razaimam45 commited on
Commit
ed1aeb3
Β·
verified Β·
1 Parent(s): e3843b7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +182 -3
README.md CHANGED
@@ -1,3 +1,182 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RobustMedCLIP: On the Robustness of Medical Vision-Language Models: Are they Truly Generalizable?
2
+
3
+ > **Accepted at [Medical Image Understanding and Analysis (MIUA) 2025]**
4
+
5
+ [![License: MIT](https://img.shields.io/badge/license-MIT-green)](LICENSE)
6
+ [![Paper](https://img.shields.io/badge/Paper-PDF-blue)](https://arxiv.org/abs/2505.15425)
7
+ [![Dataset](https://img.shields.io/badge/Dataset-MediMeta--C-orange)](https://github.com/BioMedIA-MBZUAI/RobustMedCLIP)
8
+ [![Project](https://img.shields.io/badge/Project-RobustMedCLIP-red)](https://github.com/BioMedIA-MBZUAI/RobustMedCLIP)
9
+
10
+ ---
11
+
12
+ ## πŸš€ Highlights
13
+
14
+ - 🧠 **MVLM Benchmarking**: Evaluate 5 major and recent MVLMs across **5 modalities**, **7 corruption types**, and **5 severity levels**
15
+ - πŸ“‰ **Corruption Evaluation**: Analyze degradation under Gaussian noise, motion blur, pixelation, etc.
16
+ - πŸ”¬ **MediMeta-C**: A new benchmark simulating real-world OOD shifts in high-res medical images
17
+ - πŸ§ͺ **Few-shot Robustness**: **RobustMedCLIP** uses just 1-10% of clean data for adaptation
18
+ - 🧠 **LoRA Efficient Tuning**: Low-rank fine-tuning in transformer attention layers
19
+
20
+ <p align="center">
21
+ <img src="assets/pipeline.png" width="750" alt="Pipeline Overview">
22
+ </p>
23
+ <p align="center">
24
+ Overview of the RobustMedCLIP pipeline: A) Few-shot Sampling of Clean Samples from MediMeta and MedMNIST across 5 modalities; B) Fine-tuning LoRA adapters using Few-shot samples; C) Distribution Shifts of MediMeta-C compared to Clean samples; D) Evaluation Results across Top-1 Accuracy and Corruption Error for 4 baselines and RobustMedCLIP.
25
+ </p>
26
+
27
+ ---
28
+
29
+ ## πŸ“¦ Installation
30
+
31
+ ```bash
32
+ git clone https://github.com/BioMedIA-MBZUAI/RobustMedCLIP.git
33
+ cd RobustMedCLIP
34
+ conda create -n robustmedclip python=3.12.7
35
+ conda activate robustmedclip
36
+ pip install -r requirements.txt
37
+ pip install hugginface_hub
38
+ ````
39
+
40
+ You will also need `<YOUR-HUGGINGFACE-TOKEN>` with your personal Hugging Face access token, to directly download Datasets and Model Weights.\
41
+ To create an access token, go to your Huggingface `Settings`, then click on the `Access Tokens` tab. Click on the New token button to create a new User Access Token.
42
+
43
+ ---
44
+
45
+ ## 🧠 Models
46
+
47
+ All baseline and RobustMedCLIP model checkpoints are available for direct download via Hugging Face at [RobustMedCLIP](https://huggingface.co/razaimam45/RobustMedCLIP/tree/main):
48
+
49
+ ```bash
50
+ huggingface-cli download razaimam45/RobustMedCLIP \
51
+ --local-dir ./outputs \
52
+ --repo-type model \
53
+ --token <YOUR-HUGGINGFACE-TOKEN>
54
+ ```
55
+
56
+ πŸ“ `Outputs` Folder Structure: The `outputs/` folder (should be in root folder) contains all trained model weights and evaluation results:
57
+
58
+ ```bash
59
+ outputs/
60
+ β”œβ”€β”€ checkpoints/ # Baseline MVLMs (MedCLIP, UniMedCLIP)
61
+ β”œβ”€β”€ exp-rank-8/ # RobustMedCLIP (LoRA Rank = 8) for ViT and ResNet across few-shots (1/3/7/10)%
62
+ β”œβ”€β”€ exp-rank-16/ # RobustMedCLIP (LoRA Rank = 16) for ViT and ResNet across few-shots (1/3/7/10)%
63
+ └── results/ # Evaluation logs across mCE/Accuracy metrics
64
+ ```
65
+
66
+ ---
67
+
68
+ ## 🧬 Datasets
69
+
70
+ This project proposes MediMeta-C as corruption benchmark; and evaluates MVLMs on MedMNIST-C and MediMeta-C benchmarks.
71
+
72
+ | Dataset | Modality | Clean Samples | Corruption Sets | Resolution |
73
+ |----------------|------------------|----------------|------------------|-------------|
74
+ | **MediMeta-C** | Multi-modality | 5 Modalities | 7 corruptions Γ— 5 levels | High-res |
75
+ | **MedMNIST-C** | Public Benchmark | 5 Modalities | 7 corruptions Γ— 5 levels | Low-res |
76
+
77
+ ### πŸ“‚ Dataset Structure
78
+
79
+ The MediMeta-C dataset is hosted on HuggingFace and organized as follows:
80
+
81
+ ```bash
82
+ MediMeta-C/
83
+ β”œβ”€β”€ pbc/ # Blood Cell modality
84
+ β”‚ β”œβ”€β”€ test/ # Test set
85
+ β”‚ β”‚ β”œβ”€β”€ clean.npz # Clean samples
86
+ β”‚ β”‚ β”œβ”€β”€ brightness_severity_1.npz
87
+ β”‚ β”‚ β”œβ”€β”€ brightness_severity_2.npz
88
+ β”‚ β”‚ β”œβ”€β”€ ... # Other severity levels
89
+ β”‚ β”‚ └── brightness_severity_5.npz
90
+ β”‚ β”œβ”€β”€ val/ # Validation set
91
+ β”‚ β”œβ”€β”€ clean.npz
92
+ β”‚ β”œβ”€β”€ contrast_severity_1.npz
93
+ β”‚ β”œβ”€β”€ contrast_severity_2.npz
94
+ β”‚ β”œβ”€β”€ ... # Other severity levels
95
+ β”‚ └── contrast_severity_5.npz
96
+ β”œβ”€β”€ fundus/ # Fundus modality
97
+ β”‚ β”œβ”€β”€ test/
98
+ β”‚ β”œβ”€β”€ val/
99
+ β”‚ └── ... # Similar structure as above
100
+ β”œβ”€β”€ ... # Other modalities
101
+ └── README.md # Dataset description
102
+ ```
103
+
104
+ You can download the dataset from: [MediMeta-C](https://huggingface.co/datasets/razaimam45/MediMeta-C/tree/main), and [MedMNIST-C](https://github.com/francescodisalvo05/medmnistc-api). The downloaded folder `data/MediMeta-C` should be in the root of the project folder.
105
+
106
+ ```bash
107
+ huggingface-cli download razaimam45/MediMeta-C --local-dir ./data/MediMeta-C --repo-type dataset --token <YOUR-HUGGINGFACE-TOKEN>
108
+ ````
109
+
110
+ ---
111
+
112
+ ## πŸ”§ Usage
113
+
114
+ ### 1. Few-Shot Tuning
115
+
116
+ You can fine-tune RobustMedCLIP with either ViT or ResNet backbones:
117
+
118
+ ```bash
119
+ # Fine-tune with ViT backbone (e.g., BioMedCLIP)
120
+ bash scripts/run_finetune_vit.sh
121
+
122
+ # Fine-tune with ResNet backbone (e.g., MedCLIP)
123
+ bash scripts/run_finetune_resnet.sh
124
+ ```
125
+
126
+ ### 2. Evaluation
127
+
128
+ Evaluate a fine-tuned or pretrained MVLM (including RMedCLIP):
129
+
130
+ ```bash
131
+ # Evaluation for RobustMedCLIP (RMC)
132
+ bash scripts/run_eval_rmed.sh
133
+
134
+ # Custom evaluation on other models (rmedclip, biomedclip, unimedclip, medclip, clip)
135
+ python evaluate.py --model rmedclip \
136
+ --backbone vit \
137
+ --gpu 0 --corruptions all --collection medimeta
138
+ ```
139
+
140
+ ---
141
+
142
+ ## πŸ“Š Results
143
+
144
+ RobustMedCLIP consistently outperforms prior MVLMs under corruptions across all modalities:
145
+
146
+ | Model | Clean Error ↓ | mCE ↓ (avg) |
147
+ | ------------ | ------------- | ----------- |
148
+ | CLIP | 100.0 | 100.0 |
149
+ | MedCLIP | 106.4 | 112.5 |
150
+ | BioMedCLIP | 116.3 | 126.8 |
151
+ | UniMedCLIP | 111.8 | 98.87 |
152
+ | **RMedCLIP** | **62.8** | **81.0** |
153
+
154
+ Detailed benchmarks available in `Results and Discussions`.
155
+
156
+ ---
157
+
158
+ ## ✏️ Citation
159
+
160
+ If you find this repository helpful, please cite our paper:
161
+
162
+ ```bibtex
163
+ @misc{imam2025robustnessmedicalvisionlanguagemodels,
164
+ title={On the Robustness of Medical Vision-Language Models: Are they Truly Generalizable?},
165
+ author={Raza Imam and Rufael Marew and Mohammad Yaqub},
166
+ year={2025},
167
+ eprint={2505.15425},
168
+ archivePrefix={arXiv},
169
+ primaryClass={cs.CV},
170
+ url={https://arxiv.org/abs/2505.15425},
171
+ }
172
+ ```
173
+
174
+ ---
175
+
176
+ ## 🀝 Acknowledgements
177
+
178
+ * Built on top of [BioMedCLIP](https://arxiv.org/abs/2303.00915) and [MedCLIP](https://arxiv.org/abs/2210.10163)
179
+ * MediMeta-C corruption designs are inspired by [ImageNet-C](https://arxiv.org/abs/1903.12261) and [MedMNIST-C](https://arxiv.org/abs/2406.17536)
180
+
181
+ For questions, contact: **[[email protected]](mailto:[email protected])**
182
+