Skip to content

cheliu-computation/G2D-NeurIPS24

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

G2D-NeurIPS2024

G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training

G2D: From Global to Dense Radiography Representation Learning via Vision-Language Pre-training, NeurIPS 2024.

Installation

To clone this repository:

git clone https://github.com/cheliu-computation/G2D-NeurIPS24.git

To install Python dependencies:

pip install -r requirements.txt

All experiments are implemented on A100 GPU.

Pre-train Dataset downloading

Datasets we used are as follows:

  • MIMIC-CXR: We downloaded the MIMIC-CXR-JPG dataset as the radiographs. Paired medical reports can be downloaded in MIMIC-CXR.

Preprocessing

  • First we follow MGCA preprocessing to extract a master csv includes all CXR scans associated with report. You can find in Preprocessing.
  • Then, run 'ext_data.py' to extract all scans and save as a npy file. It will accelerate the pre-training stage.

Pre-training

We pre-trained G2D on MIMIC-CXR using this command:


cd /G2D-NeurIPS24/PRETRAIN
torchrun --nnodes=1 --nproc_per_node=8 main.py

Finetune on downstream tasks

We evlauate the performance of G2D on three fine-tune downstream tasks: image classification, object detection, semantic segmentation and two zero-shot downstream tasks: zero-shot image classification, zero-shot image grounding.

For image classification, semantic segmentation and object detection, we follow MGCA-NeurIPS2022 offical configuration and code. The dataset can be found in MGCA repository.

For zero-shot image classification and grounding tasks, we follow MedKLIP-ICCV2023, please follow their offical code to extract data and implement Image-Text Retrieval tasks.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages