On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction

This repository provides the official implementation of the paper: On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction.

Introduction

We investigate whether vision-language foundation models can enhance undersampled MRI reconstruction. Our approach leverages high-level semantic embeddings from pretrained vision-language foundation models (specifically Janus) to guide the reconstruction process through contrastive learning. This aligns the reconstructed image embedding with a target semantic distribution, ensuring consistency with high-level perceptual cues. The proposed objective works with various deep learning-based reconstruction methods and can flexibly incorporate semantic priors from multimodal sources. We evaluated reconstruction results guided by prior embeddings derived from either image-only or image-language auxiliary information.

Project Structure

Foundation-Model-MRI-Reconstruction/
├── image_language_demo.py          # Image-language embedding-guided reconstruction
├── INR_demo.py                     # Implicit Neural Representation (INR)-based reconstruction guided by image-only embeddings
├── Unet_demo.py                    # UNet-based reconstruction guided by image-only embeddings
├── Unrolled_demo.py                # Unrolled network reconstruction guided by image-only embeddings
├── model.py                        # Model definitions (Unrolled, SIREN, CG solver)
├── loss_function.py                # Contrastive and reconstruction loss functions
├── utils.py                        # Utility functions for data processing
├── feature_extraction_image.py     # Extract image embeddings from foundation model
├── feature_extraction_image_language.py  # Extract image-language embeddings
├── demo_data.mat                   # Demo k-space data
├── unet/                           # UNet architecture
│   ├── unet_model.py
│   ├── unet_parts.py
│   └── pre_trained_weights/        # Pre-trained UNet weights
├── Janus/                          # Janus foundation model
├── image_examples/                 # Example images for image-language embedding extraction
└── prior_embeddings_image_language/  # Pre-computed image-language embeddings

Installation

Prerequisites

Python >= 3.8
PyTorch >= 1.12
CUDA-compatible GPU

Dependencies

# Clone the repository
git clone https://github.com/I3Tlab/Foundation-Model-MRI-Reconstruction.git
cd Foundation-Model-MRI-Reconstruction

# Install dependencies
pip install torch torchvision torchaudio
pip install transformers scipy numpy h5py scikit-image scikit-learn
pip install sigpy tqdm tensorboard matplotlib umap-learn einops

# Install minLoRA
Install minLoRA from the link: https://github.com/cchangcs/minLoRA

Download Janus-Pro Weights

Download the pretrained Janus-Pro-1B model from HuggingFace, and update the foundation_model_path in the demo scripts.

Download FastMRI Data

The FastMRI dataset is required for extracting image embeddings. To access the dataset:

Register and Request Access: Visit the FastMRI website and complete the data access request form.
Download the Dataset: Once approved, download the knee or brain MRI data. For this project, we primarily use:
- knee_multicoil_train
- knee_multicoil_val
Organize the Data: Place the downloaded data in your preferred directory and update the data paths in the demo scripts accordingly.

For quick testing, we provide:

demo_data.mat: A multi-coil k-space slice for demo reconstruction
image_examples/: Example images used to generate image-language embeddings
prior_embeddings_image_language/: Pre-generated image-language embeddings

Usage

Important: Before running any scripts, update the following paths in the code to match your environment:
foundation_model_path = "/path/to/Janus-Pro-1B"
feat_path = "/path/to/prior_embeddings"

1. Feature Extraction

Before reconstruction, extract prior embeddings from auxiliary images.

Image Embeddings

Note: You must download the FastMRI dataset before running this script, as the raw data files are too large to include in this repository. Please update the data paths in the script to point to your local data directory.

python feature_extraction_image.py

This generates embeddings stored in prior_embeddings/ with UMAP visualization.

Image-Language Embeddings

We provide example images in image_examples/ for extracting image-language embeddings.

python feature_extraction_image_language.py

This generates embeddings stored in prior_embeddings_image_language/ with UMAP visualization.

2. MRI Reconstruction

We provide four reconstruction approaches optimized by data consistency and contrastive loss functions:

U-Net-based Reconstruction

Learns a direct transformation from undersampled inputs to reconstructed images:

python Unet_demo.py

Unrolled Network Reconstruction

Unrolls a variable-splitting iterative reconstruction algorithm into a sequence of learnable stages. Each stage alternates between a U-Net and an explicit data-consistency step solved using conjugate gradient descent:

python Unrolled_demo.py

Implicit Neural Representation (INR)

Represents the MR image as a continuous function of spatial coordinates. The function is parameterized by an MLP, which takes spatial coordinates as input and predicts the corresponding image intensities:

python INR_demo.py

Image-Language Guided Reconstruction

Combines vision and language understanding for semantic-aware reconstruction:

python image_language_demo.py

3. Results

Reconstruction results are saved in reconstruction_results/:

pred_results_*.mat - Reconstructed images
model/ - Model checkpoints
log/ - TensorBoard logs

Monitor training with TensorBoard:

tensorboard --logdir=reconstruction_results/*/log

Related Resources

Citation

@article{feng2026,
author = {Feng, Ruimin and He, Xingxin and Mercer, Ronald and Stewart, Zachary and Liu, Fang},
title = {On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction},
journal = {Magnetic Resonance in Medicine},
volume = {},
number = {},
pages = {},
doi = {https://doi.org/10.1002/mrm.70374},
url = {https://onlinelibrary.wiley.com/doi/abs/10.1002/mrm.70374},
}

Contact

For questions or issues, please open a GitHub issue or contact the authors.

Intelligent Imaging Innovation and Translation Lab [github] at the Athinoula A. Martinos Center of Massachusetts General Hospital and Harvard Medical School

Ruimin Feng (rfeng3@mgh.harvard.edu)
Fang Liu (fliu12@mgh.harvard.edu)

149 13th Street, Suite 2301 Charlestown, Massachusetts 02129, USA

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction

Introduction

Table of Contents

Project Structure

Installation

Prerequisites

Dependencies

Download Janus-Pro Weights

Download FastMRI Data

Usage

1. Feature Extraction

Image Embeddings

Image-Language Embeddings

2. MRI Reconstruction

U-Net-based Reconstruction

Unrolled Network Reconstruction

Implicit Neural Representation (INR)

Image-Language Guided Reconstruction

3. Results

Related Resources

Citation

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Janus		Janus
image_examples		image_examples
prior_embeddings_image_language		prior_embeddings_image_language
unet		unet
Figure1.jpg		Figure1.jpg
INR_demo.py		INR_demo.py
LICENSE		LICENSE
Unet_demo.py		Unet_demo.py
Unrolled_demo.py		Unrolled_demo.py
demo_data.mat		demo_data.mat
feature_extraction_image.py		feature_extraction_image.py
feature_extraction_image_language.py		feature_extraction_image_language.py
image_language_demo.py		image_language_demo.py
loss_function.py		loss_function.py
model.py		model.py
readme.md		readme.md
utils.py		utils.py

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

On the Utility of Foundation Models for Fast MRI: Vision-Language-Guided Image Reconstruction

Introduction

Table of Contents

Project Structure

Installation

Prerequisites

Dependencies

Download Janus-Pro Weights

Download FastMRI Data

Usage

1. Feature Extraction

Image Embeddings

Image-Language Embeddings

2. MRI Reconstruction

U-Net-based Reconstruction

Unrolled Network Reconstruction

Implicit Neural Representation (INR)

Image-Language Guided Reconstruction

3. Results

Related Resources

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages