Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration

Jinyoung Park, Minseong Bae, Jeehye Na, Hyunwoo J. Kim.

Official PyTorch implementation of the "Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration". (AAAI 2026)

Enviroment

To install requirements, run:

git clone https://github.com/mlvlab/LLaMo.git
cd LLaMo
conda create -n llamo python==3.9
conda activate llamo
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Preparation

Pretrained graph encoder

We utilized the pre-trained graph encoder checkpoint from the MoleculeSTM repository. You can download the pre-trained graph encoder checkpoint from the link. Place the pretrained graph model in the `MoleculeSTM/' folder.

Datasets

You can download the datasets from the link. Place both datasets (MoleculeDesc, instruction_tuning) in the data/ folder.

Checkpoint

You can download our checkpoint from the link.

We're now working on refactoring the code to incorporate the huggingface. Please stay tuned:)

Training

You can update the training config in the config_file folder.

Step1. Molecular graph-language alignment

python train.py --root_train 'data/MoleculeDesc/' --root_eval 'data/MoleculeDesc/' --devices '0,1,2,3' --filename "stage1" --max_epochs 3 --mode train --inference_batch_size 16 --batch_size 4 --config_file config_file/stage1.yaml --accumulate_grad_batches 4

Step2. Instruction tuning

python train.py --root_train 'data/instruction_tuning/' --root_eval 'data/MoleculeDesc/' --devices '0,1,2,3' --filename "stage2" --max_epochs 3 --mode train --inference_batch_size 16 --batch_size 4 --config_file config_file/stage2.yaml --accumulate_grad_batches 4 --stage_path "./all_checkpoints/stage1/last.ckpt"

Inference and Evaluation

Inference

If you want to generate the output of the LLaMo on the molecule description generation task, you can run the following command.

python train.py --root_train 'data/MoleculeDesc/' --root_eval 'data/MoleculeDesc/' --devices '0,1,2,3' --filename "desc_output" --mode eval --inference_batch_size 1 --batch_size 1 --config_file config_file/stage2.yaml --stage_path <path_to_checkpoint>

Evaluation

If you want to evaluate the performance of the LLaMo on the molecule description generation task, you can run the following command.

python evaluate.py --task desc --path <path_to_predictions>

Contact

If you have any questions, please create an issue on this repository or contact at lpmn678@korea.ac.kr.

Citation

If you find our work interesting, please consider giving a ⭐ and citation.

@inproceedings{park2024llamo,
  title={Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration},
  author={Park, Jinyoung and Bae, Minseong and Na, Jeehye and Kim, Hyunwoo J},
  booktitle={AAAI},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
llamo		llamo
pipeline		pipeline
utils		utils
.gitignore		.gitignore
README.md		README.md
chem.py		chem.py
evaluate.py		evaluate.py
pyproject.toml		pyproject.toml
selfies_alphabet.txt		selfies_alphabet.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration

Enviroment

Preparation

Pretrained graph encoder

Datasets

Checkpoint

Training

Step1. Molecular graph-language alignment

Step2. Instruction tuning

Inference and Evaluation

Inference

Evaluation

Contact

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Improving Large Molecular Language Model via Relation-aware Multimodal Collaboration

Enviroment

Preparation

Pretrained graph encoder

Datasets

Checkpoint

Training

Step1. Molecular graph-language alignment

Step2. Instruction tuning

Inference and Evaluation

Inference

Evaluation

Contact

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages