🌈 Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

Official PyTorch implementation of Rainbow Padding, a simple yet powerful strategy that resolves <eos> overflow in diffusion language models (dLLMs).

Visit our Project page and arxiv for paper if you are interested! This repository provides a step-by-step pipeline for SFT LoRA training, evaluation using Rainbow Padding.

If you have any questions, please contact the authors.

Demo

Rainbow Padding

LLaDA Instruct

1. Setup

1️⃣ Create the Conda Environment

conda env create -f environment.yaml

2️⃣ Activate the Environment

conda activate rainbow

2. Dataset Preparation

We follow the curation recipe introduced in Dream (arXiv:2508.15487).
The training corpus consists of 0.5M public examples curated from:

Details are provided in Appendix C.1 of the paper.

⚠️ Note: Specific SFT configurations for both Dream and LLaDA were not publicly released (to the best of our knowledge).

Download pre-tokenized data (recommended)

You can directly download our preprocessed datasets from Google Drive:

# Same data with tokenization per model type.

# LLaDA SFT data
gdown --folder 1U8kVGYiWRsqWCDRsHUjeKTiDPrL0FsMp

# Dream SFT data
gdown --folder 1-oei1KRTFADMRljPqX5rPuTGcJ7fpHdI

3. LoRA SFT Training

We use 🤗 Accelerate for multi-GPU training.

Key Arguments for `main.py`

Argument	Description
`batch_size`	Batch size per GPU. Control the total batch size using `gradient_accumulation_steps` in `./method/sft.py`.
`pad_num`	Number of cyclic padding tokens. Use `0` for `<eos>` padding or any positive integer (e.g., `3`, `7`) for Rainbow Padding.

Example: Training with 4 GPUs and 7 Rainbow Padding Tokens

1️⃣ Initial Training

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --num_processes=4 main.py --model_type=llada_base --pad_num=7

2️⃣ Continue Training from a Checkpoint

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --num_processes=4 main.py --model_type=llada_base  --pad_num=7  --resume_dir model/llada_base/sft_5e-05_lora_epoch3_rank32_pad7

4. Evaluation

We upload our checkpoint to Hugging Face: quasar529/rainbow-padding-llada.

We use widely used library LM-Eval-Harness, and we modified the evaluation script from LLaDA's eval script. You can find the evaluation script in eval/eval_llada_instruct.py.

⚠️ Dependency Notice

To run evaluation, you must install specific versions of datasets and lm-eval due to dependency constraints:

pip install datasets==3.6.0 lm-eval==0.4.9.1

lm-eval==0.4.9.1 requires datasets>=2.16.0,<4.0.
However, the latest datasets (≥4.0.0) is incompatible.
Therefore, you need to downgrade datasets to 3.6.0, which satisfies lm-eval's requirements and ensures stable evaluation. If you skip this step, evaluation scripts may still run but can break unexpectedly due to mismatched APIs.

Example Command

# Example: Humaneval_instruct
accelerate launch --num_processes=1 eval/eval_llada_instruct.py \
  --tasks humaneval_instruct \
  --model llada_dist \
  --batch_size 1 \
  --log_samples \
  --output_path "/home/quasar529/rainbow-padding/eval/output" \
  --confirm_run_unsafe_code \
  # If you want to use wandb, set wandb_log, wandb_project, wandb_entity
  --model_args model_path='GSAI-ML/LLaDA-8B-Base',steps=1024,gen_length=1024,block_length=1024,lora_path='quasar529/rainbow-padding-llada',device='cuda',wandb_log=True,wandb_project='llada-eval',wandb_entity='your-wandb-entity'

If you want to reproduce all evaluation tasks performed in the paper at once, you can simply run the provided shell script:

sh eval/eval.sh

5. Citation

If you find this work useful, please cite:

@article{kim2025rainbow,
  title={Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs},
  author={Kim, Bumjun and Jeon, Dongjae and Kim, Dueun and Jeung, Wonje and No, Albert},
  journal={arXiv preprint arXiv:2510.03680},
  year={2025}
}

6. Acknowledgements

This code builds upon the open-sourced implementations of
Dream and LLaDA.
We thank the authors for releasing their resources and inspiring this work.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
eval		eval
images		images
method		method
tokenizer_dream		tokenizer_dream
tokenizer_llada		tokenizer_llada
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
environment.yaml		environment.yaml
main.py		main.py
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌈 Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

Demo

Rainbow Padding

LLaDA Instruct

1. Setup

1️⃣ Create the Conda Environment

2️⃣ Activate the Environment

2. Dataset Preparation

Download pre-tokenized data (recommended)

3. LoRA SFT Training

Key Arguments for `main.py`

Example: Training with 4 GPUs and 7 Rainbow Padding Tokens

1️⃣ Initial Training

2️⃣ Continue Training from a Checkpoint

4. Evaluation

⚠️ Dependency Notice

Example Command

5. Citation

6. Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

quasar529/rainbow-padding

Folders and files

Latest commit

History

Repository files navigation

🌈 Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

Demo

Rainbow Padding

LLaDA Instruct

1. Setup

1️⃣ Create the Conda Environment

2️⃣ Activate the Environment

2. Dataset Preparation

Download pre-tokenized data (recommended)

3. LoRA SFT Training

Key Arguments for main.py

Example: Training with 4 GPUs and 7 Rainbow Padding Tokens

1️⃣ Initial Training

2️⃣ Continue Training from a Checkpoint

4. Evaluation

⚠️ Dependency Notice

Example Command

5. Citation

6. Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Key Arguments for `main.py`

Packages