Skip to content

quasar529/rainbow-padding

Repository files navigation

🌈 Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs

Official PyTorch implementation of Rainbow Padding, a simple yet powerful strategy that resolves <eos> overflow in diffusion language models (dLLMs).

Visit our Project page and arxiv for paper if you are interested! This repository provides a step-by-step pipeline for SFT LoRA training, evaluation using Rainbow Padding.

If you have any questions, please contact the authors.


Demo

Rainbow Padding

Rainbow Padding Demo

LLaDA Instruct

LLaDA Instruct Demo

1. Setup

1️⃣ Create the Conda Environment

conda env create -f environment.yaml

2️⃣ Activate the Environment

conda activate rainbow

2. Dataset Preparation

We follow the curation recipe introduced in Dream (arXiv:2508.15487).
The training corpus consists of 0.5M public examples curated from:

Details are provided in Appendix C.1 of the paper.

⚠️ Note: Specific SFT configurations for both Dream and LLaDA were not publicly released (to the best of our knowledge).

Download pre-tokenized data (recommended)

You can directly download our preprocessed datasets from Google Drive:

# Same data with tokenization per model type.

# LLaDA SFT data
gdown --folder 1U8kVGYiWRsqWCDRsHUjeKTiDPrL0FsMp

# Dream SFT data
gdown --folder 1-oei1KRTFADMRljPqX5rPuTGcJ7fpHdI

3. LoRA SFT Training

We use 🤗 Accelerate for multi-GPU training.

Key Arguments for main.py

Argument Description
batch_size Batch size per GPU. Control the total batch size using gradient_accumulation_steps in ./method/sft.py.
pad_num Number of cyclic padding tokens. Use 0 for <eos> padding or any positive integer (e.g., 3, 7) for Rainbow Padding.

Example: Training with 4 GPUs and 7 Rainbow Padding Tokens

1️⃣ Initial Training

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --num_processes=4 main.py --model_type=llada_base --pad_num=7

2️⃣ Continue Training from a Checkpoint

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --num_processes=4 main.py --model_type=llada_base  --pad_num=7  --resume_dir model/llada_base/sft_5e-05_lora_epoch3_rank32_pad7

4. Evaluation

We upload our checkpoint to Hugging Face: quasar529/rainbow-padding-llada.

We use widely used library LM-Eval-Harness, and we modified the evaluation script from LLaDA's eval script. You can find the evaluation script in eval/eval_llada_instruct.py.

⚠️ Dependency Notice

To run evaluation, you must install specific versions of datasets and lm-eval due to dependency constraints:

pip install datasets==3.6.0 lm-eval==0.4.9.1
  • lm-eval==0.4.9.1 requires datasets>=2.16.0,<4.0.
  • However, the latest datasets (≥4.0.0) is incompatible.
  • Therefore, you need to downgrade datasets to 3.6.0, which satisfies lm-eval's requirements and ensures stable evaluation. If you skip this step, evaluation scripts may still run but can break unexpectedly due to mismatched APIs.

Example Command

# Example: Humaneval_instruct
accelerate launch --num_processes=1 eval/eval_llada_instruct.py \
  --tasks humaneval_instruct \
  --model llada_dist \
  --batch_size 1 \
  --log_samples \
  --output_path "/home/quasar529/rainbow-padding/eval/output" \
  --confirm_run_unsafe_code \
  # If you want to use wandb, set wandb_log, wandb_project, wandb_entity
  --model_args model_path='GSAI-ML/LLaDA-8B-Base',steps=1024,gen_length=1024,block_length=1024,lora_path='quasar529/rainbow-padding-llada',device='cuda',wandb_log=True,wandb_project='llada-eval',wandb_entity='your-wandb-entity'

If you want to reproduce all evaluation tasks performed in the paper at once, you can simply run the provided shell script:

sh eval/eval.sh

5. Citation

If you find this work useful, please cite:

@article{kim2025rainbow,
  title={Rainbow Padding: Mitigating Early Termination in Instruction-Tuned Diffusion LLMs},
  author={Kim, Bumjun and Jeon, Dongjae and Kim, Dueun and Jeung, Wonje and No, Albert},
  journal={arXiv preprint arXiv:2510.03680},
  year={2025}
}

6. Acknowledgements

This code builds upon the open-sourced implementations of
Dream and LLaDA.
We thank the authors for releasing their resources and inspiring this work.


About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •