Time-Annealed Perturbation Sampling (TAPS) is an inference-time method for improving diversity in diffusion language models without sacrificing generation quality.
This repository contains the official implementation of TAPS and the code used to reproduce experiments reported in the paper.
A conceptual comparison of the inference process between the base Diffusion-LM and our proposed method, TAPS, illustrating different context conditioning behaviors.
This repository supports two diffusion language model backbones:
| Backbone | Hugging Face | Loader |
|---|---|---|
| LLaDA-8B-Instruct | GSAI-ML/LLaDA-8B-Instruct | transformers.AutoModel |
| TraDo-8B-Instruct | Gen-Verse/TraDo-8B-Instruct | transformers.AutoModelForCausalLM |
This project uses two separate Python environments:
llada: for LLaDA-related experimentstrado: for TraDo-related experiments
# Clone the repository
git clone https://github.com/Johnny221B/TAPS.git
cd TAPS
# Create the llada environment
python -m venv envs/llada
source envs/llada/bin/activate
pip install --upgrade pip
pip install -r requirements_llada.txt
# Create the trado environment
python -m venv envs/trado
source envs/trado/bin/activate
pip install --upgrade pip
pip install -r requirements_trado.txt- GSM8K
- WritingPrompts
- NoveltyBench
- Arena-Hard-Auto
cd /mnt/data/wujx/DLM/TAPS
CUDA_VISIBLE_DEVICES=0,1 accelerate launch --num_processes 2 -m benchmarks.writingprompts.eval_llada_wp \
--model_path /path/to/llada \
--mode embedding \
--dataset euclaise/writingprompts \
--num_prompts 50 \
--num_samples 16 \
--temperature 0.7 \
--cfg 0.0 \
--cond_embed_noise_std 0.35 \
--cond_noise_start 0.05 \
--cond_noise_until 0.95 \
--cond_embed_impl hook \
--steps 512 \
--gen_length 512 \
--block_length 256 \
--empty_cache_every 20cd /mnt/data/wujx/DLM/TAPS
CUDA_VISIBLE_DEVICES=0 python -m benchmarks.writingprompts.eval_trado_wp \
--run_name trado_embedding_run \
--mode embedding \
--model_path /path/to/trado \
--num_prompts 25 \
--num_samples 16 \
--gen_length 512 \
--steps 4 \
--block_length 4 \
--temperature 0.8 \
--seed 1234 \
--cond_embed_noise_std 0.40 \
--top_k 0 \
--top_p 1.0 \
--min_p 0.0@misc{wu2026timeannealedperturbationsamplingdiverse,
title={Time-Annealed Perturbation Sampling: Diverse Generation for Diffusion Language Models},
author={Jingxuan Wu and Zhenglin Wan and Xingrui Yu and Yuzhe Yang and Yiqiao Huang and Ivor Tsang and Yang You},
year={2026},
eprint={2601.22629},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2601.22629},
}
This project is released under the MIT License. See the LICENSE file for the full text.
SPDX-License-Identifier: MIT
