generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Remove custom get_train/eval_dataloader from OnlineDPO
#5291
opened Mar 16, 2026 by
albertvillanova
Loading…
Remove TrainingArguments import from experimental trainers
#5290
opened Mar 16, 2026 by
albertvillanova
Loading…
perf(trl): set a timeout on vllm client HTTP calls
#5288
opened Mar 14, 2026 by
tejasae-afk
Loading…
Add reference to DeepSeekMath in accuracy_reward docstring
#5287
opened Mar 13, 2026 by
qgallouedec
Loading…
5 tasks
Prevent corruption of DPO VLM training if "keep_end" truncation_mode
#5286
opened Mar 13, 2026 by
albertvillanova
Loading…
Fix
accuracy_reward crash when called from non-main thread
#5281
opened Mar 13, 2026 by
qgallouedec
Loading…
Introduce backend rollout-completions interface and decouple OpenEnv helper from vLLM internals
#5256
opened Mar 10, 2026 by
rycerzes
Loading…
batch params together in weight sync and async update the weights
#5249
opened Mar 9, 2026 by
winglian
Loading…
5 tasks
Introduce minimal generation backend interface for GRPO and RLOO trainers
#5244
opened Mar 8, 2026 by
rycerzes
Loading…
feat: log raw importance ratios and fraction of truncation/masking in vLLM importance sampling correction
#5243
opened Mar 8, 2026 by
muupan
Loading…
1 of 5 tasks
[GRPO] Fix re-tokenization bug in tool-calling loop by concatenating token IDs
#5242
opened Mar 7, 2026 by
qgallouedec
Loading…
Update openenv examples to use
environment_factory
#5235
opened Mar 6, 2026 by
sergiopaniego
Loading…
8 tasks
Allow reward functions to log extra columns and scalar metrics
#5233
opened Mar 6, 2026 by
manueldeprada
Loading…
vLLM Server Sync via LoRA Adapter Reload (avoid merge + full weight sync) for GRPO
#5188
opened Feb 26, 2026 by
lfranceschetti
Loading…
[GKD] Buffer Implementation for Distillation Trainer
#5137
opened Feb 20, 2026 by
cmpatino
Loading…
3 tasks done
Previous Next
ProTip!
Adding no:label will show everything without a label.