Skip to content

Pull requests: huggingface/trl

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

docs: clarify PPO entropy metrics in PPO trainer docs
#5289 opened Mar 14, 2026 by biefan Loading…
perf(trl): set a timeout on vllm client HTTP calls
#5288 opened Mar 14, 2026 by tejasae-afk Loading…
Add reference to DeepSeekMath in accuracy_reward docstring
#5287 opened Mar 13, 2026 by qgallouedec Loading…
5 tasks
Support max_length in DPO VLM training
#5284 opened Mar 13, 2026 by albertvillanova Loading…
Add Cursor rules from AGENTS.md
#5280 opened Mar 12, 2026 by qgallouedec Loading…
Add Nemotron 3 to tests via tiny model
#5278 opened Mar 12, 2026 by sergiopaniego Loading…
5 tasks
Centralize AI agent templates in .ai
#5268 opened Mar 10, 2026 by qgallouedec Loading…
async streaming grpo w prefetch
#5250 opened Mar 9, 2026 by winglian Loading…
5 tasks
Update openenv examples to use environment_factory
#5235 opened Mar 6, 2026 by sergiopaniego Loading…
8 tasks
Simplify NeMo Gym user experience
#5156 opened Feb 24, 2026 by cmunley1 Loading…
DPO padding-free
#5141 opened Feb 21, 2026 by qgallouedec Draft
5 tasks
[GKD] Buffer Implementation for Distillation Trainer
#5137 opened Feb 20, 2026 by cmpatino Loading…
3 tasks done
MGPO feature addition
#5126 opened Feb 19, 2026 by damoonsh Loading…
2 of 5 tasks
ProTip! Adding no:label will show everything without a label.