Skip to content

Puzer/Finetuning-LLM-agent-using-RL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

🚧 Upcoming: FunctionGemma × GRPO Agent Training

I am currently conducting research on fine-tuning Google's newly released FunctionGemma using the latest TRL v0.26+ capabilities, specifically focusing on GRPO for Agent training.

The goal is to fine-tune an agent in an interactive environment by leveraging the new tool-use support in GRPOTrainer.

Which environment? I don't know yet, lol.
The spectrum ranges from a simple game to a coding/research agent.
It might be useful, maybe not, but it will for sure be fun :)

📅 ETA: A full blog post and code breakdown will be published here in Q1 2026.

Star or Watch this repository to get notified when the post goes live!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors