Skip to content

Add 32b fully async swe agent training example in skyrl#26

Open
Weili-0234 wants to merge 4 commits into
ThunderAgent-org:dev/skyrl-fully-async-swe-32bfrom
Weili-0234:dev/skyrl-32b-fully-async
Open

Add 32b fully async swe agent training example in skyrl#26
Weili-0234 wants to merge 4 commits into
ThunderAgent-org:dev/skyrl-fully-async-swe-32bfrom
Weili-0234:dev/skyrl-32b-fully-async

Conversation

@Weili-0234
Copy link
Copy Markdown
Collaborator

This PR extends the existing ThunderAgent SkyRL example with a 32B Qwen3 fully-async RL example, including:

  • Add a complete 32B training example, including launch orchestration, profiling, and SWE-bench image pre-pull
  • Add a 6-node SLURM reproducibility pipeline and separate wrappers for default/TR router
  • Update the training/inference integration path for fully-async Megatron workloads (generation loop, inference endpoint/client integration, worker/runtime behavior, and related trainer logic)
  • Add docs and setup templates for credentials and cluster-specific configs

@HaoKang-Timmy
Copy link
Copy Markdown
Collaborator

I wonder if we could seperate this to another branch of this repo so we could keep improvement, thanks!

@Weili-0234
Copy link
Copy Markdown
Collaborator Author

Yeah sure!

@Weili-0234 Weili-0234 marked this pull request as draft March 4, 2026 21:12
@Weili-0234 Weili-0234 changed the base branch from main to dev/skyrl-fully-async-swe-32b March 4, 2026 21:17
@Weili-0234 Weili-0234 marked this pull request as ready for review March 4, 2026 21:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants