[pull] main from inclusionAI:main by pull[bot] · Pull Request #29 · axistore80-coder/AReaL

pull · 2026-04-08T07:21:26Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Keep no-drop evaluation stable when batches do not divide evenly across DP workers. Pad eval inputs with zero-weight dummy items and make engine loss reduction tolerate local empty shards so distributed and pipeline-parallel paths stay synchronized. Key changes: - pad evaluate_* dispatches to preserve DP and RW pairing invariants - skip zero-weight local loss work in FSDP, Megatron, and Archon - add eval dispatch regression tests and refresh CLI reference docs

* feat(ci): separate vllm and sglang pyproject.toml * fix(ci): support vllm pyproject in docker and install tests * fix(ci): correct Dockerfile RUN chaining for pyproject swap * fix(ci): avoid read-only pyproject bind mount writes * fix(ci): validate docker variant and sync vllm lockfile docs Fail fast on invalid Docker VARIANT values to prevent silently building the wrong backend image, and align vLLM setup instructions with CI by copying uv.vllm.lock when swapping pyproject variants. Key changes: - validate VARIANT via case statement in Dockerfile - update README/docs/agent guidance with uv.vllm.lock copy step - regenerate CLI reference docs via pre-commit hook

…rging PP shards (#1145) * fix(engine): XCCL lora weights update was being overwritten when pp>1 * chore: addressed gemini comments

Ranks with no gradients (e.g. frozen non-LoRA params) previously returned 0.0 immediately, skipping the all_reduce. Ranks that do have gradients then hang waiting for the collective to complete. Move device init before the empty-grads check and make zero-grad ranks still participate in all_reduce with a zero-valued tensor. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

rchardx and others added 7 commits April 8, 2026 11:23

fix(engine): reset mm cache after vllm generation pause (#1144)

db93fab

fix(ci): sync uv.vllm.lock with the current pyproject.vllm.toml (#1146)

3028de8

fix(vllm_ext): XCCL lora weights update when PP>1 by buffering and me…

a4ea773

…rging PP shards (#1145) * fix(engine): XCCL lora weights update was being overwritten when pp>1 * chore: addressed gemini comments

chore: fix pre-commit (#1148)

e76ebb9

pull bot locked and limited conversation to collaborators Apr 8, 2026

pull bot added the ⤵️ pull label Apr 8, 2026

pull bot merged commit 595a3c4 into axistore80-coder:main Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from inclusionAI:main#29

[pull] main from inclusionAI:main#29
pull[bot] merged 7 commits intoaxistore80-coder:mainfrom
inclusionAI:main

pull bot commented Apr 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

pull bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pull bot commented Apr 8, 2026 •

edited

Loading