[pull] main from inclusionAI:main by pull[bot] · Pull Request #21 · axistore80-coder/AReaL

pull · 2026-03-31T07:21:27Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Restore Qwen VLM training by matching FSDP weight names to the layout expected by SGLang during distributed updates. Key changes: - special-case Qwen VL parameter naming for SGLang weight sync - pin the cuDNN dependency needed for SGLang vision runs - replace the skipped VLM example with Geometry3K coverage for SGLang and vLLM

Enable blockwise 128x128 FP8 e4m3fn matmuls for Archon engine via torchao, with BF16 master weights and on-the-fly quantization. Key changes: - Add FP8 linear patching (enable_fp8_linear) and expert patching (enable_fp8_experts) with deepcopy-safe types.MethodType binding - Add FP8 checkpoint detection, preparation, and dequantization with CPU fallback and Shard(0) DTensor support - Add ArchonFP8Config with mode/exclude_modules/include_experts/use_triton - Add post-parallelism shard alignment validation for TP safety - Disable torch.compile when FP8 is active (incompatible with 2D scales) - Comprehensive test suite: forward/backward correctness, checkpoint detect/prepare/dequant, MoE dispatch, distributed sharded dequant

…ervice (#1112) Wire VLLMBridgeBackend into the full inference service stack with feature parity to the SGLang path, including per-process env isolation, context- window capping, and end-to-end test coverage. Key changes: - Add backend_type config field and --backend-type CLI arg for data proxy - Wire VLLMBridgeBackend selection in data proxy app factory - Delegate InfBridge abort/resubmit to backend protocol methods instead of hardcoded SGLang payload access - Cap max_new_tokens in VLLMBridgeBackend to match SGLangBridgeBackend - Add env override support to RPCGuard /fork endpoint - Pass TRITON_CACHE_PATH, VLLM_CACHE_ROOT, VLLM_ALLOW_RUNTIME_LORA_UPDATING as per-process env vars when launching vLLM inference servers - Add launch_vllm_server() helper to integration_utils - Add 8 vLLM unit tests and 7 vLLM e2e integration test variants

garrett4wade and others added 4 commits March 31, 2026 10:35

chore(ci): update GCP CI image (#1115)

59f4047

pull bot locked and limited conversation to collaborators Mar 31, 2026

pull bot added the ⤵️ pull label Mar 31, 2026

pull bot merged commit d366571 into axistore80-coder:main Mar 31, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from inclusionAI:main#21

[pull] main from inclusionAI:main#21
pull[bot] merged 4 commits intoaxistore80-coder:mainfrom
inclusionAI:main

pull bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull bot commented Mar 31, 2026 •

edited

Loading