Skip to content

[pull] main from inclusionAI:main#37

Merged
pull[bot] merged 2 commits intoaxistore80-coder:mainfrom
inclusionAI:main
Apr 15, 2026
Merged

[pull] main from inclusionAI:main#37
pull[bot] merged 2 commits intoaxistore80-coder:mainfrom
inclusionAI:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull bot commented Apr 15, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

guozhihao-224 and others added 2 commits April 15, 2026 10:55
)

* feat(service): support multi-node inference in gateway controller

Enable inference instances to span multiple physical nodes for large
models (e.g. Llama-3.1 405B with tp_size=16 across 2x 8-GPU nodes).
When nnodes > 1, the controller groups every nnodes workers into one
inference instance, allocates rendezvous ports for distributed init,
forks inference servers on all nodes per group, and only records the
head node as the HTTP endpoint. When nnodes=1 (default), behavior is
identical to the existing code.

Key changes:
- Add nnodes field to GatewayControllerConfig (default 1)
- Extend vLLMConfig.build_args()/build_cmd() with n_nodes/node_rank
- Restructure controller fork loop with grouped multi-node support
- Register forked inference processes in _forked_services for cleanup
- Fix server_infos indexing to avoid IndexError when nnodes > 1
- Data proxies only fork on head nodes (one per DP group)
- Defensive checks: worker count validation, RuntimeError over assert

Refs: #1149

* refactor(inference): replace nnodes with n_gpus_per_node for multi-node configuration
…SDK example (#1177)

Add AgentServiceController and Guard for production-style orchestration,
replace tau2/PydanticAI demo with Claude Agent SDK integration.

Key changes:
- Add controller/ with scheduler-based Guard creation (mirrors GatewayInferenceController)
- Add guard/ module (pass-through to areal.infra.rpc.guard)
- Add config dataclasses with __post_init__ validation
- Replace Tau2Agent with ClaudeAgent (session-persistent ClaudeSDKClient)
- Session lifecycle: close_session, Worker endpoint, DataProxy propagation
- Initialize rollback on failure, register-before-commit in scale_up
- Unregister with retry in scale_down, skip pair on failure
- Lock-protected _pairs, ThreadPoolExecutor health monitor
- Timing-safe WebSocket auth via hmac.compare_digest

BREAKING CHANGE: areal.experimental.agent_service.__init__.py no longer
re-exports symbols. Import from submodules directly.
@pull pull bot locked and limited conversation to collaborators Apr 15, 2026
@pull pull bot added the ⤵️ pull label Apr 15, 2026
@pull pull bot merged commit f7e690a into axistore80-coder:main Apr 15, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants