Claude (SDK) chat + tool-calling, budgeting, and docs/test updates by akhilsinghcodes · Pull Request #1 · akhilsinghcodes/agents_fleet

akhilsinghcodes · 2026-06-01T10:11:25Z

Summary
This PR significantly expands Agents Fleet beyond PTY/CLI session monitoring by adding a full Claude (SDK) chat experience powered by the Anthropic SDK, including agentic tool-calling to run repo commands with Approve/Reject gating, improved budgeting (including tool-loop enforcement), and documentation + tests to reflect the new capabilities.

Key Features Added

Claude (SDK) Chat Sessions

Added a chat-first Claude SDK UI (React) with:
- per-session transcript rendering
- session id display for copy/reference within Agents Fleet
- “New chat” flow that resets the draft + creates a fresh session
Persisted Claude SDK artifacts to SQLite:
- config (claude_sdk_config_v1)
- user messages (claude_sdk_user_message_v1)
- assistant messages (claude_sdk_assistant_message_v1)
- usage snapshots (claude_sdk_usage_v1)
- tool approvals (claude_sdk_tool_approval_v1)
- tool results (claude_sdk_tool_result_v1)

WebSocket Streaming (Chat + Tools)

Extended /ws beyond PTY streaming to support Claude SDK:
- assistant streaming events (claude_sdk_chunk, claude_sdk_done)
- tool request + output events (claude_sdk_tool_request, claude_sdk_tool_output)
- client tool decisions (claude_sdk_tool_decision) to approve/reject commands

Tool-Calling: `run_command` (Any Shell Command)

Implemented Anthropic tool-calling in the Claude SDK turn runner:
- tool: run_command({ command })
- executes commands in the session repo working directory
Added explicit user gating:
- UI shows each tool request inline with Approve / Reject
- server blocks execution until a decision is received
Added command output limits:
- tool output capped to 100KB to protect context and budgets

Budgeting Improvements

Budget enforcement for Claude SDK sessions:
- preflight enforcement on send
- enforcement during tool loops, not just at turn start
Model-aware USD estimation:
- introduced computeModelCostUsd with a best-effort model pricing table
- Claude SDK cost calculations use model pricing + SDK usage when available
Fixed a critical bug:
- HTTP fallback route previously called processManager.stopSession(...) for Claude SDK sessions (no-op)
- now directly updates the sessions row with stop_reason='budget_exceeded', timestamps, and status

Usage Visibility

UI fetches latest claude_sdk_usage_v1 artifact after a turn and displays:
- input tokens
- output tokens
- thinking tokens (if present)
- cache read/write tokens (if present)

Tests Added

Added server test verifying Claude SDK USD budget enforcement:
- apps/server/test/claude_sdk_budget.test.ts
- uses a mocked Anthropic SDK (no network)

Docs Updated

ROADMAP.md
- marked Claude SDK chat/tooling/budgeting items as done
- added next steps: pricing configurability + further budget accuracy/testing
README.md
- updated WebSocket description to include SDK events
- added Claude SDK section + screenshots:
  - budget stop
  - tool call + output
  - tool permission gate
ARCHITECTURE.md
- updated system diagram and narrative to include Claude SDK path
- documented artifact kinds for Claude SDK sessions

Notes / Follow-ups

Model pricing table is best-effort and should be made configurable (env/JSON) for accuracy across accounts/contracts.
WS tool-loop budget enforcement relies on estimated/model-based cost; accuracy improves when SDK usage is present.
Additional WS-level integration tests (tool approvals + mid-loop budget cutoff) can be added next.

Validation

pnpm -r typecheck
pnpm -r build
pnpm -r -F @agents_fleet/server test

Akhil Singh added 7 commits June 1, 2026 02:05

Add Claude SDK chat support

c6c4708

Add Claude SDK chat tool support

e421c99

Load model pricing from config

83f312c

Add configurable model pricing defaults

dd5bc3d

Add remote pricing API support

966071d

Document Claude SDK API key requirement

1b17625

Update README.md

d1b2548

akhilsinghcodes merged commit c82b42e into main Jun 1, 2026
1 check passed

akhilsinghcodes deleted the feature/claude_enhancement branch June 1, 2026 10:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude (SDK) chat + tool-calling, budgeting, and docs/test updates#1

Claude (SDK) chat + tool-calling, budgeting, and docs/test updates#1
akhilsinghcodes merged 7 commits into
mainfrom
feature/claude_enhancement

akhilsinghcodes commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

akhilsinghcodes commented Jun 1, 2026

Key Features Added

Claude (SDK) Chat Sessions

WebSocket Streaming (Chat + Tools)

Tool-Calling: run_command (Any Shell Command)

Budgeting Improvements

Usage Visibility

Tests Added

Docs Updated

Notes / Follow-ups

Validation

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tool-Calling: `run_command` (Any Shell Command)