assembly code/live: gateway tool-calling resilience, voice interruptibility, gpt-5.1 default by alexkroman · Pull Request #251 · AssemblyAI/cli

alexkroman · 2026-06-18T22:01:44Z

Fixes uncovered while driving assembly code (and assembly live) against the LLM Gateway, plus the gpt-5.1 default switch. Each gateway issue is also reported upstream for a server-side fix; the CLI-side changes keep it resilient regardless of timeline, and all are idempotent once the gateway is fixed.

Commits

read_skill tool + empty tool-args workaround (57bea22) — the skills middleware loads skills from ~/.claude/skills but told the model to open them with the cwd-bound read_file; added a dedicated read_skill tool. Also: the gateway drops Anthropic's required tool_use.input when OpenAI arguments is empty (""/"{}"), 400/500-ing and wedging the session — outgoing empty args now get a placeholder.
Drop spurious blank tool-call delta (11b2027) — every streamed turn starts with an empty tool-call delta {"function":{"id":"","name":"","arguments":""}}; on a pure-text turn that became a tool call with name="" (Error: is not a valid tool). Now dropped before langchain sees it.
Voice readback interruptible on the daemon thread (6ff3241) — readback played on a daemon thread where neither Ctrl-C's KeyboardInterrupt nor the between-synth-chunks flag check could stop it; the cancel flag is now polled during playback so Ctrl-C interrupts speaking.
Default assembly code and assembly live to gpt-5.1 (8819df6) — both override with --model; assembly llm is unchanged. Verified the gateway accepts gpt-5.1; --help snapshot regenerated.
Test split + type narrowing (1d81481) — moved the model tests into test_code_model.py (the original was over the 500-line gate) and narrowed types.
WIP: Firecrawl search + live agent MCP tools (c0c8ad2) — @alexkroman's in-progress work, rides along on this branch (now gate-clean).

Verification

./scripts/check.sh → All checks passed — ruff/mypy/pyright/xenon/import-linter, 100% patch coverage, mutation gate (13 mutants), escape-hatch gate, build + twine.

The streaming tool-call id fix from this investigation already landed as #247.

🤖 Generated with Claude Code

… empty tool-args Two `assembly code` fixes uncovered while building a voice agent: 1. read_skill tool. The skills middleware loads skills from its own backend rooted at ~/.claude/skills, but deepagents' stock prompt tells the model to open each SKILL.md with `read_file` — which is bound to the cwd sandbox and can't reach them, so the model got `File '/aai-cli/SKILL.md' not found`. Add a read-only `read_skill` tool bound to the skills directory (with a traversal guard) and a prompt that points the model at it. build_skills() now returns the (middleware, tool) pair, wired together in _build_agent. 2. Empty tool-call arguments. The LLM Gateway maps OpenAI `arguments` onto Anthropic `tool_use.input` but drops `input` entirely when arguments are empty (""/"{}"), which Anthropic rejects (400, surfaced as 500 when streaming) — and because the failing call sits in history, every later turn fails too, wedging the session. _ensure_tool_call_arguments substitutes a minimal non-empty placeholder in the outgoing payload so the gateway emits a valid input. Request-only; the tool already ran locally with its real args. (Reported upstream for a server-side fix; this keeps the CLI resilient.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…elta Every streaming turn (when tools are available) begins with an empty tool-call delta — {"function": {"id": "", "name": "", "arguments": ""}}. On a pure-text turn (e.g. the agent asking clarifying questions) no real tool call follows, so langchain is left with a tool call whose name is "", deepagents dispatches it, and the turn dies with `Error: is not a valid tool`. Extend the streaming normalizer to drop any tool-call delta with no name, id, or arguments before langchain converts the chunk (this also harmlessly drops the gateway's empty argument-continuation deltas). A real text+tool turn still yields exactly one correct tool call; a pure-text turn yields none. Reported upstream for a server-side fix; this keeps the CLI resilient meanwhile. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

The coding-agent TUI plays each spoken reply on a daemon thread. Its two cancel channels both failed there: PcmPlayer chunks writes "so a Ctrl-C lands between them", but that relies on KeyboardInterrupt reaching the *playing* thread — true for the foreground `assembly speak` CLI, not the TUI, where Ctrl-C is handled by Textual on the UI thread. The only cross-thread signal, the `_cancel` event, was checked solely between synthesizer chunks (the feed wrapper), never during sounddevice's blocking playback. So readback was effectively uninterruptible: Ctrl-C did nothing, and the daemon thread stayed blocked in speak() instead of advancing to listen for the next turn. Poll the cancel flag inside PcmPlayer's piece-write loop (abort the device and drop the rest of the chunk when set), and have voice.speak hand the player a live poll of `_cancel`. Cancellation is now honored within ~one 4 KiB piece (~85 ms) regardless of which thread the interrupt arrives on. The poll is optional, so `assembly speak` (foreground, KeyboardInterrupt-driven) is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Switch the default LLM Gateway model for `assembly code` (code_agent/prompt.py) and `assembly live` (agent_cascade/config.py) to gpt-5.1. The live default is now a literal rather than llm.DEFAULT_MODEL, so it's independent of the one-shot `assembly llm` default (still claude-haiku). Both override with --model. Verified gpt-5.1 is accepted by the gateway. Updated the cascade config test and regenerated the `--help` snapshot (both commands show [default: gpt-5.1]). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Make the branch gate-clean now that the firecrawl WIP no longer blocks it: - Move the code_agent/model.py unit tests out of test_code_agent.py (which had grown past the 500-line file-length gate) into a new test_code_model.py, and add it to pyrightconfig.tests.json's ignore list alongside the other langchain-boundary test files. - Narrow `delta` to dict in _hoist_in_choice via early returns so mypy accepts the in-place `delta["tool_calls"]` assignment, and isinstance-narrow the chat model in the payload/convert-chunk tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

- code_agent/tui.py: guard on_worker_state_changed on is_running — a turn worker can finish after the app tears down (quit / test run_test exit), and driving _finish_turn then queries an unmounted DOM (NoMatches on "#spinner"). Skip it when the app isn't running. Fixes test_code_tui_voice flakiness on Windows. - test_live_tui.py: the reply-text assertion waited only for the AssistantMessage widget to mount, but agent_transcript sets the text via a separate call_from_thread hop — so the wait raced the text on a slow runner. Wait for the text itself. - pyrightconfig.tests.json: ignore test_code_tui_voice.py alongside the other Textual-boundary test files (it now duck-types a Worker.StateChanged event). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

alexkroman-assembly and others added 6 commits June 18, 2026 15:26

alexkroman force-pushed the code-agent-fixes branch from db00550 to 8a84c96 Compare June 18, 2026 22:37

alexkroman added this pull request to the merge queue Jun 18, 2026

Merged via the queue into main with commit c765bc1 Jun 18, 2026
20 checks passed

alexkroman deleted the code-agent-fixes branch June 18, 2026 22:58

alexkroman restored the code-agent-fixes branch June 18, 2026 23:24

alexkroman deleted the code-agent-fixes branch June 18, 2026 23:30

alexkroman mentioned this pull request Jun 18, 2026

assembly code/live: voice-interrupt UX, modal dismissal, concise speech, gemini live default #252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

assembly code/live: gateway tool-calling resilience, voice interruptibility, gpt-5.1 default#251

assembly code/live: gateway tool-calling resilience, voice interruptibility, gpt-5.1 default#251
alexkroman merged 6 commits into
mainfrom
code-agent-fixes

alexkroman commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

alexkroman commented Jun 18, 2026

Commits

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants