Skip to content

Add voice-only Textual TUI for live agent cascade sessions#250

Merged
alexkroman merged 3 commits into
mainfrom
claude/funny-carson-duyxk8
Jun 18, 2026
Merged

Add voice-only Textual TUI for live agent cascade sessions#250
alexkroman merged 3 commits into
mainfrom
claude/funny-carson-duyxk8

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Introduces LiveAgentApp, a hands-free voice-only Textual UI for assembly live (the agent cascade), complementing the existing code-agent TUI. The new TUI drops the text prompt entirely since input is purely voice-based, while reusing the shared chrome (ASSEMBLY wordmark, animated voice bar, transcript message widgets).

Key changes

  • New aai_cli/agent_cascade/tui.py: Implements LiveAgentApp (the Textual app) and _TuiRenderer (marshals cascade callbacks onto the UI thread). The app displays a scrolling transcript above an animated voice bar that tracks the session phase (listening/thinking/speaking). A worker thread drives the blocking cascade via run_conversation, with on_stop closing the audio to unblock that worker on quit.

  • TUI selection logic in aai_cli/commands/agent_cascade/_exec.py:

    • _should_use_tui(): Determines when to use the TUI (interactive mic session in human mode on a TTY, not file/sample input, not --json/-o text, not piped stdin/stdout).
    • _run_live_tui(): Wires the duplex audio, cascade dependencies, and hands the app a run_conversation closure plus on_stop callback.
    • Refactored _web_search_note() to return the notice string (or None) so it can be passed to the TUI as a notification and still emitted to stderr in line-renderer mode.
  • Shared voice-bar infrastructure in aai_cli/code_agent/tui_status.py:

    • Extracted VOICE_FRAMES (animated meter) and _VOICE_PHASES (phase labels + colors) as module-level constants so both the code and live TUIs use identical animations and styling.
    • Added voicebar_markup() helper to generate the voice bar's content (meter, label, accent color, optional hint).
  • Message widget enhancements in aai_cli/code_agent/messages.py:

    • Added UserMessage.set_text() to update the user prompt in place (used by the live TUI to grow interim voice transcripts without mounting new widgets).
    • Extracted _user_markup() helper so the styling is consistent between constructor and set_text().
  • Comprehensive test suite in tests/test_live_tui.py (342 lines):

    • Direct UI-thread tests for splash, user partial/final, agent replies, interruption, voice-bar animation, and error handling.
    • Worker-thread integration tests driving the real _TuiRenderer through a scripted cascade to cover the off-thread hop, error path, and teardown without a mic/speaker/socket.
    • Integration tests verifying TUI selection logic and wiring into run_agent_cascade.
  • Test updates in tests/test_code_messages.py and tests/test_code_tui_status.py: Added tests for UserMessage.set_text() and voicebar_markup().

Implementation details

  • The cascade runs on a worker thread; every renderer call hops back to the UI thread via call_from_thread(). Once the app tears down (quit mid-turn), that call raises RuntimeError — the event is moot, so it's dropped rather than surfaced as an unhandled exception.
  • The voice bar animates via a 0.3s interval timer, cycling through 6 block-character frames for a smooth pulse effect.
  • Quit (Ctrl-C or Ctrl-Q) calls on_stop to close the audio, which ends the mic iterator and unblocks the cascade worker, then exits the app.
  • The TUI is disabled for file/sample input, machine output modes (--json, -o text), and non-TTY environments — all fall back to the line renderer.

https://claude.ai/code/session_01WfPLNLa2h4B5khteUmYnFc

`assembly live` (the agent cascade) now runs in a Textual TUI by default for an
interactive mic session: a scrolling transcript above an animated voice bar that
tracks listening / thinking / speaking. There is no text input — it's a
hands-free spoken experience.

The new `agent_cascade/tui.py` (`LiveAgentApp`) reuses the `assembly code` TUI's
chrome: the ASSEMBLY wordmark splash (`code_agent.banner`), the transcript
message widgets (`code_agent.messages`), and the voice-bar rendering — extracted
into `code_agent.tui_status.voicebar_markup`/`VOICE_FRAMES` so both front-ends
share one source. `UserMessage` grows an interim transcript in place via a new
`set_text`.

The blocking `engine.run_cascade` runs on a worker thread and reaches the UI
through a `_TuiRenderer` (the `engine.Renderer` protocol) that hops each call
onto the UI thread; a quit closes the duplex audio, ending the mic iterator and
unblocking the worker. `_exec._should_use_tui` gates the front-end: file/sample
input, `--json`/`-o text`, and non-TTY runs keep the plain `AgentRenderer` line
output.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01WfPLNLa2h4B5khteUmYnFc
@alexkroman alexkroman enabled auto-merge June 18, 2026 20:57
@alexkroman alexkroman added this pull request to the merge queue Jun 18, 2026
Merged via the queue into main with commit b562e86 Jun 18, 2026
18 of 20 checks passed
@alexkroman alexkroman deleted the claude/funny-carson-duyxk8 branch June 18, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants