Skip to content

Chunk speak text and disable TTS keepalive deadline for long --url input#231

Merged
alexkroman merged 3 commits into
mainfrom
fix-speak-tts-keepalive-timeout
Jun 18, 2026
Merged

Chunk speak text and disable TTS keepalive deadline for long --url input#231
alexkroman merged 3 commits into
mainfrom
fix-speak-tts-keepalive-timeout

Conversation

@alexkroman

Copy link
Copy Markdown
Collaborator

Problem

assembly speak --url https://arxiv.org/pdf/... (a long PDF/article) failed with:

Error: TTS session failed: sent 1011 (internal error) keepalive ping timeout; no close frame received

The "no close frame received" means the client (websockets) gave up on a still-alive socket, not that the server crashed.

Root cause

Two independent contributors, both fixed here:

  1. The whole document was sent in a single PocketTTS Generate frame. PocketTTS is a streaming model meant to be fed incrementally; a whole paper stalls the server long enough that it stops answering the websocket keepalive ping, and the client closes the socket with code 1011.

  2. websockets' default 20s keepalive pong deadline is too aggressive for this workload. A server slow to emit the first Audio frame under load gets killed before producing anything.

Fix

  • session.synthesize_chunked splits the text into sentence-aligned chunks packed to a safe char budget and synthesizes one connection per chunk — the same one-sentence-per-connection pattern agent-cascade already uses. An over-long, terminator-less PDF blob is hard-sliced so no single Generate can blow past the server's input ceiling. Bonus: audio now starts on the first chunk instead of after the whole document.
  • ping_timeout=None on the TTS socket disables the redundant pong deadline. _RECV_TIMEOUT_SECONDS (60s) is already the per-frame liveness authority, so a slow-but-alive server is no longer killed; a genuinely dead connection still fails cleanly.

Notes

  • New aai_cli/tts/text.py holds the pure split_sentences/chunk_text helpers (Rich-free, unit-tested).
  • The oversized test_tts_session.py was split along its natural seam (single-synthesis vs dialogue) via a shared tests/_tts_session_helpers.py to stay under the 500-line file gate.
  • Full scripts/check.sh gate passes (coverage, 100% patch coverage, mutation gate, build).

🤖 Generated with Claude Code

alexkroman-assembly and others added 2 commits June 17, 2026 16:26
`assembly speak --url <pdf>` failed with "TTS session failed: sent 1011
(internal error) keepalive ping timeout" on long documents. Two causes,
both fixed:

- The entire extracted document was sent in a single PocketTTS `Generate`
  frame. PocketTTS is a streaming model meant to be fed incrementally;
  a whole paper stalls the server so it stops answering the websocket
  keepalive ping, and the client closes the still-alive socket with 1011.
  `synthesize_chunked` now splits the text into sentence-aligned chunks
  (packed to a safe char budget; an over-long terminator-less PDF blob is
  sliced) and synthesizes one connection per chunk — the same
  one-sentence-per-connection pattern agent-cascade already uses. Audio
  also starts on the first chunk instead of after the whole document.

- Independently, disable websockets' 20s keepalive pong deadline on the
  TTS socket (`ping_timeout=None`). `_RECV_TIMEOUT_SECONDS` (60s) is
  already the liveness authority per frame, so a server slow to emit the
  first frame under load is no longer killed prematurely.

Also splits the oversized tts session test module along its natural seam
(single-synthesis vs dialogue) via a shared `_tts_session_helpers` module
to stay under the 500-line file gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alexkroman alexkroman enabled auto-merge June 17, 2026 23:31
@alexkroman alexkroman added this pull request to the merge queue Jun 18, 2026
Merged via the queue into main with commit aeae9b7 Jun 18, 2026
19 checks passed
@alexkroman alexkroman deleted the fix-speak-tts-keepalive-timeout branch June 18, 2026 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants