Skip to content

Fix race conditions and error handling in async operations#95

Merged
alexkroman merged 1 commit into
mainfrom
claude/clever-hopper-jh8r0k
Jun 12, 2026
Merged

Fix race conditions and error handling in async operations#95
alexkroman merged 1 commit into
mainfrom
claude/clever-hopper-jh8r0k

Conversation

@alexkroman

@alexkroman alexkroman commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Summary

This PR fixes the 10 verified findings from a full-codebase review: it hardens error handling and fixes race conditions across concurrent operations, bounds a hang-prone receive loop, and lands two small dedup cleanups. The full scripts/check.sh gate passes (100% patch coverage; all diff-scoped mutants killed).

Key Changes

Error Handling in Streaming Workers

  • aai_cli/streaming/session.py: Wrap non-CLIError exceptions in parallel worker threads with APIError so they fail the run cleanly instead of silently dying with the daemon thread and exiting with code 0
  • tests/test_stream_session.py: Add test verifying unexpected worker errors are caught and reported with clean output (no raw tracebacks)

TTS WebSocket Timeout Bounds

  • aai_cli/tts/session.py: Add a 60-second timeout to all ws.recv() calls via new _recv_raw() helper so assembly speak fails cleanly instead of hanging forever if the server goes silent mid-session; TimeoutError maps to a clean APIError
  • tests/test_tts_session.py: Add tests verifying the timeout is applied to every frame and a silent server surfaces as a clean error

OAuth Callback Capture Thread Safety

  • aai_cli/auth/loopback.py: Add a threading.Lock to CallbackCapture and claim-once semantics: the first matching callback wins, and once the capture is claimed (by a callback or by the timeout in wait()), a late/duplicate callback can no longer mutate the result
  • tests/test_auth_loopback.py: Add tests for duplicate-callback handling and timeout claiming

WebSocket Handshake Status Classification (dedup)

  • aai_cli/ws.py: Add a shared handshake_status() that reads both structured shapes (SDK .code, websockets .response.status_code); is_rejected_key() now vetoes 403 through it, so an SDK-shaped 403 with auth-worded text is no longer misclassified as a rejected key
  • aai_cli/streaming/diagnostics.py: Remove the duplicate _handshake_status() and use the shared classifier — the two copies had already drifted
  • tests/test_ws.py: Add tests for both handshake exception shapes, the SDK-403 veto, and non-handshake exceptions

AMS JSON Response Validation

  • aai_cli/auth/ams.py: Catch ValueError from resp.json() and raise a clean APIError for 2xx responses with unparseable bodies (proxy interference, truncation) instead of letting a raw JSONDecodeError surface as "Unexpected error"
  • tests/test_auth_ams.py: Add test for non-JSON 200 response handling

Sessions Table Rendering

  • aai_cli/commands/sessions.py: Distinguish None (missing value → blank cell) from 0 (legitimate zero duration → renders "0") in the duration column
  • tests/test_sessions_command.py: Add test verifying zero duration renders as "0", not blank

Auto-Login Error Handling

  • aai_cli/context.py: Catch TypeError from the TOML writer (non-serializable values) and emit the clean "could not save the credentials" message instead of the generic "Unexpected error"
  • tests/test_context.py: Add test for TypeError during credential persistence

Generated Voice-Agent Script

  • aai_cli/code_gen/agent.py: Guard the generated send_mic thread's ws.send() with try/except so it ends quietly when the socket closes, instead of dumping a daemon-thread traceback on every normal exit of the sample script (the ready-gate was restructured to if not ready.is_set(): continue — equivalent logic, just reshaped for the guard)
  • tests/test_code_gen_stream_agent.py: Add test verifying the generated code handles socket close gracefully

Quiet-Flag Dedup

  • aai_cli/argscan.py: Add requests_quiet() so the --quiet/-q token forms live next to requests_json()
  • aai_cli/telemetry.py: Use it in _notice_suppressed() instead of a hardcoded duplicate list

https://claude.ai/code/session_01Cb2fCtBiA6LG667UnzWjvf

Correctness:
- streaming/session.py: a non-CLIError exception in a parallel worker now
  fails the run with a clean error instead of dying with the daemon thread
  and letting the command exit 0 for a failed stream.
- auth/loopback.py: the OAuth callback capture is now claim-once under a
  lock — a duplicate/late callback can no longer overwrite the captured
  token, and the timeout path can no longer race the handler mid-write.
- tts/session.py: every protocol recv() is bounded (60s), so a server that
  goes silent mid-synthesis fails 'assembly speak' cleanly instead of
  hanging it forever.
- auth/ams.py: a 2xx AMS response with an unparseable JSON body raises a
  clean APIError instead of escaping as a raw JSONDecodeError.
- code_gen/agent.py: the generated voice-agent script's send_mic thread
  ends quietly when the socket closes instead of dumping a traceback on
  every normal exit.
- commands/sessions.py: a 0-second audio duration renders as "0" instead
  of being coerced to a blank cell.
- context.py: a TypeError while persisting browser-login credentials maps
  to the "could not save the credentials" message, not "Unexpected error".

Cleanup:
- ws.py/streaming/diagnostics.py: one shared handshake_status() classifier
  for 401/403 across stream/agent/speak — the two copies had already
  drifted on the SDK .code shape.
- argscan.py/telemetry.py: quiet-flag forms (--quiet/-q) now live in
  argscan.requests_quiet alongside requests_json instead of being
  hardcoded in telemetry.

https://claude.ai/code/session_01Cb2fCtBiA6LG667UnzWjvf
@alexkroman alexkroman enabled auto-merge (squash) June 12, 2026 03:30
@alexkroman alexkroman merged commit b7093a2 into main Jun 12, 2026
8 checks passed
@alexkroman alexkroman deleted the claude/clever-hopper-jh8r0k branch June 12, 2026 03:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants