Skip to content

fix(client): render off the select loop so a slow terminal can't wedge it (phux-fysb)#51

Open
phall1 wants to merge 1 commit into
mainfrom
fix/fysb-nonblocking-stdout-sink
Open

fix(client): render off the select loop so a slow terminal can't wedge it (phux-fysb)#51
phall1 wants to merge 1 commit into
mainfrom
fix/fysb-nonblocking-stdout-sink

Conversation

@phall1
Copy link
Copy Markdown
Owner

@phall1 phall1 commented Jun 4, 2026

Summary

Re-attaching to a multi-pane/window session froze the client — screen stuck, Ctrl-C and detach dead — and then the pane shells appeared to die. This decouples rendering from the input loop so a slow terminal can never wedge it.

Root cause

The attach tokio::select! loop renders synchronously: every paint_full_frame/render_at ends in out.flush(), and when out is the real tty that flush blocks until the terminal drains. The paint runs inside the biased conn.recv() arm, so a slow flush starves the stdin/signal arms → the client can't process Ctrl-C or detach = wedge.

Why multi-pane specifically: a re-attach paint_full_frame burst scales ~4× with panes (measured 5.6 KB → 22 KB at 1 → 6 panes). That crosses the wedge threshold a single pane (which drains instantly on any terminal) never reaches. The "pane shells die" is collateral — force-killing the wedged terminal EOFs the session, which reaps the panes (exit code 1).

The fix

A new StdoutSink (attach/stdout_writer.rs) threads through main_loop as out:

  • write* only buffer in memory; flush() ships complete frames to a dedicated OS thread (phux-stdout) that owns real stdout and does the blocking write off the runtime thread.
  • The select! loop therefore never blocks on the terminal — input and signals are always serviced.
  • Backpressure: if the queued backlog exceeds 256 KiB the sink drops the stale backlog and sets needs_resync; main_loop polls it and repaints the latest state via paint_full_frame (self-contained — supersedes the dropped diffs). Memory stays bounded; the screen renders at the terminal's pace.

Lifecycle

run/run_with_predictrun_buffered (spawns the writer) → attach_session. The synchronous test seam run_with_stdout(_predict) passes None, None, so phux-server integration tests are unchanged. On Detached/Err the writer is drained + joined before the terminal-reset writes (detach stays prompt, reset isn't garbled); SwitchTo keeps the writer across the session switch.

Validation

  • Unit: flush_does_not_block_on_a_slow_sinkflush() stays <25 ms even when the inner sink sleeps 50 ms/chunk (the core "loop can't be starved" property). Plus overflow-drops-backlog-sets-resync and ship-in-order.
  • Grep: no render-path code writes stdout directly — every flush goes through the sink.
  • Regression: 343 client tests + attach lifecycle / snapshot / in-process reattach / graceful terminal-restore all green. just ci green.

Caveat — please confirm on a real terminal

I could not reproduce the exact wedge end-to-end headlessly: a throttled pty reader trips macOS pty flow-control that gates input on any binary (a harness artifact, not phux), and the detach keybinding won't fire through raw pty writes. The fix is mechanism-proven, but the definitive check is on a real terminal: open a multi-pane session, detach, re-attach — it should stay responsive instead of freezing.

🤖 Generated with Claude Code

…e it (phux-fysb)

Re-attaching to a multi-pane/window session froze the client: screen
stuck, Ctrl-C and detach dead. The attach select! loop rendered
SYNCHRONOUSLY -- every paint_full_frame/render_at ends in out.flush(),
and when `out` is the real tty that flush BLOCKS until the terminal
drains. Because the paint runs inside the biased conn.recv() arm, a slow
flush starves the stdin/signal arms. Multi-pane re-attach is the trigger:
its paint_full_frame burst is ~4x a single pane's bytes (measured 5.6KB
-> 22KB at 1->6 panes), enough to cross the wedge threshold a single
pane never reaches. (The "pane shells die" the user saw is collateral:
force-killing the wedged terminal EOFs the session.)

New StdoutSink threads through main_loop as `out`: writes buffer in
memory, flush() ships the complete-frame bytes to a dedicated OS thread
(phux-stdout) that owns real stdout and does the blocking write off the
runtime thread. The select! loop never blocks on the terminal, so input
and signals are always serviced. Backpressure: if the queued backlog
exceeds 256KiB the sink drops it and sets needs_resync; main_loop polls
that and repaints the latest state via paint_full_frame (self-contained,
supersedes the dropped diffs) -- memory bounded, screen renders at the
sink's pace.

Wiring keeps the synchronous test seam: run/run_with_predict ->
run_buffered (spawns the writer) -> attach_session; run_with_stdout(_predict)
still passes None,None so phux-server integration tests are unchanged.
On Detached/Err the writer is drained+joined before the terminal-reset
writes (so detach is prompt and the reset isn't garbled); SwitchTo keeps
the writer across the session switch.

Tests: stdout_writer unit tests incl. flush_does_not_block_on_a_slow_sink
(flush stays <25ms behind a 50ms/chunk sink -- the core property) and
overflow-drops-backlog-sets-resync. No render-path code writes stdout
directly, so every flush goes through the sink. 343 client tests +
attach lifecycle/snapshot/reattach/graceful-restore green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant