fix(rtmg): reap dead WS clients that wedge the send path#295
Merged
Conversation
The websockets sync send path (send_context → send_data) holds the
connection's protocol lock across a *blocking* socket.sendall with no
send timeout. When a client stops reading — a killed tab, a slept
laptop, or a Cloudflare named-tunnel half-open where the pod↔edge TCP
stays ESTABLISHED while the browser is gone — the next slice ws.send
wedges on a full TCP window and pins the protocol lock forever. The
library's own keepalive ping then deadlocks acquiring that same lock,
so the dead client is never detected: state.running stays True, the
runner holds the pod's one-session seat, and /sessions never empties
(lockHeld stays true). Measured: a frozen client left ~2.6 MB stuck in
the server's send queue with the session still registered 13+ minutes
later. The recv loop can't help either (a half-open peer sends no close
frame) and our own ConnectionClosed detectors are the threads stuck in
sendall.
Add a per-session dead-client watchdog that detects this from OUTSIDE
the wedged Python I/O threads, via GIL- and lock-independent syscalls on
the socket FD:
- send stall (SIOCOUTQ): the kernel send queue stays positive and
drains zero bytes for a grace window. A live client's kernel ACKs and
drains regardless of the client app's responsiveness, so a slow link,
a GC pause, or a backgrounded tab keeps draining and is never reaped;
only a client reading nothing trips it. Keying on kernel drainage —
not app-level ack progress — is what avoids the false-reap of a
GIL-starved-but-alive client that sinks an ack-stall timeout.
- peer close (TCP_INFO state != ESTABLISHED / FD error): the
"0 established sockets but session still registered" fingerprint,
where the library's recv thread missed the EOF.
On either signal it flips state.running False (runner exits → registry
unregister fires → /sessions empties → lockHeld flips false and the pod
auto-returns to the pool) and force-shuts the socket so the wedged
sendall and recv thread unblock at once. Grace is configurable via
DEMON_DEAD_CLIENT_STALL_S (default 30s).
Supersedes #293 (the ack-stall reaper), whose app-level ack-progress
signal false-reaps live-but-GIL-starved clients.
Validated on a 5090 pod: a genuinely half-open (frozen) client is reaped
via send_stall; clean close, an active client, a paused-but-alive client
(draining, zero acks, 78s), and a backpressured client with acks
trickling (78s, window-drops engaged) are all left untouched.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Off Linux the SIOCOUTQ/TCP_INFO syscalls fail and would reap every session, so gate the watchdog to Linux and no-op elsewhere. Add TCP_INFO.tcpi_bytes_acked as the drain signal so a backpressured-but- alive client whose send queue plateaus at a constant positive level isn't false-reaped. Hoist struct/sys imports. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The
websocketssync send path holds the connection's protocol lock across a blockingsocket.sendall. A half-open client (killed tab, slept laptop, Cloudflare tunnel still ESTABLISHED while the browser is gone) wedges the nextws.sendon a full TCP window and pins that lock forever. The keepalive ping then deadlocks on the same lock, so the dead client is never detected:state.runningstays True, the pod's one session seat is never released, and/sessionsnever empties (measured: ~2.6 MB stuck, session still registered 13+ min later).Add a per-session watchdog that detects this from outside the wedged I/O threads, via GIL-/lock-independent syscalls on the socket FD:
SIOCOUTQ(queued bytes) stays positive whileTCP_INFO.tcpi_bytes_ackedmakes no progress and the queue doesn't drain for a grace window. Keying on the kernel's acked-bytes counter (not app-level acks) leaves slow links, GC pauses, backgrounded tabs, and backpressured-but-alive clients untouched; only a client reading nothing trips it.TCP_INFOstate leaves ESTABLISHED, or the FD errors.On either signal it flips
state.runningFalse (runner exits → registry unregister → pod returns to pool) and force-shuts the socket so the wedgedsendalland recv thread unblock at once. Grace is configurable viaDEMON_DEAD_CLIENT_STALL_S(default 30s). Linux-only; no-ops elsewhere.Supersedes #293 (the app-level ack-stall reaper, which false-reaps GIL-starved-but-alive clients). Validated on a 5090 pod: a frozen client is reaped via send_stall; clean close, active, paused-but-alive, and backpressured clients are all left untouched.