[serve] Record WebSocket disconnect proxy metrics#64185
Conversation
3b31aea to
469024f
Compare
There was a problem hiding this comment.
Code Review
This pull request ensures that proxy metrics and access logs are recorded correctly when a client disconnects, particularly for WebSocket connections (which are assigned a status code of "1006"). It also handles GeneratorExit explicitly to prevent it from being treated as an unexpected server error. The review feedback highlights a critical issue where raising GeneratorExit leaves status as None for non-websocket requests, potentially causing an AttributeError when status.code is accessed later. A suggestion is provided to defensively set status to a "499" status code (client closed request) when catching GeneratorExit.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| except GeneratorExit: | ||
| raise |
There was a problem hiding this comment.
By adding except GeneratorExit: raise to prevent GeneratorExit from being caught as an unexpected 500 error, status remains None if the generator is closed before the response starts (e.g., when an HTTP client disconnects early).
In the finally block of send_request_to_replica, if status is None and the request type is not "websocket", it falls through to the else block:
else:
status_code = status.codeThis will raise an unhandled AttributeError: 'NoneType' object has no attribute 'code', masking the original GeneratorExit and potentially causing resource leaks or broken cleanup.
To fix this, we should defensively set status to an appropriate status code (such as "499" for client closed request) when catching GeneratorExit before re-raising it.
| except GeneratorExit: | |
| raise | |
| except GeneratorExit: | |
| if status is None: | |
| status = ResponseStatus(code="499", is_error=True) | |
| raise |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 469024f. Configure here.
Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
469024f to
e52b9e5
Compare
Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
e15b5f2 to
c973a14
Compare
Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
48a7592 to
0923bc3
Compare

Why are these changes needed?
Fixes #64077.
When an ASGI sender closes a WebSocket mid stream,
GenericProxy.proxy_requestcan be finalized atyield messagebefore it sees the finalResponseStatus. The ongoing request gauge is decremented, but request counters, error counters, latency metrics, and access logs are skipped.This moves proxy request accounting into the wrapper
finallyblock, so a finalized WebSocket stream is still counted once. If no status reached the wrapper, it records WebSocket code1006.The HTTP proxy also now lets
GeneratorExitpropagate instead of routing it through the normal exception response path. That avoids yielding while the async generator is being closed.Related issue number
#64077
Checks
python3 -m ruff check python/ray/serve/_private/proxy.py python/ray/serve/tests/unit/test_proxy.pypython3 -m ruff format --check python/ray/serve/_private/proxy.py python/ray/serve/tests/unit/test_proxy.pypython3 -m compileall -q python/ray/serve/_private/proxy.py python/ray/serve/tests/unit/test_proxy.pygit diff --checkI tried the focused pytest for the new test, but this local checkout cannot import
ray._rayletbecause the compiled Ray extension is not built here.