Skip to content

TCPClient: close the in-flight SSLIOStream on TLS handshake timeout#3673

Open
HrachShah wants to merge 1 commit into
tornadoweb:masterfrom
HrachShah:fix/tcpclient-tls-handshake-timeout-closes-stream
Open

TCPClient: close the in-flight SSLIOStream on TLS handshake timeout#3673
HrachShah wants to merge 1 commit into
tornadoweb:masterfrom
HrachShah:fix/tcpclient-tls-handshake-timeout-closes-stream

Conversation

@HrachShah

Copy link
Copy Markdown

Fixes #3614

When TCPClient.connect is called with both ssl_options and a timeout, a TLS handshake timeout leaks the underlying socket. IOStream.start_tls transfers socket ownership to a new SSLIOStream before the handshake completes, and gen.with_timeout deliberately leaves the wrapped future running on TimeoutError. The original stream's socket is already None, so its close() is a no-op, and the only caller-reachable reference to the in-flight SSLIOStream (which owns the real socket) is the inner future.

This change makes IOStream.start_tls keep a back-reference from the returned future to the new SSLIOStream (future._ssl_stream) before returning. TCPClient.connect captures that future and, on TimeoutError, calls getattr(handshake_future, "_ssl_stream", None) and close()s the stream if present. If the handshake later fails, SSLIOStream's own failure path closes the socket so the late callback is a no-op; if the handshake eventually succeeds, the caller is already gone (TimeoutError was raised) and the now-owned stream is closed at the application layer as usual.

TCPClientSSLTimeoutTest.test_tls_handshake_timeout_closes_ssl_stream in tornado/test/tcpclient_test.py patches IOStream.start_tls with a fake that does not actually drive the handshake (so the inner future stays pending), runs client.connect(..., ssl_options=..., timeout=0.05), asserts the TimeoutError, resolves the inner future, and asserts that the patched SSLIOStream.close was called exactly once. The test fails on the pre-fix code (the SSL stream is not closed) and passes with the fix.

IOStream.start_tls transfers ownership of the underlying socket to a new
SSLIOStream before the handshake completes and sets self.socket to None
on the original stream. The future start_tls returns is the only
caller-reachable reference to the new stream; the original stream's
close() is a no-op on the None socket.

If gen.with_timeout around start_tls fires, the original TimeoutError
propagates to the caller and the new SSLIOStream is left in flight. The
handshake may eventually fail (which closes the stream anyway) or it
may succeed (which leaves the socket held by the SSLIOStream with no
caller-side reference), so the socket fd leaks until GC.

Expose the new SSLIOStream on the handshake future via future._ssl_stream
so the timeout branch in TCPClient.connect can close it. TimeoutError
is re-raised after the close so the caller's behaviour is unchanged.

Regression test: monkey-patches IOStream.start_tls to return a pending
future while recording the new SSLIOStream, runs TCPClient.connect with
a short timeout, resolves the pending future, and asserts the recorded
SSLIOStream had close() called on it.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Socket leak in TCPClient.connect when TLS handshake times out

1 participant