quic: add TLS session ticket resumption support#42734
quic: add TLS session ticket resumption support#42734bellatoris wants to merge 12 commits intoenvoyproxy:mainfrom
Conversation
This change enables TLS session ticket resumption for QUIC connections, allowing clients to resume TLS sessions without full handshakes when reconnecting across server instances. Implementation details: - Add EnvoyTlsServerHandshaker that provides custom ProofSourceHandle to intercept certificate selection and configure session ticket options - Store filter chain pointer in SSL ex_data during handshake to enable session ticket callback to access the correct transport socket factory - Delegate session ticket encryption/decryption to ServerContextImpl which uses configured session_ticket_keys - Add runtime guard envoy.reloadable_features.quic_session_ticket_support (default false) to control feature enablement The feature integrates with existing DownstreamTlsContext configuration: - Uses session_ticket_keys from TLS context for ticket encryption - Respects disable_stateless_session_resumption setting - Honors handles_session_resumption capability flag Risk Level: Low (behind runtime guard, disabled by default) Signed-off-by: Doogie Min <doogie.min@sendbird.com>
|
Hi @bellatoris, welcome and thank you for your contribution. We will try to review your Pull Request as quickly as possible. In the meantime, please take a look at the contribution guidelines if you have not done so already. |
|
Please fix format. /wait |
Signed-off-by: Doogie Min <doogie.min@sendbird.com>
|
Please merge main. |
done |
|
@bellatoris looks like some CI checks need to be fixed |
a5c20f4 to
a290a7e
Compare
|
@nezdolik seems like they are flaky ones. except code coverage, all tests is passed. |
|
/wait Need merge of main |
|
Adding @RyanTheOptimist as a reviewer as @danzh2010 hasn't gotten to this yet. |
|
This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
|
@RyanTheOptimist or @danzh2010 please review |
danzh2010
left a comment
There was a problem hiding this comment.
@RyanTheOptimist do you mind take a look at this. I'm not an expert of the TLS handshake implementation.
| SSL_set_ex_data(client_hello.ssl, EnvoyQuicProofSource::filterChainExDataIndex(), | ||
| const_cast<Network::FilterChain*>(&transport_data->filter_chain_)); | ||
|
|
||
| auto& factory = dynamic_cast<const QuicServerTransportSocketFactory&>( |
There was a problem hiding this comment.
static_cast should be sufficient.
| auto ticket_config = factory.getSessionTicketConfig(); | ||
| if (ticket_config.disable_stateless_resumption || !ticket_config.has_keys || | ||
| ticket_config.handles_session_resumption) { | ||
| SSL_set_options(client_hello.ssl, SSL_OP_NO_TICKET); |
There was a problem hiding this comment.
I thought QUICHE has the ability to disable ticket resumption so QUICHE user doesn't need to directly modify the boring SSL object. Is it missing the SSL_CTX_set_tlsext_ticket_key_cb? I'm not an expert in boring SSL, but my impression is that QUICHE completely hide the ssl object behind its own APIs. Maybe @RyanTheOptimist can shed some light on this?
And do we need to retain filter chain object in the ssl extension data? Can we store it in EnvoyProofSourceHandle or the handshaker?
There was a problem hiding this comment.
Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here. https://quiche.googlesource.com/quiche/+/refs/heads/main/quiche/quic/core/crypto/tls_connection.h?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F#100
There was a problem hiding this comment.
Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here.
I'd prefer to use TlsConnection::DisableTicketSupport() directly, but tls_connection() returns const TlsConnection* and there's no non-const accessor from the subclass. The wrapper calls the same SSL_set_options(ssl(), SSL_OP_NO_TICKET) via the protected ssl() accessor. Open to suggestions if there's a better way to reach it.
And do we need to retain filter chain object in the ssl extension data? Can we store it in EnvoyProofSourceHandle or the handshaker?
The tlsext_ticket_key_cb callback is a C function pointer that only receives SSL*, and TlsConnection::ConnectionFromSsl() is protected with no public path back to the handshaker. SSL ex_data was the most straightforward way I found to pass per-connection context into this callback. Happy to take a different approach if you have ideas.
There was a problem hiding this comment.
Sorry, which tls_connection() method are you referring to? TlsHandshaker::tls_connection()? We (QUICHE developers) could easily add a non-const version of that method if that would be helpful? Or we could call TlsServerHandshaker::DisableResumption() but maybe that's not ergonomic either?
There was a problem hiding this comment.
Yes, I'm referring to TlsHandshaker::tls_connection() which returns const TlsConnection*. A non-const version would be ideal.
TlsServerHandshaker::DisableResumption() won't work here because QUICHE sets can_disable_resumption_ = false immediately before calling SelectCertificate():
// tls_server_handshaker.cc, EarlySelectCertCallback():
can_disable_resumption_ = false; // line 1079
const QuicAsyncStatus status = proof_source_handle_->SelectCertificate( // line 1080: our code runs here
...);So when our EnvoyProofSourceHandle::SelectCertificate() needs to disable tickets based on per-filter-chain config, DisableResumption() returns false and does nothing:
// tls_server_handshaker.cc:
bool TlsServerHandshaker::DisableResumption() {
if (!can_disable_resumption_ || ...) { // already false
return false; // no-op
}
tls_connection_.DisableTicketSupport(); // never reached
return true;
}If you could add a non-const tls_connection() accessor, I'd switch to tls_connection()->DisableTicketSupport() directly.
There was a problem hiding this comment.
Ok, once #44019 lands, a non-const tls_connection() method will be available.
/wait
There was a problem hiding this comment.
@RyanTheOptimist i have addressed it, could you review it again?
| struct SessionTicketConfig { | ||
| bool has_keys; | ||
| bool disable_stateless_resumption; | ||
| bool handles_session_resumption; |
There was a problem hiding this comment.
Please comment on the meaning of these fields. Particular since the latter two both refer to "resumption" it would be good to make the distinction clear.
| auto ticket_config = factory.getSessionTicketConfig(); | ||
| if (ticket_config.disable_stateless_resumption || !ticket_config.has_keys || | ||
| ticket_config.handles_session_resumption) { | ||
| SSL_set_options(client_hello.ssl, SSL_OP_NO_TICKET); |
There was a problem hiding this comment.
Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here. https://quiche.googlesource.com/quiche/+/refs/heads/main/quiche/quic/core/crypto/tls_connection.h?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F#100
| #include "source/common/quic/quic_server_transport_socket_factory.h" | ||
| #include "source/server/listener_stats.h" | ||
|
|
||
| #include "openssl/ssl.h" |
Oh sorry. the "waiting" tag filtered this out of my "to do" list query. |
ff12350 to
3d0f9b8
Compare
Resolve conflicts: - changelogs/current.yaml: keep both PR entry and upstream entries - source/common/quic/BUILD: keep macros dep, remove cert_compression_lib (upstream removed), update quiche dep names - source/common/quic/envoy_quic_proof_source.cc: use registerCertCompression (upstream rename), remove cert_compression.h include Signed-off-by: Doogie Min <doogie.min@sendbird.com>
- Replace dynamic_cast with static_cast for QuicServerTransportSocketFactory - Use handshaker->disableTicketSupport() instead of direct SSL_set_options - Add documentation comments to SessionTicketConfig fields - Remove unnecessary #include "openssl/ssl.h" from envoy_quic_proof_source.h - Fix SelectCertificate signature to match upstream QUICHE API change (added bool disable_alps_explicit_codepoint parameter) Signed-off-by: Doogie Min <doogie.min@sendbird.com>
|
@bellatoris please merge main. @RyanTheOptimist please re-review. |
…ource-update # Conflicts: # changelogs/current.yaml
3d0f9b8 to
8060975
Compare
The QUICHE API parameter `disable_alps_explicit_codepoint` triggers the spelling checker. The plural form `codepoints` already exists. Signed-off-by: Doogie Min <doogie.min@sendbird.com>
|
thx @paul-r-gall main merged |
|
/retest |
1 similar comment
|
/retest |
Add tests to improve coverage for source/common/quic and source/extensions/quic/crypto_stream: quic_transport_socket_factory_test.cc: - CreateDownstreamTransportSocketPanics: verify PANIC stub - GetSessionTicketConfig: verify default config values - SessionTicketProcessNullCtx: verify null ssl_ctx returns 0 envoy_quic_proof_source_test.cc: - FilterChainExDataIndex: verify stable non-negative index - OnNewSslCtxWithSessionTicketSupport: smoke test with guard enabled - OnNewSslCtxWithSessionTicketSupportDisabled: smoke test with guard disabled envoy_quic_dispatcher_test.cc: - HandshakeWithSessionTicketSupport: process CHLO through EnvoyTlsServerHandshaker with runtime guard enabled to exercise SelectCertificate, ComputeSignature, and disableTicketSupport Signed-off-by: Doogie Min <doogie.min@sendbird.com>
6786164 to
2123532
Compare
For context: envoyproxy/envoy#42734 (comment) PiperOrigin-RevId: 885760611
|
/retest |
Add tests to improve coverage for source/common/quic and source/extensions/quic/crypto_stream: Source changes: - Remove custom disableTicketSupport() wrapper, use QUICHE's new non-const tls_connection().DisableTicketSupport() API instead - Remove unnecessary openssl/ssl.h include from handshaker header quic_transport_socket_factory_test.cc: - CreateDownstreamTransportSocketPanics: verify PANIC stub - GetSessionTicketConfig: verify default config values - SessionTicketProcessNullCtx: verify null ssl_ctx returns 0 envoy_quic_proof_source_test.cc: - FilterChainExDataIndex: verify stable non-negative index - OnNewSslCtxWithSessionTicketSupport: smoke test with guard enabled - OnNewSslCtxWithSessionTicketSupportDisabled: smoke test with guard disabled envoy_quic_dispatcher_test.cc: - HandshakeWithSessionTicketSupport: process CHLO through EnvoyTlsServerHandshaker with runtime guard enabled to exercise SelectCertificate and session ticket configuration paths Signed-off-by: Doogie Min <doogie.min@sendbird.com>
f9527db to
72209d1
Compare
…rage tests Extract the stateless lambda in OnNewSslCtx to a public static method ticketKeyCallback so it can be unit-tested directly. Use dynamic_cast (consistent with getTransportSocketAndFilterChain in the same file). Add two new tests: - TicketKeyCallbackNullFilterChain: verifies return 0 when no filter chain ex_data is set on the SSL - TicketKeyCallbackWithFilterChain: verifies delegation through to the transport socket factory sessionTicketProcess Also extract loadCertsIntoFactory() helper to eliminate 3x copy-pasted cert-loading mock setup boilerplate across tests. Signed-off-by: Doogie Min <doogie.min@sendbird.com>
72209d1 to
845a200
Compare
Commit Message: quic: add session ticket resumption support using configured session ticket keys
Additional Description:
Summary
TLS session resumption is essential for QUIC performance. Without it, every connection requires a full TLS handshake, and 0-RTT becomes meaningless since there's no session state to resume from. As noted in #42682, TLS-related data accounts for roughly 1/3 of bytes during connection establishment - session resumption eliminates most of this overhead.
Currently, Envoy's QUIC implementation does not support session resumption across workers or processes. While users can configure
session_ticket_keysorsession_ticket_keys_sds_secret_configin downstream TLS context, these settings have no effect on QUIC connections. This limitation is documented in #25418, which explicitly states that session ticket key plumbing is missing from the QUIC implementation.This PR bridges that gap by enabling QUIC to use the same session ticket keys configured for TCP TLS, allowing session resumption to work across workers and processes.
Implementation
We create a custom
EnvoyTlsServerHandshakerthat extends QUICHE'sTlsServerHandshakerand providesEnvoyProofSourceHandleto pass filter chain context during certificate selection.Note: QUICHE's
DefaultProofSourceHandleis a private nested class insideTlsServerHandshaker, so we cannot extend or reuse it directly. We had to copy the relevant implementation and add our customizations.Key design decisions:
SSL ex_data for context passing: During
SelectCertificate(), we store the filter chain pointer in SSL ex_data. This allows the session ticket callback to retrieve per-connection configuration.SSL_CTX_set_tlsext_ticket_key_cboverSSL_CTX_set_ticket_aead_method: We use the same callback mechanism as TCP TLS rather than QUICHE'sTicketCrypterinterface. This allows us to reuseServerContextImpl::sessionTicketProcess()directly, ensuring identical session ticket handling between TCP and QUIC.Respect existing configuration: We check
disable_stateless_session_resumption,session_ticket_keys, andhandles_session_resumptionfrom the transport socket factory and disable tickets accordingly viaSSL_OP_NO_TICKET.Flow
Risk Level: Low (behind runtime guard, disabled by default)
Testing: Existing tests, manual verification with QUIC clients
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A
[Optional Runtime guard:]
envoy.reloadable_features.quic_session_ticket_support(default: false)[Optional Fixes #Issue] Partially addresses #25418