Skip to content

quic: add TLS session ticket resumption support#42734

Open
bellatoris wants to merge 12 commits intoenvoyproxy:mainfrom
bellatoris:doogie/quic-proof-source-update
Open

quic: add TLS session ticket resumption support#42734
bellatoris wants to merge 12 commits intoenvoyproxy:mainfrom
bellatoris:doogie/quic-proof-source-update

Conversation

@bellatoris
Copy link
Contributor

Commit Message: quic: add session ticket resumption support using configured session ticket keys
Additional Description:

Summary

TLS session resumption is essential for QUIC performance. Without it, every connection requires a full TLS handshake, and 0-RTT becomes meaningless since there's no session state to resume from. As noted in #42682, TLS-related data accounts for roughly 1/3 of bytes during connection establishment - session resumption eliminates most of this overhead.

Currently, Envoy's QUIC implementation does not support session resumption across workers or processes. While users can configure session_ticket_keys or session_ticket_keys_sds_secret_config in downstream TLS context, these settings have no effect on QUIC connections. This limitation is documented in #25418, which explicitly states that session ticket key plumbing is missing from the QUIC implementation.

This PR bridges that gap by enabling QUIC to use the same session ticket keys configured for TCP TLS, allowing session resumption to work across workers and processes.

Implementation

We create a custom EnvoyTlsServerHandshaker that extends QUICHE's TlsServerHandshaker and provides EnvoyProofSourceHandle to pass filter chain context during certificate selection.

Note: QUICHE's DefaultProofSourceHandle is a private nested class inside TlsServerHandshaker, so we cannot extend or reuse it directly. We had to copy the relevant implementation and add our customizations.

Key design decisions:

  1. SSL ex_data for context passing: During SelectCertificate(), we store the filter chain pointer in SSL ex_data. This allows the session ticket callback to retrieve per-connection configuration.

  2. SSL_CTX_set_tlsext_ticket_key_cb over SSL_CTX_set_ticket_aead_method: We use the same callback mechanism as TCP TLS rather than QUICHE's TicketCrypter interface. This allows us to reuse ServerContextImpl::sessionTicketProcess() directly, ensuring identical session ticket handling between TCP and QUIC.

  3. Respect existing configuration: We check disable_stateless_session_resumption, session_ticket_keys, and handles_session_resumption from the transport socket factory and disable tickets accordingly via SSL_OP_NO_TICKET.

Flow

EnvoyTlsServerHandshaker::MaybeCreateProofSourceHandle()
  └─→ EnvoyProofSourceHandle::SelectCertificate()
        └─→ SSL_set_ex_data(ssl, filter_chain)

SSL_CTX_set_tlsext_ticket_key_cb (installed in OnNewSslCtx)
  └─→ SSL_get_ex_data(ssl) → filter_chain
        └─→ QuicServerTransportSocketFactory::sessionTicketProcess()
              └─→ ServerContextImpl::sessionTicketProcess()

Risk Level: Low (behind runtime guard, disabled by default)
Testing: Existing tests, manual verification with QUIC clients
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A
[Optional Runtime guard:] envoy.reloadable_features.quic_session_ticket_support (default: false)
[Optional Fixes #Issue] Partially addresses #25418

This change enables TLS session ticket resumption for QUIC connections,
allowing clients to resume TLS sessions without full handshakes when
reconnecting across server instances.

Implementation details:
- Add EnvoyTlsServerHandshaker that provides custom ProofSourceHandle
  to intercept certificate selection and configure session ticket options
- Store filter chain pointer in SSL ex_data during handshake to enable
  session ticket callback to access the correct transport socket factory
- Delegate session ticket encryption/decryption to ServerContextImpl
  which uses configured session_ticket_keys
- Add runtime guard envoy.reloadable_features.quic_session_ticket_support
  (default false) to control feature enablement

The feature integrates with existing DownstreamTlsContext configuration:
- Uses session_ticket_keys from TLS context for ticket encryption
- Respects disable_stateless_session_resumption setting
- Honors handles_session_resumption capability flag

Risk Level: Low (behind runtime guard, disabled by default)

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@repokitteh-read-only
Copy link

Hi @bellatoris, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #42734 was opened by bellatoris.

see: more, trace.

@repokitteh-read-only
Copy link

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #42734 was opened by bellatoris.

see: more, trace.

@ravenblackx
Copy link
Contributor

Please fix format.

/wait

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@kyessenov
Copy link
Contributor

Please merge main.
/wait

@bellatoris
Copy link
Contributor Author

Please merge main.

done

@nezdolik
Copy link
Member

@bellatoris looks like some CI checks need to be fixed

@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch from a5c20f4 to a290a7e Compare January 30, 2026 12:42
@bellatoris
Copy link
Contributor Author

@nezdolik seems like they are flaky ones. except code coverage, all tests is passed.

@KBaichoo
Copy link
Contributor

KBaichoo commented Feb 5, 2026

/wait

Need merge of main

@ggreenway
Copy link
Member

Adding @RyanTheOptimist as a reviewer as @danzh2010 hasn't gotten to this yet.

@github-actions
Copy link

github-actions bot commented Mar 7, 2026

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label Mar 7, 2026
@ggreenway
Copy link
Member

@RyanTheOptimist or @danzh2010 please review

@github-actions github-actions bot removed the stale stalebot believes this issue/PR has not been touched recently label Mar 9, 2026
Copy link
Contributor

@danzh2010 danzh2010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RyanTheOptimist do you mind take a look at this. I'm not an expert of the TLS handshake implementation.

SSL_set_ex_data(client_hello.ssl, EnvoyQuicProofSource::filterChainExDataIndex(),
const_cast<Network::FilterChain*>(&transport_data->filter_chain_));

auto& factory = dynamic_cast<const QuicServerTransportSocketFactory&>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

static_cast should be sufficient.

auto ticket_config = factory.getSessionTicketConfig();
if (ticket_config.disable_stateless_resumption || !ticket_config.has_keys ||
ticket_config.handles_session_resumption) {
SSL_set_options(client_hello.ssl, SSL_OP_NO_TICKET);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought QUICHE has the ability to disable ticket resumption so QUICHE user doesn't need to directly modify the boring SSL object. Is it missing the SSL_CTX_set_tlsext_ticket_key_cb? I'm not an expert in boring SSL, but my impression is that QUICHE completely hide the ssl object behind its own APIs. Maybe @RyanTheOptimist can shed some light on this?

And do we need to retain filter chain object in the ssl extension data? Can we store it in EnvoyProofSourceHandle or the handshaker?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here. https://quiche.googlesource.com/quiche/+/refs/heads/main/quiche/quic/core/crypto/tls_connection.h?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F#100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here.

I'd prefer to use TlsConnection::DisableTicketSupport() directly, but tls_connection() returns const TlsConnection* and there's no non-const accessor from the subclass. The wrapper calls the same SSL_set_options(ssl(), SSL_OP_NO_TICKET) via the protected ssl() accessor. Open to suggestions if there's a better way to reach it.

And do we need to retain filter chain object in the ssl extension data? Can we store it in EnvoyProofSourceHandle or the handshaker?

The tlsext_ticket_key_cb callback is a C function pointer that only receives SSL*, and TlsConnection::ConnectionFromSsl() is protected with no public path back to the handshaker. SSL ex_data was the most straightforward way I found to pass per-connection context into this callback. Happy to take a different approach if you have ideas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, which tls_connection() method are you referring to? TlsHandshaker::tls_connection()? We (QUICHE developers) could easily add a non-const version of that method if that would be helpful? Or we could call TlsServerHandshaker::DisableResumption() but maybe that's not ergonomic either?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm referring to TlsHandshaker::tls_connection() which returns const TlsConnection*. A non-const version would be ideal.

TlsServerHandshaker::DisableResumption() won't work here because QUICHE sets can_disable_resumption_ = false immediately before calling SelectCertificate():

// tls_server_handshaker.cc, EarlySelectCertCallback():
can_disable_resumption_ = false;                                          // line 1079
const QuicAsyncStatus status = proof_source_handle_->SelectCertificate(   // line 1080: our code runs here
    ...);

So when our EnvoyProofSourceHandle::SelectCertificate() needs to disable tickets based on per-filter-chain config, DisableResumption() returns false and does nothing:

  // tls_server_handshaker.cc:
  bool TlsServerHandshaker::DisableResumption() {
    if (!can_disable_resumption_ || ...) {  // already false
      return false;                         // no-op
    }
    tls_connection_.DisableTicketSupport(); // never reached
    return true;
  }

If you could add a non-const tls_connection() accessor, I'd switch to tls_connection()->DisableTicketSupport() directly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, once #44019 lands, a non-const tls_connection() method will be available.

/wait

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RyanTheOptimist i have addressed it, could you review it again?

struct SessionTicketConfig {
bool has_keys;
bool disable_stateless_resumption;
bool handles_session_resumption;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment on the meaning of these fields. Particular since the latter two both refer to "resumption" it would be good to make the distinction clear.

auto ticket_config = factory.getSessionTicketConfig();
if (ticket_config.disable_stateless_resumption || !ticket_config.has_keys ||
ticket_config.handles_session_resumption) {
SSL_set_options(client_hello.ssl, SSL_OP_NO_TICKET);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah QUICHE provides this method. Is it sufficient? in fact the whole TlsConnection object looks useful here. https://quiche.googlesource.com/quiche/+/refs/heads/main/quiche/quic/core/crypto/tls_connection.h?autodive=0%2F%2F%2F%2F%2F%2F%2F%2F#100

#include "source/common/quic/quic_server_transport_socket_factory.h"
#include "source/server/listener_stats.h"

#include "openssl/ssl.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

@RyanTheOptimist
Copy link
Contributor

@RyanTheOptimist or @danzh2010 please review

Oh sorry. the "waiting" tag filtered this out of my "to do" list query.

@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch 3 times, most recently from ff12350 to 3d0f9b8 Compare March 12, 2026 11:55
Resolve conflicts:
- changelogs/current.yaml: keep both PR entry and upstream entries
- source/common/quic/BUILD: keep macros dep, remove cert_compression_lib
  (upstream removed), update quiche dep names
- source/common/quic/envoy_quic_proof_source.cc: use registerCertCompression
  (upstream rename), remove cert_compression.h include

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
- Replace dynamic_cast with static_cast for QuicServerTransportSocketFactory
- Use handshaker->disableTicketSupport() instead of direct SSL_set_options
- Add documentation comments to SessionTicketConfig fields
- Remove unnecessary #include "openssl/ssl.h" from envoy_quic_proof_source.h
- Fix SelectCertificate signature to match upstream QUICHE API change
  (added bool disable_alps_explicit_codepoint parameter)

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@paul-r-gall
Copy link
Contributor

@bellatoris please merge main.

@RyanTheOptimist please re-review.

…ource-update

# Conflicts:
#	changelogs/current.yaml
@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch from 3d0f9b8 to 8060975 Compare March 17, 2026 13:52
The QUICHE API parameter `disable_alps_explicit_codepoint` triggers
the spelling checker. The plural form `codepoints` already exists.

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris
Copy link
Contributor Author

thx @paul-r-gall main merged

@bellatoris
Copy link
Contributor Author

/retest

1 similar comment
@bellatoris
Copy link
Contributor Author

/retest

Add tests to improve coverage for source/common/quic and
source/extensions/quic/crypto_stream:

quic_transport_socket_factory_test.cc:
- CreateDownstreamTransportSocketPanics: verify PANIC stub
- GetSessionTicketConfig: verify default config values
- SessionTicketProcessNullCtx: verify null ssl_ctx returns 0

envoy_quic_proof_source_test.cc:
- FilterChainExDataIndex: verify stable non-negative index
- OnNewSslCtxWithSessionTicketSupport: smoke test with guard enabled
- OnNewSslCtxWithSessionTicketSupportDisabled: smoke test with guard disabled

envoy_quic_dispatcher_test.cc:
- HandshakeWithSessionTicketSupport: process CHLO through
  EnvoyTlsServerHandshaker with runtime guard enabled to exercise
  SelectCertificate, ComputeSignature, and disableTicketSupport

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch from 6786164 to 2123532 Compare March 18, 2026 01:24
copybara-service bot pushed a commit to google/quiche that referenced this pull request Mar 18, 2026
@bellatoris
Copy link
Contributor Author

/retest

Add tests to improve coverage for source/common/quic and
source/extensions/quic/crypto_stream:

Source changes:
- Remove custom disableTicketSupport() wrapper, use QUICHE's new
  non-const tls_connection().DisableTicketSupport() API instead
- Remove unnecessary openssl/ssl.h include from handshaker header

quic_transport_socket_factory_test.cc:
- CreateDownstreamTransportSocketPanics: verify PANIC stub
- GetSessionTicketConfig: verify default config values
- SessionTicketProcessNullCtx: verify null ssl_ctx returns 0

envoy_quic_proof_source_test.cc:
- FilterChainExDataIndex: verify stable non-negative index
- OnNewSslCtxWithSessionTicketSupport: smoke test with guard enabled
- OnNewSslCtxWithSessionTicketSupportDisabled: smoke test with guard disabled

envoy_quic_dispatcher_test.cc:
- HandshakeWithSessionTicketSupport: process CHLO through
  EnvoyTlsServerHandshaker with runtime guard enabled to exercise
  SelectCertificate and session ticket configuration paths

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch from f9527db to 72209d1 Compare March 20, 2026 06:35
…rage tests

Extract the stateless lambda in OnNewSslCtx to a public static method
ticketKeyCallback so it can be unit-tested directly. Use dynamic_cast
(consistent with getTransportSocketAndFilterChain in the same file).

Add two new tests:
- TicketKeyCallbackNullFilterChain: verifies return 0 when no filter chain
  ex_data is set on the SSL
- TicketKeyCallbackWithFilterChain: verifies delegation through to the
  transport socket factory sessionTicketProcess

Also extract loadCertsIntoFactory() helper to eliminate 3x copy-pasted
cert-loading mock setup boilerplate across tests.

Signed-off-by: Doogie Min <doogie.min@sendbird.com>
@bellatoris bellatoris force-pushed the doogie/quic-proof-source-update branch from 72209d1 to 845a200 Compare March 20, 2026 06:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants