Skip to content

chore: tidy up SSL context creation now that we've dropped Python 3.8#2364

Open
james-garner-canonical wants to merge 2 commits intocanonical:mainfrom
james-garner-canonical:26-03+chore+cleanup-sslcontext-creation
Open

chore: tidy up SSL context creation now that we've dropped Python 3.8#2364
james-garner-canonical wants to merge 2 commits intocanonical:mainfrom
james-garner-canonical:26-03+chore+cleanup-sslcontext-creation

Conversation

@james-garner-canonical
Copy link
Contributor

@james-garner-canonical james-garner-canonical commented Mar 5, 2026

This PR tidies up SSL context in ops-tracing, using create_default_context now that we've dropped Python 3.8.

Due to changes to the default behaviour in Python 3.13, we still need two lines of manual changes afterwards. The docs say:

The protocol, options, cipher and other settings may change to more restrictive values anytime without prior deprecation. The values represent a fair balance between compatibility and security.
If your application needs specific settings, you should create a SSLContext and apply the settings yourself.

This makes me wonder if we should stick to manually creating the SSLContext ourselves, though the docs do indicate that applying the settings is OK too -- I'm just wondering if anything else relevant might change underneath us.

Resolves #2055

# Note that ssl.create_default_context() doesn't allow setting the context.protocol in a
# way that's the same across Python 3.8 and 3.10 onwards. Whip the context up by hand.
context = ssl.SSLContext(ssl.PROTOCOL_TLS_CLIENT)
context = ssl.create_default_context(cadata=ca)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ssl.create_default_context defaults to purpose=Purpose.SERVER_AUTH. This uses SSLContext(ssl.PROTOCOL_TLS_CLIENT).

Comment on lines +112 to +115
# The partial chain flag allows trusting intermediate CAs in the CA list
# without the matching root CA.
# Enabled by default in Python 3.13+, needed explicitly before that.
context.verify_flags |= ssl.VERIFY_X509_PARTIAL_CHAIN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is annoyingly interleaved here. This chunk of additions corresponds to these lines below:

-        if partial_chain := getattr(ssl, 'VERIFY_X509_PARTIAL_CHAIN', None):
-            # Available starting from Python 3.10. The partial chain flag allows trusting an
-            # intermediate CAs in the CA list without the matching root CA.
-            context.verify_flags |= partial_chain

Comment on lines +119 to +120
# Enabled by default in Python 3.13+, so must be explicitly cleared.
context.verify_flags &= ~ssl.VERIFY_X509_STRICT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This diff is annoyingly interleaved here. This addition corresponds to this line above:

-        # context.verify_flags |= ssl.VERIFY_X509_STRICT

And matches this retained comment:

        # Can't use strict certificate chain validation until our self-signed ca is fixed
        # https://github.com/canonical/self-signed-certificates-operator/issues/330
        # https://github.com/canonical/tls-certificates-interface/pull/333

Comment on lines -123 to -126
if ca is not None:
context.load_verify_locations(cadata=ca)
else:
context.load_default_certs()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now handled by create_default_context. Per the docs:

Passing SERVER_AUTH as purpose sets verify_mode to CERT_REQUIRED and either loads CA certificates (when at least one of cafile, capath or cadata is given) or uses SSLContext.load_default_certs() to load default CA certificates.

@james-garner-canonical james-garner-canonical marked this pull request as ready for review March 5, 2026 03:26
# Enabled by default in Python 3.13+, needed explicitly before that.
context.verify_flags |= ssl.VERIFY_X509_PARTIAL_CHAIN
# Can't use strict certificate chain validation until our self-signed ca is fixed
# https://github.com/canonical/self-signed-certificates-operator/issues/330
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dimaqq
Copy link
Contributor

dimaqq commented Mar 5, 2026

How I would validate this change:

  1. Hack up Ops (ipython or unit test) to try to connect to (somewhere).
  2. Run the {old,new} code x Python {3.8, 3.10, 3.12}.
  3. Capture the traffic (e.g. wireshark) and inspect the initial handshake

How to inspect: one of

  • A public HTTPS endpoint (like google.com) + Wireshark, or
  • openssl s_server with some cert and cmd line flags that show decoded packets

(We've got integration tests, one uses self-signed-certs, but we don't have an integration test against a real cert provider).

@james-garner-canonical
Copy link
Contributor Author

james-garner-canonical commented Mar 5, 2026

How I would validate this change:

  1. Hack up Ops (ipython or unit test) to try to connect to (somewhere).
  2. Run the {old,new} code x Python {3.8, 3.10, 3.12}.
  3. Capture the traffic (e.g. wireshark) and inspect the initial handshake

How to inspect: one of

  • A public HTTPS endpoint (like google.com) + Wireshark, or
  • openssl s_server with some cert and cmd line flags that show decoded packets

I'm not sure that this is necessary, as the SSLContext objects created with code on main and the code from this PR should be identical on all the Python versions we test on (Python 3.10, 3.12, 3.14). I've added a test to demonstrate this, though I'm not sure it's worth keeping it around.

(We've got integration tests, one uses self-signed-certs, but we don't have an integration test against a real cert provider).

I forgot that we don't run integration tests on PRs, I've kicked off a run here.

Comment on lines +132 to +136
for attr in dir(new):
new_attr = getattr(new, attr)
if callable(new_attr):
continue
assert new_attr == getattr(original, attr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this only were so simple.

IIRC we set ALPN or NPN protocols, but those can't be inspected.
I can't be sure if that's the only setting like that.

The point is that the change needs to be validated outside of Python process.

P.S. this equivalence test should be transient, I don't think that it needs to be checked in.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, one of the unchanged lines in in updated method is context.set_alpn_protocols(['http/1.1']), and it doesn't look like we can get them back from the context. But given that context is still an ssl.SSLContext, why would we expect this line to do anything different now? create_default_context doesn't touch ALPN protocols.

I guess the real concern is that there are some other under the hood settings that we care about that are broken for us with create_default_context. If that's the case, then given that the docs indicate that create_default_context may change settings to be 'more restrictive' at any time, I wonder if create_default_context is the right fit for us, as we'd need to repeat manual validation whenever ssl changes.

P.S. this equivalence test should be transient, I don't think that it needs to be checked in.

Agreed.

Comment on lines +109 to +120
context = ssl.create_default_context(cadata=ca)
context.minimum_version = ssl.TLSVersion.TLSv1_2 # See comment at the top of module
context.set_alpn_protocols(['http/1.1'])
# The partial chain flag allows trusting intermediate CAs in the CA list
# without the matching root CA.
# Enabled by default in Python 3.13+, needed explicitly before that.
context.verify_flags |= ssl.VERIFY_X509_PARTIAL_CHAIN
# Can't use strict certificate chain validation until our self-signed ca is fixed
# https://github.com/canonical/self-signed-certificates-operator/issues/330
# https://github.com/canonical/tls-certificates-interface/pull/333
# context.verify_flags |= ssl.VERIFY_X509_STRICT
if partial_chain := getattr(ssl, 'VERIFY_X509_PARTIAL_CHAIN', None):
# Available starting from Python 3.10. The partial chain flag allows trusting an
# intermediate CAs in the CA list without the matching root CA.
context.verify_flags |= partial_chain

if ca is not None:
context.load_verify_locations(cadata=ca)
else:
context.load_default_certs()

# Enabled by default in Python 3.13+, so must be explicitly cleared.
context.verify_flags &= ~ssl.VERIFY_X509_STRICT
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having eyeballed this, it looks good.

Copy link
Collaborator

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed this in detail - one question first: it feels like the goal of the issue was to get rid of creating a custom context that was required to support 3.8. However, because of other issues, we're still creating a custom context. What are we gaining here? Should we be waiting for some future Python version instead?

@james-garner-canonical
Copy link
Contributor Author

I haven't reviewed this in detail - one question first: it feels like the goal of the issue was to get rid of creating a custom context that was required to support 3.8. However, because of other issues, we're still creating a custom context. What are we gaining here? Should we be waiting for some future Python version instead?

Yeah I feel the same way, and I suspect that create_default_context is always going to have this issue, or at least require careful attention with every Python release.

@tonyandrewmeyer
Copy link
Collaborator

Yeah I feel the same way, and I suspect that create_default_context is always going to have this issue, or at least require careful attention with every Python release.

Does that mean that it's not worth doing this issue, or that we need to have an issue like this with each Python release to handle whatever quirks are there? Or something else?

@dimaqq
Copy link
Contributor

dimaqq commented Mar 9, 2026

Given that all we promise is this: https://documentation.ubuntu.com/ops/latest/reference/ops-tracing/#ops-tracing-security

I think that our CI should cover what we need.

Let's remove the transient equivalence test and merge.

@dimaqq
Copy link
Contributor

dimaqq commented Mar 9, 2026

P.S. we could push min TLS to 1.3, because:

However, that could be a subtle breaking change for someone with an older, deployed COS.
So I'll punt that change to Ops 4.0 list.

Copy link
Contributor

@dimaqq dimaqq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove the equivalence test though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clean up SSLContext handling in ops-tracing

3 participants