CNTRLPLANE-3741: PKI config tests for service-ca and kube-apiserver-operator#31341
CNTRLPLANE-3741: PKI config tests for service-ca and kube-apiserver-operator#31341kaleemsiddiqu wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds extended e2e coverage for configurable PKI profiles, including PKI test registration, shared helpers, and kube-apiserver and service-ca certificate regeneration checks for uniform and mixed key settings. ChangesConfigurable PKI e2e coverage
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Scheduling required tests: |
13280d2 to
1820a38
Compare
|
/test e2e-gcp-ovn-techpreview-serial-1of2 |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (1)
test/extended/pki/helpers.go (1)
108-125: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low value
buildKeyConfigsilently returns an incomplete config for unrecognized algorithms.If
algorithmis neither RSA nor ECDSA, onlyAlgorithmis set and both key bodies stay zero-valued, which could produce a confusing apply result downstream. Consider asserting/erroring on the default case so misconfigured test cases fail fast.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/pki/helpers.go` around lines 108 - 125, buildKeyConfig currently falls through for any algorithm other than KeyAlgorithmRSA or KeyAlgorithmECDSA, leaving KeyConfig partially populated and hiding misconfigured test inputs. Update buildKeyConfig to explicitly handle the supported algorithms and add a default branch that fails fast (for example by asserting or returning an error) when an unrecognized KeyAlgorithm is passed, so callers of buildKeyConfig get an immediate signal instead of an incomplete config.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/pki/helpers.go`:
- Around line 284-293: waitForSecretRegeneration currently treats any existing
secret as success, so it can return the old secret or a newly created secret
before cert data is ready. Update the helper in helpers.go to wait for a
genuinely regenerated secret by comparing against the deleted secret’s
UID/resourceVersion (or by confirming the secret is gone first), and then
require that the new secret contains certKey in Data before returning. Adjust
the call site in pki_kube_apiserver.go to pass the identifier needed to
distinguish the old secret from the recreated one.
- Around line 168-174: The PKI create-or-update flow in the helper currently
treats every Get failure as “missing” and falls back to Create, which can hide
RBAC, timeout, or server errors. Update the logic in the PKI helper path to only
call Create when configClient.ConfigV1alpha1().PKIs().Get returns
apierrors.IsNotFound(err), and otherwise return the original error; apply the
same NotFound-gated pattern in applyMixedPKIConfig as well.
In `@test/extended/pki/pki_kube_apiserver.go`:
- Around line 378-383: The deleteCertificateSecret failure path in
pki_kube_apiserver.go can mask a false pass because the loop only logs a warning
and continues without verifying anything. Update the certificate regeneration
checks around deleteCertificateSecret in the relevant test loop(s) so that a
missing/un-deletable expected secret causes the test to fail, or track whether
at least one certificate was successfully deleted and verified before allowing
the spec to pass. Apply the same fix to the duplicate verification block that
uses the same pattern elsewhere in the file.
- Around line 184-188: The deferred cleanup in `g.DeferCleanup` is using
`context.Background()`, which drops the test/spec context and leaves
`cleanupPKIConfiguration` unbounded; update the cleanup path to derive from the
existing spec context in `pki_kube_apiserver.go` using a detached context (for
example via `context.WithoutCancel`) and wrap it with a timeout before calling
`cleanupPKIConfiguration` so cleanup still has context values but cannot run
indefinitely.
- Around line 25-171: The package-level integrationTestCertificates slice in
pki_kube_apiserver.go is unused by the kube-apiserver test helpers. Remove the
dead variable, or update the relevant test setup functions to reference
integrationTestCertificates directly so the shared certificate list is actually
consumed. Use the integrationTestCertificates symbol and the kube-apiserver PKI
test helpers in this file to locate the change.
- Around line 173-174: The top-level ginkgo recovery hook in the `Describe` body
is unnecessary here. Remove the `defer g.GinkgoRecover()` from the `PKI
Configuration` `Describe` block in `pki_kube_apiserver.go`, and only add
`GinkgoRecover()` inside any spawned goroutine that may call `Fail` or Gomega in
the future.
In `@test/extended/pki/pki_service_ca.go`:
- Around line 331-333: The cleanup in the deferred namespace deletion is
ignoring a returned error, which can hide failed teardown; update the defer in
the pki service CA test to capture the result of
kubeClient.CoreV1().Namespaces().Delete and explicitly assert or log any error
instead of discarding it. Use the existing cleanup block around the testNS
deletion in the deferred func and keep the failure handling consistent with the
test’s other error checks so namespace leaks are surfaced.
- Around line 33-35: The deferred cleanup in the Ginkgo test uses
context.Background(), which disconnects it from the spec context. Update the
DeferCleanup block in pki_service_ca.go to derive a detached context from the
existing test context using context.WithoutCancel(ctx) and wrap it with a
bounded timeout, then pass that context into cleanupPKIConfiguration so cleanup
still runs independently without losing test-scoped values. Use the existing
test context variable from the surrounding spec and keep the cleanup logic
inside g.DeferCleanup.
---
Nitpick comments:
In `@test/extended/pki/helpers.go`:
- Around line 108-125: buildKeyConfig currently falls through for any algorithm
other than KeyAlgorithmRSA or KeyAlgorithmECDSA, leaving KeyConfig partially
populated and hiding misconfigured test inputs. Update buildKeyConfig to
explicitly handle the supported algorithms and add a default branch that fails
fast (for example by asserting or returning an error) when an unrecognized
KeyAlgorithm is passed, so callers of buildKeyConfig get an immediate signal
instead of an incomplete config.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 0bcf1c75-08b8-4f9c-b678-e0aeb6c80603
📒 Files selected for processing (4)
test/extended/include.gotest/extended/pki/helpers.gotest/extended/pki/pki_kube_apiserver.gotest/extended/pki/pki_service_ca.go
|
Scheduling required tests: |
1820a38 to
aa956c0
Compare
|
/test e2e-gcp-ovn-techpreview-serial-1of2 |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
test/extended/pki/pki_kube_apiserver.go (3)
196-384: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚖️ Poor tradeoffConsider extracting the shared regeneration/verify loop.
testKubeAPIServerCertificatesandtestMixedKubeAPIServerCertificatesdiffer only in how the expected algorithm/size/curve is sourced; the get-UID → delete → wait → verify body is duplicated. A small helper taking(cert operatorCertificate, expectedAlgorithm, expectedRSASize, expectedECDSACurve)would remove the duplication and keep timeout/verification logic in one place.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/pki/pki_kube_apiserver.go` around lines 196 - 384, The regeneration/verify flow is duplicated in testKubeAPIServerCertificates and testMixedKubeAPIServerCertificates; extract the shared get-UID → deleteCertificateSecret → waitForSecretRegeneration → getCertificateFromSecret loop into a helper. Make the helper take operatorCertificate plus the expected algorithm/size/curve inputs so both callers can reuse the same timeout and verification logic while only differing in how expectations are populated.
232-236: 📐 Maintainability & Code Quality | 🔵 Trivial | 💤 Low valueRedundant
if err != nilaround the assertion.
o.Expect(err).NotTo(o.HaveOccurred())already fails (and halts via panic) whenerr != nil, so the surrounding conditional adds nothing and only obscures intent. The same pattern repeats at Lines 336-339. Assert unconditionally.♻️ Suggested change
- oldSecret, err := kubeClient.CoreV1().Secrets(cert.Namespace).Get(ctx, cert.SecretName, metav1.GetOptions{}) - if err != nil { - o.Expect(err).NotTo(o.HaveOccurred(), "certificate %s/%s must exist before deletion", cert.Namespace, cert.SecretName) - } + oldSecret, err := kubeClient.CoreV1().Secrets(cert.Namespace).Get(ctx, cert.SecretName, metav1.GetOptions{}) + o.Expect(err).NotTo(o.HaveOccurred(), "certificate %s/%s must exist before deletion", cert.Namespace, cert.SecretName)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/pki/pki_kube_apiserver.go` around lines 232 - 236, Remove the redundant `if err != nil` guard around the `o.Expect(err).NotTo(o.HaveOccurred(), ...)` assertion in the certificate secret lookup, and let the expectation run unconditionally before using `oldSecret.UID`. Apply the same cleanup to the repeated pattern later in this test (the `kubeClient.CoreV1().Secrets(...).Get` plus `o.Expect` block), so the intent is clearer and the assertion alone handles failures.
42-48: 🩺 Stability & Availability | 🔵 TrivialAccept the Ginkgo context in these
Itbodies.These
[Slow]specs can wait up to 20 minutes, socontext.TODO()bypasses Ginkgo timeout and cancellation handling. Thread the spec context through instead.♻️ Suggested change
- g.It("should validate uniform PKI configurations and certificate regeneration [apigroup:config.openshift.io][Slow][Skipped:MicroShift]", func() { - testUniformPKIConfigurations(context.TODO(), kubeClient, configClient) - }) + g.It("should validate uniform PKI configurations and certificate regeneration [apigroup:config.openshift.io][Slow][Skipped:MicroShift]", func(ctx context.Context) { + testUniformPKIConfigurations(ctx, kubeClient, configClient) + }) - g.It("should validate mixed PKI configurations and certificate regeneration [apigroup:config.openshift.io][Slow][Skipped:MicroShift]", func() { - testMixedPKIConfigurations(context.TODO(), kubeClient, configClient) - }) + g.It("should validate mixed PKI configurations and certificate regeneration [apigroup:config.openshift.io][Slow][Skipped:MicroShift]", func(ctx context.Context) { + testMixedPKIConfigurations(ctx, kubeClient, configClient) + })🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/pki/pki_kube_apiserver.go` around lines 42 - 48, The two Ginkgo It bodies are using context.TODO(), which bypasses Ginkgo’s timeout and cancellation handling for these long-running [Slow] specs. Update the uniform and mixed PKI specs to accept the Ginkgo context from the It body and pass that spec context through to testUniformPKIConfigurations and testMixedPKIConfigurations instead of creating a TODO context, so cancellation and timeout behavior is preserved.Source: Path instructions
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/extended/pki/pki_kube_apiserver.go`:
- Around line 196-384: The regeneration/verify flow is duplicated in
testKubeAPIServerCertificates and testMixedKubeAPIServerCertificates; extract
the shared get-UID → deleteCertificateSecret → waitForSecretRegeneration →
getCertificateFromSecret loop into a helper. Make the helper take
operatorCertificate plus the expected algorithm/size/curve inputs so both
callers can reuse the same timeout and verification logic while only differing
in how expectations are populated.
- Around line 232-236: Remove the redundant `if err != nil` guard around the
`o.Expect(err).NotTo(o.HaveOccurred(), ...)` assertion in the certificate secret
lookup, and let the expectation run unconditionally before using
`oldSecret.UID`. Apply the same cleanup to the repeated pattern later in this
test (the `kubeClient.CoreV1().Secrets(...).Get` plus `o.Expect` block), so the
intent is clearer and the assertion alone handles failures.
- Around line 42-48: The two Ginkgo It bodies are using context.TODO(), which
bypasses Ginkgo’s timeout and cancellation handling for these long-running
[Slow] specs. Update the uniform and mixed PKI specs to accept the Ginkgo
context from the It body and pass that spec context through to
testUniformPKIConfigurations and testMixedPKIConfigurations instead of creating
a TODO context, so cancellation and timeout behavior is preserved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 57aa33bc-1f72-4c5b-94ea-09a80b5afb76
📒 Files selected for processing (4)
test/extended/include.gotest/extended/pki/helpers.gotest/extended/pki/pki_kube_apiserver.gotest/extended/pki/pki_service_ca.go
🚧 Files skipped from review as they are similar to previous changes (3)
- test/extended/include.go
- test/extended/pki/helpers.go
- test/extended/pki/pki_service_ca.go
|
Scheduling required tests: |
aa956c0 to
adbcf03
Compare
|
/test e2e-gcp-ovn-techpreview-serial-1of2 |
|
Scheduling required tests: |
adbcf03 to
ba29d50
Compare
|
/retest |
|
/test pull-ci-openshift-origin-main-e2e-gcp-ovn-techpreview-pkiconfig |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: gangwgr, kaleemsiddiqu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/verified by CI |
|
@kaleemsiddiqu: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0080764 to
66180a4
Compare
|
New changes are detected. LGTM label has been removed. |
|
/test pull-ci-openshift-origin-main-e2e-gcp-ovn-techpreview-pkiconfig |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/pki/helpers.go`:
- Around line 71-79: Both polling helpers are swallowing every client error and
only timing out, which hides real failures. Update waitForPKICRD and
waitForSecretRegeneration to use the callback error return from
wait.PollUntilContextTimeout so only the expected “not ready”/not found case
keeps waiting, while RBAC, transport, or apiserver errors are returned
immediately. Keep the logic localized in these helper functions and preserve
their existing retry behavior for the transient unavailable case.
In `@test/extended/pki/pki_kube_apiserver.go`:
- Around line 42-48: The PKI test cases are using context.TODO(), which drops
the Ginkgo spec context and its cancellation/deadline behavior. Update the two
g.It blocks in the PKI kube apiserver spec to pass the existing spec ctx through
to testUniformPKIConfigurations and testMixedPKIConfigurations instead of
creating a new TODO context.
- Around line 98-103: The fixed sleep before the kube-apiserver certificate
regeneration checks is too brittle and can race with operator reconciliation. In
both test loops in pki_kube_apiserver.go, replace the 10-second time.Sleep with
a wait that observes operator progress after the PKI config change before
calling testKubeAPIServerCertificates. Use the existing test flow around tc,
kubeClient, and the reconciliation helpers to block until the operator has
consumed the updated configuration, then continue with the certificate
assertions.
In `@test/extended/pki/pki_service_ca.go`:
- Around line 211-214: Only treat a NotFound error as the optional absence of
the serving cert in pki_service_ca.go; the current Get call in the serving-cert
lookup silently ignores all errors and leaves oldServingCertUID empty. Update
the logic around kubeClient.CoreV1().Secrets(...).Get so it explicitly checks
for apierrors.IsNotFound(err) before continuing, and for any other error return
it immediately instead of proceeding. Keep the existing oldServingCertUID
assignment path in the successful Get case.
- Around line 319-324: The deferred namespace cleanup in the PKI service CA
helper still depends on the cancellable spec context, so deletion can be skipped
when the test context is canceled. Update the cleanup in the helper that defers
kubeClient.CoreV1().Namespaces().Delete to use a detached context via
context.WithoutCancel(ctx) and wrap it with an explicit timeout before issuing
the delete, while keeping the existing NotFound handling and warning log.
- Around line 174-178: The service-ca reconciliation check is currently only
logged and then execution continues, which violates the “never ignore error
returns” rule. In the PKI test flow around WaitForOperatorProgressingFalse and
the subsequent “Operator has reconciled PKI configuration” log, change the
handling so a non-nil error fails the test immediately instead of warning and
proceeding. Ensure the surrounding logic only continues once the operator has
successfully reconciled.
- Around line 39-55: Pass the Ginkgo context through the PKI test flow instead
of using context.TODO(), so the spec lifecycle controls the CRD waits, config
updates, and certificate polling. Update the g.It callback in the PKI service CA
test to accept a context and forward it into testServiceCAPKIConfiguration, then
change testServiceCAPKIConfiguration to take that ctx parameter and use it for
the client polling and wait logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 440f93eb-4914-4bf0-97be-d1c56f25bb1e
📒 Files selected for processing (5)
pkg/testsuites/standard_suites.gotest/extended/include.gotest/extended/pki/helpers.gotest/extended/pki/pki_kube_apiserver.gotest/extended/pki/pki_service_ca.go
✅ Files skipped from review due to trivial changes (1)
- test/extended/include.go
66180a4 to
8719e20
Compare
…perator Tests for kube-apiserver and service-ca for pki config Signed-off-by: Kaleemullah Siddiqui <ksiddiqu@redhat.com>
8719e20 to
f0278a2
Compare
|
@kaleemsiddiqu: This pull request references CNTRLPLANE-3741 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Scheduling required tests: |
|
/test e2e-gcp-ovn-techpreview-pkiconfig |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
@kaleemsiddiqu: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: f0278a2
New tests seen in this PR at sha: f0278a2
|
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: f0278a2
New tests seen in this PR at sha: f0278a2
|
Tests for kube-apiserver and service-ca added
Summary by CodeRabbit
openshift/pkiconfig) to validate configurable PKI certificate regeneration.kube-apiserverPKI regeneration by deleting targeted certificate Secrets and confirming regenerated certificate key properties for uniform and mixed configurations.service-caunder theConfigurablePKIfeature gate, including regeneration and validation of signing and serving certificates (RSA and ECDSA key/curve parameters, with mixed overrides).