[WIP] AGENT-1449: Add IRI registry credential rotation support#5766
[WIP] AGENT-1449: Add IRI registry credential rotation support#5766rwsu wants to merge 4 commits intoopenshift:mainfrom
Conversation
Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The IRI controller merges registry auth credentials into the global pull secret after bootstrap. This triggers the template controller to re-render template MCs (00-master, etc.) with the updated pull secret, producing a different rendered MC hash than what bootstrap created. The mismatch causes the MCD DaemonSet pod to fail during bootstrap: it reads the bootstrap-rendered MC name from the node annotation, but that MC no longer exists in-cluster (replaced by the re-rendered one). The MCD falls back to reading /etc/machine-config-daemon/currentconfig, which was never written because the firstboot MCD detected "no changes" and skipped it. Both master nodes go Degraded and never recover. Fix by merging IRI auth into the pull secret during bootstrap before template MC rendering, so both bootstrap and in-cluster produce identical rendered MC hashes. Extract the pull secret merge logic into a shared MergeIRIAuthIntoPullSecret function used by both the bootstrap path and the in-cluster IRI controller. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Implement safe credential rotation for the IRI registry using a desired-vs-current pattern with generation-numbered usernames. The auth secret holds the desired password; the pull secret (read from rendered MachineConfig) holds the deployed password. When they differ, a three-phase rotation is performed: 1. Deploy dual htpasswd (old + new credentials with different usernames) 2. Update pull secret after all MCPs finish rolling out 3. Clean up dual htpasswd to single entry after new pull secret is deployed This avoids authentication deadlocks during rolling MachineConfig updates because the pull secret always contains the old credentials, which are present in every version of the htpasswd. Mid-rotation password changes are handled by verifying htpasswd hashes with bcrypt.CompareHashAndPassword and regenerating if they don't match. Key changes: - Add MachineConfigPool lister/informer to IRI controller - Add reconcileAuthCredentials with three-case rotation logic - Add getDeployedIRICredentials (reads from rendered MC, not API) - Add areAllPoolsUpdated (checks all pools including workers) - Add HtpasswdHasValidEntry, GenerateHtpasswdEntry, GenerateDualHtpasswd, NextIRIUsername, ExtractIRICredentialsFromPullSecret helpers - Vendor golang.org/x/crypto/bcrypt for htpasswd hash generation - Add credential rotation design doc Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
…ial rotation Add three new e2e tests: - TestIRIAuth_UnauthenticatedRequestReturns401: verifies registry rejects unauthenticated requests with 401 when auth is enabled - TestIRIAuth_AuthenticatedRequestSucceeds: verifies registry accepts requests with valid Basic Auth credentials - TestIRIAuth_CredentialRotation: end-to-end test of the three-phase credential rotation (dual htpasswd, pull secret update, cleanup) Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
|
@rwsu: An error was encountered searching for bug AGENT-1449 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
No response returned: Get "https://issues.redhat.com/rest/api/2/issue/AGENT-1449": GET https://issues.redhat.com/rest/api/2/issue/AGENT-1449 giving up after 5 attempt(s)
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rwsu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/cc @andfasano |
|
@rwsu: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
- What I did
Implement safe credential rotation for the IRI registry using a desired-vs-current pattern with generation-numbered usernames. The auth secret holds the desired password; the pull secret (read from rendered MachineConfig) holds the deployed password. When they differ, a three-phase rotation is performed:
This avoids authentication deadlocks during rolling MachineConfig updates because the pull secret always contains the old credentials, which are present in every version of the htpasswd. Mid-rotation password changes are handled by verifying htpasswd hashes with bcrypt.CompareHashAndPassword and regenerating if they don't match.
Key changes:
- How to verify it
Update the password to trigger the rotation to start:
Verify the /etc/iri-registry/auth/htpasswd has been updated.
Verify iri-registry works for both new and old credentials during rollout.
Verify global pull-secret contains the new credentials after rollout is complete.
- Description for the changelog
Add credential rotation support for the IRI registry. When the auth secret's password field is updated, the controller performs a three-phase rotation: (1) deploys a dual htpasswd with both old and new credentials so all nodes accept both passwords during rollout, (2) updates the global pull secret with the new credentials after all MachineConfigPools are fully updated, and (3) cleans up the dual htpasswd to a single entry once the new credentials are deployed everywhere. This avoids authentication deadlocks caused by api-int load-balancing requests across master nodes that may be at different stages of the rollout.
Also adds e2e tests for registry authentication (401 on unauthenticated requests, 200 with valid credentials) and an end-to-end credential rotation test that exercises all three phases.
Depends on #5765.