Skip to content

AGENT-1449: Add IRI registry authentication support to MCO#5765

Open
rwsu wants to merge 2 commits intoopenshift:mainfrom
rwsu:AGENT-1449-auth
Open

AGENT-1449: Add IRI registry authentication support to MCO#5765
rwsu wants to merge 2 commits intoopenshift:mainfrom
rwsu:AGENT-1449-auth

Conversation

@rwsu
Copy link

@rwsu rwsu commented Mar 13, 2026

*- What I did

Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image.

- How to verify it

  • Verify the IRI registry rejects requests without providing credentials

- Description for the changelog

Add htpasswd authentication to the Internal Release Image (IRI) registry. The registry now requires Basic Auth credentials, with the password stored in the internal-release-image-registry-auth secret and automatically merged into the global pull secret so kubelet can authenticate when pulling images. The master MachineConfig is updated to configure the registry's htpasswd file and enable auth environment variables in the systemd unit.

rwsu and others added 2 commits March 13, 2026 11:40
Add htpasswd-based authentication to the IRI registry. The installer
generates credentials and provides them via a bootstrap secret. The MCO
mounts the htpasswd file into the registry container and configures
registry auth environment variables. The registry password is merged
into the node pull secret so kubelet can authenticate when pulling
the release image.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The IRI controller merges registry auth credentials into the global
pull secret after bootstrap. This triggers the template controller to
re-render template MCs (00-master, etc.) with the updated pull secret,
producing a different rendered MC hash than what bootstrap created.

The mismatch causes the MCD DaemonSet pod to fail during bootstrap:
it reads the bootstrap-rendered MC name from the node annotation, but
that MC no longer exists in-cluster (replaced by the re-rendered one).
The MCD falls back to reading /etc/machine-config-daemon/currentconfig,
which was never written because the firstboot MCD detected "no changes"
and skipped it. Both master nodes go Degraded and never recover.

Fix by merging IRI auth into the pull secret during bootstrap before
template MC rendering, so both bootstrap and in-cluster produce
identical rendered MC hashes.

Extract the pull secret merge logic into a shared MergeIRIAuthIntoPullSecret
function used by both the bootstrap path and the in-cluster IRI controller.

Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci-robot
Copy link
Contributor

@rwsu: An error was encountered searching for bug AGENT-1449 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details.

Full error message. No response returned: Get "https://issues.redhat.com/rest/api/2/issue/AGENT-1449": GET https://issues.redhat.com/rest/api/2/issue/AGENT-1449 giving up after 5 attempt(s)

Please contact an administrator to resolve this issue, then request a bug refresh with /jira refresh.

Details

In response to this:

*- What I did

Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image.

- How to verify it

  • Verify the IRI registry rejects requests without providing credentials

- Description for the changelog

Add htpasswd authentication to the Internal Release Image (IRI) registry. The registry now requires Basic Auth credentials, with the password stored in the internal-release-image-registry-auth secret and automatically merged into the global pull secret so kubelet can authenticate when pulling images. The master MachineConfig is updated to configure the registry's htpasswd file and enable auth environment variables in the systemd unit.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

Walkthrough

This pull request introduces IRI (Internal Release Image) authentication secret support across the bootstrap and controller flows. It adds a new auth secret parameter to multiple function signatures, implements a merge function to inject IRI registry credentials into pull secrets, and updates template files to support optional htpasswd-based authentication for the internal registry.

Changes

Cohort / File(s) Summary
Core Bootstrap Functions
pkg/controller/bootstrap/bootstrap.go, pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go
Updated function signatures to accept and pass the new iriAuthSecret parameter; added conditional logic to merge IRI auth credentials into pull secret when feature gate is enabled.
Constants
pkg/controller/common/constants.go
Added new exported constant InternalReleaseImageAuthSecretName with value "internal-release-image-registry-auth".
Pull Secret Merging
pkg/controller/internalreleaseimage/pullsecret.go
Implemented new MergeIRIAuthIntoPullSecret function to inject IRI registry authentication into dockerconfigjson pull secrets with htpasswd credentials.
Renderer
pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go
Extended NewRendererByRole signature to accept iriAuthSecret parameter and propagated htpasswd data into render context for template usage.
Controller Logic
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
Added kubeClient field, retrieves optional IRI auth secret during sync, treats auth secret as related for reconciliation updates, and merges credentials into the pull secret after role processing.
Bootstrap Tests
pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go
Updated existing test calls to pass nil auth secret; added new test case to verify bootstrap with auth secret.
Controller Tests
pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
Added import for encoding/json; introduced test hook for pull secret verification; added test cases for auth secret handling and pull secret merging.
Test Helpers
pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
Added factory functions pullSecret() and iriAuthSecret() for test object construction; introduced auth validation helper; extended ControllerConfigBuilder with DNS domain support; updated master config expectations to include auth-related ignition files.
Pull Secret Tests
pkg/controller/internalreleaseimage/pullsecret_test.go
Added comprehensive test coverage for MergeIRIAuthIntoPullSecret, validating auth entry addition, preservation of existing entries, stale entry updates, and error handling for invalid JSON and missing auths field.
Template Files
pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml, templates/master/files/usr-local-bin-load-registry-image-sh.yaml, templates/master/units/iri-registry.service.yaml
New htpasswd template file and updates to registry image loading script to use authenticated pulls; extended service unit with conditional auth volume mounting and environment variables when htpasswd is configured.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from bfournie and umohnani8 March 13, 2026 21:45
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rwsu
Once this PR has been reviewed and has the lgtm label, please assign yuqi-zhang for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)

378-383: Consider whether merge failure should be fatal.

If the IRI auth secret exists, it indicates auth is configured for the registry. When merging credentials into the pull secret fails, nodes may be unable to pull from the authenticated registry. The current warning-only approach could lead to silent authentication failures at runtime.

Consider whether this should return an error to trigger a retry, or at least emit an event for visibility.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`
around lines 378 - 383, Currently mergeIRIAuthIntoPullSecret failure is only
logged with klog.Warningf which can hide critical pull-auth problems; change
this to return the error from the surrounding reconcile path so the controller
retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or
wrapped err from the Reconcile method when
ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally
emit a Kubernetes event for visibility using the controller's event recorder
(e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge
failure and the secret name (iriAuthSecret) so operators see the issue in
events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/controller/internalreleaseimage/pullsecret.go`:
- Around line 21-31: The code currently constructs iriRegistryHost using
baseDomain without validation, which allows empty/whitespace domains (producing
"api-int.:22625"); before creating iriRegistryHost validate baseDomain (e.g.,
use strings.TrimSpace(baseDomain) and check for empty), and if invalid return an
error (e.g., fmt.Errorf("empty baseDomain") ) so the function fails fast; update
the section that defines iriRegistryHost to perform this check and only build
iriRegistryHost after validation.

In
`@pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml`:
- Line 40: The podman pull call unconditionally uses --authfile
/var/lib/kubelet/config.json which fails if that file is missing; change the
logic around the podman pull with registryImage so you first test for the
authfile's existence (e.g. [ -f /var/lib/kubelet/config.json ] and non-empty)
and only add --authfile /var/lib/kubelet/config.json when present, otherwise
perform an unauthenticated podman pull "${registryImage}" fallback; ensure the
conditional preserves current error handling and quoting of registryImage.

---

Nitpick comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 378-383: Currently mergeIRIAuthIntoPullSecret failure is only
logged with klog.Warningf which can hide critical pull-auth problems; change
this to return the error from the surrounding reconcile path so the controller
retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or
wrapped err from the Reconcile method when
ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally
emit a Kubernetes event for visibility using the controller's event recorder
(e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge
failure and the secret name (iriAuthSecret) so operators see the issue in
events.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 28340cd8-6caf-4ee6-8c24-9145c611edd0

📥 Commits

Reviewing files that changed from the base of the PR and between 5f0d9d7 and a5a65dc.

📒 Files selected for processing (13)
  • pkg/controller/bootstrap/bootstrap.go
  • pkg/controller/common/constants.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_controller_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.go
  • pkg/controller/internalreleaseimage/internalreleaseimage_renderer.go
  • pkg/controller/internalreleaseimage/pullsecret.go
  • pkg/controller/internalreleaseimage/pullsecret_test.go
  • pkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yaml
  • pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml
  • pkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml

Comment on lines +21 to +31
iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain)

var dockerConfig map[string]interface{}
if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil {
return nil, fmt.Errorf("could not parse pull secret: %w", err)
}

auths, ok := dockerConfig["auths"].(map[string]interface{})
if !ok {
return nil, fmt.Errorf("pull secret missing 'auths' field")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Validate baseDomain before building IRI host

Line 21 builds the registry host even when baseDomain is empty, producing api-int.:22625 and silently injecting a broken auth entry. Fail fast on empty/whitespace domain.

Proposed fix
 import (
 	"encoding/base64"
 	"encoding/json"
 	"fmt"
+	"strings"
 )
@@
 func MergeIRIAuthIntoPullSecret(pullSecretRaw []byte, password string, baseDomain string) ([]byte, error) {
 	if password == "" {
 		return pullSecretRaw, nil
 	}
+	if strings.TrimSpace(baseDomain) == "" {
+		return nil, fmt.Errorf("baseDomain is required to build IRI registry host")
+	}
 
 	iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain)

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain)
var dockerConfig map[string]interface{}
if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil {
return nil, fmt.Errorf("could not parse pull secret: %w", err)
}
auths, ok := dockerConfig["auths"].(map[string]interface{})
if !ok {
return nil, fmt.Errorf("pull secret missing 'auths' field")
}
if strings.TrimSpace(baseDomain) == "" {
return nil, fmt.Errorf("baseDomain is required to build IRI registry host")
}
iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain)
var dockerConfig map[string]interface{}
if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil {
return nil, fmt.Errorf("could not parse pull secret: %w", err)
}
auths, ok := dockerConfig["auths"].(map[string]interface{})
if !ok {
return nil, fmt.Errorf("pull secret missing 'auths' field")
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/controller/internalreleaseimage/pullsecret.go` around lines 21 - 31, The
code currently constructs iriRegistryHost using baseDomain without validation,
which allows empty/whitespace domains (producing "api-int.:22625"); before
creating iriRegistryHost validate baseDomain (e.g., use
strings.TrimSpace(baseDomain) and check for empty), and if invalid return an
error (e.g., fmt.Errorf("empty baseDomain") ) so the function fails fast; update
the section that defines iriRegistryHost to perform this check and only build
iriRegistryHost after validation.

# As a fallback, let's try to fetch the registry image remotely
echo "Trying to pull ${registryImage} from remote registry..."
if podman pull '${registryImage}'; then
if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard authenticated pull on authfile existence

Line 40 always uses --authfile /var/lib/kubelet/config.json. If that file is missing when this service starts, the remote pull path fails even when an unauthenticated pull could succeed. Add a file-existence guard and fallback.

Proposed fix
-    if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then
+    pull_args=()
+    if [[ -s /var/lib/kubelet/config.json ]]; then
+        pull_args+=(--authfile /var/lib/kubelet/config.json)
+    fi
+
+    if podman pull "${pull_args[@]}" "${registryImage}"; then
         echo "Successfully pulled ${registryImage} from remote registry"
         exit 0
     fi

As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then
pull_args=()
if [[ -s /var/lib/kubelet/config.json ]]; then
pull_args+=(--authfile /var/lib/kubelet/config.json)
fi
if podman pull "${pull_args[@]}" "${registryImage}"; then
echo "Successfully pulled ${registryImage} from remote registry"
exit 0
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml`
at line 40, The podman pull call unconditionally uses --authfile
/var/lib/kubelet/config.json which fails if that file is missing; change the
logic around the podman pull with registryImage so you first test for the
authfile's existence (e.g. [ -f /var/lib/kubelet/config.json ] and non-empty)
and only add --authfile /var/lib/kubelet/config.json when present, otherwise
perform an unauthenticated podman pull "${registryImage}" fallback; ensure the
conditional preserves current error handling and quoting of registryImage.

@rwsu
Copy link
Author

rwsu commented Mar 13, 2026

/cc @andfasano

@openshift-ci openshift-ci bot requested a review from andfasano March 13, 2026 22:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants