AGENT-1449: Add IRI registry authentication support to MCO#5765
AGENT-1449: Add IRI registry authentication support to MCO#5765rwsu wants to merge 2 commits intoopenshift:mainfrom
Conversation
Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The IRI controller merges registry auth credentials into the global pull secret after bootstrap. This triggers the template controller to re-render template MCs (00-master, etc.) with the updated pull secret, producing a different rendered MC hash than what bootstrap created. The mismatch causes the MCD DaemonSet pod to fail during bootstrap: it reads the bootstrap-rendered MC name from the node annotation, but that MC no longer exists in-cluster (replaced by the re-rendered one). The MCD falls back to reading /etc/machine-config-daemon/currentconfig, which was never written because the firstboot MCD detected "no changes" and skipped it. Both master nodes go Degraded and never recover. Fix by merging IRI auth into the pull secret during bootstrap before template MC rendering, so both bootstrap and in-cluster produce identical rendered MC hashes. Extract the pull secret merge logic into a shared MergeIRIAuthIntoPullSecret function used by both the bootstrap path and the in-cluster IRI controller. Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
|
@rwsu: An error was encountered searching for bug AGENT-1449 on the Jira server at https://issues.redhat.com. No known errors were detected, please see the full error message for details. Full error message.
No response returned: Get "https://issues.redhat.com/rest/api/2/issue/AGENT-1449": GET https://issues.redhat.com/rest/api/2/issue/AGENT-1449 giving up after 5 attempt(s)
Please contact an administrator to resolve this issue, then request a bug refresh with DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis pull request introduces IRI (Internal Release Image) authentication secret support across the bootstrap and controller flows. It adds a new auth secret parameter to multiple function signatures, implements a merge function to inject IRI registry credentials into pull secrets, and updates template files to support optional htpasswd-based authentication for the internal registry. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: rwsu The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
pkg/controller/internalreleaseimage/internalreleaseimage_controller.go (1)
378-383: Consider whether merge failure should be fatal.If the IRI auth secret exists, it indicates auth is configured for the registry. When merging credentials into the pull secret fails, nodes may be unable to pull from the authenticated registry. The current warning-only approach could lead to silent authentication failures at runtime.
Consider whether this should return an error to trigger a retry, or at least emit an event for visibility.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go` around lines 378 - 383, Currently mergeIRIAuthIntoPullSecret failure is only logged with klog.Warningf which can hide critical pull-auth problems; change this to return the error from the surrounding reconcile path so the controller retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or wrapped err from the Reconcile method when ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally emit a Kubernetes event for visibility using the controller's event recorder (e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge failure and the secret name (iriAuthSecret) so operators see the issue in events.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/controller/internalreleaseimage/pullsecret.go`:
- Around line 21-31: The code currently constructs iriRegistryHost using
baseDomain without validation, which allows empty/whitespace domains (producing
"api-int.:22625"); before creating iriRegistryHost validate baseDomain (e.g.,
use strings.TrimSpace(baseDomain) and check for empty), and if invalid return an
error (e.g., fmt.Errorf("empty baseDomain") ) so the function fails fast; update
the section that defines iriRegistryHost to perform this check and only build
iriRegistryHost after validation.
In
`@pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml`:
- Line 40: The podman pull call unconditionally uses --authfile
/var/lib/kubelet/config.json which fails if that file is missing; change the
logic around the podman pull with registryImage so you first test for the
authfile's existence (e.g. [ -f /var/lib/kubelet/config.json ] and non-empty)
and only add --authfile /var/lib/kubelet/config.json when present, otherwise
perform an unauthenticated podman pull "${registryImage}" fallback; ensure the
conditional preserves current error handling and quoting of registryImage.
---
Nitpick comments:
In `@pkg/controller/internalreleaseimage/internalreleaseimage_controller.go`:
- Around line 378-383: Currently mergeIRIAuthIntoPullSecret failure is only
logged with klog.Warningf which can hide critical pull-auth problems; change
this to return the error from the surrounding reconcile path so the controller
retries (i.e., replace the klog.Warningf branch with a return fmt.Errorf(...) or
wrapped err from the Reconcile method when
ctrl.mergeIRIAuthIntoPullSecret(cconfig, iriAuthSecret) fails), and additionally
emit a Kubernetes event for visibility using the controller's event recorder
(e.g., ctrl.recorder.Eventf or ctrl.eventRecorder.Eventf) mentioning the merge
failure and the secret name (iriAuthSecret) so operators see the issue in
events.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 28340cd8-6caf-4ee6-8c24-9145c611edd0
📒 Files selected for processing (13)
pkg/controller/bootstrap/bootstrap.gopkg/controller/common/constants.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap.gopkg/controller/internalreleaseimage/internalreleaseimage_bootstrap_test.gopkg/controller/internalreleaseimage/internalreleaseimage_controller.gopkg/controller/internalreleaseimage/internalreleaseimage_controller_test.gopkg/controller/internalreleaseimage/internalreleaseimage_helpers_test.gopkg/controller/internalreleaseimage/internalreleaseimage_renderer.gopkg/controller/internalreleaseimage/pullsecret.gopkg/controller/internalreleaseimage/pullsecret_test.gopkg/controller/internalreleaseimage/templates/master/files/iri-registry-auth-htpasswd.yamlpkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yamlpkg/controller/internalreleaseimage/templates/master/units/iri-registry.service.yaml
| iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain) | ||
|
|
||
| var dockerConfig map[string]interface{} | ||
| if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil { | ||
| return nil, fmt.Errorf("could not parse pull secret: %w", err) | ||
| } | ||
|
|
||
| auths, ok := dockerConfig["auths"].(map[string]interface{}) | ||
| if !ok { | ||
| return nil, fmt.Errorf("pull secret missing 'auths' field") | ||
| } |
There was a problem hiding this comment.
Validate baseDomain before building IRI host
Line 21 builds the registry host even when baseDomain is empty, producing api-int.:22625 and silently injecting a broken auth entry. Fail fast on empty/whitespace domain.
Proposed fix
import (
"encoding/base64"
"encoding/json"
"fmt"
+ "strings"
)
@@
func MergeIRIAuthIntoPullSecret(pullSecretRaw []byte, password string, baseDomain string) ([]byte, error) {
if password == "" {
return pullSecretRaw, nil
}
+ if strings.TrimSpace(baseDomain) == "" {
+ return nil, fmt.Errorf("baseDomain is required to build IRI registry host")
+ }
iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain)As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain) | |
| var dockerConfig map[string]interface{} | |
| if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil { | |
| return nil, fmt.Errorf("could not parse pull secret: %w", err) | |
| } | |
| auths, ok := dockerConfig["auths"].(map[string]interface{}) | |
| if !ok { | |
| return nil, fmt.Errorf("pull secret missing 'auths' field") | |
| } | |
| if strings.TrimSpace(baseDomain) == "" { | |
| return nil, fmt.Errorf("baseDomain is required to build IRI registry host") | |
| } | |
| iriRegistryHost := fmt.Sprintf("api-int.%s:22625", baseDomain) | |
| var dockerConfig map[string]interface{} | |
| if err := json.Unmarshal(pullSecretRaw, &dockerConfig); err != nil { | |
| return nil, fmt.Errorf("could not parse pull secret: %w", err) | |
| } | |
| auths, ok := dockerConfig["auths"].(map[string]interface{}) | |
| if !ok { | |
| return nil, fmt.Errorf("pull secret missing 'auths' field") | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/controller/internalreleaseimage/pullsecret.go` around lines 21 - 31, The
code currently constructs iriRegistryHost using baseDomain without validation,
which allows empty/whitespace domains (producing "api-int.:22625"); before
creating iriRegistryHost validate baseDomain (e.g., use
strings.TrimSpace(baseDomain) and check for empty), and if invalid return an
error (e.g., fmt.Errorf("empty baseDomain") ) so the function fails fast; update
the section that defines iriRegistryHost to perform this check and only build
iriRegistryHost after validation.
| # As a fallback, let's try to fetch the registry image remotely | ||
| echo "Trying to pull ${registryImage} from remote registry..." | ||
| if podman pull '${registryImage}'; then | ||
| if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then |
There was a problem hiding this comment.
Guard authenticated pull on authfile existence
Line 40 always uses --authfile /var/lib/kubelet/config.json. If that file is missing when this service starts, the remote pull path fails even when an unauthenticated pull could succeed. Add a file-existence guard and fallback.
Proposed fix
- if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then
+ pull_args=()
+ if [[ -s /var/lib/kubelet/config.json ]]; then
+ pull_args+=(--authfile /var/lib/kubelet/config.json)
+ fi
+
+ if podman pull "${pull_args[@]}" "${registryImage}"; then
echo "Successfully pulled ${registryImage} from remote registry"
exit 0
fiAs per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if podman pull --authfile /var/lib/kubelet/config.json "${registryImage}"; then | |
| pull_args=() | |
| if [[ -s /var/lib/kubelet/config.json ]]; then | |
| pull_args+=(--authfile /var/lib/kubelet/config.json) | |
| fi | |
| if podman pull "${pull_args[@]}" "${registryImage}"; then | |
| echo "Successfully pulled ${registryImage} from remote registry" | |
| exit 0 | |
| fi |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@pkg/controller/internalreleaseimage/templates/master/files/usr-local-bin-load-registry-image-sh.yaml`
at line 40, The podman pull call unconditionally uses --authfile
/var/lib/kubelet/config.json which fails if that file is missing; change the
logic around the podman pull with registryImage so you first test for the
authfile's existence (e.g. [ -f /var/lib/kubelet/config.json ] and non-empty)
and only add --authfile /var/lib/kubelet/config.json when present, otherwise
perform an unauthenticated podman pull "${registryImage}" fallback; ensure the
conditional preserves current error handling and quoting of registryImage.
|
/cc @andfasano |
*- What I did
Add htpasswd-based authentication to the IRI registry. The installer generates credentials and provides them via a bootstrap secret. The MCO mounts the htpasswd file into the registry container and configures registry auth environment variables. The registry password is merged into the node pull secret so kubelet can authenticate when pulling the release image.
- How to verify it
- Description for the changelog
Add htpasswd authentication to the Internal Release Image (IRI) registry. The registry now requires Basic Auth credentials, with the password stored in the internal-release-image-registry-auth secret and automatically merged into the global pull secret so kubelet can authenticate when pulling images. The master MachineConfig is updated to configure the registry's htpasswd file and enable auth environment variables in the systemd unit.