Skip to content

MCO-1972: Removes OSImageURLConfig from the build controller#5424

Open
cheesesashimi wants to merge 6 commits intoopenshift:mainfrom
cheesesashimi:zzlotnik/osimageurl-redux
Open

MCO-1972: Removes OSImageURLConfig from the build controller#5424
cheesesashimi wants to merge 6 commits intoopenshift:mainfrom
cheesesashimi:zzlotnik/osimageurl-redux

Conversation

@cheesesashimi
Copy link
Member

@cheesesashimi cheesesashimi commented Nov 18, 2025

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

However, if one also sets the OSImageStream name on the MachineConfigPool and also sets OSImageURL on a MachineConfig, the MCO should degrade in this state because it would override value provided by the cluster admin. This PR also includes an E2E test which verifies that this is the case. This new E2E test will not be automatically ran until openshift/release#75329 is merged.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 18, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 18, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cheesesashimi
Copy link
Member Author

/test unit verify e2e-gcp-op-ocl

BaseOSContainerImage: m.MachineConfig.Spec.OSImageURL,
BaseOSExtensionsContainerImage: m.MachineConfig.Spec.BaseOSExtensionsContainerImage,
// This value is purposely left empty because the ConfigMap does not actually
// populate this value. However, we want the hashing to be stable.
Copy link
Member Author

@cheesesashimi cheesesashimi Nov 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to reviewer: This might be a moot point if someone is upgrading from one OCP release to another, the hashes will change. However, that means that old images may get rebuilt in the process, which is undesirable.

@cheesesashimi cheesesashimi changed the title Removes OSImageURLConfig from the build controller MCO-1972: Removes OSImageURLConfig from the build controller Nov 18, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 18, 2025
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Nov 18, 2025

@cheesesashimi: This pull request references MCO-1972 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 18, 2025
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-ocl

@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from 49a39f6 to 3ffaec0 Compare February 5, 2026 15:25
@cheesesashimi
Copy link
Member Author

/test unit verify e2e-gcp-op-ocl

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-ocl

@cheesesashimi cheesesashimi marked this pull request as ready for review February 6, 2026 14:34
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 6, 2026
@cheesesashimi
Copy link
Member Author

/test unit

@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from adb5a66 to b66544e Compare February 10, 2026 15:41
@pablintino
Copy link
Contributor

/retest-required
/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Feb 11, 2026
@ptalgulk01
Copy link

Pre-merge verified:

Environment Setup:
OCP Version: 4.22.0-0-2026-02-18-060304-test-ci-ln-ti6cfjk-latest
Platform: AWS

Pre-requisites

  • Create Container file push to quay.io/mcoqe/layering repo
  • get the sha256 for the image

Steps

  • apply the MC with OSImageURL
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: worker 
  name: os-layer-custom
spec:
  osImageURL: "quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559"
> EOF
machineconfig.machineconfiguration.openshift.io/os-layer-custom created
  • Wait for the MCP update to complete
  • Check the image is applied on node
$ oc debug node/ip-10-0-21-13.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-21-13us-east-2computeinternal-debug-6mczg ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559
                   Digest: sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559
                  Version: 9.8.20260214-0 (2026-02-18T13:06:30Z)

Removing debug pod ...

$ oc get mc rendered-worker-13ea0dd3cdbf18ac59647ba5de9d4e8a  -o jsonpath='{.spec.osImageURL}'
quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559

@cheesesashimi
Copy link
Member Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2026
@cheesesashimi
Copy link
Member Author

/test verify

Ensures that a cluster admin may only override the OSImageURL field or
set the desired OSImageStream name; but not both. This ensures that
either the cluster admin or the MCO will manage the OS image and
prevents the MCO from overriding this setting.
@cheesesashimi cheesesashimi force-pushed the zzlotnik/osimageurl-redux branch from ce83ad7 to 6303082 Compare March 2, 2026 17:25
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-techpreview

@pablintino
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 3, 2026
@cheesesashimi
Copy link
Member Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2026
@ptalgulk01
Copy link

ptalgulk01 commented Mar 4, 2026

Case1: Verify when osstream and osimageurl MC is applied the MCP is degraded
Steps:

  • Created custom pool
  • Patch the osstream image on pool
$ oc patch mcp custom2 --type merge -p '{"spec":{"osImageStream":{"name":"rhel-10"}}}'
machineconfigpool.machineconfiguration.openshift.io/custom2 patched
$ oc debug node/ip-10-0-66-149.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-66-149us-east-2computeinternal-debug-vn5s6 ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:registry.build07.ci.openshift.org/ci-ln-dwy4i6t/stable@sha256:4a1798a3b92a794a69d56eaf78c1521a1c4d2e52fd05057072780ec19ccabd45
                   Digest: sha256:4a1798a3b92a794a69d56eaf78c1521a1c4d2e52fd05057072780ec19ccabd45
                  Version: 10.1.20260126-0 (2026-01-27T06:43:47Z)

Removing debug pod ...
  • Applied OsImageUrl MC on custom pool
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
 labels:
   machineconfiguration.openshift.io/role: custom2
 name: os-layer-custom2
spec:
 osImageURL: "quay.io/mcoqe/layering@sha256:11b1ab30f33dd92b43c92be37f35033c0ddc0955fce144a3e2d7a81d5790a559"
EOF
  • Able to see MCP degraded with below error
oc get mcp custom2 -w
NAME      CONFIG                                              UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
custom2   rendered-custom2-63007943eea37dec9bb259de89961c7d   True      False      False      1              1                   1                     0                      86m
custom2   rendered-custom2-63007943eea37dec9bb259de89961c7d   True      False      True       1              1                   1                     0                      86m
^C

  - lastTransitionTime: "2026-03-04T08:51:09Z"
    message: 'Failed to render configuration for pool custom2: could not generate
      rendered MachineConfig: cannot override MachineConfig osImageURL and set MachineConfigPool
      spec.osImageStream.name simultaneously'
    reason: ""
    status: "True"
    type: RenderDegraded

Case2: Verify the off-cluster and on-cluster layering case

  • Apply the below MC
MC template
oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: custom3 
  name: os-layer-custom3
spec:
  osImageURL: "quay.io/mcoqe/layering@sha256:3760b74a58b882494c5e99d7191647822a2dfea43983cb5882887325b99327a7"
EOF
machineconfig.machineconfiguration.openshift.io/os-layer-custom3 created
  • check changes are applied
$  oc debug node/ip-10-0-2-202.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-2-202us-east-2computeinternal-debug-928f4 ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:quay.io/mcoqe/layering@sha256:3760b74a58b882494c5e99d7191647822a2dfea43983cb5882887325b99327a7
                   Digest: sha256:3760b74a58b882494c5e99d7191647822a2dfea43983cb5882887325b99327a7
                  Version: 10.1.20260126-0 (2026-03-04T09:02:55Z)

Removing debug pod ...

oc debug node/ip-10-0-2-202.us-east-2.compute.internal -- chroot /host ls /etc/test-offlayering.test
Starting pod/ip-10-0-2-202us-east-2computeinternal-debug-w7csh ...
To use host binaries, run `chroot /host`
/etc/test-offlayering.test

Removing debug pod ...
  • Enabled the OCL
OCL template
$ sh ../ocl/create-long-lived-token.sh pull-copy
$ oc create -f - << EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineOSConfig
metadata:
  name: custom3
spec:
  machineConfigPool:
    name: custom3
  imageBuilder:
    imageBuilderType: Job
  baseImagePullSecret:
    name: pull-copy
  renderedImagePushSecret:
    name: $(oc get -n openshift-machine-config-operator sa builder -ojsonpath='{.secrets[0].name}')
  renderedImagePushSpec: "image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image:latest"
  containerFile:
  - content: |-
      RUN  echo "This is from ON-CLUSTER layering" > /etc/test-onlayering.test
EOF
  • Check the off-cluster and on-cluster layering files are present
sh-5.2# ls /etc | grep -i layer
test-offlayering.test
test-onlayering.test
  • Delete the MC and check MOSB is triggred and off-layering file is not present
$ oc get machineosbuilds
NAME                                       PREPARED   BUILDING   SUCCEEDED   INTERRUPTED   FAILED   AGE
custom3-5715646d7e43ffab6b29cd92225f7855   False      True       False       False         False    14s
custom3-9d3bae242432a426740e78c5dbef87f0   False      False      True        False         False    15m

oc debug node/ip-10-0-2-202.us-east-2.compute.internal -- chroot /host ls /etc/ | grep -i layering
Starting pod/ip-10-0-2-202us-east-2computeinternal-debug-jb954 ...
To use host binaries, run `chroot /host`

Removing debug pod ...
test-onlayering.test

@ptalgulk01
Copy link

ptalgulk01 commented Mar 5, 2026

Case3: patch the osstream for OCL enabled bool and check the build is triggred.

  • Patch the rhel-10 osstream
oc get mcp custom3 -ojsonpath='{.spec.osImageStream.name}'
rhel-10
  • Check OCL triggers another build
oc get machineosbuilds
NAME                                       PREPARED   BUILDING   SUCCEEDED   INTERRUPTED   FAILED   AGE
custom3-13289be38498b1112a462829db03b493   False      True       False       False         False    103s
custom3-5715646d7e43ffab6b29cd92225f7855   False      False      True        False         False    16h
custom3-9d3bae242432a426740e78c5dbef87f0   False      False      True        False         False    16h

oc debug node/ip-10-0-22-15.us-east-2.compute.internal -- chroot /host rpm-ostree status
Starting pod/ip-10-0-22-15us-east-2computeinternal-debug-7ccqr ...
To use host binaries, run `chroot /host`
State: idle
Deployments:
* ostree-unverified-registry:image-registry.openshift-image-registry.svc:5000/openshift-machine-config-operator/ocb-image@sha256:c2ddf1d07f2a40a83935beb3419ca346f6fe637c12e1be9e5a31952d8f81ce30
                   Digest: sha256:c2ddf1d07f2a40a83935beb3419ca346f6fe637c12e1be9e5a31952d8f81ce30
                  Version: 10.1.20260126-0 (2026-03-06T12:52:13Z)

Removing debug pod ...

/label qe-approved
/verified by @ptalgulk01

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Mar 5, 2026
@openshift-ci-robot
Copy link
Contributor

@ptalgulk01: This PR has been marked as verified by @ptalgulk01.

Details

In response to this:

Case3: patch the osstream for OCL enabled bool and check the build is triggred.

  • Patch the rhel-10 osstream
oc get mcp custom3 -ojsonpath='{.spec.osImageStream.name}'
rhel-10
  • Check OCL triggers another build
oc get machineosbuilds
NAME                                       PREPARED   BUILDING   SUCCEEDED   INTERRUPTED   FAILED   AGE
custom3-13289be38498b1112a462829db03b493   False      True       False       False         False    103s
custom3-5715646d7e43ffab6b29cd92225f7855   False      False      True        False         False    16h
custom3-9d3bae242432a426740e78c5dbef87f0   False      False      True        False         False    16h

/label qe-approved
/verified by @ptalgulk01

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Mar 5, 2026
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Mar 5, 2026

@cheesesashimi: This pull request references MCO-1972 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

- What I did

This decouples the Build Controller from OSImageURLConfig and makes the OSImageURL and BaseOSExtensionsImage fields on the rendered MachineConfig the source of truth for the base OS and extensions images to use for Image-Mode OpenShift. The idea is that if a different OS image is selected on a per-pool basis (e.g., one is RHEL9 and one is RHEL10 for dual-streams), then the Build Controller should use the appropriate source of truth for the appropriate pool.

However, if one also sets the OSImageStream name on the MachineConfigPool and also sets OSImageURL on a MachineConfig, the MCO should degrade in this state because it would override value provided by the cluster admin. This PR also includes an E2E test which verifies that this is the case. This new E2E test will not be automatically ran until openshift/release#75329 is merged.

- How to verify it

The best way to verify this is to create a cluster and then create a MachineConfig which overrides the OSImageURL value. The Build Controller should build a new OS image based upon the new OSImageURL value.

- Description for the changelog
MachineConfigs should be the source of truth for the Build Controller

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cheesesashimi
Copy link
Member Author

/test e2e-aws-ovn

@cheesesashimi
Copy link
Member Author

/retest-required

1 similar comment
@pablintino
Copy link
Contributor

/retest-required

@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-part2

1 similar comment
@cheesesashimi
Copy link
Member Author

/test e2e-gcp-op-part2

@pablintino
Copy link
Contributor

/override ci/prow/e2e-gcp-op-part2
Override as the failure is known and unrelated.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 12, 2026

@pablintino: Overrode contexts on behalf of pablintino: ci/prow/e2e-gcp-op-part2

Details

In response to this:

/override ci/prow/e2e-gcp-op-part2
Override as the failure is known and unrelated.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@pablintino
Copy link
Contributor

/lgtm
/override ci/prow/e2e-gcp-op-part2 ci/prow/e2e-hypershift
Overriding known, unrelated, issues.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheesesashimi, pablintino

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cheesesashimi,pablintino]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 13, 2026

@pablintino: Overrode contexts on behalf of pablintino: ci/prow/e2e-gcp-op-part2, ci/prow/e2e-hypershift

Details

In response to this:

/lgtm
/override ci/prow/e2e-gcp-op-part2 ci/prow/e2e-hypershift
Overriding known, unrelated, issues.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 5f0d9d7 and 2 for PR HEAD 6303082 in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 3474828 and 1 for PR HEAD 6303082 in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 14, 2026

@cheesesashimi: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-hypershift 6303082 link unknown /test e2e-hypershift
ci/prow/okd-scos-images 6303082 link unknown /test okd-scos-images
ci/prow/e2e-gcp-op-part2 6303082 link unknown /test e2e-gcp-op-part2
ci/prow/e2e-gcp-op-part1 6303082 link unknown /test e2e-gcp-op-part1
ci/prow/verify 6303082 link unknown /test verify
ci/prow/e2e-gcp-op-single-node 6303082 link unknown /test e2e-gcp-op-single-node
ci/prow/unit 6303082 link unknown /test unit
ci/prow/bootstrap-unit 6303082 link unknown /test bootstrap-unit
ci/prow/e2e-aws-ovn 6303082 link unknown /test e2e-aws-ovn
ci/prow/images 6303082 link unknown /test images
ci/prow/e2e-aws-ovn-upgrade 6303082 link unknown /test e2e-aws-ovn-upgrade
ci/prow/verify-deps 6303082 link unknown /test verify-deps

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 14, 2026
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 14, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9a118e84-769f-4390-b356-621e49d029ec

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. qe-approved Signifies that QE has signed off on this PR verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants