Skip to content

CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626

Merged
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
muraee:metrics-forwarding-api
Jun 25, 2026
Merged

CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626
openshift-merge-bot[bot] merged 6 commits into
openshift:mainfrom
muraee:metrics-forwarding-api

Conversation

@muraee

@muraee muraee commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added structured monitoring configuration on HostedCluster and HostedControlPlane to configure metrics forwarding (Enabled/Disabled) and to select metrics set (Telemetry, SRE, All). Deprecated metrics-forwarding annotation is preserved for backward compatibility when the spec is unset.
  • Bug Fixes

    • Controllers and components now consistently honor spec-driven monitoring settings for enabling/disabling forwarding and SRE metrics selection.
  • Tests

    • Updated unit and e2e tests and helpers to exercise the new spec-driven monitoring behavior.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 28, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@muraee: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@openshift-ci

openshift-ci Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR migrates metrics forwarding configuration from annotation-based to spec-based control. It adds MonitoringSpec, MetricsForwardingSpec, and enums (MetricsForwardingMode, MetricsSet), copies HostedCluster.Spec.Monitoring to HostedControlPlane.Spec.Monitoring with a backward-compatibility shim for the deprecated annotation, computes an effective metrics set for SRE reconciliation, updates controller predicates and gating to use the spec field, and updates unit and e2e tests to drive behavior via the new spec.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant HostedClusterController
  participant HostedCluster
  participant HostedControlPlane
  participant ControlPlaneOperator
  participant MetricsForwarder

  User->>HostedCluster: set spec.monitoring.metricsForwarding.mode=Enabled
  HostedClusterController->>HostedCluster: read Spec.Monitoring
  HostedClusterController->>HostedControlPlane: copy Spec.Monitoring (apply deprecated-annotation shim if unset)
  ControlPlaneOperator->>HostedControlPlane: read Spec.Monitoring
  ControlPlaneOperator->>MetricsForwarder: enable/disable based on MetricsForwarding.Mode and MetricsSet
  MetricsForwarder-->>ControlPlaneOperator: status
Loading

Suggested reviewers

  • jparrill
  • sdminonne
🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Unit tests lack meaningful failure messages on assertions: TestPredicate, TestReconcileMetricsForwarder, TestReconcileHostedControlPlaneMonitoring missing diagnostic context. Add descriptive failure messages to Expect() assertions (e.g., Expect(err).NotTo(HaveOccurred(), "should enable metrics forwarding on HCP") to help diagnose failures.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new spec.monitoring API field to enable metrics forwarding configuration.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. No dynamic content found in test case names (no fmt.Sprintf, string concatenation, UUIDs, timestamps, or generated identifiers detected).
Topology-Aware Scheduling Compatibility ✅ Passed PR only adds API schema and controller logic for metrics forwarding; no deployment manifests, affinity rules, topology constraints, or topology-unaware replica logic.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New e2e tests contain no hardcoded IPv4 addresses, IPv4-specific parsing, or external connectivity requirements. All connections use DNS service names and cluster-internal services.
No-Weak-Crypto ✅ Passed No weak cryptography usage found in PR: no MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB mode, custom crypto implementations, or non-constant-time secret comparisons detected in modified code.
Container-Privileges ✅ Passed PR adds monitoring API fields; no privileged container settings (privileged: true, hostPID, hostNetwork, hostIPC, SYS_ADMIN, allowPrivilegeEscalation: true, runAsUser: 0) introduced.
No-Sensitive-Data-In-Logs ✅ Passed PR adds monitoring API fields (Enabled/Disabled enums and metrics sets) with no logging of spec values; code reads these fields for feature gating but never logs them.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@muraee muraee marked this pull request as ready for review May 28, 2026 14:53
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@muraee muraee changed the title NO-JIRA: Add spec.monitoring API for metrics forwarding CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding May 28, 2026
@openshift-ci-robot

openshift-ci-robot commented May 28, 2026

Copy link
Copy Markdown

@muraee: This pull request references CNTRLPLANE-3526 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels May 28, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and jparrill May 28, 2026 14:53
@muraee

muraee commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8626 May 28, 2026 14:58 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/hypershift/v1beta1/hosted_controlplane.go`:
- Around line 190-195: The Monitoring field in HostedControlPlane currently has
an inconsistent JSON tag; locate the Monitoring declaration (Monitoring
MonitoringSpec) and replace the tag string that contains both
"omitempty,omitzero" with the single "omitzero" form (i.e.,
json:"monitoring,omitzero") so it matches the style used by HostedClusterSpec
and other fields like AutoNode.

In
`@control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go`:
- Around line 1189-1192: The code reads
hcp.Spec.Monitoring.MetricsForwarding.MetricsSet without nil checks which can
panic; update the logic around effectiveMetricsSet (and keep r.MetricsSet
fallback) to first verify hcp.Spec.Monitoring != nil and
hcp.Spec.Monitoring.MetricsForwarding != nil before accessing MetricsSet, and
only override effectiveMetricsSet when those pointers are non-nil and MetricsSet
is non-empty (preserve existing behavior of using metrics.MetricsSet(...) when
present).

In
`@control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go`:
- Around line 46-49: The predicate function dereferences
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil checks which
can panic when Monitoring or MetricsForwarding are nil; update
predicate(cpContext component.WorkloadContext) to first check cpContext.HCP,
cpContext.HCP.Spec, cpContext.HCP.Spec.Monitoring and
cpContext.HCP.Spec.Monitoring.MetricsForwarding for nil before reading Mode and
combine that guarded check with the existing DisableMonitoringServices
annotation check (hyperv1.DisableMonitoringServices) so the function returns
false (no metrics forwarding) when any of the intermediate structs are nil and
only true when Mode == hyperv1.MetricsForwardingModeEnabled and monitoring is
not disabled.

In
`@control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go`:
- Around line 69-72: The predicate function reads
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil guards; add
checks in predicate to ensure cpContext.HCP, cpContext.HCP.Spec, and
cpContext.HCP.Spec.Monitoring are non-nil (and that Monitoring.MetricsForwarding
is present) before accessing Mode, and return false (no reconcile) if any are
nil; keep the existing DisableMonitoringServices annotation check
(hyperv1.DisableMonitoringServices) and only evaluate Mode when the monitoring
structs exist to avoid nil-pointer panics.

In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2508-2511: The current fallback flips any non-Enabled mode
(including an explicit Disabled) to Enabled when the deprecated annotation
exists; change the condition to only apply the annotation fallback when the HCP
mode is not explicitly set (e.g., empty/unspecified) rather than any mode other
than Enabled. Concretely, update the check around
hcp.Spec.Monitoring.MetricsForwarding.Mode so it only sets
hyperv1.MetricsForwardingModeEnabled from the deprecated
hcluster.Annotations[hyperv1.EnableMetricsForwarding] when the existing mode is
the unset/zero value (not when it equals hyperv1.MetricsForwardingModeDisabled
or any explicit value).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9bad19ca-0e30-40f1-88ee-d8aca6240665

📥 Commits

Reviewing files that changed from the base of the PR and between e8aa9bb and 996709e.

⛔ Files ignored due to path filters (44)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/monitoringspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (11)
  • api/hypershift/v1beta1/hosted_controlplane.go
  • api/hypershift/v1beta1/hostedcluster_types.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go

Comment thread api/hypershift/v1beta1/hosted_controlplane.go
Comment thread hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go Outdated
Comment thread api/hypershift/v1beta1/hostedcluster_types.go
@muraee muraee force-pushed the metrics-forwarding-api branch from 996709e to 89814ff Compare May 28, 2026 15:15
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8626 May 28, 2026 15:21 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 951-952: The code directly reads
hcp.Spec.Monitoring.MetricsForwarding.Mode which can panic when Monitoring or
MetricsForwarding is nil; update the conditional to first nil-check
hcp.Spec.Monitoring and hcp.Spec.Monitoring.MetricsForwarding and only compare
Mode to hyperv1.MetricsForwardingModeEnabled when both are non-nil, otherwise
treat it as not enabled and call return k8sutil.DeleteAllIfNeeded(ctx, r.client,
deployment, cm, servingCA, podMonitor); ensure you reference the same symbols
(hcp, Spec, Monitoring, MetricsForwarding, Mode,
hyperv1.MetricsForwardingModeEnabled) so the branch exactly mirrors the intended
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9bc66ee1-87cc-4ca1-8f51-fda6d9f9d236

📥 Commits

Reviewing files that changed from the base of the PR and between 996709e and 89814ff.

⛔ Files ignored due to path filters (44)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/monitoringspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (11)
  • api/hypershift/v1beta1/hosted_controlplane.go
  • api/hypershift/v1beta1/hostedcluster_types.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go
  • api/hypershift/v1beta1/hosted_controlplane.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 71.05263% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 42.60%. Comparing base (b6903c6) to head (170c140).
⚠️ Report is 41 commits behind head on main.

Files with missing lines Patch % Lines
...trollers/hostedcluster/hostedcluster_controller.go 70.00% 8 Missing and 1 partial ⚠️
...stedcontrolplane/v2/endpoint_resolver/component.go 66.66% 1 Missing ⚠️
...s/hostedcontrolplane/v2/metrics_proxy/component.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8626      +/-   ##
==========================================
+ Coverage   42.50%   42.60%   +0.10%     
==========================================
  Files         768      768              
  Lines       95272    95318      +46     
==========================================
+ Hits        40498    40614     +116     
+ Misses      51971    51892      -79     
- Partials     2803     2812       +9     
Files with missing lines Coverage Δ
.../hostedcontrolplane/v2/metrics_proxy/deployment.go 86.02% <100.00%> (+13.84%) ⬆️
...rconfigoperator/controllers/resources/resources.go 57.25% <100.00%> (+0.27%) ⬆️
...stedcontrolplane/v2/endpoint_resolver/component.go 12.00% <66.66%> (-3.39%) ⬇️
...s/hostedcontrolplane/v2/metrics_proxy/component.go 0.00% <0.00%> (ø)
...trollers/hostedcluster/hostedcluster_controller.go 52.36% <70.00%> (+0.37%) ⬆️

... and 3 files with indirect coverage changes

Flag Coverage Δ
cmd-support 35.46% <ø> (ø)
cpo-hostedcontrolplane 44.96% <71.42%> (+0.12%) ⬆️
cpo-other 44.76% <100.00%> (+0.44%) ⬆️
hypershift-operator 53.11% <70.00%> (+0.06%) ⬆️
other 31.69% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Allow the forwarded metrics set (guest-side, via metrics-proxy) to be
configured independently from the MC-side ServiceMonitor/PodMonitor
relabel configs. When MetricsForwardingSpec.MetricsSet is not set, it
falls back to MonitoringSpec.MetricsSet (then to the global METRICS_SET
env var).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@muraee muraee force-pushed the metrics-forwarding-api branch from f5670b8 to 0c33a77 Compare June 23, 2026 14:54
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2026
- Broaden SRE ConfigMap watch to enqueue HostedClusters with per-cluster
  metricsSet SRE override, not just when operator global is SRE
- Extract effectiveMetricsSet() helper to deduplicate override resolution
- Fix comment to say "None" instead of "Disabled" matching the enum
- Fix test names to use Forward/None instead of Enabled/Disabled
- Add positive test for Mode=Forward verifying resources are preserved
- Add envtest for required mode field on MetricsForwardingSpec
- Add godoc to exported Predicate function
- Add TestMetricsSetConstantsInSync asserting hyperv1/metrics type parity

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@muraee muraee force-pushed the metrics-forwarding-api branch from 0c33a77 to ef52175 Compare June 23, 2026 14:56
@csrwng

csrwng commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8626 June 23, 2026 15:03 Inactive
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 23, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
Older CPOs (4.22) only read the EnableMetricsForwarding annotation, not
the new spec.monitoring.metricsForwarding field. When the HO sets the
spec on the HCP without also setting the annotation, a 4.22 CPO never
deploys metrics forwarding components, breaking 4.22 e2e tests and
real-world N-1 version skew scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@csrwng

csrwng commented Jun 24, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 24, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@hypershift-jira-solve-ci

Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2069771585861980160 | Cost: $3.38010825 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@muraee

muraee commented Jun 24, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@hypershift-jira-solve-ci

hypershift-jira-solve-ci Bot commented Jun 24, 2026

Copy link
Copy Markdown

Test Failure Analysis Complete

Job Information

  • Prow Job 1: pull-ci-openshift-hypershift-main-e2e-aws-4-22
  • Build ID: 2069815825975480320
  • Target: e2e-aws-4-22
  • Result: 600 tests, 29 skipped, 2 failures
  • Failed Test: TestKarpenterUpgradeControlPlane/Main (2169.60s)

  • Prow Job 2: pull-ci-openshift-hypershift-main-e2e-aws
  • Build ID: 2069815825908371456
  • Target: e2e-aws
  • Result: 623 tests, 30 skipped, 4 failures
  • Failed Test: TestKarpenter/Main/Parallel_provisioning_tests/OpenshiftEC2NodeClass_Kubelet_propagation (313.11s)

Test Failure Analysis

Error

Job 1 (e2e-aws-4-22) — TestKarpenterUpgradeControlPlane/Main:
  Failed to wait for HostedCluster e2e-clusters-pngkr/karpenter-upgrade-control-plane-9js8d
  to rollout in 30m0s: context deadline exceeded
  wanted most recent version history to have state Completed, has state Partial
  ClusterVersionProgressing=True: ClusterOperatorsUpdating(Working towards
  4.22.0-0.ci-2026-06-24-105629: 572 of 697 done (82% complete), waiting on dns, network)

Job 2 (e2e-aws) — TestKarpenter/.../OpenshiftEC2NodeClass_Kubelet_propagation:
  Get "https://10.0.136.209:10250/containerLogs/kube-system/kubelet-config-checker/checker":
  remote error: tls: internal error
  Status: "Failure", Code: 500

Summary

Both failures are in Karpenter-related tests that are unrelated to PR #8626 (CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding). The PR adds a spec.monitoring API for metrics forwarding, touching only API types, CRD manifests, apply-configuration client code, and a metrics proxy test utility — it does not modify any Karpenter tests, upgrade logic, kubelet certificate handling, or cluster operator reconciliation. Cross-referencing with other recent PRs (#8820, #8819) confirms both TestKarpenterUpgradeControlPlane and TestKarpenter fail frequently with identical failure modes on completely unrelated changes, establishing these as known flaky tests.

Root Cause

Job 1 (e2e-aws-4-22) — TestKarpenterUpgradeControlPlane: The test triggers a hosted cluster control plane upgrade from one CI release image to another. After updating spec.release.image, it waits up to 30 minutes for the ClusterVersion rollout to complete. The upgrade stalled at 82% completion (572 of 697 done), blocked on the dns and network cluster operators failing to converge. This is a transient CVO (Cluster Version Operator) issue where cluster operators become stuck during upgrade — not caused by any code change in this PR. The test's 30-minute timeout (WaitForDataPlaneRollout in test/e2e/util/util.go:614) expired while waiting for ClusterVersionProgressing=False.

Job 2 (e2e-aws) — OpenshiftEC2NodeClass_Kubelet_propagation: The test creates a custom OpenshiftEC2NodeClass with specific kubelet configuration, provisions a Karpenter node, then deploys a privileged checker pod to validate kubelet settings. The failure occurs when attempting to fetch container logs from the kubelet on port 10250 — a TLS handshake failure (tls: internal error) between the API server and the kubelet on the newly provisioned node (IP 10.0.136.209). This is a node-level certificate bootstrapping issue on a freshly Karpenter-provisioned node where the kubelet's serving certificate may not yet be properly signed or trusted. This is entirely in the node bootstrap/certificate path and has no relationship to the monitoring API changes.

Both failures reproduce on other PRs with completely different code changes, confirming they are pre-existing infrastructure/environment flakiness.

Recommendations
  1. Retest the PR — Both failures are flaky infrastructure issues unrelated to the PR changes. A /retest should resolve both jobs.

  2. No code changes needed in PR CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding #8626 — The PR's 58 changed files (API types, CRD manifests, generated deepcopy, apply-configuration code, and a metrics proxy test utility) have zero overlap with the failing Karpenter tests.

  3. For test owners — Consider:

    • Increasing the WaitForDataPlaneRollout timeout for TestKarpenterUpgradeControlPlane beyond 30 minutes, or adding retry logic for CVO stalls on specific operators (dns, network).
    • Adding retry/backoff logic for the kubelet TLS connection in OpenshiftEC2NodeClass_Kubelet_propagation to handle transient certificate bootstrapping delays on newly provisioned Karpenter nodes.
Evidence
Evidence Detail
PR #8626 changed files 58 files: API types, CRD manifests, deepcopy, apply-configuration, util_metrics_proxy.go, hosted_cluster_metrics_test.go. Zero Karpenter or upgrade test files.
Job 1 failure TestKarpenterUpgradeControlPlane/Main — CVO stalled at 82% (572/697), waiting on dns + network operators. 30m timeout exceeded.
Job 1 condition ClusterVersionProgressing=True: ClusterOperatorsUpdating(Working towards 4.22.0-0.ci-2026-06-24-105629: 572 of 697 done (82% complete), waiting on dns, network)
Job 2 failure TestKarpenter/.../OpenshiftEC2NodeClass_Kubelet_propagation — TLS internal error on kubelet port 10250 at IP 10.0.136.209
Job 2 error Get "https://10.0.136.209:10250/containerLogs/kube-system/kubelet-config-checker/checker": remote error: tls: internal error (HTTP 500)
Flaky on PR #8820 Same TestKarpenterUpgradeControlPlane fails with EnsureNoCrashingPods on unrelated PR
Flaky on PR #8819 TestKarpenter/ValidateHostedCluster fails on unrelated PR
Test pass rate Job 1: 598/600 passed (99.7%). Job 2: 619/623 passed (99.4%). Only Karpenter tests failed.

@muraee

muraee commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

/retest-required

@muraee

muraee commented Jun 25, 2026

Copy link
Copy Markdown
Contributor Author

/verified by e2e

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 25, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@muraee: This PR has been marked as verified by e2e.

Details

In response to this:

/verified by e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

@muraee: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 02675db into openshift:main Jun 25, 2026
53 checks passed
@muraee

muraee commented Jun 30, 2026

Copy link
Copy Markdown
Contributor Author

/jira backport release-4.22

@openshift-ci-robot

Copy link
Copy Markdown

@muraee: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.22

Details

In response to this:

/jira backport release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-cherrypick-robot

Copy link
Copy Markdown

@openshift-ci-robot: #8626 failed to apply on top of branch "release-4.22":

Applying: feat(api): add spec.monitoring with metricsForwarding API
Applying: chore(api): regenerate CRDs, deepcopy, clients, vendor, and docs
Using index info to reconstruct a base tree...
M	api/hypershift/v1beta1/zz_generated.deepcopy.go
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml
A	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml
M	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml
A	api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml
M	client/applyconfiguration/utils.go
M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml
M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml
M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml
M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml
M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml.git/rebase-apply/patch:2276: trailing whitespace.
<a href="#hypershift.openshift.io/v1beta1.HostedClusterSpec">HostedClusterSpec</a>, 
.git/rebase-apply/patch:2509: trailing whitespace.
<a href="#hypershift.openshift.io/v1beta1.HostedClusterSpec">HostedClusterSpec</a>, 
warning: 2 lines add whitespace errors.

M	cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml
M	docs/content/reference/aggregated-docs.md
M	docs/content/reference/api.md
M	vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go
M	vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go
M	vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go
Falling back to patching base and 3-way merge...
Auto-merging api/hypershift/v1beta1/zz_generated.deepcopy.go
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AutoNodeKarpenter.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AutoNodeKarpenter.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml
Auto-merging api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml
Auto-merging client/applyconfiguration/utils.go
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml
Auto-merging cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml
Auto-merging docs/content/reference/aggregated-docs.md
Auto-merging docs/content/reference/api.md
Auto-merging vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go
Auto-merging vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go
Auto-merging vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go
Applying: feat(cpo,ho): use spec.monitoring API instead of annotation
Using index info to reconstruct a base tree...
M	control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
M	control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
M	hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
M	hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
A	test/e2e/v2/tests/hosted_cluster_metrics_test.go
Falling back to patching base and 3-way merge...
Auto-merging control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
CONFLICT (content): Merge conflict in control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
Auto-merging control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
Auto-merging hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
CONFLICT (content): Merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
Auto-merging hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
CONFLICT (content): Merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
CONFLICT (modify/delete): test/e2e/v2/tests/hosted_cluster_metrics_test.go deleted in HEAD and modified in feat(cpo,ho): use spec.monitoring API instead of annotation.  Version feat(cpo,ho): use spec.monitoring API instead of annotation of test/e2e/v2/tests/hosted_cluster_metrics_test.go left in tree.
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
hint: When you have resolved this problem, run "git am --continue".
hint: If you prefer to skip this patch, run "git am --skip" instead.
hint: To restore the original branch and stop patching, run "git am --abort".
hint: Disable this message with "git config set advice.mergeConflict false"
Patch failed at 0003 feat(cpo,ho): use spec.monitoring API instead of annotation

Details

In response to this:

@muraee: The following backport issues have been created:

Queuing cherrypicks to the requested branches to be created after this PR merges:
/cherrypick release-4.22

In response to this:

/jira backport release-4.22

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants