fix(hostedclustersizing): add override fall back#7554
Conversation
When ClusterSizeOverrideAnnotation is set to a value not found in the ClusterSizingConfiguration, the controller now gracefully falls back to autoscaling (if enabled) or node count sizing, rather than skipping sizing entirely. Signed-off-by: Brendan Bergen <bbergen@redhat.com> Assisted-by: Claude Opus 4.5 (via Cursor)
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: SudoBrendan The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@SudoBrendan: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Stale PRs are closed after 21d of inactivity. If this PR is still relevant, comment to refresh it or remove the stale label. If this PR is safe to close now please do so with /lifecycle stale |
|
Stale PRs rot after 14d of inactivity. Mark the PR as fresh by commenting If this PR is safe to close now please do so with /lifecycle rotten |
|
Now I have a complete picture of all the failures. Here is the analysis: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryAll four failures on PR #7554 are caused by the PR being extremely stale — it was opened on January 20, 2026, is now 1,842 commits behind Root CauseThere are two distinct root causes: 1. Tide merge conflict (primary blocker): 2. Konflux Enterprise Contract failures (pre-existing, not caused by PR code):
Recommendations
Evidence
|
What this PR does / why we need it:
Today, if you set the size override annotation, but that size isn't an exact string match of what's in the config, your cluster will remain unsized rather than falling back to the standard logic (autoscale / node size). The only indication this has happened is a log message in the controller.
NOTE: I think this seems odd - do others agree this is a bug, not a feature? Is there a reason to prefer an unsized cluster when the override is set incorrectly that I'm unaware of?
Which issue(s) this PR fixes:
Adjacent work I happened to discover while trying to refine what CNTRLPLANE-2581 would be
Special notes for your reviewer:
Checklist: