[backport release/v25.3.x] operator: defer rolling restart while a recently replaced pod is still coming up#1539
Merged
Conversation
3 tasks
RafalKorepta
approved these changes
May 22, 2026
…l coming up During a rolling restart the operator deletes one pod at a time and waits for the cluster to report healthy before proceeding. In practice the Redpanda health API lags behind pod state: after deleting pod A, the new pod A can appear in the cache with the correct revision before Redpanda detects broker A's departure, so isHealthy remains true and the operator immediately deletes pod B — causing two pods to be unavailable simultaneously. HasRecentlyReplacedPods in pool.go closes this race. Before entering the rolling loop (and only when there are pods to roll), it checks whether any pod that already carries the latest StatefulSet revision is not yet Running+Ready, or any pod is still terminating. If so, the reconciler defers with a requeue rather than proceeding to the next deletion. This is narrower than "all pods ready" — pods with the old revision that happen to be unhealthy for unrelated reasons do not block the roll. Backport of the guard introduced on main in #1446. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8fad3c6 to
c72d8bf
Compare
Contributor
|
Acceptance test suite is failing non deterministically. Maybe similar patch fix as in #1541 should be applied. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Backport of the
HasRecentlyReplacedPodsrolling-restart guard introduced onmainin #1446. The guard itself lives inoperator/internal/lifecycle/pool.goand is called from the rolling loop in
operator/internal/controller/redpanda/redpanda_controller.go.Why
During a rolling restart the operator deletes one pod at a time and waits for
the cluster to report healthy before proceeding. In practice the Redpanda
health API lags behind pod state: after deleting pod A, the new pod A can
appear in the cache with the correct revision before Redpanda detects broker
A's departure, so
isHealthyremainstrueand the operator immediatelydeletes pod B — causing two pods to be unavailable simultaneously.
HasRecentlyReplacedPodsinpool.gocloses this race. Before entering therolling loop (and only when there are pods to roll), it checks whether any
pod that already carries the latest StatefulSet revision is not yet
Running+Ready. If so, the reconciler defers with a requeue rather than
proceeding to the next deletion. This is narrower than "all pods ready" —
pods with the old revision that happen to be unhealthy for unrelated reasons
do not block the roll.
Why a fresh commit instead of a cherry-pick
#1446 landed as a large batch (acceptance test segmentation + multicluster
fixes + this guard). Cherry-picking it whole would drag in a much larger diff
and types (
MulticlusterStatefulSet, multicluster client interfaces) thatdon't apply cleanly to this release branch. This PR carries just the guard
chunk verbatim from
main.Test plan
go vet ./operator/internal/lifecycle/... ./operator/internal/controller/redpanda/...— cleanobserve at most one pod Terminating + one pod NotReady at any moment
(vs. two-down windows pre-guard)
Follow-ups
the cluster-
IsHealthyheuristic with per-broker risks. Tracked separately:rpadmin client bindings in common-go#170,
operator-side wiring after that lands.
— separate PR.
🤖 Generated with Claude Code