Retry DRBD adjust after stale bitmap attach failure#491
Open
kvaps wants to merge 1 commit intoLINBIT:masterfrom
Open
Retry DRBD adjust after stale bitmap attach failure#491kvaps wants to merge 1 commit intoLINBIT:masterfrom
kvaps wants to merge 1 commit intoLINBIT:masterfrom
Conversation
Author
|
We've integrated this change into Cozystack as part of cozystack/cozystack#2331 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
When
drbdadm adjustfails during local attach with:already has a bitmap, this should not happenit means DRBD still has stale local bitmap state for the target minor. In that case LINSTOR currently aborts the adjust and leaves the resource diskless even though peers may still be healthy.
This change teaches the satellite to:
minorfrom stderrdrbdsetup detach <minor>with a--forcefallbackdrbdadm adjustonceWhy
I hit this while recovering DRBD resources after a cluster incident. In practice this looked like an
unintentional disklessresource in LINSTOR whiledrbdadm statusstill showed a healthyPrimarywithpeer-disk:UpToDateon other nodes.The detach + retry path was enough to resynchronize LINSTOR with the actual DRBD device state and allow the local disk to be reattached.
Validation
1.33.1