fix: atomic subsystem namespace slot check via FDB transaction#1065
Open
boddumanohar wants to merge 1 commit into
Open
fix: atomic subsystem namespace slot check via FDB transaction#1065boddumanohar wants to merge 1 commit into
boddumanohar wants to merge 1 commit into
Conversation
Parallel clone/create requests sharing a subsystem (namespaced=True) all read the same namespace count from DB and can each decide the slot is free, resulting in more lvols than max_namespace_per_subsys being written to one NQN. Fix: replace the bare lvol.write_to_db() call with a single FDB transactional function (write_lvol_with_ns_check) that re-counts active namespaces for the target NQN inside the transaction and writes the new lvol record only when the subsystem still has room. Because the range-read (b'object/LVol/') and the write share one FDB transaction, concurrent writers that race on the same NQN trigger an OCC conflict on commit. FDB automatically retries the loser with fresh data, serialising the slot allocation without any explicit lock — parallel creates on *different* subsystems continue to run without any contention. Affected callers: - snapshot_controller.clone() (namespaced clones) - lvol_controller.add_lvol_ha() (namespaced lvol creates) Both paths now return a retryable error instead of silently over- allocating when the slot is taken after the OCC conflict is resolved. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Parallel
clone()andadd_lvol_ha()requests withnamespaced=Trueshare a subsystem (NQN). All threads read the same namespace count from DB, each decides the slot is free, and each writes an lvol with the same NQN — resulting in more lvols thanmax_namespace_per_subsysin one subsystem.This is a classic TOCTOU race: the check (
get_next_available_subsystem_on_node) and the act (write_to_db) were not atomic.Fix
Replace the bare
lvol.write_to_db()with a newdb_controller.write_lvol_with_ns_check(lvol)that wraps both the count check and the write in a single FDB transaction:FDB's OCC means: if two transactions both read the
object/LVol/range and try to commit, the one that sees a stale read loses, gets retried with fresh data, and correctly sees the slot is now taken. No explicit lock, no serialisation of unrelated requests — parallel creates on different subsystems are completely unaffected.Callers changed
snapshot_controller.pyclone()— namespaced cloneslvol_controller.pyadd_lvol_ha()— namespaced lvol createsBoth return a retryable error (
"Subsystem namespace limit reached concurrently; retry") instead of silently over-allocating when the OCC check fails.Why not a mutex?
A per-node lock would serialise all clone requests on a node (even ones targeting different subsystems), adding ~15–25 ms of queue wait per request under parallel load. OCC only serialises the rare actual conflict.
Diff size
55 lines across 3 files.
🤖 Generated with Claude Code