Refactor garbage collection to prevent premature pruning and panic vu…#69
Open
MokshonWork wants to merge 1 commit into
Open
Refactor garbage collection to prevent premature pruning and panic vu…#69MokshonWork wants to merge 1 commit into
MokshonWork wants to merge 1 commit into
Conversation
…lnerabilities Signed-off-by: Moksh Goyal <221651574+MokshonWork@users.noreply.github.com>
88b0a2f to
948dda3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Garbage Collection Concurrency & Stability Overhaul
What changes did I make?
lore/src/repository.rs):LoreRepositoryGcArgsstruct to introduce two new configuration fields:grace_period_secandprune_threshold.lore-storage/src/maintenance.rs):gcexecution loop to return an idiomaticResult<(), GcError>instead of blindly swallowing errors or crashing.store.evict()call withstore.evict_with_grace(), passing down the new grace period threshold.gc_lease::acquire_exclusive_sweep()) that locks before any eviction operations can occur.Why were these changes made?
These modifications directly address two systemic weaknesses in the repository's storage backend:
How is this useful?
This refactor brings enterprise-grade concurrency safety to the garbage collection pipeline.
By introducing a temporal grace period and active staging leases, we guarantee that no chunk is ever deleted while it is still "in-flight" during a workspace transition. Furthermore, migrating to an atomic
Resultsystem ensures that if a system crashes mid-sweep, the garbage collector rolls back cleanly on the next startup without causing permanent metadata corruption. This makes the tool robust enough for massive, highly concurrent CI environments.