Skip to content

repository: fix Repository.delete edge cases (#9815)#9820

Open
mr-raj12 wants to merge 1 commit into
borgbackup:masterfrom
mr-raj12:fix-repository-delete-edge-cases
Open

repository: fix Repository.delete edge cases (#9815)#9820
mr-raj12 wants to merge 1 commit into
borgbackup:masterfrom
mr-raj12:fix-repository-delete-edge-cases

Conversation

@mr-raj12

Copy link
Copy Markdown
Contributor

Closes #9815.

Follow-up to #9813, which made Repository.delete a real single-object delete. That surfaced three issues:

1. compact_pack contiguity assert. delete() works off the chunk index, which can still list a stale earlier copy of a re-put object whose bytes remain in the pack. The retained offset == covered assert then failed because the target no longer starts at offset 0. compact_pack now takes a complete flag: compact keeps the strict check, delete passes complete=False and copies the known survivors at their real offsets, tolerating the gap.

2. Quadratic index writes in check --repair. delete() rewrote the full chunk index on every call, and verify_data's repair loop calls it once per defect chunk. delete() now takes a persist flag; check --repair passes persist=False and the index is rebuilt once in finish(), so there is no per-chunk full index write.

3. Misleading repair output. verify_data logged that defect chunks "can not be removed yet" while actually removing them, and it dropped chunks from the repository index but not from the checker's own index, so the affected files were not reported in the same run. The messages are corrected and the chunk is dropped from the checker index too.

Adds a regression test for the stale-earlier-object case.

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.15%. Comparing base (106dfba) to head (4fe7e0e).
⚠️ Report is 1 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9820      +/-   ##
==========================================
+ Coverage   84.94%   85.15%   +0.20%     
==========================================
  Files          92       92              
  Lines       15291    15294       +3     
  Branches     2296     2298       +2     
==========================================
+ Hits        12989    13023      +34     
+ Misses       1611     1580      -31     
  Partials      691      691              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@mr-raj12 mr-raj12 marked this pull request as ready for review June 27, 2026 05:12
Tolerate a stale earlier object in the pack via a compact_pack complete flag, skip the per-delete index write in check --repair via a delete persist flag, and make verify_data report the files whose chunks it removes. Add a regression test for the stale-earlier-object case.
@mr-raj12 mr-raj12 force-pushed the fix-repository-delete-edge-cases branch from ada4e8b to 4fe7e0e Compare June 27, 2026 05:25
Comment on lines +392 to 397
data = corrupt(repository.get(chunk.id), 123)
# re-putting the same id would not repoint the index: close() persists new index
# entries incrementally, not updates to existing ones, so the next process would still
# read the old, good copy. delete first, then put the corrupted bytes as a fresh entry.
repository.delete(chunk.id)
repository.put(chunk.id, data)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, that should work better.

even better than that would be to simulate reality of bit corruption more closely: use some low-level code that directly corrupts the file on disk without going through delete/put and without modifying the index.

Comment thread src/borg/repository.py
Comment on lines +804 to +805
A pack holds several objects, so we rewrite it via compact_pack(complete=False) rather than
dropping the whole pack. With persist=True the full chunk index is written back so the next

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please read these AI-made comments and make sure they don't talk about the bad hacks of the past?

Dropping a whole pack instead of a defect chunk is nothing we need to discuss or even mention.

Comment thread src/borg/repository.py
return self._pack_writer.add(id, data)

def delete(self, id):
def delete(self, id, *, persist=True):

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

persist is a bad name, maybe update_index or so?

Comment thread src/borg/repository.py
Comment on lines +829 to +834
keep_ids and drop_ids come from the caller iterating the chunk index for this pack_id; they
must not overlap. With complete=True (compact) they are the pack's whole object set, so their
ranges must tile it from offset 0 with no gap or overlap. With complete=False (delete) the index
view may be partial (a re-put can repoint an earlier object to another pack while its bytes stay
here), so we copy only the known survivors at their real offsets and allow gaps. Kept objects are
copied into a new pack via store.defrag and repointed; dropped objects' index entries are removed.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rewrite this to be more clear.

also remove crap like "come from the caller iterating chunk index for this pack id". a description of parameters needs to tell what they are and what we do here with them, not where a call comes from or what the caller is doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

borg2: PR #9813 aftermath

2 participants