fix: clean up staging ODB after dvc add --out transfer#11008
fix: clean up staging ODB after dvc add --out transfer#11008BALOGUN-DAVID wants to merge 2 commits intotreeverse:mainfrom
dvc add --out transfer#11008Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #11008 +/- ##
==========================================
+ Coverage 90.68% 90.98% +0.30%
==========================================
Files 504 505 +1
Lines 39795 41150 +1355
Branches 3141 3263 +122
==========================================
+ Hits 36087 37440 +1353
- Misses 3042 3071 +29
+ Partials 666 639 -27 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Hi, this PR does not seem to work. You can verify this with the following test: def test_add_out_cleans_up_staging(tmp_dir, dvc):
tmp_dir.gen({"src_dir": {"a.txt": "aaa", "b.txt": "bbb"}})
odb = dvc.cache.local
assert list(odb.fs.find(odb.path)) == []
dvc.add("src_dir", out="dst_dir")
assert (tmp_dir / "dst_dir").read_text() == {"a.txt": "aaa", "b.txt": "bbb"}
assert not [p for p in odb.fs.find(odb.path) if p.endswith(".tmp")]The test that you have added only checks if the method that you added gets called or not. |
Removes leftover .tmp files from the destination cache directly after transfer to prevent disk bloat. Includes target test case for dvc add --out. Fixes treeverse#11008.
| for path in odb.fs.find(odb.path): | ||
| if path.endswith(".tmp"): | ||
| odb.fs.remove(path) |
There was a problem hiding this comment.
The external .tmp cleanup here feels brittle, and is much broader than necessary.
|
Thanks for taking the time to open this PR. I am going to close it for now, as I do not think it fully addresses the issue and it seems to depend on implementation details. This bug is also not a high priority for me at the moment. |

Fix Staging ODB Cleanup for
dvc add --out📝 Description
When running
dvc add --out, DVC copies source files into the cache via a staging mechanism but fails to clean up the temporary staging ODB (stored inmemfs) and temporary upload files after the transfer completes.This causes lingering temporary files and
memfsbloat.This issue was confirmed as a bug by DVC maintainers.
Approach:
This PR adds a
staging.clear()call inside theOutput.transfer()method indvc/output.py, directly afterotransfer()successfully moves the objects from the staging area to the cache ODB.Impact:
Prevents memory leaks and disk space accumulation by properly freeing up temporary
memfsstaging files and caching artifacts upon successful completion of thedvc add --outcommand.✅ Test Results
test_add_with_out_cleans_up_stagingintests/func/test_add.pythat uses amockerspy onObjectDB.clearto ensure the cleanup function is triggered.add --outpass smoothly (e.g.,test_add_with_out,test_add_force_overwrite_out,test_add_to_cache_different_name).🧩 Visual Evidence —
output.pyChanges🧪 Visual Evidence — Test Changes
📋 Checklist
[x ] ❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. 🙏