Add __str__/__repr__ to all public classes and fix stale lock cleanup#62
Add __str__/__repr__ to all public classes and fix stale lock cleanup#62rfrenchseti wants to merge 1 commit intomainfrom
Conversation
Fix #21: Add __str__ and __repr__ methods to FileCache, FileCacheSource (and all subclasses), and enhance FCPath's __repr__ to include non-default parameters (anonymous, lock_timeout, nthreads). Also add cache_name property to FileCache. Fix #56: Add clean_up_stale_locks() method to FileCache that walks the cache directory and removes orphaned lock files left by crashed processes. Also improve the _retrieve_multi_locked wait loop to automatically detect and clean up stale locks by attempting to acquire them, rather than waiting until timeout. Made-with: Cursor
WalkthroughAdded string representation methods ( Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #62 +/- ##
==========================================
- Coverage 89.55% 89.38% -0.18%
==========================================
Files 5 5
Lines 2413 2486 +73
Branches 507 523 +16
==========================================
+ Hits 2161 2222 +61
- Misses 149 157 +8
- Partials 103 107 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
filecache/file_cache.py (1)
1723-1746:⚠️ Potential issue | 🟠 MajorKeep
exception_on_fail=Trueconsistent when the other process drops the lock without a file.The new stale-lock branch records
urlinfiles_not_exist, but the existingif not lock_path.is_file()branch still doesn't. That path can therefore return aFileNotFoundErrorobject inside the result list instead of raising.🩹 Proposed fix
if not lock_path.is_file(): func_ret[idx] = FileNotFoundError( f'Another process failed to download {url}') self._log_debug(f' Download elsewhere failed: {url}') + files_not_exist.append(url) continue🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@filecache/file_cache.py` around lines 1723 - 1746, The branch that handles a missing lock file (the "if not lock_path.is_file()" block) sets func_ret[idx] to FileNotFoundError and logs the event but does not add the URL to files_not_exist, causing inconsistent behavior vs the stale-lock branch; update that branch to also append url to files_not_exist (same as the stale_lock branch) so both paths record the URL for the later exception_on_fail handling and preserve the existing func_ret and logging logic for lock_path, func_ret, files_not_exist, and the logged message.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@filecache/file_cache.py`:
- Around line 489-500: The cleanup code acquires a FileLock then unlinks the
lock file before releasing it (lock_path.unlink then lock.release), which can
fail on Windows and split locks on POSIX; change the order so the lock is
released prior to deleting the lock file: after successfully acquiring the lock
(lock.acquire()) call lock.release() first, then unlink the path
(lock_path.unlink(...)), and keep the removed increment and logging
(_log_info/_log_debug) intact; ensure this adjustment is applied in the
stale-lock cleanup block that uses filelock.FileLock, lock.acquire,
lock.release, and lock_path.unlink.
- Around line 451-465: The __repr__ implementation omits boolean
lifecycle/locking flags when they are False, making non-default states
indistinguishable; change FileCache.__repr__ to always include the boolean
fields (self._delete_on_exit, self._time_sensitive, self._is_mp_safe,
self._anonymous) in the parts list by appending their actual values (e.g.,
f'delete_on_exit={self._delete_on_exit!r}') instead of only appending when True,
while keeping the existing conditional logic for lock_timeout and nthreads if
desired.
In `@README.md`:
- Around line 207-211: The example uses FileCache.clean_up_stale_locks() but
omits the import and the mp_safe=True context; update the snippet to import
FileCache and construct it with mp_safe=True (i.e., use
FileCache('shared-cache', mp_safe=True)) before calling clean_up_stale_locks()
so the example is self-contained and matches the multiprocessing-safe
discussion.
In `@tests/test_file_cache.py`:
- Around line 1073-1116: Add a regression test that exercises the wait-loop
stale-lock recovery path in FileCache by simulating a waiter in
_retrieve_multi_locked: create a lock file at the _LOCK_PREFIX path for a target
returned by get_local_path, spawn or simulate a concurrent waiter that calls the
code path that would block in _retrieve_multi_locked (e.g., by invoking the
method that uses the wait loop or by mimicking the waiter behavior), then make
the lock file stale (modify its mtime or write invalid content) while the waiter
is waiting and assert the waiter exits/fails immediately (no sleep until
timeout) and that clean_up_stale_locks() removes the stale lock; reference
FileCache, _retrieve_multi_locked, clean_up_stale_locks, get_local_path and
_LOCK_PREFIX to locate the relevant code paths.
---
Outside diff comments:
In `@filecache/file_cache.py`:
- Around line 1723-1746: The branch that handles a missing lock file (the "if
not lock_path.is_file()" block) sets func_ret[idx] to FileNotFoundError and logs
the event but does not add the URL to files_not_exist, causing inconsistent
behavior vs the stale-lock branch; update that branch to also append url to
files_not_exist (same as the stale_lock branch) so both paths record the URL for
the later exception_on_fail handling and preserve the existing func_ret and
logging logic for lock_path, func_ret, files_not_exist, and the logged message.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1aa274b7-d372-4df7-acb5-57e66715a206
📒 Files selected for processing (7)
README.mdfilecache/file_cache.pyfilecache/file_cache_path.pyfilecache/file_cache_source.pytests/test_file_cache.pytests/test_file_cache_path.pytests/test_file_cache_source.py
| def __repr__(self) -> str: | ||
| parts = [repr(self._cache_name)] | ||
| if self._delete_on_exit: | ||
| parts.append('delete_on_exit=True') | ||
| if self._time_sensitive: | ||
| parts.append('time_sensitive=True') | ||
| if self._is_mp_safe: | ||
| parts.append('mp_safe=True') | ||
| if self._anonymous: | ||
| parts.append('anonymous=True') | ||
| if self._lock_timeout != 60: | ||
| parts.append(f'lock_timeout={self._lock_timeout!r}') | ||
| if self._nthreads != 8: | ||
| parts.append(f'nthreads={self._nthreads!r}') | ||
| return f'FileCache({", ".join(parts)})' |
There was a problem hiding this comment.
__repr__ drops the non-default lifecycle and locking states.
delete_on_exit and mp_safe are only rendered when they are True. That makes FileCache(None, delete_on_exit=False) and FileCache('shared', mp_safe=False) look like default instances even though their behavior is the opposite.
🛠️ Proposed fix
def __repr__(self) -> str:
parts = [repr(self._cache_name)]
- if self._delete_on_exit:
- parts.append('delete_on_exit=True')
+ default_delete_on_exit = (self._cache_name is None)
+ if self._delete_on_exit != default_delete_on_exit:
+ parts.append(f'delete_on_exit={self._delete_on_exit!r}')
if self._time_sensitive:
parts.append('time_sensitive=True')
- if self._is_mp_safe:
- parts.append('mp_safe=True')
+ default_mp_safe = (self._cache_name is not None)
+ if self._is_mp_safe != default_mp_safe:
+ parts.append(f'mp_safe={self._is_mp_safe!r}')
if self._anonymous:
parts.append('anonymous=True')
if self._lock_timeout != 60:
parts.append(f'lock_timeout={self._lock_timeout!r}')
if self._nthreads != 8:
parts.append(f'nthreads={self._nthreads!r}')
return f'FileCache({", ".join(parts)})'📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def __repr__(self) -> str: | |
| parts = [repr(self._cache_name)] | |
| if self._delete_on_exit: | |
| parts.append('delete_on_exit=True') | |
| if self._time_sensitive: | |
| parts.append('time_sensitive=True') | |
| if self._is_mp_safe: | |
| parts.append('mp_safe=True') | |
| if self._anonymous: | |
| parts.append('anonymous=True') | |
| if self._lock_timeout != 60: | |
| parts.append(f'lock_timeout={self._lock_timeout!r}') | |
| if self._nthreads != 8: | |
| parts.append(f'nthreads={self._nthreads!r}') | |
| return f'FileCache({", ".join(parts)})' | |
| def __repr__(self) -> str: | |
| parts = [repr(self._cache_name)] | |
| default_delete_on_exit = (self._cache_name is None) | |
| if self._delete_on_exit != default_delete_on_exit: | |
| parts.append(f'delete_on_exit={self._delete_on_exit!r}') | |
| if self._time_sensitive: | |
| parts.append('time_sensitive=True') | |
| default_mp_safe = (self._cache_name is not None) | |
| if self._is_mp_safe != default_mp_safe: | |
| parts.append(f'mp_safe={self._is_mp_safe!r}') | |
| if self._anonymous: | |
| parts.append('anonymous=True') | |
| if self._lock_timeout != 60: | |
| parts.append(f'lock_timeout={self._lock_timeout!r}') | |
| if self._nthreads != 8: | |
| parts.append(f'nthreads={self._nthreads!r}') | |
| return f'FileCache({", ".join(parts)})' |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@filecache/file_cache.py` around lines 451 - 465, The __repr__ implementation
omits boolean lifecycle/locking flags when they are False, making non-default
states indistinguishable; change FileCache.__repr__ to always include the
boolean fields (self._delete_on_exit, self._time_sensitive, self._is_mp_safe,
self._anonymous) in the parts list by appending their actual values (e.g.,
f'delete_on_exit={self._delete_on_exit!r}') instead of only appending when True,
while keeping the existing conditional logic for lock_timeout and nthreads if
desired.
| lock = filelock.FileLock(lock_path, timeout=0) | ||
| try: | ||
| lock.acquire() | ||
| except filelock._error.Timeout: | ||
| self._log_debug(f'Lock file is active, skipping: {lock_path}') | ||
| continue | ||
| try: | ||
| lock_path.unlink(missing_ok=True) | ||
| removed += 1 | ||
| self._log_info(f'Removed stale lock file: {lock_path}') | ||
| finally: | ||
| lock.release() |
There was a problem hiding this comment.
Release the stale lock before deleting its lock file.
This new cleanup path reverses the release/unlink order already used elsewhere in FileCache. That sequence is there for a reason: unlinking an acquired lock can fail on Windows and can split the lock on POSIX.
🔒 Proposed fix
- try:
- lock_path.unlink(missing_ok=True)
- removed += 1
- self._log_info(f'Removed stale lock file: {lock_path}')
- finally:
- lock.release()
+ lock.release()
+ lock_path.unlink(missing_ok=True)
+ removed += 1
+ self._log_info(f'Removed stale lock file: {lock_path}')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| lock = filelock.FileLock(lock_path, timeout=0) | |
| try: | |
| lock.acquire() | |
| except filelock._error.Timeout: | |
| self._log_debug(f'Lock file is active, skipping: {lock_path}') | |
| continue | |
| try: | |
| lock_path.unlink(missing_ok=True) | |
| removed += 1 | |
| self._log_info(f'Removed stale lock file: {lock_path}') | |
| finally: | |
| lock.release() | |
| lock = filelock.FileLock(lock_path, timeout=0) | |
| try: | |
| lock.acquire() | |
| except filelock._error.Timeout: | |
| self._log_debug(f'Lock file is active, skipping: {lock_path}') | |
| continue | |
| lock.release() | |
| lock_path.unlink(missing_ok=True) | |
| removed += 1 | |
| self._log_info(f'Removed stale lock file: {lock_path}') |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@filecache/file_cache.py` around lines 489 - 500, The cleanup code acquires a
FileLock then unlinks the lock file before releasing it (lock_path.unlink then
lock.release), which can fail on Windows and split locks on POSIX; change the
order so the lock is released prior to deleting the lock file: after
successfully acquiring the lock (lock.acquire()) call lock.release() first, then
unlink the path (lock_path.unlink(...)), and keep the removed increment and
logging (_log_info/_log_debug) intact; ensure this adjustment is applied in the
stale-lock cleanup block that uses filelock.FileLock, lock.acquire,
lock.release, and lock_path.unlink.
| ```python | ||
| fc = FileCache('shared-cache') | ||
| removed = fc.clean_up_stale_locks() | ||
| print(f'Cleaned up {removed} stale lock(s)') | ||
| ``` |
There was a problem hiding this comment.
Make the stale-lock example self-contained and explicit about mp_safe.
The snippet currently omits the import and does not reflect the mp_safe=True context described above it.
📝 Proposed doc fix
```python
-fc = FileCache('shared-cache')
+from filecache import FileCache
+
+fc = FileCache('shared-cache', mp_safe=True)
removed = fc.clean_up_stale_locks()
print(f'Cleaned up {removed} stale lock(s)')</details>
<details>
<summary>🤖 Prompt for AI Agents</summary>
Verify each finding against the current code and only fix it if needed.
In @README.md around lines 207 - 211, The example uses
FileCache.clean_up_stale_locks() but omits the import and the mp_safe=True
context; update the snippet to import FileCache and construct it with
mp_safe=True (i.e., use FileCache('shared-cache', mp_safe=True)) before calling
clean_up_stale_locks() so the example is self-contained and matches the
multiprocessing-safe discussion.
</details>
<!-- fingerprinting:phantom:poseidon:hawk -->
<!-- This is an auto-generated comment by CodeRabbit -->
| def test_clean_up_stale_locks(): | ||
| with FileCache(cache_name=None, mp_safe=True) as fc: | ||
| local_path = fc.get_local_path('gs://test/test.txt') | ||
| lock_path = local_path.parent / f'{fc._LOCK_PREFIX}{local_path.name}' | ||
| lock_path.parent.mkdir(parents=True, exist_ok=True) | ||
| lock_path.write_text('stale') | ||
| assert lock_path.is_file() | ||
| removed = fc.clean_up_stale_locks() | ||
| assert removed == 1 | ||
| assert not lock_path.is_file() | ||
|
|
||
|
|
||
| def test_clean_up_stale_locks_active(): | ||
| with FileCache(cache_name=None, mp_safe=True) as fc: | ||
| local_path = fc.get_local_path('gs://test/test.txt') | ||
| lock_path = local_path.parent / f'{fc._LOCK_PREFIX}{local_path.name}' | ||
| lock_path.parent.mkdir(parents=True, exist_ok=True) | ||
| lock = filelock.FileLock(lock_path, timeout=0) | ||
| lock.acquire() | ||
| try: | ||
| removed = fc.clean_up_stale_locks() | ||
| assert removed == 0 | ||
| assert lock_path.is_file() | ||
| finally: | ||
| lock.release() | ||
| lock_path.unlink(missing_ok=True) | ||
|
|
||
|
|
||
| def test_clean_up_stale_locks_empty(): | ||
| with FileCache(cache_name=None) as fc: | ||
| removed = fc.clean_up_stale_locks() | ||
| assert removed == 0 | ||
|
|
||
|
|
||
| def test_clean_up_stale_locks_multiple(): | ||
| with FileCache(cache_name=None, mp_safe=True) as fc: | ||
| for i in range(3): | ||
| local_path = fc.get_local_path(f'gs://test/test{i}.txt') | ||
| lock_path = local_path.parent / f'{fc._LOCK_PREFIX}{local_path.name}' | ||
| lock_path.parent.mkdir(parents=True, exist_ok=True) | ||
| lock_path.write_text('stale') | ||
| removed = fc.clean_up_stale_locks() | ||
| assert removed == 3 | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add a regression test for the wait-loop stale-lock recovery path.
These tests only call clean_up_stale_locks() directly. The other half of the change lives in _retrieve_multi_locked(), where a waiter should detect a lock that becomes stale and fail immediately instead of sleeping until timeout; that behavior is still uncovered.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_file_cache.py` around lines 1073 - 1116, Add a regression test
that exercises the wait-loop stale-lock recovery path in FileCache by simulating
a waiter in _retrieve_multi_locked: create a lock file at the _LOCK_PREFIX path
for a target returned by get_local_path, spawn or simulate a concurrent waiter
that calls the code path that would block in _retrieve_multi_locked (e.g., by
invoking the method that uses the wait loop or by mimicking the waiter
behavior), then make the lock file stale (modify its mtime or write invalid
content) while the waiter is waiting and assert the waiter exits/fails
immediately (no sleep until timeout) and that clean_up_stale_locks() removes the
stale lock; reference FileCache, _retrieve_multi_locked, clean_up_stale_locks,
get_local_path and _LOCK_PREFIX to locate the relevant code paths.
|
Done on a different branch |
Summary
Fix Add __str__ and __repr__ to all classes #21: Add
__str__and__repr__methods to all public classes that were missing them:FileCache:__str__returns the cache directory path;__repr__includes cache name and non-default configuration (delete_on_exit, time_sensitive, mp_safe, anonymous, lock_timeout, nthreads). Also addscache_nameproperty.FileCacheSource(base class):__str__returns the source prefix (e.g.,gs://bucket);__repr__includes class name, scheme, remote, and anonymous flag. All subclasses (FileCacheSourceFile,FileCacheSourceHTTP,FileCacheSourceGS,FileCacheSourceS3) inherit these.FileCacheSourceFake: Overrides__repr__to also includestorage_dirwhen it differs from the default.FCPath: Enhanced__repr__to include non-defaultanonymous,lock_timeout, andnthreadsparameters alongside the path.Fix Programs can leave behind locks that never go away #56: Address stale lock files left behind by crashed processes:
FileCache.clean_up_stale_locks()method that walks the cache directory, identifies lock files not actively held by any process (by attempting to acquire them with zero timeout), and removes stale ones. Returns the count of removed lock files._retrieve_multi_lockedwait loop to automatically detect stale locks from crashed processes. Instead of waiting until timeout when a lock file exists but no process holds it, the code now attempts to acquire the lock; if successful, it cleans up the stale lock and reports the download failure immediately.Test plan
clean_up_stale_locks(): stale lock removal, active lock preservation, empty cache, multiple stale locksFileCache.__str__and__repr__with various configurationsFileCache.cache_namepropertyFileCacheSource*.__str__and__repr__for all subclassesFileCacheSourceFake.__repr__with custom storage directoryFCPath.__repr__tests for enhanced detailMade with Cursor
Summary by CodeRabbit
New Features
Documentation