You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
During PR #641 Round-2 blind review (R2-S-3 finding), security-engineer-r2 flagged that shared/claude_md_manager.py::file_lock has a 5-second timeout with fail-open behavior. An attacker (same-user, since this is local-trust scope) holding the sidecar .{filename}.lock file for >5 seconds causes legitimate writers to silently degrade to unlocked writes.
# pact-plugin/hooks/shared/claude_md_manager.py (file_lock)# Acquire fcntl.flock on .{filename}.lock with 5s timeout# On timeout or any disk error → fail-open (proceed without lock)
Fail-open is correct under current threat models (same-user-trust, where the attacker has equivalent capability via direct os tooling anyway), but the timeout is short and there's no observability for "how often does fail-open fire?"
Symptoms attacker could induce
Concurrent CLAUDE.md writes that race past lock ordering → torn writes (PACT_MANAGED block partial overwrite by routing-block update, or vice versa).
Race window during which strip_orphan_routing_markers runs without holding the lock → potential to lose a writer's append.
The likelihood is low (same-user is the only adversary) but the surface is enumerable.
Proposal — defense-in-depth
Option A — TTL on sidecar lock
# Before fcntl.flock acquisition, check the sidecar lock's mtime.# If mtime > now - LOCK_TTL_SECONDS (e.g., 30s), the previous holder is presumed dead;# unlink the sidecar and reacquire.LOCK_TTL_SECONDS=30deffile_lock(target: Path):
sidecar=target.parent/f".{target.name}.lock"ifsidecar.exists():
age=time.time() -sidecar.stat().st_mtimeifage>LOCK_TTL_SECONDS:
sidecar.unlink(missing_ok=True) # presumed dead holder# ... existing acquisition with 5s timeout ...
This bounds the worst-case wait at LOCK_TTL_SECONDS (30s) for legitimate writers, instead of leaving them indefinitely waiting on a hung process or fail-open silently after 5s.
Option B — fail-open observability
# On fail-open, emit a journal event so audits can enumerate fail-open frequencydeffile_lock(target: Path):
try:
# ... acquisition ...yieldexcept (fcntl.error, TimeoutError) ase:
_journal_lock_fail_open(str(target), e.errno, e.args[0])
yield# proceed without lock
session_journal.py accepts a new lock_fail_open event type. Doesn't change behavior; gives security review observability.
Recommendation
Implement both: Option B first (observability is cheap), Option A as the security hardening once we have data on actual fail-open frequency.
Acceptance criteria
lock_fail_open journal event emitted on every file_lock fail-open path.
LOCK_TTL_SECONDS sidecar-lock TTL implemented; lock files older than 30s are presumed dead and unlinked.
Background
During PR #641 Round-2 blind review (R2-S-3 finding), security-engineer-r2 flagged that
shared/claude_md_manager.py::file_lockhas a 5-second timeout with fail-open behavior. An attacker (same-user, since this is local-trust scope) holding the sidecar.{filename}.lockfile for >5 seconds causes legitimate writers to silently degrade to unlocked writes.Fail-open is correct under current threat models (same-user-trust, where the attacker has equivalent capability via direct os tooling anyway), but the timeout is short and there's no observability for "how often does fail-open fire?"
Symptoms attacker could induce
The likelihood is low (same-user is the only adversary) but the surface is enumerable.
Proposal — defense-in-depth
Option A — TTL on sidecar lock
This bounds the worst-case wait at
LOCK_TTL_SECONDS(30s) for legitimate writers, instead of leaving them indefinitely waiting on a hung process or fail-open silently after 5s.Option B — fail-open observability
session_journal.pyaccepts a newlock_fail_openevent type. Doesn't change behavior; gives security review observability.Recommendation
Implement both: Option B first (observability is cheap), Option A as the security hardening once we have data on actual fail-open frequency.
Acceptance criteria
lock_fail_openjournal event emitted on every file_lock fail-open path.Cross-references
shared/claude_md_manager.py::file_lock(5s timeout, fail-open)file_lockis the canonical helper for shared-file read-mutate-write