TIN-1620: dev-env zero-diff fingerprint for the repo-roam ladder#507
Merged
Conversation
Adds the one missing precision layer the large-workdir research identified for Gate G5 / TIN-1620 (one expendable live repo): a git-aware dev-env fingerprint that asserts a roamed repo is byte-and-semantically identical across hosts, with no mid-reconcile .git corruption. This is a tighter assertion layered ON TOP of the existing QA-matrix rows (T2/T3 exact bytes, T8/T9 + M3/M6 peer-edit rehydrate, T10/T11 + M5/M5-R conflict/keep-both, T12 symlink, T13 modes), not a new matrix and not a new sync engine. scripts/repo-roam-fingerprint.sh: capture/compare/seed-canary/self-test. Capture records git status --porcelain=v2 --branch, HEAD+branch+symbolic-ref, refs, staged vs unstaged diff hashes, index blob shas, untracked, stash, reflog tip, git fsck --full, and a sorted working-file sha256 manifest honoring the fail-closed reconcile deny-set (.env*/secret/live-WAL recorded DENIED, never hashed). compare exits nonzero on ANY difference and requires fsck clean on both sides. seed-canary builds a throwaway repo (feature branch + staged + unstaged + untracked + stash + exec script + symlink) and refuses $HOME / ~/git / fs-root. The tool plugs into the existing canary/evidence scaffold (git-repo-canary.sh -> home-canary-linux-xr-shadow.sh for shadow/push, git-repo-restore-proof.sh for rollback, the neo-honey / TIN-1620 flip-flop harnesses for cross-host lifecycle) and emits one new dev-env-fingerprint/ subtree plus one dev-env-zero-diff gate line; it does not duplicate any of them. docs/ops/repo-roam-test-plan-2026-06-08.md: the canary runbook, mapping each step (R0-R5) to the EXISTING T/M rows + the TIN-1620 G5 acceptance, marking LIVE-fleet steps as out of scope for this PR. Documents the Facet-4 enrollment fact (.git-as-files is config-scoped via a per-repo -c config with sync_git_dirs=true + git_sync_mode="raw"; ~/git enrollment works with NO Rust change; the global flip is forbidden for blast radius), the Facet-5 zero-diff caveats (mtime/index trap needing a git update-index --refresh mitigation; symlink drop in the reconcile collect path), and the Facet-6 raw-mode .git corruption gate (G5-git-5 expected-fail until conflict resolution is .git-aware). Flags the .git-as-files vs git-bundle tension and the unlisted stash assertion as operator-review items. Taskfile: lazy:dev-env-fingerprint + lazy:test-dev-env-fingerprint, mirroring the existing inventory/canary task surface. shellcheck-clean; self-test (disposable /tmp seed -> capture -> capture -> compare, plus drift-detection negative control) and the regression suite both pass. No live fleet touched.
Three must-fixes for the dev-env zero-diff fingerprint (PR #507): 1. capture is now genuinely read-only. Drop the `git write-tree` call from head.env — it wrote tree objects into <repo>/.git/objects and touched the index, breaking the read-only contract that the live R0 step relies on when pointing capture at real expendable repos. The staged/index identity is already captured by `git ls-files -s` (index-blobs.txt) + the `git diff --cached` hash (diff-cached.sha256), so write-tree was redundant. Header + runbook read-only claims corrected to be TRUE. 2. Tighten the fsck corruption grep to genuine signals only: ^error: / ^fatal: / invalid sha1 pointer / broken link. Drop the broad `missing` / `dangling.*commit` matches that false-positive on healthy repos with gc'd / expired reflogs (a lone `dangling commit <sha>` notice was wrongly flipping fsck=dirty). 3. Make the [PR] vs [LIVE] boundary unmissable in the runbook: a green self-test only proves the assertion engine is internally consistent on one host — it is NOT proof of flip-flop zero-diff in either direction or of live .git corruption catching. Those are delegated to the [LIVE] R2/R3/R5 steps and the Facet-6 harness (PR #506).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds the one missing precision layer the large-workdir research identified for Gate G5 / TIN-1620 (one expendable live repo): a git-aware dev-env zero-diff fingerprint that asserts a roamed
~/gitrepo is byte-and-semantically identical across hosts (committed + uncommitted + staged + untracked + branch/HEAD + stash), with no mid-reconcile.gitcorruption.This is not new scope. Repo-roam == Phase 2 "single live repo" in
large-workdir-onboarding-design-2026-05-25.md== Gate G5 inlarge-workdir-daily-driver-sequencing-2026-05-30.md==TIN-1620(live-execution childTIN-1908). The fingerprint is a tighter assertion layered ON TOP of the existing QA-matrix rows (T2/T3, T8/T9+M3/M6, T10/T11+M5/M5-R, T12, T13) — not a new matrix, not a new sync engine.[PR] vs [LIVE] — what a green bar in THIS PR does and does NOT prove
Files
scripts/repo-roam-fingerprint.sh—capture/compare(exit nonzero on any diff, fsck-clean required both sides) /seed-canary/self-test. Honors the fail-closed reconcile deny-set (.env*/secret/live-WAL recordedDENIED, never hashed).seed-canaryrefuses$HOME/~/git/ fs-root.scripts/test-repo-roam-fingerprint.sh— regression suite (deterministic match, drift detection, deny-set, safety refusals).docs/ops/repo-roam-test-plan-2026-06-08.md— the canary runbook, mapping each step R0-R5 to the existing T/M rows + TIN-1620 G5 acceptance, with [LIVE] markers for fleet-only steps (out of scope here).Taskfile.yaml—lazy:dev-env-fingerprint+lazy:test-dev-env-fingerprint.Reuses vs adds
Reuses (does not duplicate):
git-repo-canary.sh->home-canary-linux-xr-shadow.sh(shadow/push,.git-as-plain-files),git-repo-restore-proof.sh(rollback),large-workdir-inventory.py, the neo-honey / TIN-1620 flip-flop lifecycle harnesses, and thedocs/release/evidence/<run-id>/packet convention. Adds exactly one new evidence subtree (dev-env-fingerprint/) and one gate line (dev-env-zero-diff=pass).Key findings documented in the runbook
.git-as-files is config-scoped via a per-repo-cconfig (sync_git_dirs=true+git_sync_mode="raw") on the scheduledtcfs reconcile --path --prefix --executeunit. The global flip is forbidden (fleet-wide blast radius). An additive--sync-git-dirsflag is a nice-to-have, not required.SyncManifest; restored.git/indexsmudges on firstgit status) needs agit update-index --refreshmitigation; the symlink drop (reconcile.rs:1120preserve_symlinks:false) fails T12. The fingerprint makes both visible..gitcorruption gate (Facet 6): raw-mode per-file.gitconflict resolution can produce half-applied refs (git fsck: invalid sha1 pointer); G5-git-5 is expected-fail until resolution is.git-aware..git-as-files vs git-bundle tension (R4 acceptance currently bundle-based);stashas an explicit precision assertion (not a new ticket).Validation
shellcheckclean on both scripts.task lazy:test-dev-env-fingerprintpasses; self-test (disposable/tmpseed -> capture -> capture -> compare + drift negative control) passes. Scope of this green: assertion-engine consistency on one host only — see the [PR] vs [LIVE] section above.captureis strictly read-only (must-fix): it no longer runsgit write-tree(which previously wrote tree objects into<repo>/.git/objectsand touched the index). Proven on a scratch/tmprepo with a staged change present — the entire.gittree is byte-for-byte identical andgit status+ object count are unchanged before/after capture.^error:/^fatal:/invalid sha1 pointer/broken link) and drops broadmissing/dangling.*commitmatches that false-positived on healthy repos with gc'd / expired reflogs. Verified: a healthy repo emitting a lonedangling commit <sha>notice now reportsfsck=clean(the old pattern flipped itdirty); a repo with aninvalid sha1 pointerstill reportsfsck=dirty.