fix: diff→symbol joins return empty results in multi-repo daemons#74
Merged
Conversation
Multi-repo daemons key indexed file paths as "<prefix>/<rel>", while git diffs and forge file lists carry repo-relative paths. Every file→symbol join fed the relative path straight into the exact-match GetFileNodes lookup, so get_pr_impact / triage_prs / conflicts_prs and all thirteen MapGitDiff-driven tools (review, pr_risk, diff_context, detect_changes, suggest_reviewers, ...) reported zero changed symbols and 0.0 risk in multi-repo mode. MapGitDiff / MapGitDiffWithLines now take the repo prefix and retry a missed lookup with the prefixed form — raw-first keeps unprefixed single-repo graphs working and never double-prefixes. The PR tools resolve the prefix from the repo selector, the picked repo root, or the session's cwd-bound repo (the CLI dials the daemon with its working directory, so "gortex prs <N>" inside a tracked repo needs no selector). ChangedFiles keeps diff-relative paths so git pathspec re-joins are unaffected.
The diff parsers anchor on "+++ b/" headers. A developer's global diff.mnemonicPrefix=true makes worktree-side diffs emit c/ w/ prefixes (diff.noprefix=true drops them entirely), zeroing out every parsed hunk — review / detect_changes / diff_context silently reported an empty changeset for the unstaged/staged/all scopes while compare kept working (commit-to-commit diffs keep a/ b/). The test fixture already guarded itself against exactly this config; production never did. GitDiffArgs now forces the standard form with -c overrides ahead of the diff subcommand, and the three argv builders (diffmap, review old-side rawDiff, sibling per-file diff) share it.
…sion cwd The diff-driven handlers had three different, individually broken ways of answering "which repo is this call about": - detect_changes / diff_context / the pre-commit prompt defaulted to repoRoot "." via the single-indexer field, which is nil in multi-repo daemons — their git diff ran in the daemon's own cwd and always came back empty. - The review family (review, review_pack, post_review, critique_review, pr_review_context, sibling_diff_context, suggest_reviewers, review_questions, pr_risk) resolved a repo selector only as a tracked prefix, so the CLI's default --repo (a filesystem path) and the no-selector multi-repo case both failed with "could not resolve a repository root". - resolveRepoFilter read the same `repo` arg without path normalization and interpreted the review tools' `scope` argument (unstaged / staged / all / compare) as a saved-scope name, so even a resolvable root then died on "unknown scope" or "filter resolves to nothing". One resolver now answers the question for all of them: Server.diffRepoScope(ctx, repo) honours an explicit selector exclusively (prefix or path, normalized; unknown selectors resolve nothing so the caller errors instead of silently diffing another repo), then falls back to the lone tracked repo, then to the session's cwd-bound repo — clients dial the daemon with their working directory, so `gortex review` and `gortex detect_changes` need no selector inside a tracked repo. The git-diff scope values are reserved words to the saved-scope filter, and the filter normalizes path selectors the same way. detect_changes, diff_context, and the pre_commit prompt grow an optional repo argument.
The per-file risk rollup keyed rows by both path vocabularies at once: changed-symbol paths come from graph nodes (multi-repo daemons key them "<prefix>/<rel>") while the diff's changed files are repo-relative. Every file surfaced twice — once prefixed with its real impact tier and once as a LOW diff-only duplicate. Invisible before the prefix-aware join landed (the symbol side was always empty); rankFileRisk now normalizes all keys to the diff-relative form via the report's repo prefix. The headline also no longer reads "BLOCK: no findings": a non-APPROVE verdict with zero findings is risk-driven, so it says which tier earned it — "BLOCK: no rule findings, but 9 of 20 changed file(s) carry CRITICAL blast-radius risk".
…dict
"CRITICAL <file> (0 finding(s))" answered neither question a reviewer
asks: critical *why*, and does it matter if the change is tested? The
impact analysis already computed both answers and threw them away.
FileRisk rows now carry the evidence — the widest blast radius among the
file's changed symbols and how many of them lack a covering test:
CRITICAL internal/mcp/tools_multi.go (0 finding(s)) — 114 affected, 16/16 changed symbols untested
and coverage tempers the verdict: a file whose changed symbols are all
test-covered contributes at most REVIEW. Blast radius says how far a
regression could reach; covering tests say how likely it is to land
unnoticed; only the combination blocks. Files with no impact data keep
the conservative ladder.
The headline names the worst tier and its coverage ("9 of 22 changed
file(s) carry CRITICAL blast-radius risk (9 without covering tests)"),
and a clean rulepack run reads as a pass ("✓ No inline findings — the
deterministic rulepack passed") instead of sounding like an absence.
The CLI's human rendering also collapses onto review.RenderSummary — it
kept a hand-rolled duplicate of the renderer that silently dropped every
field added after it forked; both audiences now share the canonical
packet (the human output gains the cost line).
The coverage evidence read every change in this repo as untested —
including files with visible *_test.go siblings — because the index
config excludes "**/*_test.go": no test symbols in the graph means no
test callers in any blast radius, and the rollup presented that
blindness as a negative finding on every row.
The probe is language-aware because a coarser one lied the other way:
this repo indexes Python test_* files, so "any test symbol exists" was
true while Go coverage stayed unknowable. coverageKnownForDiff requires
each changed file's own test convention (_test.go, .test.ts, test_, …)
to be present among the indexed symbols, scanning the repo's nodes once
and caching per prefix for the daemon's lifetime.
When coverage is unknowable the rows keep the blast-radius fact ("112
affected") but carry no untested counts, the verdict keeps the
conservative ladder, and the headline says so: "... carry CRITICAL
blast-radius risk (test coverage unknown — the index carries no test
symbols)".
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Symptom
Every diff- and PR-driven tool silently reported an empty changeset:
gortex prs <N>printedrisk LOW (score 0.0),Blast radius: 0 changed symbol(s), and all five review priorities at0.0for a 10-file PR. The same all-zero result came back frompr_risk,review,review_pack,diff_context,detect_changes,pr_review_context,suggest_reviewers,critique_review,post_review,sibling_diff_context, and the pre-commit prompt. No errors anywhere — the tools just saw nothing.Two independent root causes, fixed in one commit each.
Root cause 1 — repo-prefix mismatch in the file→symbol join
Multi-repo daemons key indexed file paths as
<prefix>/<rel>(gortex/internal/daemon/overlay.go), but git diffs and forge file lists carry repo-relative paths (internal/daemon/overlay.go). Both join choke points fed the relative path straight into the exact-matchGetFileNodeslookup:Server.changedSymbolsForFiles(forge paths →get_pr_impact,triage_prs,conflicts_prs)analysis.MapGitDiff/MapGitDiffWithLines(git hunk paths → the 13 diff-driven handlers above)Every lookup missed → 0 changed symbols → every downstream axis (callers, community, coverage, flow, security) scored zero.
Fix
analysis.JoinFileNodes/JoinFilePath: raw lookup first (keeps unprefixed single-repo graphs and already-prefixed input working, never double-prefixes), then retry with<prefix>/<path>.MapGitDiff/MapGitDiffWithLinestake an explicitrepoPrefix; the shared hunk join is extracted intojoinHunksToSymbols(the two functions had verbatim-duplicate loops).DiffResult.ChangedFileskeeps diff-relative paths so git pathspec re-joins are unaffected.review.Options.RepoPrefix+BuildChangeViewthreading for the review engine.Server.diffJoinPrefix(repoRoot)resolves root → prefix for the MapGitDiff handlers;Server.prJoinPrefix(ctx, repo)resolves the PR-tools prefix from thereposelector, the picked repo root, or the session's cwd-bound repo — the CLI dials the daemon with its working directory, sogortex prs <N>inside a tracked repo needs no explicit selector.suggest_reviewersownership/co-change joins use the same prefix-aware lookup.Root cause 2 — git diff header prefixes vs. user config (
0891ac8d)The diff parsers anchor on
+++ b/headers. Withdiff.mnemonicPrefix=truein the developer's git config, worktree-side diffs emitc/w/prefixes (diff.noprefix=truedrops them entirely) — every parsed hunk drops, and theunstaged/staged/allscopes report an empty changeset whilecomparekeeps working (commit-to-commit diffs keepa/ b/). The test fixture already pinned both keys tofalseto protect itself from exactly this; production never did.Replace the first two "Out of scope" bullets with a third fix section:
Root cause 3 — no unified repo-scope resolution (
f391e8fb)detect_changes/diff_context/ the pre-commit prompt defaulted torepoRoot := "."(the daemon's own cwd) via the nil-in-multi-repo single-indexer field; the review family resolved--repoonly as a tracked prefix, so the CLI's path default failed; andresolveRepoFiltermis-read the review tools'scopeargument as a saved-scope name and the path-formrepoas a workspace filter — even a resolvable root then errored.Fix:
Server.diffRepoScope(ctx, repo)— explicit selector (prefix or path, normalized, honoured exclusively), then the lone tracked repo, then the session's cwd-bound repo. All diff-driven handlers use it;detect_changes,diff_context, and thepre_commitprompt grow an optionalrepoargument; the git-scope values are reserved words to the saved-scope filter.Fix:
analysis.GitDiffArgsbuilds the diff argv with-c diff.mnemonicPrefix=false -c diff.noprefix=falseahead of the subcommand, and the three previously-duplicated argv builders (diffmap, review old-siderawDiff, sibling per-file diff) now share it.