Skip to content

fix: diff→symbol joins return empty results in multi-repo daemons#74

Merged
zzet merged 6 commits into
mainfrom
fix/pr-impact-multirepo-paths
Jun 11, 2026
Merged

fix: diff→symbol joins return empty results in multi-repo daemons#74
zzet merged 6 commits into
mainfrom
fix/pr-impact-multirepo-paths

Conversation

@zzet

@zzet zzet commented Jun 11, 2026

Copy link
Copy Markdown
Owner

Symptom

Every diff- and PR-driven tool silently reported an empty changeset: gortex prs <N> printed risk LOW (score 0.0), Blast radius: 0 changed symbol(s), and all five review priorities at 0.0 for a 10-file PR. The same all-zero result came back from pr_risk, review, review_pack, diff_context, detect_changes, pr_review_context, suggest_reviewers, critique_review, post_review, sibling_diff_context, and the pre-commit prompt. No errors anywhere — the tools just saw nothing.

Two independent root causes, fixed in one commit each.

Root cause 1 — repo-prefix mismatch in the file→symbol join

Multi-repo daemons key indexed file paths as <prefix>/<rel> (gortex/internal/daemon/overlay.go), but git diffs and forge file lists carry repo-relative paths (internal/daemon/overlay.go). Both join choke points fed the relative path straight into the exact-match GetFileNodes lookup:

  • Server.changedSymbolsForFiles (forge paths → get_pr_impact, triage_prs, conflicts_prs)
  • analysis.MapGitDiff / MapGitDiffWithLines (git hunk paths → the 13 diff-driven handlers above)

Every lookup missed → 0 changed symbols → every downstream axis (callers, community, coverage, flow, security) scored zero.

Fix

  • analysis.JoinFileNodes / JoinFilePath: raw lookup first (keeps unprefixed single-repo graphs and already-prefixed input working, never double-prefixes), then retry with <prefix>/<path>.
  • MapGitDiff / MapGitDiffWithLines take an explicit repoPrefix; the shared hunk join is extracted into joinHunksToSymbols (the two functions had verbatim-duplicate loops). DiffResult.ChangedFiles keeps diff-relative paths so git pathspec re-joins are unaffected.
  • review.Options.RepoPrefix + BuildChangeView threading for the review engine.
  • Server.diffJoinPrefix(repoRoot) resolves root → prefix for the MapGitDiff handlers; Server.prJoinPrefix(ctx, repo) resolves the PR-tools prefix from the repo selector, the picked repo root, or the session's cwd-bound repo — the CLI dials the daemon with its working directory, so gortex prs <N> inside a tracked repo needs no explicit selector.
  • suggest_reviewers ownership/co-change joins use the same prefix-aware lookup.

Root cause 2 — git diff header prefixes vs. user config (0891ac8d)

The diff parsers anchor on +++ b/ headers. With diff.mnemonicPrefix=true in the developer's git config, worktree-side diffs emit c/ w/ prefixes (diff.noprefix=true drops them entirely) — every parsed hunk drops, and the unstaged / staged / all scopes report an empty changeset while compare keeps working (commit-to-commit diffs keep a/ b/). The test fixture already pinned both keys to false to protect itself from exactly this; production never did.

Replace the first two "Out of scope" bullets with a third fix section:

Root cause 3 — no unified repo-scope resolution (f391e8fb)

detect_changes / diff_context / the pre-commit prompt defaulted to repoRoot := "." (the daemon's own cwd) via the nil-in-multi-repo single-indexer field; the review family resolved --repo only as a tracked prefix, so the CLI's path default failed; and resolveRepoFilter mis-read the review tools' scope argument as a saved-scope name and the path-form repo as a workspace filter — even a resolvable root then errored.

Fix: Server.diffRepoScope(ctx, repo) — explicit selector (prefix or path, normalized, honoured exclusively), then the lone tracked repo, then the session's cwd-bound repo. All diff-driven handlers use it; detect_changes, diff_context, and the pre_commit prompt grow an optional repo argument; the git-scope values are reserved words to the saved-scope filter.

Fix: analysis.GitDiffArgs builds the diff argv with -c diff.mnemonicPrefix=false -c diff.noprefix=false ahead of the subcommand, and the three previously-duplicated argv builders (diffmap, review old-side rawDiff, sibling per-file diff) now share it.

zzet added 6 commits June 11, 2026 20:28
Multi-repo daemons key indexed file paths as "<prefix>/<rel>", while git
diffs and forge file lists carry repo-relative paths. Every file→symbol
join fed the relative path straight into the exact-match GetFileNodes
lookup, so get_pr_impact / triage_prs / conflicts_prs and all thirteen
MapGitDiff-driven tools (review, pr_risk, diff_context, detect_changes,
suggest_reviewers, ...) reported zero changed symbols and 0.0 risk in
multi-repo mode.

MapGitDiff / MapGitDiffWithLines now take the repo prefix and retry a
missed lookup with the prefixed form — raw-first keeps unprefixed
single-repo graphs working and never double-prefixes. The PR tools
resolve the prefix from the repo selector, the picked repo root, or the
session's cwd-bound repo (the CLI dials the daemon with its working
directory, so "gortex prs <N>" inside a tracked repo needs no selector).
ChangedFiles keeps diff-relative paths so git pathspec re-joins are
unaffected.
The diff parsers anchor on "+++ b/" headers. A developer's global
diff.mnemonicPrefix=true makes worktree-side diffs emit c/ w/ prefixes
(diff.noprefix=true drops them entirely), zeroing out every parsed hunk
— review / detect_changes / diff_context silently reported an empty
changeset for the unstaged/staged/all scopes while compare kept working
(commit-to-commit diffs keep a/ b/). The test fixture already guarded
itself against exactly this config; production never did.

GitDiffArgs now forces the standard form with -c overrides ahead of the
diff subcommand, and the three argv builders (diffmap, review old-side
rawDiff, sibling per-file diff) share it.
…sion cwd

The diff-driven handlers had three different, individually broken ways
of answering "which repo is this call about":

- detect_changes / diff_context / the pre-commit prompt defaulted to
  repoRoot "." via the single-indexer field, which is nil in multi-repo
  daemons — their git diff ran in the daemon's own cwd and always came
  back empty.
- The review family (review, review_pack, post_review, critique_review,
  pr_review_context, sibling_diff_context, suggest_reviewers,
  review_questions, pr_risk) resolved a repo selector only as a tracked
  prefix, so the CLI's default --repo (a filesystem path) and the
  no-selector multi-repo case both failed with "could not resolve a
  repository root".
- resolveRepoFilter read the same `repo` arg without path normalization
  and interpreted the review tools' `scope` argument (unstaged / staged /
  all / compare) as a saved-scope name, so even a resolvable root then
  died on "unknown scope" or "filter resolves to nothing".

One resolver now answers the question for all of them:
Server.diffRepoScope(ctx, repo) honours an explicit selector exclusively
(prefix or path, normalized; unknown selectors resolve nothing so the
caller errors instead of silently diffing another repo), then falls back
to the lone tracked repo, then to the session's cwd-bound repo — clients
dial the daemon with their working directory, so `gortex review` and
`gortex detect_changes` need no selector inside a tracked repo. The
git-diff scope values are reserved words to the saved-scope filter, and
the filter normalizes path selectors the same way. detect_changes,
diff_context, and the pre_commit prompt grow an optional repo argument.
The per-file risk rollup keyed rows by both path vocabularies at once:
changed-symbol paths come from graph nodes (multi-repo daemons key them
"<prefix>/<rel>") while the diff's changed files are repo-relative. Every
file surfaced twice — once prefixed with its real impact tier and once as
a LOW diff-only duplicate. Invisible before the prefix-aware join landed
(the symbol side was always empty); rankFileRisk now normalizes all keys
to the diff-relative form via the report's repo prefix.

The headline also no longer reads "BLOCK: no findings": a non-APPROVE
verdict with zero findings is risk-driven, so it says which tier earned
it — "BLOCK: no rule findings, but 9 of 20 changed file(s) carry
CRITICAL blast-radius risk".
…dict

"CRITICAL <file> (0 finding(s))" answered neither question a reviewer
asks: critical *why*, and does it matter if the change is tested? The
impact analysis already computed both answers and threw them away.

FileRisk rows now carry the evidence — the widest blast radius among the
file's changed symbols and how many of them lack a covering test:

  CRITICAL internal/mcp/tools_multi.go (0 finding(s)) — 114 affected, 16/16 changed symbols untested

and coverage tempers the verdict: a file whose changed symbols are all
test-covered contributes at most REVIEW. Blast radius says how far a
regression could reach; covering tests say how likely it is to land
unnoticed; only the combination blocks. Files with no impact data keep
the conservative ladder.

The headline names the worst tier and its coverage ("9 of 22 changed
file(s) carry CRITICAL blast-radius risk (9 without covering tests)"),
and a clean rulepack run reads as a pass ("✓ No inline findings — the
deterministic rulepack passed") instead of sounding like an absence.

The CLI's human rendering also collapses onto review.RenderSummary — it
kept a hand-rolled duplicate of the renderer that silently dropped every
field added after it forked; both audiences now share the canonical
packet (the human output gains the cost line).
The coverage evidence read every change in this repo as untested —
including files with visible *_test.go siblings — because the index
config excludes "**/*_test.go": no test symbols in the graph means no
test callers in any blast radius, and the rollup presented that
blindness as a negative finding on every row.

The probe is language-aware because a coarser one lied the other way:
this repo indexes Python test_* files, so "any test symbol exists" was
true while Go coverage stayed unknowable. coverageKnownForDiff requires
each changed file's own test convention (_test.go, .test.ts, test_, …)
to be present among the indexed symbols, scanning the repo's nodes once
and caching per prefix for the daemon's lifetime.

When coverage is unknowable the rows keep the blast-radius fact ("112
affected") but carry no untested counts, the verdict keeps the
conservative ladder, and the headline says so: "... carry CRITICAL
blast-radius risk (test coverage unknown — the index carries no test
symbols)".
@zzet zzet merged commit 85db7cb into main Jun 11, 2026
10 checks passed
@zzet zzet deleted the fix/pr-impact-multirepo-paths branch June 11, 2026 21:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant