Skip to content

perf(db): drop redundant idx_edges_source and idx_edges_target#1

Closed
andreinknv wants to merge 1 commit into
mainfrom
perf/drop-redundant-edge-indexes
Closed

perf(db): drop redundant idx_edges_source and idx_edges_target#1
andreinknv wants to merge 1 commit into
mainfrom
perf/drop-redundant-edge-indexes

Conversation

@andreinknv
Copy link
Copy Markdown
Owner

Summary

Drops two redundant secondary indexes from the edges table:

  • idx_edges_source — fully covered by idx_edges_source_kind via SQLite's left-prefix scan.
  • idx_edges_target — fully covered by idx_edges_target_kind.

These indexes pre-date the (source, kind) / (target, kind) composites added later. Once the wider indexes were introduced, the narrow ones became dead weight on every write and every disk page — but they were never removed.

Empirical validation

Measured on a synthesized 50K-node / 250K-edge SQLite database that mirrors a realistic mid-size codebase. Reproducer: scripts/spikes/spike-edge-indexes.mjs.

Metric Baseline (with redundant) Stripped Δ
DB size 34.7 MB 27.0 MB -22.2%
Bulk insert (250K edges) 590 ms 431 ms 1.37× faster
WHERE source = ? latency reference 0.88× no regression
WHERE target = ? latency reference 0.95× no regression

EXPLAIN output on the stripped DB confirms the kind-prefixed indexes still cover source-only / target-only queries:

SEARCH edges USING COVERING INDEX idx_edges_source_kind (source=?)

Run yourself with node scripts/spikes/spike-edge-indexes.mjs.

Why this is safe

Query shape Old plan New plan
WHERE source = ? idx_edges_source idx_edges_source_kind (left-prefix, covering)
WHERE target = ? idx_edges_target idx_edges_target_kind (left-prefix, covering)
WHERE source = ? AND kind = ? idx_edges_source_kind idx_edges_source_kind (unchanged)
WHERE target = ? AND kind = ? idx_edges_target_kind idx_edges_target_kind (unchanged)

SQLite's query planner uses a B-tree's leading column(s) for prefix matches, so the composite indexes are a strict superset of the dropped ones for these queries.

Migration

Adds schema migration v4 (CURRENT_SCHEMA_VERSION 3 → 4):

DROP INDEX IF EXISTS idx_edges_source;
DROP INDEX IF EXISTS idx_edges_target;
  • Fresh databases skip both indexes (removed from schema.sql).
  • Existing v3 databases drop them on next open via the migration runner.
  • Reversible (just re-create the indexes if needed) — no data loss.

Test plan

  • npx tsc --noEmit clean
  • npx vitest run — 382/382 tests pass
  • New tests in __tests__/pr19-improvements.test.ts:
    • Fresh DB does not contain idx_edges_source / idx_edges_target
    • Upgrade path drops both narrow indexes when present and bumps schema version to 4
  • Spike script reproduces the measurements above

Files changed

File Change
src/db/migrations.ts New v4 migration; CURRENT_SCHEMA_VERSION 3 → 4
src/db/schema.sql Removed two CREATE INDEX lines
__tests__/foundation.test.ts Updated v3 → v4 expectation
__tests__/pr19-improvements.test.ts Added v4 migration tests + updated version expectation
scripts/spikes/spike-edge-indexes.mjs New benchmark reproducer

🤖 Generated with Claude Code

These two narrow indexes are fully covered by the wider
idx_edges_source_kind and idx_edges_target_kind composite
indexes via SQLite's left-prefix scan. Keeping them costs DB
size and bulk-insert time without giving any query that the
kind-prefixed indexes don't already cover.

Empirical measurements on a 50K-node / 250K-edge synthesized DB
(scripts/spikes/spike-edge-indexes.mjs):

  - DB size: -22.2% (34.7 MB -> 27.0 MB)
  - Bulk insert: 1.37x faster (590ms -> 431ms)
  - Source-only / target-only query latency: no regression
    (EXPLAIN: SEARCH edges USING COVERING INDEX
     idx_edges_source_kind (source=?))

Adds schema migration v4. Fresh databases skip the indexes
entirely; existing v3 databases drop them on next open.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@andreinknv
Copy link
Copy Markdown
Owner Author

Superseded by upstream PR colbymchenry#122

@andreinknv andreinknv closed this Apr 28, 2026
andreinknv added a commit that referenced this pull request Apr 30, 2026
Two fixes surfaced while verifying b6cef74 against the live ollama and
codegraph indexes:

1. buildReviewContext spread clobbered DEFAULTS with undefined.
   { ...DEFAULTS, ...options } with options.maxCoChangeWarnings set to
   undefined (the shape MCP forwards when the caller did not override)
   set opts.maxCoChangeWarnings to undefined. undefined > 0 is false,
   so the co-change loop was skipped entirely — coChangeWarnings: [] on
   every default review_context call. Same shape silently disabled the
   jaccard threshold and uncapped callers/callees. Replaced the spread
   with a manual merge that ignores undefined values.

2. searchHybrid only diversified the strict no-embeddings-backend path.
   When embeddings are configured but the cache is empty (the common
   summarize:false state, which both project configs use), the call
   fell through to ftsResults.slice(0, limit) without diversification —
   letting six New constructors flood a 12-result budget. Same issue on
   the embed-call-failed and empty-vec-result paths. Routed all three
   FTS-only fallbacks through diversifyByName.

Regression tests in stress-test-roundtwo-fixes.test.ts exercise both
paths through the real surface (MCP-shape options for #1, configured-
but-unpopulated embeddings for #2). Suite: 1074 / 13 skip / 0 fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 1, 2026
… docs

Three reviewer-flagged improvements on 6d0e7a2 (the WASM-visibility
commit). All informational items from that review — no functional
regression, no scope drift, suite stays at 1117/13/0.

## #1 — backend is now per-DatabaseConnection, not a process global

`createDatabase` previously set a module-level `activeBackend` and
exposed it via `getActiveBackend()`. In the MCP cross-project path
(handleStatus called against `projectPath` opens a SECOND DB via
openSync) the global reflected whichever DB was opened most recently,
not necessarily the one whose stats were just rendered. Benign in
practice (every DB in a process resolves the same backend) but
structurally imprecise.

Refactor:
- `createDatabase(dbPath)` now returns `{db, backend}` instead of a
  bare `SqliteDatabase`. Caller stores both.
- `DatabaseConnection` carries a `private backend: SqliteBackend` and
  exposes `getBackend()`.
- `CodeGraph.getBackend()` delegates — that's the public surface.
- CLI `codegraph status` and MCP `handleStatus` both call
  `cg.getBackend()` instead of the global. The global is removed.

Two pre-existing tests (`migrations-015-016`, `migrations-022`) that
called `createDatabase` directly now destructure `{db: adapter}`.

## #2 — fix recipe deduplicated across the two code surfaces

The `xcode-select` / `npm rebuild` / `npm install --save` recipe
appeared inline in both `buildWasmFallbackBanner` (sqlite-adapter.ts)
and the MCP `handleStatus` formatter (mcp/tools.ts). New
`WASM_FALLBACK_FIX_RECIPE` constant in sqlite-adapter.ts is the
single source for the one-line summary; the MCP formatter
interpolates it. The banner formats the same content multi-line for
the stderr surface. README is intentionally separate (different
audience, different rendering).

## #3 — README troubleshooting now covers Linux

Section title renamed "Indexing is slow on macOS / WASM fallback" ->
"Indexing is slow / WASM fallback active". New code block lists fix
steps for macOS, Debian/Ubuntu, RHEL/Fedora, and the cross-platform
`npm install --save` escape hatch. The banner stderr block also
gained the Linux equivalent for symmetry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 1, 2026
Both items came back APPROVE-with-info on the prior review and are
pure cleanup — no behavior change, no API surface change.

## #1 — RHEL/Fedora step in WASM_FALLBACK_FIX_RECIPE constant

The constant in `src/db/sqlite-adapter.ts` is documented as the
"single source of truth" for the fix recipe shown in the MCP
`Backend:` line, but it only listed macOS + Debian/Ubuntu paths. The
multi-line `buildWasmFallbackBanner` and the README both already
include the `yum groupinstall "Development Tools"` step for
RHEL/Fedora; the constant was the lone surface missing it. Now
appended so MCP-displayed guidance matches the other two surfaces
on every supported platform.

## #2 — hoist inline `import('./db').SqliteBackend` to top-level

`CodeGraph.getBackend()` was the only method in `src/index.ts` using
the inline-import type form. Adding `SqliteBackend` to the existing
`import { DatabaseConnection, getDatabasePath } from './db'` keeps
the file consistent. No circular-import risk since `./db` already
re-exports the type from `./db/sqlite-adapter`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 1, 2026
… server-config flags

Tooling-gap backlog (codegraph/docs/codegraph-tooling-gaps.md) closed:
  #1 freshness severity bucket — `classifyFreshness` with fresh|recent|stale|very_stale
  #2 allowStale flag — opt-in bypass for the heavy-drift gate, registry-injected schema
  #3 module format in status — `module-format.ts` parses package.json + tsconfig (JSONC-safe)
  #4 codegraph_imports tool + import-classifier — file/directory/bare/unresolvable filters
  #5 dynamic imports — extractor catches `import('…')` + `require('…')`, incl. template_string
  #6 build-context refs — new `build_context_refs` table for `__dirname` / `import.meta.*`
  #7 files.is_test flag — column populated by glob; surfaced in status as `(N test)`
  colbymchenry#11 summarize-also-embeds (discovered while dogfooding) — `cg.summarizeAll()` chains
       `embedAllSummaries`; new `cg.embedAll()` for embed-only path; CLI `codegraph embed`

CLI/MCP alignment (5/32 → 33+/35):
  - 13 new CLI commands via `runViaMCP` shim: callers, callees, impact, node, similar,
    biomarkers, imports, help-tools, explore, hotspots, dead-code, config-refs, sql-refs,
    module-summary, role, coverage-query, pending-summaries, save-summaries, review-context
  - 7 new MCP tools: codegraph_imports, codegraph_embed, codegraph_summarize, codegraph_sync,
    codegraph_reindex, codegraph_coverage_ingest, codegraph_init, codegraph_uninit,
    codegraph_unlock, codegraph_affected

MCP server-level operator config (`codegraph serve --mcp`):
  - --no-write-tools / --allow-stale-default / --disable-tool (sandboxing)
  - --llm-endpoint / --llm-chat-model / --llm-ask-model / --llm-embedding-model /
    --llm-api-key (operator LLM config; per-project config wins on conflict)
  - New CODEGRAPH_LLM_* env vars wired through `mergeLlmEnv` in resolveLlmProviders

Architectural cleanups:
  - `bypassFreshnessGate` and `isWriteTool` declarative flags on ToolModule (replaces
    growing string-comparison chain in execute())
  - `withAllowStale` registry injection only on tools that DO see the gate
  - DRY of inline copy-paste in 3 hooks → `src/index-hooks/enclosing.ts`
  - `LlmClient.isEmbeddingReachable` for split-provider correctness
  - SyncResult `lockContention` flag → handleSync emits distinct retryable message
  - `clearStructural` deletes from build_context_refs (was orphan-leaking on --force)
  - cli:dev npm script + tsx CLI fixed (web-tree-sitter `import type` for type-only refs)

Migrations:
  023-files-is-test.ts — add `files.is_test`
  024-build-context-refs.ts — add `build_context_refs` table

Reviewer rounds: 11 total, all REQUEST_CHANGES addressed inline. Notable fixes:
  - JSONC URL strip via state machine (was eating `https://` tails)
  - classifyFreshness very_stale now requires isStale (in-sync-but-old → recent)
  - Dynamic imports also match template_string nodes
  - process.exit deferred until after finally cleanup in runViaMCP
  - --same-language / --different-language mutual exclusion guard
  - help-tools CLI bypasses isInitialized (works without a project)
  - handleUninit sweeps projectCache by getProjectRoot (no dangling alias leaks)
  - handleAffected errors instead of silently dropping unsupported glob filters
  - mergeLlmEnv preserves precedence: legacy flat config wins over env-synthesised block

Suite: 1268 passing, 1 expected red (colbymchenry#8 — undecided), 13 skipped, 1 todo, 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 1, 2026
…iness

Adds a new `### Server config` section to `codegraph_status` that
surfaces operator-level MCP flags so a sandboxed deployment can verify
the running server actually applied the intended config:

  - **Disabled tools:** lists names from `--disable-tool`
  - **Write tools:** disabled when `--no-write-tools` is set
  - **Default `allowStale`:** true when `--allow-stale-default` is set
  - **LLM env overrides:** which CODEGRAPH_LLM_* vars are active

Section is hidden entirely when nothing is set (default install stays
clean). Renders identically across MCP and CLI surfaces since both
route through `handleStatus`.

Visual treatment for the existing sections:
  - Backend: ✅ suffix when sqlite-vec is loaded; ⚠ when not
  - Status: 🟢 fresh / 🟡 recent / 🟠 stale / 🔴 very_stale
    (driven by `FreshnessSeverity` from #1)
  - Feature Readiness: 🟢/🟡/⚪ traffic-light per row, bold keys,
    percentages on summaries
  - Section headers gained icons: 🚦 Feature Readiness, 🤖 LLM providers,
    🔧 Server config

Test changes:
  - 5 new cases for the Server config section (omitted-when-empty,
    disable-write-tools, disabled-tools list, allowStale default,
    LLM env-overrides)
  - Two fragile string-equality assertions widened to regex
    (mcp-handlers-stress-test-fixes / mcp-llm-visibility) so future
    emoji tweaks do not re-break them — semantic invariants preserved.

Suite: 1273 passing, 1 expected red (colbymchenry#8), 0 regressions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
Closes BACKLOG.md section A #1. Returns the structural symbol-level
diff of the working tree vs a git ref (default HEAD) — the cumulative
"+/-/~ symbols" view of an agent's edits. Pairs with the planned
codegraph_diff branch-vs-branch (B colbymchenry#17): same machinery, different
ref pair.

How it works
- listChangedFilesSince(ref) uses "git diff --name-status --no-renames REF"
  + "git ls-files --others" to capture committed-since-REF, staged,
  unstaged, and untracked paths in one set. --no-renames means a
  "git mv" surfaces as old-path-D + new-path-A so the agent sees
  symbols moving sides instead of an opaque rename.
- For each path, getFileAtRef reads the baseline blob via
  "git show REF:path"; current source comes from disk.
- Both sides go through the existing stateless extractFromSource;
  ensureGrammarsForFiles loads tree-sitter grammars for the changed
  languages (the runViaMCP shim doesn't preload them, only the live
  MCP server does — defensive load keeps the tool self-sufficient).
- Diff matches symbols by (filePath, qualifiedName, kind). Modified
  detection: signature change OR line-range change OR async/static/
  exported flag flip. Body-only edits that don't shift line numbers
  fall through as "unchanged" — accepted MVP false-negative.
- INTERESTING_KINDS filter drops file/module/namespace/import/export/
  variable/constant/parameter — those kinds rotate on every edit
  and would bury the report in noise.
- Optional includeBiomarkers attaches current findings on each
  added/modified node by looking up (qualifiedName, kind) against the
  persisted graph.

Surfaces
- src/compare/index.ts — async compareToRef(cg, options).
- src/mcp/tools/compare.ts — codegraph_compare_to_ref tool. Markdown
  report with summary + skipped-file list + per-file +/-/~ sections.
- src/bin/codegraph.ts — "codegraph compare-to-ref" CLI command via
  the runViaMCP shim.
- src/git-utils.ts — getFileAtRef + listChangedFilesSince helpers.

Tests
- __tests__/compare.test.ts — 8 vitest cases (added/removed/modified,
  untracked = all-added, deleted = all-removed, no-change, pathFilter,
  no-git error, includeBiomarkers, "git mv" rename split).
- Suite: 1361/13/0 (was 1352/13/0 + 1 from prior bug-fix + 8 new).

Reviewer caught a rename bug pre-commit (parseGitStatusPorcelain
dropped the old path on "git mv"). Switched to "git diff REF" with
--no-renames so committed/staged/unstaged + renames all flow through
one classifier; untracked files get a separate ls-files pass.

Dogfooded on this very branch — output cleanly enumerates the new
files' added symbols, the modified-files' actual function changes,
and skips boilerplate like file-level line-range churn.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
…y#19)

Two ranking changes in findRelevantContext, gated by the eval
harness shipped in B colbymchenry#26:

1. Common-term dampener (one-sided IDF) in runMultiTermTextSearch.
   Each per-term search-result score gets multiplied by
   `1 - DAMPEN_RATE * (numHits / fetchLimit)`. With
   DAMPEN_RATE=0.5: a term that saturates the fetch cap (16/16)
   gets 0.5×; mid-saturation (8/16) gets 0.75×; rare hits (1/16)
   get effectively 1.0×. Counters the cross-call score-summing
   problem where common tokens ("nodes", "parse" in this repo)
   drowned out rare ones ("extraction") because each independent
   per-term search only knows BM25-IDF within ITS own results,
   not across calls.

2. PageRank centrality boost in collectAndScoreCandidates.
   `score *= 1 + γ * sqrt(centrality)` with γ=5. Pulls
   high-centrality hubs (ExtractionOrchestrator, etc.) into
   entry-point selection. Sqrt-smooths the long tail so leaves
   barely move while hubs at the top centrality (~0.12 in this
   repo) get +173%.

Eval results
------------
self-eval suite (11 cases): mean recall 0.79 → 0.91 (+0.121),
pass 9/11 → 10/11. Zero per-case regressions. Two cases gained:
- self-explore-extraction-pipeline: 0.00 → 1.00
- self-explore-search-cascade: 0.67 → 1.00

The gate from B colbymchenry#26 (compare.ts) was load-bearing — multiple
intermediate iterations were rejected:
- bidirectional IDF (`log(1 + cap/hits)`): regressed
  self-explore-compare-to-ref by 0.33. Rolled back.
- cliff dampener (saturated→0.5×, else→1.0×, no centrality
  change): flipped extraction-pipeline +1.00 but regressed
  biomarker-engine -0.33. Rejected.
- smooth dampener at DAMPEN_RATE=0.4: zero movement either
  direction.
- DAMPEN_RATE=0.5 with γ=2 centrality (existing): same
  trade as cliff. Rejected.
- DAMPEN_RATE=0.5 with γ=5 centrality: biomarker-engine
  recovered (centrality boost on its hub symbols compensated
  for the dampened common-term scores). Two wins held. Gate
  passed.

Reviewer pass — docstring rot fixed
- CENTRALITY_BOOST_WEIGHT JSDoc still showed γ=2 arithmetic
  examples after I bumped the constant to 5. Updated to the
  correct +173% / +50% / +16% percentages.
- DAMPEN_RATE inline comment showed numbers from a superseded
  DAMPEN_RATE=0.4 iteration. Updated to the shipped 0.5 values.
- Both flagged via memo scrutiny-area #1 (docstring rotting
  after a behavior change). Memo workflow paying off — fifth
  consecutive review where memo content was load-bearing.

Verification
- npm run typecheck (tsgo) — clean.
- npx vitest run — 1392/13/0 (unchanged; ranking changes
  measured via eval, not vitest).
- npx tsx __tests__/evaluation/runner.ts . --cases self
  --compare __tests__/evaluation/baseline-self.json — within
  budget, +0.121 mean recall, +1 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
…henry#23)

Flip the SQLite backend default. Pre-PR, createDatabase tried
better-sqlite3 first and fell back to node:sqlite; the package
shipped better-sqlite3 in optionalDependencies so npm install
attempted to compile native bindings on every fresh install.
Post-PR, node:sqlite is the built-in default (Node 22.5+,
zero native compile, zero platform binaries); better-sqlite3
remains supported as a perf opt-in (~1.5× faster on indexing)
when the user installs it explicitly.

The Node 22 floor is already in place (engines.node = ">=22.5.0"),
so this isn't a forced version bump — just exercising what the
existing floor enabled.

Why this is a win
- Fresh installs no longer compile native bindings (the #1 cause
  of "npm install fails on Apple Silicon" issues per the dep audit).
- 22 MB net install savings versus the previous default path
  on platforms that pulled prebuilt better-sqlite3 binaries.
- Users who want the perf opt-in still get it via a single
  explicit `npm install better-sqlite3 --save`.

Adapter shims required
- node:sqlite throws on unused named parameters; better-sqlite3
  silently ignored them. Several call sites pass
  `bindingsFromObject(record, FULL_SCHEMA)` where the SQL only
  binds a subset (notably upsertFile excludes churn-managed
  columns). Added `extractNamedPlaceholders` to filter the
  bindings object on each prepare()'d statement's run/get/all.
- node:sqlite throws on double-close; better-sqlite3 was
  silently a no-op. Made `NodeSqliteAdapter.close()` idempotent
  to match the existing contract.

Surfaces flipped
- src/db/sqlite-adapter.ts createDatabase priority order +
  banner text (now flags failed perf opt-in, not active
  fallback).
- src/mcp/tools/status.ts reports `node:sqlite (built-in default)`
  vs `better-sqlite3 (perf opt-in, native)`.
- src/bin/codegraph.ts CLI status display same.
- package.json dropped better-sqlite3 from optionalDependencies.
- @types/better-sqlite3 kept in devDependencies for compile-time
  SqliteDatabase interface.

Test fallout addressed
- wasm-savepoint banner-text assertions: updated for the new
  framing.
- mcp-llm-visibility + sqlite-vec backend assertions: now match
  either flavor so the test stays portable across both dev
  environments.
- cochange + search-quality migration tests that directly
  `require('better-sqlite3')`: now `it.skipIf(!hasBetterSqlite3)`.
  21 tests skip when the perf opt-in isn't installed (they all
  poke the migration path against a raw better-sqlite3 handle;
  production uses the adapter regardless).

Reviewer pass — fixed before commit
- buildFallbackBanner said "Or remove the perf opt-in" then
  printed `npm install better-sqlite3 --save` — that's INSTALL,
  not remove. Bug. Fixed to `npm uninstall better-sqlite3`. Test
  assertion flipped to match.
- extractNamedPlaceholders JSDoc now notes that node:sqlite's
  "Unknown named parameter" diagnostic for typo'd keys is
  silenced as a deliberate trade.
- Sixth consecutive review where memo content (scrutiny-area #1
  docstring rot) drove the catch.

Verification
- npm run typecheck (tsgo) — clean.
- npx vitest run — 1371 passed / 34 skipped / 0 failed
  (vs 1392/13/0 baseline; +21 skipped reflect direct-handle
  migration tests that need better-sqlite3).
- npm run eval:self --compare baseline-self.json — within
  budget, +0.121 mean recall (B colbymchenry#19 ranking arc holds on
  the new default backend).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
…enry#28)

Two changes in one — they are tightly coupled (the dedup was
motivated by the user catching that adding one language touched
6 places).

Lua addition
- src/extraction/wasm/tree-sitter-lua.wasm vendored from
  tree-sitter-wasms@0.1.13 (MIT).
- src/extraction/languages/lua.ts — LanguageDef + LanguageExtractor
  patterned on bash.ts. Handles function_definition_statement
  (top-level + M.method + M:method), local_function_definition_statement,
  local_variable_declaration. Fixture under
  docs/test-beds/lua/fixture.lua extracts 11 symbols, 0 errors.
  v1 scope: top-level functions + table methods + locals.
  `require()` import-node promotion deferred — the visitor
  scaffolding is in place but doesn't fire for calls nested
  inside local declarations; documented as a known v1 gap.
- src/extraction/languages/registry.ts adds LUA_DEF.
- src/types.ts adds 'lua' to the Language union.
- __tests__/pr19-improvements.test.ts: vendored-grammar count
  bumped 24 → 25.

Dedup
User caught that the registration surface had drifted:

- **Real drift bug fixed**: src/search/query-parser.ts
  LANGUAGE_VALUES was an unchecked duplicate that had ALREADY
  drifted before this change — only listed 20 of 24 registered
  languages, so `lang:bash` / `lang:rescript` etc. silently fell
  through to FTS text instead of filtering. Replaced with
  `new Set(VALID_LANGUAGES.filter(l => l !== 'unknown'))`.
  VALID_LANGUAGES (config.ts) has the existing compile-time
  exhaustiveness check against the Language union, so future
  drifts on the search-filter set are now blocked at the build.
  VALID_LANGUAGES went from `const` to `export const`.

- **JS-family helper**: 5 sites in resolution/import-resolver.ts
  (×2), resolution/index.ts, extraction/tree-sitter-decls.ts (×2)
  had the same hardcoded
  `lang === 'typescript' || lang === 'javascript' || lang === 'tsx' || lang === 'jsx'`
  disjunction. Replaced with a single `isJsFamily()` helper in
  src/utils.ts. resolution/index.ts's existing `isJsTsLanguage`
  is now a one-liner alias to keep call sites stable.

After this commit, adding a new language touches 4 places:
WASM file, extractor, registry entry, types.ts Language union.
The 4th is intrinsic to TypeScript — the type system needs a
literal union at compile time. The 1st-3rd are the irreducible
implementation surface.

Backlog scope decision
Per user 2026-05-03: PowerShell / Solidity / Elixir descoped from
B colbymchenry#28 — audiences too narrow to justify the per-language
maintenance cost (grammar sync, extractor edge cases, fixture,
eval footprint). Lua kept (Neovim configs, OpenResty, embedded
scripting, game logic). Re-evaluate on demand.

Reviewer pass — APPROVE with 2 info-level edge-case findings:
- multi-param Lua signature could double-paren if grammar emits
  parens in `parameter_list` (fixture probe didn't surface this);
  worth a fixture-pinned signature assertion in a follow-up.
- `readLuaStringLiteral` skips leveled long-brackets `[==[...]==]`;
  documented v1 gap.

Sixth consecutive review where memo content (scrutiny areas #1
docstring rot and #7 speculative-export check) was load-bearing.

Verification
- npm run typecheck (tsgo) — clean.
- npx vitest run — 1371 / 34 / 0 (unchanged).
- End-to-end Lua extraction probe: 11 nodes, 10 edges, 0 errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
Pre-PR, `graphql-extractor.ts` explicitly skipped `type_system_extension`
AST nodes ("intentionally skipped in v1 — merging extensions
across files needs a second resolution pass we don't do yet"),
so federation-style `extend type User { posts: [Post] }` produced
zero nodes. Post-PR, each extension emits a separate node carrying
the new fields/values plus an `extends` UnresolvedReference
targeting the base type — cross-file merging reconstructible by
walking the resolver-promoted edges.

Mapping
- `extend type X { … }`        → class      + extends ref
- `extend interface X { … }`   → interface  + extends ref
- `extend input X { … }`       → class      + extends ref
- `extend enum X { … }`        → enum       + extends ref + new enum_members
- `extend union X = …`         → type_alias + extends ref + new union refs
- `extend scalar X`            → unsupported by tree-sitter-graphql
                                   0.1.0 (parses as ERROR);
                                   defensive scaffold kept for a
                                   future grammar bump

Per-line node-id derivation makes multi-extension cases distinct
(`extend type User` at L5 and L20 both produce nodes named `User`
of kind `class` with separate ids). Cross-file: filePath in the
id-hash makes them unique by source location. Fields / enum
values / union members go under the extension node, preserving
"this field came from this extender" provenance.

Known same-file edge case
If a base definition and its extension live in the SAME file, the
existing `findBestMatch` line-proximity may pick the extension's
own node (distance 0) over the base definition (distance > 0),
producing a self-referential extends edge. Federation patterns
put base + extension in different files, which is what this
targets. Documented in `pushExtendsRef` JSDoc as a future
resolver-pass filter target.

Files
- src/extraction/graphql-extractor.ts: visitDefinition routes to
  the new `visitTypeSystemExtension` dispatcher; 6 emit*Extension
  methods reuse `emitFieldsOf` and the new `pushExtendsRef`
  helper. Class-level docstring mapping table updated to cover
  the extension forms (memo scrutiny-area #1 catch by reviewer).
- __tests__/graphql-extend-type.test.ts: 3 new cases (5 kinds
  end-to-end, signature distinction, type_of refs).
- __tests__/extraction.test.ts: one existing test flipped from
  "extend type silently produces zero nodes (v1 out-of-scope)" to
  "extension node + extends ref emitted".
- docs/test-beds/graphql/fixture.graphql: full schema fixture
  covering definitions and all 5 supported extension forms;
  auto-discovered by the language-coverage harness.

Verification
- npm run typecheck (tsgo) — clean.
- npx vitest run — 1374 / 34 / 0 (was 1371; +3 new + 1 flipped).
- E2E probe on a multi-kind extend fixture: 5 extension nodes,
  5 extends refs, fields under the right parent, 0 errors.

Reviewer pass — eighth memo-load-bearing review this session:
- Class-level mapping table missing the extension rows (memo
  scrutiny-area #1 docstring rotting). Added.
- Same-file self-resolve edge case noted as future resolver
  filter target.
- emitScalarExtension's unreachable status confirmed adequate
  per its existing JSDoc (memo scrutiny-area #7 doesn't apply
  to private methods).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
Second commit of the trace-logging arc. Wraps every
toolHandler.execute() call in a TraceLogger.log() so the call
flows into the mcp_tool_calls table that landed in 027139b.
The viewer's Agent-trace tab (commit C) reads it back.

Lifecycle:
- TraceLogger is created lazily once `cg` is open (via
  ensureTraceLogger() called on each dispatch). Skipped entirely
  when --no-write-tools is passed: the spirit of that flag is "no
  DB writes", and trace logging is a write path.
- log() is contractually best-effort (DB failures swallowed at
  debug). The dispatch site additionally wraps it in try/finally
  so even a contract violation can never strand the tool result —
  sendResult always fires.

Reviewer-memo gates passed:
- #1 docstring rot: ensureTraceLogger and the dispatch comments
  match the implementation.
- #2 best-effort claim is upheld at both layers (logger try/catch
  + caller try/finally).
- #5 N/A — this commit doesn't add tables.

The 3-line wiring isn't separately tested; TraceLogger has 10
round-trip tests covering every behavior the wiring composes,
and the wiring has no branches. Suite 1422/34/0 unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 3, 2026
The eval covered exact symbol search, multi-hop exploration, and
the new payload budget — but never exercised the fuzzy-edit-distance
fallback path that `searchNodes` cascades to when FTS / LIKE return
zero hits. A regression in `searchNodesFuzzy` or `suggestSymbolNames`
would silently break agent typo recovery without affecting any
existing case.

Four new cases in `cases-self.ts`:
  - `self-fuzzy-extractFromSorce` — single missing char ('u')
  - `self-fuzzy-compreToRef` — single missing char ('a')
  - `self-fuzzy-serchNodes` — single missing char ('a')
  - `self-fuzzy-doublehit-CodGrap` — TWO errors, just inside the
    SUGGEST_MAX_EDIT_DIST cap of 2. This case fails immediately
    if a future change tightens the budget — explicit early-warning
    for accidental over-pruning of the fuzzy reach.

No `kinds` filter on these — fuzzy walks the whole distinct-name
set; we want it to find the symbol regardless of declared kind.

Eval after re-baseline: **15/15 passed | recall=1.00 | mrr=0.91**
(MRR up from 0.86 because every fuzzy case ranks the target at #1).

Semantic-mode eval coverage remains a gap (filed as B9): the eval
setup opens a fresh `.codegraph/` with no embeddings, so semantic
cases would all skip. Wiring an embeddings fixture or a per-eval
`embedAll()` is its own scope.

Suite: **1577 / 34 / 0** (test count unchanged — these are eval
cases, not vitest tests).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 4, 2026
…(B8)

Closes the gap that gated B3's real-world impact. B3 added an
AMBIGUOUS confidence bucket driven by `tieMargin`, but the picker in
`resolveOne` only saw ties ACROSS strategies (framework vs import vs
name-match). Within `matchByExactName`, multiple same-name candidates
went through `findBestMatch` which returned ONE node and discarded
the rest — so the textbook ambiguity case (two `ambigTarget` exports,
one no-import caller) silently resolved to one with confidence
stamped EXTRACTED. AMBIGUOUS was unreachable in production.

## Fix

`findBestMatch` now returns `{ best, runnerUp }` (the second-highest-
scoring candidate, or null when only one). `matchByExactName`
projects both nodes' proximity through the same
`proximity >= 30 ? 0.7 : 0.4` confidence rule and stamps
`tieMargin = max(0, bestConfidence - runnerUpConfidence)` on the
returned `ResolvedRef`. The `Math.max(0, ...)` clamp respects the
documented "non-negative gap" invariant — when the scoring heuristic
promotes a low-proximity bestMatch over a higher-proximity runnerUp
(e.g. same-file boost beats directory proximity), the raw delta can
go negative; the classifier still flags it AMBIGUOUS but consumers
reading metadata.tieMargin won't see a negative number.

Same-id runner-ups skipped (FTS dups, repeat exports — those
reinforce, don't contest).

The picker's existing AMBIGUOUS short-circuit
(`tieMargin < AMBIGUOUS_TIE_MARGIN`) lights up automatically — no
upstream changes needed.

## Verification

Reproducer fixture (two `ambigTarget` exports + caller without import):
  - BEFORE B8: edge stamped `confidence=EXTRACTED`, no tieMargin in metadata.
  - AFTER B8: `confidence=AMBIGUOUS`, `tieMargin=0`, `resolvedBy=exact-match`.

## Behavior change to flag

Any agent that was filtering by `minConfidence='EXTRACTED'` and
relying on same-name same-kind ambiguities passing through silently
will now see fewer rows. That's the CORRECT behavior — those rows
were guesses presented as facts — but downstream code that assumed
every EXTRACTED row was concrete needs to be aware.

## Reviewer cycle

REQUEST_CHANGES round 1: negative-tieMargin invariant violation +
stale `classifyConfidence` JSDoc on the exact-match case (recurring
reviewer-memo finding #1) + stale "tracked as B8" comment in the B3
test. All four addressed in this commit.

## Tests + eval

Suite: **1579 / 34 / 0** (+2 B8 tests: end-to-end fixture +
non-negative invariant property check). Eval: 15/15 | recall=1.00 |
mrr=0.91 | meanPayloadΔ +0.3% | within budget.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 8, 2026
…arch arc)

Three additions inspired by the 2026-05-08 code-KG paper sweep
(CodeRAG, GraphCodeAgent, GraphGen4Code, Maarleveld GNN survey,
CodexGraph, FalkorDB). Each verified absent before implementation;
all opt-in to keep default structural traversals unchanged.

#1 — In-graph similar_to edges (similarity as graph hops)
  - New EdgeKind similar_to (confidence=INFERRED, metadata.score)
  - buildSimilarToEdges reuses findSimilarViaVec; delete+insert are
    wrapped in db.transaction so the replacement is atomic
  - Migration 036 partial-index on edges(source,kind) WHERE
    kind=similar_to — guarded by sqlite_master existence check for
    the pre-016 hand-rolled migration tests
  - Surfaced as codegraph_admin({action: build-similarity-edges})
    and CLI codegraph admin build-similarity-edges --k --min-score
  - EXCLUDED_EDGE_KINDS keeps it out of default traversals; explicit
    edgeKinds bypass the filter

#2 — mode=intent search over symbol_summaries
  - FTS5 virtual table summary_fts (porter unicode61, mirrors nodes_fts)
    + INSERT/UPDATE/DELETE triggers on symbol_summaries
  - Migration 037 + parallel schema.sql entry for fresh-init path
  - bm25-ranked, optional kind/language/pathFilter filters
  - pathFilter LIKE uses canonical backslash-escape pattern matching
    queries-findings.ts:227-232 (no _/% injection)
  - Refuse-when-empty error points at codegraph summarize
  - FTS5 query parse errors caught and re-surfaced as a clear
    syntax-error message

#3 — Intra-procedural def_use edges (TS/JS/TSX/JSX)
  - New EdgeKind def_use as self-loops on function/method nodes;
    metadata carries name, defLine, useLines
  - Scope-bounded extractor in src/extraction/def-use.ts, called
    from both tsExtractFunction and tsExtractMethod
  - Skips parameters (function inputs), fields (covered by
    field_access), nested-scope vars (belong to inner function set)
  - EXCLUDED_EDGE_KINDS opt-in; no traversal helper assumes
    source != target

Schema-version assertions bumped 35 to 37 in foundation.test.ts and
pr19-improvements.test.ts.

Suite 1742/0/34 (was 1729 baseline, +13 new tests). Three reviewer
rounds: round 1 caught LIKE-escape, atomicity, field rename, FTS5
try/catch; round 2 caught a JSDoc rot from the rename and a
contradictory test assertion; round 3 APPROVE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 8, 2026
…ge-level diff (eval arc)

Three additions with pre-set hypotheses + post-impl measurement, per
the user's "evaluation and measurement how much useful it will
actually be" brief. Pre-impl thresholds were committed before code so
post numbers couldn't be motivated. One feature deferred with honest
documented reasoning.

#1 Selective parse-cache invalidation
  - clearParseCache(qb, language?) returns deleted-row count
  - CLI: --clear-parse-cache [language] (boolean OR optional value)
  - MCP: clearParseCache: boolean + clearParseCacheLanguage: string
    (typed schema doesn't allow oneOf — split into two args; language
     wins when both set)

  Hypothesis (pre-set): >=3x wall-clock speedup, <30s absolute
  Measured (codegraph repo, 463/498 = 93% TS in parse cache):
    - TS-only clear: 3.70s wall / 3.77s user-CPU
    - Full clear:    3.87s wall / 8.13s user-CPU
    - Wall ratio: 1.04x (parallelism masks the work delta)
    - User-CPU ratio: 2.16x more work for full clear
  Verdict: speed threshold NOT MET on monoglot testbed. Real value
  here is correctness (targeted invalidation when an extractor
  changes). On polyglot repos at 50% target-lang ratio, expected ~2x
  wall-clock speedup.

#1.5 Docstring source for mode='intent' (user follow-up: "make intent richer")
  - Migration 038: docstring_fts FTS5 over nodes.docstring + INS/UPD/DEL
    triggers (with WHEN docstring IS NOT NULL AND != ''); schema.sql parity
    for fresh-init path; pragma_table_info guard for pre-016 hand-rolled
    migration test setups
  - _search-intent.ts queries BOTH summary_fts AND docstring_fts,
    UNIONs by node_id keeping best rank, surfaces 'via summary' /
    'via docstring' provenance label per result
  - Empty-corpus check fixed: FTS5 external-content COUNT(*) reads
    from the source table, not the actual indexed rows — switched to
    direct content count

  Hypothesis (pre-set): >=30% recall increase
  Measured (20 hand-picked intent queries, codegraph corpus):
    - Summary-only hits: 22
    - Docstring-only hits: 34
    - Combined unique-node hits: 56 (2.55x = 155% improvement)
  Verdict: well above threshold. Best ROI of this arc — docstrings
  cover 26% of nodes vs summaries' 18%, AND describe intent verbatim
  in JSDoc / Python docstrings / Go comments.

#2 Edge-level diff in compare_to_ref
  - EdgeDelta / EdgesDelta types
  - diffEdgeLists keyed by stable
    (srcQualName::srcKind=>tgtQualName::tgtKind::edgeKind) so line
    shifts don't surface as spurious changes
  - Cross-file edges out of scope (compareToRef is per-file)
  - Opt-in via includeEdges: true (CLI: --include-edges)
  - Renderer surfaces source -> target node IDs (round-1 reviewer
    finding: discarded data; fixed)

  Hypothesis (pre-set): >=30% additional info (>=20% loose)
  Measured (HEAD vs HEAD~3 on this branch):
    - Node changes: 21+11+308 = 340
    - Edge changes (NEW signal): 83+11 = 94
    - Files surfaced: 22 -> 27 (+5 visible only via edge changes)
    - Information gain: 94/340 = 27.6%
  Verdict: >=20% threshold MET; just below >=30% strict. The 5
  newly-surfaced files (pure-edge changes) are the qualitative win.

#3 Stack-graphs cross-language resolver — DEFERRED
  Survey of the codegraph corpus: monoglot TypeScript. child_process
  invocations are ~30 git execFileSync calls; no Python/Ruby/Go
  spawn targets. Dynamic imports are NPM packages. string_imports
  table is dominated by test fixtures. Conclusion: this corpus
  lacks the ground-truth cross-language references needed to
  measure a scope-graph rule meaningfully. Building infrastructure
  without testable signal would be speculative abstraction (CLAUDE.md
  anti-pattern). Stays on the borrow-ideas backlog as the
  long-horizon item; not blocked, just not session-feasible without
  a polyglot testbed.

Schema-version assertions bumped 37 -> 38 in foundation.test.ts and
pr19-improvements.test.ts.

Suite 1746/0/34 (was 1742; +4 new tests). Two reviewer rounds: round
1 caught the edge-delta formatter discarding source/target IDs;
round 2 APPROVE. Info-level note tracked for later: summary_fts_au
trigger could mirror the docstring_fts_au SELECT-WHERE guard pattern
for consistency.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 8, 2026
…enrichment

User asked: "is there other hot data we can apply intent search on, and
is there something missing we can apply on hot data?" Answer: test
descriptions mined from it/test/describe(...) calls in test files.
"Hot" = extracted at index time, no LLM pass needed. Plus enrich
codegraph_tests_for and codegraph_node with the same data while we're
in the area.

What shipped
============

#1.6a Schema + mining hook
  - Migration 039: test_names (id / file_path / line / description, FK
    on files.path, ON DELETE CASCADE) + test_names_fts FTS5 with
    porter unicode61 tokenizer + INS/UPD/DEL triggers; schema.sql
    parity for fresh-init path; sqlite_master guard for pre-016 hand-
    rolled migration tests
  - src/index-hooks/test-names.ts: regex-mining hook (NOT tree-sitter
    — patterns are simple enough; ~5% missed cases acceptable per
    docstring justification). Matches it/test/describe/context/
    fit/fdescribe/fcontext/xit/xtest/xdescribe with optional
    .skip/.only/.each/.todo/.concurrent/.sequential/.failing modifiers.
    Idempotent: full rescan on indexAll, per-file rescan on sync.
    Runs after TESTS_EDGES_HOOK so test->subject edges are fresh.

#1.6b mode='intent' third source
  - _search-intent.ts queries summary_fts + docstring_fts +
    test_names_fts. Symbol-anchored sources merge by node_id keeping
    best rank; test_names render in a separate "Test-description
    matches" section because they anchor to file:line not a node id.
  - kind/languageFilter skip the test_names branch (test_names don't
    have a node kind). pathFilter applies via the canonical
    LIKE-escape pattern from queries-findings.ts (memo item #3).
  - Empty-corpus refusal + no-match messages updated to mention all
    three sources.

#1.6c codegraph_tests_for enrichment
  - TestRow.testDescriptions: Array<{line, description}>, fetched via
    fetchTestDescriptionsForFile (50-row cap per file).
  - appendBucket renders up to 8 descriptions per file with overflow
    note ("...and N more assertions").
  - handleFilesMode (the files=[...] path) gets the same enrichment.

#1.6d codegraph_node enrichment
  - When fetching a single symbol's details, surface up to 5 mined
    test assertions from any test file linked to the symbol's file
    via the `tests` edge. Anchored by file (not by precise symbol —
    label says "may include broader-scope tests").

Schema-version assertions bumped 38 -> 39 in foundation.test.ts and
pr19-improvements.test.ts.

Measurement
===========

Coverage on this codegraph repo: 2169 test descriptions across 133
test files mined.

Per-source hits across 10 hand-picked queries:
  - summary-only:  22 hits
  - docstring:     80 hits
  - test_name:    101 hits
  - Combined:     203 hits
  - Gain from test_name: 99% additional signal vs summary+docstring

Vocabulary specialization: test_names dominate on assertion-style
queries ("rejects", "returns empty", "refusal error"); docstrings
dominate on implementation-style queries ("parses tree sitter",
"writes embeddings"). Sources are complementary, not redundant.

Suite 1751/0/34 (was 1750; +1 for the new node-enrichment test;
total +5 across the test-names test file). Typecheck clean.

Reviewer: APPROVE with 3 info-level findings. Addressed the missing
codegraph_node enrichment test in this commit; the two style notes
(canonical FK-safe insert pattern via `WHERE EXISTS`, transaction
wrap on clearAllTestNames for very large repos) are tracked as
follow-ups.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 8, 2026
…ulk codegraph_at_range

Three additive surface extensions to cut investigation round-trips:

#1 codegraph_node accepts `symbols: string[]` (up to 20) alongside
   the existing single-symbol `symbol`. Duplicate inputs that resolve
   to the same node are merged. Saves N round-trips when checking a
   list of suspect symbols.

#2 codegraph_node accepts four new inline-expansion flags
   (`includeCallers` / `includeCallees` / `includeBiomarkers` /
   `includeTests`). When set, the response folds in the corresponding
   tool's answer under each card, capped per-section to keep token
   pressure low (10 callers, 10 callees, 5 findings, 5 test files).
   Collapses 3-5 round-trips into one for "tell me everything about X"
   patterns; the dedicated tools remain available for the full lists.

#3 codegraph_at_range accepts `ranges: Array<{file, startLine, endLine}>`
   (up to 100) alongside the single-range form. Output renders one
   subsection per range so the agent can map results back to specific
   diff hunks. PR review with N hunks goes from N+1 calls to 1.

All three paths are additive — the legacy single-input shapes are
preserved verbatim. Backward-compat is locked in by the existing tests
plus 19 new ones (8 for node multi/expansions, 5 for bulk at-range,
+6 from refactoring). Docs updated in CLAUDE.md, README.md, and the
server-instructions playbook.

Suite 1772/0/34. No schema migration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 8, 2026
Round-2 reviewer pass (across 9aade08..2c7bf25) found four items.
Two request_changes (correctness), two info-level (quality):

## stopPool dead variable + wrong condition

src/llm/pool-controller.ts:268-297. The local `stopped` variable was
declared, conditionally incremented in the second pass, but never
read (the return statement uses `all.length - alreadyDead`). Worse,
the conditional `else if (!alreadyDead)` was logically wrong (means
"only count gracefully-exited members when total alreadyDead is
zero") — would have become a real bug if anyone refactored the
return to use the counter. Drop both: the arithmetic in the return
already gives the correct result.

## validateConfig JSDoc overclaim (recurring scrutiny pattern #1)

src/installer/llm-setup-cli.ts. JSDoc said "ping /v1/models AND
send a one-token sample call"; implementation only did the sample
call. Fix is to MAKE THE JSDoc TRUE: extracted `probeModelsEndpoint`
and gate both `pingChat` and `pingEmbed` on it. Also bumps the
per-call timeout 10s → 30s; cold-start of a 30B-class model from
disk can hit 20+ seconds on first byte, and a false-positive
"validation failed" during first-run setup is worse UX than waiting.

## isPidAlive PID-recycle caveat (info)

src/llm/pool-controller.ts:82-91. JSDoc now mentions the false-
positive risk on long-uptime systems and points at the `started`
timestamp as the secondary heuristic for callers that need to
distrust the result.

## spawnServer fd leak (info)

src/llm/pool-controller.ts. Added `fs.closeSync(log)` after
`child.unref()`. The child inherits the fd; the parent doesn't need
its reference. CLI process is short-lived so the leak resolved on
exit, but correct hygiene matters and the fix is one line.

Suite: 1923 / 0 / 34. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
Three friction items + one test regression follow-up to e11b2f9.

## colbymchenry#37 — pool-aware llm auto-detect

probeKnownLocalServers() in src/llm/detect.ts probed only the 4
hardcoded legacy endpoints (Ollama 11434, llama.cpp 8080, MLX 8081,
LM Studio 1234). When the user spawns a recommended-tier pool via
"codegraph llm pool up" (chat 8085-8087, embed 8090-8091), the auto-
detect missed it and reported "No local LLM detected" even though a
healthy pool existed on the same machine.

Fix: extend probeKnownLocalServers(probeTimeoutMs, projectPath?).
When projectPath is supplied, read .codegraph/pool.json via
readPoolState() (from pool-controller.ts) and probe each member's
endpoint additively to — not replacing — the legacy 4. Returns a new
"pool: ProbedPool | null" field; null when projectPath is unset or
no pool.json exists; arrays empty when pool.json lists members but
none respond. The wizard in src/installer/llm-setup.ts surfaces pool
members as pool-chat-i / pool-embed-i options prepended before
legacy entries in buildChatProviderOptions /
buildEmbeddingProviderOptions; resolveChatChoice /
resolveEmbeddingChoice parse the new keys via regex dispatch.

New test file __tests__/llm-detect-pool-aware.test.ts (6 cases): no
projectPath, missing pool.json, reachable chat member, reachable
embed member, unreachable members, both reachable together.

Static import for readPoolState — no circular dep
(pool-controller.ts does not import detect.ts).

## colbymchenry#45 — bin/codegraph.ts:1251 legacy alias typo

Error message referenced config.llm.summarizeLlmModel as a "legacy"
field. That identifier doesn't exist; the real legacy alias per
LEGACY_LLM_FIELD_MAP in src/config.ts is the flat-field
config.llm.chatModel. Fix: replace the parenthetical hint accordingly.

## colbymchenry#46 — stale config field names in agent-facing error messages

Errors in src/mcp/tools/_search-semantic.ts (lines 74, 137),
src/mcp/tools/ask.ts (line 110), src/mcp/tools/dead-code.ts (line 30)
referenced LEGACY field names (config.llm.chat, config.llm.askChat,
config.llm.embeddings) as if they were canonical — these strings are
sent to the user when LLM is unavailable, so a user following the
message ends up configuring stale field names. Fix: lead with the
canonical purpose-suffixed names (summarizeLlm / askLlm / embeddingLlm),
demote legacy aliases to a footnote ("legacy ... also accepted").

## Test regression fix

Commit e11b2f9 renamed the status output label "Chat model:" to
"Summarize model:" in src/mcp/tools/status.ts:596 but did not update
__tests__/mcp-llm-visibility.test.ts which still asserted on the old
wording. Three assertions updated (lines 70, 99, 127) plus one
describe-text aligned (askChat differs from chat -> askLlm differs
from summarizeLlm).

## Reviewer pass

Independent reviewer (with .claude/reviewer-memo.md prepended) caught
one item under memo recurring scrutiny area #1 (docstring rot): the
module-level JSDoc at src/llm/detect.ts:1-13 still described the
module's job as "detect Ollama, llama.cpp, MLX and LM Studio
instances on conventional ports" without mentioning pool-aware
probing. The function-level JSDoc on probeKnownLocalServers was
updated by the implementer but the file header was not. Fixed in
this commit.

## Verification

- 10 files changed, +180/-22, plus 1 new test file (189 lines, 6 tests)
- npm run typecheck — clean
- 82/82 tests pass across 7 LLM-related test files
- probeKnownLocalServers call sites: exactly 2 (definition +
  src/installer/llm-setup.ts:83 caller)
- New exports ProbedPool + (now-exported) ProbedServer both have
  concrete in-tree callers — verified per reviewer-memo item #7

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
…polish items

Eight friction-tracker items addressed in parallel by sub-agents (2 Haiku,
1 Sonnet); reviewer caught one real correctness edge case (bucket overlap
on degenerate fresh-index shapes) plus two info items, all addressed in
this commit.

## colbymchenry#21 — at_range cost-benefit JSDoc

Doc-only update to src/mcp/tools/at-range.ts. Tool description and
JSDoc now state "pays off most on dense files (100+ symbols) and
multi-range bulk lookups; for tiny preview fetches on small files,
raw `head -N` is comparable." No code change.

## colbymchenry#25 — blame surfaces rename detection inline

src/git-utils.ts gains a new helper `getFileFollowEarliestTs` that runs
`git log --follow --format=%aI -- <path>` (5 s timeout, ISO timestamp).
src/mcp/tools/blame.ts compares the rename-aware oldest commit against
the line-range-only timeline's oldest. When `--follow` reaches further
back, appends a warning that the timeline truncated at the file's
rename and points at `git log --follow <file>` for the full history.
Edge cases handled: not-a-git-repo, timeout, empty timeline.

Test approach uses `vi.spyOn` to mock pre-rename history because
real fixtures are unreliable: modern git's `git log -L` follows
renames via content-similarity tracking, making a deterministic
black-box rename-fixture impossible.

## colbymchenry#26 — hotspots split into 3 mutually-exclusive categories

src/db/queries-history.ts gains `getCategorizedHotspots` and
src/mcp/tools/hotspots.ts gains a `category: 'risk' | 'maintenance'
| 'brittle' | 'all'` arg (default 'risk' for backward compat).
Thresholds use 75/25 percentile rather than hardcoded magic
numbers — they adapt as the project grows.

Buckets:
- risk        : high centrality AND high churn — where bugs hide
- maintenance : high churn AND not-high centrality — refactor target
- brittle     : high centrality AND not-high churn — stable critical

Reviewer-caught correctness bug: original filters used `<= low` for
the secondary axis, which collapsed buckets when high == low (fresh
index where centrality is uniformly zero, or repos where every file
has identical churn). A file at the threshold could appear in both
risk AND maintenance simultaneously. Fixed by switching maintenance
and brittle to `< highThreshold`, making them strictly disjoint
even on degenerate inputs. Also added a more-hint when any section
hit the per-category cap (the existing `category='risk'` path
already had this; `category='all'` now mirrors).

New `__tests__/hotspots.test.ts` (4 cases) covers all-section
rendering, single-category dispatch, and the backward-compat
default path.

## colbymchenry#27 — search centrality:high differentiates "hook hasn't run"
       vs "no node met the threshold"

src/mcp/tools/search.ts. `probeCentralityFilterCulprit` now runs a
sub-millisecond probe `SELECT 1 FROM nodes WHERE centrality IS NOT
NULL LIMIT 1` (uses the existing `idx_nodes_centrality` index). When
ALL nodes have NULL centrality the agent gets the existing "centrality
hook hasn't run — run codegraph index" hint. When SOME nodes have
centrality but none cleared the filter, a different hint suggests
relaxing the threshold. Two-case hint instead of one.

## colbymchenry#28 — search exact promotes multi-token-query warning to pre-result

src/mcp/tools/search.ts. `buildConceptHintIfNeeded` now returns
`{ preResult, postResult }` instead of a single string. When the
query splits into 2+ space-separated non-qualified tokens (likely
"multiple symbol names"), the agent gets a leading hint to call
search per name OR use codegraph_explore — BEFORE the result list
rather than buried after.

Field-qualified tokens (`kind:function lang:typescript`) and
single-free-token queries are unchanged.

## colbymchenry#33 — callers on "constructor" with no callers explains
       the instantiates-edge model

src/mcp/tools/callers.ts. When the resolved symbol is
`kind=method && name=constructor` AND the callers list is empty,
appends a one-line note: "constructors are invoked via
`new ClassName(...)`, which graph-edges as `instantiates` on the
parent class. To find construction sites, run codegraph_callers on
the enclosing class instead of 'constructor'." Both the multi-match
and single-match paths got the note (guarded by the same
kind+name+empty check). Constructors WITH callers (e.g. via super())
render normally — no false positive.

## colbymchenry#35 — node.symbol tie-break prefers non-fixture, then centrality

src/mcp/tools/symbol-resolver.ts. `pickFromMultipleExactMatches`
now filters out fixture paths first (falls back to all-fixture when
that's all that matches), then sorts by centrality DESC (NULL → 0).
A `helper` symbol that exists in both `src/core.ts` and
`docs/test-beds/fixture.ts` resolves to `src/core.ts` as the displayed
primary. Tier #3 (last_touched_ts) deferred — data not in the
resolver's existing query.

Reviewer-caught DRY issue: the fixture-path regex set was duplicated
between symbol-resolver.ts and dead-code.ts (introduced by parallel
sub-agents on the same brief). Extracted to `isFixturePath` in
src/mcp/tools/shared.ts; both consumers now import the single source.

## colbymchenry#49 — getSummaryCoverage denominator threading (3 call sites)

src/bin/codegraph.ts (lines 348, 1461) + src/mcp/tools/status.ts
(line 440). All three pass `SUMMARIZABLE_KINDS` to getSummaryCoverage
to match the canonical pattern from the previously-fixed
_search-intent.ts:218. Without this, the helper falls back to
COUNT(*) which inflates the denominator with parameters / imports /
file nodes — its own JSDoc explicitly warns against this.

## Test re-additions

Sub-agent #1 deleted its own test files for colbymchenry#33 and colbymchenry#35 (a brief
misread — "DO NOT commit" was interpreted as "DO NOT leave tests in
repo"). Re-added as
`__tests__/mcp-callers-constructor-and-fixture-tiebreak.test.ts`
covering: constructor-with-no-callers note appears,
non-constructor-method note absent, name-collision picks non-fixture
primary.

## Verification

- 15 modified files + 2 new test files, +619/-55
- npm run typecheck — clean
- 74/74 tests pass across 9 LLM/search/hotspots-related test files
- New exports: `isFixturePath` (shared.ts), `getCategorizedHotspots`
  (queries-history.ts), `getFileFollowEarliestTs` (git-utils.ts) —
  all have concrete in-tree callers in the same diff per
  reviewer-memo item #7

Reviewer pass with .claude/reviewer-memo.md prepended caught:
- (request_changes) bucket-exclusivity edge case → fixed
- (info) isFixturePath duplication → deduped
- (info) category='all' missing more-hint → added

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
Reviewer pass on b6d06f1 caught a stale tense-strong claim in the
module docstring (recurring scrutiny pattern #1) and a needless
4x over-fetch heuristic in the HNSW query path:

- Module docstring updated: "one HNSW KNN per unsummarised node
  (falling back to vec0 / in-memory cosine when hnswlib-node is
  unavailable)" matches the new findNearestNeighbors fallback chain.

- findNearestNeighbors no longer passes embeddingModel to hnsw.query.
  The per-pass index built by buildHnswForPropagator is already
  filtered to one model, so HnswIndex.query's model post-filter would
  reject nothing while paying the 4x over-fetch cost. (Distinct from
  similar-edges 8.4, where a single per-dim index can carry multiple
  models.)

- buildHnswForPropagator JSDoc now documents the model:dim 1:1
  invariant + notes the empty-rows branch is unreachable via
  runNeighborPropagator.

Suite remains 2100/0/34. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
Reviewer pass on 3e0bb32^..ac857bd caught two docstring rots
(recurring scrutiny pattern #1) and two info-level robustness nits:

1. RerankerClient.scoreOne JSDoc said "Returns NaN when not
   configured" but the underlying rerank() throws LlmEndpointError
   first, so the ?? NaN fallback is unreachable on that path.
   Tightened the JSDoc to match actual behavior.

2. searchHybridBlendWithEmbeddings JSDoc described embed +
   expansion + RRF but missed the new reranker pass landed in this
   stack. Added a clause covering the rerankerLlm-gated reorder.

3. rerankerCache stored the rejected promise from a failed model
   load forever — a transient HF rate-limit or offline disk-full
   would permanently disable reranking for the rest of the process
   lifetime. Added a `.catch` that evicts the cache entry on
   rejection so the next call retries cleanly.

4. maybeRerankSemantic now skips reranking for hits whose candidate
   text builds to the empty string (no name + signature + docstring
   + summary). Passing '' to a cross-encoder produces a meaningless
   score that would shuffle the hit's rank unpredictably; better
   to keep the original cosine ordering for that subset.

Suite remains 2195/0/34. Typecheck clean.

Reviewer also flagged a coverage gap on isReachable's load-throws
branch that can't be tested with hoisted vi.mock — left as a
follow-up since the catch is exercised by the docstring contract
and integration tests cover the practical paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
Reviewer pass on 6fbdc76..5e0f9c1 caught two request_changes
issues — both instances of recurring scrutiny patterns:

1. **Playbook gap** (pattern colbymchenry#9, CLI/MCP alignment): the new
   `codegraph_review_neighbors` tool and the new
   `codegraph_admin({action: 'embed-only'})` action were absent
   from `SERVER_INSTRUCTIONS`, so agents relying on the playbook
   for tool selection wouldn't know they exist. Added bullets to
   the use-case map AND the PR-review chain AND the tier-1 list.

2. **Docstring rot** (pattern #1): `cgRunIndexUnderLock`'s JSDoc
   still claimed it "runs the extractor, resolve, post-index hooks,
   DB maintenance" with no mention of the embedOnly short-circuit
   added in 6aa0bcc. Added a paragraph documenting the skip path
   and clarifying that the embed pass itself is caller-owned.

Reviewer also flagged the tautological co_changes test in
embed-only.test.ts as info-level — kept as-is since the centrality
test carries the meaningful "postHooks skipped" signal and the
co_changes assertion is a cheap regression guard.

Suite remains 2215/0/34. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
New module src/llm/secrets-detector.ts. Heuristic-first detector that
identifies functions handling API keys, JWTs, passwords, AWS access
keys, env-secret reads, PII patterns, and hardcoded long-token literals.
No ML model — curated regex pattern set with weighted contributions
into a 0-1 score and named `signals[]` for downstream filtering.

Public API:
- detectSecretsHandling({ name, signature, body, summary }) → { score, reasons[], signals[] }
- 8 categorical signals: api-key-name, jwt-pattern, password-name,
  crypto-secret, aws-access-key, env-secret-read, pii-name, literal-token
- Test-name guard down-weights findings on test fixtures by 0.5x
- Same category fires once even on multiple matches (no double-count)

32 vitest cases across 5 describe groups. Friction notes captured:
JWT literal regex needed `.` in the body charset; test-name guard
extended to camelCase patterns (`testApiKeyAuth`, not just `test_*`).

Integration into the biomarker registry as a new findings type lands
separately (tracked as Stage 7 #1 follow-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 9, 2026
Wires the heuristic secrets/PII detector into the biomarker pipeline
as a new findings type:

- New 'secrets_handling' entry in BIOMARKER_NAMES.
- New evaluateSecretsHandling(node) in engine.ts — wraps the detector
  and maps score ladders to severity:
    score >= 0.7 → error
    score >= 0.5 → warning
    score >= 0.3 → info
    else null
- Detail field carries the {signals, reasons} arrays so downstream
  consumers can show "why this fired".
- New evaluateFileSecretsHandling per-file pass in biomarkers/index.ts;
  runs after evaluateConstantStaleDocs in the analyseFile flow.
- BIOMARKER_CACHE_KEY bumped v4 → v5 so existing indexes self-heal
  and emit the new findings on next sync.
- mcp-reindex.test.ts assertion updated to the new key.

v1 scope: name + signature + docstring + summary as detector input.
Body inspection (literal AWS keys / hardcoded JWTs in source) is a
future enhancement once the orchestrator threads bodies into rule
contexts. Even without bodies, the regex set catches name-based +
signature-based + docstring-based signals.

Suite remains 2374/0/34. Typecheck clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 10, 2026
Threads function bodies into the secrets_handling biomarker via
the orchestrator's existing `loadAnalysableSource` read. Each node's
body is sliced from the file source by 1-based startLine/endLine and
passed to `detectSecretsHandling` alongside name + signature +
docstring.

Body-only signal upgrades:
- Literal AWS access keys (AKIA...) inside source caught.
- Hardcoded JWT literals (eyJ...) inside source caught.
- `process.env.SECRET_*` / `ENV['TOKEN']` env-secret reads caught.
- Long base64-ish hardcoded tokens caught.

Without body, the v1 integration only fired on name / signature /
docstring evidence — missed the most concrete cases.

evaluateSecretsHandling now accepts an optional `body` field; when
absent (legacy callers), defaults to '' (current v1 behaviour).
biomarkers/index.ts threads `prep.src` through to the new pass and
slices via a small helper.

Suite remains 2433/0/34. Typecheck clean. BIOMARKER_CACHE_KEY stays
at v5 — same biomarker name, same finding shape, just a richer
detector input. Existing index re-evaluates on next sync via the
content-hash gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 11, 2026
…olved-by-FTS

Three more friction closures from the 2026-05-11 workout:

- src/extraction/index.ts + src/db/queries-files.ts: heal-marked
  files now propagate through the git fast-path. The
  EXTRACTION_LOGIC_VERSION self-heal sets needs_reextract=1 on every
  file, but the git fast-path eoCollectGitChanges only iterated
  files git reported as changed — so heal-marked files with no
  diff were silently skipped in any git-tracked project. Adds a
  getFilesNeedingReextract(qb) query and unions the result into
  the change set after the git fast-path returns. Full-scan path
  was already correct via eoClassifyFileChange; no change there.
  Closes friction colbymchenry#22 (surfaced while validating the cluster fix
  in 6957b74 — the original needs_reextract=1 → 620 of 627 stayed
  pending after sync, requiring admin index --force).

- src/mcp/tools/entry-points.ts: new "MCP tools" bucket scans for
  exported `*_TOOL` constants whose initializer signature contains
  `name: 'codegraph_*'` and surfaces them with the parsed canonical
  tool name. Closes friction colbymchenry#13 / cluster symptom #6: every MCP
  tool dispatcher previously fell through every bucket because
  routes/CLI commands/CLI-files/public-exports each had a different
  filter mismatch. Five buckets now (HTTP routes, CLI commands,
  MCP tools, CLI files, public exports). __tests__/mcp-entry-points
  grows 5 → 6 tests with the new bucket fixture.

- Friction colbymchenry#9 (semantic/intent search can't bridge tool-name →
  handler-name): resolved by existing FTS5 path. Investigation
  showed nodes_fts indexes the `signature` column at bm25=2, so
  searching `codegraph_X` already returns the matching XXX_TOOL
  constant via signature substring match. The morning-workout
  "No results" observation was likely a stale-index transient
  (pre-cluster-fix). Added a regression test in mcp-search-family
  to lock in the behavior.

Test impact: 2633 → 2635 (one new MCP-tools-bucket test, one new
canonical-tool-name regression test). Type-check clean.

Closes:
- colbymchenry#22 EXTRACTION_LOGIC_VERSION heal flag honored
- colbymchenry#13 entry_points 5th bucket for MCP tool dispatchers
- colbymchenry#9 already-resolved-by-FTS — locked in with regression test

Still open:
- colbymchenry#8 / cluster #1 (context whiff): graph edge now exists, but the
  context-builder ranker still surfaces ToolHandler/ToolResult/
  ToolCtx instead of dispatchers. Needs a context-ranker boost
  for `codegraph_*` queries. Logged as colbymchenry#24 follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 11, 2026
Closes friction colbymchenry#8 / cluster #1: codegraph_context whiff on the
PRIMARY natural-language tool when the query mentions a canonical
MCP tool name.

Symptom: query like "how does codegraph_find dispatch across
by:name/content/env/sql axes" returned generic ToolHandler /
ToolResult / ToolCtx plumbing or unrelated SQL extractor symbols
(both match `name`/`content`/`sql` tokens in the trailing English).
The actual XXX_TOOL constant got pushed below the cap by
text-search noise even though FTS5 finds it via signature.

Fix: new `cbPromoteMcpToolMatches` helper. After the standard
extractor + prefix passes, scan extracted query symbols for the
canonical `codegraph_X` shape; for each hit, look up `constant`
nodes whose initializer signature contains `name: 'codegraph_X'`
and append them to exactMatches with a high base score
(MCP_TOOL_PROMOTION_SCORE = 100). The score puts the registered
tool above text-search and centrality-boosted noise so it lands
in the top entry-point slots.

Implementation cost: one extra `getNodesByKind('constant')` call
per query, bounded by the count of top-level exported constants
(~hundreds in a typical codebase). No-op when no `codegraph_*`
token is in the extracted symbol list.

Tests: __tests__/context.test.ts grows 18 → 19 cases. Fixture has
DEMO_TOOL plus three buildSql* helpers; without promotion the SQL
helpers outrank DEMO_TOOL in the entry points.

Test impact: 2635 → 2636 (one new MCP-tool-promotion test).
Type-check clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv added a commit that referenced this pull request May 11, 2026
Independent reviewer (REQUEST_CHANGES verdict) flagged two
stale-docstring instances in entry-points.ts and two info-tier
defensive improvements. Closing all four.

REQUEST_CHANGES (canonical "docstring rotting" pattern from
.claude/reviewer-memo.md scrutiny area #1):

- src/mcp/tools/entry-points.ts:29 — file-level JSDoc said "All
  four buckets filter out test paths" after the MCP tools bucket
  was added. Now reads "All five buckets".

- src/mcp/tools/entry-points.ts:258 — the agent-visible
  `definition.description` (surfaced on every codegraph_entry_points
  metadata fetch) said "Four buckets (capped at 20 each): HTTP
  routes, CLI commands, CLI files, public exports". Now lists
  five buckets including the new "MCP tools (XXX_TOOL constants
  registered via `definition: { name: 'codegraph_*' }`)" entry.

INFO improvements:

- detectMcpToolName: the substring guard required a literal space
  ("name: 'codegraph_") while the regex used `name:\s*` (zero or
  more whitespace), so `name:'codegraph_X'` (no space) was
  silently missed. Loosened the guard to two cheap substring
  checks (`includes('name:')` + `includes('codegraph_')`) — false
  positives are filtered by the regex anyway.

- MCP_TOOL_PROMOTION_SCORE: bumped 100 → 200. Reviewer flagged
  that `applyCentralityBoost` can amplify competing hubs by up
  to ~6× (1 + 5*sqrt(c)), so a node with FTS=40 + centrality=0.4
  could reach ~166 and outrank the promoted constant at 100.
  200 leaves headroom for the realistic worst case; the JSDoc
  documents the math + the centrality-1 hub edge case where a
  truly exceptional hub could still outrank (which would be a
  legitimate entry point anyway).

Test impact: 2636 → 2636 (no behavior changed for the existing
tests — copy update, guard loosened, score bumped). Type-check
clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant