feat(mcp): steer agents to structural tools (#626)#627
Conversation
Agents on the codedb MCP surface default to search -> read -> edit and skip the structural tools (symbol/callers/deps/outline), so the code graph goes unexercised. Make the structural path the path of least resistance. - Reframe tool descriptions + server instructions to prescribe the structural tools first and cast codedb_search as a substring/phrase fallback. - Runtime nudge on search: a bare identifier that resolves to an indexed symbol prepends a one-line pointer to codedb_symbol/codedb_callers (text output only, skipped for format=json). - Runtime nudge on read: a whole-file read (>=400 lines, no range) prepends a pointer to codedb_outline; wired into both the cached and uncached paths. - Tests (issue-626) cover the gating logic: isBareIdentifier and fullFileReadHint. Closes #626. (#623 closed separately as a duplicate; its distinct loop/redundancy-detection guardrail is not addressed here.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 14ef8b3328
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .fuzzy = false, | ||
| .max_results = 1, | ||
| }; | ||
| const results = explorer.searchSymbols(spec, alloc) catch return; |
There was a problem hiding this comment.
Avoid full symbol scans before every search
When codedb_search receives any bare identifier in text mode, this calls searchSymbols just to decide whether to print a hint. searchSymbols is not an exact-name O(1) lookup; it iterates the symbol index and then scans outlines when there is no match (src/explore.zig around the symbol-index/outlines loops), so common searches for local variables or config words that are not definitions now pay a full symbol/outline scan before the normal content search. This is on the benchmark-critical search path for large repos; use a direct symbol-index probe or another cheap existence check for the nudge.
Useful? React with 👍 / 👎.
| } | ||
| // Issue #626: nudge toward the structural tools when the query is a bare | ||
| // symbol name. Text output only — would corrupt the format=json payload. | ||
| if (!json_fmt) appendSearchSymbolNudge(alloc, explorer, query, out); |
There was a problem hiding this comment.
Keep the search result header first
This writes the nudge before the normal N results for ... header. MCP summaries for codedb_search parse the first output line to display the result count, so an exact-symbol search now produces a bogus summary like ↪ results instead of the actual count for clients that show the summary block. Emit the nudge after the search header/results or update the summary parser to skip advisory lines.
Useful? React with 👍 / 👎.
#626) Follow-up to the #626 structural steering. Auditing the tool surface showed mcpGenerateGuidance already steers most graph tools as "-> next" hints (callers->callpath, edit->changes, hot->outline, the symbol/search/outline/word chain). The single genuine gap is codedb_deps: nothing points to it and it has no next-hint. - Add depsHint: after a single-definition codedb_symbol hit (the moment before an edit, when blast radius matters), prepend a one-line pointer to codedb_deps. Pure + count-gated (results.len == 1), text-only, mirrors fullFileReadHint. - Upgrade three passive differentiator descriptions to prescriptive: codedb_deps (impact/blast-radius), codedb_hot (orientation), codedb_changes (what-changed). No callpath nudge: codedb_callers already emits "-> next: codedb_callpath", so an inline one would duplicate it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Benchmark Regression ReportThresholds: 10.00% and 50,000 ns absolute delta
|
Problem
Agents on the codedb MCP surface default to
search → read → editand skip the structural tools (symbol/callers/deps/outline), so the code graph goes unexercised — codedb ends up used like a leaner grep. (See #626.)Change
Make the structural path the path of least resistance, two ways:
codedb_searchas a substring/phrase fallback, not the default.codedb_searchon a bare identifier that resolves to an indexed symbol prepends a one-line pointer tocodedb_symbol/codedb_callers(text output only; skipped forformat=json).codedb_readof a whole file (≥400 lines, no range) prepends a pointer tocodedb_outline; wired into both the cached (renderReadBytes) and uncached read paths.Tests
src/test_mcp.zig—issue-626cases cover the gating logic:isBareIdentifier(fires onmake_bytes, not ondef content/make_bytes(/2fast) andfullFileReadHint(null under 400 lines, fires above). Full suite green:zig build test→ exit 0.Verification
Driven live through a fresh
codedb mcpserver: the search nudge fires for a bare indexed identifier and is correctly absent from the CLI path and fromformat=json. A Sonnet 4.6 agent reads the new server instruction and cites it when choosing structural tools first.Caveat worth stating: Sonnet 4.6 in Claude Code already navigates structurally (outline-before-read, ranged reads,
symbol/callersby reflex), so on these tasks neither runtime nudge actually triggered — the steering is confirmation/insurance for this model. The payoff is on grep-happier agent loops; a SWE-bench-Lite A/B is the test that can show the nudges deflecting a bad call, and is tracked separately.Issues
Closes #626. #623 was closed as a duplicate; its distinct concern (tail-runaway / loop-redundancy convergence guardrail) is not addressed here and was preserved in that issue's close comment.
🤖 Generated with Claude Code