Summary
When the codedb MCP tools are available, agents almost exclusively use search → read → edit (a grep+cat+patch flow) and effectively never invoke the structural-intelligence tools - symbol, callers, callpath, deps, outline, context. codedb's actual differentiator (the code graph) goes unexercised; in practice it's used like ripgrep with smaller output.
Evidence
Tool-call traces of agent runs over SWE-bench Lite (django/sympy/requests/etc.). Representative full trace for one task:
ToolSearch -> search("def content") -> read(response.py) -> search("def make_bytes")
-> read(response.py) -> ToolSearch -> edit -> edit -> edit
Across audited runs the pattern is the same: search/read/edit only. No symbol/callers/deps/outline/context calls were observed.
Impact
- The measured token savings vs native tools (~34% median) come from compact tool output, not from smarter graph-based navigation - i.e. codedb is winning as "a leaner grep/read", leaving its headline value on the table.
- Grep-style exploration on large repos correlates with the runaway trajectories tracked separately.
Suggested direction
- Strengthen tool descriptions / server instructions to steer agents to structural tools first (e.g. "to locate a definition use
symbol, not search; to find usages use callers").
- Or consolidate the generic
search/read so the structural path is the path of least resistance.
Found via an independent SWE-bench Lite token-efficiency benchmark: identical agent (`claude -p`, Sonnet 4.6) and tasks, only the tool surface differs - native Read/Grep/Edit vs codedb MCP tools. N=51 paired instances; full harness + data available.
Summary
When the codedb MCP tools are available, agents almost exclusively use
search→read→edit(a grep+cat+patch flow) and effectively never invoke the structural-intelligence tools -symbol,callers,callpath,deps,outline,context. codedb's actual differentiator (the code graph) goes unexercised; in practice it's used like ripgrep with smaller output.Evidence
Tool-call traces of agent runs over SWE-bench Lite (django/sympy/requests/etc.). Representative full trace for one task:
Across audited runs the pattern is the same:
search/read/editonly. Nosymbol/callers/deps/outline/contextcalls were observed.Impact
Suggested direction
symbol, notsearch; to find usages usecallers").search/readso the structural path is the path of least resistance.Found via an independent SWE-bench Lite token-efficiency benchmark: identical agent (`claude -p`, Sonnet 4.6) and tasks, only the tool surface differs - native Read/Grep/Edit vs codedb MCP tools. N=51 paired instances; full harness + data available.