Skip to content

index: codedb.snapshot written into project root pollutes git / risks committing a multi-MB index #625

Description

@justrach

Summary

The codedb MCP server exposes a large tool surface (~23 tools). With Claude Code's deferred-tool mechanism this means every session spends ToolSearch round-trips just to load tool schemas before doing any real work, and the tool definitions tax cached context each turn.

Evidence

  • Every audited agent run began with 2 ToolSearch calls (to load codedb tool schemas) before the first real action.
  • In the A/B benchmark, codedb runs emitted +23.5% more output tokens than the native-tools baseline (more, smaller tool calls), which is the main reason a ~23% raw-token saving converted to only ~2% cost saving at Sonnet pricing.

Impact

  • Per-session schema-loading overhead and per-turn context tax.
  • High tool-call/output-token count blunts the cost benefit of the token savings.

Suggested direction

  • Consolidate overlapping tools (e.g. read/outline; search/find/word/glob) into fewer, mode-flagged tools.
  • Ship a slim default toolset, with the long tail opt-in.

Found via an independent SWE-bench Lite token-efficiency benchmark: identical agent (`claude -p`, Sonnet 4.6) and tasks, only the tool surface differs - native Read/Grep/Edit vs codedb MCP tools. N=51 paired instances; full harness + data available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions