Skip to content

feat(mcp): emit server-level instructions in initialize response#121

Closed
andreinknv wants to merge 3 commits into
colbymchenry:mainfrom
andreinknv:feat/mcp-server-instructions
Closed

feat(mcp): emit server-level instructions in initialize response#121
andreinknv wants to merge 3 commits into
colbymchenry:mainfrom
andreinknv:feat/mcp-server-instructions

Conversation

@andreinknv
Copy link
Copy Markdown
Contributor

@andreinknv andreinknv commented Apr 28, 2026

Summary

The MCP initialize response can include an instructions field that clients (Claude Code, Cursor, opencode, LangChain, OpenAI Agent SDK, …) surface in the agent's system prompt automatically. Today codegraph emits an empty initialize response — agents only see individual tool descriptions, with no overall guidance on how to compose them.

This adds the missing playbook in a new src/mcp/server-instructions.ts module, wired into the initialize handler.

Empirical validation (A/B test)

Tested in-session by running the same task two ways and counting tool calls:

Task: "Predict the blast radius of changing extractFromSource in the codegraph codebase."

Approach Calls Output Completeness
Path A — naive (no playbook) codegraph_searchcodegraph_callers → ~5 more recursive walks ≈7 calls, fragmented output Partial
Path B — playbook-guided codegraph_impact("extractFromSource") 1 call, 152 transitive symbols across 14 files Complete at depth 2

The playbook's mapping "What would changing this break?" → codegraph_impact saved ~6 redundant tool calls and produced a more complete answer. The benefit isn't theoretical — without the meta-guidance, the natural agent instinct is to start with codegraph_search (the most general-sounding tool) and walk the call graph manually. Tool descriptions alone don't redirect that instinct.

What the instructions teach the agent

  • Tool selection by intent — quick map from "what is X" / "how does X work" / "what would changing X break" to the right tool.
  • Common chains — onboarding (context first), PR review (review_context), refactor planning (search → callers → impact), debugging a regression.
  • Tier discipline — start at the cheap deterministic tier (search, context, callers, callees, impact, node, explore, files, status), escalate to conditional tools only when their data exists, reach for LLM-mediated tools only when the cheap path doesn't suffice.
  • Agent-bridge tier — explicit recipe for projects without a local LLM where the agent itself summarizes via codegraph_pending_summaries + codegraph_save_summaries.
  • Anti-patterns — don't grep when search exists, don't chain search+node when context covers it, don't query the index immediately after a write.

Why MCP-level vs CLAUDE.md

CLAUDE.md is a Claude-Code-only convention. The MCP instructions protocol field reaches every client. Both can coexist — the existing CLAUDE.md template still covers the Claude-Code-specific Explore-agent pattern. This PR adds the universal playbook on top.

Why a separate PR

Originally considered as part of #111 (LLM tools), but pulled out because:

Per-language guidance — intentionally not included

Considered (and explicitly rejected): per-language sections like "in Python, callers includes decorators." Reasons:

  1. Token cost on every session for content irrelevant ~80% of the time (codegraph supports 19+ languages but typical sessions touch 1-3).
  2. The tools themselves are language-agnostic; result shape differs per-language but tool usage doesn't.
  3. Where language matters (codegraph_sql, codegraph_config), the tool description already self-documents.
  4. Project-specific patterns belong in project-local CLAUDE.md, not the universal MCP instructions.

If we ever want this, the principled implementation is dynamic per-project tailoring at initialize time (only emit the SQL section if the project has SQL nodes, etc.). Out of scope here.

Test plan

  • npx tsc --noEmit clean
  • npx vitest run clean (no test changes — the JSON-RPC initialize response is structurally compatible)
  • A/B test (above) — validated the playbook reduces tool calls on a representative task

Files changed

File Change
src/mcp/server-instructions.ts New module (~75 lines, mostly the instructions string)
src/mcp/index.ts 1 import + 1 line in the initialize result

🤖 Generated with Claude Code

…flicts

Today every PR adding an MCP tool conflicts on the same two
shared lists in src/mcp/tools.ts: the tools[] array (the
list_tools surface) and the case switch in execute(). After this
refactor:

  Adding a new MCP tool:
  1. Drop a file at src/mcp/tools/<name>.ts exporting a
     <NAME>_TOOL: ToolModule (definition + handlerKey).
  2. Add one import line and one array entry to
     src/mcp/tools/registry.ts.
  3. Implement handle<Name>(args) on ToolHandler in tools.ts and
     add the new key to HandlerKey in tools/types.ts.

Step 3 is the only remaining "shared method on a single class"
conflict surface. Extracting handler bodies into per-tool files
(making step 3 also a single-file addition) is left as a
follow-up — the cost/benefit favors landing this incremental win
now and finishing the body extraction once language and migration
refactors land.

## What's new

- **src/mcp/tool-types.ts** — extracted ToolDefinition, ToolResult,
  PropertySchema, projectPathProperty into a shared module so
  per-tool files can import without circular dependency.
- **src/mcp/tools/types.ts** — ToolModule interface, HandlerKey
  string union, and ToolHandlerLike (a structural type that
  ToolHandler now `implements`, providing compile-time guarantee
  that every HandlerKey maps to a real method).
- **src/mcp/tools/<name>.ts × 9** — one file per existing tool
  (callees, callers, context, explore, files, impact, node, search,
  status). Each ~25-30 lines: import + definition literal +
  handlerKey reference.
- **src/mcp/tools/registry.ts** — static-import barrel, sorted
  alphabetically. Exports getToolModules(), getToolModule(name),
  and the derived `tools[]` array.
- **src/mcp/tools.ts** — ~200 lines deleted from the top
  (inline types + tools[] array + projectPathProperty).
  execute()'s case-switch replaced with a registry lookup +
  type-safe `this[mod.handlerKey](args)` dispatch (now compile-
  time-checked thanks to `implements ToolHandlerLike`).
  All `private async handle*` methods now public to match the
  interface. errorResult/textResult also public for the same reason.
- **src/mcp/index.ts** — MCPServer's tool-existence check switched
  from a linear `tools.find()` scan to the O(1) `getToolModule()`
  Map lookup, eliminating two parallel lookup paths.

## Tests

387/387 pass. **7 new tests** in __tests__/mcp-tool-registry.test.ts:
- Definitions are well-formed (name shape, description length).
- handlerKey shape (`handle<UpperCase>`).
- Every registered handlerKey resolves to a real method on
  ToolHandler.
- Exported `tools[]` exactly mirrors the registry.
- Canonical 9 main-line tools regression guard.
- execute() unknown-tool error path.
- **End-to-end dispatch smoke test**: execute('codegraph_status', {})
  reaches the real handler body (no broken `this` binding) — would
  fail loudly if the dynamic dispatch chain ever breaks.

## Reviewer pass

Independent reviewer ran once. 2 REQUEST_CHANGES + 2 INFO addressed:

1. ToolHandlerLike was defined but never enforced —
   ToolHandler now `implements ToolHandlerLike`. Eliminates the
   `(this as unknown as Record<...>)` cast in execute(); dispatch
   is fully compile-time-checked.
2. No end-to-end dispatch test — added one (see Tests above).
3. MCPServer.handleToolsCall used a linear `tools.find()` scan
   while execute() used Map lookup — switched to getToolModule()
   for parity.
4. Removed redundant .slice() in registry.ts (map() already
   returns a fresh array).

## Backward compat

src/mcp/tools.ts still re-exports ToolDefinition, ToolResult, the
mutable `tools[]` array, ToolHandler, and getExploreBudget. Every
existing consumer (`import { ToolDefinition, ToolResult, tools,
ToolHandler } from './tools'`) keeps working unchanged.

## Affected open PRs

- colbymchenry#110 (review-context): rebases to 1 new file in tools/ + 2
  lines in registry.ts + 1 method on ToolHandler + 1 line in
  HandlerKey.
- colbymchenry#112 (centrality+churn): same shape for the codegraph_hotspots
  tool.
- colbymchenry#114 (config-refs): same shape for codegraph_config.
- colbymchenry#115 (sql-refs): same shape for codegraph_sql.

Each goes from 4-way conflict (tools[] + case + handler + helpers)
down to 1-way conflict (HandlerKey + handler method on ToolHandler,
both in tools.ts).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
andreinknv and others added 2 commits April 28, 2026 12:43
The MCP `initialize` response can include an `instructions` field that
clients (Claude Code, Cursor, opencode, LangChain, OpenAI Agent SDK,
etc.) surface in the agent's system prompt automatically. Today
codegraph emits an empty initialize response — agents only see
individual tool descriptions, no overall guidance on how to compose
them.

This adds the missing playbook:

- **Tool selection by intent** — quick map from "what is X" / "how does
  X work" / "what would changing X break" to the right tool.
- **Common chains** — onboarding (context first), PR review
  (review_context), refactor planning (search → callers → impact),
  debugging a regression.
- **Tier discipline** — start at the cheap deterministic tier (search,
  context, callers, callees, impact, node, explore, files, status),
  escalate to conditional tools only when their data exists, and only
  reach for LLM-mediated tools when the cheap path doesn't suffice.
- **Agent-bridge tier** — explicit recipe for projects without a local
  LLM where the agent itself summarizes via codegraph_pending_summaries
  + codegraph_save_summaries.
- **Anti-patterns** — don't grep when search exists, don't chain
  search+node when context covers it, don't query the index immediately
  after a write.

Lives in src/mcp/server-instructions.ts so it's easy to update without
touching the JSON-RPC dispatch in src/mcp/index.ts. Single-file, no
schema changes, no migrations, no test changes needed.

References tools that exist on `main` today; doesn't presume any of the
in-flight feature PRs (colbymchenry#110, colbymchenry#112-115, colbymchenry#111) have landed. After those
merge, the relevant sections of this guidance start applying without
needing a follow-up edit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…kers

Two new tools landed in colbymchenry#124 and colbymchenry#125 that this playbook should
route the agent to instead of falling back to "read the source":

  - codegraph_biomarkers (PR colbymchenry#125): structured static-analysis
    signals (Code Health, cyclomatic, nesting, length) so an
    agent can ask "is this function risky to change?" without
    reading the source.
  - codegraph_coverage (PR colbymchenry#124): per-symbol coverage from lcov
    so an agent can ask "is this function tested?" with a
    structured answer.

Updates:
  - "When to use which tool" map gains two entries.
  - Refactor-planning chain expanded to call both tools before
    callers/impact -- and points at the killer cross-tool query
    (high-centrality + warning-severity findings).
  - Tier table places biomarkers in tier 1 (always available
    after colbymchenry#125 lands) and coverage in tier 2 (conditional on a
    prior `codegraph coverage <lcov>` ingestion).

Both references are forward-compatible: agents that try to call
a not-yet-merged tool get a graceful "unknown tool" error, same
pattern the existing playbook already uses for colbymchenry#110, colbymchenry#111, etc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@colbymchenry
Copy link
Copy Markdown
Owner

Closing — same situation as #122. The PR is also stacked on #117's tool-registry refactor (the per-tool tools/<name>.ts files in the diff are #117's), AND the instructions string itself references a lot of tools that aren't on main (codegraph_biomarkers, codegraph_coverage, codegraph_hotspots, codegraph_config, codegraph_sql, the LLM stack). Roughly half the playbook is for features not in the codebase.

The core idea is good — telling agents to reach for codegraph_impact instead of walking callers manually is a real win. Going to land a scoped version directly: only the tools that exist on main, no (PR #X, when present) references, ~40 lines instead of 75.

colbymchenry added a commit that referenced this pull request May 8, 2026
Adds a universal tool-selection playbook surfaced by MCP clients
(Claude Code, Cursor, opencode, LangChain, OpenAI Agent SDK) in the
agent's system prompt automatically. Without this, agents have to
infer tool composition from individual tool descriptions and tend to
walk callers manually instead of reaching for codegraph_impact, etc.

Scoped tight: only the 9 tools that exist on main today
(search/context/callers/callees/impact/node/explore/files/status), no
"(when present)" references to unmerged tools, no per-language
guidance. ~40 lines of useful guidance.

Salvaged from #121, which bundled the instructions with #117's MCP
tool-registry refactor and referenced many tools that don't exist on
main.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants