An MCP (Model Context Protocol) server that gives Claude Code semantic search over any codebase — without sending your code to any external API. It uses local Ollama embeddings and ChromaDB to index your repo on-disk, then lets Claude retrieve only the relevant chunks when answering questions.
When you ask Claude about your codebase, the naive approach is to paste entire files into the context window. That burns tokens fast:
| Approach | Tokens consumed (typical 50k-line repo) |
|---|---|
| Paste all files into context | ~200,000–500,000 tokens per session |
| This MCP server (top-5 chunks) | ~1,000–3,000 tokens per query |
Savings: 99%+ reduction in tokens per codebase query.
Instead of Claude reading every file, it issues a semantic search and receives only the most relevant 5 code chunks (configurable). For a large repo this translates directly into lower cost and faster responses.
- Indexing — your repo's files are chunked (512 tokens, 50-token overlap) and embedded locally using
nomic-embed-textvia Ollama. Embeddings are stored in a persistent ChromaDB database on disk. - Search — when Claude needs to understand your code, it calls
search_codebasewith a natural language query. The query is embedded and the top-k most similar chunks are returned. - Auto-index — the MCP server detects the current working directory and auto-indexes it on the first search if it hasn't been indexed yet.
Everything runs 100% locally — no code leaves your machine.
- Python 3.10+
- Ollama running locally with:
nomic-embed-textmodel (for embeddings)
ollama pull nomic-embed-textgit clone https://github.com/<your-username>/claude-augment.git
cd claude-augmentpip install -r requirements.txtAdd the following to your Claude Code MCP config (~/.claude/claude_desktop_config.json or via claude mcp add):
{
"mcpServers": {
"codebase-search": {
"command": "python",
"args": ["/absolute/path/to/claude-augment/mcp_server.py"]
}
}
}Or using the CLI:
claude mcp add codebase-search python /absolute/path/to/claude-augment/mcp_server.pyOpen any project in your terminal and start Claude Code. On your first search query, the server will auto-index the current repo. After that, Claude will automatically search your codebase semantically before answering questions about it.
| Tool | Description |
|---|---|
search_codebase |
Semantic search over the indexed repo. Auto-indexes on first use. |
reindex_repo |
Re-index the current repo from scratch (run after major changes). |
index_status |
Show how many chunks are indexed for the current repo. |
list_indexed_repos |
List all repos indexed so far and their chunk counts. |
You can pre-index a repo manually before opening it in Claude:
python index_repo.py /path/to/your/repo
# With a custom collection name:
python index_repo.py /path/to/your/repo --collection my_project.py .ts .tsx .js .jsx .go .java .rs .cpp .c .h .rb .php .swift .kt .cs .md .txt .yaml .yml .toml .json .sh .sql .ipynb