Skip to content

Contained indexing: manual re-index only, dedup projects, lower RAM on 16GB#625

Open
Andy11-cpu wants to merge 2 commits into
DeusData:mainfrom
Andy11-cpu:fix/contained-indexing
Open

Contained indexing: manual re-index only, dedup projects, lower RAM on 16GB#625
Andy11-cpu wants to merge 2 commits into
DeusData:mainfrom
Andy11-cpu:fix/contained-indexing

Conversation

@Andy11-cpu

@Andy11-cpu Andy11-cpu commented Jun 25, 2026

Copy link
Copy Markdown

Summary

  • Add auto_watch config (default false). The git watcher no longer registers on every MCP initialize for already-indexed projects — this was causing background re-index storms even when auto_index was false.
  • Re-index only when you call index_repository, or when auto_watch true is set after a manual index.
  • Deduplicate project DBs: cbm_find_existing_project_name matches canonical/git identity so the same repo opened from different paths reuses one index.
  • Tier RAM budget: 25% on ≤16GB, 35% on ≤32GB, 50% above (was always 50%).
  • Install hooks are opt-in via --hooks (PreToolUse Grep/Glob augmenter no longer installed by default).

Motivation

On a 16GB Windows machine with multiple indexed projects, the MCP server reserved ~8GB RAM and pegged CPU from watcher-driven re-indexing plus hook-augment spawning on every Grep/Glob.


Setup on any machine (fork branch)

Branch: fix/contained-indexing on Andy11-cpu/codebase-memory-mcp

Option A — Build from source (recommended until merged)

Windows (MSYS2 / llvm-mingw, same as CI):

git clone https://github.com/Andy11-cpu/codebase-memory-mcp.git
cd codebase-memory-mcp
git checkout fix/contained-indexing

# Install llvm-mingw (portable) + GNU make (scoop), then build:
#   scoop install make
#   Download llvm-mingw ucrt x86_64 zip from https://github.com/mstorsjo/llvm-mingw/releases
#   Add llvm-mingw/bin and Git usr/bin to PATH (Git provides sh + grep for Makefile)

make -j4 -f Makefile.cbm cbm CC=clang CXX=clang++
# Binary: build/c/codebase-memory-mcp.exe

Linux / macOS:

git clone https://github.com/Andy11-cpu/codebase-memory-mcp.git
cd codebase-memory-mcp
git checkout fix/contained-indexing
make -f Makefile.cbm cbm
# Binary: build/c/codebase-memory-mcp

WSL (Linux binary, not Windows exe):

cd /mnt/c/path/to/codebase-memory-mcp
sudo apt-get install -y build-essential zlib1g-dev
make -f Makefile.cbm cbm

Option B — Official release + manual config (partial fix only)

v0.8.1 does not include the auto_watch gate. You can still reduce pain until this PR merges:

codebase-memory-mcp config set auto_index false
# Remove PreToolUse hook from agent settings if install added it
# Do NOT enable auto_watch (not available on v0.8.1)

MCP config

Point your agent at the patched binary (not v0.8.1 from AppData until you replace it):

Cursor~/.cursor/mcp.json:

{
  "mcpServers": {
    "codebase-memory-mcp": {
      "command": "/absolute/path/to/codebase-memory-mcp/build/c/codebase-memory-mcp"
    }
  }
}

On Windows use forward slashes, e.g. C:/Users/you/Projects/codebase-memory-mcp/build/c/codebase-memory-mcp.exe

Claude Code~/.claude/.mcp.json (same command path).

Restart the agent after editing MCP config.

Verify the patch is active

codebase-memory-mcp config list
# Expect: auto_index = false, auto_watch = false

codebase-memory-mcp --version
# dev or branch build until release

# On 16GB RAM, first log line should show ~25% budget:
# level=info msg=mem.init budget_mb=~4096 total_ram_mb=~16384

On MCP connect with existing indexes, logs should show watcher.skip (reason: auto_watch_disabled), not watcher.watch.

Agent skill — "update quicklook"

Copy this skill to the other machine so you can re-index on demand by saying "update quicklook":

Cursor: ~/.cursor/skills/update-quicklook/SKILL.md
Claude Code: ~/.claude/skills/update-quicklook/SKILL.md

Minimal skill content (or copy from this fork's docs after merge):

---
name: update-quicklook
description: Manually re-index a codebase-memory-mcp project. Use when the user says "update quicklook", "reindex", or wants the knowledge graph refreshed after merges.
---

# Update Quicklook

1. list_projects — check current index state
2. index_repository(repo_path="<absolute path>") — default mode full
3. Report node/edge counts. Never enable auto_watch or auto_index.

Daily workflow (after setup)

1. Work on branches / merge PRs normally — no background re-index
2. When you want a fresh graph: say "update quicklook" (or name the repo)
   → agent runs index_repository on that repo only
3. Use graph tools: search_graph, trace_path, get_architecture, detect_changes
4. After a big merge: "update quicklook" again — you control when it runs
5. Never run: config set auto_watch true   (unless you explicitly want git polling)

CLI equivalent:

codebase-memory-mcp cli index_repository '{"repo_path": "/path/to/repo"}'
codebase-memory-mcp cli list_projects '{}'

MCP tools (from agent): index_repository, list_projects, index_status, search_graph, trace_path, etc.


What NOT to do

Avoid Why
config set auto_index true Indexes on every new MCP connect
config set auto_watch true Git polling re-index after manual index
install --hooks PreToolUse Grep/Glob augmenter spawns on every search
Same repo at two paths without this PR Duplicate .db files (e.g. monaith vs monaith-current) — this PR dedups by git identity

Test plan

  • DCO sign-off
  • Lint (clang-format + cppcheck)
  • make -f Makefile.cbm test passes (CI)
  • MCP connect with existing index: no watcher registration (watcher.skip, not watcher.watch)
  • index_repository updates graph; no duplicate .db for same canonical repo
  • mem.init shows ~4096 budget_mb on 16GB host
  • install without --hooks leaves agent settings.json unchanged

Fork: https://github.com/Andy11-cpu/codebase-memory-mcp/tree/fix/contained-indexing

@Andy11-cpu Andy11-cpu force-pushed the fix/contained-indexing branch 2 times, most recently from 2682180 to f6fc7cd Compare June 25, 2026 17:30
Stop registering the git watcher on every MCP initialize unless auto_watch is explicitly enabled (default false). Re-index only via index_repository or when auto_watch is turned on after a manual index.

Also deduplicate project DBs by canonical/git identity, tier RAM budget to 25% on 16GB machines, and make install hooks opt-in via --hooks.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Andy11-cpu <canada11@duck.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Andy11-cpu <canada11@duck.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: Andy11-cpu <canada11@duck.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@Andy11-cpu Andy11-cpu force-pushed the fix/contained-indexing branch from f6fc7cd to 3450263 Compare June 25, 2026 17:52
@Andy11-cpu Andy11-cpu marked this pull request as draft June 26, 2026 07:27
@Andy11-cpu Andy11-cpu marked this pull request as ready for review June 26, 2026 07:37
@Andy11-cpu Andy11-cpu marked this pull request as draft June 26, 2026 07:41
@Andy11-cpu Andy11-cpu marked this pull request as ready for review June 26, 2026 07:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant