Skip to content

New files not indexed — WAL-checkpoint blocked on successfully-indexed project (Windows, v0.6.0) #277

@davidseraphi

Description

@davidseraphi

v0.6.0 (confirm via codebase-memory-mcp --version before filing), Windows 11, stdio MCP registration via project-scope .mcp.json.

Relationship to existing CLOSED issues

Triaged before filing; distinct from all three:

If this repro turns out to be a re-emergence of any of the above under a new code path, please re-close as duplicate with a pointer.

Symptom

index_repository returns status: indexed, nodes: N, edges: M where (N, M) are the pre-add-new-file counts — the indexer reports success but the returned counts do not reflect the files added since the last clean checkpoint. search_graph, search_code, and trace_path all return empty for symbols in files added since. Affects both delete_projectindex_repository rebuild AND incremental re-index cycles; i.e., even a full rebuild-from-scratch from the MCP-tool-surface does NOT resolve it.

Root cause (empirically verified)

Multiple stdio-spawned codebase-memory-mcp.exe processes persist across client session closes — 10 orphan processes observed in our case, accumulated across multiple Claude Code sessions where stdio shutdown did not handshake cleanly. SQLite WAL at ~/.cache/codebase-memory-mcp/<project>.db-wal grows (9.1 MB observed) while the main <project>.db file's mtime stays frozen at the date of the last clean checkpoint. New symbols get written to WAL, but checkpoint-into-main cannot land because orphan-process connections hold read locks.

Verified by: after running taskkill /F /IM codebase-memory-mcp.exe on all 10 processes + rm ~/.cache/codebase-memory-mcp/<project>.db* (all three files: .db, .db-shm, .db-wal) + re-indexing the project, we observed a +42 node / +41 edge delta for that project — multiple prior sessions' accumulated index writes finally landed. delete_project + index_repository cycle alone does NOT resolve this; the orphan-process kill + cache-file delete is required.

Minimal repro (Windows)

  1. Index a project via index_repository, confirm symbols appear in search_graph.
  2. Close the client session without a clean stdio-shutdown handshake (typical — stdio parent-death is SIGHUP-equivalent, not a graceful close).
  3. Open a new client session against the same project.
  4. Add a new .py file with a distinct function name.
  5. delete_projectindex_repositorysearch_graph(name_pattern=".*new_fn.*")total: 0.
  6. ls -la ~/.cache/codebase-memory-mcp/ shows .db-wal growing across retries; .db mtime stuck at first-ever-index date.
  7. tasklist | findstr codebase-memory-mcp shows N > 1 orphan processes.

Workaround

taskkill //F //IM codebase-memory-mcp.exe
rm ~/.cache/codebase-memory-mcp/<project>.db*
# — next client tool call auto-respawns a fresh server via stdio
# (no manual restart needed if registered via .mcp.json)

Verified working; the client-side MCP auto-respawn is transparent. Other projects' indexes in the cache are preserved (they have separate .db files per project).

Proposed fixes

  • On delete_project: issue PRAGMA wal_checkpoint(TRUNCATE) before touching files on disk.
  • On startup: detect stale .db-wal beyond a threshold (10 MB?) and either force-checkpoint or emit a loud warning.
  • On stdio-close: trap SIGTERM / stdio EOF and run PRAGMA wal_checkpoint(TRUNCATE) before exit.
  • Document the orphan-process failure mode in the README's troubleshooting section.

Downstream context

Encountered while building Outside-Diff Impact Slicing (ODSC) review agents that depend on this MCP's trace_path + detect_changes. Our diagnostic trail is at
.claude/agents/VALIDATION_LOG.md Entry 2 (§"Cold-start resolution (2026-04-21, same session)") of the downstream project; see
https://github.com/davidseraphi/PROJECT_MEMORY_ENGINE_OS/blob/master/.claude/agents/VALIDATION_LOG.md
(public once the current local-ahead commits are pushed — available on request if not yet visible when this issue is triaged).

Happy to run additional diagnostics on our repro on request (e.g., sqlite3 <db> 'PRAGMA wal_checkpoint;' against the stuck-state DB to confirm the checkpoint-cannot-land hypothesis mechanically).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions