Deep codebase understanding for AI coding agents, without smashing your context window.
When AI coding agents encounter a large codebase, they typically do one of two things: skim too quickly and miss critical details, or read too much and exhaust their context window. Either way, they start writing code with a shallow understanding, and the results show it.
comprehend teaches AI agents to systematically understand a codebase before touching it. Instead of dumping files into the context window, it uses a measure-first protocol that produces smaller, richer context than brute-force approaches:
- Measure the problem (file count, total size, structure)
- Plan the right strategy based on actual measurements
- Fan out parallel subagents to read and analyze different parts of the codebase
- Accumulate findings in a persistent REPL — facts survive across tool calls instead of evaporating
- Synthesize a deep understanding from structured results
The persistent REPL is the key insight. It acts as shared memory: subagents write their findings to named variables, and the parent agent reads and aggregates them with plain Python. Nothing gets lost. Nothing bloats the context window.
| Without comprehend | With comprehend | |
|---|---|---|
| Context usage | Reads files directly into the agent's window until it runs out of room | Subagents do the reading; main agent stays small and focused |
| Cross-file reasoning | Reads file A, then file B, then forgets half of file A | Builds an architecture map in the REPL, then traces specific paths |
| Large files | Truncates or skims | Chunks at natural boundaries (function defs, markdown headers, timestamps) and analyzes each chunk in parallel |
| State between calls | Every tool call starts fresh | REPL variables, imports, and functions persist for the entire session |
comprehend is a skill for Claude Code, OpenAI, Gemini, or your favorite LLM du jour. It provides:
- A context assessment protocol -- measure before analyzing, choose the right strategy for the size
- Three analysis primitives -- direct (small), recursive (medium), batched parallel (large)
- A persistent REPL server -- Python process over Unix socket that maintains state across shell calls
- A text chunking utility -- splits files at natural boundaries, not arbitrary character counts
- Worked examples -- five real patterns (log analysis, code review, document Q&A, data comparison, cross-file tracing)
npx skills add johnwbyrd/comprehendgit clone https://github.com/johnwbyrd/comprehend.git
mkdir -p .claude/skills
cp -r comprehend/skills/comprehend .claude/skills/git clone https://github.com/johnwbyrd/comprehend.git
mkdir -p ~/.claude/skills
cp -r comprehend/skills/comprehend ~/.claude/skills/Invoke with /comprehend, or let it activate automatically when the agent encounters large-context analysis tasks — anything involving files over 50KB, multi-file analysis, or codebase-wide understanding.
All scripts are pure Python 3 with no external dependencies.
chunk_text.py — Measure and split files at natural boundaries:
python .claude/skills/comprehend/scripts/chunk_text.py info large_file.txt
python .claude/skills/comprehend/scripts/chunk_text.py chunk large_file.txt --size 80000
python .claude/skills/comprehend/scripts/chunk_text.py boundaries source.pyrepl_server.py / repl_client.py — Persistent Python REPL over Unix socket (TCP on Windows):
REPL_ADDR=$(python .claude/skills/comprehend/scripts/repl_server.py --make-addr)
nohup python .claude/skills/comprehend/scripts/repl_server.py "$REPL_ADDR" > /dev/null 2>&1 &
python .claude/skills/comprehend/scripts/repl_client.py "$REPL_ADDR" 'x = 42'
python .claude/skills/comprehend/scripts/repl_client.py "$REPL_ADDR" 'print(x + 1)' # 43
python .claude/skills/comprehend/scripts/repl_client.py "$REPL_ADDR" --vars
python .claude/skills/comprehend/scripts/repl_client.py "$REPL_ADDR" --shutdownThe persistent REPL gives agents arbitrary Python execution that persists across tool calls. This is powerful but carries significant risk if your LLM behaves maliciously. Use comprehend in a sandboxed environment (containers, VMs, or your agent platform's built-in sandbox). Note carefully the limitation of liability in the LICENSE file.
Inspired by the RLM framework from MIT OASYS lab (paper, blog). The core idea -- that language models should decompose large problems into smaller ones, using persistent state to accumulate findings -- maps naturally onto the subagent + REPL architecture that modern coding agents already have.
Also related: RAPTOR (Recursive Abstractive Processing for Tree-Organized Retrieval), which builds hierarchical summaries bottom-up over chunks. Comprehend uses a similar structure -- chunk, summarize in parallel, aggregate -- but operates online during a session rather than as an offline indexing step, and relies on the agent to choose natural boundaries rather than embedding-based clustering.