A small CLI for understanding how Claude Code's multi-project context isolation works by building a clean-room version of it.
The goal is not to replace Claude Code. The goal is to make three normally-invisible mechanisms observable:
- Stateless model contract — every API call rebuilds the full context from disk.
- Stateful sessions on disk — each project's conversation lives in one append-only JSONL file.
- Filesystem as the boundary — one folder is one workspace; isolation is topological, not logical.
If those bullets feel abstract, the demo below makes them concrete in about 60 seconds.
Requires Node ≥ 24 (for process.loadEnvFile).
npm install
npm run typecheck
npm test # 19 tests across 5 suitesSet up credentials in .env at the repo root:
ANTHROPIC_API_KEY=sk-...
# Optional: route through an Anthropic-compatible gateway
ANTHROPIC_BASE_URL=https://your-gateway.example.com/v1Heads up — shell env shadows
.env. Node'sprocess.loadEnvFile()does not overwrite variables already set in your shell. Ifchatexits withAPI auth: ANTHROPIC_API_KEY is not seteven though your.envlooks right, checkenv | grep ANTHROPIC(or PowerShellGet-ChildItem env:ANTHROPIC*) — an empty shell var will shadow your file value.
All commands operate on a workspace identified by --workspace <id> (short form -w) or the env
var $CLEANROOM_WORKSPACE. The workspace ID is the folder name under $CLEANROOM_HOME
(default ~/Cleanroom / C:\Users\<you>\Cleanroom).
| Command | What it does |
|---|---|
cleanroom new <name> |
Create a workspace directory with default files |
cleanroom list |
List all workspaces under $CLEANROOM_HOME |
cleanroom chat -w <id> "<msg>" |
Send one message; rebuild context, call the model, append both turns to the log |
cleanroom inspect -w <id> [--next "<msg>"] |
Show exactly what the next call would send — no API call, no writes |
cleanroom tail -w <id> [-n 20] |
Print the last N raw JSONL entries |
cleanroom events -w <id> |
Print the events log (session_loaded / exchange_completed / error) |
~/Cleanroom/
└── demo/ # workspace = directory
├── context.md # system prompt for this workspace
├── .sessions/
│ └── current.jsonl # append-only conversation log
└── .cleanroom/
├── settings.json # { model, maxInputTokens, warningRatio }
└── events.jsonl # observability log (best-effort)
This is the demo that makes "stateless model + filesystem state" click.
Step 1 — Create a workspace and give it a personality.
cleanroom new demo
# write the system prompt
printf 'You are a terse code reviewer. Reply in 1 sentence.\n' > ~/Cleanroom/demo/context.mdStep 2 — Ask a question.
$ cleanroom chat -w demo "I just wrote a 200-line function with no tests. Thoughts?"
That's a maintainability and reliability risk—split it into smaller focused functions and add at
least unit tests around core logic and edge cases before merging.One-sentence terse review. As asked.
Step 3 — Edit context.md without restarting anything. Just change the file on disk.
printf 'You are an enthusiastic cheerleader. Always celebrate the user.\n' > ~/Cleanroom/demo/context.mdStep 4 — Ask the exact same question in the same workspace.
$ cleanroom chat -w demo "I just wrote a 200-line function with no tests. Thoughts?"
Nice hustle getting it written — that's real momentum. 🎉
Now protect Future You:
- **200 lines + no tests = high bug risk**
- **Refactor** into smaller functions (single responsibility)
- Add a **few fast unit tests** first:
- happy path
- edge cases
- error handling
...Same workspace, same conversation history, completely different voice.
Why this works. Cleanroom holds zero conversation state in memory between invocations.
Each chat call re-reads context.md from disk, re-reads the full session log from disk, rebuilds
the API payload from scratch, and sends it. There is no live process to "remember" the old persona.
The file IS the persona. Edit the file → next call uses the new persona — even mid-conversation,
even with prior turns in history.
This is exactly how Claude Code picks up CLAUDE.md edits the moment you save them.
inspect shows the next call's payload without making the call. Run it right after step 3 above:
$ cleanroom inspect -w demo
=== Workspace ===
id: demo
path: C:\Users\globa\Cleanroom\demo
context.md: C:\Users\globa\Cleanroom\demo\context.md
session: C:\Users\globa\Cleanroom\demo\.sessions\current.jsonl
model: claude-haiku-4-5-20251001
tokenizer: claude-tokenizer@0.0.4
boundary: Only this workspace is active
=== Budget ===
used: 211 / 100000 (0.2%)
system: 13 tok
history: 198 tok (4 loaded, 0 trimmed of 4 total)
current message: 0 tok
=== System prompt ===
You are an enthusiastic cheerleader. Always celebrate the user.
=== Messages (4 included) ===
[1] user
I just wrote a 200-line function with no tests. Thoughts?
[2] assistant
That's a maintainability and reliability risk—split it into smaller focused functions...
[3] user
...
Two things to notice:
- System prompt is the new cheerleader text —
context.mdis re-read every invocation. - Messages include the prior terse reviewer reply — the session log is the ground truth, the system prompt is layered over it.
Run cleanroom inspect twice with an edit between them; the System prompt section changes
immediately. That's the proof that there's no in-memory cache.
cleanroom new alpha
cleanroom new beta
printf 'The secret word is PURPLE.\n' > ~/Cleanroom/alpha/context.md
printf 'The secret word is GREEN.\n' > ~/Cleanroom/beta/context.md
cleanroom chat -w alpha "What is the secret word?"
# → PURPLE
cleanroom chat -w beta "What is the secret word?"
# → GREENWorkspaces share nothing. Nothing under ~/Cleanroom/beta/ is opened during chat -w alpha —
this is enforced by tests/isolation.test.ts, which wraps fs.promises.{access,stat,readFile, readdir,open} and asserts no recorded read path crosses workspace roots.
cleanroom events shows the observability log. The messages=N field on session_loaded is the
direct evidence of rebuild-from-disk: it counts the entries the next call sees in the on-disk
log, computed fresh each time.
$ cleanroom events -w demo
2026-05-20T09:57:17Z session_loaded ws=demo messages=0
2026-05-20T09:57:22Z exchange_completed ws=demo user=01KS... asst=01KS... in=264 out=34
2026-05-20T09:57:40Z session_loaded ws=demo messages=2
2026-05-20T09:57:49Z exchange_completed ws=demo user=01KS... asst=01KS... in=317 out=200
messages=0 → messages=2 between two chat calls means the second invocation found two prior
turns on disk. If we held state in memory, this number wouldn't need to come from disk — and the
file-edit demo above wouldn't work.
| Code | Meaning |
|---|---|
| 0 | success |
| 1 | generic error |
| 2 | workspace not found |
| 3 | context overflow (system prompt + current message exceed maxInputTokens) |
| 4 | API error (network / auth / rate limit / server) |
| 5 | JSONL corruption that cannot be tolerated (reserved; MVP rarely throws this) |
Every error message includes the workspace ID. Spec §9.
The full engineering spec — schemas, the context-builder algorithm, isolation contract, exit
codes, what's intentionally out of scope — lives in
docs/cleanroom-spec-cli-mvp.md. Read that before any
non-trivial change.
The five module rules worth knowing without reading the spec:
src/core/context-builder.ts,tokenizer.ts,types.tsare pure. No I/O. The builder takes aBuildContextInput(already-loaded system prompt + already-parsed history) and returns the API payload + budget report. The I/O lives in the command layer.- No in-memory state survives a
chatinvocation. No module-level cache, no singleton conversation store, noprocess.env.CURRENT_WORKSPACE. Everything is keyed by workspace path and threaded through arguments. - The Context Builder never writes. Reads only. It is safe to run from
inspectwithout side effects. - System prompt and the current user message are never trimmed. If they don't fit alone,
throw
ContextOverflowError(exit 3). History is trimmed from the oldest end. - Unknown JSONL
typevalues are skipped silently. This is how we ship MVP today and addtool_use/tool_resultlater without breaking old readers (tests/forward-compat.test.ts pins this).
- Not a streaming UI.
chatblocks until the reply is complete. - Not a tool-calling runtime. The JSONL schema reserves
tool_use/tool_resultso future you can add them without a migration; the MVP doesn't emit them. - Not a compaction engine.
SystemNoteis reserved for summaries; not implemented. - Not multi-session, not search, not fork. Each workspace has exactly one
current.jsonl.
See docs/cleanroom-spec-cli-mvp.md §12 for the full out-of-scope
list and why.