Skip to content

feat: guard against OOM on oversized workspace roots#1

Merged
ForeverInLaw merged 5 commits into
mainfrom
feat/index-root-guards
Jun 2, 2026
Merged

feat: guard against OOM on oversized workspace roots#1
ForeverInLaw merged 5 commits into
mainfrom
feat/index-root-guards

Conversation

@ForeverInLaw

@ForeverInLaw ForeverInLaw commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Problem

Indexer walks the whole tree via ignore::WalkBuilder with no guard on root size. Adapter defaults the workspace to cwd, so opening an agent session in C:\ or %USERPROFILE% makes the daemon index the entire drive/home — every file content, symbol, BM25 token, and embedding lives in RAM → OOM. Only per-file defenses existed (SKIP_DIRS, MAX_FILE_BYTES).

Changes (3 layers)

  1. Root guard (continuum-daemon): refuse background indexing and recursive watching when the workspace root is a filesystem root or the user home dir. Daemon still serves memory + on-demand text search. Override CONTINUUM_ALLOW_LARGE_ROOT=1.
  2. File-count cap (continuum-indexer): bound a pass to CONTINUUM_MAX_FILES (default 50000, 0 = unlimited); truncation logged. Backstop for big trees that arent home/drive.
  3. Broader SKIP_DIRS: add VCS, dependency, build-output, language-cache, and home bloat dirs (AppData, Library, .cache, .cargo, vendor, …).

Plus README env-var docs.

Verify

  • cargo build -p continuum-daemon -p continuum-indexer — clean.
  • cargo test -p continuum-indexer — 18 passed (new SKIP_DIRS assertions for .venv, AppData).
  • Manual: run daemon with --workspace C:\ → log skipping automatic indexing: ... is a filesystem root, daemon stays up; set CONTINUUM_ALLOW_LARGE_ROOT=1 → indexes (capped at 50k).

Risks

  • Home-dir match is exact (canonicalized); home subdirs (Desktop, Documents) not guarded — caught instead by the file cap.
  • out/build/Library as skip names could hide a real source dir sharing those names; mitigated by the 0-disable cap and standard naming.

Add version-control, dependency, build-output, language-cache, and
home-directory bloat dirs (AppData, Library, .cache, .cargo, vendor,
etc.) to SKIP_DIRS so a misaimed workspace root pulls in far less.
Bound a single index pass to CONTINUUM_MAX_FILES (default 50000, 0
disables). A workspace rooted at a huge tree no longer loads unbounded
symbols, tokens, and embeddings into memory; truncation is logged.
Refuse background indexing and recursive watching when the workspace
root is a filesystem root or the user's home directory, the cases that
trigger OOM. Memory and on-demand text search still work; override with
CONTINUUM_ALLOW_LARGE_ROOT=1.
@ForeverInLaw ForeverInLaw merged commit 437218f into main Jun 2, 2026
3 checks passed
@ForeverInLaw ForeverInLaw deleted the feat/index-root-guards branch June 2, 2026 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant