diff --git a/CLAUDE.md b/CLAUDE.md
index e4230818..f3934322 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -1,6 +1,6 @@
 # CLAUDE.md
 
-Vectorless is a reasoning-native document intelligence engine written in Rust.
+Vectorless is a Document Understanding Engine for AI written in Rust.
 
 ## Principles
 
@@ -10,32 +10,56 @@ Vectorless is a reasoning-native document intelligence engine written in Rust.
 
 ## Project Structure
 
-- `rust/` - Rust core engine
-  - `src/client/` - Client API (EngineBuilder, Engine) - facade layer, no business logic
-  - `src/document/` - Document data structures (DocumentTree, NavigationIndex, ReasoningIndex)
-  - `src/index/` - Compile pipeline (8-stage, checkpointing, incremental update)
-  - `src/retrieval/` - Retrieval dispatch layer (preprocessing, dispatch, postprocessing, cache, streaming)
-  - `src/query/` - Query understanding and planning (intent classification, rewrite, decomposition)
-  - `src/agent/` - Retrieval execution (Worker: doc navigation, Orchestrator: supervisor loop + multi-doc fusion)
-  - `src/rerank/` - Result reranking and answer synthesis (dedup, scoring, fusion, synthesis)
-  - `src/scoring/` - Scoring and ranking strategies (BM25, relevance scoring, score combination)
-  - `src/llm/` - LLM client (connection pool, memo/caching, throttle/rate-limiting, fallback)
-  - `src/storage/` - Persistence (Workspace, LRU cache, backend abstraction file/memory)
-  - `src/graph/` - Cross-document relationship graph
-  - `src/metrics/` - Metrics collection and reporting
-  - `src/events/` - Event system for progress monitoring
-  - `src/config/` - Configuration types and validation
-  - `src/error.rs` - Unified error types
-  - `src/utils/` - Utility functions (token counting, fingerprinting, validation)
-  - `examples/` - Rust examples (flow, indexing, pdf, batch, etc.)
-- `python/` - Python SDK (PyO3 bindings) + CLI
+Cargo workspace with 17 fine-grained Rust crates + pure Python SDK:
+
+```
+vectorless-core/
+├── vectorless-error/       # Error types (Result, Error enum)
+├── vectorless-document/    # Document types (Document, Tree, NavigationIndex, ReasoningIndex)
+├── vectorless-config/      # Configuration hub (aggregates all config types)
+├── vectorless-utils/       # Utilities (fingerprinting, token counting, validation)
+├── vectorless-scoring/     # Scoring (BM25, keyword extraction)
+├── vectorless-graph/       # Cross-document relationship graph
+├── vectorless-events/      # Event system for progress monitoring
+├── vectorless-metrics/     # Metrics collection and reporting
+├── vectorless-llm/         # LLM client (pool, memo/cache, throttle, fallback)
+├── vectorless-storage/     # Persistence (Workspace, LRU cache, file/memory backends)
+├── vectorless-query/       # Query understanding (intent classification, rewrite)
+├── vectorless-index/       # Compile pipeline (10-stage, checkpointing, incremental update)
+├── vectorless-agent/       # Retrieval execution (Worker navigation + Orchestrator fusion)
+├── vectorless-retrieval/   # Retrieval dispatch layer (dispatcher, cache, streaming)
+├── vectorless-rerank/      # Result reranking (dedup, BM25 scoring, fusion)
+├── vectorless-engine/      # Facade (Engine, EngineBuilder) — re-exports public API
+└── vectorless-py/          # PyO3 bindings (compiled into Python native module)
+```
+
+- `vectorless/` - Pure Python SDK (high-level wrappers, CLI, config loading, integrations)
+- `examples/` - Python examples (primary, for Python ecosystem)
 - `docs/` - Docusaurus documentation site
-- `samples/` - Sample files
+
+### Dependency Layers
+
+```
+Layer 0:  error · document · utils · scoring          (no workspace deps)
+Layer 1:  graph · events · config · metrics            (depends on Layer 0)
+Layer 2:  llm · storage                                 (depends on Layer 0–1)
+Layer 3:  query                                         (depends on Layer 0–2)
+Layer 4:  index · agent                                 (depends on Layer 0–3)
+Layer 5:  retrieval · rerank                            (depends on Layer 0–4)
+Layer 6:  engine (facade) · vectorless-py (bindings)    (depends on all)
+```
+
+### Compilation Isolation
+
+改一个模块只重编译该 crate + 上游 facade：
+- 改 `agent` → agent, retrieval, rerank, engine, py 重编译；index/llm/storage 不动
+- 改 `llm` → llm 及其上层重编译；index/agent/stage 不重编译
+- 改 `document` → 全部重编译（核心类型，预期行为）
 
 ### Retrieval Call Flow
 
 ```
-Engine.query()
+Engine.ask()
   → retrieval/dispatcher
     → query/understand() → QueryPlan (LLM intent + concepts + strategy)
     → Orchestrator (always, single or multi-doc)
@@ -49,16 +73,17 @@ Engine.query()
 ## Build Commands
 
 ```bash
-# Rust core
-cd rust
-cargo build          # Build
-cargo test           # Run tests
+# Build (workspace)
+cargo build          # Build all crates
+cargo test           # Run tests (488 tests across all crates)
 cargo clippy         # Lint
 cargo fmt            # Format code
 
+# Build specific crate (fast — only that crate + dependents)
+cargo build -p vectorless-agent
+
 # Python SDK
-cd python
-pip install -e .     # Install in editable mode
+pip install -e .     # Install in editable mode (from project root, uses maturin)
 
 # Docs site
 cd docs
@@ -145,7 +170,9 @@ When uncertain whether an operation is safe, **default to asking user confirmati
 
 ## Common Development Workflow
 
-1. **Adding features**: Implement in appropriate `rust/src/` module, add tests
+1. **Adding features**: Implement in the appropriate `vectorless-core/vectorless-*/` crate, add tests
 2. **Fixing bugs**: Add failing test case first, fix and ensure tests pass
-3. **Python bindings**: Update `python/src/lib.rs` (PyO3) when Rust APIs change
-4. **Committing code**: Use semantic commit messages, format: `type(scope): description`
+3. **Adding crates**: New modules get their own crate under `vectorless-core/`, add to workspace Cargo.toml
+4. **Python bindings**: Update `vectorless-core/vectorless-py/src/lib.rs` (PyO3) when Rust APIs change
+5. **Python SDK**: Update `vectorless/` when API surface changes
+6. **Committing code**: Use semantic commit messages, format: `type(scope): description`
diff --git a/Cargo.toml b/Cargo.toml
index c3c13c88..30641940 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -1,10 +1,28 @@
 [workspace]
-members = ["rust", "python"]
+members = [
+    "vectorless-core/vectorless-error",
+    "vectorless-core/vectorless-document",
+    "vectorless-core/vectorless-config",
+    "vectorless-core/vectorless-utils",
+    "vectorless-core/vectorless-scoring",
+    "vectorless-core/vectorless-graph",
+    "vectorless-core/vectorless-events",
+    "vectorless-core/vectorless-metrics",
+    "vectorless-core/vectorless-llm",
+    "vectorless-core/vectorless-storage",
+    "vectorless-core/vectorless-query",
+    "vectorless-core/vectorless-index",
+    "vectorless-core/vectorless-agent",
+    "vectorless-core/vectorless-retrieval",
+    "vectorless-core/vectorless-rerank",
+    "vectorless-core/vectorless-engine",
+    "vectorless-core/vectorless-py",
+]
 resolver = "2"
 
 [workspace.package]
-version = "0.1.32"
-description = "Reasoning-based Document Engine"
+version = "0.1.12"
+description = "Document Understanding Engine for AI"
 edition = "2024"
 authors = ["zTgx <beautifularea@gmail.com>"]
 license = "Apache-2.0"
diff --git a/HISTORY.md b/HISTORY.md
new file mode 100644
index 00000000..387ed3fb
--- /dev/null
+++ b/HISTORY.md
@@ -0,0 +1,99 @@
+# HISTORY
+
+## 0.1.11 (2026-04-21)
+
+- Project description updated to "reasoning-based document engine"
+- Core principles documentation (Reason don't vector, Model fails we fail, No thought no answer)
+- Updated homepage with three core principles and key features
+
+## 0.1.10 (2026-04-21)
+
+- Description generation enabled by default
+- `timeout_secs` option for Python indexing
+- Agent-based navigation documentation
+
+## 0.1.9 (2026-04-20)
+
+- **Agent-based retrieval architecture**: replaced pilot/search with Orchestrator + Workers
+- Navigation commands: `ls`, `cd`, `cat`, `grep`, `find`, `head`, `pwd`, `wc`
+- Orchestrator supervisor loop with dynamic re-planning
+- Query understanding pipeline with `QueryPlan`
+- Evidence evaluation and replanning modules
+- `NavigationIndex` with `DocCard` and `SectionCard`
+- LLM-based confidence scoring (replaced BM25)
+- Unified rerank pipeline (replaced synthesis/fusion)
+- `DocCard` catalog in workspace storage
+- Shared concurrency control for LLM clients
+- Memoization for LLM operations in retrieval pipeline
+- LLM request timeout configuration
+
+## 0.1.8 (2026-04-16)
+
+- GitHub Actions workflow for automated releases
+- Endpoint parameter support for API configuration
+- Custom config option in `EngineBuilder`
+- Enhanced error messages with detailed failure info
+- Endpoint validation in engine builder
+
+## 0.1.7 (2026-04-15)
+
+- Runtime metrics reports (LLM, Pilot, Retrieval)
+- Recursive option for `from_dir` method
+- Directory indexing support via `IndexContext`
+- Centralized `LlmPool` configuration system
+- Shared LLM client injected into pipeline context
+- Pipeline checkpoint for resumable indexing
+- `source_path` field and updated `QueryContext` API
+
+## 0.1.6 (2026-04-15)
+
+- `IndexMetrics` binding with detailed indexing statistics
+- `StrategyPreference` for controlling retrieval strategies
+- Pure Pilot search algorithm, beam search with backtracking
+- Per-step reasoning support in search algorithms
+- Binary pruning and pre-filtering for wide nodes
+- LLM-based query complexity detection
+- Cross-document strategy with graph-based boosting
+- Synonym expansion for improved query recall
+- Default summary strategy changed to Full
+
+## 0.1.4 (2026-04-13)
+
+- PDF parser: switch to `pdf-extract` for reliable text extraction
+- Concurrent LLM verification for TOC entries
+- PDF indexing example
+
+## 0.1.3 (2026-04-13)
+
+- Internal module naming cleanup (`_` prefix for private functions)
+
+## 0.1.2 (2026-04-13)
+
+- Search-from functionality and ToC-based navigation
+- Reasoning chain (replacing navigation trace)
+- Adaptive budget controller for pipeline token management
+- Structural path constraints and hints extraction
+- Reasoning index for fast retrieval path resolution
+- Document graph system for cross-document relationships
+- Streaming retrieval with `RetrieveEvent` support
+- Multi-document query support
+- Incremental indexing with content and logic fingerprinting
+- Parallel processing for multiple document sources
+- Pipeline checkpoint and content merging/splitting support
+
+## 0.1.1 (2026-04-08)
+
+- Workspace-managed dependencies and configuration
+- LLM pilot functionality and summary generation
+- Query decomposition support
+- LLM-first search with TOC-based location
+- Restructured Python examples
+
+## 0.1.0 (2026-04-07)
+
+Initial Python SDK release.
+
+- PyO3 bindings for the Rust engine core
+- Basic `Engine` class with `index()` and `query()` methods
+- `pyproject.toml` with maturin build backend
+- Ruff formatting configuration
diff --git a/examples/batch_indexing/README.md b/examples/batch_indexing/README.md
deleted file mode 100644
index 41e87fae..00000000
--- a/examples/batch_indexing/README.md
+++ /dev/null
@@ -1,28 +0,0 @@
-# Batch Indexing Example
-
-Demonstrates indexing multiple documents at once using:
-- `from_paths` -- explicit list of file paths
-- `from_dir` -- all supported files in a directory
-- `from_bytes` -- raw in-memory content
-
-Also shows cross-document querying with `with_doc_ids`.
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-python main.py
-```
-
-## Environment Variables
-
-| Variable                | Description          | Default   |
-|------------------------|----------------------|-----------|
-| `VECTORLESS_API_KEY`   | LLM API key          | `sk-...`  |
-| `VECTORLESS_MODEL`     | LLM model name       | `gpt-4o`  |
-| `VECTORLESS_ENDPOINT`  | Custom API endpoint  | `None`    |
diff --git a/examples/batch_indexing/main.py b/examples/batch_indexing/main.py
deleted file mode 100644
index c68b3626..00000000
--- a/examples/batch_indexing/main.py
+++ /dev/null
@@ -1,180 +0,0 @@
-"""
-Batch indexing example -- demonstrates indexing multiple documents at once
-using from_paths, from_dir, and from_bytes.
-
-Usage:
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-
-from vectorless import (
-    Engine,
-    IndexContext,
-    IndexOptions,
-    QueryContext,
-    VectorlessError,
-)
-
-# --- Configuration ---
-API_KEY = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-MODEL = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-ENDPOINT = os.environ.get("VECTORLESS_ENDPOINT", None)
-# Sample documents for demonstration
-DOCS = {
-    "alpha.md": """\
-# Alpha Report
-
-## Summary
-
-Alpha is a distributed key-value store designed for low-latency reads.
-It uses a log-structured merge tree for storage.
-
-## Architecture
-
-Write requests go through a write-ahead log, then are buffered in memory.
-When the buffer is full, it is flushed to disk as an immutable SSTable.
-""",
-    "beta.md": """\
-# Beta Report
-
-## Summary
-
-Beta is a stream processing engine that consumes events from Kafka topics
-and applies real-time transformations using a DAG-based execution model.
-
-## Performance
-
-Beta processes up to 2 million events per second per node on commodity hardware.
-""",
-    "gamma.md": """\
-# Gamma Report
-
-## Summary
-
-Gamma is a feature store that bridges the gap between offline feature
-computation and online serving. Features are computed in Spark and served
-via a low-latency gRPC endpoint.
-
-## Integration
-
-Gamma integrates with Alpha for feature metadata storage and Beta for
-real-time feature updates.
-""",
-}
-
-
-def write_sample_docs(base_dir: str) -> list[str]:
-    """Write sample markdown files and return their paths."""
-    paths = []
-    for name, content in DOCS.items():
-        path = os.path.join(base_dir, name)
-        with open(path, "w") as f:
-            f.write(content)
-        paths.append(path)
-    return paths
-
-
-async def main() -> None:
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-        endpoint=ENDPOINT,
-    )
-
-    # Create a temp directory with sample documents
-    docs_dir = "./batch_docs"
-    os.makedirs(docs_dir, exist_ok=True)
-    paths = write_sample_docs(docs_dir)
-
-    # ---- 1. Index multiple files at once via from_paths ----
-    print("=" * 50)
-    print("  from_paths -- index a list of files")
-    print("=" * 50)
-
-    ctx = IndexContext.from_paths(paths)
-    result = await engine.index(ctx)
-
-    print(f"  Indexed {len(result.items)} document(s)")
-    for item in result.items:
-        print(f"    - {item.name} ({item.doc_id[:8]}...)")
-    if result.has_failures():
-        for f in result.failed:
-            print(f"    ! Failed: {f.source} -- {f.error}")
-    print()
-
-    doc_ids = [item.doc_id for item in result.items]
-
-    # ---- 2. Query across all batch-indexed documents ----
-    print("=" * 50)
-    print("  Query across multiple documents")
-    print("=" * 50)
-
-    answer = await engine.query(
-        QueryContext(
-            "Which system processes the most events per second?"
-        ).with_doc_ids(doc_ids)
-    )
-    for item in answer.items:
-        print(f"  [{item.doc_id[:8]}...] score={item.score:.2f}")
-        print(f"    {item.content[:200]}...")
-    print()
-
-    # ---- 3. Index a directory via from_dir ----
-    print("=" * 50)
-    print("  from_dir -- index all supported files in a directory")
-    print("=" * 50)
-
-    # Clear first so we see fresh results
-    await engine.clear()
-
-    ctx = IndexContext.from_dir(docs_dir).with_options(
-        IndexOptions(generate_summaries=True, generate_description=True)
-    )
-    result = await engine.index(ctx)
-
-    print(f"  Indexed {len(result.items)} document(s)")
-    for item in result.items:
-        desc = item.description[:80] if item.description else "N/A"
-        print(f"    - {item.name}: {desc}...")
-    print()
-
-    # ---- 4. Index from raw bytes via from_bytes ----
-    print("=" * 50)
-    print("  from_bytes -- index in-memory content")
-    print("=" * 50)
-
-    md_bytes = b"""# Delta Notes
-
-## Key Points
-
-- Delta uses CRDTs for conflict-free replication.
-- Writes are locally committed then asynchronously propagated.
-- Read repair ensures eventual consistency across all replicas.
-"""
-
-    ctx = IndexContext.from_bytes(md_bytes, "markdown").with_name("delta")
-    result = await engine.index(ctx)
-
-    print(f"  Indexed: {result.doc_id}")
-    print()
-
-    # ---- Cleanup ----
-    print("=" * 50)
-    print("  Cleanup")
-    print("=" * 50)
-
-    removed = await engine.clear()
-    print(f"  Removed {removed} document(s)")
-
-    # Remove temp files
-    for p in paths:
-        os.remove(p)
-    os.rmdir(docs_dir)
-    print(f"  Cleaned up {docs_dir}/")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/document_management/README.md b/examples/document_management/README.md
deleted file mode 100644
index e41148e0..00000000
--- a/examples/document_management/README.md
+++ /dev/null
@@ -1,28 +0,0 @@
-# Document Management Example
-
-Demonstrates CRUD operations on indexed documents:
-
-- `engine.list()` -- list all documents
-- `engine.exists(doc_id)` -- check if a document exists
-- `engine.remove(doc_id)` -- remove a single document
-- `engine.clear()` -- remove all documents
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-python main.py
-```
-
-## Environment Variables
-
-| Variable                | Description          | Default   |
-|------------------------|----------------------|-----------|
-| `VECTORLESS_API_KEY`   | LLM API key          | `sk-...`  |
-| `VECTORLESS_MODEL`     | LLM model name       | `gpt-4o`  |
-| `VECTORLESS_ENDPOINT`  | Custom API endpoint  | `None`    |
diff --git a/examples/document_management/main.py b/examples/document_management/main.py
deleted file mode 100644
index 5d206a89..00000000
--- a/examples/document_management/main.py
+++ /dev/null
@@ -1,132 +0,0 @@
-"""
-Document management example -- demonstrates CRUD operations on indexed documents:
-list, exists, remove, and clear.
-
-Usage:
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-
-from vectorless import (
-    Engine,
-    IndexContext,
-    QueryContext,
-    VectorlessError,
-)
-
-# --- Configuration ---
-API_KEY = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-MODEL = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-ENDPOINT = os.environ.get("VECTORLESS_ENDPOINT", None)
-# Sample documents
-SAMPLE_A = """\
-# Project Alpha
-
-## Overview
-
-Project Alpha is a next-generation database engine written in Rust.
-It supports ACID transactions and serializable isolation.
-
-## Features
-
-- MVCC concurrency control
-- B-tree and LSM storage engines
-- Query planner with cost-based optimization
-"""
-
-SAMPLE_B = """\
-# Project Beta
-
-## Overview
-
-Project Beta is a web framework for building real-time applications.
-It uses WebSocket-based communication and server-side rendering.
-
-## Features
-
-- Hot module reloading
-- Built-in authentication middleware
-- Automatic code splitting
-"""
-
-
-async def main() -> None:
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-        endpoint=ENDPOINT,
-    )
-
-    # ---- Index two documents ----
-    print("Indexing two documents...")
-
-    result_a = await engine.index(
-        IndexContext.from_content(SAMPLE_A, "markdown").with_name("alpha")
-    )
-    doc_id_a = result_a.doc_id
-    print(f"  A: {doc_id_a}")
-
-    result_b = await engine.index(
-        IndexContext.from_content(SAMPLE_B, "markdown").with_name("beta")
-    )
-    doc_id_b = result_b.doc_id
-    print(f"  B: {doc_id_b}")
-    print()
-
-    # ---- list() -- show all indexed documents ----
-    print("--- list() ---")
-    docs = await engine.list()
-    for doc in docs:
-        pages = f", pages={doc.page_count}" if doc.page_count else ""
-        lines = f", lines={doc.line_count}" if doc.line_count else ""
-        print(f"  {doc.name}  id={doc.id[:8]}...  format={doc.format}{pages}{lines}")
-    print(f"  Total: {len(docs)} document(s)\n")
-
-    # ---- exists() -- check if a document is indexed ----
-    print("--- exists() ---")
-    for did, label in [(doc_id_a, "A"), (doc_id_b, "B"), ("nonexistent-id", "?")]:
-        found = await engine.exists(did)
-        print(f"  {label}: exists={found}")
-    print()
-
-    # ---- Query a specific document ----
-    print("--- query(doc_id_a) ---")
-    answer = await engine.query(
-        QueryContext("What storage engines does Alpha support?").with_doc_ids([doc_id_a])
-    )
-    item = answer.single()
-    if item:
-        print(f"  Score: {item.score:.2f}")
-        print(f"  Answer: {item.content[:200]}...\n")
-
-    # ---- remove() -- delete a single document ----
-    print("--- remove(doc_id_a) ---")
-    removed = await engine.remove(doc_id_a)
-    print(f"  Removed A: {removed}")
-
-    # Verify it's gone
-    exists_a = await engine.exists(doc_id_a)
-    print(f"  exists(A) after removal: {exists_a}")
-    print()
-
-    # ---- list() again -- only B should remain ----
-    print("--- list() after removal ---")
-    docs = await engine.list()
-    for doc in docs:
-        print(f"  {doc.name}  id={doc.id[:8]}...")
-    print(f"  Total: {len(docs)} document(s)\n")
-
-    # ---- clear() -- remove all remaining documents ----
-    print("--- clear() ---")
-    cleared = await engine.clear()
-    print(f"  Cleared {cleared} document(s)")
-
-    docs = await engine.list()
-    print(f"  Remaining: {len(docs)} document(s)")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/error_handling/README.md b/examples/error_handling/README.md
deleted file mode 100644
index 2424d618..00000000
--- a/examples/error_handling/README.md
+++ /dev/null
@@ -1,33 +0,0 @@
-# Error Handling Example
-
-Demonstrates how to catch and inspect `VectorlessError` exceptions:
-
-- Invalid format strings
-- Invalid indexing modes
-- Querying non-existent documents
-- Batch indexing with partial failures
-- Engine creation with invalid credentials
-
-The `VectorlessError` exception provides:
-- `kind` -- error category (`"config"`, `"not_found"`, `"parse"`, `"llm"`, etc.)
-- `message` -- human-readable error description
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-python main.py
-```
-
-## Environment Variables
-
-| Variable                | Description          | Default   |
-|------------------------|----------------------|-----------|
-| `VECTORLESS_API_KEY`   | LLM API key          | `sk-...`  |
-| `VECTORLESS_MODEL`     | LLM model name       | `gpt-4o`  |
-| `VECTORLESS_ENDPOINT`  | Custom API endpoint  | `None`    |
diff --git a/examples/error_handling/main.py b/examples/error_handling/main.py
deleted file mode 100644
index 22099e3d..00000000
--- a/examples/error_handling/main.py
+++ /dev/null
@@ -1,107 +0,0 @@
-"""
-Error handling example -- demonstrates catching and inspecting VectorlessError.
-
-Usage:
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-
-from vectorless import (
-    Engine,
-    IndexContext,
-    IndexOptions,
-    QueryContext,
-    VectorlessError,
-)
-
-# --- Configuration ---
-API_KEY = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-MODEL = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-ENDPOINT = os.environ.get("VECTORLESS_ENDPOINT", None)
-
-async def main() -> None:
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-        endpoint=ENDPOINT,
-    )
-
-    # ---- 1. Invalid format ----
-    print("--- Invalid format in from_bytes ---")
-    try:
-        ctx = IndexContext.from_bytes(b"hello", "xml")
-    except VectorlessError as e:
-        print(f"  Caught VectorlessError:")
-        print(f"    kind:    {e.kind}")
-        print(f"    message: {e.message}")
-        print(f"    repr:    {repr(e)}")
-    print()
-
-    # ---- 2. Invalid indexing mode ----
-    print("--- Invalid indexing mode ---")
-    try:
-        opts = IndexOptions(mode="bad_mode")
-    except VectorlessError as e:
-        print(f"  Caught VectorlessError:")
-        print(f"    kind:    {e.kind}")
-        print(f"    message: {e.message}")
-    print()
-
-    # ---- 3. Query a non-existent document ----
-    print("--- Query non-existent document ---")
-    try:
-        await engine.query(
-            QueryContext("What is this?").with_doc_ids(["does-not-exist"])
-        )
-    except VectorlessError as e:
-        print(f"  Caught VectorlessError:")
-        print(f"    kind:    {e.kind}")
-        print(f"    message: {e.message}")
-    print()
-
-    # ---- 4. Index with partial failure in batch ----
-    print("--- Batch indexing with mixed results ---")
-    good = IndexContext.from_content("# Real Doc\n\nThis is valid content.", "markdown")
-
-    result = await engine.index(good.with_name("good_doc"))
-    if result.has_failures():
-        for f in result.failed:
-            print(f"  Failed: {f.source} -- {f.error}")
-    else:
-        print(f"  Success: {result.doc_id}")
-
-        # Inspect individual items
-        for item in result.items:
-            print(f"  Item: {item.name} ({item.format})")
-            if item.metrics:
-                m = item.metrics
-                print(f"    Total time: {m.total_time_ms} ms, LLM calls: {m.llm_calls}")
-    print()
-
-    # ---- 5. Engine creation with bad credentials ----
-    print("--- Engine with invalid credentials ---")
-    try:
-        bad_engine = Engine(
-            api_key="sk-invalid-key-12345",
-            model="gpt-4o",
-        )
-        # Try to use it -- the error will surface on the first LLM call
-        await bad_engine.index(
-            IndexContext.from_content("# Test\n", "markdown").with_name("fail_test")
-        )
-    except VectorlessError as e:
-        print(f"  Caught VectorlessError:")
-        print(f"    kind:    {e.kind}")
-        print(f"    message: {e.message[:120]}...")
-    print()
-
-    # ---- Cleanup ----
-    await engine.clear()
-    print("Done.")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/index_directory/main.py b/examples/index_directory/main.py
deleted file mode 100644
index 08b1c3bd..00000000
--- a/examples/index_directory/main.py
+++ /dev/null
@@ -1,99 +0,0 @@
-"""
-Directory indexing example — recursively index all documents in a directory.
-
-Usage:
-    python index_directory.py /path/to/docs
-    python index_directory.py /path/to/docs --no-recursive
-
-Environment variables:
-    LLM_API_KEY    — Your LLM API key (required)
-    LLM_MODEL      — Model name (default: google/gemini-3-flash-preview)
-    LLM_ENDPOINT   — API endpoint (default: http://localhost:4000/api/v1)
-"""
-
-import argparse
-import asyncio
-import os
-
-from vectorless import Engine, IndexContext, QueryContext
-
-
-async def main():
-    parser = argparse.ArgumentParser(description="Index a directory of documents")
-    parser.add_argument("directory", help="Directory path to index")
-    parser.add_argument(
-        "--no-recursive",
-        action="store_true",
-        help="Only scan top-level files (default: recursive)",
-    )
-    args = parser.parse_args()
-
-    # Build engine
-    api_key = os.environ.get("LLM_API_KEY", "sk-or-v1-...")
-    model = os.environ.get("LLM_MODEL", "google/gemini-3-flash-preview")
-    endpoint = os.environ.get("LLM_ENDPOINT", "http://localhost:4000/api/v1")
-
-    engine = Engine(
-        api_key=api_key,
-        model=model,
-        endpoint=endpoint,
-    )
-
-    recursive = not args.no_recursive
-
-    # Index directory
-    ctx = IndexContext.from_dir(args.directory, recursive=recursive)
-
-    if ctx.is_empty():
-        print(f"No supported files found in: {args.directory}")
-        return
-
-    print(f"{'Recursively scanning' if recursive else 'Scanning top-level files in'}: {args.directory}")
-    print(f"Found files to index")
-
-    result = await engine.index(ctx)
-
-    print(f"\nIndexed {len(result.items)} document(s):")
-    for item in result.items:
-        print(f"  {item.name} ({item.doc_id})")
-        if item.metrics:
-            print(f"    nodes: {item.metrics.nodes_processed}, time: {item.metrics.total_time_ms}ms")
-
-    if result.has_failures():
-        print("\nFailed:")
-        for f in result.failed:
-            print(f"  {f.source} — {f.error}")
-
-    # Query across all indexed documents
-    query = "What is this about?"
-    print(f'\nQuerying: "{query}"')
-
-    answer = await engine.query(QueryContext(query))
-    for item in answer.items:
-        print(f"  [{item.doc_id} score={item.score:.2f}]")
-        preview = item.content[:200]
-        print(f"  {preview}")
-        if len(item.content) > 200:
-            print("  ...")
-
-    # Metrics report
-    report = engine.metrics_report()
-    print("\nMetrics:")
-    print(
-        f"  LLM: {report.llm.total_calls} calls, "
-        f"{report.llm.total_tokens} tokens, "
-        f"${report.llm.estimated_cost_usd:.4f}"
-    )
-    print(
-        f"  Retrieval: {report.retrieval.total_queries} queries, "
-        f"avg score {report.retrieval.avg_path_score:.2f}"
-    )
-
-    # Cleanup
-    docs = await engine.list()
-    for doc in docs:
-        await engine.remove(doc.id)
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/index_metrics/README.md b/examples/index_metrics/README.md
deleted file mode 100644
index 78bdd552..00000000
--- a/examples/index_metrics/README.md
+++ /dev/null
@@ -1,42 +0,0 @@
-# IndexMetrics Example
-
-Demonstrates how to inspect detailed indexing pipeline metrics via `IndexMetrics`.
-
-`IndexMetrics` is attached to each `IndexItem` and provides:
-
-| Field                  | Description                                  |
-|------------------------|----------------------------------------------|
-| `total_time_ms`        | Total indexing time                           |
-| `parse_time_ms`        | Document parsing stage duration               |
-| `build_time_ms`        | Tree building stage duration                  |
-| `enhance_time_ms`      | Summary/enhancement stage duration            |
-| `nodes_processed`      | Number of tree nodes processed                |
-| `summaries_generated`  | Successfully generated summaries              |
-| `summaries_failed`     | Failed summary generations                    |
-| `llm_calls`            | Total LLM API calls made                      |
-| `total_tokens_generated` | Total tokens produced by the LLM            |
-| `topics_indexed`       | Topics added to the reasoning index           |
-| `keywords_indexed`     | Keywords added to the reasoning index         |
-
-This example compares documents indexed with and without summaries enabled
-to show how `IndexOptions` affect pipeline stages and LLM usage.
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-python main.py
-```
-
-## Environment Variables
-
-| Variable                | Description          | Default   |
-|------------------------|----------------------|-----------|
-| `VECTORLESS_API_KEY`   | LLM API key          | `sk-...`  |
-| `VECTORLESS_MODEL`     | LLM model name       | `gpt-4o`  |
-| `VECTORLESS_ENDPOINT`  | Custom API endpoint  | `None`    |
diff --git a/examples/index_metrics/main.py b/examples/index_metrics/main.py
deleted file mode 100644
index bfea4cf0..00000000
--- a/examples/index_metrics/main.py
+++ /dev/null
@@ -1,233 +0,0 @@
-"""
-IndexMetrics example -- demonstrates inspecting detailed indexing pipeline metrics.
-
-IndexMetrics exposes timing, node processing, LLM usage, and reasoning index
-statistics for each indexed document.  This example compares two documents with
-different IndexOptions to show how options affect the pipeline.
-
-Usage:
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-
-from vectorless import (
-    Engine,
-    IndexContext,
-    IndexItem,
-    IndexMetrics,
-    IndexOptions,
-    VectorlessError,
-)
-
-# --- Configuration ---
-API_KEY = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-MODEL = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-ENDPOINT = os.environ.get("VECTORLESS_ENDPOINT", None)
-# --- Sample documents with varying complexity ---
-SIMPLE_DOC = """\
-# Quick Note
-
-This is a short note about caching strategies.
-Redis is commonly used as an in-memory cache.
-"""
-
-COMPLEX_DOC = """\
-# Distributed Systems Design Guide
-
-## Consensus
-
-Raft is a consensus algorithm designed to be easy to understand.
-It elects a leader via randomized timeouts and replicates log entries
-to a majority of followers before committing them.
-
-## Replication
-
-State machine replication ensures that all replicas execute the same
-commands in the same order. Primary-backup replication is simpler but
-provides lower availability during leader failover.
-
-## Partitioning
-
-Consistent hashing distributes keys across nodes with minimal
-remapping when the cluster size changes. Virtual nodes improve balance
-when the key space is small.
-
-## Failure Detection
-
-Phi accrual failure detection treats failure as a continuous suspicion
-level rather than a binary alive/dead state. This reduces false
-positives during transient network issues.
-"""
-
-
-def print_pipeline_breakdown(m: IndexMetrics) -> None:
-    """Print a breakdown of pipeline stages and their percentages."""
-    total = m.total_time_ms
-    if total == 0:
-        print("    (no timing data)")
-        return
-
-    parse_pct = m.parse_time_ms / total * 100
-    build_pct = m.build_time_ms / total * 100
-    enhance_pct = m.enhance_time_ms / total * 100
-    other_pct = max(0, 100 - parse_pct - build_pct - enhance_pct)
-
-    print(f"    Parse:    {m.parse_time_ms:>5} ms  ({parse_pct:5.1f}%)")
-    print(f"    Build:    {m.build_time_ms:>5} ms  ({build_pct:5.1f}%)")
-    print(f"    Enhance:  {m.enhance_time_ms:>5} ms  ({enhance_pct:5.1f}%)")
-    print(f"    Other:    {total - m.parse_time_ms - m.build_time_ms - m.enhance_time_ms:>5} ms  ({other_pct:5.1f}%)")
-
-
-def print_llm_stats(m: IndexMetrics) -> None:
-    """Print LLM utilization statistics."""
-    print(f"    LLM calls:         {m.llm_calls}")
-    print(f"    Tokens generated:   {m.total_tokens_generated}")
-    if m.llm_calls > 0:
-        avg_tokens = m.total_tokens_generated / m.llm_calls
-        print(f"    Avg tokens/call:    {avg_tokens:.0f}")
-
-
-def print_summary_stats(m: IndexMetrics) -> None:
-    """Print summary generation success/failure."""
-    total = m.summaries_generated + m.summaries_failed
-    print(f"    Summaries ok:       {m.summaries_generated}")
-    print(f"    Summaries failed:   {m.summaries_failed}")
-    if total > 0:
-        success_rate = m.summaries_generated / total * 100
-        print(f"    Success rate:       {success_rate:.1f}%")
-
-
-def print_reasoning_index(m: IndexMetrics) -> None:
-    """Print reasoning index statistics."""
-    print(f"    Nodes processed:    {m.nodes_processed}")
-    print(f"    Topics indexed:     {m.topics_indexed}")
-    print(f"    Keywords indexed:   {m.keywords_indexed}")
-
-
-def print_full_report(item: IndexItem) -> None:
-    """Print a full metrics report for an indexed item."""
-    m = item.metrics
-    print(f"  Document: {item.name} ({item.format})")
-    if m is None:
-        print("    (no metrics)")
-        return
-
-    print(f"  Total time: {m.total_time_ms} ms")
-    print(f"  repr: {repr(m)}")
-
-    print()
-    print("  Pipeline stages:")
-    print_pipeline_breakdown(m)
-
-    print()
-    print("  LLM usage:")
-    print_llm_stats(m)
-
-    print()
-    print("  Summary generation:")
-    print_summary_stats(m)
-
-    print()
-    print("  Reasoning index:")
-    print_reasoning_index(m)
-
-
-async def main() -> None:
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-        endpoint=ENDPOINT,
-    )
-
-    # ================================================================
-    # 1. Index a simple document WITHOUT summaries
-    # ================================================================
-    print("=" * 55)
-    print("  Run 1: Simple doc, summaries OFF")
-    print("=" * 55)
-
-    opts_no_summary = IndexOptions(
-        generate_summaries=False,
-        generate_description=False,
-    )
-    result = await engine.index(
-        IndexContext.from_content(SIMPLE_DOC, "markdown")
-        .with_name("simple_no_summary")
-        .with_options(opts_no_summary)
-    )
-    item = result.items[0]
-    print_full_report(item)
-    doc_id_1 = item.doc_id
-    print()
-
-    # ================================================================
-    # 2. Index the same simple document WITH summaries
-    # ================================================================
-    print("=" * 55)
-    print("  Run 2: Simple doc, summaries ON")
-    print("=" * 55)
-
-    opts_with_summary = IndexOptions(
-        generate_summaries=True,
-        generate_description=True,
-    )
-    result = await engine.index(
-        IndexContext.from_content(SIMPLE_DOC, "markdown")
-        .with_name("simple_with_summary")
-        .with_options(opts_with_summary)
-    )
-    item = result.items[0]
-    print_full_report(item)
-    doc_id_2 = item.doc_id
-    print()
-
-    # ================================================================
-    # 3. Compare: summaries OFF vs ON for the simple doc
-    # ================================================================
-    m_off = (await engine.list())[0]  # first indexed
-    # Find the second document's metrics via a fresh index
-    # (We already have both items above; let's compare directly)
-
-    # ================================================================
-    # 4. Index a complex document WITH summaries
-    # ================================================================
-    print("=" * 55)
-    print("  Run 3: Complex doc, summaries ON")
-    print("=" * 55)
-
-    result = await engine.index(
-        IndexContext.from_content(COMPLEX_DOC, "markdown")
-        .with_name("complex_with_summary")
-        .with_options(opts_with_summary)
-    )
-    item = result.items[0]
-    print_full_report(item)
-    doc_id_3 = item.doc_id
-    print()
-
-    # ================================================================
-    # 5. Summary table
-    # ================================================================
-    print("=" * 55)
-    print("  Comparison table")
-    print("=" * 55)
-
-    docs = await engine.list()
-    for doc in docs:
-        print(f"  {doc.name:<30} id={doc.id[:8]}...")
-        if doc.description:
-            print(f"    description: {doc.description[:80]}")
-
-    # ================================================================
-    # Cleanup
-    # ================================================================
-    print()
-    cleared = await engine.clear()
-    print(f"Cleaned up {cleared} document(s).")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/indexing/README.md b/examples/indexing/README.md
deleted file mode 100644
index dc60506f..00000000
--- a/examples/indexing/README.md
+++ /dev/null
@@ -1,15 +0,0 @@
-# Indexing Example
-
-Demonstrates the full Vectorless workflow: index, query, list, graph, cleanup.
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-python main.py
-```
diff --git a/examples/indexing/main.py b/examples/indexing/main.py
deleted file mode 100644
index f2adce3b..00000000
--- a/examples/indexing/main.py
+++ /dev/null
@@ -1,130 +0,0 @@
-"""
-Indexing example — demonstrates the full Vectorless workflow.
-
-Usage:
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-from vectorless import Engine, IndexContext, IndexOptions, QueryContext
-
-# os is used only for removing the sample file
-
-# --- Configuration ---
-# Replace with your own credentials
-API_KEY = "sk-..."
-MODEL = "gpt-4o"
-
-
-async def main():
-    # --- 1. Create engine ---
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-    )
-    print("Engine created\n")
-
-    # --- 2. Index from text ---
-    print("--- Index from text ---")
-    result = await engine.index(
-        IndexContext.from_content(
-            """# Architecture Guide
-
-## Overview
-
-Vectorless is a reasoning-native document intelligence engine.
-It uses hierarchical semantic trees instead of vector embeddings.
-
-## Key Concepts
-
-- **Semantic Tree**: Documents are parsed into a tree of sections.
-- **LLM Navigation**: Queries are resolved by traversing the tree.
-- **No Vectors**: No embeddings, no similarity search, no vector DB.
-""",
-            "markdown",
-        ).with_name("architecture")
-    )
-    doc_id = result.doc_id
-    print(f"  Indexed: {doc_id}")
-    print(f"  Items: {result.total()}\n")
-
-    # --- 3. Index from file ---
-    print("--- Index from file ---")
-    # Write a sample file first
-    sample_path = "./sample_report.md"
-    with open(sample_path, "w") as f:
-        f.write("""# Q4 Financial Report
-
-## Revenue
-
-Total revenue for Q4 was $12.3M, up 15% from Q3.
-SaaS subscriptions accounted for $8.1M, consulting for $4.2M.
-
-## Costs
-
-Operating costs were $9.8M, including $3.2M in engineering salaries.
-Marketing spend was reduced by 8% to $1.5M.
-
-## Outlook
-
-Projected Q1 revenue is $13.5M based on current pipeline.
-""")
-
-    result = await engine.index(IndexContext.from_path(sample_path))
-    file_doc_id = result.doc_id
-    print(f"  Indexed: {file_doc_id}\n")
-    os.remove(sample_path)
-
-    # --- 4. Index with options ---
-    print("--- Index with options (summaries + description) ---")
-    result = await engine.index(
-        IndexContext.from_content(
-            "# API Reference\n\n## GET /users\n\nList all users.\n\n## POST /users\n\nCreate a user.",
-            "markdown",
-        )
-        .with_name("api_ref")
-        .with_options(IndexOptions(generate_summaries=True, generate_description=True)),
-    )
-    print(f"  Indexed: {result.doc_id}\n")
-
-    # --- 5. Query ---
-    print("--- Query ---")
-    answer = await engine.query(
-        QueryContext("What was the total revenue?").with_doc_ids([file_doc_id])
-    )
-    item = answer.single()
-    if item:
-        print(f"  Score: {item.score:.2f}")
-        print(f"  Answer: {item.content[:200]}\n")
-
-    # --- 6. List documents ---
-    print("--- List documents ---")
-    docs = await engine.list()
-    for doc in docs:
-        desc = f" — {doc.description}" if doc.description else ""
-        print(f"  {doc.name} ({doc.id[:8]}...){desc}")
-    print()
-
-    # --- 7. Document graph ---
-    print("--- Document graph ---")
-    graph = await engine.get_graph()
-    if graph:
-        print(f"  Nodes: {graph.node_count()}, Edges: {graph.edge_count()}")
-        for doc_id in graph.doc_ids():
-            node = graph.get_node(doc_id)
-            if node:
-                neighbors = graph.get_neighbors(doc_id)
-                kw = ", ".join(k.keyword for k in node.top_keywords[:3])
-                print(f"  {node.title}: keywords=[{kw}], neighbors={len(neighbors)}")
-    print()
-
-    # --- 8. Cleanup ---
-    print("--- Cleanup ---")
-    removed = await engine.clear()
-    print(f"  Removed {removed} document(s)")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/pdf_indexing/README.md b/examples/pdf_indexing/README.md
deleted file mode 100644
index cfee9a95..00000000
--- a/examples/pdf_indexing/README.md
+++ /dev/null
@@ -1,27 +0,0 @@
-# PDF Indexing Example
-
-Demonstrates indexing a PDF file, inspecting indexing metrics, and querying.
-
-## Setup
-
-```bash
-pip install vectorless
-```
-
-## Run
-
-```bash
-# Use the sample PDF from the repository
-python main.py
-
-# Or specify your own PDF file
-python main.py /path/to/document.pdf
-```
-
-## Environment Variables
-
-| Variable                | Description          | Default   |
-|------------------------|----------------------|-----------|
-| `VECTORLESS_API_KEY`   | LLM API key          | `sk-...`  |
-| `VECTORLESS_MODEL`     | LLM model name       | `gpt-4o`  |
-| `VECTORLESS_ENDPOINT`  | Custom API endpoint  | `None`    |
diff --git a/examples/pdf_indexing/main.py b/examples/pdf_indexing/main.py
deleted file mode 100644
index c1e36727..00000000
--- a/examples/pdf_indexing/main.py
+++ /dev/null
@@ -1,123 +0,0 @@
-"""
-PDF indexing example -- demonstrates indexing PDF files and inspecting metrics.
-
-Usage:
-    pip install vectorless
-    python main.py [path/to/file.pdf]
-
-If no path is given, uses the sample PDF in the repository.
-"""
-
-import asyncio
-import os
-import sys
-
-from vectorless import (
-    Engine,
-    IndexContext,
-    IndexItem,
-    IndexMetrics,
-    IndexOptions,
-    QueryContext,
-    VectorlessError,
-)
-
-# --- Configuration ---
-API_KEY = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-MODEL = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-ENDPOINT = os.environ.get("VECTORLESS_ENDPOINT", None)
-# Resolve the sample PDF path relative to the repo root
-SAMPLE_PDF = os.path.join(
-    os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
-    "samples",
-    "Docker_Cheat_Sheet.pdf",
-)
-
-
-def print_separator(title: str) -> None:
-    print(f"\n{'=' * 40}")
-    print(f"  {title}")
-    print(f"{'=' * 40}")
-
-
-def print_metrics(item: IndexItem) -> None:
-    """Pretty-print indexing metrics for a single item."""
-    m: IndexMetrics | None = item.metrics
-    if m is None:
-        print("  (no metrics available)")
-        return
-
-    print(f"  Total time:       {m.total_time_ms:>6} ms")
-    print(f"  Parse time:       {m.parse_time_ms:>6} ms")
-    print(f"  Build time:       {m.build_time_ms:>6} ms")
-    print(f"  Enhance time:     {m.enhance_time_ms:>6} ms")
-    print(f"  Nodes processed:  {m.nodes_processed:>6}")
-    print(f"  Summaries ok:     {m.summaries_generated:>6}")
-    print(f"  Summaries failed: {m.summaries_failed:>6}")
-    print(f"  LLM calls:        {m.llm_calls:>6}")
-    print(f"  Tokens generated:  {m.total_tokens_generated:>6}")
-    print(f"  Topics indexed:   {m.topics_indexed:>6}")
-    print(f"  Keywords indexed: {m.keywords_indexed:>6}")
-
-
-async def main() -> None:
-    pdf_path = sys.argv[1] if len(sys.argv) > 1 else SAMPLE_PDF
-
-    if not os.path.isfile(pdf_path):
-        print(f"Error: file not found: {pdf_path}")
-        sys.exit(1)
-
-    engine = Engine(
-        api_key=API_KEY,
-        model=MODEL,
-        endpoint=ENDPOINT,
-    )
-
-    # ---- Index with description + summaries enabled ----
-    print_separator("Indexing PDF")
-
-    options = IndexOptions(generate_summaries=True, generate_description=True)
-    ctx = IndexContext.from_path(pdf_path).with_options(options)
-
-    try:
-        result = await engine.index(ctx)
-    except VectorlessError as e:
-        print(f"Indexing failed: [{e.kind}] {e.message}")
-        return
-
-    if result.has_failures():
-        for f in result.failed:
-            print(f"  Failed: {f.source} -- {f.error}")
-        return
-
-    doc_id = result.doc_id
-    print(f"  doc_id: {doc_id}")
-
-    for item in result.items:
-        print(f"\n  Item: {item.name} ({item.format})")
-        if item.page_count is not None:
-            print(f"  Pages: {item.page_count}")
-        if item.description:
-            print(f"  Description: {item.description[:120]}...")
-        print_metrics(item)
-
-    # ---- Query the PDF ----
-    print_separator("Query")
-
-    answer = await engine.query(
-        QueryContext("What is this document about?").with_doc_ids([doc_id])
-    )
-    item = answer.single()
-    if item:
-        print(f"  Score:   {item.score:.2f}")
-        print(f"  Nodes:   {item.node_ids}")
-        print(f"  Content: {item.content[:300]}...")
-
-    # ---- Cleanup ----
-    print_separator("Cleanup")
-    removed = await engine.clear()
-    print(f"  Removed {removed} document(s)")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/examples/session_walkthrough/README.md b/examples/session_walkthrough/README.md
deleted file mode 100644
index 17174a07..00000000
--- a/examples/session_walkthrough/README.md
+++ /dev/null
@@ -1,32 +0,0 @@
-# Session API Walkthrough
-
-Demonstrates the full high-level Vectorless Python API using the `Session` and `SyncSession` classes.
-
-## What it covers
-
-| # | Topic | API |
-|---|-------|-----|
-| 1 | Session creation | `Session()`, `from_env()`, `from_config_file()` |
-| 2 | Indexing sources | `index(content=)`, `index(path=)`, `index(bytes_data=)`, `index(directory=)` |
-| 3 | Batch indexing | `index_batch(paths, jobs=N)` |
-| 4 | Querying | `ask(question, doc_ids=)`, `ask(question, workspace_scope=True)` |
-| 5 | Streaming query | `query_stream()` async iterator |
-| 6 | Document management | `list_documents()`, `document_exists()`, `remove_document()`, `clear_all()` |
-| 7 | Document graph | `get_graph()` nodes, edges, keywords |
-| 8 | Event callbacks | `EventEmitter` with `@on_index` / `@on_query` decorators |
-| 9 | Metrics | `metrics_report()` |
-| 10 | Sync API | `SyncSession` (no async/await) |
-
-## Setup
-
-```bash
-pip install vectorless
-export VECTORLESS_API_KEY="sk-..."
-export VECTORLESS_MODEL="gpt-4o"
-```
-
-## Run
-
-```bash
-python main.py
-```
diff --git a/examples/session_walkthrough/main.py b/examples/session_walkthrough/main.py
deleted file mode 100644
index f94e5bb6..00000000
--- a/examples/session_walkthrough/main.py
+++ /dev/null
@@ -1,589 +0,0 @@
-"""
-Session API walkthrough -- demonstrates the full high-level Vectorless API.
-
-This example uses the Session class (recommended entry point) to cover:
-  1. Session creation (constructor / from_env / from_config_file)
-  2. Indexing from various sources (content, path, directory, bytes)
-  3. Batch indexing with concurrency control
-  4. Querying with doc_ids and workspace scope
-  5. Streaming query with real-time events
-  6. Document management (list, exists, remove, clear)
-  7. Cross-document relationship graph
-  8. Event callbacks for progress monitoring
-  9. Metrics reporting
-  10. SyncSession (synchronous API, no async/await)
-
-Usage:
-    export VECTORLESS_API_KEY="sk-..."
-    export VECTORLESS_MODEL="gpt-4o"
-    pip install vectorless
-    python main.py
-"""
-
-import asyncio
-import os
-import tempfile
-
-from vectorless import (
-    Session,
-    SyncSession,
-    EventEmitter,
-    VectorlessError,
-)
-from vectorless.events import IndexEventType, QueryEventType
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Sample documents used throughout the example
-# ──────────────────────────────────────────────────────────────────
-
-ARCHITECTURE_DOC = """\
-# Vectorless Architecture
-
-## Overview
-
-Vectorless is a reasoning-native document intelligence engine.
-It uses hierarchical semantic trees instead of vector embeddings.
-
-## Key Concepts
-
-- **Semantic Tree**: Documents are parsed into a tree of sections.
-- **LLM Navigation**: Queries are resolved by traversing the tree.
-- **No Vectors**: No embeddings, no similarity search, no vector DB.
-
-## Retrieval Flow
-
-Engine.query()
-  -> query/understand() -> QueryPlan
-  -> Orchestrator dispatches Workers
-  -> Workers navigate document trees
-  -> rerank -> synthesis -> answer
-"""
-
-FINANCE_DOC = """\
-# Q4 Financial Report
-
-## Revenue
-
-Total revenue for Q4 was $12.3M, up 15% from Q3.
-SaaS subscriptions accounted for $8.1M, consulting for $4.2M.
-
-## Costs
-
-Operating costs were $9.8M, including $3.2M in engineering salaries.
-Marketing spend was reduced by 8% to $1.5M.
-
-## Outlook
-
-Projected Q1 revenue is $13.5M based on current pipeline.
-"""
-
-SECURITY_DOC = """\
-# Security Policy
-
-## Authentication
-
-All API requests require a Bearer token in the Authorization header.
-Tokens expire after 24 hours and must be refreshed.
-
-## Data Encryption
-
-Data at rest is encrypted using AES-256. Data in transit uses TLS 1.3.
-
-## Audit Logging
-
-All access to sensitive data is logged and retained for 90 days.
-"""
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Helper: set up a temp directory with sample files
-# ──────────────────────────────────────────────────────────────────
-
-def create_sample_directory() -> tuple[str, list[str]]:
-    """Create a temp directory with sample documents. Returns (dir, paths)."""
-    tmpdir = tempfile.mkdtemp(prefix="vectorless_walkthrough_")
-    docs = {
-        "architecture.md": ARCHITECTURE_DOC,
-        "finance.md": FINANCE_DOC,
-        "security.md": SECURITY_DOC,
-    }
-    paths = []
-    for name, content in docs.items():
-        path = os.path.join(tmpdir, name)
-        with open(path, "w") as f:
-            f.write(content)
-        paths.append(path)
-    return tmpdir, paths
-
-
-def cleanup_directory(tmpdir: str) -> None:
-    """Remove all files in the temp directory."""
-    for fname in os.listdir(tmpdir):
-        os.remove(os.path.join(tmpdir, fname))
-    os.rmdir(tmpdir)
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 1: Session Creation
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_session_creation() -> Session:
-    """Demonstrate different ways to create a Session."""
-    print("=" * 60)
-    print("  1. Session Creation")
-    print("=" * 60)
-
-    # Option A: Constructor with explicit credentials
-    api_key = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-    model = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-    endpoint = os.environ.get("VECTORLESS_ENDPOINT")
-
-    session = Session(api_key=api_key, model=model, endpoint=endpoint)
-    print(f"  Created: {session}")
-
-    # Option B: from environment variables
-    # session = Session.from_env()
-
-    # Option C: from a config file
-    # session = Session.from_config_file("~/.vectorless/config.toml")
-
-    # Option D: with an EventEmitter for progress callbacks
-    # events = EventEmitter()
-    # session = Session(api_key=api_key, model=model, events=events)
-
-    print()
-    return session
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 2: Indexing from Various Sources
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_indexing(session: Session, tmpdir: str, paths: list[str]) -> dict[str, str]:
-    """Demonstrate indexing from content, path, directory, and bytes."""
-    print("=" * 60)
-    print("  2. Indexing")
-    print("=" * 60)
-
-    doc_ids: dict[str, str] = {}
-
-    # --- 2a. Index from in-memory content ---
-    print("  [content] Indexing from string...")
-    result = await session.index(
-        content=ARCHITECTURE_DOC,
-        format="markdown",
-        name="architecture",
-    )
-    doc_ids["architecture"] = result.doc_id  # type: ignore[assignment]
-    print(f"    doc_id: {result.doc_id}")
-    print(f"    items:  {result.total()}")
-
-    # --- 2b. Index from a file path ---
-    print("  [path]    Indexing from file path...")
-    result = await session.index(path=paths[1], name="finance")
-    doc_ids["finance"] = result.doc_id  # type: ignore[assignment]
-    print(f"    doc_id: {result.doc_id}")
-
-    # --- 2c. Index from raw bytes ---
-    print("  [bytes]   Indexing from raw bytes...")
-    result = await session.index(
-        bytes_data=SECURITY_DOC.encode("utf-8"),
-        format="markdown",
-        name="security",
-    )
-    doc_ids["security"] = result.doc_id  # type: ignore[assignment]
-    print(f"    doc_id: {result.doc_id}")
-
-    # --- 2d. Index a directory ---
-    print("  [dir]     Indexing a directory...")
-    # Clear first to see fresh results
-    await session.clear_all()
-
-    result = await session.index(directory=tmpdir, name="all_docs")
-    print(f"    doc_id: {result.doc_id}")
-    print(f"    items:  {len(result.items)}")
-    for item in result.items:
-        print(f"      - {item.name} ({item.doc_id[:8]}...)")
-        doc_ids[item.name] = item.doc_id
-
-    print()
-    return doc_ids
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 3: Batch Indexing with Concurrency
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_batch_indexing(session: Session, paths: list[str]) -> list[str]:
-    """Demonstrate batch indexing with concurrent jobs."""
-    print("=" * 60)
-    print("  3. Batch Indexing (concurrency=2)")
-    print("=" * 60)
-
-    # Clear to start fresh
-    await session.clear_all()
-
-    results = await session.index_batch(
-        paths,
-        mode="default",
-        jobs=2,       # max 2 concurrent indexing operations
-        force=False,
-    )
-
-    doc_ids = []
-    for r in results:
-        print(f"    {r.doc_id[:8]}... ({len(r.items)} items)")
-        for item in r.items:
-            doc_ids.append(item.doc_id)
-
-    print(f"  Batch indexed {len(results)} file(s), {len(doc_ids)} document(s) total")
-    print()
-    return doc_ids
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 4: Querying
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_querying(session: Session, doc_ids: list[str]) -> None:
-    """Demonstrate querying with doc_ids and workspace scope."""
-    print("=" * 60)
-    print("  4. Querying")
-    print("=" * 60)
-
-    # --- Query specific documents ---
-    print("  [ask] Query specific documents...")
-    response = await session.ask(
-        "What was the total revenue for Q4?",
-        doc_ids=doc_ids[:2],  # limit to first two docs
-    )
-
-    result = response.single()
-    if result:
-        print(f"    Score:      {result.score:.2f}")
-        print(f"    Confidence: {result.confidence:.2f}")
-        print(f"    Answer:     {result.content[:150]}...")
-        if result.evidence:
-            print(f"    Evidence:   {len(result.evidence)} item(s)")
-            for ev in result.evidence[:2]:
-                print(f"      - {ev.title}: {ev.content[:80]}...")
-        if result.metrics:
-            print(f"    LLM calls:  {result.metrics.llm_calls}")
-            print(f"    Nodes:      {result.metrics.nodes_visited}")
-
-    # --- Query across all documents ---
-    print()
-    print("  [workspace_scope] Query across entire workspace...")
-    response = await session.ask(
-        "How is data encrypted?",
-        workspace_scope=True,
-    )
-    for item in response.items:
-        print(f"    [{item.doc_id[:8]}...] score={item.score:.2f}")
-        print(f"      {item.content[:120]}...")
-
-    # --- Query with timeout ---
-    print()
-    print("  [timeout] Query with 30s timeout...")
-    try:
-        response = await session.ask(
-            "What is the retrieval flow?",
-            doc_ids=doc_ids,
-            timeout_secs=30,
-        )
-        if response.single():
-            print(f"    Answer: {response.single().content[:150]}...")
-    except VectorlessError as e:
-        print(f"    Error: {e}")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 5: Streaming Query
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_streaming(session: Session, doc_ids: list[str]) -> None:
-    """Demonstrate streaming query with real-time events."""
-    print("=" * 60)
-    print("  5. Streaming Query")
-    print("=" * 60)
-
-    stream = await session.query_stream(
-        "What are the key concepts?",
-        doc_ids=doc_ids[:1],
-    )
-
-    event_count = 0
-    async for event in stream:
-        event_count += 1
-        event_type = event.get("type", "unknown")
-        # Print a compact summary of each event
-        if event_type == "completed":
-            results = event.get("results", [])
-            print(f"    [{event_count}] completed — {len(results)} result(s)")
-        elif event_type == "error":
-            print(f"    [{event_count}] error — {event.get('message', '')}")
-        else:
-            print(f"    [{event_count}] {event_type}")
-
-    # The final result is available after iteration completes
-    if stream.result:
-        final = stream.result
-        item = final.single()
-        if item:
-            print(f"    Final answer: {item.content[:150]}...")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 6: Document Management
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_document_management(session: Session, doc_ids: list[str]) -> None:
-    """Demonstrate list, exists, remove, and clear."""
-    print("=" * 60)
-    print("  6. Document Management")
-    print("=" * 60)
-
-    # --- List all documents ---
-    docs = await session.list_documents()
-    print(f"  Listed {len(docs)} document(s):")
-    for doc in docs:
-        pages = f", pages={doc.page_count}" if doc.page_count else ""
-        print(f"    {doc.name}  id={doc.id[:8]}...  format={doc.format}{pages}")
-
-    # --- Check existence ---
-    if doc_ids:
-        exists = await session.document_exists(doc_ids[0])
-        print(f"\n  exists({doc_ids[0][:8]}...): {exists}")
-
-    # --- Remove a document ---
-    if len(doc_ids) > 1:
-        removed = await session.remove_document(doc_ids[1])
-        print(f"  remove({doc_ids[1][:8]}...): {removed}")
-
-        # Verify removal
-        exists_after = await session.document_exists(doc_ids[1])
-        print(f"  exists after removal: {exists_after}")
-
-    # --- List again ---
-    docs = await session.list_documents()
-    print(f"\n  After removal: {len(docs)} document(s)")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 7: Cross-Document Relationship Graph
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_graph(session: Session) -> None:
-    """Demonstrate the cross-document relationship graph."""
-    print("=" * 60)
-    print("  7. Document Graph")
-    print("=" * 60)
-
-    graph = await session.get_graph()
-
-    if graph is None or graph.is_empty():
-        print("  Graph is empty (no documents or no relationships found)")
-        print()
-        return
-
-    print(f"  Nodes: {graph.node_count()}, Edges: {graph.edge_count()}")
-
-    for did in graph.doc_ids():
-        node = graph.get_node(did)
-        if node:
-            keywords = ", ".join(k.keyword for k in node.top_keywords[:5])
-            neighbors = graph.get_neighbors(did)
-            print(f"  {node.title}")
-            print(f"    format: {node.format}, nodes: {node.node_count}")
-            print(f"    keywords: [{keywords}]")
-            print(f"    neighbors: {len(neighbors)}")
-            for edge in neighbors[:3]:
-                target = graph.get_node(edge.target_doc_id)
-                target_name = target.title if target else edge.target_doc_id[:8]
-                weight_str = f"weight={edge.weight:.2f}"
-                evidence_str = ""
-                if edge.evidence:
-                    evidence_str = f", shared_keywords={edge.evidence.shared_keyword_count}"
-                print(f"      -> {target_name} ({weight_str}{evidence_str})")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 8: Event Callbacks
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_events() -> None:
-    """Demonstrate event callbacks with EventEmitter."""
-    print("=" * 60)
-    print("  8. Event Callbacks")
-    print("=" * 60)
-
-    events = EventEmitter()
-
-    @events.on_index
-    def on_index_event(event):
-        if event.event_type == IndexEventType.STARTED:
-            print(f"    [INDEX] Started: {event.path or event.message}")
-        elif event.event_type == IndexEventType.COMPLETE:
-            print(f"    [INDEX] Complete: {event.doc_id or event.message}")
-        elif event.event_type == IndexEventType.ERROR:
-            print(f"    [INDEX] Error: {event.message}")
-
-    @events.on_query
-    def on_query_event(event):
-        if event.event_type == QueryEventType.STARTED:
-            print(f"    [QUERY] Started: {event.query}")
-        elif event.event_type == QueryEventType.COMPLETE:
-            print(f"    [QUERY] Complete: {event.total_results} result(s)")
-
-    # Create a session with the event emitter
-    api_key = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-    model = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-    session = Session(api_key=api_key, model=model, events=events)
-
-    # Index and query — events fire automatically
-    await session.index(content=ARCHITECTURE_DOC, format="markdown", name="demo_events")
-    await session.ask("What are the key concepts?", workspace_scope=True)
-
-    await session.clear_all()
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 9: Metrics
-# ──────────────────────────────────────────────────────────────────
-
-async def demo_metrics(session: Session) -> None:
-    """Demonstrate metrics reporting."""
-    print("=" * 60)
-    print("  9. Metrics Report")
-    print("=" * 60)
-
-    report = session.metrics_report()
-    if report:
-        # The report contains llm and retrieval subsections
-        if hasattr(report, "llm"):
-            llm = report.llm
-            print(f"  LLM Metrics:")
-            print(f"    Total calls:     {getattr(llm, 'total_calls', 'N/A')}")
-            print(f"    Total tokens:    {getattr(llm, 'total_tokens', 'N/A')}")
-            print(f"    Cache hit rate:  {getattr(llm, 'cache_hit_rate', 'N/A')}")
-        if hasattr(report, "retrieval"):
-            ret = report.retrieval
-            print(f"  Retrieval Metrics:")
-            print(f"    Total queries:   {getattr(ret, 'total_queries', 'N/A')}")
-            print(f"    Avg latency:     {getattr(ret, 'avg_latency_ms', 'N/A')} ms")
-    else:
-        print("  No metrics available")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Section 10: SyncSession (Synchronous API)
-# ──────────────────────────────────────────────────────────────────
-
-def demo_sync_session() -> None:
-    """Demonstrate the synchronous Session (no async/await needed)."""
-    print("=" * 60)
-    print("  10. SyncSession (no async/await)")
-    print("=" * 60)
-
-    api_key = os.environ.get("VECTORLESS_API_KEY", "sk-...")
-    model = os.environ.get("VECTORLESS_MODEL", "gpt-4o")
-
-    # Can also use: SyncSession.from_env()
-    with SyncSession(api_key=api_key, model=model) as session:
-        # Index from content
-        result = session.index(
-            content=FINANCE_DOC,
-            format="markdown",
-            name="sync_demo",
-        )
-        print(f"  Indexed: {result.doc_id}")
-
-        # Query
-        response = session.ask(
-            "What was the total revenue?",
-            doc_ids=[result.doc_id],  # type: ignore[list-item]
-        )
-        item = response.single()
-        if item:
-            print(f"  Answer: {item.content[:150]}...")
-
-        # Cleanup
-        session.clear_all()
-        print("  Cleaned up")
-
-    print()
-
-
-# ──────────────────────────────────────────────────────────────────
-#  Main
-# ──────────────────────────────────────────────────────────────────
-
-async def main() -> None:
-    print()
-    print("  Vectorless — Session API Walkthrough")
-    print("  " + "-" * 38)
-    print()
-
-    # 1. Create session
-    session = await demo_session_creation()
-
-    # Set up sample directory
-    tmpdir, paths = create_sample_directory()
-
-    try:
-        # 2. Indexing
-        doc_id_map = await demo_indexing(session, tmpdir, paths)
-        all_doc_ids = list(doc_id_map.values())
-
-        # 3. Batch indexing (clears and re-indexes)
-        batch_doc_ids = await demo_batch_indexing(session, paths)
-        all_doc_ids = batch_doc_ids if batch_doc_ids else all_doc_ids
-
-        # 4. Querying
-        if all_doc_ids:
-            await demo_querying(session, all_doc_ids)
-
-        # 5. Streaming query
-        if all_doc_ids:
-            await demo_streaming(session, all_doc_ids)
-
-        # 6. Document management
-        await demo_document_management(session, all_doc_ids)
-
-        # 7. Graph
-        await demo_graph(session)
-
-        # 8. Events (creates its own session)
-        await demo_events()
-
-        # 9. Metrics
-        await demo_metrics(session)
-
-    finally:
-        # Cleanup
-        await session.clear_all()
-        cleanup_directory(tmpdir)
-        print("=" * 60)
-        print("  Cleanup complete.")
-        print("=" * 60)
-
-    # 10. SyncSession (separate, runs synchronously)
-    demo_sync_session()
-
-    print("  Done.")
-
-
-if __name__ == "__main__":
-    asyncio.run(main())
diff --git a/rust/examples/single_doc_challenge.rs b/examples/single_doc_challenge.py
similarity index 55%
rename from rust/examples/single_doc_challenge.rs
rename to examples/single_doc_challenge.py
index cb174acf..10a55fe6 100644
--- a/rust/examples/single_doc_challenge.rs
+++ b/examples/single_doc_challenge.py
@@ -1,25 +1,29 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Single-document reasoning challenge.
-//!
-//! Indexes a realistic technical document and asks questions that require
-//! the engine to navigate deep into the tree, cross-reference details
-//! across distant sections, and extract information buried in nested
-//! structures — not surface-level keyword matches.
-//!
-//! ```bash
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   LLM_ENDPOINT=https://api.openai.com/v1 \
-//!   cargo run --example single_doc_challenge
-//! ```
-
-use vectorless::{DocumentFormat, EngineBuilder, IndexContext, QueryContext};
-
-/// A research report with information scattered across sections.
-/// The answers to the challenge questions require connecting dots
-/// from different parts of the document, not simple keyword lookup.
-const REPORT: &str = r#"
+# Copyright (c) 2026 vectorless developers
+# SPDX-License-Identifier: Apache-2.0
+
+"""Single-document reasoning challenge.
+
+Indexes a realistic technical document and asks questions that require
+the engine to navigate deep into the tree, cross-reference details
+across distant sections, and extract information buried in nested
+structures — not surface-level keyword matches.
+
+```bash
+LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
+  LLM_ENDPOINT=https://api.openai.com/v1 \
+  python examples/single_doc_challenge.py
+```
+"""
+
+import asyncio
+import os
+
+from vectorless import Engine
+
+# A research report with information scattered across sections.
+# The answers to the challenge questions require connecting dots
+# from different parts of the document, not simple keyword lookup.
+REPORT = """
 # Quantum Computing Division — Annual Research Report 2025
 
 ## Executive Summary
@@ -143,111 +147,79 @@
    larger code distances)
 4. File 25+ patents
 5. Grow revenue to $60M
-"#;
-
-/// Challenge questions designed to test deep reasoning.
-/// None of these can be answered by simple keyword search —
-/// each requires connecting information from multiple sections.
-const CHALLENGE_QUESTIONS: &[&str] = &[
-    // Requires: cross-reference Lab B's device characterization needs with
-    // Lab A's FR-02 specs, then connect to the CapEx table for FR-02 cost
+"""
+
+CHALLENGE_QUESTIONS = [
+    # Requires: cross-reference Lab B's device characterization needs with
+    # Lab A's FR-02 specs, then connect to the CapEx table for FR-02 cost
     "How much did the only refrigerator capable of characterizing Lab B's devices cost, and where is it located?",
-    // Requires: trace Lab C's below-threshold result → depends on Lab A's T1
-    // improvement → depends on tantalum junction transition
+    # Requires: trace Lab C's below-threshold result -> depends on Lab A's T1
+    # improvement -> depends on tantalum junction transition
     "What specific materials change in another lab made Lab C's error correction milestone possible?",
-    // Requires: find the firmware bug in Lab D section, then look at the
-    // Lab A FR-01 qubit count, then compute the impact window
+    # Requires: find the firmware bug in Lab D section, then look at the
+    # Lab A FR-01 qubit count, then compute the impact window
     "How many qubits were affected by the firmware bug, and for how many days?",
-    // Requires: Lab B gap/target ratio (70%) × theoretical target (0.5meV)
-    // → actual gap = 0.35meV, compare with 2026 goal of 0.45meV
+    # Requires: Lab B gap/target ratio (70%) * theoretical target (0.5meV)
+    # -> actual gap = 0.35meV, compare with 2026 goal of 0.45meV
     "What is the gap between Lab B's current topological gap achievement and the 2026 target, in meV?",
-    // Requires: trace the dependency chain: 256-qubit goal → need FR-03 →
-    // cost $9-11M → government contracts are largest revenue source at $19.8M
+    # Requires: trace the dependency chain: 256-qubit goal -> need FR-03 ->
+    # cost $9-11M -> government contracts are largest revenue source at $19.8M
     "If the 2026 qubit scaling goal requires a new refrigerator, can the largest revenue source category alone cover its estimated cost?",
-];
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    tracing_subscriber::fmt()
-        .compact()
-        .with_env_filter(
-            tracing_subscriber::EnvFilter::try_from_default_env()
-                .unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("info")),
-        )
-        .init();
-
-    println!("=== Single-Document Reasoning Challenge ===\n");
-
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-    let endpoint =
-        std::env::var("LLM_ENDPOINT").unwrap_or_else(|_| "https://api.openai.com/v1".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    // Index (skip if already indexed — we're testing retrieval, not indexing)
-    let doc_name = "qc_report_2025";
-    let doc_id = {
-        let existing = engine.list().await?;
-        if let Some(doc) = existing.iter().find(|d| d.name == doc_name) {
-            println!("Document already indexed, reusing: {}\n", doc.id);
-            doc.id.clone()
-        } else {
-            println!("Indexing research report...");
-            let result = engine
-                .index(
-                    IndexContext::from_content(REPORT, DocumentFormat::Markdown)
-                        .with_name(doc_name),
-                )
-                .await?;
-            let id = result.doc_id().unwrap().to_string();
-            println!("  doc_id: {}\n", id);
-            id
-        }
-    };
-
-    // Challenge queries
-    for (i, question) in CHALLENGE_QUESTIONS.iter().enumerate() {
-        println!("Q{}: {}", i + 1, question);
-
-        match engine
-            .query(QueryContext::new(*question).with_doc_ids(vec![doc_id.clone()]))
-            .await
-        {
-            Ok(response) => {
-                if let Some(item) = response.single() {
-                    if item.content.is_empty() {
-                        println!("   (no answer found)\n");
-                    } else {
-                        // Print first 3 lines as preview
-                        for line in item.content.lines().take(3) {
-                            println!("   {}", line);
-                        }
-                        let remaining = item.content.lines().count().saturating_sub(3);
-                        if remaining > 0 {
-                            println!("   ... ({} more lines)", remaining);
-                        }
-                        println!("   confidence: {:.2}\n", item.confidence);
-                    }
-                } else {
-                    println!("   (no results)\n");
-                }
-            }
-            Err(e) => {
-                println!("   error: {}\n", e);
-            }
-        }
-    }
-
-    // Uncomment to remove the document after testing:
-    // engine.remove(&doc_id).await?;
-    // println!("Cleaned up.");
-
-    Ok(())
-}
+]
+
+
+async def main() -> None:
+    print("=== Single-Document Reasoning Challenge ===\n")
+
+    api_key = os.environ.get("LLM_API_KEY", "sk-...")
+    model = os.environ.get("LLM_MODEL", "gpt-4o")
+    endpoint = os.environ.get("LLM_ENDPOINT", "https://api.openai.com/v1")
+
+    engine = Engine(api_key=api_key, model=model, endpoint=endpoint)
+
+    doc_name = "qc_report_2025"
+
+    # Check if already indexed
+    doc_id = None
+    docs = await engine.list_documents()
+    for doc in docs:
+        if doc.name == doc_name:
+            doc_id = doc.doc_id
+            print(f"Document already indexed, reusing: {doc_id}\n")
+            break
+
+    if doc_id is None:
+        print("Indexing research report...")
+        from vectorless._core import IndexContext
+
+        ctx = IndexContext.from_content(REPORT, "markdown").with_name(doc_name)
+        result = await engine.index(ctx)
+        doc_id = result.doc_id
+        print(f"  doc_id: {doc_id}\n")
+
+    # Challenge queries
+    for i, question in enumerate(CHALLENGE_QUESTIONS, 1):
+        print(f"Q{i}: {question}")
+
+        try:
+            answer = await engine.ask(question, doc_ids=[doc_id])
+            if not answer.content:
+                print("   (no answer found)\n")
+            else:
+                lines = answer.content.split("\n")
+                for line in lines[:3]:
+                    print(f"   {line}")
+                remaining = len(lines) - 3
+                if remaining > 0:
+                    print(f"   ... ({remaining} more lines)")
+                print(f"   confidence: {answer.confidence:.2f}\n")
+        except Exception as e:
+            print(f"   error: {e}\n")
+
+    # Uncomment to remove the document after testing:
+    # await engine.forget(doc_id)
+    # print("Cleaned up.")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/pyproject.toml b/pyproject.toml
index ac8ae458..cee403e3 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,20 +4,19 @@ build-backend = "maturin"
 
 [project]
 name = "vectorless"
-version = "0.1.11"
-description = "Reasoning-based Document Engine"
+dynamic = ["version"]
+description = "Document Understanding Engine for AI"
 readme = "README.md"
-requires-python = ">=3.9"
+requires-python = ">=3.10"
 license = { text = "Apache-2.0" }
 authors = [
-    { name = "vectorless developers", email = "beautifularea@gmail.com" }
+    { name = "Vectorless", email = "beautifularea@gmail.com" }
 ]
 classifiers = [
     "Development Status :: 4 - Beta",
     "Intended Audience :: Developers",
     "License :: OSI Approved :: Apache Software License",
     "Programming Language :: Python :: 3",
-    "Programming Language :: Python :: 3.9",
     "Programming Language :: Python :: 3.10",
     "Programming Language :: Python :: 3.11",
     "Programming Language :: Python :: 3.12",
@@ -26,12 +25,12 @@ classifiers = [
     "Topic :: Scientific/Engineering :: Artificial Intelligence",
     "Topic :: Text Processing :: Linguistic",
 ]
-keywords = ["rag", "document", "retrieval", "llm", "document-intelligence"]
+keywords = ["document", "understanding", "ai", "reasoning", "document-intelligence"]
 
 dependencies = [
     "pydantic>=2.0",
     "click>=8.0",
-    "tomli>=2.0; python_version < '3.11'",
+    "tomli>=2.0; python_version < '3.11'",  # 3.10 only, 3.11+ has tomllib built-in
 ]
 
 [project.optional-dependencies]
@@ -67,23 +66,23 @@ Repository = "https://github.com/vectorlessflow/vectorless"
 Documentation = "https://www.vectorless.dev/docs/intro"
 
 [tool.maturin]
-python-source = "python"
+python-source = "."
 module-name = "vectorless._vectorless"
-manifest-path = "python/Cargo.toml"
+manifest-path = "vectorless-core/vectorless-py/Cargo.toml"
 features = ["pyo3/extension-module"]
 
 [tool.pytest.ini_options]
 asyncio_mode = "auto"
-testpaths = ["python/tests"]
+testpaths = ["tests"]
 
 [tool.mypy]
-python_version = "3.9"
+python_version = "3.10"
 warn_return_any = true
 warn_unused_configs = true
 
 [tool.ruff]
 line-length = 100
-target-version = "py39"
+target-version = "py310"
 
 [tool.ruff.lint]
 select = ["E", "F", "W", "I", "N", "UP", "B"]
@@ -93,3 +92,12 @@ quote-style = "double"
 indent-style = "space"
 skip-magic-trailing-comma = false
 line-ending = "lf"
+
+[tool.uv]
+dev-dependencies = [
+    "pytest>=7.0",
+    "pytest-asyncio>=0.21",
+    "mypy>=1.0",
+    "ruff>=0.4",
+    "maturin>=1.5",
+]
diff --git a/python/README.md b/python/README.md
deleted file mode 100644
index be4761d0..00000000
--- a/python/README.md
+++ /dev/null
@@ -1,239 +0,0 @@
-# Vectorless Python SDK
-
-Python bindings for [vectorless](https://github.com/vectorlessflow/vectorless) — a reasoning-native document intelligence engine for AI.
-
-## Installation
-
-```bash
-pip install vectorless
-```
-
-## Quick Start
-
-```python
-import asyncio
-from vectorless import Engine, IndexContext, QueryContext
-
-async def main():
-    # Create engine — api_key and model are required
-    engine = Engine(
-        api_key="sk-...",
-        model="gpt-4o",
-    )
-
-    # Index a document
-    result = await engine.index(IndexContext.from_path("./report.pdf"))
-    doc_id = result.doc_id
-    print(f"Indexed: {doc_id}")
-
-    # Query the document
-    result = await engine.query(
-        QueryContext("What is the total revenue?").with_doc_ids([doc_id])
-    )
-    item = result.single()
-    print(f"Answer: {item.content}")
-    print(f"Score: {item.score:.2f}")
-
-    # List all documents
-    for doc in await engine.list():
-        print(f"  - {doc.name} ({doc.id})")
-
-    # Cleanup
-    await engine.remove(doc_id)
-
-asyncio.run(main())
-```
-
-## API Reference
-
-### Engine
-
-The main entry point for vectorless.
-
-```python
-class Engine:
-    def __init__(
-        self,
-        config_path: str | None = None,
-        api_key: str | None = None,
-        model: str | None = None,
-        endpoint: str | None = None,
-    ): ...
-
-    async def index(self, ctx: IndexContext) -> IndexResult: ...
-    async def query(self, ctx: QueryContext) -> QueryResult: ...
-    async def list(self) -> list[DocumentInfo]: ...
-    async def remove(self, doc_id: str) -> bool: ...
-    async def clear(self) -> int: ...
-    async def exists(self, doc_id: str) -> bool: ...
-    async def get_graph(self) -> DocumentGraph | None: ...
-```
-
-### IndexContext
-
-Context for indexing documents.
-
-```python
-class IndexContext:
-    @staticmethod
-    def from_path(path: str, name: str | None = None) -> IndexContext: ...
-
-    @staticmethod
-    def from_paths(paths: list[str]) -> IndexContext: ...
-
-    @staticmethod
-    def from_dir(path: str, recursive: bool = True) -> IndexContext: ...
-
-    @staticmethod
-    def from_content(
-        content: str,
-        name: str | None = None,
-        format: str = "markdown",
-    ) -> IndexContext: ...
-
-    @staticmethod
-    def from_bytes(
-        data: bytes,
-        name: str,
-        format: str,
-    ) -> IndexContext: ...
-
-    def with_options(self, options: IndexOptions) -> IndexContext: ...
-    def with_mode(self, mode: str) -> IndexContext: ...
-```
-
-**Supported formats:**
-- `"markdown"` / `"md"` - Markdown content
-- `"pdf"` - PDF documents
-
-### QueryContext
-
-Context for querying documents.
-
-```python
-class QueryContext:
-    def __init__(self, query: str): ...
-
-    def with_doc_ids(self, doc_ids: list[str]) -> QueryContext: ...
-    def with_workspace(self) -> QueryContext: ...
-    def with_timeout_secs(self, secs: int) -> QueryContext: ...
-    def with_force_analysis(self, force: bool) -> QueryContext: ...
-```
-
-### IndexResult
-
-```python
-class IndexResult:
-    @property
-    def doc_id(self) -> str | None: ...
-    @property
-    def items(self) -> list[IndexItem]: ...
-    @property
-    def failed(self) -> list[FailedItem]: ...
-    def has_failures(self) -> bool: ...
-    def total(self) -> int: ...
-    def __len__(self) -> int: ...
-```
-
-### QueryResult
-
-```python
-class QueryResult:
-    @property
-    def items(self) -> list[QueryResultItem]: ...
-    @property
-    def failed(self) -> list[FailedItem]: ...
-    def single(self) -> QueryResultItem | None: ...
-    def has_failures(self) -> bool: ...
-    def __len__(self) -> int: ...
-```
-
-### QueryResultItem
-
-```python
-class QueryResultItem:
-    @property
-    def doc_id(self) -> str: ...
-    @property
-    def content(self) -> str: ...
-    @property
-    def score(self) -> float: ...
-    @property
-    def node_ids(self) -> list[str]: ...
-```
-
-### IndexItem
-
-```python
-class IndexItem:
-    @property
-    def doc_id(self) -> str: ...
-    @property
-    def name(self) -> str: ...
-    @property
-    def format(self) -> str: ...
-    @property
-    def description(self) -> str | None: ...
-    @property
-    def source_path(self) -> str | None: ...
-    @property
-    def page_count(self) -> int | None: ...
-    @property
-    def metrics(self) -> IndexMetrics | None: ...
-```
-
-### DocumentInfo
-
-```python
-class DocumentInfo:
-    @property
-    def id(self) -> str: ...
-    @property
-    def name(self) -> str: ...
-    @property
-    def format(self) -> str: ...
-    @property
-    def description(self) -> str | None: ...
-    @property
-    def source_path(self) -> str | None: ...
-    @property
-    def page_count(self) -> int | None: ...
-    @property
-    def line_count(self) -> int | None: ...
-```
-
-### VectorlessError
-
-```python
-class VectorlessError(Exception):
-    @property
-    def message(self) -> str: ...
-    @property
-    def kind(self) -> str: ...  # "config", "parse", "not_found", "llm"
-```
-
-## Development
-
-### Building from source
-
-```bash
-# Install maturin
-pip install maturin
-
-# Build and install (from project root)
-maturin develop
-
-# Run tests
-pytest
-```
-
-### Publishing to PyPI
-
-```bash
-maturin build --release
-maturin publish
-```
-
-## License
-
-Apache-2.0
diff --git a/python/src/context.rs b/python/src/context.rs
deleted file mode 100644
index 2bf0ae94..00000000
--- a/python/src/context.rs
+++ /dev/null
@@ -1,282 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! IndexContext, QueryContext, and IndexOptions Python wrappers.
-
-use pyo3::prelude::*;
-
-use ::vectorless::{DocumentFormat, IndexContext, IndexMode, IndexOptions, QueryContext};
-
-use super::error::VectorlessError;
-
-/// Parse format string to DocumentFormat.
-fn parse_format(format: &str) -> PyResult<DocumentFormat> {
-    match format.to_lowercase().as_str() {
-        "markdown" | "md" => Ok(DocumentFormat::Markdown),
-        "pdf" => Ok(DocumentFormat::Pdf),
-        _ => Err(PyErr::from(VectorlessError::new(
-            format!("Unknown format: {}. Supported: markdown, pdf", format),
-            "config",
-        ))),
-    }
-}
-
-// ============================================================
-// IndexOptions
-// ============================================================
-
-/// Options for controlling indexing behavior.
-///
-/// Args:
-///     mode: Indexing mode - "default", "force", or "incremental".
-///     generate_summaries: Whether to generate summaries. Default: True.
-///     generate_description: Whether to generate document description. Default: False.
-///     generate_ids: Whether to generate node IDs. Default: True.
-///     enable_synonym_expansion: Whether to expand keywords with LLM-generated
-///         synonyms during indexing. Improves recall for differently-worded queries.
-///         Default: False.
-#[pyclass(name = "IndexOptions", skip_from_py_object)]
-#[derive(Clone)]
-pub struct PyIndexOptions {
-    pub(crate) inner: IndexOptions,
-}
-
-#[pymethods]
-impl PyIndexOptions {
-    #[new]
-    #[pyo3(signature = (mode="default", generate_summaries=true, generate_description=false, generate_ids=true, enable_synonym_expansion=true, timeout_secs=None))]
-    fn new(
-        mode: &str,
-        generate_summaries: bool,
-        generate_description: bool,
-        generate_ids: bool,
-        enable_synonym_expansion: bool,
-        timeout_secs: Option<u64>,
-    ) -> PyResult<Self> {
-        let mut opts = IndexOptions::new();
-        match mode {
-            "default" => {}
-            "force" => opts = opts.with_mode(IndexMode::Force),
-            "incremental" => opts = opts.with_mode(IndexMode::Incremental),
-            _ => {
-                return Err(PyErr::from(VectorlessError::new(
-                    format!(
-                        "Unknown mode: {}. Supported: default, force, incremental",
-                        mode
-                    ),
-                    "config",
-                )));
-            }
-        }
-        opts.generate_summaries = generate_summaries;
-        opts.generate_description = generate_description;
-        opts.generate_ids = generate_ids;
-        opts.enable_synonym_expansion = enable_synonym_expansion;
-        if let Some(secs) = timeout_secs {
-            opts = opts.with_timeout_secs(secs);
-        }
-        Ok(Self { inner: opts })
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "IndexOptions(mode='{}', generate_summaries={}, generate_description={}, generate_ids={}, enable_synonym_expansion={})",
-            match self.inner.mode {
-                IndexMode::Default => "default",
-                IndexMode::Force => "force",
-                IndexMode::Incremental => "incremental",
-            },
-            self.inner.generate_summaries,
-            self.inner.generate_description,
-            self.inner.generate_ids,
-            self.inner.enable_synonym_expansion,
-        )
-    }
-}
-
-// ============================================================
-// IndexContext
-// ============================================================
-
-/// Context for indexing a document.
-///
-/// Create using the static methods:
-///
-/// ```python
-/// from vectorless import IndexContext
-///
-/// # Single file
-/// ctx = IndexContext.from_path("./document.pdf")
-///
-/// # Multiple files
-/// ctx = IndexContext.from_paths(["./a.pdf", "./b.md"])
-///
-/// # Directory
-/// ctx = IndexContext.from_dir("./docs/")
-///
-/// # From text
-/// ctx = IndexContext.from_content("# Title\\nContent...", "markdown").with_name("doc")
-///
-/// # From bytes
-/// ctx = IndexContext.from_bytes(data, "pdf").with_name("doc")
-/// ```
-#[pyclass(name = "IndexContext")]
-pub struct PyIndexContext {
-    pub(crate) inner: IndexContext,
-}
-
-#[pymethods]
-impl PyIndexContext {
-    /// Create an IndexContext from a single file path.
-    #[staticmethod]
-    fn from_path(path: String) -> Self {
-        Self {
-            inner: IndexContext::from_path(&path),
-        }
-    }
-
-    /// Create an IndexContext from multiple file paths.
-    #[staticmethod]
-    fn from_paths(paths: Vec<String>) -> Self {
-        Self {
-            inner: IndexContext::from_paths(&paths),
-        }
-    }
-
-    /// Create an IndexContext from all supported files in a directory.
-    ///
-    /// Args:
-    ///     path: Directory path to scan.
-    ///     recursive: If True, scan subdirectories recursively. Default: False.
-    #[staticmethod]
-    #[pyo3(signature = (path, recursive=false))]
-    fn from_dir(path: String, recursive: bool) -> Self {
-        let inner = IndexContext::from_dir(&path, recursive);
-        Self { inner }
-    }
-
-    /// Create an IndexContext from text content.
-    #[staticmethod]
-    #[pyo3(signature = (content, format="markdown"))]
-    fn from_content(content: String, format: &str) -> PyResult<Self> {
-        let doc_format = parse_format(format)?;
-        let ctx = IndexContext::from_content(&content, doc_format);
-        Ok(Self { inner: ctx })
-    }
-
-    /// Create an IndexContext from binary data.
-    #[staticmethod]
-    fn from_bytes(data: Vec<u8>, format: &str) -> PyResult<Self> {
-        let doc_format = parse_format(format)?;
-        let ctx = IndexContext::from_bytes(data, doc_format);
-        Ok(Self { inner: ctx })
-    }
-
-    /// Set the document name (single-source only).
-    fn with_name(&self, name: String) -> Self {
-        let ctx = self.inner.clone().with_name(&name);
-        Self { inner: ctx }
-    }
-
-    /// Apply indexing options.
-    fn with_options(&self, options: &PyIndexOptions) -> Self {
-        let ctx = self.inner.clone().with_options(options.inner.clone());
-        Self { inner: ctx }
-    }
-
-    /// Set indexing mode.
-    fn with_mode(&self, mode: &str) -> PyResult<Self> {
-        let m = match mode {
-            "default" => IndexMode::Default,
-            "force" => IndexMode::Force,
-            "incremental" => IndexMode::Incremental,
-            _ => {
-                return Err(PyErr::from(VectorlessError::new(
-                    format!(
-                        "Unknown mode: {}. Supported: default, force, incremental",
-                        mode
-                    ),
-                    "config",
-                )));
-            }
-        };
-        let ctx = self.inner.clone().with_mode(m);
-        Ok(Self { inner: ctx })
-    }
-
-    /// Number of document sources.
-    fn __len__(&self) -> usize {
-        self.inner.len()
-    }
-
-    /// Whether no sources are present.
-    fn is_empty(&self) -> bool {
-        self.inner.is_empty()
-    }
-
-    fn __repr__(&self) -> String {
-        format!("IndexContext(sources={})", self.inner.len())
-    }
-}
-
-// ============================================================
-// QueryContext
-// ============================================================
-
-/// Context for a query operation.
-///
-/// ```python
-/// from vectorless import QueryContext
-///
-/// # Query specific documents
-/// ctx = QueryContext("What is the total revenue?").with_doc_ids([doc_id])
-///
-/// # Query multiple documents
-/// ctx = QueryContext("What is the architecture?").with_doc_ids(["doc-1", "doc-2"])
-///
-/// # Query entire workspace
-/// ctx = QueryContext("Explain the algorithm")
-/// ```
-#[pyclass(name = "QueryContext")]
-pub struct PyQueryContext {
-    pub(crate) inner: QueryContext,
-}
-
-#[pymethods]
-impl PyQueryContext {
-    /// Create a new query context (defaults to workspace scope).
-    #[new]
-    fn new(query: String) -> Self {
-        Self {
-            inner: QueryContext::new(&query),
-        }
-    }
-
-    /// Set scope to specific documents.
-    fn with_doc_ids(&self, doc_ids: Vec<String>) -> Self {
-        let ctx = self.inner.clone().with_doc_ids(doc_ids);
-        Self { inner: ctx }
-    }
-
-    /// Set scope to entire workspace.
-    fn with_workspace(&self) -> Self {
-        let ctx = self.inner.clone().with_workspace();
-        Self { inner: ctx }
-    }
-
-    /// Set per-operation timeout in seconds.
-    fn with_timeout_secs(&self, secs: u64) -> Self {
-        let ctx = self.inner.clone().with_timeout_secs(secs);
-        Self { inner: ctx }
-    }
-
-    /// Force the Orchestrator to analyze documents before dispatching Workers.
-    fn with_force_analysis(&self, force: bool) -> Self {
-        let ctx = self.inner.clone().with_force_analysis(force);
-        Self { inner: ctx }
-    }
-
-    fn __repr__(&self) -> String {
-        "QueryContext(...)".to_string()
-    }
-}
diff --git a/python/src/document.rs b/python/src/document.rs
deleted file mode 100644
index d5652fba..00000000
--- a/python/src/document.rs
+++ /dev/null
@@ -1,59 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! DocumentInfo Python wrapper.
-
-use pyo3::prelude::*;
-
-use ::vectorless::DocumentInfo;
-
-/// Information about an indexed document.
-#[pyclass(name = "DocumentInfo")]
-pub struct PyDocumentInfo {
-    pub(crate) inner: DocumentInfo,
-}
-
-#[pymethods]
-impl PyDocumentInfo {
-    #[getter]
-    fn id(&self) -> &str {
-        &self.inner.id
-    }
-
-    #[getter]
-    fn name(&self) -> &str {
-        &self.inner.name
-    }
-
-    #[getter]
-    fn format(&self) -> &str {
-        &self.inner.format
-    }
-
-    #[getter]
-    fn description(&self) -> Option<&str> {
-        self.inner.description.as_deref()
-    }
-
-    #[getter]
-    fn source_path(&self) -> Option<&str> {
-        self.inner.source_path.as_deref()
-    }
-
-    #[getter]
-    fn page_count(&self) -> Option<usize> {
-        self.inner.page_count
-    }
-
-    #[getter]
-    fn line_count(&self) -> Option<usize> {
-        self.inner.line_count
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "DocumentInfo(id='{}', name='{}', format='{}')",
-            self.inner.id, self.inner.name, self.inner.format
-        )
-    }
-}
diff --git a/python/src/results.rs b/python/src/results.rs
deleted file mode 100644
index ba4ea776..00000000
--- a/python/src/results.rs
+++ /dev/null
@@ -1,506 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Query and index result Python wrappers.
-
-use pyo3::prelude::*;
-
-use ::vectorless::IndexMetrics;
-use ::vectorless::{
-    EvidenceItem, FailedItem, IndexItem, IndexResult, QueryMetrics, QueryResult, QueryResultItem,
-};
-
-// ============================================================
-// EvidenceItem
-// ============================================================
-
-/// A single piece of evidence with source attribution.
-#[pyclass(name = "EvidenceItem")]
-pub struct PyEvidenceItem {
-    pub(crate) inner: EvidenceItem,
-}
-
-#[pymethods]
-impl PyEvidenceItem {
-    /// Section title where this evidence was found.
-    #[getter]
-    fn title(&self) -> &str {
-        &self.inner.title
-    }
-
-    /// Navigation path (e.g., "Root/Chapter 1/Section 1.2").
-    #[getter]
-    fn path(&self) -> &str {
-        &self.inner.path
-    }
-
-    /// Raw evidence content.
-    #[getter]
-    fn content(&self) -> &str {
-        &self.inner.content
-    }
-
-    /// Source document name.
-    #[getter]
-    fn doc_name(&self) -> Option<&str> {
-        self.inner.doc_name.as_deref()
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "EvidenceItem(title='{}', path='{}', content_len={})",
-            self.inner.title,
-            self.inner.path,
-            self.inner.content.len()
-        )
-    }
-}
-
-// ============================================================
-// QueryMetrics
-// ============================================================
-
-/// Query execution metrics.
-#[pyclass(name = "QueryMetrics")]
-pub struct PyQueryMetrics {
-    pub(crate) inner: QueryMetrics,
-}
-
-#[pymethods]
-impl PyQueryMetrics {
-    /// Number of LLM calls made.
-    #[getter]
-    fn llm_calls(&self) -> u32 {
-        self.inner.llm_calls
-    }
-
-    /// Number of navigation rounds used.
-    #[getter]
-    fn rounds_used(&self) -> u32 {
-        self.inner.rounds_used
-    }
-
-    /// Number of distinct nodes visited.
-    #[getter]
-    fn nodes_visited(&self) -> usize {
-        self.inner.nodes_visited
-    }
-
-    /// Number of evidence items collected.
-    #[getter]
-    fn evidence_count(&self) -> usize {
-        self.inner.evidence_count
-    }
-
-    /// Total characters of collected evidence.
-    #[getter]
-    fn evidence_chars(&self) -> usize {
-        self.inner.evidence_chars
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "QueryMetrics(llm_calls={}, rounds={}, evidence={})",
-            self.inner.llm_calls, self.inner.rounds_used, self.inner.evidence_count
-        )
-    }
-}
-
-// ============================================================
-// QueryResultItem
-// ============================================================
-
-/// A single document's query result.
-#[pyclass(name = "QueryResultItem")]
-pub struct PyQueryResultItem {
-    pub(crate) inner: QueryResultItem,
-}
-
-#[pymethods]
-impl PyQueryResultItem {
-    /// The document ID.
-    #[getter]
-    fn doc_id(&self) -> &str {
-        &self.inner.doc_id
-    }
-
-    /// The retrieved content (synthesized answer or raw evidence).
-    #[getter]
-    fn content(&self) -> &str {
-        &self.inner.content
-    }
-
-    /// Confidence score (0.0 to 1.0).
-    #[getter]
-    fn score(&self) -> f32 {
-        self.inner.confidence
-    }
-
-    /// Node IDs that matched (navigation paths).
-    #[getter]
-    fn node_ids(&self) -> Vec<String> {
-        self.inner.node_ids.clone()
-    }
-
-    /// Evidence items with source attribution.
-    #[getter]
-    fn evidence(&self) -> Vec<PyEvidenceItem> {
-        self.inner
-            .evidence
-            .iter()
-            .map(|e| PyEvidenceItem {
-                inner: EvidenceItem {
-                    title: e.title.clone(),
-                    path: e.path.clone(),
-                    content: e.content.clone(),
-                    doc_name: e.doc_name.clone(),
-                },
-            })
-            .collect()
-    }
-
-    /// Execution metrics for this query.
-    #[getter]
-    fn metrics(&self) -> Option<PyQueryMetrics> {
-        self.inner.metrics.as_ref().map(|m| PyQueryMetrics {
-            inner: QueryMetrics {
-                llm_calls: m.llm_calls,
-                rounds_used: m.rounds_used,
-                nodes_visited: m.nodes_visited,
-                evidence_count: m.evidence_count,
-                evidence_chars: m.evidence_chars,
-            },
-        })
-    }
-
-    /// Confidence score (0.0 to 1.0).
-    #[getter]
-    fn confidence(&self) -> f32 {
-        self.inner.confidence
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "QueryResultItem(doc_id='{}', confidence={:.2}, evidence={})",
-            self.inner.doc_id,
-            self.inner.confidence,
-            self.inner.evidence.len()
-        )
-    }
-}
-
-// ============================================================
-// FailedItem
-// ============================================================
-
-/// A failed item in a batch operation.
-#[pyclass(name = "FailedItem")]
-pub struct PyFailedItem {
-    pub(crate) inner: FailedItem,
-}
-
-#[pymethods]
-impl PyFailedItem {
-    /// Source description.
-    #[getter]
-    fn source(&self) -> &str {
-        &self.inner.source
-    }
-
-    /// Error message.
-    #[getter]
-    fn error(&self) -> &str {
-        &self.inner.error
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "FailedItem(source='{}', error='{}')",
-            self.inner.source, self.inner.error
-        )
-    }
-}
-
-// ============================================================
-// QueryResult
-// ============================================================
-
-/// Result of a document query.
-#[pyclass(name = "QueryResult")]
-pub struct PyQueryResult {
-    pub(crate) inner: QueryResult,
-}
-
-#[pymethods]
-impl PyQueryResult {
-    /// Result items (one per document).
-    #[getter]
-    fn items(&self) -> Vec<PyQueryResultItem> {
-        self.inner
-            .items
-            .iter()
-            .map(|i| PyQueryResultItem {
-                inner: QueryResultItem {
-                    doc_id: i.doc_id.clone(),
-                    node_ids: i.node_ids.clone(),
-                    content: i.content.clone(),
-                    evidence: i.evidence.clone(),
-                    metrics: i.metrics.clone(),
-                    confidence: i.confidence,
-                },
-            })
-            .collect()
-    }
-
-    /// Get the first (single-doc) result item.
-    fn single(&self) -> Option<PyQueryResultItem> {
-        self.inner.single().map(|i| PyQueryResultItem {
-            inner: QueryResultItem {
-                doc_id: i.doc_id.clone(),
-                node_ids: i.node_ids.clone(),
-                content: i.content.clone(),
-                evidence: i.evidence.clone(),
-                metrics: i.metrics.clone(),
-                confidence: i.confidence,
-            },
-        })
-    }
-
-    /// Number of result items.
-    fn __len__(&self) -> usize {
-        self.inner.len()
-    }
-
-    /// Whether any documents failed.
-    fn has_failures(&self) -> bool {
-        self.inner.has_failures()
-    }
-
-    /// Failed items.
-    #[getter]
-    fn failed(&self) -> Vec<PyFailedItem> {
-        self.inner
-            .failed
-            .iter()
-            .map(|f| PyFailedItem {
-                inner: FailedItem::new(&f.source, &f.error),
-            })
-            .collect()
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "QueryResult(items={}, failed={})",
-            self.inner.len(),
-            self.inner.failed.len()
-        )
-    }
-}
-
-// ============================================================
-// IndexMetrics
-// ============================================================
-
-/// Indexing pipeline metrics.
-#[pyclass(name = "IndexMetrics")]
-pub struct PyIndexMetrics {
-    pub(crate) inner: IndexMetrics,
-}
-
-#[pymethods]
-impl PyIndexMetrics {
-    /// Total indexing time (ms).
-    #[getter]
-    fn total_time_ms(&self) -> u64 {
-        self.inner.total_time_ms()
-    }
-
-    /// Parse stage duration (ms).
-    #[getter]
-    fn parse_time_ms(&self) -> u64 {
-        self.inner.parse_time_ms
-    }
-
-    /// Build stage duration (ms).
-    #[getter]
-    fn build_time_ms(&self) -> u64 {
-        self.inner.build_time_ms
-    }
-
-    /// Enhance (summary) stage duration (ms).
-    #[getter]
-    fn enhance_time_ms(&self) -> u64 {
-        self.inner.enhance_time_ms
-    }
-
-    /// Number of nodes processed.
-    #[getter]
-    fn nodes_processed(&self) -> usize {
-        self.inner.nodes_processed
-    }
-
-    /// Number of summaries successfully generated.
-    #[getter]
-    fn summaries_generated(&self) -> usize {
-        self.inner.summaries_generated
-    }
-
-    /// Number of summaries that failed to generate.
-    #[getter]
-    fn summaries_failed(&self) -> usize {
-        self.inner.summaries_failed
-    }
-
-    /// Number of LLM calls made.
-    #[getter]
-    fn llm_calls(&self) -> usize {
-        self.inner.llm_calls
-    }
-
-    /// Total tokens generated by LLM.
-    #[getter]
-    fn total_tokens_generated(&self) -> usize {
-        self.inner.total_tokens_generated
-    }
-
-    /// Number of topics in reasoning index.
-    #[getter]
-    fn topics_indexed(&self) -> usize {
-        self.inner.topics_indexed
-    }
-
-    /// Number of keywords in reasoning index.
-    #[getter]
-    fn keywords_indexed(&self) -> usize {
-        self.inner.keywords_indexed
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "IndexMetrics(total={}ms, summaries={}, failed={}, llm_calls={})",
-            self.inner.total_time_ms(),
-            self.inner.summaries_generated,
-            self.inner.summaries_failed,
-            self.inner.llm_calls,
-        )
-    }
-}
-
-// ============================================================
-// IndexItem / IndexResult
-// ============================================================
-
-/// A single indexed document item.
-#[pyclass(name = "IndexItem")]
-pub struct PyIndexItem {
-    pub(crate) inner: IndexItem,
-}
-
-#[pymethods]
-impl PyIndexItem {
-    #[getter]
-    fn doc_id(&self) -> &str {
-        &self.inner.doc_id
-    }
-
-    #[getter]
-    fn name(&self) -> &str {
-        &self.inner.name
-    }
-
-    #[getter]
-    fn format(&self) -> String {
-        format!("{:?}", self.inner.format).to_lowercase()
-    }
-
-    #[getter]
-    fn description(&self) -> Option<&str> {
-        self.inner.description.as_deref()
-    }
-
-    #[getter]
-    fn source_path(&self) -> Option<&str> {
-        self.inner.source_path.as_deref()
-    }
-
-    #[getter]
-    fn page_count(&self) -> Option<usize> {
-        self.inner.page_count
-    }
-
-    /// Indexing pipeline metrics (timing, LLM usage, etc.).
-    #[getter]
-    fn metrics(&self) -> Option<PyIndexMetrics> {
-        self.inner
-            .metrics
-            .as_ref()
-            .map(|m| PyIndexMetrics { inner: m.clone() })
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "IndexItem(doc_id='{}', name='{}')",
-            self.inner.doc_id, self.inner.name
-        )
-    }
-}
-
-/// Result of a document indexing operation.
-#[pyclass(name = "IndexResult")]
-pub struct PyIndexResult {
-    pub(crate) inner: IndexResult,
-}
-
-#[pymethods]
-impl PyIndexResult {
-    /// The document ID (convenience for single-document indexing).
-    #[getter]
-    fn doc_id(&self) -> Option<String> {
-        self.inner.doc_id().map(|s| s.to_string())
-    }
-
-    /// All indexed items.
-    #[getter]
-    fn items(&self) -> Vec<PyIndexItem> {
-        self.inner
-            .items
-            .iter()
-            .map(|i| PyIndexItem { inner: i.clone() })
-            .collect()
-    }
-
-    /// Failed items.
-    #[getter]
-    fn failed(&self) -> Vec<PyFailedItem> {
-        self.inner
-            .failed
-            .iter()
-            .map(|f| PyFailedItem {
-                inner: FailedItem::new(&f.source, &f.error),
-            })
-            .collect()
-    }
-
-    /// Whether any items failed.
-    fn has_failures(&self) -> bool {
-        self.inner.has_failures()
-    }
-
-    /// Total number of items (successful + failed).
-    fn total(&self) -> usize {
-        self.inner.total()
-    }
-
-    fn __len__(&self) -> usize {
-        self.inner.len()
-    }
-
-    fn __repr__(&self) -> String {
-        format!(
-            "IndexResult(doc_id={:?}, count={}, failed={})",
-            self.inner.doc_id(),
-            self.inner.items.len(),
-            self.inner.failed.len()
-        )
-    }
-}
diff --git a/python/src/streaming.rs b/python/src/streaming.rs
deleted file mode 100644
index eafa688e..00000000
--- a/python/src/streaming.rs
+++ /dev/null
@@ -1,179 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! PyO3 streaming query wrapper.
-//!
-//! Bridges Rust's `mpsc::Receiver<RetrieveEvent>` to a Python async iterator,
-//! yielding real-time retrieval progress events as dicts.
-
-use pyo3::exceptions::PyStopAsyncIteration;
-use pyo3::prelude::*;
-use pyo3::types::PyDict;
-use pyo3_async_runtimes::tokio::future_into_py;
-use std::sync::Arc;
-use tokio::sync::{Mutex, mpsc};
-
-use ::vectorless::{RetrieveEvent, SufficiencyLevel};
-
-/// Convert a `RetrieveEvent` into a Python dict with a `"type"` key.
-fn event_to_dict(event: RetrieveEvent, py: Python<'_>) -> PyResult<Bound<'_, PyDict>> {
-    let dict = PyDict::new(py);
-    match event {
-        RetrieveEvent::Started { query, strategy } => {
-            dict.set_item("type", "started")?;
-            dict.set_item("query", query)?;
-            dict.set_item("strategy", strategy)?;
-        }
-        RetrieveEvent::StageCompleted { stage, elapsed_ms } => {
-            dict.set_item("type", "stage_completed")?;
-            dict.set_item("stage", stage)?;
-            dict.set_item("elapsed_ms", elapsed_ms)?;
-        }
-        RetrieveEvent::NodeVisited {
-            node_id,
-            title,
-            score,
-        } => {
-            dict.set_item("type", "node_visited")?;
-            dict.set_item("node_id", node_id)?;
-            dict.set_item("title", title)?;
-            dict.set_item("score", score)?;
-        }
-        RetrieveEvent::ContentFound {
-            node_id,
-            title,
-            preview,
-            score,
-        } => {
-            dict.set_item("type", "content_found")?;
-            dict.set_item("node_id", node_id)?;
-            dict.set_item("title", title)?;
-            dict.set_item("preview", preview)?;
-            dict.set_item("score", score)?;
-        }
-        RetrieveEvent::Backtracking { from, to, reason } => {
-            dict.set_item("type", "backtracking")?;
-            dict.set_item("from", from)?;
-            dict.set_item("to", to)?;
-            dict.set_item("reason", reason)?;
-        }
-        RetrieveEvent::SufficiencyCheck { level, tokens } => {
-            let level_str = match level {
-                SufficiencyLevel::Sufficient => "sufficient",
-                SufficiencyLevel::PartialSufficient => "partial_sufficient",
-                SufficiencyLevel::Insufficient => "insufficient",
-            };
-            dict.set_item("type", "sufficiency_check")?;
-            dict.set_item("level", level_str)?;
-            dict.set_item("tokens", tokens)?;
-        }
-        RetrieveEvent::Completed { response } => {
-            dict.set_item("type", "completed")?;
-            dict.set_item("confidence", response.confidence)?;
-            dict.set_item("is_sufficient", response.is_sufficient)?;
-            dict.set_item("strategy_used", response.strategy_used)?;
-            dict.set_item("tokens_used", response.tokens_used)?;
-            dict.set_item("content", response.content)?;
-
-            let results: Vec<Bound<'_, PyDict>> = response
-                .results
-                .into_iter()
-                .map(|r| {
-                    let rd = PyDict::new(py);
-                    rd.set_item("node_id", &r.node_id)?;
-                    rd.set_item("title", &r.title)?;
-                    rd.set_item("content", &r.content)?;
-                    rd.set_item("score", r.score)?;
-                    rd.set_item("depth", r.depth)?;
-                    Ok(rd)
-                })
-                .collect::<PyResult<Vec<_>>>()?;
-            dict.set_item("results", results)?;
-        }
-        RetrieveEvent::Error { message } => {
-            dict.set_item("type", "error")?;
-            dict.set_item("message", message)?;
-        }
-    }
-    Ok(dict)
-}
-
-/// Python-facing async iterator over streaming retrieval events.
-///
-/// Usage::
-///
-///     stream = await engine.query_stream(ctx)
-///     async for event in stream:
-///         print(event["type"])
-#[pyclass(name = "StreamingQuery")]
-pub struct PyStreamingQuery {
-    rx: Arc<Mutex<Option<mpsc::Receiver<RetrieveEvent>>>>,
-}
-
-impl PyStreamingQuery {
-    pub fn new(rx: mpsc::Receiver<RetrieveEvent>) -> Self {
-        Self {
-            rx: Arc::new(Mutex::new(Some(rx))),
-        }
-    }
-}
-
-#[pymethods]
-impl PyStreamingQuery {
-    fn __aiter__(slf: PyRef<'_, Self>) -> PyRef<'_, Self> {
-        slf
-    }
-
-    fn __anext__<'py>(&self, py: Python<'py>) -> PyResult<Bound<'py, PyAny>> {
-        let rx: Arc<Mutex<Option<mpsc::Receiver<RetrieveEvent>>>> = Arc::clone(&self.rx);
-        future_into_py(py, async move {
-            let mut guard = rx.lock().await;
-            let receiver: &mut Option<mpsc::Receiver<RetrieveEvent>> = &mut *guard;
-            match receiver {
-                None => Err(PyStopAsyncIteration::new_err("stream exhausted")),
-                Some(rx) => match rx.recv().await {
-                    Some(event) => {
-                        let is_terminal = matches!(
-                            &event,
-                            RetrieveEvent::Completed { .. } | RetrieveEvent::Error { .. }
-                        );
-                        if is_terminal {
-                            *guard = None;
-                        }
-                        // We cannot convert to dict here (no Python token in async context).
-                        // Instead, store the event and convert on the Python side.
-                        // PyO3 0.28: future_into_py resolves on the Python thread,
-                        // so we use Python::with_gil equivalent via pyo3_async_runtimes.
-                        //
-                        // The cleanest approach: wrap in a PyO3-compatible type.
-                        // Since RetrieveEvent doesn't implement IntoPyObject, we convert
-                        // to a simple serializable form.
-                        Ok(SerializedEvent(event))
-                    }
-                    None => {
-                        *guard = None;
-                        Err(PyStopAsyncIteration::new_err("stream closed"))
-                    }
-                },
-            }
-        })
-    }
-
-    fn __repr__(&self) -> String {
-        "StreamingQuery(...)".to_string()
-    }
-}
-
-/// Wrapper to carry a RetrieveEvent across the async boundary
-/// and convert it to a dict on the Python thread.
-struct SerializedEvent(RetrieveEvent);
-
-impl<'py> IntoPyObject<'py> for SerializedEvent {
-    type Target = PyDict;
-    type Output = Bound<'py, Self::Target>;
-    type Error = PyErr;
-
-    fn into_pyobject(self, py: Python<'py>) -> Result<Self::Output, Self::Error> {
-        event_to_dict(self.0, py)
-    }
-}
diff --git a/python/vectorless/__init__.py b/python/vectorless/__init__.py
deleted file mode 100644
index d2ff88d0..00000000
--- a/python/vectorless/__init__.py
+++ /dev/null
@@ -1,71 +0,0 @@
-"""
-Vectorless — Reasoning-native document engine.
-
-Every retrieval is a reasoning act.
-
-Quick Start:
-    from vectorless import Session
-
-    session = Session(api_key="sk-...", model="gpt-4o")
-    result = await session.index(path="./report.pdf")
-    answer = await session.ask("What is the revenue?", doc_ids=[result.doc_id])
-    print(answer.single().content)
-"""
-
-# High-level API (recommended)
-from vectorless.session import Session
-from vectorless.sync_session import SyncSession
-from vectorless.config import EngineConfig, load_config, load_config_from_env, load_config_from_file
-from vectorless.events import EventEmitter
-from vectorless.streaming import StreamingQueryResult
-from vectorless.types import (
-    DocumentGraphWrapper,
-    EdgeEvidence,
-    Evidence,
-    FailedItem,
-    GraphEdge,
-    GraphNode,
-    IndexItemWrapper,
-    IndexMetrics,
-    IndexResultWrapper,
-    QueryMetrics,
-    QueryResponse,
-    QueryResult,
-    WeightedKeyword,
-)
-
-# Version and error types
-from vectorless._vectorless import VectorlessError, __version__
-
-__all__ = [
-    # Primary API
-    "Session",
-    "SyncSession",
-    # Configuration
-    "EngineConfig",
-    "load_config",
-    "load_config_from_env",
-    "load_config_from_file",
-    # Events
-    "EventEmitter",
-    # Streaming
-    "StreamingQueryResult",
-    # Result types
-    "QueryResponse",
-    "QueryResult",
-    "QueryMetrics",
-    "Evidence",
-    "IndexResultWrapper",
-    "IndexItemWrapper",
-    "IndexMetrics",
-    "FailedItem",
-    # Graph types
-    "DocumentGraphWrapper",
-    "GraphNode",
-    "GraphEdge",
-    "EdgeEvidence",
-    "WeightedKeyword",
-    # Error and version
-    "VectorlessError",
-    "__version__",
-]
diff --git a/python/vectorless/_core.py b/python/vectorless/_core.py
deleted file mode 100644
index c83d089a..00000000
--- a/python/vectorless/_core.py
+++ /dev/null
@@ -1,54 +0,0 @@
-"""Internal re-exports from the Rust PyO3 module.
-
-This module is NOT part of the public API. Use ``vectorless.Session`` instead.
-"""
-
-from vectorless._vectorless import (
-    Config,
-    DocumentGraph,
-    DocumentGraphNode,
-    DocumentInfo,
-    EdgeEvidence,
-    Engine,
-    EvidenceItem,
-    FailedItem,
-    GraphEdge,
-    IndexContext,
-    IndexItem,
-    IndexMetrics,
-    IndexOptions,
-    IndexResult,
-    QueryContext,
-    QueryMetrics,
-    QueryResult,
-    QueryResultItem,
-    StreamingQuery,
-    VectorlessError,
-    WeightedKeyword,
-    __version__,
-)
-
-__all__ = [
-    "Config",
-    "DocumentGraph",
-    "DocumentGraphNode",
-    "DocumentInfo",
-    "EdgeEvidence",
-    "Engine",
-    "EvidenceItem",
-    "FailedItem",
-    "GraphEdge",
-    "IndexContext",
-    "IndexItem",
-    "IndexMetrics",
-    "IndexOptions",
-    "IndexResult",
-    "QueryContext",
-    "QueryMetrics",
-    "QueryResult",
-    "QueryResultItem",
-    "StreamingQuery",
-    "VectorlessError",
-    "WeightedKeyword",
-    "__version__",
-]
diff --git a/rust/Cargo.toml b/rust/Cargo.toml
deleted file mode 100644
index 723e46f7..00000000
--- a/rust/Cargo.toml
+++ /dev/null
@@ -1,94 +0,0 @@
-[package]
-name = "vectorless"
-version.workspace = true
-edition.workspace = true
-authors.workspace = true
-description.workspace = true
-license.workspace = true
-repository.workspace = true
-homepage.workspace = true
-documentation = "https://docs.rs/vectorless"
-keywords = ["rag", "document", "retrieval", "indexing", "llm"]
-categories = ["text-processing", "data-structures", "algorithms"]
-readme = "../README.md"
-exclude = ["docs/", "examples/", ".*"]
-
-[dependencies]
-# Async runtime
-tokio = { workspace = true }
-async-trait = { workspace = true }
-futures = { workspace = true }
-
-# Serialization
-serde = { workspace = true }
-serde_json = { workspace = true }
-
-# Error handling
-thiserror = { workspace = true }
-anyhow = { workspace = true, optional = true }
-
-# OpenAI-compatible API client
-async-openai = { workspace = true }
-
-# UUID
-uuid = { workspace = true }
-
-# Time
-chrono = { workspace = true }
-
-# Logging
-tracing = { workspace = true }
-
-# Rate limiting
-governor = { workspace = true }
-nonzero_ext = { workspace = true }
-
-# Token counting
-tiktoken-rs = { workspace = true }
-
-# Text processing
-regex = { workspace = true }
-
-# Markdown parsing
-pulldown-cmark = { workspace = true }
-
-# Tree data structure
-indextree = { workspace = true }
-
-# LRU cache
-lru = { workspace = true }
-
-# Checksum
-sha2 = { workspace = true }
-
-# BLAKE2b hashing
-blake2 = { workspace = true }
-base64 = { workspace = true }
-
-# Synchronization primitives
-parking_lot = { workspace = true }
-
-# Compression
-flate2 = { workspace = true }
-
-# File locking (Unix)
-[target.'cfg(unix)'.dependencies]
-libc = { workspace = true }
-
-# PDF processing
-pdf-extract = { workspace = true }
-lopdf = { workspace = true }
-
-# Random number generation
-rand = { workspace = true }
-
-# BM25 scoring
-bm25 = { workspace = true }
-
-[dev-dependencies]
-tempfile = { workspace = true }
-tokio-test = { workspace = true }
-tracing-subscriber = { workspace = true }
-
-[lints]
-workspace = true
diff --git a/rust/examples/deep_retrieval.rs b/rust/examples/deep_retrieval.rs
deleted file mode 100644
index 44877543..00000000
--- a/rust/examples/deep_retrieval.rs
+++ /dev/null
@@ -1,221 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Complex retrieval example — forces SubAgent navigation, not fast path.
-//!
-//! This example indexes a document where the answer to a tricky question
-//! is NOT directly accessible via keyword lookup in the ReasoningIndex.
-//! The SubAgent must navigate through multiple levels, collect evidence
-//! from different sections, and synthesize a cross-referenced answer.
-//!
-//! # Usage
-//!
-//! ```bash
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   LLM_ENDPOINT=https://api.openai.com/v1 cargo run --example deep_retrieval
-//! ```
-
-use vectorless::{EngineBuilder, IndexContext, IndexOptions, QueryContext};
-
-/// A compact but deeply nested document about a fictional space mission.
-///
-/// Structure (4 levels deep):
-///
-/// Mission Atlas Report
-/// ├── Launch Operations
-/// │   ├── Vehicle Configuration
-/// │   │   ├── Stage 1 Parameters
-/// │   │   └── Stage 2 Parameters
-/// │   └── Countdown Timeline
-/// │       ├── T-48h to T-12h
-/// │       └── T-12h to T-0
-/// ├── Orbital Mechanics
-/// │   ├── Transfer Orbit Analysis
-/// │   │   ├── Delta-V Budget
-/// │   │   └── Gravity Assist Profile
-/// │   └── Station-Keeping Schedule
-/// ├── Payload Operations
-/// │   ├── Satellite Alpha Deployment
-/// │   │   ├── Separation Sequence
-/// │   │   └── Solar Panel Extension
-/// │   ├── Satellite Beta Deployment
-/// │   │   ├── Antenna Calibration
-/// │   │   └── Frequency Assignment
-/// │   └── Re-entry Capsule
-/// │       ├── Heat Shield Specs
-/// │       └── Landing Zone Selection
-/// └── Mission Anomalies
-///     ├── Day 3 Communication Blackout
-///     └── Day 17 Thruster Misfire
-const MISSION_REPORT: &str = r#"
-# Mission Atlas Report
-
-## Launch Operations
-
-### Vehicle Configuration
-
-#### Stage 1 Parameters
-
-The first stage utilizes a LOX/RP-1 bipropellant configuration with a sea-level thrust of 7,600 kN. Burn time is 162 seconds with a specific impulse of 282 seconds. The propellant mass fraction is 0.894. Stage separation occurs at T+162s at an altitude of approximately 68 km with a velocity of 2,340 m/s.
-
-#### Stage 2 Parameters
-
-The second stage employs a single RL-10C engine using LOX/LH2 with a vacuum thrust of 110 kN. Burn duration extends to 370 seconds with a specific impulse of 448 seconds. The stage carries 20,800 kg of propellant. Engine ignition occurs at T+165s following a 3-second coast phase after stage separation.
-
-### Countdown Timeline
-
-#### T-48h to T-12h
-
-During the early countdown phase, the launch team completed propellant loading verification and navigation system alignment. A minor issue was detected in the Stage 2 fuel temperature sensor at T-36h, which was resolved by recalibrating the sensor threshold from 20.1K to 19.8K. Weather briefing at T-24h indicated 85% probability of favorable conditions with upper-level winds at 45 knots.
-
-#### T-12h to T-0
-
-Final countdown proceeded nominally. Auxiliary power unit start occurred at T-4h. Range safety checks completed at T-2h. Go/No-Go poll at T-30 minutes was unanimous across all stations. Terminal count at T-9 minutes was initiated with no holds. Liftoff occurred at 14:37:22 UTC on March 15, achieving the targeted azimuth of 72.3 degrees.
-
-## Orbital Mechanics
-
-### Transfer Orbit Analysis
-
-#### Delta-V Budget
-
-The total mission delta-V budget is 4,832 m/s, allocated as follows: ascent to parking orbit 1,890 m/s, trans-target injection 2,210 m/s, orbit insertion 510 m/s, and station-keeping reserve 222 m/s. The parking orbit was achieved at 185 km circular with an inclination of 28.5 degrees. The gravity assist maneuver at Titan contributed an effective delta-V savings of 380 m/s, which allowed the mission to carry 15% more payload than the original baseline design.
-
-#### Gravity Assist Profile
-
-The Titan flyby occurred on Day 47 at a closest approach distance of 950 km. The bending angle was 38.7 degrees with an asymptotic velocity of 4.2 km/s relative to Titan. This maneuver shifted the spacecraft trajectory from a Hohmann-type direct transfer to a gravity-assisted trajectory, reducing total flight time from 187 days to 143 days. Post-flyby trajectory correction burn of 3.4 m/s was executed on Day 49 to refine the approach corridor.
-
-### Station-Keeping Schedule
-
-Station-keeping maneuvers are planned at 14-day intervals with a delta-V allocation of 2.8 m/s per maneuver. The first three maneuvers consumed 2.6, 3.1, and 2.5 m/s respectively, staying within the allocated budget. Orbital decay rate without correction is approximately 0.3 km per 14-day cycle due to atmospheric drag at the operational altitude of 420 km.
-
-## Payload Operations
-
-### Satellite Alpha Deployment
-
-#### Separation Sequence
-
-Satellite Alpha separated from the payload adapter at T+3h42m using a Marman band release mechanism. Separation velocity was 0.45 m/s with a tip-off rate of 0.02 deg/s. Initial telemetry confirmed solar panel deployment signal at T+3h58m. First ground station contact occurred over Svalbard at T+4h12m confirming nominal spacecraft health.
-
-#### Solar Panel Extension
-
-Both solar arrays deployed fully within 8 minutes of the deployment command. Array 1 generated 4,280 W and Array 2 generated 4,310 W, for a combined initial output of 8,590 W against a design target of 8,400 W. The arrays use triple-junction GaAs cells with a beginning-of-life efficiency of 30.7%. Power margin at end-of-life (7 years) is projected at 6,950 W, still above the minimum operational requirement of 6,200 W.
-
-### Satellite Beta Deployment
-
-#### Antenna Calibration
-
-Satellite Beta's high-gain antenna completed calibration in three phases. Phase 1 (boresight alignment) achieved a pointing accuracy of 0.023 degrees against a requirement of 0.05 degrees. Phase 2 (pattern verification) confirmed the sidelobe levels were within specification at -28 dB below main beam. Phase 3 (EIRP verification) measured 52.4 dBW against a required minimum of 51.0 dBW.
-
-#### Frequency Assignment
-
-Satellite Beta operates in Ka-band with a downlink center frequency of 20.185 GHz and an uplink at 30.050 GHz. The allocated bandwidth is 500 MHz per polarization, supporting 24 transponders with 36 MHz spacing. Cross-polarization isolation exceeds 30 dB. The link budget supports a minimum data rate of 1.2 Gbps under rain fade conditions corresponding to 99.7% availability in the primary coverage zone.
-
-### Re-entry Capsule
-
-#### Heat Shield Specs
-
-The re-entry capsule thermal protection system uses a phenolic-impregnated carbon ablator (PICA-X) with a thickness of 33 mm on the forebody. Maximum predicted heat flux is 185 W/cm² at the stagnation point during re-entry at 11.2 km/s. The heat shield mass is 86 kg, representing 12% of the total capsule dry mass of 717 kg. The backshell uses a lighter SLA-561V material with a 15 mm thickness rated for 45 W/cm².
-
-#### Landing Zone Selection
-
-The primary landing zone is located at 34.2°N 108.7°W in the White Sands Proving Ground, with an elliptical footprint of 15 km × 8 km at the 3-sigma confidence level. Wind drift analysis based on 10 years of upper-atmosphere data predicts a mean offset of 3.2 km northeast. The backup landing zone is at 32.5°N 106.5°W near Fort Bliss, activated only if the primary zone weather violates the surface wind constraint of 12 m/s.
-
-## Mission Anomalies
-
-### Day 3 Communication Blackout
-
-At approximately 07:14 UTC on Day 3, the primary S-band transponder experienced an unexpected carrier loss lasting 4 hours and 22 minutes. Root cause analysis identified a single-event upset (SEU) in the command decoder ASIC, caused by a high-energy proton from the inner Van Allen belt. The transponder recovered autonomously after a watchdog timer reset. No command sequences were lost as the onboard computer continued executing the stored timeline. Redundant transponder was not activated because the primary recovery occurred before the 6-hour switchover threshold.
-
-### Day 17 Thruster Misfire
-
-At 14:52 UTC on Day 17, thruster cluster B3 (one of eight attitude control clusters) fired for 2.3 seconds during a period when no thruster activity was commanded. This produced an unplanned delta-V of 0.08 m/s and an attitude perturbation of 0.3 degrees. Telemetry analysis revealed a stuck valve in the B3 propellant control valve assembly, likely caused by particulate contamination during ground processing. The flight software detected the anomaly within 500 ms and inhibited the B3 cluster. Subsequent attitude corrections were performed using the remaining seven clusters. The propellant impact of the lost cluster reduces the available delta-V for the mission by approximately 4 m/s, leaving a remaining reserve of 218 m/s against a requirement of 150 m/s.
-"#;
-
-/// Questions designed to force deep navigation:
-///
-/// 1. "How much delta-V budget remains after the Day 17 thruster failure,
-///     and is it enough to complete the mission?"
-///     → Requires finding delta-V budget (Orbital Mechanics > Transfer > Delta-V Budget)
-///     AND the anomaly impact (Mission Anomalies > Day 17 Thruster Misfire)
-///     AND cross-referencing reserve vs requirement.
-///
-/// 2. "What is the total power generation margin at end-of-life for Satellite Alpha
-///     compared to its minimum operational requirement?"
-///     → Requires finding EOL power (Payload > Alpha > Solar Panel Extension)
-///     and computing the difference.
-///
-/// 3. "If the B3 thruster cluster had failed during the Day 3 blackout instead of
-///     Day 17, would the spacecraft have been able to recover attitude without
-///     ground intervention?"
-///     → Requires combining anomaly timelines and thruster redundancy info.
-const QUERIES: &[&str] = &["where can i find the backup landing zone"];
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    tracing_subscriber::fmt::init();
-
-    println!("=== Deep Retrieval Example ===\n");
-
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT").unwrap_or_else(|_| "https://api".to_string());
-
-    // Build engine
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    // Index document
-    let temp_dir = tempfile::tempdir()?;
-    let md_path = temp_dir.path().join("mission_atlas.md");
-    tokio::fs::write(&md_path, MISSION_REPORT).await?;
-
-    let index_result = engine
-        .index(IndexContext::from_path(&md_path).with_options(IndexOptions::new().with_summaries()))
-        .await?;
-    let doc_id = index_result.doc_id().unwrap().to_string();
-    println!("Indexed document: {}\n", doc_id);
-
-    // Query
-    for query in QUERIES {
-        println!("Q: \"{}\"", query);
-
-        match engine
-            .query(
-                QueryContext::new(*query)
-                    .with_doc_ids(vec![doc_id.clone()])
-                    .with_force_analysis(true),
-            )
-            .await
-        {
-            Ok(result) => {
-                if let Some(item) = result.single() {
-                    if item.content.is_empty() {
-                        println!("   No relevant content found");
-                    } else {
-                        println!("   A:");
-                        for line in item.content.lines().take(10) {
-                            println!("     {}", line);
-                        }
-                        if item.content.lines().count() > 10 {
-                            println!(
-                                "     ... ({} more lines)",
-                                item.content.lines().count() - 10
-                            );
-                        }
-                    }
-                }
-            }
-            Err(e) => println!("   Error: {}", e),
-        }
-        println!();
-    }
-
-    // Cleanup
-    engine.remove(&doc_id).await?;
-    Ok(())
-}
diff --git a/rust/examples/events.rs b/rust/examples/events.rs
deleted file mode 100644
index 3db97706..00000000
--- a/rust/examples/events.rs
+++ /dev/null
@@ -1,155 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Event callbacks example.
-//!
-//! This example demonstrates the event system for:
-//! - Monitoring indexing progress
-//! - Tracking query execution
-//! - Debugging retrieval behavior
-//!
-//! # Usage
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   LLM_ENDPOINT=https://api.openai.com/v1 cargo run --example events
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example events
-//! ```
-
-use std::sync::Arc;
-use std::sync::atomic::{AtomicUsize, Ordering};
-
-use vectorless::{EngineBuilder, IndexContext, QueryContext};
-use vectorless::{EventEmitter, IndexEvent, QueryEvent};
-
-#[tokio::main]
-async fn main() -> Result<(), Box<dyn std::error::Error>> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    println!("=== Event Callbacks Example ===\n");
-
-    // 1. Create event emitter with handlers
-    println!("Step 1: Setting up event handlers...\n");
-
-    let index_count = Arc::new(AtomicUsize::new(0));
-    let query_count = Arc::new(AtomicUsize::new(0));
-    let nodes_visited = Arc::new(AtomicUsize::new(0));
-
-    let index_count_clone = index_count.clone();
-    let query_count_clone = query_count.clone();
-    let nodes_visited_clone = nodes_visited.clone();
-
-    let events = EventEmitter::new()
-        // Index events
-        .on_index(move |e| match e {
-            IndexEvent::Started { path } => {
-                println!("  [INDEX] Started: {}", path);
-            }
-            IndexEvent::FormatDetected { format } => {
-                println!("  [INDEX] Format: {:?}", format);
-            }
-            IndexEvent::TreeBuilt { node_count } => {
-                println!("  [INDEX] Tree built: {} nodes", node_count);
-            }
-            IndexEvent::Complete { doc_id } => {
-                println!("  [INDEX] Complete: {}", &doc_id[..8]);
-                index_count_clone.fetch_add(1, Ordering::SeqCst);
-            }
-            IndexEvent::Error { message } => {
-                println!("  [INDEX] Error: {}", message);
-            }
-            _ => {}
-        })
-        // Query events
-        .on_query(move |e| match e {
-            QueryEvent::Started { query } => {
-                println!("  [QUERY] Started: \"{}\"", query);
-                query_count_clone.fetch_add(1, Ordering::SeqCst);
-            }
-            QueryEvent::NodeVisited { title, score, .. } => {
-                println!("  [QUERY] Visited: \"{}\" (score: {:.2})", title, score);
-                nodes_visited_clone.fetch_add(1, Ordering::SeqCst);
-            }
-            QueryEvent::CandidateFound { node_id, score } => {
-                println!(
-                    "  [QUERY] Candidate: {} (score: {:.2})",
-                    &node_id[..8],
-                    score
-                );
-            }
-            QueryEvent::Complete {
-                total_results,
-                confidence,
-            } => {
-                println!(
-                    "  [QUERY] Complete: {} results, confidence: {:.2}",
-                    total_results, confidence
-                );
-            }
-            QueryEvent::Error { message } => {
-                println!("  [QUERY] Error: {}", message);
-            }
-            _ => {}
-        });
-
-    println!("  ✓ Event handlers configured\n");
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-    let endpoint =
-        std::env::var("LLM_ENDPOINT").unwrap_or_else(|_| "https://api.openai.com/v1".to_string());
-
-    // 2. Create engine with events
-    println!("Step 2: Creating engine with event emitter...");
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .with_events(events)
-        .build()
-        .await?;
-    println!("  ✓ Engine created\n");
-
-    // 3. Index a document with events
-    println!("Step 3: Indexing document (with events)...");
-    let result = engine
-        .index(IndexContext::from_path("../README.md"))
-        .await?;
-    let doc_id = result.doc_id().unwrap().to_string();
-    println!("  ✓ Indexed: {doc_id}\n");
-
-    // 4. Query with events
-    println!("Step 4: Querying (with events)...");
-    let result = engine
-        .query(QueryContext::new("What is vectorless?").with_doc_ids(vec![doc_id.clone()]))
-        .await?;
-    if let Some(item) = result.single() {
-        println!("  ✓ Found result ({} chars)", item.content.len());
-        if !item.content.is_empty() {
-            let preview: String = item.content.chars().take(200).collect();
-            println!("  Preview: {}...", preview);
-        }
-    }
-
-    // 5. Stats
-    println!("\n--- Stats ---");
-    println!(
-        "  Documents indexed: {}",
-        index_count.load(Ordering::SeqCst)
-    );
-    println!("  Queries executed: {}", query_count.load(Ordering::SeqCst));
-    println!("  Nodes visited: {}", nodes_visited.load(Ordering::SeqCst));
-
-    // Cleanup
-    engine.remove(&doc_id).await?;
-    println!("\n  Cleaned up");
-
-    println!("\n=== Done ===");
-    Ok(())
-}
diff --git a/rust/examples/flow.rs b/rust/examples/flow.rs
deleted file mode 100644
index ce13d80b..00000000
--- a/rust/examples/flow.rs
+++ /dev/null
@@ -1,143 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Complete Markdown processing flow example.
-//!
-//! This example demonstrates the full pipeline:
-//! 1. Create a Vectorless client
-//! 2. Index a Markdown document
-//! 3. Show document structure in JSON format
-//! 4. Query the document
-//!
-//! # Usage
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   LLM_ENDPOINT=https://api.openai.com/v1 cargo run --example flow
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example flow
-//! ```
-
-use vectorless::{EngineBuilder, IndexContext, IndexOptions, QueryContext};
-
-/// Sample markdown content for demonstration.
-const SAMPLE_MARKDOWN: &str = r#"
-# Vectorless Architecture Guide
-
-## Overview
-
-Vectorless is a reasoning-native document intelligence engine that transforms documents into hierarchical semantic trees. Unlike traditional RAG systems that rely on vector embeddings and similarity search, Vectorless uses LLM-powered tree navigation to retrieve relevant content through deep contextual understanding.
-
-The core idea is simple: structured documents already have inherent semantic relationships encoded in their headings, sections, and paragraphs. By preserving this structure as a navigable tree, an LLM can efficiently locate relevant information by following the document's own logical organization.
-
-## Architecture
-
-The system consists of three main components: an indexing pipeline, a storage layer, and a retrieval engine. The indexing pipeline parses documents into tree structures and generates summaries. The storage layer persists indexed documents to disk. The retrieval engine navigates the tree at query time using search algorithms guided by LLM decisions.
-
-### Indexing Pipeline
-
-The indexing pipeline processes documents through multiple stages: parsing, tree building, enhancement (LLM summary generation), and reasoning index construction. Each stage is independently configurable and can be enabled or disabled based on requirements. The pipeline supports incremental re-indexing with content fingerprinting to avoid redundant work when documents haven't changed.
-
-### Retrieval Engine
-
-The retrieval engine uses an agent-based architecture where an Orchestrator coordinates Workers that navigate the document tree using LLM-guided decisions (ls, cd, cat, find, grep). The Orchestrator evaluates progress after each step and can replan when results are insufficient. The engine is budget-aware, tracking token usage and making cost-conscious decisions about when to invoke the LLM versus using cheaper heuristic scoring.
-
-## Performance
-
-Under typical workloads, indexing a 50-page document takes approximately 10-30 seconds depending on LLM response latency and the complexity of the document structure. Query latency ranges from 200ms for simple keyword-matched queries to 3-5 seconds for complex multi-hop reasoning queries that require multiple LLM calls during tree navigation.
-
-The system is designed for accuracy over speed. By leveraging document structure and LLM reasoning, it achieves higher retrieval quality than vector-based approaches on structured documents like technical reports, legal contracts, and research papers.
-"#;
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    println!("=== Vectorless Flow Example ===\n");
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT").unwrap_or_else(|_| "https://api".to_string());
-
-    // Step 1: Create a Vectorless client
-    println!("Step 1: Creating Vectorless client...");
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    println!("  - Client created successfully");
-    println!();
-
-    // Step 2: Index the sample Markdown document
-    println!("Step 2: Indexing Markdown document...");
-
-    let temp_dir = tempfile::tempdir()?;
-    let md_path = temp_dir.path().join("sample.md");
-    tokio::fs::write(&md_path, SAMPLE_MARKDOWN).await?;
-
-    let index_result = engine
-        .index(IndexContext::from_path(&md_path).with_options(IndexOptions::new().with_summaries()))
-        .await?;
-    let doc_id = index_result.doc_id().unwrap().to_string();
-
-    println!("  - Document indexed successfully");
-    println!("  - Document ID: {}", doc_id);
-    println!();
-
-    // Step 3: List indexed documents
-    println!("Step 3: Indexed documents:");
-    for doc in engine.list().await? {
-        println!("  - {} ({})", doc.name, doc.id);
-    }
-    println!();
-
-    // Step 4: Query the document
-    println!("Step 4: Querying the document...");
-
-    let queries = vec!["What is the seconds for complex multi-hop?"];
-
-    for query in queries {
-        println!("  Query: \"{}\"", query);
-
-        match engine
-            .query(QueryContext::new(query).with_doc_ids(vec![doc_id.clone()]))
-            .await
-        {
-            Ok(result) => {
-                if let Some(item) = result.single() {
-                    if item.content.is_empty() {
-                        println!("    - No relevant content found");
-                    } else {
-                        println!("    - Found relevant content:");
-                        for line in item.content.lines() {
-                            println!("      {}", line);
-                        }
-                    }
-                } else {
-                    println!("    - No results");
-                }
-            }
-            Err(e) => {
-                println!("    - Error: {}", e);
-            }
-        }
-        println!();
-    }
-
-    // Cleanup
-    // for doc in engine.list().await? {
-    //     engine.remove(&doc.id).await?;
-    // }
-
-    Ok(())
-}
diff --git a/rust/examples/graph.rs b/rust/examples/graph.rs
deleted file mode 100644
index 5fccd084..00000000
--- a/rust/examples/graph.rs
+++ /dev/null
@@ -1,106 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Document graph example for Vectorless.
-//!
-//! Demonstrates how to retrieve the cross-document relationship graph
-//! after indexing. The graph is automatically rebuilt after each index call,
-//! connecting documents that share keywords via Jaccard similarity.
-//!
-//! # Usage
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   cargo run --example graph
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example graph
-//! ```
-
-use vectorless::{EngineBuilder, IndexContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    println!("=== Document Graph Example ===\n");
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-
-    // 1. Create engine
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .build()
-        .await
-        .map_err(|e: vectorless::BuildError| vectorless::Error::Config(e.to_string()))?;
-
-    // 2. Index documents — graph is rebuilt automatically
-    let result = engine
-        .index(IndexContext::from_paths(&["../README.md", "../CLAUDE.md"]))
-        .await?;
-
-    println!("Indexed {} document(s)", result.items.len());
-    for item in &result.items {
-        println!("  - {} ({})", item.name, item.doc_id);
-    }
-    println!();
-
-    // 3. Get the document graph
-    match engine.get_graph().await? {
-        Some(graph) => {
-            println!(
-                "Document graph: {} nodes, {} edges",
-                graph.node_count(),
-                graph.edge_count()
-            );
-
-            // Show document nodes
-            for doc_id in graph.doc_ids() {
-                if let Some(node) = graph.get_node(doc_id) {
-                    println!(
-                        "  Node: {} — {} keyword(s), top: {:?}",
-                        node.title,
-                        node.top_keywords.len(),
-                        node.top_keywords
-                            .iter()
-                            .take(3)
-                            .map(|kw| &kw.keyword)
-                            .collect::<Vec<_>>()
-                    );
-
-                    // Show edges (connected documents)
-                    let neighbors = graph.get_neighbors(doc_id);
-                    if !neighbors.is_empty() {
-                        for edge in neighbors {
-                            println!(
-                                "    → {} (weight={:.2}, jaccard={:.2}, shared={})",
-                                edge.target_doc_id,
-                                edge.weight,
-                                edge.evidence.keyword_jaccard,
-                                edge.evidence.shared_keyword_count,
-                            );
-                        }
-                    } else {
-                        println!("    (no connections)");
-                    }
-                }
-            }
-        }
-        None => println!("No graph available (no documents with reasoning index)"),
-    }
-
-    // 4. Cleanup
-    let docs = engine.list().await?;
-    for doc in &docs {
-        engine.remove(&doc.id).await?;
-    }
-
-    println!("\n=== Done ===");
-    Ok(())
-}
diff --git a/rust/examples/index_directory.rs b/rust/examples/index_directory.rs
deleted file mode 100644
index 2696df99..00000000
--- a/rust/examples/index_directory.rs
+++ /dev/null
@@ -1,114 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Directory indexing example — recursively index all documents in a directory.
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   LLM_ENDPOINT=http://localhost:4000/api/v1 \
-//!   cargo run --example index_directory -- /path/to/docs
-//!
-//! # With recursive flag (default):
-//! cargo run --example index_directory -- /path/to/docs --recursive
-//!
-//! # Non-recursive (top-level only):
-//! cargo run --example index_directory -- /path/to/docs --no-recursive
-//! ```
-
-use vectorless::{EngineBuilder, IndexContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    tracing_subscriber::fmt::init();
-
-    // Parse CLI arguments
-    let args: Vec<String> = std::env::args().collect();
-    let dir = args.get(1).map(|s| s.as_str()).unwrap_or("./samples");
-    let recursive = !args.iter().any(|a| a == "--no-recursive");
-
-    // Build engine
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-or-v1-...".to_string());
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    // Index directory
-    println!(
-        "{}indexing: {}",
-        if recursive { "Recursively " } else { "" },
-        dir
-    );
-    let ctx = IndexContext::from_dir(dir, recursive);
-
-    if ctx.is_empty() {
-        println!("No supported files found in: {}", dir);
-        return Ok(());
-    }
-
-    println!("Found {} file(s) to index", ctx.len());
-
-    let result = engine.index(ctx).await?;
-
-    println!("\nIndexed {} document(s):", result.items.len());
-    for item in &result.items {
-        println!("  {} ({})", item.name, item.doc_id);
-        if let Some(metrics) = &item.metrics {
-            println!(
-                "    nodes: {}, time: {}ms",
-                metrics.nodes_processed,
-                metrics.total_time_ms()
-            );
-        }
-    }
-
-    if result.has_failures() {
-        println!("\nFailed:");
-        for f in &result.failed {
-            println!("  {} — {}", f.source, f.error);
-        }
-    }
-
-    // Query across all indexed documents
-    let query = "What is this about?";
-    println!("\nQuerying: \"{query}\"");
-
-    let answer = engine.query(vectorless::QueryContext::new(query)).await?;
-
-    for item in &answer.items {
-        println!("  [{} confidence={:.2}]", item.doc_id, item.confidence);
-        let preview: String = item.content.chars().take(200).collect();
-        println!("  {preview}");
-        if item.content.len() > 200 {
-            println!("  ...");
-        }
-    }
-
-    // Metrics report
-    let report = engine.metrics_report();
-    println!("\nMetrics:");
-    println!(
-        "  LLM: {} calls, {} tokens, ${:.4}",
-        report.llm.total_calls, report.llm.total_tokens, report.llm.estimated_cost_usd,
-    );
-    println!(
-        "  Retrieval: {} queries, avg score {:.2}",
-        report.retrieval.total_queries, report.retrieval.avg_path_score,
-    );
-
-    // Cleanup
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-    }
-
-    Ok(())
-}
diff --git a/rust/examples/index_incremental.rs b/rust/examples/index_incremental.rs
deleted file mode 100644
index 6500a992..00000000
--- a/rust/examples/index_incremental.rs
+++ /dev/null
@@ -1,122 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Incremental indexing example — re-index with change detection.
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   LLM_ENDPOINT=http://localhost:4000/api/v1 cargo run --example index_incremental
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example index_incremental
-//! ```
-
-use vectorless::{DocumentFormat, EngineBuilder, IndexContext, IndexMode};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-or-v1-...".to_string());
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    let content_v1 = r#"# API Reference
-
-## GET /users
-
-Returns a list of all users in the system.
-
-## POST /users
-
-Creates a new user account.
-"#;
-
-    let content_v2 = r#"# API Reference
-
-## GET /users
-
-Returns a paginated list of users. Supports `?page=` and `?limit=` parameters.
-
-## POST /users
-
-Creates a new user account. Requires email and password fields.
-
-## DELETE /users/:id
-
-Deletes a user by their unique identifier.
-"#;
-
-    // 1. Initial full index
-    println!("--- Initial index ---");
-    let result = engine
-        .index(IndexContext::from_content(
-            content_v1,
-            DocumentFormat::Markdown,
-        ))
-        .await?;
-
-    let doc_id = result.items[0].doc_id.clone();
-    if let Some(m) = &result.items[0].metrics {
-        println!(
-            "indexed in {}ms, {} nodes",
-            m.total_time_ms(),
-            m.nodes_processed
-        );
-    }
-
-    // 2. Re-index unchanged content (incremental) — skips processing
-    println!("\n--- Re-index unchanged (incremental) ---");
-    let result = engine
-        .index(
-            IndexContext::from_content(content_v1, DocumentFormat::Markdown)
-                .with_mode(IndexMode::Incremental),
-        )
-        .await?;
-
-    for item in &result.items {
-        println!("doc_id: {} (unchanged, skipped)", item.doc_id);
-    }
-
-    // 3. Re-index with changes (incremental) — detects diff and updates
-    println!("\n--- Re-index with changes (incremental) ---");
-    let result = engine
-        .index(
-            IndexContext::from_content(content_v2, DocumentFormat::Markdown)
-                .with_mode(IndexMode::Incremental),
-        )
-        .await?;
-
-    for item in &result.items {
-        if let Some(m) = &item.metrics {
-            println!(
-                "updated in {}ms, {} nodes",
-                m.total_time_ms(),
-                m.nodes_processed
-            );
-        }
-    }
-
-    println!("\ndoc_id: {doc_id}");
-
-    // Cleanup
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-    }
-
-    Ok(())
-}
diff --git a/rust/examples/index_pdf.rs b/rust/examples/index_pdf.rs
deleted file mode 100644
index 0f9ae607..00000000
--- a/rust/examples/index_pdf.rs
+++ /dev/null
@@ -1,118 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! PDF indexing example — index a PDF document via the vectorless engine.
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   cargo run --example index_pdf -- ../samples/Docker_Cheat_Sheet.pdf
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example index_pdf -- ../samples/Docker_Cheat_Sheet.pdf
-//! ```
-
-use std::path::Path;
-
-use vectorless::{EngineBuilder, IndexContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing so we can see pipeline logs.
-    // Set RUST_LOG=info or RUST_LOG=debug for more detail.
-    tracing_subscriber::fmt::init();
-
-    let args: Vec<String> = std::env::args().collect();
-
-    let pdf_path = args.get(1).map(|s| s.as_str()).unwrap_or_else(|| {
-        eprintln!("Usage: cargo run --example index_pdf -- <path-to-pdf>");
-        std::process::exit(1);
-    });
-
-    if !Path::new(pdf_path).exists() {
-        eprintln!("Error: file not found: {}", pdf_path);
-        std::process::exit(1);
-    }
-
-    println!("=== Indexing PDF: {} ===\n", pdf_path);
-
-    // LLM configuration is required — set these environment variables:
-    //   LLM_API_KEY   — your API key (required)
-    //   LLM_MODEL     — model name (default: google/gemini-3-flash-preview)
-    //   LLM_ENDPOINT  — API endpoint (default: http://localhost:4000/api/v1)
-    let api_key = match std::env::var("LLM_API_KEY") {
-        Ok(key) => key,
-        Err(_) => {
-            eprintln!("Error: LLM_API_KEY environment variable is required.");
-            eprintln!("Set it before running:");
-            eprintln!("  LLM_API_KEY=sk-xxx cargo run --example index_pdf -- <path>");
-            std::process::exit(1);
-        }
-    };
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    tracing::info!(
-        "LLM config — key: {}..., model: {}, endpoint: {}",
-        &api_key[..api_key.len().min(8)],
-        model,
-        endpoint
-    );
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    let result = engine.index(IndexContext::from_path(pdf_path)).await?;
-
-    println!(
-        "Indexed: {}, Failed: {}",
-        result.items.len(),
-        result.failed.len()
-    );
-
-    for item in &result.items {
-        println!("\n--- {} ---", item.name);
-        println!("doc_id:  {}", item.doc_id);
-        println!("format:  {:?}", item.format);
-
-        if let Some(metrics) = &item.metrics {
-            println!("\nMetrics:");
-            println!("  total time:    {}ms", metrics.total_time_ms());
-            println!("  parse:         {}ms", metrics.parse_time_ms);
-            println!("  build:         {}ms", metrics.build_time_ms);
-            println!("  enhance:       {}ms", metrics.enhance_time_ms);
-            println!("  nodes:         {}", metrics.nodes_processed);
-            println!("  summaries:     {}", metrics.summaries_generated);
-            println!("  failed:        {}", metrics.summaries_failed);
-            println!("  llm calls:     {}", metrics.llm_calls);
-            println!("  tokens:        {}", metrics.total_tokens_generated);
-            println!("  topics:        {}", metrics.topics_indexed);
-            println!("  keywords:      {}", metrics.keywords_indexed);
-
-            if metrics.llm_calls == 0 {
-                println!("\n  *** WARNING: No LLM calls were made. ***");
-                println!("  Set RUST_LOG=info to see pipeline logs:");
-                println!("    RUST_LOG=info cargo run --example index_pdf -- <path>");
-                println!("  Check LLM_API_KEY, LLM_MODEL, and LLM_ENDPOINT are valid.");
-            }
-        }
-    }
-
-    for fail in &result.failed {
-        eprintln!("FAILED: {} — {}", fail.source, fail.error);
-    }
-
-    // Cleanup workspace (uncomment to clean up after run)
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-    }
-
-    Ok(())
-}
diff --git a/rust/examples/index_single.rs b/rust/examples/index_single.rs
deleted file mode 100644
index edaa2460..00000000
--- a/rust/examples/index_single.rs
+++ /dev/null
@@ -1,103 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Single document indexing example — index one document from content.
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   LLM_ENDPOINT=http://localhost:4000/api/v1 cargo run --example index_single
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example index_single
-//! ```
-
-use vectorless::{DocumentFormat, EngineBuilder, IndexContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-or-v1-...".to_string());
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    let content = r#"# Distributed Data Processing Platform
-
-## Introduction
-
-This document provides a comprehensive overview of the distributed data processing platform architecture. The system is designed to handle petabyte-scale data workloads with sub-second query latency, supporting both real-time streaming and batch processing paradigms. The architecture follows a microservices-based approach with independent scaling capabilities for each component, enabling cost-effective resource utilization across varying workload patterns.
-
-
-## System Architecture
-
-The platform follows a layered architecture pattern with clear separation of concerns between ingestion, processing, storage, and serving layers. Each layer can be independently deployed, scaled, and upgraded without affecting other layers, following the principle of bounded contexts from domain-driven design. Inter-layer communication uses a combination of asynchronous message passing for data flow and synchronous gRPC calls for control plane operations.
-
-### Ingestion Layer
-
-The ingestion layer serves as the entry point for all data entering the platform. It supports multiple protocols including HTTP REST, gRPC, Apache Kafka, and AWS Kinesis. The layer is responsible for data validation, schema enforcement, initial transformation, and routing to downstream processing pipelines. Built on a reactive architecture using backpressure-aware operators, the ingestion layer gracefully handles burst traffic patterns without overwhelming downstream services.
-
-
-### Processing Engine
-
-The processing engine is the core computational component of the platform, responsible for transforming, enriching, aggregating, and analyzing ingested data. It supports both stream processing for real-time analytics and batch processing for historical analysis. The engine is built on a custom execution framework that optimizes query plans based on data statistics and available compute resources.
-
-### Storage Layer
-
-The storage layer provides a unified abstraction over multiple storage backends, each optimized for different access patterns. The hot tier uses an in-memory columnar cache for frequently accessed dimensions and recent fact data, providing microsecond-level access latency. The warm tier uses a distributed key-value store backed by NVMe SSDs for data accessed within the past 30 days. The cold tier uses object storage with Parquet file format for historical data, achieving cost efficiency at the expense of higher access latency.
-
-Data is automatically tiered based on configurable policies that consider access frequency, data age, and query patterns. The tiering engine runs as a background service that continuously monitors access patterns and migrates data between tiers. Metadata about data placement is maintained in a distributed metadata service built on etcd, which provides consistent reads and writes with linearizable semantics.
-
-### Query Serving Layer
-
-The query serving layer provides the external-facing API for executing analytical queries against the processed data. It supports SQL queries via a PostgreSQL-compatible wire protocol, making it accessible to a wide range of BI tools and existing applications without requiring driver changes. The query router analyzes incoming queries and determines the optimal execution strategy, considering which storage tiers contain the relevant data and whether partial results can be served from cached aggregations.
-
-Query results are optionally materialized in a result cache that uses a time-to-live (TTL) policy combined with lazy invalidation based on upstream data freshness markers. The cache achieves a hit rate of approximately 85% for dashboard workloads, significantly reducing the computational load on the processing engine for repetitive query patterns.
-
-## Deployment and Operations
-
-The platform is deployed on Kubernetes with Helm charts that encapsulate all deployment configurations, resource limits, and scaling policies. Each microservice is packaged as a container image with multi-stage builds that minimize image size and attack surface. The CI/CD pipeline uses a GitOps workflow with ArgoCD, ensuring that all changes to production are auditable, reproducible, and reversible.
-
-Monitoring is implemented using a Prometheus and Grafana stack, with custom metrics exported by each service using a shared instrumentation library. Key performance indicators including query latency percentiles, ingestion throughput, processing lag, and error rates are tracked on operational dashboards with automated alerting through PagerDuty integration. Distributed tracing using OpenTelemetry provides end-to-end visibility into request flows across microservices, enabling rapid diagnosis of performance anomalies and error root causes.
-"#;
-
-    // Index from content string
-    let result = engine
-        .index(IndexContext::from_content(
-            content,
-            DocumentFormat::Markdown,
-        ))
-        .await?;
-
-    for item in &result.items {
-        println!("doc_id:  {}", item.doc_id);
-        println!("name:    {}", item.name);
-        println!("format:  {:?}", item.format);
-
-        if let Some(ref metrics) = item.metrics {
-            println!("time:    {}ms", metrics.total_time_ms());
-            println!("nodes:   {}", metrics.nodes_processed);
-            println!("tokens:  {}", metrics.total_tokens_generated);
-        }
-    }
-
-    // Cleanup
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-    }
-
-    Ok(())
-}
diff --git a/rust/examples/indexing.rs b/rust/examples/indexing.rs
deleted file mode 100644
index fe78c254..00000000
--- a/rust/examples/indexing.rs
+++ /dev/null
@@ -1,59 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Batch indexing example — index multiple documents via the vectorless engine.
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   LLM_ENDPOINT=http://localhost:4000/api/v1 cargo run --example indexing
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example indexing
-//! ```
-
-use vectorless::{EngineBuilder, IndexContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    // Initialize tracing for debug output (set RUST_LOG=debug to see more)
-    tracing_subscriber::fmt::init();
-
-    // Build engine with LLM configuration from environment or defaults.
-    // Adjust the defaults below to match your setup.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-or-v1-...".to_string());
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    // Index multiple documents in a single call.
-    // Paths are resolved relative to the workspace directory.
-    let result = engine
-        .index(IndexContext::from_paths(&["../README.md", "../CLAUDE.md"]))
-        .await?;
-
-    println!("Indexed {} document(s)", result.items.len());
-    for item in &result.items {
-        println!("  - {} ({})", item.name, item.doc_id);
-        if let Some(metrics) = &item.metrics {
-            println!("    Time: {}ms", metrics.total_time_ms());
-            println!("    Nodes: {}", metrics.nodes_processed);
-        }
-    }
-
-    // Cleanup
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-    }
-
-    Ok(())
-}
diff --git a/rust/examples/indexing_flow.rs b/rust/examples/indexing_flow.rs
deleted file mode 100644
index 03eb3a87..00000000
--- a/rust/examples/indexing_flow.rs
+++ /dev/null
@@ -1,173 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Indexing pipeline flow example — demonstrates the full indexing pipeline
-//! with detailed metrics breakdown.
-//!
-//! This example walks through:
-//! 1. Creating a Vectorless engine
-//! 2. Indexing a Markdown document from content
-//! 3. Inspecting per-stage timing metrics
-//!
-//! Set `RUST_LOG=info` to see pipeline stage logs, or `RUST_LOG=debug` for
-//! detailed internal progress.
-//!
-//! # Usage
-//!
-//! ```bash
-//! # Using environment variables for LLM config:
-//! LLM_API_KEY=sk-xxx LLM_MODEL=google/gemini-3-flash-preview \
-//!   LLM_ENDPOINT=http://localhost:4000/api/v1 cargo run --example indexing_flow
-//!
-//! # Or with defaults (edit the code to set your key/endpoint):
-//! cargo run --example indexing_flow
-//! ```
-
-use vectorless::{DocumentFormat, EngineBuilder, IndexContext};
-
-/// Sample document with multi-level headings to exercise tree construction
-/// and navigation index building.
-const SAMPLE_MARKDOWN: &str = r#"
-# Payment Platform Technical Guide
-
-## Overview
-
-This guide covers the architecture and implementation details of the payment processing platform. The system handles credit card payments, bank transfers, and digital wallets across multiple currencies and regions. It is designed for high availability with 99.99% uptime SLA and supports peak throughput of 10,000 transactions per second.
-
-## Architecture
-
-The platform uses a microservices architecture with event-driven communication between services. Each service owns its data store and communicates through a message broker for eventual consistency. The system is deployed on Kubernetes with automatic horizontal scaling based on request queue depth.
-
-### Ingestion Gateway
-
-The ingestion gateway is the entry point for all payment requests. It handles request validation, authentication, idempotency checks, and routing to the appropriate payment processor. The gateway implements circuit breaker patterns to gracefully degrade when downstream processors experience issues.
-
-### Payment Processing Engine
-
-The payment processing engine orchestrates the lifecycle of each payment transaction. It manages state transitions from initiation through authorization, capture, settlement, and reconciliation. The engine supports both synchronous and asynchronous payment flows, depending on the payment method and processor requirements.
-
-### Settlement Service
-
-The settlement service handles batch settlement with acquiring banks and payment networks. It runs on a configurable schedule (typically end-of-day for each banking region) and groups authorized transactions into settlement batches. The service handles currency conversion, fee calculation, and split payments for marketplace scenarios.
-
-## Security
-
-All payment data is encrypted at rest using AES-256 and in transit using TLS 1.3. Cardholder data is tokenized immediately upon receipt and stored in a PCI DSS Level 1 compliant vault. The platform undergoes annual PCI DSS audits and quarterly network vulnerability scans.
-
-### Fraud Detection
-
-Real-time fraud detection uses a rules engine combined with a machine learning model that scores each transaction based on velocity checks, geolocation anomalies, device fingerprinting, and behavioral patterns. Transactions exceeding configurable risk thresholds are automatically held for manual review.
-
-### Compliance
-
-The platform complies with PCI DSS, SOC 2 Type II, GDPR, and regional payment regulations including PSD2 (Europe) and local data residency requirements. Audit logs are retained for 7 years and accessible through a dedicated compliance API.
-
-## Monitoring and Operations
-
-Real-time dashboards track transaction volumes, success rates, latency percentiles, and error rates across all payment methods and processors. Automated alerting triggers on-call rotations when key metrics deviate from baseline thresholds.
-"#;
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    tracing_subscriber::fmt::init();
-
-    println!("=== Indexing Pipeline Flow Example ===\n");
-
-    // Build engine with LLM configuration from environment or defaults.
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model =
-        std::env::var("LLM_MODEL").unwrap_or_else(|_| "google/gemini-3-flash-preview".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT")
-        .unwrap_or_else(|_| "http://localhost:4000/api/v1".to_string());
-
-    // Step 1: Create engine
-    println!("Step 1: Creating engine...");
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-    println!("  Done.\n");
-
-    // Step 2: Index from content
-    println!("Step 2: Indexing document from content...\n");
-    let result = engine
-        .index(IndexContext::from_content(
-            SAMPLE_MARKDOWN,
-            DocumentFormat::Markdown,
-        ))
-        .await?;
-
-    println!("  Indexed {} document(s)\n", result.items.len());
-
-    // Step 3: Inspect indexing results and metrics
-    for item in &result.items {
-        println!("--- Document Info ---");
-        println!("  doc_id:    {}", item.doc_id);
-        println!("  name:      {}", item.name);
-        println!("  format:    {:?}", item.format);
-
-        if let Some(desc) = &item.description {
-            println!("  summary:   {}...", &desc[..desc.len().min(120)]);
-        }
-
-        if let Some(ref metrics) = item.metrics {
-            println!("\n--- Pipeline Stage Metrics ---");
-            println!("  Stage                Time (ms)");
-            println!("  ─────────────────────────────");
-            println!("  Parse              {:>8}", metrics.parse_time_ms);
-            println!("  Build              {:>8}", metrics.build_time_ms);
-            println!("  Validate           {:>8}", metrics.validate_time_ms);
-            println!("  Split              {:>8}", metrics.split_time_ms);
-            println!("  Enhance            {:>8}", metrics.enhance_time_ms);
-            println!("  Enrich             {:>8}", metrics.enrich_time_ms);
-            println!(
-                "  Reasoning Index    {:>8}",
-                metrics.reasoning_index_time_ms
-            );
-            println!(
-                "  Navigation Index   {:>8}",
-                metrics.navigation_index_time_ms
-            );
-            println!("  Optimize           {:>8}", metrics.optimize_time_ms);
-            println!("  ─────────────────────────────");
-            println!("  Total              {:>8}", metrics.total_time_ms());
-
-            println!("\n--- Index Output ---");
-            println!("  Nodes processed:       {}", metrics.nodes_processed);
-            println!("  Summaries generated:   {}", metrics.summaries_generated);
-            println!("  Summaries failed:      {}", metrics.summaries_failed);
-            println!("  LLM calls:             {}", metrics.llm_calls);
-            println!(
-                "  Tokens generated:      {}",
-                metrics.total_tokens_generated
-            );
-
-            println!("\n--- Navigation Index ---");
-            println!("  Nav entries:           {}", metrics.nav_entries_indexed);
-            println!("  Child routes:          {}", metrics.child_routes_indexed);
-
-            println!("\n--- Reasoning Index ---");
-            println!("  Topics indexed:        {}", metrics.topics_indexed);
-            println!("  Keywords indexed:      {}", metrics.keywords_indexed);
-
-            println!("\n--- Tree Optimization ---");
-            println!("  Nodes skipped:         {}", metrics.nodes_skipped);
-            println!("  Nodes merged:          {}", metrics.nodes_merged);
-        }
-
-        println!();
-    }
-
-    // Step 4: Cleanup
-    println!("Step 3: Cleaning up...");
-    for doc in engine.list().await? {
-        engine.remove(&doc.id).await?;
-        println!("  Removed: {} ({})", doc.name, doc.id);
-    }
-
-    println!("\n=== Done ===");
-    Ok(())
-}
diff --git a/rust/examples/query.rs b/rust/examples/query.rs
deleted file mode 100644
index 8914081d..00000000
--- a/rust/examples/query.rs
+++ /dev/null
@@ -1,83 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Query-only example — query an already-indexed document.
-//!
-//! Assumes the workspace already contains indexed documents
-//! (e.g. from `cargo run --example flow` or `index_single`).
-//!
-//! # Usage
-//!
-//! ```bash
-//! LLM_API_KEY=sk-xxx LLM_MODEL=gpt-4o \
-//!   LLM_ENDPOINT=https://api.openai.com/v1 cargo run --example query
-//! ```
-
-use vectorless::{EngineBuilder, QueryContext};
-
-#[tokio::main]
-async fn main() -> vectorless::Result<()> {
-    tracing_subscriber::fmt::init();
-
-    let api_key = std::env::var("LLM_API_KEY").unwrap_or_else(|_| "sk-...".to_string());
-    let model = std::env::var("LLM_MODEL").unwrap_or_else(|_| "gpt-4o".to_string());
-    let endpoint = std::env::var("LLM_ENDPOINT").unwrap_or_else(|_| "https://api".to_string());
-
-    let engine = EngineBuilder::new()
-        .with_key(&api_key)
-        .with_model(&model)
-        .with_endpoint(&endpoint)
-        .build()
-        .await
-        .map_err(|e| vectorless::Error::Config(e.to_string()))?;
-
-    // List available documents
-    let docs = engine.list().await?;
-    if docs.is_empty() {
-        println!("No indexed documents found. Run an indexing example first.");
-        return Ok(());
-    }
-
-    println!("Available documents:");
-    for doc in &docs {
-        println!("  - {} ({})", doc.name, doc.id);
-    }
-    println!();
-
-    // Query a specific document
-    let doc_id = docs[0].id.clone();
-    let queries = vec![
-        "What is the system architecture?",
-        "How does the storage layer work?",
-    ];
-
-    for query in queries {
-        println!("Query: \"{}\"", query);
-
-        match engine
-            .query(QueryContext::new(query).with_doc_ids(vec![doc_id.clone()]))
-            .await
-        {
-            Ok(result) => {
-                if let Some(item) = result.single() {
-                    if item.content.is_empty() {
-                        println!("  No relevant content found");
-                    } else {
-                        println!("  Found:");
-                        for line in item.content.lines() {
-                            println!("    {}", line);
-                        }
-                    }
-                } else {
-                    println!("  No results");
-                }
-            }
-            Err(e) => {
-                println!("  Error: {}", e);
-            }
-        }
-        println!();
-    }
-
-    Ok(())
-}
diff --git a/rust/src/client/test_support.rs b/rust/src/client/test_support.rs
deleted file mode 100644
index dd443da8..00000000
--- a/rust/src/client/test_support.rs
+++ /dev/null
@@ -1,54 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Test-only helpers for constructing Engine instances without a real LLM.
-//!
-//! This module is exposed via `vectorless::__test_support` and should **only**
-//! be used in integration tests.
-
-use std::sync::Arc;
-
-use crate::client::engine::Engine;
-use crate::client::indexer::IndexerClient;
-use crate::client::retriever::RetrieverClient;
-use crate::config::Config;
-use crate::events::EventEmitter;
-use crate::index::PipelineExecutor;
-use crate::llm::LlmClient;
-use crate::llm::config::LlmConfig;
-use crate::metrics::MetricsHub;
-use crate::storage::Workspace;
-
-/// Build an `Engine` with a no-LLM pipeline for integration testing.
-///
-/// The pipeline skips enhance/summary stages but exercises:
-/// parse → build → validate → split → enrich → optimize.
-///
-/// # Example
-///
-/// ```rust,ignore
-/// let tmp = tempfile::tempdir().unwrap();
-/// let engine = vectorless::__test_support::build_test_engine(tmp.path()).await;
-/// ```
-pub async fn build_test_engine(workspace_dir: &std::path::Path) -> Engine {
-    let config = Config::default();
-
-    // No-LLM indexer: pipeline without enhance stage
-    let executor_factory: Arc<dyn Fn() -> PipelineExecutor + Send + Sync> =
-        Arc::new(|| PipelineExecutor::new());
-    let indexer = IndexerClient::with_factory(executor_factory);
-
-    let workspace = Workspace::new(workspace_dir).await.unwrap();
-    let retriever = RetrieverClient::new(LlmClient::new(LlmConfig::default()));
-
-    Engine::with_components(
-        config,
-        workspace,
-        retriever,
-        indexer,
-        EventEmitter::new(),
-        Arc::new(MetricsHub::with_defaults()),
-    )
-    .await
-    .unwrap()
-}
diff --git a/rust/src/client/workspace.rs b/rust/src/client/workspace.rs
deleted file mode 100644
index db296493..00000000
--- a/rust/src/client/workspace.rs
+++ /dev/null
@@ -1,243 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Workspace management client.
-//!
-//! This module provides async CRUD operations for document persistence
-//! through the workspace abstraction.
-//!
-//! # Example
-//!
-//! ```rust,ignore
-//! let workspace = WorkspaceClient::new(workspace_storage).await;
-//!
-//! // Save a document
-//! workspace.save(&doc).await?;
-//!
-//! // Load a document
-//! let doc = workspace.load("doc-id").await?;
-//!
-//! // List all documents
-//! for doc in workspace.list().await? {
-//!     println!("{}: {}", doc.id, doc.name);
-//! }
-//! ```
-
-use std::sync::Arc;
-
-use tracing::{debug, info};
-
-use crate::error::Result;
-use crate::storage::{PersistedDocument, Workspace};
-
-use super::types::DocumentInfo;
-use crate::events::{EventEmitter, WorkspaceEvent};
-
-/// Workspace management client.
-///
-/// Provides async thread-safe CRUD operations for document persistence.
-/// All operations are async and can be safely called from multiple tasks.
-///
-/// # Thread Safety
-///
-/// The client is fully thread-safe and can be cloned cheaply
-/// (it uses `Arc` internally).
-#[derive(Clone)]
-pub(crate) struct WorkspaceClient {
-    /// Workspace storage.
-    workspace: Arc<Workspace>,
-
-    /// Event emitter.
-    events: EventEmitter,
-}
-
-impl WorkspaceClient {
-    /// Create a new workspace client.
-    pub async fn new(workspace: Workspace) -> Self {
-        Self {
-            workspace: Arc::new(workspace),
-            events: EventEmitter::new(),
-        }
-    }
-
-    /// Create with event emitter.
-    pub fn with_events(mut self, events: EventEmitter) -> Self {
-        self.events = events;
-        self
-    }
-
-    /// Save a document to the workspace.
-    ///
-    /// If a document with the same ID already exists, logs a warning
-    /// (this can happen during concurrent indexing of the same source).
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace write fails.
-    pub async fn save(&self, doc: &PersistedDocument) -> Result<()> {
-        let doc_id = doc.meta.id.clone();
-
-        if self.workspace.contains(&doc_id).await {
-            tracing::warn!(
-                doc_id,
-                name = %doc.meta.name,
-                "Overwriting existing document — possible concurrent index of the same source"
-            );
-        }
-
-        self.workspace.add(doc).await?;
-
-        info!("Saved document: {}", doc_id);
-        self.events.emit_workspace(WorkspaceEvent::Saved { doc_id });
-
-        Ok(())
-    }
-
-    /// Load a document from the workspace.
-    ///
-    /// Returns `Ok(None)` if the document doesn't exist.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace read fails.
-    pub async fn load(&self, doc_id: &str) -> Result<Option<PersistedDocument>> {
-        let doc = self.workspace.load_and_cache(doc_id).await?;
-
-        if let Some(ref _d) = doc {
-            debug!("Loaded document: {}", doc_id);
-        }
-
-        self.events.emit_workspace(WorkspaceEvent::Loaded {
-            doc_id: doc_id.to_string(),
-            cache_hit: doc.is_some(),
-        });
-
-        Ok(doc)
-    }
-
-    /// Remove a document from the workspace.
-    ///
-    /// Returns `Ok(true)` if the document was removed, `Ok(false)` if it didn't exist.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace write fails.
-    pub async fn remove(&self, doc_id: &str) -> Result<bool> {
-        let removed = self.workspace.remove(doc_id).await?;
-
-        if removed {
-            info!("Removed document: {}", doc_id);
-            self.events.emit_workspace(WorkspaceEvent::Removed {
-                doc_id: doc_id.to_string(),
-            });
-        }
-
-        Ok(removed)
-    }
-
-    /// Check if a document exists in the workspace.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace read fails.
-    pub async fn exists(&self, doc_id: &str) -> Result<bool> {
-        Ok(self.workspace.contains(doc_id).await)
-    }
-
-    /// List all documents in the workspace.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace read fails.
-    pub async fn list(&self) -> Result<Vec<DocumentInfo>> {
-        let doc_ids = self.workspace.list_documents().await;
-        let mut result = Vec::with_capacity(doc_ids.len());
-
-        for id in &doc_ids {
-            if let Some(meta) = self.workspace.get_meta(id).await {
-                result.push(DocumentInfo {
-                    id: meta.id,
-                    name: meta.doc_name,
-                    format: meta.doc_type,
-                    description: meta.doc_description,
-                    source_path: meta.path,
-                    page_count: meta.page_count,
-                    line_count: meta.line_count,
-                });
-            }
-        }
-
-        Ok(result)
-    }
-
-    /// Get document info by ID.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace read fails.
-    pub async fn get_document_info(&self, doc_id: &str) -> Result<Option<DocumentInfo>> {
-        Ok(self
-            .workspace
-            .get_meta(doc_id)
-            .await
-            .map(|meta| DocumentInfo {
-                id: meta.id,
-                name: meta.doc_name,
-                format: meta.doc_type,
-                description: meta.doc_description,
-                source_path: meta.path,
-                page_count: meta.page_count,
-                line_count: meta.line_count,
-            }))
-    }
-
-    /// Clear all documents from the workspace.
-    ///
-    /// Returns the number of documents removed.
-    ///
-    /// # Errors
-    ///
-    /// Returns an error if the workspace write fails.
-    pub async fn clear(&self) -> Result<usize> {
-        let doc_ids = self.workspace.list_documents().await;
-        let mut removed = 0usize;
-
-        for doc_id in &doc_ids {
-            match self.workspace.remove(doc_id).await {
-                Ok(true) => removed += 1,
-                Ok(false) => {}
-                Err(e) => tracing::warn!("Failed to remove document {}: {}", doc_id, e),
-            }
-        }
-
-        if removed > 0 {
-            info!("Cleared workspace: {removed} documents removed");
-            self.events
-                .emit_workspace(WorkspaceEvent::Cleared { count: removed });
-        }
-
-        Ok(removed)
-    }
-
-    /// Get the underlying workspace Arc (for advanced use).
-    pub(crate) fn inner(&self) -> Arc<Workspace> {
-        Arc::clone(&self.workspace)
-    }
-
-    /// Find a document ID by its source file path.
-    ///
-    /// Used for incremental indexing to check if a file has already been indexed.
-    pub async fn find_by_source_path(&self, path: &std::path::Path) -> Option<String> {
-        self.workspace.find_by_source_path(path).await
-    }
-
-    /// Get the document graph, loading from backend if not cached.
-    pub async fn get_graph(&self) -> Result<Option<crate::graph::DocumentGraph>> {
-        self.workspace.get_graph().await
-    }
-
-    /// Persist the document graph to the backend.
-    pub async fn set_graph(&self, graph: &crate::graph::DocumentGraph) -> Result<()> {
-        self.workspace.set_graph(graph).await
-    }
-}
diff --git a/rust/src/config/mod.rs b/rust/src/config/mod.rs
deleted file mode 100644
index 3ece0fe6..00000000
--- a/rust/src/config/mod.rs
+++ /dev/null
@@ -1,16 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Internal configuration management.
-//!
-//! Users configure vectorless via [`EngineBuilder`](crate::client::EngineBuilder) methods,
-//! not by directly interacting with this module.
-
-mod types;
-mod validator;
-
-pub use types::Config;
-pub(crate) use types::{
-    CompressionAlgorithm, FallbackBehavior, FallbackConfig, IndexerConfig, LlmConfig,
-    LlmMetricsConfig, MetricsConfig, OnAllFailedBehavior, RetrievalMetricsConfig, SlotConfig,
-};
diff --git a/rust/src/lib.rs b/rust/src/lib.rs
deleted file mode 100644
index a4263042..00000000
--- a/rust/src/lib.rs
+++ /dev/null
@@ -1,101 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-#![allow(dead_code)]
-
-//! # Vectorless
-//!
-//! A reasoning-native document engine for AI.
-//!
-//! It will reason through any of your structured documents — **PDFs, Markdown,
-//! reports, contracts** — and retrieve only what's relevant. Nothing more,
-//! nothing less.
-//!
-//! ## Quick Start
-//!
-//! ```rust,no_run
-//! use vectorless::{EngineBuilder, IndexContext, QueryContext};
-//!
-//! #[tokio::main]
-//! async fn main() -> Result<(), Box<dyn std::error::Error>> {
-//!     let engine = EngineBuilder::new()
-//!         .with_key("sk-...")
-//!         .with_model("gpt-4o")
-//!         .with_endpoint("https://api.openai.com/v1")
-//!         .build()
-//!         .await?;
-//!
-//!     let result = engine.index(IndexContext::from_path("./document.md")).await?;
-//!     let doc_id = result.doc_id().unwrap();
-//!
-//!     let result = engine.query(
-//!         QueryContext::new("What is this about?")
-//!             .with_doc_ids(vec![doc_id.to_string()]),
-//!     ).await?;
-//!     println!("{}", result.content);
-//!
-//!     Ok(())
-//! }
-//! ```
-
-// ── Modules ──────────────────────────────────────────────────────────────────
-
-mod agent;
-mod client;
-mod config;
-mod document;
-mod error;
-mod events;
-mod graph;
-mod metrics;
-
-mod index;
-mod llm;
-mod query;
-mod rerank;
-mod retrieval;
-mod scoring;
-mod storage;
-mod utils;
-
-// ── Public API ───────────────────────────────────────────────────────────────
-
-// Client
-pub use client::{
-    BuildError, Confidence, DocumentFormat, DocumentInfo, Engine, EngineBuilder, EvidenceItem,
-    FailedItem, IndexContext, IndexItem, IndexMode, IndexOptions, IndexResult, QueryContext,
-    QueryMetrics, QueryResult, QueryResultItem,
-};
-
-// Config
-pub use config::Config;
-
-// Documents
-pub use document::{
-    DocumentStructure, DocumentTree, NodeId, ReasoningIndexConfig, StructureNode, TocConfig,
-    TocEntry, TocNode, TocView, TreeNode,
-};
-
-// Graph
-pub use graph::{DocumentGraph, DocumentGraphNode, EdgeEvidence, GraphEdge, WeightedKeyword};
-
-// Events
-pub use events::{EventEmitter, IndexEvent, QueryEvent, WorkspaceEvent};
-
-// Metrics
-pub use metrics::{IndexMetrics, LlmMetricsReport, MetricsReport, RetrievalMetricsReport};
-
-// Retrieval (streaming)
-pub use retrieval::{RetrieveEvent, SufficiencyLevel};
-
-// Errors
-pub use error::{Error, Result};
-
-/// Test-only utilities.
-///
-/// **Do not use in production code.** This module exposes helpers for writing
-/// integration tests without a real LLM endpoint.
-#[doc(hidden)]
-pub mod __test_support {
-    pub use crate::client::test_support::*;
-}
diff --git a/rust/src/rerank/types.rs b/rust/src/rerank/types.rs
deleted file mode 100644
index 4b42f351..00000000
--- a/rust/src/rerank/types.rs
+++ /dev/null
@@ -1,14 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Rerank result types.
-
-/// Output from the rerank pipeline.
-pub struct RerankOutput {
-    /// Synthesized answer.
-    pub answer: String,
-    /// Number of LLM calls used during synthesis/fusion.
-    pub llm_calls: u32,
-    /// Confidence score (0.0–1.0) — derived from LLM evaluate() result.
-    pub confidence: f32,
-}
diff --git a/rust/src/storage/workspace.rs b/rust/src/storage/workspace.rs
deleted file mode 100644
index 936fc815..00000000
--- a/rust/src/storage/workspace.rs
+++ /dev/null
@@ -1,666 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Async workspace management for document collections.
-//!
-//! This module provides the primary workspace implementation for document
-//! persistence, using async I/O for integration with runtimes like Tokio.
-//!
-//! # Features
-//!
-//! - **Async I/O** - All operations are async for non-blocking performance
-//! - **LRU Cache** - Automatic caching with configurable size
-//! - **Thread-Safe** - Fully thread-safe with `Arc<RwLock>`
-//! - **Pluggable Backend** - Use file storage, in-memory, or custom backends
-//!
-//! # Example
-//!
-//! ```rust,ignore
-//! use vectorless::storage::Workspace;
-//!
-//! #[tokio::main]
-//! async fn main() -> Result<()> {
-//!     let workspace = Workspace::new("./workspace").await?;
-//!
-//!     // Add a document
-//!     workspace.add(&doc).await?;
-//!
-//!     // Load with caching
-//!     let loaded = workspace.load_and_cache("doc-1").await?;
-//!
-//!     Ok(())
-//! }
-//! ```
-
-use std::collections::HashMap;
-use std::path::PathBuf;
-use std::sync::Arc;
-
-use serde::{Deserialize, Serialize};
-use tokio::sync::RwLock;
-use tracing::{debug, info, warn};
-
-use super::backend::{FileBackend, StorageBackend};
-use super::cache::DocumentCache;
-use super::persistence::{PersistedDocument, load_document_from_bytes, save_document_to_bytes};
-use crate::Error;
-use crate::error::Result;
-
-const META_KEY: &str = "meta";
-const CATALOG_KEY: &str = "catalog";
-const DEFAULT_CACHE_SIZE: usize = 100;
-
-/// Lightweight metadata entry for the async workspace index.
-#[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct DocumentMetaEntry {
-    /// Document ID.
-    pub id: String,
-    /// Document name/title.
-    pub doc_name: String,
-    /// Document description.
-    #[serde(default)]
-    pub doc_description: Option<String>,
-    /// Document type (pdf, md, etc.).
-    pub doc_type: String,
-    /// Source file path.
-    #[serde(default)]
-    pub path: Option<String>,
-    /// Page count (for PDFs).
-    #[serde(skip_serializing_if = "Option::is_none")]
-    pub page_count: Option<usize>,
-    /// Line count (for markdown).
-    #[serde(skip_serializing_if = "Option::is_none")]
-    pub line_count: Option<usize>,
-}
-
-/// Options for async workspace creation.
-#[derive(Debug, Clone)]
-pub struct WorkspaceOptions {
-    /// LRU cache size (default: 100).
-    pub cache_size: usize,
-}
-
-impl Default for WorkspaceOptions {
-    fn default() -> Self {
-        Self {
-            cache_size: DEFAULT_CACHE_SIZE,
-        }
-    }
-}
-
-impl WorkspaceOptions {
-    /// Create new options with defaults.
-    pub fn new() -> Self {
-        Self::default()
-    }
-
-    /// Set the cache size.
-    pub fn with_cache_size(mut self, size: usize) -> Self {
-        self.cache_size = size;
-        self
-    }
-}
-
-/// Inner state for the async workspace.
-struct WorkspaceInner {
-    /// Storage backend.
-    backend: Arc<dyn StorageBackend>,
-    /// Root path (for file-based backends).
-    root: Option<PathBuf>,
-    /// Document metadata index.
-    meta_index: HashMap<String, DocumentMetaEntry>,
-    /// DocCard catalog — lightweight document summaries for Orchestrator analysis.
-    catalog: HashMap<String, crate::document::DocCard>,
-    /// LRU cache for loaded documents.
-    cache: DocumentCache,
-    /// Cross-document relationship graph (cached).
-    document_graph: Option<crate::graph::DocumentGraph>,
-}
-
-/// An async workspace for managing indexed documents.
-///
-/// Uses `tokio::sync::RwLock` for async-safe concurrent access.
-/// All operations are async and can be safely called from multiple tasks.
-///
-/// # Thread Safety
-///
-/// The async workspace is fully thread-safe and can be cloned cheaply
-/// (it uses `Arc` internally).
-#[derive(Clone)]
-pub struct Workspace {
-    inner: Arc<RwLock<WorkspaceInner>>,
-}
-
-impl std::fmt::Debug for Workspace {
-    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
-        f.debug_struct("Workspace").finish()
-    }
-}
-
-impl Workspace {
-    /// Create a new async workspace with a storage backend.
-    pub async fn with_backend(backend: Arc<dyn StorageBackend>) -> Result<Self> {
-        Self::with_backend_and_options(backend, WorkspaceOptions::default()).await
-    }
-
-    /// Create an async workspace with backend and options.
-    pub async fn with_backend_and_options(
-        backend: Arc<dyn StorageBackend>,
-        options: WorkspaceOptions,
-    ) -> Result<Self> {
-        let mut inner = WorkspaceInner {
-            backend,
-            root: None,
-            meta_index: HashMap::new(),
-            catalog: HashMap::new(),
-            cache: DocumentCache::with_capacity(options.cache_size),
-            document_graph: None,
-        };
-
-        Self::load_meta_index(&mut inner)?;
-        Self::load_catalog_index(&mut inner)?;
-
-        Ok(Self {
-            inner: Arc::new(RwLock::new(inner)),
-        })
-    }
-
-    /// Create a new file-based async workspace at the given path.
-    pub async fn new(path: impl Into<PathBuf>) -> Result<Self> {
-        Self::with_options(path, WorkspaceOptions::default()).await
-    }
-
-    /// Create a new async workspace with custom cache size.
-    pub async fn with_cache_size(path: impl Into<PathBuf>, cache_size: usize) -> Result<Self> {
-        Self::with_options(
-            path,
-            WorkspaceOptions {
-                cache_size,
-                ..Default::default()
-            },
-        )
-        .await
-    }
-
-    /// Create a new async workspace with custom options.
-    pub async fn with_options(path: impl Into<PathBuf>, options: WorkspaceOptions) -> Result<Self> {
-        let root = path.into();
-        let backend = Arc::new(FileBackend::new(&root)?);
-
-        let mut inner = WorkspaceInner {
-            backend,
-            root: Some(root),
-            meta_index: HashMap::new(),
-            catalog: HashMap::new(),
-            cache: DocumentCache::with_capacity(options.cache_size),
-            document_graph: None,
-        };
-
-        Self::load_meta_index(&mut inner)?;
-        Self::load_catalog_index(&mut inner)?;
-
-        Ok(Self {
-            inner: Arc::new(RwLock::new(inner)),
-        })
-    }
-
-    /// Get the workspace root path (if file-based).
-    pub async fn path(&self) -> Option<PathBuf> {
-        let inner = self.inner.read().await;
-        inner.root.clone()
-    }
-
-    /// List all document IDs in the workspace.
-    pub async fn list_documents(&self) -> Vec<String> {
-        let inner = self.inner.read().await;
-        inner.meta_index.keys().cloned().collect()
-    }
-
-    /// Get metadata for a document.
-    pub async fn get_meta(&self, id: &str) -> Option<DocumentMetaEntry> {
-        let inner = self.inner.read().await;
-        inner.meta_index.get(id).cloned()
-    }
-
-    /// Check if a document exists.
-    pub async fn contains(&self, id: &str) -> bool {
-        let inner = self.inner.read().await;
-        inner.meta_index.contains_key(id)
-    }
-
-    /// Add a document to the workspace.
-    pub async fn add(&self, doc: &PersistedDocument) -> Result<()> {
-        let mut inner = self.inner.write().await;
-
-        let doc_id = doc.meta.id.clone();
-        let key = Self::doc_key(&doc_id);
-
-        // Serialize and save via backend
-        let bytes = save_document_to_bytes(doc)?;
-        inner.backend.put(&key, &bytes)?;
-
-        // Update meta index
-        let meta_entry = DocumentMetaEntry {
-            id: doc_id.clone(),
-            doc_name: doc.meta.name.clone(),
-            doc_description: doc.meta.description.clone(),
-            doc_type: doc.meta.format.clone(),
-            path: doc
-                .meta
-                .source_path
-                .as_ref()
-                .map(|p| p.to_string_lossy().to_string()),
-            page_count: if doc.pages.is_empty() {
-                None
-            } else {
-                Some(doc.pages.len())
-            },
-            line_count: doc.meta.line_count,
-        };
-
-        inner.meta_index.insert(doc_id.clone(), meta_entry);
-        Self::save_meta_index(&inner)?;
-
-        // Update catalog with DocCard
-        if let Some(card) = doc
-            .navigation_index
-            .as_ref()
-            .and_then(|nav| nav.doc_card().cloned())
-        {
-            inner.catalog.insert(doc_id.clone(), card);
-            Self::save_catalog_index(&inner)?;
-        }
-
-        // Remove from cache if present
-        let _ = inner.cache.remove(&doc_id);
-
-        info!("Saved document {} to async workspace", doc_id);
-
-        // Invalidate document graph since documents changed
-        inner.document_graph = None;
-
-        Ok(())
-    }
-
-    /// Load a document from the workspace.
-    ///
-    /// Uses LRU cache: returns cached version if available,
-    /// otherwise loads from backend and caches it.
-    pub async fn load(&self, id: &str) -> Result<Option<PersistedDocument>> {
-        // First check if document exists (read lock)
-        {
-            let inner = self.inner.read().await;
-            if !inner.meta_index.contains_key(id) {
-                return Ok(None);
-            }
-
-            // Check LRU cache
-            if let Some(cached) = inner.cache.get(id)? {
-                debug!("Cache hit for document {}", id);
-                return Ok(Some(cached));
-            }
-        }
-
-        // Load from backend (need read lock for backend access)
-        let inner = self.inner.read().await;
-        let key = Self::doc_key(id);
-
-        match inner.backend.get(&key)? {
-            Some(bytes) => {
-                let doc = load_document_from_bytes(&bytes)?;
-
-                // Note: We can't modify the cache with only a read lock
-                // For now, we return the document without caching
-                // A more sophisticated implementation would use a separate cache structure
-
-                debug!("Loaded document {} from backend", id);
-                Ok(Some(doc))
-            }
-            None => {
-                warn!("Document {} in meta index but not in backend", id);
-                Ok(None)
-            }
-        }
-    }
-
-    /// Load a document and cache it (requires write lock for caching).
-    pub async fn load_and_cache(&self, id: &str) -> Result<Option<PersistedDocument>> {
-        // First check if document exists (read lock)
-        {
-            let inner = self.inner.read().await;
-            if !inner.meta_index.contains_key(id) {
-                return Ok(None);
-            }
-
-            // Check LRU cache
-            if let Some(cached) = inner.cache.get(id)? {
-                debug!("Cache hit for document {}", id);
-                return Ok(Some(cached));
-            }
-        }
-
-        // Load from backend and cache (write lock)
-        let inner = self.inner.write().await;
-        let key = Self::doc_key(id);
-
-        match inner.backend.get(&key)? {
-            Some(bytes) => {
-                let doc = load_document_from_bytes(&bytes)?;
-
-                // Add to cache
-                inner.cache.put(id.to_string(), doc.clone())?;
-
-                debug!("Loaded and cached document {}", id);
-                Ok(Some(doc))
-            }
-            None => {
-                warn!("Document {} in meta index but not in backend", id);
-                Ok(None)
-            }
-        }
-    }
-
-    /// Remove a document from the workspace.
-    pub async fn remove(&self, id: &str) -> Result<bool> {
-        let mut inner = self.inner.write().await;
-
-        if !inner.meta_index.contains_key(id) {
-            return Ok(false);
-        }
-
-        let key = Self::doc_key(id);
-        inner.backend.delete(&key)?;
-
-        inner.meta_index.remove(id);
-
-        // Remove from cache and catalog
-        let _ = inner.cache.remove(id);
-        inner.catalog.remove(id);
-
-        Self::save_meta_index(&inner)?;
-        Self::save_catalog_index(&inner)?;
-
-        info!("Removed document {} from async workspace", id);
-
-        // Invalidate document graph since documents changed
-        inner.document_graph = None;
-
-        Ok(true)
-    }
-
-    /// Get the number of documents in the workspace.
-    pub async fn len(&self) -> usize {
-        let inner = self.inner.read().await;
-        inner.meta_index.len()
-    }
-
-    /// Check if the workspace is empty.
-    pub async fn is_empty(&self) -> bool {
-        let inner = self.inner.read().await;
-        inner.meta_index.is_empty()
-    }
-
-    /// Find a document ID by its source path.
-    ///
-    /// Returns the first document whose `source_path` matches.
-    /// Used for incremental indexing to check if a file has already been indexed.
-    pub async fn find_by_source_path(&self, path: &std::path::Path) -> Option<String> {
-        let target = path.to_string_lossy().to_string();
-        let inner = self.inner.read().await;
-        for (_, entry) in &inner.meta_index {
-            if entry.path.as_deref() == Some(target.as_str()) {
-                return Some(entry.id.clone());
-            }
-        }
-        None
-    }
-
-    /// Get the number of items currently in the LRU cache.
-    pub async fn cache_len(&self) -> usize {
-        let inner = self.inner.read().await;
-        inner.cache.len()
-    }
-
-    /// Get cache utilization (0.0 to 1.0).
-    pub async fn cache_utilization(&self) -> f64 {
-        let inner = self.inner.read().await;
-        inner.cache.utilization()
-    }
-
-    /// Get cache statistics.
-    pub async fn cache_stats(&self) -> super::cache::CacheStats {
-        let inner = self.inner.read().await;
-        inner.cache.stats()
-    }
-
-    /// Clear the LRU cache.
-    pub async fn clear_cache(&self) -> Result<()> {
-        let inner = self.inner.write().await;
-        inner.cache.clear()?;
-        debug!("Cleared async document cache");
-        Ok(())
-    }
-
-    // =========================================================================
-    // Document Graph Methods
-    // =========================================================================
-
-    /// Storage key for the document graph.
-    const GRAPH_KEY: &'static str = "_graph";
-
-    /// Get the document graph, loading from backend if not cached.
-    pub async fn get_graph(&self) -> Result<Option<crate::graph::DocumentGraph>> {
-        // Check cache first
-        {
-            let inner = self.inner.read().await;
-            if inner.document_graph.is_some() {
-                return Ok(inner.document_graph.clone());
-            }
-        }
-
-        // Load from backend
-        let inner = self.inner.read().await;
-        match inner.backend.get(Self::GRAPH_KEY)? {
-            Some(bytes) => {
-                let graph: crate::graph::DocumentGraph =
-                    serde_json::from_slice(&bytes).map_err(|e| {
-                        crate::Error::Serialization(format!("Failed to deserialize graph: {}", e))
-                    })?;
-                debug!("Loaded document graph from backend");
-                Ok(Some(graph))
-            }
-            None => Ok(None),
-        }
-    }
-
-    /// Persist the document graph to the backend.
-    pub async fn set_graph(&self, graph: &crate::graph::DocumentGraph) -> Result<()> {
-        let mut inner = self.inner.write().await;
-        let bytes = serde_json::to_vec(graph).map_err(|e| {
-            crate::Error::Serialization(format!("Failed to serialize graph: {}", e))
-        })?;
-        inner.backend.put(Self::GRAPH_KEY, &bytes)?;
-        inner.document_graph = Some(graph.clone());
-        info!(
-            "Persisted document graph ({} nodes, {} edges)",
-            graph.node_count(),
-            graph.edge_count()
-        );
-        Ok(())
-    }
-
-    /// Invalidate the cached document graph (e.g. after add/remove).
-    pub async fn invalidate_graph(&self) -> Result<()> {
-        let mut inner = self.inner.write().await;
-        inner.document_graph = None;
-        // Also remove from backend so stale graphs don't persist
-        let _ = inner.backend.delete(Self::GRAPH_KEY);
-        debug!("Invalidated document graph cache");
-        Ok(())
-    }
-
-    /// Get the storage key for a document.
-    fn doc_key(id: &str) -> String {
-        id.to_string()
-    }
-
-    /// Load the meta index from backend.
-    fn load_meta_index(inner: &mut WorkspaceInner) -> Result<()> {
-        match inner.backend.get(META_KEY)? {
-            Some(bytes) => {
-                let meta: HashMap<String, DocumentMetaEntry> = serde_json::from_slice(&bytes)
-                    .map_err(|e| Error::Parse(format!("Failed to parse meta index: {}", e)))?;
-                inner.meta_index = meta;
-                info!(
-                    "Loaded {} document(s) from async workspace index",
-                    inner.meta_index.len()
-                );
-            }
-            None => {
-                // Try to rebuild from existing keys
-                Self::rebuild_meta_index(inner)?;
-            }
-        }
-        Ok(())
-    }
-
-    /// Save the meta index to backend.
-    fn save_meta_index(inner: &WorkspaceInner) -> Result<()> {
-        let bytes = serde_json::to_vec_pretty(&inner.meta_index)
-            .map_err(|e| Error::Parse(format!("Failed to serialize meta index: {}", e)))?;
-        inner.backend.put(META_KEY, &bytes)?;
-        Ok(())
-    }
-
-    /// Load the DocCard catalog from backend.
-    fn load_catalog_index(inner: &mut WorkspaceInner) -> Result<()> {
-        match inner.backend.get(CATALOG_KEY)? {
-            Some(bytes) => {
-                let catalog: HashMap<String, crate::document::DocCard> =
-                    serde_json::from_slice(&bytes).map_err(|e| {
-                        Error::Parse(format!("Failed to parse catalog index: {}", e))
-                    })?;
-                inner.catalog = catalog;
-                info!("Loaded DocCard catalog: {} entries", inner.catalog.len());
-            }
-            None => {
-                // Rebuild from existing documents
-                Self::rebuild_catalog(inner)?;
-            }
-        }
-        Ok(())
-    }
-
-    /// Save the DocCard catalog to backend.
-    fn save_catalog_index(inner: &WorkspaceInner) -> Result<()> {
-        let bytes = serde_json::to_vec_pretty(&inner.catalog)
-            .map_err(|e| Error::Parse(format!("Failed to serialize catalog: {}", e)))?;
-        inner.backend.put(CATALOG_KEY, &bytes)?;
-        Ok(())
-    }
-
-    /// Rebuild the DocCard catalog from existing documents.
-    fn rebuild_catalog(inner: &mut WorkspaceInner) -> Result<()> {
-        let keys = inner.backend.keys()?;
-        let reserved = ["meta", "_graph", "catalog"];
-        let doc_keys: Vec<_> = keys
-            .iter()
-            .filter(|k| !reserved.contains(&k.as_str()))
-            .collect();
-
-        for key in doc_keys {
-            if let Some(bytes) = inner.backend.get(key)? {
-                if let Ok(doc) = load_document_from_bytes(&bytes) {
-                    if let Some(card) = doc
-                        .navigation_index
-                        .as_ref()
-                        .and_then(|nav| nav.doc_card().cloned())
-                    {
-                        inner.catalog.insert(doc.meta.id.clone(), card);
-                    }
-                }
-            }
-        }
-
-        if !inner.catalog.is_empty() {
-            Self::save_catalog_index(inner)?;
-            info!("Rebuilt DocCard catalog: {} entries", inner.catalog.len());
-        }
-
-        Ok(())
-    }
-
-    /// Get all DocCards from the catalog.
-    pub async fn list_catalog(&self) -> Vec<(String, crate::document::DocCard)> {
-        let inner = self.inner.read().await;
-        inner
-            .catalog
-            .iter()
-            .map(|(id, card)| (id.clone(), card.clone()))
-            .collect()
-    }
-
-    /// Get a single DocCard by document ID.
-    pub async fn get_doc_card(&self, id: &str) -> Option<crate::document::DocCard> {
-        let inner = self.inner.read().await;
-        inner.catalog.get(id).cloned()
-    }
-
-    /// Rebuild the meta index from existing documents.
-    fn rebuild_meta_index(inner: &mut WorkspaceInner) -> Result<()> {
-        let keys = inner.backend.keys()?;
-        let reserved = ["meta", "_graph", "catalog"];
-        let doc_keys: Vec<_> = keys
-            .iter()
-            .filter(|k| !reserved.contains(&k.as_str()))
-            .collect();
-
-        for key in doc_keys {
-            if let Some(bytes) = inner.backend.get(key)? {
-                if let Ok(doc) = load_document_from_bytes(&bytes) {
-                    let doc_id = doc.meta.id.clone();
-                    let meta_entry = DocumentMetaEntry {
-                        id: doc_id.clone(),
-                        doc_name: doc.meta.name,
-                        doc_description: doc.meta.description,
-                        doc_type: doc.meta.format,
-                        path: doc
-                            .meta
-                            .source_path
-                            .as_ref()
-                            .map(|p| p.to_string_lossy().to_string()),
-                        page_count: if doc.pages.is_empty() {
-                            None
-                        } else {
-                            Some(doc.pages.len())
-                        },
-                        line_count: doc.meta.line_count,
-                    };
-                    inner.meta_index.insert(doc_id, meta_entry);
-                }
-            }
-        }
-
-        if !inner.meta_index.is_empty() {
-            Self::save_meta_index(inner)?;
-            info!(
-                "Rebuilt async index from {} document(s)",
-                inner.meta_index.len()
-            );
-        }
-
-        Ok(())
-    }
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-    use crate::document::DocumentTree;
-
-    fn create_test_doc(id: &str) -> PersistedDocument {
-        let meta = super::super::persistence::DocumentMeta::new(id, "Test Doc", "md");
-        let tree = DocumentTree::new("Root", "Content");
-        PersistedDocument::new(meta, tree)
-    }
-}
diff --git a/rust/tests/integration.rs b/rust/tests/integration.rs
deleted file mode 100644
index 00c6b0cf..00000000
--- a/rust/tests/integration.rs
+++ /dev/null
@@ -1,165 +0,0 @@
-// Copyright (c) 2026 vectorless developers
-// SPDX-License-Identifier: Apache-2.0
-
-//! Integration tests for the Engine client.
-//!
-//! These tests exercise the full index → persist → query lifecycle
-//! without requiring a real LLM endpoint, using the no-LLM pipeline.
-
-use std::path::PathBuf;
-
-use vectorless::__test_support::build_test_engine;
-use vectorless::{Engine, IndexContext, IndexMode};
-
-async fn setup() -> (Engine, tempfile::TempDir) {
-    let tmp = tempfile::tempdir().unwrap();
-    let engine = build_test_engine(tmp.path()).await;
-    (engine, tmp)
-}
-
-#[tokio::test]
-async fn test_index_and_persist_single_markdown() {
-    let (engine, tmp) = setup().await;
-
-    // Write a test markdown file
-    let md_path = tmp.path().join("test.md");
-    std::fs::write(&md_path, "# Hello\n\nWorld content here.").unwrap();
-
-    let ctx = IndexContext::from_path(&md_path).with_mode(IndexMode::Force);
-    let result = engine.index(ctx).await.unwrap();
-
-    assert_eq!(result.len(), 1);
-    assert!(!result.has_failures());
-    let doc_id = result.doc_id().unwrap();
-    assert!(!doc_id.is_empty());
-
-    // Verify persisted
-    assert!(engine.exists(doc_id).await.unwrap());
-
-    // List should contain 1 doc
-    let docs = engine.list().await.unwrap();
-    assert_eq!(docs.len(), 1);
-    assert_eq!(docs[0].name, "test");
-
-    // Remove
-    assert!(engine.remove(doc_id).await.unwrap());
-    assert!(!engine.exists(doc_id).await.unwrap());
-}
-
-#[tokio::test]
-async fn test_index_from_content() {
-    let (engine, _tmp) = setup().await;
-
-    let ctx = IndexContext::from_content(
-        "# Title\n\nParagraph 1\n\n## Section\n\nParagraph 2",
-        vectorless::DocumentFormat::Markdown,
-    )
-    .with_name("inline-doc");
-
-    let result = engine.index(ctx).await.unwrap();
-    assert_eq!(result.len(), 1);
-    let doc_id = result.doc_id().unwrap();
-
-    // Verify it's persisted and loadable
-    assert!(engine.exists(doc_id).await.unwrap());
-
-    // Clean up
-    engine.remove(doc_id).await.unwrap();
-}
-
-#[tokio::test]
-async fn test_index_multiple_sources_parallel() {
-    let (engine, tmp) = setup().await;
-
-    // Create 3 markdown files
-    let paths: Vec<PathBuf> = (0..3)
-        .map(|i| {
-            let p = tmp.path().join(format!("doc{i}.md"));
-            std::fs::write(&p, format!("# Doc {i}\n\nContent {i}")).unwrap();
-            p
-        })
-        .collect();
-
-    let ctx = IndexContext::from_paths(paths).with_mode(IndexMode::Force);
-    let result = engine.index(ctx).await.unwrap();
-
-    assert_eq!(result.len(), 3);
-    assert!(!result.has_failures());
-
-    let docs = engine.list().await.unwrap();
-    assert_eq!(docs.len(), 3);
-
-    // Clear all
-    let count = engine.clear().await.unwrap();
-    assert_eq!(count, 3);
-}
-
-#[tokio::test]
-async fn test_index_default_mode_skips_existing() {
-    let (engine, tmp) = setup().await;
-
-    let md_path = tmp.path().join("existing.md");
-    std::fs::write(&md_path, "# Original\n\nOriginal content.").unwrap();
-
-    // First index
-    let ctx = IndexContext::from_path(&md_path);
-    let result1 = engine.index(ctx).await.unwrap();
-    assert_eq!(result1.len(), 1);
-    let id1 = result1.doc_id().unwrap().to_string();
-
-    // Second index with Default mode — should skip
-    let ctx = IndexContext::from_path(&md_path);
-    let result2 = engine.index(ctx).await.unwrap();
-    assert_eq!(result2.len(), 1);
-    assert!(!result2.has_failures());
-    // Same doc ID — not re-indexed
-    assert_eq!(result2.doc_id().unwrap(), id1);
-}
-
-#[tokio::test]
-async fn test_force_mode_reindexes() {
-    let (engine, tmp) = setup().await;
-
-    let md_path = tmp.path().join("force.md");
-    std::fs::write(&md_path, "# Version 1").unwrap();
-
-    // First index
-    let ctx = IndexContext::from_path(&md_path);
-    let result1 = engine.index(ctx).await.unwrap();
-    let id1 = result1.doc_id().unwrap().to_string();
-
-    // Force re-index — should get a new doc ID
-    let ctx = IndexContext::from_path(&md_path).with_mode(IndexMode::Force);
-    let result2 = engine.index(ctx).await.unwrap();
-    assert_eq!(result2.len(), 1);
-    // Different doc ID — re-indexed
-    assert_ne!(result2.doc_id().unwrap(), id1);
-}
-
-#[tokio::test]
-async fn test_clear_empty_workspace() {
-    let (engine, _tmp) = setup().await;
-
-    let count = engine.clear().await.unwrap();
-    assert_eq!(count, 0);
-}
-
-#[tokio::test]
-async fn test_remove_nonexistent() {
-    let (engine, _tmp) = setup().await;
-
-    let removed = engine.remove("nonexistent-id").await.unwrap();
-    assert!(!removed);
-}
-
-#[tokio::test]
-async fn test_index_from_bytes() {
-    let (engine, _tmp) = setup().await;
-
-    let ctx = IndexContext::from_bytes(vec![1, 2, 3, 4], vectorless::DocumentFormat::Pdf)
-        .with_name("test-bytes");
-
-    // This will fail at parse (not a real PDF), but should error gracefully
-    let result = engine.index(ctx).await;
-    assert!(result.is_err());
-}
diff --git a/python/tests/__init__.py b/tests/__init__.py
similarity index 100%
rename from python/tests/__init__.py
rename to tests/__init__.py
diff --git a/python/tests/conftest.py b/tests/conftest.py
similarity index 100%
rename from python/tests/conftest.py
rename to tests/conftest.py
diff --git a/python/tests/test_cli/__init__.py b/tests/test_cli/__init__.py
similarity index 100%
rename from python/tests/test_cli/__init__.py
rename to tests/test_cli/__init__.py
diff --git a/python/tests/test_compat/__init__.py b/tests/test_compat/__init__.py
similarity index 100%
rename from python/tests/test_compat/__init__.py
rename to tests/test_compat/__init__.py
diff --git a/python/tests/test_config.py b/tests/test_config.py
similarity index 100%
rename from python/tests/test_config.py
rename to tests/test_config.py
diff --git a/python/tests/test_events.py b/tests/test_events.py
similarity index 100%
rename from python/tests/test_events.py
rename to tests/test_events.py
diff --git a/python/tests/test_session.py b/tests/test_session.py
similarity index 100%
rename from python/tests/test_session.py
rename to tests/test_session.py
diff --git a/python/tests/test_types.py b/tests/test_types.py
similarity index 100%
rename from python/tests/test_types.py
rename to tests/test_types.py
diff --git a/vectorless-core/vectorless-agent/Cargo.toml b/vectorless-core/vectorless-agent/Cargo.toml
new file mode 100644
index 00000000..0d7f7936
--- /dev/null
+++ b/vectorless-core/vectorless-agent/Cargo.toml
@@ -0,0 +1,29 @@
+[package]
+name = "vectorless-agent"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-document = { path = "../vectorless-document" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-llm = { path = "../vectorless-llm" }
+vectorless-query = { path = "../vectorless-query" }
+vectorless-rerank = { path = "../vectorless-rerank" }
+vectorless-scoring = { path = "../vectorless-scoring" }
+tokio = { workspace = true }
+async-trait = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+futures = { workspace = true }
+chrono = { workspace = true }
+thiserror = { workspace = true }
+regex = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/agent/command.rs b/vectorless-core/vectorless-agent/src/command.rs
similarity index 97%
rename from rust/src/agent/command.rs
rename to vectorless-core/vectorless-agent/src/command.rs
index 5507a1d1..07385420 100644
--- a/rust/src/agent/command.rs
+++ b/vectorless-core/vectorless-agent/src/command.rs
@@ -7,7 +7,7 @@
 //! simple and forgiving — unknown input falls back to `Ls` so the agent can
 //! re-observe its surroundings.
 
-use crate::document::{NavigationIndex, NodeId};
+use vectorless_document::{NavigationIndex, NodeId};
 
 /// Parsed command from LLM output.
 #[derive(Debug, Clone, PartialEq)]
@@ -192,7 +192,7 @@ pub fn resolve_target_extended(
     target: &str,
     nav_index: &NavigationIndex,
     current_node: NodeId,
-    tree: &crate::document::DocumentTree,
+    tree: &vectorless_document::DocumentTree,
 ) -> Option<NodeId> {
     let target = strip_quotes(target);
     // Try the primary resolver first
@@ -222,7 +222,7 @@ pub fn resolve_target_extended(
 fn search_descendants(
     target_lower: &str,
     start: NodeId,
-    tree: &crate::document::DocumentTree,
+    tree: &vectorless_document::DocumentTree,
     max_depth: usize,
 ) -> Option<NodeId> {
     let mut queue: Vec<(NodeId, usize)> = vec![(start, 0)];
@@ -305,7 +305,7 @@ mod tests {
 
     #[test]
     fn test_resolve_target_quoted() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         let mut tree = DocumentTree::new("Root", "");
         let root = tree.root();
@@ -387,7 +387,7 @@ mod tests {
 
     #[test]
     fn test_resolve_target_numeric() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         let mut tree = DocumentTree::new("Root", "");
         let root = tree.root();
@@ -420,7 +420,7 @@ mod tests {
 
     #[test]
     fn test_resolve_target_exact() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         let mut tree = DocumentTree::new("Root", "");
         let root = tree.root();
@@ -445,7 +445,7 @@ mod tests {
 
     #[test]
     fn test_resolve_target_case_insensitive() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         let mut tree = DocumentTree::new("Root", "");
         let root = tree.root();
@@ -474,7 +474,7 @@ mod tests {
 
     #[test]
     fn test_resolve_target_contains() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         let mut tree = DocumentTree::new("Root", "");
         let root = tree.root();
@@ -498,13 +498,13 @@ mod tests {
     #[test]
     fn test_resolve_target_no_routes() {
         let nav_index = NavigationIndex::new();
-        let tree = crate::document::DocumentTree::new("Root", "");
+        let tree = vectorless_document::DocumentTree::new("Root", "");
         assert!(resolve_target("anything", &nav_index, tree.root()).is_none());
     }
 
     #[test]
     fn test_resolve_target_extended_deep_search() {
-        use crate::document::{ChildRoute, DocumentTree};
+        use vectorless_document::{ChildRoute, DocumentTree};
 
         // root → "Wrapper" → "Research Labs" → "Lab B"
         let mut tree = DocumentTree::new("Root", "root content");
diff --git a/rust/src/agent/config.rs b/vectorless-core/vectorless-agent/src/config.rs
similarity index 92%
rename from rust/src/agent/config.rs
rename to vectorless-core/vectorless-agent/src/config.rs
index 0873c8c6..d3b784f1 100644
--- a/rust/src/agent/config.rs
+++ b/vectorless-core/vectorless-agent/src/config.rs
@@ -84,6 +84,8 @@ pub struct Output {
     pub metrics: Metrics,
     /// Confidence score (0.0–1.0) — derived from LLM evaluate() result.
     pub confidence: f32,
+    /// Reasoning trace steps collected during agent navigation.
+    pub trace_steps: Vec<vectorless_document::TraceStep>,
 }
 
 impl Output {
@@ -94,22 +96,15 @@ impl Output {
             evidence: Vec::new(),
             metrics: Metrics::default(),
             confidence: 0.0,
+            trace_steps: Vec::new(),
         }
     }
 }
 
 /// A single piece of evidence collected during navigation.
-#[derive(Debug, Clone, Serialize, Deserialize)]
-pub struct Evidence {
-    /// Navigation path where this evidence was found (e.g., "Root/API Reference/Auth").
-    pub source_path: String,
-    /// Title of the node.
-    pub node_title: String,
-    /// Content of the node.
-    pub content: String,
-    /// Source document name (set by Orchestrator in multi-doc scenarios).
-    pub doc_name: Option<String>,
-}
+///
+/// Re-exported from [`vectorless_rerank::types::Evidence`].
+pub use vectorless_rerank::types::Evidence;
 
 /// Agent execution metrics.
 #[derive(Debug, Clone, Default, Serialize, Deserialize)]
@@ -148,6 +143,8 @@ pub struct WorkerOutput {
     pub metrics: WorkerMetrics,
     /// Document name this Worker was assigned to.
     pub doc_name: String,
+    /// Reasoning trace steps from this Worker.
+    pub trace_steps: Vec<vectorless_document::TraceStep>,
 }
 
 /// Metrics specific to a single Worker's execution.
@@ -184,6 +181,7 @@ impl From<WorkerOutput> for Output {
                 evidence_chars: wo.metrics.evidence_chars,
             },
             confidence: 0.0,
+            trace_steps: wo.trace_steps,
         }
     }
 }
@@ -209,11 +207,11 @@ pub enum Scope<'a> {
 /// Read-only access to a single document's compile artifacts.
 pub struct DocContext<'a> {
     /// Document content tree.
-    pub tree: &'a crate::document::DocumentTree,
+    pub tree: &'a vectorless_document::DocumentTree,
     /// Navigation index (includes DocCard).
-    pub nav_index: &'a crate::document::NavigationIndex,
+    pub nav_index: &'a vectorless_document::NavigationIndex,
     /// Reasoning index (keyword/topic lookup).
-    pub reasoning_index: &'a crate::document::ReasoningIndex,
+    pub reasoning_index: &'a vectorless_document::ReasoningIndex,
     /// Document name (for evidence source attribution).
     pub doc_name: &'a str,
 }
diff --git a/rust/src/agent/context.rs b/vectorless-core/vectorless-agent/src/context.rs
similarity index 90%
rename from rust/src/agent/context.rs
rename to vectorless-core/vectorless-agent/src/context.rs
index c4e542bf..f984bd51 100644
--- a/rust/src/agent/context.rs
+++ b/vectorless-core/vectorless-agent/src/context.rs
@@ -6,7 +6,7 @@
 //! These types provide the agent with structured access to the document's
 //! navigation index, content tree, and reasoning index — all read-only.
 
-use crate::document::{ChildRoute, NodeId, TopicEntry};
+use vectorless_document::{ChildRoute, NodeId, TopicEntry};
 
 // Re-export from config for convenience
 pub use super::config::{DocContext, WorkspaceContext};
@@ -57,17 +57,17 @@ impl<'a> DocContext<'a> {
     }
 
     /// Get the document's DocCard, if available.
-    pub fn doc_card(&self) -> Option<&crate::document::DocCard> {
+    pub fn doc_card(&self) -> Option<&vectorless_document::DocCard> {
         self.nav_index.doc_card()
     }
 
     /// Get the navigation entry for a node (overview, hints, tags).
-    pub fn nav_entry(&self, node: NodeId) -> Option<&crate::document::NavEntry> {
+    pub fn nav_entry(&self, node: NodeId) -> Option<&vectorless_document::NavEntry> {
         self.nav_index.get_entry(node)
     }
 
     /// Get the summary shortcut (pre-computed overview), if available.
-    pub fn summary_shortcut(&self) -> Option<&crate::document::SummaryShortcut> {
+    pub fn summary_shortcut(&self) -> Option<&vectorless_document::SummaryShortcut> {
         self.reasoning_index.summary_shortcut()
     }
 
@@ -105,7 +105,7 @@ impl<'a> WorkspaceContext<'a> {
     }
 
     /// Get all DocCards for documents that have them.
-    pub fn doc_cards(&self) -> Vec<(usize, &crate::document::DocCard)> {
+    pub fn doc_cards(&self) -> Vec<(usize, &vectorless_document::DocCard)> {
         self.docs
             .iter()
             .enumerate()
diff --git a/rust/src/agent/events.rs b/vectorless-core/vectorless-agent/src/events.rs
similarity index 100%
rename from rust/src/agent/events.rs
rename to vectorless-core/vectorless-agent/src/events.rs
diff --git a/rust/src/agent/mod.rs b/vectorless-core/vectorless-agent/src/lib.rs
similarity index 96%
rename from rust/src/agent/mod.rs
rename to vectorless-core/vectorless-agent/src/lib.rs
index f471258a..a566ab6b 100644
--- a/rust/src/agent/mod.rs
+++ b/vectorless-core/vectorless-agent/src/lib.rs
@@ -51,5 +51,5 @@ pub trait Agent {
     /// Agent name for logging and events.
     fn name(&self) -> &str;
     /// Execute the agent, consuming self.
-    async fn run(self) -> crate::error::Result<Self::Output>;
+    async fn run(self) -> vectorless_error::Result<Self::Output>;
 }
diff --git a/rust/src/agent/orchestrator/analyze.rs b/vectorless-core/vectorless-agent/src/orchestrator/analyze.rs
similarity index 95%
rename from rust/src/agent/orchestrator/analyze.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/analyze.rs
index 47dd58f1..2edd9612 100644
--- a/rust/src/agent/orchestrator/analyze.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/analyze.rs
@@ -8,10 +8,10 @@
 
 use tracing::{debug, info};
 
-use crate::error::Error;
-use crate::llm::LlmClient;
-use crate::query::QueryPlan;
-use crate::scoring::bm25::extract_keywords;
+use vectorless_error::Error;
+use vectorless_llm::LlmClient;
+use vectorless_query::QueryPlan;
+use vectorless_scoring::bm25::extract_keywords;
 
 use super::super::config::WorkspaceContext;
 use super::super::prompts::{DispatchEntry, orchestrator_analysis, parse_dispatch_plan};
@@ -43,11 +43,11 @@ pub async fn analyze(
     query: &str,
     ws: &WorkspaceContext<'_>,
     state: &mut OrchestratorState,
-    emitter: &crate::agent::EventEmitter,
+    emitter: &crate::EventEmitter,
     skip_analysis: bool,
     query_plan: &QueryPlan,
     llm: &LlmClient,
-) -> crate::error::Result<AnalyzeOutcome> {
+) -> vectorless_error::Result<AnalyzeOutcome> {
     if skip_analysis {
         debug!("Phase 1: skipping (user-specified documents)");
         let dispatches = (0..ws.doc_count())
diff --git a/rust/src/agent/orchestrator/dispatch.rs b/vectorless-core/vectorless-agent/src/orchestrator/dispatch.rs
similarity index 97%
rename from rust/src/agent/orchestrator/dispatch.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/dispatch.rs
index f599ac1d..0916ec46 100644
--- a/rust/src/agent/orchestrator/dispatch.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/dispatch.rs
@@ -5,7 +5,7 @@
 
 use tracing::{info, warn};
 
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 use super::super::Agent;
 use super::super::config::{AgentConfig, WorkspaceContext};
@@ -13,7 +13,7 @@ use super::super::events::EventEmitter;
 use super::super::prompts::DispatchEntry;
 use super::super::state::OrchestratorState;
 use super::super::worker::Worker;
-use crate::query::QueryPlan;
+use vectorless_query::QueryPlan;
 
 /// Dispatch Workers in parallel and collect results.
 pub async fn dispatch_and_collect(
diff --git a/rust/src/agent/orchestrator/evaluate.rs b/vectorless-core/vectorless-agent/src/orchestrator/evaluate.rs
similarity index 97%
rename from rust/src/agent/orchestrator/evaluate.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/evaluate.rs
index 88e7e07f..9b0898e1 100644
--- a/rust/src/agent/orchestrator/evaluate.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/evaluate.rs
@@ -8,8 +8,8 @@
 
 use tracing::info;
 
-use crate::error::Error;
-use crate::llm::LlmClient;
+use vectorless_error::Error;
+use vectorless_llm::LlmClient;
 
 use super::super::config::Evidence;
 use super::super::prompts::{check_sufficiency, parse_sufficiency_response};
@@ -30,7 +30,7 @@ pub async fn evaluate(
     query: &str,
     evidence: &[Evidence],
     llm: &LlmClient,
-) -> crate::error::Result<EvalResult> {
+) -> vectorless_error::Result<EvalResult> {
     let evidence_summary = format_evidence_summary(evidence);
     let (system, user) = check_sufficiency(query, &evidence_summary);
 
diff --git a/rust/src/agent/orchestrator/mod.rs b/vectorless-core/vectorless-agent/src/orchestrator/mod.rs
similarity index 94%
rename from rust/src/agent/orchestrator/mod.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/mod.rs
index 643ca0fe..e17d1f39 100644
--- a/rust/src/agent/orchestrator/mod.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/mod.rs
@@ -16,8 +16,8 @@ mod supervisor;
 
 use tracing::info;
 
-use crate::llm::LlmClient;
-use crate::query::QueryPlan;
+use vectorless_llm::LlmClient;
+use vectorless_query::QueryPlan;
 
 use super::Agent;
 use super::config::{AgentConfig, Output, WorkspaceContext};
@@ -75,7 +75,7 @@ impl<'a> Agent for Orchestrator<'a> {
         "orchestrator"
     }
 
-    async fn run(self) -> crate::error::Result<Output> {
+    async fn run(self) -> vectorless_error::Result<Output> {
         let Orchestrator {
             query,
             ws,
@@ -191,11 +191,12 @@ pub async fn finalize_output(
     emitter: &EventEmitter,
     orch_llm_calls: u32,
     multi_doc: bool,
-    intent: crate::query::QueryIntent,
+    intent: vectorless_query::QueryIntent,
     confidence: f32,
-) -> crate::error::Result<Output> {
+) -> vectorless_error::Result<Output> {
     let rerank_result =
-        crate::rerank::process(query, &state.all_evidence, multi_doc, intent, confidence).await?;
+        vectorless_rerank::process(query, &state.all_evidence, multi_doc, intent, confidence)
+            .await?;
 
     let total_llm_calls = orch_llm_calls + rerank_result.llm_calls;
     if !rerank_result.answer.is_empty() {
diff --git a/rust/src/agent/orchestrator/replan.rs b/vectorless-core/vectorless-agent/src/orchestrator/replan.rs
similarity index 97%
rename from rust/src/agent/orchestrator/replan.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/replan.rs
index 58ce32b6..5694c370 100644
--- a/rust/src/agent/orchestrator/replan.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/replan.rs
@@ -9,9 +9,9 @@
 
 use tracing::info;
 
-use crate::error::Error;
-use crate::llm::LlmClient;
-use crate::scoring::bm25::extract_keywords;
+use vectorless_error::Error;
+use vectorless_llm::LlmClient;
+use vectorless_scoring::bm25::extract_keywords;
 
 use super::super::config::Evidence;
 use super::super::prompts::DispatchEntry;
@@ -41,7 +41,7 @@ pub async fn replan(
     total_docs: usize,
     doc_cards_text: &str,
     llm: &LlmClient,
-) -> crate::error::Result<ReplanResult> {
+) -> vectorless_error::Result<ReplanResult> {
     let evidence_summary = format_evidence_context(collected_evidence);
     let keywords = extract_keywords(query);
     let find_text = if keywords.is_empty() {
diff --git a/rust/src/agent/orchestrator/supervisor.rs b/vectorless-core/vectorless-agent/src/orchestrator/supervisor.rs
similarity index 97%
rename from rust/src/agent/orchestrator/supervisor.rs
rename to vectorless-core/vectorless-agent/src/orchestrator/supervisor.rs
index 664d06c8..d98dd1a6 100644
--- a/rust/src/agent/orchestrator/supervisor.rs
+++ b/vectorless-core/vectorless-agent/src/orchestrator/supervisor.rs
@@ -5,8 +5,8 @@
 
 use tracing::info;
 
-use crate::llm::LlmClient;
-use crate::query::QueryPlan;
+use vectorless_llm::LlmClient;
+use vectorless_query::QueryPlan;
 
 use super::super::config::{AgentConfig, WorkspaceContext};
 use super::super::events::EventEmitter;
@@ -41,7 +41,7 @@ pub async fn run_supervisor_loop(
     emitter: &EventEmitter,
     query_plan: &QueryPlan,
     skip_analysis: bool,
-) -> crate::error::Result<SupervisorOutcome> {
+) -> vectorless_error::Result<SupervisorOutcome> {
     let mut current_dispatches = initial_dispatches;
     let mut iteration: u32 = 0;
     let mut eval_sufficient = false;
diff --git a/rust/src/agent/prompts.rs b/vectorless-core/vectorless-agent/src/prompts.rs
similarity index 100%
rename from rust/src/agent/prompts.rs
rename to vectorless-core/vectorless-agent/src/prompts.rs
diff --git a/rust/src/agent/state.rs b/vectorless-core/vectorless-agent/src/state.rs
similarity index 93%
rename from rust/src/agent/state.rs
rename to vectorless-core/vectorless-agent/src/state.rs
index 9e0612ad..d2d3029c 100644
--- a/rust/src/agent/state.rs
+++ b/vectorless-core/vectorless-agent/src/state.rs
@@ -5,7 +5,8 @@
 
 use std::collections::HashSet;
 
-use crate::document::NodeId;
+use vectorless_document::NodeId;
+use vectorless_document::TraceStep;
 
 use super::config::{Evidence, Output};
 
@@ -47,6 +48,8 @@ pub struct WorkerState {
     pub check_count: u32,
     /// Whether a navigation plan was generated in Phase 1.5.
     pub plan_generated: bool,
+    /// Reasoning trace steps collected during navigation.
+    pub trace_steps: Vec<TraceStep>,
 }
 
 /// Maximum number of history entries to keep for prompt injection.
@@ -69,6 +72,7 @@ impl WorkerState {
             plan: String::new(),
             check_count: 0,
             plan_generated: false,
+            trace_steps: Vec::new(),
         }
     }
 
@@ -108,7 +112,7 @@ impl WorkerState {
     }
 
     /// Check if evidence has already been collected for a specific node.
-    pub fn has_evidence_for(&self, node_id: crate::document::NodeId) -> bool {
+    pub fn has_evidence_for(&self, node_id: vectorless_document::NodeId) -> bool {
         self.collected_nodes.contains(&node_id)
     }
 
@@ -184,6 +188,7 @@ impl WorkerState {
                 evidence_chars,
             },
             doc_name: doc_name.to_string(),
+            trace_steps: self.trace_steps,
         }
     }
 }
@@ -259,11 +264,13 @@ impl OrchestratorState {
                 ..Default::default()
             },
             confidence: 0.0,
+            trace_steps: self.collect_trace_steps(),
         }
     }
 
     /// Merge all sub-results into a single Output (consuming self).
     pub fn into_output(self, answer: String) -> Output {
+        let trace_steps = self.collect_trace_steps();
         Output {
             answer,
             evidence: self.all_evidence,
@@ -284,8 +291,18 @@ impl OrchestratorState {
                 ..Default::default()
             },
             confidence: 0.0,
+            trace_steps,
         }
     }
+
+    /// Collect trace steps from all sub-results.
+    fn collect_trace_steps(&self) -> Vec<TraceStep> {
+        let mut steps = Vec::new();
+        for result in &self.sub_results {
+            steps.extend(result.trace_steps.iter().cloned());
+        }
+        steps
+    }
 }
 
 impl Default for OrchestratorState {
diff --git a/rust/src/agent/tools/common.rs b/vectorless-core/vectorless-agent/src/tools/common.rs
similarity index 100%
rename from rust/src/agent/tools/common.rs
rename to vectorless-core/vectorless-agent/src/tools/common.rs
diff --git a/rust/src/agent/tools/mod.rs b/vectorless-core/vectorless-agent/src/tools/mod.rs
similarity index 100%
rename from rust/src/agent/tools/mod.rs
rename to vectorless-core/vectorless-agent/src/tools/mod.rs
diff --git a/rust/src/agent/tools/orchestrator.rs b/vectorless-core/vectorless-agent/src/tools/orchestrator.rs
similarity index 91%
rename from rust/src/agent/tools/orchestrator.rs
rename to vectorless-core/vectorless-agent/src/tools/orchestrator.rs
index 96dcb116..4b3d72ac 100644
--- a/rust/src/agent/tools/orchestrator.rs
+++ b/vectorless-core/vectorless-agent/src/tools/orchestrator.rs
@@ -4,7 +4,7 @@
 //! Orchestrator tools: ls_docs, find_cross, dispatch.
 
 use super::ToolResult;
-use crate::agent::config::WorkspaceContext;
+use crate::config::WorkspaceContext;
 
 /// Execute `ls_docs` — list all document cards.
 ///
@@ -112,14 +112,14 @@ pub fn find_cross(keywords: &[String], ctx: &WorkspaceContext) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::{DocCard, NavigationIndex, ReasoningIndex, SectionCard};
+    use vectorless_document::{DocCard, NavigationIndex, ReasoningIndex, SectionCard};
 
     fn build_workspace() -> (
-        Vec<crate::document::DocumentTree>,
+        Vec<vectorless_document::DocumentTree>,
         Vec<NavigationIndex>,
         Vec<ReasoningIndex>,
     ) {
-        let tree1 = crate::document::DocumentTree::new("2024 Report", "content");
+        let tree1 = vectorless_document::DocumentTree::new("2024 Report", "content");
         let mut nav1 = NavigationIndex::new();
         nav1.set_doc_card(DocCard {
             title: "2024 Financial Report".to_string(),
@@ -134,7 +134,7 @@ mod tests {
             total_leaves: 10,
         });
 
-        let tree2 = crate::document::DocumentTree::new("2023 Report", "content");
+        let tree2 = vectorless_document::DocumentTree::new("2023 Report", "content");
         let mut nav2 = NavigationIndex::new();
         nav2.set_doc_card(DocCard {
             title: "2023 Financial Report".to_string(),
@@ -160,13 +160,13 @@ mod tests {
     fn test_ls_docs_shows_cards() {
         let (trees, navs, ridxs) = build_workspace();
         let docs = vec![
-            crate::agent::config::DocContext {
+            crate::config::DocContext {
                 tree: &trees[0],
                 nav_index: &navs[0],
                 reasoning_index: &ridxs[0],
                 doc_name: "2024",
             },
-            crate::agent::config::DocContext {
+            crate::config::DocContext {
                 tree: &trees[1],
                 nav_index: &navs[1],
                 reasoning_index: &ridxs[1],
@@ -185,10 +185,10 @@ mod tests {
 
     #[test]
     fn test_ls_docs_empty() {
-        let tree = crate::document::DocumentTree::new("Empty", "");
+        let tree = vectorless_document::DocumentTree::new("Empty", "");
         let nav = NavigationIndex::new();
         let ridx = ReasoningIndex::default();
-        let docs = vec![crate::agent::config::DocContext {
+        let docs = vec![crate::config::DocContext {
             tree: &tree,
             nav_index: &nav,
             reasoning_index: &ridx,
diff --git a/rust/src/agent/tools/worker/cat.rs b/vectorless-core/vectorless-agent/src/tools/worker/cat.rs
similarity index 93%
rename from rust/src/agent/tools/worker/cat.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/cat.rs
index 107aafa8..e4aeb055 100644
--- a/rust/src/agent/tools/worker/cat.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/cat.rs
@@ -3,9 +3,9 @@
 
 //! `cat` — read node content and collect as evidence.
 
-use crate::agent::command;
-use crate::agent::config::{DocContext, Evidence};
-use crate::agent::state::WorkerState;
+use crate::command;
+use crate::config::{DocContext, Evidence};
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -65,7 +65,7 @@ pub fn cat(target: &str, ctx: &DocContext, state: &mut WorkerState) -> ToolResul
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_test_tree() -> (DocumentTree, NavigationIndex, NodeId, NodeId, NodeId) {
         let mut tree = DocumentTree::new("Root", "root content");
@@ -101,7 +101,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
diff --git a/rust/src/agent/tools/worker/cd.rs b/vectorless-core/vectorless-agent/src/tools/worker/cd.rs
similarity index 94%
rename from rust/src/agent/tools/worker/cd.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/cd.rs
index 8d874832..14972abe 100644
--- a/rust/src/agent/tools/worker/cd.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/cd.rs
@@ -3,9 +3,9 @@
 
 //! `cd`, `cd_absolute`, `cd_up` — navigation commands.
 
-use crate::agent::command;
-use crate::agent::config::DocContext;
-use crate::agent::state::WorkerState;
+use crate::command;
+use crate::config::DocContext;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -132,7 +132,7 @@ pub fn cd_up(ctx: &DocContext, state: &mut WorkerState) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_test_tree() -> (DocumentTree, NavigationIndex, NodeId, NodeId, NodeId) {
         let mut tree = DocumentTree::new("Root", "root content");
@@ -168,7 +168,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
@@ -185,7 +185,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
@@ -232,7 +232,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
@@ -250,7 +250,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
diff --git a/rust/src/agent/tools/worker/find.rs b/vectorless-core/vectorless-agent/src/tools/worker/find.rs
similarity index 94%
rename from rust/src/agent/tools/worker/find.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/find.rs
index 47912b01..b48b4189 100644
--- a/rust/src/agent/tools/worker/find.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/find.rs
@@ -3,7 +3,7 @@
 
 //! `find_tree` — search for nodes by title pattern across the entire tree.
 
-use crate::agent::config::DocContext;
+use crate::config::DocContext;
 
 use super::super::ToolResult;
 
@@ -43,8 +43,8 @@ pub fn find_tree(pattern: &str, ctx: &DocContext) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use crate::config::DocContext;
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_rich_tree() -> (DocumentTree, NavigationIndex, NodeId) {
         let mut tree = DocumentTree::new(
@@ -90,7 +90,7 @@ mod tests {
             DocContext {
                 tree: &$tree,
                 nav_index: &$nav,
-                reasoning_index: &crate::document::ReasoningIndex::default(),
+                reasoning_index: &vectorless_document::ReasoningIndex::default(),
                 doc_name: "test",
             }
         };
diff --git a/rust/src/agent/tools/worker/grep.rs b/vectorless-core/vectorless-agent/src/tools/worker/grep.rs
similarity index 92%
rename from rust/src/agent/tools/worker/grep.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/grep.rs
index 077b077e..b1555fd7 100644
--- a/rust/src/agent/tools/worker/grep.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/grep.rs
@@ -3,8 +3,8 @@
 
 //! `grep` — regex search across all node content in the current subtree.
 
-use crate::agent::config::DocContext;
-use crate::agent::state::WorkerState;
+use crate::config::DocContext;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 use super::collect_subtree;
@@ -61,9 +61,9 @@ pub fn grep(pattern: &str, ctx: &DocContext, state: &WorkerState) -> ToolResult
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::state::WorkerState;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use crate::config::DocContext;
+    use crate::state::WorkerState;
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_rich_tree() -> (DocumentTree, NavigationIndex, NodeId) {
         let mut tree = DocumentTree::new(
@@ -109,7 +109,7 @@ mod tests {
             DocContext {
                 tree: &$tree,
                 nav_index: &$nav,
-                reasoning_index: &crate::document::ReasoningIndex::default(),
+                reasoning_index: &vectorless_document::ReasoningIndex::default(),
                 doc_name: "test",
             }
         };
@@ -167,7 +167,7 @@ mod tests {
         let ctx = rich_ctx!(tree, nav);
         let mut state = WorkerState::new(root, 15);
 
-        crate::agent::tools::worker::cd::cd("Expenses", &ctx, &mut state);
+        crate::tools::worker::cd::cd("Expenses", &ctx, &mut state);
         let result = grep("revenue", &ctx, &state);
         assert!(result.success);
         assert!(result.feedback.contains("No matches"));
diff --git a/rust/src/agent/tools/worker/head.rs b/vectorless-core/vectorless-agent/src/tools/worker/head.rs
similarity index 90%
rename from rust/src/agent/tools/worker/head.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/head.rs
index 5fefa234..764cba7a 100644
--- a/rust/src/agent/tools/worker/head.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/head.rs
@@ -3,9 +3,9 @@
 
 //! `head` — preview first N lines of a node without collecting evidence.
 
-use crate::agent::command;
-use crate::agent::config::DocContext;
-use crate::agent::state::WorkerState;
+use crate::command;
+use crate::config::DocContext;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -53,9 +53,9 @@ pub fn head(target: &str, lines: usize, ctx: &DocContext, state: &WorkerState) -
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::state::WorkerState;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use crate::config::DocContext;
+    use crate::state::WorkerState;
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_rich_tree() -> (DocumentTree, NavigationIndex, NodeId) {
         let mut tree = DocumentTree::new(
@@ -88,7 +88,7 @@ mod tests {
             DocContext {
                 tree: &$tree,
                 nav_index: &$nav,
-                reasoning_index: &crate::document::ReasoningIndex::default(),
+                reasoning_index: &vectorless_document::ReasoningIndex::default(),
                 doc_name: "test",
             }
         };
diff --git a/rust/src/agent/tools/worker/ls.rs b/vectorless-core/vectorless-agent/src/tools/worker/ls.rs
similarity index 94%
rename from rust/src/agent/tools/worker/ls.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/ls.rs
index a00d688e..3c85bc18 100644
--- a/rust/src/agent/tools/worker/ls.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/ls.rs
@@ -3,8 +3,8 @@
 
 //! `ls` — list children of the current node.
 
-use crate::agent::config::DocContext;
-use crate::agent::state::WorkerState;
+use crate::config::DocContext;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -75,7 +75,7 @@ pub fn ls(ctx: &DocContext, state: &WorkerState) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_test_tree() -> (DocumentTree, NavigationIndex, NodeId, NodeId, NodeId) {
         let mut tree = DocumentTree::new("Root", "root content");
@@ -111,7 +111,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let state = WorkerState::new(root, 15);
diff --git a/rust/src/agent/tools/worker/mod.rs b/vectorless-core/vectorless-agent/src/tools/worker/mod.rs
similarity index 94%
rename from rust/src/agent/tools/worker/mod.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/mod.rs
index eb73d34f..4a9fc6e6 100644
--- a/rust/src/agent/tools/worker/mod.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/mod.rs
@@ -21,7 +21,7 @@ pub use ls::ls;
 pub use pwd::pwd;
 pub use wc::wc;
 
-use crate::document::{DocumentTree, NodeId};
+use vectorless_document::{DocumentTree, NodeId};
 
 /// Collect all NodeIds in the subtree rooted at `node` (inclusive).
 pub(super) fn collect_subtree(node: NodeId, tree: &DocumentTree) -> Vec<NodeId> {
diff --git a/rust/src/agent/tools/worker/pwd.rs b/vectorless-core/vectorless-agent/src/tools/worker/pwd.rs
similarity index 84%
rename from rust/src/agent/tools/worker/pwd.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/pwd.rs
index 7adcf084..c5ff06b9 100644
--- a/rust/src/agent/tools/worker/pwd.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/pwd.rs
@@ -3,7 +3,7 @@
 
 //! `pwd` — show current navigation path.
 
-use crate::agent::state::WorkerState;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -15,9 +15,9 @@ pub fn pwd(state: &WorkerState) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::tools::worker::cd::cd;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex};
+    use crate::config::DocContext;
+    use crate::tools::worker::cd::cd;
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex};
 
     fn build_test_tree() -> (DocumentTree, NavigationIndex) {
         let mut tree = DocumentTree::new("Root", "root content");
@@ -45,7 +45,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 15);
diff --git a/rust/src/agent/tools/worker/wc.rs b/vectorless-core/vectorless-agent/src/tools/worker/wc.rs
similarity index 89%
rename from rust/src/agent/tools/worker/wc.rs
rename to vectorless-core/vectorless-agent/src/tools/worker/wc.rs
index 4ea7ec01..adc05cff 100644
--- a/rust/src/agent/tools/worker/wc.rs
+++ b/vectorless-core/vectorless-agent/src/tools/worker/wc.rs
@@ -3,9 +3,9 @@
 
 //! `wc` — show node content statistics.
 
-use crate::agent::command;
-use crate::agent::config::DocContext;
-use crate::agent::state::WorkerState;
+use crate::command;
+use crate::config::DocContext;
+use crate::state::WorkerState;
 
 use super::super::ToolResult;
 
@@ -42,9 +42,9 @@ pub fn wc(target: &str, ctx: &DocContext, state: &WorkerState) -> ToolResult {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::state::WorkerState;
-    use crate::document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
+    use crate::config::DocContext;
+    use crate::state::WorkerState;
+    use vectorless_document::{ChildRoute, DocumentTree, NavigationIndex, NodeId};
 
     fn build_rich_tree() -> (DocumentTree, NavigationIndex, NodeId) {
         let mut tree = DocumentTree::new(
@@ -77,7 +77,7 @@ mod tests {
             DocContext {
                 tree: &$tree,
                 nav_index: &$nav,
-                reasoning_index: &crate::document::ReasoningIndex::default(),
+                reasoning_index: &vectorless_document::ReasoningIndex::default(),
                 doc_name: "test",
             }
         };
diff --git a/rust/src/agent/worker/execute.rs b/vectorless-core/vectorless-agent/src/worker/execute.rs
similarity index 99%
rename from rust/src/agent/worker/execute.rs
rename to vectorless-core/vectorless-agent/src/worker/execute.rs
index 66c40250..12ac88e5 100644
--- a/rust/src/agent/worker/execute.rs
+++ b/vectorless-core/vectorless-agent/src/worker/execute.rs
@@ -5,7 +5,7 @@
 
 use tracing::{info, warn};
 
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 use super::super::command::{Command, parse_command};
 use super::super::config::{DocContext, Step};
diff --git a/rust/src/agent/worker/format.rs b/vectorless-core/vectorless-agent/src/worker/format.rs
similarity index 100%
rename from rust/src/agent/worker/format.rs
rename to vectorless-core/vectorless-agent/src/worker/format.rs
diff --git a/rust/src/agent/worker/mod.rs b/vectorless-core/vectorless-agent/src/worker/mod.rs
similarity index 97%
rename from rust/src/agent/worker/mod.rs
rename to vectorless-core/vectorless-agent/src/worker/mod.rs
index b906d30e..d4ac5453 100644
--- a/rust/src/agent/worker/mod.rs
+++ b/vectorless-core/vectorless-agent/src/worker/mod.rs
@@ -24,10 +24,10 @@ use super::context::FindHit;
 use super::events::EventEmitter;
 use super::state::WorkerState;
 use super::tools::worker as tools;
-use crate::error::Error;
-use crate::llm::LlmClient;
-use crate::query::QueryPlan;
-use crate::scoring::bm25::extract_keywords;
+use vectorless_error::Error;
+use vectorless_llm::LlmClient;
+use vectorless_query::QueryPlan;
+use vectorless_scoring::bm25::extract_keywords;
 
 use navigation::run_navigation_loop;
 use planning::build_plan_prompt;
@@ -75,7 +75,7 @@ impl<'a> Agent for Worker<'a> {
         "worker"
     }
 
-    async fn run(self) -> crate::error::Result<WorkerOutput> {
+    async fn run(self) -> vectorless_error::Result<WorkerOutput> {
         let Worker {
             query,
             task,
diff --git a/rust/src/agent/worker/navigation.rs b/vectorless-core/vectorless-agent/src/worker/navigation.rs
similarity index 91%
rename from rust/src/agent/worker/navigation.rs
rename to vectorless-core/vectorless-agent/src/worker/navigation.rs
index 29b1a680..cb6b06ba 100644
--- a/rust/src/agent/worker/navigation.rs
+++ b/vectorless-core/vectorless-agent/src/worker/navigation.rs
@@ -14,8 +14,8 @@ use super::super::state::WorkerState;
 use super::execute::{execute_command, parse_and_detect_failure};
 use super::format::format_visited_titles;
 use super::planning::{build_replan_prompt, format_keyword_hints};
-use crate::error::Error;
-use crate::llm::LlmClient;
+use vectorless_error::Error;
+use vectorless_llm::LlmClient;
 
 /// Run the Phase 2 navigation loop.
 ///
@@ -32,7 +32,7 @@ pub async fn run_navigation_loop(
     index_hits: &[FindHit],
     intent_context: &str,
     llm_calls: &mut u32,
-) -> crate::error::Result<()> {
+) -> vectorless_error::Result<()> {
     let use_dispatch_prompt = task.is_some();
     let keyword_hints = format_keyword_hints(index_hits, ctx);
     let max_llm = config.max_llm_calls;
@@ -203,7 +203,7 @@ fn handle_parse_failure(
     (command, is_parse_failure)
 }
 
-/// Push a round's command + feedback preview into history.
+/// Push a round's command + feedback preview into history and trace.
 fn push_round_history(state: &mut WorkerState, cmd_str: &str) {
     let feedback_preview = if state.last_feedback.len() > 120 {
         let boundary = state.last_feedback.ceil_char_boundary(120);
@@ -212,6 +212,13 @@ fn push_round_history(state: &mut WorkerState, cmd_str: &str) {
         state.last_feedback.clone()
     };
     state.push_history(format!("{} → {}", cmd_str, feedback_preview));
+
+    let round = state.max_rounds.saturating_sub(state.remaining);
+    state.trace_steps.push(vectorless_document::TraceStep {
+        action: cmd_str.to_string(),
+        observation: state.last_feedback.chars().take(200).collect(),
+        round,
+    });
 }
 
 /// Dynamic re-planning after an insufficient check.
@@ -228,7 +235,7 @@ async fn handle_replan(
     emitter: &EventEmitter,
     llm_calls: &mut u32,
     max_llm: u32,
-) -> crate::error::Result<()> {
+) -> vectorless_error::Result<()> {
     if !is_check {
         return Ok(());
     }
@@ -270,9 +277,9 @@ async fn handle_replan(
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::state::WorkerState;
-    use crate::document::{DocumentTree, NodeId};
+    use crate::config::DocContext;
+    use crate::state::WorkerState;
+    use vectorless_document::{DocumentTree, NodeId};
 
     fn test_ctx() -> (DocumentTree, NodeId) {
         let tree = DocumentTree::new("Root", "root content");
@@ -283,11 +290,11 @@ mod tests {
     #[test]
     fn test_handle_parse_failure_valid_command() {
         let (tree, root) = test_ctx();
-        let nav = crate::document::NavigationIndex::new();
+        let nav = vectorless_document::NavigationIndex::new();
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 10);
@@ -300,11 +307,11 @@ mod tests {
     #[test]
     fn test_handle_parse_failure_unrecognized() {
         let (tree, root) = test_ctx();
-        let nav = crate::document::NavigationIndex::new();
+        let nav = vectorless_document::NavigationIndex::new();
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 10);
@@ -319,11 +326,11 @@ mod tests {
     #[test]
     fn test_handle_parse_failure_short_response() {
         let (tree, root) = test_ctx();
-        let nav = crate::document::NavigationIndex::new();
+        let nav = vectorless_document::NavigationIndex::new();
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let mut state = WorkerState::new(root, 10);
@@ -374,11 +381,11 @@ mod tests {
     #[test]
     fn test_build_round_prompt_dispatch_first_round() {
         let (tree, root) = test_ctx();
-        let nav = crate::document::NavigationIndex::new();
+        let nav = vectorless_document::NavigationIndex::new();
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test_doc",
         };
         let mut state = WorkerState::new(root, 10);
@@ -402,11 +409,11 @@ mod tests {
     #[test]
     fn test_build_round_prompt_navigation_subsequent_round() {
         let (tree, root) = test_ctx();
-        let nav = crate::document::NavigationIndex::new();
+        let nav = vectorless_document::NavigationIndex::new();
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test_doc",
         };
         let mut state = WorkerState::new(root, 10);
diff --git a/rust/src/agent/worker/planning.rs b/vectorless-core/vectorless-agent/src/worker/planning.rs
similarity index 94%
rename from rust/src/agent/worker/planning.rs
rename to vectorless-core/vectorless-agent/src/worker/planning.rs
index a42bf52c..80149e7a 100644
--- a/rust/src/agent/worker/planning.rs
+++ b/vectorless-core/vectorless-agent/src/worker/planning.rs
@@ -5,8 +5,8 @@
 
 use std::collections::HashSet;
 
-use crate::query::QueryIntent;
-use crate::scoring::bm25::{Bm25Engine, FieldDocument, extract_keywords};
+use vectorless_query::QueryIntent;
+use vectorless_scoring::bm25::{Bm25Engine, FieldDocument, extract_keywords};
 
 use super::super::config::DocContext;
 use super::super::context::FindHit;
@@ -222,8 +222,8 @@ pub fn format_keyword_hints(keyword_hits: &[FindHit], ctx: &DocContext<'_>) -> S
 }
 
 /// Build the ancestor path string for a node (e.g., "root/Chapter 1/Section 1.2").
-pub fn build_ancestor_path(node_id: crate::document::NodeId, ctx: &DocContext<'_>) -> String {
-    let mut path: Vec<crate::document::NodeId> = ctx.tree.ancestors_iter(node_id).collect();
+pub fn build_ancestor_path(node_id: vectorless_document::NodeId, ctx: &DocContext<'_>) -> String {
+    let mut path: Vec<vectorless_document::NodeId> = ctx.tree.ancestors_iter(node_id).collect();
     path.reverse();
     path.iter()
         .filter_map(|&id| ctx.node_title(id))
@@ -473,7 +473,7 @@ fn build_sibling_hints(state: &WorkerState, ctx: &DocContext<'_>) -> String {
 
     if let Some(parent) = ctx.parent(state.current_node) {
         if let Some(routes) = ctx.ls(parent) {
-            let unvisited: Vec<&crate::document::ChildRoute> = routes
+            let unvisited: Vec<&vectorless_document::ChildRoute> = routes
                 .iter()
                 .filter(|r| !state.visited.contains(&r.node_id))
                 .collect();
@@ -490,7 +490,7 @@ fn build_sibling_hints(state: &WorkerState, ctx: &DocContext<'_>) -> String {
 
         if let Some(grandparent) = ctx.parent(parent) {
             if let Some(routes) = ctx.ls(grandparent) {
-                let unvisited_parent_siblings: Vec<&crate::document::ChildRoute> = routes
+                let unvisited_parent_siblings: Vec<&vectorless_document::ChildRoute> = routes
                     .iter()
                     .filter(|r| !state.visited.contains(&r.node_id) && r.node_id != parent)
                     .collect();
@@ -517,25 +517,25 @@ fn build_sibling_hints(state: &WorkerState, ctx: &DocContext<'_>) -> String {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::agent::config::DocContext;
-    use crate::agent::config::Evidence;
-    use crate::agent::state::WorkerState;
-    use crate::document::{ChildRoute, NavEntry, NodeId};
-    use crate::scoring::bm25::extract_keywords;
+    use crate::config::DocContext;
+    use crate::config::Evidence;
+    use crate::state::WorkerState;
+    use vectorless_document::{ChildRoute, NavEntry, NodeId};
+    use vectorless_scoring::bm25::extract_keywords;
 
     fn build_semantic_test_tree() -> (
-        crate::document::DocumentTree,
-        crate::document::NavigationIndex,
+        vectorless_document::DocumentTree,
+        vectorless_document::NavigationIndex,
         NodeId,
         NodeId,
         NodeId,
     ) {
-        let mut tree = crate::document::DocumentTree::new("Root", "root content");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "root content");
         let root = tree.root();
         let revenue = tree.add_child(root, "Revenue", "revenue content");
         let expenses = tree.add_child(root, "Expenses", "expense content");
 
-        let mut nav = crate::document::NavigationIndex::new();
+        let mut nav = vectorless_document::NavigationIndex::new();
         nav.add_entry(
             root,
             NavEntry {
@@ -600,7 +600,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         assert_eq!(build_ancestor_path(revenue, &ctx), "Root/Revenue");
@@ -613,7 +613,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let keywords = extract_keywords("What is the revenue?");
@@ -632,7 +632,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let keywords = extract_keywords("operating costs analysis");
@@ -651,7 +651,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let keywords = extract_keywords("xyzzy foobar");
@@ -673,7 +673,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "test",
         };
         let (system, user) = build_replan_prompt("What is total revenue?", None, &state, &ctx);
@@ -688,7 +688,7 @@ mod tests {
         let ctx = DocContext {
             tree: &tree,
             nav_index: &nav,
-            reasoning_index: &crate::document::ReasoningIndex::default(),
+            reasoning_index: &vectorless_document::ReasoningIndex::default(),
             doc_name: "Financial Report",
         };
         let ls_output =
diff --git a/vectorless-core/vectorless-config/Cargo.toml b/vectorless-core/vectorless-config/Cargo.toml
new file mode 100644
index 00000000..c42bda8b
--- /dev/null
+++ b/vectorless-core/vectorless-config/Cargo.toml
@@ -0,0 +1,17 @@
+[package]
+name = "vectorless-config"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-graph = { path = "../vectorless-graph" }
+serde = { workspace = true }
+serde_json = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/vectorless-core/vectorless-config/src/lib.rs b/vectorless-core/vectorless-config/src/lib.rs
new file mode 100644
index 00000000..30217490
--- /dev/null
+++ b/vectorless-core/vectorless-config/src/lib.rs
@@ -0,0 +1,20 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Internal configuration management.
+//!
+//! Users configure vectorless via [`EngineBuilder`](vectorless_engine::EngineBuilder) methods,
+//! not by directly interacting with this module.
+
+mod types;
+mod validator;
+
+pub use types::Config;
+pub use types::DocumentGraphConfig;
+pub use types::LlmMetricsConfig;
+pub use types::MetricsConfig;
+pub use types::RetrievalMetricsConfig;
+pub use types::{
+    CompressionAlgorithm, FallbackBehavior, FallbackConfig, IndexerConfig, LlmConfig,
+    OnAllFailedBehavior, RetrievalConfig, RetryConfig, SlotConfig, StorageConfig, ThrottleConfig,
+};
diff --git a/rust/src/config/types/indexer.rs b/vectorless-core/vectorless-config/src/types/indexer.rs
similarity index 100%
rename from rust/src/config/types/indexer.rs
rename to vectorless-core/vectorless-config/src/types/indexer.rs
diff --git a/rust/src/config/types/llm_pool.rs b/vectorless-core/vectorless-config/src/types/llm_pool.rs
similarity index 94%
rename from rust/src/config/types/llm_pool.rs
rename to vectorless-core/vectorless-config/src/types/llm_pool.rs
index b38497aa..2cd129f1 100644
--- a/rust/src/config/types/llm_pool.rs
+++ b/vectorless-core/vectorless-config/src/types/llm_pool.rs
@@ -4,7 +4,7 @@
 //! Unified LLM configuration.
 //!
 //! This module consolidates all LLM-related configuration into a single
-//! cohesive structure. Users configure via [`EngineBuilder`](crate::client::EngineBuilder)
+//! cohesive structure. Users configure via [`EngineBuilder`](vectorless_engine::EngineBuilder)
 //! for simple cases, or construct [`LlmConfig`] programmatically for advanced use.
 
 use serde::{Deserialize, Serialize};
@@ -333,17 +333,6 @@ impl RetryConfig {
         let delay_ms = delay_ms.min(self.max_delay_ms as f64);
         std::time::Duration::from_millis(delay_ms as u64)
     }
-
-    /// Convert to the runtime retry config (used by llm module).
-    pub fn to_runtime_config(&self) -> crate::llm::config::RetryConfig {
-        crate::llm::config::RetryConfig {
-            max_attempts: self.max_attempts,
-            initial_delay_ms: self.initial_delay_ms,
-            max_delay_ms: self.max_delay_ms,
-            multiplier: self.multiplier,
-            retry_on_rate_limit: self.retry_on_rate_limit,
-        }
-    }
 }
 
 /// Throttle / rate-limiting configuration.
@@ -402,16 +391,6 @@ impl ThrottleConfig {
         self.requests_per_minute = rpm;
         self
     }
-
-    /// Convert to the runtime concurrency config.
-    pub fn to_runtime_config(&self) -> crate::llm::throttle::ConcurrencyConfig {
-        crate::llm::throttle::ConcurrencyConfig {
-            max_concurrent_requests: self.max_concurrent_requests,
-            requests_per_minute: self.requests_per_minute,
-            enabled: self.enabled,
-            semaphore_enabled: self.semaphore_enabled,
-        }
-    }
 }
 
 /// Fallback behavior on errors.
diff --git a/rust/src/config/types/metrics.rs b/vectorless-core/vectorless-config/src/types/metrics.rs
similarity index 100%
rename from rust/src/config/types/metrics.rs
rename to vectorless-core/vectorless-config/src/types/metrics.rs
diff --git a/rust/src/config/types/mod.rs b/vectorless-core/vectorless-config/src/types/mod.rs
similarity index 91%
rename from rust/src/config/types/mod.rs
rename to vectorless-core/vectorless-config/src/types/mod.rs
index e6ba3f8b..717b137d 100644
--- a/rust/src/config/types/mod.rs
+++ b/vectorless-core/vectorless-config/src/types/mod.rs
@@ -11,17 +11,19 @@ mod storage;
 
 use serde::{Deserialize, Serialize};
 
-pub(crate) use indexer::IndexerConfig;
-pub(crate) use llm_pool::{
-    FallbackBehavior, FallbackConfig, LlmConfig, OnAllFailedBehavior, SlotConfig,
+pub use indexer::IndexerConfig;
+pub use llm_pool::{
+    FallbackBehavior, FallbackConfig, LlmConfig, OnAllFailedBehavior, RetryConfig, SlotConfig,
+    ThrottleConfig,
 };
-pub(crate) use metrics::{LlmMetricsConfig, MetricsConfig, RetrievalMetricsConfig};
-pub(crate) use retrieval::RetrievalConfig;
-pub(crate) use storage::{CompressionAlgorithm, StorageConfig};
+pub use metrics::{LlmMetricsConfig, MetricsConfig, RetrievalMetricsConfig};
+pub use retrieval::RetrievalConfig;
+pub use storage::{CompressionAlgorithm, StorageConfig};
+pub use vectorless_graph::DocumentGraphConfig;
 
 /// Main configuration for vectorless.
 ///
-/// Users typically configure via [`EngineBuilder`](crate::client::EngineBuilder):
+/// Users typically configure via [`EngineBuilder`](vectorless_engine::EngineBuilder):
 ///
 /// ```rust,no_run
 /// use vectorless::client::EngineBuilder;
@@ -73,7 +75,7 @@ pub struct Config {
 
     /// Document graph configuration.
     #[serde(default)]
-    pub graph: crate::graph::DocumentGraphConfig,
+    pub graph: DocumentGraphConfig,
 }
 
 impl Default for Config {
@@ -84,7 +86,7 @@ impl Default for Config {
             indexer: IndexerConfig::default(),
             retrieval: RetrievalConfig::default(),
             storage: StorageConfig::default(),
-            graph: crate::graph::DocumentGraphConfig::default(),
+            graph: DocumentGraphConfig::default(),
         }
     }
 }
@@ -126,7 +128,7 @@ impl Config {
     }
 
     /// Set the document graph configuration.
-    pub fn with_graph(mut self, graph: crate::graph::DocumentGraphConfig) -> Self {
+    pub fn with_graph(mut self, graph: DocumentGraphConfig) -> Self {
         self.graph = graph;
         self
     }
@@ -205,13 +207,24 @@ impl Config {
 }
 
 /// Configuration validation error.
-#[derive(Debug, Clone, thiserror::Error)]
-#[error("Configuration validation failed with {} error(s)", self.errors.len())]
+#[derive(Debug, Clone)]
 pub struct ConfigValidationError {
     /// Validation errors.
     pub errors: Vec<ValidationError>,
 }
 
+impl std::fmt::Display for ConfigValidationError {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "Configuration validation failed with {} error(s)",
+            self.errors.len()
+        )
+    }
+}
+
+impl std::error::Error for ConfigValidationError {}
+
 /// A single validation error.
 #[derive(Debug, Clone)]
 pub struct ValidationError {
diff --git a/rust/src/config/types/retrieval.rs b/vectorless-core/vectorless-config/src/types/retrieval.rs
similarity index 100%
rename from rust/src/config/types/retrieval.rs
rename to vectorless-core/vectorless-config/src/types/retrieval.rs
diff --git a/rust/src/config/types/storage.rs b/vectorless-core/vectorless-config/src/types/storage.rs
similarity index 100%
rename from rust/src/config/types/storage.rs
rename to vectorless-core/vectorless-config/src/types/storage.rs
diff --git a/rust/src/config/validator.rs b/vectorless-core/vectorless-config/src/validator.rs
similarity index 100%
rename from rust/src/config/validator.rs
rename to vectorless-core/vectorless-config/src/validator.rs
diff --git a/vectorless-core/vectorless-document/Cargo.toml b/vectorless-core/vectorless-document/Cargo.toml
new file mode 100644
index 00000000..ace36491
--- /dev/null
+++ b/vectorless-core/vectorless-document/Cargo.toml
@@ -0,0 +1,23 @@
+[package]
+name = "vectorless-document"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+regex = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+indextree = { workspace = true }
+chrono = { workspace = true }
+uuid = { workspace = true }
+
+[dev-dependencies]
+tempfile = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/vectorless-core/vectorless-document/src/format.rs b/vectorless-core/vectorless-document/src/format.rs
new file mode 100644
index 00000000..78f6e52e
--- /dev/null
+++ b/vectorless-core/vectorless-document/src/format.rs
@@ -0,0 +1,62 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Document format and sufficiency types.
+//!
+//! These types are used across multiple modules and are defined here
+//! to avoid circular dependencies between crates.
+
+use serde::{Deserialize, Serialize};
+
+/// Supported document formats.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
+pub enum DocumentFormat {
+    /// Markdown files (.md, .markdown)
+    Markdown,
+    /// PDF files (.pdf)
+    Pdf,
+}
+
+impl DocumentFormat {
+    /// Detect format from file extension.
+    pub fn from_extension(ext: &str) -> Option<Self> {
+        match ext.to_lowercase().as_str() {
+            "md" | "markdown" => Some(Self::Markdown),
+            "pdf" => Some(Self::Pdf),
+            _ => None,
+        }
+    }
+
+    /// Get the file extension for this format.
+    pub fn extension(&self) -> &'static str {
+        match self {
+            Self::Markdown => "md",
+            Self::Pdf => "pdf",
+        }
+    }
+
+    /// All supported file extensions (lowercase).
+    ///
+    /// Single source of truth — used by directory scanning to
+    /// discover indexable files.
+    pub const SUPPORTED_EXTENSIONS: &'static [&'static str] = &["md", "pdf"];
+}
+
+/// Sufficiency level for incremental retrieval.
+#[derive(Debug, Clone, Copy, PartialEq, Eq)]
+pub enum SufficiencyLevel {
+    /// Information is sufficient, stop retrieving.
+    Sufficient,
+
+    /// Partial information, can continue if needed.
+    PartialSufficient,
+
+    /// Information is insufficient, continue retrieving.
+    Insufficient,
+}
+
+impl Default for SufficiencyLevel {
+    fn default() -> Self {
+        Self::Insufficient
+    }
+}
diff --git a/rust/src/document/mod.rs b/vectorless-core/vectorless-document/src/lib.rs
similarity index 85%
rename from rust/src/document/mod.rs
rename to vectorless-core/vectorless-document/src/lib.rs
index d3de3bfc..85ea1ff3 100644
--- a/rust/src/document/mod.rs
+++ b/vectorless-core/vectorless-document/src/lib.rs
@@ -16,6 +16,7 @@
 //! - [`NodeReference`] - In-document reference (e.g., "see Appendix G")
 //! - [`RefType`] - Type of reference (Section, Appendix, Table, etc.)
 
+mod format;
 mod navigation;
 mod node;
 mod reasoning;
@@ -24,7 +25,9 @@ mod serde_helpers;
 mod structure;
 mod toc;
 mod tree;
+pub mod understanding;
 
+pub use format::{DocumentFormat, SufficiencyLevel};
 pub use navigation::{ChildRoute, DocCard, NavEntry, NavigationIndex, SectionCard};
 pub use node::{NodeId, TreeNode};
 pub use reasoning::{
@@ -35,3 +38,6 @@ pub use reference::ReferenceExtractor;
 pub use structure::{DocumentStructure, StructureNode};
 pub use toc::{TocConfig, TocEntry, TocNode, TocView};
 pub use tree::{DocumentTree, RetrievalIndex};
+pub use understanding::{
+    Answer, Concept, Document, DocumentInfo, Evidence, IngestInput, ReasoningTrace, TraceStep,
+};
diff --git a/rust/src/document/navigation.rs b/vectorless-core/vectorless-document/src/navigation.rs
similarity index 99%
rename from rust/src/document/navigation.rs
rename to vectorless-core/vectorless-document/src/navigation.rs
index dbfeadd4..348fef43 100644
--- a/rust/src/document/navigation.rs
+++ b/vectorless-core/vectorless-document/src/navigation.rs
@@ -234,7 +234,7 @@ pub struct SectionCard {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
+    use crate::tree::DocumentTree;
 
     fn build_small_tree() -> DocumentTree {
         // Root -> [Child1 (leaf), Child2 -> [Grandchild (leaf)]]
diff --git a/rust/src/document/node.rs b/vectorless-core/vectorless-document/src/node.rs
similarity index 100%
rename from rust/src/document/node.rs
rename to vectorless-core/vectorless-document/src/node.rs
diff --git a/rust/src/document/reasoning.rs b/vectorless-core/vectorless-document/src/reasoning.rs
similarity index 97%
rename from rust/src/document/reasoning.rs
rename to vectorless-core/vectorless-document/src/reasoning.rs
index 2c4ab01b..533244de 100644
--- a/rust/src/document/reasoning.rs
+++ b/vectorless-core/vectorless-document/src/reasoning.rs
@@ -339,7 +339,7 @@ mod tests {
     #[test]
     fn test_builder_basic() {
         // Create a simple tree to get valid NodeIds
-        let mut tree = crate::document::DocumentTree::new("Root", "root content");
+        let mut tree = crate::tree::DocumentTree::new("Root", "root content");
         let child1 = tree.add_child(tree.root(), "Introduction", "intro content");
         let child2 = tree.add_child(tree.root(), "Methods", "methods content");
 
@@ -370,7 +370,7 @@ mod tests {
 
     #[test]
     fn test_serialization_roundtrip_empty() {
-        let mut tree = crate::document::DocumentTree::new("Root", "content");
+        let mut tree = crate::tree::DocumentTree::new("Root", "content");
         let child = tree.add_child(tree.root(), "Section 1", "s1 content");
 
         let mut builder = ReasoningIndexBuilder::new();
@@ -395,7 +395,7 @@ mod tests {
 
     #[test]
     fn test_serialization_roundtrip_with_hot_nodes() {
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = crate::tree::DocumentTree::new("Root", "");
         let root = tree.root();
         let c1 = tree.add_child(root, "S1", "content 1");
         let c2 = tree.add_child(root, "S2", "content 2");
@@ -426,7 +426,7 @@ mod tests {
     #[test]
     fn test_backward_compat_hot_nodes_empty_object() {
         // Simulate old JSON where hot_nodes was serialized as {} by derive.
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = crate::tree::DocumentTree::new("Root", "");
         let child = tree.add_child(tree.root(), "S1", "c");
 
         let mut builder = ReasoningIndexBuilder::new();
diff --git a/rust/src/document/reference.rs b/vectorless-core/vectorless-document/src/reference.rs
similarity index 100%
rename from rust/src/document/reference.rs
rename to vectorless-core/vectorless-document/src/reference.rs
diff --git a/rust/src/document/serde_helpers.rs b/vectorless-core/vectorless-document/src/serde_helpers.rs
similarity index 99%
rename from rust/src/document/serde_helpers.rs
rename to vectorless-core/vectorless-document/src/serde_helpers.rs
index cb658c35..00495da7 100644
--- a/rust/src/document/serde_helpers.rs
+++ b/vectorless-core/vectorless-document/src/serde_helpers.rs
@@ -93,7 +93,7 @@ where
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
+    use crate::tree::DocumentTree;
 
     /// Wrapper struct to test `#[serde(with)]` through serde_json round-trip.
     #[derive(Serialize, Deserialize, Debug)]
diff --git a/rust/src/document/structure.rs b/vectorless-core/vectorless-document/src/structure.rs
similarity index 100%
rename from rust/src/document/structure.rs
rename to vectorless-core/vectorless-document/src/structure.rs
diff --git a/rust/src/document/toc.rs b/vectorless-core/vectorless-document/src/toc.rs
similarity index 100%
rename from rust/src/document/toc.rs
rename to vectorless-core/vectorless-document/src/toc.rs
diff --git a/rust/src/document/tree.rs b/vectorless-core/vectorless-document/src/tree.rs
similarity index 99%
rename from rust/src/document/tree.rs
rename to vectorless-core/vectorless-document/src/tree.rs
index 1659471b..4080c00f 100644
--- a/rust/src/document/tree.rs
+++ b/vectorless-core/vectorless-document/src/tree.rs
@@ -825,7 +825,7 @@ impl Default for DocumentTree {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::reference::{NodeReference, RefType};
+    use crate::reference::{NodeReference, RefType};
 
     #[test]
     fn test_children_with_refs_no_references() {
diff --git a/vectorless-core/vectorless-document/src/understanding.rs b/vectorless-core/vectorless-document/src/understanding.rs
new file mode 100644
index 00000000..4be8e29d
--- /dev/null
+++ b/vectorless-core/vectorless-document/src/understanding.rs
@@ -0,0 +1,306 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Understanding types — the core objects that define the Document Understanding Engine.
+//!
+//! These types form the stable public contract:
+//! - [`Document`] — the unified post-ingest artifact (internal first-class citizen)
+//! - [`DocumentInfo`] — what `ingest()` returns to users
+//! - [`Concept`] — key concept extracted from a document
+//! - [`Answer`] — what `ask()` returns
+//! - [`Evidence`] — proof trail for an answer
+//! - [`ReasoningTrace`] / [`TraceStep`] — always-mandatory reasoning trace
+
+use serde::{Deserialize, Serialize};
+
+use super::toc::TocNode;
+
+// ---------------------------------------------------------------------------
+// Document — unified post-ingest artifact
+// ---------------------------------------------------------------------------
+
+/// A understood document — the core artifact of the understand phase.
+///
+/// This is what `ingest()` produces internally and what `ask()` consumes.
+/// It unifies tree + navigation index + reasoning index + summary + concepts
+/// into a single first-class type, replacing the previous loose coupling of
+/// `DocContext { &tree, &nav, &reasoning }`.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Document {
+    /// Unique document identifier.
+    pub doc_id: String,
+    /// Document name/title.
+    pub name: String,
+    /// Document format ("pdf", "markdown", "docx").
+    pub format: String,
+    /// Source file path (if indexed from a file).
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub source_path: Option<String>,
+
+    // ── Three indexes (engine internal) ──
+    /// Hierarchical semantic tree.
+    pub tree: super::tree::DocumentTree,
+    /// Pre-computed navigation structure.
+    pub nav_index: super::navigation::NavigationIndex,
+    /// Keyword / topic / section summaries.
+    pub reasoning_index: super::reasoning::ReasoningIndex,
+
+    // ── Understanding results (ingest stage output) ──
+    /// Document-level summary.
+    pub summary: String,
+    /// Key concepts the engine identified.
+    #[serde(default)]
+    pub concepts: Vec<Concept>,
+
+    // ── Metadata ──
+    /// Page count (for PDFs).
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    pub page_count: Option<usize>,
+    /// Number of sections in the tree.
+    #[serde(default)]
+    pub section_count: usize,
+}
+
+// ---------------------------------------------------------------------------
+// DocumentInfo — what ingest() returns to users
+// ---------------------------------------------------------------------------
+
+/// The engine's understanding of a document — returned by `ingest()`.
+///
+/// Rich enough for users to confirm the engine "got it right":
+/// summary, structure (TOC), and key concepts.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct DocumentInfo {
+    /// Unique document identifier.
+    pub doc_id: String,
+    /// Document name.
+    pub name: String,
+    /// Document format ("pdf", "markdown", "docx").
+    pub format: String,
+    /// Document-level summary — what this document is about.
+    pub summary: String,
+    /// Table of contents — the document's structure as the engine sees it.
+    pub structure: TocNode,
+    /// Key concepts the engine identified.
+    pub concepts: Vec<Concept>,
+    /// Number of sections in the document.
+    pub section_count: usize,
+    /// Page count (for PDFs).
+    pub page_count: Option<usize>,
+}
+
+impl Document {
+    /// Get node content by ID (Agent `cat` command).
+    pub fn cat(&self, node_id: super::node::NodeId) -> Option<&str> {
+        self.tree.get(node_id).map(|n| n.content.as_str())
+    }
+
+    /// Find nodes containing a keyword in title or content.
+    pub fn find(&self, keyword: &str) -> Vec<(super::node::NodeId, &str)> {
+        let kw = keyword.to_lowercase();
+        self.tree
+            .traverse()
+            .iter()
+            .filter_map(|&id| {
+                let node = self.tree.get(id)?;
+                if node.title.to_lowercase().contains(&kw)
+                    || node.content.to_lowercase().contains(&kw)
+                {
+                    Some((id, node.title.as_str()))
+                } else {
+                    None
+                }
+            })
+            .collect()
+    }
+
+    /// Get node title by ID.
+    pub fn node_title(&self, node_id: super::node::NodeId) -> Option<&str> {
+        self.tree.get(node_id).map(|n| n.title.as_str())
+    }
+
+    /// Number of sections in the tree.
+    pub fn section_count(&self) -> usize {
+        self.section_count
+    }
+
+    /// Produce the public DocumentInfo view of this document.
+    pub fn info(&self) -> DocumentInfo {
+        let toc = super::toc::TocView::new().generate(&self.tree);
+        DocumentInfo {
+            doc_id: self.doc_id.clone(),
+            name: self.name.clone(),
+            format: self.format.clone(),
+            summary: self.summary.clone(),
+            structure: toc,
+            concepts: self.concepts.clone(),
+            section_count: self.section_count,
+            page_count: self.page_count,
+        }
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Concept
+// ---------------------------------------------------------------------------
+
+/// A key concept extracted from a document.
+///
+/// Produced during the ingest pipeline's final concept extraction step.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Concept {
+    /// Concept name (e.g., "capacitor derating").
+    pub name: String,
+    /// One-sentence explanation.
+    pub summary: String,
+    /// Which sections this concept appears in.
+    pub sections: Vec<String>,
+}
+
+// ---------------------------------------------------------------------------
+// Answer — what ask() returns
+// ---------------------------------------------------------------------------
+
+/// The result of `ask()` — a reasoned answer with evidence and trace.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Answer {
+    /// The answer content.
+    pub content: String,
+    /// Evidence supporting the answer.
+    pub evidence: Vec<Evidence>,
+    /// Confidence score (0.0–1.0).
+    pub confidence: f32,
+    /// Reasoning trace — how the agent arrived at this answer. Always present.
+    pub trace: ReasoningTrace,
+}
+
+// ---------------------------------------------------------------------------
+// Evidence
+// ---------------------------------------------------------------------------
+
+/// A piece of evidence supporting an answer — with source attribution.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Evidence {
+    /// Original document text.
+    pub content: String,
+    /// Navigation path (e.g., "Root/Chapter 3/Section 3.2").
+    pub source_path: String,
+    /// Which document this evidence came from.
+    pub doc_name: String,
+    /// Relevance to the question (0.0–1.0).
+    pub relevance: f32,
+}
+
+// ---------------------------------------------------------------------------
+// ReasoningTrace — always mandatory
+// ---------------------------------------------------------------------------
+
+/// Reasoning trace — how the agent arrived at the answer. Always present.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ReasoningTrace {
+    /// The steps the agent took.
+    pub steps: Vec<TraceStep>,
+}
+
+impl ReasoningTrace {
+    /// Create an empty trace.
+    pub fn empty() -> Self {
+        Self { steps: Vec::new() }
+    }
+
+    /// Create a trace with a single step.
+    pub fn single(action: impl Into<String>, observation: impl Into<String>, round: u32) -> Self {
+        Self {
+            steps: vec![TraceStep {
+                action: action.into(),
+                observation: observation.into(),
+                round,
+            }],
+        }
+    }
+
+    /// Add a step to the trace.
+    pub fn push(&mut self, action: impl Into<String>, observation: impl Into<String>, round: u32) {
+        self.steps.push(TraceStep {
+            action: action.into(),
+            observation: observation.into(),
+            round,
+        });
+    }
+}
+
+/// A single step in the reasoning trace.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct TraceStep {
+    /// What the agent did (e.g., "cd Chapter 3").
+    pub action: String,
+    /// What the agent observed (e.g., "Found 5 sections about...").
+    pub observation: String,
+    /// Which round this step was in.
+    pub round: u32,
+}
+
+// ---------------------------------------------------------------------------
+// IngestInput — what ingest() takes
+// ---------------------------------------------------------------------------
+
+/// Input to `ingest()` — the document to be understood.
+#[derive(Debug, Clone)]
+pub enum IngestInput {
+    /// Index from a file path.
+    Path(std::path::PathBuf),
+    /// Index from raw bytes.
+    Bytes {
+        /// Document name.
+        name: String,
+        /// Raw document bytes.
+        data: Vec<u8>,
+        /// Document format.
+        format: super::format::DocumentFormat,
+    },
+    /// Index from a text string.
+    Text {
+        /// Document name.
+        name: String,
+        /// Document content.
+        content: String,
+    },
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_reasoning_trace_empty() {
+        let trace = ReasoningTrace::empty();
+        assert!(trace.steps.is_empty());
+    }
+
+    #[test]
+    fn test_reasoning_trace_single() {
+        let trace = ReasoningTrace::single("cd Chapter 3", "Found 5 sections", 1);
+        assert_eq!(trace.steps.len(), 1);
+        assert_eq!(trace.steps[0].action, "cd Chapter 3");
+        assert_eq!(trace.steps[0].round, 1);
+    }
+
+    #[test]
+    fn test_reasoning_trace_push() {
+        let mut trace = ReasoningTrace::empty();
+        trace.push("ls", "Root with 3 children", 0);
+        trace.push("cd Chapter 2", "Found target section", 1);
+        assert_eq!(trace.steps.len(), 2);
+    }
+
+    #[test]
+    fn test_concept_serialization() {
+        let concept = Concept {
+            name: "capacitor derating".into(),
+            summary: "Reducing capacitor specs for reliability".into(),
+            sections: vec!["Section 3.2".into()],
+        };
+        let json = serde_json::to_string(&concept).unwrap();
+        assert!(json.contains("capacitor derating"));
+    }
+}
diff --git a/vectorless-core/vectorless-engine/Cargo.toml b/vectorless-core/vectorless-engine/Cargo.toml
new file mode 100644
index 00000000..acf7c7c6
--- /dev/null
+++ b/vectorless-core/vectorless-engine/Cargo.toml
@@ -0,0 +1,37 @@
+[package]
+name = "vectorless-engine"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-agent = { path = "../vectorless-agent" }
+vectorless-config = { path = "../vectorless-config" }
+vectorless-document = { path = "../vectorless-document" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-events = { path = "../vectorless-events" }
+vectorless-graph = { path = "../vectorless-graph" }
+vectorless-index = { path = "../vectorless-index" }
+vectorless-llm = { path = "../vectorless-llm" }
+vectorless-metrics = { path = "../vectorless-metrics" }
+vectorless-retrieval = { path = "../vectorless-retrieval" }
+vectorless-rerank = { path = "../vectorless-rerank" }
+vectorless-storage = { path = "../vectorless-storage" }
+vectorless-utils = { path = "../vectorless-utils" }
+tokio = { workspace = true }
+tracing = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+uuid = { workspace = true }
+chrono = { workspace = true }
+thiserror = { workspace = true }
+parking_lot = { workspace = true }
+async-trait = { workspace = true }
+futures = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/client/builder.rs b/vectorless-core/vectorless-engine/src/builder.rs
similarity index 95%
rename from rust/src/client/builder.rs
rename to vectorless-core/vectorless-engine/src/builder.rs
index b3ccc6ea..5fd2d6b9 100644
--- a/rust/src/client/builder.rs
+++ b/vectorless-core/vectorless-engine/src/builder.rs
@@ -6,10 +6,13 @@
 //! This module provides [`EngineBuilder`] for configuring and building
 //! [`Engine`] instances with sensible defaults.
 
-use crate::{
-    client::engine::Engine, client::retriever::RetrieverClient, config::Config,
-    events::EventEmitter, metrics::MetricsHub, storage::Workspace,
-};
+use vectorless_config::Config;
+use vectorless_events::EventEmitter;
+use vectorless_metrics::MetricsHub;
+use vectorless_storage::Workspace;
+
+use super::engine::Engine;
+use super::retriever::RetrieverClient;
 
 /// Builder for creating a [`Engine`] client.
 ///
@@ -190,10 +193,10 @@ impl EngineBuilder {
 
         // Build LlmPool from unified LlmConfig (shared metrics hub)
         let metrics_hub = std::sync::Arc::new(MetricsHub::with_defaults());
-        let pool = crate::llm::LlmPool::from_config(&config.llm, Some(metrics_hub.clone()));
+        let pool = vectorless_llm::LlmPool::from_config(&config.llm, Some(metrics_hub.clone()));
 
         // Indexer uses pool.index()
-        let indexer = crate::client::indexer::IndexerClient::with_llm(pool.index().clone());
+        let indexer = super::indexer::IndexerClient::with_llm(pool.index().clone());
 
         // Retriever uses pool.retrieval() via agent system
         let retriever = RetrieverClient::new(pool.retrieval().clone());
diff --git a/rust/src/client/engine.rs b/vectorless-core/vectorless-engine/src/engine.rs
similarity index 54%
rename from rust/src/client/engine.rs
rename to vectorless-core/vectorless-engine/src/engine.rs
index 47ad9bf3..e1833f43 100644
--- a/rust/src/client/engine.rs
+++ b/vectorless-core/vectorless-engine/src/engine.rs
@@ -3,16 +3,17 @@
 
 //! Main Engine client - the entry point for vectorless.
 //!
-//! The Engine provides a unified API for document indexing and retrieval:
+//! The Engine provides a unified API for the Document Understanding Engine:
 //!
-//! - [`index`](Engine::index) — Index documents from files, content, or bytes
-//! - [`query`](Engine::query) — Query documents using natural language
-//! - [`query_stream`](Engine::query_stream) — Query with streaming results
+//! - [`ingest`](Engine::ingest) — Understand a document (parse, analyze, persist)
+//! - [`ask`](Engine::ask) — Ask a question (returns answer + evidence + trace)
+//! - [`forget`](Engine::forget) — Remove a document
+//! - [`list_documents`](Engine::list_documents) — List all understood documents
 //!
 //! # Example
 //!
 //! ```rust,no_run
-//! use vectorless::client::{EngineBuilder, IndexContext, QueryContext};
+//! use vectorless::{EngineBuilder, IngestInput};
 //!
 //! # #[tokio::main]
 //! # async fn main() -> Result<(), Box<dyn std::error::Error>> {
@@ -23,16 +24,22 @@
 //!     .build()
 //!     .await?;
 //!
-//! // Index a document
-//! let result = engine.index(IndexContext::from_path("./document.md")).await?;
-//! let doc_id = result.doc_id().unwrap();
+//! // Understand a document
+//! let doc = engine.ingest(IngestInput::Path("./document.md".into())).await?;
+//! println!("{}: {}", doc.name, doc.summary);
 //!
-//! // Query
-//! let result = engine.query(
-//!     QueryContext::new("What is this?").with_doc_ids(vec![doc_id.to_string()])
-//! ).await?;
+//! // Ask a question
+//! let answer = engine.ask("What is this?", &[doc.doc_id.clone()]).await?;
+//! println!("{}", answer.content);
 //!
-//! println!("Found: {}", result.content);
+//! // List all understood documents
+//! let docs = engine.list_documents().await?;
+//! for d in &docs {
+//!     println!("{}: {}", d.name, d.summary);
+//! }
+//!
+//! // Forget a document
+//! engine.forget(&doc.doc_id).await?;
 //! # Ok(())
 //! # }
 //! ```
@@ -42,26 +49,24 @@ use std::{collections::HashMap, sync::Arc};
 use futures::StreamExt;
 use tracing::{info, warn};
 
-use crate::{
-    DocumentTree, Error,
-    config::Config,
-    error::Result,
-    events::EventEmitter,
-    index::{
-        PipelineOptions,
-        incremental::{self, IndexAction},
-    },
-    metrics::MetricsHub,
-    retrieval::RetrieveEventReceiver,
-    storage::{PersistedDocument, Workspace},
+use vectorless_config::Config;
+use vectorless_document::{
+    Answer, Document as UnderstandingDocument, DocumentTree, Evidence, IngestInput, ReasoningTrace,
+};
+use vectorless_error::{Error, Result};
+use vectorless_events::EventEmitter;
+use vectorless_index::{
+    PipelineOptions,
+    incremental::{self, IndexAction},
 };
+use vectorless_metrics::MetricsHub;
+use vectorless_storage::{PersistedDocument, Workspace};
 
 use super::{
     index_context::{IndexContext, IndexSource},
     indexer::IndexerClient,
-    query_context::{QueryContext, QueryScope},
     retriever::RetrieverClient,
-    types::{DocumentInfo, FailedItem, IndexItem, IndexMode, IndexResult, QueryResult},
+    types::{FailedItem, IndexItem, IndexMode, IndexResult},
     workspace::WorkspaceClient,
 };
 
@@ -132,18 +137,16 @@ impl Engine {
     }
 
     // ============================================================
-    // Document Indexing
+    // Ingest Pipeline (private — called by ingest())
     // ============================================================
 
-    /// Index one or more documents.
-    ///
-    /// Accepts an [`IndexContext`] that specifies the source (file path,
-    /// directory, content string, or bytes) and indexing options.
-    /// Multiple sources are indexed in parallel.
+    /// Run the ingest pipeline: parse, compile, persist.
     ///
+    /// Accepts an [`IndexContext`] that specifies the source and options.
+    /// Multiple sources are processed in parallel.
     /// Returns an [`IndexResult`] containing the indexed document metadata.
     #[tracing::instrument(skip_all, fields(sources = ctx.sources.len()))]
-    pub async fn index(&self, ctx: IndexContext) -> Result<IndexResult> {
+    async fn ingest_pipeline(&self, ctx: IndexContext) -> Result<IndexResult> {
         if ctx.is_empty() {
             return Err(Error::Config("No document sources provided".into()));
         }
@@ -407,392 +410,136 @@ impl Engine {
     }
 
     // ============================================================
-    // Document Querying
+    // Understanding Engine API
     // ============================================================
 
-    /// Query documents.
+    /// Understand a document — parse, analyze, and persist.
     ///
-    /// Accepts a [`QueryContext`] that specifies the query text and scope
-    /// (single document, multiple documents, or entire workspace).
-    #[tracing::instrument(skip_all, fields(query = %ctx.query))]
-    pub async fn query(&self, ctx: QueryContext) -> Result<QueryResult> {
-        let timeout_secs = ctx.timeout_secs;
-
-        self.with_timeout(timeout_secs, async move {
-            let doc_ids = self.resolve_scope(&ctx.scope).await?;
-            info!(doc_count = doc_ids.len(), "Resolving documents for query");
-
-            let (documents, failed) = self.load_documents(&doc_ids).await?;
-            info!(
-                loaded = documents.len(),
-                failed = failed.len(),
-                "Documents loaded"
-            );
-            if documents.is_empty() {
-                return Err(Error::Config(format!(
-                    "No documents available for query: {} failures",
-                    failed.len()
-                )));
+    /// Returns a [`vectorless_document::DocumentInfo`] with summary, structure, and concepts.
+    /// The engine builds a full understanding including tree, navigation index,
+    /// reasoning index, summary, and key concepts.
+    pub async fn ingest(&self, input: IngestInput) -> Result<vectorless_document::DocumentInfo> {
+        let ctx = match &input {
+            IngestInput::Path(path) => IndexContext::from_path(path),
+            IngestInput::Bytes { data, format, .. } => {
+                IndexContext::from_bytes(data.clone(), *format)
             }
+            IngestInput::Text { content, .. } => IndexContext::from_content(
+                content,
+                vectorless_index::parse::DocumentFormat::Markdown,
+            ),
+        };
 
-            let skip_analysis = !ctx.force_analysis;
-            let mut result = self
-                .retriever
-                .query(&documents, &ctx.query, skip_analysis)
-                .await?;
-            result.failed.extend(failed);
-            Ok(result)
-        })
-        .await
+        let result = self.ingest_pipeline(ctx).await?;
+
+        let doc_id = result
+            .doc_id()
+            .ok_or_else(|| Error::Config("ingest produced no results".into()))?
+            .to_string();
+
+        // Load the persisted document to build DocumentInfo
+        let persisted = self
+            .workspace
+            .load(&doc_id)
+            .await?
+            .ok_or_else(|| Error::Config("Document not found after ingest".into()))?;
+
+        let doc = Self::persisted_to_understanding_document(persisted);
+        Ok(doc.info())
     }
 
-    /// Query a document with streaming results.
+    /// Ask a question — returns a reasoned answer with evidence and trace.
     ///
-    /// Returns a receiver that yields retrieval events
-    /// as the retrieval agent progresses through navigation.
+    /// - `input`: the question (required)
+    /// - `ids`: document IDs to search. Empty = search all documents.
     ///
-    /// Supports single-document and multi-document scope.
-    /// Events are translated from the agent's internal event stream
-    /// into the public `RetrieveEventReceiver` stream.
-    pub async fn query_stream(&self, ctx: QueryContext) -> Result<RetrieveEventReceiver> {
-        let doc_ids = self.resolve_scope(&ctx.scope).await?;
-        let query = ctx.query.clone();
-
-        // Load all requested documents (need owned PersistedDocument for spawned task)
-        let mut docs = Vec::new();
-        for doc_id in &doc_ids {
-            let doc = match self.workspace.load(doc_id).await? {
-                Some(d) => d,
-                None => return Err(Error::Config(format!("Document not found: {}", doc_id))),
-            };
-            docs.push((doc_id.clone(), doc));
-        }
+    /// Always returns an [`Answer`] with content, evidence, confidence, and
+    /// a mandatory reasoning trace.
+    pub async fn ask(&self, input: &str, ids: &[String]) -> Result<Answer> {
+        // Resolve doc IDs
+        let doc_ids = if ids.is_empty() {
+            let docs = self.list_documents().await?;
+            if docs.is_empty() {
+                return Err(Error::Config("Workspace is empty".into()));
+            }
+            docs.into_iter().map(|d| d.doc_id).collect::<Vec<_>>()
+        } else {
+            ids.to_vec()
+        };
 
-        // Create agent event channel
-        let (agent_tx, mut agent_rx) =
-            crate::agent::events::channel(crate::agent::events::DEFAULT_AGENT_EVENT_BOUND);
-        let (retrieve_tx, retrieve_rx) =
-            crate::retrieval::stream::channel(crate::retrieval::stream::DEFAULT_STREAM_BOUND);
-
-        // Spawn a task that translates AgentEvents → RetrieveEvents
-        tokio::spawn(async move {
-            use crate::agent::AgentEvent;
-            use crate::retrieval::stream::RetrieveEvent;
-
-            while let Some(event) = agent_rx.recv().await {
-                let translated = match event {
-                    // ── Query Understanding ──
-                    AgentEvent::QueryUnderstandingStarted { query } => RetrieveEvent::Started {
-                        query,
-                        strategy: "query_understanding".to_string(),
-                    },
-                    AgentEvent::QueryUnderstandingCompleted { query, .. } => {
-                        RetrieveEvent::StageCompleted {
-                            stage: format!("query_understanding: {}", query),
-                            elapsed_ms: 0,
-                        }
-                    }
+        // Load documents
+        let (documents, failed) = self.load_documents(&doc_ids).await?;
+        if documents.is_empty() {
+            return Err(Error::Config(format!(
+                "No documents available: {} failures",
+                failed.len()
+            )));
+        }
 
-                    // ── Orchestrator ──
-                    AgentEvent::OrchestratorStarted {
-                        query,
-                        doc_count,
-                        skip_analysis,
-                    } => RetrieveEvent::Started {
-                        query,
-                        strategy: if skip_analysis {
-                            "orchestrator_skip_analysis".to_string()
-                        } else {
-                            format!("orchestrator({}_docs)", doc_count)
-                        },
-                    },
-                    AgentEvent::OrchestratorAnalyzing {
-                        doc_count,
-                        keywords,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "orchestrator_analyzing_{}_docs_kw_{}",
-                            doc_count,
-                            keywords.len()
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::WorkerDispatched {
-                        doc_idx,
-                        doc_name,
-                        task,
-                        ..
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!("dispatch_{}_{}_{}", doc_idx, doc_name, task.len().min(30)),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::WorkerCompleted {
-                        doc_idx,
-                        doc_name,
-                        evidence_count,
-                        rounds_used,
-                        llm_calls,
-                        success,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "worker_{}_{}_done_e{}_r{}_l{}_{}",
-                            doc_idx, doc_name, evidence_count, rounds_used, llm_calls, success
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::OrchestratorEvaluated {
-                        sufficient,
-                        evidence_count,
-                        missing_info: _,
-                    } => RetrieveEvent::SufficiencyCheck {
-                        level: if sufficient {
-                            crate::retrieval::SufficiencyLevel::Sufficient
-                        } else {
-                            crate::retrieval::SufficiencyLevel::Insufficient
-                        },
-                        tokens: evidence_count,
-                    },
-                    AgentEvent::OrchestratorReplanning {
-                        reason,
-                        evidence_count,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "orchestrator_replan_{}_e{}",
-                            &reason[..reason.len().min(30)],
-                            evidence_count
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::OrchestratorCompleted {
-                        evidence_count,
-                        total_llm_calls,
-                        dispatch_rounds,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "orchestrator_done_e{}_l{}_r{}",
-                            evidence_count, total_llm_calls, dispatch_rounds
-                        ),
-                        elapsed_ms: 0,
-                    },
-
-                    // ── Worker ──
-                    AgentEvent::WorkerStarted {
-                        doc_name,
-                        task: _,
-                        max_rounds,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!("worker_started_{}_r{}", doc_name, max_rounds),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::WorkerPlanGenerated { doc_name, plan_len } => {
-                        RetrieveEvent::StageCompleted {
-                            stage: format!("plan_{}_{}chars", doc_name, plan_len),
-                            elapsed_ms: 0,
-                        }
-                    }
-                    AgentEvent::WorkerRound {
-                        doc_name,
-                        round,
-                        command,
-                        success: _,
-                        elapsed_ms,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!("round_{}_{}_{}", doc_name, round, command),
-                        elapsed_ms,
-                    },
-                    AgentEvent::EvidenceCollected {
-                        doc_name,
-                        node_title,
-                        source_path,
-                        content_len,
-                        total_evidence: _,
-                    } => RetrieveEvent::ContentFound {
-                        node_id: source_path,
-                        title: format!("[{}] {}", doc_name, node_title),
-                        preview: String::new(),
-                        score: if content_len > 0 { 0.8 } else { 0.0 },
-                    },
-                    AgentEvent::WorkerSufficiencyCheck {
-                        doc_name: _,
-                        sufficient,
-                        evidence_count,
-                        ..
-                    } => RetrieveEvent::SufficiencyCheck {
-                        level: if sufficient {
-                            crate::retrieval::SufficiencyLevel::Sufficient
-                        } else {
-                            crate::retrieval::SufficiencyLevel::Insufficient
-                        },
-                        tokens: evidence_count,
-                    },
-                    AgentEvent::WorkerReplan {
-                        doc_name,
-                        missing_info,
-                        plan_len,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "replan_{}_{}_{}chars",
-                            doc_name,
-                            &missing_info[..missing_info.len().min(30)],
-                            plan_len
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::WorkerBudgetWarning {
-                        doc_name,
-                        warning_type,
-                        round,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "budget_warning_{}_{}_round_{}",
-                            doc_name, warning_type, round
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::WorkerDone {
-                        doc_name,
-                        evidence_count,
-                        rounds_used,
-                        llm_calls,
-                        budget_exhausted: _,
-                        plan_generated: _,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "worker_done_{}_e{}_r{}_l{}",
-                            doc_name, evidence_count, rounds_used, llm_calls
-                        ),
-                        elapsed_ms: 0,
-                    },
-
-                    // ── Answer Pipeline ──
-                    AgentEvent::AnswerStarted {
-                        evidence_count,
-                        multi_doc,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!(
-                            "answer_start_{}_e{}",
-                            if multi_doc { "multi" } else { "single" },
-                            evidence_count
-                        ),
-                        elapsed_ms: 0,
-                    },
-                    AgentEvent::AnswerCompleted {
-                        answer_len,
-                        confidence,
-                    } => RetrieveEvent::StageCompleted {
-                        stage: format!("synthesis_{}_{}chars", confidence, answer_len),
-                        elapsed_ms: 0,
-                    },
-
-                    // ── Terminal ──
-                    AgentEvent::Completed {
-                        evidence_count,
-                        llm_calls,
-                        answer_len,
-                    } => {
-                        let response = crate::retrieval::RetrieveResponse {
-                            results: Vec::new(),
-                            content: String::new(),
-                            confidence: if evidence_count > 0 { 0.8 } else { 0.0 },
-                            is_sufficient: true,
-                            strategy_used: format!("agent(l={},a={})", llm_calls, answer_len),
-                            reasoning_chain: crate::retrieval::ReasoningChain::default(),
-                            tokens_used: answer_len,
-                        };
-                        let _ = retrieve_tx
-                            .send(RetrieveEvent::Completed { response })
-                            .await;
-                        break; // Completed is terminal
-                    }
-                    AgentEvent::Error { stage, message } => {
-                        let _ = retrieve_tx
-                            .send(RetrieveEvent::Error {
-                                message: format!("[{}] {}", stage, message),
-                            })
-                            .await;
-                        break; // Error is terminal
-                    }
-                };
-
-                // For non-terminal events, send the translated event
-                if !matches!(
-                    translated,
-                    RetrieveEvent::Completed { .. } | RetrieveEvent::Error { .. }
-                ) {
-                    if retrieve_tx.send(translated).await.is_err() {
-                        break; // Receiver dropped
-                    }
-                }
-            }
-        });
+        // Build DocContexts from Documents and dispatch
+        let doc_contexts: Vec<vectorless_agent::DocContext> = documents
+            .iter()
+            .map(|doc| vectorless_agent::DocContext {
+                tree: &doc.tree,
+                nav_index: &doc.nav_index,
+                reasoning_index: &doc.reasoning_index,
+                doc_name: &doc.name,
+            })
+            .collect();
+
+        let skip_analysis = !ids.is_empty();
+        let scope = if skip_analysis {
+            vectorless_agent::Scope::Specified(doc_contexts)
+        } else {
+            vectorless_agent::Scope::Workspace(vectorless_agent::WorkspaceContext::new(
+                doc_contexts,
+            ))
+        };
 
-        // Run the agent in a background task
+        let emitter = vectorless_agent::EventEmitter::noop();
         let config = self.retriever.config().clone();
         let llm = self.retriever.llm().clone();
-        let emitter = crate::agent::EventEmitter::new(agent_tx);
-        let metrics_hub = Arc::clone(&self.metrics_hub);
-        let start = std::time::Instant::now();
-
-        tokio::spawn(async move {
-            // Prepare owned indices (fill defaults for missing)
-            let owned_docs: Vec<(
-                String,
-                crate::storage::PersistedDocument,
-                crate::document::NavigationIndex,
-                crate::document::ReasoningIndex,
-            )> = docs
-                .into_iter()
-                .map(|(id, doc)| {
-                    let nav = doc.navigation_index.clone().unwrap_or_default();
-                    let ridx = doc.reasoning_index.clone().unwrap_or_default();
-                    (id, doc, nav, ridx)
-                })
-                .collect();
-
-            // All streaming queries are user-specified docs → always use Scope::Specified
-            let doc_contexts: Vec<crate::agent::DocContext> = owned_docs
-                .iter()
-                .map(|(id, doc, nav, ridx)| crate::agent::DocContext {
-                    tree: &doc.tree,
-                    nav_index: nav,
-                    reasoning_index: ridx,
-                    doc_name: id.as_str(),
-                })
-                .collect();
-            let scope = crate::agent::Scope::Specified(doc_contexts);
-            let result =
-                crate::retrieval::dispatcher::dispatch(&query, scope, &config, &llm, &emitter)
-                    .await;
-
-            // Bridge agent metrics into global MetricsHub
-            if let Ok(output) = result {
-                let m = &output.metrics;
-                let elapsed = start.elapsed();
-                metrics_hub.record_retrieval_query(
-                    m.rounds_used as u64,
-                    m.nodes_visited as u64,
-                    elapsed.as_millis() as u64,
-                );
-            }
-        });
+        let output =
+            vectorless_retrieval::dispatcher::dispatch(input, scope, &config, &llm, &emitter)
+                .await?;
 
-        Ok(retrieve_rx)
+        // Convert Output -> Answer
+        Ok(Self::output_to_answer(&output))
     }
 
-    // ============================================================
-    // Document Management
-    // ============================================================
-
-    /// Get a list of all indexed documents.
-    pub async fn list(&self) -> Result<Vec<DocumentInfo>> {
-        self.workspace.list().await
+    /// Remove a document from the workspace.
+    pub async fn forget(&self, doc_id: &str) -> Result<()> {
+        self.workspace.remove(doc_id).await?;
+        Ok(())
     }
 
-    /// Remove a document from the workspace.
-    pub async fn remove(&self, doc_id: &str) -> Result<bool> {
-        self.workspace.remove(doc_id).await
+    /// List all understood documents.
+    ///
+    /// Returns [`Vec<vectorless_document::DocumentInfo>`] with summary, structure, and concepts
+    /// for each document.
+    pub async fn list_documents(&self) -> Result<Vec<vectorless_document::DocumentInfo>> {
+        let ids = self.workspace.inner().list_documents().await;
+        let mut result = Vec::new();
+        for id in ids {
+            match self.workspace.load(&id).await {
+                Ok(Some(persisted)) => {
+                    result.push(Self::persisted_to_understanding_document(persisted).info());
+                }
+                Ok(None) => {
+                    tracing::warn!(doc_id = %id, "Document in index but not in storage");
+                }
+                Err(e) => {
+                    tracing::warn!(doc_id = %id, error = %e, "Failed to load document");
+                }
+            }
+        }
+        Ok(result)
     }
 
+    // ============================================================
+    // Utility Methods
+    // ============================================================
+
     /// Check if a document exists in the workspace.
     pub async fn exists(&self, doc_id: &str) -> Result<bool> {
         self.workspace.exists(doc_id).await
@@ -809,18 +556,73 @@ impl Engine {
     ///
     /// The graph is automatically rebuilt after indexing documents.
     /// Returns `None` if no graph has been built yet.
-    pub async fn get_graph(&self) -> Result<Option<crate::graph::DocumentGraph>> {
+    pub async fn get_graph(&self) -> Result<Option<vectorless_graph::DocumentGraph>> {
         self.workspace.get_graph().await
     }
 
     /// Generate a complete metrics report.
     ///
-    /// Returns a [`MetricsReport`](crate::metrics::MetricsReport) containing
+    /// Returns a [`MetricsReport`](vectorless_metrics::MetricsReport) containing
     /// LLM usage and retrieval operation metrics.
-    pub fn metrics_report(&self) -> crate::metrics::MetricsReport {
+    pub fn metrics_report(&self) -> vectorless_metrics::MetricsReport {
         self.metrics_hub.generate_report()
     }
 
+    // ============================================================
+    // Internal: type conversions
+    // ============================================================
+
+    /// Convert a PersistedDocument to a Document (understanding type).
+    fn persisted_to_understanding_document(persisted: PersistedDocument) -> UnderstandingDocument {
+        let nav_index = persisted.navigation_index.unwrap_or_default();
+        let reasoning_index = persisted.reasoning_index.unwrap_or_default();
+        let tree = persisted.tree;
+
+        let section_count = tree.node_count();
+
+        UnderstandingDocument {
+            doc_id: persisted.meta.id,
+            name: persisted.meta.name,
+            format: persisted.meta.format,
+            source_path: persisted
+                .meta
+                .source_path
+                .as_ref()
+                .map(|p| p.to_string_lossy().to_string()),
+            tree,
+            nav_index,
+            reasoning_index,
+            summary: persisted.meta.description.unwrap_or_default(),
+            concepts: persisted.concepts,
+            page_count: persisted.meta.page_count,
+            section_count,
+        }
+    }
+
+    /// Convert agent Output to public Answer type.
+    fn output_to_answer(output: &vectorless_agent::Output) -> Answer {
+        // Build evidence
+        let evidence: Vec<Evidence> = output
+            .evidence
+            .iter()
+            .map(|e| Evidence {
+                content: e.content.clone(),
+                source_path: e.source_path.clone(),
+                doc_name: e.doc_name.clone().unwrap_or_default(),
+                relevance: 0.0,
+            })
+            .collect();
+
+        Answer {
+            content: output.answer.clone(),
+            evidence,
+            confidence: output.confidence,
+            trace: ReasoningTrace {
+                steps: output.trace_steps.clone(),
+            },
+        }
+    }
+
     // ============================================================
     // Internal
     // ============================================================
@@ -829,23 +631,13 @@ impl Engine {
     async fn load_documents(
         &self,
         doc_ids: &[String],
-    ) -> Result<(
-        Vec<(
-            crate::document::DocumentTree,
-            crate::document::NavigationIndex,
-            crate::document::ReasoningIndex,
-            String,
-        )>,
-        Vec<FailedItem>,
-    )> {
+    ) -> Result<(Vec<vectorless_document::Document>, Vec<FailedItem>)> {
         let mut documents = Vec::new();
         let mut failed = Vec::new();
         for doc_id in doc_ids {
             match self.workspace.load(doc_id).await {
                 Ok(Some(doc)) => {
-                    let nav_index = doc.navigation_index.unwrap_or_default();
-                    let reasoning_index = doc.reasoning_index.unwrap_or_default();
-                    documents.push((doc.tree, nav_index, reasoning_index, doc_id.clone()));
+                    documents.push(Self::persisted_to_understanding_document(doc));
                 }
                 Ok(None) => {
                     failed.push(FailedItem::new(doc_id, "Document not found"));
@@ -875,20 +667,6 @@ impl Engine {
         }
     }
 
-    /// Resolve QueryScope into a list of document IDs.
-    async fn resolve_scope(&self, scope: &QueryScope) -> Result<Vec<String>> {
-        match scope {
-            QueryScope::Documents(ids) => Ok(ids.clone()),
-            QueryScope::Workspace => {
-                let docs = self.list().await?;
-                if docs.is_empty() {
-                    return Err(Error::Config("Workspace is empty".to_string()));
-                }
-                Ok(docs.into_iter().map(|d| d.id).collect())
-            }
-        }
-    }
-
     /// Build pipeline options for pipeline execution (with checkpoint dir).
     ///
     /// This is the single source of truth for pipeline configuration.
@@ -897,13 +675,13 @@ impl Engine {
         options: &super::types::IndexOptions,
         source: &IndexSource,
     ) -> PipelineOptions {
-        use crate::index::{IndexMode, ReasoningIndexConfig, SummaryStrategy};
+        use vectorless_index::{IndexMode, ReasoningIndexConfig, SummaryStrategy};
 
         let format = match source {
             IndexSource::Path(path) => self
                 .indexer
                 .detect_format_from_path(path)
-                .unwrap_or(crate::index::parse::DocumentFormat::Markdown),
+                .unwrap_or(vectorless_index::parse::DocumentFormat::Markdown),
             IndexSource::Content { format, .. } => *format,
             IndexSource::Bytes { format, .. } => *format,
         };
@@ -912,8 +690,8 @@ impl Engine {
 
         PipelineOptions {
             mode: match format {
-                crate::index::parse::DocumentFormat::Markdown => IndexMode::Markdown,
-                crate::index::parse::DocumentFormat::Pdf => IndexMode::Pdf,
+                vectorless_index::parse::DocumentFormat::Markdown => IndexMode::Markdown,
+                vectorless_index::parse::DocumentFormat::Pdf => IndexMode::Pdf,
             },
             generate_ids: options.generate_ids,
             summary_strategy: if options.generate_summaries {
@@ -927,7 +705,9 @@ impl Engine {
                 enable_synonym_expansion: options.enable_synonym_expansion,
                 ..ReasoningIndexConfig::default()
             },
-            concurrency: self.config.llm.throttle.to_runtime_config(),
+            concurrency: vectorless_llm::throttle::ConcurrencyConfig::from(
+                &self.config.llm.throttle,
+            ),
             ..Default::default()
         }
     }
@@ -967,8 +747,8 @@ impl Engine {
             return Ok(IndexAction::Skip(incremental::SkipInfo {
                 doc_id: existing_id,
                 name,
-                format: crate::index::parse::DocumentFormat::from_extension(&format_str)
-                    .unwrap_or(crate::index::parse::DocumentFormat::Markdown),
+                format: vectorless_index::parse::DocumentFormat::from_extension(&format_str)
+                    .unwrap_or(vectorless_index::parse::DocumentFormat::Markdown),
                 description: desc,
                 page_count: pages,
             }));
@@ -985,8 +765,9 @@ impl Engine {
             None => return Ok(IndexAction::FullIndex { existing_id: None }),
         };
 
-        let format = crate::index::parse::DocumentFormat::from_extension(&stored_doc.meta.format)
-            .unwrap_or(crate::index::parse::DocumentFormat::Markdown);
+        let format =
+            vectorless_index::parse::DocumentFormat::from_extension(&stored_doc.meta.format)
+                .unwrap_or(vectorless_index::parse::DocumentFormat::Markdown);
         let pipeline_options = self.build_pipeline_options(options, source);
 
         // If logic fingerprint changed, remove old doc before full reprocess
@@ -1052,7 +833,7 @@ impl Engine {
             "Documents loaded for graph rebuild"
         );
 
-        let mut builder = crate::graph::DocumentGraphBuilder::new(self.config.graph.clone());
+        let mut builder = vectorless_graph::DocumentGraphBuilder::new(self.config.graph.clone());
         for doc in &loaded_docs {
             let keywords = Self::extract_keywords_from_doc(&doc);
             builder.add_document(
@@ -1074,7 +855,7 @@ impl Engine {
         Ok(())
     }
 
-    /// Extract keyword → weight map from a persisted document's ReasoningIndex.
+    /// Extract keyword -> weight map from a persisted document's ReasoningIndex.
     fn extract_keywords_from_doc(doc: &PersistedDocument) -> HashMap<String, f32> {
         let mut keywords = HashMap::new();
         if let Some(ref ri) = doc.reasoning_index {
@@ -1109,9 +890,9 @@ impl std::fmt::Debug for Engine {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::client::types::IndexMode;
+    use crate::types::IndexMode;
 
-    // ── resolve_index_action Default mode ──────────────────────────────────
+    // -- resolve_index_action Default mode ----------------------------------------------
 
     // We can't call resolve_index_action without a workspace, but we can
     // verify IndexMode equality logic used inside.
@@ -1123,13 +904,13 @@ mod tests {
         assert_ne!(mode, IndexMode::Incremental);
     }
 
-    // ── build_index_item ──────────────────────────────────────────────────
+    // -- build_index_item ----------------------------------------------------------------
 
-    // Build_index_item only transforms data — no I/O.
-    use crate::client::indexed_document::IndexedDocument;
+    // Build_index_item only transforms data -- no I/O.
+    use crate::indexed_document::IndexedDocument;
 
     fn make_doc() -> IndexedDocument {
-        IndexedDocument::new("test-id", crate::index::parse::DocumentFormat::Markdown)
+        IndexedDocument::new("test-id", vectorless_index::parse::DocumentFormat::Markdown)
             .with_name("test.md")
             .with_description("test doc")
             .with_source_path(std::path::PathBuf::from("/tmp/test.md"))
@@ -1142,7 +923,10 @@ mod tests {
 
         assert_eq!(item.doc_id, "test-id");
         assert_eq!(item.name, "test.md");
-        assert_eq!(item.format, crate::index::parse::DocumentFormat::Markdown);
+        assert_eq!(
+            item.format,
+            vectorless_index::parse::DocumentFormat::Markdown
+        );
         assert_eq!(item.description, Some("test doc".to_string()));
         assert_eq!(item.source_path, Some("/tmp/test.md".to_string()));
         assert!(item.metrics.is_none());
@@ -1150,10 +934,10 @@ mod tests {
 
     #[test]
     fn test_build_index_item_no_source_path() {
-        let doc = IndexedDocument::new("id", crate::index::parse::DocumentFormat::Pdf);
+        let doc = IndexedDocument::new("id", vectorless_index::parse::DocumentFormat::Pdf);
         let item = Engine::build_index_item(&doc);
 
         assert_eq!(item.source_path, Some(String::new())); // unwrap_or_default
-        assert_eq!(item.format, crate::index::parse::DocumentFormat::Pdf);
+        assert_eq!(item.format, vectorless_index::parse::DocumentFormat::Pdf);
     }
 }
diff --git a/rust/src/client/index_context.rs b/vectorless-core/vectorless-engine/src/index_context.rs
similarity index 99%
rename from rust/src/client/index_context.rs
rename to vectorless-core/vectorless-engine/src/index_context.rs
index 30cb2502..df109db6 100644
--- a/rust/src/client/index_context.rs
+++ b/vectorless-core/vectorless-engine/src/index_context.rs
@@ -38,7 +38,7 @@
 
 use std::path::PathBuf;
 
-use crate::index::parse::DocumentFormat;
+use vectorless_document::DocumentFormat;
 
 use super::types::{IndexMode, IndexOptions};
 
diff --git a/rust/src/client/indexed_document.rs b/vectorless-core/vectorless-engine/src/indexed_document.rs
similarity index 87%
rename from rust/src/client/indexed_document.rs
rename to vectorless-core/vectorless-engine/src/indexed_document.rs
index 3aa78f65..ee1cbbce 100644
--- a/rust/src/client/indexed_document.rs
+++ b/vectorless-core/vectorless-engine/src/indexed_document.rs
@@ -9,15 +9,15 @@
 
 use std::path::PathBuf;
 
-use crate::document::DocumentTree;
-use crate::index::parse::DocumentFormat;
-use crate::metrics::IndexMetrics;
-use crate::storage::PageContent;
+use vectorless_document::DocumentFormat;
+use vectorless_document::DocumentTree;
+use vectorless_metrics::IndexMetrics;
+use vectorless_storage::PageContent;
 
 /// An indexed document with its tree structure and metadata.
 ///
 /// Internal intermediate produced by the indexing pipeline and consumed
-/// by [`Engine`](super::Engine) to create a [`PersistedDocument`](crate::storage::PersistedDocument).
+/// by [`Engine`](super::Engine) to create a [`PersistedDocument`](vectorless_storage::PersistedDocument).
 #[derive(Debug, Clone)]
 pub(crate) struct IndexedDocument {
     /// Unique document identifier.
@@ -48,10 +48,13 @@ pub(crate) struct IndexedDocument {
     pub metrics: Option<IndexMetrics>,
 
     /// Pre-computed reasoning index for retrieval acceleration.
-    pub reasoning_index: Option<crate::document::ReasoningIndex>,
+    pub reasoning_index: Option<vectorless_document::ReasoningIndex>,
 
     /// Pre-computed navigation index for agent-based retrieval.
-    pub navigation_index: Option<crate::document::NavigationIndex>,
+    pub navigation_index: Option<vectorless_document::NavigationIndex>,
+
+    /// Key concepts extracted from the document.
+    pub concepts: Vec<vectorless_document::Concept>,
 }
 
 impl IndexedDocument {
@@ -69,6 +72,7 @@ impl IndexedDocument {
             metrics: None,
             reasoning_index: None,
             navigation_index: None,
+            concepts: Vec::new(),
         }
     }
 
diff --git a/rust/src/client/indexer.rs b/vectorless-core/vectorless-engine/src/indexer.rs
similarity index 93%
rename from rust/src/client/indexer.rs
rename to vectorless-core/vectorless-engine/src/indexer.rs
index 2c598382..20490f6d 100644
--- a/rust/src/client/indexer.rs
+++ b/vectorless-core/vectorless-engine/src/indexer.rs
@@ -26,15 +26,15 @@ use std::sync::Arc;
 use tracing::info;
 use uuid::Uuid;
 
-use crate::error::{Error, Result};
-use crate::index::parse::DocumentFormat;
-use crate::index::{IndexInput, IndexMode, PipelineExecutor, PipelineOptions};
-use crate::llm::LlmClient;
-use crate::storage::{DocumentMeta, PersistedDocument};
+use vectorless_document::DocumentFormat;
+use vectorless_error::{Error, Result};
+use vectorless_index::{IndexInput, IndexMode, PipelineExecutor, PipelineOptions};
+use vectorless_llm::LlmClient;
+use vectorless_storage::{DocumentMeta, PersistedDocument};
 
 use super::index_context::IndexSource;
 use super::indexed_document::IndexedDocument;
-use crate::events::{EventEmitter, IndexEvent};
+use vectorless_events::{EventEmitter, IndexEvent};
 
 /// Document indexing client.
 ///
@@ -95,7 +95,7 @@ impl IndexerClient {
         source: &IndexSource,
         name: Option<&str>,
         mut pipeline_options: PipelineOptions,
-        existing_tree: Option<&crate::DocumentTree>,
+        existing_tree: Option<&vectorless_document::DocumentTree>,
     ) -> Result<IndexedDocument> {
         pipeline_options.existing_tree = existing_tree.cloned();
         match source {
@@ -123,7 +123,7 @@ impl IndexerClient {
         let path = path.canonicalize().unwrap_or_else(|_| path.to_path_buf());
 
         // Validate file before indexing
-        let validation = crate::utils::validate_file(&path)?;
+        let validation = vectorless_utils::validate_file(&path)?;
         if !validation.valid {
             return Err(Error::Parse(
                 validation
@@ -161,7 +161,7 @@ impl IndexerClient {
         pipeline_options: PipelineOptions,
     ) -> Result<IndexedDocument> {
         // Validate content before indexing
-        let validation = crate::utils::validate_content(content, format);
+        let validation = vectorless_utils::validate_content(content, format);
         if !validation.valid {
             return Err(Error::Parse(
                 validation
@@ -193,7 +193,7 @@ impl IndexerClient {
         pipeline_options: PipelineOptions,
     ) -> Result<IndexedDocument> {
         // Validate bytes before indexing
-        let validation = crate::utils::validate_bytes(bytes, format);
+        let validation = vectorless_utils::validate_bytes(bytes, format);
         if !validation.valid {
             return Err(Error::Parse(
                 validation
@@ -253,7 +253,7 @@ impl IndexerClient {
     fn build_indexed_document(
         &self,
         doc_id: String,
-        result: crate::index::PipelineResult,
+        result: vectorless_index::PipelineResult,
         format: DocumentFormat,
         name: Option<&str>,
         path: Option<&Path>,
@@ -280,6 +280,7 @@ impl IndexerClient {
 
         doc.reasoning_index = result.reasoning_index;
         doc.navigation_index = result.navigation_index;
+        doc.concepts = result.concepts;
 
         if let Some(p) = path {
             doc = doc.with_source_path(p);
@@ -340,7 +341,7 @@ impl IndexerClient {
         // Compute content fingerprint for incremental indexing (async I/O)
         if let Some(ref path) = doc.source_path {
             if let Ok(bytes) = tokio::fs::read(path).await {
-                let fp = crate::utils::fingerprint::Fingerprint::from_bytes(&bytes);
+                let fp = vectorless_utils::fingerprint::Fingerprint::from_bytes(&bytes);
                 meta = meta.with_fingerprint(fp);
             }
         }
@@ -367,6 +368,7 @@ impl IndexerClient {
 
         persisted.reasoning_index = doc.reasoning_index;
         persisted.navigation_index = doc.navigation_index;
+        persisted.concepts = doc.concepts;
         persisted
             .meta
             .update_processing_stats(node_count, summary_tokens, duration_ms);
diff --git a/rust/src/client/mod.rs b/vectorless-core/vectorless-engine/src/lib.rs
similarity index 76%
rename from rust/src/client/mod.rs
rename to vectorless-core/vectorless-engine/src/lib.rs
index 8a370e57..d6f656c1 100644
--- a/rust/src/client/mod.rs
+++ b/vectorless-core/vectorless-engine/src/lib.rs
@@ -72,7 +72,6 @@ mod indexed_document;
 mod indexer;
 mod query_context;
 mod retriever;
-pub(crate) mod test_support;
 mod types;
 mod workspace;
 
@@ -95,12 +94,28 @@ pub use query_context::QueryContext;
 // ============================================================
 
 pub use types::{
-    Confidence, DocumentInfo, EvidenceItem, FailedItem, IndexItem, IndexMode, IndexOptions,
-    IndexResult, QueryMetrics, QueryResult, QueryResultItem,
+    Confidence, EvidenceItem, FailedItem, IndexItem, IndexMode, IndexOptions, IndexResult,
+    QueryMetrics, QueryResult, QueryResultItem,
 };
 
 // ============================================================
 // Parser Types (needed for IndexContext::from_content)
 // ============================================================
 
-pub use crate::index::parse::DocumentFormat;
+pub use vectorless_document::DocumentFormat;
+
+// ============================================================
+// Re-exports from sub-crates (for downstream consumers)
+// ============================================================
+
+pub use vectorless_config::Config;
+pub use vectorless_document::DocumentTree;
+pub use vectorless_document::{
+    Answer, Concept, DocumentInfo, Evidence, IngestInput, ReasoningTrace, TraceStep,
+};
+pub use vectorless_error::{Error, Result};
+pub use vectorless_events::{EventEmitter, IndexEvent, QueryEvent, WorkspaceEvent};
+pub use vectorless_graph::{
+    DocumentGraph, DocumentGraphNode, EdgeEvidence, GraphEdge, WeightedKeyword,
+};
+pub use vectorless_metrics::{LlmMetricsReport, MetricsReport, RetrievalMetricsReport};
diff --git a/rust/src/client/query_context.rs b/vectorless-core/vectorless-engine/src/query_context.rs
similarity index 100%
rename from rust/src/client/query_context.rs
rename to vectorless-core/vectorless-engine/src/query_context.rs
diff --git a/rust/src/client/retriever.rs b/vectorless-core/vectorless-engine/src/retriever.rs
similarity index 81%
rename from rust/src/client/retriever.rs
rename to vectorless-core/vectorless-engine/src/retriever.rs
index 67f53f6b..217e182a 100644
--- a/rust/src/client/retriever.rs
+++ b/vectorless-core/vectorless-engine/src/retriever.rs
@@ -8,13 +8,16 @@
 
 use tracing::info;
 
-use crate::agent::{self, config::AgentConfig, events::EventEmitter as AgentEventEmitter};
-use crate::client::types::QueryResult;
-use crate::document::{DocumentTree, NavigationIndex, ReasoningIndex};
-use crate::error::Result;
-use crate::events::{EventEmitter, QueryEvent};
-use crate::llm::LlmClient;
-use crate::retrieval::{dispatcher, postprocessor};
+use super::types::QueryResult;
+use vectorless_agent::{
+    self, config::AgentConfig, config::DocContext, config::Scope, config::WorkspaceContext,
+    events::EventEmitter as AgentEventEmitter,
+};
+use vectorless_document::{DocumentTree, NavigationIndex, ReasoningIndex};
+use vectorless_error::Result;
+use vectorless_events::{EventEmitter, QueryEvent};
+use vectorless_llm::LlmClient;
+use vectorless_retrieval::{dispatcher, postprocessor};
 
 /// Document retrieval client.
 ///
@@ -82,9 +85,9 @@ impl RetrieverClient {
             skip_analysis, "Querying: {:?}", question
         );
 
-        let doc_contexts: Vec<agent::DocContext> = documents
+        let doc_contexts: Vec<DocContext> = documents
             .iter()
-            .map(|(tree, nav, ridx, id)| agent::DocContext {
+            .map(|(tree, nav, ridx, id)| DocContext {
                 tree,
                 nav_index: nav,
                 reasoning_index: ridx,
@@ -93,9 +96,9 @@ impl RetrieverClient {
             .collect();
 
         let scope = if skip_analysis {
-            agent::Scope::Specified(doc_contexts)
+            Scope::Specified(doc_contexts)
         } else {
-            agent::Scope::Workspace(agent::WorkspaceContext::new(doc_contexts))
+            Scope::Workspace(WorkspaceContext::new(doc_contexts))
         };
 
         let emitter = AgentEventEmitter::noop();
@@ -135,6 +138,6 @@ mod tests {
     #[test]
     fn test_retriever_client_creation() {
         let _client =
-            RetrieverClient::new(LlmClient::new(crate::llm::config::LlmConfig::default()));
+            RetrieverClient::new(LlmClient::new(vectorless_llm::config::LlmConfig::default()));
     }
 }
diff --git a/rust/src/client/types.rs b/vectorless-core/vectorless-engine/src/types.rs
similarity index 87%
rename from rust/src/client/types.rs
rename to vectorless-core/vectorless-engine/src/types.rs
index 8995f6f2..7cd421f5 100644
--- a/rust/src/client/types.rs
+++ b/vectorless-core/vectorless-engine/src/types.rs
@@ -7,8 +7,8 @@
 
 use serde::{Deserialize, Serialize};
 
-use crate::index::parse::DocumentFormat;
-use crate::metrics::IndexMetrics;
+use vectorless_document::DocumentFormat;
+use vectorless_metrics::IndexMetrics;
 
 // ============================================================
 // Partial Success
@@ -253,64 +253,10 @@ impl IndexItem {
 }
 
 // ============================================================
-// Query Types
+// Query Types — re-exported from retrieval crate
 // ============================================================
 
-/// A single piece of evidence with source attribution.
-#[derive(Debug, Clone)]
-pub struct EvidenceItem {
-    /// Section title where this evidence was found.
-    pub title: String,
-    /// Navigation path (e.g., "Root/Chapter 1/Section 1.2").
-    pub path: String,
-    /// Raw evidence content.
-    pub content: String,
-    /// Source document name (set in multi-doc scenarios).
-    pub doc_name: Option<String>,
-}
-
-/// Query execution metrics.
-#[derive(Debug, Clone, Default)]
-pub struct QueryMetrics {
-    /// Number of LLM calls made.
-    pub llm_calls: u32,
-    /// Number of navigation rounds used.
-    pub rounds_used: u32,
-    /// Number of distinct nodes visited.
-    pub nodes_visited: usize,
-    /// Number of evidence items collected.
-    pub evidence_count: usize,
-    /// Total characters of collected evidence.
-    pub evidence_chars: usize,
-}
-
-/// Confidence score of the query result (0.0–1.0).
-///
-/// Derived from LLM evaluate() — whether evidence was deemed sufficient
-/// and how many replan rounds were needed.
-pub type Confidence = f32;
-
-/// A single document's query result.
-#[derive(Debug, Clone)]
-pub struct QueryResultItem {
-    /// The document ID.
-    pub doc_id: String,
-
-    /// Matching node IDs (navigation paths).
-    pub node_ids: Vec<String>,
-
-    /// Synthesized answer or raw evidence content.
-    pub content: String,
-
-    /// Evidence items that contributed to this result, with source attribution.
-    pub evidence: Vec<EvidenceItem>,
-
-    /// Execution metrics for this query.
-    pub metrics: Option<QueryMetrics>,
-
-    /// Confidence score (0.0–1.0) — derived from LLM evaluation.
-    pub confidence: Confidence,
-}
+pub use vectorless_retrieval::{Confidence, EvidenceItem, QueryMetrics, QueryResultItem};
 
 /// Result of a document query.
 ///
diff --git a/vectorless-core/vectorless-error/Cargo.toml b/vectorless-core/vectorless-error/Cargo.toml
new file mode 100644
index 00000000..c39a0ab1
--- /dev/null
+++ b/vectorless-core/vectorless-error/Cargo.toml
@@ -0,0 +1,15 @@
+[package]
+name = "vectorless-error"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+thiserror = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/error.rs b/vectorless-core/vectorless-error/src/error.rs
similarity index 100%
rename from rust/src/error.rs
rename to vectorless-core/vectorless-error/src/error.rs
diff --git a/vectorless-core/vectorless-error/src/lib.rs b/vectorless-core/vectorless-error/src/lib.rs
new file mode 100644
index 00000000..85d6a0bf
--- /dev/null
+++ b/vectorless-core/vectorless-error/src/lib.rs
@@ -0,0 +1,8 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Error types for the vectorless library.
+
+mod error;
+
+pub use error::{Error, Result};
diff --git a/vectorless-core/vectorless-events/Cargo.toml b/vectorless-core/vectorless-events/Cargo.toml
new file mode 100644
index 00000000..c21492d7
--- /dev/null
+++ b/vectorless-core/vectorless-events/Cargo.toml
@@ -0,0 +1,17 @@
+[package]
+name = "vectorless-events"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+parking_lot = { workspace = true }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-document = { path = "../vectorless-document" }
+
+[lints]
+workspace = true
diff --git a/rust/src/events/emitter.rs b/vectorless-core/vectorless-events/src/emitter.rs
similarity index 100%
rename from rust/src/events/emitter.rs
rename to vectorless-core/vectorless-events/src/emitter.rs
diff --git a/rust/src/events/mod.rs b/vectorless-core/vectorless-events/src/lib.rs
similarity index 100%
rename from rust/src/events/mod.rs
rename to vectorless-core/vectorless-events/src/lib.rs
diff --git a/rust/src/events/types.rs b/vectorless-core/vectorless-events/src/types.rs
similarity index 97%
rename from rust/src/events/types.rs
rename to vectorless-core/vectorless-events/src/types.rs
index 05ca0754..45bb5ad4 100644
--- a/rust/src/events/types.rs
+++ b/vectorless-core/vectorless-events/src/types.rs
@@ -6,8 +6,8 @@
 //! Provides enums for indexing, query, and workspace events
 //! that can be observed via [`EventEmitter`](super::EventEmitter).
 
-use crate::index::parse::DocumentFormat;
-use crate::retrieval::SufficiencyLevel;
+use vectorless_document::DocumentFormat;
+use vectorless_document::SufficiencyLevel;
 
 /// Indexing operation events.
 #[derive(Debug, Clone)]
diff --git a/vectorless-core/vectorless-graph/Cargo.toml b/vectorless-core/vectorless-graph/Cargo.toml
new file mode 100644
index 00000000..a441bfd6
--- /dev/null
+++ b/vectorless-core/vectorless-graph/Cargo.toml
@@ -0,0 +1,18 @@
+[package]
+name = "vectorless-graph"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-document = { path = "../vectorless-document" }
+tracing = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/graph/builder.rs b/vectorless-core/vectorless-graph/src/builder.rs
similarity index 100%
rename from rust/src/graph/builder.rs
rename to vectorless-core/vectorless-graph/src/builder.rs
diff --git a/rust/src/graph/config.rs b/vectorless-core/vectorless-graph/src/config.rs
similarity index 100%
rename from rust/src/graph/config.rs
rename to vectorless-core/vectorless-graph/src/config.rs
diff --git a/rust/src/graph/mod.rs b/vectorless-core/vectorless-graph/src/lib.rs
similarity index 94%
rename from rust/src/graph/mod.rs
rename to vectorless-core/vectorless-graph/src/lib.rs
index f1b48862..594609e4 100644
--- a/rust/src/graph/mod.rs
+++ b/vectorless-core/vectorless-graph/src/lib.rs
@@ -9,7 +9,7 @@
 //! - [`DocumentGraphConfig`] — configuration for graph building and retrieval boosting
 //!
 //! The document graph is a workspace-scoped, weighted graph built from each document's
-//! [`ReasoningIndex`](crate::document::ReasoningIndex) keyword data. It enables
+//! [`ReasoningIndex`](vectorless_document::ReasoningIndex) keyword data. It enables
 //! graph-aware retrieval ranking where connected documents receive a relevance boost.
 //!
 //! # Data Flow
diff --git a/rust/src/graph/types.rs b/vectorless-core/vectorless-graph/src/types.rs
similarity index 100%
rename from rust/src/graph/types.rs
rename to vectorless-core/vectorless-graph/src/types.rs
diff --git a/vectorless-core/vectorless-index/Cargo.toml b/vectorless-core/vectorless-index/Cargo.toml
new file mode 100644
index 00000000..587b54b6
--- /dev/null
+++ b/vectorless-core/vectorless-index/Cargo.toml
@@ -0,0 +1,38 @@
+[package]
+name = "vectorless-index"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-config = { path = "../vectorless-config" }
+vectorless-document = { path = "../vectorless-document" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-llm = { path = "../vectorless-llm" }
+vectorless-metrics = { path = "../vectorless-metrics" }
+vectorless-scoring = { path = "../vectorless-scoring" }
+vectorless-storage = { path = "../vectorless-storage" }
+vectorless-utils = { path = "../vectorless-utils" }
+tokio = { workspace = true }
+async-trait = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+pulldown-cmark = { workspace = true }
+pdf-extract = { workspace = true }
+lopdf = { workspace = true }
+regex = { workspace = true }
+uuid = { workspace = true }
+chrono = { workspace = true }
+rand = { workspace = true }
+futures = { workspace = true }
+base64 = { workspace = true }
+sha2 = { workspace = true }
+tempfile = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/index/config.rs b/vectorless-core/vectorless-index/src/config.rs
similarity index 97%
rename from rust/src/index/config.rs
rename to vectorless-core/vectorless-index/src/config.rs
index 798951b1..e9133c40 100644
--- a/rust/src/index/config.rs
+++ b/vectorless-core/vectorless-index/src/config.rs
@@ -10,10 +10,10 @@
 //! - [`ThinningConfig`] - Node merging settings
 
 use super::summary::SummaryStrategy;
-use crate::config::IndexerConfig;
-use crate::document::{DocumentTree, ReasoningIndexConfig};
-use crate::llm::throttle::ConcurrencyConfig;
-use crate::utils::fingerprint::{Fingerprint, Fingerprinter};
+use vectorless_config::IndexerConfig;
+use vectorless_document::{DocumentTree, ReasoningIndexConfig};
+use vectorless_llm::throttle::ConcurrencyConfig;
+use vectorless_utils::fingerprint::{Fingerprint, Fingerprinter};
 
 use std::path::PathBuf;
 
diff --git a/rust/src/index/incremental/detector.rs b/vectorless-core/vectorless-index/src/incremental/detector.rs
similarity index 99%
rename from rust/src/index/incremental/detector.rs
rename to vectorless-core/vectorless-index/src/incremental/detector.rs
index 23107bb1..011edab8 100644
--- a/rust/src/index/incremental/detector.rs
+++ b/vectorless-core/vectorless-index/src/incremental/detector.rs
@@ -13,8 +13,8 @@ use std::time::SystemTime;
 
 use serde::{Deserialize, Serialize};
 
-use crate::document::{DocumentTree, NodeId};
-use crate::utils::fingerprint::{Fingerprint, Fingerprinter, NodeFingerprint};
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_utils::fingerprint::{Fingerprint, Fingerprinter, NodeFingerprint};
 
 /// Type of change detected.
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
@@ -569,7 +569,7 @@ pub fn compute_all_node_fingerprints(tree: &DocumentTree) -> HashMap<String, Nod
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
+    use vectorless_document::DocumentTree;
 
     #[test]
     fn test_change_detector_new() {
diff --git a/rust/src/index/incremental/mod.rs b/vectorless-core/vectorless-index/src/incremental/mod.rs
similarity index 98%
rename from rust/src/index/incremental/mod.rs
rename to vectorless-core/vectorless-index/src/incremental/mod.rs
index 901661dd..4a5efddb 100644
--- a/rust/src/index/incremental/mod.rs
+++ b/vectorless-core/vectorless-index/src/incremental/mod.rs
@@ -18,10 +18,10 @@ mod detector;
 mod resolver;
 mod updater;
 
-use crate::document::DocumentTree;
 pub use detector::ChangeDetector;
 pub use resolver::{IndexAction, SkipInfo, resolve_action};
 use std::collections::HashMap;
+use vectorless_document::DocumentTree;
 
 /// Reuse summaries from old tree for unchanged nodes in the new tree.
 ///
diff --git a/rust/src/index/incremental/resolver.rs b/vectorless-core/vectorless-index/src/incremental/resolver.rs
similarity index 94%
rename from rust/src/index/incremental/resolver.rs
rename to vectorless-core/vectorless-index/src/incremental/resolver.rs
index a8087fd4..d6022f31 100644
--- a/rust/src/index/incremental/resolver.rs
+++ b/vectorless-core/vectorless-index/src/incremental/resolver.rs
@@ -10,11 +10,11 @@
 
 use tracing::info;
 
-use crate::document::DocumentTree;
-use crate::index::config::PipelineOptions;
-use crate::index::parse::DocumentFormat;
-use crate::storage::PersistedDocument;
-use crate::utils::fingerprint::Fingerprint;
+use crate::config::PipelineOptions;
+use vectorless_document::DocumentFormat;
+use vectorless_document::DocumentTree;
+use vectorless_storage::PersistedDocument;
+use vectorless_utils::fingerprint::Fingerprint;
 
 /// Action to take for a source during indexing.
 pub enum IndexAction {
diff --git a/rust/src/index/incremental/updater.rs b/vectorless-core/vectorless-index/src/incremental/updater.rs
similarity index 97%
rename from rust/src/index/incremental/updater.rs
rename to vectorless-core/vectorless-index/src/incremental/updater.rs
index a9220acf..70525d9b 100644
--- a/rust/src/index/incremental/updater.rs
+++ b/vectorless-core/vectorless-index/src/incremental/updater.rs
@@ -5,9 +5,9 @@
 
 use tracing::info;
 
-use crate::document::{DocumentTree, NodeId};
-use crate::error::Result;
-use crate::index::parse::RawNode;
+use crate::parse::RawNode;
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_error::Result;
 
 use super::detector::ChangeDetector;
 
diff --git a/rust/src/index/mod.rs b/vectorless-core/vectorless-index/src/lib.rs
similarity index 98%
rename from rust/src/index/mod.rs
rename to vectorless-core/vectorless-index/src/lib.rs
index 051f5326..048158ca 100644
--- a/rust/src/index/mod.rs
+++ b/vectorless-core/vectorless-index/src/lib.rs
@@ -66,8 +66,8 @@ pub mod summary;
 pub use pipeline::{IndexInput, IndexMetrics, PipelineExecutor, PipelineResult};
 
 // Re-export config types
-pub use crate::document::ReasoningIndexConfig;
 pub use config::{IndexMode, PipelineOptions, ThinningConfig};
+pub use vectorless_document::ReasoningIndexConfig;
 
 // Re-export summary
 pub use summary::SummaryStrategy;
diff --git a/rust/src/index/parse/markdown/config.rs b/vectorless-core/vectorless-index/src/parse/markdown/config.rs
similarity index 100%
rename from rust/src/index/parse/markdown/config.rs
rename to vectorless-core/vectorless-index/src/parse/markdown/config.rs
diff --git a/rust/src/index/parse/markdown/frontmatter.rs b/vectorless-core/vectorless-index/src/parse/markdown/frontmatter.rs
similarity index 100%
rename from rust/src/index/parse/markdown/frontmatter.rs
rename to vectorless-core/vectorless-index/src/parse/markdown/frontmatter.rs
diff --git a/rust/src/index/parse/markdown/mod.rs b/vectorless-core/vectorless-index/src/parse/markdown/mod.rs
similarity index 100%
rename from rust/src/index/parse/markdown/mod.rs
rename to vectorless-core/vectorless-index/src/parse/markdown/mod.rs
diff --git a/rust/src/index/parse/markdown/parser.rs b/vectorless-core/vectorless-index/src/parse/markdown/parser.rs
similarity index 98%
rename from rust/src/index/parse/markdown/parser.rs
rename to vectorless-core/vectorless-index/src/parse/markdown/parser.rs
index 5bdf6a71..b8980f74 100644
--- a/rust/src/index/parse/markdown/parser.rs
+++ b/vectorless-core/vectorless-index/src/parse/markdown/parser.rs
@@ -6,9 +6,9 @@
 use pulldown_cmark::Options;
 use std::path::Path;
 
-use crate::error::Result;
-use crate::index::parse::{DocumentFormat, DocumentMeta, ParseResult, RawNode};
-use crate::utils::estimate_tokens;
+use crate::parse::{DocumentFormat, DocumentMeta, ParseResult, RawNode};
+use vectorless_error::Result;
+use vectorless_utils::estimate_tokens;
 
 use super::config::MarkdownConfig;
 use super::frontmatter;
@@ -395,7 +395,7 @@ impl MarkdownParser {
     pub async fn parse_file(&self, path: &Path) -> Result<ParseResult> {
         let content = tokio::fs::read_to_string(path)
             .await
-            .map_err(|e| crate::Error::Parse(format!("Failed to read file: {}", e)))?;
+            .map_err(|e| vectorless_error::Error::Parse(format!("Failed to read file: {}", e)))?;
 
         let mut result = self.parse(&content).await?;
 
diff --git a/rust/src/index/parse/mod.rs b/vectorless-core/vectorless-index/src/parse/mod.rs
similarity index 78%
rename from rust/src/index/parse/mod.rs
rename to vectorless-core/vectorless-index/src/parse/mod.rs
index 0bcba9f4..d9bde2bf 100644
--- a/rust/src/index/parse/mod.rs
+++ b/vectorless-core/vectorless-index/src/parse/mod.rs
@@ -5,15 +5,6 @@
 //!
 //! Supports Markdown and PDF formats. Parsing is dispatched directly
 //! via `match` — no trait objects or registry needed.
-//!
-//! # Quick parse
-//!
-//! ```rust,ignore
-//! use vectorless::index::parse::{parse_content, parse_bytes, DocumentFormat};
-//!
-//! let result = parse_content("# Title\nContent", DocumentFormat::Markdown).await?;
-//! let result = parse_bytes(&pdf_bytes, DocumentFormat::Pdf).await?;
-//! ```
 
 pub mod markdown;
 pub mod pdf;
@@ -25,9 +16,9 @@ pub use types::{DocumentFormat, DocumentMeta, ParseResult, RawNode};
 
 use std::path::Path;
 
-use crate::error::Result;
-use crate::index::parse::markdown::MarkdownParser;
-use crate::llm::LlmClient;
+use crate::parse::markdown::MarkdownParser;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
 
 /// Parse a string content document.
 pub async fn parse_content(
@@ -40,7 +31,7 @@ pub async fn parse_content(
             let parser = MarkdownParser::new();
             parser.parse(content).await
         }
-        DocumentFormat::Pdf => Err(crate::Error::Parse(
+        DocumentFormat::Pdf => Err(vectorless_error::Error::Parse(
             "PDF requires bytes, not string content".to_string(),
         )),
     }
@@ -75,8 +66,9 @@ pub async fn parse_bytes(
 ) -> Result<ParseResult> {
     match format {
         DocumentFormat::Markdown => {
-            let content = std::str::from_utf8(bytes)
-                .map_err(|e| crate::Error::Parse(format!("Invalid UTF-8 content: {}", e)))?;
+            let content = std::str::from_utf8(bytes).map_err(|e| {
+                vectorless_error::Error::Parse(format!("Invalid UTF-8 content: {}", e))
+            })?;
             let parser = MarkdownParser::new();
             parser.parse(content).await
         }
diff --git a/rust/src/index/parse/pdf/mod.rs b/vectorless-core/vectorless-index/src/parse/pdf/mod.rs
similarity index 100%
rename from rust/src/index/parse/pdf/mod.rs
rename to vectorless-core/vectorless-index/src/parse/pdf/mod.rs
diff --git a/rust/src/index/parse/pdf/parser.rs b/vectorless-core/vectorless-index/src/parse/pdf/parser.rs
similarity index 97%
rename from rust/src/index/parse/pdf/parser.rs
rename to vectorless-core/vectorless-index/src/parse/pdf/parser.rs
index a3327cc0..61e05787 100644
--- a/rust/src/index/parse/pdf/parser.rs
+++ b/vectorless-core/vectorless-index/src/parse/pdf/parser.rs
@@ -12,13 +12,13 @@ use std::path::Path;
 use lopdf::Document as LopdfDocument;
 use tracing::{info, warn};
 
-use crate::Error;
-use crate::error::Result;
-use crate::index::parse::toc::TocProcessor;
-use crate::llm::LlmClient;
+use crate::parse::toc::TocProcessor;
+use vectorless_error::Error;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
 
 use super::types::{PdfMetadata, PdfPage, PdfParseResult};
-use crate::index::parse::{DocumentFormat, DocumentMeta, ParseResult, RawNode};
+use crate::parse::{DocumentFormat, DocumentMeta, ParseResult, RawNode};
 
 /// PDF document parser.
 pub struct PdfParser {
@@ -192,7 +192,7 @@ impl PdfParser {
     /// Convert TOC entries to RawNodes.
     fn toc_entries_to_raw_nodes(
         &self,
-        entries: &[crate::index::parse::toc::TocEntry],
+        entries: &[crate::parse::toc::TocEntry],
         pages: &[PdfPage],
     ) -> Vec<RawNode> {
         let mut nodes = Vec::new();
@@ -217,7 +217,7 @@ impl PdfParser {
     /// Get content for a TOC entry from pages.
     fn get_content_for_entry(
         &self,
-        entry: &crate::index::parse::toc::TocEntry,
+        entry: &crate::parse::toc::TocEntry,
         pages: &[PdfPage],
     ) -> String {
         let start_page = entry.physical_page.unwrap_or(1);
diff --git a/rust/src/index/parse/pdf/types.rs b/vectorless-core/vectorless-index/src/parse/pdf/types.rs
similarity index 99%
rename from rust/src/index/parse/pdf/types.rs
rename to vectorless-core/vectorless-index/src/parse/pdf/types.rs
index 3b978836..c666d011 100644
--- a/rust/src/index/parse/pdf/types.rs
+++ b/vectorless-core/vectorless-index/src/parse/pdf/types.rs
@@ -3,8 +3,8 @@
 
 //! PDF document types.
 
-use crate::utils::estimate_tokens;
 use serde::{Deserialize, Serialize};
+use vectorless_utils::estimate_tokens;
 
 /// A single page from a PDF document.
 #[derive(Debug, Clone, Serialize, Deserialize)]
diff --git a/rust/src/index/parse/toc/assigner.rs b/vectorless-core/vectorless-index/src/parse/toc/assigner.rs
similarity index 98%
rename from rust/src/index/parse/toc/assigner.rs
rename to vectorless-core/vectorless-index/src/parse/toc/assigner.rs
index 267cda18..7c0404fa 100644
--- a/rust/src/index/parse/toc/assigner.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/assigner.rs
@@ -7,12 +7,12 @@ use futures::stream::{self, StreamExt};
 use std::collections::HashMap;
 use tracing::{debug, info};
 
-use crate::error::Result;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::config::LlmConfig;
+use crate::parse::pdf::PdfPage;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::{PageOffset, TocEntry};
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 /// Page assigner configuration.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/toc/detector.rs b/vectorless-core/vectorless-index/src/parse/toc/detector.rs
similarity index 98%
rename from rust/src/index/parse/toc/detector.rs
rename to vectorless-core/vectorless-index/src/parse/toc/detector.rs
index 8484e101..c9960253 100644
--- a/rust/src/index/parse/toc/detector.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/detector.rs
@@ -6,12 +6,12 @@
 use regex::Regex;
 use tracing::debug;
 
-use crate::error::Result;
-use crate::llm::config::LlmConfig;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::TocDetection;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::LlmClient;
+use crate::parse::pdf::PdfPage;
+use vectorless_llm::LlmClient;
 
 /// TOC detector configuration.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/toc/mod.rs b/vectorless-core/vectorless-index/src/parse/toc/mod.rs
similarity index 100%
rename from rust/src/index/parse/toc/mod.rs
rename to vectorless-core/vectorless-index/src/parse/toc/mod.rs
diff --git a/rust/src/index/parse/toc/parser.rs b/vectorless-core/vectorless-index/src/parse/toc/parser.rs
similarity index 98%
rename from rust/src/index/parse/toc/parser.rs
rename to vectorless-core/vectorless-index/src/parse/toc/parser.rs
index df0f306d..fe97708a 100644
--- a/rust/src/index/parse/toc/parser.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/parser.rs
@@ -5,11 +5,11 @@
 
 use tracing::debug;
 
-use crate::error::Result;
-use crate::llm::config::LlmConfig;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::TocEntry;
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 /// TOC parser configuration.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/toc/processor.rs b/vectorless-core/vectorless-index/src/parse/toc/processor.rs
similarity index 99%
rename from rust/src/index/parse/toc/processor.rs
rename to vectorless-core/vectorless-index/src/parse/toc/processor.rs
index e53b6346..bc8d52af 100644
--- a/rust/src/index/parse/toc/processor.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/processor.rs
@@ -10,9 +10,9 @@
 use futures::stream::{self, StreamExt};
 use tracing::{debug, info, warn};
 
-use crate::error::Result;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::LlmClient;
+use crate::parse::pdf::PdfPage;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
 
 use super::assigner::{PageAssigner, PageAssignerConfig};
 use super::detector::{TocDetector, TocDetectorConfig};
diff --git a/rust/src/index/parse/toc/repairer.rs b/vectorless-core/vectorless-index/src/parse/toc/repairer.rs
similarity index 98%
rename from rust/src/index/parse/toc/repairer.rs
rename to vectorless-core/vectorless-index/src/parse/toc/repairer.rs
index 61ba414e..977f8635 100644
--- a/rust/src/index/parse/toc/repairer.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/repairer.rs
@@ -6,13 +6,13 @@
 use futures::stream::{self, StreamExt};
 use tracing::{debug, info};
 
-use crate::error::Result;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::config::LlmConfig;
+use crate::parse::pdf::PdfPage;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::{TocEntry, VerificationError, VerificationReport};
 use super::verifier::IndexVerifier;
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 /// Repairer configuration.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/toc/structure_extractor.rs b/vectorless-core/vectorless-index/src/parse/toc/structure_extractor.rs
similarity index 99%
rename from rust/src/index/parse/toc/structure_extractor.rs
rename to vectorless-core/vectorless-index/src/parse/toc/structure_extractor.rs
index 63ce9d7e..c9f29ddb 100644
--- a/rust/src/index/parse/toc/structure_extractor.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/structure_extractor.rs
@@ -10,12 +10,12 @@
 use futures::stream::{self, StreamExt};
 use tracing::{debug, info, warn};
 
-use crate::error::Result;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::config::LlmConfig;
+use crate::parse::pdf::PdfPage;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::TocEntry;
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 /// Configuration for structure extraction.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/toc/types.rs b/vectorless-core/vectorless-index/src/parse/toc/types.rs
similarity index 100%
rename from rust/src/index/parse/toc/types.rs
rename to vectorless-core/vectorless-index/src/parse/toc/types.rs
diff --git a/rust/src/index/parse/toc/verifier.rs b/vectorless-core/vectorless-index/src/parse/toc/verifier.rs
similarity index 98%
rename from rust/src/index/parse/toc/verifier.rs
rename to vectorless-core/vectorless-index/src/parse/toc/verifier.rs
index 1e3d1d45..460f39be 100644
--- a/rust/src/index/parse/toc/verifier.rs
+++ b/vectorless-core/vectorless-index/src/parse/toc/verifier.rs
@@ -7,12 +7,12 @@ use futures::stream::{self, StreamExt};
 use rand::seq::SliceRandom;
 use tracing::{debug, info};
 
-use crate::error::Result;
-use crate::index::parse::pdf::PdfPage;
-use crate::llm::config::LlmConfig;
+use crate::parse::pdf::PdfPage;
+use vectorless_error::Result;
+use vectorless_llm::config::LlmConfig;
 
 use super::types::{ErrorType, TocEntry, VerificationError, VerificationReport};
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 /// Verifier configuration.
 #[derive(Debug, Clone)]
diff --git a/rust/src/index/parse/types.rs b/vectorless-core/vectorless-index/src/parse/types.rs
similarity index 81%
rename from rust/src/index/parse/types.rs
rename to vectorless-core/vectorless-index/src/parse/types.rs
index baaa8224..92dd6b0f 100644
--- a/rust/src/index/parse/types.rs
+++ b/vectorless-core/vectorless-index/src/parse/types.rs
@@ -6,43 +6,12 @@
 //! This module defines the types used for document parsing:
 //! - [`RawNode`] - A raw node extracted from a document before tree construction
 //! - [`DocumentMeta`] - Metadata about a document
-//! - [`DocumentFormat`] - Supported document formats
+//! - [`DocumentFormat`] - Supported document formats (re-exported from document module)
 
 use serde::{Deserialize, Serialize};
 
-/// Supported document formats.
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
-pub enum DocumentFormat {
-    /// Markdown files (.md, .markdown)
-    Markdown,
-    /// PDF files (.pdf)
-    Pdf,
-}
-
-impl DocumentFormat {
-    /// Detect format from file extension.
-    pub fn from_extension(ext: &str) -> Option<Self> {
-        match ext.to_lowercase().as_str() {
-            "md" | "markdown" => Some(Self::Markdown),
-            "pdf" => Some(Self::Pdf),
-            _ => None,
-        }
-    }
-
-    /// Get the file extension for this format.
-    pub fn extension(&self) -> &'static str {
-        match self {
-            Self::Markdown => "md",
-            Self::Pdf => "pdf",
-        }
-    }
-
-    /// All supported file extensions (lowercase).
-    ///
-    /// Single source of truth — used by directory scanning to
-    /// discover indexable files.
-    pub const SUPPORTED_EXTENSIONS: &'static [&'static str] = &["md", "pdf"];
-}
+/// Re-export [`DocumentFormat`] from the document module.
+pub use vectorless_document::DocumentFormat;
 
 /// A raw node extracted from a document.
 ///
diff --git a/rust/src/index/pipeline/checkpoint.rs b/vectorless-core/vectorless-index/src/pipeline/checkpoint.rs
similarity index 99%
rename from rust/src/index/pipeline/checkpoint.rs
rename to vectorless-core/vectorless-index/src/pipeline/checkpoint.rs
index 4ba1f01a..ad607214 100644
--- a/rust/src/index/pipeline/checkpoint.rs
+++ b/vectorless-core/vectorless-index/src/pipeline/checkpoint.rs
@@ -13,8 +13,8 @@ use chrono::{DateTime, Utc};
 use serde::{Deserialize, Serialize};
 use tracing::{info, warn};
 
-use crate::document::DocumentTree;
-use crate::index::parse::RawNode;
+use crate::parse::RawNode;
+use vectorless_document::DocumentTree;
 
 use super::metrics::IndexMetrics;
 
diff --git a/rust/src/index/pipeline/context.rs b/vectorless-core/vectorless-index/src/pipeline/context.rs
similarity index 96%
rename from rust/src/index/pipeline/context.rs
rename to vectorless-core/vectorless-index/src/pipeline/context.rs
index f34876b9..9bc01101 100644
--- a/rust/src/index/pipeline/context.rs
+++ b/vectorless-core/vectorless-index/src/pipeline/context.rs
@@ -6,9 +6,9 @@
 use std::collections::HashMap;
 use std::path::PathBuf;
 
-use crate::document::{DocumentTree, NavigationIndex, NodeId, ReasoningIndex};
-use crate::index::parse::{DocumentFormat, RawNode};
-use crate::llm::LlmClient;
+use crate::parse::{DocumentFormat, RawNode};
+use vectorless_document::{Concept, DocumentTree, NavigationIndex, NodeId, ReasoningIndex};
+use vectorless_llm::LlmClient;
 
 use super::super::{PipelineOptions, SummaryStrategy};
 use super::metrics::IndexMetrics;
@@ -251,6 +251,9 @@ pub struct IndexContext {
     /// Navigation index for Agent-based retrieval (built by NavigationIndexStage).
     pub navigation_index: Option<NavigationIndex>,
 
+    /// Key concepts extracted from the document (built by ConceptExtractionStage).
+    pub concepts: Vec<Concept>,
+
     /// Existing tree from previous indexing (for incremental updates).
     /// When set, the enhance and reasoning stages can reuse data from unchanged nodes.
     pub existing_tree: Option<DocumentTree>,
@@ -289,6 +292,7 @@ impl IndexContext {
             summary_cache: SummaryCache::default(),
             reasoning_index: None,
             navigation_index: None,
+            concepts: Vec::new(),
             existing_tree: None,
             stage_results: HashMap::new(),
             metrics: IndexMetrics::default(),
@@ -387,6 +391,7 @@ impl IndexContext {
             summary_cache: self.summary_cache,
             reasoning_index: self.reasoning_index,
             navigation_index: self.navigation_index,
+            concepts: self.concepts,
         }
     }
 }
@@ -429,6 +434,9 @@ pub struct PipelineResult {
 
     /// Navigation index for Agent-based retrieval.
     pub navigation_index: Option<NavigationIndex>,
+
+    /// Key concepts extracted from the document.
+    pub concepts: Vec<Concept>,
 }
 
 impl PipelineResult {
diff --git a/rust/src/index/pipeline/executor.rs b/vectorless-core/vectorless-index/src/pipeline/executor.rs
similarity index 86%
rename from rust/src/index/pipeline/executor.rs
rename to vectorless-core/vectorless-index/src/pipeline/executor.rs
index 34c1f43a..16956888 100644
--- a/rust/src/index/pipeline/executor.rs
+++ b/vectorless-core/vectorless-index/src/pipeline/executor.rs
@@ -8,13 +8,14 @@
 
 use tracing::info;
 
-use crate::error::Result;
-use crate::llm::LlmClient;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
 
 use super::super::PipelineOptions;
 use super::super::stages::{
-    BuildStage, EnhanceStage, EnrichStage, IndexStage, NavigationIndexStage, OptimizeStage,
-    ParseStage, ReasoningIndexStage, SplitStage, ValidateStage,
+    BuildStage, ConceptExtractionStage, EnhanceStage, EnrichStage, IndexStage,
+    NavigationIndexStage, OptimizeStage, ParseStage, ReasoningIndexStage, SplitStage,
+    ValidateStage, VerifyStage,
 };
 use super::context::{IndexInput, PipelineResult};
 use super::orchestrator::PipelineOrchestrator;
@@ -55,8 +56,10 @@ impl PipelineExecutor {
     /// 4. `split` - Split oversized leaf nodes (optional)
     /// 5. `enrich` - Add metadata and cross-references
     /// 6. `reasoning_index` - Build pre-computed reasoning index
-    /// 7. `navigation_index` - Build Agent navigation index
-    /// 8. `optimize` - Optimize tree structure
+    /// 7. `concept_extraction` - Extract key concepts (optional)
+    /// 8. `navigation_index` - Build Agent navigation index
+    /// 9. `verify` - Validate ingest output reliability
+    /// 10. `optimize` - Optimize tree structure
     pub fn new() -> Self {
         let orchestrator = PipelineOrchestrator::new()
             .stage_with_priority(ParseStage::new(), 10)
@@ -65,7 +68,9 @@ impl PipelineExecutor {
             .stage_with_priority(SplitStage::new(), 25)
             .stage_with_priority(EnrichStage::new(), 40)
             .stage_with_priority(ReasoningIndexStage::new(), 45)
+            .stage_with_priority(ConceptExtractionStage::new(), 47)
             .stage_with_priority(NavigationIndexStage::new(), 50)
+            .stage_with_priority(VerifyStage, 55)
             .stage_with_priority(OptimizeStage::new(), 60);
 
         Self { orchestrator }
@@ -81,8 +86,10 @@ impl PipelineExecutor {
     /// 5. `enhance` - LLM-based enhancement (summaries)
     /// 6. `enrich` - Add metadata
     /// 7. `reasoning_index` - Build pre-computed reasoning index
-    /// 8. `navigation_index` - Build Agent navigation index
-    /// 9. `optimize` - Optimize tree
+    /// 8. `concept_extraction` - Extract key concepts via LLM (optional)
+    /// 9. `navigation_index` - Build Agent navigation index
+    /// 10. `verify` - Validate ingest output reliability
+    /// 11. `optimize` - Optimize tree
     pub fn with_llm(client: LlmClient) -> Self {
         tracing::info!(
             "PipelineExecutor::with_llm — cloning client to ParseStage + EnhanceStage + context"
@@ -93,10 +100,12 @@ impl PipelineExecutor {
             .stage_with_priority(BuildStage::new(), 20)
             .stage_with_priority(ValidateStage::new(), 22)
             .stage_with_priority(SplitStage::new(), 25)
-            .stage_with_priority(EnhanceStage::with_llm_client(client), 30)
+            .stage_with_priority(EnhanceStage::with_llm_client(client.clone()), 30)
             .stage_with_priority(EnrichStage::new(), 40)
             .stage_with_priority(ReasoningIndexStage::new(), 45)
+            .stage_with_priority(ConceptExtractionStage::with_llm_client(client), 47)
             .stage_with_priority(NavigationIndexStage::new(), 50)
+            .stage_with_priority(VerifyStage, 55)
             .stage_with_priority(OptimizeStage::new(), 60);
 
         Self { orchestrator }
diff --git a/rust/src/index/pipeline/metrics.rs b/vectorless-core/vectorless-index/src/pipeline/metrics.rs
similarity index 76%
rename from rust/src/index/pipeline/metrics.rs
rename to vectorless-core/vectorless-index/src/pipeline/metrics.rs
index f25fe29f..9c08d69a 100644
--- a/rust/src/index/pipeline/metrics.rs
+++ b/vectorless-core/vectorless-index/src/pipeline/metrics.rs
@@ -3,4 +3,4 @@
 
 //! Re-export IndexMetrics from the metrics module.
 
-pub use crate::metrics::IndexMetrics;
+pub use vectorless_metrics::IndexMetrics;
diff --git a/rust/src/index/pipeline/mod.rs b/vectorless-core/vectorless-index/src/pipeline/mod.rs
similarity index 100%
rename from rust/src/index/pipeline/mod.rs
rename to vectorless-core/vectorless-index/src/pipeline/mod.rs
diff --git a/rust/src/index/pipeline/orchestrator.rs b/vectorless-core/vectorless-index/src/pipeline/orchestrator.rs
similarity index 98%
rename from rust/src/index/pipeline/orchestrator.rs
rename to vectorless-core/vectorless-index/src/pipeline/orchestrator.rs
index 10f1f3ad..9421d2c9 100644
--- a/rust/src/index/pipeline/orchestrator.rs
+++ b/vectorless-core/vectorless-index/src/pipeline/orchestrator.rs
@@ -27,7 +27,7 @@ use std::collections::HashMap;
 use std::time::Instant;
 use tracing::{debug, error, info, warn};
 
-use crate::error::Result;
+use vectorless_error::Result;
 
 use super::super::PipelineOptions;
 use super::super::stages::IndexStage;
@@ -95,7 +95,7 @@ pub struct PipelineOrchestrator {
     /// Registered stages with metadata.
     stages: Vec<StageEntry>,
     /// Shared LLM client injected into pipeline context.
-    llm_client: Option<crate::llm::LlmClient>,
+    llm_client: Option<vectorless_llm::LlmClient>,
 }
 
 impl Default for PipelineOrchestrator {
@@ -114,7 +114,7 @@ impl PipelineOrchestrator {
     }
 
     /// Set the shared LLM client (injected into pipeline context).
-    pub fn with_llm_client(mut self, client: crate::llm::LlmClient) -> Self {
+    pub fn with_llm_client(mut self, client: vectorless_llm::LlmClient) -> Self {
         self.llm_client = Some(client);
         self
     }
@@ -220,7 +220,7 @@ impl PipelineOrchestrator {
         for entry in &self.stages {
             for dep in &entry.depends_on {
                 if !name_to_idx.contains_key(dep.as_str()) {
-                    return Err(crate::error::Error::Config(format!(
+                    return Err(vectorless_error::Error::Config(format!(
                         "Stage '{}' depends on non-existent stage '{}'",
                         entry.stage.name(),
                         dep
@@ -276,7 +276,7 @@ impl PipelineOrchestrator {
                 .filter(|i| !result.contains(i))
                 .map(|i| self.stages[i].stage.name())
                 .collect();
-            return Err(crate::error::Error::Config(format!(
+            return Err(vectorless_error::Error::Config(format!(
                 "Circular dependency detected involving stages: {:?}",
                 remaining
             )));
@@ -619,7 +619,7 @@ impl PipelineOrchestrator {
                             dyn std::future::Future<
                                     Output = (
                                         ParallelEntry,
-                                        std::result::Result<StageResult, crate::error::Error>,
+                                        std::result::Result<StageResult, vectorless_error::Error>,
                                     ),
                                 > + Send,
                         >,
@@ -869,7 +869,7 @@ struct ParallelEntry {
     /// Failure policy (captured before swap).
     policy: FailurePolicy,
     /// Access pattern (captured before swap).
-    access: crate::index::stages::AccessPattern,
+    access: crate::stages::AccessPattern,
 }
 
 /// Builder for creating custom stage configurations.
diff --git a/rust/src/index/pipeline/policy.rs b/vectorless-core/vectorless-index/src/pipeline/policy.rs
similarity index 100%
rename from rust/src/index/pipeline/policy.rs
rename to vectorless-core/vectorless-index/src/pipeline/policy.rs
diff --git a/rust/src/index/stages/build.rs b/vectorless-core/vectorless-index/src/stages/build.rs
similarity index 97%
rename from rust/src/index/stages/build.rs
rename to vectorless-core/vectorless-index/src/stages/build.rs
index 02b5eda8..29eb687b 100644
--- a/rust/src/index/stages/build.rs
+++ b/vectorless-core/vectorless-index/src/stages/build.rs
@@ -7,14 +7,14 @@ use super::async_trait;
 use std::time::Instant;
 use tracing::{debug, info};
 
-use crate::document::{DocumentTree, NodeId};
-use crate::error::Result;
-use crate::index::parse::RawNode;
-use crate::utils::estimate_tokens;
+use crate::parse::RawNode;
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_error::Result;
+use vectorless_utils::estimate_tokens;
 
 use super::{IndexStage, StageResult};
-use crate::index::ThinningConfig;
-use crate::index::pipeline::IndexContext;
+use crate::ThinningConfig;
+use crate::pipeline::IndexContext;
 
 /// Build stage - constructs a tree from raw nodes.
 pub struct BuildStage;
diff --git a/vectorless-core/vectorless-index/src/stages/concept.rs b/vectorless-core/vectorless-index/src/stages/concept.rs
new file mode 100644
index 00000000..d6a1f52b
--- /dev/null
+++ b/vectorless-core/vectorless-index/src/stages/concept.rs
@@ -0,0 +1,241 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Concept extraction stage — extracts key concepts from topics and summaries.
+
+use std::collections::HashMap;
+
+use serde::Deserialize;
+use tracing::{info, warn};
+
+use vectorless_document::Concept;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
+
+use super::async_trait;
+use super::{AccessPattern, IndexStage, StageResult};
+use crate::pipeline::IndexContext;
+
+/// Maximum number of top keywords to send to the LLM for concept extraction.
+const MAX_TOPICS: usize = 20;
+
+/// Maximum number of concepts to extract.
+const MAX_CONCEPTS: usize = 15;
+
+/// Concept extraction stage.
+///
+/// Takes the reasoning index's topic entries and tree summaries, then uses
+/// a single LLM call to extract structured [`Concept`] values.
+/// Falls back to basic keyword-based concepts when no LLM is available.
+pub struct ConceptExtractionStage {
+    llm_client: Option<LlmClient>,
+}
+
+impl ConceptExtractionStage {
+    /// Create a new stage without LLM support (keyword-based fallback).
+    pub fn new() -> Self {
+        Self { llm_client: None }
+    }
+
+    /// Create a stage with LLM support for rich concept extraction.
+    pub fn with_llm_client(client: LlmClient) -> Self {
+        Self {
+            llm_client: Some(client),
+        }
+    }
+}
+
+#[async_trait]
+impl IndexStage for ConceptExtractionStage {
+    fn name(&self) -> &str {
+        "concept_extraction"
+    }
+
+    fn depends_on(&self) -> Vec<&'static str> {
+        vec!["reasoning_index"]
+    }
+
+    fn is_optional(&self) -> bool {
+        true
+    }
+
+    fn access_pattern(&self) -> AccessPattern {
+        AccessPattern {
+            reads_tree: true,
+            writes_concepts: true,
+            ..AccessPattern::default()
+        }
+    }
+
+    async fn execute(&mut self, ctx: &mut IndexContext) -> Result<StageResult> {
+        let concepts = if let Some(ref client) = self.llm_client {
+            extract_with_llm(ctx, client).await
+        } else {
+            extract_from_topics(ctx)
+        };
+
+        let count = concepts.len();
+        ctx.concepts = concepts;
+        info!("[concept_extraction] Extracted {} concepts", count);
+
+        Ok(StageResult::success("concept_extraction"))
+    }
+}
+
+/// Extract concepts using LLM from topics and summaries.
+async fn extract_with_llm(ctx: &mut IndexContext, client: &LlmClient) -> Vec<Concept> {
+    let (topics, section_titles) = gather_source_data(ctx);
+
+    if topics.is_empty() {
+        warn!("[concept_extraction] No topics available for extraction");
+        return Vec::new();
+    }
+
+    let system = "You are a document analysis assistant. Extract the most important concepts \
+        from the given topics and section titles. For each concept, provide:\n\
+        - name: a short name (2-4 words)\n\
+        - summary: a one-sentence explanation\n\
+        - sections: list of section titles where this concept appears\n\n\
+        Return ONLY a valid JSON array of objects. No explanation, no markdown. \
+        Maximum 15 concepts, ordered by importance.";
+
+    let user_prompt = format!(
+        "Document topics (keyword: relevance weight):\n{}\n\n\
+         Section titles:\n{}",
+        topics
+            .iter()
+            .map(|(k, w)| format!("- {} (weight: {:.2})", k, w))
+            .collect::<Vec<_>>()
+            .join("\n"),
+        section_titles.join(", "),
+    );
+
+    #[derive(Debug, Deserialize)]
+    #[serde(rename_all = "snake_case")]
+    struct RawConcept {
+        name: String,
+        summary: String,
+        #[serde(default)]
+        sections: Vec<String>,
+    }
+
+    match client
+        .complete_json::<Vec<RawConcept>>(&system, &user_prompt)
+        .await
+    {
+        Ok(raw) => raw
+            .into_iter()
+            .take(MAX_CONCEPTS)
+            .map(|c| Concept {
+                name: c.name,
+                summary: c.summary,
+                sections: c.sections,
+            })
+            .collect(),
+        Err(e) => {
+            warn!(
+                "[concept_extraction] LLM extraction failed: {}, using fallback",
+                e
+            );
+            extract_from_topics(ctx)
+        }
+    }
+}
+
+/// Fallback: derive basic concepts from topic keywords.
+fn extract_from_topics(ctx: &mut IndexContext) -> Vec<Concept> {
+    let (topics, section_titles) = gather_source_data(ctx);
+
+    topics
+        .into_iter()
+        .take(MAX_CONCEPTS)
+        .map(|(name, _)| Concept {
+            name: name.clone(),
+            summary: String::new(),
+            sections: section_titles.clone(),
+        })
+        .collect()
+}
+
+/// Gather top topics and section titles from the pipeline context.
+fn gather_source_data(ctx: &IndexContext) -> (Vec<(String, f32)>, Vec<String>) {
+    // Collect top keywords by weight
+    let mut topics: Vec<(String, f32)> = Vec::new();
+
+    if let Some(ref ri) = ctx.reasoning_index {
+        let mut all: Vec<(String, f32)> = ri
+            .all_topic_entries()
+            .map(|(keyword, entries)| {
+                let max_weight = entries.iter().map(|e| e.weight).fold(0.0_f32, f32::max);
+                (keyword.clone(), max_weight)
+            })
+            .collect();
+        all.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap_or(std::cmp::Ordering::Equal));
+        all.truncate(MAX_TOPICS);
+        topics = all;
+    }
+
+    // Collect section titles from the tree
+    let section_titles: Vec<String> = ctx
+        .tree
+        .as_ref()
+        .map(|tree| {
+            tree.traverse()
+                .iter()
+                .filter_map(|&id| {
+                    let node = tree.get(id)?;
+                    if !node.title.is_empty() {
+                        Some(node.title.clone())
+                    } else {
+                        None
+                    }
+                })
+                .collect()
+        })
+        .unwrap_or_default();
+
+    (topics, section_titles)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_extract_from_empty_topics() {
+        let topics = Vec::<(String, f32)>::new();
+        let titles = vec!["Section 1".to_string()];
+        // Basic sanity: empty topics produce empty concepts
+        let concepts: Vec<Concept> = topics
+            .into_iter()
+            .take(MAX_CONCEPTS)
+            .map(|(name, _)| Concept {
+                name,
+                summary: String::new(),
+                sections: titles.clone(),
+            })
+            .collect();
+        assert!(concepts.is_empty());
+    }
+
+    #[test]
+    fn test_extract_from_topics_basic() {
+        let topics: Vec<(String, f32)> = vec![
+            ("quantum".to_string(), 0.95),
+            ("error correction".to_string(), 0.88),
+            ("qubit".to_string(), 0.82),
+        ];
+        let titles = vec!["Research Labs".to_string()];
+        let concepts: Vec<Concept> = topics
+            .into_iter()
+            .take(MAX_CONCEPTS)
+            .map(|(name, _)| Concept {
+                name,
+                summary: String::new(),
+                sections: titles.clone(),
+            })
+            .collect();
+        assert_eq!(concepts.len(), 3);
+        assert_eq!(concepts[0].name, "quantum");
+    }
+}
diff --git a/rust/src/index/stages/enhance.rs b/vectorless-core/vectorless-index/src/stages/enhance.rs
similarity index 96%
rename from rust/src/index/stages/enhance.rs
rename to vectorless-core/vectorless-index/src/stages/enhance.rs
index 0223d572..9613674f 100644
--- a/rust/src/index/stages/enhance.rs
+++ b/vectorless-core/vectorless-index/src/stages/enhance.rs
@@ -9,16 +9,16 @@ use std::sync::Arc;
 use std::time::{Duration, Instant};
 use tracing::{debug, info, warn};
 
-use crate::document::NodeId;
-use crate::error::Result;
-use crate::index::incremental;
-use crate::llm::LlmClient;
-use crate::llm::memo::{MemoKey, MemoStore};
-use crate::utils::fingerprint::Fingerprint;
+use crate::incremental;
+use vectorless_document::NodeId;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
+use vectorless_llm::memo::{MemoKey, MemoStore};
+use vectorless_utils::fingerprint::Fingerprint;
 
 use super::{IndexStage, StageResult};
-use crate::index::pipeline::{FailurePolicy, IndexContext, StageRetryConfig};
-use crate::index::summary::{LlmSummaryGenerator, SummaryGenerator, SummaryStrategy};
+use crate::pipeline::{FailurePolicy, IndexContext, StageRetryConfig};
+use crate::summary::{LlmSummaryGenerator, SummaryGenerator, SummaryStrategy};
 
 /// A node that needs LLM summary generation.
 struct PendingNode {
@@ -277,7 +277,7 @@ impl IndexStage for EnhanceStage {
             // Shortcut: use original content as summary for short nodes (Borrow A)
             let token_count = node
                 .token_count
-                .unwrap_or_else(|| crate::utils::estimate_tokens(&node.content));
+                .unwrap_or_else(|| vectorless_utils::estimate_tokens(&node.content));
             if shortcut_threshold > 0 && token_count > 0 && token_count <= shortcut_threshold {
                 tree.set_summary(node_id, &node.content);
                 debug!(
@@ -344,7 +344,7 @@ impl IndexStage for EnhanceStage {
                             failed += 1;
                         } else {
                             ctx.metrics
-                                .add_tokens_generated(crate::utils::estimate_tokens(&response));
+                                .add_tokens_generated(vectorless_utils::estimate_tokens(&response));
 
                             if is_leaf {
                                 // Leaf node: response is a plain content summary
diff --git a/rust/src/index/stages/enrich.rs b/vectorless-core/vectorless-index/src/stages/enrich.rs
similarity index 96%
rename from rust/src/index/stages/enrich.rs
rename to vectorless-core/vectorless-index/src/stages/enrich.rs
index 88ea8cc1..e14611e2 100644
--- a/rust/src/index/stages/enrich.rs
+++ b/vectorless-core/vectorless-index/src/stages/enrich.rs
@@ -7,11 +7,11 @@ use super::async_trait;
 use std::time::Instant;
 use tracing::{debug, info};
 
-use crate::document::{DocumentTree, NodeId, ReferenceExtractor, TocView};
-use crate::error::Result;
+use vectorless_document::{DocumentTree, NodeId, ReferenceExtractor, TocView};
+use vectorless_error::Result;
 
 use super::{AccessPattern, IndexStage, StageResult};
-use crate::index::pipeline::IndexContext;
+use crate::pipeline::IndexContext;
 
 /// Enrich stage - adds metadata to the tree.
 pub struct EnrichStage;
@@ -169,7 +169,7 @@ impl IndexStage for EnrichStage {
         let tree = ctx
             .tree
             .as_mut()
-            .ok_or_else(|| crate::Error::IndexBuild("Tree not built".to_string()))?;
+            .ok_or_else(|| vectorless_error::Error::IndexBuild("Tree not built".to_string()))?;
 
         let node_count = tree.node_count();
         info!("[enrich] Starting: {} nodes", node_count);
diff --git a/rust/src/index/stages/mod.rs b/vectorless-core/vectorless-index/src/stages/mod.rs
similarity index 94%
rename from rust/src/index/stages/mod.rs
rename to vectorless-core/vectorless-index/src/stages/mod.rs
index 9a3c405f..a5bab452 100644
--- a/rust/src/index/stages/mod.rs
+++ b/vectorless-core/vectorless-index/src/stages/mod.rs
@@ -4,6 +4,7 @@
 //! Index pipeline stages.
 
 mod build;
+mod concept;
 mod enhance;
 mod enrich;
 mod navigation;
@@ -12,8 +13,10 @@ mod parse;
 mod reasoning;
 mod split;
 mod validate;
+mod verify_ingest;
 
 pub use build::BuildStage;
+pub use concept::ConceptExtractionStage;
 pub use enhance::EnhanceStage;
 pub use enrich::EnrichStage;
 pub use navigation::NavigationIndexStage;
@@ -22,10 +25,11 @@ pub use parse::ParseStage;
 pub use reasoning::ReasoningIndexStage;
 pub use split::SplitStage;
 pub use validate::ValidateStage;
+pub use verify_ingest::VerifyStage;
 
 use super::pipeline::{FailurePolicy, IndexContext, StageResult};
-use crate::error::Result;
 pub use async_trait::async_trait;
+use vectorless_error::Result;
 
 /// Declares which context fields a stage reads/writes.
 /// Used by the orchestrator to determine safe parallel execution.
@@ -41,6 +45,8 @@ pub struct AccessPattern {
     pub writes_navigation_index: bool,
     /// Whether this stage writes to `description`.
     pub writes_description: bool,
+    /// Whether this stage writes to `concepts`.
+    pub writes_concepts: bool,
 }
 
 /// Index pipeline stage.
diff --git a/rust/src/index/stages/navigation.rs b/vectorless-core/vectorless-index/src/stages/navigation.rs
similarity index 95%
rename from rust/src/index/stages/navigation.rs
rename to vectorless-core/vectorless-index/src/stages/navigation.rs
index 0a41517f..8c25a411 100644
--- a/rust/src/index/stages/navigation.rs
+++ b/vectorless-core/vectorless-index/src/stages/navigation.rs
@@ -17,12 +17,12 @@
 use std::time::Instant;
 use tracing::{debug, info, warn};
 
-use crate::document::{ChildRoute, DocumentTree, NavEntry, NavigationIndex, NodeId};
-use crate::error::Result;
+use vectorless_document::{ChildRoute, DocumentTree, NavEntry, NavigationIndex, NodeId};
+use vectorless_error::Result;
 
 use super::async_trait;
 use super::{AccessPattern, IndexStage, StageResult};
-use crate::index::pipeline::IndexContext;
+use crate::pipeline::IndexContext;
 
 /// Navigation Index Stage — builds the Agent navigation index.
 ///
@@ -31,7 +31,7 @@ use crate::index::pipeline::IndexContext;
 /// - A list of [`ChildRoute`] entries, one per child, with title, description, and leaf count.
 ///
 /// The resulting [`NavigationIndex`] is stored in `ctx.navigation_index` and
-/// serialized as part of [`PersistedDocument`](crate::storage::persistence::PersistedDocument).
+/// serialized as part of [`PersistedDocument`](vectorless_storage::persistence::PersistedDocument).
 pub struct NavigationIndexStage;
 
 impl NavigationIndexStage {
@@ -227,12 +227,12 @@ impl IndexStage for NavigationIndexStage {
         // Phase 3: Build DocCard from root-level data (already computed, zero LLM).
         // Provides a compact document summary for multi-document Orchestrator Agent.
         if let Some(root_entry) = nav_index.get_entry(tree.root()) {
-            let sections: Vec<crate::document::SectionCard> = nav_index
+            let sections: Vec<vectorless_document::SectionCard> = nav_index
                 .get_child_routes(tree.root())
                 .map(|routes| {
                     routes
                         .iter()
-                        .map(|r| crate::document::SectionCard {
+                        .map(|r| vectorless_document::SectionCard {
                             title: r.title.clone(),
                             description: r.description.clone(),
                             leaf_count: r.leaf_count,
@@ -241,7 +241,7 @@ impl IndexStage for NavigationIndexStage {
                 })
                 .unwrap_or_default();
 
-            let doc_card = crate::document::DocCard {
+            let doc_card = vectorless_document::DocCard {
                 title: tree
                     .get(tree.root())
                     .map(|n| n.title.clone())
@@ -293,7 +293,7 @@ impl IndexStage for NavigationIndexStage {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
+    use vectorless_document::DocumentTree;
 
     fn build_test_tree() -> DocumentTree {
         let mut tree = DocumentTree::new("Root", "root content");
@@ -392,8 +392,8 @@ mod tests {
 
         // Build context with the tree
         let mut ctx = IndexContext::new(
-            crate::index::pipeline::IndexInput::content("test"),
-            crate::index::config::PipelineOptions::default(),
+            crate::pipeline::IndexInput::content("test"),
+            crate::config::PipelineOptions::default(),
         );
         ctx.tree = Some(tree);
 
@@ -440,8 +440,8 @@ mod tests {
         let tree = DocumentTree::new("Root", "content");
 
         let mut ctx = IndexContext::new(
-            crate::index::pipeline::IndexInput::content("test"),
-            crate::index::config::PipelineOptions::default(),
+            crate::pipeline::IndexInput::content("test"),
+            crate::config::PipelineOptions::default(),
         );
         ctx.tree = Some(tree);
 
@@ -459,8 +459,8 @@ mod tests {
     #[tokio::test]
     async fn test_execute_no_tree() {
         let ctx = IndexContext::new(
-            crate::index::pipeline::IndexInput::content("test"),
-            crate::index::config::PipelineOptions::default(),
+            crate::pipeline::IndexInput::content("test"),
+            crate::config::PipelineOptions::default(),
         );
         // ctx.tree is None
 
diff --git a/rust/src/index/stages/optimize.rs b/vectorless-core/vectorless-index/src/stages/optimize.rs
similarity index 94%
rename from rust/src/index/stages/optimize.rs
rename to vectorless-core/vectorless-index/src/stages/optimize.rs
index 8186d494..61ee4706 100644
--- a/rust/src/index/stages/optimize.rs
+++ b/vectorless-core/vectorless-index/src/stages/optimize.rs
@@ -7,9 +7,9 @@ use super::{AccessPattern, async_trait};
 use std::time::Instant;
 use tracing::{debug, info};
 
-use crate::document::NodeId;
-use crate::error::Result;
-use crate::index::pipeline::IndexContext;
+use crate::pipeline::IndexContext;
+use vectorless_document::NodeId;
+use vectorless_error::Result;
 
 use super::{IndexStage, StageResult};
 
@@ -28,9 +28,9 @@ impl OptimizeStage {
     /// Non-leaf nodes (section headings with subsections) are never merged,
     /// even if their own content is empty.
     fn merge_small_leaves(
-        tree: &mut crate::document::DocumentTree,
+        tree: &mut vectorless_document::DocumentTree,
         min_tokens: usize,
-        metrics: &mut crate::index::IndexMetrics,
+        metrics: &mut crate::IndexMetrics,
     ) -> usize {
         let mut merged_count = 0;
 
@@ -108,7 +108,7 @@ impl OptimizeStage {
     }
 
     /// Remove empty intermediate nodes (skip root).
-    fn remove_empty_nodes(tree: &mut crate::document::DocumentTree) -> usize {
+    fn remove_empty_nodes(tree: &mut vectorless_document::DocumentTree) -> usize {
         let mut removed_count = 0;
         let root = tree.root();
 
@@ -189,7 +189,7 @@ impl IndexStage for OptimizeStage {
         let tree = ctx
             .tree
             .as_mut()
-            .ok_or_else(|| crate::Error::IndexBuild("Tree not built".to_string()))?;
+            .ok_or_else(|| vectorless_error::Error::IndexBuild("Tree not built".to_string()))?;
 
         let node_count = tree.node_count();
         info!(
@@ -242,10 +242,10 @@ impl IndexStage for OptimizeStage {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
-    use crate::index::PipelineOptions;
-    use crate::index::pipeline::IndexContext;
-    use crate::index::pipeline::IndexInput;
+    use crate::PipelineOptions;
+    use crate::pipeline::IndexContext;
+    use crate::pipeline::IndexInput;
+    use vectorless_document::DocumentTree;
 
     /// Create a tree with small leaf children under root for merge tests.
     ///
@@ -286,7 +286,7 @@ mod tests {
     fn test_merge_small_leaves_merges_adjacent_pair() {
         let mut tree = make_merge_test_tree();
         let root = tree.root();
-        let mut metrics = crate::index::pipeline::IndexMetrics::new();
+        let mut metrics = crate::pipeline::IndexMetrics::new();
 
         // Threshold 100: Leaf A (50) and Leaf B (30) should merge
         let merged = OptimizeStage::merge_small_leaves(&mut tree, 100, &mut metrics);
@@ -307,7 +307,7 @@ mod tests {
     #[test]
     fn test_merge_small_leaves_nothing_above_threshold() {
         let mut tree = make_merge_test_tree();
-        let mut metrics = crate::index::pipeline::IndexMetrics::new();
+        let mut metrics = crate::pipeline::IndexMetrics::new();
 
         // Threshold 10: all leaves are above this, nothing merges
         let merged = OptimizeStage::merge_small_leaves(&mut tree, 10, &mut metrics);
@@ -327,7 +327,7 @@ mod tests {
             n.token_count = Some(5);
         }
 
-        let mut metrics = crate::index::pipeline::IndexMetrics::new();
+        let mut metrics = crate::pipeline::IndexMetrics::new();
         let _ = OptimizeStage::merge_small_leaves(&mut tree, 100, &mut metrics);
 
         // Leaf A should now contain both contents with heading prefix
@@ -355,7 +355,7 @@ mod tests {
             n.token_count = Some(5);
         }
 
-        let mut metrics = crate::index::pipeline::IndexMetrics::new();
+        let mut metrics = crate::pipeline::IndexMetrics::new();
         let merged = OptimizeStage::merge_small_leaves(&mut tree, 100, &mut metrics);
 
         // Section is non-leaf, only Leaf is a leaf — no adjacent pair of leaves
@@ -447,7 +447,7 @@ mod tests {
     #[test]
     fn test_merge_small_leaves_empty_tree() {
         let mut tree = DocumentTree::new("Root", "");
-        let mut metrics = crate::index::pipeline::IndexMetrics::new();
+        let mut metrics = crate::pipeline::IndexMetrics::new();
 
         let merged = OptimizeStage::merge_small_leaves(&mut tree, 100, &mut metrics);
         assert_eq!(merged, 0, "Root with no children should merge nothing");
diff --git a/rust/src/index/stages/parse.rs b/vectorless-core/vectorless-index/src/stages/parse.rs
similarity index 85%
rename from rust/src/index/stages/parse.rs
rename to vectorless-core/vectorless-index/src/stages/parse.rs
index b0e542f6..7dbaa076 100644
--- a/rust/src/index/stages/parse.rs
+++ b/vectorless-core/vectorless-index/src/stages/parse.rs
@@ -7,17 +7,17 @@ use super::async_trait;
 use std::time::Instant;
 use tracing::{debug, info};
 
-use crate::error::Result;
-use crate::index::parse::DocumentFormat;
+use vectorless_document::DocumentFormat;
+use vectorless_error::Result;
 
 use super::{IndexStage, StageResult};
-use crate::index::IndexMode;
-use crate::index::pipeline::{IndexContext, IndexInput};
+use crate::IndexMode;
+use crate::pipeline::{IndexContext, IndexInput};
 
 /// Parse stage - extracts raw nodes from documents.
 pub struct ParseStage {
     /// Optional LLM client for PDF structure extraction.
-    llm_client: Option<crate::llm::LlmClient>,
+    llm_client: Option<vectorless_llm::LlmClient>,
 }
 
 impl ParseStage {
@@ -27,7 +27,7 @@ impl ParseStage {
     }
 
     /// Create a parse stage with an LLM client.
-    pub fn with_llm_client(client: crate::llm::LlmClient) -> Self {
+    pub fn with_llm_client(client: vectorless_llm::LlmClient) -> Self {
         Self {
             llm_client: Some(client),
         }
@@ -39,8 +39,9 @@ impl ParseStage {
             IndexMode::Auto => match &ctx.input {
                 IndexInput::File(path) => {
                     let ext = path.extension().and_then(|e| e.to_str()).unwrap_or("");
-                    DocumentFormat::from_extension(ext)
-                        .ok_or_else(|| crate::Error::Parse(format!("Unknown format: {}", ext)))
+                    DocumentFormat::from_extension(ext).ok_or_else(|| {
+                        vectorless_error::Error::Parse(format!("Unknown format: {}", ext))
+                    })
                 }
                 IndexInput::Content { format, .. } => Ok(*format),
                 IndexInput::Bytes { format, .. } => Ok(*format),
@@ -99,7 +100,7 @@ impl IndexStage for ParseStage {
                 debug!("[parse] Reading file: {:?}", ctx.source_path);
 
                 // Parse directly
-                crate::index::parse::parse_file(&path, format, self.llm_client.clone()).await?
+                crate::parse::parse_file(&path, format, self.llm_client.clone()).await?
             }
             IndexInput::Content {
                 content,
@@ -112,8 +113,7 @@ impl IndexStage for ParseStage {
                 debug!("[parse] Parsing inline content ({} chars)", content.len());
 
                 // Parse content directly
-                crate::index::parse::parse_content(content, *format, self.llm_client.clone())
-                    .await?
+                crate::parse::parse_content(content, *format, self.llm_client.clone()).await?
             }
             IndexInput::Bytes { data, name, format } => {
                 // Set name
@@ -122,7 +122,7 @@ impl IndexStage for ParseStage {
                 debug!("[parse] Parsing bytes ({} bytes)", data.len());
 
                 // Parse bytes
-                crate::index::parse::parse_bytes(data, *format, self.llm_client.clone()).await?
+                crate::parse::parse_bytes(data, *format, self.llm_client.clone()).await?
             }
         };
 
diff --git a/rust/src/index/stages/reasoning.rs b/vectorless-core/vectorless-index/src/stages/reasoning.rs
similarity index 95%
rename from rust/src/index/stages/reasoning.rs
rename to vectorless-core/vectorless-index/src/stages/reasoning.rs
index d30e1303..da11d008 100644
--- a/rust/src/index/stages/reasoning.rs
+++ b/vectorless-core/vectorless-index/src/stages/reasoning.rs
@@ -11,17 +11,17 @@ use std::collections::HashMap;
 use std::time::Instant;
 use tracing::{debug, info, warn};
 
-use crate::document::{
+use vectorless_document::{
     NodeId, ReasoningIndexBuilder, ReasoningIndexConfig, SectionSummary, SummaryShortcut,
     TopicEntry,
 };
-use crate::error::Result;
-use crate::llm::LlmClient;
-use crate::scoring::extract_keywords;
+use vectorless_error::Result;
+use vectorless_llm::LlmClient;
+use vectorless_scoring::extract_keywords;
 
 use super::async_trait;
 use super::{AccessPattern, IndexStage, StageResult};
-use crate::index::pipeline::IndexContext;
+use crate::pipeline::IndexContext;
 
 /// Reasoning Index Stage - builds a pre-computed reasoning index from the document tree.
 ///
@@ -56,7 +56,7 @@ impl ReasoningIndexStage {
 
     /// Build the topic-to-path mapping by extracting keywords from all nodes.
     fn build_topic_paths(
-        tree: &crate::document::DocumentTree,
+        tree: &vectorless_document::DocumentTree,
         config: &ReasoningIndexConfig,
     ) -> (HashMap<String, Vec<TopicEntry>>, usize) {
         let mut keyword_nodes: HashMap<String, Vec<(NodeId, f32, usize)>> = HashMap::new();
@@ -143,7 +143,7 @@ impl ReasoningIndexStage {
     }
 
     /// Build section map from depth-1 nodes.
-    fn build_section_map(tree: &crate::document::DocumentTree) -> HashMap<String, NodeId> {
+    fn build_section_map(tree: &vectorless_document::DocumentTree) -> HashMap<String, NodeId> {
         let mut section_map = HashMap::new();
         let root = tree.root();
         for child_id in tree.children(root) {
@@ -253,7 +253,7 @@ impl ReasoningIndexStage {
     }
 
     /// Build summary shortcut from root and depth-1 nodes.
-    fn build_summary_shortcut(tree: &crate::document::DocumentTree) -> Option<SummaryShortcut> {
+    fn build_summary_shortcut(tree: &vectorless_document::DocumentTree) -> Option<SummaryShortcut> {
         let root = tree.root();
         let root_node = tree.get(root)?;
 
@@ -477,9 +477,9 @@ mod tests {
 
     #[test]
     fn test_build_topic_paths_basic() {
-        use crate::document::ReasoningIndexConfig;
+        use vectorless_document::ReasoningIndexConfig;
 
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
         let c1 = tree.add_child(root, "Machine Learning Introduction", "");
         let c2 = tree.add_child(root, "Deep Learning Methods", "");
@@ -511,9 +511,9 @@ mod tests {
 
     #[test]
     fn test_build_topic_paths_weight_normalization() {
-        use crate::document::ReasoningIndexConfig;
+        use vectorless_document::ReasoningIndexConfig;
 
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
         let _c1 = tree.add_child(root, "rust ownership", "rust borrowing rules");
 
@@ -534,9 +534,9 @@ mod tests {
 
     #[test]
     fn test_build_topic_paths_respects_max_keyword_entries() {
-        use crate::document::ReasoningIndexConfig;
+        use vectorless_document::ReasoningIndexConfig;
 
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
 
         // Create many children with unique keywords
@@ -561,7 +561,7 @@ mod tests {
 
     #[test]
     fn test_build_section_map() {
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
         let c1 = tree.add_child(root, "Introduction", "content");
         let c2 = tree.add_child(root, "Methods", "content");
@@ -586,7 +586,7 @@ mod tests {
 
     #[test]
     fn test_build_summary_shortcut() {
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
         let c1 = tree.add_child(root, "S1", "summary 1");
         let c2 = tree.add_child(root, "S2", "summary 2");
@@ -614,7 +614,7 @@ mod tests {
     #[test]
     fn test_build_summary_shortcut_fallback_to_children() {
         // Root has no summary → fallback to concatenating children
-        let mut tree = crate::document::DocumentTree::new("Root", "");
+        let mut tree = vectorless_document::DocumentTree::new("Root", "");
         let root = tree.root();
         let c1 = tree.add_child(root, "S1", "");
         let c2 = tree.add_child(root, "S2", "");
diff --git a/rust/src/index/stages/split.rs b/vectorless-core/vectorless-index/src/stages/split.rs
similarity index 97%
rename from rust/src/index/stages/split.rs
rename to vectorless-core/vectorless-index/src/stages/split.rs
index 245688b8..c729214b 100644
--- a/rust/src/index/stages/split.rs
+++ b/vectorless-core/vectorless-index/src/stages/split.rs
@@ -6,13 +6,13 @@
 use std::time::Instant;
 use tracing::{debug, info};
 
-use crate::document::{DocumentTree, NodeId};
-use crate::error::Result;
-use crate::utils::estimate_tokens;
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_error::Result;
+use vectorless_utils::estimate_tokens;
 
 use super::{AccessPattern, IndexStage, StageResult, async_trait};
-use crate::index::config::SplitConfig;
-use crate::index::pipeline::IndexContext;
+use crate::config::SplitConfig;
+use crate::pipeline::IndexContext;
 
 /// Split stage — breaks oversized leaf nodes into smaller children.
 ///
@@ -228,6 +228,7 @@ impl IndexStage for SplitStage {
             writes_reasoning_index: false,
             writes_navigation_index: false,
             writes_description: false,
+            writes_concepts: false,
         }
     }
 
diff --git a/rust/src/index/stages/validate.rs b/vectorless-core/vectorless-index/src/stages/validate.rs
similarity index 93%
rename from rust/src/index/stages/validate.rs
rename to vectorless-core/vectorless-index/src/stages/validate.rs
index 312ff18a..5b165a2d 100644
--- a/rust/src/index/stages/validate.rs
+++ b/vectorless-core/vectorless-index/src/stages/validate.rs
@@ -7,10 +7,10 @@ use std::collections::HashSet;
 use std::time::Instant;
 use tracing::{debug, info, warn};
 
-use crate::error::Result;
+use vectorless_error::Result;
 
 use super::{AccessPattern, IndexStage, StageResult, async_trait};
-use crate::index::pipeline::IndexContext;
+use crate::pipeline::IndexContext;
 
 /// Maximum allowed tree depth.
 const MAX_DEPTH: usize = 20;
@@ -80,7 +80,7 @@ impl ValidateStage {
     }
 
     /// Check that tree depth is reasonable.
-    fn check_depth(tree: &crate::document::DocumentTree, issues: &mut Vec<ValidationIssue>) {
+    fn check_depth(tree: &vectorless_document::DocumentTree, issues: &mut Vec<ValidationIssue>) {
         let all_nodes = tree.traverse();
         let max_depth = all_nodes
             .iter()
@@ -100,7 +100,10 @@ impl ValidateStage {
     }
 
     /// Check for leaf nodes with empty titles.
-    fn check_empty_titles(tree: &crate::document::DocumentTree, issues: &mut Vec<ValidationIssue>) {
+    fn check_empty_titles(
+        tree: &vectorless_document::DocumentTree,
+        issues: &mut Vec<ValidationIssue>,
+    ) {
         let leaves = tree.leaves();
         let mut empty_count = 0;
 
@@ -122,7 +125,7 @@ impl ValidateStage {
 
     /// Check token count consistency: parent's tokens should be >= sum of children's.
     fn check_token_consistency(
-        tree: &crate::document::DocumentTree,
+        tree: &vectorless_document::DocumentTree,
         issues: &mut Vec<ValidationIssue>,
     ) {
         let all_nodes = tree.traverse();
@@ -167,7 +170,7 @@ impl ValidateStage {
 
     /// Check for content duplication across leaf nodes.
     fn check_content_duplication(
-        tree: &crate::document::DocumentTree,
+        tree: &vectorless_document::DocumentTree,
         issues: &mut Vec<ValidationIssue>,
     ) {
         let leaves = tree.leaves();
@@ -239,6 +242,7 @@ impl IndexStage for ValidateStage {
             writes_reasoning_index: false,
             writes_navigation_index: false,
             writes_description: false,
+            writes_concepts: false,
         }
     }
 
@@ -295,11 +299,11 @@ impl IndexStage for ValidateStage {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
+    use vectorless_document::DocumentTree;
 
     fn make_context_with_tree(tree: DocumentTree) -> IndexContext {
-        let input = crate::index::IndexInput::content("test");
-        let options = crate::index::config::PipelineOptions::default();
+        let input = crate::IndexInput::content("test");
+        let options = crate::config::PipelineOptions::default();
         let mut ctx = IndexContext::new(input, options);
         ctx.tree = Some(tree);
         ctx
@@ -351,8 +355,8 @@ mod tests {
 
     #[test]
     fn test_validate_no_tree_error() {
-        let input = crate::index::IndexInput::content("test");
-        let options = crate::index::config::PipelineOptions::default();
+        let input = crate::IndexInput::content("test");
+        let options = crate::config::PipelineOptions::default();
         let ctx = IndexContext::new(input, options);
 
         let stage = ValidateStage::new();
diff --git a/vectorless-core/vectorless-index/src/stages/verify_ingest.rs b/vectorless-core/vectorless-index/src/stages/verify_ingest.rs
new file mode 100644
index 00000000..33119d83
--- /dev/null
+++ b/vectorless-core/vectorless-index/src/stages/verify_ingest.rs
@@ -0,0 +1,78 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Verify stage — validates ingest output reliability before persist.
+
+use tracing::{info, warn};
+
+use super::async_trait;
+use super::{AccessPattern, IndexStage};
+use crate::pipeline::{IndexContext, StageResult};
+use vectorless_error::{Error, Result};
+
+/// Verification stage — ensures ingest produced reliable output.
+///
+/// Checks:
+/// - Tree is non-empty (at least root node)
+/// - Document summary is non-empty
+/// - At least one concept was extracted
+///
+/// Any check failure produces an error — no silent degradation.
+pub struct VerifyStage;
+
+#[async_trait]
+impl IndexStage for VerifyStage {
+    fn name(&self) -> &str {
+        "verify"
+    }
+
+    fn depends_on(&self) -> Vec<&'static str> {
+        vec!["concept_extraction"]
+    }
+
+    fn is_optional(&self) -> bool {
+        false
+    }
+
+    fn access_pattern(&self) -> AccessPattern {
+        AccessPattern {
+            reads_tree: true,
+            ..AccessPattern::default()
+        }
+    }
+
+    async fn execute(&mut self, ctx: &mut IndexContext) -> Result<StageResult> {
+        // Tree must exist and have nodes
+        let tree = ctx
+            .tree
+            .as_ref()
+            .ok_or_else(|| Error::InvalidStructure("document tree is empty".into()))?;
+        let node_count = tree.node_count();
+        if node_count == 0 {
+            return Err(Error::InvalidStructure("tree has no nodes".into()));
+        }
+
+        // Summary must be non-empty
+        let has_summary = ctx
+            .description
+            .as_ref()
+            .is_some_and(|s| !s.trim().is_empty());
+        if !has_summary {
+            warn!("[verify] Document summary is empty");
+        }
+
+        // Concepts must be present (warning only — non-fatal)
+        if ctx.concepts.is_empty() {
+            warn!("[verify] No concepts extracted from document");
+        }
+
+        info!(
+            "[verify] Passed: {} nodes, summary={}, concepts={}",
+            node_count,
+            has_summary,
+            ctx.concepts.len()
+        );
+
+        Ok(StageResult::success("verify"))
+    }
+}
diff --git a/rust/src/index/summary/full.rs b/vectorless-core/vectorless-index/src/summary/full.rs
similarity index 94%
rename from rust/src/index/summary/full.rs
rename to vectorless-core/vectorless-index/src/summary/full.rs
index c9e76e33..bc8a3a92 100644
--- a/rust/src/index/summary/full.rs
+++ b/vectorless-core/vectorless-index/src/summary/full.rs
@@ -3,8 +3,8 @@
 
 //! Full summary strategy - generate summaries for all nodes.
 
-use crate::document::NodeId;
-use crate::llm::LlmClient;
+use vectorless_document::NodeId;
+use vectorless_llm::LlmClient;
 
 use super::{SummaryGenerator, SummaryStrategyConfig};
 
@@ -46,7 +46,7 @@ impl FullStrategy {
     }
 
     /// Generate a summary for content.
-    pub async fn generate(&self, title: &str, content: &str) -> crate::llm::LlmResult<String> {
+    pub async fn generate(&self, title: &str, content: &str) -> vectorless_llm::LlmResult<String> {
         self.generator.generate(title, content).await
     }
 
diff --git a/rust/src/index/summary/lazy.rs b/vectorless-core/vectorless-index/src/summary/lazy.rs
similarity index 98%
rename from rust/src/index/summary/lazy.rs
rename to vectorless-core/vectorless-index/src/summary/lazy.rs
index 6d9cadef..29821d5b 100644
--- a/rust/src/index/summary/lazy.rs
+++ b/vectorless-core/vectorless-index/src/summary/lazy.rs
@@ -7,7 +7,7 @@ use std::collections::HashMap;
 use std::sync::Arc;
 use tokio::sync::RwLock;
 
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 use super::{SummaryGenerator, SummaryStrategyConfig};
 
@@ -93,7 +93,7 @@ impl LazyStrategy {
         node_id: &str,
         title: &str,
         content: &str,
-    ) -> crate::llm::LlmResult<String> {
+    ) -> vectorless_llm::LlmResult<String> {
         // Check cache first
         if self.persist {
             if let Some(cached) = self.get_cached(node_id).await {
diff --git a/rust/src/index/summary/mod.rs b/vectorless-core/vectorless-index/src/summary/mod.rs
similarity index 100%
rename from rust/src/index/summary/mod.rs
rename to vectorless-core/vectorless-index/src/summary/mod.rs
diff --git a/rust/src/index/summary/selective.rs b/vectorless-core/vectorless-index/src/summary/selective.rs
similarity index 96%
rename from rust/src/index/summary/selective.rs
rename to vectorless-core/vectorless-index/src/summary/selective.rs
index 18c8946e..f933d988 100644
--- a/rust/src/index/summary/selective.rs
+++ b/vectorless-core/vectorless-index/src/summary/selective.rs
@@ -3,8 +3,8 @@
 
 //! Selective summary strategy - generate summaries only for qualifying nodes.
 
-use crate::document::{DocumentTree, NodeId};
-use crate::llm::LlmClient;
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_llm::LlmClient;
 
 use super::{SummaryGenerator, SummaryStrategyConfig};
 
@@ -89,7 +89,7 @@ impl SelectiveStrategy {
     }
 
     /// Generate a summary for content.
-    pub async fn generate(&self, title: &str, content: &str) -> crate::llm::LlmResult<String> {
+    pub async fn generate(&self, title: &str, content: &str) -> vectorless_llm::LlmResult<String> {
         self.generator.generate(title, content).await
     }
 
diff --git a/rust/src/index/summary/strategy.rs b/vectorless-core/vectorless-index/src/summary/strategy.rs
similarity index 98%
rename from rust/src/index/summary/strategy.rs
rename to vectorless-core/vectorless-index/src/summary/strategy.rs
index 7937aa74..03f7a6c1 100644
--- a/rust/src/index/summary/strategy.rs
+++ b/vectorless-core/vectorless-index/src/summary/strategy.rs
@@ -5,10 +5,10 @@
 
 use async_trait::async_trait;
 
-use crate::document::{DocumentTree, NodeId};
-use crate::llm::memo::{MemoKey, MemoStore, MemoValue};
-use crate::llm::{LlmClient, LlmResult};
-use crate::utils::fingerprint::Fingerprint;
+use vectorless_document::{DocumentTree, NodeId};
+use vectorless_llm::memo::{MemoKey, MemoStore, MemoValue};
+use vectorless_llm::{LlmClient, LlmResult};
+use vectorless_utils::fingerprint::Fingerprint;
 
 /// Configuration for summary strategies.
 #[derive(Debug, Clone)]
diff --git a/vectorless-core/vectorless-llm/Cargo.toml b/vectorless-core/vectorless-llm/Cargo.toml
new file mode 100644
index 00000000..8aa02162
--- /dev/null
+++ b/vectorless-core/vectorless-llm/Cargo.toml
@@ -0,0 +1,35 @@
+[package]
+name = "vectorless-llm"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-config = { path = "../vectorless-config" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-metrics = { path = "../vectorless-metrics" }
+vectorless-utils = { path = "../vectorless-utils" }
+async-openai = { workspace = true }
+tokio = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+thiserror = { workspace = true }
+chrono = { workspace = true }
+governor = { workspace = true }
+nonzero_ext = { workspace = true }
+lru = { workspace = true }
+parking_lot = { workspace = true }
+uuid = { workspace = true }
+rand = { workspace = true }
+base64 = { workspace = true }
+
+[dev-dependencies]
+tempfile = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/llm/client.rs b/vectorless-core/vectorless-llm/src/client.rs
similarity index 98%
rename from rust/src/llm/client.rs
rename to vectorless-core/vectorless-llm/src/client.rs
index 3eeb60af..0356f7bb 100644
--- a/rust/src/llm/client.rs
+++ b/vectorless-core/vectorless-llm/src/client.rs
@@ -147,7 +147,7 @@ impl LlmClient {
     }
 
     /// Add metrics hub for recording LLM call statistics.
-    pub fn with_shared_metrics(mut self, hub: Arc<crate::metrics::MetricsHub>) -> Self {
+    pub fn with_shared_metrics(mut self, hub: Arc<vectorless_metrics::MetricsHub>) -> Self {
         self.executor = self.executor.with_shared_metrics(hub);
         self
     }
@@ -355,7 +355,7 @@ mod tests {
 
     #[test]
     fn test_client_with_concurrency() {
-        use crate::llm::throttle::ConcurrencyConfig;
+        use crate::throttle::ConcurrencyConfig;
 
         let controller = ConcurrencyController::new(ConcurrencyConfig::conservative());
         let client = LlmClient::for_model("gpt-4o-mini").with_concurrency(controller);
@@ -365,7 +365,7 @@ mod tests {
 
     #[test]
     fn test_client_with_shared_metrics() {
-        use crate::metrics::MetricsHub;
+        use vectorless_metrics::MetricsHub;
 
         let hub = MetricsHub::shared();
         let client = LlmClient::for_model("gpt-4o").with_shared_metrics(hub.clone());
diff --git a/rust/src/llm/config.rs b/vectorless-core/vectorless-llm/src/config.rs
similarity index 93%
rename from rust/src/llm/config.rs
rename to vectorless-core/vectorless-llm/src/config.rs
index 32685e36..b85d9831 100644
--- a/rust/src/llm/config.rs
+++ b/vectorless-core/vectorless-llm/src/config.rs
@@ -9,7 +9,7 @@ use std::time::Duration;
 /// Runtime LLM client configuration.
 ///
 /// This is the runtime representation used by [`LlmClient`](super::LlmClient).
-/// Created from the config-layer [`LlmConfig`](crate::config::LlmConfig)
+/// Created from the config-layer [`LlmConfig`](vectorless_config::LlmConfig)
 /// during pool construction — users never construct this directly.
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct LlmConfig {
@@ -150,6 +150,18 @@ fn default_true() -> bool {
     true
 }
 
+impl From<&vectorless_config::RetryConfig> for RetryConfig {
+    fn from(c: &vectorless_config::RetryConfig) -> Self {
+        Self {
+            max_attempts: c.max_attempts,
+            initial_delay_ms: c.initial_delay_ms,
+            max_delay_ms: c.max_delay_ms,
+            multiplier: c.multiplier,
+            retry_on_rate_limit: c.retry_on_rate_limit,
+        }
+    }
+}
+
 impl Default for RetryConfig {
     fn default() -> Self {
         Self {
diff --git a/rust/src/llm/error.rs b/vectorless-core/vectorless-llm/src/error.rs
similarity index 97%
rename from rust/src/llm/error.rs
rename to vectorless-core/vectorless-llm/src/error.rs
index 5969cf72..94598345 100644
--- a/rust/src/llm/error.rs
+++ b/vectorless-core/vectorless-llm/src/error.rs
@@ -93,9 +93,9 @@ impl From<serde_json::Error> for LlmError {
     }
 }
 
-impl From<LlmError> for crate::Error {
+impl From<LlmError> for vectorless_error::Error {
     fn from(e: LlmError) -> Self {
-        crate::Error::Llm(e.to_string())
+        vectorless_error::Error::Llm(e.to_string())
     }
 }
 
diff --git a/rust/src/llm/executor.rs b/vectorless-core/vectorless-llm/src/executor.rs
similarity index 99%
rename from rust/src/llm/executor.rs
rename to vectorless-core/vectorless-llm/src/executor.rs
index 409b474e..37790fc8 100644
--- a/rust/src/llm/executor.rs
+++ b/vectorless-core/vectorless-llm/src/executor.rs
@@ -62,7 +62,7 @@ use super::config::LlmConfig;
 use super::error::{LlmError, LlmResult};
 use super::fallback::{FallbackChain, FallbackStep};
 use super::throttle::ConcurrencyController;
-use crate::metrics::MetricsHub;
+use vectorless_metrics::MetricsHub;
 
 /// Unified executor for LLM operations.
 ///
@@ -518,7 +518,7 @@ mod tests {
 
     #[test]
     fn test_executor_with_throttle() {
-        use crate::llm::throttle::ConcurrencyConfig;
+        use crate::throttle::ConcurrencyConfig;
 
         let controller = ConcurrencyController::new(ConcurrencyConfig::conservative());
         let executor = LlmExecutor::for_model("gpt-4o-mini").with_throttle(controller);
diff --git a/rust/src/llm/fallback.rs b/vectorless-core/vectorless-llm/src/fallback.rs
similarity index 99%
rename from rust/src/llm/fallback.rs
rename to vectorless-core/vectorless-llm/src/fallback.rs
index fb6e37cd..d9a84c0e 100644
--- a/rust/src/llm/fallback.rs
+++ b/vectorless-core/vectorless-llm/src/fallback.rs
@@ -24,7 +24,7 @@ use serde::{Deserialize, Serialize};
 use tracing::{debug, info, warn};
 
 use super::error::LlmError;
-use crate::config::{
+use vectorless_config::{
     FallbackBehavior, FallbackConfig as ConfigFallbackConfig, OnAllFailedBehavior,
 };
 
diff --git a/rust/src/llm/mod.rs b/vectorless-core/vectorless-llm/src/lib.rs
similarity index 93%
rename from rust/src/llm/mod.rs
rename to vectorless-core/vectorless-llm/src/lib.rs
index bd65e58a..8bb01c3b 100644
--- a/rust/src/llm/mod.rs
+++ b/vectorless-core/vectorless-llm/src/lib.rs
@@ -29,14 +29,17 @@
 //! ```
 
 mod client;
-pub(crate) mod config;
+pub mod config;
 mod error;
 mod executor;
 mod fallback;
-pub(crate) mod memo;
+pub mod memo;
 mod pool;
-pub(crate) mod throttle;
+pub mod throttle;
 
 pub use client::LlmClient;
 pub use error::LlmResult;
 pub use pool::LlmPool;
+
+// Re-export vectorless_error types for internal use
+pub(crate) use vectorless_error::{Error, Result};
diff --git a/rust/src/llm/memo/mod.rs b/vectorless-core/vectorless-llm/src/memo/mod.rs
similarity index 100%
rename from rust/src/llm/memo/mod.rs
rename to vectorless-core/vectorless-llm/src/memo/mod.rs
diff --git a/rust/src/llm/memo/store.rs b/vectorless-core/vectorless-llm/src/memo/store.rs
similarity index 97%
rename from rust/src/llm/memo/store.rs
rename to vectorless-core/vectorless-llm/src/memo/store.rs
index 2fdfcea4..5a87bd61 100644
--- a/rust/src/llm/memo/store.rs
+++ b/vectorless-core/vectorless-llm/src/memo/store.rs
@@ -18,8 +18,8 @@ use serde::{Deserialize, Serialize};
 use tracing::{debug, info};
 
 use super::types::{MemoEntry, MemoKey, MemoOpType, MemoStats, MemoValue};
-use crate::error::Result;
-use crate::utils::fingerprint::Fingerprint;
+use vectorless_error::Result;
+use vectorless_utils::fingerprint::Fingerprint;
 
 /// Default TTL for cache entries (7 days).
 const DEFAULT_TTL: Duration = Duration::days(7);
@@ -388,14 +388,15 @@ impl MemoStore {
             stats,
         };
 
-        let parent = path
-            .parent()
-            .ok_or_else(|| crate::Error::Parse("Invalid path for memo store".to_string()))?;
+        let parent = path.parent().ok_or_else(|| {
+            vectorless_error::Error::Parse("Invalid path for memo store".to_string())
+        })?;
         tokio::fs::create_dir_all(parent).await?;
 
         let temp_path = path.with_extension("tmp");
-        let json = serde_json::to_vec_pretty(&data)
-            .map_err(|e| crate::Error::Parse(format!("Failed to serialize memo store: {}", e)))?;
+        let json = serde_json::to_vec_pretty(&data).map_err(|e| {
+            vectorless_error::Error::Parse(format!("Failed to serialize memo store: {}", e))
+        })?;
         tokio::fs::write(&temp_path, &json).await?;
         tokio::fs::rename(&temp_path, path).await?;
 
@@ -414,8 +415,9 @@ impl MemoStore {
         }
 
         let bytes = tokio::fs::read(path).await?;
-        let data: MemoStoreData = serde_json::from_slice(&bytes)
-            .map_err(|e| crate::Error::Parse(format!("Failed to deserialize memo store: {}", e)))?;
+        let data: MemoStoreData = serde_json::from_slice(&bytes).map_err(|e| {
+            vectorless_error::Error::Parse(format!("Failed to deserialize memo store: {}", e))
+        })?;
 
         let mut cache = self.cache.write();
 
diff --git a/rust/src/llm/memo/types.rs b/vectorless-core/vectorless-llm/src/memo/types.rs
similarity index 99%
rename from rust/src/llm/memo/types.rs
rename to vectorless-core/vectorless-llm/src/memo/types.rs
index a45aed12..9e3cb86d 100644
--- a/rust/src/llm/memo/types.rs
+++ b/vectorless-core/vectorless-llm/src/memo/types.rs
@@ -6,7 +6,7 @@
 use chrono::{DateTime, Utc};
 use serde::{Deserialize, Serialize};
 
-use crate::utils::fingerprint::Fingerprint;
+use vectorless_utils::fingerprint::Fingerprint;
 
 /// Types of operations that can be memoized.
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, Serialize, Deserialize)]
@@ -157,7 +157,7 @@ impl MemoKey {
 
     /// Compute a fingerprint of this key for storage.
     pub fn fingerprint(&self) -> Fingerprint {
-        use crate::utils::fingerprint::Fingerprinter;
+        use vectorless_utils::fingerprint::Fingerprinter;
 
         let mut fp = Fingerprinter::new();
         fp.write_u64(self.op_type.as_byte() as u64);
diff --git a/rust/src/llm/pool.rs b/vectorless-core/vectorless-llm/src/pool.rs
similarity index 85%
rename from rust/src/llm/pool.rs
rename to vectorless-core/vectorless-llm/src/pool.rs
index 9acef8ca..45d3b4be 100644
--- a/rust/src/llm/pool.rs
+++ b/vectorless-core/vectorless-llm/src/pool.rs
@@ -9,7 +9,7 @@ use super::client::LlmClient;
 use super::config::LlmConfig;
 use super::fallback::{FallbackChain, FallbackConfig};
 use super::throttle::ConcurrencyController;
-use crate::metrics::MetricsHub;
+use vectorless_metrics::MetricsHub;
 
 /// Pool of LLM clients for different purposes.
 ///
@@ -21,7 +21,7 @@ use crate::metrics::MetricsHub;
 ///
 /// # Construction
 ///
-/// The pool is built from a [`config::LlmConfig`](crate::config::LlmConfig)
+/// The pool is built from a [`config::LlmConfig`](vectorless_config::LlmConfig)
 /// which defines the global credentials and per-slot overrides.
 ///
 /// ```rust,ignore
@@ -49,14 +49,14 @@ impl LlmPool {
     /// When `metrics` is provided, all clients share the same hub
     /// for unified LLM call statistics.
     pub fn from_config(
-        config: &crate::config::LlmConfig,
+        config: &vectorless_config::LlmConfig,
         metrics: Option<Arc<MetricsHub>>,
     ) -> Self {
         let api_key = config.api_key.clone();
         let endpoint = config.endpoint.clone().unwrap_or_default();
-        let retry = config.retry.to_runtime_config();
+        let retry = super::config::RetryConfig::from(&config.retry);
 
-        let make_config = |slot: &crate::config::SlotConfig| -> LlmConfig {
+        let make_config = |slot: &vectorless_config::SlotConfig| -> LlmConfig {
             LlmConfig {
                 model: config.resolve_model(slot),
                 endpoint: endpoint.clone(),
@@ -81,14 +81,14 @@ impl LlmPool {
         ));
 
         // Attach shared throttle controller from config
-        let concurrency_config = config.throttle.to_runtime_config();
+        let concurrency_config = super::throttle::ConcurrencyConfig::from(&config.throttle);
         let controller = Arc::new(ConcurrencyController::new(concurrency_config));
 
         // Attach shared fallback chain from config
         let fallback_config: FallbackConfig = config.fallback.clone().into();
         let fallback_chain = Arc::new(FallbackChain::new(fallback_config));
 
-        let build_client = |slot_config: &crate::config::SlotConfig| {
+        let build_client = |slot_config: &vectorless_config::SlotConfig| {
             let mut client = LlmClient::new(make_config(slot_config))
                 .with_shared_concurrency(controller.clone())
                 .with_shared_openai_client(openai_client.clone())
@@ -107,7 +107,7 @@ impl LlmPool {
 
     /// Create a pool with default configurations.
     pub fn from_defaults() -> Self {
-        Self::from_config(&crate::config::LlmConfig::default(), None)
+        Self::from_config(&vectorless_config::LlmConfig::default(), None)
     }
 
     /// Get the index client.
@@ -133,10 +133,10 @@ mod tests {
 
     #[test]
     fn test_pool_from_config() {
-        let config = crate::config::LlmConfig::new("gpt-4o")
+        let config = vectorless_config::LlmConfig::new("gpt-4o")
             .with_api_key("sk-test")
             .with_endpoint("https://api.openai.com/v1")
-            .with_index(crate::config::SlotConfig::fast().with_model("gpt-4o-mini"));
+            .with_index(vectorless_config::SlotConfig::fast().with_model("gpt-4o-mini"));
 
         let pool = LlmPool::from_config(&config, None);
 
@@ -147,7 +147,7 @@ mod tests {
 
     #[test]
     fn test_pool_from_config_with_metrics() {
-        let config = crate::config::LlmConfig::new("gpt-4o")
+        let config = vectorless_config::LlmConfig::new("gpt-4o")
             .with_api_key("sk-test")
             .with_endpoint("https://api.openai.com/v1");
 
@@ -163,7 +163,7 @@ mod tests {
 
     #[test]
     fn test_pool_shared_metrics_hub() {
-        let config = crate::config::LlmConfig::new("gpt-4o")
+        let config = vectorless_config::LlmConfig::new("gpt-4o")
             .with_api_key("sk-test")
             .with_endpoint("https://api.openai.com/v1");
 
diff --git a/rust/src/llm/throttle.rs b/vectorless-core/vectorless-llm/src/throttle.rs
similarity index 95%
rename from rust/src/llm/throttle.rs
rename to vectorless-core/vectorless-llm/src/throttle.rs
index 5de96743..2a0d27fb 100644
--- a/rust/src/llm/throttle.rs
+++ b/vectorless-core/vectorless-llm/src/throttle.rs
@@ -44,6 +44,17 @@ pub struct ConcurrencyConfig {
     pub semaphore_enabled: bool,
 }
 
+impl From<&vectorless_config::ThrottleConfig> for ConcurrencyConfig {
+    fn from(c: &vectorless_config::ThrottleConfig) -> Self {
+        Self {
+            max_concurrent_requests: c.max_concurrent_requests,
+            requests_per_minute: c.requests_per_minute,
+            enabled: c.enabled,
+            semaphore_enabled: c.semaphore_enabled,
+        }
+    }
+}
+
 fn default_max_concurrent_requests() -> usize {
     10
 }
diff --git a/vectorless-core/vectorless-metrics/Cargo.toml b/vectorless-core/vectorless-metrics/Cargo.toml
new file mode 100644
index 00000000..93237dcc
--- /dev/null
+++ b/vectorless-core/vectorless-metrics/Cargo.toml
@@ -0,0 +1,19 @@
+[package]
+name = "vectorless-metrics"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-config = { path = "../vectorless-config" }
+vectorless-error = { path = "../vectorless-error" }
+serde = { workspace = true }
+tracing = { workspace = true }
+parking_lot = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/metrics/hub.rs b/vectorless-core/vectorless-metrics/src/hub.rs
similarity index 99%
rename from rust/src/metrics/hub.rs
rename to vectorless-core/vectorless-metrics/src/hub.rs
index c4f70fe4..ff4f553b 100644
--- a/rust/src/metrics/hub.rs
+++ b/vectorless-core/vectorless-metrics/src/hub.rs
@@ -7,7 +7,7 @@ use std::sync::Arc;
 
 use super::llm::{LlmMetrics, LlmMetricsReport};
 use super::retrieval::{RetrievalMetrics, RetrievalMetricsReport};
-use crate::config::MetricsConfig;
+use vectorless_config::MetricsConfig;
 
 /// Central metrics hub for unified collection.
 ///
diff --git a/rust/src/metrics/index.rs b/vectorless-core/vectorless-metrics/src/index.rs
similarity index 100%
rename from rust/src/metrics/index.rs
rename to vectorless-core/vectorless-metrics/src/index.rs
diff --git a/rust/src/metrics/mod.rs b/vectorless-core/vectorless-metrics/src/lib.rs
similarity index 100%
rename from rust/src/metrics/mod.rs
rename to vectorless-core/vectorless-metrics/src/lib.rs
diff --git a/rust/src/metrics/llm.rs b/vectorless-core/vectorless-metrics/src/llm.rs
similarity index 99%
rename from rust/src/metrics/llm.rs
rename to vectorless-core/vectorless-metrics/src/llm.rs
index 257747ae..09d1546a 100644
--- a/rust/src/metrics/llm.rs
+++ b/vectorless-core/vectorless-metrics/src/llm.rs
@@ -5,7 +5,7 @@
 
 use std::sync::atomic::{AtomicU64, Ordering};
 
-use crate::config::LlmMetricsConfig;
+use vectorless_config::LlmMetricsConfig;
 
 /// LLM metrics tracker.
 #[derive(Debug, Default)]
diff --git a/rust/src/metrics/retrieval.rs b/vectorless-core/vectorless-metrics/src/retrieval.rs
similarity index 99%
rename from rust/src/metrics/retrieval.rs
rename to vectorless-core/vectorless-metrics/src/retrieval.rs
index 682250e9..ce15941e 100644
--- a/rust/src/metrics/retrieval.rs
+++ b/vectorless-core/vectorless-metrics/src/retrieval.rs
@@ -5,7 +5,7 @@
 
 use std::sync::atomic::{AtomicU64, Ordering};
 
-use crate::config::RetrievalMetricsConfig;
+use vectorless_config::RetrievalMetricsConfig;
 
 /// Retrieval metrics tracker.
 #[derive(Debug, Default)]
diff --git a/python/Cargo.toml b/vectorless-core/vectorless-py/Cargo.toml
similarity index 74%
rename from python/Cargo.toml
rename to vectorless-core/vectorless-py/Cargo.toml
index 93a0d557..967540ce 100644
--- a/python/Cargo.toml
+++ b/vectorless-core/vectorless-py/Cargo.toml
@@ -1,11 +1,12 @@
 [package]
 name = "vectorless-py"
-version = "0.1.0"
+version.workspace = true
 edition.workspace = true
 authors.workspace = true
 description = "Python bindings for vectorless"
 license.workspace = true
 repository.workspace = true
+homepage.workspace = true
 
 [lib]
 name = "vectorless"
@@ -15,4 +16,7 @@ crate-type = ["cdylib"]
 pyo3 = { workspace = true }
 pyo3-async-runtimes = { workspace = true }
 tokio = { version = "1", features = ["rt-multi-thread"] }
-vectorless = { path = "../rust" }
+vectorless-engine = { path = "../vectorless-engine" }
+
+[lints]
+workspace = true
diff --git a/vectorless-core/vectorless-py/src/answer.rs b/vectorless-core/vectorless-py/src/answer.rs
new file mode 100644
index 00000000..92131392
--- /dev/null
+++ b/vectorless-core/vectorless-py/src/answer.rs
@@ -0,0 +1,103 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Answer Python wrapper.
+
+use pyo3::prelude::*;
+
+use ::vectorless_engine::Answer;
+
+/// A reasoned answer with evidence and trace.
+#[pyclass(name = "Answer")]
+pub struct PyAnswer {
+    pub(crate) inner: Answer,
+}
+
+#[pymethods]
+impl PyAnswer {
+    /// The answer content.
+    #[getter]
+    fn content(&self) -> &str {
+        &self.inner.content
+    }
+
+    /// Evidence supporting the answer.
+    #[getter]
+    fn evidence(&self) -> Vec<PyEvidence> {
+        self.inner
+            .evidence
+            .iter()
+            .map(|e| PyEvidence {
+                content: e.content.clone(),
+                source_path: e.source_path.clone(),
+                doc_name: e.doc_name.clone(),
+                relevance: e.relevance,
+            })
+            .collect()
+    }
+
+    /// Confidence score (0.0–1.0).
+    #[getter]
+    fn confidence(&self) -> f32 {
+        self.inner.confidence
+    }
+
+    /// Reasoning trace — how the agent arrived at this answer.
+    #[getter]
+    fn trace(&self) -> PyReasoningTrace {
+        PyReasoningTrace {
+            steps: self
+                .inner
+                .trace
+                .steps
+                .iter()
+                .map(|s| PyTraceStep {
+                    action: s.action.clone(),
+                    observation: s.observation.clone(),
+                    round: s.round,
+                })
+                .collect(),
+        }
+    }
+
+    fn __repr__(&self) -> String {
+        format!(
+            "Answer(confidence={:.2}, evidence={}, trace_steps={})",
+            self.inner.confidence,
+            self.inner.evidence.len(),
+            self.inner.trace.steps.len()
+        )
+    }
+}
+
+/// A piece of evidence with source attribution.
+#[pyclass(name = "Evidence")]
+pub struct PyEvidence {
+    #[pyo3(get)]
+    pub content: String,
+    #[pyo3(get)]
+    pub source_path: String,
+    #[pyo3(get)]
+    pub doc_name: String,
+    #[pyo3(get)]
+    pub relevance: f32,
+}
+
+/// Reasoning trace — always present.
+#[pyclass(name = "ReasoningTrace")]
+pub struct PyReasoningTrace {
+    #[pyo3(get)]
+    pub steps: Vec<PyTraceStep>,
+}
+
+/// A single step in the reasoning trace.
+#[pyclass(name = "TraceStep", skip_from_py_object)]
+#[derive(Clone)]
+pub struct PyTraceStep {
+    #[pyo3(get)]
+    pub action: String,
+    #[pyo3(get)]
+    pub observation: String,
+    #[pyo3(get)]
+    pub round: u32,
+}
diff --git a/python/src/config.rs b/vectorless-core/vectorless-py/src/config.rs
similarity index 94%
rename from python/src/config.rs
rename to vectorless-core/vectorless-py/src/config.rs
index ce601311..6b043ea6 100644
--- a/python/src/config.rs
+++ b/vectorless-core/vectorless-py/src/config.rs
@@ -24,7 +24,7 @@ use pyo3::prelude::*;
 /// ```
 #[pyclass(name = "Config")]
 pub struct PyConfig {
-    pub(crate) inner: vectorless::Config,
+    pub(crate) inner: vectorless_engine::Config,
 }
 
 #[pymethods]
@@ -33,7 +33,7 @@ impl PyConfig {
     #[new]
     fn new() -> Self {
         Self {
-            inner: vectorless::Config::default(),
+            inner: vectorless_engine::Config::default(),
         }
     }
 
diff --git a/vectorless-core/vectorless-py/src/document.rs b/vectorless-core/vectorless-py/src/document.rs
new file mode 100644
index 00000000..af200d02
--- /dev/null
+++ b/vectorless-core/vectorless-py/src/document.rs
@@ -0,0 +1,78 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! DocumentInfo Python wrapper.
+
+use pyo3::prelude::*;
+
+use ::vectorless_engine::DocumentInfo;
+
+/// Information about an understood document.
+#[pyclass(name = "DocumentInfo")]
+pub struct PyDocumentInfo {
+    pub(crate) inner: DocumentInfo,
+}
+
+#[pymethods]
+impl PyDocumentInfo {
+    #[getter]
+    fn doc_id(&self) -> &str {
+        &self.inner.doc_id
+    }
+
+    #[getter]
+    fn name(&self) -> &str {
+        &self.inner.name
+    }
+
+    #[getter]
+    fn format(&self) -> &str {
+        &self.inner.format
+    }
+
+    #[getter]
+    fn summary(&self) -> &str {
+        &self.inner.summary
+    }
+
+    #[getter]
+    fn concepts(&self) -> Vec<PyConcept> {
+        self.inner
+            .concepts
+            .iter()
+            .map(|c| PyConcept {
+                name: c.name.clone(),
+                summary: c.summary.clone(),
+                sections: c.sections.clone(),
+            })
+            .collect()
+    }
+
+    #[getter]
+    fn section_count(&self) -> usize {
+        self.inner.section_count
+    }
+
+    #[getter]
+    fn page_count(&self) -> Option<usize> {
+        self.inner.page_count
+    }
+
+    fn __repr__(&self) -> String {
+        format!(
+            "DocumentInfo(doc_id='{}', name='{}', format='{}')",
+            self.inner.doc_id, self.inner.name, self.inner.format
+        )
+    }
+}
+
+/// A key concept extracted from a document.
+#[pyclass(name = "Concept")]
+pub struct PyConcept {
+    #[pyo3(get)]
+    pub name: String,
+    #[pyo3(get)]
+    pub summary: String,
+    #[pyo3(get)]
+    pub sections: Vec<String>,
+}
diff --git a/python/src/engine.rs b/vectorless-core/vectorless-py/src/engine.rs
similarity index 58%
rename from python/src/engine.rs
rename to vectorless-core/vectorless-py/src/engine.rs
index 23924568..66c4e30c 100644
--- a/python/src/engine.rs
+++ b/vectorless-core/vectorless-py/src/engine.rs
@@ -1,69 +1,65 @@
 // Copyright (c) 2026 vectorless developers
 // SPDX-License-Identifier: Apache-2.0
 
-//! Engine Python wrapper and async helpers.
+//! Engine Python wrapper — async ingest/ask/forget/list_documents.
 
 use pyo3::prelude::*;
 use pyo3_async_runtimes::tokio::future_into_py;
 use std::sync::Arc;
 use tokio::runtime::Runtime;
 
-use ::vectorless::{Engine, EngineBuilder, IndexContext, QueryContext};
+use ::vectorless_engine::{Engine, EngineBuilder, IngestInput};
 
-use super::config::PyConfig;
-use super::context::{PyIndexContext, PyQueryContext};
+use super::answer::PyAnswer;
 use super::document::PyDocumentInfo;
 use super::error::VectorlessError;
 use super::error::to_py_err;
 use super::graph::PyDocumentGraph;
 use super::metrics::PyMetricsReport;
-use super::results::{PyIndexResult, PyQueryResult};
-use super::streaming::PyStreamingQuery;
 
 // ============================================================
 // Engine async helpers (named functions to avoid FnOnce HRTB issue)
 // ============================================================
 
-async fn run_index(engine: Arc<Engine>, ctx: IndexContext) -> PyResult<PyIndexResult> {
-    let result = engine.index(ctx).await.map_err(to_py_err)?;
-    Ok(PyIndexResult { inner: result })
+async fn run_ingest(engine: Arc<Engine>, input: IngestInput) -> PyResult<PyDocumentInfo> {
+    let doc = engine.ingest(input).await.map_err(to_py_err)?;
+    Ok(PyDocumentInfo { inner: doc })
 }
 
-async fn run_query(engine: Arc<Engine>, ctx: QueryContext) -> PyResult<PyQueryResult> {
-    let result = engine.query(ctx).await.map_err(to_py_err)?;
-    Ok(PyQueryResult { inner: result })
+async fn run_ask(
+    engine: Arc<Engine>,
+    question: String,
+    doc_ids: Vec<String>,
+) -> PyResult<PyAnswer> {
+    let answer = engine.ask(&question, &doc_ids).await.map_err(to_py_err)?;
+    Ok(PyAnswer { inner: answer })
 }
 
-async fn run_list(engine: Arc<Engine>) -> PyResult<Vec<PyDocumentInfo>> {
-    let docs = engine.list().await.map_err(to_py_err)?;
+async fn run_forget(engine: Arc<Engine>, doc_id: String) -> PyResult<()> {
+    engine.forget(&doc_id).await.map_err(to_py_err)
+}
+
+async fn run_list_documents(engine: Arc<Engine>) -> PyResult<Vec<PyDocumentInfo>> {
+    let docs = engine.list_documents().await.map_err(to_py_err)?;
     Ok(docs
         .into_iter()
         .map(|d| PyDocumentInfo { inner: d })
         .collect())
 }
 
-async fn run_remove(engine: Arc<Engine>, doc_id: String) -> PyResult<bool> {
-    engine.remove(&doc_id).await.map_err(to_py_err)
+async fn run_exists(engine: Arc<Engine>, doc_id: String) -> PyResult<bool> {
+    engine.exists(&doc_id).await.map_err(to_py_err)
 }
 
 async fn run_clear(engine: Arc<Engine>) -> PyResult<usize> {
     engine.clear().await.map_err(to_py_err)
 }
 
-async fn run_exists(engine: Arc<Engine>, doc_id: String) -> PyResult<bool> {
-    engine.exists(&doc_id).await.map_err(to_py_err)
-}
-
 async fn run_get_graph(engine: Arc<Engine>) -> PyResult<Option<PyDocumentGraph>> {
     let graph = engine.get_graph().await.map_err(to_py_err)?;
     Ok(graph.map(|g| PyDocumentGraph { inner: g }))
 }
 
-async fn run_query_stream(engine: Arc<Engine>, ctx: QueryContext) -> PyResult<PyStreamingQuery> {
-    let rx = engine.query_stream(ctx).await.map_err(to_py_err)?;
-    Ok(PyStreamingQuery::new(rx))
-}
-
 fn run_metrics_report(engine: Arc<Engine>) -> PyMetricsReport {
     PyMetricsReport {
         inner: engine.metrics_report(),
@@ -74,25 +70,29 @@ fn run_metrics_report(engine: Arc<Engine>) -> PyMetricsReport {
 // Engine
 // ============================================================
 
-/// The main vectorless engine.
+/// The vectorless Document Understanding Engine.
 ///
-/// `api_key` and `model` are **required**.
+/// All methods are **async** — use `await` to call them.
 ///
 /// ```python
-/// from vectorless import Engine, IndexContext, QueryContext
+/// from vectorless import Engine
+///
+/// engine = Engine(api_key="sk-...", model="gpt-4o")
 ///
-/// engine = Engine(
-///     api_key="sk-...",
-///     model="gpt-4o",
-/// )
+/// # Understand a document
+/// doc = await engine.ingest("./report.pdf")
+/// print(doc.summary)
 ///
-/// # Index
-/// result = await engine.index(IndexContext.from_path("./report.pdf"))
-/// doc_id = result.doc_id
+/// # Ask a question
+/// answer = await engine.ask("What is the revenue?", doc_ids=[doc.doc_id])
+/// print(answer.content)
+/// print(answer.trace)  # reasoning trace — always present
 ///
-/// # Query
-/// answer = await engine.query(QueryContext("What is the revenue?").with_doc_ids([doc_id]))
-/// print(answer.single().content)
+/// # List all understood documents
+/// docs = await engine.list_documents()
+///
+/// # Forget a document
+/// await engine.forget(doc.doc_id)
 /// ```
 #[pyclass(name = "Engine")]
 pub struct PyEngine {
@@ -117,7 +117,7 @@ impl PyEngine {
         api_key: Option<String>,
         model: Option<String>,
         endpoint: Option<String>,
-        config: Option<PyRef<PyConfig>>,
+        config: Option<PyRef<super::config::PyConfig>>,
     ) -> PyResult<Self> {
         let rt = Runtime::new().map_err(|e| {
             PyErr::from(VectorlessError::new(
@@ -160,80 +160,73 @@ impl PyEngine {
         })
     }
 
-    /// Index a document.
+    /// Understand a document — parse, analyze, and persist.
     ///
     /// Args:
-    ///     ctx: IndexContext created from from_path, from_paths, from_dir, etc.
+    ///     path: File path to the document (PDF or Markdown).
     ///
     /// Returns:
-    ///     IndexResult with doc_id and items.
+    ///     DocumentInfo with doc_id, summary, structure, concepts.
     ///
     /// Raises:
-    ///     VectorlessError: If indexing fails.
-    fn index<'py>(&self, py: Python<'py>, ctx: &PyIndexContext) -> PyResult<Bound<'py, PyAny>> {
+    ///     VectorlessError: If ingest fails.
+    fn ingest<'py>(&self, py: Python<'py>, path: String) -> PyResult<Bound<'py, PyAny>> {
         let engine = Arc::clone(&self.inner);
-        let index_ctx = ctx.inner.clone();
-        future_into_py(py, run_index(engine, index_ctx))
+        let input = IngestInput::Path(path.into());
+        future_into_py(py, run_ingest(engine, input))
     }
 
-    /// Query indexed documents.
+    /// Ask a question — returns a reasoned answer with evidence and trace.
     ///
     /// Args:
-    ///     ctx: QueryContext with query text and scope.
+    ///     question: The question to ask (required).
+    ///     doc_ids: List of document IDs to search. Empty = search all.
     ///
     /// Returns:
-    ///     QueryResult with answer and score.
+    ///     Answer with content, evidence, confidence, and trace.
     ///
     /// Raises:
-    ///     VectorlessError: If query fails.
-    fn query<'py>(&self, py: Python<'py>, ctx: &PyQueryContext) -> PyResult<Bound<'py, PyAny>> {
+    ///     VectorlessError: If ask fails.
+    #[pyo3(signature = (question, doc_ids=None))]
+    fn ask<'py>(
+        &self,
+        py: Python<'py>,
+        question: String,
+        doc_ids: Option<Vec<String>>,
+    ) -> PyResult<Bound<'py, PyAny>> {
         let engine = Arc::clone(&self.inner);
-        let query_ctx = ctx.inner.clone();
-        future_into_py(py, run_query(engine, query_ctx))
+        let ids = doc_ids.unwrap_or_default();
+        future_into_py(py, run_ask(engine, question, ids))
     }
 
-    /// Query documents with streaming progress events.
-    ///
-    /// Returns a StreamingQuery async iterator that yields real-time
-    /// retrieval events as dicts with a ``"type"`` key.
+    /// Remove a document by ID.
     ///
     /// Args:
-    ///     ctx: QueryContext with query text and scope.
-    ///
-    /// Returns:
-    ///     StreamingQuery async iterator.
+    ///     doc_id: The document ID to remove.
     ///
     /// Raises:
-    ///     VectorlessError: If query setup fails.
-    fn query_stream<'py>(
-        &self,
-        py: Python<'py>,
-        ctx: &PyQueryContext,
-    ) -> PyResult<Bound<'py, PyAny>> {
+    ///     VectorlessError: If removal fails.
+    fn forget<'py>(&self, py: Python<'py>, doc_id: String) -> PyResult<Bound<'py, PyAny>> {
         let engine = Arc::clone(&self.inner);
-        let query_ctx = ctx.inner.clone();
-        future_into_py(py, run_query_stream(engine, query_ctx))
+        future_into_py(py, run_forget(engine, doc_id))
     }
 
-    /// List all indexed documents.
+    /// List all understood documents.
     ///
     /// Returns:
-    ///     List of DocumentInfo objects.
-    fn list<'py>(&self, py: Python<'py>) -> PyResult<Bound<'py, PyAny>> {
+    ///     List of DocumentInfo objects with summary, structure, and concepts.
+    fn list_documents<'py>(&self, py: Python<'py>) -> PyResult<Bound<'py, PyAny>> {
         let engine = Arc::clone(&self.inner);
-        future_into_py(py, run_list(engine))
+        future_into_py(py, run_list_documents(engine))
     }
 
-    /// Remove a document by ID.
-    ///
-    /// Returns:
-    ///     True if removed, False if not found.
-    fn remove<'py>(&self, py: Python<'py>, doc_id: String) -> PyResult<Bound<'py, PyAny>> {
+    /// Check if a document exists.
+    fn exists<'py>(&self, py: Python<'py>, doc_id: String) -> PyResult<Bound<'py, PyAny>> {
         let engine = Arc::clone(&self.inner);
-        future_into_py(py, run_remove(engine, doc_id))
+        future_into_py(py, run_exists(engine, doc_id))
     }
 
-    /// Remove all indexed documents.
+    /// Remove all documents.
     ///
     /// Returns:
     ///     Number of documents removed.
@@ -242,12 +235,6 @@ impl PyEngine {
         future_into_py(py, run_clear(engine))
     }
 
-    /// Check if a document exists.
-    fn exists<'py>(&self, py: Python<'py>, doc_id: String) -> PyResult<Bound<'py, PyAny>> {
-        let engine = Arc::clone(&self.inner);
-        future_into_py(py, run_exists(engine, doc_id))
-    }
-
     /// Get the cross-document relationship graph.
     ///
     /// Returns:
diff --git a/python/src/error.rs b/vectorless-core/vectorless-py/src/error.rs
similarity index 97%
rename from python/src/error.rs
rename to vectorless-core/vectorless-py/src/error.rs
index e4a977b8..c4715614 100644
--- a/python/src/error.rs
+++ b/vectorless-core/vectorless-py/src/error.rs
@@ -6,7 +6,7 @@
 use pyo3::exceptions::PyException;
 use pyo3::prelude::*;
 
-use ::vectorless::Error as RustError;
+use ::vectorless_engine::Error as RustError;
 
 /// Python exception for vectorless errors.
 #[pyclass(extends = PyException, subclass)]
diff --git a/python/src/graph.rs b/vectorless-core/vectorless-py/src/graph.rs
similarity index 97%
rename from python/src/graph.rs
rename to vectorless-core/vectorless-py/src/graph.rs
index a424316f..affba45a 100644
--- a/python/src/graph.rs
+++ b/vectorless-core/vectorless-py/src/graph.rs
@@ -5,7 +5,9 @@
 
 use pyo3::prelude::*;
 
-use ::vectorless::{DocumentGraph, DocumentGraphNode, EdgeEvidence, GraphEdge, WeightedKeyword};
+use ::vectorless_engine::{
+    DocumentGraph, DocumentGraphNode, EdgeEvidence, GraphEdge, WeightedKeyword,
+};
 
 /// A keyword with weight from document analysis.
 #[pyclass(name = "WeightedKeyword")]
diff --git a/python/src/lib.rs b/vectorless-core/vectorless-py/src/lib.rs
similarity index 51%
rename from python/src/lib.rs
rename to vectorless-core/vectorless-py/src/lib.rs
index d17c5830..e6ad77ea 100644
--- a/python/src/lib.rs
+++ b/vectorless-core/vectorless-py/src/lib.rs
@@ -5,54 +5,42 @@
 
 use pyo3::prelude::*;
 
+mod answer;
 mod config;
-mod context;
 mod document;
 mod engine;
 mod error;
 mod graph;
 mod metrics;
-mod results;
-mod streaming;
 
+use answer::{PyAnswer, PyEvidence, PyReasoningTrace, PyTraceStep};
 use config::PyConfig;
-use context::{PyIndexContext, PyIndexOptions, PyQueryContext};
-use document::PyDocumentInfo;
+use document::{PyConcept, PyDocumentInfo};
 use engine::PyEngine;
 use error::VectorlessError;
 use graph::{PyDocumentGraph, PyDocumentGraphNode, PyEdgeEvidence, PyGraphEdge, PyWeightedKeyword};
 use metrics::{PyLlmMetricsReport, PyMetricsReport, PyRetrievalMetricsReport};
-use results::{
-    PyEvidenceItem, PyFailedItem, PyIndexItem, PyIndexMetrics, PyIndexResult, PyQueryMetrics,
-    PyQueryResult, PyQueryResultItem,
-};
-use streaming::PyStreamingQuery;
 
-/// Vectorless - Reasoning-native document intelligence engine.
+/// Vectorless — Document Understanding Engine for AI.
 ///
 /// ```python
-/// from vectorless import Engine, IndexContext, QueryContext
+/// from vectorless import Engine
 ///
 /// engine = Engine(api_key="sk-...", model="gpt-4o")
-/// result = await engine.index(IndexContext.from_path("./report.pdf"))
-/// answer = await engine.query(QueryContext("What is the revenue?").with_doc_ids([result.doc_id]))
-/// print(answer.single().content)
+/// doc = await engine.ingest("./report.pdf")
+/// answer = await engine.ask("What is the revenue?", doc_ids=[doc.doc_id])
+/// print(answer.content)
 /// ```
 #[pymodule]
 fn _vectorless(m: &Bound<'_, PyModule>) -> PyResult<()> {
     m.add_class::<VectorlessError>()?;
-    m.add_class::<PyIndexOptions>()?;
-    m.add_class::<PyIndexContext>()?;
-    m.add_class::<PyQueryContext>()?;
-    m.add_class::<PyIndexResult>()?;
-    m.add_class::<PyIndexItem>()?;
-    m.add_class::<PyIndexMetrics>()?;
-    m.add_class::<PyQueryResult>()?;
-    m.add_class::<PyQueryResultItem>()?;
-    m.add_class::<PyEvidenceItem>()?;
-    m.add_class::<PyQueryMetrics>()?;
-    m.add_class::<PyFailedItem>()?;
+    m.add_class::<PyEngine>()?;
     m.add_class::<PyDocumentInfo>()?;
+    m.add_class::<PyConcept>()?;
+    m.add_class::<PyAnswer>()?;
+    m.add_class::<PyEvidence>()?;
+    m.add_class::<PyReasoningTrace>()?;
+    m.add_class::<PyTraceStep>()?;
     m.add_class::<PyDocumentGraphNode>()?;
     m.add_class::<PyDocumentGraph>()?;
     m.add_class::<PyGraphEdge>()?;
@@ -62,8 +50,6 @@ fn _vectorless(m: &Bound<'_, PyModule>) -> PyResult<()> {
     m.add_class::<PyRetrievalMetricsReport>()?;
     m.add_class::<PyMetricsReport>()?;
     m.add_class::<PyConfig>()?;
-    m.add_class::<PyStreamingQuery>()?;
-    m.add_class::<PyEngine>()?;
 
     m.add("__version__", env!("CARGO_PKG_VERSION"))?;
 
diff --git a/python/src/metrics.rs b/vectorless-core/vectorless-py/src/metrics.rs
similarity index 98%
rename from python/src/metrics.rs
rename to vectorless-core/vectorless-py/src/metrics.rs
index f194cd82..19f94623 100644
--- a/python/src/metrics.rs
+++ b/vectorless-core/vectorless-py/src/metrics.rs
@@ -5,7 +5,7 @@
 
 use pyo3::prelude::*;
 
-use ::vectorless::{LlmMetricsReport, MetricsReport, RetrievalMetricsReport};
+use ::vectorless_engine::{LlmMetricsReport, MetricsReport, RetrievalMetricsReport};
 
 /// LLM usage metrics report.
 #[pyclass(name = "LlmMetricsReport")]
diff --git a/vectorless-core/vectorless-query/Cargo.toml b/vectorless-core/vectorless-query/Cargo.toml
new file mode 100644
index 00000000..353e94dc
--- /dev/null
+++ b/vectorless-core/vectorless-query/Cargo.toml
@@ -0,0 +1,22 @@
+[package]
+name = "vectorless-query"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-error = { path = "../vectorless-error" }
+vectorless-llm = { path = "../vectorless-llm" }
+vectorless-scoring = { path = "../vectorless-scoring" }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+tokio = { workspace = true }
+chrono = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/query/mod.rs b/vectorless-core/vectorless-query/src/lib.rs
similarity index 87%
rename from rust/src/query/mod.rs
rename to vectorless-core/vectorless-query/src/lib.rs
index bbe6806d..38ecb116 100644
--- a/rust/src/query/mod.rs
+++ b/vectorless-core/vectorless-query/src/lib.rs
@@ -23,8 +23,8 @@ mod understand;
 
 pub use types::{QueryIntent, QueryPlan};
 
-use crate::llm::LlmClient;
-use crate::scoring::bm25::extract_keywords;
+use vectorless_llm::LlmClient;
+use vectorless_scoring::bm25::extract_keywords;
 
 /// Query understanding pipeline.
 ///
@@ -38,7 +38,7 @@ impl QueryPipeline {
     /// 2. LLM deep understanding (intent, concepts, complexity, strategy)
     ///
     /// Errors propagate — the caller handles retries or failure.
-    pub async fn understand(query: &str, llm: &LlmClient) -> crate::error::Result<QueryPlan> {
+    pub async fn understand(query: &str, llm: &LlmClient) -> vectorless_error::Result<QueryPlan> {
         let keywords = extract_keywords(query);
         understand::understand(query, &keywords, llm).await
     }
diff --git a/rust/src/query/types.rs b/vectorless-core/vectorless-query/src/types.rs
similarity index 100%
rename from rust/src/query/types.rs
rename to vectorless-core/vectorless-query/src/types.rs
diff --git a/rust/src/query/understand.rs b/vectorless-core/vectorless-query/src/understand.rs
similarity index 98%
rename from rust/src/query/understand.rs
rename to vectorless-core/vectorless-query/src/understand.rs
index 9790e557..c124395a 100644
--- a/rust/src/query/understand.rs
+++ b/vectorless-core/vectorless-query/src/understand.rs
@@ -9,7 +9,7 @@
 use serde::Deserialize;
 use tracing::{info, warn};
 
-use crate::llm::LlmClient;
+use vectorless_llm::LlmClient;
 
 use super::types::{Complexity, QueryIntent, QueryPlan, SubQuery};
 
@@ -32,14 +32,14 @@ pub async fn understand(
     query: &str,
     keywords: &[String],
     llm: &LlmClient,
-) -> crate::error::Result<QueryPlan> {
+) -> vectorless_error::Result<QueryPlan> {
     let (system, user) = understand_prompt(query, keywords);
     info!("Query understanding: calling LLM...");
     let response = llm.complete(&system, &user).await?;
 
     if response.trim().is_empty() {
         warn!("Query understanding: LLM returned empty response");
-        return Err(crate::error::Error::Config(
+        return Err(vectorless_error::Error::Config(
             "Query understanding failed: LLM returned an empty response. \
              Check your API key, model, and endpoint configuration."
                 .to_string(),
@@ -48,7 +48,7 @@ pub async fn understand(
 
     let analysis = parse_analysis(&response).ok_or_else(|| {
         let preview = &response[..response.len().min(300)];
-        crate::error::Error::Config(format!(
+        vectorless_error::Error::Config(format!(
             "Query understanding returned unparseable response ({} chars): {}",
             response.len(),
             preview
diff --git a/vectorless-core/vectorless-rerank/Cargo.toml b/vectorless-core/vectorless-rerank/Cargo.toml
new file mode 100644
index 00000000..f9a9112c
--- /dev/null
+++ b/vectorless-core/vectorless-rerank/Cargo.toml
@@ -0,0 +1,19 @@
+[package]
+name = "vectorless-rerank"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-query = { path = "../vectorless-query" }
+
+[lints]
+workspace = true
diff --git a/rust/src/rerank/dedup.rs b/vectorless-core/vectorless-rerank/src/dedup.rs
similarity index 99%
rename from rust/src/rerank/dedup.rs
rename to vectorless-core/vectorless-rerank/src/dedup.rs
index 8644a932..713c6088 100644
--- a/rust/src/rerank/dedup.rs
+++ b/vectorless-core/vectorless-rerank/src/dedup.rs
@@ -5,7 +5,7 @@
 
 use std::collections::HashSet;
 
-use crate::agent::Evidence;
+use crate::types::Evidence;
 
 /// Minimum characters for an evidence item to be considered meaningful.
 const MIN_EVIDENCE_CHARS: usize = 50;
diff --git a/rust/src/rerank/mod.rs b/vectorless-core/vectorless-rerank/src/lib.rs
similarity index 95%
rename from rust/src/rerank/mod.rs
rename to vectorless-core/vectorless-rerank/src/lib.rs
index bc179ec3..875c1024 100644
--- a/rust/src/rerank/mod.rs
+++ b/vectorless-core/vectorless-rerank/src/lib.rs
@@ -21,9 +21,8 @@ pub mod types;
 
 use tracing::info;
 
-use crate::agent::Evidence;
-use crate::query::QueryIntent;
-use types::RerankOutput;
+use types::{Evidence, RerankOutput};
+use vectorless_query::QueryIntent;
 
 /// Process agent output through the rerank pipeline.
 ///
@@ -35,7 +34,7 @@ pub async fn process(
     _multi_doc: bool,
     intent: QueryIntent,
     confidence: f32,
-) -> crate::error::Result<RerankOutput> {
+) -> vectorless_error::Result<RerankOutput> {
     let deduped = dedup::dedup(evidence);
     if deduped.is_empty() {
         info!("No evidence after dedup");
diff --git a/vectorless-core/vectorless-rerank/src/types.rs b/vectorless-core/vectorless-rerank/src/types.rs
new file mode 100644
index 00000000..73d19ce9
--- /dev/null
+++ b/vectorless-core/vectorless-rerank/src/types.rs
@@ -0,0 +1,29 @@
+// Copyright (c) 2026 vectorless developers
+// SPDX-License-Identifier: Apache-2.0
+
+//! Rerank result types.
+
+use serde::{Deserialize, Serialize};
+
+/// A single piece of evidence collected during navigation.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct Evidence {
+    /// Navigation path where this evidence was found (e.g., "Root/API Reference/Auth").
+    pub source_path: String,
+    /// Title of the node.
+    pub node_title: String,
+    /// Content of the node.
+    pub content: String,
+    /// Source document name (set by Orchestrator in multi-doc scenarios).
+    pub doc_name: Option<String>,
+}
+
+/// Output from the rerank pipeline.
+pub struct RerankOutput {
+    /// Synthesized answer.
+    pub answer: String,
+    /// Number of LLM calls used during synthesis/fusion.
+    pub llm_calls: u32,
+    /// Confidence score (0.0–1.0) — derived from LLM evaluate() result.
+    pub confidence: f32,
+}
diff --git a/vectorless-core/vectorless-retrieval/Cargo.toml b/vectorless-core/vectorless-retrieval/Cargo.toml
new file mode 100644
index 00000000..b364f762
--- /dev/null
+++ b/vectorless-core/vectorless-retrieval/Cargo.toml
@@ -0,0 +1,30 @@
+[package]
+name = "vectorless-retrieval"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-agent = { path = "../vectorless-agent" }
+vectorless-document = { path = "../vectorless-document" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-llm = { path = "../vectorless-llm" }
+vectorless-query = { path = "../vectorless-query" }
+vectorless-storage = { path = "../vectorless-storage" }
+vectorless-utils = { path = "../vectorless-utils" }
+tokio = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+tracing = { workspace = true }
+futures = { workspace = true }
+parking_lot = { workspace = true }
+
+[dev-dependencies]
+indextree = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/retrieval/cache.rs b/vectorless-core/vectorless-retrieval/src/cache.rs
similarity index 99%
rename from rust/src/retrieval/cache.rs
rename to vectorless-core/vectorless-retrieval/src/cache.rs
index ecfce79a..c924732c 100644
--- a/rust/src/retrieval/cache.rs
+++ b/vectorless-core/vectorless-retrieval/src/cache.rs
@@ -22,8 +22,8 @@ use std::time::Instant;
 
 use tracing::warn;
 
-use crate::document::NodeId;
-use crate::utils::fingerprint::Fingerprint;
+use vectorless_document::NodeId;
+use vectorless_utils::fingerprint::Fingerprint;
 
 /// A tiered reasoning cache for the retrieval pipeline.
 ///
diff --git a/rust/src/retrieval/dispatcher.rs b/vectorless-core/vectorless-retrieval/src/dispatcher.rs
similarity index 89%
rename from rust/src/retrieval/dispatcher.rs
rename to vectorless-core/vectorless-retrieval/src/dispatcher.rs
index e92766fb..b2c45096 100644
--- a/rust/src/retrieval/dispatcher.rs
+++ b/vectorless-core/vectorless-retrieval/src/dispatcher.rs
@@ -17,12 +17,12 @@
 
 use tracing::info;
 
-use crate::agent::config::{AgentConfig, Scope, WorkspaceContext};
-use crate::agent::orchestrator::Orchestrator;
-use crate::agent::{Agent, EventEmitter, Output};
-use crate::error::{Error, Result};
-use crate::llm::LlmClient;
-use crate::query::QueryPipeline;
+use vectorless_agent::config::{AgentConfig, Scope, WorkspaceContext};
+use vectorless_agent::orchestrator::Orchestrator;
+use vectorless_agent::{Agent, EventEmitter, Output};
+use vectorless_error::{Error, Result};
+use vectorless_llm::LlmClient;
+use vectorless_query::QueryPipeline;
 
 /// Dispatch a query to the Orchestrator.
 ///
diff --git a/rust/src/retrieval/mod.rs b/vectorless-core/vectorless-retrieval/src/lib.rs
similarity index 87%
rename from rust/src/retrieval/mod.rs
rename to vectorless-core/vectorless-retrieval/src/lib.rs
index bab04971..96d6f1db 100644
--- a/rust/src/retrieval/mod.rs
+++ b/vectorless-core/vectorless-retrieval/src/lib.rs
@@ -25,4 +25,7 @@ pub mod stream;
 mod types;
 
 pub use stream::{RetrieveEvent, RetrieveEventReceiver};
-pub use types::{ReasoningChain, RetrieveResponse, SufficiencyLevel};
+pub use types::{
+    Confidence, EvidenceItem, QueryMetrics, QueryResultItem, ReasoningChain, RetrieveResponse,
+    SufficiencyLevel,
+};
diff --git a/rust/src/retrieval/postprocessor.rs b/vectorless-core/vectorless-retrieval/src/postprocessor.rs
similarity index 96%
rename from rust/src/retrieval/postprocessor.rs
rename to vectorless-core/vectorless-retrieval/src/postprocessor.rs
index fddc8c5e..79da0ebf 100644
--- a/rust/src/retrieval/postprocessor.rs
+++ b/vectorless-core/vectorless-retrieval/src/postprocessor.rs
@@ -9,8 +9,8 @@
 
 use std::collections::BTreeMap;
 
-use crate::agent::config::{Evidence, Metrics, Output};
-use crate::client::{Confidence, EvidenceItem, QueryMetrics, QueryResultItem};
+use crate::types::{Confidence, EvidenceItem, QueryMetrics, QueryResultItem};
+use vectorless_agent::config::{Evidence, Metrics, Output};
 
 /// Convert agent output to query result items, split by document.
 ///
diff --git a/rust/src/retrieval/stream.rs b/vectorless-core/vectorless-retrieval/src/stream.rs
similarity index 100%
rename from rust/src/retrieval/stream.rs
rename to vectorless-core/vectorless-retrieval/src/stream.rs
diff --git a/rust/src/retrieval/types.rs b/vectorless-core/vectorless-retrieval/src/types.rs
similarity index 70%
rename from rust/src/retrieval/types.rs
rename to vectorless-core/vectorless-retrieval/src/types.rs
index 3d1e41e5..ca15ee55 100644
--- a/rust/src/retrieval/types.rs
+++ b/vectorless-core/vectorless-retrieval/src/types.rs
@@ -5,24 +5,8 @@
 
 use serde::{Deserialize, Serialize};
 
-/// Sufficiency level for incremental retrieval.
-#[derive(Debug, Clone, Copy, PartialEq, Eq)]
-pub enum SufficiencyLevel {
-    /// Information is sufficient, stop retrieving.
-    Sufficient,
-
-    /// Partial information, can continue if needed.
-    PartialSufficient,
-
-    /// Information is insufficient, continue retrieving.
-    Insufficient,
-}
-
-impl Default for SufficiencyLevel {
-    fn default() -> Self {
-        Self::Insufficient
-    }
-}
+/// Re-export [`SufficiencyLevel`] from the document module.
+pub use vectorless_document::SufficiencyLevel;
 
 /// Complete retrieval response.
 #[derive(Debug, Clone)]
@@ -207,3 +191,55 @@ pub struct ReasoningStep {
     /// Human-readable explanation of the decision.
     pub reasoning: String,
 }
+
+// ============================================================
+// Query result types (used by engine)
+// ============================================================
+
+/// Confidence score of the query result (0.0–1.0).
+pub type Confidence = f32;
+
+/// A single piece of evidence with source attribution.
+#[derive(Debug, Clone)]
+pub struct EvidenceItem {
+    /// Section title where this evidence was found.
+    pub title: String,
+    /// Navigation path (e.g., "Root/Chapter 1/Section 1.2").
+    pub path: String,
+    /// Raw evidence content.
+    pub content: String,
+    /// Source document name (set in multi-doc scenarios).
+    pub doc_name: Option<String>,
+}
+
+/// Query execution metrics.
+#[derive(Debug, Clone, Default)]
+pub struct QueryMetrics {
+    /// Number of LLM calls made.
+    pub llm_calls: u32,
+    /// Number of navigation rounds used.
+    pub rounds_used: u32,
+    /// Number of distinct nodes visited.
+    pub nodes_visited: usize,
+    /// Number of evidence items collected.
+    pub evidence_count: usize,
+    /// Total characters of collected evidence.
+    pub evidence_chars: usize,
+}
+
+/// A single document's query result.
+#[derive(Debug, Clone)]
+pub struct QueryResultItem {
+    /// The document ID.
+    pub doc_id: String,
+    /// Matching node IDs (navigation paths).
+    pub node_ids: Vec<String>,
+    /// Synthesized answer or raw evidence content.
+    pub content: String,
+    /// Evidence items that contributed to this result, with source attribution.
+    pub evidence: Vec<EvidenceItem>,
+    /// Execution metrics for this query.
+    pub metrics: Option<QueryMetrics>,
+    /// Confidence score (0.0–1.0) — derived from LLM evaluation.
+    pub confidence: Confidence,
+}
diff --git a/vectorless-core/vectorless-scoring/Cargo.toml b/vectorless-core/vectorless-scoring/Cargo.toml
new file mode 100644
index 00000000..f1c169d2
--- /dev/null
+++ b/vectorless-core/vectorless-scoring/Cargo.toml
@@ -0,0 +1,17 @@
+[package]
+name = "vectorless-scoring"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+bm25 = { workspace = true }
+regex = { workspace = true }
+async-trait = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/scoring/bm25.rs b/vectorless-core/vectorless-scoring/src/bm25.rs
similarity index 100%
rename from rust/src/scoring/bm25.rs
rename to vectorless-core/vectorless-scoring/src/bm25.rs
diff --git a/rust/src/scoring/mod.rs b/vectorless-core/vectorless-scoring/src/lib.rs
similarity index 100%
rename from rust/src/scoring/mod.rs
rename to vectorless-core/vectorless-scoring/src/lib.rs
diff --git a/vectorless-core/vectorless-storage/Cargo.toml b/vectorless-core/vectorless-storage/Cargo.toml
new file mode 100644
index 00000000..87ce252e
--- /dev/null
+++ b/vectorless-core/vectorless-storage/Cargo.toml
@@ -0,0 +1,38 @@
+[package]
+name = "vectorless-storage"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-config = { path = "../vectorless-config" }
+vectorless-document = { path = "../vectorless-document" }
+vectorless-error = { path = "../vectorless-error" }
+vectorless-utils = { path = "../vectorless-utils" }
+vectorless-graph = { path = "../vectorless-graph" }
+tokio = { workspace = true }
+serde = { workspace = true }
+serde_json = { workspace = true }
+flate2 = { workspace = true }
+lru = { workspace = true }
+tracing = { workspace = true }
+chrono = { workspace = true }
+uuid = { workspace = true }
+sha2 = { workspace = true }
+base64 = { workspace = true }
+parking_lot = { workspace = true }
+regex = { workspace = true }
+thiserror = { workspace = true }
+
+[dev-dependencies]
+tempfile = { workspace = true }
+
+[target.'cfg(unix)'.dependencies]
+libc = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/storage/backend/file.rs b/vectorless-core/vectorless-storage/src/backend/file.rs
similarity index 99%
rename from rust/src/storage/backend/file.rs
rename to vectorless-core/vectorless-storage/src/backend/file.rs
index ab461fe7..454ca5e4 100644
--- a/rust/src/storage/backend/file.rs
+++ b/vectorless-core/vectorless-storage/src/backend/file.rs
@@ -10,8 +10,8 @@ use std::sync::RwLock;
 use tracing::debug;
 
 use super::StorageBackend;
-use crate::Error;
-use crate::error::Result;
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// File system storage backend.
 ///
diff --git a/rust/src/storage/backend/memory.rs b/vectorless-core/vectorless-storage/src/backend/memory.rs
similarity index 74%
rename from rust/src/storage/backend/memory.rs
rename to vectorless-core/vectorless-storage/src/backend/memory.rs
index 4844f8e2..197ddace 100644
--- a/rust/src/storage/backend/memory.rs
+++ b/vectorless-core/vectorless-storage/src/backend/memory.rs
@@ -7,7 +7,7 @@ use std::collections::HashMap;
 use std::sync::RwLock;
 
 use super::StorageBackend;
-use crate::error::Result;
+use vectorless_error::Result;
 
 /// In-memory storage backend.
 ///
@@ -39,68 +39,60 @@ impl MemoryBackend {
 
 impl StorageBackend for MemoryBackend {
     fn get(&self, key: &str) -> Result<Option<Vec<u8>>> {
-        let data = self
-            .data
-            .read()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let data = self.data.read().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         Ok(data.get(key).cloned())
     }
 
     fn put(&self, key: &str, value: &[u8]) -> Result<()> {
-        let mut data = self
-            .data
-            .write()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let mut data = self.data.write().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         data.insert(key.to_string(), value.to_vec());
         Ok(())
     }
 
     fn delete(&self, key: &str) -> Result<bool> {
-        let mut data = self
-            .data
-            .write()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let mut data = self.data.write().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         Ok(data.remove(key).is_some())
     }
 
     fn exists(&self, key: &str) -> Result<bool> {
-        let data = self
-            .data
-            .read()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let data = self.data.read().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         Ok(data.contains_key(key))
     }
 
     fn keys(&self) -> Result<Vec<String>> {
-        let data = self
-            .data
-            .read()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let data = self.data.read().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         Ok(data.keys().cloned().collect())
     }
 
     fn len(&self) -> Result<usize> {
-        let data = self
-            .data
-            .read()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let data = self.data.read().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         Ok(data.len())
     }
 
     fn clear(&self) -> Result<()> {
-        let mut data = self
-            .data
-            .write()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let mut data = self.data.write().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         data.clear();
         Ok(())
     }
 
     fn batch_put(&self, items: &[(&str, &[u8])]) -> Result<()> {
-        let mut data = self
-            .data
-            .write()
-            .map_err(|_| crate::Error::Cache("Memory backend lock poisoned".to_string()))?;
+        let mut data = self.data.write().map_err(|_| {
+            vectorless_error::Error::Cache("Memory backend lock poisoned".to_string())
+        })?;
         for (key, value) in items {
             data.insert(key.to_string(), value.to_vec());
         }
diff --git a/rust/src/storage/backend/mod.rs b/vectorless-core/vectorless-storage/src/backend/mod.rs
similarity index 100%
rename from rust/src/storage/backend/mod.rs
rename to vectorless-core/vectorless-storage/src/backend/mod.rs
diff --git a/rust/src/storage/backend/trait_def.rs b/vectorless-core/vectorless-storage/src/backend/trait_def.rs
similarity index 99%
rename from rust/src/storage/backend/trait_def.rs
rename to vectorless-core/vectorless-storage/src/backend/trait_def.rs
index 782bdac0..9d74f232 100644
--- a/rust/src/storage/backend/trait_def.rs
+++ b/vectorless-core/vectorless-storage/src/backend/trait_def.rs
@@ -5,7 +5,7 @@
 
 use std::fmt::Debug;
 
-use crate::error::Result;
+use vectorless_error::Result;
 
 /// Storage backend trait for abstracting different storage systems.
 ///
diff --git a/rust/src/storage/cache.rs b/vectorless-core/vectorless-storage/src/cache.rs
similarity index 98%
rename from rust/src/storage/cache.rs
rename to vectorless-core/vectorless-storage/src/cache.rs
index 70f7d4a9..2b06c240 100644
--- a/rust/src/storage/cache.rs
+++ b/vectorless-core/vectorless-storage/src/cache.rs
@@ -21,8 +21,8 @@ use std::sync::atomic::{AtomicU64, Ordering};
 use lru::LruCache;
 
 use super::persistence::PersistedDocument;
-use crate::Error;
-use crate::error::Result;
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// Default cache size (number of documents).
 const DEFAULT_CACHE_SIZE: usize = 100;
@@ -267,8 +267,8 @@ pub struct CacheStats {
 #[cfg(test)]
 mod tests {
     use super::*;
-    use crate::document::DocumentTree;
-    use crate::storage::{DocumentMeta, PersistedDocument};
+    use crate::{DocumentMeta, PersistedDocument};
+    use vectorless_document::DocumentTree;
 
     fn create_test_doc(id: &str) -> PersistedDocument {
         let meta = DocumentMeta::new(id, "Test Doc", "md");
diff --git a/rust/src/storage/codec.rs b/vectorless-core/vectorless-storage/src/codec.rs
similarity index 95%
rename from rust/src/storage/codec.rs
rename to vectorless-core/vectorless-storage/src/codec.rs
index 4c7e864e..ce750222 100644
--- a/rust/src/storage/codec.rs
+++ b/vectorless-core/vectorless-storage/src/codec.rs
@@ -30,8 +30,8 @@ use flate2::Compression;
 use flate2::read::GzDecoder;
 use flate2::write::GzEncoder;
 
-use crate::Error;
-use crate::error::Result;
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// Codec trait for compression/decompression.
 pub trait Codec: Debug + Send + Sync {
@@ -145,7 +145,7 @@ impl Codec for GzipCodec {
 /// Create a codec from configuration.
 pub fn codec_from_config(
     enabled: bool,
-    algorithm: crate::config::CompressionAlgorithm,
+    algorithm: vectorless_config::CompressionAlgorithm,
     level: u32,
 ) -> Box<dyn Codec> {
     if !enabled {
@@ -153,8 +153,8 @@ pub fn codec_from_config(
     }
 
     match algorithm {
-        crate::config::CompressionAlgorithm::Gzip => Box::new(GzipCodec::new(level)),
-        crate::config::CompressionAlgorithm::Zstd => {
+        vectorless_config::CompressionAlgorithm::Gzip => Box::new(GzipCodec::new(level)),
+        vectorless_config::CompressionAlgorithm::Zstd => {
             // Zstd not implemented yet, fallback to gzip
             // TODO: Add zstd support when needed
             Box::new(GzipCodec::new(level))
@@ -228,7 +228,7 @@ mod tests {
 
     #[test]
     fn test_codec_from_config() {
-        use crate::config::CompressionAlgorithm;
+        use vectorless_config::CompressionAlgorithm;
 
         // Disabled compression
         let codec = codec_from_config(false, CompressionAlgorithm::Gzip, 6);
diff --git a/rust/src/storage/mod.rs b/vectorless-core/vectorless-storage/src/lib.rs
similarity index 100%
rename from rust/src/storage/mod.rs
rename to vectorless-core/vectorless-storage/src/lib.rs
diff --git a/rust/src/storage/lock.rs b/vectorless-core/vectorless-storage/src/lock.rs
similarity index 99%
rename from rust/src/storage/lock.rs
rename to vectorless-core/vectorless-storage/src/lock.rs
index feb484ba..3783d51f 100644
--- a/rust/src/storage/lock.rs
+++ b/vectorless-core/vectorless-storage/src/lock.rs
@@ -15,8 +15,8 @@
 use std::fs::{File, OpenOptions};
 use std::path::Path;
 
-use crate::Error;
-use crate::error::Result;
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// A file lock that is automatically released when dropped.
 ///
diff --git a/rust/src/storage/migration.rs b/vectorless-core/vectorless-storage/src/migration.rs
similarity index 99%
rename from rust/src/storage/migration.rs
rename to vectorless-core/vectorless-storage/src/migration.rs
index 1711169b..3cc35512 100644
--- a/rust/src/storage/migration.rs
+++ b/vectorless-core/vectorless-storage/src/migration.rs
@@ -35,8 +35,8 @@ use std::collections::HashMap;
 
 use tracing::{debug, info, warn};
 
-use crate::Error;
-use crate::error::Result;
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// Current data format version.
 pub const CURRENT_VERSION: u32 = 1;
diff --git a/rust/src/storage/persistence.rs b/vectorless-core/vectorless-storage/src/persistence.rs
similarity index 95%
rename from rust/src/storage/persistence.rs
rename to vectorless-core/vectorless-storage/src/persistence.rs
index b2dac4d4..e0a83763 100644
--- a/rust/src/storage/persistence.rs
+++ b/vectorless-core/vectorless-storage/src/persistence.rs
@@ -15,9 +15,9 @@ use std::fs::File;
 use std::io::{BufReader, BufWriter, Write};
 use std::path::{Path, PathBuf};
 
-use crate::Error;
-use crate::document::{DocumentTree, NavigationIndex, ReasoningIndex};
-use crate::error::Result;
+use vectorless_document::{DocumentTree, NavigationIndex, ReasoningIndex};
+use vectorless_error::Error;
+use vectorless_error::Result;
 
 /// Current format version for persisted documents.
 const FORMAT_VERSION: u32 = 1;
@@ -63,17 +63,17 @@ pub struct DocumentMeta {
     /// Content fingerprint for change detection.
     #[serde(
         default,
-        skip_serializing_if = "crate::utils::fingerprint::Fingerprint::is_zero"
+        skip_serializing_if = "vectorless_utils::fingerprint::Fingerprint::is_zero"
     )]
-    pub content_fingerprint: crate::utils::fingerprint::Fingerprint,
+    pub content_fingerprint: vectorless_utils::fingerprint::Fingerprint,
 
     /// Logic fingerprint (hash of pipeline configuration used to produce this document).
     /// If the pipeline config changes, a full reprocess is needed even if content didn't change.
     #[serde(
         default,
-        skip_serializing_if = "crate::utils::fingerprint::Fingerprint::is_zero"
+        skip_serializing_if = "vectorless_utils::fingerprint::Fingerprint::is_zero"
     )]
-    pub logic_fingerprint: crate::utils::fingerprint::Fingerprint,
+    pub logic_fingerprint: vectorless_utils::fingerprint::Fingerprint,
 
     /// Processing version (incremented when algorithm changes).
     #[serde(default)]
@@ -110,8 +110,8 @@ impl DocumentMeta {
             line_count: None,
             created_at: now,
             modified_at: now,
-            content_fingerprint: crate::utils::fingerprint::Fingerprint::zero(),
-            logic_fingerprint: crate::utils::fingerprint::Fingerprint::zero(),
+            content_fingerprint: vectorless_utils::fingerprint::Fingerprint::zero(),
+            logic_fingerprint: vectorless_utils::fingerprint::Fingerprint::zero(),
             processing_version: 0,
             node_count: 0,
             total_summary_tokens: 0,
@@ -133,13 +133,16 @@ impl DocumentMeta {
     }
 
     /// Set the content fingerprint.
-    pub fn with_fingerprint(mut self, fp: crate::utils::fingerprint::Fingerprint) -> Self {
+    pub fn with_fingerprint(mut self, fp: vectorless_utils::fingerprint::Fingerprint) -> Self {
         self.content_fingerprint = fp;
         self
     }
 
     /// Set the logic fingerprint.
-    pub fn with_logic_fingerprint(mut self, fp: crate::utils::fingerprint::Fingerprint) -> Self {
+    pub fn with_logic_fingerprint(
+        mut self,
+        fp: vectorless_utils::fingerprint::Fingerprint,
+    ) -> Self {
         self.logic_fingerprint = fp;
         self
     }
@@ -172,7 +175,7 @@ impl DocumentMeta {
     /// Mark as processed with given fingerprint and version.
     pub fn mark_processed(
         &mut self,
-        fp: crate::utils::fingerprint::Fingerprint,
+        fp: vectorless_utils::fingerprint::Fingerprint,
         version: u32,
         model: Option<&str>,
     ) {
@@ -185,7 +188,7 @@ impl DocumentMeta {
     /// Check if the document needs reprocessing.
     pub fn needs_reprocessing(
         &self,
-        current_fp: &crate::utils::fingerprint::Fingerprint,
+        current_fp: &vectorless_utils::fingerprint::Fingerprint,
         current_version: u32,
     ) -> bool {
         // Never processed
@@ -232,6 +235,10 @@ pub struct PersistedDocument {
     /// Navigation index for Agent-based retrieval.
     #[serde(default, skip_serializing_if = "Option::is_none")]
     pub navigation_index: Option<NavigationIndex>,
+
+    /// Key concepts extracted from the document.
+    #[serde(default, skip_serializing_if = "Vec::is_empty")]
+    pub concepts: Vec<vectorless_document::Concept>,
 }
 
 impl PersistedDocument {
@@ -244,6 +251,7 @@ impl PersistedDocument {
             pages: Vec::new(),
             reasoning_index: None,
             navigation_index: None,
+            concepts: Vec::new(),
         }
     }
 
diff --git a/vectorless-core/vectorless-utils/Cargo.toml b/vectorless-core/vectorless-utils/Cargo.toml
new file mode 100644
index 00000000..d883a53a
--- /dev/null
+++ b/vectorless-core/vectorless-utils/Cargo.toml
@@ -0,0 +1,25 @@
+[package]
+name = "vectorless-utils"
+version.workspace = true
+edition.workspace = true
+authors.workspace = true
+description.workspace = true
+license.workspace = true
+repository.workspace = true
+homepage.workspace = true
+
+[dependencies]
+vectorless-error = { path = "../vectorless-error" }
+vectorless-document = { path = "../vectorless-document" }
+serde = { workspace = true }
+sha2 = { workspace = true }
+blake2 = { workspace = true }
+tiktoken-rs = { workspace = true }
+base64 = { workspace = true }
+thiserror = { workspace = true }
+
+[dev-dependencies]
+serde_json = { workspace = true }
+
+[lints]
+workspace = true
diff --git a/rust/src/utils/fingerprint.rs b/vectorless-core/vectorless-utils/src/fingerprint.rs
similarity index 100%
rename from rust/src/utils/fingerprint.rs
rename to vectorless-core/vectorless-utils/src/fingerprint.rs
diff --git a/rust/src/utils/mod.rs b/vectorless-core/vectorless-utils/src/lib.rs
similarity index 100%
rename from rust/src/utils/mod.rs
rename to vectorless-core/vectorless-utils/src/lib.rs
diff --git a/rust/src/utils/token.rs b/vectorless-core/vectorless-utils/src/token.rs
similarity index 100%
rename from rust/src/utils/token.rs
rename to vectorless-core/vectorless-utils/src/token.rs
diff --git a/rust/src/utils/validation.rs b/vectorless-core/vectorless-utils/src/validation.rs
similarity index 98%
rename from rust/src/utils/validation.rs
rename to vectorless-core/vectorless-utils/src/validation.rs
index ae541c16..3133175d 100644
--- a/rust/src/utils/validation.rs
+++ b/vectorless-core/vectorless-utils/src/validation.rs
@@ -5,8 +5,8 @@
 
 use std::path::Path;
 
-use crate::error::{Error, Result};
-use crate::index::parse::DocumentFormat;
+use vectorless_document::DocumentFormat;
+use vectorless_error::{Error, Result};
 
 /// Maximum file size before emitting a warning (100 MB).
 const LARGE_FILE_THRESHOLD: usize = 100 * 1024 * 1024;
diff --git a/vectorless/README.md b/vectorless/README.md
new file mode 100644
index 00000000..d03b76a6
--- /dev/null
+++ b/vectorless/README.md
@@ -0,0 +1,165 @@
+# Vectorless Python SDK
+
+Python bindings for [vectorless](https://github.com/vectorlessflow/vectorless) — a Document Understanding Engine for AI.
+
+## Installation
+
+```bash
+pip install vectorless
+```
+
+## Quick Start
+
+```python
+import asyncio
+from vectorless import Engine
+
+async def main():
+    # Create engine — api_key and model are required
+    engine = Engine(
+        api_key="sk-...",
+        model="gpt-4o",
+    )
+
+    # Understand a document
+    doc = await engine.ingest("./report.pdf")
+    print(f"Understood: {doc.name} — {doc.summary}")
+
+    # Ask a question
+    answer = await engine.ask(
+        "What is the total revenue?",
+        doc_ids=[doc.doc_id],
+    )
+    print(f"Answer: {answer.content}")
+    print(f"Confidence: {answer.confidence:.2f}")
+    print(f"Evidence: {len(answer.evidence)} pieces")
+    print(f"Trace: {len(answer.trace.steps)} steps")
+
+    # List all understood documents
+    docs = await engine.list_documents()
+    for d in docs:
+        print(f"  - {d.name} ({d.doc_id})")
+
+    # Forget a document
+    await engine.forget(doc.doc_id)
+
+asyncio.run(main())
+```
+
+## API Reference
+
+### Engine
+
+The main entry point. All methods are **async**.
+
+```python
+class Engine:
+    def __init__(
+        self,
+        api_key: str | None = None,
+        model: str | None = None,
+        endpoint: str | None = None,
+        config: Config | None = None,
+    ): ...
+
+    async def ingest(self, path: str) -> DocumentInfo: ...
+    async def ask(self, question: str, doc_ids: list[str] | None = None) -> Answer: ...
+    async def forget(self, doc_id: str) -> None: ...
+    async def list_documents(self) -> list[DocumentInfo]: ...
+    async def exists(self, doc_id: str) -> bool: ...
+    async def clear(self) -> int: ...
+    async def get_graph(self) -> DocumentGraph | None: ...
+    def metrics_report(self) -> MetricsReport: ...
+```
+
+### DocumentInfo
+
+```python
+class DocumentInfo:
+    doc_id: str
+    name: str
+    format: str
+    summary: str
+    concepts: list[Concept]
+    section_count: int
+    page_count: int | None
+```
+
+### Answer
+
+```python
+class Answer:
+    content: str
+    evidence: list[Evidence]
+    confidence: float
+    trace: ReasoningTrace
+```
+
+### Evidence
+
+```python
+class Evidence:
+    content: str
+    source_path: str
+    doc_name: str
+    relevance: float
+```
+
+### ReasoningTrace
+
+```python
+class ReasoningTrace:
+    steps: list[TraceStep]
+```
+
+### TraceStep
+
+```python
+class TraceStep:
+    action: str
+    observation: str
+    round: int
+```
+
+### Concept
+
+```python
+class Concept:
+    name: str
+    description: str
+    confidence: float
+```
+
+### VectorlessError
+
+```python
+class VectorlessError(Exception):
+    message: str
+    kind: str  # "config", "parse", "not_found", "llm"
+```
+
+## Development
+
+### Building from source
+
+```bash
+# Install maturin
+pip install maturin
+
+# Build and install (from project root)
+maturin develop
+
+# Run tests
+pytest
+```
+
+### Publishing to PyPI
+
+```bash
+maturin build --release
+maturin publish
+```
+
+## License
+
+Apache-2.0
diff --git a/vectorless/__init__.py b/vectorless/__init__.py
new file mode 100644
index 00000000..7b5f67a4
--- /dev/null
+++ b/vectorless/__init__.py
@@ -0,0 +1,67 @@
+"""
+Vectorless — Document Understanding Engine for AI.
+
+Quick Start:
+    from vectorless import Engine
+
+    engine = Engine(api_key="sk-...", model="gpt-4o")
+    doc = await engine.ingest("./report.pdf")
+    answer = await engine.ask("What is the revenue?", doc_ids=[doc.doc_id])
+    print(answer.content)
+"""
+
+# Core Engine and types from Rust
+from vectorless._vectorless import (
+    Answer,
+    Concept,
+    Config,
+    DocumentGraph,
+    DocumentInfo,
+    EdgeEvidence,
+    Engine,
+    Evidence,
+    GraphEdge,
+    MetricsReport,
+    ReasoningTrace,
+    TraceStep,
+    VectorlessError,
+    WeightedKeyword,
+    __version__,
+)
+
+# Configuration utilities
+from vectorless.config import EngineConfig, load_config, load_config_from_env, load_config_from_file
+
+# Events
+from vectorless.events import EventEmitter
+
+__all__ = [
+    # Primary API
+    "Engine",
+    # Configuration
+    "EngineConfig",
+    "load_config",
+    "load_config_from_env",
+    "load_config_from_file",
+    "Config",
+    # Events
+    "EventEmitter",
+    # Document types
+    "DocumentInfo",
+    "Concept",
+    # Answer types
+    "Answer",
+    "Evidence",
+    "ReasoningTrace",
+    "TraceStep",
+    # Graph types
+    "DocumentGraph",
+    "GraphEdge",
+    "EdgeEvidence",
+    "WeightedKeyword",
+    # Metrics
+    "MetricsReport",
+    # Error and version
+    "VectorlessError",
+    "__version__",
+]
diff --git a/python/vectorless/_async_utils.py b/vectorless/_async_utils.py
similarity index 100%
rename from python/vectorless/_async_utils.py
rename to vectorless/_async_utils.py
diff --git a/python/vectorless/_compat/__init__.py b/vectorless/_compat/__init__.py
similarity index 100%
rename from python/vectorless/_compat/__init__.py
rename to vectorless/_compat/__init__.py
diff --git a/python/vectorless/_compat/langchain.py b/vectorless/_compat/langchain.py
similarity index 100%
rename from python/vectorless/_compat/langchain.py
rename to vectorless/_compat/langchain.py
diff --git a/python/vectorless/_compat/llamaindex.py b/vectorless/_compat/llamaindex.py
similarity index 100%
rename from python/vectorless/_compat/llamaindex.py
rename to vectorless/_compat/llamaindex.py
diff --git a/vectorless/_core.py b/vectorless/_core.py
new file mode 100644
index 00000000..53ea7ce5
--- /dev/null
+++ b/vectorless/_core.py
@@ -0,0 +1,48 @@
+"""Internal re-exports from the Rust PyO3 module.
+
+This module is NOT part of the public API. Use ``vectorless.Engine`` instead.
+"""
+
+from vectorless._vectorless import (
+    Answer,
+    Concept,
+    Config,
+    DocumentGraph,
+    DocumentGraphEdge,
+    DocumentGraphNode,
+    DocumentInfo,
+    EdgeEvidence,
+    Engine,
+    Evidence,
+    GraphEdge,
+    LlmMetricsReport,
+    MetricsReport,
+    ReasoningTrace,
+    RetrievalMetricsReport,
+    TraceStep,
+    VectorlessError,
+    WeightedKeyword,
+    __version__,
+)
+
+__all__ = [
+    "Answer",
+    "Concept",
+    "Config",
+    "DocumentGraph",
+    "DocumentGraphEdge",
+    "DocumentGraphNode",
+    "DocumentInfo",
+    "EdgeEvidence",
+    "Engine",
+    "Evidence",
+    "GraphEdge",
+    "LlmMetricsReport",
+    "MetricsReport",
+    "ReasoningTrace",
+    "RetrievalMetricsReport",
+    "TraceStep",
+    "VectorlessError",
+    "WeightedKeyword",
+    "__version__",
+]
diff --git a/python/vectorless/cli/__init__.py b/vectorless/cli/__init__.py
similarity index 100%
rename from python/vectorless/cli/__init__.py
rename to vectorless/cli/__init__.py
diff --git a/python/vectorless/cli/commands/__init__.py b/vectorless/cli/commands/__init__.py
similarity index 100%
rename from python/vectorless/cli/commands/__init__.py
rename to vectorless/cli/commands/__init__.py
diff --git a/python/vectorless/cli/commands/add.py b/vectorless/cli/commands/add.py
similarity index 100%
rename from python/vectorless/cli/commands/add.py
rename to vectorless/cli/commands/add.py
diff --git a/python/vectorless/cli/commands/ask.py b/vectorless/cli/commands/ask.py
similarity index 100%
rename from python/vectorless/cli/commands/ask.py
rename to vectorless/cli/commands/ask.py
diff --git a/python/vectorless/cli/commands/config_cmd.py b/vectorless/cli/commands/config_cmd.py
similarity index 100%
rename from python/vectorless/cli/commands/config_cmd.py
rename to vectorless/cli/commands/config_cmd.py
diff --git a/python/vectorless/cli/commands/info.py b/vectorless/cli/commands/info.py
similarity index 100%
rename from python/vectorless/cli/commands/info.py
rename to vectorless/cli/commands/info.py
diff --git a/python/vectorless/cli/commands/init.py b/vectorless/cli/commands/init.py
similarity index 100%
rename from python/vectorless/cli/commands/init.py
rename to vectorless/cli/commands/init.py
diff --git a/python/vectorless/cli/commands/list_cmd.py b/vectorless/cli/commands/list_cmd.py
similarity index 100%
rename from python/vectorless/cli/commands/list_cmd.py
rename to vectorless/cli/commands/list_cmd.py
diff --git a/python/vectorless/cli/commands/query.py b/vectorless/cli/commands/query.py
similarity index 100%
rename from python/vectorless/cli/commands/query.py
rename to vectorless/cli/commands/query.py
diff --git a/python/vectorless/cli/commands/remove.py b/vectorless/cli/commands/remove.py
similarity index 100%
rename from python/vectorless/cli/commands/remove.py
rename to vectorless/cli/commands/remove.py
diff --git a/python/vectorless/cli/commands/stats.py b/vectorless/cli/commands/stats.py
similarity index 100%
rename from python/vectorless/cli/commands/stats.py
rename to vectorless/cli/commands/stats.py
diff --git a/python/vectorless/cli/commands/tree.py b/vectorless/cli/commands/tree.py
similarity index 100%
rename from python/vectorless/cli/commands/tree.py
rename to vectorless/cli/commands/tree.py
diff --git a/python/vectorless/cli/main.py b/vectorless/cli/main.py
similarity index 100%
rename from python/vectorless/cli/main.py
rename to vectorless/cli/main.py
diff --git a/python/vectorless/cli/output.py b/vectorless/cli/output.py
similarity index 100%
rename from python/vectorless/cli/output.py
rename to vectorless/cli/output.py
diff --git a/python/vectorless/config/__init__.py b/vectorless/config/__init__.py
similarity index 100%
rename from python/vectorless/config/__init__.py
rename to vectorless/config/__init__.py
diff --git a/python/vectorless/config/loading.py b/vectorless/config/loading.py
similarity index 100%
rename from python/vectorless/config/loading.py
rename to vectorless/config/loading.py
diff --git a/python/vectorless/config/models.py b/vectorless/config/models.py
similarity index 100%
rename from python/vectorless/config/models.py
rename to vectorless/config/models.py
diff --git a/python/vectorless/events.py b/vectorless/events.py
similarity index 100%
rename from python/vectorless/events.py
rename to vectorless/events.py
diff --git a/python/vectorless/jupyter.py b/vectorless/jupyter.py
similarity index 100%
rename from python/vectorless/jupyter.py
rename to vectorless/jupyter.py
diff --git a/python/vectorless/py.typed b/vectorless/py.typed
similarity index 100%
rename from python/vectorless/py.typed
rename to vectorless/py.typed
diff --git a/python/vectorless/session.py b/vectorless/session.py
similarity index 100%
rename from python/vectorless/session.py
rename to vectorless/session.py
diff --git a/python/vectorless/streaming.py b/vectorless/streaming.py
similarity index 100%
rename from python/vectorless/streaming.py
rename to vectorless/streaming.py
diff --git a/python/vectorless/sync_session.py b/vectorless/sync_session.py
similarity index 100%
rename from python/vectorless/sync_session.py
rename to vectorless/sync_session.py
diff --git a/python/vectorless/types/__init__.py b/vectorless/types/__init__.py
similarity index 100%
rename from python/vectorless/types/__init__.py
rename to vectorless/types/__init__.py
diff --git a/python/vectorless/types/graph.py b/vectorless/types/graph.py
similarity index 100%
rename from python/vectorless/types/graph.py
rename to vectorless/types/graph.py
diff --git a/python/vectorless/types/results.py b/vectorless/types/results.py
similarity index 100%
rename from python/vectorless/types/results.py
rename to vectorless/types/results.py