ADR-018: Knowledge Layer & Retrieval Architecture — Tracking Issue

## Summary

Tracking issue for ADR-018 implementation. Introduces a three-layer knowledge architecture (L1 Core Reference, L2 Local RAG, L3 Community Hub) that provides model-agnostic domain knowledge to the LLM across all providers. Covers MIDI/OSC/ArtNet/HID protocol specs, platform automation patterns, device profiles, and provider-aware retrieval tuning.

**ADR:** `docs/adrs/ADR-018-knowledge-layer-retrieval-architecture.md`
**Total Estimate:** 17–24 hours across 3 phases

## Decisions

| # | Decision | Status |
|---|----------|--------|
| D1 | Three-Layer Knowledge Architecture (L1/L2/L3) | 🔲 |
| D2 | L1 — Core Reference Document (2,500 token target, 4,000 cap) | 🔲 |
| D3 | L2 — Local Retrieval Index (ONNX embeddings, SQLite, cosine search) | 🔲 |
| D3.1 | Index Contents (MIDI, OSC, ArtNet, HID, Platform, Devices) | 🔲 |
| D3.2 | Chunking Strategy (section-aware, 300-token target, 50-token overlap) | 🔲 |
| D3.3 | Embedding Model (all-MiniLM-L6-v2, 384-dim, ONNX Runtime) | 🔲 |
| D3.4 | Index Storage (SQLite with BLOB embeddings, brute-force cosine) | 🔲 |
| D3.5 | Retrieval Pipeline (7-step: query → domain filter → embed → search → threshold → budget → inject) | 🔲 |
| D4 | L3 — Optional Online Retrieval (Conductor Hub, opt-in, embedding-only) | 🔲 |
| D5 | Retrieval Integration Points (system prompt assembly, Tauri commands) | 🔲 |
| D6 | Provider-Aware Retrieval Tuning (ProviderProfile per provider) | 🔲 |
| D7 | Knowledge Source Management UI (Settings panel) | 🔲 |
| D8 | Skills Interaction & Budget Arbitration (L2 trimmed first, L1 never trimmed) | 🔲 |
| D9 | Index Build Pipeline (CLI tool, parse → chunk → embed → store) | 🔲 |

## Phase Breakdown

### Phase 1: L1 Core Reference (3–4h)
- [ ] 1A: Author `docs/llm-reference.md` Core Reference Document → D1, D2
- [ ] 1B: Inject L1 into system prompt assembly → D2, D5, D8

### Phase 2: L2 Local Retrieval (10–14h)
- [ ] 2A: Knowledge index infrastructure — ONNX, SQLite, cosine search → D3, D3.3, D3.4
- [ ] 2B: Index build CLI — section-aware chunker + pipeline → D3.2, D9
- [ ] 2C: Source content preparation — 6 domains, 540-800 chunks → D3.1
- [ ] 2D: Retrieval pipeline integration — 7-step pipeline + Tauri commands → D3.5, D5

### Phase 3: Polish & Extend (4–6h)
- [ ] 3A: Provider-aware retrieval tuning with ProviderProfile → D6
- [ ] 3B: Knowledge Sources settings UI → D7
- [ ] 3C: L3 online retrieval stub and opt-in toggle → D4

## Dependency Graph

```
1A (reference doc) ──→ 1B (injection)
                             ↓
2A (index infra) ──→ 2B (build CLI) ──→ 2D (pipeline)
                                  ↗
         2C (content) ──────────
                                        ↓
                              3A (provider tuning)
                              3B (settings UI)
                              3C (L3 stub)

Phase 1 and Phase 2A can start in parallel.
Phase 2C (content) can start in parallel with 2A/2B.
Phase 3 sub-tasks are independent of each other.
```

## Key Technical Details

| Aspect | Detail |
|--------|--------|
| **Embedding Model** | all-MiniLM-L6-v2 via ONNX Runtime (`ort` crate), 384-dim, 80MB, <100ms/query |
| **Index Storage** | SQLite at `$CONDUCTOR_DATA_DIR/knowledge/knowledge.db`, 2-5 MB |
| **L1 Token Budget** | Target 2,500, hard cap 4,000 tokens |
| **L2 Retrieval Budget** | Max 1,500 tokens per query, top-5, threshold 0.35 |
| **L3 Privacy** | Embedding vectors only — no raw text sent to Hub |
| **Budget Arbitration** | Trim L2 first, then older Skills. L1 + T1 never trimmed. |

## External Dependencies

- **ADR-007** (LLM Integration): Provider abstraction, MCP tools, Skills system
- **ADR-015** (LLM Signal Awareness): T1/T2/T3 signal context — coexists in system prompt
- **ADR-013** (LLM Canvas): `markdown` artifact type may render retrieved guides
- **ONNX Runtime**: `ort` Rust crate (MIT/Apache-2.0) — new dependency
- **all-MiniLM-L6-v2**: Sentence Transformers model (MIT License) — bundled ONNX file

## References

- **ADR-018**: `docs/adrs/ADR-018-knowledge-layer-retrieval-architecture.md`
- **Event types**: `conductor-core/src/event_types.rs`
- **MIDI learn**: `conductor-gui/src-tauri/src/midi_learn.rs`
- **Context management**: `conductor-gui/src-tauri/src/llm/context.rs`
- **Chat store**: `conductor-gui/ui/src/lib/stores/chat.js`
- **LLM commands**: `conductor-gui/src-tauri/src/llm_commands.rs`
- **Provider modules**: `conductor-gui/src-tauri/src/llm/providers/`
- **Settings UI**: `conductor-gui/ui/src/lib/workspace/AppSettingsView.svelte`
- **Conversation DB**: `conductor-gui/src-tauri/src/llm/db.rs`

Aspect	Detail
Embedding Model	all-MiniLM-L6-v2 via ONNX Runtime (`ort` crate), 384-dim, 80MB, <100ms/query
Index Storage	SQLite at `$CONDUCTOR_DATA_DIR/knowledge/knowledge.db`, 2-5 MB
L1 Token Budget	Target 2,500, hard cap 4,000 tokens
L2 Retrieval Budget	Max 1,500 tokens per query, top-5, threshold 0.35
L3 Privacy	Embedding vectors only — no raw text sent to Hub
Budget Arbitration	Trim L2 first, then older Skills. L1 + T1 never trimmed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ADR-018: Knowledge Layer & Retrieval Architecture — Tracking Issue #655

Summary

Decisions

Phase Breakdown

Phase 1: L1 Core Reference (3–4h)

Phase 2: L2 Local Retrieval (10–14h)

Phase 3: Polish & Extend (4–6h)

Dependency Graph

Key Technical Details

External Dependencies

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

#	Decision	Status
D1	Three-Layer Knowledge Architecture (L1/L2/L3)	🔲
D2	L1 — Core Reference Document (2,500 token target, 4,000 cap)	🔲
D3	L2 — Local Retrieval Index (ONNX embeddings, SQLite, cosine search)	🔲
D3.1	Index Contents (MIDI, OSC, ArtNet, HID, Platform, Devices)	🔲
D3.2	Chunking Strategy (section-aware, 300-token target, 50-token overlap)	🔲
D3.3	Embedding Model (all-MiniLM-L6-v2, 384-dim, ONNX Runtime)	🔲
D3.4	Index Storage (SQLite with BLOB embeddings, brute-force cosine)	🔲
D3.5	Retrieval Pipeline (7-step: query → domain filter → embed → search → threshold → budget → inject)	🔲
D4	L3 — Optional Online Retrieval (Conductor Hub, opt-in, embedding-only)	🔲
D5	Retrieval Integration Points (system prompt assembly, Tauri commands)	🔲
D6	Provider-Aware Retrieval Tuning (ProviderProfile per provider)	🔲
D7	Knowledge Source Management UI (Settings panel)	🔲
D8	Skills Interaction & Budget Arbitration (L2 trimmed first, L1 never trimmed)	🔲
D9	Index Build Pipeline (CLI tool, parse → chunk → embed → store)	🔲

ADR-018: Knowledge Layer & Retrieval Architecture — Tracking Issue #655

Description

Summary

Decisions

Phase Breakdown

Phase 1: L1 Core Reference (3–4h)

Phase 2: L2 Local Retrieval (10–14h)

Phase 3: Polish & Extend (4–6h)

Dependency Graph

Key Technical Details

External Dependencies

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions