Baseline comparison harness: trie-memory vs LLM + vector DB

## Summary
The central architectural bet of this PoC is "paraphrase-capable recall without embeddings." That claim is only meaningful when compared against the obvious baseline — an LLM plus vector DB stack on the same scenarios.

## What to do
- Port Scenarios 1 through 5 and the 20-query stress corpus to a baseline implementation:
  - Embeddings: any open model (e.g., sentence-transformers or local nomic-embed).
  - Vector store: sqlite-vec, qdrant-lite, or similar.
  - Renderer: template-only (same no-fabrication contract), or a small LLM pass — document which.
- Run both systems on the same corpus and queries. Record mode distribution, recall quality, latency, storage footprint.
- Publish the comparison as `docs/baseline_comparison.md`.

## Acceptance
- Reproducible harness exists and is documented.
- A results table shows where trie-memory wins, where it loses, where it ties.
- README "honest state" section references the baseline numbers instead of speculation.

## Links
- `docs/design/honest_agent/progress.md` Phase D (listed as deferred — explicitly pulling it up to M2 because it is load-bearing for the central claim).
- `tests/phase_b_stress.rs`
- `tests/honest_agent_paraphrase.rs`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Baseline comparison harness: trie-memory vs LLM + vector DB #5

Summary

What to do

Acceptance

Links

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Baseline comparison harness: trie-memory vs LLM + vector DB #5

Description

Summary

What to do

Acceptance

Links

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions