feat: add Record Provenance across memory records#58
Conversation
Migration 8->9 adds a nullable provenance column (CHECK-constrained to verbatim/user_authored/extracted/derived) to messages, decisions, learnings, breadcrumbs, and loa_entries. Legacy rows stay NULL (unknown) per ADR-0001 — never guessed, never laundered. Write paths stamp provenance automatically: - CLI add + MCP memory_add -> user_authored (no public override) - raw conversation capture (import, dump, PreCompact flush) -> verbatim - extraction writers (hooks, structured extraction, LoA, import-legacy) -> extracted - derived reserved for future internal paths recall provenance backfill classifies legacy rows on deterministic evidence only, dry-run by default with --execute. CLI search flags unknown provenance by default; --show-provenance shows all values. MCP search/hybrid/recall payloads carry provenance for every record type. Refs #42, ADR-0001
…sult payloads - backfill: dry-run default writes nothing, --execute classifies only evidence-backed rows, never overwrites, idempotent, table filter, unknown-table rejection - write paths: CLI add stamps user_authored, structured extraction stamps extracted, batch message capture persists verbatim, unstamped writes stay NULL - hooks: sqlite-writers stamp extracted + legacy-DB column guard, PreCompact flush stamps verbatim + pre-provenance DB guard - conversation import: raw messages stamped verbatim - search(): provenance present for all five record types, NULL as null - CLI display contract: quiet for known, flags unknown, --show-provenance - ADR-0001 contract pins: MCP memory_add schema and CLI expose no provenance override Refs #42
…ckfill - cli-reference: search flag, display contract, Record Provenance section - mcp-tools: provenance in search/hybrid/recall payloads; memory_add stamps user_authored with no provenance parameter (ADR-0001) - architecture: provenance column + migration 8->9 note - slash-commands + /Recall:search: --show-provenance flag - FOR_CLAUDE/FOR_PI/FOR_OPENCODE: CLI examples kept in sync Refs #42
Review — PR #58: Record Provenance (issue #42, ADR-0001)Reviewed at head Binding design constraints — all verified ✅
Scope judgment:
|
Closes #42
What
Adds Record Provenance — the declared origin and transformation level of every memory record — as automatic write-path metadata per ADR-0001 and the
CONTEXT.mdglossary.Schema
provenancecolumn tomessages,decisions,learnings,breadcrumbs, andloa_entries, CHECK-constrained toverbatim | user_authored | extracted | derived(NULL allowed — legacy unknown stays representable).PRAGMA user_versionadvances through the existing migration system; fresh-install DDL carries the same column.Provenancetype propagates through all record interfaces andSearchResult.Write-path stamping (no public override)
memory_adduser_authored— schema exposes no provenance parameterrecall adduser_authored— no--provenanceflaghooks/lib/sqlite-writers.ts)extracted(with legacy-DB column guard)verbatimextractedderivedreserved in the vocabularyBackfill —
recall provenance backfillDry-run by default,
--executeto apply. Never guesses: messages →verbatim(only historical write path is raw capture), loa_entries →extracted(all historical writers store machine-generated extracts), decisions/learnings →extractedonly wherecategory = 'auto-extracted', breadcrumbs only wherecategory = 'extracted-idea'. Everything else is reported and left NULL. Never overwrites; never assignsuser_authored.Display contract
recall search: quiet for known provenance, visibly flags unknown (⚠ [provenance: unknown]).recall search --show-provenance: shows every value.memory_search,memory_hybrid_search, andmemory_recallpayloads carry provenance for every record type (legacy NULL reported asunknown).Tests (500 pass, lint clean)
--show-provenance)search()payload provenance for all five tablesmemory_addschema and CLI expose no provenance overrideDocs
cli-reference.md,mcp-tools.md,architecture.md,slash-commands.md,/Recall:searchcommand doc, and the FOR_CLAUDE/FOR_PI/FOR_OPENCODE CLI blocks (kept in sync).Notes
recall search(the command that gained--show-provenance); the separaterecall semantic/recall hybridCLI display paths were left untouched.