Skip to content

Add a persistent sqlite knowledge-vault backend#238

Merged
Antawari merged 3 commits into
mainfrom
cat0629/vault-sqlite-backend
Jun 29, 2026
Merged

Add a persistent sqlite knowledge-vault backend#238
Antawari merged 3 commits into
mainfrom
cat0629/vault-sqlite-backend

Conversation

@Antawari

Copy link
Copy Markdown
Contributor

What

A third VaultBackend implementation backed by the standard-library sqlite3 module — no new dependencies, so it runs in CI (unlike the embedded-vector backend, whose optional deps are absent there).

  • Conforms to the existing VaultBackend protocol and the frozen vault-entry shape.
  • Mirrors the in-memory backend's keyword retrieval byte-for-byte (LIKE prefilter, ranking in Python) → deterministic, identical across environments. Keyword retrieval only, no embeddings.
  • Single-table storage with a forward-only versioned schema recorded in a small meta table; all SQL parameterized.
  • Wired into the backend factory behind a new "sqlite" option; existing memory / lancedb / fallback paths untouched.

Tests

tests/unit/test_sqlite_vault.py — mirrors the in-memory contract (store → exists → query → get_by_source, round-trip fidelity, dedup-by-content-hash, upsert, limit, entry_type filter) plus persistence across reopening the same file path, idempotent schema on reopen, and factory wiring.

Local sanity: ruff check + ruff format --check clean, protocol-doc citations 0 drift, vault tests green.

🤖 Generated with Claude Code

Antawari and others added 3 commits June 29, 2026 12:31
A third VaultBackend implementation backed by the standard-library
sqlite3 module — no extra dependencies, so it runs in CI (unlike the
embedded-vector backend, whose optional deps are absent there). It
conforms to the existing VaultBackend protocol and the frozen vault
entry shape, and mirrors the in-memory backend's keyword retrieval
byte-for-byte (LIKE prefilter, ranking done in Python) so results are
deterministic and identical across environments.

Storage is a single table with a forward-only versioned schema recorded
in a small meta table; all SQL is parameterized. Keyword retrieval only,
no embeddings. Wired into the backend factory behind a new "sqlite"
option; the existing options are untouched. Tests mirror the in-memory
contract plus persistence across reopening the same file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the two dynamically-built statements (the upsert and the query
prefilter) with static literals: a fixed-column INSERT ... ON CONFLICT,
and a SELECT that reads the table (optionally narrowed by entry_type)
with the scoring done in Python as before. Every value is still a bound
parameter; this clears the shared gate's SQL-construction lint and is
simpler — the keyword ranking still mirrors the in-memory backend.

Declare the new test file in the file-budget ledger so it does not draw
against the frozen tests/unit package total (the established pattern for
new test coverage).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Antawari Antawari merged commit 045ba1e into main Jun 29, 2026
4 checks passed
@Antawari Antawari deleted the cat0629/vault-sqlite-backend branch June 29, 2026 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant