A research knowledge graph runtime. Typed substrates, closed taxonomies, and a verb-consolidated MCP surface — built for agents that need structure, not just vectors.
Vector search finds similar text. A knowledge graph finds structure — lineages, dependencies, contradictions, gaps. khive gives your research agent a typed, queryable graph that grows as it works: read a paper and entities + edges fall out; make a connection and it's traversable immediately; come back next session and the graph remembers what you built.
No Neo4j. No SPARQL endpoint to deploy. SQLite on disk, MCP over stdio, cargo test in 4 seconds.
| Capability | How |
|---|---|
| 67 verbs, 7 packs | KG, GTD, memory, brain, comm, schedule, knowledge — all load by default |
| Typed entities | 9 closed kinds: concept, document, dataset, project, person, org, artifact, service, resource |
| Typed edges | 17 closed relations in 9 categories (structure, derivation, provenance, temporal, dependency, impl, lateral, annotation, epistemic) |
| Typed notes | 5 closed kinds: observation, insight, question, decision, reference |
| Hybrid retrieval | FTS5 + vector RRF with embedding rerank; shipped BM25, HNSW, Vamana, and fusion crates for pack-specific retrieval paths |
| Graph traversal | BFS with depth/direction/relation filters, bidirectional shortest path |
| GQL + SPARQL queries | Parse to SQL, run against the same SQLite backend |
| Salience-weighted notes | Notes carry salience scores; search ranks by semantic relevance × salience |
| Cross-substrate links | Notes annotate entities (and vice versa) via the same edge system |
| Soft delete + supersede | History-preserving: old records stay, newer ones supersede via graph edges |
| Namespace attribution | Every record stamped with a namespace; all records share one store by default; operators may supply a custom Gate for namespace-scoped enforcement |
Everything in khive is one of three things:
| Substrate | What it is | Mutability | Example |
|---|---|---|---|
| Entity | A graph node with typed edges | Mutable + soft-delete | LoRA (concept), arxiv:2106.09685 (document) |
| Note | A temporal observation about the world | Mutable + soft-delete | "FlashAttention gains scale with seq len, not batch" |
| Event | An audit log entry | Immutable | create(kind="entity", ...) was called at T |
Entities are things. Notes are what you think about things. Events are what happened.
One MCP tool: request (ADR-016 + ADR-027). Every verb is a parsed op inside it.
request(ops="verb(arg=value, arg=value)") # single op
request(ops="[v1(...), v2(...), v3(...)]") # parallel batch (max 100)
request(ops="[{\"tool\":\"v1\",\"args\":{...}}, ...]") # equivalent JSON form
All 7 packs load by default — 67 verbs out of the box:
| Pack | Prefix | Verbs | What it does |
|---|---|---|---|
| kg | (bare) | 16 | Entities, edges, notes, graph queries |
| gtd | gtd. |
5 | Task lifecycle (inbox → next → active → done) |
| memory | memory. |
5 | Salience-weighted remember / decay-ranked recall |
| brain | brain. |
13 | Bayesian user profiles + feedback loop |
| comm | comm. |
5 | Threaded messaging |
| schedule | schedule. |
4 | Reminders and scheduled verb execution |
| knowledge | knowledge. |
19 | Atom-based KB with embedding rerank search |
create, list, search take kind=entity|note (or kind=edge for list).
get, update, delete, merge are UUID-only — they auto-detect the record type.
Agents reach khive via MCP stdio — Python, TypeScript, Rust, or any MCP-compatible client. No language SDK to learn.
┌──────────────────────────────────────────────────────────────┐
│ kkernel mcp — stdio MCP server (the `kkernel` binary) │
│ khived — persistent daemon (ADR-049): warm runtime │
│ auto-spawned on first request │
│ 1 tool: `request` (ADR-016 + ADR-027) — parses DSL, │
│ dispatches each op through the VerbRegistry │
└──────────────────────────────────────────────────────────────┘
↕ VerbRegistry dispatch
┌──────────────────────────────────────────────────────────────┐
│ khive-pack-kg — KG vocabulary + 16 verb handlers │
│ khive-pack-gtd — task lifecycle (5 verbs) │
│ khive-pack-memory — salience + decay recall (5 verbs) │
│ khive-pack-brain — Bayesian profiles (13 verbs) │
│ khive-pack-comm — threaded messaging (5 verbs) │
│ khive-pack-schedule — reminders + scheduled ops (4 verbs) │
│ khive-pack-knowledge — atom KB + embedding rerank (19 verbs)│
└──────────────────────────────────────────────────────────────┘
↕ in-process
┌──────────────────────────────────────────────────────────────┐
│ khive-runtime, khive-request, khive-query, khive-db, │
│ khive-storage, khive-score, khive-types │
└──────────────────────────────────────────────────────────────┘
Embedded SQLite storage with FTS5 trigram text search and sqlite-vec vector search. Retrieval ships in-process BM25, HNSW, Vamana, and fusion crates for hybrid and pack-specific search paths. One binary, one DB file, no external services to run.
The khived daemon (ADR-049) keeps the runtime warm between MCP sessions — embedding model stays loaded, SQLite connections stay open, pack registries stay initialized. It auto-spawns on first request and persists in the background, eliminating cold-start overhead on reconnect.
HTTP gateway, CLI, and visual frontend are planned for future releases.
Knowledge search runs on an in-process Vamana ANN index (khive-vamana). On the standard
SIFT-1M benchmark, the index returns recall@10 of 0.95 at a p50 query latency of 171µs over
1,000,000 vectors, measured on a single laptop (macos-arm64, commit eb6696c). Tail latency
stays under 250µs and the index scales sublinearly as the corpus grows from 100K to 1M vectors:
| Vectors | Recall@10 | p50 | p95 | p99 | Build |
|---|---|---|---|---|---|
| 100K | 0.9504 | 71µs | 93µs | 102µs | 11.1s |
| 316K | 0.9523 | 130µs | 177µs | 200µs | 54.6s |
| 1M | 0.9521 | 171µs | 216µs | 234µs | 320.8s |
Query latency grows about 2.4x while the corpus grows 10x, so the search path is sublinear in corpus size over this range. Index build is a one-time cost paid at construction; it grows super-linearly here (about 29x build time for the 10x corpus).
The speedups over exhaustive search implied by these latencies (89x at 100K, 153x at 316K, 341x
at 1M) are computed against a back-derived brute-force baseline rather than a directly measured
one (see #167); treat them as indicative, not
as a measured headline figure. The benchmark harness lives in perf/; raw data is in
perf/ledger.csv.
| Crate | Purpose |
|---|---|
khive-types |
Domain types, Pack trait, closed enums |
khive-score |
Deterministic i64 fixed-point scoring |
khive-storage |
Trait-only capability surface (zero implementations) |
khive-db |
SQLite backend: entity/note/edge tables, FTS5 TextSearch, current sqlite-vec VectorStore compatibility |
khive-retrieval |
Hybrid retrieval primitives |
khive-fusion |
RRF, weighted, union, vector-only, and keyword-only fusion strategies |
khive-bm25 |
BM25 keyword index |
khive-hnsw |
HNSW vector index |
khive-vamana |
Vamana ANN index used by knowledge search |
khive-query |
SPARQL / GQL → SQL compiler |
khive-runtime |
Service API + VerbRegistry + PackRuntime trait |
khive-request |
Request DSL parser (function-call, JSON; pipe / LNDL planned). Transport-agnostic AST. |
khive-pack-kg |
KG pack: vocabulary, verb handlers, kind validation |
khive-pack-gtd |
GTD pack: task lifecycle over the notes substrate |
khive-pack-memory |
Memory pack: salience-weighted remember/recall with decay |
khive-pack-brain |
Brain pack: Bayesian user profiles, feedback, resolution |
khive-pack-comm |
Comm pack: threaded messaging with inbox |
khive-pack-schedule |
Schedule pack: reminders and scheduled verb execution |
khive-pack-knowledge |
Knowledge pack: atom-based KB with embedding rerank search |
khive-mcp |
MCP server library — single request tool dispatching through the VerbRegistry (served by kkernel mcp) |
kkernel |
The single shipped binary — kkernel mcp serves MCP; admin subcommands (exec, reindex, db, …) |
Dependency direction (storage stack): types → score → storage → db → query → runtime → packs → mcp.
Side input: request → mcp (the DSL parser is consumed only at the MCP dispatch boundary;
packs do not depend on it).
Storage is trait-only; backends (SQLite today, Postgres tomorrow) implement the traits without
touching consumers.
1. Install:
npm install -g khive2. Add to your MCP config (.mcp.json in your project, or ~/.claude/mcp.json for
global):
{ "mcpServers": { "khive": { "command": "khive", "args": ["mcp"] } } }That's it. All 7 packs load by default, a background daemon auto-spawns to keep the runtime
warm, and Claude Code discovers the request tool with the full 67-verb catalog.
If you prefer Rust tooling or need to build from source:
cargo install kkernel # from crates.io — installs the `kkernel` binary
# or:
git clone https://github.com/ohdearquant/khive.git && cd khive
cd crates && cargo build --release -p kkernelThen point your MCP config at the binary's mcp subcommand:
{ "mcpServers": { "khive": { "command": "kkernel", "args": ["mcp"] } } }kkernel is the single shipped binary; kkernel mcp serves the MCP request
surface. The npm package installs a thin khive (and a khive-mcp compatibility)
shim that forwards to kkernel mcp — the Cargo path invokes kkernel directly.
The agent expresses verbs as DSL ops inside the single request tool:
request(ops="create(kind=\"entity\", entity_kind=\"concept\", name=\"LoRA\")")
request(ops="search(kind=\"entity\", query=\"parameter efficient fine-tuning\")")
request(ops="link(source_id=\"<uuid>\", target_id=\"<uuid>\", relation=\"variant_of\")")
# Batch multiple ops in one call:
request(ops="[create(...), create(...), link(...)]")
For guided research workflows, install the marketplace plugin:
/plugin marketplace add ohdearquant/khive
/plugin install kg
This adds 4 workflow skills and a researcher agent:
| Skill | What it does |
|---|---|
/kg:digest |
Ingest material into the graph — extract entities, link, verify |
/kg:explore |
Discover what the graph knows — traverse, narrate, surface gaps |
/kg:connect |
Wire a new concept into existing knowledge — find relations |
/kg:polish |
Audit and fix — orphans, low-degree nodes, duplicates |
khive mcp # Default: ~/.khive/khive.db
khive mcp --db /path/to/my.db # Custom DB path
khive mcp --db :memory: # Ephemeral (testing)
khive mcp --namespace my-project # Default namespace (default: "local")
khive mcp --no-embed # Disable local embedding model
khive mcp --log debug # Log level (default: warn)Environment variables: KHIVE_DB, KHIVE_NAMESPACE, KHIVE_NO_EMBED, KHIVE_LOG.
cd crates && cargo test --workspace
make ci # Full CI: fmt, clippy, test, buildPrerequisites: Rust 1.94+ (via rustup), Deno 2.x (for the TypeScript CLI layer — optional)
- Node.js 20+ and pnpm (for frontend — optional)
- Feature branches + PRs. Never push directly to main.
make cimust pass (fmt, clippy, test, no-default-features check, release build).- Conventional commits:
feat(types): add NoteKind taxonomy. - Schema/interface changes need a design doc — propose in the PR or as an issue.
- See CLAUDE.md for the developer guide, AGENTS.md for agent usage.
v0.3.0 — published on crates.io. 67 verbs across 7 packs, 9 entity kinds, 17 edge relations, daemon warm startup (ADR-049), knowledge search with embedding rerank, Bayesian brain profiles, threaded messaging, scheduled verb execution. Ready for use with Claude Code and any MCP-compatible agent.
Apache 2.0 — see LICENSE.