Skip to content

feat(entities): persist false-positive entity↔document unlink (#35)#37

Merged
tenfourty merged 1 commit into
mainfrom
feat/entity-suppressions
Jun 25, 2026
Merged

feat(entities): persist false-positive entity↔document unlink (#35)#37
tenfourty merged 1 commit into
mainfrom
feat/entity-suppressions

Conversation

@tenfourty

Copy link
Copy Markdown
Owner

What

Adds a way to suppress false-positive entity↔document links that survives re-indexing and Granola sync.

  • New sidecar store memory/.kbx/entity-suppressions.json (src/kb/suppressions.py) — keyed by document path → entity names, case-insensitive, atomic writes.
  • find_entity_mentions(..., suppressed_ids=...) skips suppressed entities across all matching tiers.
  • Indexer loads the sidecar, maps suppressed entity names→ids per document, and threads suppressed_ids into mention derivation — so the fix persists through a full re-index (names, not ids, are the key).
  • KnowledgeBase.unlink_entity() / relink_entity() API methods (drop the live mention now + record/remove the suppression).
  • CLI: kbx entity unlink "Name" <doc> / kbx entity relink "Name" <doc> (auto-commit aware).
  • MCP: kb_entity_unlink / kb_entity_relink (both idempotent).

Why

Entity linking is heuristic and occasionally wrong (a bare first name matching the wrong person, a title substring that isn't really about the entity). Mentions are re-derived on every index and Granola sync regenerates meeting files, so a correction couldn't live in the DB row or in frontmatter — it would be wiped on the next sync. A sidecar keyed by entity name is the only place a suppression survives id churn + file regeneration.

Tests

  • test_suppressions.py — store unit tests (load/add/remove, idempotency, corrupt-file tolerance).
  • test_unlink.py — full integration through index_all: unlink drops the mention and stays dropped across a full reindex; relink restores it; CLI happy-path + error paths.
  • test_entities.py — matcher honours suppressed_ids.
  • test_mcp.pykb_entity_unlink/kb_entity_relink handlers (success + error shapes).

Docs updated: docs/entities.md (new §Suppressions), docs/cli.md, CLAUDE.md (tool count 31→33 + MCP section), CLI _AGENT_PLAYBOOK.

Closes #35.

Entity↔document links are heuristic and occasionally wrong (a bare first name
matching the wrong person, a title substring that isn't really about the
entity). Mentions are re-derived on every index and Granola sync regenerates
meeting files, so a correction can't live in the DB row or in frontmatter — it
would be wiped on the next sync.

Suppressions now live in a sidecar (memory/.kbx/entity-suppressions.json),
keyed by document path → entity names. The indexer loads it and skips those
entities for that document across all matching tiers, so the fix survives a
full re-index (names, not ids, are the key). Exposed as `kbx entity unlink` /
`kbx entity relink` and the kb_entity_unlink / kb_entity_relink MCP tools;
both write through the auto-commit path.
@tenfourty tenfourty merged commit 4fe6154 into main Jun 25, 2026
8 checks passed
@tenfourty tenfourty deleted the feat/entity-suppressions branch June 25, 2026 12:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: un-link (and suppress regeneration of) false-positive entity↔document matches

1 participant