Skip to content

feat: explicit typed entity relations for graph-style queries #2

Description

@tenfourty

Summary

Store bidirectional typed links between entities (people, projects, teams) with relation type and reason. Enable graph-style queries like "who reports to Person A?", "all decisions from Person B's 1:1s", "everything connected to Project X".

# Create a relation
kbx entity relate "Person A" --type "reports-to" --target "Person B"

# Query relations
kbx entity relations "Person A"
kbx entity relations "Project X" --type "leads"

Motivation

kbx currently knows:

  • Which entities exist (people, projects, teams)
  • Which documents mention which entities (via find_entity_mentions())

But it doesn't know the relationships between entities. The entity mention index tells us "Person A appears in 47 documents" but not "Person A reports to Person B" or "Person A leads Project X".

This information exists implicitly — in org charts, meeting transcripts, entity file metadata — but it's not queryable. You can't ask kbx "who works on Project X?" or "show me everything connected to Person A's team".

With typed relations:

  • kbx entity relations "Person A" → shows direct reports, projects, collaborators
  • kbx search "migration" --entity "Person A" → expands to include documents from related entities (Person A's team, Person A's projects)
  • kbx person find "Person A" → profile includes a relations section
  • Debrief pipeline can auto-extract relations from meeting transcripts ("Person A is taking over Project X")

Inspiration: OpenViking stores bidirectional URI-to-URI links with reasons via a RelationService. Relations are auto-created during session commits and queryable via API. Their _create_relations() method extracts URIs from context parts and creates bidirectional links.

Design

1. Relation Schema

New relations table in SQLite:

CREATE TABLE relations (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    source_entity_id INTEGER NOT NULL REFERENCES entities(id),
    target_entity_id INTEGER NOT NULL REFERENCES entities(id),
    relation_type TEXT NOT NULL,          -- constrained vocabulary
    reason TEXT,                          -- free-text explanation
    source_doc_path TEXT,                 -- document that evidences this relation
    confidence TEXT DEFAULT 'manual',     -- 'manual', 'auto-high', 'auto-low'
    created_at TEXT NOT NULL,             -- ISO 8601
    updated_at TEXT,                      -- ISO 8601
    valid_until TEXT,                     -- ISO 8601, NULL = indefinite

    UNIQUE(source_entity_id, target_entity_id, relation_type)
);

CREATE INDEX idx_relations_source ON relations(source_entity_id);
CREATE INDEX idx_relations_target ON relations(target_entity_id);
CREATE INDEX idx_relations_type ON relations(relation_type);

Unique constraint: One relation of each type per source→target pair. Creating a duplicate updates the existing relation (reason, source_doc, updated_at).

2. Relation Types

Constrained vocabulary with free-text fallback. Predefined types cover 90% of cases; custom type with a required reason covers the rest.

Person → Person

Type Inverse Example
reports-to manages Person A reports to Person B
collaborates-with collaborates-with Person A collaborates with Person B (symmetric)
mentors mentored-by Person A mentors Person B
delegates-to delegated-from Person A delegates work to Person B

Person → Project

Type Inverse Example
leads led-by Person A leads Project X
contributes-to contributor Person A contributes to Project X
sponsors sponsored-by Person A sponsors Project X
reviews reviewed-by Person A reviews Project X

Person → Team

Type Inverse Example
member-of has-member Person A is a member of Team X
leads led-by Person A leads Team X (reuses same type)

Project → Project

Type Inverse Example
depends-on depended-on-by Project X depends on Project Y
blocks blocked-by Project X blocks Project Y
related-to related-to Project X is related to Project Y (symmetric)

Any → Any

Type Inverse Example
custom custom Free-text, reason required

Validation: The CLI and API validate that relation types are appropriate for the source/target entity types (e.g. reports-to only valid for person→person). custom is always allowed.

Inverse lookup: When storing A→B with type reports-to, the system can traverse B→A as manages without storing a separate row. The inverse mapping is defined in code, not duplicated in the database.

3. CLI Surface

# Create relations
kbx entity relate "Person A" --type "reports-to" --target "Person B"
kbx entity relate "Person A" --type "leads" --target "Project X" --reason "Took over in Q1"
kbx entity relate "Project X" --type "depends-on" --target "Project Y" --reason "Shared auth module"

# List all relations for an entity
kbx entity relations "Person A" --json
# Output:
# {
#   "entity": "Person A",
#   "relations": [
#     {"type": "reports-to", "target": "Person B", "reason": null, "since": "2026-01-15"},
#     {"type": "leads", "target": "Project X", "reason": "Took over in Q1", "since": "2026-02-01"},
#     {"type": "collaborates-with", "target": "Person C", "reason": null, "since": "2026-01-20"}
#   ],
#   "inverse_relations": [
#     {"type": "manages", "source": "Person D", "reason": null, "since": "2025-11-01"},
#     {"type": "contributes-to", "source": "Person E", "reason": null, "since": "2026-01-10"}
#   ]
# }

# Filter by type
kbx entity relations "Person A" --type "reports-to" --json
kbx entity relations "Project X" --type "leads" --json

# Remove a relation
kbx entity unrelate "Person A" --type "reports-to" --target "Person B"

# List all relations of a type across the graph
kbx entity relations --type "reports-to" --json
# → all reporting relationships in the system

4. Auto-Extraction

During debrief or indexing, detect high-confidence relations from text patterns:

Pattern-based extraction (high confidence):

  • "Person A reports to Person B" → reports-to
  • "Person A is leading Project X" / "Person A took over Project X" → leads
  • "Person A joined Team Y" → member-of
  • "Project X depends on Project Y" / "Project X is blocked by Project Y" → depends-on / blocked-by

Meeting-context extraction (medium confidence):

  • Meeting attendees who consistently appear together → collaborates-with (after N co-occurrences)
  • Person mentioned as "taking action" on a project → contributes-to

Confidence levels:

  • manual — explicitly created via CLI or API
  • auto-high — pattern-matched with strong signal (e.g. "reports to" in an org chart document)
  • auto-low — inferred from co-occurrence or weak patterns

Auto-extracted relations should be suggested, not auto-committed (at least initially):

kbx entity suggest-relations --from-doc memory/meetings/2026/01/15/notes.md
# Suggested relations:
#   Person A --reports-to--> Person B  (confidence: high, source: "Person A reports to Person B")
#   Person C --leads--> Project X      (confidence: medium, source: "Person C is driving Project X")
# Accept all? [y/N/select]

A --auto-accept flag with a confidence threshold could enable unattended extraction:

kbx entity suggest-relations --auto-accept --min-confidence high

5. Query Integration

Relations expand the search scope when querying with --entity:

# Without relations: searches only documents mentioning Person A
kbx search "deployment issues" --entity "Person A"

# With relations: also includes documents from Person A's projects and direct reports
kbx search "deployment issues" --entity "Person A" --expand-relations

Expansion logic:

  1. Find entity "Person A"
  2. Find related entities: projects Person A leads, people Person A manages, teams Person A belongs to
  3. Collect documents mentioning Person A OR any related entity
  4. Score with relation distance decay: direct mention = 1.0, one hop = 0.7, two hops = 0.4

Integration with two-pass search (#69): Relations feed directly into Pass 1 entity scoring. When Pass 1 finds "Person A", it can expand to related entities and include their documents in Pass 2's candidate set.

6. Profile Integration

kbx person find and kbx project find output should include a relations section:

kbx person find "Person A" --json
# {
#   "name": "Person A",
#   "role": "Engineering Manager",
#   "team": "Platform",
#   "facts": [...],
#   "relations": {
#     "reports_to": ["Person B"],
#     "manages": ["Person C", "Person D", "Person E"],
#     "leads": ["Project X", "Project Y"],
#     "collaborates_with": ["Person F"],
#     "member_of": ["Team Platform"]
#   },
#   ...
# }

Human-readable output:

Person A — Engineering Manager (Platform)
  Reports to: Person B
  Manages: Person C, Person D, Person E
  Leads: Project X, Project Y
  Collaborates with: Person F
  Member of: Team Platform

7. Bidirectionality

Storage: Only one row per relation (source→target). The inverse is computed at query time from the inverse type mapping.

INVERSE_MAP = {
    "reports-to": "manages",
    "manages": "reports-to",
    "leads": "led-by",
    "led-by": "leads",
    "contributes-to": "contributor",
    "contributor": "contributes-to",
    "mentors": "mentored-by",
    "mentored-by": "mentors",
    "delegates-to": "delegated-from",
    "delegated-from": "delegates-to",
    "sponsors": "sponsored-by",
    "sponsored-by": "sponsors",
    "reviews": "reviewed-by",
    "reviewed-by": "reviews",
    "member-of": "has-member",
    "has-member": "member-of",
    "depends-on": "depended-on-by",
    "depended-on-by": "depends-on",
    "blocks": "blocked-by",
    "blocked-by": "blocks",
    # Symmetric types
    "collaborates-with": "collaborates-with",
    "related-to": "related-to",
    "custom": "custom",
}

Traversal: get_relations(entity_id) queries both source_entity_id = ? and target_entity_id = ?, applying the inverse mapping for target-side results.

Canonical direction: For asymmetric types, the CLI normalises input — kbx entity relate "Person B" --type "manages" --target "Person A" is stored as Person A --reports-to--> Person B (the canonical direction is the one where the source has the "subordinate" type).

8. Relation Decay and Staleness

Relations can become stale — people change teams, projects end, reporting lines shift.

valid_until field: Optional expiry date. Relations with valid_until < now are excluded from active queries but retained for historical queries.

# Create a time-bounded relation
kbx entity relate "Person A" --type "leads" --target "Project X" --valid-until 2026-06-30

# Expire a relation (soft delete)
kbx entity relate "Person A" --type "leads" --target "Project X" --valid-until 2026-03-01

Staleness detection: Extend kbx entity stale to include relation staleness:

kbx entity stale --include-relations --json
# Flags:
#   - Relations where neither entity has been mentioned in 60+ days
#   - Relations with valid_until in the past
#   - Auto-extracted relations that were never confirmed

Historical queries: --include-expired flag to include expired relations:

kbx entity relations "Person A" --include-expired --json
# Shows current and past relations, with valid_until dates

Integration with Other Features

  • Two-pass search (#69): Relations feed into Pass 1 entity expansion. Finding "Person A" automatically surfaces their projects and reports.
  • Search explain (#68): Explain output shows which relations were used to expand the search scope and how relation distance affected scoring.
  • kbx context output: Could include top relations per entity in the compact context view (e.g. "Person A (EM, manages: 3, leads: Project X)").
  • Debrief pipeline: Post-meeting debriefs could suggest relation updates based on transcript content (new assignments, team changes, project handoffs).

Implementation Phases

  1. Phase 1 — Schema + manual CRUD: Create relations table, migration, kbx entity relate/unrelate/relations commands. Inverse mapping. ~2 days
  2. Phase 2 — Profile integration: Show relations in kbx person find and kbx project find output. ~1 day
  3. Phase 3 — Query expansion: --entity + --expand-relations in search. Relation distance scoring. ~2 days
  4. Phase 4 — Auto-extraction: Pattern-based relation suggestions from documents. suggest-relations command. ~2-3 days
  5. Phase 5 — Decay + staleness: valid_until, expired relation handling, stale relation detection. ~1 day

Open Questions

  • Should relations be stored in the entity markdown files (as a ## Relations section) in addition to the DB? This would make them visible in kbx view and git-trackable, but adds sync complexity.
  • Should kbx context include relation counts or key relations per entity? This would give agents more structural context but increases context size.
  • For auto-extraction, should co-occurrence-based collaborates-with relations require a minimum number of shared meetings (e.g. 3+) to avoid noise?
  • Should relation types be extensible via config (user-defined types in kbx.toml) or is the predefined set sufficient?
  • How should relations interact with entity merging/dedup? If two entity records are merged, their relations should be merged too.

Restored from prior issue #70 after repository history rewrite (PII remediation, 2026-05-27).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions