Skip to content

Milestone: v0.6 — Adaptive Knowledge Ingestion Pipeline #78

@Steake

Description

@Steake

Milestone anchor issue. Tracks all adaptive knowledge ingestion work specified in issue #33 and docs/ADAPTIVE_INGESTION_README.md.

Core Deliverables

  • Adaptive ingestion workers + CPU Autotuner (8-core/16GB target)
  • Layout/sentence-aware chunking at 3 levels: Fast / Balanced / Deep
  • Tightened custom vector DB contract (single source of ANN/search)
  • Graph builder from vector kNN neighbours (Document → Chunk → Concept)
  • Persistent, responsive Jobs UI with preflight ETAs
  • GET /api/graph/{docId} knowledge graph endpoint

Acceptance Criteria

  • Import ≥300 MB PDF on 8-core/16 GB without OOM
  • Preflight ETA accuracy ±25% after 2 min
  • Self-query MRR@10 ≥ 0.6
  • Jobs UI persists across page reloads
  • Vector DB is single source of embeddings — no client-side ANN duplication

Linked

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationenhancementNew feature or request

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions