Skip to content

feat: graph-enhanced retrieval — native GraphStore + pipeline integration#394

Open
2233admin wants to merge 3 commits intoNevaMind-AI:mainfrom
2233admin:fix/retrieve-pgvector-direct
Open

feat: graph-enhanced retrieval — native GraphStore + pipeline integration#394
2233admin wants to merge 3 commits intoNevaMind-AI:mainfrom
2233admin:fix/retrieve-pgvector-direct

Conversation

@2233admin
Copy link
Copy Markdown

Summary

Add graph-enhanced retrieval as a native module in memU, replacing external ad-hoc graph recall.

Changes

Phase 1: GraphStore module

  • GraphNode, GraphEdge, GraphCommunity domain models + SQLModel ORM
  • PostgresGraphStore repository (800+ lines): CRUD, dual-path graph recall, PPR, LPA, global PageRank
  • Alembic migration for gm_* tables with scope column support
  • Wired into PostgresStore alongside existing repos

Phase 2: Pipeline integration

  • RetrieveGraphConfig: enabled, weight (0.0-1.0), max_nodes
  • recall_graph WorkflowStep in RAG retrieve workflow
  • Score fusion: itemsalpha + graphbeta, only when graph active with results
  • graph_nodes[] in retrieve response

Phase 3: Tests + docs + review

  • 30 unit tests (PPR, LPA, merge, fusion, config, models, ORM)
  • README section with config example
  • Two independent reviews: all P1/P2 findings fixed

Review findings addressed

  • [P1] Scope filtering on all graph queries (multi-user isolation)
  • [P1] Edge cascade on node delete
  • [P2] Item scores only deflated when graph active with results
  • [P3] Weight validation, dead code removal, state key declarations

Discussion points

  • ddl_mode="validate" still runs Alembic upgrade (pre-existing)
  • Migration hard-codes user_id; dynamic scope models may need migration updates

Test plan

  • 107 tests pass (77 existing + 30 new)
  • E2E verified with live PG (78 nodes, 50 edges, 36 communities)

- GraphNode/GraphEdge/GraphCommunity domain models + SQLModel ORM
- GraphStore repository with CRUD + dual-path graph recall + PPR/LPA
- Alembic migration for gm_* tables with scope column support
- Wired into PostgresStore alongside existing repos
- 77 existing tests still passing
- RetrieveGraphConfig: enabled, weight (β), max_nodes
- recall_graph WorkflowStep in RAG workflow
- Score fusion in _rag_build_context: vector*α + graph*β
- graph_nodes[] in retrieve response
- 77 tests pass, E2E verified with live PG data
Tests cover: PPR algorithm (8), global PageRank (3), LPA community
detection (4), merge results (4), score fusion (3), config (4),
domain models (3), ORM registration (1). All pure-Python, no DB needed.
@2233admin 2233admin force-pushed the fix/retrieve-pgvector-direct branch from cf48f48 to 1e1a992 Compare March 28, 2026 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant