Skip to content

Add Context Graph extension for structural code awareness#1

Merged
Ricoledan merged 8 commits intomainfrom
feature/context-graph
Mar 11, 2026
Merged

Add Context Graph extension for structural code awareness#1
Ricoledan merged 8 commits intomainfrom
feature/context-graph

Conversation

@Ricoledan
Copy link
Member

Summary

  • Context Graph: Adds a persistent SQLite graph that captures structural relationships between code chunks (calls, imports, inheritance), enabling AI agents to understand how code connects — not just what it says
  • Three new MCP tools (context_query, graph_annotate, session_summary) for graph-aware code exploration, agent annotations, and session state tracking
  • Fully opt-in via SEMANTIC_CODE_GRAPH_ENABLED=true with zero overhead when disabled and graceful degradation on failure

Architecture

Tree-sitter AST → chunkCodeWithEdges() → chunks + raw edges
                                              ↓
                              resolveEdges() → GraphEdge[]
                                              ↓
                              SQLite GraphStore (graph_nodes, graph_edges)
                                              ↓
                              BFS traversal → context_query results

Changes by phase

  1. FoundationGraphNode, GraphEdge, RawEdge types, GraphConfig, SQLite schema, GraphStore class with CRUD + BFS
  2. Edge ExtractionchunkCodeWithEdges() extracts calls/imports/extends/implements from AST; resolveEdges() matches symbols to chunk IDs
  3. Session MemorySessionManager with visited nodes, frontier, reasoning log, annotations, TTL cleanup
  4. MCP Toolscontext_query (search + graph expansion), graph_annotate (notes + agent_linked edges), session_summary (exploration state)
  5. Watcher Integration — Incremental graph updates on file changes, stale node detection
  6. Testing — 68 new tests across 7 files (unit, integration, security, performance); all 410 tests pass

New dependencies

  • better-sqlite3 + @types/better-sqlite3 (synchronous SQLite, prebuilt binaries)

Test plan

  • npm test — all 410 tests pass (22 suites), zero regressions
  • npm test tests/graph/ — 59 graph unit tests pass
  • Integration tests: chunk → graph pipeline, context query with session tracking
  • Security tests: SQL injection prevention via parameterized queries
  • Performance benchmarks: BFS < 1ms, session_summary < 0.1ms
  • Manual test: start MCP server with SEMANTIC_CODE_GRAPH_ENABLED=true, run context_query
  • Manual test: graph_annotate + session_summary workflow
  • Manual test: modify a file, verify stale nodes detected

🤖 Generated with Claude Code

Add context graph foundation with:
- GraphNode, GraphEdge, RawEdge, ChunkResult types
- GraphConfig with env var loading (SEMANTIC_CODE_GRAPH_ENABLED, etc.)
- SQLite schema with graph_nodes, graph_edges, graph_meta tables
- GraphStore class with CRUD, BFS traversal, stale detection
- GraphError class in errors module
- better-sqlite3 dependency
Add chunkCodeWithEdges() that extracts structural edges alongside chunks:
- Call expressions → 'calls' edges
- Import statements → 'imports' edges
- Extends/implements → 'extends'/'implements' edges
- Export statements → 'exports' edges
- Edge resolver that matches symbols to chunk IDs (same-file=1.0, cross-file=0.8)
- Language configs extended with callNodeTypes, importNodeTypes, heritageNodeTypes
- TypeScript, JavaScript, Python configs populated; others get empty extraction
Add SessionManager with:
- Per-session visited nodes tracking (cap: 10K)
- Priority-based frontier for exploration suggestions
- Chronological reasoning log (cap: 1K entries)
- Per-node annotations
- TTL-based cleanup (default: 1hr, 60s interval)
- Serialize/deserialize for session persistence
Add three new MCP tools (conditional on SEMANTIC_CODE_GRAPH_ENABLED):
- context_query: semantic search + graph neighborhood expansion
- graph_annotate: write notes on nodes, create agent_linked edges
- session_summary: visited/frontier/stale/annotation counts + reasoning log
Register tools in startServer(), add graph cleanup to shutdown handlers
Integrate graph store into indexing pipeline:
- indexFile() extracts edges via chunkCodeWithEdges() when graph enabled
- indexDirectory() builds symbol index after chunking, resolves edges, batch upserts
- FileWatcher.processFileChange() deletes old graph data, re-extracts, resolves
- FileWatcher.handleFileDelete() removes graph data for deleted files
- SemanticSearchTool.setGraphStore() wires graph into indexer/watcher
- All graph operations are conditional and degrade gracefully if disabled
Add 7 test files covering:
- graph-store.test.ts: CRUD, BFS traversal, cascading deletes, stale detection (27 tests)
- extractor.test.ts: symbol resolution, ambiguity, self-ref prevention (8 tests)
- session.test.ts: visit tracking, frontier, TTL, serialize/deserialize (16 tests)
- context-query.integration.test.ts: end-to-end search + graph expansion (3 tests)
- graph-indexing.integration.test.ts: full pipeline chunk → graph (5 tests)
- graph-injection.test.ts: SQL injection prevention via parameterized queries (5 tests)
- graph.perf.test.ts: benchmarks for BFS, extraction overhead, session summary (4 tests)

All 410 tests pass (22 suites), zero regressions.
…ardening

- Extract shared ID validation into src/utils/validation.ts
- Add validateId/validateIds to GraphStore operations (upsertNodes,
  upsertEdges, getNode, getNeighbors) for defense-in-depth
- Add ID format regex to Zod schemas in all graph tools
- Fix visitNode frontier bug: remove node from frontier even when
  visited-nodes cap is reached, preventing infinite revisit loops
- Refactor chunkCode/chunkCodeWithEdges to share core logic via
  chunkCodeCore(), eliminating ~150 lines of duplication
- Export generateChunkId and use it in context-query.ts instead of
  duplicated filePathToChunkId()
- Strengthen always-passing test assertions (toBeGreaterThanOrEqual(0))
  with meaningful checks; fix test data to produce valid chunks
- Rename mislabeled integration test to reflect actual scope
- Remove unused getAllNodes prepared statement
- Bump version 0.3.3 → 0.4.0 (new feature, backward compatible)
- Document context_query, graph_annotate, session_summary tools in README
- Add graph configuration env vars to README
- Update project structure to include graph/ and tools/
- Update architecture.md with graph component details
- Add code-graph, context-graph keywords to package.json
@Ricoledan Ricoledan merged commit bd45032 into main Mar 11, 2026
1 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant