Add Context Graph extension for structural code awareness#1
Merged
Conversation
Add context graph foundation with: - GraphNode, GraphEdge, RawEdge, ChunkResult types - GraphConfig with env var loading (SEMANTIC_CODE_GRAPH_ENABLED, etc.) - SQLite schema with graph_nodes, graph_edges, graph_meta tables - GraphStore class with CRUD, BFS traversal, stale detection - GraphError class in errors module - better-sqlite3 dependency
Add chunkCodeWithEdges() that extracts structural edges alongside chunks: - Call expressions → 'calls' edges - Import statements → 'imports' edges - Extends/implements → 'extends'/'implements' edges - Export statements → 'exports' edges - Edge resolver that matches symbols to chunk IDs (same-file=1.0, cross-file=0.8) - Language configs extended with callNodeTypes, importNodeTypes, heritageNodeTypes - TypeScript, JavaScript, Python configs populated; others get empty extraction
Add SessionManager with: - Per-session visited nodes tracking (cap: 10K) - Priority-based frontier for exploration suggestions - Chronological reasoning log (cap: 1K entries) - Per-node annotations - TTL-based cleanup (default: 1hr, 60s interval) - Serialize/deserialize for session persistence
Add three new MCP tools (conditional on SEMANTIC_CODE_GRAPH_ENABLED): - context_query: semantic search + graph neighborhood expansion - graph_annotate: write notes on nodes, create agent_linked edges - session_summary: visited/frontier/stale/annotation counts + reasoning log Register tools in startServer(), add graph cleanup to shutdown handlers
Integrate graph store into indexing pipeline: - indexFile() extracts edges via chunkCodeWithEdges() when graph enabled - indexDirectory() builds symbol index after chunking, resolves edges, batch upserts - FileWatcher.processFileChange() deletes old graph data, re-extracts, resolves - FileWatcher.handleFileDelete() removes graph data for deleted files - SemanticSearchTool.setGraphStore() wires graph into indexer/watcher - All graph operations are conditional and degrade gracefully if disabled
Add 7 test files covering: - graph-store.test.ts: CRUD, BFS traversal, cascading deletes, stale detection (27 tests) - extractor.test.ts: symbol resolution, ambiguity, self-ref prevention (8 tests) - session.test.ts: visit tracking, frontier, TTL, serialize/deserialize (16 tests) - context-query.integration.test.ts: end-to-end search + graph expansion (3 tests) - graph-indexing.integration.test.ts: full pipeline chunk → graph (5 tests) - graph-injection.test.ts: SQL injection prevention via parameterized queries (5 tests) - graph.perf.test.ts: benchmarks for BFS, extraction overhead, session summary (4 tests) All 410 tests pass (22 suites), zero regressions.
…ardening - Extract shared ID validation into src/utils/validation.ts - Add validateId/validateIds to GraphStore operations (upsertNodes, upsertEdges, getNode, getNeighbors) for defense-in-depth - Add ID format regex to Zod schemas in all graph tools - Fix visitNode frontier bug: remove node from frontier even when visited-nodes cap is reached, preventing infinite revisit loops - Refactor chunkCode/chunkCodeWithEdges to share core logic via chunkCodeCore(), eliminating ~150 lines of duplication - Export generateChunkId and use it in context-query.ts instead of duplicated filePathToChunkId() - Strengthen always-passing test assertions (toBeGreaterThanOrEqual(0)) with meaningful checks; fix test data to produce valid chunks - Rename mislabeled integration test to reflect actual scope - Remove unused getAllNodes prepared statement
- Bump version 0.3.3 → 0.4.0 (new feature, backward compatible) - Document context_query, graph_annotate, session_summary tools in README - Add graph configuration env vars to README - Update project structure to include graph/ and tools/ - Update architecture.md with graph component details - Add code-graph, context-graph keywords to package.json
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
context_query,graph_annotate,session_summary) for graph-aware code exploration, agent annotations, and session state trackingSEMANTIC_CODE_GRAPH_ENABLED=truewith zero overhead when disabled and graceful degradation on failureArchitecture
Changes by phase
GraphNode,GraphEdge,RawEdgetypes,GraphConfig, SQLite schema,GraphStoreclass with CRUD + BFSchunkCodeWithEdges()extracts calls/imports/extends/implements from AST;resolveEdges()matches symbols to chunk IDsSessionManagerwith visited nodes, frontier, reasoning log, annotations, TTL cleanupcontext_query(search + graph expansion),graph_annotate(notes + agent_linked edges),session_summary(exploration state)New dependencies
better-sqlite3+@types/better-sqlite3(synchronous SQLite, prebuilt binaries)Test plan
npm test— all 410 tests pass (22 suites), zero regressionsnpm test tests/graph/— 59 graph unit tests passSEMANTIC_CODE_GRAPH_ENABLED=true, runcontext_querygraph_annotate+session_summaryworkflow🤖 Generated with Claude Code