Skip to content

feat: code intelligence — code graph, coverage, impact, and spec↔code mapping #19

@lsmonki

Description

@lsmonki

Summary

Add a specd impact command that, given a spec identifier, file path, or symbol name, computes the full blast radius across both the spec graph and the code graph, and produces an actionable task list for the change.

Today, understanding the impact of any change — to a spec or to code — requires manual cross-referencing: checking which specs depend on which, tracing what code implements a spec, and separately analyzing code-level dependencies. This is slow, error-prone, and the kind of work that should be automated by a spec-driven tool.

Why

  • The primary question in spec-driven development is "I'm changing this spec — what breaks?" A spec change can affect dependent specs, the code that implements it, and transitively all code that depends on that code. Today there is no way to answer this question without manual grep and domain knowledge.
  • The reverse is equally important: "I'm changing this code — which specs are affected?" A code change may violate constraints from one or more specs, but today there is no formal link between source files and the specs that govern them.
  • Reduces risk of incomplete changes. Without unified impact analysis, changes that satisfy the compiler can still silently violate spec constraints, and spec changes can leave implementations out of sync.
  • Enables automated task generation. With both graphs connected, specd can generate a concrete, ordered list of changes needed — specs to update, files to modify, tests to add.

Primary use case: spec change impact

specd impact specs/core/snapshot-hasher/spec.md

Expected answer:

  1. Dependent specs — which other specs depend on this one (upstream dependsOn traversal)
  2. Implementing code — which source files implement this spec (spec→code mapping)
  3. Code blast radius — for each implementing file, what other code depends on it (code graph)
  4. Transitive spec impact — for each affected code file, which other specs does it fall under
  5. Task list — ordered list of everything that needs review or update

This is the most natural question in a spec-driven workflow, and today it cannot be answered without manual effort.

Secondary use case: code change impact

specd impact src/domain/services/content-hash.ts
specd impact --symbol contentHash

Expected answer: the reverse — which specs govern this code, which other specs depend on those, and what the full code blast radius is.

What specd has today

Capability Source What it connects
Spec → spec dependencies dependsOn in .specd-metadata.yaml Spec graph traversal
Spec → its own files contentHashes in .specd-metadata.yaml Freshness of spec.md, verify.mdnot source code
Schema → spec sections contextSections selectors in schema Which sections to extract from spec files

Note: the specs/<package>/ layout convention used in specd's own repository is project-specific — each team can organize their specs differently. specd does not infer a spec→package mapping from directory structure, and any impact tool must not assume one.

What does NOT exist today

  • Spec → code mapping. There is no formal link between a spec and the source files that implement it. This is the critical missing piece for the primary use case. The contentHashes field tracks the spec's own files, not source code.
  • Code → spec mapping. The reverse is also missing — no way to ask "which spec covers content-hash.ts?"
  • Code graph. specd has no understanding of import/call relationships between source files.
  • Unified impact view. No way to go from "I'm changing this spec/file/symbol" to "here's everything affected."

What needs to be built

1. Spec ↔ Code mapping (bidirectional)

The most critical missing piece. A mechanism to associate specs with the source code that implements them, and vice versa. The mapping must not require modifying source code files — it lives entirely in spec-side artifacts.

Option A: Explicit covers field in spec metadata
Add a covers field to .specd-metadata.yaml:

covers:
  - src/domain/services/content-hash.ts
  - src/application/use-cases/_shared/compute-artifact-hash.ts
  • Spec → code: read the covers field
  • Code → spec: build an inverted index across all specs

Pros: Explicit, precise, queryable, works regardless of directory layout conventions, bidirectional from a single source, no source code modifications.
Cons: Per-spec maintenance burden, can go stale if files move without updating metadata.

Option B: Centralized mapping file
A project-level specd-covers.yaml (or section in specd.yaml) that maps all specs to their implementing files:

mappings:
  core/snapshot-hasher:
    - packages/core/src/domain/services/content-hash.ts
    - packages/core/src/infrastructure/fs/hash.ts
  core/validate-artifacts:
    - packages/core/src/application/use-cases/validate-artifacts.ts
  • Spec → code: look up the spec in the mapping
  • Code → spec: build an inverted index

Pros: Single file to audit, easy to review in PRs, no source code modifications, no per-spec metadata overhead.
Cons: One more file to maintain, can desynchronize with reality, doesn't scale well for very large projects.

Option C: Hybrid — metadata covers + centralized file
Allow both. The impact engine merges both sources.
Pros: Teams choose what fits their workflow. Cons: Complexity, potential conflicts.

Recommendation: Start with Option A (covers in metadata) — it's the most natural place since it lives alongside the spec it describes. Staleness can be mitigated with specd verify --covers to validate listed files still exist. An agent can automate covers updates as part of the change workflow.

2. Code graph — language-agnostic via Tree-sitter

specd must be language-agnostic. The code graph cannot depend on a specific compiler or language toolchain. Tree-sitter is the right foundation:

  • MIT licensed — no licensing concerns. Each language grammar is also MIT or similarly permissive.
  • Native Node.js bindingstree-sitter and node-tree-sitter on npm.
  • Proven at scale — used by GitHub, Neovim, Zed, and other tools for multi-language analysis.
  • Fast — native C parsers, incremental parsing, handles large codebases.

Architecture

Source files → Tree-sitter parser (per language) → Language-specific AST
  → Query extractor (per language) → Unified symbol model
  → Graph builder → Code knowledge graph
  → Impact engine → Blast radius

Core (language-agnostic):

  • A unified symbol model: File, Function, Class, Method, Import
  • A unified relation model: CALLS, IMPORTS, EXPORTS, EXTENDS, IMPLEMENTS
  • Graph builder, traversal, and impact analysis — all operate on the unified model
  • Serialized index (e.g., JSON or SQLite) for fast incremental queries

Language adapters (per language):
Each language adapter is a set of Tree-sitter queries (S-expression patterns) that extract symbols and relations from that language's AST into the unified model. What changes per language is only the query file — the rest of the pipeline is shared.

Example — a TypeScript adapter would have queries for:

  • import_statementIMPORTS relation
  • export_statementEXPORTS relation
  • function_declaration, method_definitionFunction/Method nodes
  • call_expressionCALLS relation
  • class_declaration, extends_clauseClass + EXTENDS

A Python adapter would have different queries (import_from_statement, def, class, etc.) but produce the same unified model.

Bundled languages:

  • TypeScript / JavaScript — included by default (specd's own language)
  • Additional languages installed as optional grammar packages

Plugin model for grammars:

# specd.yaml
codeGraph:
  languages:
    - typescript   # built-in
    - python       # requires tree-sitter-python
    - go           # requires tree-sitter-go

Each grammar is an npm package (tree-sitter-typescript, tree-sitter-python, etc.) plus a specd query file that maps that language's AST nodes to the unified model.

Why not other approaches

Approach Why not
TypeScript compiler API / ts-morph TypeScript-only — violates language-agnostic requirement
LSP Requires running language servers, heavy, hard to batch, different protocol quirks per server
Regex import parsing No call graph, no symbol-level precision, fragile across languages
External tool integration Licensing risk, dependency on third-party maintenance, can't guarantee availability
Source code annotations Requires modifying source files — invasive, couples code to spec tooling

3. Unified impact engine

Combine spec graph + code graph + spec↔code mapping:

Input: spec
  → Spec graph: find dependent specs (upstream dependsOn)
  → Spec→code mapping: find implementing source files (covers)
  → Code graph: for each implementing file, find callers at depth 1/2/3
  → Code→spec mapping: for each affected code file, find other governing specs
  → Output: unified blast radius + ordered task list

Input: file or symbol
  → Code graph: find all callers/importers at depth 1/2/3
  → Code→spec mapping: for each affected file, find governing specs
  → Spec graph: for each affected spec, traverse dependsOn upstream
  → Output: unified blast radius + ordered task list

Output format

Human-readable (default)

specd impact specs/core/snapshot-hasher/spec.md

Dependent specs (via dependsOn):
  - specs/core/validate-artifacts/spec.md
  - specs/core/compile-context/spec.md
  - specs/cli/change-approve/spec.md
  ... (6 total)

Implementing code (via covers):
  - packages/core/src/domain/services/content-hash.ts
  - packages/core/src/infrastructure/fs/hash.ts

Code blast radius:
  d=1 (WILL BREAK):      4 files
  d=2 (LIKELY AFFECTED):  5 files
  d=3 (MAY NEED TESTING): 4 files

Cross-spec impact:
  Code in blast radius also covered by:
  - specs/core/compile-context/spec.md (3 files)
  - specs/core/validate-artifacts/spec.md (2 files)

Task list (15 files, 7 specs):
  1. [spec]  Review/update 6 dependent specs
  2. [code]  Modify 2 implementing files
  3. [code]  Update 8 affected callers
  4. [test]  Update 3 test files

Risk: CRITICAL | Files: ~15 | Specs: 7

Machine-readable (--format json)

Full JSON output for consumption by agents, CI, or other tools.

Rollout

Phase Scope Depends on
v1 Spec-only impact (dependsOn traversal — no code) Existing metadata
v1.5 Spec↔code mapping (covers field) + staleness check Metadata extension
v2 Tree-sitter code graph (TS/JS built-in) + unified impact Tree-sitter integration
v2.5 Additional language adapters (Python, Go, Rust, etc.) Grammar packages + query files
v3 Change-level impact, CI integration, MCP tool v2+

Related issues

Acceptance criteria

  • specd impact <spec-path> returns dependent specs + implementing code + code blast radius
  • specd impact <file-path> returns affected code files + governing specs + dependent specs
  • specd impact --symbol <name> resolves symbol and runs full analysis
  • specd impact --change <id> analyzes all files in a change
  • Output includes risk assessment (LOW/MEDIUM/HIGH/CRITICAL)
  • Output includes ordered task list
  • JSON output available via --format json
  • MCP tool available for agent consumption
  • specd verify --covers validates that files listed in covers still exist
  • Code graph is language-agnostic via Tree-sitter
  • TypeScript/JavaScript grammar built-in; additional languages via plugin grammars
  • Language adapters are Tree-sitter query files mapping to a unified symbol model
  • No source code modifications required — mapping lives entirely in spec-side artifacts

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions