Skip to content

Code-intelligence improvements: call-edge return-usage, per-function CFG reaching-defs, affected-by re-resolution, contract bridges, in-process type resolvers#83

Merged
zzet merged 5 commits into
mainfrom
feat/improvements
Jun 13, 2026

Conversation

@zzet

@zzet zzet commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary

Five independent code-intelligence improvements, one commit each. Every commit is fully implemented and tested; the branch builds with CGO and passes go test -race across all touched packages.

1. Return-value usage classification on call edges (26d50ae7)

Every call edge is stamped at extraction time with how its call site consumes the callee's return value — discarded, assigned, partially_ignored, returned, goroutine, deferred, argument, or condition. A single parent-chain classifier, driven by per-grammar node-kind tables, covers Go, Python, JavaScript, TypeScript, Java, Rust, Ruby, and C#; closure and switch-expression boundaries are honored so a call inside a block or match arm is not mislabeled by its enclosing statement, and unknown shapes stay unstamped rather than guessed. Surfaced on find_usages (per-usage field + filter) and verify_change (per-function distribution of real call sites), so a return-signature change shows exactly how each caller uses the value.

2. Per-function control-flow graphs + reaching-definitions fixpoint (878dd0e7)

New internal/cfg package: on-demand per-function CFGs built from the tree-sitter AST for Go, Python, JavaScript, TypeScript, Java, Rust, and Ruby — basic blocks with per-statement def/use sets, labeled edges (branches, loops, labeled break/continue, switch/match fallthrough, try/except/finally), and a bitset GEN/KILL reaching-definitions fixpoint producing statement-granular def→use chains, plus a Mermaid renderer. Exposed as the get_cfg MCP tool and analyze kind=def_use. internal/dataflow gained a CFG-backed refiner on flow_between and taint_paths: same-function value-flow hops are confirmed or pruned based on whether the definition actually reaches the use, and pruned paths sink in the ranking.

3. Affected-by re-resolution on incremental sync (f518c494)

When an incremental sync changes a file's symbol signatures (or removes symbols / changes their kind), the files that reference those symbols are re-resolved synchronously in the same pipeline; a body-only edit produces no delta and fans out to nothing. The delta is computed on a line-insensitive, graph-derived symbol shape, so it is meaningful across languages and is not defeated by line-embedded node IDs, and parse failures are no longer mistaken for symbol removal. Affected files come from a reverse reference-facts lookup (RefFactsReader.LoadRefFactsByTargets, backed by a new by-target index) unioned with a pre-evict in-edge snapshot, capped with truncation accounting; the no-delta path stays cheap.

4. Persisted cross-service contract-bridge subgraph (30e58f45)

IDL-aware contract extraction (.proto package/service/method canonical identities with brace-bounded service blocks; a Thrift extractor) plus a matcher join that pairs gRPC/Thrift providers and consumers across casing and package-qualification, gated on real gRPC evidence so plain Register*Server function definitions don't mint phantom providers. Each matched provider↔consumer group materializes one persisted contract-bridge node, scoped to the (workspace, project) match boundary so unrelated services never merge, with deterministic node fields and reconcile serialization. The contracts tool gained action=bridge: a reciprocal-rank-fusion group query and a cross-service impact mode.

5. In-process tree-sitter type resolvers for six languages (8605e8fc)

New internal/semantic/tstypes package: per-language type resolvers for Java, Python, Ruby, Rust, TypeScript/JavaScript, and C# that run fully in-process over the shared tree-sitter AST — no external language server spawned. A table-driven engine builds per-file scope graphs, binds declared and constructor types, propagates them through local assignments, resolves receivers against the graph's method sets via import-aware cross-file lookup, and synthesizes implements/extends edges per language. Resolutions are stamped at the ast_resolved tier with semantic_source <lang>-types, never downgrading a stronger edge; ambiguous receivers are skipped. Enrichment is scoped to the repo being enriched, runs its graph-apply phase under the resolve mutex, persists full edge provenance on disk backends, and wires single-file incremental enrichment. Providers register as supplemental and coexist with LSP providers.

Quality

  • Each feature was implemented end-to-end, then put through a multi-agent adversarial review; every confirmed finding was fixed with a regression test that fails before the fix and passes after. A handful of flagged items were examined and dropped as false positives with code-grounded reasons.
  • Builds onto current main (v0.44.1) — conflicts with the recently merged work were resolved by union, and the merge surfaced (and fixed) one real omission where two additional call edges were left unstamped.
  • go build ./... + CGO binary build green; go vet clean; go test -race green across parser/languages, cfg, dataflow, graph, graph/store_sqlite, graph/storetest, analysis, contracts, semantic/..., serverstack, resolver, config, agents/..., indexer, and mcp (5,791 tests).

Notes

  • One honest scope limit: a pure arity change to a plain .js (not .ts) function remains undetectable by the affected-by delta, because the JavaScript extractor emits no parameter-shape nodes; TypeScript, Python, Java, C#, Rust, and Go are covered. Closing this would require a change to the JS extractor, outside the scope of this branch.

zzet added 5 commits June 13, 2026 09:11
Stamp return_usage on every call edge at extraction time across Go, Python,
JavaScript, TypeScript, Java, Rust, Ruby, and C#: discarded, assigned,
partially_ignored, returned, goroutine, deferred, argument, condition. One
shared parent-chain classifier driven by per-grammar node-kind tables, with
closure and switch-expression boundaries honored so a call inside a block or
match arm is not mislabeled by its enclosing statement; unknown shapes stay
unstamped rather than mislabeled.

Surfaced on find_usages (per-usage return_usage field + filter param) and
verify_change (per-function return-usage distribution of real call sites, so
a return-signature change shows how every call site consumes the value).
…xpoint

New internal/cfg package: on-demand per-function CFGs from the tree-sitter AST
for Go, Python, JavaScript, TypeScript, Java, Rust, and Ruby — basic blocks
with per-statement def/use sets, labeled edges (branches, loops, labeled
break/continue, switch/match fallthrough, try/except/finally), and a bitset
GEN/KILL reaching-definitions fixpoint producing statement-granular def-to-use
chains, with a Mermaid renderer.

Exposed as the get_cfg MCP tool and analyze kind=def_use. internal/dataflow
gained a CFG-backed refiner on flow_between and taint_paths: same-function
value_flow hops are confirmed or pruned based on whether the def reaches the
use, and pruned paths sink in the ranking.
When an incremental sync changes a file's symbol signatures (or removes
symbols / changes their kind), the files that reference those symbols are
re-resolved synchronously in the same pipeline; a body-only edit produces no
delta and fans out to nothing. The delta is computed on a line-insensitive,
graph-derived symbol shape so it is meaningful across languages and not
defeated by line-embedded node IDs, and parse failures are not mistaken for
symbol removal.

Affected files come from a reverse reference-facts lookup
(RefFactsReader.LoadRefFactsByTargets, backed by a new by-target index) unioned
with a pre-evict in-edge snapshot, capped with truncation accounting. The
no-delta path stays cheap. Also fixes the cold-index resolver shadow-swap and a
stale ref-facts row left when a reference disappears.
IDL-aware contract extraction (.proto package/service/method canonical
identities with brace-bounded service blocks; a Thrift extractor) plus a
matcher join that pairs gRPC/Thrift providers and consumers across casing and
package-qualification, gated on real gRPC evidence so plain Register*Server
function definitions do not mint phantom providers.

Each matched provider-consumer group materializes one persisted contract-bridge
node, scoped to the (workspace, project) match boundary so unrelated services
never merge, with deterministic node fields and reconcile serialization. The
contracts tool gained action=bridge: a reciprocal-rank-fusion group query and a
cross-service impact mode.
New internal/semantic/tstypes package: per-language type resolvers for Java,
Python, Ruby, Rust, TypeScript/JavaScript, and C# that run fully in-process
over the shared tree-sitter AST — no external language server. A table-driven
engine builds per-file scope graphs, binds declared and constructor types,
propagates them through local assignments, resolves receivers against the
graph's method sets via import-aware cross-file lookup, and synthesizes
implements/extends edges per language.

Resolutions are stamped at the ast_resolved tier with semantic_source
<lang>-types, never downgrading a stronger edge; ambiguous receivers are
skipped. Enrichment is scoped to the repo being enriched, runs its graph-apply
phase under the resolve mutex, persists full edge provenance on disk backends,
and wires single-file incremental enrichment. Providers register as
supplemental in the semantic manager and coexist with LSP providers.
@zzet zzet force-pushed the feat/improvements branch from 8605e8f to 68edf88 Compare June 13, 2026 07:11
@zzet zzet merged commit d05f85e into main Jun 13, 2026
11 checks passed
@zzet zzet deleted the feat/improvements branch June 13, 2026 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant