feat: wire up framework route extraction#89
Conversation
|
Hi @timomeara — solid PR. The bug verification is rock-solid (I confirmed the same Two concrete improvements that I think would land cleanly inside this PR. Both prevent latent bugs and one of them removes a merge conflict you'll hit with #101 when both merge. 1. Apply
|
|
Thanks @andreinknv — appreciate the careful read. On comment stripping: agreed, this is a real bug class and worth fixing in this PR rather than relying on follow-up coordination. Since On On the optional follow-ups: agree both make sense as separate PRs (per-file registry, IndexResult failure summary). I'd rather not expand scope here. Will push the comment-stripper + regression tests shortly. |
… extractors Replaces comment characters and string-literal contents with spaces (not removal) so source offsets stay valid for downstream regex match index -> line number conversion. Handles Python triple-quoted docstrings, Ruby =begin/=end, Rust nested block comments, and the standard //, #, /* */ forms across the supported languages. This is consumed by framework extract() methods in a follow-up commit so that commented-out / docstring routing examples don't surface as phantom route nodes in the graph.
…antom routes)
Pipes the per-language stripCommentsForRegex helper into every framework
extract() that scans raw source: django/flask/fastapi (python.ts),
express, laravel, rails, spring, go, rust, aspnet, vapor, plus
swiftui/uikit struct extraction in swift.ts.
Without this, examples like:
# path('/admin/', AdminPanel.as_view())
""" path('/users/', UserListView.as_view()) """
urlpatterns = [path('/real/', RealView.as_view())]
produced 3 phantom route nodes. Now only the real one is extracted.
Each framework gets a regression test in __tests__/frameworks.test.ts
asserting that line-, block-, docstring- and (where relevant)
heredoc-style commented-out routes do not surface as nodes.
Conflict resolution after rebasing main onto this PR: - src/extraction/tree-sitter.ts: main added VueExtractor (new file src/extraction/vue-extractor.ts via colbymchenry#66). The PR's restructured if/else chain in extractFromSource gets a new vue branch alongside svelte/liquid/dfm so the framework-extract pipeline runs uniformly for vue files too. - src/resolution/frameworks/vue.ts: vue resolver still used the dead extractNodes(): Node[] interface that this PR replaced. Migrated to extract(): { nodes, references } matching the other 13 resolvers — Vue's nuxt route detection (pages/, server/api/, middleware/) keeps working, just emits no references (matches react.ts shape). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This is excellent work — thanks Tim. The bug analysis is genuine (extractNodes had zero callers, so every framework's route extraction was being silently discarded), the per-framework commits make this reviewable, and the integration test that builds a real Django project on disk to assert the route→view edge is exactly the right kind of verification. Pushed a conflict-resolution commit to your branch:
Full test suite: 481/481 passing (423 → 481, +58 new from your tests). Merging. |
Problem
FrameworkResolver.extractNodesis declared in the type atsrc/resolution/types.tsbut has zero callers across the entiresrc/tree (confirmed via grep). Meanwhile every framework resolver (Django, Flask, FastAPI, Express, Laravel, Rails, Spring, Go, Rust, C#, Swift, React, Svelte) ships anextractNodesimplementation that does real work and is then discarded.As a result, the graph has zero
routekind nodes in practice — checked on a real Django codebase: 23urls.pyfiles indexed, 0 route nodes produced, 0 edges from URL configs to view classes.codegraph_callers(MyView)silently misses its most important caller: the URL pattern that binds it.Separately, the existing Django extractor's regex captures the view name in group 2 but the destructure discards it, so even if the hook were wired up it wouldn't link routes to views. Similar bugs exist in other frameworks.
Fix
extractNodes?(filePath, content): Node[]hook withextract?(filePath, content): { nodes, references }.extract()inside the extraction pipeline for every framework whose declaredlanguagesinclude the current file's language. The orchestrator detects frameworks once per index run via a filesystem-backedResolutionContextand plumbs the names through the parse-worker boundary (strings, not function refs — structured clone can't serialize methods).resolve()) to produceroute -> handleredges with kindreferences.After this change,
codegraph_callers(UserListView)on a Django project returns the URL pattern that binds it.Frameworks covered
path(),re_path(),url(),include()inurls.py(CBV.as_view(), dotted module paths)@app.route('/x', methods=[...]), blueprint routes@app.get(...),@router.post(...), all standard methodsapp.get(...),router.post(...)with middleware chains (handler = last arg)Route::get(),Route::resource(),Controller@action, tuple syntaxget '/x', to: 'users#index', hash-rocket=>@GetMapping,@PostMapping,@RequestMappingon methodsr.GET(...),router.HandleFunc(...).route("/x", get(handler))[HttpGet("/x")]attributesapp.get("x", use: handler)Tests
__tests__/frameworks.test.ts— 29 tests. Each framework asserts a representative route pattern produces both the expected route node and a handler reference with correctfromNodeId/referenceName/referenceKind.__tests__/frameworks-integration.test.ts— builds a real tmp Django project on disk (manage.py, requirements.txt, users/views.py withUserListView, users/urls.py withpath("users/", UserListView.as_view(), ...)), runs fullindexAll(), asserts theroutenode exists, theclassnode exists, and an edge between them with kindreferences.Before this PR the integration test fails (0 route nodes). After, it passes. Full suite: 410 tests, 409 pass. The 1 pre-existing failure is
FileWatcher > debounced sync > should trigger sync after file change— an fs.watch timing flake that reproduces on the base commit too and is unrelated to this work.Architecture
The cleanest hook point turned out to be inside
extractFromSourceitself, because both the main-thread fallback path and the worker-thread parse path go through it. That way the worker doesn't need to know anything about framework objects, only astring[]of detected names.The references flow through the existing
ReferenceResolver.resolveAllso they're linked by the same name-matching / import-resolution / frameworkresolve()machinery that handles every other kind of reference. That means Django's view-class-targeting logic indjangoResolver.resolve()is re-used automatically for route references — no new resolution path to maintain.Scope notes
include(('api.urls', 'api')), comments containing fakepath(...)calls, DRFrouter.registeraction expansion) are listed as follow-ups.route:<file>:<line>:<url>). Matches existing framework precedent; an edit that adds a route at the top of a file will churn downstream IDs. Worth revisiting when incremental indexing lands.<Route element={<Page/>}/>→Pagewiring is a follow-up.Stats
The bulk of the docs delta is
docs/plans/2026-04-24-framework-resolver-extract.md— the implementation plan. Happy to drop that commit if you'd prefer the PR without the planning artifact.Commit sequence
15 commits, one per framework (revertable independently):