feat: wire up framework route extraction#46
Open
mschreib28 wants to merge 18 commits into
Open
Conversation
… extractors Replaces comment characters and string-literal contents with spaces (not removal) so source offsets stay valid for downstream regex match index -> line number conversion. Handles Python triple-quoted docstrings, Ruby =begin/=end, Rust nested block comments, and the standard //, #, /* */ forms across the supported languages. This is consumed by framework extract() methods in a follow-up commit so that commented-out / docstring routing examples don't surface as phantom route nodes in the graph.
…antom routes)
Pipes the per-language stripCommentsForRegex helper into every framework
extract() that scans raw source: django/flask/fastapi (python.ts),
express, laravel, rails, spring, go, rust, aspnet, vapor, plus
swiftui/uikit struct extraction in swift.ts.
Without this, examples like:
# path('/admin/', AdminPanel.as_view())
""" path('/users/', UserListView.as_view()) """
urlpatterns = [path('/real/', RealView.as_view())]
produced 3 phantom route nodes. Now only the real one is extracted.
Each framework gets a regression test in __tests__/frameworks.test.ts
asserting that line-, block-, docstring- and (where relevant)
heredoc-style commented-out routes do not surface as nodes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem\n\n
FrameworkResolver.extractNodesis declared in the type atsrc/resolution/types.tsbut has zero callers across the entiresrc/tree (confirmed via grep). Meanwhile every framework resolver (Django, Flask, FastAPI, Express, Laravel, Rails, Spring, Go, Rust, C#, Swift, React, Svelte) ships anextractNodesimplementation that does real work and is then discarded.\n\nAs a result, the graph has zeroroutekind nodes in practice — checked on a real Django codebase: 23urls.pyfiles indexed, 0 route nodes produced, 0 edges from URL configs to view classes.codegraph_callers(MyView)silently misses its most important caller: the URL pattern that binds it.\n\nSeparately, the existing Django extractor's regex captures the view name in group 2 but the destructure discards it, so even if the hook were wired up it wouldn't link routes to views. Similar bugs exist in other frameworks.\n\n## Fix\n\n- Replaces the deadextractNodes?(filePath, content): Node[]hook withextract?(filePath, content): { nodes, references }.\n- Runsextract()inside the extraction pipeline for every framework whose declaredlanguagesinclude the current file's language. The orchestrator detects frameworks once per index run via a filesystem-backedResolutionContextand plumbs the names through the parse-worker boundary (strings, not function refs — structured clone can't serialize methods).\n- Updates all 13 existing framework resolvers to emit both route nodes AND handler references. The references flow through the existing resolution pipeline (name matching, import resolution, framework-specificresolve()) to produceroute -> handleredges with kindreferences.\n\nAfter this change,codegraph_callers(UserListView)on a Django project returns the URL pattern that binds it.\n\n## Frameworks covered\n\n| Framework | Shapes recognized |\n|---|---|\n| Django |path(),re_path(),url(),include()inurls.py(CBV.as_view(), dotted module paths) |\n| Flask |@app.route('/x', methods=[...]), blueprint routes |\n| FastAPI |@app.get(...),@router.post(...), all standard methods |\n| Express |app.get(...),router.post(...)with middleware chains (handler = last arg) |\n| Laravel |Route::get(),Route::resource(),Controller@action, tuple syntax |\n| Rails |get '/x', to: 'users#index', hash-rocket=>|\n| Spring |@GetMapping,@PostMapping,@RequestMappingon methods |\n| Gin / chi / gorilla / mux |r.GET(...),router.HandleFunc(...)|\n| Axum / actix / Rocket |.route("/x", get(handler))|\n| ASP.NET |[HttpGet("/x")]attributes |\n| Vapor |app.get("x", use: handler)|\n| React Router / SvelteKit | Route component nodes (interface migration only; handler refs are a follow-up) |\n\n## Tests\n\n- Unit tests per framework in__tests__/frameworks.test.ts— 29 tests. Each framework asserts a representative route pattern produces both the expected route node and a handler reference with correctfromNodeId/referenceName/referenceKind.\n- End-to-end Django test in__tests__/frameworks-integration.test.ts— builds a real tmp Django project on disk (manage.py, requirements.txt, users/views.py withUserListView, users/urls.py withpath("users/", UserListView.as_view(), ...)), runs fullindexAll(), asserts theroutenode exists, theclassnode exists, and an edge between them with kindreferences.\n\nBefore this PR the integration test fails (0 route nodes). After, it passes. Full suite: 410 tests, 409 pass. The 1 pre-existing failure isFileWatcher > debounced sync > should trigger sync after file change— an fs.watch timing flake that reproduces on the base commit too and is unrelated to this work.\n\n## Architecture\n\nThe cleanest hook point turned out to be insideextractFromSourceitself, because both the main-thread fallback path and the worker-thread parse path go through it. That way the worker doesn't need to know anything about framework objects, only astring[]of detected names.\n\n\nindexAll()\n ├─ detectFrameworks() → string[] (once per run, filesystem-backed context)\n └─ for each file: postMessage({ ..., frameworkNames })\n worker: extractFromSource(path, content, lang, frameworkNames)\n ├─ tree-sitter pass → {nodes, unresolvedReferences, errors}\n └─ for fw in getApplicableFrameworks(names, lang):\n fw.extract(path, content) → {nodes, references}\n merge into result\n\n\nThe references flow through the existingReferenceResolver.resolveAllso they're linked by the same name-matching / import-resolution / frameworkresolve()machinery that handles every other kind of reference. That means Django's view-class-targeting logic indjangoResolver.resolve()is re-used automatically for route references — no new resolution path to maintain.\n\n## Scope notes\n\n- Regex-based extraction throughout. AST-based is a tracked follow-up (the plan doc explicitly scopes it out). Current regex handles the realistic shapes covered by the test suite; known edge cases (namespacedinclude(('api.urls', 'api')), comments containing fakepath(...)calls, DRFrouter.registeraction expansion) are listed as follow-ups.\n- Node IDs embed line numbers (route:<file>:<line>:<url>). Matches existing framework precedent; an edit that adds a route at the top of a file will churn downstream IDs. Worth revisiting when incremental indexing lands.\n- React Router / SvelteKit only migrate to the new interface without emitting handler refs —<Route element={<Page/>}/>→Pagewiring is a follow-up.\n\n## Stats\n\n| Category | Lines |\n|----------|------:|\n| Production code (src/) | +760 / -683 |\n| Tests (tests/) | +370 |\n| Docs (README + plan) | +1139 |\n\nThe bulk of the docs delta isdocs/plans/2026-04-24-framework-resolver-extract.md— the implementation plan. Happy to drop that commit if you'd prefer the PR without the planning artifact.\n\n## Commit sequence\n\n15 commits, one per framework (revertable independently):\n\n\ndocs: add framework extract wiring plan\nfeat(resolution): replace extractNodes with extract() returning nodes and references\nfeat(resolution): add getApplicableFrameworks helper for per-language dispatch\nfeat(django): emit route nodes and route->view references in extract()\nfeat(flask,fastapi): emit route nodes and route->handler references\nfeat(express): emit route nodes and route->handler references\nfeat(laravel): emit route nodes and route->handler references\nfeat(rails): emit route nodes and route->handler references\nfeat(spring): emit route nodes and route->handler references\nfeat(go): emit route nodes and route->handler references\nfeat(rust): emit route nodes and route->handler references\nfeat(aspnet): emit route nodes and route->handler references\nfeat(swift,vapor): emit route nodes and route->handler references\nchore(react,svelte): migrate resolvers to extract() interface\nfeat(extraction): run framework extractors after tree-sitter parse\ndocs: document framework route extraction\nCopied from colbymchenry/codegraph#89