feat(parser): comprehensive PHP/Laravel support — fix infrastructure + Laravel semantic edges#252
Open
Minidoracat wants to merge 4 commits intotirth8205:mainfrom
Open
feat(parser): comprehensive PHP/Laravel support — fix infrastructure + Laravel semantic edges#252Minidoracat wants to merge 4 commits intotirth8205:mainfrom
Minidoracat wants to merge 4 commits intotirth8205:mainfrom
Conversation
…ure + add Laravel semantic edges
PHP's core parsing infrastructure (CALLS, INHERITS, IMPORTS edges) was
completely non-functional because `_get_call_name()` could not match
tree-sitter-php's `name` node type, `_get_bases()` had no PHP branch,
and `_extract_import()` fell through to a raw-text fallback.
This commit fixes the PHP foundation and adds Laravel-specific semantic
analysis on top:
**Phase 1 — PHP infrastructure fix:**
- `_get_call_name()`: add PHP-specific branches for all 4 call expression
types (function_call, member_call, scoped_call, object_creation)
- `_get_bases()`: add PHP branch for `base_clause` (extends) and
`class_interface_clause` (implements)
- `_extract_import()`: add PHP branch handling simple, grouped, and
aliased `use` statements with proper AST traversal
- `_CLASS_TYPES["php"]`: add `trait_declaration`, `enum_declaration`
- `_CALL_TYPES["php"]`: add `scoped_call_expression`,
`object_creation_expression`
**Phase 2 — Entry points + Blade detection:**
- `_LANG_ENTRY_NAME_PATTERNS`: language-scoped entry-point patterns so
PHP-specific names (handle, boot, register, up, down) don't pollute
other languages
- `detect_language()`: handle `.blade.php` compound extension before
the generic suffix lookup
**Phase 3 — Laravel semantic edges:**
- `_extract_php_constructs()`: detect Route definitions
(`Route::get('/path', [Controller::class, 'method'])`) and emit CALLS
edges to controller methods
- Detect Eloquent relationships (`hasMany`, `belongsTo`, etc.) and emit
REFERENCES edges to target models
- `_php_class_from_class_access()`: correctly extract class names from
both short (`Post::class`) and FQCN (`\App\Models\Post::class`) forms
**Phase 4 — Blade templates + PSR-4:**
- `_parse_blade()`: regex-based extraction of `@extends`, `@include`,
`@component`, `@livewire` directives as IMPORTS_FROM/REFERENCES edges
- `_find_php_composer_psr4()`: resolve PHP namespaces to file paths via
`composer.json` autoload PSR-4 mappings with caching
**Tested on real Laravel 9 and 12 projects:**
- CALLS edges: 0 → 9,369 (Laravel 12 project), 4,962 → 35,771 (Laravel 9)
- INHERITS edges: 0 → 481 / 0 → 346
- REFERENCES edges: 2 → 74 / 9 → 54
- Total edges: +226% / +266%
26 new tests covering all phases. 761 total tests pass, 0 regressions.
Update limitations section to reflect PHP/Laravel entry-point detection and add framework-aware parsing row to the features table.
Sync zh-CN, ja-JP, ko-KR, hi-IN with the Framework-aware parsing feature row added to the English README in the previous commit.
Upstream added ^handle$ to the universal _ENTRY_NAME_PATTERNS, so 'handle' now matches all languages — not just PHP. Narrow the negative assertion to boot/register/up which remain PHP-specific.
069bcf1 to
66a48c7
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_get_call_name(),_get_bases(), and_extract_import()all had no PHP-specific branches, making CALLS / INHERITS / IMPORTS edges completely non-functional for PHP codebaseshandle,boot,register,up/down) don't pollute other languagesMotivation
PHP is listed as a supported language, but the parser produced zero CALLS edges and zero INHERITS edges for PHP files. The root cause: tree-sitter-php uses
nameas the AST node type for identifiers (notidentifierlike other grammars), so_get_call_name()could never match PHP call expressions. Similarly,_get_bases()and_extract_import()had no PHP branches, falling through to defaults that produced no useful edges.Tested on real Laravel 9, 12, and 13 projects:
All edges spot-checked for accuracy — Route→Controller mappings, Eloquent relationships, Filament resource inheritance, and Blade directives all correspond to real code relationships.
Changes
Phase 1 — PHP infrastructure fix (
parser.py)_get_call_name(): PHP-specific branches for 4 call expression types (function_call_expression,member_call_expression,scoped_call_expression,object_creation_expression)_get_bases(): PHP branch forbase_clause(extends) +class_interface_clause(implements)_extract_import(): PHP branch handling simple, grouped (use Foo\{A, B}), and aliased imports_CLASS_TYPES["php"]: addtrait_declaration,enum_declaration_CALL_TYPES["php"]: addscoped_call_expression,object_creation_expressionPhase 2 — Entry points + Blade detection
flows.py:_LANG_ENTRY_NAME_PATTERNSdict for language-scoped patterns;_matches_entry_name()accepts optionallanguageparameterparser.py:detect_language()checks.blade.phpcompound extension before generic suffix lookupPhase 3 — Laravel semantic edges (
parser.py)_extract_php_constructs(): Route definitions (Route::get('/path', [Controller::class, 'method'])) → CALLS edge to controller methodhasMany,belongsTo, etc. — 11 methods) → REFERENCES edge to target model_php_class_from_class_access(): handles both short (Post::class) and FQCN (\App\Models\Post::class) formsPhase 4 — Blade templates + PSR-4 (
parser.py)_parse_blade(): regex-based extraction of@extends,@include,@component,@livewireas IMPORTS_FROM / REFERENCES edges_find_php_composer_psr4(): resolve namespaces to file paths viacomposer.jsonautoload PSR-4 mappings with cachingDocs (
README.md)Test plan
test_multilang.py(TestPHPParsing: 14, TestLaravelParsing: 5, TestBladeParsing: 6) andtest_flows.py(1)ruff checkcleanlanguage == "php"or in PHP-only methods🤖 Generated with Claude Code