Perf: cache class-name resolution and skip redundant return-type body scans#62
Merged
Merged
Conversation
AbstractSniff::getClassNameWithNamespace() and getClassName() walk the token stream backwards from the last token of the file on every call, and are invoked by several sniffs once per T_FUNCTION (MethodTypeHintSniff, ReturnTypeHintSniff, DocBlockReturnSelfSniff, FullyQualifiedClassNameInDocBlockSniff, Psr4Sniff). On large files with many methods this becomes O(methods * file_size). Cache the result by filename so each file resolves at most once. DocBlockReturnSelfSniff::process() called getReturnTypes() unconditionally for every method with a return annotation. getReturnTypes() walks the entire function body looking for return statements. Since the three assertions guarded by it only fire when the annotation contains 'self', '$this', or the own class name, add a cheap pre-filter that skips the body scan for plain types like 'void', 'int', 'string', 'array<...>', etc. Measured on a real-world ~11k-line CakePHP controller with ~100 actions: parallel=1, single file: 40s -> 20s (-50%) parallel=16, whole codebase: 128s -> 90s (-30%) Total CPU across all workers: 355s -> 277s (-22%) Existing test suite (100 tests / 122 assertions) passes unchanged.
dereuromark
added a commit
that referenced
this pull request
May 13, 2026
…dentSniff (#63) * Use O(1) conditions lookups in ConsistentIndentSniff Three helpers in ConsistentIndentSniff (isInsideClosure, isInsideSwitchCase, couldBeInCaseBlock) walked the token stream backwards up to 2000 / 500 / 500 tokens for every indented line. The tokens they searched for are already tracked in the per-token 'conditions' map built by the phpcs PHP tokenizer: - T_CLOSURE: PHP::processAdditional() rewrites the conditions of every token inside an anonymous function from T_FUNCTION to T_CLOSURE (PHP.php around line 2840). - T_SWITCH / T_CASE / T_DEFAULT: all scope openers, so createLevelMap() adds them to inner tokens' conditions. So the inner-loop scans were redundant. Replace with O(depth) walks of the conditions array (typically <10 entries). T_FN is the one exception — arrow functions get scope_opener/scope_closer but don't propagate to inner tokens' conditions — so keep a bounded fallback scan for it. Removed the outdated "Case blocks don't show up in conditions" comment; they do. Measured on the same workload as #62 (CakePHP 5 app, 1095 files, biggest controller 11k lines): parallel=1, single file: 20s -> 16.7s parallel=16, whole codebase: 1m30s -> 1m16s Sniff perf on the slowest file: 4.12s (24%) -> 1.09s (8%). Existing test suite (100 tests / 122 assertions) passes unchanged. Output diff across 1095 real-world files: identical violations. * Cache T_FN scope ranges instead of bounded backward scan Initial draft used a 500-token backward scan to find a containing T_FN (arrow function) when the conditions-based fast path didn't match. Codex review correctly flagged that as a behaviour regression: arrow function bodies can legitimately exceed 500 tokens (a long match expression, a deeply nested array as the body, a long method chain), and the bounded scan would then fail to detect that we're inside the arrow and could emit false-positive over-indentation errors on lines the original implementation correctly skipped. Replace the bounded scan with a per-file cache of [scope_opener, scope_closer] ranges for every T_FN token. Built lazily on first use in O(file) and checked in O(arrow_count) per query (typically a handful of arrows per file). No scan limit, no false negatives. Cache stores the token count alongside the ranges so it self- invalidates if the file is re-tokenized inside a phpcbf fix loop. Net effect vs the bounded-scan draft: parallel=1 ResidentsController.php: 16.7s -> 6.5s parallel=16 whole codebase wall: 1m16s -> 27.9s The big extra speedup comes from cheap empty-array iteration in files with no arrow functions at all, which was previously paying for a 500-token scan per indented line.
This was referenced May 13, 2026
dereuromark
added a commit
that referenced
this pull request
May 13, 2026
UseStatementSniff has its own getUseStatements() implementation that duplicates UseStatementsTrait::getUseStatements() (cached in #64). It already has an instance-level cache via existingStatements, which covers repeated calls within a single phpcs pass; the new static cache adds coverage across phpcbf fix iterations, where populateTokenListeners() creates fresh sniff instances per pass and resets the instance cache. Cache invalidation follows the same fingerprint-based scheme as #64: token count alone is not strong enough, since an alias rename keeps it constant. Cached entries record a content fingerprint of each use statement range and re-verify them against the live tokens before being trusted. The cache also refuses to serve an empty result so it cannot return stale state for a file where a fix added a first use statement while another simultaneous fix happened to keep the file's overall token count unchanged. Measured on the same CakePHP 5 app from #62 / #63 / #64 / #65 (parallel=1, --report=performance): PhpCollective.Namespaces.UseStatement 3.36s -> 2.08s Existing test suite (100 tests / 122 assertions) passes unchanged. The FQCN cache changes from the earlier draft of this PR were dropped in favour of the cache that landed via #65.
dereuromark
added a commit
that referenced
this pull request
May 13, 2026
…#66) UseStatementSniff has its own getUseStatements() implementation that duplicates UseStatementsTrait::getUseStatements() (cached in #64). It already has an instance-level cache via existingStatements, which covers repeated calls within a single phpcs pass; the new static cache adds coverage across phpcbf fix iterations, where populateTokenListeners() creates fresh sniff instances per pass and resets the instance cache. Cache invalidation follows the same fingerprint-based scheme as #64: token count alone is not strong enough, since an alias rename keeps it constant. Cached entries record a content fingerprint of each use statement range and re-verify them against the live tokens before being trusted. The cache also refuses to serve an empty result so it cannot return stale state for a file where a fix added a first use statement while another simultaneous fix happened to keep the file's overall token count unchanged. Measured on the same CakePHP 5 app from #62 / #63 / #64 / #65 (parallel=1, --report=performance): PhpCollective.Namespaces.UseStatement 3.36s -> 2.08s Existing test suite (100 tests / 122 assertions) passes unchanged. The FQCN cache changes from the earlier draft of this PR were dropped in favour of the cache that landed via #65.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two small changes that cut whole-codebase
composer cs-checktime by ~30% and the slowest single file by ~50% on real-world large CakePHP code, with no behaviour change.1.
AbstractSniff: per-file cache for class-name resolutiongetClassNameWithNamespace()calls$phpcsFile->findPrevious([T_CLASS, T_TRAIT, T_INTERFACE, T_ENUM], $lastToken)— i.e. it scans backwards from the last token of the file to find the class declaration, which sits at the top. The result is stable for the lifetime of a phpcs run on a given file.It is invoked indirectly by every sniff that needs the current class name, once per
T_FUNCTION:MethodTypeHintSniff::process->getCurrentClassName()ReturnTypeHintSniff::assertNotThisOrStatic->getClassNameWithNamespace()DocBlockReturnSelfSniff::fixClassToThis->getClassName()FullyQualifiedClassNameInDocBlockSniff::findInSameNameSpace->getNamespace()Psr4Sniff::process->getClassName()On an 11k-line controller with ~100 methods, that is hundreds of full-file walks per file. Caching by filename turns this into one walk per file. Cache key is
$phpcsFile->getFilename(); entries live for the lifetime of the worker process, which is bounded by the file set assigned to it.2.
DocBlockReturnSelfSniff: pre-filter before expensive return-type scangetReturnTypes()walks the entire function body looking for return statements. The originalprocess()called it unconditionally for every method with a return docblock entry. But the three assertions guarded by it only fire when the annotation containsself,$this, or the own class name — so for plain types likevoid,int,string,array<...>, all that work was wasted.Added a cheap pre-filter on the
partsarray before the body scan. Skips the scan for the common case.Benchmarks
Measured on a 1095-file CakePHP 5 app whose biggest controller is
ResidentsController.phpat 11,111 lines / ~100 actions, onsquizlabs/php_codesniffer:^4.0.1andslevomat/coding-standard:^8.23.0:parallel=1, ResidentsController.php aloneparallel=16, whole codebase (wall)parallel=16, whole codebase (CPU)Top entries in the perf report on the slowest file went from:
PhpCollective.Commenting.DocBlockReturnSelf5.86s (32%),PhpCollective.Classes.MethodTypeHint3.27s (18%)The 50% gain on the biggest file is the key win — it eliminates the load-imbalance bottleneck that previously prevented
parallel=16from scaling much beyond 2.7x.Verification
composer cs-checkandcomposer stanon this repo: clean.self,$this, and fully-qualified class name in return docblocks: identical 3 errors reported before vs after, with both fixable markers preserved.