Blasp v4: Driver-based architecture rewrite#48
Conversation
…ction Adds a `Blaspable` trait that hooks into the Eloquent `saving` event to automatically check and sanitize (or reject) profanity on specified model attributes. Supports per-model language, mask, and mode overrides. - Blaspable trait with sanitize/reject modes and helper methods - ProfanityRejectedException for reject mode - ModelProfanityDetected event fired on detection - `model.mode` config key in blasp.php - 21 tests covering all trait functionality Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove all v3-era source files that have been replaced by the new v4 architecture: Abstracts, Config, Contracts, Facades, Generators, Normalizers, Registries, and the monolithic BlaspService/ProfanityDetector. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New modular core with Analyzer, Dictionary, Result, and driver-based detection (RegexDriver, PatternDriver). Includes normalizers per language, configurable masking strategies, severity levels, and false positive filtering. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BlaspManager with fluent PendingCheck API, Facade, ServiceProvider, middleware, validation rule, artisan commands (clear, test, languages), events (ProfanityDetected, ContentBlocked), and BlaspFake for testing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update composer.json laravel extra to point to new BlaspServiceProvider and Facade namespaces. Add severity tiers to English language config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Migrate all tests to use the new v4 Facade, PendingCheck fluent API, and Result methods. Simplify TestCase base class to use BlaspServiceProvider. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Full rewrite covering the new driver architecture, fluent API, Result object, Blaspable trait, middleware, validation rules, testing utilities, events, artisan commands, configuration reference, and v3 migration guide. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register 'blasp' as a short middleware alias, add @clean Blade directive for XSS-safe profanity masking in views, and register isProfane/cleanProfanity macros on Str and Stringable for fluent usage. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Move all classes from Blaspsoft\Blasp\Laravel\* to Blaspsoft\Blasp\* and update imports across src and tests to match. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Catches sound-alike profanity evasions (e.g. "phuck", "fuk", "sheit") that bypass the regex and pattern drivers. Uses PHP's metaphone() for indexing and levenshtein() for confirmation, with a curated false-positive list to protect common words like "fork", "duck", and "beach". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Allows combining regex, pattern, and phonetic drivers so a single
check() call catches obfuscated text, exact matches, and sound-alikes
in one pass. Supports config-based (`driver('pipeline')`) and ad-hoc
(`pipeline('regex', 'phonetic')`) usage with union merge semantics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add severity maps (mild/moderate/extreme) for Spanish, French, and German so withSeverity() filtering works correctly for all languages instead of defaulting everything to High. Implement result caching in PendingCheck — check() results are cached by a hash of all parameters (text, driver, language, severity, allow/block lists, mask strategy). CallbackMask bypasses cache since closures can't serialize. Add Result::fromArray() for deserialization, extend Dictionary::clearCache() to also clear result cache, and add cache.results config toggle. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…lists Non-English severity maps (Spanish, French, German) only had 3 tiers (mild, moderate, extreme) while English had 4. Added 'high' tier with representative strong profanity words to each. Also added 39 words that appeared in severity maps but were missing from profanities arrays (21 English, 5 French, 13 German), which meant they could never be detected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Dictionary: sanitize language parameter to prevent path traversal
via loadLanguageConfig(), forLanguage(), and forLanguages()
- TestCommand: rename --verbose to --detail to avoid conflict with
Symfony Console's built-in -v|--verbose flag
- PatternDriver, PhoneticDriver, RegexDriver: convert PREG_OFFSET_CAPTURE
byte offsets to character offsets for correct multibyte string handling
- PatternDriver, PhoneticDriver, RegexDriver: apply severity filter before
masking so low-severity words aren't masked in cleanText when filtered out
- Blasp facade: throw RuntimeException in assertChecked() and
assertCheckedTimes() when fake() hasn't been called, instead of silently
passing
- Profanity rule: convert static factory methods to instance methods with
__callStatic for backward compat, enabling chaining like
Profanity::in('spanish')->severity(Severity::High)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove 'knob' from false_positives list (conflicts with profanities) - PatternDriver: deduplicate overlapping matches before masking to prevent double-masking (e.g., "motherfucker" matching both "motherfucker" and "fuck") - PhoneticDriver, RegexDriver: pass byte offsets to FalsePositiveFilter methods (isInsideHexToken, isSpanningWordBoundary, getFullWordContext) which use byte-level operations, while keeping character offsets for MatchedWord positions and mb_substr masking Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…n PatternDriver Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…icDriver matches Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces unbounded lazy quantifier (*?) with {0,3} in the separator
expression between profanity characters. This prevents PHP-FPM worker
segfaults caused by PCRE JIT stack overflow when processing 1,300+
complex patterns with nested lazy quantifiers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…tack overflow
Each branch in the separator group now matches exactly one character,
with the outer {0,3}? handling repetition. Removes redundant (?:\s)
alternative since \s is already in the character class.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughValidates pipeline driver config, guards against self-referential pipeline names, early-exits non-string validator inputs, preserves nested disable state, improves UTF-8 handling in matchers/drivers, reorders severity filtering before deduplication/masking, adjusts middleware field selection order, trims language lists, and limits tracked cache keys. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 13
Note
Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
config/languages/spanish.php (1)
149-149:⚠️ Potential issue | 🟡 MinorDuplicate entry:
cabronazoappears twice in the profanities list.This term is listed at both line 149 and line 186.
Proposed fix
'cabronazo', - 'hijoelagranputa', + 'hijoelagranputa',Remove one of the duplicate
cabronazoentries (line 149 or 186).Also applies to: 186-186
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/languages/spanish.php` at line 149, The Spanish profanities list contains a duplicated entry 'cabronazo'; remove one of the two occurrences so the array contains a single 'cabronazo' entry (locate the profanities array where 'cabronazo' appears and delete either the item at the earlier occurrence or the later one), ensuring the array syntax/commas remain valid after removal.src/Core/Normalizers/SpanishNormalizer.php (1)
21-33:⚠️ Potential issue | 🟠 MajorHandle
preg_replace_callback()failure paths to maintain string contract.Both
preg_replace_callback()calls (lines 21 and 28) can returnnullon PCRE errors or invalid UTF-8. The first failure leaves$normalizedStringasnull, which the second callback receives instead of a string, and the method's return type declaration (stringat line 7) is violated whennullis returned at line 35.Proposed fix
- $normalizedString = preg_replace_callback('/\bll(?=[aeiouáéíóúü])/i', function ($matches) { + $normalizedString = preg_replace_callback('/\bll(?=[aeiouáéíóúü])/i', function ($matches) { $match = $matches[0]; if ($match === 'LL') return 'Y'; if ($match === 'Ll') return 'Y'; return 'y'; - }, $normalizedString); + }, $normalizedString) ?? $normalizedString; - $normalizedString = preg_replace_callback('/rr/i', function ($matches) { + $normalizedString = preg_replace_callback('/rr/i', function ($matches) { $match = $matches[0]; if ($match === 'RR') return 'R'; if ($match === 'Rr') return 'R'; return 'r'; - }, $normalizedString); + }, $normalizedString) ?? $normalizedString;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Normalizers/SpanishNormalizer.php` around lines 21 - 33, The preg_replace_callback calls can return null and thus break the method's string return contract; change each assignment to use a temporary variable (e.g. $result = preg_replace_callback(...)) and then check if $result === null — if so, keep the previous string value of $normalizedString (or cast to string) and log/handle the PCRE failure as appropriate; otherwise assign $normalizedString = $result. Apply this for both callbacks that touch $normalizedString so the method (in SpanishNormalizer, the function using $normalizedString and declared to return string) never ends up returning null.
🟡 Minor comments (12)
src/Core/Matchers/FalsePositiveFilter.php-125-139 (1)
125-139:⚠️ Potential issue | 🟡 MinorUse
/\wu/regex flag for proper Unicode word boundary detection ingetFullWordContext().The method uses the
/\w/pattern without the/uflag to detect word character boundaries. This matches only ASCII word characters[A-Za-z0-9_], not Unicode letters. When expanding to full word context for strings containing accented or non-ASCII word characters, the regex will stop at the first non-ASCII character, potentially producing incorrect context. Add the/uflag to the regex patterns on lines 130 and 134 to properly support UTF-8 word boundaries.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Matchers/FalsePositiveFilter.php` around lines 125 - 139, In getFullWordContext, the preg_match calls using '/\w/' only match ASCII word characters and break on UTF-8; update both regexes in the left and right expansion loops to use the Unicode flag (e.g. change '/\w/' to '/\w/u') so preg_match properly recognizes non-ASCII word characters when expanding the left and right bounds around the match.tests/BlaspCheckTest.php-163-163 (1)
163-163:⚠️ Potential issue | 🟡 MinorTypo in test method name:
boudary→boundary.Proposed fix
- public function test_word_boudary() + public function test_word_boundary()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/BlaspCheckTest.php` at line 163, Rename the test method test_word_boudary to test_word_boundary to fix the typo; update the method declaration and any references/annotations that call or refer to test_word_boudary so PHPUnit runs the corrected test name, ensuring the function signature in BlaspCheckTest (public function test_word_boudary) is changed to public function test_word_boundary.tests/BlaspCheckTest.php-175-175 (1)
175-175:⚠️ Potential issue | 🟡 MinorTypo in test method name:
pural→plural.Proposed fix
- public function test_pural_profanity() + public function test_plural_profanity()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/BlaspCheckTest.php` at line 175, Rename the test method named test_pural_profanity to test_plural_profanity (fixing the "pural" → "plural" typo) and update any references to that method (calls, annotations, or data providers) so PHPUnit discovers and runs it correctly; ensure method name in class BlaspCheckTest and any related docblocks or test-suite references are adjusted accordingly.tests/BlaspCheckTest.php-193-193 (1)
193-193:⚠️ Potential issue | 🟡 MinorTypo in test method name:
subtitution→substitution.Proposed fix
- public function test_ass_subtitution() + public function test_ass_substitution()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/BlaspCheckTest.php` at line 193, Rename the test method test_ass_subtitution to test_ass_substitution to fix the typo; update the method declaration in BlaspCheckTest (and any references, data providers, or annotations that refer to test_ass_subtitution) so the test runner and any callers use the corrected name test_ass_substitution.tests/CacheDriverConfigurationTest.php-18-32 (1)
18-32:⚠️ Potential issue | 🟡 MinorThese tests don't actually prove the cache behavior yet.
test_dictionary_can_be_created_without_cache()never asserts that no cache entries were written, and both clear-cache tests callDictionary::clearCache()before warming any dictionary. They will still pass if caching is broken orclearCache()is a no-op. Prime a language first and assert the registry changes around the clear call.🧪 Suggested test hardening
public function test_dictionary_can_be_created_without_cache(): void { Config::set('blasp.cache.driver', null); $dictionary = Dictionary::forLanguage('english'); $this->assertNotNull($dictionary); $this->assertNotEmpty($dictionary->getProfanities()); + $this->assertFalse(Cache::has('blasp_cache_keys')); } public function test_clear_cache_works(): void { + Config::set('blasp.cache.driver', 'array'); + Dictionary::forLanguage('english'); + $this->assertNotEmpty(Cache::store('array')->get('blasp_cache_keys', [])); + Dictionary::clearCache(); $this->assertFalse(Cache::has('blasp_cache_keys')); } ... public function test_clear_cache_with_custom_driver(): void { Config::set('blasp.cache.driver', 'array'); + Dictionary::forLanguage('english'); + $this->assertNotEmpty(Cache::store('array')->get('blasp_cache_keys', [])); Dictionary::clearCache(); $keys = Cache::store('array')->get('blasp_cache_keys', []); $this->assertEmpty($keys); }Also applies to: 51-58
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/CacheDriverConfigurationTest.php` around lines 18 - 32, Both tests are not validating caching behavior: update test_dictionary_can_be_created_without_cache to set Config::set('blasp.cache.driver', null), call Dictionary::forLanguage('english') to prime the dictionary, then assert Cache::has('blasp_cache_keys') is false (no entries written) while still asserting $dictionary and getProfanities() are valid; update test_clear_cache_works to first prime the cache by calling Dictionary::forLanguage('english') (or another language), assert Cache::has('blasp_cache_keys') is true, then call Dictionary::clearCache() and assert Cache::has('blasp_cache_keys') is false to prove clearCache() actually removes registry entries (use Dictionary::forLanguage, Dictionary::clearCache, Cache::has and Config::set references).tests/MultiLanguageProfanityTest.php-29-31 (1)
29-31:⚠️ Potential issue | 🟡 MinorKeep a native UTF-8 case in the regression set.
The PR calls out multibyte/UTF-8 fixes, but these assertions now only prove the ASCII transliterations. If accent/ß normalization regresses, this file can still stay green.
Proposed fix
$testCases = [ 'mierda' => 'Esta es una mierda', 'joder' => 'No quiero joder', 'cabron' => 'Eres un cabron', + 'cabrón' => 'Eres un cabrón', 'puta' => 'La puta madre', ]; @@ $testCases = [ 'english' => ['FUCK', 'FuCk', 'fUcK'], - 'spanish' => ['MIERDA', 'MiErDa', 'mIeRdA'], - 'german' => ['SCHEISSE', 'ScHeIsSe', 'schEISSE'], + 'spanish' => ['MIERDA', 'MiErDa', 'mIeRdA', 'CABRÓN'], + 'german' => ['SCHEISSE', 'ScHeIsSe', 'schEISSE', 'SchEiße'], 'french' => ['MERDE', 'MeRdE', 'mErDe'], ];Also applies to: 93-96
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/MultiLanguageProfanityTest.php` around lines 29 - 31, The test regression set currently uses ASCII transliterations (e.g. "'cabron' => 'Eres un cabron'") which hides multibyte/UTF‑8 behavior; update those entries to use native UTF‑8 characters (for example change "cabron"/"Eres un cabron" to "cabrón"/"Eres un cabrón" and any similar ASCII forms) and apply the same UTF‑8 replacements to the other entries referred to (the ones around the 93-96 block) so the tests exercise accent/ß/multibyte normalization end-to-end.tests/ConfigurationLoaderTest.php-80-83 (1)
80-83:⚠️ Potential issue | 🟡 MinorSeed the cache before asserting
clearCache().As written, this test passes even if
Dictionary::clearCache()is a no-op, becauseblasp_cache_keysis never created first.Proposed fix
public function test_clear_cache() { + Cache::put('blasp_cache_keys', ['blasp.test']); + $this->assertTrue(Cache::has('blasp_cache_keys')); + Dictionary::clearCache(); $this->assertFalse(Cache::has('blasp_cache_keys')); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/ConfigurationLoaderTest.php` around lines 80 - 83, The test test_clear_cache currently calls Dictionary::clearCache() then asserts Cache::has('blasp_cache_keys') is false but never seeds that cache key first; modify the test to seed the cache (e.g. Cache::put('blasp_cache_keys', ['dummy' => true]) or Cache::forever('blasp_cache_keys', ['dummy' => true'])) before calling Dictionary::clearCache(), then call Dictionary::clearCache() and assert Cache::has('blasp_cache_keys') is false to verify the method actually removes the key.tests/MultiLanguageProfanityTest.php-43-45 (1)
43-45:⚠️ Potential issue | 🟡 MinorRemove the duplicate German test key.
PHP keeps only the last value for duplicate string keys, so one of these cases is silently discarded before the loop runs. Replace it with the UTF-8 variant 'scheiße' (with ß) to add coverage for the multibyte character handling:
Proposed fix
$testCases = [ - 'scheisse' => 'Das ist scheisse', - 'scheisse' => 'Das ist scheisse', + 'scheisse' => 'Das ist scheisse', + 'scheiße' => 'Das ist scheiße', 'arsch' => 'Du bist ein arsch', 'ficken' => 'Ich will ficken', 'verdammt' => 'Verdammt noch mal', ];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/MultiLanguageProfanityTest.php` around lines 43 - 45, The $testCases array in MultiLanguageProfanityTest.php contains duplicate string keys ('scheisse') so PHP drops the first entry; update the array assigned to $testCases to remove the duplicate key and replace one of them with the UTF-8 variant 'scheiße' (use 'scheiße' as the key and a corresponding value like 'Das ist scheiße') so the test covers multibyte character handling; adjust only the $testCases entries (refer to the $testCases variable in this file/test) and keep other test logic unchanged.tests/ResultCachingTest.php-107-118 (1)
107-118:⚠️ Potential issue | 🟡 MinorThis test loses the keys it needs to verify.
Lines 112-113 reload
$keysafterDictionary::clearCache(), so theforeachiterates the cleared list and never asserts that the previously cached result entries were removed. Keep the pre-clear key set and assert against that.♻️ Suggested fix
$keys = Cache::get('blasp_result_cache_keys', []); $this->assertNotEmpty($keys); Dictionary::clearCache(); - $keys = Cache::get('blasp_result_cache_keys', []); $this->assertNull(Cache::get('blasp_result_cache_keys')); // Verify the cached result data was also cleared foreach ($keys as $key) { $this->assertNull(Cache::get($key));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/ResultCachingTest.php` around lines 107 - 118, The test reloads $keys after calling Dictionary::clearCache(), losing the original key list needed for verification; change the test to capture the pre-clear keys (e.g. $originalKeys = $keys) before calling Dictionary::clearCache(), remove the second assignment to $keys, keep the assertion that Cache::get('blasp_result_cache_keys') is null, and iterate $originalKeys in the foreach to assert Cache::get($key) is null for each previously cached entry (references: $keys, $originalKeys, Dictionary::clearCache(), Cache::get()).src/Core/Result.php-156-157 (1)
156-157:⚠️ Potential issue | 🟡 MinorEdge case: Empty text with matches could produce incorrect score.
When
$originalTextis empty (or whitespace-only),preg_splitreturns['']afterPREG_SPLIT_NO_EMPTY, resulting incount() = 0. Themax(1, ...)ensures$totalWords >= 1, but if$wordsis non-empty while$originalTextis empty, the score calculation may not reflect the expected context.This is defensive but worth documenting the expected behavior when
withMatches()is called with words but an empty original text.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Result.php` around lines 156 - 157, The current total-words calculation in Result.php can miscount when $originalText is empty but $words is non-empty; modify the logic used before calling Score::calculate so that if trim($originalText) is empty and $words is provided, $totalWords is derived from count($words) (or count(preg_split(..., implode(' ', $words)))) instead of defaulting to 1; update the block that computes $totalWords (the line using preg_split on $originalText ?: implode(' ', $words)) and ensure Score::calculate($matchedWords, $totalWords) receives this corrected value, and add a short comment in the withMatches-related area documenting this edge-case behavior.src/PendingCheck.php-151-157 (1)
151-157:⚠️ Potential issue | 🟡 MinorUnused
$falsePositivesparameter in configure().The
$falsePositivesparameter is accepted but ignored. This appears to be incomplete backward-compatibility implementation. Either implement false positive filtering or document that this parameter is deprecated/ignored.🔧 Proposed fix to document or implement
Option 1: Document as ignored (if intentional):
+ /** + * `@deprecated` Use allow() for false positives. The $falsePositives parameter is ignored. + */ public function configure(?array $profanities = null, ?array $falsePositives = null): selfOption 2: Implement the functionality:
public function configure(?array $profanities = null, ?array $falsePositives = null): self { if ($profanities !== null) { $this->blockList = array_merge($this->blockList, $profanities); } + if ($falsePositives !== null) { + $this->allowList = array_merge($this->allowList, $falsePositives); + } return $this; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/PendingCheck.php` around lines 151 - 157, The configure() method currently ignores the $falsePositives parameter; update it to apply false-positive handling by merging $falsePositives into a dedicated property and removing any false positives from the current block list: inside PendingCheck::configure add handling like if ($falsePositives !== null) { $this->falsePositives = array_merge($this->falsePositives ?? [], $falsePositives); $this->blockList = array_values(array_diff($this->blockList, $falsePositives)); } so the unique symbols to change are the configure() method, the $falsePositives parameter, the $this->blockList property, and add/use $this->falsePositives property to store the allowed items.src/Core/Dictionary.php-49-66 (1)
49-66:⚠️ Potential issue | 🟡 MinorUse
mb_strtolower()for UTF-8 safety.The PR objectives mention UTF-8/multibyte safety fixes, but this code uses
strtolower()which doesn't handle multibyte characters correctly. Additionally, line 55 performs a case-sensitivein_array()check against$this->profanitieswhich may contain mixed-case words.🛡️ Proposed fix
- $this->allowList = array_map('strtolower', $allowList); - $this->blockList = array_map('strtolower', $blockList); + $this->allowList = array_map(fn($w) => mb_strtolower($w, 'UTF-8'), $allowList); + $this->blockList = array_map(fn($w) => mb_strtolower($w, 'UTF-8'), $blockList); $this->language = $language; // Apply block list — add extra words to profanities foreach ($this->blockList as $word) { - if (!in_array($word, $this->profanities)) { + if (!in_array($word, array_map(fn($p) => mb_strtolower($p, 'UTF-8'), $this->profanities))) { $this->profanities[] = $word; $this->severityMap[$word] = Severity::High; } } // Remove allow-listed words if (!empty($this->allowList)) { $this->profanities = array_values(array_filter( $this->profanities, - fn($p) => !in_array(strtolower($p), $this->allowList) + fn($p) => !in_array(mb_strtolower($p, 'UTF-8'), $this->allowList) )); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Dictionary.php` around lines 49 - 66, Replace all uses of strtolower() with mb_strtolower() for UTF-8 safety when normalizing $allowList and $blockList (e.g. the array_map calls that assign $this->allowList and $this->blockList), and ensure comparisons against $this->profanities are done case-insensitively by comparing normalized values (use mb_strtolower on both sides). Specifically, update the block-list loop that checks in_array($word, $this->profanities) to compare mb_strtolower($word) against a lowercased profanities set (or normalize $this->profanities once), and change the allow-list filter closure (fn($p) => !in_array(strtolower($p), $this->allowList)) to use mb_strtolower($p) and mb_strtolower for list entries so all checks are multibyte-safe.
🧹 Nitpick comments (16)
src/Core/Matchers/CompoundWordDetector.php (1)
39-43: Usearray_keys()to avoid the unused$_variable.Static analysis flagged the unused
$_. Since only the keys are needed, iterating overarray_keys()is cleaner.♻️ Proposed fix
- foreach ($profanityExpressions as $profanity => $_) { - if (strlen($profanity) >= 3 && stripos($remainder, $profanity) !== false) { + foreach (array_keys($profanityExpressions) as $profanity) { + if (strlen($profanity) >= 3 && stripos($remainder, $profanity) !== false) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Matchers/CompoundWordDetector.php` around lines 39 - 43, In CompoundWordDetector (the loop over $profanityExpressions), replace the foreach that uses an unused value slot (foreach ($profanityExpressions as $profanity => $_)) by iterating only the keys (e.g., foreach (array_keys($profanityExpressions) as $profanity)) so static analysis warnings go away while preserving the existing logic that checks strlen($profanity) >= 3 and stripos($remainder, $profanity) !== false before returning false.src/Core/Score.php (1)
7-16: Add type documentation for the$matchedWordsparameter.The method assumes array elements are
MatchedWordinstances (accessing$word->severity->weight()), but this isn't documented. Consider adding a PHPDoc annotation for IDE support and clarity.📝 Suggested documentation
+ /** + * `@param` MatchedWord[] $matchedWords + */ public static function calculate(array $matchedWords, int $totalWordCount): int🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Score.php` around lines 7 - 16, Add a PHPDoc type annotation for the $matchedWords parameter on the calculate method to document that the array contains MatchedWord instances (e.g., `@param` MatchedWord[] $matchedWords). Update the docblock above the public static function calculate(array $matchedWords, int $totalWordCount): int to reference the MatchedWord class so IDEs and static analyzers know the element type used when calling $word->severity->weight().composer.json (1)
18-24: Potential testbench/Laravel version mismatch.
illuminate/supportsupports Laravel 8–12, butorchestra/testbench ^10.0only supports Laravel 12. This may cause issues when testing against older Laravel versions in CI. Consider either narrowingilluminate/supportto^12.0or using a wider testbench version range if backward compatibility is needed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@composer.json` around lines 18 - 24, The composer.json currently allows "illuminate/support" ^8–12 while "orchestra/testbench" is fixed to ^10.0, causing a testbench/Laravel version mismatch; update composer.json so both requirements align—either restrict "illuminate/support" to ^12.0 if you only intend to support Laravel 12, or broaden "orchestra/testbench" to a range that supports the older Laravel majors you need (so the "require" entry for "illuminate/support" and the "require-dev" entry for "orchestra/testbench" reference compatible major versions).src/Core/Normalizers/EnglishNormalizer.php (1)
5-11: Consider extending or delegating toNullNormalizer.This implementation is identical to
NullNormalizer. While having separate language-specific normalizer classes allows independent evolution (e.g., adding English-specific transformations later), you could reduce duplication by extendingNullNormalizeror delegating to it until English-specific logic is needed.♻️ Optional: extend NullNormalizer
-class EnglishNormalizer implements StringNormalizer +class EnglishNormalizer extends NullNormalizer { - public function normalize(string $string): string - { - return $string; - } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Normalizers/EnglishNormalizer.php` around lines 5 - 11, The EnglishNormalizer currently duplicates NullNormalizer behavior; update EnglishNormalizer to reuse NullNormalizer by either extending NullNormalizer (e.g., class EnglishNormalizer extends NullNormalizer implements StringNormalizer) or delegating its normalize(string $string): string to an internal NullNormalizer instance, keeping the public normalize method and the EnglishNormalizer class name so language-specific logic can be added later.config/languages/spanish.php (1)
4-33: Severity map added - verify coverage alignment with profanities list.The new severity mapping provides good categorization. However, some terms in
profanities(e.g.,homosexualat line 74,jodido/jodidaat lines 44-45) are not present in any severity tier. Consider whether all profanities should have a severity assignment for consistent scoring behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/languages/spanish.php` around lines 4 - 33, The severity map ('severity' array) is missing entries for some profanities declared elsewhere (e.g., the profanities list contains "homosexual" and "jodido"/"jodida" which are not present in any severity tier), so update the severity mapping to include those terms with appropriate tiers; locate the 'severity' array in this diff and add the missing tokens (or remove/normalize them from the profanities list if intentionally excluded) so every profanity in the profanities collection has a corresponding severity level and consistent scoring behavior.src/Core/Normalizers/GermanNormalizer.php (1)
18-23: Case-preserving callback handles common cases; consider edge cases.The callback handles
SCH,Sch, and defaults tosh. Mixed-case variants likesCHorScHwill normalize to lowercasesh. This is likely acceptable for profanity detection, but if full case preservation is desired:Alternative: preserve casing per character
$normalizedString = preg_replace_callback('/sch/i', function ($matches) { $match = $matches[0]; - if ($match === 'SCH') return 'SH'; - if ($match === 'Sch') return 'Sh'; - return 'sh'; + return (ctype_upper($match[0]) ? 'S' : 's') . (ctype_upper($match[2]) ? 'H' : 'h'); }, $normalizedString);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Normalizers/GermanNormalizer.php` around lines 18 - 23, The sch replacement callback in GermanNormalizer (the preg_replace_callback on $normalizedString) only special-cases 'SCH' and 'Sch' and otherwise returns 'sh', which loses mixed-case patterns like 'sCH' or 'ScH'; update the callback to compute the replacement per-character by examining the original $matches[0] characters and producing 's'/'S' + 'h'/'H' according to each source character's case (preserve full per-character casing) so mixed-case inputs map to corresponding mixed-case outputs while keeping the existing checks for 'SCH' and 'Sch'.src/Console/ClearCommand.php (1)
13-17: Consider adding error handling for cache clear failures.The command assumes
Dictionary::clearCache()always succeeds. While cache clearing is typically safe, adding minimal error handling would improve robustness.♻️ Optional: Add try-catch for robustness
public function handle(): void { - Dictionary::clearCache(); - $this->info('Blasp cache cleared successfully!'); + try { + Dictionary::clearCache(); + $this->info('Blasp cache cleared successfully!'); + } catch (\Throwable $e) { + $this->error('Failed to clear cache: ' . $e->getMessage()); + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Console/ClearCommand.php` around lines 13 - 17, Wrap the call to Dictionary::clearCache() in a try-catch inside ClearCommand::handle so failures are caught; on success keep $this->info('Blasp cache cleared successfully!'), on exception call $this->error(...) with the exception message (e.g. $e->getMessage()) and return a non-zero status (return 1) so the command reports failure, otherwise return 0. Ensure you reference Dictionary::clearCache, ClearCommand::handle, $this->info and $this->error when making the change.tests/AllLanguagesDetectionTest.php (1)
102-118: Use accented spellings here so this remains an end-to-end normalizer test.Line 103 says “umlauts and eszett”, but
scheisseis already the ASCII fallback; Line 114 calls out accents, butconnardhas none. A broken normalizer wiring inBlasp::...()->check()would still pass. Swap at least one case toscheißeand an accented French term likeenculéormèrde.♻️ Suggested update
- $germanTests = ['scheisse', 'Scheisse', 'SCHEISSE']; + $germanTests = ['scheiße', 'Scheiße', 'SCHEISSE']; ... - $frenchTests = ['connard', 'CONNARD', 'Connard']; + $frenchTests = ['enculé', 'ENCULÉ', 'Enculé'];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/AllLanguagesDetectionTest.php` around lines 102 - 118, The test uses ASCII fallbacks so the normalization pipeline isn't truly exercised; update the test inputs used with Blasp::german()->check and Blasp::french()->check to include at least one real-accented form (e.g., replace or add "scheiße" among $germanTests and replace or add a French accented term like "enculé" in $frenchTests) so the end-to-end normalizer is validated rather than only ASCII variants.tests/DetectionStrategyRegistryTest.php (1)
47-60: Consider suppressing the unused parameter warnings or using named underscore prefixes.The anonymous class implementing
DriverInterfacehas unused parameters ($app,$dictionary,$mask,$options) flagged by PHPMD. While these are required by the interface contract, you could improve clarity by using underscore-prefixed names to signal intentional non-use.🔧 Optional: Use underscore prefix for intentionally unused parameters
- $this->manager->extend('custom', function ($app) { - return new class implements DriverInterface { - public function detect(string $text, Dictionary $dictionary, MaskStrategyInterface $mask, array $options = []): Result + $this->manager->extend('custom', function ($_app) { + return new class implements DriverInterface { + public function detect(string $text, Dictionary $_dictionary, MaskStrategyInterface $_mask, array $_options = []): Result { return new Result($text, $text, [], 0); } }; });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/DetectionStrategyRegistryTest.php` around lines 47 - 60, The anonymous DriverInterface implementation in the test (inside test_extend_registers_custom_driver and the closure passed to $this->manager->extend) has unused parameters in detect causing PHPMD warnings; update the detect signature to mark them intentionally unused by renaming parameters to _app, _dictionary, _mask, and _options (or add a PHPMD suppression comment on the anonymous class) so the interface contract is preserved but static analysis warnings are silenced.tests/SeverityMapTest.php (1)
8-8: Missing TestCase import.The class extends
TestCasebut there's no explicit import statement. This relies on the class being in the same namespace (Blaspsoft\Blasp\Tests), which should work ifTestCase.phpexists there, but explicit imports improve clarity.♻️ Add explicit import for clarity
use Blaspsoft\Blasp\Facades\Blasp; use Blaspsoft\Blasp\Enums\Severity; +use Blaspsoft\Blasp\Tests\TestCase; class SeverityMapTest extends TestCase🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/SeverityMapTest.php` at line 8, The test class SeverityMapTest currently extends TestCase without an explicit import; add a use statement to import the TestCase class used by the tests (e.g., add "use PHPUnit\Framework\TestCase;" or the project's TestCase namespace if different) so the class reference is explicit and clear, ensuring the file defines SeverityMapTest extends TestCase with the corresponding use line at the top.src/Core/Matchers/RegexMatcher.php (1)
103-112: Verify default quantifier usage.The default quantifier
*?could potentially cause performance issues with certain inputs. While all current callers in this file pass explicit quantifiers ('','+'), external consumers could trigger the default.Consider making the quantifier parameter required or documenting that
*?is the default:♻️ Proposed documentation improvement
- private function generateEscapedExpression(array $characters = [], array $escapedCharacters = [], string $quantifier = '*?'): string + /** + * `@param` string $quantifier Repetition quantifier (default '*?' for zero-or-more lazy) + */ + private function generateEscapedExpression(array $characters = [], array $escapedCharacters = [], string $quantifier = '*?'): string🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Matchers/RegexMatcher.php` around lines 103 - 112, The method generateEscapedExpression currently uses a risky default quantifier '*?' which may cause performance problems; make the quantifier explicit by removing the default and requiring callers to pass it (update the signature of generateEscapedExpression to require string $quantifier), then update all call sites in this class to pass their explicit quantifiers (noting existing callers already pass '' and '+'), and add/update the PHPDoc for generateEscapedExpression to state that the quantifier must be provided and what values are expected.src/PendingCheck.php (1)
241-259: Lenient mode silently overrides driver selection.When
lenientModeis true, the driver is forced to'pattern'regardless of any explicitdriver()call. This could be surprising to users who call->driver('phonetic')->lenient().Consider documenting this behavior or logging a debug message:
if ($this->lenientMode) { + // Lenient mode uses pattern driver for looser matching $driverName = 'pattern'; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/PendingCheck.php` around lines 241 - 259, The current resolveDriver() silently forces the driver to 'pattern' whenever $this->lenientMode is true, overriding any explicit ->driver() selection; update resolveDriver() so that it only falls back to the 'pattern' driver when lenientMode is true AND no explicit driver was chosen (i.e. $this->driverName is null), or alternatively emit a debug/log message when lenientMode overrides an explicit driver; adjust the logic around $driverName, keeping references to resolveDriver(), $this->driverName, $this->lenientMode, manager->getDefaultDriver(), and the 'pattern' literal to implement the conditional override or add a logging statement.tests/PhoneticDriverTest.php (1)
112-126: Consider more specific assertion for phonetic variant detection.The test uses
str_containsto check if any matched word contains 'fuck' or 'phuk'. While flexible, this could match unintended substrings. Consider asserting on the exact base word if the dictionary/matcher behavior is deterministic.- $matched = false; - foreach ($result->uniqueWords() as $word) { - if (str_contains($word, 'fuck') || str_contains($word, 'phuk')) { - $matched = true; - break; - } - } - $this->assertTrue($matched, 'Expected a fuck/phuk variant in uniqueWords: ' . implode(', ', $result->uniqueWords())); + $this->assertContains('fuck', $result->uniqueWords(), 'Expected base word "fuck" in uniqueWords');However, if the base word can legitimately vary, the current approach is acceptable.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/PhoneticDriverTest.php` around lines 112 - 126, The test_detects_phonetic_evasion currently uses str_contains on each entry of $result->uniqueWords() which may produce false positives; update the assertion to check for exact matches against the expected deterministic base word(s) instead: retrieve the array from $result->uniqueWords() and assert that it contains one of the exact allowed variants (e.g., 'fucking' or 'phuking' or whatever the phonetic driver is expected to produce) using equality comparisons or in_array, or if the dictionary can legitimately vary keep a small explicit whitelist of acceptable exact base words and assert intersection with that whitelist is non-empty; locate this change in the test_detects_phonetic_evasion function and replace the str_contains loop with an exact-match check against the whitelist of expected phonetic variants.src/Core/Dictionary.php (3)
194-198: Usemb_strtolower()ingetSeverity()for consistency.For UTF-8 safety, this should use
mb_strtolower()to match the multibyte handling elsewhere in the codebase.♻️ Proposed fix
public function getSeverity(string $word): Severity { - $lower = strtolower($word); + $lower = mb_strtolower($word, 'UTF-8'); return $this->severityMap[$lower] ?? Severity::High; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Dictionary.php` around lines 194 - 198, The getSeverity method uses strtolower which is not multibyte-safe; change it to use mb_strtolower to match the rest of the codebase and ensure UTF-8 characters are handled correctly. Update the implementation in function getSeverity(string $word): Severity to call mb_strtolower($word) (optionally passing 'UTF-8') before looking up $this->severityMap[$lower]; keep the fallback to Severity::High unchanged so behavior remains the same for missing entries.
294-318: Usemb_strtolower()inbuildSeverityMap()for UTF-8 safety.Lines 302 and 310 use
strtolower()which isn't multibyte-safe. This should be consistent with the PR's UTF-8 safety goals.♻️ Proposed fix
private static function buildSeverityMap(array $config): array { $map = []; if (isset($config['severity']) && is_array($config['severity'])) { foreach ($config['severity'] as $level => $words) { $severity = Severity::tryFrom($level) ?? Severity::High; foreach ($words as $word) { - $map[strtolower($word)] = $severity; + $map[mb_strtolower($word, 'UTF-8')] = $severity; } } } // Words only in profanities (not in severity map) default to High if (isset($config['profanities'])) { foreach ($config['profanities'] as $word) { - $lower = strtolower($word); + $lower = mb_strtolower($word, 'UTF-8'); if (!isset($map[$lower])) { $map[$lower] = Severity::High; } } } return $map; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Dictionary.php` around lines 294 - 318, In buildSeverityMap(), strtolower() is used on words from $config['severity'] and $config['profanities'], which is not multibyte-safe; replace those calls with mb_strtolower($word, 'UTF-8') (or mb_strtolower($word) with default encoding if project-wide) so UTF-8 characters are handled correctly when populating the $map and checking !isset($map[$lower]); ensure the same change is made for both places where strtolower() is used in this method.
16-16: Unused constantCACHE_TTL.This constant is defined but never referenced elsewhere in the class. Either remove it or use it in the caching methods (e.g., when storing cache entries).
🧹 Proposed fix: Remove unused constant
class Dictionary { - private const CACHE_TTL = 86400; - private array $profanities;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Core/Dictionary.php` at line 16, The private constant CACHE_TTL is declared but never used in the Dictionary class; either remove the constant or apply it where cache entries are stored/returned. If you want to keep it, update the caching methods (e.g., methods that set or save cache entries in class Dictionary such as any putCache/saveToCache/setCache method) to use CACHE_TTL as the TTL value when calling the cache store operation; otherwise delete the unused CACHE_TTL declaration to eliminate dead code. Ensure references use the constant name CACHE_TTL and that any cache store call passes it as the TTL argument.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 72fbb9b0-bf10-4996-82b6-8fec9c9dd973
📒 Files selected for processing (95)
README.mdcomposer.jsonconfig/blasp.phpconfig/config.phpconfig/languages/english.phpconfig/languages/french.phpconfig/languages/german.phpconfig/languages/spanish.phpsrc/Abstracts/BaseDetectionStrategy.phpsrc/Abstracts/StringNormalizer.phpsrc/BlaspManager.phpsrc/BlaspService.phpsrc/BlaspServiceProvider.phpsrc/Blaspable.phpsrc/Config/ConfigurationLoader.phpsrc/Config/DetectionConfig.phpsrc/Config/MultiLanguageDetectionConfig.phpsrc/Console/ClearCommand.phpsrc/Console/Commands/BlaspClearCommand.phpsrc/Console/LanguagesCommand.phpsrc/Console/TestCommand.phpsrc/Contracts/DetectionConfigInterface.phpsrc/Contracts/DetectionStrategyInterface.phpsrc/Contracts/ExpressionGeneratorInterface.phpsrc/Contracts/MultiLanguageConfigInterface.phpsrc/Contracts/RegistryInterface.phpsrc/Core/Analyzer.phpsrc/Core/Contracts/DriverInterface.phpsrc/Core/Contracts/MaskStrategyInterface.phpsrc/Core/Dictionary.phpsrc/Core/Masking/CallbackMask.phpsrc/Core/Masking/CharacterMask.phpsrc/Core/Masking/GrawlixMask.phpsrc/Core/MatchedWord.phpsrc/Core/Matchers/CompoundWordDetector.phpsrc/Core/Matchers/FalsePositiveFilter.phpsrc/Core/Matchers/PhoneticMatcher.phpsrc/Core/Matchers/RegexMatcher.phpsrc/Core/Normalizers/EnglishNormalizer.phpsrc/Core/Normalizers/FrenchNormalizer.phpsrc/Core/Normalizers/GermanNormalizer.phpsrc/Core/Normalizers/NullNormalizer.phpsrc/Core/Normalizers/SpanishNormalizer.phpsrc/Core/Normalizers/StringNormalizer.phpsrc/Core/Result.phpsrc/Core/Score.phpsrc/Drivers/PatternDriver.phpsrc/Drivers/PhoneticDriver.phpsrc/Drivers/PipelineDriver.phpsrc/Drivers/RegexDriver.phpsrc/Enums/Severity.phpsrc/Events/ContentBlocked.phpsrc/Events/ModelProfanityDetected.phpsrc/Events/ProfanityDetected.phpsrc/Exceptions/ProfanityRejectedException.phpsrc/Facades/Blasp.phpsrc/Middleware/CheckProfanity.phpsrc/Normalizers/EnglishStringNormalizer.phpsrc/Normalizers/GermanStringNormalizer.phpsrc/Normalizers/Normalize.phpsrc/PendingCheck.phpsrc/ProfanityDetector.phpsrc/Registries/DetectionStrategyRegistry.phpsrc/Registries/LanguageNormalizerRegistry.phpsrc/Rules/Profanity.phpsrc/ServiceProvider.phpsrc/Testing/BlaspFake.phptests/AllLanguagesApiTest.phptests/AllLanguagesDetectionTest.phptests/BladeDirectiveTest.phptests/BlaspCheckTest.phptests/BlaspableTest.phptests/CacheDriverConfigurationTest.phptests/ConfigurationLoaderLanguageTest.phptests/ConfigurationLoaderTest.phptests/CustomMaskCharacterTest.phptests/DetectionStrategyRegistryTest.phptests/EdgeCaseTest.phptests/EmptyInputTest.phptests/FrenchStringNormalizerTest.phptests/GermanStringNormalizerTest.phptests/Issue24Test.phptests/Issue32FalsePositiveTest.phptests/MiddlewareAliasTest.phptests/MultiLanguageDetectionConfigTest.phptests/MultiLanguageProfanityTest.phptests/PhoneticDriverTest.phptests/PipelineDriverTest.phptests/ProfanityExpressionGeneratorTest.phptests/ResultCachingTest.phptests/SeverityMapTest.phptests/SpanishStringNormalizerTest.phptests/StrMacroTest.phptests/TestCase.phptests/UuidFalsePositiveTest.php
💤 Files with no reviewable changes (21)
- src/Abstracts/StringNormalizer.php
- src/Normalizers/EnglishStringNormalizer.php
- src/Contracts/ExpressionGeneratorInterface.php
- tests/UuidFalsePositiveTest.php
- src/Contracts/RegistryInterface.php
- src/Console/Commands/BlaspClearCommand.php
- src/Contracts/DetectionStrategyInterface.php
- src/Normalizers/GermanStringNormalizer.php
- src/Normalizers/Normalize.php
- src/ProfanityDetector.php
- src/Abstracts/BaseDetectionStrategy.php
- config/config.php
- src/Contracts/DetectionConfigInterface.php
- src/Contracts/MultiLanguageConfigInterface.php
- src/Config/DetectionConfig.php
- src/Registries/DetectionStrategyRegistry.php
- src/Config/ConfigurationLoader.php
- src/Registries/LanguageNormalizerRegistry.php
- src/ServiceProvider.php
- src/BlaspService.php
- src/Config/MultiLanguageDetectionConfig.php
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Preserve previous state in Blaspable::withoutBlaspChecking() for nested calls - Reject recursive pipeline driver configuration - Guard validation rule against non-string input - Respect except fields when middleware fields config is set - Apply severity filter before overlap dedup in PatternDriver - Apply severity filter before masking in RegexDriver - Use mb_strtolower/mb_strlen in PhoneticMatcher for UTF-8 safety - Remove unused Dictionary import from BlaspServiceProvider Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/BlaspManager.php`:
- Around line 81-83: Normalize and validate the drivers list before the
self-reference check in BlaspManager: ensure $config['drivers'] exists and is an
array of strings (filter non-strings and cast items to string or throw) and
build $driverNames using strtolower on each entry so the in_array check detects
any case variants of "pipeline"; then perform the in_array('pipeline',
$driverNames, true) guard and throw the InvalidArgumentException if found (this
prevents miscased values from bypassing the guard and avoids TypeError from
array_map when createDriver() / createPipelineDriver() could recurse).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7e68e2a6-9fe2-4aa6-af22-3c68f10b933d
📒 Files selected for processing (7)
src/BlaspManager.phpsrc/BlaspServiceProvider.phpsrc/Blaspable.phpsrc/Core/Matchers/PhoneticMatcher.phpsrc/Drivers/PatternDriver.phpsrc/Drivers/RegexDriver.phpsrc/Middleware/CheckProfanity.php
🚧 Files skipped from review as they are similar to previous changes (4)
- src/Drivers/PatternDriver.php
- src/Middleware/CheckProfanity.php
- src/Drivers/RegexDriver.php
- src/Core/Matchers/PhoneticMatcher.php
- RegexDriver: track masked ranges and use immutable normalized string for position lookups to prevent offset drift across mutated buffers - RegexDriver: extract matched text from original input instead of normalized string to preserve original casing/characters - PipelineDriver: pass explicit UTF-8 encoding to mb_substr calls - PendingCheck: cap cache key tracking with configurable max_tracked_keys to prevent unbounded growth - French config: remove mince/flûte/flute from profanity list as they are common benign words that cause false positives - German config: move schwul/schwule/schwuler/schwules from extreme to moderate severity as they are also neutral self-identifiers - StrMacroTest: replace toString() with (string) cast for Laravel 8 compat Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/Drivers/PipelineDriver.php (1)
63-69: Apply the same UTF-8 fix to PatternDriver and RegexDriver for consistency.
PatternDriver(lines 87–89 insrc/Drivers/PatternDriver.php) andRegexDriver(lines 152–154 insrc/Drivers/RegexDriver.php) callmb_substr()without explicit encoding parameter, whilePipelineDriverexplicitly specifies'UTF-8'. When these drivers are composed in a pipeline, the inconsistent encoding handling could produce different masking results for multibyte input if the internal encoding differs from UTF-8.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Drivers/PipelineDriver.php` around lines 63 - 69, PatternDriver and RegexDriver currently call mb_substr() without the explicit encoding, causing inconsistent behavior with PipelineDriver which uses 'UTF-8'; update the mb_substr() calls in src/Drivers/PatternDriver.php and src/Drivers/RegexDriver.php to pass 'UTF-8' as the final argument (same as in PipelineDriver) so multibyte strings are handled consistently in PatternDriver and RegexDriver (look for the mb_substr usages inside the classes PatternDriver and RegexDriver and add the 'UTF-8' parameter).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/PendingCheck.php`:
- Around line 321-327: The current use of array_unique($keys) keeps the first
occurrence so frequently reused keys stay at the front and may be FIFO-evicted;
to refresh recency before dedupe/trim, dedupe while preserving the last
occurrence instead of the first: transform $keys so duplicates are removed but
the most recent appearance is kept (e.g. reverse $keys, call array_unique, then
reverse back or use a keyed map to keep last), then apply the existing
max-tracked-keys trimming using the same $maxKeys from
config('blasp.cache.max_tracked_keys', 1000); update the code around $keys,
array_unique and array_slice in PendingCheck (the block manipulating $keys) to
perform this reverse-dedupe-restore sequence so hot keys are considered recent
and not evicted.
---
Nitpick comments:
In `@src/Drivers/PipelineDriver.php`:
- Around line 63-69: PatternDriver and RegexDriver currently call mb_substr()
without the explicit encoding, causing inconsistent behavior with PipelineDriver
which uses 'UTF-8'; update the mb_substr() calls in
src/Drivers/PatternDriver.php and src/Drivers/RegexDriver.php to pass 'UTF-8' as
the final argument (same as in PipelineDriver) so multibyte strings are handled
consistently in PatternDriver and RegexDriver (look for the mb_substr usages
inside the classes PatternDriver and RegexDriver and add the 'UTF-8' parameter).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ff8e6473-56ae-490d-b05c-1e687662615c
📒 Files selected for processing (6)
config/languages/french.phpconfig/languages/german.phpsrc/Drivers/PipelineDriver.phpsrc/Drivers/RegexDriver.phpsrc/PendingCheck.phptests/StrMacroTest.php
✅ Files skipped from review due to trivial changes (1)
- config/languages/french.php
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Drivers/RegexDriver.php
| $keys = array_unique($keys); | ||
|
|
||
| // Evict oldest keys when exceeding the configured limit | ||
| $maxKeys = config('blasp.cache.max_tracked_keys', 1000); | ||
| if (count($keys) > $maxKeys) { | ||
| $keys = array_slice($keys, -$maxKeys); | ||
| } |
There was a problem hiding this comment.
Refresh key recency before dedupe/trim to avoid stale cache survivors.
array_unique() keeps the first occurrence, so repeated keys are not moved to the “recent” end. With FIFO-style trimming, actively reused keys can be evicted from blasp_result_cache_keys, and then Dictionary::clearCache() may miss clearing them.
🔧 Proposed fix
- $keys[] = $key;
- $keys = array_unique($keys);
+ // Move key to most-recent position while preserving uniqueness
+ $keys = array_values(array_filter($keys, static fn ($k) => $k !== $key));
+ $keys[] = $key;
// Evict oldest keys when exceeding the configured limit
- $maxKeys = config('blasp.cache.max_tracked_keys', 1000);
+ $maxKeys = max(1, (int) config('blasp.cache.max_tracked_keys', 1000));
if (count($keys) > $maxKeys) {
$keys = array_slice($keys, -$maxKeys);
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/PendingCheck.php` around lines 321 - 327, The current use of
array_unique($keys) keeps the first occurrence so frequently reused keys stay at
the front and may be FIFO-evicted; to refresh recency before dedupe/trim, dedupe
while preserving the last occurrence instead of the first: transform $keys so
duplicates are removed but the most recent appearance is kept (e.g. reverse
$keys, call array_unique, then reverse back or use a keyed map to keep last),
then apply the existing max-tracked-keys trimming using the same $maxKeys from
config('blasp.cache.max_tracked_keys', 1000); update the code around $keys,
array_unique and array_slice in PendingCheck (the block manipulating $keys) to
perform this reverse-dedupe-restore sequence so hot keys are considered recent
and not evicted.
… type validation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
BlaspableEloquent trait,CheckProfanitymiddleware, Blade directive, Str/Stringable macros, validation ruleBlasp::fake()with assertionsProfanityDetected,ContentBlocked,ModelProfanityDetectedBlaspsoft\Blasp\Laravel\merged intoBlaspsoft\Blasp\Test plan
composer test)🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Configuration
Tests