-
Notifications
You must be signed in to change notification settings - Fork 5
Closed
Labels
area: nfaThompson NFA, PikeVM, BoundedBacktrackerThompson NFA, PikeVM, BoundedBacktrackerpriority: highImportant for next releaseImportant for next releasetype: performanceSpeed/memory improvement or regressionSpeed/memory improvement or regression
Description
Problem
PikeVM compiles . (AnyCharNotNL) into ~10 UTF-8 byte-range alternation splits in the NFA. For patterns like \{\{(.*?)\}\}, PikeVM must evaluate all 10 branches at every byte position.
On LangArena's Template::Regex benchmark (50000 template vars, 50 iterations, ~10MB text), this results in ~5 billion branch evaluations. coregex takes 20.2s vs stdlib 6.5s (3x slower) vs Rust 3.8s (5.3x slower).
Stdlib avoids this because it decodes one rune at a time natively without alternation splits. Rust uses lazy DFA for forward scanning, PikeVM only for captures.
Proposed Solutions
- Rune-native
.in PikeVM — decode UTF-8 rune directly instead of splitting into byte-range alternations. Eliminates 10x branch overhead. - DFA-based ReplaceAll — for simple capture patterns like
\{\{(.*?)\}\}, use lazy DFA for forward scan to find match boundaries, then PikeVM only for capture extraction within the match. - Hybrid approach — ASCII fast path (single byte, no alternation) + rune decode fallback for non-ASCII.
Impact
- Template::Regex: 20s → ~4-6s target (parity with stdlib/Rust)
- Any pattern with
.*?or.+on large inputs - Affects PikeVM, BoundedBacktracker
References
- Issue Not success usage in my project #124 (original report by @kostya)
- Rust regex PikeVM:
regex-automata/src/nfa/thompson/pikevm.rs
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
area: nfaThompson NFA, PikeVM, BoundedBacktrackerThompson NFA, PikeVM, BoundedBacktrackerpriority: highImportant for next releaseImportant for next releasetype: performanceSpeed/memory improvement or regressionSpeed/memory improvement or regression