perf: PikeVM UTF-8 dot alternation overhead (3x slower than stdlib on Template::Regex)

## Problem

PikeVM compiles `.` (AnyCharNotNL) into ~10 UTF-8 byte-range alternation splits in the NFA. For patterns like `\{\{(.*?)\}\}`, PikeVM must evaluate all 10 branches at every byte position.

On LangArena's Template::Regex benchmark (50000 template vars, 50 iterations, ~10MB text), this results in ~5 billion branch evaluations. coregex takes **20.2s** vs stdlib **6.5s** (3x slower) vs Rust **3.8s** (5.3x slower).

Stdlib avoids this because it decodes one rune at a time natively without alternation splits. Rust uses lazy DFA for forward scanning, PikeVM only for captures.

## Proposed Solutions

1. **Rune-native `.` in PikeVM** — decode UTF-8 rune directly instead of splitting into byte-range alternations. Eliminates 10x branch overhead.
2. **DFA-based ReplaceAll** — for simple capture patterns like `\{\{(.*?)\}\}`, use lazy DFA for forward scan to find match boundaries, then PikeVM only for capture extraction within the match.
3. **Hybrid approach** — ASCII fast path (single byte, no alternation) + rune decode fallback for non-ASCII.

## Impact

- Template::Regex: 20s → ~4-6s target (parity with stdlib/Rust)
- Any pattern with `.*?` or `.+` on large inputs
- Affects PikeVM, BoundedBacktracker

## References

- Issue #124 (original report by @kostya)
- Rust regex PikeVM: `regex-automata/src/nfa/thompson/pikevm.rs`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: PikeVM UTF-8 dot alternation overhead (3x slower than stdlib on Template::Regex) #132

Problem

Proposed Solutions

Impact

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

perf: PikeVM UTF-8 dot alternation overhead (3x slower than stdlib on Template::Regex) #132

Description

Problem

Proposed Solutions

Impact

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions