Skip to content

perf: DFA-first ReplaceAll for Rust-level performance on capture-heavy workloads #135

@kolkov

Description

@kolkov

Problem

ReplaceAllStringFunc currently runs the full PikeVM path for every match: forward scan to find match end, then capture extraction. For patterns like \{\{(.*?)\}\} on large inputs (~10MB, 50K replacements), this is significantly slower than Rust's approach.

Rust regex uses lazy DFA for the forward scan to find match boundaries, then falls back to PikeVM only for capture group extraction within the already-found match span. This minimizes the expensive NFA simulation to small bounded regions.

Current State (v0.12.7)

Phase 1 (PR #134) fixed the PikeVM split-chain overhead for . with sparse-dispatch — 2.8-4.8x speedup. But the architectural gap remains: PikeVM still does full forward scanning where a DFA would be faster.

LangArena Template::Regex (\{\{(.*?)\}\}, ReplaceAllStringFunc, ~10MB text):

  • Rust: ~3.8s (lazy DFA forward + PikeVM captures)
  • Go stdlib: ~6.5s (rune-native NFA)
  • coregex v0.12.7: TBD (was ~20s pre-fix, expect ~5-7s with sparse-dispatch)

Proposed Approach

  1. For ReplaceAllStringFunc / ReplaceAllString, use lazy DFA for forward match boundary detection
  2. Once match boundaries are found, run PikeVM only on the match span for capture extraction
  3. This matches Rust's two-phase architecture: DFA for speed, NFA for captures

Impact

Metadata

Metadata

Assignees

No one assigned

    Labels

    area: metaStrategy selection, engine orchestrationpriority: highImportant for next releasetype: performanceSpeed/memory improvement or regression

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions