Skip to content

perf: SIMD acceleration for sparse char_class matches #64

@kolkov

Description

@kolkov

Summary

Add SIMD-accelerated search for CharClassSearcher when matches are sparse (>64 non-matching bytes between matches).

Background

In PR #63, we evaluated SIMD for char_class patterns but found it slower for the common case where matches are frequent (30-50% of positions). However, SIMD can provide significant speedup when:

  • Searching for rare characters in large text
  • Patterns like [0-9]+ in mostly alphabetic text
  • Log parsing where keywords are sparse

Proposed Implementation

Hybrid Approach

func (s *CharClassSearcher) SearchAt(haystack []byte, at int) (int, int, bool) {
    // For sparse search (gap > 64 bytes), use SIMD to find first match
    if s.sparseHint && len(haystack)-at > 64 {
        pos := simd.MemchrInTable(haystack[at:], &s.membership)
        if pos == -1 {
            return -1, -1, false
        }
        at += pos
    }
    
    // Then use scalar for match extent
    // ...
}

SIMD Functions (already implemented)

  • simd.MemchrWord(haystack) - find first [A-Za-z0-9_]
  • simd.MemchrNotWord(haystack) - find first non-word char
  • simd.MemchrInTable(haystack, table) - generic 256-byte table lookup

Acceptance Criteria

  • Detect sparse vs dense patterns (heuristic or adaptive)
  • Use SIMD MemchrInTable for sparse search
  • Benchmark: >2x speedup on sparse patterns
  • No regression on dense patterns (existing tests)

References

Priority

Low - Current streaming implementation is already 4.9x faster than stdlib. This is an optimization for edge cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    arch: amd64x86-64 specific (AVX2, SSSE3)area: simdLow-level SIMD assembly (AVX2, SSSE3)effort: 8Very large, ~1 weekpriority: lowBacklog, nice to havestrategy: char-classCharClassSearcher optimizationtype: enhancementImprove existing feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions