Skip to content

refactor(scan): simplify NEON in-string probe + add ARM64 CI#34

Merged
membphis merged 2 commits into
mainfrom
worktree-neon-vmaxvq-probe
May 17, 2026
Merged

refactor(scan): simplify NEON in-string probe + add ARM64 CI#34
membphis merged 2 commits into
mainfrom
worktree-neon-vmaxvq-probe

Conversation

@membphis
Copy link
Copy Markdown
Collaborator

@membphis membphis commented May 17, 2026

Summary

Simplify NEON scanner code and add ARM64 CI coverage.

Changes

Code simplification:

  • Replace byte_mask64 (uses movemask16 pairwise-add chain) with inline vmaxvq_u8 probe
  • Remove unused byte_mask16 and byte_mask64 functions
  • Net: -7 lines, cleaner code path

CI coverage:

  • Add macos-14 (Apple Silicon) to test matrix
  • Ensures NEON code paths are tested in CI (previously only tested locally)

membphis added 2 commits May 17, 2026 08:59
Replace byte_mask64 (which uses movemask16 pairwise-add chain) with
vmaxvq_u8 on OR'd comparison results for detecting quote/backslash
in the in-string fast path. The vmaxvq_u8 approach is ~3x faster for
the probe itself, though end-to-end gains are masked by the existing
memchr2 cross-chunk jump optimization.

Changes:
- Remove unused byte_mask16 and byte_mask64 functions (-19 lines)
- Inline vmaxvq_u8 probe logic in scan_neon_impl (+12 lines)
- Add ARM64 (macos-14) to CI matrix for NEON coverage
- Add bench_neon128.rs for micro-benchmarking probe methods
@membphis membphis changed the title perf(scan): use vmaxvq_u8 for NEON in-string fast probe refactor(scan): simplify NEON in-string probe + add ARM64 CI May 17, 2026
@membphis membphis merged commit 28ad4b7 into main May 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant