From ee3ca171bdb0edca630937bd93ec500e82c4ed38 Mon Sep 17 00:00:00 2001 From: Andy Date: Tue, 10 Mar 2026 17:27:50 +0300 Subject: [PATCH] docs: update benchmark results to v0.12.6 (AMD EPYC, regex-bench) --- README.md | 26 ++++++++++++++------------ ROADMAP.md | 37 ++++++++++++++++++++++++------------- 2 files changed, 38 insertions(+), 25 deletions(-) diff --git a/README.md b/README.md index a21f0bb..fdb1bc9 100644 --- a/README.md +++ b/README.md @@ -60,18 +60,20 @@ func main() { ## Performance -Cross-language benchmarks on 6MB input ([source](https://github.com/kolkov/regex-bench)): - -| Pattern | Go stdlib | coregex | vs stdlib | -|---------|-----------|---------|-----------| -| Literal alternation | 600 ms | 5 ms | **113x** | -| Inner `.*keyword.*` | 453 ms | 2 ms | **285x** | -| Suffix `.*\.txt` | 350 ms | <1 ms | **350x+** | -| Multiline `(?m)^/.*\.php` | 103 ms | <1 ms | **100x+** | -| Email validation | 389 ms | <1 ms | **389x+** | -| URL extraction | 350 ms | <1 ms | **350x+** | -| IP address | 825 ms | 10 ms | **82x** | -| Char class `[\w]+` | 670 ms | 112 ms | **6x** | +Cross-language benchmarks on 6MB input, AMD EPYC ([source](https://github.com/kolkov/regex-bench)): + +| Pattern | Go stdlib | coregex | Rust regex | vs stdlib | vs Rust | +|---------|-----------|---------|------------|-----------|---------| +| Literal alternation | 483 ms | 4.6 ms | 0.6 ms | **104x** | 7.8x slower | +| Multi-literal | 1401 ms | 12.7 ms | 4.6 ms | **110x** | 2.7x slower | +| Inner `.*keyword.*` | 232 ms | 0.25 ms | 0.28 ms | **926x** | **1.1x faster** | +| Suffix `.*\.txt` | 234 ms | 0.88 ms | 1.07 ms | **266x** | **1.2x faster** | +| Multiline `(?m)^/.*\.php` | 103 ms | 0.65 ms | 0.66 ms | **159x** | **~parity** | +| Email validation | 261 ms | 0.58 ms | 0.21 ms | **449x** | 2.7x slower | +| URL extraction | 258 ms | 0.63 ms | 0.34 ms | **409x** | 1.8x slower | +| IP address | 495 ms | 2.2 ms | 12.0 ms | **230x** | **5.5x faster** | +| Char class `[\w]+` | 525 ms | 40.7 ms | 50.3 ms | **12x** | **1.2x faster** | +| Word repeat `(\w{2,8})+` | 659 ms | 187 ms | 48.3 ms | **3.5x** | 3.8x slower | **Where coregex excels:** - Multiline patterns (`(?m)^/.*\.php`) — near Rust parity, 100x+ vs stdlib diff --git a/ROADMAP.md b/ROADMAP.md index 37a880c..74fc4cf 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -216,24 +216,35 @@ v1.0.0 STABLE → Production release with API stability guarantee ## Performance Targets -### Current (v0.8.20) ✅ ACHIEVED - -| Pattern Type | stdlib | coregex | Speedup | Status | -|--------------|--------|---------|---------|--------| -| Inner literal `.*keyword.*` | 12.6ms | 4µs | **3154x** | ✅ | -| Suffix `.*\.txt` | 1.3ms | 855ns | **1549x** | ✅ | -| Suffix alternation `.*\.(txt\|log\|md)` 1KB | 15.5µs | 454ns | **34x** | ✅ | -| Suffix alternation `.*\.(txt\|log\|md)` 1MB | 57ms | 147µs | **385x** | ✅ | -| FindAll `.*@suffix` | 316ms | 3.6ms | **87x** | ✅ | -| Alternation `(foo\|bar\|...)` | 9.7µs | 40ns | **242x** | ✅ | -| Case-insensitive 32KB | 1.2ms | 4.6µs | **263x** | ✅ | -| Character class `\d+` | 6.7µs | 1.5µs | **4.5x** | ✅ | -| Email patterns | 22µs | 2µs | **11x** | ✅ | +### Current (v0.12.6) — AMD EPYC, 6MB input ✅ ACHIEVED + +Cross-language benchmarks via [regex-bench](https://github.com/kolkov/regex-bench): + +| Pattern | Go stdlib | coregex | Rust regex | vs stdlib | vs Rust | +|---------|-----------|---------|------------|-----------|---------| +| Literal alternation | 483 ms | 4.6 ms | 0.6 ms | **104x** | 7.8x slower | +| Multi-literal | 1401 ms | 12.7 ms | 4.6 ms | **110x** | 2.7x slower | +| Inner `.*keyword.*` | 232 ms | 0.25 ms | 0.28 ms | **926x** | **1.1x faster** | +| Suffix `.*\.txt` | 234 ms | 0.88 ms | 1.07 ms | **266x** | **1.2x faster** | +| Multiline `(?m)^/.*\.php` | 103 ms | 0.65 ms | 0.66 ms | **159x** | **~parity** | +| Email validation | 261 ms | 0.58 ms | 0.21 ms | **449x** | 2.7x slower | +| URL extraction | 258 ms | 0.63 ms | 0.34 ms | **409x** | 1.8x slower | +| IP address | 495 ms | 2.2 ms | 12.0 ms | **230x** | **5.5x faster** | +| Char class `[\w]+` | 525 ms | 40.7 ms | 50.3 ms | **12x** | **1.2x faster** | +| Alpha+digit | 261 ms | 25.7 ms | 11.9 ms | **10x** | 2.1x slower | +| Word+digit | 271 ms | 26.2 ms | 12.0 ms | **10x** | 2.1x slower | +| Word repeat `(\w{2,8})+` | 659 ms | 187 ms | 48.3 ms | **3.5x** | 3.8x slower | +| HTTP methods | 106 ms | 0.90 ms | 0.70 ms | **117x** | 1.2x slower | +| Anchored PHP | 0.00 ms | 0.01 ms | 0.01 ms | ~1x | ~parity | +| Multiline PHP | 103 ms | 0.65 ms | 0.66 ms | **159x** | **~parity** | + +**5 patterns faster than Rust**: inner_literal, suffix, IP, char_class, multiline_php. ### Remaining for v1.0.0 | Feature | Status | Priority | |---------|--------|----------| +| Close Teddy gap vs Rust (7.8x) | Blocked on Go 1.26 archsimd | High | | ARM NEON SIMD | Planned | Medium | | Look-around assertions | Planned | Medium | | API stability guarantee | Required | High |