Add BITOP XOR coverage for short-tail (50B values × 10 source keys, pipeline 10)#403
Merged
fcostaoliveira merged 1 commit intoJun 10, 2026
Conversation
…B values, pipeline 10) Mirrors the redis/redis#15289 repro workload (-5.4% RPS XOR / -3.4% OR after 391e3452 -> 8dfb823) so the AVX2 short-tail path can be regressed/bisected on the fleet. Spec layout: - Preload 10 string keys (key:1..key:10), each 50 bytes (32-byte AVX2 block + 18-byte scalar tail — the same length class that triggers the regression). - Bench `BITOP XOR tmp key:1 ... key:10` with pipeline 10, 1 client, 1 thread, test-time 180s. - Build variants: bookworm gcc 15.2.0 amd64 + arm64 + dockerhub. - Priority 67 (regression coverage, not high-priority release-gate). This pairs with the existing 2keys-bitmap-100M-bits-bitop-and spec, which covers the large-buffer SIMD path; the short-tail variant covers the call / vzeroupper overhead specific to the AVX2 dispatch in bitopCommand.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
redis/redis#15289 reports a -5.41% RPS / +5.93% avg-latency regression on
BITOP XOR(and -3.42% / +3.77% onBITOP OR) between391e3452and8dfb823(PR #13898 — Implement DIFF, DIFF1, ANDOR and ONE for BITOP), specifically when source values are short enough to leave a non-32B-multiple scalar tail (50B is the headline length: one AVX2 block + 18B tail).The currently registered BITOP spec —
memtier_benchmark-2keys-bitmap-100M-bits-bitop-and— exercises the long-buffer SIMD path (12.5MB src1/src2), which is exactly the path where 8dfb823 is a win. It cannot reproduce the short-tail loss.What
This PR adds a new spec,
memtier_benchmark-10keys-bitmap-50B-bitop-xor-pipeline-10, that mirrors the reporter's repro:key:1..key:10), each holding a 50-byte string (32B AVX2 block + 18B scalar tail)BITOP XOR tmp key:1 key:2 ... key:10, pipeline 10, 1 client, 1 thread, test-time 180sThe XOR variant carries the strongest signal in the issue's table (-5.41% RPS, p ≈ 7e-9); a follow-up OR variant can be added if useful, but XOR alone is enough to bisect/track this regression on the AVX2 fleet.
Notes
10keys-bitmap-50B-bitop-xor-pipeline-10) so the AVX2 short-tail bench is self-describing.Note
Low Risk
Benchmark registry YAML only; no Redis runtime or production code changes.
Overview
Registers a new memtier benchmark spec
memtier_benchmark-10keys-bitmap-50B-bitop-xor-pipeline-10so CI can bisect and track the BITOP XOR short-tail regression from redis/redis#15289 (50-byte values → one AVX2 block plus 18-byte scalar tail), which the existing large-bitmap BITOP AND spec does not hit.The spec preloads 10 string keys with 50B values, then runs
BITOP XORacross all ten with pipeline 10 (180s, 1 client/thread), on oss-standalone with bookworm gcc amd64/arm64 and dockerhub build variants, at priority 67 as regression coverage rather than a release gate.Reviewed by Cursor Bugbot for commit 6fd4c58. Bugbot is set up for automated code reviews on this repo. Configure here.