Skip to content

stream 70-30 benchmark: use --test-time 120 for stable variance#374

Open
paulorsousa wants to merge 3 commits into
mainfrom
fix-stream-70-30-variance
Open

stream 70-30 benchmark: use --test-time 120 for stable variance#374
paulorsousa wants to merge 3 commits into
mainfrom
fix-stream-70-30-variance

Conversation

@paulorsousa

@paulorsousa paulorsousa commented Apr 15, 2026

Copy link
Copy Markdown
Collaborator

Replace -n 1000 with --test-time 120 to ensure steady-state measurement.

With -n 1000 (200K total requests at ~127K ops/s), the test finishes in ~1-2 seconds, making it extremely sensitive to startup jitter — observed 99.6% std.dev across runs on ARM.

The 1k-consumers variant of this same test already uses --test-time 120 and achieves 0.1% variance.

Change

  • -n 1000--test-time 120 (120-second steady-state run)
  • Updated description to reflect the new measurement mode

Note

Low Risk
Changes are limited to benchmark YAML specifications (no runtime code changes), with primary impact being longer and more stressful benchmark runs that may affect CI/resource usage.

Overview
Adds two new stream GC-focused memtier benchmark specs (memtier_benchmark-stream-50-cgroups-xadd-xreadgroup-xtrim-{gc,aggressive-gc}.yml) that fully populate 50 consumer-group PELs and include a 50/20/30 mix of XADD/XREADGROUP/XTRIM to force GC compaction under active consumption.

Replaces the previous stream-50-cgroups xadd+xtrim-only suites (which only preloaded 2k PEL entries) and updates memtier_benchmark-stream-concurrent-xadd-xreadgroup-70-30.yml to run for --test-time 120 (instead of -n 1000) with refreshed description for steadier measurements.

Reviewed by Cursor Bugbot for commit 5996773. Bugbot is set up for automated code reviews on this repo. Configure here.

Replace -n 1000 with --test-time 120 to ensure steady-state measurement.
With -n 1000 (200K total requests at ~127K ops/s), the test finishes in
~1-2 seconds, making it extremely sensitive to startup jitter — observed
99.6% std.dev across runs. The 1k-consumers variant of this same test
already uses --test-time 120 and achieves 0.1% variance.
@jit-ci

jit-ci Bot commented Apr 15, 2026

Copy link
Copy Markdown

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

Two issues prevented these benchmarks from exercising the
streamEntryIsReferenced() PEL scan path:

1. Init only read 2000 entries per group (COUNT 2000) — changed to
   COUNT 100000 to read all pre-loaded entries, fully populating PELs
   and advancing min_cgroup_last_id.

2. No XREADGROUP during benchmark — min_cgroup_last_id never advanced
   past init, so streamEntryIsReferenced() short-circuited at the
   min_cgroup_last_id check before reaching the PEL scan. Added 20%
   XREADGROUP to the command mix (50% XADD, 20% XREADGROUP, 30% XTRIM)
   so consumer groups keep consuming and advancing min_cgroup_last_id.

Without these fixes, XTRIM entries were always > min_cgroup_last_id
and took the early return path — the O(C) PEL scan across 50 groups
was never reached.
@@ -138,7 +139,8 @@ clientconfig:
tool: memtier_benchmark
arguments: >
--data-size 100
--command="XADD stream-key * field __data__" --command-key-pattern="P" --command-ratio=70
--command="XADD stream-key * field __data__" --command-key-pattern="P" --command-ratio=50
--command="XREADGROUP GROUP cg-01 consumer1 COUNT 10 STREAMS stream-key >" --command-key-pattern="P" --command-ratio=20

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XREADGROUP on one group won't advance min_cgroup_last_id

Medium Severity

The benchmark XREADGROUP only targets cg-01, but the description claims it "advances min_cgroup_last_id." Since min_cgroup_last_id is the minimum across all 50 groups, it stays pinned at the initial position because the other 49 groups never consume new entries. Once the preloaded entries are trimmed away (very quickly for MAXLEN ~ 1000), newly-added entries have IDs above min_cgroup_last_id, causing the exact short-circuit the benchmark is designed to avoid. The benchmark then spends most of its 120 seconds not stressing the intended O(C) PEL-scan path.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6031c9b. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 5996773. Configure here.

arguments: >
--data-size 100
--command="XADD stream-key * field __data__" --command-key-pattern="P" --command-ratio=50
--command="XREADGROUP GROUP cg-01 consumer1 COUNT 10 STREAMS stream-key >" --command-key-pattern="P" --command-ratio=20

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

XREADGROUP reads only one of fifty consumer groups

Medium Severity

The benchmark's XREADGROUP command only reads from cg-01, but 50 consumer groups exist. The description claims this "advances min_cgroup_last_id," but since min_cgroup_last_id is the minimum of all groups' last_delivered_id, advancing only one group while the other 49 remain fixed doesn't advance the minimum. After the initial preloaded entries are trimmed (first seconds), new entries from XADD have IDs above min_cgroup_last_id and cause streamEntryIsReferenced() to short-circuit — meaning the O(C) PEL-scan code path the benchmark claims to stress is not exercised during the steady-state portion of the 120-second test.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 5996773. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant