Skip to content

Cluster code micro-optimization#4094

Open
klaussfreire wants to merge 23 commits into
redis:masterfrom
klaussfreire:micro_opt
Open

Cluster code micro-optimization#4094
klaussfreire wants to merge 23 commits into
redis:masterfrom
klaussfreire:micro_opt

Conversation

@klaussfreire

@klaussfreire klaussfreire commented Jun 1, 2026

Copy link
Copy Markdown

Description of change

A bunch of low-hanging micro-optimizations.

Mostly common subexpression extraction to local variables when beneficial. Barely noticeable benchmark impact but technically an improvement without readability impact IMO.

One of the most impactful changes concentrates around key extraction and node determination.
It optimizes the most common path for key extraction which is that of single-key commands,
precomputing key position during command parsing to have a simple fast path in get_keys.

This last one does produce a large impact in cluster benchmarks, of around 20%.

Some optimizations also apply to regular (non-cluster) clients, but the main concern of this optimization effort was around cluster ops, so they were not benchmarked.

Benchmark results

This uses a benchmark similar to those in basic_operations, but with a cluster client.
Two pipeline patterns were used:

  • hset+x: hset followed by hexpire, each hset + hexpire call counts as a single "request"
  • setex: to do both at once
    Both patterns include value payloads of around 50 bytes each.

The first one stresses multiple command types with various return types, while the second one stresses a streamlined single-command pipeline with limited response data.

The benchmark was run in AWS against a 3-node valkey EC cluster (EC versions of Redis OSS does not support setex), using hiredis 3.3.1. The setup builds a mostly CPU-bound case with a very fast but still realistic network between client and cluster.

Large pipelines show as dnf because they get deadlocked, something that will be covered on another PR.

Numbers are given in requests per second (RPS) and relative uplift.

Pipeline size master hset+x master setex micro_opt hset+x micro_opt setex uplift hset+x uplift setex
100 14285 23292 16973 26705 18.8% 14.6%
10000 17598 31909 22116 39784 25.7% 24.7%
50000 15785 30328 19386 36752 22.8% 21.2%
150000 dnf dnf dnf dnf
300000 dnf dnf dnf dnf

Pull Request check-list

Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.


Note

Medium Risk
Changes cluster slot/key routing and pipeline target selection; behavior should be equivalent but mistakes would mis-route commands across shards.

Overview
This PR speeds up Redis Cluster command routing and pipelines, with smaller wins elsewhere in the client stack.

Key extraction precomputes _single_key_pos when COMMAND metadata is loaded and adds a fast path in get_keys (plus _is_keyed_command) so typical single-key commands skip heavier parsing. Sync and async CommandsParser share the same pattern, with minor refactors for multi-key and negative last_key_pos handling.

Cluster node selection normalizes policy callbacks to a single (self, command, *args, **kwargs) shape, caches resolved policy callbacks per command (cleared after 5k entries), and reuses keyed/keyless CommandPolicies in cluster pipeline execution with a per-batch policy cache. Slot routing reuses a local is_read flag.

Other hot-path tweaks: Redis.parse_response skips option handling when options is empty; Encoder.encode uses clearer early returns; bool_ok treats both b"OK" and "OK"; hiredis parsers cache reader/sock locals; policy resolvers avoid redundant dict lookups; the command packer benchmark joins list payloads before sendall.

Reviewed by Cursor Bugbot for commit 7f51775. Bugbot is set up for automated code reviews on this repo. Configure here.

A bunch of low-hanging, low-impact micro-optimizations.

Mostly common subexpression extraction to local variables when
beneficial. Barely noticeable benchmark impact but technically
an improvement without readability impact IMO.
Concentrate around key extraction and node determination.

Optimizes the most common path for key extraction which is
that of single-key commands.
Common subexpression factorization
Getting the policy callback out of all the dicts is very indirect
and wasteful for simple commands. Cache simple commands so the
next time it only takes a single lookup.

About 3.7% improvement on overall benchmark.
Avoid resolving the same command policies over and over.

All resolvers are "static" in the short term, during a single
pipeline execution, so cache their results for the duration
to reduce the amount of work done.
Simple commands have simple responses, so a fast path for those
improves response parsing speed in the average
Another big speedup in command policy resolution
@jit-ci

jit-ci Bot commented Jun 1, 2026

Copy link
Copy Markdown

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Were meant for pre-PR testing only

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 99df020. Configure here.

Comment thread redis/_parsers/commands.py
if isinstance(value, (bytes, bytearray, memoryview)):
return value
elif isinstance(value, str):
return value.encode(self.encoding, self.encoding_errors)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Encoder str check now precedes bool check incorrectly

Low Severity

The reordering moves the isinstance(value, str) check before the isinstance(value, bool) check. While bool is not a subclass of str in standard Python so current behavior is preserved, the original ordering had a clear comment explaining the bool check exists specifically as a guard before int. Moving str above bool obscures this intent and makes the code fragile — any future custom type inheriting both str and bool-like behavior could break silently. The comment "special case bool since it is a subclass of int" is now misleadingly placed after str, not before int.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 99df020. Configure here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really, the concern for bool was around int, not str. If it were, bytes would have a similar problem. That behavior is preserved. String and byte values are by far the most commonly used so they should be checked first.

@petyaslavova

Copy link
Copy Markdown
Collaborator

Hi @klaussfreire, thank you for your contribution! I'll take a look at it shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants