Skip to content

Fix async RedisCluster lifecycle after aclose#4069

Open
seplease wants to merge 4 commits into
redis:masterfrom
seplease:fix-3846-async-cluster-aclose
Open

Fix async RedisCluster lifecycle after aclose#4069
seplease wants to merge 4 commits into
redis:masterfrom
seplease:fix-3846-async-cluster-aclose

Conversation

@seplease

@seplease seplease commented May 16, 2026

Copy link
Copy Markdown

Summary

Fix async RedisCluster reopening itself after aclose().

Previously, calling await client.aclose() did not fully terminate the cluster client lifecycle. A later command could trigger lazy reinitialization and recreate cluster connections.

This change separates "not initialized yet" from "explicitly closed" and prevents closed async cluster clients from being reused.

Fixes #3846.

Background

RedisCluster uses lazy initialization internally.

The problem was that _initialize represented two different states at once:

  • the client has not been initialized yet
  • the client has already been explicitly closed

aclose() disconnected node connections, but also reset _initialize=True.

As a result, a later command path treated the client as uninitialized and called initialize() again, reopening cluster connections after close.

That behavior made aclose() act more like a soft reset than a terminal lifecycle operation.

There was also a race window between initialize() and aclose() where connections created during initialization could survive the close path.

Changes

Introduced an explicit _closed state for the async cluster client lifecycle.

Main changes:

  • added _closed lifecycle tracking
  • added _ensure_open() checks before initialization and command execution
  • changed public aclose() to mark the client closed and prevent reuse
  • made aclose() wait for in-flight initialization before disconnecting nodes
  • separated internal node reset from terminal client close

Internal cleanup paths are now split into:

  • aclose()
    • marks the client closed
    • prevents future reuse
  • _close_nodes()
    • resets node state without closing the client lifecycle
    • still allows reinitialization
  • _disconnect_nodes()
    • performs the actual disconnect cleanup

Cluster recovery paths (MOVED, ClusterDownError, SlotNotCoveredError, and pipeline retry) now use _close_nodes() so topology refresh behavior remains unchanged.

Tests

Added regression coverage for:

  • commands executed after aclose()
  • calling aclose() before initialization
  • concurrent initialize() / aclose() execution
  • pipeline execution after close
  • initialization failure cleanup
  • node cleanup paths that still allow reinitialization
  • transaction pipeline execution after close
  • context manager re-entry after exit

Also added an integration-style regression test against a real Docker Redis Cluster.

Ran:

.venv/bin/python -m pytest tests/test_asyncio/test_cluster.py -q \
  --redis-url=redis://localhost:16379/0 \
  --redis-ssl-url=rediss://localhost:27379/0 \
  -m 'not onlynoncluster and not redismod and not fixed_client'

Result:

238 passed, 195 skipped, 33 deselected

Also validated with:

.venv/bin/ruff format --check redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py
.venv/bin/ruff check redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py
.venv/bin/python -m py_compile redis/asyncio/cluster.py tests/test_asyncio/test_cluster.py tests/test_asyncio/conftest.py

Notes

This PR only changes the async cluster client lifecycle behavior.

The sync RedisCluster implementation is unchanged.


Note

Medium Risk
Changes core async cluster client lifecycle and error-recovery disconnect paths; behavior shifts for code that called aclose() then reused the same instance or relied on implicit reopening.

Overview
Fixes async RedisCluster so aclose() is terminal: a new _closed flag and _ensure_open() block initialize(), execute_command(), and pipeline paths after close, instead of treating a closed client as merely “uninitialized” and lazily reconnecting.

Lifecycle split: aclose() sets _closed and disconnects under the init lock (including waiting on in-flight initialize()); _close_nodes() / _disconnect_nodes() tear down node pools without marking the client closed—used for context-manager exit, MOVED / cluster-down recovery, and pipeline retries so topology refresh still works.

Tests & fixtures: Broad regression tests (reuse after close, concurrent init/close, pipelines, real cluster ping) and cluster teardown flushdb when the fixture client is already closed.

Reviewed by Cursor Bugbot for commit 00924b4. Bugbot is set up for automated code reviews on this repo. Configure here.

@jit-ci

jit-ci Bot commented May 16, 2026

Copy link
Copy Markdown

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Comment thread redis/asyncio/cluster.py

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit b922022. Configure here.

Comment thread redis/asyncio/cluster.py
@petyaslavova

Copy link
Copy Markdown
Collaborator

Hi @seplease, thank you for your contribution! I will have a look at it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants