Skip to content

Avoid per-check fd allocation in hiredis _socket_can_read() — use poll() instead of a per-call selector#4118

Open
violuke wants to merge 1 commit into
redis:masterfrom
violuke:master
Open

Avoid per-check fd allocation in hiredis _socket_can_read() — use poll() instead of a per-call selector#4118
violuke wants to merge 1 commit into
redis:masterfrom
violuke:master

Conversation

@violuke

@violuke violuke commented Jun 12, 2026

Copy link
Copy Markdown

Follow-up to #4115, which fixed the ValueError: filedescriptor out of range in select() in the hiredis parser by replacing select.select() with selectors.DefaultSelector(). This relates also to #4117

The selector approach has a residual failure mode: DefaultSelector resolves to EpollSelector on Linux and KqueueSelector on macOS, both of which allocate a new file descriptor on every call. _socket_can_read() constructs a fresh selector per readiness check, and Connection.can_read() runs on every connection acquisition — so every command now allocates (and frees) an fd just to check readiness.

This matters because fd pressure is exactly the condition that triggers this bug class: a socket only lands on fd >= 1024 in a process holding many descriptors. In a process at or near RLIMIT_NOFILE, the readiness check itself now fails with OSError: [Errno 24] Too many open files (raised from epoll_create/kqueue inside selectors.py), so a worker that runs out of fds loses the ability to talk to Redis entirely even though its connection socket is healthy.

This PR switches _socket_can_read() to select.poll(), which:

  • has no FD_SETSIZE limit (the original bug stays fixed);
  • allocates no file descriptor — the poll set lives in the syscall arguments, not in a kernel object — so readiness checks keep working under fd exhaustion;
  • avoids constructing a selector object per check on the hot path.

Behaviour is preserved: timeout seconds is converted to poll's milliseconds, timeout=0 remains a non-blocking check and timeout=None blocks forever; poll() always reports POLLHUP/POLLERR/POLLNVAL regardless of the registered mask, so a closed or errored socket still counts as "readable", matching the previous select()/selector semantics. On Windows, where select.poll is unavailable, the selectors.DefaultSelector path from #4115 is retained as the fallback (Windows is not subject to the Unix FD_SETSIZE value limit).

The async hiredis parser is unaffected — it reads via asyncio streams and performs no select/poll calls.

Changes

  • redis/_parsers/hiredis.py: _socket_can_read() prefers select.poll() when available; the selector-based check remains as the Windows fallback.
  • tests/test_connection.py:
    • test_hiredis_socket_can_read_uses_poll (parametrized) — verifies the poll path and the seconds-to-milliseconds timeout conversion for 0, 0.001, and None.
    • test_hiredis_socket_can_read_falls_back_to_default_selector — the existing selector test from Fix hiredis readiness checks for high file descriptors #4115, updated to patch _HAS_POLL = False so it exercises the fallback path.
    • test_hiredis_socket_can_read_handles_high_file_descriptor — kept from Fix hiredis readiness checks for high file descriptors #4115; skip condition updated for the new branch order.
    • test_hiredis_socket_can_read_under_fd_exhaustion (new) — lowers RLIMIT_NOFILE, fills the fd table via os.dup() until EMFILE, then asserts readiness checks still work for both a readable and an empty socket pair. This test fails with OSError: [Errno 24] against current master and passes with this change.

How tested

  • All hiredis can_read unit tests pass (none require a running server).
  • Verified the new fd-exhaustion regression test catches the issue: with redis/_parsers/hiredis.py reverted to master, it fails with OSError: [Errno 24] Too many open files raised from selectors.py; with this change it passes.
  • ruff check, ruff format --check, and vulture redis whitelist.py --min-confidence 80 are clean.

Note

Medium Risk
Touches the hot-path readiness check on every hiredis can_read(); behavior is intended to match prior semantics but poll vs epoll edge cases and Windows fallback warrant careful review.

Overview
Hiredis socket readiness now uses select.poll() on platforms where it exists, instead of creating a fresh selectors.DefaultSelector() on every _socket_can_read() call. That keeps the #4115 fix (no FD_SETSIZE / high-fd select failures) while avoiding epoll/kqueue per-check fd allocation, which could raise EMFILE when the process is already fd-starved—the same situation that pushes Redis sockets onto high fds. Windows still uses the selector fallback via _HAS_POLL.

Tests now assert the poll path (including timeout → ms conversion), force the selector path with _HAS_POLL = False, and add test_hiredis_socket_can_read_under_fd_exhaustion to prove readiness checks still work when the fd table is full.

Reviewed by Cursor Bugbot for commit 5347b79. Bugbot is set up for automated code reviews on this repo. Configure here.

…_read()` — use `poll()` instead of a per-call selector
@jit-ci

jit-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

@jit-ci

jit-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

@petyaslavova

Copy link
Copy Markdown
Collaborator

Hi @violuke, thanks for the contribution!

I'll take a closer look on Monday. I had completely missed the fact that DefaultSelector actually uses a new file descriptor, so this definitely shouldn't stay as it is. This PR looks like a good first step toward a proper solution.

Thanks for pushing this forward!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants