[wip] tuneable worker pools by dwoz · Pull Request #68532 · saltstack/salt

dwoz · 2025-12-13T22:48:38Z

What does this PR do?

What issues does this PR fix or reference?

Fixes

Previous Behavior

Remove this section if not relevant

New Behavior

Remove this section if not relevant

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

Docs
Changelog - https://docs.saltproject.io/en/master/topics/development/changelog.html
Tests written/updated

Commits signed with GPG?

Yes/No

…nflict - Implement forward_message() in TCP and WebSocket transports The abstract method was added to DaemonizedRequestServer but only implemented in ZeroMQ. TCP and WS now log a warning since worker pool routing is only supported for ZeroMQ transport. - Remove duplicate frontend_channels in ReqServer.__bind() The dispatcher IS the front-end that listens on port 4506 for minion connections. Creating separate frontend_channels caused a port conflict preventing minions from connecting to the master. Fixes: - Lint failure: abstract-method warnings for tcp.py and ws.py - Test failures: minions unable to connect (request timeouts)

The dispatcher was incorrectly trying to create its own request_server transport which attempted to bind to port 4506, conflicting with the frontend channel that minions connect to. Changes: - Restored frontend_channels that listen on port 4506 for minion connections - Modified PoolDispatcherChannel to receive frontend_channels and connect to them as a worker (like MWorker does) - Dispatcher now properly routes requests from frontend to pool channels Architecture flow: 1. Minions connect to frontend_channels on port 4506 2. Dispatcher acts as a worker to frontend_channels 3. Dispatcher classifies requests and forwards to pool channels 4. Pool workers handle requests from their pool's channel

Updated test_pool_routing.py to match the new PoolDispatcherChannel constructor signature which now requires frontend_channels parameter.

The current dispatcher architecture is fundamentally flawed. The forward_message() method tries to use REQ→DEALER pattern which doesn't work, causing all requests to timeout. Issue: Dispatcher cannot forward messages between ZeroMQ queue devices. The architecture needs to be redesigned. Disabling by default to unblock CI while we rework the implementation.

Replaced the broken dispatcher architecture with a custom ZeroMQ routing device that properly routes requests to different worker pools based on command classification. **What Changed:** 1. **New ZeroMQ Pooled Router** (`salt/transport/zeromq.py`): - Implemented `zmq_device_pooled()` method that creates one frontend ROUTER socket and multiple backend DEALER sockets (one per pool) - Routes incoming requests by deserializing payload, classifying command, and forwarding to appropriate pool's DEALER socket - Tracks pending requests to match responses back to clients - Uses `zmq.Poller()` for efficient multi-socket handling 2. **Simplified Master Architecture** (`salt/master.py`): - Removed complex dispatcher process and frontend/pool channel separation - Single request server transport with pooled routing (ZeroMQ only) - Workers connect directly to pool-specific IPC sockets (workers-{pool}.ipc) - Removed unused tornado.ioloop import 3. **Enabled by Default** (`salt/config/__init__.py`): - Changed `worker_pools_enabled` default from False to True - Feature now works correctly and provides workload isolation **Why the Previous Approach Failed:** The dispatcher tried to use `forward_message()` which created REQ sockets to connect to pool DEALER sockets. REQ→DEALER pattern doesn't work for bidirectional request/response. ZeroMQ queue devices are transparent pass-throughs and cannot route based on message content. **New Approach:** Custom routing device at ZeroMQ socket level: - Frontend ROUTER (port 4506) ← minions connect here - Custom routing loop examines each request payload - Forwards to appropriate pool's DEALER socket - Workers connect to their pool's DEALER (workers-{pool}.ipc) - Responses automatically route back through ZeroMQ envelope tracking **Testing:** Tested locally with salt-master and salt-minion: - Minion connects successfully (no timeouts) - `salt test-minion test.ping` works correctly - Pooled router logs show proper pool creation (fast, general) - Workers bind to pool-specific IPC sockets as expected

**The Bug:** The pooled routing device was incorrectly matching responses back to clients. When multiple clients sent requests to the same pool concurrently, responses would be sent to the wrong clients because the code just found the FIRST pending request for that pool, not the ACTUAL client whose request was answered. **The Symptom:** - Tests passed when run individually (single request at a time) - Tests failed massively in CI (many concurrent requests) - 147 test failures in CI, all related to incorrect response routing **The Fix:** Preserve the client identity through the entire request/response cycle: 1. When forwarding request to pool DEALER: - Before: `dealer.send_multipart([b"", payload])` # Lost identity! - After: `dealer.send_multipart([identity, b"", payload])` # Preserves it! 2. When receiving response from pool DEALER: - Before: Tried to match by pool name (WRONG!) - After: Extract identity from response: `identity = reply_msg[0]` 3. The ROUTER/DEALER queue device automatically preserves routing envelopes, so responses now correctly include the original client identity **How It Works:** - Client A -> [A_id, "", req] -> Frontend ROUTER - Frontend -> [A_id, "", req] -> Pool DEALER -> Worker - Worker -> response -> Pool DEALER -> [A_id, "", resp] - Frontend <- [A_id, "", resp] <- sends to Client A (CORRECT!) This fix eliminates the flawed `pending_requests` dict and broken matching logic.

The DEALER socket doesn't maintain request-response correlation automatically. This commit implements proper FIFO queue tracking per pool to correctly route responses back to the original client that sent each request. Key changes: - Track pending requests per pool using collections.deque (FIFO queue) - Send only [b'', payload] to DEALER (not client identity) - Pop client identity from FIFO queue when receiving response from DEALER - This ensures responses are matched to requests in correct FIFO order

Changed from ROUTER-ROUTER pattern (which requires DEALER workers) to ROUTER-DEALER pattern (compatible with REQ workers) by forwarding the entire message envelope through DEALER sockets. **Root Cause:** - DEALER load-balances to multiple REQ workers - Previous attempts stripped client identity before forwarding - Responses couldn't be correlated back to original clients **Solution:** - Forward entire envelope `[client_id, b"", payload]` through DEALER - DEALER preserves envelope when forwarding to/from REQ workers - Response includes client_id, allowing proper routing back to client This mimics how zmq.QUEUE device handles correlation, but with custom routing logic to direct requests to appropriate worker pools.

When worker pools are enabled, the zmq_device_pooled() function creates pool-specific IPC files (workers-fast.ipc, workers-medium.ipc, etc) but does not create the standard workers.ipc file that components like netapi expect to check if the master is running. The salt.netapi.NetapiClient._is_master_running() method checks for the existence of workers.ipc to determine if the master daemon is available. Without this file, netapi tests and components fail with "Salt Master is not available" errors even when the master is running correctly. This commit adds creation of a workers.ipc marker file after setting up the pool-specific DEALER sockets, ensuring backward compatibility with components that rely on this file for master status checks. Fixes integration test failures in chunks 3 & 5 related to netapi and other components that check master availability.

The pooled routing device process was registered as "MWorkerQueue-Pooled" but tests expect to see logs from a process named "MWorkerQueue". This change ensures the process uses the expected name regardless of whether worker pools are enabled.

…ools - Replace non-deterministic hash() with zlib.adler32() for pool port offsets. This ensures consistency across spawned processes on Windows where hash randomization is enabled by default. - Ensure workers.ipc is removed if it exists as a socket before creating the marker file. This prevents crashes during upgrades/downgrades where a legacy socket file might still exist in the socket directory. - Improve error handling during marker file creation to prevent process crashes. Fixes package test timeouts on Windows and Linux.

- Create worker channels once per pool instead of once per worker process. This significantly reduces Master startup time, especially on Windows where process spawning is slow and pickling is expensive. - Handle encrypted payloads (bytes) in RequestRouter._extract_command. Encrypted traffic is now correctly routed to the default pool instead of potentially causing AttributeErrors.

- Update all transport pre_fork methods to accept *args and **kwargs. - Update ReqServerChannel and PubServerChannel pre_fork to accept *args and **kwargs. - Simplify master.py to call chan.pre_fork directly with worker_pools. - This resolves crashes in non-ZeroMQ transports (TCP/WS) that were receiving unexpected worker_pools arguments.

- Pass SMaster.secrets to MasterPubServerChannel.pre_fork. - Correctly extract and set SMaster.secrets in both PubServerChannel and MasterPubServerChannel publish daemons. - This ensures spawned processes (Windows/macOS) have access to the AES key needed for serial number generation and payload wrapping.

- Update transport unit tests to match new pre_fork method signature. - Add safety checks in ZeroMQ _send_recv loop to ensure future.set_exception() is only called if the future is not already done. This prevents InvalidStateError when a task is cancelled or times out simultaneously with a socket error.

- Apply black formatting and fix indentation in unit tests. - Add an asyncio.Lock to RequestClient.connect() to prevent multiple concurrent connection attempts from spawning redundant _send_recv tasks. This resolves EFSM (invalid state) errors under high concurrency.

- Add close() method to PoolRoutingChannelV2Revised to ensure all pool clients and servers are properly shut down. - Harden RequestClient._send_recv loop to always reconnect on ZMQError. This ensures the REQ socket state is reset if a send or receive fails, preventing subsequent EFSM (invalid state) errors. - Add robust deserialization handling in _send_recv.

- Apply black formatting and remove trailing whitespace in pool_routing_v2_revised.py. - Handle encrypted payloads (bytes) in handle_and_route_message to prevent AttributeError. - Apply set comprehension optimization in deb.py (suggested by pyupgrade).

- Add future.done() checks before every future transition in RequestClient._send_recv. - This prevents asyncio.exceptions.InvalidStateError when a request times out or is cancelled just as a message arrives or a socket error occurs. - Add 'reconnect storm' protection by skipping already-completed futures pulled from the queue. - Improve error handling to ensure ANY ZeroMQ error resets the REQ socket state machine. - Update CI failure tracker with root cause and fix status.

…ssues - Await socket.poll() in RequestClient._send_recv to fix AttributeError when using asyncio ZeroMQ transport. - Ensure WebSocket handlers correctly return the WebSocketResponse object to satisfy aiohttp requirements. - Fix test_client_send_recv_on_cancelled_error by adding a shutdown sentinel to terminate the _send_recv loop.

- Call pre_fork on pool-specific worker channels in master.py. - Skip redundant ZMQ device startup in pre_fork if pool_name is set. - Fix AttributeError in RequestClient and AsyncReqMessageClient by ensuring send_recv_task is initialized and using yield/await on socket.poll(). - Improve task management in _init_socket() to prevent task leaks.

- Implement task ID tracking in both asyncio and Tornado ZeroMQ clients. - Ensures only the latest spawned _send_recv task can process the request queue. - This prevents 'reconnect storms' and EFSM (invalid state) errors where multiple tasks would interleave operations on the same REQ socket.

- Prevent request queue reset on connect() to avoid losing pending requests. - Add comprehensive task_id checks in both asyncio and Tornado clients to ensure only the latest spawned task is processing the queue and using the socket. - Handle EFSM explicitly during socket polling. - Ensure all future transitions are protected by future.done() checks.

- Add missing @tornado.gen.coroutine to AsyncReqMessageClient.send(). - Fix bug where _closing was reset to False in close(). - Implement message re-queuing in _send_recv() for both asyncio and Tornado clients to prevent message loss when a task is superseded by a newer one. - Remove redundant task_id checks from the middle of the send/recv loop.

- Re-introduce safe poll(POLLOUT) check during idle periods to detect socket issues, while ensuring it never interferes with in-flight requests. - Use future.cancelled() for robust cancellation detection across both asyncio and Tornado client implementations. - Fix UnboundLocalError by correctly ordering task_id checks and queue retrieval. - Ensure the receive loop strictly breaks on ZMQError to reset the state machine. - Fix potential message loss by re-queuing messages when a task is superseded.

dwoz requested a review from a team as a code owner December 13, 2025 22:48

dwoz had a problem deploying to ci December 13, 2025 22:48 — with GitHub Actions Error

dwoz added the test:full Run the full test suite label Dec 13, 2025

dwoz temporarily deployed to ci December 13, 2025 22:49 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 13, 2025 22:56 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 13, 2025 23:14 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 14, 2025 05:02 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 14, 2025 05:11 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 14, 2025 05:28 — with GitHub Actions Inactive

dwoz temporarily deployed to ci December 31, 2025 22:41 — with GitHub Actions Inactive

dwoz and others added 30 commits April 3, 2026 14:12

Use worker pools config in our tests

56f678e

Route requests

91be45b

Fix test to pass frontend_channels to PoolDispatcherChannel

b220dbd

Updated test_pool_routing.py to match the new PoolDispatcherChannel constructor signature which now requires frontend_channels parameter.

Fix linter

c74f615

Fix tests

6655c96

Fix KeyError: 'transport' in iter_transport_opts and pool_routing test

edd8ce5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wip] tuneable worker pools#68532

[wip] tuneable worker pools#68532
dwoz wants to merge 31 commits intosaltstack:masterfrom
dwoz:tunnable-mworkers

dwoz commented Dec 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dwoz commented Dec 13, 2025

What does this PR do?

What issues does this PR fix or reference?

Previous Behavior

New Behavior

Merge requirements satisfied?

Commits signed with GPG?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant