Skip to content

GUACAMOLE-2265: Fix worker process leak after client disconnect.#662

Open
escra wants to merge 1 commit intoapache:staging/1.6.1from
ESCRA-GmbH:fix/worker-process-leak
Open

GUACAMOLE-2265: Fix worker process leak after client disconnect.#662
escra wants to merge 1 commit intoapache:staging/1.6.1from
ESCRA-GmbH:fix/worker-process-leak

Conversation

@escra
Copy link
Copy Markdown

@escra escra commented Apr 14, 2026

Summary

Worker processes may remain alive indefinitely after all users have
disconnected, particularly under resource pressure (e.g. cgroup pids-limit).
Over days/weeks this leads to steady PID accumulation until no new connections
can be created.

Root causes

1. pthread_create() failure ignored in guacd_proc_add_user()

When thread creation fails with EAGAIN (common under cgroup pids-limit), the
user thread never runs, so guacd_proc_stop() is never called and the worker
blocks in recvmsg() forever.

2. Worker main loop blocks indefinitely in recvmsg()

The loop in guacd_exec_proc() calls guacd_recv_fd() with no timeout or
health check. If guacd_proc_stop() is not triggered for any reason, the
worker hangs at 0% CPU consuming a PID slot indefinitely.

3. RDP fail path does not clean up resources

guac_rdp_handle_connection() does not free the display, FreeRDP instance,
keyboard, or SVC list on the fail path, leaking resources on every failed
connection attempt.

Fix

  • Handle pthread_create() failure by closing the FD, freeing params, and
    calling guacd_proc_stop()
  • Replace the bare recvmsg() loop with a poll()-based loop that checks
    client state every 5 seconds (GUACD_PROC_IDLE_TIMEOUT_MS). Workers exit
    when the client has stopped or all users have disconnected. An absolute
    safety timeout (GUACD_PROC_MAX_IDLE_MS = 30s) handles edge cases where
    the normal exit path is blocked
  • Add resource cleanup to the fail label in
    guac_rdp_handle_connection()

JIRA

GUACAMOLE-2265

Related

  • GUACAMOLE-2118 -
    Sporadic hanging connections after upgrade
  • GUACAMOLE-2143 -
    Improve process management to prevent zombie process accumulation

Test plan

  • Run guacd under cgroup with pids.max=50, create 60 concurrent
    connections, verify workers exit after disconnect
  • Kill the user's browser mid-session, verify worker exits within 30s
  • Trigger RDP connection failures (unreachable host), verify no resource
    leak via /proc/[pid]/fd count
  • Stress test: 100 rapid connect/disconnect cycles, verify process count
    returns to baseline
  • Verify normal operation is unaffected (long-running sessions, reconnect)

Worker processes may remain alive indefinitely after all users have
disconnected, particularly under resource pressure (e.g. cgroup
pids-limit). The root cause is threefold:

1. guacd_proc_add_user() ignores the pthread_create() return value.
   When thread creation fails (EAGAIN), the user thread never runs,
   so guacd_proc_stop() is never called and the worker blocks in
   recvmsg() forever.

2. The worker main loop blocks indefinitely in guacd_recv_fd() with
   no timeout or health check. If guacd_proc_stop() is not triggered
   for any reason, the worker hangs at 0% CPU consuming a PID slot.

3. The RDP fail path in guac_rdp_handle_connection() does not clean up
   the display, FreeRDP instance, keyboard, or SVC list, leaking
   resources on every failed connection attempt.

Fix:
- Handle pthread_create() failure by closing the FD, freeing params,
  and calling guacd_proc_stop().
- Replace the bare recvmsg() loop with a poll()-based loop that checks
  client state every 5 seconds. Workers exit when the client has
  stopped or all users have disconnected. An absolute safety timeout
  (30s) handles edge cases where cleanup is blocked.
- Add resource cleanup to the fail label in guac_rdp_handle_connection().
@necouchman
Copy link
Copy Markdown
Contributor

@escra I believe the issue you're describing here may be a duplicate of GUACAMOLE-2118, and your fix may address that issue. Please check on that issue and see if it matches, and re-tag your issue as GUACAMOLE-2118.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants