GUACAMOLE-2265: Fix worker process leak after client disconnect.#677
Open
escra wants to merge 1 commit into
Open
GUACAMOLE-2265: Fix worker process leak after client disconnect.#677escra wants to merge 1 commit into
escra wants to merge 1 commit into
Conversation
Worker processes are never terminated after a client disconnects when the normal exit path is blocked, leaking one guacd child per affected connection until the container is restarted. The worker's main loop blocked indefinitely in recvmsg() waiting for new user file descriptors on fd_socket and only woke up when a new user joined, so it could not notice that the client had stopped and no further users would ever arrive. Replace the blocking receive with a poll() that wakes on a fixed idle interval (GUACD_PROC_IDLE_TIMEOUT_MS) and decides whether to keep waiting or exit based on client state and connected user count. After the last user disconnects a short reconnect grace period (GUACD_PROC_RECONNECT_GRACE_MS) is honoured so a browser refresh or tab re-open can rejoin the existing session, and an absolute safety net (GUACD_PROC_MAX_IDLE_MS) force-exits a worker whose teardown is wedged (e.g. guac_argv_await stuck in the FreeRDP authenticate callback). The absolute safety net is gated on connected_users == 0 and idle_cycles is reset while users are connected. The poll loop only observes new user FDs on fd_socket; an already-connected single-user session generates no new FDs, so without this gating a healthy active session accumulated idle cycles and was force-terminated at the GUACD_PROC_MAX_IDLE_MS mark. The safety net now measures true post-disconnect idle time and can never fire on a live session, while still catching the blocked-teardown case. If pthread_create() for a user thread fails, the connection is closed and the worker is stopped so a failed thread launch cannot leak the worker. Verified with a k6 soak (143s) and an interactive session (285s) that an actively connected session survives well past GUACD_PROC_MAX_IDLE_MS with no premature disconnect and no 'exceeded maximum idle time' warning, and with a deterministic reproducer that holds one active single-user session past the safety-net threshold.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is the main-targeted forward-port of #662 (GUACAMOLE-2265). The original PR targets the frozen
staging/1.6.1branch; this applies the same fix on top of currentmain(HEAD e775052).Problem
A guacd worker process could leak after client disconnect: the blocking
guacd_recv_fd()loop inguacd_exec_proc()never returns when the connection drops without a clean FD close, leaving the child alive in the process map.Fix
poll()-based loop with idle timeout, plus reconnect-grace and max-idle handling (GUACD_PROC_IDLE_TIMEOUT_MS,GUACD_PROC_RECONNECT_GRACE_MS,GUACD_PROC_MAX_IDLE_MS).POLLERR/POLLHUP/POLLNVAL.pthread_create()failure handling inguacd_proc_add_user()(close fd, free params, stop proc).Verification
Builds cleanly on
main;guacdlinks successfully. Substance is identical to #662 (verified diff parity).