From a15de3d263a5759ca4f07aa7a2b9e6d494551150 Mon Sep 17 00:00:00 2001 From: Stefan Boberg Date: Mon, 23 Mar 2026 11:40:11 +0100 Subject: Process management improvements (#881) This PR improves process lifecycle handling and resilience across several areas: - **Reclaim stale shared-memory entries instead of exiting** (`zenserver.cpp`): When a zenserver instance fails to attach as a sponsor to an existing process (e.g. because the PID was reused by an unrelated process), the server now clears the stale shared-memory entry and proceeds with normal startup instead of calling `std::exit(1)`. - **Wait for child process exit in `Kill()` and `Terminate()` on Unix** (`process.cpp`): After sending `SIGTERM` in `Kill()`, the code now waits up to 5s for graceful shutdown (escalating to `SIGKILL` on timeout), matching the Windows behavior. `Terminate()` also waits after `SIGKILL` so the child is properly reaped and doesn't linger as a zombie clogging up the process table. - **Fix sysctl buffer race in macOS `FindProcess`** (`process.cpp`): The macOS process enumeration now retries the `sysctl` call (up to 3 attempts with 25% buffer padding) to handle the race where the process list changes between the sizing call and the data-fetching call. Also flattens the nesting and fixes the guard/free scoping. - **Terminate stale processes before integration tests** (`zenserver-test.cpp`, `test.lua`): The integration test runner now accepts a `--kill-stale-processes` flag (passed automatically by `test.lua`) that scans for and terminates any leftover `zenserver`, `zenserver-test`, and `zentest-appstub` processes from previous test runs, logging the executable name and PID of each. This addresses flaky test failures caused by stale processes from prior runs holding ports or other resources. --- src/zenserver/zenserver.cpp | 32 +++++++++++++++++++++++++++----- 1 file changed, 27 insertions(+), 5 deletions(-) (limited to 'src/zenserver/zenserver.cpp') diff --git a/src/zenserver/zenserver.cpp b/src/zenserver/zenserver.cpp index 49cbbb9fc..ea86c5654 100644 --- a/src/zenserver/zenserver.cpp +++ b/src/zenserver/zenserver.cpp @@ -684,11 +684,33 @@ ZenServerMain::Run() } else { - ZEN_CONSOLE_WARN(ZEN_APP_NAME " exiting, failed to add sponsor owner pid {} to process listening to port {} (pid: {})", - m_ServerOptions.OwnerPid, - m_ServerOptions.BasePort, - Entry->Pid.load()); - std::exit(1); + // The entry's process failed to pick up our sponsor request after + // multiple attempts. Before reclaiming the entry, verify that the + // PID does not still belong to a zenserver process. If it does, the + // server is alive but unresponsive – fall back to the original error + // path. If the PID is gone or belongs to a different executable the + // entry is genuinely stale and safe to reclaim. + const int StalePid = Entry->Pid.load(); + std::error_code ExeEc; + std::filesystem::path PidExePath = GetProcessExecutablePath(StalePid, ExeEc); + const bool PidIsZenServer = !ExeEc && (PidExePath.filename() == GetRunningExecutablePath().filename()); + if (PidIsZenServer) + { + ZEN_CONSOLE_WARN(ZEN_APP_NAME + " exiting, failed to add sponsor to process on port {} " + "(pid {}); that pid is still a running zenserver instance", + m_ServerOptions.BasePort, + StalePid); + std::exit(1); + } + ZEN_CONSOLE_WARN( + "Failed to add sponsor to process on port {} (pid {}); " + "pid belongs to '{}' – assuming stale entry and reclaiming", + m_ServerOptions.BasePort, + StalePid, + ExeEc ? "" : PidExePath.filename().string()); + Entry->Reset(); + Entry = nullptr; } } else -- cgit v1.2.3