diff options
| author | Stefan Boberg <[email protected]> | 2026-04-17 19:48:22 +0200 |
|---|---|---|
| committer | GitHub Enterprise <[email protected]> | 2026-04-17 19:48:22 +0200 |
| commit | 2e14b335d14d0735da7070803869b6fea741dc57 (patch) | |
| tree | 25cc389df422483d4820a66a0a0c6898f4b561b2 /scripts/test_scripts | |
| parent | cleanup buildstorageoperations (#973) (diff) | |
| download | archived-zen-2e14b335d14d0735da7070803869b6fea741dc57.tar.xz archived-zen-2e14b335d14d0735da7070803869b6fea741dc57.zip | |
[zenhorde] security + robustness fixes (#972)
### Critical (cryptographic correctness)
- AES-GCM nonce: replace homebrew `N32[0]++; N32[1]--; N32[2] = ^` scheme with NIST SP 800-38D ยง8.2.1 deterministic construction (64-bit big-endian counter). Session tears down on counter exhaustion instead of reusing a nonce.
- Remove `std::random_device` / `mt19937` nonce seed - the deterministic construction from the previous commit doesn't need an RNG, and `std::random_device` isn't guaranteed to be a CSPRNG.
- BCrypt return values: check every `BCRYPT_SUCCESS`, cache the `BCRYPT_KEY_HANDLE` on the context instead of re-creating it per message, destroy under null-guards. Closes the silent-downgrade-to-non-GCM path.
### High
- OpenSSL: check `EVP_CIPHER_CTX_new` / `EVP_EncryptInit_ex` / `EVP_DecryptInit_ex` return values in the constructor and set `HasErrors` on failure.
- Log AES-GCM tag-verification failures distinctly from other decrypt errors (BCrypt `STATUS_AUTH_TAG_MISMATCH` / OpenSSL `EVP_DecryptFinal_ex` post-set-tag), with a sequence counter for correlation.
- Thread a bounds-checked `ReadCursor` through every `Read*` parser helper; `ReadException` / `ReadExecuteResult` / `ReadBlobRequest` now return `bool` and callers treat malformed frames as protocol errors. Closes the `0xFF` varint OOB-read.
- Validate `ReadBlobRequest` locator as a safe filename component (reject path separators, `..`, NUL/control, drive colons, leading/trailing dot/space, length > 255). Closes the path-traversal attack on the `BundleDir / (Locator + ".blob")` join.
- Bind `AsyncAgentMessageChannel`'s timer and `AsyncReadResponse` entry onto the socket's strand; expose `AsyncComputeSocket::GetStrand()`. Removes the race between the bare-io_context timer completion and `OnFrame` on `m_PendingHandler` under the 3-thread pool.
- Drop the long-lived `m_EncryptBuffer` member - encrypt into a fresh per-write buffer shared with the completion handler. Also fixes thread-safety of the encrypt path.
- Validate server-returned `ClusterId` against `[A-Za-z0-9._-]{1,64}` before concatenating into the `api/v2/compute/<ClusterId>` URL.
### Medium
- `EVP_CIPHER_CTX_reset` + re-bind cipher on every encrypt/decrypt so stale state cannot bleed across messages. Also logs EVP failures.
- Malformed `ExecuteResult` (size != 4) now tears down the agent instead of silently reporting `ExitCode = -1`.
- Replace `assert(Eq != nullptr)` on env var parsing with a `zen::runtime_error` - assert is compiled out in release and `*(Eq+1)` was UB.
- Blob name uses `zen::Oid::NewOid()` (24 hex chars, seeded from `random_device` run-id + monotonic serial) instead of predictable `<pid>_<ms>_<counter>`. Refuse to overwrite an existing blob path.
- Cap `m_RecentlyDrainedWorkerIds` at 256 entries with an FIFO eviction queue.
- `Blob(Data, Length)` rejects `Length > INT32_MAX` instead of wrapping the int32 wire fields.
- Static `AuthToken` path uses `HttpClientAccessToken::TimePoint::max()` (never-expires sentinel) instead of synthesizing `now + 24h`.
- Remove dead `m_Transport` field and `else if (m_Transport)` branch in `AsyncHordeAgent::Cancel()`.
Diffstat (limited to 'scripts/test_scripts')
| -rw-r--r-- | scripts/test_scripts/kill-test-processes.ps1 | 19 | ||||
| -rwxr-xr-x | scripts/test_scripts/kill-test-processes.sh | 45 |
2 files changed, 64 insertions, 0 deletions
diff --git a/scripts/test_scripts/kill-test-processes.ps1 b/scripts/test_scripts/kill-test-processes.ps1 new file mode 100644 index 000000000..0668a5319 --- /dev/null +++ b/scripts/test_scripts/kill-test-processes.ps1 @@ -0,0 +1,19 @@ +# Kill leftover CI test processes (zenserver, minio, nomad, consul) whose +# executable lives under the given build directory. Windows counterpart of +# kill-test-processes.sh; see that file for rationale. +# +# Usage: kill-test-processes.ps1 -Label <label> -BuildDir <path> + +param( + [Parameter(Mandatory=$true)][string]$Label, + [Parameter(Mandatory=$true)][string]$BuildDir +) + +foreach ($name in @('zenserver', 'minio', 'nomad', 'consul')) { + $procs = Get-Process -Name $name -ErrorAction SilentlyContinue | + Where-Object { $_.Path -and $_.Path.StartsWith($BuildDir, [System.StringComparison]::OrdinalIgnoreCase) } + foreach ($p in $procs) { + Write-Host "Killing $Label $name (PID $($p.Id)): $($p.Path)" + $p | Stop-Process -Force -ErrorAction SilentlyContinue + } +} diff --git a/scripts/test_scripts/kill-test-processes.sh b/scripts/test_scripts/kill-test-processes.sh new file mode 100755 index 000000000..e5fb2fcaa --- /dev/null +++ b/scripts/test_scripts/kill-test-processes.sh @@ -0,0 +1,45 @@ +#!/usr/bin/env bash +# +# Kill leftover CI test processes (zenserver, minio, nomad, consul) whose +# executable lives under the given build directory. +# +# Used by CI workflows to clean up any test processes from a previous run +# before starting a new one, and to reap any that are still running after +# the run finishes. +# +# Usage: kill-test-processes.sh <label> <build_dir> +# label: word printed in log messages, e.g. "stale" or "leftover" +# build_dir: absolute path; only processes whose executable is under this +# directory are killed. +# +# We resolve each PID's actual executable path rather than relying on argv +# (which `pgrep -a` reports). argv starts with the short process name (e.g. +# "consul") regardless of how the process was launched, so a pure argv prefix +# match never fires and leftovers silently survived between runs. + +set -u + +label=${1:?label required} +build_dir=${2:?build_dir required} + +get_exe_path() { + local pid=$1 + if [[ -r "/proc/$pid/exe" ]]; then + # Linux + readlink -f "/proc/$pid/exe" 2>/dev/null || true + else + # macOS fallback: the "txt" file descriptor points at the process binary + lsof -p "$pid" -Fn -a -d txt 2>/dev/null | awk '/^n/{print substr($0,2); exit}' || true + fi +} + +for name in zenserver minio nomad consul; do + while read -r pid; do + [[ -z "$pid" ]] && continue + exe=$(get_exe_path "$pid") + if [[ "$exe" == "$build_dir"* ]]; then + echo "Killing $label $name (PID $pid): $exe" + kill -9 "$pid" 2>/dev/null || true + fi + done < <(pgrep -x "$name" 2>/dev/null || true) +done |