|
|
### Critical (cryptographic correctness)
- AES-GCM nonce: replace homebrew `N32[0]++; N32[1]--; N32[2] = ^` scheme with NIST SP 800-38D ยง8.2.1 deterministic construction (64-bit big-endian counter). Session tears down on counter exhaustion instead of reusing a nonce.
- Remove `std::random_device` / `mt19937` nonce seed - the deterministic construction from the previous commit doesn't need an RNG, and `std::random_device` isn't guaranteed to be a CSPRNG.
- BCrypt return values: check every `BCRYPT_SUCCESS`, cache the `BCRYPT_KEY_HANDLE` on the context instead of re-creating it per message, destroy under null-guards. Closes the silent-downgrade-to-non-GCM path.
### High
- OpenSSL: check `EVP_CIPHER_CTX_new` / `EVP_EncryptInit_ex` / `EVP_DecryptInit_ex` return values in the constructor and set `HasErrors` on failure.
- Log AES-GCM tag-verification failures distinctly from other decrypt errors (BCrypt `STATUS_AUTH_TAG_MISMATCH` / OpenSSL `EVP_DecryptFinal_ex` post-set-tag), with a sequence counter for correlation.
- Thread a bounds-checked `ReadCursor` through every `Read*` parser helper; `ReadException` / `ReadExecuteResult` / `ReadBlobRequest` now return `bool` and callers treat malformed frames as protocol errors. Closes the `0xFF` varint OOB-read.
- Validate `ReadBlobRequest` locator as a safe filename component (reject path separators, `..`, NUL/control, drive colons, leading/trailing dot/space, length > 255). Closes the path-traversal attack on the `BundleDir / (Locator + ".blob")` join.
- Bind `AsyncAgentMessageChannel`'s timer and `AsyncReadResponse` entry onto the socket's strand; expose `AsyncComputeSocket::GetStrand()`. Removes the race between the bare-io_context timer completion and `OnFrame` on `m_PendingHandler` under the 3-thread pool.
- Drop the long-lived `m_EncryptBuffer` member - encrypt into a fresh per-write buffer shared with the completion handler. Also fixes thread-safety of the encrypt path.
- Validate server-returned `ClusterId` against `[A-Za-z0-9._-]{1,64}` before concatenating into the `api/v2/compute/<ClusterId>` URL.
### Medium
- `EVP_CIPHER_CTX_reset` + re-bind cipher on every encrypt/decrypt so stale state cannot bleed across messages. Also logs EVP failures.
- Malformed `ExecuteResult` (size != 4) now tears down the agent instead of silently reporting `ExitCode = -1`.
- Replace `assert(Eq != nullptr)` on env var parsing with a `zen::runtime_error` - assert is compiled out in release and `*(Eq+1)` was UB.
- Blob name uses `zen::Oid::NewOid()` (24 hex chars, seeded from `random_device` run-id + monotonic serial) instead of predictable `<pid>_<ms>_<counter>`. Refuse to overwrite an existing blob path.
- Cap `m_RecentlyDrainedWorkerIds` at 256 entries with an FIFO eviction queue.
- `Blob(Data, Length)` rejects `Length > INT32_MAX` instead of wrapping the int32 wire fields.
- Static `AuthToken` path uses `HttpClientAccessToken::TimePoint::max()` (never-expires sentinel) instead of synthesizing `now + 24h`.
- Remove dead `m_Transport` field and `else if (m_Transport)` branch in `AsyncHordeAgent::Cancel()`.
|