aboutsummaryrefslogtreecommitdiff
path: root/src/zenhttp/include
Commit message (Collapse)AuthorAgeFilesLines
* sessions: persist to disk, prune, track client liveness, accept UE_LOGFMT ↵HEADmainStefan Boberg2026-05-051-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (#1014) Branch started as a sessions-service overhaul (persistence, client liveness, UE_LOGFMT intake) and grew to pick up adjacent infrastructure work: an early-startup log backlog, a hardened `MemoryArena`, the `zen trace serve` viewer gaining a counter view + compact timeline + tabbed callsite panel, defensive fixes in the third-party `tourist` trace parser, a series of allocation reductions across the HTTP and compact-binary hot paths, and a new `zen sessions` CLI command tree. ## Sessions service **Persistence.** Each session lives on disk under `<DataRoot>/sessions/<id>/` as `info.cb` (metadata) plus `log.bin` (length-prefixed CbObject log records). On startup the service scans that directory and loads prior sessions as ended sessions, preloading the tail of each log so historical views work after a restart. `SessionLog` is noexcept-constructed and falls back to a disabled state on disk errors, so a bad disk can't take down `RegisterSession`. `GetSession` falls back to the ended-sessions list (fixes historical log fetches over HTTP). `LoadTail` counts only successfully-parsed records. **Pruning.** Periodic cleanup task drops ended sessions once any of three caps is exceeded: age (default 1 year), count (default 1000), or total on-disk footprint (default 50 MiB). Runs 30 s after startup, hourly thereafter. Active sessions never pruned; disk removal and directory stat happen outside the exclusive lock so a slow filesystem can't stall lookups. **Client liveness.** Sessions carry a `ProcessHandle` for the client-reported pid, captured at registration time so Windows pid recycling can't produce false positives. A 30 s asio timer probes liveness and ends dead sessions through the normal remove path, producing a synthetic `Session ended: process exited (...)` line persisted to `log.bin`. Windows decodes common NTSTATUS exit codes to human names (Ctrl-C, access violation, stack overflow, ...); POSIX stays at plain `process exited`. Clients auto-fill `ClientPid` only for local targets (unix socket / loopback); the server defensively accepts pids only from `IsLocalMachineRequest()` peers. zenserver also reports its own pid when registering its self-session, so it shows up with a real pid in the dashboard and `zen sessions ls`. **Synthetic end-of-session line.** `RemoveSession` takes an optional reason; before the session moves to the ended list it appends an Info-level `Session ended[: reason]` entry through the normal log path (released outside `m_Lock`). Current reasons: `client request` (HTTP DELETE), `server shutdown` (self-session), `process exited (...)` (liveness). **UE_LOGFMT structured entries.** `POST /sessions/{id}/log` now accepts `{level, logger, format, fields}` alongside the existing `{level, logger, message}` shape. New `logtemplate.{h,cpp}` implements UE's `StructuredLog.cpp` template grammar (field paths with `.name` / `[N]`, `{{`/`}}` escapes, `$text` / `$format` / `$locformat` object conventions, bounded recursion). Renders to a displayable message at intake while persisting raw format + fields so a future UI can drill into fields without another schema bump. Hot path is zero-alloc — renders into `ExtendableStringBuilder<256>` using stack-buffered `Oid::ToString` / `IoHash::ToHexString` overloads. UI shows a `{…}` marker with the raw template + JSON-pretty fields on hover. **Parent sessions.** `SessionInfo` gains `parent_session_id`; hub-managed storage server child processes inherit the hub's session id via `--parent-session=<id>`. `ZEN_SESSIONS_URL` env var becomes a fallback for `--sessions-url` / config when neither is provided. The in-process session log sink is disabled when a remote sessions target is configured (logs flow through `SessionsServiceClient` instead). The sessions UI groups child sessions under their parent (collapsible/expandable, sorts as a unit, supports nesting). **Platform reporting.** `SessionInfo` gains `Platform`, flowed end-to-end: client auto-fills via `GetRuntimePlatformName()`, server persists in `info.cb` (`plat`) and emits on GET. UI renders as a SimpleIcons-style inline SVG (windows / macOS / iOS / linux / wine / android / playstation / xbox / nintendo) with case-insensitive alias resolution (Win32/Win64, PS4/PS5, XSX/XSS, NintendoSwitch, iPhone/iPad, Darwin/OSX). Unknown values fall back to text; sorting runs on the underlying string. **WebSocket log streaming.** Sessions UI moves from 2 s polling to a WebSocket push model. New `WsSubscriber` has a stable id + helper methods. UI caps the log-line DOM at 5 000 entries with a shared cursor-regression helper, factored out of two call sites. Per-broadcast allocations trimmed on the push path; fixed a stack overrun in the WS log broadcast hex-id buffer. **Log memory.** `LogEntry::Level` is now `logging::LogLevel` (1 byte) instead of `std::string` (~32 B) — saves ~310 KB per full 10 k-entry deque and eliminates a per-message allocation in the in-proc sink. On-disk format writes an int32 and accepts either int or legacy string on read. `LogEntry` strings now live in a `MemoryArena`; logger names are interned across the deque. `SessionLog::Append` and `WriteSessionInfoFile` drop their `UniqueBuffer` round-trip and write `CbObject::GetView()` straight through `BasicFile` / `SafeWriteFile`. Multi-entry `POST /log` batched under one lock + one push. **In-proc log timestamps.** `InProcSessionLogSink::TimePointToDateTime` previously preserved only whole seconds, so every in-proc entry rendered at `.000` ms in the dashboard and `zen sessions tail`. It now adds the sub-second part (nanoseconds → 100 ns ticks) to keep ms precision end-to-end. **UI.** Side "Session Details" panel is gone — its info is inline in the table (appname, mode, platform, id, timestamps, this/log pills, active dot). Bottom panel is a tabbed `Log | Metadata` view with a right-side "Session Information" panel beside metadata; log-only controls (filter, newest-first, follow, log-level filter, expand/collapse) hide when Metadata is active, polling keeps running across tab switches. Wide-mode toggle fills the viewport edge-to-edge. Log lines show the logger category; timestamps render in 24 h with zero-padded fields regardless of locale. Sessions list defaults to All / 10 per page / created-desc, gains click-to-sort headers on the full dataset, a header filter box, and a pager aligned to the table's right edge. Duplicate auto-injected `<h1>Sessions</h1>` removed. ## `zen sessions` CLI New command tree on the `zen` client for inspecting the sessions service from the terminal: - **`zen sessions ls`** — lists sessions (active first, ended next; newest-first within each group) with id, status, app/mode, pid, created, duration, and log count. Supports `--status active|ended|all` (default `all`). - **`zen sessions status`** — prints the sessions service summary: self id, active / ended counts, and the read/write/delete/list/request/bad-request counters from `/stats/sessions`. - **`zen sessions tail [session]`** — tails a session's log. With no argument it tails zenserver's own session (resolved via `/sessions/list`'s `self_id`); an explicit 24-hex id targets any session, including ended ones (historical replay). `--lines N` (default 50, 0 = all buffered) trims the initial dump client-side. `--follow` prefers a WebSocket push subscription on `/sessions/ws` for sub-second latency; on upgrade failure (older server, blocked port, unix-socket transport) it falls back to HTTP cursor polling at `--interval-ms` (default 500), with sleeps chunked to 50 ms so Ctrl-C reacts quickly. Output matches `zen::logging::FullFormatter` (`[YY-MM-DD HH:MM:SS.mmm] [lvl] [logger] message`); on a TTY the level is colored and the logger is bold, with continuation lines indented under the message column using the *visible* prefix width. 404 surfaces as `(session ended)` and connection errors as `(server gone)` — both clean exits, so stopping the server mid-tail no longer prints a stack trace. - **`zen sessions ui`** — opens `<host>/dashboard/?page=sessions` in the user's default browser. Rejects unix-socket hosts. A small `ZenServiceClient::IsUnixSocket()` helper now wraps the unix-socket check used by `ui`, `sessions tail` (WS path), and `sessions ui`. ## Logging `BacklogSink` captures early-startup log entries in a fixed-capacity ring so late-attached sinks (session sink, file sink) can replay them. Detaches from the broadcast list when disabled; backed by destructor-only cleanup (no `unique_ptr` indirection per entry). Tuned defaults so the backlog covers typical bring-up without unbounded growth. ## `zen trace serve` viewer - Compact timeline mode for high-density views. - New `TRACE_INT_VALUE` / `TRACE_FLOAT_VALUE` counter trace points + a counters page in the viewer. - Callsite tables collapsed into a single tabbed panel. - Lossless `Oid <-> Guid` bridge for trace session ids; trace `SessionId` plumbed through. - `tourist` parser hardening: bounds-check `BufferStream::read`, validate `Type::info_size` before `patch()`, convert `parse_important_aux` to a loop (avoids deep recursion), widen `ParserPool` index to `uint32`, bounds-check field offsets in the dispatcher, pin `Types::parse` buffer up-front. ## `MemoryArena` Configurable chunk size, inline chunk list, oversize requests routed to truly-dedicated chunks (no slack waste, no fragmentation when one allocation is much larger than the chunk). ## Allocation cleanups across hot paths - `zenhttp::HttpRequestRouter::HandleRequest` and `FormatPackageMessageInternal`: drop heap allocations. - Compact-binary validation: `eastl::fixed_vector` + `eastl::sort`; eliminate `std::vector` churn. - `zenserverprocess`: trim transient allocations in spawn paths. - Sessions HTTP intake / broadcast: drop transient `std::string` allocs.
* hub async s3 client (#1024)Dan Engelbrecht2026-05-052-45/+211
| | | | | | | | | | | | | | - Feature: `AsyncHttpClient` adds cancellable request tokens, streaming GET to a file (`AsyncDownload`), zero-copy chunk-callback GET (`AsyncStream`), pull-mode body source for streaming `AsyncPut`, retry layer mirroring the synchronous client, and a submit-side in-flight cap (`HttpClientSettings::MaxConcurrentRequests`) so hub-scale fanout against a single host cannot stall queued handles into curl's connect-timeout window - Feature: Hub hydration can route S3 transfers through a non-blocking `AsyncHttpClient` (curl_multi + asio) backed by a single io thread; hydrate and dehydrate now pipeline requests instead of blocking worker threads - `--hub-hydration-async-enabled` (Lua: `hub.hydration.async.enabled`, default true) - `--hub-hydration-async-max-concurrent-requests` (Lua: `hub.hydration.async.maxconcurrentrequests`, default `clamp(cpu*4, 128, 512)`) - Feature: Hub provision/deprovision/obliterate now run as two phases on separate worker pools so per-module hydration cannot starve child-process spawn/despawn (and vice versa) - New `--hub-instance-spawn-threads` (Lua: `hub.instance.spawnthreads`, default `clamp(cpu/8, 4, 16)`) drives child-process spawn/despawn - `--hub-instance-provision-threads` (Lua: `hub.instance.provisionthreads`) now drives per-module hydrate/dehydrate scheduling only; default changed from `max(cpu/4, 2)` to `clamp(cpu/8, 4, 12)` - `--hub-hydration-threads` (Lua: `hub.hydration.threads`) now controls per-file workers inside a single hydrate/dehydrate; default changed from `max(cpu/4, 2)` to `clamp(cpu/8, 4, 12)` - Feature: `AsyncHttpClient` owns its `asio::io_context` and one io thread by default; the `(BaseUri, io_context&)` constructor is preserved for callers that want to share an externally-driven `io_context` across clients (caller MUST keep the loop running until the client destructs) - Feature: `Hub::Configuration` C++ struct fields renamed (`OptionalProvisionWorkerPool`/`OptionalHydrationWorkerPool` -> `OptionalProvisionPool`/`OptionalSpawnPool`/`OptionalHydrationPool`). Embedders constructing `Hub` directly must update field names; provision and spawn pools must both be set or both null (asserted at construction). - Bugfix: `S3Client` signing-key cache no longer returns stale signatures after IMDS-rotated credentials change `AccessKeyId`; cache is now keyed on `(DateStamp, AccessKeyId)`
* watchdog ephemeral port exhaust (#1022)Dan Engelbrecht2026-05-042-0/+63
| | | - Improvement: Hub pools HTTP connections to managed instances so provision/deprovision churn no longer exhausts Windows ephemeral ports
* zenhttp improvements (robustness / correctness) (#968)Stefan Boberg2026-05-043-7/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A collection of security, correctness, and robustness fixes in `zenhttp` and `zencore` surfaced by security review. Most items are small, independent commits grouped here because they all tighten trust boundaries or fix UB along the same code paths. ## WebSocket protocol hardening (RFC 6455) - **Enforce the client-side mask bit**. Server-side frame loops now reject unmasked frames with close code 1002 per §5.1. Prevents HTTP intermediary smuggling. - **Validate control frames and RSV bits**. Fragmented control frames, oversized (>125 B) control payloads, and any non-zero RSV bit now fail the connection before allocation. - **Lower per-frame payload cap** from 256 MB → 4 MB. Bounds per-connection accumulator memory. - **Implement message fragmentation**. Continuation frames are coalesced and delivered as a single message; interleaved non-control frames close with 1002; assembled messages are capped at 4 MB (1009 on overflow). Previously partial fragments were delivered to handlers, bypassing payload validation. - **Parse the 101 handshake response properly** in `HttpWsClient`. Status-line, `Upgrade`, `Connection`, and `Sec-WebSocket-Accept` are now matched exactly rather than via substring searches against the full body. ## Auth / OIDC hardening - **Constant-time password compare** in `PasswordSecurity::IsAllowed` (closes a remote length/content timing oracle). Adds a shared `ConstantTimeEquals` helper. - **Harden Basic-auth header parsing**: trim trailing LWS, reject control bytes and DEL in the credential. - **OIDC discovery pinning**: require HTTPS (loopback exempt), verify `issuer` matches `BaseUrl`, require `token_endpoint` / `userinfo_endpoint` / `jwks_uri` to share origin with `BaseUrl`, reject empty `token_endpoint`. - **Restrict `POST /auth/oidc/refreshtoken`** to local-machine requests. Previously unauthenticated in default deployments — remote callers could evict or replace cached tokens. - **Stop logging OIDC provider response bodies** on refresh failure (IdPs echo `refresh_token` back in error bodies). - **Drop the unused `IdentityToken` field** from `OidcClient` / `OpenIdToken` so nothing in the tree accidentally trusts an unverified JWT. ## Auth state encryption migration - Add `AesGcm` AEAD primitive (BCrypt / OpenSSL backends, mbedTLS stubbed) and `CryptoRandom::Fill` CSPRNG helper in `zencore/crypto.h`. - Migrate authstate file from AES-256-CBC with a fixed IV to AES-GCM with a fresh 12-byte random nonce per write and the 4-byte `ZEN1` magic bound as AAD. Legacy-CBC files are transparently read once and rewritten in the new format. ## Filesystem / IO robustness - `IoBufferExtendedCore::Materialize` now checks `MAP_FAILED` on POSIX (was comparing to `nullptr`, which let the failure sentinel propagate into later reads and `munmap(MAP_FAILED, ...)`). - `IoBufferBuilder::MakeFromFile / MakeFromTemporaryFile`: close the FD/HANDLE on exception via a dismissable `ScopeGuard`; actually check the `fstat()` return value (previously used an uninitialized `FileSize`). - `ReadFromFileMaybe`: loop short reads, retry `EINTR`, chunk Windows `ReadFile` at `0xFFFFFFFF` bytes (fixes silent truncation of multi-GiB reads). - `WipeDirectory`: compare `FindFirstFileW` handle against `INVALID_HANDLE_VALUE` rather than `nullptr`. - `RemoveFileNative` (Linux/macOS): report non-`ENOENT` stat failures via the `std::error_code` out-param and stop reading `st_mode` after a failed stat. ## Buffer / compression correctness - Avoid per-copy `IoBufferCore` heap allocations in `CompositeBuffer::CopyTo / ViewOrCopyRange` iterators; add fast path for `BufferHeader::Read` when the 64-byte header fits in the first plain-memory segment. - `BufferHeader`: add `IsHeaderValid()` gate covering `BlockSizeExponent` range, `BlockCount * BlockSize` overflow, and `TotalRawSize` bounds before any arithmetic uses them. Defends against attacker-controlled headers that can pass the CRC and trigger OOB writes in `DecompressBlock`.
* zenhttp: add FollowRedirects option to HttpClient (#982)Stefan Boberg2026-04-201-0/+7
| | | | | | - Adds `FollowRedirects` (default `false`) and `MaxRedirects` (default `5`) fields to `HttpClientSettings`. - When `FollowRedirects` is enabled, the curl backend sets `CURLOPT_FOLLOWLOCATION` and `CURLOPT_MAXREDIRS` so HTTP 3xx redirects are handled transparently in the transport layer — callers no longer need to parse `Location` headers and re-issue requests themselves. - Defaults are off, so existing callers see no behavior change.
* Move ZipFs from zenserver frontend into zenhttp (#980)Stefan Boberg2026-04-201-0/+35
| | | Moves `ZipFs` from `src/zenserver/frontend/` to `src/zenhttp/` so any binary linking `zenhttp` can serve a bundled web UI from a zip archive (motivator: the upcoming `zen trace serve` subcommand).
* fix utf characters in source code (#953)Dan Engelbrecht2026-04-133-9/+9
|
* Compute OIDC auth, async Horde agents, and orchestrator improvements (#913)Stefan Boberg2026-04-131-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rework of the Horde agent subsystem from synchronous per-thread I/O to an async ASIO-driven architecture, plus provisioner scale-down with graceful draining, OIDC authentication, scheduler improvements, and dashboard UI for provisioner control. ### Async Horde Agent Rewrite - Replace synchronous `HordeAgent` (one thread per agent, blocking I/O) with `AsyncHordeAgent` — an ASIO state machine running on a shared `io_context` thread pool - Replace `TcpComputeTransport`/`AesComputeTransport` with `AsyncTcpComputeTransport`/`AsyncAesComputeTransport` - Replace `AgentMessageChannel` with `AsyncAgentMessageChannel` using frame queuing and ASIO timers - Delete `ComputeBuffer` and `ComputeChannel` ring-buffer classes (no longer needed) ### Provisioner Drain / Scale-Down - `HordeProvisioner` can now drain agents when target core count is lowered: queries each agent's `/compute/session/status` for workload, selects candidates by largest-fit/lowest-workload, and sends `/compute/session/drain` - Configurable `--horde-drain-grace-period` (default 300s) before force-kill - Implement `IProvisionerStateProvider` interface to expose provisioner state to the orchestrator HTTP layer - Forward `--coordinator-session`, `--provision-clean`, and `--provision-tracehost` through both Horde and Nomad provisioners to spawned workers ### OIDC Authentication - `HordeClient` accepts an `AccessTokenProvider` (refreshable token function) as alternative to static `--horde-token` - Wire up `OidcToken.exe` auto-discovery via `httpclientauth::CreateFromOidcTokenExecutable` with `--HordeUrl` mode - New `--horde-oidctoken-exe-path` CLI option for explicit path override ### Orchestrator & Scheduler - Orchestrator generates a session ID at startup; workers include `coordinator_session` in announcements so the orchestrator can reject stale-session workers - New `Rejected` action state — when a remote runner declines at capacity, the action is rescheduled without retry count increment - Reduce scheduler lock contention: snapshot pending actions under shared lock, sort/trim outside the lock - Parallelize remote action submission across runners via `WorkerThreadPool` with slow-submit warnings - New action field `FailureReason` populated by all runner types (exit codes, sandbox failures, exceptions) - New endpoints: `session/drain`, `session/status`, `session/sunset`, `provisioner/status`, `provisioner/target` ### Remote Execution - Eager-attach mode for `RemoteHttpRunner` — bundles all attachments upfront in a `CbPackage` for single-roundtrip submits - Track in-flight submissions to prevent over-queuing - Show remote runner hostname in `GetDisplayName()` - `--announce-url` to override the endpoint announced to the coordinator (e.g. relay-visible address) ### Frontend Dashboard - Delete standalone `compute.html` (925 lines) and `orchestrator.html` (669 lines), consolidated into JS page modules - Add provisioner panel to orchestrator dashboard: target/active/estimated core counts, draining agent count - Editable target-cores input with debounced POST to `/orch/provisioner/target` - Per-agent provisioning status badges (active / draining / deallocated) in the agents table - Active vs total CPU counts in agents summary row ### CLI - New `zen compute record-start` / `record-stop` subcommands - `zen exec` progress bar with submit and completion phases, atomic work counters, `--progress` mode (Pretty/Plain/Quiet) ### Other - `DataDir` supports environment variable expansion - Worker manifest validation checks for `worker.zcb` marker to detect incomplete cached directories - Linux/Mac runners `nice(5)` child processes to avoid starving the main server - `ComputeService::SetShutdownCallback` wired to `RequestExit` via `session/sunset` - Curl HTTP client logs effective URL on failure - `MachineInfo` carries `Pool` and `Mode` from Horde response - Horde bundle creation includes `.pdb` on Windows
* minor fixups (#948)Dan Engelbrecht2026-04-131-3/+3
| | | | | | | * objectstore.cpp - m_TotalBytesServed now tracks all range cases (single, multi, 416) * async http: docstring corrected: curl_multi_socket_action() / ASIO socket async_wait remove non-ascii characters * fix singlethreaded gc option in lua to not use dash * fix changelog order
* HTTP range responses (RFC 7233) - httpobjectstore (#928)Dan Engelbrecht2026-04-102-2/+16
| | | | | | | | | - Improvement: HTTP range responses (RFC 7233) are now fully compliant across the object store and build store - 206 Partial Content responses now include a `Content-Range` header; previously absent for single-range requests, which broke `HttpClient::GetRanges()` - 416 Range Not Satisfiable responses now include `Content-Range: bytes */N` as required by RFC 7233 - Out-of-bounds range requests return 416 Range Not Satisfiable (was 400 Bad Request) - Single-byte ranges (`bytes=N-N`) are now correctly accepted (were previously rejected) - Range byte positions widened from 32-bit to 64-bit; RFC 7233 imposes no size limit on byte range values - Build store binary GET requests with a Range header now return 206 Partial Content with `Content-Range` (previously returned 200 OK without it)
* Add async HTTP client (curl_multi + ASIO) (#918)Stefan Boberg2026-04-091-0/+123
| | | | | | | | | | | | | | | | | | | | | | | - Adds `AsyncHttpClient` — an asynchronous HTTP client using `curl_multi_socket_action` integrated with ASIO for event-driven I/O. Supports GET, POST, PUT, DELETE, HEAD with both callback-based and `std::future`-based APIs. - Extracts shared curl helpers (callbacks, URL encoding, header construction, error mapping) into `httpclientcurlhelpers.h`, eliminating duplication between the sync and async implementations. ## Design - All curl_multi state is serialized on an `asio::strand`, safe with multi-threaded io_contexts. - Two construction modes: owned io_context (creates internal thread) or external io_context (caller runs the loop). - Socket readiness is detected via `asio::ip::tcp::socket::async_wait` driven by curl's `CURLMOPT_SOCKETFUNCTION`/`CURLMOPT_TIMERFUNCTION` — no polling, sub-millisecond latency. - Completion callbacks are dispatched off the strand onto the io_context so slow callbacks don't starve the curl event loop. Exceptions in callbacks are caught and logged. ## Files | File | Change | |------|--------| | `zenhttp/include/zenhttp/asynchttpclient.h` | New public header | | `zenhttp/clients/asynchttpclient.cpp` | Implementation (~1000 lines) | | `zenhttp/clients/httpclientcurlhelpers.h` | Shared curl helpers extracted from sync client | | `zenhttp/clients/httpclientcurl.cpp` | Removed duplicated helpers, uses shared header | | `zenhttp/asynchttpclient_test.cpp` | 8 test cases: verbs, payloads, callbacks, concurrency, external io_context, connection errors | | `zenhttp/zenhttp.cpp` | Forcelink registration for new tests |
* migrate from http_parser to llhttp (#929)Dan Engelbrecht2026-04-091-2/+3
|
* hub instance dashboard proxy (#914)Dan Engelbrecht2026-04-012-2/+2
| | | - Feature: Hub dashboard proxy - instance dashboards are accessible through the hub server at `/hub/proxy/{port}/` without requiring direct port access
* Request validation and resilience improvements (#864)Stefan Boberg2026-03-303-8/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ### Security: Input validation & path safety - **Reject local file references by default** in package parsing — only allow when explicitly opted in by the service (`ParseFlags::kAllowLocalReferences`) and validated by an `ILocalRefPolicy` (fail-closed: no policy = rejected) - **`DataRootLocalRefPolicy`** restricts local ref paths to the server's data root via canonical path prefix matching - **Validate attachment hashes** in compute HTTP handlers — decompresses and re-hashes each attachment at ingestion time to reject tampered payloads - **Path traversal validation** for worker descriptions (`pathvalidation.h`) — rejects absolute paths, `..` components, Windows reserved device names, and invalid filename characters - **Harden CbPackage parsing** against corrupt inputs — overflow-safe attachment count, bounds checks on local ref offset/size, graceful failure instead of `ZEN_ASSERT` for untrusted data - **Harden legacy package parser** — reject zero-size binary fields, missing mappers, and optionally validate resolved attachment hashes - **Bounds check in `CbPackageReader::MarshalLocalChunkReference`** — detect when `MakeFromFile` silently clamps offset+size to file size ### Reliability: Lock consolidation & bug fixes - **Consolidate three action map locks into one** (`m_ActionMapLock`) — eliminates deadlock risk from multi-lock ordering, simplifies state transitions, and fixes a race where newly enqueued actions were briefly invisible to `GetActionResult`/`FindActionResult` - **Fix infinite loop in `BaseRunnerGroup::SubmitActions`** when actions exceed total runner capacity — cap round-robin at `TotalCapacity` and default unassigned results to "No capacity" - **Fix `MakeSafeAbsolutePathInPlace` for UNC paths** — `\server\share` now correctly becomes `\?\UNC\server\share` instead of `\?\server\share` - **Fix `max_retries=0`** — previously fell through to the default of 3; now correctly means "no retries" ### New: ManagedProcessRunner - Cross-platform process runner backed by `SubprocessManager` — uses async exit callbacks instead of polling, delegates CPU/memory metrics to the manager's built-in sampler - `ProcessGroup` (JobObject on Windows, process group on POSIX) for bulk cancellation on shutdown - `--managed` flag on `zen exec inproc` to select this runner - Refactored monitor thread lifecycle — `StartMonitorThread()` now called from derived constructors to avoid calling virtual functions from base constructor ### Process management - **Suppress crash dialogs** via `JOB_OBJECT_UILIMIT_ERRORMODE` + `SEM_NOGPFAULTERRORBOX` in both `WindowsProcessRunner` and `JobObject::Initialize` — prevents WER/Dr. Watson modal dialogs from blocking the monitor thread - **CREATE_SUSPENDED → AssignProcessToJobObject → ResumeThread** pattern in `WindowsProcessRunner` — ensures job object assignment before process execution - **Move stdout/stderr callbacks to `Spawn()` parameters** in `SubprocessManager` — prevents race where early output could be missed before callback installation - Consistent PID logging across all runner types ### Test infrastructure - **`zentest-appstub`**: Added `Fail` (configurable exit code) and `Crash` (abort / nullptr deref) test functions - **Compute integration tests**: exit code handling, auto-retry exhaustion, manual reschedule after failure, mixed success/failure queues, crash handling (abort + nullptr), crash auto-retry, immediate query visibility after enqueue - **Package format tests**: truncated header, bad magic, attachment count overflow, truncated data, local ref rejection/acceptance, policy enforcement (inside/outside root, traversal, no-policy fail-closed) - **Legacy package parser tests**: empty input, zero-size binary, hash resolution with/without mapper, hash mismatch detection - **UNC path tests** for `MakeSafeAbsolutePath` ### Misc - ANSI color helper macros (`ZEN_RED`, `ZEN_BRIGHT_WHITE`, etc.) and `ZEN_BOLD`/`ZEN_DIM`/etc. - Generic `fmt::formatter` for types with free `ToString` functions - Compute dashboard: truncated hash display with monospace font and hover for full value - Renamed `usonpackage_forcelink` → `cbpackage_forcelink` - Compute enabled by default in xmake config (releases still explicitly disable)
* Misc small fixes (#897)Stefan Boberg2026-03-271-0/+3
| | | | | | | | | | - **Eliminate `<regex>` usage** — Replaced `std::regex`-based URL parsing in `jupiterbuildstorage.cpp` with manual `string_view` parsing. Added `CXXOPTS_NO_REGEX` to disable regex in cxxopts. Includes comprehensive tests for the new URL parser. - **Add missing HTTP response codes** — Added `102`, `103`, `203`, `207`, `208`, `226`, `306`, `421`, `425`, `451` to the enum and reason string lookup. - **Add `ForceColor` support to zen CLI** — Plumbed the `ForceColor` logging option through to the zen client. - **Add `.clangd` config** — Strips MSVC-specific flags clangd can't handle and suppresses noisy clang-tidy checks. - **Generic `fmt::formatter` for `ToString`** — Concept-based formatter that auto-formats any type with a free `ToString()` function, removing the need for per-type specializations. - **Fix OpenSSL dependency** — Changed `zenhorde` to use `openssl3` package on Linux/macOS. - **Add `<cmath>` include** — Missing include in `hyperloglog.h`. - **GCC compile fix** — Moved `static constinit` variable inside lambda in `logging.cpp`.
* remove CPR HTTP client backend (#894)Stefan Boberg2026-03-272-101/+0
| | | CPR is no longer needed now that HttpClient has fully transitioned to raw libcurl. This removes the CPR library, its build integration, implementation files, and all conditional compilation guards, leaving curl as the sole HTTP client backend.
* idle deprovision in hub (#895)Dan Engelbrecht2026-03-274-3/+11
| | | | | | | | | | | | | - Feature: Hub watchdog automatically deprovisions inactive provisioned and hibernated instances - Feature: Added `stats/activity_counters` endpoint to measure server activity - Feature: Added configuration options for hub watchdog - `--hub-watchdog-provisioned-inactivity-timeout-seconds` Inactivity timeout before a provisioned instance is deprovisioned - `--hub-watchdog-hibernated-inactivity-timeout-seconds` Inactivity timeout before a hibernated instance is deprovisioned - `--hub-watchdog-inactivity-check-margin-seconds` Margin before timeout at which an activity check is issued - `--hub-watchdog-cycle-interval-ms` Watchdog poll interval in milliseconds - `--hub-watchdog-cycle-processing-budget-ms` Maximum time budget per watchdog cycle in milliseconds - `--hub-watchdog-instance-check-throttle-ms` Minimum delay between checks on a single instance - `--hub-watchdog-activity-check-connect-timeout-ms` Connect timeout for activity check requests - `--hub-watchdog-activity-check-request-timeout-ms` Request timeout for activity check requests
* hub instance state refactor (#892)Dan Engelbrecht2026-03-271-1/+4
| | | | | | - Improvement: Provisioning a hibernated instance now automatically wakes it instead of requiring an explicit wake call first - Improvement: Deprovisioning now accepts instances in Crashed or Hibernated states, not just Provisioned - Improvement: Added `--consul-health-interval-seconds` and `--consul-deregister-after-seconds` options to control Consul health check behavior (defaults: 10s and 30s) - Improvement: Consul registration now occurs when provisioning starts; health check intervals are applied once provisioning completes
* Merge branch 'de/v5.7.25-hotpatch' (#880)Dan Engelbrecht2026-03-231-3/+2
|
* Unique session/client tracking using HyperLogLog (#884)Stefan Boberg2026-03-231-0/+16
| | | | | | | | | | | | | | ## Summary Adds probabilistic cardinality estimation for tracking unique HTTP clients and sessions using a HyperLogLog implementation. - Add a `HyperLogLog<Precision>` template in `zentelemetry` with thread-safe lock-free register updates, merge support, and XXH3 hashing - Feed client IP addresses (via raw bytes) and session IDs (via `Oid` bytes) into their respective HyperLogLog estimators from both the ASIO and http.sys server backends - Emit `distinct_clients` and `distinct_sessions` cardinality estimates in HTTP `CollectStats()` - Add tests covering empty, single, duplicates, accuracy, merge, and clear scenarios ## Why HyperLogLog Tracking exact unique counts would require storing every observed IP or session ID. HyperLogLog provides a memory-bounded probabilistic estimate (~1–2% error) using only a few KB of memory regardless of traffic volume.
* improve auth token refresh (#863)Dan Engelbrecht2026-03-182-5/+25
| | | Authentication callbacks are not thread safe, ensured call sites does single threaded calls
* Compute batching (#849)Stefan Boberg2026-03-182-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ### Compute Batch Submission - Consolidate duplicated action submission logic in `httpcomputeservice` into a single `HandleSubmitAction` supporting both single-action and batch (actions array) payloads - Group actions by queue in `RemoteHttpRunner` and submit as batches with configurable chunk size, falling back to individual submission on failure - Extract shared helpers: `MakeErrorResult`, `ValidateQueueForEnqueue`, `ActivateActionInQueue`, `RemoveActionFromActiveMaps` ### Retracted Action State - Add `Retracted` state to `RunnerAction` for retry-free rescheduling — an explicit request to pull an action back and reschedule it on a different runner without incrementing `RetryCount` - Implement idempotent `RetractAction()` on `RunnerAction` and `ComputeServiceSession` - Add `POST jobs/{lsn}/retract` and `queues/{queueref}/jobs/{lsn}/retract` HTTP endpoints - Add state machine documentation and per-state comments to `RunnerAction` ### Compute Race Fixes - Fix race in `HandleActionUpdates` where actions enqueued between session abandon and scheduler tick were never abandoned, causing `GetActionResult` to return 202 indefinitely - Fix queue `ActiveCount` race where `NotifyQueueActionComplete` was called after releasing `m_ResultsLock`, allowing callers to observe stale counters immediately after `GetActionResult` returned OK ### Logging Optimization and ANSI improvements - Improve `AnsiColorStdoutSink` write efficiency — single write call, dirty-flag flush, `RwLock` instead of `std::mutex` - Move ANSI color emission from sink into formatters via `Formatter::SetColorEnabled()`; remove `ColorRangeStart`/`End` from `LogMessage` - Extract color helpers (`AnsiColorForLevel`, `StripAnsiSgrSequences`) into `helpers.h` - Strip upstream ANSI SGR escapes in non-color output mode. This enables colour in log messages without polluting log files with ANSI control sequences - Move `RotatingFileSink`, `JsonFormatter`, and `FullFormatter` from header-only to pimpl with `.cpp` files ### CLI / Exec Refactoring - Extract `ExecSessionRunner` class from ~920-line `ExecUsingSession` into focused methods and a `ExecSessionConfig` struct - Replace monolithic `ExecCommand` with subcommand-based architecture (`http`, `inproc`, `beacon`, `dump`, `buildlog`) - Allow parent options to appear after subcommand name by parsing subcommand args permissively and forwarding unmatched tokens to the parent parser ### Testing Improvements - Fix `--test-suite` filter being ignored due to accumulation with default wildcard filter - Add test suite banners to test listener output - Made `function.session.abandon_pending` test more robust ### Startup / Reliability Fixes - Fix silent exit when a second zenserver instance detects a port conflict — use `ZEN_CONSOLE_*` for log calls that precede `InitializeLogging()` - Fix two potential SIGSEGV paths during early startup: guard `sentry_options_new()` returning nullptr, and throw on `ZenServerState::Register()` returning nullptr instead of dereferencing - Fail on unrecognized zenserver `--mode` instead of silently defaulting to store ### Other - Show host details (hostname, platform, CPU count, memory) when discovering new compute workers - Move frontend `html.zip` from source tree into build directory - Add format specifications for Compact Binary and Compressed Buffer wire formats - Add `WriteCompactBinaryObject` to zencore - Extended `ConsoleTui` with additional functionality - Add `--vscode` option to `xmake sln` for clangd / `compile_commands.json` support - Disable compute/horde/nomad in release builds (not yet production-ready) - Disable unintended `ASIO_HAS_IO_URING` enablement - Fix crashpad patch missing leading whitespace - Clean up code triggering gcc false positives
* zen hub port reuse (#850)Dan Engelbrecht2026-03-171-0/+1
| | | | | | | | - Feature: Added `--allow-port-probing` option to control whether zenserver searches for a free port on startup (default: true, automatically false when --dedicated is set) - Feature: Added new hub options for controlling provisioned storage server instances: - `--hub-instance-http` - HTTP server implementation for instances (asio/httpsys) - `--hub-instance-http-threads` - Number of HTTP connection threads per instance - `--hub-instance-corelimit` - Limit CPU concurrency per instance - Improvement: Hub now manages a deterministic port pool for provisioned instances allowing reuse of unused ports
* URI decoding, process env, compiler info, httpasio strands, regex route ↵Stefan Boberg2026-03-161-50/+6
| | | | | | | | | | | | | | | | | removal (#841) - Percent-decode URIs in ASIO HTTP server to match http.sys CookedUrl behavior, ensuring consistent decoded paths across backends - Add Environment field to CreateProcOptions for passing extra env vars to child processes (Windows: merged into Unicode environment block; Unix: setenv in fork) - Add GetCompilerName() and include it in build options startup logging - Suppress Windows CRT error dialogs in test harness for headless/CI runs - Fix mimalloc package: pass CMAKE_BUILD_TYPE, skip cfuncs test for cross-compile - Add virtual destructor to SentryAssertImpl to fix debug-mode warning - Simplify object store path handling now that URIs arrive pre-decoded - Add URI decoding test coverage for percent-encoded paths and query params - Simplify httpasio request handling by using strands (guarantees no parallel handlers per connection) - Removed deprecated regex-based route matching support - Fix full GC never triggering after cross-toolchain builds: The `gc_state` file stores `system_clock` ticks, but the tick resolution differs between toolchains (nanoseconds on GCC/standard clang, microseconds on UE clang). A nanosecond timestamp misinterpreted as microseconds appears far in the future (~year 58,000), bypassing the staleness check and preventing time-based full GC from ever running. Fixed by also resetting when the stored timestamp is in the future. - Clamp GC countdown display to configured interval: Prevents nonsensical log output (e.g. "Full GC in 492128002h") caused by the above or any other clock anomaly. The clamp applies to both the scheduler log and the status API.
* Made CPR optional, html generated at build time (#840)Stefan Boberg2026-03-132-9/+15
| | | | | | | - Fix potential crash on startup caused by logging macros being invoked before the logging system is initialized (null logger dereference in `ZenServerState::Sweep()`). `LoggerRef::ShouldLog` now guards against a null logger pointer. - Make CPR an optional dependency (`--zencpr` build option, enabled by default) so builds can proceed without it - Make zenvfs Windows-only (platform-specific target) - Generate the frontend zip at build time from source HTML files instead of checking in a binary blob which would accumulate with every single update
* Unix Domain Socket auto discovery (#833)Stefan Boberg2026-03-133-6/+11
| | | | | | | | This PR adds end-to-end Unix domain socket (UDS) support, allowing zen CLI to discover and connect to UDS-only servers automatically. - **`unix://` URI scheme in zen CLI**: The `-u` / `--hosturl` option now accepts `unix:///path/to/socket` to connect to a zenserver via a Unix domain socket instead of TCP. - **Per-instance shared memory for extended server info**: Each zenserver instance now publishes a small shared memory section (keyed by SessionId) containing per-instance data that doesn't fit in the fixed-size ZenServerEntry -- starting with the UDS socket path. This is a 4KB pagefile-backed section on Windows (`Global\ZenInstance_{sessionid}`) and a POSIX shared memory object on Linux/Mac (`/UnrealEngineZen_{sessionid}`). - **Client-side auto-discovery of UDS servers**: `zen info`, `zen status`, etc. now automatically discover and prefer UDS connections when a server publishes a socket path. Servers running with `--no-network` (UDS-only) are no longer invisible to the CLI. - **`kNoNetwork` flag in ZenServerEntry**: Servers started with `--no-network` advertise this in their shared state entry. Clients skip TCP fallback for these servers, and display commands (`ps`, `status`, `top`) show `-` instead of a port number to indicate TCP is not available.
* Add --no-network option (#831)Stefan Boberg2026-03-121-4/+5
| | | | - Add `--no-network` CLI option which disables all TCP/HTTPS listeners, restricting zenserver to Unix domain socket communication only. - Also fixes asio upgrade breakage on main
* added streaming download of payloads http client Post (#824)Dan Engelbrecht2026-03-111-1/+4
| | | | | | * added streaming download of payloads in cpr client ::Post * curlclient Post streaming download * case sensitivity fixes for http headers * move over missing functionality from crpclient to httpclient
* HttpClient using libcurl, Unix Sockets for HTTP. HTTPS support (#770)Stefan Boberg2026-03-104-11/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main goal of this change is to eliminate the cpr back-end altogether and replace it with the curl implementation. I would expect to drop cpr as soon as we feel happy with the libcurl back-end. That would leave us with a direct dependency on libcurl only, and cpr can be eliminated as a dependency. ### HttpClient Backend Overhaul - Implemented a new **libcurl-based HttpClient** backend (`httpclientcurl.cpp`, ~2000 lines) as an alternative to the cpr-based one - Made HttpClient backend **configurable at runtime** via constructor arguments and `-httpclient=...` CLI option (for zen, zenserver, and tests) - Extended HttpClient test suite to cover multipart/content-range scenarios ### Unix Domain Socket Support - Added Unix domain socket support to **httpasio** (server side) - Added Unix domain socket support to **HttpClient** - Added Unix domain socket support to **HttpWsClient** (WebSocket client) - Templatized `HttpServerConnectionT<SocketType>` and `WsAsioConnectionT<SocketType>` to handle TCP, Unix, and SSL sockets uniformly via `if constexpr` dispatch ### HTTPS Support - Added **preliminary HTTPS support to httpasio** (for Mac/Linux via OpenSSL) - Added **basic HTTPS support for http.sys** (Windows) - Implemented HTTPS test for httpasio - Split `InitializeServer` into smaller sub-functions for http.sys ### Other Notable Changes - Improved **zenhttp-test stability** with dynamic port allocation - Enhanced port retry logic in http.sys (handles ERROR_ACCESS_DENIED) - Fatal signal/exception handlers for backtrace generation in tests - Added `zen bench http` subcommand to exercise network + HTTP client/server communication stack
* Dashboard overhaul, compute integration (#814)Stefan Boberg2026-03-093-14/+115
| | | | | | | | | | - **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS. - **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking. - **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration. - **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP. - **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch. - **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support. Also contains some fixes from TSAN analysis.
* add fallback for zencache multirange (#816)Dan Engelbrecht2026-03-091-1/+1
| | | | | | | | | | | * clean up BuildStorageResolveResult to allow capabilities * add check for multirange request capability * add MaxRangeCountPerRequest capabilities * project export tests * add InMemoryBuildStorageCache * progress and logging improvements * fix ElapsedSeconds calculations in fileremoteprojectstore.cpp * oplogs/builds test script
* Claude config, some bug fixes (#813)Stefan Boberg2026-03-064-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Claude config updates * Bug fixes and hardening across `zencore` and `zenhttp`, identified via static analysis. ### zencore - **`ZEN_ASSERT` macro** -- extended to accept an optional string message literal; added `ZEN_ASSERT_MSG_` helper for message formatting. Callers needing runtime fmt-style formatting should use `ZEN_ASSERT_FORMAT`. - **`MpscQueue`** -- fixed `TypeCompatibleStorage` to use a properly-sized `char Storage[sizeof(T)]` array instead of a single `char`; corrected `Data()` to cast `&Storage` rather than `this`; switched cache-line alignment to a fixed constant to avoid GCC's `-Winterference-size` warning. Enabled previously-disabled tests. - **`StringBuilderImpl`** -- initialized `m_Base`/`m_CurPos`/`m_End` to `nullptr`. Fixed `StringCompare` return type (`bool` -> `int`). Fixed `ParseInt` to reject strings with trailing non-numeric characters. Removed deprecated `<codecvt>` include. - **`NiceNumGeneral`** -- replaced `powl()` with integer `IntPow()` to avoid floating-point precision issues. - **`RwLock::ExclusiveLockScope`** -- added move constructor/assignment; initialized `m_Lock` to `nullptr`. - **`Latch::AddCount`** -- fixed variable type (`std::atomic_ptrdiff_t` -> `std::ptrdiff_t` for the return value of `fetch_add`). - **`thread.cpp`** -- fixed Linux `pthread_setname_np` 16-byte name truncation; added null check before dereferencing in `Event::Close()`; fixed `NamedEvent::Close()` to call `close(Fd)` outside the lock region; added null guard in `NamedMutex` destructor; `Sleep()` now returns early for non-positive durations. - **`MD5Stream`** -- was entirely stubbed out (no-op); now correctly calls `MD5Init`/`MD5Update`/`MD5Final`. Fixed `ToHexString` to use the correct string length. Fixed forward declarations. Fixed tests to compare `compare() == 0`. - **`sentryintegration.cpp`** -- guard against null `filename`/`funcname` in spdlog message handler to prevent a crash in `fmt::format`. - **`jobqueue.cpp`** -- fixed lost job ID when `IdGenerator` wraps around zero; fixed raw `Job*` in `RunningJobs` map (potential use-after-free) to `RefPtr<Job>`; fixed range-loop copies; fixed format string typo. - **`trace.cpp`** -- suppress GCC false-positive warnings in third-party `trace.h` include. ### zenhttp - **WebSocket close race** (`wsasio`, `wshttpsys`, `httpwsclient`) -- `m_CloseSent` promoted from `bool` to `std::atomic<bool>`; close check changed to `exchange(true)` to eliminate the check-then-set data race. - **`wsframecodec.cpp`** -- reject WebSocket frames with payload > 256 MB to prevent OOM from malformed/malicious frames. - **`oidc.cpp`** -- URL-encode refresh token and client ID in token requests (`FormUrlEncode`); parse `end_session_endpoint` and `device_authorization_endpoint` from OIDC discovery document. - **`httpclientcommon.cpp`** -- propagate error code from `AppendData` when flushing the cache buffer. - **`httpclient.h`** -- initialize all uninitialized members (`ErrorCode`, `UploadedBytes`, `DownloadedBytes`, `ElapsedSeconds`, `MultipartBoundary` fields). - **`httpserver.h`** -- fix `operator=` return type for `HttpRpcHandler` (missing `&`). - **`packageformat.h`** -- fix `~0u` (32-bit truncation) to `~uint64_t(0)` for a `uint64_t` field. - **`httpparser`** -- initialize `m_RequestVerb` in both declaration and `ResetState()`. - **`httpplugin.cpp`** -- initialize `m_BasePort`; fix format string missing quotes around connection name. - **`httptracer.h`** -- move `#pragma once` before includes. - **`websocket.h`** -- initialize `WebSocketMessage::Opcode`. ### zenserver - **`hubservice.cpp`** -- fix two `ZEN_ASSERT` calls that incorrectly used fmt-style format args; converted to `ZEN_ASSERT_FORMAT`.
* compute orchestration (#763)Stefan Boberg2026-03-041-0/+1
| | | | | | | | | | - Added local process runners for Linux/Wine, Mac with some sandboxing support - Horde & Nomad provisioning for development and testing - Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API - Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates - Some security hardening - Improved scalability and `zen exec` command Still experimental - compute support is disabled by default
* HTTP improvements (#803)Stefan Boberg2026-03-042-0/+9
| | | | | - Add GetTotalBytesReceived/GetTotalBytesSent to HttpServer with implementations in ASIO and http.sys backends - Add ExpectedErrorCodes to HttpClientSettings to suppress warn/info logs for anticipated HTTP error codes - Also fixes minor issues in `CprHttpClient::Download`
* unity build fixes (#802)Stefan Boberg2026-03-041-0/+1
| | | | | Various fixes to make cpp files build in unity build mode as an aside using Unity build doesn't really seem to work on Linux, unsure why but it leads to link-time issues
* use multi range requests (#800)Dan Engelbrecht2026-03-031-1/+3
| | | | | | | - Improvement: `zen builds download` now uses multi-range requests for blocks to reduce download size - Improvement: `zen oplog-import` now uses partial block with multi-range requests for blocks to reduce download size - Improvement: Improved feedback in log/console during `zen oplog-import` - Improvement: `--allow-partial-block-requests` now defaults to `true` for `zen builds download` and `zen oplog-import` (was `mixed`) - Improvement: Improved range merging analysis when downloading partial blocks
* add full WebSocket (RFC 6455) client/server support for zenhttp (#792)Stefan Boberg2026-02-274-1/+153
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * This branch adds full WebSocket (RFC 6455) support to the HTTP server layer, covering both transport backends, a client, and tests. - **`websocket.h`** -- Core interfaces: `WebSocketOpcode`, `WebSocketMessage`, `WebSocketConnection` (ref-counted), and `IWebSocketHandler`. Services opt in to WebSocket support by implementing `IWebSocketHandler` alongside their existing `HttpService`. - **`httpwsclient.h`** -- `HttpWsClient`: an ASIO-backed `ws://` client with both standalone (own thread) and shared `io_context` modes. Supports connect timeout and optional auth token injection via `IWsClientHandler` callbacks. - **`wsasio.cpp/h`** -- `WsAsioConnection`: WebSocket over ASIO TCP. Takes over the socket after the HTTP 101 handshake and runs an async read/write loop with a queued write path (guarded by `RwLock`). - **`wshttpsys.cpp/h`** -- `WsHttpSysConnection`: WebSocket over http.sys opaque-mode connections (Windows only). Uses `HttpReceiveRequestEntityBody` / `HttpSendResponseEntityBody` via IOCP, sharing the same threadpool as normal http.sys traffic. Self-ref lifetime management ensures graceful drain of outstanding async ops. - **`httpsys_iocontext.h`** -- Tagged `OVERLAPPED` wrapper (`HttpSysIoContext`) used to distinguish normal HTTP transactions from WebSocket read/write completions in the single IOCP callback. - **`wsframecodec.cpp/h`** -- `WsFrameCodec`: static helpers for parsing (unmasked and masked) and building (unmasked server frames and masked client frames) RFC 6455 frames across all three payload length encodings (7-bit, 16-bit, 64-bit). Also computes `Sec-WebSocket-Accept` keys. - **`clients/httpwsclient.cpp`** -- `HttpWsClient::Impl`: ASIO-based client that performs the HTTP upgrade handshake, then hands off to the frame codec for the read loop. Manages its own `io_context` thread or plugs into an external one. - **`httpasio.cpp`** -- ASIO server now detects `Upgrade: websocket` requests, checks the matching `HttpService` for `IWebSocketHandler` via `dynamic_cast`, performs the RFC 6455 handshake (101 response), and spins up a `WsAsioConnection`. - **`httpsys.cpp`** -- Same upgrade detection and handshake logic for the http.sys backend, using `WsHttpSysConnection` and `HTTP_SEND_RESPONSE_FLAG_OPAQUE`. - **`httpparser.cpp/h`** -- Extended to surface the `Upgrade` / `Connection` / `Sec-WebSocket-Key` headers needed by the handshake. - **`httpcommon.h`** -- Minor additions (probably new header constants or response codes for the WS upgrade). - **`httpserver.h`** -- Small interface changes to support WebSocket registration. - **`zenhttp.cpp` / `xmake.lua`** -- New source files wired in; build config updated. - **Unit tests** (`websocket.framecodec`): round-trip encode/decode for text, binary, close frames; all three payload sizes; masked and unmasked variants; RFC 6455 `Sec-WebSocket-Accept` test vector. - **Integration tests** (`websocket.integration`): full ASIO server tests covering handshake (101), normal HTTP coexistence, echo, server-push broadcast, client close handshake, ping/pong auto-response, sequential messages, and rejection of upgrades on non-WS services. - **Client tests** (`websocket.client`): `HttpWsClient` connect+echo+close, connection failure (bad port -> close code 1006), and server-initiated close. * changed HttpRequestParser::ParseCurrentHeader to use switch instead of if/else chain * remove spurious printf --------- Co-authored-by: Stefan Boberg <[email protected]>
* MeasureLatency now bails out quickly if it experiences a connection error (#789)Stefan Boberg2026-02-271-0/+3
| | | previously it would stall some 40s in this case
* add support in http client to accept multi-range responses (#788)Dan Engelbrecht2026-02-271-0/+14
| | | * add support in http client to accept multi-range responses
* adding HttpClient tests (#785)Stefan Boberg2026-02-261-1/+2
| | | | | | | | | | | | | | | | | | | | Add comprehensive `HttpClient` test suite. Covers: - **HTTP verbs** -- GET, POST, PUT, DELETE, HEAD dispatch correctly - **GET/POST/PUT/Upload/Download** -- payload round-trips (IoBuffer, CbObject, CompositeBuffer), content types, large payloads, file-spill downloads - **Status codes** -- 2xx/4xx/5xx classification, exact code matching - **Response API** -- IsSuccess, AsText, AsObject, ToText, ErrorMessage, ThrowError - **Error handling** -- connection refused, request timeout, nonexistent endpoints - **Session management** -- default ID, SetSessionId, reset to zero - **Authentication** -- token provider, expired tokens, bearer verification - **Content type detection** -- text, JSON, binary, CbObject - **Request metadata** -- elapsed time, upload/download byte counts - **Retry logic** -- retry after transient 503s, no-retry baseline - **Latency measurement** -- MeasureLatency against live and unreachable servers - **KeyValueMap** -- construction from pairs, string_views, initializer lists - **Transport-level faults (GET)** -- connection reset/close before response, partial headers, truncated body, mid-body reset, stalled response timeout, retry after RST - **Transport-level faults (POST)** -- server reset/close before consuming body, mid-body reset, early 503 without consuming upload, stalled upload timeout, retry with large body after transient failures Also adds zenhttp-test to the xmake test runner (xmake test --run=http).
* HttpService/Frontend improvements (#782)Stefan Boberg2026-02-251-0/+16
| | | | | | | - zenhttp: added `GetServiceUri()`/`GetExternalHost()` - enables code to quickly generate an externally reachable URI for a given service - frontend: improved Uri handling (better defaults) - added support for 404 page (to make it easier to find a good URL)
* use partial blocks for oplog import (#780)Dan Engelbrecht2026-02-241-0/+9
| | | | | Feature: Add --allow-partial-block-requests to zen oplog-import Improvement: zen oplog-import now uses partial block requests to reduce download size Improvement: Use latency to Cloud Storage host and Zen Cache host when calculating partial block requests
* add http server root password protection (#757)Dan Engelbrecht2026-02-173-28/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Feature: Added `--security-config-path` option to zenserver to configure security settings - Expects a path to a .json file - Default is an empty path resulting in no extra security settings and legacy behavior - Current support is a top level filter of incoming http requests restricted to the `password` type - `password` type will check the `Authorization` header and match it to the selected authorization strategy - Currently the security settings is very basic and configured to a fixed username+password at startup { "http" { "root": { "filter": { "type": "password", "config": { "password": { "username": "<username>", "password": "<password>" }, "protect-machine-local-requests": false, "unprotected-uris": [ "/health/", "/health/info", "/health/version" ] } } } } }
* misc fixes brought over from sb/proto (#759)Stefan Boberg2026-02-172-4/+17
| | | | | | | | * `RwLock::WithSharedLock` and `RwLock::WithExclusiveLock` can now return a value (which is returned by the passed function) * Comma-separated logger specification now correctly deals with commas * `GetSystemMetrics` properly accounts for cores * cpr response formatter passes arguments in the right order * `HttpServerRequest::SetLogRequest` can be used to selectively log HTTP requests
* add foundation for http password protection (#756)Dan Engelbrecht2026-02-131-0/+52
|
* add IHttpRequestFilter to allow server implementation to filter/reject ↵Dan Engelbrecht2026-02-131-4/+18
| | | | | requests (#753) * add IHttpRequestFilter to allow server implementation to filter/reject requests
* add IsLocalMachineRequest to HttpServerRequest (#749)Dan Engelbrecht2026-02-121-0/+2
| | | * add IsLocalMachineRequest to HttpServerRequest
* add simple http client tests (#751)Dan Engelbrecht2026-02-121-1/+3
| | | * add simple http client tests and fix run loop of http server to not rely on application quit
* remove ZENCORE_API completely (#718)Stefan Boberg2026-01-191-2/+2
| | | initially we had ZENCORE_API macros to potentially allow for DLL linkage. It turns out that this is not useful and the macros just contribute noise, so this change removes them completely.
* add otel instrumentation (#581)Stefan Boberg2025-12-111-2/+5
| | | | | | | | this change adds OTEL tracing to a few places * Top-level application lifecycle (config/init/cleanup, main loop) * http.sys requests it also brings some otlptrace optimizations and dynamic configuration of tracing. OTLP tracing is currently always disabled