| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
- Improvement: Hub pools HTTP connections to managed instances so provision/deprovision churn no longer exhausts Windows ephemeral ports
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A collection of security, correctness, and robustness fixes in `zenhttp` and `zencore` surfaced by security review. Most items are small, independent commits grouped here because they all tighten trust boundaries or fix UB along the same code paths.
## WebSocket protocol hardening (RFC 6455)
- **Enforce the client-side mask bit**. Server-side frame loops now reject unmasked frames with close code 1002 per §5.1. Prevents HTTP intermediary smuggling.
- **Validate control frames and RSV bits**. Fragmented control frames, oversized (>125 B) control payloads, and any non-zero RSV bit now fail the connection before allocation.
- **Lower per-frame payload cap** from 256 MB → 4 MB. Bounds per-connection accumulator memory.
- **Implement message fragmentation**. Continuation frames are coalesced and delivered as a single message; interleaved non-control frames close with 1002; assembled messages are capped at 4 MB (1009 on overflow). Previously partial fragments were delivered to handlers, bypassing payload validation.
- **Parse the 101 handshake response properly** in `HttpWsClient`. Status-line, `Upgrade`, `Connection`, and `Sec-WebSocket-Accept` are now matched exactly rather than via substring searches against the full body.
## Auth / OIDC hardening
- **Constant-time password compare** in `PasswordSecurity::IsAllowed` (closes a remote length/content timing oracle). Adds a shared `ConstantTimeEquals` helper.
- **Harden Basic-auth header parsing**: trim trailing LWS, reject control bytes and DEL in the credential.
- **OIDC discovery pinning**: require HTTPS (loopback exempt), verify `issuer` matches `BaseUrl`, require `token_endpoint` / `userinfo_endpoint` / `jwks_uri` to share origin with `BaseUrl`, reject empty `token_endpoint`.
- **Restrict `POST /auth/oidc/refreshtoken`** to local-machine requests. Previously unauthenticated in default deployments — remote callers could evict or replace cached tokens.
- **Stop logging OIDC provider response bodies** on refresh failure (IdPs echo `refresh_token` back in error bodies).
- **Drop the unused `IdentityToken` field** from `OidcClient` / `OpenIdToken` so nothing in the tree accidentally trusts an unverified JWT.
## Auth state encryption migration
- Add `AesGcm` AEAD primitive (BCrypt / OpenSSL backends, mbedTLS stubbed) and `CryptoRandom::Fill` CSPRNG helper in `zencore/crypto.h`.
- Migrate authstate file from AES-256-CBC with a fixed IV to AES-GCM with a fresh 12-byte random nonce per write and the 4-byte `ZEN1` magic bound as AAD. Legacy-CBC files are transparently read once and rewritten in the new format.
## Filesystem / IO robustness
- `IoBufferExtendedCore::Materialize` now checks `MAP_FAILED` on POSIX (was comparing to `nullptr`, which let the failure sentinel propagate into later reads and `munmap(MAP_FAILED, ...)`).
- `IoBufferBuilder::MakeFromFile / MakeFromTemporaryFile`: close the FD/HANDLE on exception via a dismissable `ScopeGuard`; actually check the `fstat()` return value (previously used an uninitialized `FileSize`).
- `ReadFromFileMaybe`: loop short reads, retry `EINTR`, chunk Windows `ReadFile` at `0xFFFFFFFF` bytes (fixes silent truncation of multi-GiB reads).
- `WipeDirectory`: compare `FindFirstFileW` handle against `INVALID_HANDLE_VALUE` rather than `nullptr`.
- `RemoveFileNative` (Linux/macOS): report non-`ENOENT` stat failures via the `std::error_code` out-param and stop reading `st_mode` after a failed stat.
## Buffer / compression correctness
- Avoid per-copy `IoBufferCore` heap allocations in `CompositeBuffer::CopyTo / ViewOrCopyRange` iterators; add fast path for `BufferHeader::Read` when the 64-byte header fits in the first plain-memory segment.
- `BufferHeader`: add `IsHeaderValid()` gate covering `BlockSizeExponent` range, `BlockCount * BlockSize` overflow, and `TotalRawSize` bounds before any arithmetic uses them. Defends against attacker-controlled headers that can pass the CRC and trigger OOB writes in `DecompressBlock`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stopping the zenserver Windows service (via `sc stop`, `zen service stop`, system shutdown, or any other SCM path) was being ignored. SCM would eventually force-kill the process after its timeout, giving an ungraceful shutdown.
## Root cause
PR #751 ("add simple http client tests", c37421a3b) restructured each HTTP server's `OnRun` loop from
```cpp
do { m_ShutdownEvent.Wait(WaitTimeout); }
while (!IsApplicationExitRequested());
```
to
```cpp
do { ShutdownRequested = m_ShutdownEvent.Wait(WaitTimeout); }
while (!ShutdownRequested);
```
That was well-intentioned — tests wanted to start/stop an HTTP server without touching global process state — but the old loop was the only thing that turned `RequestApplicationExit()` into an actual server wake-up. Once it was removed, `RequestApplicationExit(0)` was silently downgraded to "just sets a flag". The `WindowsService::SvcCtrlHandler` stop path was calling exactly that, so SCM stops stopped working. The sponsor-process check path kept working only because it *also* calls `m_Http->RequestExit()` via `ZenServerBase::RequestExit()`.
## Fix
- Restore `IsApplicationExitRequested()` as a secondary exit condition in each HTTP server's `OnRun` loop (`httpsys`, `httpasio`, `httpmulti`, `httpnull`, `httpplugin`) alongside the per-server `m_ShutdownEvent` that #751 introduced. Preserves #751's goal — tests can still call `server->RequestExit()` without touching global state — while making `RequestApplicationExit()` wake the server up again, which the rest of the codebase and `SvcCtrlHandler` assume.
- Clean up the service control handler in the same pass: also accept `SERVICE_CONTROL_SHUTDOWN`, report `STOP_PENDING` with a 30s `dwWaitHint` (was 0), drop the redundant second `ReportSvcStatus` call, and remove `ghSvcStopEvent` which nothing ever `Wait()`-ed on.
- Advertise `SERVICE_ACCEPT_STOP | SERVICE_ACCEPT_SHUTDOWN` while running; drop controls while stop-pending/stopped.
- Make `WindowsService` destructor virtual (latent UB given `Run()` was already virtual).
|
| |
|
|
|
|
|
|
|
| |
- Improvement: Hub shares a single S3 client and IMDS credential provider across all modules, reducing IMDS load and surviving transient IMDS blips during bulk provisioning
- Improvement: Hub validates hydration config at startup; bad `--hub-hydration-target-spec` or `--hub-hydration-target-config` now fails `zen hub` at boot instead of per-module at first hydrate
- Improvement: S3 hydration multipart chunk size configurable via `settings.chunk-size` (default 32 MiB)
- Improvement: S3 client extracts `<Error><Code>` and `<Message>` from XML error bodies (previously logged as `<unhandled content format>`)
- Improvement: S3 client fails fast with a "no credentials available" error when AWS credentials are missing, instead of sending an unsigned request that S3 rejects with a generic 400
- Improvement: IMDS credential provider retries transient connection failures (up to 3 attempts with backoff)
- Improvement: HTTP clients with `RetryCount > 0` also retry on `CURLE_COULDNT_CONNECT`
|
| |
|
|
|
|
| |
- Adds `FollowRedirects` (default `false`) and `MaxRedirects` (default `5`) fields to `HttpClientSettings`.
- When `FollowRedirects` is enabled, the curl backend sets `CURLOPT_FOLLOWLOCATION` and `CURLOPT_MAXREDIRS` so HTTP 3xx redirects are handled transparently in the transport layer — callers no longer need to parse `Location` headers and re-issue requests themselves.
- Defaults are off, so existing callers see no behavior change.
|
| |
|
| |
Moves `ZipFs` from `src/zenserver/frontend/` to `src/zenhttp/` so any binary linking `zenhttp` can serve a bundled web UI from a zip archive (motivator: the upcoming `zen trace serve` subcommand).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A series of correctness and API hygiene fixes to the intrusive refcount primitives in `zenbase`, culminating in the removal of `RefPtr<T>` in favour of a single unified `Ref<T>` smart pointer.
The changes are motivated by two pieces of latent UB sitting under every `Ref<T>` / `TRefCounted<T>` in the codebase, plus a handful of API footguns on the smart-pointer side (silent raw-pointer decay, missing converting moves, unconstrained conversions from unrelated types).
## Correctness fixes
- **Strict-aliasing UB in atomic helpers** — `AtomicIncrement`/`Decrement`/`Add` took a `volatile uint32_t&` and reinterpret-cast it to `std::atomic<T>*`. The object was never constructed as a `std::atomic`, so the access was type-punning UB. Fixed by changing `m_RefCount` to `std::atomic<uint32_t>` directly in `RefCounted`, `TRefCounted<T>` and `IoBufferCore`. The helpers (and `zenbase/atomic.h`) are later removed entirely — the three callers now invoke `fetch_add`/`fetch_sub` directly.
- **const_cast of non-mutable member** — `AddRef()` / `Release()` are `const` but mutated `m_RefCount` via `const_cast`. Since `m_RefCount` wasn't `mutable`, writing through the cast was UB for any `const`-qualified holder (e.g. a `static const` refcounted singleton). Fixed by marking `m_RefCount` `mutable` and dropping the `const_cast` in `AddRef`/`Release`.
- **Public non-virtual `TRefCounted` destructor** — allowed `delete basePtr;` to slice past the CRTP `DeleteThis()` contract. Moved to `protected`.
## Memory-ordering cleanup
- `AddRef` weakened from seq_cst to **relaxed** (a thread can only take a new reference via one it already holds; nothing needs to synchronize).
- `Release` weakened from seq_cst to **acq_rel** (sufficient to order prior writes before the destructor, and make the decrement visible to observers).
- Diagnostic `RefCount()` / `GetRefCount()` reads made **relaxed** and spelled out as explicit `.load()` — the returned value is stale the moment it's observed, so stronger ordering gives no guarantee.
- No-op on x86 (`lock xadd` either way), but removes a full barrier on every `Ref<T>` copy on ARM64 (Apple silicon / Windows-on-ARM).
## `RefPtr` / `Ref` unification
Before this branch, `RefPtr<T>` and `Ref<T>` were subtly different in ways that made the safer of the two (`Ref`) harder to use and the looser one (`RefPtr`) dangerous:
- `RefPtr::operator T*()` was implicit — `delete refPtr;` compiled silently (double-delete), and the raw pointer could outlive the temporary `RefPtr` it was extracted from. Made `explicit`, then removed entirely once call sites were migrated to `.Get()`.
- `RefPtr(T*)` was implicit while `RefPtr(RefPtr<Derived>&&)` was `explicit` — exactly the opposite of the safety intent. Reversed.
- `RefPtr`'s converting move was unconstrained (any `RefPtr<U>` with an implicitly-convertible `U*` satisfied it, including `void*` and multiple-inheritance base offsets). Added a `DerivedFrom<U, T>` constraint matching `Ref<T>`.
- `Ref<T>` was missing a converting move ctor / move-assignment from `Ref<Derived>` — upcasts of rvalues were going through `AddRef`+`Release` instead of a pointer steal. Added.
- `Release()` and the non-move smart-pointer ops were not `noexcept`, despite being so in practice. Marked `noexcept` throughout.
After all of the above, the two types were functionally identical. The final commit deletes `RefPtr` and rewrites the ~10 consumer files to use `Ref`.
|
| |
|
|
| |
- Improvement: New `ZEN_SCOPED_LOG(Expr)` macro routes `ZEN_INFO`/`ZEN_WARN`/`ZEN_DEBUG` in the enclosing block through the given logger expression instead of the default
- Improvement: `BuildContainer`, `SaveOplog`, and `LoadOplogContext` now take a caller-provided `LoggerRef` so diagnostic messages route through the caller's logger
|
| |
|
|
| |
- Bugfix: OAuth client credentials token request now sends correct `application/x-www-form-urlencoded` content type
- Improvement: HTTP client Content-Type in additional headers now overrides the payload content type
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rework of the Horde agent subsystem from synchronous per-thread I/O to an async ASIO-driven architecture, plus provisioner scale-down with graceful draining, OIDC authentication, scheduler improvements, and dashboard UI for provisioner control.
### Async Horde Agent Rewrite
- Replace synchronous `HordeAgent` (one thread per agent, blocking I/O) with `AsyncHordeAgent` — an ASIO state machine running on a shared `io_context` thread pool
- Replace `TcpComputeTransport`/`AesComputeTransport` with `AsyncTcpComputeTransport`/`AsyncAesComputeTransport`
- Replace `AgentMessageChannel` with `AsyncAgentMessageChannel` using frame queuing and ASIO timers
- Delete `ComputeBuffer` and `ComputeChannel` ring-buffer classes (no longer needed)
### Provisioner Drain / Scale-Down
- `HordeProvisioner` can now drain agents when target core count is lowered: queries each agent's `/compute/session/status` for workload, selects candidates by largest-fit/lowest-workload, and sends `/compute/session/drain`
- Configurable `--horde-drain-grace-period` (default 300s) before force-kill
- Implement `IProvisionerStateProvider` interface to expose provisioner state to the orchestrator HTTP layer
- Forward `--coordinator-session`, `--provision-clean`, and `--provision-tracehost` through both Horde and Nomad provisioners to spawned workers
### OIDC Authentication
- `HordeClient` accepts an `AccessTokenProvider` (refreshable token function) as alternative to static `--horde-token`
- Wire up `OidcToken.exe` auto-discovery via `httpclientauth::CreateFromOidcTokenExecutable` with `--HordeUrl` mode
- New `--horde-oidctoken-exe-path` CLI option for explicit path override
### Orchestrator & Scheduler
- Orchestrator generates a session ID at startup; workers include `coordinator_session` in announcements so the orchestrator can reject stale-session workers
- New `Rejected` action state — when a remote runner declines at capacity, the action is rescheduled without retry count increment
- Reduce scheduler lock contention: snapshot pending actions under shared lock, sort/trim outside the lock
- Parallelize remote action submission across runners via `WorkerThreadPool` with slow-submit warnings
- New action field `FailureReason` populated by all runner types (exit codes, sandbox failures, exceptions)
- New endpoints: `session/drain`, `session/status`, `session/sunset`, `provisioner/status`, `provisioner/target`
### Remote Execution
- Eager-attach mode for `RemoteHttpRunner` — bundles all attachments upfront in a `CbPackage` for single-roundtrip submits
- Track in-flight submissions to prevent over-queuing
- Show remote runner hostname in `GetDisplayName()`
- `--announce-url` to override the endpoint announced to the coordinator (e.g. relay-visible address)
### Frontend Dashboard
- Delete standalone `compute.html` (925 lines) and `orchestrator.html` (669 lines), consolidated into JS page modules
- Add provisioner panel to orchestrator dashboard: target/active/estimated core counts, draining agent count
- Editable target-cores input with debounced POST to `/orch/provisioner/target`
- Per-agent provisioning status badges (active / draining / deallocated) in the agents table
- Active vs total CPU counts in agents summary row
### CLI
- New `zen compute record-start` / `record-stop` subcommands
- `zen exec` progress bar with submit and completion phases, atomic work counters, `--progress` mode (Pretty/Plain/Quiet)
### Other
- `DataDir` supports environment variable expansion
- Worker manifest validation checks for `worker.zcb` marker to detect incomplete cached directories
- Linux/Mac runners `nice(5)` child processes to avoid starving the main server
- `ComputeService::SetShutdownCallback` wired to `RequestExit` via `session/sunset`
- Curl HTTP client logs effective URL on failure
- `MachineInfo` carries `Pool` and `Mode` from Horde response
- Horde bundle creation includes `.pdb` on Windows
|
| |
|
| |
* log curl raw error on retry, add retry on CURLE_PARTIAL_FILE error
|
| |
|
|
|
|
|
| |
* objectstore.cpp - m_TotalBytesServed now tracks all range cases (single, multi, 416)
* async http: docstring corrected: curl_multi_socket_action() / ASIO socket async_wait
remove non-ascii characters
* fix singlethreaded gc option in lua to not use dash
* fix changelog order
|
| |
|
|
|
|
|
|
|
| |
- Improvement: HTTP range responses (RFC 7233) are now fully compliant across the object store and build store
- 206 Partial Content responses now include a `Content-Range` header; previously absent for single-range requests, which broke `HttpClient::GetRanges()`
- 416 Range Not Satisfiable responses now include `Content-Range: bytes */N` as required by RFC 7233
- Out-of-bounds range requests return 416 Range Not Satisfiable (was 400 Bad Request)
- Single-byte ranges (`bytes=N-N`) are now correctly accepted (were previously rejected)
- Range byte positions widened from 32-bit to 64-bit; RFC 7233 imposes no size limit on byte range values
- Build store binary GET requests with a Range header now return 206 Partial Content with `Content-Range` (previously returned 200 OK without it)
|
| |
|
|
|
|
|
|
| |
* reduce zenserver spawns in tests
* fix filesystemutils wrong test suite name
* tweak tests for faster runtime
* reduce more test runtime
* more wall time improvements
* fast http and processmanager tests
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Adds `AsyncHttpClient` — an asynchronous HTTP client using `curl_multi_socket_action` integrated with ASIO for event-driven I/O. Supports GET, POST, PUT, DELETE, HEAD with both callback-based and `std::future`-based APIs.
- Extracts shared curl helpers (callbacks, URL encoding, header construction, error mapping) into `httpclientcurlhelpers.h`, eliminating duplication between the sync and async implementations.
## Design
- All curl_multi state is serialized on an `asio::strand`, safe with multi-threaded io_contexts.
- Two construction modes: owned io_context (creates internal thread) or external io_context (caller runs the loop).
- Socket readiness is detected via `asio::ip::tcp::socket::async_wait` driven by curl's `CURLMOPT_SOCKETFUNCTION`/`CURLMOPT_TIMERFUNCTION` — no polling, sub-millisecond latency.
- Completion callbacks are dispatched off the strand onto the io_context so slow callbacks don't starve the curl event loop. Exceptions in callbacks are caught and logged.
## Files
| File | Change |
|------|--------|
| `zenhttp/include/zenhttp/asynchttpclient.h` | New public header |
| `zenhttp/clients/asynchttpclient.cpp` | Implementation (~1000 lines) |
| `zenhttp/clients/httpclientcurlhelpers.h` | Shared curl helpers extracted from sync client |
| `zenhttp/clients/httpclientcurl.cpp` | Removed duplicated helpers, uses shared header |
| `zenhttp/asynchttpclient_test.cpp` | 8 test cases: verbs, payloads, callbacks, concurrency, external io_context, connection errors |
| `zenhttp/zenhttp.cpp` | Forcelink registration for new tests |
|
| | |
|
| |
|
| |
- Feature: Hub dashboard proxy - instance dashboards are accessible through the hub server at `/hub/proxy/{port}/` without requiring direct port access
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
### Security: Input validation & path safety
- **Reject local file references by default** in package parsing — only allow when explicitly opted in by the service (`ParseFlags::kAllowLocalReferences`) and validated by an `ILocalRefPolicy` (fail-closed: no policy = rejected)
- **`DataRootLocalRefPolicy`** restricts local ref paths to the server's data root via canonical path prefix matching
- **Validate attachment hashes** in compute HTTP handlers — decompresses and re-hashes each attachment at ingestion time to reject tampered payloads
- **Path traversal validation** for worker descriptions (`pathvalidation.h`) — rejects absolute paths, `..` components, Windows reserved device names, and invalid filename characters
- **Harden CbPackage parsing** against corrupt inputs — overflow-safe attachment count, bounds checks on local ref offset/size, graceful failure instead of `ZEN_ASSERT` for untrusted data
- **Harden legacy package parser** — reject zero-size binary fields, missing mappers, and optionally validate resolved attachment hashes
- **Bounds check in `CbPackageReader::MarshalLocalChunkReference`** — detect when `MakeFromFile` silently clamps offset+size to file size
### Reliability: Lock consolidation & bug fixes
- **Consolidate three action map locks into one** (`m_ActionMapLock`) — eliminates deadlock risk from multi-lock ordering, simplifies state transitions, and fixes a race where newly enqueued actions were briefly invisible to `GetActionResult`/`FindActionResult`
- **Fix infinite loop in `BaseRunnerGroup::SubmitActions`** when actions exceed total runner capacity — cap round-robin at `TotalCapacity` and default unassigned results to "No capacity"
- **Fix `MakeSafeAbsolutePathInPlace` for UNC paths** — `\server\share` now correctly becomes `\?\UNC\server\share` instead of `\?\server\share`
- **Fix `max_retries=0`** — previously fell through to the default of 3; now correctly means "no retries"
### New: ManagedProcessRunner
- Cross-platform process runner backed by `SubprocessManager` — uses async exit callbacks instead of polling, delegates CPU/memory metrics to the manager's built-in sampler
- `ProcessGroup` (JobObject on Windows, process group on POSIX) for bulk cancellation on shutdown
- `--managed` flag on `zen exec inproc` to select this runner
- Refactored monitor thread lifecycle — `StartMonitorThread()` now called from derived constructors to avoid calling virtual functions from base constructor
### Process management
- **Suppress crash dialogs** via `JOB_OBJECT_UILIMIT_ERRORMODE` + `SEM_NOGPFAULTERRORBOX` in both `WindowsProcessRunner` and `JobObject::Initialize` — prevents WER/Dr. Watson modal dialogs from blocking the monitor thread
- **CREATE_SUSPENDED → AssignProcessToJobObject → ResumeThread** pattern in `WindowsProcessRunner` — ensures job object assignment before process execution
- **Move stdout/stderr callbacks to `Spawn()` parameters** in `SubprocessManager` — prevents race where early output could be missed before callback installation
- Consistent PID logging across all runner types
### Test infrastructure
- **`zentest-appstub`**: Added `Fail` (configurable exit code) and `Crash` (abort / nullptr deref) test functions
- **Compute integration tests**: exit code handling, auto-retry exhaustion, manual reschedule after failure, mixed success/failure queues, crash handling (abort + nullptr), crash auto-retry, immediate query visibility after enqueue
- **Package format tests**: truncated header, bad magic, attachment count overflow, truncated data, local ref rejection/acceptance, policy enforcement (inside/outside root, traversal, no-policy fail-closed)
- **Legacy package parser tests**: empty input, zero-size binary, hash resolution with/without mapper, hash mismatch detection
- **UNC path tests** for `MakeSafeAbsolutePath`
### Misc
- ANSI color helper macros (`ZEN_RED`, `ZEN_BRIGHT_WHITE`, etc.) and `ZEN_BOLD`/`ZEN_DIM`/etc.
- Generic `fmt::formatter` for types with free `ToString` functions
- Compute dashboard: truncated hash display with monospace font and hover for full value
- Renamed `usonpackage_forcelink` → `cbpackage_forcelink`
- Compute enabled by default in xmake config (releases still explicitly disable)
|
| |
|
|
|
|
|
|
|
|
| |
- **Eliminate `<regex>` usage** — Replaced `std::regex`-based URL parsing in `jupiterbuildstorage.cpp` with manual `string_view` parsing. Added `CXXOPTS_NO_REGEX` to disable regex in cxxopts. Includes comprehensive tests for the new URL parser.
- **Add missing HTTP response codes** — Added `102`, `103`, `203`, `207`, `208`, `226`, `306`, `421`, `425`, `451` to the enum and reason string lookup.
- **Add `ForceColor` support to zen CLI** — Plumbed the `ForceColor` logging option through to the zen client.
- **Add `.clangd` config** — Strips MSVC-specific flags clangd can't handle and suppresses noisy clang-tidy checks.
- **Generic `fmt::formatter` for `ToString`** — Concept-based formatter that auto-formats any type with a free `ToString()` function, removing the need for per-type specializations.
- **Fix OpenSSL dependency** — Changed `zenhorde` to use `openssl3` package on Linux/macOS.
- **Add `<cmath>` include** — Missing include in `hyperloglog.h`.
- **GCC compile fix** — Moved `static constinit` variable inside lambda in `logging.cpp`.
|
| |
|
| |
CPR is no longer needed now that HttpClient has fully transitioned to raw libcurl. This removes the CPR library, its build integration, implementation files, and all conditional compilation guards, leaving curl as the sole HTTP client backend.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Hub watchdog automatically deprovisions inactive provisioned and hibernated instances
- Feature: Added `stats/activity_counters` endpoint to measure server activity
- Feature: Added configuration options for hub watchdog
- `--hub-watchdog-provisioned-inactivity-timeout-seconds` Inactivity timeout before a provisioned instance is deprovisioned
- `--hub-watchdog-hibernated-inactivity-timeout-seconds` Inactivity timeout before a hibernated instance is deprovisioned
- `--hub-watchdog-inactivity-check-margin-seconds` Margin before timeout at which an activity check is issued
- `--hub-watchdog-cycle-interval-ms` Watchdog poll interval in milliseconds
- `--hub-watchdog-cycle-processing-budget-ms` Maximum time budget per watchdog cycle in milliseconds
- `--hub-watchdog-instance-check-throttle-ms` Minimum delay between checks on a single instance
- `--hub-watchdog-activity-check-connect-timeout-ms` Connect timeout for activity check requests
- `--hub-watchdog-activity-check-request-timeout-ms` Request timeout for activity check requests
|
| |
|
|
|
|
| |
- Improvement: Provisioning a hibernated instance now automatically wakes it instead of requiring an explicit wake call first
- Improvement: Deprovisioning now accepts instances in Crashed or Hibernated states, not just Provisioned
- Improvement: Added `--consul-health-interval-seconds` and `--consul-deregister-after-seconds` options to control Consul health check behavior (defaults: 10s and 30s)
- Improvement: Consul registration now occurs when provisioning starts; health check intervals are applied once provisioning completes
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a `SubprocessManager` for managing child processes with ASIO-integrated async exit detection, stdout/stderr pipe capture, and periodic metrics sampling. Also introduces `ProcessGroup` for OS-backed process grouping (Windows JobObjects / POSIX process groups).
### SubprocessManager
- Async process exit detection using platform-native mechanisms (Windows `object_handle`, Linux `pidfd_open`, macOS `kqueue EVFILT_PROC`) — no polling
- Stdout/stderr capture via async pipe readers with per-process or default callbacks
- Periodic round-robin metrics sampling (CPU, memory) across managed processes
- Spawn, adopt, remove, kill, and enumerate managed processes
### ProcessGroup
- OS-level process grouping: Windows JobObject (kill-on-close guarantee), POSIX `setpgid` (bulk signal delivery)
- Atomic group kill via `TerminateJobObject` (Windows) or `kill(-pgid, sig)` (POSIX)
- Per-group aggregate metrics and enumeration
### ProcessHandle improvements
- Added explicit constructors from `int` (pid) and `void*` (native handle)
- Added move constructor and move assignment operator
### ProcessMetricsTracker
- Cross-platform process metrics (CPU time, working set, page faults) via `QueryProcessMetrics()`
- ASIO timer-driven periodic sampling with configurable interval and batch size
- Aggregate metrics across tracked processes
### Other changes
- Fixed `zentest-appstub` writing a spurious `Versions` file to cwd on every invocation
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
## Summary
This PR adds a session management service, several new dashboard pages, and a number of infrastructure improvements.
### Sessions Service
- `SessionsServiceClient` in `zenutil` announces sessions to a remote zenserver with a 15s heartbeat (POST/PUT/DELETE lifecycle)
- Storage server registers itself with its own local sessions service on startup
- Session mode attribute coupled to server mode (Compute, Proxy, Hub, etc.)
- Ended sessions tracked with `ended_at` timestamp; status filtering (Active/Ended/All)
- `--sessions-url` config option for remote session announcement
- In-process log sink (`InProcSessionLogSink`) forwards server log output to the server's own session, visible in the dashboard
### Session Log Viewer
- POST/GET endpoints for session logs (`/sessions/{id}/log`) supporting raw text and structured JSON/CbObject with batch `entries` array
- In-memory log storage per session (capped at 10k entries) with cursor-based pagination for efficient incremental fetching
- Log panel in the sessions dashboard with incremental DOM updates, auto-scroll (Follow toggle), newest-first toggle, text filter, and log-level coloring
- Auto-selects the server's own session on page load
### TCP Log Streaming
- `LogStreamListener` and `TcpLogStreamSink` for log delivery over TCP
- Sequence numbers on each message with drop detection and synthetic "dropped" notice on gaps
- Gathered buffer writes to reduce syscall overhead when flushing batches
- Tests covering basic delivery, multi-line splitting, drop detection, and sequencing
### New Dashboard Pages
- **Sessions**: master-detail layout with selectable rows, metadata panel, live WebSocket updates, paging, abbreviated date formatting, and "this" pill for the local session
- **Object Store**: summary stats tiles and bucket table with click-to-expand inline object listing (`GET /obj/`)
- **Storage**: per-volume disk usage breakdown (`GET /admin/storage`), Garbage Collection status section (next-run countdown, last-run stats), and GC History table with paginated rows and expandable detail panels
- **Network**: overview tiles, per-service request table, proxy connections, and live WebSocket updates; distinct client IPs and session counts via HyperLogLog
### Documentation Page
- In-dashboard Docs page with sidebar navigation, markdown rendering (via `marked`), Mermaid diagram support (theme-aware), collapsible sections, text filtering with highlighting, and cross-document linking
- New user-facing docs: `overview.md` (with architecture and per-mode diagrams), `sessions.md`, `cache.md`, `projects.md`; updated `compute.md`
- Dev docs moved to `docs/dev/`
### Infrastructure & Bug Fixes
- **Deflate compression** for the embedded frontend zip (~3.4MB → ~950KB); zlib inflate support added to `ZipFs` with cached decompressed buffers
- **Local IP addresses**: `GetLocalIpAddresses()` (Windows via `GetAdaptersAddresses`, Linux/Mac via `getifaddrs`); surfaced in `/status/status`, `/health/info`, and the dashboard banner
- **Dashboard nav**: unified into `zen-nav` web component with `MutationObserver` for dynamically added links, CSS `::part()` to merge banner/nav border radii, and prefix-based active link detection
- Stats broadcast refactored from manual JSON string concatenation to `CbObjectWriter`; `CbObject`-to-JS conversion improved for `TimeSpan`, `DateTime`, and large integers
- Stats WebSocket boilerplate consolidated into `ZenPage.connect_stats_ws()`
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
## Summary
Adds probabilistic cardinality estimation for tracking unique HTTP clients and sessions using a HyperLogLog implementation.
- Add a `HyperLogLog<Precision>` template in `zentelemetry` with thread-safe lock-free register updates, merge support, and XXH3 hashing
- Feed client IP addresses (via raw bytes) and session IDs (via `Oid` bytes) into their respective HyperLogLog estimators from both the ASIO and http.sys server backends
- Emit `distinct_clients` and `distinct_sessions` cardinality estimates in HTTP `CollectStats()`
- Add tests covering empty, single, duplicates, accuracy, merge, and clear scenarios
## Why HyperLogLog
Tracking exact unique counts would require storing every observed IP or session ID. HyperLogLog provides a memory-bounded probabilistic estimate (~1–2% error) using only a few KB of memory regardless of traffic volume.
|
| |
|
|
| |
- Bugfix: Retry OIDC token refresh once on failure before propagating the error
- Bugfix: Handle HTTP 501 (Not Implemented) from Jupiter as a signal to fall back from multi-range to single-range requests
|
| |
|
| |
Authentication callbacks are not thread safe, ensured call sites does single threaded calls
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
### Compute Batch Submission
- Consolidate duplicated action submission logic in `httpcomputeservice` into a single `HandleSubmitAction` supporting both single-action and batch (actions array) payloads
- Group actions by queue in `RemoteHttpRunner` and submit as batches with configurable chunk size, falling back to individual submission on failure
- Extract shared helpers: `MakeErrorResult`, `ValidateQueueForEnqueue`, `ActivateActionInQueue`, `RemoveActionFromActiveMaps`
### Retracted Action State
- Add `Retracted` state to `RunnerAction` for retry-free rescheduling — an explicit request to pull an action back and reschedule it on a different runner without incrementing `RetryCount`
- Implement idempotent `RetractAction()` on `RunnerAction` and `ComputeServiceSession`
- Add `POST jobs/{lsn}/retract` and `queues/{queueref}/jobs/{lsn}/retract` HTTP endpoints
- Add state machine documentation and per-state comments to `RunnerAction`
### Compute Race Fixes
- Fix race in `HandleActionUpdates` where actions enqueued between session abandon and scheduler tick were never abandoned, causing `GetActionResult` to return 202 indefinitely
- Fix queue `ActiveCount` race where `NotifyQueueActionComplete` was called after releasing `m_ResultsLock`, allowing callers to observe stale counters immediately after `GetActionResult` returned OK
### Logging Optimization and ANSI improvements
- Improve `AnsiColorStdoutSink` write efficiency — single write call, dirty-flag flush, `RwLock` instead of `std::mutex`
- Move ANSI color emission from sink into formatters via `Formatter::SetColorEnabled()`; remove `ColorRangeStart`/`End` from `LogMessage`
- Extract color helpers (`AnsiColorForLevel`, `StripAnsiSgrSequences`) into `helpers.h`
- Strip upstream ANSI SGR escapes in non-color output mode. This enables colour in log messages without polluting log files with ANSI control sequences
- Move `RotatingFileSink`, `JsonFormatter`, and `FullFormatter` from header-only to pimpl with `.cpp` files
### CLI / Exec Refactoring
- Extract `ExecSessionRunner` class from ~920-line `ExecUsingSession` into focused methods and a `ExecSessionConfig` struct
- Replace monolithic `ExecCommand` with subcommand-based architecture (`http`, `inproc`, `beacon`, `dump`, `buildlog`)
- Allow parent options to appear after subcommand name by parsing subcommand args permissively and forwarding unmatched tokens to the parent parser
### Testing Improvements
- Fix `--test-suite` filter being ignored due to accumulation with default wildcard filter
- Add test suite banners to test listener output
- Made `function.session.abandon_pending` test more robust
### Startup / Reliability Fixes
- Fix silent exit when a second zenserver instance detects a port conflict — use `ZEN_CONSOLE_*` for log calls that precede `InitializeLogging()`
- Fix two potential SIGSEGV paths during early startup: guard `sentry_options_new()` returning nullptr, and throw on `ZenServerState::Register()` returning nullptr instead of dereferencing
- Fail on unrecognized zenserver `--mode` instead of silently defaulting to store
### Other
- Show host details (hostname, platform, CPU count, memory) when discovering new compute workers
- Move frontend `html.zip` from source tree into build directory
- Add format specifications for Compact Binary and Compressed Buffer wire formats
- Add `WriteCompactBinaryObject` to zencore
- Extended `ConsoleTui` with additional functionality
- Add `--vscode` option to `xmake sln` for clangd / `compile_commands.json` support
- Disable compute/horde/nomad in release builds (not yet production-ready)
- Disable unintended `ASIO_HAS_IO_URING` enablement
- Fix crashpad patch missing leading whitespace
- Clean up code triggering gcc false positives
|
| |
|
|
|
|
|
|
| |
- Feature: Added `--allow-port-probing` option to control whether zenserver searches for a free port on startup (default: true, automatically false when --dedicated is set)
- Feature: Added new hub options for controlling provisioned storage server instances:
- `--hub-instance-http` - HTTP server implementation for instances (asio/httpsys)
- `--hub-instance-http-threads` - Number of HTTP connection threads per instance
- `--hub-instance-corelimit` - Limit CPU concurrency per instance
- Improvement: Hub now manages a deterministic port pool for provisioned instances allowing reuse of unused ports
|
| |
|
|
|
|
|
| |
This PR makes it *possible* to do a Windows build on Linux via `clang-cl`.
It doesn't actually change any build process. No policy change, just mechanics and some code fixes to clear clang compilation.
The code fixes are mainly related to #include file name casing, to match the on-disk casing of the SDK files (via xwin).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
removal (#841)
- Percent-decode URIs in ASIO HTTP server to match http.sys CookedUrl behavior, ensuring consistent decoded paths across backends
- Add Environment field to CreateProcOptions for passing extra env vars to child processes (Windows: merged into Unicode environment block; Unix: setenv in fork)
- Add GetCompilerName() and include it in build options startup logging
- Suppress Windows CRT error dialogs in test harness for headless/CI runs
- Fix mimalloc package: pass CMAKE_BUILD_TYPE, skip cfuncs test for cross-compile
- Add virtual destructor to SentryAssertImpl to fix debug-mode warning
- Simplify object store path handling now that URIs arrive pre-decoded
- Add URI decoding test coverage for percent-encoded paths and query params
- Simplify httpasio request handling by using strands (guarantees no parallel handlers per connection)
- Removed deprecated regex-based route matching support
- Fix full GC never triggering after cross-toolchain builds: The `gc_state` file stores `system_clock` ticks, but the tick resolution differs between toolchains (nanoseconds on GCC/standard clang, microseconds on UE clang). A nanosecond timestamp misinterpreted as microseconds appears far in the future (~year 58,000), bypassing the staleness check and preventing time-based full GC from ever running. Fixed by also resetting when the stored timestamp is in the future.
- Clamp GC countdown display to configured interval: Prevents nonsensical log output (e.g. "Full GC in 492128002h") caused by the above or any other clock anomaly. The clamp applies to both the scheduler log and the status API.
|
| |
|
|
|
|
|
| |
- Fix potential crash on startup caused by logging macros being invoked before the logging system is initialized (null logger dereference in `ZenServerState::Sweep()`). `LoggerRef::ShouldLog` now guards against a null logger pointer.
- Make CPR an optional dependency (`--zencpr` build option, enabled by default) so builds can proceed without it
- Make zenvfs Windows-only (platform-specific target)
- Generate the frontend zip at build time from source HTML files instead of checking in a binary blob which would accumulate with every single update
|
| |
|
|
|
|
|
|
|
|
| |
- Add clang-cl warning suppressions in xmake.lua matching Linux/macOS set
- Guard /experimental:c11atomics with {tools="cl"} for MSVC-only
- Fix long long / int64_t redefinition in string.h for clang-cl
- Fix unclosed namespace in callstacktrace.cpp #else branch
- Fix missing override in httpplugin.cpp
- Reorder WorkerPool fields to match designated initializer order
- Use INVALID_SOCKET instead of SOCKET_ERROR for SOCKET comparisons
|
| |
|
|
|
|
|
|
| |
This PR adds end-to-end Unix domain socket (UDS) support, allowing zen CLI to discover and connect to UDS-only servers automatically.
- **`unix://` URI scheme in zen CLI**: The `-u` / `--hosturl` option now accepts `unix:///path/to/socket` to connect to a zenserver via a Unix domain socket instead of TCP.
- **Per-instance shared memory for extended server info**: Each zenserver instance now publishes a small shared memory section (keyed by SessionId) containing per-instance data that doesn't fit in the fixed-size ZenServerEntry -- starting with the UDS socket path. This is a 4KB pagefile-backed section on Windows (`Global\ZenInstance_{sessionid}`) and a POSIX shared memory object on Linux/Mac (`/UnrealEngineZen_{sessionid}`).
- **Client-side auto-discovery of UDS servers**: `zen info`, `zen status`, etc. now automatically discover and prefer UDS connections when a server publishes a socket path. Servers running with `--no-network` (UDS-only) are no longer invisible to the CLI.
- **`kNoNetwork` flag in ZenServerEntry**: Servers started with `--no-network` advertise this in their shared state entry. Clients skip TCP fallback for these servers, and display commands (`ps`, `status`, `top`) show `-` instead of a port number to indicate TCP is not available.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switches the default HTTP client to the libcurl-based backend and follows up with a series of correctness fixes and code quality improvements to `CurlHttpClient`.
**Backend switch & build fixes:**
- Switch default HTTP client to libcurl-based backend
- Suppress `[[nodiscard]]` warning when building fmt
- Miscellaneous bugfixes in HttpClient/libcurl
- Pass `-y` to `xmake config` in `xmake test` task
**Boilerplate reduction:**
- Add `Session::SetHeaders()` for RAII ownership of `curl_slist`, eliminating manual `curl_slist_free_all` calls from every verb method
- Add `Session::PerformWithResponseCallbacks()` to absorb the repeated 12-line write+header callback setup block
- Extract `ParseHeaderLine()` shared helper, replacing 4 duplicate header-parsing implementations
- Extract `BuildHeaderMap()` and `ApplyContentTypeFromHeaders()` helpers to deduplicate header-to-map conversion and Content-Type scanning
- Unify the two `DoWithRetry` overloads (PayloadFile variant now delegates to the Validate variant)
**Correctness fixes:**
- `TransactPackage`: both phases now use `PerformWithResponseCallbacks()`, fixing missing abort support and a dead header collection loop
- `TransactPackage`: error path now routes through `CommonResponse`, preserving curl error codes and messages for the caller
- `ValidatePayload`: merged 3 separate header-scan loops into a single pass
**Performance improvements:**
- Replace `fmt::format` with `ExtendableStringBuilder` in `BuildHeaderList` and `BuildUrlWithParameters`, eliminating heap allocations in the common case
- Replace `curl_easy_escape`/`curl_free` with inline URL percent-encoding using `AsciiSet`
- Remove wasteful `CommonResponse(...)` construction in retry logging, formatting directly from `CurlResult` fields
|
| |
|
|
| |
- Add `--no-network` CLI option which disables all TCP/HTTPS listeners, restricting zenserver to Unix domain socket communication only.
- Also fixes asio upgrade breakage on main
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Migrate removed deprecated APIs:
- io_service -> io_context
- io_service::work -> executor_work_guard
- resolver::query/iterator -> resolver::resolve() with results_type
- address::from_string() -> make_address()
---
Breaking Changes (1.33.0)
- deferred as default completion token — can omit token in coroutines: co_await socket.async_read_some(buf)
- cancel_after / cancel_at — timeout any async operation: co_await sock.async_read_some(buf, cancel_after(5s))
- Partial completion token adapters — as_tuple, redirect_error, bind_executor etc. composable via pipe: co_await (async_write(sock, buf) | as_tuple | cancel_after(10s))
- composed — simpler alternative to async_compose for stateful operation implementations
- co_composed moved out of experimental
1.35.0 — Allocator & Resolver
- Allocator constructors for io_context and thread_pool — control memory allocation for services, I/O objects, strands
- Configurable resolver thread pool ("resolver"/"threads")
- Timer heap pre-allocation ("timer"/"heap_reserve")
1.37.0 — Inline Executors & Reactor Tuning
- inline_executor — always executes inline (useful as completion executor)
- inline_or_executor<> — tries inline first, falls back to wrapped executor
- New dispatch/post/defer overloads that run a function on one executor and deliver result to a handler on another
- redirect_disposition — captures disposition into a variable (like redirect_error but generic)
- Reactor config: reset_edge_on_partial_read, use_eventfd, use_timerfd
Notable Fixes
- Resource leak in awaitable move assignment (1.37.0)
- Memory leak in SSL stream move assignment (1.37.0)
- Thread sanitizer issue in kqueue reactor (1.37.0)
- co_spawn non-reentrant completion handler fix (1.36.0)
- Windows file append mode fix (1.32.0)
- SSL engine move assignment leak (1.33.0)
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a **transparent TCP proxy mode** to zenserver (activated via `zenserver proxy`), allowing it to sit between clients and upstream Zen servers to inspect and monitor HTTP/1.x traffic in real time. Primarily useful during development, to be able to observe multi-server/client interactions in one place.
- **Dedicated proxy port** -- Proxy mode defaults to port 8118 with its own data directory to avoid collisions with a normal zenserver instance.
- **TCP proxy core** (`src/zenserver/proxy/`) -- A new transparent TCP proxy that forwards connections to upstream targets, with support for both TCP/IP and Unix socket listeners. Multi-threaded I/O for connection handling. Supports Unix domain sockets for both upstream/downstream.
- **HTTP traffic inspection** -- Parses HTTP/1.x request/response streams inline to extract method, path, status, content length, and WebSocket upgrades without breaking the proxied data.
- **Proxy dashboard** -- A web UI showing live connection stats, per-target request counts, active connections, bytes transferred, and client IP/session ID rollups.
- **Server mode display** -- Dashboard banner now shows the running server mode (Zen Proxy, Zen Compute, etc.).
Supporting changes included in this branch:
- **Wildcard log level matching** -- Log levels can now be set per-category using wildcard patterns (e.g. `proxy.*=debug`).
- **`zen down --all`** -- New flag to shut down all running zenserver instances; also used by the new `xmake kill` task.
- Minor test stability fixes (flaky hash collisions, per-thread RNG seeds).
- Support ZEN_MALLOC environment variable for default allocator selection and switch default to rpmalloc
- Fixed sentry-native build to allow LTO on Windows
|
| |
|
|
|
|
| |
* added streaming download of payloads in cpr client ::Post
* curlclient Post streaming download
* case sensitivity fixes for http headers
* move over missing functionality from crpclient to httpclient
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main goal of this change is to eliminate the cpr back-end altogether and replace it with the curl implementation. I would expect to drop cpr as soon as we feel happy with the libcurl back-end. That would leave us with a direct dependency on libcurl only, and cpr can be eliminated as a dependency.
### HttpClient Backend Overhaul
- Implemented a new **libcurl-based HttpClient** backend (`httpclientcurl.cpp`, ~2000 lines)
as an alternative to the cpr-based one
- Made HttpClient backend **configurable at runtime** via constructor arguments
and `-httpclient=...` CLI option (for zen, zenserver, and tests)
- Extended HttpClient test suite to cover multipart/content-range scenarios
### Unix Domain Socket Support
- Added Unix domain socket support to **httpasio** (server side)
- Added Unix domain socket support to **HttpClient**
- Added Unix domain socket support to **HttpWsClient** (WebSocket client)
- Templatized `HttpServerConnectionT<SocketType>` and `WsAsioConnectionT<SocketType>`
to handle TCP, Unix, and SSL sockets uniformly via `if constexpr` dispatch
### HTTPS Support
- Added **preliminary HTTPS support to httpasio** (for Mac/Linux via OpenSSL)
- Added **basic HTTPS support for http.sys** (Windows)
- Implemented HTTPS test for httpasio
- Split `InitializeServer` into smaller sub-functions for http.sys
### Other Notable Changes
- Improved **zenhttp-test stability** with dynamic port allocation
- Enhanced port retry logic in http.sys (handles ERROR_ACCESS_DENIED)
- Fatal signal/exception handlers for backtrace generation in tests
- Added `zen bench http` subcommand to exercise network + HTTP client/server communication stack
|
| |
|
|
|
|
|
|
|
|
| |
- **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS.
- **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking.
- **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration.
- **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP.
- **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch.
- **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support.
Also contains some fixes from TSAN analysis.
|
| |
|
|
|
|
|
|
|
|
|
| |
* clean up BuildStorageResolveResult to allow capabilities
* add check for multirange request capability
* add MaxRangeCountPerRequest capabilities
* project export tests
* add InMemoryBuildStorageCache
* progress and logging improvements
* fix ElapsedSeconds calculations in fileremoteprojectstore.cpp
* oplogs/builds test script
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removes the vendored spdlog library (~12,000 lines) and replaces it with a purpose-built logging system in zencore (~1,800 lines). The new implementation provides the same functionality with fewer abstractions, no shared_ptr overhead, and full control over the logging pipeline.
### What changed
**New logging core in zencore/logging/:**
- LogMessage, Formatter, Sink, Logger, Registry - core abstractions matching spdlog's model but simplified
- AnsiColorStdoutSink - ANSI color console output (replaces spdlog stdout_color_sink)
- MsvcSink - OutputDebugString on Windows (replaces spdlog msvc_sink)
- AsyncSink - async logging via BlockingQueue worker thread (replaces spdlog async_logger)
- NullSink, MessageOnlyFormatter - utility types
- Thread-safe timestamp caching in formatters using RwLock
**Moved to zenutil/logging/:**
- FullFormatter - full log formatting with timestamp, logger name, level, source location, multiline alignment
- JsonFormatter - structured JSON log output
- RotatingFileSink - rotating file sink with atomic size tracking
**API changes:**
- Log levels are now an enum (LogLevel) instead of int, eliminating the zen::logging::level namespace
- LoggerRef no longer wraps shared_ptr - it holds a raw pointer with the registry owning lifetime
- Logger error handler is wired through Registry and propagated to all loggers on registration
- Logger::Log() now populates ThreadId on every message
**Cleanup:**
- Deleted thirdparty/spdlog/ entirely (110+ files)
- Deleted full_test_formatter (was ~80% duplicate of FullFormatter)
- Renamed snake_case classes to PascalCase (full_formatter -> FullFormatter, json_formatter -> JsonFormatter, sentry_sink -> SentrySink)
- Removed spdlog from xmake dependency graph
### Build / test impact
- zencore no longer depends on spdlog
- zenutil and zenvfs xmake.lua updated to drop spdlog dep
- zentelemetry xmake.lua updated to drop spdlog dep
- All existing tests pass, no test changes required beyond formatter class renames
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Claude config updates
* Bug fixes and hardening across `zencore` and `zenhttp`, identified via static analysis.
### zencore
- **`ZEN_ASSERT` macro** -- extended to accept an optional string message literal; added `ZEN_ASSERT_MSG_` helper for message formatting. Callers needing runtime fmt-style formatting should use `ZEN_ASSERT_FORMAT`.
- **`MpscQueue`** -- fixed `TypeCompatibleStorage` to use a properly-sized `char Storage[sizeof(T)]` array instead of a single `char`; corrected `Data()` to cast `&Storage` rather than `this`; switched cache-line alignment to a fixed constant to avoid GCC's `-Winterference-size` warning. Enabled previously-disabled tests.
- **`StringBuilderImpl`** -- initialized `m_Base`/`m_CurPos`/`m_End` to `nullptr`. Fixed `StringCompare` return type (`bool` -> `int`). Fixed `ParseInt` to reject strings with trailing non-numeric characters. Removed deprecated `<codecvt>` include.
- **`NiceNumGeneral`** -- replaced `powl()` with integer `IntPow()` to avoid floating-point precision issues.
- **`RwLock::ExclusiveLockScope`** -- added move constructor/assignment; initialized `m_Lock` to `nullptr`.
- **`Latch::AddCount`** -- fixed variable type (`std::atomic_ptrdiff_t` -> `std::ptrdiff_t` for the return value of `fetch_add`).
- **`thread.cpp`** -- fixed Linux `pthread_setname_np` 16-byte name truncation; added null check before dereferencing in `Event::Close()`; fixed `NamedEvent::Close()` to call `close(Fd)` outside the lock region; added null guard in `NamedMutex` destructor; `Sleep()` now returns early for non-positive durations.
- **`MD5Stream`** -- was entirely stubbed out (no-op); now correctly calls `MD5Init`/`MD5Update`/`MD5Final`. Fixed `ToHexString` to use the correct string length. Fixed forward declarations. Fixed tests to compare `compare() == 0`.
- **`sentryintegration.cpp`** -- guard against null `filename`/`funcname` in spdlog message handler to prevent a crash in `fmt::format`.
- **`jobqueue.cpp`** -- fixed lost job ID when `IdGenerator` wraps around zero; fixed raw `Job*` in `RunningJobs` map (potential use-after-free) to `RefPtr<Job>`; fixed range-loop copies; fixed format string typo.
- **`trace.cpp`** -- suppress GCC false-positive warnings in third-party `trace.h` include.
### zenhttp
- **WebSocket close race** (`wsasio`, `wshttpsys`, `httpwsclient`) -- `m_CloseSent` promoted from `bool` to `std::atomic<bool>`; close check changed to `exchange(true)` to eliminate the check-then-set data race.
- **`wsframecodec.cpp`** -- reject WebSocket frames with payload > 256 MB to prevent OOM from malformed/malicious frames.
- **`oidc.cpp`** -- URL-encode refresh token and client ID in token requests (`FormUrlEncode`); parse `end_session_endpoint` and `device_authorization_endpoint` from OIDC discovery document.
- **`httpclientcommon.cpp`** -- propagate error code from `AppendData` when flushing the cache buffer.
- **`httpclient.h`** -- initialize all uninitialized members (`ErrorCode`, `UploadedBytes`, `DownloadedBytes`, `ElapsedSeconds`, `MultipartBoundary` fields).
- **`httpserver.h`** -- fix `operator=` return type for `HttpRpcHandler` (missing `&`).
- **`packageformat.h`** -- fix `~0u` (32-bit truncation) to `~uint64_t(0)` for a `uint64_t` field.
- **`httpparser`** -- initialize `m_RequestVerb` in both declaration and `ResetState()`.
- **`httpplugin.cpp`** -- initialize `m_BasePort`; fix format string missing quotes around connection name.
- **`httptracer.h`** -- move `#pragma once` before includes.
- **`websocket.h`** -- initialize `WebSocketMessage::Opcode`.
### zenserver
- **`hubservice.cpp`** -- fix two `ZEN_ASSERT` calls that incorrectly used fmt-style format args; converted to `ZEN_ASSERT_FORMAT`.
|
| |
|
|
|
|
|
|
|
|
| |
- Added local process runners for Linux/Wine, Mac with some sandboxing support
- Horde & Nomad provisioning for development and testing
- Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API
- Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates
- Some security hardening
- Improved scalability and `zen exec` command
Still experimental - compute support is disabled by default
|
| |
|
|
|
| |
- Add GetTotalBytesReceived/GetTotalBytesSent to HttpServer with implementations in ASIO and http.sys backends
- Add ExpectedErrorCodes to HttpClientSettings to suppress warn/info logs for anticipated HTTP error codes
- Also fixes minor issues in `CprHttpClient::Download`
|
| |
|
|
|
| |
Various fixes to make cpp files build in unity build mode
as an aside using Unity build doesn't really seem to work on Linux, unsure why but it leads to link-time issues
|
| |
|
|
|
|
|
| |
- Improvement: `zen builds download` now uses multi-range requests for blocks to reduce download size
- Improvement: `zen oplog-import` now uses partial block with multi-range requests for blocks to reduce download size
- Improvement: Improved feedback in log/console during `zen oplog-import`
- Improvement: `--allow-partial-block-requests` now defaults to `true` for `zen builds download` and `zen oplog-import` (was `mixed`)
- Improvement: Improved range merging analysis when downloading partial blocks
|