| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
- Improvement: Hub pools HTTP connections to managed instances so provision/deprovision churn no longer exhausts Windows ephemeral ports
|
| |
|
|
|
|
|
|
|
| |
- Improvement: Hub shares a single S3 client and IMDS credential provider across all modules, reducing IMDS load and surviving transient IMDS blips during bulk provisioning
- Improvement: Hub validates hydration config at startup; bad `--hub-hydration-target-spec` or `--hub-hydration-target-config` now fails `zen hub` at boot instead of per-module at first hydrate
- Improvement: S3 hydration multipart chunk size configurable via `settings.chunk-size` (default 32 MiB)
- Improvement: S3 client extracts `<Error><Code>` and `<Message>` from XML error bodies (previously logged as `<unhandled content format>`)
- Improvement: S3 client fails fast with a "no credentials available" error when AWS credentials are missing, instead of sending an unsigned request that S3 rejects with a generic 400
- Improvement: IMDS credential provider retries transient connection failures (up to 3 attempts with backoff)
- Improvement: HTTP clients with `RetryCount > 0` also retry on `CURLE_COULDNT_CONNECT`
|
| |
|
|
|
|
| |
- Adds `FollowRedirects` (default `false`) and `MaxRedirects` (default `5`) fields to `HttpClientSettings`.
- When `FollowRedirects` is enabled, the curl backend sets `CURLOPT_FOLLOWLOCATION` and `CURLOPT_MAXREDIRS` so HTTP 3xx redirects are handled transparently in the transport layer — callers no longer need to parse `Location` headers and re-issue requests themselves.
- Defaults are off, so existing callers see no behavior change.
|
| |
|
|
| |
- Improvement: New `ZEN_SCOPED_LOG(Expr)` macro routes `ZEN_INFO`/`ZEN_WARN`/`ZEN_DEBUG` in the enclosing block through the given logger expression instead of the default
- Improvement: `BuildContainer`, `SaveOplog`, and `LoadOplogContext` now take a caller-provided `LoggerRef` so diagnostic messages route through the caller's logger
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rework of the Horde agent subsystem from synchronous per-thread I/O to an async ASIO-driven architecture, plus provisioner scale-down with graceful draining, OIDC authentication, scheduler improvements, and dashboard UI for provisioner control.
### Async Horde Agent Rewrite
- Replace synchronous `HordeAgent` (one thread per agent, blocking I/O) with `AsyncHordeAgent` — an ASIO state machine running on a shared `io_context` thread pool
- Replace `TcpComputeTransport`/`AesComputeTransport` with `AsyncTcpComputeTransport`/`AsyncAesComputeTransport`
- Replace `AgentMessageChannel` with `AsyncAgentMessageChannel` using frame queuing and ASIO timers
- Delete `ComputeBuffer` and `ComputeChannel` ring-buffer classes (no longer needed)
### Provisioner Drain / Scale-Down
- `HordeProvisioner` can now drain agents when target core count is lowered: queries each agent's `/compute/session/status` for workload, selects candidates by largest-fit/lowest-workload, and sends `/compute/session/drain`
- Configurable `--horde-drain-grace-period` (default 300s) before force-kill
- Implement `IProvisionerStateProvider` interface to expose provisioner state to the orchestrator HTTP layer
- Forward `--coordinator-session`, `--provision-clean`, and `--provision-tracehost` through both Horde and Nomad provisioners to spawned workers
### OIDC Authentication
- `HordeClient` accepts an `AccessTokenProvider` (refreshable token function) as alternative to static `--horde-token`
- Wire up `OidcToken.exe` auto-discovery via `httpclientauth::CreateFromOidcTokenExecutable` with `--HordeUrl` mode
- New `--horde-oidctoken-exe-path` CLI option for explicit path override
### Orchestrator & Scheduler
- Orchestrator generates a session ID at startup; workers include `coordinator_session` in announcements so the orchestrator can reject stale-session workers
- New `Rejected` action state — when a remote runner declines at capacity, the action is rescheduled without retry count increment
- Reduce scheduler lock contention: snapshot pending actions under shared lock, sort/trim outside the lock
- Parallelize remote action submission across runners via `WorkerThreadPool` with slow-submit warnings
- New action field `FailureReason` populated by all runner types (exit codes, sandbox failures, exceptions)
- New endpoints: `session/drain`, `session/status`, `session/sunset`, `provisioner/status`, `provisioner/target`
### Remote Execution
- Eager-attach mode for `RemoteHttpRunner` — bundles all attachments upfront in a `CbPackage` for single-roundtrip submits
- Track in-flight submissions to prevent over-queuing
- Show remote runner hostname in `GetDisplayName()`
- `--announce-url` to override the endpoint announced to the coordinator (e.g. relay-visible address)
### Frontend Dashboard
- Delete standalone `compute.html` (925 lines) and `orchestrator.html` (669 lines), consolidated into JS page modules
- Add provisioner panel to orchestrator dashboard: target/active/estimated core counts, draining agent count
- Editable target-cores input with debounced POST to `/orch/provisioner/target`
- Per-agent provisioning status badges (active / draining / deallocated) in the agents table
- Active vs total CPU counts in agents summary row
### CLI
- New `zen compute record-start` / `record-stop` subcommands
- `zen exec` progress bar with submit and completion phases, atomic work counters, `--progress` mode (Pretty/Plain/Quiet)
### Other
- `DataDir` supports environment variable expansion
- Worker manifest validation checks for `worker.zcb` marker to detect incomplete cached directories
- Linux/Mac runners `nice(5)` child processes to avoid starving the main server
- `ComputeService::SetShutdownCallback` wired to `RequestExit` via `session/sunset`
- Curl HTTP client logs effective URL on failure
- `MachineInfo` carries `Pool` and `Mode` from Horde response
- Horde bundle creation includes `.pdb` on Windows
|
| |
|
| |
* log curl raw error on retry, add retry on CURLE_PARTIAL_FILE error
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Adds `AsyncHttpClient` — an asynchronous HTTP client using `curl_multi_socket_action` integrated with ASIO for event-driven I/O. Supports GET, POST, PUT, DELETE, HEAD with both callback-based and `std::future`-based APIs.
- Extracts shared curl helpers (callbacks, URL encoding, header construction, error mapping) into `httpclientcurlhelpers.h`, eliminating duplication between the sync and async implementations.
## Design
- All curl_multi state is serialized on an `asio::strand`, safe with multi-threaded io_contexts.
- Two construction modes: owned io_context (creates internal thread) or external io_context (caller runs the loop).
- Socket readiness is detected via `asio::ip::tcp::socket::async_wait` driven by curl's `CURLMOPT_SOCKETFUNCTION`/`CURLMOPT_TIMERFUNCTION` — no polling, sub-millisecond latency.
- Completion callbacks are dispatched off the strand onto the io_context so slow callbacks don't starve the curl event loop. Exceptions in callbacks are caught and logged.
## Files
| File | Change |
|------|--------|
| `zenhttp/include/zenhttp/asynchttpclient.h` | New public header |
| `zenhttp/clients/asynchttpclient.cpp` | Implementation (~1000 lines) |
| `zenhttp/clients/httpclientcurlhelpers.h` | Shared curl helpers extracted from sync client |
| `zenhttp/clients/httpclientcurl.cpp` | Removed duplicated helpers, uses shared header |
| `zenhttp/asynchttpclient_test.cpp` | 8 test cases: verbs, payloads, callbacks, concurrency, external io_context, connection errors |
| `zenhttp/zenhttp.cpp` | Forcelink registration for new tests |
|
| |
|
|
|
|
| |
- Improvement: Provisioning a hibernated instance now automatically wakes it instead of requiring an explicit wake call first
- Improvement: Deprovisioning now accepts instances in Crashed or Hibernated states, not just Provisioned
- Improvement: Added `--consul-health-interval-seconds` and `--consul-deregister-after-seconds` options to control Consul health check behavior (defaults: 10s and 30s)
- Improvement: Consul registration now occurs when provisioning starts; health check intervals are applied once provisioning completes
|
| |
|
| |
Authentication callbacks are not thread safe, ensured call sites does single threaded calls
|
| |
|
|
|
|
|
|
| |
This PR adds end-to-end Unix domain socket (UDS) support, allowing zen CLI to discover and connect to UDS-only servers automatically.
- **`unix://` URI scheme in zen CLI**: The `-u` / `--hosturl` option now accepts `unix:///path/to/socket` to connect to a zenserver via a Unix domain socket instead of TCP.
- **Per-instance shared memory for extended server info**: Each zenserver instance now publishes a small shared memory section (keyed by SessionId) containing per-instance data that doesn't fit in the fixed-size ZenServerEntry -- starting with the UDS socket path. This is a 4KB pagefile-backed section on Windows (`Global\ZenInstance_{sessionid}`) and a POSIX shared memory object on Linux/Mac (`/UnrealEngineZen_{sessionid}`).
- **Client-side auto-discovery of UDS servers**: `zen info`, `zen status`, etc. now automatically discover and prefer UDS connections when a server publishes a socket path. Servers running with `--no-network` (UDS-only) are no longer invisible to the CLI.
- **`kNoNetwork` flag in ZenServerEntry**: Servers started with `--no-network` advertise this in their shared state entry. Clients skip TCP fallback for these servers, and display commands (`ps`, `status`, `top`) show `-` instead of a port number to indicate TCP is not available.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switches the default HTTP client to the libcurl-based backend and follows up with a series of correctness fixes and code quality improvements to `CurlHttpClient`.
**Backend switch & build fixes:**
- Switch default HTTP client to libcurl-based backend
- Suppress `[[nodiscard]]` warning when building fmt
- Miscellaneous bugfixes in HttpClient/libcurl
- Pass `-y` to `xmake config` in `xmake test` task
**Boilerplate reduction:**
- Add `Session::SetHeaders()` for RAII ownership of `curl_slist`, eliminating manual `curl_slist_free_all` calls from every verb method
- Add `Session::PerformWithResponseCallbacks()` to absorb the repeated 12-line write+header callback setup block
- Extract `ParseHeaderLine()` shared helper, replacing 4 duplicate header-parsing implementations
- Extract `BuildHeaderMap()` and `ApplyContentTypeFromHeaders()` helpers to deduplicate header-to-map conversion and Content-Type scanning
- Unify the two `DoWithRetry` overloads (PayloadFile variant now delegates to the Validate variant)
**Correctness fixes:**
- `TransactPackage`: both phases now use `PerformWithResponseCallbacks()`, fixing missing abort support and a dead header collection loop
- `TransactPackage`: error path now routes through `CommonResponse`, preserving curl error codes and messages for the caller
- `ValidatePayload`: merged 3 separate header-scan loops into a single pass
**Performance improvements:**
- Replace `fmt::format` with `ExtendableStringBuilder` in `BuildHeaderList` and `BuildUrlWithParameters`, eliminating heap allocations in the common case
- Replace `curl_easy_escape`/`curl_free` with inline URL percent-encoding using `AsciiSet`
- Remove wasteful `CommonResponse(...)` construction in retry logging, formatting directly from `CurlResult` fields
|
| |
|
|
|
|
| |
* added streaming download of payloads in cpr client ::Post
* curlclient Post streaming download
* case sensitivity fixes for http headers
* move over missing functionality from crpclient to httpclient
|
|
|
The main goal of this change is to eliminate the cpr back-end altogether and replace it with the curl implementation. I would expect to drop cpr as soon as we feel happy with the libcurl back-end. That would leave us with a direct dependency on libcurl only, and cpr can be eliminated as a dependency.
### HttpClient Backend Overhaul
- Implemented a new **libcurl-based HttpClient** backend (`httpclientcurl.cpp`, ~2000 lines)
as an alternative to the cpr-based one
- Made HttpClient backend **configurable at runtime** via constructor arguments
and `-httpclient=...` CLI option (for zen, zenserver, and tests)
- Extended HttpClient test suite to cover multipart/content-range scenarios
### Unix Domain Socket Support
- Added Unix domain socket support to **httpasio** (server side)
- Added Unix domain socket support to **HttpClient**
- Added Unix domain socket support to **HttpWsClient** (WebSocket client)
- Templatized `HttpServerConnectionT<SocketType>` and `WsAsioConnectionT<SocketType>`
to handle TCP, Unix, and SSL sockets uniformly via `if constexpr` dispatch
### HTTPS Support
- Added **preliminary HTTPS support to httpasio** (for Mac/Linux via OpenSSL)
- Added **basic HTTPS support for http.sys** (Windows)
- Implemented HTTPS test for httpasio
- Split `InitializeServer` into smaller sub-functions for http.sys
### Other Notable Changes
- Improved **zenhttp-test stability** with dynamic port allocation
- Enhanced port retry logic in http.sys (handles ERROR_ACCESS_DENIED)
- Fatal signal/exception handlers for backtrace generation in tests
- Added `zen bench http` subcommand to exercise network + HTTP client/server communication stack
|