| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
| |
- Improvement: Hub provision, deprovision, hibernate, and wake operations are now async. HTTP requests returns 202 Accepted while the operation completes in the background
- Improvement: Hub returns 202 Accepted (instead of 409 Conflict) when the same async operation is already in progress for a module
- Improvement: Hub returns 200 OK when a requested state transition is already satisfied
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a `SubprocessManager` for managing child processes with ASIO-integrated async exit detection, stdout/stderr pipe capture, and periodic metrics sampling. Also introduces `ProcessGroup` for OS-backed process grouping (Windows JobObjects / POSIX process groups).
### SubprocessManager
- Async process exit detection using platform-native mechanisms (Windows `object_handle`, Linux `pidfd_open`, macOS `kqueue EVFILT_PROC`) — no polling
- Stdout/stderr capture via async pipe readers with per-process or default callbacks
- Periodic round-robin metrics sampling (CPU, memory) across managed processes
- Spawn, adopt, remove, kill, and enumerate managed processes
### ProcessGroup
- OS-level process grouping: Windows JobObject (kill-on-close guarantee), POSIX `setpgid` (bulk signal delivery)
- Atomic group kill via `TerminateJobObject` (Windows) or `kill(-pgid, sig)` (POSIX)
- Per-group aggregate metrics and enumeration
### ProcessHandle improvements
- Added explicit constructors from `int` (pid) and `void*` (native handle)
- Added move constructor and move assignment operator
### ProcessMetricsTracker
- Cross-platform process metrics (CPU time, working set, page faults) via `QueryProcessMetrics()`
- ASIO timer-driven periodic sampling with configurable interval and batch size
- Aggregate metrics across tracked processes
### Other changes
- Fixed `zentest-appstub` writing a spurious `Versions` file to cwd on every invocation
|
| |
|
|
|
|
|
| |
- **Cross-platform `GetProcessMetrics`**: Implement Linux (`/proc/{pid}/stat`, `/proc/{pid}/statm`, `/proc/{pid}/status`) and macOS (`proc_pidinfo(PROC_PIDTASKINFO)`) support for CPU times and memory metrics. Fix Windows to populate the `MemoryBytes` field (was always 0). All platforms now set `MemoryBytes = WorkingSetSize`.
- **`ProcessMetricsTracker`**: Experimental utility class (`zenutil`) that periodically samples resource usage for a set of tracked child processes. Supports both a dedicated background thread and an ASIO steady_timer mode. Computes delta-based CPU usage percentage across samples, with batched sampling (8 processes per tick) to limit per-cycle overhead.
- **`ProcessHandle` documentation**: Add Doxygen comments to all public methods describing platform-specific behavior.
- **Cleanup**: Remove unused `ZEN_RUN_TESTS` macro (inlined at its single call site in `zenserver/main.cpp`), remove dead `#if 0` thread-shutdown workaround block.
- **Minor fixes**: Use `HttpClientAccessToken` constructor in hordeclient instead of setting private members directly. Log ASIO version at startup and include it in the server settings list.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
## Summary
This PR adds a session management service, several new dashboard pages, and a number of infrastructure improvements.
### Sessions Service
- `SessionsServiceClient` in `zenutil` announces sessions to a remote zenserver with a 15s heartbeat (POST/PUT/DELETE lifecycle)
- Storage server registers itself with its own local sessions service on startup
- Session mode attribute coupled to server mode (Compute, Proxy, Hub, etc.)
- Ended sessions tracked with `ended_at` timestamp; status filtering (Active/Ended/All)
- `--sessions-url` config option for remote session announcement
- In-process log sink (`InProcSessionLogSink`) forwards server log output to the server's own session, visible in the dashboard
### Session Log Viewer
- POST/GET endpoints for session logs (`/sessions/{id}/log`) supporting raw text and structured JSON/CbObject with batch `entries` array
- In-memory log storage per session (capped at 10k entries) with cursor-based pagination for efficient incremental fetching
- Log panel in the sessions dashboard with incremental DOM updates, auto-scroll (Follow toggle), newest-first toggle, text filter, and log-level coloring
- Auto-selects the server's own session on page load
### TCP Log Streaming
- `LogStreamListener` and `TcpLogStreamSink` for log delivery over TCP
- Sequence numbers on each message with drop detection and synthetic "dropped" notice on gaps
- Gathered buffer writes to reduce syscall overhead when flushing batches
- Tests covering basic delivery, multi-line splitting, drop detection, and sequencing
### New Dashboard Pages
- **Sessions**: master-detail layout with selectable rows, metadata panel, live WebSocket updates, paging, abbreviated date formatting, and "this" pill for the local session
- **Object Store**: summary stats tiles and bucket table with click-to-expand inline object listing (`GET /obj/`)
- **Storage**: per-volume disk usage breakdown (`GET /admin/storage`), Garbage Collection status section (next-run countdown, last-run stats), and GC History table with paginated rows and expandable detail panels
- **Network**: overview tiles, per-service request table, proxy connections, and live WebSocket updates; distinct client IPs and session counts via HyperLogLog
### Documentation Page
- In-dashboard Docs page with sidebar navigation, markdown rendering (via `marked`), Mermaid diagram support (theme-aware), collapsible sections, text filtering with highlighting, and cross-document linking
- New user-facing docs: `overview.md` (with architecture and per-mode diagrams), `sessions.md`, `cache.md`, `projects.md`; updated `compute.md`
- Dev docs moved to `docs/dev/`
### Infrastructure & Bug Fixes
- **Deflate compression** for the embedded frontend zip (~3.4MB → ~950KB); zlib inflate support added to `ZipFs` with cached decompressed buffers
- **Local IP addresses**: `GetLocalIpAddresses()` (Windows via `GetAdaptersAddresses`, Linux/Mac via `getifaddrs`); surfaced in `/status/status`, `/health/info`, and the dashboard banner
- **Dashboard nav**: unified into `zen-nav` web component with `MutationObserver` for dynamically added links, CSS `::part()` to merge banner/nav border radii, and prefix-based active link detection
- Stats broadcast refactored from manual JSON string concatenation to `CbObjectWriter`; `CbObject`-to-JS conversion improved for `TimeSpan`, `DateTime`, and large integers
- Stats WebSocket boilerplate consolidated into `ZenPage.connect_stats_ws()`
|
| |
|
|
|
|
|
|
|
|
|
| |
- **`Logger` now holds a single `SinkPtr`** instead of a `std::vector<SinkPtr>`. The `SetSinks`/`AddSink` API is replaced with a single `SetSink`. This removes complexity from `Logger` itself and makes `Clone()` cheaper (no vector copy).
- **New `BroadcastSink`** (`zencore/logging/broadcastsink.h`) acts as a thread-safe, shared indirection point that fans out to a dynamic list of child sinks. Adding or removing a child sink via `AddSink`/`RemoveSink` is immediately visible to every `Logger` that holds a reference to it — including cloned loggers — without requiring each logger to be updated individually.
- **`GetDefaultBroadcastSink()`** (exposed from `zenutil/logging.h`) gives server-layer code access to the shared broadcast sink so it can register optional sinks (OTel, TCP log stream) after logging is initialized, without going through `Default()->AddSink()`.
### Motivation
Previously, dynamically adding sinks post-initialization mutated the default logger's internal sink vector directly. This was fragile: cloned loggers (created before `AddSink` was called) would not pick up the new sinks. `BroadcastSink` fixes this by making the sink list a shared, mutable object that all loggers sharing the same broadcast instance observe uniformly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Upgrade mimalloc from v2.1.2 to v2.2.7. Note that mimalloc is no longer the default allocator so this only impacts users who somehow opt into mimalloc via `--malloc=mimalloc` or compile with different defaults
- Add all available mimalloc versions (1.6.7–3.2.8) to the package definition for testing
- Log the active memory allocator (with version where available) at server startup
- Annotate vendored rpmalloc with its source commit and version
## Notable changes in mimalloc 2.1.2 → 2.2.7
- **Memory release fix** (2.2.4): fix case where OS memory was not always fully released
- **Race condition fix** (2.2.6): fixed rare race condition and potential buffer overflow in debug statistics
- **Windows arm64 support** (2.1.9)
- **Guarded build** (2.1.9): new build mode that places OS guard pages behind objects to catch buffer overflows
- **THP awareness** (2.2.6): auto-detects transparent huge pages and adjusts purge size to avoid fragmentation
- **Faster TLS access on Windows** (2.2.6)
- **Improved calloc and aligned allocation performance** (2.2.6)
- **New diagnostic APIs** (2.2.2): `mi_options_print`, `mi_arenas_print`, `mi_stat_get` / `mi_stat_get_json`
- **macOS**: use `MADV_FREE_REUSABLE` for better memory behavior (2.2.4)
- **Build fixes**: Android, Xbox, musl, mingw, arm32, Debian 32-bit, non-BMI1 x64 systems
## Allocator logging
Added `FMalloc::GetName()` pure virtual so the server logs which allocator is active at startup:
```
zenserver - memory allocator: mimalloc 2.2.7
```
Allocator names include version where available:
- `mimalloc 2.2.7` (runtime version via `mi_version()`)
- `rpmalloc 1.5.0-dev.20250810` (ad-hoc version from vendored develop branch commit)
- `ansi`, `stomp` (no version info available)
## Test plan
- [x] Builds successfully on Windows (release)
- [x] Verify server startup log shows allocator name
- [x] Test with `--malloc=mimalloc` (default) and `--malloc=rpmalloc`
- [x] Run test suites to check for regressions
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- **RAII pipe handles for child process stdout/stderr capture**: `StdoutPipeHandles` is now a proper RAII type with automatic cleanup, move semantics, and partial close support. This makes it safe to use pipes for capturing child process output without risking handle/fd leaks.
- **Optional separate stderr pipe**: `CreateProcOptions` now accepts a `StderrPipe` field so callers can capture stdout and stderr independently. When null (default), stderr shares the stdout pipe as before.
- **LogStreamListener with pluggable handler**: The TCP log stream listener accepts connections from remote processes and delivers parsed log lines through a `LogStreamHandler` interface, set dynamically via `SetHandler()`. This allows any client to receive log messages without depending on a specific console implementation.
- **TcpLogStreamSink for zen::logging**: A logging sink that forwards log messages to a `LogStreamListener` over TCP, using the native `zen::logging::Sink` infrastructure with proper thread-safe synchronization.
- **Reliable child process exit codes on Linux**: `waitpid` result handling is fixed so `ProcessHandle::GetExitCode()` returns the real exit code. `ProcessHandle::Reset()` reaps zombies directly, replacing the global `IgnoreChildSignals()` which prevented exit code collection entirely. Also fixes a TOCTOU race in `ProcessHandle::Wait()` on Linux/Mac.
- **Pipe capture test suite**: Tests covering stdout/stderr capture via pipes (both shared and separate modes), RAII cleanup, move semantics, and exit code propagation using `zentest-appstub` as the child process.
- **Service command integration tests**: Shell-based integration tests for `zen service` covering the full lifecycle (install, status, start, stop, uninstall) on all three platforms — Linux (systemd), macOS (launchd), and Windows (SCM via PowerShell).
- **Test script reorganization**: Platform-specific test scripts moved from `scripts/test_scripts/` into `scripts/test_linux/`, `test_mac/`, and `test_windows/`.
|
| |
|
|
|
|
|
| |
- Improvement: Hub module listing now includes per-instance process metrics (memory, CPU time, working set, pagefile usage)
- Improvement: Hub now monitors provisioned instance health in the background and refreshes process metrics periodically
- Improvement: Hub no longer exposes raw `StorageServerInstance` pointers to callers; instance state is returned as value snapshots (`Hub::InstanceInfo`)
- Improvement: Hub instance access is now guarded by RAII per-instance locks (`SharedLockedPtr`/`ExclusiveLockedPtr`), preventing concurrent modifications during provisioning and deprovisioning
- Improvement: Hub instance lifecycle is now tracked as a `HubInstanceState` enum covering transitional states (Provisioning, Deprovisioning, Hibernating, Waking); exposed as a string in the HTTP API and dashboard
|
| |
|
|
|
|
|
|
|
|
| |
- Install a crash handler at the very top of main() in both zenserver and zen
- On Windows, uses SetUnhandledExceptionFilter with StackWalk64 for accurate
crash-site backtraces with DbgHelp symbol resolution
- On Linux/Mac, uses sigaction with async-signal-safe backtrace output
- Automatically superseded when Sentry/crashpad installs its own handlers
- Stays active for the full process lifetime if Sentry is disabled or absent
- Include .sym debug symbol files in Linux release bundle
|
| |
|
| |
Improved workaround for troubles with code potentially logging before logging is initialized. Any logging will be routed to a default console logger until logging is initialized fully
|
| |
|
|
|
|
|
| |
Add natvis for Compact Binary
Includes natvis for DateTime, TimeSpan, IoHash, Guid, Oid.
Based on UE CL 51830581.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
### Compute Batch Submission
- Consolidate duplicated action submission logic in `httpcomputeservice` into a single `HandleSubmitAction` supporting both single-action and batch (actions array) payloads
- Group actions by queue in `RemoteHttpRunner` and submit as batches with configurable chunk size, falling back to individual submission on failure
- Extract shared helpers: `MakeErrorResult`, `ValidateQueueForEnqueue`, `ActivateActionInQueue`, `RemoveActionFromActiveMaps`
### Retracted Action State
- Add `Retracted` state to `RunnerAction` for retry-free rescheduling — an explicit request to pull an action back and reschedule it on a different runner without incrementing `RetryCount`
- Implement idempotent `RetractAction()` on `RunnerAction` and `ComputeServiceSession`
- Add `POST jobs/{lsn}/retract` and `queues/{queueref}/jobs/{lsn}/retract` HTTP endpoints
- Add state machine documentation and per-state comments to `RunnerAction`
### Compute Race Fixes
- Fix race in `HandleActionUpdates` where actions enqueued between session abandon and scheduler tick were never abandoned, causing `GetActionResult` to return 202 indefinitely
- Fix queue `ActiveCount` race where `NotifyQueueActionComplete` was called after releasing `m_ResultsLock`, allowing callers to observe stale counters immediately after `GetActionResult` returned OK
### Logging Optimization and ANSI improvements
- Improve `AnsiColorStdoutSink` write efficiency — single write call, dirty-flag flush, `RwLock` instead of `std::mutex`
- Move ANSI color emission from sink into formatters via `Formatter::SetColorEnabled()`; remove `ColorRangeStart`/`End` from `LogMessage`
- Extract color helpers (`AnsiColorForLevel`, `StripAnsiSgrSequences`) into `helpers.h`
- Strip upstream ANSI SGR escapes in non-color output mode. This enables colour in log messages without polluting log files with ANSI control sequences
- Move `RotatingFileSink`, `JsonFormatter`, and `FullFormatter` from header-only to pimpl with `.cpp` files
### CLI / Exec Refactoring
- Extract `ExecSessionRunner` class from ~920-line `ExecUsingSession` into focused methods and a `ExecSessionConfig` struct
- Replace monolithic `ExecCommand` with subcommand-based architecture (`http`, `inproc`, `beacon`, `dump`, `buildlog`)
- Allow parent options to appear after subcommand name by parsing subcommand args permissively and forwarding unmatched tokens to the parent parser
### Testing Improvements
- Fix `--test-suite` filter being ignored due to accumulation with default wildcard filter
- Add test suite banners to test listener output
- Made `function.session.abandon_pending` test more robust
### Startup / Reliability Fixes
- Fix silent exit when a second zenserver instance detects a port conflict — use `ZEN_CONSOLE_*` for log calls that precede `InitializeLogging()`
- Fix two potential SIGSEGV paths during early startup: guard `sentry_options_new()` returning nullptr, and throw on `ZenServerState::Register()` returning nullptr instead of dereferencing
- Fail on unrecognized zenserver `--mode` instead of silently defaulting to store
### Other
- Show host details (hostname, platform, CPU count, memory) when discovering new compute workers
- Move frontend `html.zip` from source tree into build directory
- Add format specifications for Compact Binary and Compressed Buffer wire formats
- Add `WriteCompactBinaryObject` to zencore
- Extended `ConsoleTui` with additional functionality
- Add `--vscode` option to `xmake sln` for clangd / `compile_commands.json` support
- Disable compute/horde/nomad in release builds (not yet production-ready)
- Disable unintended `ASIO_HAS_IO_URING` enablement
- Fix crashpad patch missing leading whitespace
- Clean up code triggering gcc false positives
|
| |
|
|
|
|
| |
- Improvement: Add easy access options for sanitizers with `xmake config` and `xmake test` as options
- `--msan=[y|n]` Enable MemorySanitizer (Linux only, requires all deps instrumented)
- `--asan=[y|n]` Enable AddressSanitizer (disables mimalloc and sentry)
- `--tsan=[y|n]` Enable ThreadSanitizer (Linux/Mac only)
|
| |
|
|
|
|
|
| |
This PR makes it *possible* to do a Windows build on Linux via `clang-cl`.
It doesn't actually change any build process. No policy change, just mechanics and some code fixes to clear clang compilation.
The code fixes are mainly related to #include file name casing, to match the on-disk casing of the SDK files (via xwin).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
removal (#841)
- Percent-decode URIs in ASIO HTTP server to match http.sys CookedUrl behavior, ensuring consistent decoded paths across backends
- Add Environment field to CreateProcOptions for passing extra env vars to child processes (Windows: merged into Unicode environment block; Unix: setenv in fork)
- Add GetCompilerName() and include it in build options startup logging
- Suppress Windows CRT error dialogs in test harness for headless/CI runs
- Fix mimalloc package: pass CMAKE_BUILD_TYPE, skip cfuncs test for cross-compile
- Add virtual destructor to SentryAssertImpl to fix debug-mode warning
- Simplify object store path handling now that URIs arrive pre-decoded
- Add URI decoding test coverage for percent-encoded paths and query params
- Simplify httpasio request handling by using strands (guarantees no parallel handlers per connection)
- Removed deprecated regex-based route matching support
- Fix full GC never triggering after cross-toolchain builds: The `gc_state` file stores `system_clock` ticks, but the tick resolution differs between toolchains (nanoseconds on GCC/standard clang, microseconds on UE clang). A nanosecond timestamp misinterpreted as microseconds appears far in the future (~year 58,000), bypassing the staleness check and preventing time-based full GC from ever running. Fixed by also resetting when the stored timestamp is in the future.
- Clamp GC countdown display to configured interval: Prevents nonsensical log output (e.g. "Full GC in 492128002h") caused by the above or any other clock anomaly. The clamp applies to both the scheduler log and the status API.
|
| |
|
|
|
|
|
|
| |
- Add block cloning (copy-on-write) support for Linux and macOS to complement the existing Windows (ReFS) implementation
- **Linux**: `TryCloneFile` via `FICLONE` ioctl, `CloneQueryInterface` with range cloning via `FICLONERANGE` (Btrfs/XFS)
- **macOS**: `TryCloneFile` via `clonefile()` syscall (APFS), `SupportsBlockRefCounting` via `VOL_CAP_INT_CLONE`. `CloneQueryInterface` is not implemented as macOS lacks a sub-file range clone API
- Promote `ScopedFd` to file scope for broader use in filesystem code
- Add test scripts for block cloning validation on Linux (Btrfs via loopback) and macOS (APFS)
- Also added test script for testing on Windows (ReFS)
|
| |
|
|
|
|
|
| |
- Fix potential crash on startup caused by logging macros being invoked before the logging system is initialized (null logger dereference in `ZenServerState::Sweep()`). `LoggerRef::ShouldLog` now guards against a null logger pointer.
- Make CPR an optional dependency (`--zencpr` build option, enabled by default) so builds can proceed without it
- Make zenvfs Windows-only (platform-specific target)
- Generate the frontend zip at build time from source HTML files instead of checking in a binary blob which would accumulate with every single update
|
| |
|
|
|
|
|
|
|
|
| |
- Add clang-cl warning suppressions in xmake.lua matching Linux/macOS set
- Guard /experimental:c11atomics with {tools="cl"} for MSVC-only
- Fix long long / int64_t redefinition in string.h for clang-cl
- Fix unclosed namespace in callstacktrace.cpp #else branch
- Fix missing override in httpplugin.cpp
- Reorder WorkerPool fields to match designated initializer order
- Use INVALID_SOCKET instead of SOCKET_ERROR for SOCKET comparisons
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Switches the default HTTP client to the libcurl-based backend and follows up with a series of correctness fixes and code quality improvements to `CurlHttpClient`.
**Backend switch & build fixes:**
- Switch default HTTP client to libcurl-based backend
- Suppress `[[nodiscard]]` warning when building fmt
- Miscellaneous bugfixes in HttpClient/libcurl
- Pass `-y` to `xmake config` in `xmake test` task
**Boilerplate reduction:**
- Add `Session::SetHeaders()` for RAII ownership of `curl_slist`, eliminating manual `curl_slist_free_all` calls from every verb method
- Add `Session::PerformWithResponseCallbacks()` to absorb the repeated 12-line write+header callback setup block
- Extract `ParseHeaderLine()` shared helper, replacing 4 duplicate header-parsing implementations
- Extract `BuildHeaderMap()` and `ApplyContentTypeFromHeaders()` helpers to deduplicate header-to-map conversion and Content-Type scanning
- Unify the two `DoWithRetry` overloads (PayloadFile variant now delegates to the Validate variant)
**Correctness fixes:**
- `TransactPackage`: both phases now use `PerformWithResponseCallbacks()`, fixing missing abort support and a dead header collection loop
- `TransactPackage`: error path now routes through `CommonResponse`, preserving curl error codes and messages for the caller
- `ValidatePayload`: merged 3 separate header-scan loops into a single pass
**Performance improvements:**
- Replace `fmt::format` with `ExtendableStringBuilder` in `BuildHeaderList` and `BuildUrlWithParameters`, eliminating heap allocations in the common case
- Replace `curl_easy_escape`/`curl_free` with inline URL percent-encoding using `AsciiSet`
- Remove wasteful `CommonResponse(...)` construction in retry logging, formatting directly from `CurlResult` fields
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a **transparent TCP proxy mode** to zenserver (activated via `zenserver proxy`), allowing it to sit between clients and upstream Zen servers to inspect and monitor HTTP/1.x traffic in real time. Primarily useful during development, to be able to observe multi-server/client interactions in one place.
- **Dedicated proxy port** -- Proxy mode defaults to port 8118 with its own data directory to avoid collisions with a normal zenserver instance.
- **TCP proxy core** (`src/zenserver/proxy/`) -- A new transparent TCP proxy that forwards connections to upstream targets, with support for both TCP/IP and Unix socket listeners. Multi-threaded I/O for connection handling. Supports Unix domain sockets for both upstream/downstream.
- **HTTP traffic inspection** -- Parses HTTP/1.x request/response streams inline to extract method, path, status, content length, and WebSocket upgrades without breaking the proxied data.
- **Proxy dashboard** -- A web UI showing live connection stats, per-target request counts, active connections, bytes transferred, and client IP/session ID rollups.
- **Server mode display** -- Dashboard banner now shows the running server mode (Zen Proxy, Zen Compute, etc.).
Supporting changes included in this branch:
- **Wildcard log level matching** -- Log levels can now be set per-category using wildcard patterns (e.g. `proxy.*=debug`).
- **`zen down --all`** -- New flag to shut down all running zenserver instances; also used by the new `xmake kill` task.
- Minor test stability fixes (flaky hash collisions, per-thread RNG seeds).
- Support ZEN_MALLOC environment variable for default allocator selection and switch default to rpmalloc
- Fixed sentry-native build to allow LTO on Windows
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
- **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS.
- **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking.
- **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration.
- **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP.
- **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch.
- **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support.
Also contains some fixes from TSAN analysis.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add auto-detection of colour support to `AnsicolourStdoutSink`.
**New `colorMode` enum** (`On`, `Off`, `Auto`) added to the header, accepted by the `AnsicolorStdoutSink` constructor. Defaults to `Auto`, so all existing call sites are unaffected.
**`Auto` mode detection logic** (in `IscolourTerminal()`):
1. **TTY check** -- if stdout is not a terminal, colour is disabled.
2. **`NO_COLOR`** -- respects the no-colour.org convention. If set, colour is disabled.
3. **`COLORTERM`** -- if set (e.g. `truecolour`, `24bit`), colour is enabled.
4. **`TERM`** -- rejects `dumb`; accepts known colour-capable terminals via substring match:
`alacritty`, `ansi`, `colour`, `console`, `cygwin`, `gnome`, `konsole`, `kterm`, `linux`, `msys`, `putty`, `rxvt`, `screen`, `tmux`, `vt100`, `vt102`, `xterm`.
Substring matching covers variants like `xterm-256color` and `rxvt-unicode`.
5. **Fallback** -- Windows defaults to colour enabled (modern console supports ANSI natively); other platforms default to disabled.
When colour is disabled, ANSI escape sequences are omitted entirely from the output.
NOTE: this doesn't currently apply to all paths which do logging in zen as they may be determining their colour output mode separately from `AnsicolorStdoutSink`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Removes the vendored spdlog library (~12,000 lines) and replaces it with a purpose-built logging system in zencore (~1,800 lines). The new implementation provides the same functionality with fewer abstractions, no shared_ptr overhead, and full control over the logging pipeline.
### What changed
**New logging core in zencore/logging/:**
- LogMessage, Formatter, Sink, Logger, Registry - core abstractions matching spdlog's model but simplified
- AnsiColorStdoutSink - ANSI color console output (replaces spdlog stdout_color_sink)
- MsvcSink - OutputDebugString on Windows (replaces spdlog msvc_sink)
- AsyncSink - async logging via BlockingQueue worker thread (replaces spdlog async_logger)
- NullSink, MessageOnlyFormatter - utility types
- Thread-safe timestamp caching in formatters using RwLock
**Moved to zenutil/logging/:**
- FullFormatter - full log formatting with timestamp, logger name, level, source location, multiline alignment
- JsonFormatter - structured JSON log output
- RotatingFileSink - rotating file sink with atomic size tracking
**API changes:**
- Log levels are now an enum (LogLevel) instead of int, eliminating the zen::logging::level namespace
- LoggerRef no longer wraps shared_ptr - it holds a raw pointer with the registry owning lifetime
- Logger error handler is wired through Registry and propagated to all loggers on registration
- Logger::Log() now populates ThreadId on every message
**Cleanup:**
- Deleted thirdparty/spdlog/ entirely (110+ files)
- Deleted full_test_formatter (was ~80% duplicate of FullFormatter)
- Renamed snake_case classes to PascalCase (full_formatter -> FullFormatter, json_formatter -> JsonFormatter, sentry_sink -> SentrySink)
- Removed spdlog from xmake dependency graph
### Build / test impact
- zencore no longer depends on spdlog
- zenutil and zenvfs xmake.lua updated to drop spdlog dep
- zentelemetry xmake.lua updated to drop spdlog dep
- All existing tests pass, no test changes required beyond formatter class renames
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Claude config updates
* Bug fixes and hardening across `zencore` and `zenhttp`, identified via static analysis.
### zencore
- **`ZEN_ASSERT` macro** -- extended to accept an optional string message literal; added `ZEN_ASSERT_MSG_` helper for message formatting. Callers needing runtime fmt-style formatting should use `ZEN_ASSERT_FORMAT`.
- **`MpscQueue`** -- fixed `TypeCompatibleStorage` to use a properly-sized `char Storage[sizeof(T)]` array instead of a single `char`; corrected `Data()` to cast `&Storage` rather than `this`; switched cache-line alignment to a fixed constant to avoid GCC's `-Winterference-size` warning. Enabled previously-disabled tests.
- **`StringBuilderImpl`** -- initialized `m_Base`/`m_CurPos`/`m_End` to `nullptr`. Fixed `StringCompare` return type (`bool` -> `int`). Fixed `ParseInt` to reject strings with trailing non-numeric characters. Removed deprecated `<codecvt>` include.
- **`NiceNumGeneral`** -- replaced `powl()` with integer `IntPow()` to avoid floating-point precision issues.
- **`RwLock::ExclusiveLockScope`** -- added move constructor/assignment; initialized `m_Lock` to `nullptr`.
- **`Latch::AddCount`** -- fixed variable type (`std::atomic_ptrdiff_t` -> `std::ptrdiff_t` for the return value of `fetch_add`).
- **`thread.cpp`** -- fixed Linux `pthread_setname_np` 16-byte name truncation; added null check before dereferencing in `Event::Close()`; fixed `NamedEvent::Close()` to call `close(Fd)` outside the lock region; added null guard in `NamedMutex` destructor; `Sleep()` now returns early for non-positive durations.
- **`MD5Stream`** -- was entirely stubbed out (no-op); now correctly calls `MD5Init`/`MD5Update`/`MD5Final`. Fixed `ToHexString` to use the correct string length. Fixed forward declarations. Fixed tests to compare `compare() == 0`.
- **`sentryintegration.cpp`** -- guard against null `filename`/`funcname` in spdlog message handler to prevent a crash in `fmt::format`.
- **`jobqueue.cpp`** -- fixed lost job ID when `IdGenerator` wraps around zero; fixed raw `Job*` in `RunningJobs` map (potential use-after-free) to `RefPtr<Job>`; fixed range-loop copies; fixed format string typo.
- **`trace.cpp`** -- suppress GCC false-positive warnings in third-party `trace.h` include.
### zenhttp
- **WebSocket close race** (`wsasio`, `wshttpsys`, `httpwsclient`) -- `m_CloseSent` promoted from `bool` to `std::atomic<bool>`; close check changed to `exchange(true)` to eliminate the check-then-set data race.
- **`wsframecodec.cpp`** -- reject WebSocket frames with payload > 256 MB to prevent OOM from malformed/malicious frames.
- **`oidc.cpp`** -- URL-encode refresh token and client ID in token requests (`FormUrlEncode`); parse `end_session_endpoint` and `device_authorization_endpoint` from OIDC discovery document.
- **`httpclientcommon.cpp`** -- propagate error code from `AppendData` when flushing the cache buffer.
- **`httpclient.h`** -- initialize all uninitialized members (`ErrorCode`, `UploadedBytes`, `DownloadedBytes`, `ElapsedSeconds`, `MultipartBoundary` fields).
- **`httpserver.h`** -- fix `operator=` return type for `HttpRpcHandler` (missing `&`).
- **`packageformat.h`** -- fix `~0u` (32-bit truncation) to `~uint64_t(0)` for a `uint64_t` field.
- **`httpparser`** -- initialize `m_RequestVerb` in both declaration and `ResetState()`.
- **`httpplugin.cpp`** -- initialize `m_BasePort`; fix format string missing quotes around connection name.
- **`httptracer.h`** -- move `#pragma once` before includes.
- **`websocket.h`** -- initialize `WebSocketMessage::Opcode`.
### zenserver
- **`hubservice.cpp`** -- fix two `ZEN_ASSERT` calls that incorrectly used fmt-style format args; converted to `ZEN_ASSERT_FORMAT`.
|
| |
|
| |
Compile fixes for various versions of gcc,clang (non-UE)
|
| |
|
|
|
| |
* remove stray std::unique_ptr<AuthMgr> Auth; causing crashes
* add more feedback during parsing of auth options
|
| |
|
|
|
|
|
|
|
|
| |
- Added local process runners for Linux/Wine, Mac with some sandboxing support
- Horde & Nomad provisioning for development and testing
- Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API
- Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates
- Some security hardening
- Improved scalability and `zen exec` command
Still experimental - compute support is disabled by default
|
| |
|
|
|
| |
Various fixes to make cpp files build in unity build mode
as an aside using Unity build doesn't really seem to work on Linux, unsure why but it leads to link-time issues
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Makes all test cases part of a test suite. Test suites are named after the module and the name of the file containing the implementation of the test.
* This allows for better and more predictable filtering of which test cases to run which should also be able to reduce the time CI spends in tests since it can filter on the tests for that particular module.
Also improves `xmake test` behaviour:
* instead of an explicit list of projects just enumerate the test projects which are available based on build system state
* also introduces logic to avoid running `xmake config` unnecessarily which would invalidate the existing build and do lots of unnecessary work since dependencies were invalidated by the updated config
* also invokes build only for the chosen test targets
As a bonus, also adds `xmake sln --open` which allows opening IDE after generation of solution/xmake project is done.
|
| |
|
|
|
|
| |
* when `--verbose` is specified to zenserver-test, all child process output (typically, zenserver instances) is piped through to stdout. you can also pass `--verbose` to `xmake test` to accomplish the same thing.
* this PR also consolidates all test runner `main` function logic (such as from zencore-test, zenhttp-test etc) into central implementation in zencore for consistency and ease of maintenance
* also added extended utf8-tests including a fix to `Utf8ToWide()`
|
| |
|
|
|
| |
This change introduces job object support on Windows to be able to more accurately track and limit resource usage on storage instances created by the hub service. It also ensures that all child instances can be torn down reliably on exit.
Also made it so hub tests no longer pop up console windows while running.
|
| |
|
| |
* Ported "lane trace" feature from UE (by way of IAX)
|
| |
|
|
|
|
|
| |
- Add missing includes in hashutils.h (`<cstddef>`, `<type_traits>`)
- Add `ZenContentType` parameter to all `IoBufferBuilder` factory methods so content type is set at buffer creation time
- Fix null dereference in `SharedBuffer::GetFileReference()` when buffer is null
- Fix out-of-bounds read in trace command-line argument parsing when arg length exactly matches option length
- Add unit tests for 32-bit `CountLeadingZeros`
|
| |
|
|
|
| |
Allows user to automate launching of zenserver dashboard, including when multiple instances are running. If multiple instances are running you can open all dashboards with `--all`, and also using the in-terminal chooser which also allows you to open a specific instance.
Also includes a fix to `zen exec` when using offset/stride/limit
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zencore fixes:
- filesystem.cpp: ReadFile error reporting logic
- compactbinaryvalue.h: CbValue::As*String error reporting logic
zenhttp fixes:
- httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB)
- httpsys async workpool initialization race
zenstore fixes:
- cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost)
- structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses)
- cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete)
- cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size
- buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash)
- buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob
zenserver fixes:
- httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow)
- hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper)
- zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
|
| |
|
| |
Co-authored-by: Stefan Boberg <[email protected]>
|
| |
|
|
|
| |
(was MakeSafeAbsolutePathÍnPlace - note accent)
Also fixed misleading comments on multiple functions in filesystem.h
|
| |
|
|
|
| |
* implemented selective request logging for http.sys for consistency with asio
* fixed traversal of GetLogicalProcessorInformationEx to account for variable-sized records
* also adds CPU usage metrics
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Added `--security-config-path` option to zenserver to configure security settings
- Expects a path to a .json file
- Default is an empty path resulting in no extra security settings and legacy behavior
- Current support is a top level filter of incoming http requests restricted to the `password` type
- `password` type will check the `Authorization` header and match it to the selected authorization strategy
- Currently the security settings is very basic and configured to a fixed username+password at startup
{
"http" {
"root": {
"filter": {
"type": "password",
"config": {
"password": {
"username": "<username>",
"password": "<password>"
},
"protect-machine-local-requests": false,
"unprotected-uris": [
"/health/",
"/health/info",
"/health/version"
]
}
}
}
}
}
|
| |
|
|
|
|
|
|
| |
* `RwLock::WithSharedLock` and `RwLock::WithExclusiveLock` can now return a value (which is returned by the passed function)
* Comma-separated logger specification now correctly deals with commas
* `GetSystemMetrics` properly accounts for cores
* cpr response formatter passes arguments in the right order
* `HttpServerRequest::SetLogRequest` can be used to selectively log HTTP requests
|
| |
|
| |
also made sure log initialization calls it to ensure the console output format is retained even if the console logger was set up before logging is initialized
|
| | |
|
| |
|
|
|
|
| |
* add ability to override scheduling mode in ParallelWork
* Don't increase buffering size when copying from local cache with --boost-worker-memory enabled
* Rework scheduling writes of downloaded data to reduce memory usage
|
| |
|
|
|
|
|
| |
* add system metrics output to top command
* removed unnecessary xmake directives
* file system API/comment tweaks
* fixed out-of-range access in httpserver test
* updated ZenServer base API to allow customization by mode
|
| |
|
| |
initially we had ZENCORE_API macros to potentially allow for DLL linkage. It turns out that this is not useful and the macros just contribute noise, so this change removes them completely.
|
| |
|
|
|
|
|
|
| |
* this adds a consul package which can be used to fetch a consul binary
* it also adds a `ConsulProcess` helper which can be used to spawn and manage a consul service instance
* zencore dependencies brought across:
- `except_fmt.h` for easer generation of formatted exception messages
- `process.h/cpp` changes (adds `Kill` operation and process group support on Windows)
- `string.h` changes to allow generic use of `WideToUtf8()`
|
| |
|
|
|
|
|
|
| |
commands (#706)
* added `--exclude-folders` to `zen upload`, `zen download` and `zen diff`
added `--exclude-extensions` to `zen upload` and `zen diff`
excluded folder names are now matched by folder name in subfolders in addition to root level folders
* allow multiple token separators
|
| |
|
|
|
| |
- Improvement: Deeper validation of data when scrub is activated (cas/cache/project)
- Improvement: Enabled more multi threading when running scrub operations
- Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
|
| |
|
| |
Implemented GetSystemMetrics function for Mac and Linux
|