aboutsummaryrefslogtreecommitdiff
path: root/src/zenserver/frontend/html
Commit message (Collapse)AuthorAgeFilesLines
* fix hub consule health endpoint registration (#917)Dan Engelbrecht3 days1-0/+1
| | | | * use correct health endpoint for zenhubserver consul registration * add total disk space on hub resource pane
* s3 and consul fixes (#916)Dan Engelbrecht4 days1-1/+1
| | | | | | | | | | | * fix endpoint for stats/hub in compute/hub.html page * fix api token call failure for imds (using wrong overload for Put) * add "localhost" to healt check url in consul when no address is given * add consul fallback deregister if normal deregister fails * add consul registration unit test
* add provision button to hub ui (#915)Dan Engelbrecht4 days1-0/+133
|
* hub instance dashboard proxy (#914)Dan Engelbrecht4 days2-2/+2
| | | - Feature: Hub dashboard proxy - instance dashboards are accessible through the hub server at `/hub/proxy/{port}/` without requiring direct port access
* Request validation and resilience improvements (#864)Stefan Boberg6 days1-8/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ### Security: Input validation & path safety - **Reject local file references by default** in package parsing — only allow when explicitly opted in by the service (`ParseFlags::kAllowLocalReferences`) and validated by an `ILocalRefPolicy` (fail-closed: no policy = rejected) - **`DataRootLocalRefPolicy`** restricts local ref paths to the server's data root via canonical path prefix matching - **Validate attachment hashes** in compute HTTP handlers — decompresses and re-hashes each attachment at ingestion time to reject tampered payloads - **Path traversal validation** for worker descriptions (`pathvalidation.h`) — rejects absolute paths, `..` components, Windows reserved device names, and invalid filename characters - **Harden CbPackage parsing** against corrupt inputs — overflow-safe attachment count, bounds checks on local ref offset/size, graceful failure instead of `ZEN_ASSERT` for untrusted data - **Harden legacy package parser** — reject zero-size binary fields, missing mappers, and optionally validate resolved attachment hashes - **Bounds check in `CbPackageReader::MarshalLocalChunkReference`** — detect when `MakeFromFile` silently clamps offset+size to file size ### Reliability: Lock consolidation & bug fixes - **Consolidate three action map locks into one** (`m_ActionMapLock`) — eliminates deadlock risk from multi-lock ordering, simplifies state transitions, and fixes a race where newly enqueued actions were briefly invisible to `GetActionResult`/`FindActionResult` - **Fix infinite loop in `BaseRunnerGroup::SubmitActions`** when actions exceed total runner capacity — cap round-robin at `TotalCapacity` and default unassigned results to "No capacity" - **Fix `MakeSafeAbsolutePathInPlace` for UNC paths** — `\server\share` now correctly becomes `\?\UNC\server\share` instead of `\?\server\share` - **Fix `max_retries=0`** — previously fell through to the default of 3; now correctly means "no retries" ### New: ManagedProcessRunner - Cross-platform process runner backed by `SubprocessManager` — uses async exit callbacks instead of polling, delegates CPU/memory metrics to the manager's built-in sampler - `ProcessGroup` (JobObject on Windows, process group on POSIX) for bulk cancellation on shutdown - `--managed` flag on `zen exec inproc` to select this runner - Refactored monitor thread lifecycle — `StartMonitorThread()` now called from derived constructors to avoid calling virtual functions from base constructor ### Process management - **Suppress crash dialogs** via `JOB_OBJECT_UILIMIT_ERRORMODE` + `SEM_NOGPFAULTERRORBOX` in both `WindowsProcessRunner` and `JobObject::Initialize` — prevents WER/Dr. Watson modal dialogs from blocking the monitor thread - **CREATE_SUSPENDED → AssignProcessToJobObject → ResumeThread** pattern in `WindowsProcessRunner` — ensures job object assignment before process execution - **Move stdout/stderr callbacks to `Spawn()` parameters** in `SubprocessManager` — prevents race where early output could be missed before callback installation - Consistent PID logging across all runner types ### Test infrastructure - **`zentest-appstub`**: Added `Fail` (configurable exit code) and `Crash` (abort / nullptr deref) test functions - **Compute integration tests**: exit code handling, auto-retry exhaustion, manual reschedule after failure, mixed success/failure queues, crash handling (abort + nullptr), crash auto-retry, immediate query visibility after enqueue - **Package format tests**: truncated header, bad magic, attachment count overflow, truncated data, local ref rejection/acceptance, policy enforcement (inside/outside root, traversal, no-policy fail-closed) - **Legacy package parser tests**: empty input, zero-size binary, hash resolution with/without mapper, hash mismatch detection - **UNC path tests** for `MakeSafeAbsolutePath` ### Misc - ANSI color helper macros (`ZEN_RED`, `ZEN_BRIGHT_WHITE`, etc.) and `ZEN_BOLD`/`ZEN_DIM`/etc. - Generic `fmt::formatter` for types with free `ToString` functions - Compute dashboard: truncated hash display with monospace font and hover for full value - Renamed `usonpackage_forcelink` → `cbpackage_forcelink` - Compute enabled by default in xmake config (releases still explicitly disable)
* hub resource limits (#900)Dan Engelbrecht6 days9-150/+104
| | | | | | | | | | | | - Feature: Hub dashboard now shows a Resources tile with disk and memory usage against configured limits - Feature: Hub module listing now shows state-change timestamps and duration for each instance - Improvement: Hub provisioning rejects new instances when disk or memory usage exceeds configurable thresholds; limits are disabled by default (0 = no limit) - `--hub-provision-disk-limit-bytes` - Reject provisioning when used disk exceeds this many bytes - `--hub-provision-disk-limit-percent` - Reject provisioning when used disk exceeds this percentage of total disk - `--hub-provision-memory-limit-bytes` - Reject provisioning when used memory exceeds this many bytes - `--hub-provision-memory-limit-percent` - Reject provisioning when used memory exceeds this percentage of total RAM - Improvement: Hub process metrics are now tracked atomically per active instance slot, eliminating per-query process handle lookups - Improvement: Hub, Build Store, and Workspaces service stats sections in the dashboard are now collapsible - Bugfix: Hub watchdog loop did not check `m_ShutdownFlag`, causing it to spin indefinitely on shutdown
* dashboard improvements (#896)Dan Engelbrecht9 days8-139/+569
| | | | | | - Feature: Added Workspaces dashboard page with HTTP request stats and per-workspace metrics - Feature: Added Build Storage dashboard page with service-specific HTTP request stats - Improvement: Front page now shows Hub and Object Store activity tiles; HTTP panel is fixed above the tiles grid - Improvement: HTTP stats tiles now include 5m/15m rates and p999/max latency across all service pages
* Dashboard refresh (logs, storage, network, object store, docs) (#835)Stefan Boberg13 days20-246/+7939
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ## Summary This PR adds a session management service, several new dashboard pages, and a number of infrastructure improvements. ### Sessions Service - `SessionsServiceClient` in `zenutil` announces sessions to a remote zenserver with a 15s heartbeat (POST/PUT/DELETE lifecycle) - Storage server registers itself with its own local sessions service on startup - Session mode attribute coupled to server mode (Compute, Proxy, Hub, etc.) - Ended sessions tracked with `ended_at` timestamp; status filtering (Active/Ended/All) - `--sessions-url` config option for remote session announcement - In-process log sink (`InProcSessionLogSink`) forwards server log output to the server's own session, visible in the dashboard ### Session Log Viewer - POST/GET endpoints for session logs (`/sessions/{id}/log`) supporting raw text and structured JSON/CbObject with batch `entries` array - In-memory log storage per session (capped at 10k entries) with cursor-based pagination for efficient incremental fetching - Log panel in the sessions dashboard with incremental DOM updates, auto-scroll (Follow toggle), newest-first toggle, text filter, and log-level coloring - Auto-selects the server's own session on page load ### TCP Log Streaming - `LogStreamListener` and `TcpLogStreamSink` for log delivery over TCP - Sequence numbers on each message with drop detection and synthetic "dropped" notice on gaps - Gathered buffer writes to reduce syscall overhead when flushing batches - Tests covering basic delivery, multi-line splitting, drop detection, and sequencing ### New Dashboard Pages - **Sessions**: master-detail layout with selectable rows, metadata panel, live WebSocket updates, paging, abbreviated date formatting, and "this" pill for the local session - **Object Store**: summary stats tiles and bucket table with click-to-expand inline object listing (`GET /obj/`) - **Storage**: per-volume disk usage breakdown (`GET /admin/storage`), Garbage Collection status section (next-run countdown, last-run stats), and GC History table with paginated rows and expandable detail panels - **Network**: overview tiles, per-service request table, proxy connections, and live WebSocket updates; distinct client IPs and session counts via HyperLogLog ### Documentation Page - In-dashboard Docs page with sidebar navigation, markdown rendering (via `marked`), Mermaid diagram support (theme-aware), collapsible sections, text filtering with highlighting, and cross-document linking - New user-facing docs: `overview.md` (with architecture and per-mode diagrams), `sessions.md`, `cache.md`, `projects.md`; updated `compute.md` - Dev docs moved to `docs/dev/` ### Infrastructure & Bug Fixes - **Deflate compression** for the embedded frontend zip (~3.4MB → ~950KB); zlib inflate support added to `ZipFs` with cached decompressed buffers - **Local IP addresses**: `GetLocalIpAddresses()` (Windows via `GetAdaptersAddresses`, Linux/Mac via `getifaddrs`); surfaced in `/status/status`, `/health/info`, and the dashboard banner - **Dashboard nav**: unified into `zen-nav` web component with `MutationObserver` for dynamically added links, CSS `::part()` to merge banner/nav border radii, and prefix-based active link detection - Stats broadcast refactored from manual JSON string concatenation to `CbObjectWriter`; `CbObject`-to-JS conversion improved for `TimeSpan`, `DateTime`, and large integers - Stats WebSocket boilerplate consolidated into `ZenPage.connect_stats_ws()`
* add hub instance crash recovery (#885)Dan Engelbrecht13 days2-1/+8
| | | * add hub instance crash recovery
* hub web UI improvements (#878)Dan Engelbrecht2026-03-222-20/+742
| | | | | | | | | | - Improvement: Hub dashboard module list improved - Animated state dots, with flashing transitions for hibernating, waking, provisioning, and deprovisioning - Per-row port used by the provisioned instance - Per-row hibernate, wake, and deprovision actions with confirmation for destructive operations - Per-row button to open the instance dashboard in a new window - Multi-select with bulk hibernate/wake/deprovision and select-all - Pagination at 50 lines per page - Inline fold-out panel per row with human-readable process metrics
* add hub instance info (#869)Dan Engelbrecht2026-03-202-4/+4
| | | | | | | - Improvement: Hub module listing now includes per-instance process metrics (memory, CPU time, working set, pagefile usage) - Improvement: Hub now monitors provisioned instance health in the background and refreshes process metrics periodically - Improvement: Hub no longer exposes raw `StorageServerInstance` pointers to callers; instance state is returned as value snapshots (`Hub::InstanceInfo`) - Improvement: Hub instance access is now guarded by RAII per-instance locks (`SharedLockedPtr`/`ExclusiveLockedPtr`), preventing concurrent modifications during provisioning and deprovisioning - Improvement: Hub instance lifecycle is now tracked as a `HubInstanceState` enum covering transitional states (Provisioning, Deprovisioning, Hibernating, Waking); exposed as a string in the HTTP API and dashboard
* Compute batching (#849)Stefan Boberg2026-03-184-17/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ### Compute Batch Submission - Consolidate duplicated action submission logic in `httpcomputeservice` into a single `HandleSubmitAction` supporting both single-action and batch (actions array) payloads - Group actions by queue in `RemoteHttpRunner` and submit as batches with configurable chunk size, falling back to individual submission on failure - Extract shared helpers: `MakeErrorResult`, `ValidateQueueForEnqueue`, `ActivateActionInQueue`, `RemoveActionFromActiveMaps` ### Retracted Action State - Add `Retracted` state to `RunnerAction` for retry-free rescheduling — an explicit request to pull an action back and reschedule it on a different runner without incrementing `RetryCount` - Implement idempotent `RetractAction()` on `RunnerAction` and `ComputeServiceSession` - Add `POST jobs/{lsn}/retract` and `queues/{queueref}/jobs/{lsn}/retract` HTTP endpoints - Add state machine documentation and per-state comments to `RunnerAction` ### Compute Race Fixes - Fix race in `HandleActionUpdates` where actions enqueued between session abandon and scheduler tick were never abandoned, causing `GetActionResult` to return 202 indefinitely - Fix queue `ActiveCount` race where `NotifyQueueActionComplete` was called after releasing `m_ResultsLock`, allowing callers to observe stale counters immediately after `GetActionResult` returned OK ### Logging Optimization and ANSI improvements - Improve `AnsiColorStdoutSink` write efficiency — single write call, dirty-flag flush, `RwLock` instead of `std::mutex` - Move ANSI color emission from sink into formatters via `Formatter::SetColorEnabled()`; remove `ColorRangeStart`/`End` from `LogMessage` - Extract color helpers (`AnsiColorForLevel`, `StripAnsiSgrSequences`) into `helpers.h` - Strip upstream ANSI SGR escapes in non-color output mode. This enables colour in log messages without polluting log files with ANSI control sequences - Move `RotatingFileSink`, `JsonFormatter`, and `FullFormatter` from header-only to pimpl with `.cpp` files ### CLI / Exec Refactoring - Extract `ExecSessionRunner` class from ~920-line `ExecUsingSession` into focused methods and a `ExecSessionConfig` struct - Replace monolithic `ExecCommand` with subcommand-based architecture (`http`, `inproc`, `beacon`, `dump`, `buildlog`) - Allow parent options to appear after subcommand name by parsing subcommand args permissively and forwarding unmatched tokens to the parent parser ### Testing Improvements - Fix `--test-suite` filter being ignored due to accumulation with default wildcard filter - Add test suite banners to test listener output - Made `function.session.abandon_pending` test more robust ### Startup / Reliability Fixes - Fix silent exit when a second zenserver instance detects a port conflict — use `ZEN_CONSOLE_*` for log calls that precede `InitializeLogging()` - Fix two potential SIGSEGV paths during early startup: guard `sentry_options_new()` returning nullptr, and throw on `ZenServerState::Register()` returning nullptr instead of dereferencing - Fail on unrecognized zenserver `--mode` instead of silently defaulting to store ### Other - Show host details (hostname, platform, CPU count, memory) when discovering new compute workers - Move frontend `html.zip` from source tree into build directory - Add format specifications for Compact Binary and Compressed Buffer wire formats - Add `WriteCompactBinaryObject` to zencore - Extended `ConsoleTui` with additional functionality - Add `--vscode` option to `xmake sln` for clangd / `compile_commands.json` support - Disable compute/horde/nomad in release builds (not yet production-ready) - Disable unintended `ASIO_HAS_IO_URING` enablement - Fix crashpad patch missing leading whitespace - Clean up code triggering gcc false positives
* Transparent proxy mode (#823)Stefan Boberg2026-03-124-10/+504
| | | | | | | | | | | | | | | | | Adds a **transparent TCP proxy mode** to zenserver (activated via `zenserver proxy`), allowing it to sit between clients and upstream Zen servers to inspect and monitor HTTP/1.x traffic in real time. Primarily useful during development, to be able to observe multi-server/client interactions in one place. - **Dedicated proxy port** -- Proxy mode defaults to port 8118 with its own data directory to avoid collisions with a normal zenserver instance. - **TCP proxy core** (`src/zenserver/proxy/`) -- A new transparent TCP proxy that forwards connections to upstream targets, with support for both TCP/IP and Unix socket listeners. Multi-threaded I/O for connection handling. Supports Unix domain sockets for both upstream/downstream. - **HTTP traffic inspection** -- Parses HTTP/1.x request/response streams inline to extract method, path, status, content length, and WebSocket upgrades without breaking the proxied data. - **Proxy dashboard** -- A web UI showing live connection stats, per-target request counts, active connections, bytes transferred, and client IP/session ID rollups. - **Server mode display** -- Dashboard banner now shows the running server mode (Zen Proxy, Zen Compute, etc.). Supporting changes included in this branch: - **Wildcard log level matching** -- Log levels can now be set per-category using wildcard patterns (e.g. `proxy.*=debug`). - **`zen down --all`** -- New flag to shut down all running zenserver instances; also used by the new `xmake kill` task. - Minor test stability fixes (flaky hash collisions, per-thread RNG seeds). - Support ZEN_MALLOC environment variable for default allocator selection and switch default to rpmalloc - Fixed sentry-native build to allow LTO on Windows
* Dashboard overhaul, compute integration (#814)Stefan Boberg2026-03-0928-854/+4327
| | | | | | | | | | - **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS. - **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking. - **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration. - **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP. - **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch. - **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support. Also contains some fixes from TSAN analysis.
* compute orchestration (#763)Stefan Boberg2026-03-049-50/+2222
| | | | | | | | | | - Added local process runners for Linux/Wine, Mac with some sandboxing support - Horde & Nomad provisioning for development and testing - Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API - Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates - Some security hardening - Improved scalability and `zen exec` command Still experimental - compute support is disabled by default
* icon and header logo changeszousar2026-02-194-19/+21
|
* Merge branch 'main' into zs/web-ui-improvementszousar2026-02-181-0/+991
|\
| * structured compute basics (#714)Stefan Boberg2026-02-181-0/+991
| | | | | | | | | | | | | | | | | | this change adds the `zencompute` component, which can be used to distribute work dispatched from UE using the DDB (Derived Data Build) APIs via zenserver this change also adds a distinct zenserver compute mode (`zenserver compute`) which is intended to be used for leaf compute nodes to exercise the compute functionality without directly involving UE, a `zen exec` subcommand is also added, which can be used to feed replays through the system all new functionality is considered *experimental* and disabled by default at this time, behind the `zencompute` option in xmake config
* | entry.js handles missing/native items more gracefullyzousar2026-02-182-4/+32
| |
* | Dependencies table doesn't reflow the entries pagezousar2026-02-171-5/+19
| |
* | Rename the cache section in the web uizousar2026-02-171-1/+1
| |
* | Make files table in entry.js paginated and searchablezousar2026-02-171-40/+170
| |
* | Added custom page for cook.artifactszousar2026-02-164-4/+428
| |
* | Change breadcrumbs for oplogs to be more descriptivezousar2026-02-152-6/+12
| |
* | Add support for listing files on oplog entrieszousar2026-02-151-9/+110
| |
* | Restore handling for hard/soft name prefixeszousar2026-02-151-4/+16
| |
* | Enhance dependencies to include soft and hard depszousar2026-02-141-11/+13
|/
* Fix formatting of stat pages (#748)Liam Mitchell2026-02-093-3/+10
| | | * Fix formatting of stat pages
* Zs/oplog navigation fix (#731)Zousar Shaker2026-01-231-6/+6
| | | * Fix incorrect oplog navigation symbols
* Avoid rendering user text input as HTML (#700)Liam Mitchell2026-01-091-1/+1
| | | * Avoid rendering user text input as HTML
* Added actions to drop all projects or all z$ namespaces. (#655)Martin Ridgers2025-11-182-2/+54
| | | | | | | | | | | * Save references to the project and zcache tables * Add an attribute to table rows with the actionable project/namespace id * Drop-all option for projects and cache namespaces * Updated frontend .zip archive * Edited changelog
* Include version string on the dashboard's start page. (#651)Martin Ridgers2025-11-173-2/+26
| | | | | | | | | | | * Method to get plain text from an async request * Include server's version on the dashboard start page * Same paragraph style as the rest of the method * Updated changelog * Update frontend archive
* Mr/dashboard stats summary tweak (#592)Martin Ridgers2025-10-201-9/+14
| | | | | | | * More robust dashboard stats summary * Updated changelog * Updated frontend zip archive
* Sorting oplog tree view by size would raise an error. (#497)Martin Ridgers2025-09-161-1/+1
| | | | | | | * Sorting oplog tree view by size would raise an error * Updated frontend .zip archive * Updated changelog
* If the oplog has no packagestoreentry then show the raw json.Florent Devillechabrol2025-08-141-2/+5
|
* Do not skip oplog without package data.Florent Devillechabrol2025-08-142-36/+39
|
* Surfaced basic z$ information to self-hosted dashboard (#441)Martin Ridgers2025-06-182-0/+124
| | | | | | - Namespaces are listed on the start page. - Namespaces can be dropped. - New page to show details of a namespace and list its buckets. - Buckets can be dropped.
* Make metadata presentation more genericzousar2025-04-151-23/+42
|
* Fix for BigInt conversion bugzousar2025-04-151-1/+1
|
* Avoid signed overflow using BigIntzousar2025-04-115-13/+16
| | | | Bias for use of BigInt when consuming integer fields in compact binary to avoid values showing up as negative due to overflow on the Number type.
* Merge pull request #343 from ue-foundation/zs/web-ui-oplog-searchZousar Shaker2025-04-042-3/+7
|\ | | | | Oplog search improvements
| * Oplog search improvementszousar2025-04-032-3/+7
| | | | | | | | | | | | - Case insensitive search - Allow search of 1 or 2 character strings - Reset table when doing a null search
* | Alternate fix by explicitly initializing pkg_idzousar2025-04-041-2/+1
| |
* | Bump db versionzousar2025-04-031-1/+1
| | | | | | | | Required to refresh db contents after ID fix.
* | Don't duplicate ID bytes when more than one pkg_datazousar2025-04-031-0/+1
|/ | | | | | ID was getting extended and left shifted if we encountered multiple package data items in a single entry. So instead of the ID being 0x0c6500b7fb8dbe2e, it was 0x0C6500B7FB8DBE2E0C6500B7FB8DBE2E. When we went to look up an imported package by ID, it would not be found and the import would be presented as a blank string. Addressing this by making the first package data the only referenceable one. Second package datas are currently used for optional data blobs, and will not be imported or referenced. They are sidecar data.
* Removed do_nothing from entry.jszousar2025-03-251-2/+0
|
* Add CookPackageArtifacts attachment to web uizousar2025-03-211-2/+67
|
* Missing import statment on dashboard's start page (#314)Martin Ridgers2025-03-191-0/+1
|
* Dashboard: view -> list rename, table style fix, file name appended to ↵Martin Ridgers2024-12-124-5/+8
| | | | | | | | | | | | | | | downloads (#264) * Single-column tables could overflow their maximum width * Suffix oplog entry data's file name when downloading * Renamed "view" link to "list" * Ensure all undesirable characters are removed from page name * Updated embedded frontend Zip archive * Wrote some entries into the changelog
* Dashboard CSS fixes and archival of a partial treemap view (#242)Martin Ridgers2024-11-283-6/+194
| | | | | | | | | | | * Input boxes' text was unreadable when using the dark theme * Change from margins to padding top/bottom - easier to reason about vertical styling. * A treemap. Not used anywhere and not finished. Submitting so it isn't lost * Prevent tables' first content columns from collapsing * Dashboardk .zip archive update