diff options
| author | Stefan Boberg <[email protected]> | 2026-03-23 12:23:19 +0100 |
|---|---|---|
| committer | GitHub Enterprise <[email protected]> | 2026-03-23 12:23:19 +0100 |
| commit | 26aa50677403e4c5ad053b221bc7264fe1d249f2 (patch) | |
| tree | de196528390e8875b0551d52071038120d969f73 /docs | |
| parent | Process management improvements (#881) (diff) | |
| download | zen-26aa50677403e4c5ad053b221bc7264fe1d249f2.tar.xz zen-26aa50677403e4c5ad053b221bc7264fe1d249f2.zip | |
Documentation updates (#882)
Restructured the docs folder in preparation for more docs. Improved the contents a bit.
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/NOTES.md | 20 | ||||
| -rw-r--r-- | docs/_index.json | 34 | ||||
| -rw-r--r-- | docs/cache.md | 143 | ||||
| -rw-r--r-- | docs/compute.md | 92 | ||||
| -rw-r--r-- | docs/dev/CODING.md (renamed from docs/CODING.md) | 0 | ||||
| -rw-r--r-- | docs/dev/Deploy.md (renamed from docs/Deploy.md) | 0 | ||||
| -rw-r--r-- | docs/dev/Github_runner_setup.md (renamed from docs/Github_runner_setup.md) | 0 | ||||
| -rw-r--r-- | docs/dev/VersioningCompatibility.md (renamed from docs/VersioningCompatibility.md) | 0 | ||||
| -rw-r--r-- | docs/dev/WindowsOnLinux.md (renamed from docs/WindowsOnLinux.md) | 0 | ||||
| -rw-r--r-- | docs/dev/xmake.md | 54 | ||||
| -rw-r--r-- | docs/glossary.md | 120 | ||||
| -rw-r--r-- | docs/overview.md | 188 | ||||
| -rw-r--r-- | docs/projects.md | 123 | ||||
| -rw-r--r-- | docs/sessions.md | 96 | ||||
| -rw-r--r-- | docs/specs/CompactBinary.md | 6 |
15 files changed, 832 insertions, 44 deletions
diff --git a/docs/NOTES.md b/docs/NOTES.md deleted file mode 100644 index 76476b0ea..000000000 --- a/docs/NOTES.md +++ /dev/null @@ -1,20 +0,0 @@ -# General Notes - -Some implementation notes and things which we may want to address in the future. - -## Memory Management - -We’ll likely want to *not* use `mimalloc` by default due to memory overheads, but perhaps it’s a win on high volume servers? In that case we need a way to opt in, but it’s not obvious how that might be done since we need to configure it quite early. It would even be preferable to not even compile with mimalloc to avoid the unfortunate way they initialize using TLS mechanisms in more recent versions since this does not play well with static linking. Instead of `mimalloc` it may be preferable to use `rpmalloc` instead as it is simpler and we have internal developer support if necessary. - -## Testing - -`doctest` has some thread local state which can unfortunately end up running after the main thread has exited and torn everything down. When it tries to free memory after main has exited things go bad. Currently this mostly ends up being an issue when running tests in the debugger. Some heuristics have been implemented to try and wait for all threads to exit before continuing shutting down but it does not feel like a proper solution. - -# Hub - -## Data Obliteration - -We need to support data obliteration on a module level. This means removing any local state for a given -module id and also any cold data. - -Add ability to register service with Consul via REST API diff --git a/docs/_index.json b/docs/_index.json new file mode 100644 index 000000000..6dea6a987 --- /dev/null +++ b/docs/_index.json @@ -0,0 +1,34 @@ +[ + { + "title": "Overview", + "path": "overview.md" + }, + { + "title": "Sessions", + "path": "sessions.md" + }, + { + "title": "Cache", + "path": "cache.md" + }, + { + "title": "Projects", + "path": "projects.md" + }, + { + "title": "Compute", + "path": "compute.md" + }, + { + "title": "Glossary", + "path": "glossary.md" + }, + { + "title": "Compact Binary Format", + "path": "specs/CompactBinary.md" + }, + { + "title": "Compressed Buffer Format", + "path": "specs/CompressedBuffer.md" + } +] diff --git a/docs/cache.md b/docs/cache.md new file mode 100644 index 000000000..a916945fa --- /dev/null +++ b/docs/cache.md @@ -0,0 +1,143 @@ +# Structured Cache (DDC) + +The Cache service is the primary storage interface for the Derived Data Cache +(DDC). It stores and retrieves derived data produced by the Unreal Engine +cooker — cooked assets, shader bytecode, texture compilations, and other build +artifacts. + +The cache is designed to allow the client to avoid doing computationally intensive +work by using the output produced by another machine/user or a previous local run. + +## Concepts + +### Namespace + +A logical partition within the cache. Each namespace (e.g. `ns_ue.ddc`) has its +own set of buckets and independent storage accounting. Namespaces allow different +types of cached data to be managed separately. + +### Bucket + +A subdivision within a namespace. Buckets group cache entries by the type of +derived data they contain (e.g. `bulkdatalist`, `animationsequence`, +`shaderbytecode`). Bucket names are lowercase alphanumeric strings. + +### Key + +Cache entries are identified by a 40-character hex hash within a bucket. The +full key path is `{bucket}/{keyhash}`. The key itself has no prescribed +derivation, but typically the client summarises all inputs to a transformation +and feeds the result into a hash function such as **IoHash** (BLAKE3/160) to +produce it. + +The inputs conceptually include the code performing the transformation, though +not in a literal sense — that would be impractical. Instead, one or more version +numbers are typically included as a proxy for the code itself. + +It is important that all inputs influencing the outcome of the transformation +are accounted for. Missing inputs can lead to non-determinism in the cooker and +editor, where stale cached results are used instead of recomputing. + +### Cache Record + +A cache record is a structured entry stored in Compact Binary format. Records +contain metadata describing the cached result along with references to one or +more attachments that hold the actual payload data. When a client performs a +cache lookup, the record is returned first, and the client can then fetch the +referenced attachments as needed. + +### Cache Value + +A cache value is an unstructured (opaque) entry stored as a raw binary blob. +Unlike records, values do not carry structured metadata or separate attachment +references — the entire payload is stored and retrieved as a single unit. Values +are used for simpler caching scenarios where the overhead of structured records +is unnecessary. + +### Attachment + +A binary payload (chunk) associated with a cache entry, identified by the +content hash of the data. Attachments are stored in CAS and referenced by +the cache entry. The full attachment path is `{bucket}/{keyhash}/{valuehash}`. + +## API + +**Base URI:** `/z$/` + +### Service and Namespace Info + +``` +GET /z$/ # Service info (namespaces, storage size) +GET /z$/{namespace} # Namespace info (buckets, sizes, config) +GET /z$/{namespace}?bucketsizes=* # Include per-bucket storage sizes +``` + +### Bucket Operations + +``` +GET /z$/{namespace}/{bucket} # Bucket info (size, entry count) +DELETE /z$/{namespace}/{bucket} # Drop all entries in a bucket +DELETE /z$/{namespace} # Drop entire namespace +``` + +### Record Operations + +Records are structured cache entries (Compact Binary format). + +``` +GET /z$/{namespace}/{bucket}/{keyhash} # Get cached record +PUT /z$/{namespace}/{bucket}/{keyhash} # Store cached record +``` + +### Attachment Operations + +Attachments are binary chunks referenced by cache entries. + +``` +GET /z$/{namespace}/{bucket}/{keyhash}/{valuehash} # Get attachment +PUT /z$/{namespace}/{bucket}/{keyhash}/{valuehash} # Store attachment +``` + +### Batch Operations (RPC) + +For high-throughput scenarios, the RPC endpoint supports batch get/put +operations using the Compact Binary protocol: + +``` +POST /z$/{namespace}/$rpc # Batch RPC operations +``` + +Supported RPC operations: + +- `PutCacheRecords` — batch store of structured records +- `GetCacheRecords` — batch retrieve of structured records +- `PutCacheValues` — batch store of unstructured values +- `GetCacheValues` — batch retrieve of unstructured values +- `GetCacheChunks` — batch retrieve of chunks with upstream fallback + +### Detailed Inventory + +``` +GET /z$/details$ # All cache contents +GET /z$/details$/{namespace} # Namespace inventory +GET /z$/details$/{namespace}/{bucket} # Bucket inventory +GET /z$/details$/{namespace}/{bucket}/{key} # Single key details +``` + +Add `?csv=true` for CSV output, `?details=true` for size information, or +`?attachmentdetails=true` for full attachment listings. + +## Upstream Caching + +The cache service supports tiered caching through upstream connections. When a +cache miss occurs locally, the request can be forwarded to an upstream storage +instance (e.g. a shared build farm cache). Successfully retrieved content is +stored locally for future requests. + +## Web Dashboard + +The Cache page in the dashboard shows: + +- Per-namespace breakdown with disk and memory usage +- Per-bucket entry counts and storage sizes +- Cache hit/miss statistics diff --git a/docs/compute.md b/docs/compute.md index df8a22870..06453c975 100644 --- a/docs/compute.md +++ b/docs/compute.md @@ -1,29 +1,33 @@ -# DDC compute interface design documentation +# Zen Compute -This is a work in progress +> **Note:** Compute interfaces are a work in progress and are not yet included +> in official binary releases. -## General architecture +Zen server implements a compute interface for managing distributed processing +via the UE5 DDC 2.0 Build API. -The Zen server compute interfaces implement a basic model for distributing compute processes. -Clients can implement [Functions](#functions) in [worker executables](#workers) and dispatch -[actions](#actions) to them via a message based interface. +## Compute Model -The API requires users to describe the actions and the workers explicitly fully up front and the -work is described and submitted as singular objects to the compute service. The model somewhat -resembles Lambda and other stateless compute services but is more tightly constrained to allow -for optimizations and to integrate tightly with the storage components in Zen server. +The compute interface implements a "pure function" model for distributing work, +similar in spirit to serverless compute paradigms like AWS Lambda. -This is in contrast with Unreal Build Accelerator in where the worker (remote process) -and the inputs are discovered on-the-fly as the worker progresses and inputs and results -are communicated via relatively high-frequency RPCs. +Clients implement transformation [Functions](#functions) in +[worker executables](#workers) and dispatch [Actions](#actions) to them via a +message-based interface. -### Actions +Actions and workers must be described explicitly and fully up front — work is +submitted as self-contained objects to the compute service. This is more +constrained than general-purpose serverless platforms, which allows for +optimizations and tight integration with Zen's storage model. -An action is described by an action descriptor, which is a compact binary object which -contains a self-contained description of the inputs and the function which should be applied -to generate an output. -#### Sample Action Descriptor +## Actions + +An action is the unit of work in the compute model. It is described by an action +descriptor — a Compact Binary object containing a self-contained description of +the inputs and the function to apply to produce an output. + +### Sample Action Descriptor ``` work item 4857714dee2383b50b2e7d72afd79848ab5d13f8 (2 attachments): @@ -39,7 +43,7 @@ Inputs: RawSize: 3334 ``` -### Functions +## Functions Functions are identified by a name, and a version specification. For matching purposes there's also a build system version specification. @@ -56,7 +60,7 @@ function version build system CompileShaderJobs 83027356-2cf7-41ca-aba5-c81ab0ff2129 17fe280d-ccd8-4be8-a9d1-89c944a70969 69cb9bb50e9600b5bd5e5ca4ba0f9187b118069a ``` -### Workers +## Workers A worker is an executable which accepts some command line options which are used to pass the information required to execute an action. There are two modes, one legacy mode which is @@ -87,7 +91,7 @@ communication with the invoking program (the 'build system'). To be able to evol interface, each worker also indicates the version of the build system using the `BuildSystemVersion` attribute. -#### Sample Worker Descriptor +### Sample Worker Descriptor ``` worker 69cb9bb50e9600b5bd5e5ca4ba0f9187b118069a: @@ -201,3 +205,49 @@ worker resolution. worker. `/compute/queues/{oidtoken}/jobs/{lsn}` - GET action result by LSN, scoped to the queue + +## Relationship to Unreal Build Accelerator + +Zen Compute is designed to complement Unreal Build Accelerator (UBA), not +replace it. The two systems target different workload characteristics: + +- **Zen Compute** — suited to workloads where the inputs and function are fully + known before execution begins. All data is declared up front in the action + descriptor, and the worker runs as a self-contained transformation. This + enables content-addressed caching of results and efficient scheduling. + +- **UBA** — suited to workloads where inputs are discovered dynamically as the + process runs. The remote process and its dependencies are resolved on the fly, + with inputs and results exchanged via high-frequency RPCs throughout execution. + +In practice, Zen Compute handles workloads like shader compilation where the +inputs are well-defined, while UBA handles more complex build processes with +dynamic dependency graphs. + +## Execution Flow + +```mermaid +sequenceDiagram + participant C as Client + participant G as Zen Server + participant Q as Runner + participant W as Worker + + C->>G: POST /jobs + G-->>C: 202 Accepted (job_id) + G->>Q: enqueue(action) + Q-->>G: job_id + + C->>G: GET /jobs/job_id + G-->>C: 202 Accepted (job_id) + + Q->>W: spawn() + Q-->>W: action + W->>W: process + W->>Q: complete(job_id) + + C->>G: GET /jobs/job_id + G->>Q: status(job_id) + Q-->>G: done + G-->>C: 200 OK (result) +``` diff --git a/docs/CODING.md b/docs/dev/CODING.md index 8924c8107..8924c8107 100644 --- a/docs/CODING.md +++ b/docs/dev/CODING.md diff --git a/docs/Deploy.md b/docs/dev/Deploy.md index 2a5e9be0f..2a5e9be0f 100644 --- a/docs/Deploy.md +++ b/docs/dev/Deploy.md diff --git a/docs/Github_runner_setup.md b/docs/dev/Github_runner_setup.md index 42b2b1a01..42b2b1a01 100644 --- a/docs/Github_runner_setup.md +++ b/docs/dev/Github_runner_setup.md diff --git a/docs/VersioningCompatibility.md b/docs/dev/VersioningCompatibility.md index f4a283653..f4a283653 100644 --- a/docs/VersioningCompatibility.md +++ b/docs/dev/VersioningCompatibility.md diff --git a/docs/WindowsOnLinux.md b/docs/dev/WindowsOnLinux.md index 540447cb2..540447cb2 100644 --- a/docs/WindowsOnLinux.md +++ b/docs/dev/WindowsOnLinux.md diff --git a/docs/dev/xmake.md b/docs/dev/xmake.md new file mode 100644 index 000000000..a529107b8 --- /dev/null +++ b/docs/dev/xmake.md @@ -0,0 +1,54 @@ +# xmake notes and tips and tricks + +We use xmake to build code in this tree. We also use xmake to handle some of our third-party dependencies which are +not in the tree. + +This document is intended as a basic guide on how to accomplish some common things when working with the Zen codebase. + +The official documentation for xmake is located here: https://xmake.io/ it covers most features but isn't +necessarily rich in examples on how to accomplish things on different platforms and environments. + +# Build basics + +xmake is what I'd call a "stateful" build system in that there is a 'configuration' step which you +will generally want to run before actually building the code. This allows you to specify which +mode you want to build (in our case, "debug" or "release"), and also allows you to perform additional +configuration of options such as whether we should include support for Sentry crash reporting, use +of 'mimalloc' for memory allocations etc etc + +Configure xmake for building 'debug' (which includes building all third-party packages) and +triggering a build, issue these commands in the shell: + +``` +dev/zen> xmake config --mode=debug +dev/zen> xmake +``` + +If all goes well, you will be able to run the compiled programs using: + +``` +dev/zen> xmake run zenserver +dev/zen> xmake run zen +``` + +# Cleaning out *all* build state + +You may run into build issues at some point due to bad on-disk state. For instance your workstation +could crash at an inopportune moment leaving things in an inconsistent state, or you may run into bugs +in compilers or the build system itself. + +When faced with this it's good to be able to wipe out all state which influences the build. Since xmake +uses a number of different locations to store state it's not entirely obvious at first how to accomplish +this. + +## Windows + +``` +dev\zen> rmdir /s /q .xmake build %LOCALAPPDATA%\.xmake %TEMP%\.xmake +``` + +## Linux / MacOS + +``` +dev/zen> rm -rf .xmake build ~/.xmake +``` diff --git a/docs/glossary.md b/docs/glossary.md new file mode 100644 index 000000000..6973f7222 --- /dev/null +++ b/docs/glossary.md @@ -0,0 +1,120 @@ +# Storage + +## DDC (Derived Data Cache) +A cache for derived data produced by the Unreal Engine cooker. DDC entries are +keyed by a hash (typically generated from the input data combined with code +versions and parameters) and typically contain cooked assets, shader bytecode, or +other build artifacts. DDC is one of the primary workloads served by zen. + +### Cache Namespace +A logical partition within the cache service. Each namespace (e.g. +`ns_ue.ddc`) has its own set of buckets and storage accounting. Namespaces +allow different types of cached data to be managed independently. + +### Cache Bucket +A subdivision within a cache namespace. Buckets provide fine-grained grouping of +cache entries, typically corresponding to a specific type of derived data (e.g. +`bulkdatalist`, `animationsequence`). + +## Project Store +A storage service for per-project data such as asset metadata and editor state. +Like the build store, project store entries have configurable expiration and are +garbage collected accordingly. + +## CAS (Content-Addressable Storage) +The lowest storage layer in zen. Data is stored and retrieved by its content hash +(IoHash) also known as Content Id (aka 'cid'). CAS provides automatic deduplication +since identical content always produces the same hash. CAS content is garbage +collected based on what is referenced from higher-level services. + +Content is immutable once written to CAS. Note that zenserver always stores data +in `CompressedBuffer` format, and the Content Id is derived from the *decompressed* +data. + +### IoHash +A 20-byte (160-bit) content hash used to identify data in CAS. IoHash is the +fundamental addressing unit for content-addressable operations. + +### Attachment +A reference from a higher-level store (build store, project store, or cache) to +a CAS object. Attachments are what keep CAS objects alive during garbage +collection — unreferenced CAS content is eligible for removal. + +## Build Store +A storage service for build artifacts. Build store entries have configurable +expiration and are subject to garbage collection when they exceed their maximum +retention duration. + +## Object Store +A general-purpose object storage service for arbitrary key-value data. Objects +are organized by namespace and can have associated metadata. + +# Garbage Collection + +## Full GC +A complete garbage collection pass that scans all referencers and reference +stores, removes expired data, prunes unreferenced CAS content, and optionally +compacts storage blocks. Full GC runs at a configured interval (typically hours). + +## Lightweight GC +A faster, less thorough garbage collection pass focused on removing expired +entries without performing full reference scanning or compaction. Runs more +frequently than full GC. + +## Referencer +A component that holds references to CAS content (e.g. the cache service, +project store, or build store). During GC, each referencer is scanned to +determine which CAS objects are still in use. + +## Reference Store +The CAS-side counterpart to a referencer. During GC, reference stores are scanned +to identify and remove CAS objects that are no longer referenced by any +referencer. + +## Compaction +A GC phase that reclaims fragmented disk space by rewriting storage blocks. A +block is compacted when its usage falls below the configured threshold +percentage. Compaction frees disk space but is more expensive than simple +deletion. + +## Write Block +A period during GC where write operations to the affected stores are temporarily +blocked to ensure consistency. The write block duration is tracked as a GC +performance metric. + +# Network + +## http.sys +The Windows kernel-mode HTTP server driver used by zen on Windows for maximum +throughput. Requires either administrator privileges or a URL reservation to bind +to network interfaces. Can be overridden with `--http=asio` to use the +user-mode ASIO server instead. + +## ASIO Server +An asynchronous I/O based HTTP server implementation available on all platforms. +This is the default on Linux and macOS, and can be used on Windows as an +alternative to http.sys. + +# Sessions + +## Session +A logical connection from a client application (e.g. Unreal Editor, a build +agent) to the zen server. Each session has an ID, application name, mode, and +tracks activity timestamps. Sessions can carry metadata and log entries. + +## Session Log +A per-session log stream visible in the sessions browser UI. The server's own +session log captures zen's internal log output, while external sessions can +post log entries via the API. + +# General + +## zen CLI +The command-line client utility (`zen`) for interacting with a running +zenserver. Provides commands for cache management, server status, diagnostics, +and more. + +## zenserver +The main server binary that integrates all zen storage services, the HTTP +server, telemetry, and the web dashboard. It can be launched in a number +of modes (storage, compute, proxy, hub) depending on the desired functionality. diff --git a/docs/overview.md b/docs/overview.md new file mode 100644 index 000000000..17e880ffb --- /dev/null +++ b/docs/overview.md @@ -0,0 +1,188 @@ +# Zen Storage Service + +Zen is a high-performance *local storage* service for Unreal Engine 5. It manages +*secondary* data such as cooker output, derived data caches (DDC), project +metadata, and build artifacts. In addition to storage, Zen can schedule certain +data transformations (shader compilation, texture compression, etc.) for +distributed execution. + +Zen is primarily deployed as a daemon on user machines for local caching of +cooker artifacts, but can also serve as a high-performance shared instance for +build farms and teams — complementing and offloading Unreal Cloud DDC. + +## Storage Services + +### DDC (Derived Data Cache) / Structured Cache + +The primary workload alongside the Project Store. Caches derived data produced +by the Unreal Engine cooker — cooked assets, shader bytecode, texture compilations, +and other intermediate artifacts. Entries are keyed by a hash derived from a set of +inputs, and organized into namespaces and buckets. See [Cache](cache.md). + +### Content-Addressable Storage (CAS) + +The lowest storage layer. Data is stored and retrieved by its content hash +(IoHash). CAS provides automatic deduplication — identical content always +produces the same hash. Higher-level services reference CAS objects via +attachments. + +### Project Store + +The project store stores cooked asset data and associated metadata required +for dependency tracking by the cooker. + +Entries have configurable expiration and are garbage collected accordingly. + +See [Projects](projects.md) + +### Build Store + +Storage for build artifacts (staged or packaged builds) with configurable +retention. Build store entries are subject to garbage collection when they +exceed their maximum duration. + +This is an optional feature, and it is disabled by default. + +### Object Store + +General-purpose key-value storage organized by namespace, with support for +associated metadata. Similar in spirit to S3 in that it presents a generic +key-value store interface. + +Primarily intended to be used for certain developer workflows, such as when +working with UE5 Individal Asset Streaming/Download. This is optional and +disabled by default. + +## Compute Services + +Zen can optionally act as a remote executor for UE5's Derived Data Build +interface, distributing fine-grained actions across multiple workers with +low-latency scheduling. + +See [Compute](compute.md) for details. + +## Supporting Services + +### Sessions + +Tracks client connections from Unreal Editor instances, build agents, and other +tools. Each session carries an application name, mode, metadata, and a log +stream visible in the web dashboard. + +### Garbage Collection + +Automatic management of storage lifecycle. Full GC scans all references, removes +expired data, prunes unreferenced CAS content, and compacts fragmented storage. +Lightweight GC runs more frequently with a focus on expiration only. + +## Server Modes + +Zenserver supports several operational modes, selected at launch: + +### Storage Server + +The primary mode, running all storage and cache services. + +### Hub + +A coordination service for managing multiple storage instances on the +same host. + +```mermaid +graph LR + Client[UE5 Client] <--> Zen[Zen Server] + Client2[UE5 Client II] <--> Zen2[Zen Server] + + subgraph User 1 + Client + end + + subgraph User 2 + Client2 + end + + subgraph Hub[Hub Instance] + Zen + Zen2[Zen Server] + HubServer[Hub Server] + end + + Orchestrator --> HubServer +``` + +### Compute + +Action processing endpoint for distributed Derived Data Build workloads. + +```mermaid +graph LR + Client[UE5 Client] <--> Zen[Zen Server] + Zen <--> Zen2[Compute] + Zen <--> Zen3[Compute] + Zen <--> Zen4[Compute] +``` + +### Proxy + +A relay server that forwards requests to upstream storage instances, with +optional stream parsing and recording. Primarily intended for use during +development to aid in visualising component interactions in advanced setups. + +```mermaid +graph LR + Client[UE5 Client] <--> Proxy <--> Zen[Zen Server] + Proxy <--> Zen2[Zen Server] +``` + +## Architecture + +```mermaid +graph TD + Client[UE5 Client] --> DDC + Client --> PRJ + Client --> OBJ + + subgraph Zen[Zen Server] + DDC[DDC Cache] + PRJ[Project Store] + OBJ[Object Store] + CAS[Content-Addressable Storage] + end + + DDC --> CAS + PRJ --> CAS + + DDC -.-> Upstream[Upstream Cache] +``` + +## Web Dashboard + +Zenserver includes an embedded web dashboard providing real-time visibility into: + +- **Storage** — disk usage, directory breakdown, cache namespaces, and garbage + collection status and history +- **Network** — HTTP request rates, latency percentiles, and bandwidth by + service +- **Sessions** — active and ended sessions with live log streaming +- **Info** — server configuration, build info, and runtime state + +## CLI + +The `zen` command-line utility provides commands for interacting with a running +zenserver: + +- **Server management** — `serve`, `up`, `admin` +- **Cache operations** — `cache`, `projectstore`, `workspaces` +- **Storage operations** — `copy`, `dedup`, `vfs`, `wipe` +- **Monitoring** — `status`, `info`, `top`, `trace`, `bench` +- **Diagnostics** — `ui` (launch web dashboard), `version` + +## HTTP Server + +Zenserver uses platform-optimized HTTP serving: + +- **Windows** — http.sys kernel-mode driver for maximum throughput (default), + with ASIO as a fallback +- **Linux / macOS** — ASIO-based asynchronous I/O server + +The server implementation can be overridden with `--http=asio` on any platform. diff --git a/docs/projects.md b/docs/projects.md new file mode 100644 index 000000000..158b4e3e1 --- /dev/null +++ b/docs/projects.md @@ -0,0 +1,123 @@ +# Project Store + +The Project Store service provides per-project storage for asset metadata, +editor state, and other project-specific data. Unlike the cache, which stores +ephemeral derived data, the project store manages persistent data that tracks +the state of an Unreal Engine project. + +## Concepts + +### Project + +A project is a named container identified by an alphanumeric string (with +underscores and dots allowed). Each project holds one or more operation logs. + +One project typically corresponds to a project (`.uproject`) on disk, and +will generally hold one operation log per cooked platform. + +### OpLog (Operation Log) + +An operation log is a sequential transaction log associated with a build target +or variant within a project. Each oplog tracks changes as an ordered sequence of +operations, identified by a Log Sequence Number (LSN). The operation log is +append-only and entries are immutable once written. + +### Operation Log Entry (Op) + +An atomic unit in the oplog. Most operations correspond to a cooker package save +and will contain data associated with that asset. Each operation has: + +- **Key** — a unique OID identifier determined by the client. This is typically derived from the asset name associated with the entry. +- **LSN** — its position in the oplog sequence +- **Metadata** — free-form structured data used by the cooker/editor +- **Data** — the operation payload (structured or unstructured) +- **Attachments** — referenced binary chunks stored in CAS + +An oplog may have more than one oplog entry with the same Key. In this case the entry which the highest LSN (i.e the most recent entry) +will be considered "current" and will be served if a client requests data for a particular Key. + +### Chunk + +A binary data blob referenced by operation log entries. Chunks are stored in CAS and +identified by their content hash, providing automatic deduplication. + +## API + +**Base URI:** `/prj/` + +### Project Management + +``` +GET /prj/ # List all projects +POST /prj/{project} # Create a project +PUT /prj/{project} # Update project metadata +GET /prj/{project} # Get project info +DELETE /prj/{project} # Delete a project +``` + +### OpLog Management + +``` +GET /prj/{project}/oplog/{target} # Get oplog info (head LSN) +POST /prj/{project}/oplog/{target} # Create oplog +PUT /prj/{project}/oplog/{target} # Update oplog +DELETE /prj/{project}/oplog/{target} # Delete oplog +``` + +### Operations + +``` +GET /prj/{project}/oplog/{target}/entries # List operations with LSN +GET /prj/{project}/oplog/{target}/{lsn} # Get operation by LSN +POST /prj/{project}/oplog/{target}/new # Create new operation +POST /prj/{project}/oplog/{target}/batch # Batch create operations +``` + +### Chunks + +``` +POST /prj/{project}/oplog/{target}/prep # Prepare chunk uploads +GET /prj/{project}/oplog/{target}/{chunk} # Get chunk by ID +GET /prj/{project}/oplog/{target}/{chunk}/info # Get chunk metadata +GET /prj/{project}/oplog/{target}/chunkinfos # Batch chunk info +GET /prj/{project}/oplog/{target}/files # List files in oplog +``` + +### Import / Export + +``` +POST /prj/{project}/oplog/{target}/save # Export oplog +GET /prj/{project}/oplog/{target}/load # Import oplog +``` + +### Validation + +``` +POST /prj/{project}/oplog/{target}/validate # Validate oplog integrity +``` + +### Detailed Inventory + +``` +GET /prj/details$ # All projects +GET /prj/details$/{project} # Single project +GET /prj/details$/{project}/{target} # OpLog details +GET /prj/details$/{project}/{target}/{chunk} # Operation details +``` + +Add `?csv=true` for CSV output, `?details=true` for size information, or +`?attachmentdetails=true` for full attachment listings including CAS references. + +## Storage Lifecycle + +Project store entries have configurable expiration. Expired entries are removed +during garbage collection. The expiration duration is set globally via the +`MaxProjectStoreDuration` configuration parameter. + +## Web Dashboard + +The Projects page in the dashboard shows: + +- List of all registered projects +- Per-project oplog details and operation counts +- Storage usage per project diff --git a/docs/sessions.md b/docs/sessions.md new file mode 100644 index 000000000..593c95283 --- /dev/null +++ b/docs/sessions.md @@ -0,0 +1,96 @@ +# Sessions + +The Sessions service tracks client connections to zenserver. Each session +represents a logical connection from a client application — an Unreal Editor +instance, a build agent, a CLI tool, or any other process interacting with the +server. + +## Concepts + +### Session + +A session is identified by a unique OID and carries: + +- **AppName** — the name of the connecting application (e.g. `UnrealEditor`, + `zen`) +- **Mode** — the operational mode of the client +- **JobId** — an optional job identifier for build farm scenarios +- **Metadata** — arbitrary key-value data attached by the client +- **Timestamps** — creation, last update, and end times + +### Session Log + +Each session has an associated log stream that can receive entries from the +client or from zenserver itself. The server's own session captures internal log +output, making it visible in the web dashboard. + +Log entries contain a timestamp, severity level (`debug`, `info`, `warn`, +`error`), a message string, and optional structured data fields. + +Logs are kept in memory with a maximum of 10,000 entries per session. When the +limit is exceeded, the oldest entries are evicted. + +## API + +**Base URI:** `/sessions/` + +### Listing Sessions + +``` +GET /sessions/ # Active sessions (default) +GET /sessions/?status=ended # Ended sessions +GET /sessions/?status=all # All sessions +``` + +### Managing Sessions + +``` +POST /sessions/{id} # Create or update a session +PUT /sessions/{id} # Update session metadata +DELETE /sessions/{id} # End and remove a session +GET /sessions/{id} # Get session details +``` + +When creating a session, the request body should contain: + +- `appname` — application name +- `mode` — operational mode +- `jobid` — optional job identifier +- `metadata` — optional key-value metadata + +### Session Logs + +``` +GET /sessions/{id}/log # Retrieve log entries +POST /sessions/{id}/log # Append log entries +``` + +**Retrieving logs** supports two modes: + +- **Cursor-based** (recommended): pass `?cursor=N` where N is the cursor value + from the previous response. Returns only entries appended since that cursor. +- **Offset/limit**: pass `?offset=N&limit=M` for traditional pagination. + +**Appending logs** accepts: + +- **Plain text** — each line becomes a separate log entry +- **JSON** — a single log object or `{"entries": [...]}` array, each with + `level`, `message`, and optional `data` fields + +### Live Updates + +``` +GET /sessions/ws # WebSocket connection +``` + +The WebSocket endpoint pushes the full session list to connected clients +periodically. This powers the real-time session table in the web dashboard. + +## Web Dashboard + +The Sessions page in the dashboard provides: + +- A table of active, ended, or all sessions with sorting and pagination +- Session detail panel showing metadata and properties +- Live log viewer with follow mode, newest-first ordering, and text filtering +- The server's own session is selected by default on page load diff --git a/docs/specs/CompactBinary.md b/docs/specs/CompactBinary.md index d8cccbd1e..f7a9637ec 100644 --- a/docs/specs/CompactBinary.md +++ b/docs/specs/CompactBinary.md @@ -110,10 +110,10 @@ Every field has a type, stored as a single byte. The low 6 bits identify the typ ### 3.1 Type byte layout ``` - Bit 7 Bit 6 Bits 5..0 -┌───────────┬───────────┬──────────────────┐ + Bit 7 Bit 6 Bits 5..0 +┌──────────────┬──────────────┬──────────────────────┐ │ HasFieldName │ HasFieldType │ Type ID (0x00–0x3F) │ -└───────────┴───────────┴──────────────────┘ +└──────────────┴──────────────┴──────────────────────┘ ``` ### 3.2 Flags |