diff options
| author | Stefan Boberg <[email protected]> | 2026-03-04 14:13:46 +0100 |
|---|---|---|
| committer | GitHub Enterprise <[email protected]> | 2026-03-04 14:13:46 +0100 |
| commit | 0763d09a81e5a1d3df11763a7ec75e7860c9510a (patch) | |
| tree | 074575ba6ea259044a179eab0bb396d37268fb09 /docs | |
| parent | native xmake toolchain definition for UE-clang (#805) (diff) | |
| download | zen-0763d09a81e5a1d3df11763a7ec75e7860c9510a.tar.xz zen-0763d09a81e5a1d3df11763a7ec75e7860c9510a.zip | |
compute orchestration (#763)
- Added local process runners for Linux/Wine, Mac with some sandboxing support
- Horde & Nomad provisioning for development and testing
- Client session queues with lifecycle management (active/draining/cancelled), automatic retry with configurable limits, and manual reschedule API
- Improved web UI for orchestrator, compute, and hub dashboards with WebSocket push updates
- Some security hardening
- Improved scalability and `zen exec` command
Still experimental - compute support is disabled by default
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/compute.md | 79 |
1 files changed, 65 insertions, 14 deletions
diff --git a/docs/compute.md b/docs/compute.md index 417622f94..df8a22870 100644 --- a/docs/compute.md +++ b/docs/compute.md @@ -122,31 +122,82 @@ functions: version: '83027356-2cf7-41ca-aba5-c81ab0ff2129' ``` -## API (WIP not final) +## API -The compute interfaces are currently exposed on the `/apply` endpoint but this -will be subject to change as we adapt the interfaces during development. The LSN +The compute interfaces are exposed on the `/compute` endpoint. The LSN APIs below are intended to replace the action ID oriented APIs. The POST APIs typically involve a two-step dance where a descriptor is POSTed and -the service responds with a list of `needs` chunks (identified via `IoHash`) which -it does not have yet. The client can then follow up with a POST of a Compact Binary +the service responds with a list of `needs` chunks (identified via `IoHash`) which +it does not have yet. The client can then follow up with a POST of a Compact Binary Package containing the descriptor along with the needed chunks. -`/apply/ready` - health check endpoint returns HTTP 200 OK or HTTP 503 +`/compute/ready` - health check endpoint returns HTTP 200 OK or HTTP 503 -`/apply/sysinfo` - system information endpoint +`/compute/sysinfo` - system information endpoint -`/apply/record/start`, `/apply/record/stop` - start/stop action recording +`/compute/record/start`, `/compute/record/stop` - start/stop action recording -`/apply/workers/{worker}` - GET/POST worker descriptors and payloads +`/compute/workers/{worker}` - GET/POST worker descriptors and payloads -`/apply/jobs/completed` - GET list of completed actions +`/compute/jobs/completed` - GET list of completed actions -`/apply/jobs/{lsn}` - GET completed action results from LSN, POST action cancellation by LSN, priority changes by LSN +`/compute/jobs/{lsn}` - GET completed action results from LSN, POST action cancellation by LSN, priority changes by LSN -`/apply/jobs/{worker}/{action}` - GET completed action (job) results by action ID +`/compute/jobs/{worker}/{action}` - GET completed action (job) results by action ID -`/apply/jobs/{worker}` - GET pending/running jobs for worker, POST requests to schedule action as a job +`/compute/jobs/{worker}` - GET pending/running jobs for worker, POST requests to schedule action as a job -`/apply/jobs` - POST request to schedule action as a job +`/compute/jobs` - POST request to schedule action as a job + +### Queues + +Queues provide a way to logically group actions submitted by a client session. This enables +per-session cancellation and completion polling without affecting actions submitted by other +sessions. + +#### Local access (integer ID routes) + +These routes use sequential integer queue IDs and are restricted to local (loopback) +connections only. Remote requests receive HTTP 403 Forbidden. + +`/compute/queues` - POST to create a new queue. Returns a `queue_id` which is used to +reference the queue in subsequent requests. + +`/compute/queues/{queue}` - GET queue status (active, completed, failed, and cancelled +action counts, plus `is_complete` flag indicating all actions have finished). DELETE to +cancel all pending and running actions in the queue. + +`/compute/queues/{queue}/completed` - GET list of completed action LSNs for this queue +whose results have not yet been retired. A queue-scoped alternative to `/compute/jobs/completed`. + +`/compute/queues/{queue}/jobs` - POST to submit an action to a queue with automatic worker +resolution. Accepts an optional `priority` query parameter. + +`/compute/queues/{queue}/jobs/{worker}` - POST to submit an action to a queue targeting a +specific worker. Accepts an optional `priority` query parameter. + +`/compute/queues/{queue}/jobs/{lsn}` - GET action result by LSN, scoped to the queue + +#### Remote access (OID token routes) + +These routes use cryptographically generated 24-character hex tokens (OIDs) instead of +integer queue IDs. Tokens are unguessable and safe to use over the network. The token +mapping lives entirely in the HTTP service layer; the underlying compute service only +knows about integer queue IDs. + +`/compute/queues/remote` - POST to create a new queue with token-based access. Returns +`queue_token` (24-char hex string) and `queue_id` (integer, for internal visibility). + +`/compute/queues/{oidtoken}` - GET queue status or DELETE to cancel, same semantics as +the integer ID variant but using the OID token for identification. + +`/compute/queues/{oidtoken}/completed` - GET list of completed action LSNs for this queue. + +`/compute/queues/{oidtoken}/jobs` - POST to submit an action to a queue with automatic +worker resolution. + +`/compute/queues/{oidtoken}/jobs/{worker}` - POST to submit an action targeting a specific +worker. + +`/compute/queues/{oidtoken}/jobs/{lsn}` - GET action result by LSN, scoped to the queue |