| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A collection of security, correctness, and robustness fixes in `zenhttp` and `zencore` surfaced by security review. Most items are small, independent commits grouped here because they all tighten trust boundaries or fix UB along the same code paths.
## WebSocket protocol hardening (RFC 6455)
- **Enforce the client-side mask bit**. Server-side frame loops now reject unmasked frames with close code 1002 per §5.1. Prevents HTTP intermediary smuggling.
- **Validate control frames and RSV bits**. Fragmented control frames, oversized (>125 B) control payloads, and any non-zero RSV bit now fail the connection before allocation.
- **Lower per-frame payload cap** from 256 MB → 4 MB. Bounds per-connection accumulator memory.
- **Implement message fragmentation**. Continuation frames are coalesced and delivered as a single message; interleaved non-control frames close with 1002; assembled messages are capped at 4 MB (1009 on overflow). Previously partial fragments were delivered to handlers, bypassing payload validation.
- **Parse the 101 handshake response properly** in `HttpWsClient`. Status-line, `Upgrade`, `Connection`, and `Sec-WebSocket-Accept` are now matched exactly rather than via substring searches against the full body.
## Auth / OIDC hardening
- **Constant-time password compare** in `PasswordSecurity::IsAllowed` (closes a remote length/content timing oracle). Adds a shared `ConstantTimeEquals` helper.
- **Harden Basic-auth header parsing**: trim trailing LWS, reject control bytes and DEL in the credential.
- **OIDC discovery pinning**: require HTTPS (loopback exempt), verify `issuer` matches `BaseUrl`, require `token_endpoint` / `userinfo_endpoint` / `jwks_uri` to share origin with `BaseUrl`, reject empty `token_endpoint`.
- **Restrict `POST /auth/oidc/refreshtoken`** to local-machine requests. Previously unauthenticated in default deployments — remote callers could evict or replace cached tokens.
- **Stop logging OIDC provider response bodies** on refresh failure (IdPs echo `refresh_token` back in error bodies).
- **Drop the unused `IdentityToken` field** from `OidcClient` / `OpenIdToken` so nothing in the tree accidentally trusts an unverified JWT.
## Auth state encryption migration
- Add `AesGcm` AEAD primitive (BCrypt / OpenSSL backends, mbedTLS stubbed) and `CryptoRandom::Fill` CSPRNG helper in `zencore/crypto.h`.
- Migrate authstate file from AES-256-CBC with a fixed IV to AES-GCM with a fresh 12-byte random nonce per write and the 4-byte `ZEN1` magic bound as AAD. Legacy-CBC files are transparently read once and rewritten in the new format.
## Filesystem / IO robustness
- `IoBufferExtendedCore::Materialize` now checks `MAP_FAILED` on POSIX (was comparing to `nullptr`, which let the failure sentinel propagate into later reads and `munmap(MAP_FAILED, ...)`).
- `IoBufferBuilder::MakeFromFile / MakeFromTemporaryFile`: close the FD/HANDLE on exception via a dismissable `ScopeGuard`; actually check the `fstat()` return value (previously used an uninitialized `FileSize`).
- `ReadFromFileMaybe`: loop short reads, retry `EINTR`, chunk Windows `ReadFile` at `0xFFFFFFFF` bytes (fixes silent truncation of multi-GiB reads).
- `WipeDirectory`: compare `FindFirstFileW` handle against `INVALID_HANDLE_VALUE` rather than `nullptr`.
- `RemoveFileNative` (Linux/macOS): report non-`ENOENT` stat failures via the `std::error_code` out-param and stop reading `st_mode` after a failed stat.
## Buffer / compression correctness
- Avoid per-copy `IoBufferCore` heap allocations in `CompositeBuffer::CopyTo / ViewOrCopyRange` iterators; add fast path for `BufferHeader::Read` when the 64-byte header fits in the first plain-memory segment.
- `BufferHeader`: add `IsHeaderValid()` gate covering `BlockSizeExponent` range, `BlockCount * BlockSize` overflow, and `TotalRawSize` bounds before any arithmetic uses them. Defends against attacker-controlled headers that can pass the CRC and trigger OOB writes in `DecompressBlock`.
|
| |
|
|
|
|
|
| |
- `GetEnvVariable` now returns `std::optional<std::string>` so callers can distinguish an unset variable from one set to an empty value.
- Windows path uses `SetLastError(ERROR_SUCCESS)` + `ERROR_ENVVAR_NOT_FOUND` to detect "not found"; POSIX path returns `nullopt` when `getenv` returns `nullptr`.
- All call sites migrated. Most use `.value_or("")` to preserve current empty-or-unset behavior. The diagnostic helpers in `zen-test/artifactprovider-tests.cpp` now report `<unset>` vs `<empty>` distinctly.
- Added a check in the `ExpandEnvironmentVariables` test confirming `nullopt` for an unset variable; PATH/HOME lookups in that test use `REQUIRE(has_value())` so a missing var fails cleanly instead of throwing `bad_optional_access`.
|
| |
|
|
|
|
|
|
|
| |
- Bugfix: `builds download` partial-block fetch decisions now account for build storage host latency
- Bugfix: Transfer rate displays in `builds` commands now smooth correctly
- Split `buildstorageoperations.cpp` (8.5k lines) into per-operation TUs: buildinspect, buildprimecache, buildstorageresolve, buildupdatefolder, builduploadfolder, buildvalidatebuildpart; stats moved to buildstoragestats.h.
- FilteredRate extracted to zenutil.
- BuildsCommand shared state consolidated into a BuildsConfiguration struct; subcommands inherit from BuildsSubCmdBase holding a `const BuildsConfiguration&` instead of a `BuildsCommand&`.
- `ProgressBar` renamed to `ConsoleProgressBar`; mode enum (`ConsoleProgressMode`) lifted to namespace scope; `PushLogOperation`/`PopLogOperation`/`ForceLinebreak` promoted to virtuals on `ProgressBase`.
- Free-function wrappers (`UploadFolder`, `DownloadFolder`, `ValidateBuildPart`) added around the existing operation classes so callers stop reimplementing setup + stats logging.
|
| |
|
|
| |
- Improvement: New `ZEN_SCOPED_LOG(Expr)` macro routes `ZEN_INFO`/`ZEN_WARN`/`ZEN_DEBUG` in the enclosing block through the given logger expression instead of the default
- Improvement: `BuildContainer`, `SaveOplog`, and `LoadOplogContext` now take a caller-provided `LoggerRef` so diagnostic messages route through the caller's logger
|
| |
|
| |
- Improvement: Replaced `OperationLogOutput` with `ProgressBase` in `zenutil`; logging and progress reporting are now separate concerns. Operation classes receive a `LoggerRef` for logging and a `ProgressBase&` for progress bars
|
| |
|
|
| |
Fix strings passed to RegisterRoute in HttpProjectStore: strings are literals now instead of regexes, so '\$' needs to be changed to just '$'.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
### Security: Input validation & path safety
- **Reject local file references by default** in package parsing — only allow when explicitly opted in by the service (`ParseFlags::kAllowLocalReferences`) and validated by an `ILocalRefPolicy` (fail-closed: no policy = rejected)
- **`DataRootLocalRefPolicy`** restricts local ref paths to the server's data root via canonical path prefix matching
- **Validate attachment hashes** in compute HTTP handlers — decompresses and re-hashes each attachment at ingestion time to reject tampered payloads
- **Path traversal validation** for worker descriptions (`pathvalidation.h`) — rejects absolute paths, `..` components, Windows reserved device names, and invalid filename characters
- **Harden CbPackage parsing** against corrupt inputs — overflow-safe attachment count, bounds checks on local ref offset/size, graceful failure instead of `ZEN_ASSERT` for untrusted data
- **Harden legacy package parser** — reject zero-size binary fields, missing mappers, and optionally validate resolved attachment hashes
- **Bounds check in `CbPackageReader::MarshalLocalChunkReference`** — detect when `MakeFromFile` silently clamps offset+size to file size
### Reliability: Lock consolidation & bug fixes
- **Consolidate three action map locks into one** (`m_ActionMapLock`) — eliminates deadlock risk from multi-lock ordering, simplifies state transitions, and fixes a race where newly enqueued actions were briefly invisible to `GetActionResult`/`FindActionResult`
- **Fix infinite loop in `BaseRunnerGroup::SubmitActions`** when actions exceed total runner capacity — cap round-robin at `TotalCapacity` and default unassigned results to "No capacity"
- **Fix `MakeSafeAbsolutePathInPlace` for UNC paths** — `\server\share` now correctly becomes `\?\UNC\server\share` instead of `\?\server\share`
- **Fix `max_retries=0`** — previously fell through to the default of 3; now correctly means "no retries"
### New: ManagedProcessRunner
- Cross-platform process runner backed by `SubprocessManager` — uses async exit callbacks instead of polling, delegates CPU/memory metrics to the manager's built-in sampler
- `ProcessGroup` (JobObject on Windows, process group on POSIX) for bulk cancellation on shutdown
- `--managed` flag on `zen exec inproc` to select this runner
- Refactored monitor thread lifecycle — `StartMonitorThread()` now called from derived constructors to avoid calling virtual functions from base constructor
### Process management
- **Suppress crash dialogs** via `JOB_OBJECT_UILIMIT_ERRORMODE` + `SEM_NOGPFAULTERRORBOX` in both `WindowsProcessRunner` and `JobObject::Initialize` — prevents WER/Dr. Watson modal dialogs from blocking the monitor thread
- **CREATE_SUSPENDED → AssignProcessToJobObject → ResumeThread** pattern in `WindowsProcessRunner` — ensures job object assignment before process execution
- **Move stdout/stderr callbacks to `Spawn()` parameters** in `SubprocessManager` — prevents race where early output could be missed before callback installation
- Consistent PID logging across all runner types
### Test infrastructure
- **`zentest-appstub`**: Added `Fail` (configurable exit code) and `Crash` (abort / nullptr deref) test functions
- **Compute integration tests**: exit code handling, auto-retry exhaustion, manual reschedule after failure, mixed success/failure queues, crash handling (abort + nullptr), crash auto-retry, immediate query visibility after enqueue
- **Package format tests**: truncated header, bad magic, attachment count overflow, truncated data, local ref rejection/acceptance, policy enforcement (inside/outside root, traversal, no-policy fail-closed)
- **Legacy package parser tests**: empty input, zero-size binary, hash resolution with/without mapper, hash mismatch detection
- **UNC path tests** for `MakeSafeAbsolutePath`
### Misc
- ANSI color helper macros (`ZEN_RED`, `ZEN_BRIGHT_WHITE`, etc.) and `ZEN_BOLD`/`ZEN_DIM`/etc.
- Generic `fmt::formatter` for types with free `ToString` functions
- Compute dashboard: truncated hash display with monospace font and hover for full value
- Renamed `usonpackage_forcelink` → `cbpackage_forcelink`
- Compute enabled by default in xmake config (releases still explicitly disable)
|
| |
|
|
|
|
| |
- Feature: Added Workspaces dashboard page with HTTP request stats and per-workspace metrics
- Feature: Added Build Storage dashboard page with service-specific HTTP request stats
- Improvement: Front page now shows Hub and Object Store activity tiles; HTTP panel is fixed above the tiles grid
- Improvement: HTTP stats tiles now include 5m/15m rates and p999/max latency across all service pages
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Hub watchdog automatically deprovisions inactive provisioned and hibernated instances
- Feature: Added `stats/activity_counters` endpoint to measure server activity
- Feature: Added configuration options for hub watchdog
- `--hub-watchdog-provisioned-inactivity-timeout-seconds` Inactivity timeout before a provisioned instance is deprovisioned
- `--hub-watchdog-hibernated-inactivity-timeout-seconds` Inactivity timeout before a hibernated instance is deprovisioned
- `--hub-watchdog-inactivity-check-margin-seconds` Margin before timeout at which an activity check is issued
- `--hub-watchdog-cycle-interval-ms` Watchdog poll interval in milliseconds
- `--hub-watchdog-cycle-processing-budget-ms` Maximum time budget per watchdog cycle in milliseconds
- `--hub-watchdog-instance-check-throttle-ms` Minimum delay between checks on a single instance
- `--hub-watchdog-activity-check-connect-timeout-ms` Connect timeout for activity check requests
- `--hub-watchdog-activity-check-request-timeout-ms` Request timeout for activity check requests
|
| |
|
| |
Authentication callbacks are not thread safe, ensured call sites does single threaded calls
|
| | |
|
| | |
|
| |
|
|
|
| |
- Improvement: Fixed issue where oplog upload could create blocks larger than the max limit (64Mb)
Refactored remoteprojectstore.cpp to use ParallelWork and exceptions for error handling.
|
| |\ |
|
| | |\ |
|
| | |\ \ |
|
| | | | |
| | | |
| | | |
| | | | |
command line or config
|
| | | | |
| | | |
| | | |
| | | | |
and add lua option
|
| | | | |
| | | |
| | | |
| | | | |
OidcToken executable path
|
| |\ \ \ \
| | |_|/
| |/| | |
|
| | | | |
| | | |
| | | |
| | | |
| | | | |
* create oplogs as they are imported
* Improved logic for partial block analisys
* unit tests for ChunkBlockAnalyser
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS.
- **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking.
- **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration.
- **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP.
- **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch.
- **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support.
Also contains some fixes from TSAN analysis.
|
| | | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
* clean up BuildStorageResolveResult to allow capabilities
* add check for multirange request capability
* add MaxRangeCountPerRequest capabilities
* project export tests
* add InMemoryBuildStorageCache
* progress and logging improvements
* fix ElapsedSeconds calculations in fileremoteprojectstore.cpp
* oplogs/builds test script
|
| | | |/
| |/|
| | |
| | |
| | | |
Feature: Add --allow-partial-block-requests to zen oplog-import
Improvement: zen oplog-import now uses partial block requests to reduce download size
Improvement: Use latency to Cloud Storage host and Zen Cache host when calculating partial block requests
|
| | | |
| | |
| | |
| | | |
or config
|
| |/ / |
|
| | |
| |
| |
| | |
* replace http router AddPattern with AddMatcher
* fix scrub logging
|
| | | |
|
| | |
| |
| |
| | |
project root
|
| | |
| |
| |
| | |
out of loop
|
| |/ |
|
| |
|
|
|
|
|
|
|
|
| |
- Feature: `zen oplog-export`, `zen oplog-import` and `zen oplog-download` now has options to boost workers
- `--boost-worker-count` - Increase the number of worker threads - may cause computer to be less responsive
- `--boost-worker-memory` - Increase the limit where we write downloaded data to temporary storage to conserve space - may cause computer to be less responsive due to high memory usage
- `--boost-workers` - Enables both 'boost-worker-count' and 'boost-worker-memory' - may cause computer to be less responsive
- Improvement: Refactored boost options for `zen builds` operations `upload`, `download`, `diff`, `prime-cache`, `fetch-blob` and `validate-part`
- `--boost-worker-count` - Increase the number of worker threads - may cause computer to be less responsive
- `--boost-worker-memory` - Increase the limit where we write downloaded data to temporary storage to conserve space - may cause computer to be less responsive due to high memory usage
- `--boost-workers` - Enables both 'boost-worker-count' and 'boost-worker-memory' - may cause computer to be less responsive
|
| | |
|
| |
|
| |
* add option to enable/disable upload to builds cache
|
| |
|
|
|
|
| |
* make sure the correct `UE_WITH_TRACE` conditional is used to enable/disable support code as appropriate
* fixed some accidental `int32`, `int64` et al usage, due to typedefs leaking through from trace header
with this fix, it is now possible to build with `--zentrace=no` again
|
| |
|
| |
* add host discovery and zen cache support for oplog import
|
|
|
* move all storage-related services into storage tree
* move config into config/
* also move admin service into storage since it mostly has storage related functionality
* header consolidation
|