| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
Switch from a fixed 10ms poll loop (curl_multi_perform) to the
event-driven curl_multi_socket_action API. curl now tells us which
sockets to watch via CURLMOPT_SOCKETFUNCTION and when to fire
timeouts via CURLMOPT_TIMERFUNCTION. Socket readiness is detected
through asio::ip::tcp::socket::async_wait, eliminating the polling
latency entirely.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Introduces AsyncHttpClient backed by curl_multi_perform driven by an
ASIO steady_timer. Supports callback-based and std::future-based APIs
for GET/POST/PUT/DELETE/HEAD, with both owned and external io_context
modes. All curl_multi state is serialized on an asio::strand, making
it safe with multi-threaded io_contexts.
Shared curl helpers (callbacks, URL encoding, header construction,
error mapping) extracted into httpclientcurlhelpers.h to eliminate
duplication between the sync and async implementations.
|
| | |
|
| | |
|
| |
|
|
| |
* use correct health endpoint for zenhubserver consul registration
* add total disk space on hub resource pane
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
* fix endpoint for stats/hub in compute/hub.html page
* fix api token call failure for imds (using wrong overload for Put)
* add "localhost" to healt check url in consul when no address is given
* add consul fallback deregister if normal deregister fails
* add consul registration unit test
|
| | |
|
| |
|
| |
- Feature: Hub dashboard proxy - instance dashboards are accessible through the hub server at `/hub/proxy/{port}/` without requiring direct port access
|
| | |
|
| |
|
|
|
| |
- Improvement: Hub child process spawning on macOS now uses `posix_spawn` in line with Apple recommendations
- Bugfix: Hub child process spawning on Linux now uses `vfork` instead of `fork`, preventing ENOMEM failures on systems with strict memory overcommit (`vm.overcommit_memory=2`)
- Bugfix: Fixed process group management on POSIX; child processes were not placed into the correct process group, breaking group-wide signal delivery
|
| |
|
| |
- Improvement: Consul token is now re-read from the environment variable on every request, allowing token rotation without restarting the service
|
| |
|
|
|
| |
CI test runs (#909)
Adds steps to the validate workflow on all platforms that kill any zenserver, minio, nomad, or consul processes launched from the build output directory. Runs before tests to clear stale processes from previous runs, and after tests (always, even on failure) to clean up.
|
| |
|
|
|
|
| |
* Unit test coverage for zero byte file handling in oplogs
* Unit test fixes for the zero length file case
* Fixes for zero length file attachments
* Additional fix for zero length file attachments
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
| |
(#907)
* fix potential race with stats counters missing when to Stop filtered values
* fix off by one in PutMultipartBuildBlob retry path
* use move operation instead of copy operation PutMultipartBlob
* fix filter Stop() for upload operations and fix bug with generateblock count filter
|
| |
|
| |
- Bugfix: Fixed concurrency issue in JupiterBuildStorage when updating stats
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
- Improvement: Hub server now supports Lua config file for all hub-specific options
- `hub.upstreamnotification.*` - upstream notification endpoint and instance ID
- `hub.consul.*` - service registration endpoint, token, health interval, deregister timeout
- `hub.instance.*` - base port, HTTP class, thread count, core limit, config path
- `hub.instance.limits.*` - instance count cap, disk and memory usage limits
- `hub.hydration.*` - hydration target spec and config path
- `hub.watchdog.*` - cycle timing, inactivity timeouts, and activity check timeouts
- Improvement: Added `--hub-instance-base-port-number` as an alias for `--hub-base-port-number`, and `--upstream-notification-instance-id` as an alias for `--instance-id`
- Improvement: Added hub mode documentation at docs/hub.md
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
### Security: Input validation & path safety
- **Reject local file references by default** in package parsing — only allow when explicitly opted in by the service (`ParseFlags::kAllowLocalReferences`) and validated by an `ILocalRefPolicy` (fail-closed: no policy = rejected)
- **`DataRootLocalRefPolicy`** restricts local ref paths to the server's data root via canonical path prefix matching
- **Validate attachment hashes** in compute HTTP handlers — decompresses and re-hashes each attachment at ingestion time to reject tampered payloads
- **Path traversal validation** for worker descriptions (`pathvalidation.h`) — rejects absolute paths, `..` components, Windows reserved device names, and invalid filename characters
- **Harden CbPackage parsing** against corrupt inputs — overflow-safe attachment count, bounds checks on local ref offset/size, graceful failure instead of `ZEN_ASSERT` for untrusted data
- **Harden legacy package parser** — reject zero-size binary fields, missing mappers, and optionally validate resolved attachment hashes
- **Bounds check in `CbPackageReader::MarshalLocalChunkReference`** — detect when `MakeFromFile` silently clamps offset+size to file size
### Reliability: Lock consolidation & bug fixes
- **Consolidate three action map locks into one** (`m_ActionMapLock`) — eliminates deadlock risk from multi-lock ordering, simplifies state transitions, and fixes a race where newly enqueued actions were briefly invisible to `GetActionResult`/`FindActionResult`
- **Fix infinite loop in `BaseRunnerGroup::SubmitActions`** when actions exceed total runner capacity — cap round-robin at `TotalCapacity` and default unassigned results to "No capacity"
- **Fix `MakeSafeAbsolutePathInPlace` for UNC paths** — `\server\share` now correctly becomes `\?\UNC\server\share` instead of `\?\server\share`
- **Fix `max_retries=0`** — previously fell through to the default of 3; now correctly means "no retries"
### New: ManagedProcessRunner
- Cross-platform process runner backed by `SubprocessManager` — uses async exit callbacks instead of polling, delegates CPU/memory metrics to the manager's built-in sampler
- `ProcessGroup` (JobObject on Windows, process group on POSIX) for bulk cancellation on shutdown
- `--managed` flag on `zen exec inproc` to select this runner
- Refactored monitor thread lifecycle — `StartMonitorThread()` now called from derived constructors to avoid calling virtual functions from base constructor
### Process management
- **Suppress crash dialogs** via `JOB_OBJECT_UILIMIT_ERRORMODE` + `SEM_NOGPFAULTERRORBOX` in both `WindowsProcessRunner` and `JobObject::Initialize` — prevents WER/Dr. Watson modal dialogs from blocking the monitor thread
- **CREATE_SUSPENDED → AssignProcessToJobObject → ResumeThread** pattern in `WindowsProcessRunner` — ensures job object assignment before process execution
- **Move stdout/stderr callbacks to `Spawn()` parameters** in `SubprocessManager` — prevents race where early output could be missed before callback installation
- Consistent PID logging across all runner types
### Test infrastructure
- **`zentest-appstub`**: Added `Fail` (configurable exit code) and `Crash` (abort / nullptr deref) test functions
- **Compute integration tests**: exit code handling, auto-retry exhaustion, manual reschedule after failure, mixed success/failure queues, crash handling (abort + nullptr), crash auto-retry, immediate query visibility after enqueue
- **Package format tests**: truncated header, bad magic, attachment count overflow, truncated data, local ref rejection/acceptance, policy enforcement (inside/outside root, traversal, no-policy fail-closed)
- **Legacy package parser tests**: empty input, zero-size binary, hash resolution with/without mapper, hash mismatch detection
- **UNC path tests** for `MakeSafeAbsolutePath`
### Misc
- ANSI color helper macros (`ZEN_RED`, `ZEN_BRIGHT_WHITE`, etc.) and `ZEN_BOLD`/`ZEN_DIM`/etc.
- Generic `fmt::formatter` for types with free `ToString` functions
- Compute dashboard: truncated hash display with monospace font and hover for full value
- Renamed `usonpackage_forcelink` → `cbpackage_forcelink`
- Compute enabled by default in xmake config (releases still explicitly disable)
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Added `--hub-hydration-target-config` option to specify the hydration target via a JSON config file (mutually exclusive with `--hub-hydration-target-spec`); supports `file` and `s3` types with structured settings
```json
{
"type": "file",
"settings": {
"path": "/path/to/hydration/storage"
}
}
```
```json
{
"type": "s3",
"settings": {
"uri": "s3://bucket[/prefix]",
"region": "us-east-1",
"endpoint": "http://localhost:9000",
"path-style": true
}
}
```
- Improvement: Hub hydration dehydration skips the `.sentry-native` directory
- Bugfix: Fixed `MakeSafeAbsolutePathInPlace` when a UNC prefix is present but path uses mixed delimiters
|
| |
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Hub dashboard now shows a Resources tile with disk and memory usage against configured limits
- Feature: Hub module listing now shows state-change timestamps and duration for each instance
- Improvement: Hub provisioning rejects new instances when disk or memory usage exceeds configurable thresholds; limits are disabled by default (0 = no limit)
- `--hub-provision-disk-limit-bytes` - Reject provisioning when used disk exceeds this many bytes
- `--hub-provision-disk-limit-percent` - Reject provisioning when used disk exceeds this percentage of total disk
- `--hub-provision-memory-limit-bytes` - Reject provisioning when used memory exceeds this many bytes
- `--hub-provision-memory-limit-percent` - Reject provisioning when used memory exceeds this percentage of total RAM
- Improvement: Hub process metrics are now tracked atomically per active instance slot, eliminating per-query process handle lookups
- Improvement: Hub, Build Store, and Workspaces service stats sections in the dashboard are now collapsible
- Bugfix: Hub watchdog loop did not check `m_ShutdownFlag`, causing it to spin indefinitely on shutdown
|
| |
|
| |
Replace doctest SUBCASEs with sequential scoped blocks so the MinIO server is spawned once and torn down via RAII at scope exit, instead of being restarted for every subcase re-entry. Fixes flaky CI on macOS caused by repeated MinIO process start/stop.
|
| |\
| |
| | |
Zs/file intern extern conversion
|
| | | |
|
| | | |
|
| |/ |
|
| |
|
| |
Split create_release.yml into a lightweight gate that checks for an existing git tag and a reusable workflow (create_release_impl.yml) containing all build/release jobs. When VERSION.txt is merged to a non-release branch the tag check short-circuits the entire workflow, preventing duplicate builds and failed artifact uploads.
|
| | |
|
| | |
|
| | |
|
| | |
|
| |
|
|
|
|
|
|
|
|
| |
- **Eliminate `<regex>` usage** — Replaced `std::regex`-based URL parsing in `jupiterbuildstorage.cpp` with manual `string_view` parsing. Added `CXXOPTS_NO_REGEX` to disable regex in cxxopts. Includes comprehensive tests for the new URL parser.
- **Add missing HTTP response codes** — Added `102`, `103`, `203`, `207`, `208`, `226`, `306`, `421`, `425`, `451` to the enum and reason string lookup.
- **Add `ForceColor` support to zen CLI** — Plumbed the `ForceColor` logging option through to the zen client.
- **Add `.clangd` config** — Strips MSVC-specific flags clangd can't handle and suppresses noisy clang-tidy checks.
- **Generic `fmt::formatter` for `ToString`** — Concept-based formatter that auto-formats any type with a free `ToString()` function, removing the need for per-type specializations.
- **Fix OpenSSL dependency** — Changed `zenhorde` to use `openssl3` package on Linux/macOS.
- **Add `<cmath>` include** — Missing include in `hyperloglog.h`.
- **GCC compile fix** — Moved `static constinit` variable inside lambda in `logging.cpp`.
|
| | |
|
| |
|
|
|
|
| |
- Feature: Added Workspaces dashboard page with HTTP request stats and per-workspace metrics
- Feature: Added Build Storage dashboard page with service-specific HTTP request stats
- Improvement: Front page now shows Hub and Object Store activity tiles; HTTP panel is fixed above the tiles grid
- Improvement: HTTP stats tiles now include 5m/15m rates and p999/max latency across all service pages
|
| |
|
| |
CPR is no longer needed now that HttpClient has fully transitioned to raw libcurl. This removes the CPR library, its build integration, implementation files, and all conditional compilation guards, leaving curl as the sole HTTP client backend.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
- Feature: Hub watchdog automatically deprovisions inactive provisioned and hibernated instances
- Feature: Added `stats/activity_counters` endpoint to measure server activity
- Feature: Added configuration options for hub watchdog
- `--hub-watchdog-provisioned-inactivity-timeout-seconds` Inactivity timeout before a provisioned instance is deprovisioned
- `--hub-watchdog-hibernated-inactivity-timeout-seconds` Inactivity timeout before a hibernated instance is deprovisioned
- `--hub-watchdog-inactivity-check-margin-seconds` Margin before timeout at which an activity check is issued
- `--hub-watchdog-cycle-interval-ms` Watchdog poll interval in milliseconds
- `--hub-watchdog-cycle-processing-budget-ms` Maximum time budget per watchdog cycle in milliseconds
- `--hub-watchdog-instance-check-throttle-ms` Minimum delay between checks on a single instance
- `--hub-watchdog-activity-check-connect-timeout-ms` Connect timeout for activity check requests
- `--hub-watchdog-activity-check-request-timeout-ms` Request timeout for activity check requests
|
| |
|
| |
Fixes incorrect .pdata section for ASM functions on Windows, early failure on invalid dictionarySize, and removes spurious libcxx dependency on Linux/Mac. Adds Linux ARM64 and Windows ARM64 libraries.
|
| |
|
|
|
|
| |
- Improvement: Provisioning a hibernated instance now automatically wakes it instead of requiring an explicit wake call first
- Improvement: Deprovisioning now accepts instances in Crashed or Hibernated states, not just Provisioned
- Improvement: Added `--consul-health-interval-seconds` and `--consul-deregister-after-seconds` options to control Consul health check behavior (defaults: 10s and 30s)
- Improvement: Consul registration now occurs when provisioning starts; health check intervals are applied once provisioning completes
|
| |
|
|
|
| |
- Improvement: Hub provision, deprovision, hibernate, and wake operations are now async. HTTP requests returns 202 Accepted while the operation completes in the background
- Improvement: Hub returns 202 Accepted (instead of 409 Conflict) when the same async operation is already in progress for a module
- Improvement: Hub returns 200 OK when a requested state transition is already satisfied
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a `SubprocessManager` for managing child processes with ASIO-integrated async exit detection, stdout/stderr pipe capture, and periodic metrics sampling. Also introduces `ProcessGroup` for OS-backed process grouping (Windows JobObjects / POSIX process groups).
### SubprocessManager
- Async process exit detection using platform-native mechanisms (Windows `object_handle`, Linux `pidfd_open`, macOS `kqueue EVFILT_PROC`) — no polling
- Stdout/stderr capture via async pipe readers with per-process or default callbacks
- Periodic round-robin metrics sampling (CPU, memory) across managed processes
- Spawn, adopt, remove, kill, and enumerate managed processes
### ProcessGroup
- OS-level process grouping: Windows JobObject (kill-on-close guarantee), POSIX `setpgid` (bulk signal delivery)
- Atomic group kill via `TerminateJobObject` (Windows) or `kill(-pgid, sig)` (POSIX)
- Per-group aggregate metrics and enumeration
### ProcessHandle improvements
- Added explicit constructors from `int` (pid) and `void*` (native handle)
- Added move constructor and move assignment operator
### ProcessMetricsTracker
- Cross-platform process metrics (CPU time, working set, page faults) via `QueryProcessMetrics()`
- ASIO timer-driven periodic sampling with configurable interval and batch size
- Aggregate metrics across tracked processes
### Other changes
- Fixed `zentest-appstub` writing a spurious `Versions` file to cwd on every invocation
|
| | |
|
| |
|
|
|
|
| |
- **Replace crashpad static-libc++ patch file with `io.replace()` in `on_install`** — The old `.patch` file was fragile (trailing-whitespace stripping on Windows would silently break it). Using `io.replace()` in the xmake build script is more robust and easier to maintain.
- **Skip sentry-native `on_test` link check on Linux** — The link test requires `-lc++abi` when building with the UE clang toolchain but adding it unconditionally breaks GCC/libstdc++ builds. The zenserver build itself validates that the library is usable.
- **Add `crashpad-test.sh`** — A test script that launches a release zenserver, waits for the health endpoint, then verifies that `crashpad_handler` is running, no `sentry_init` failure was logged, and the handler has no dynamic `libc++.so.1` dependency.
- **Add Crashpad Check step to Linux release CI** — Runs `crashpad-test.sh` in the `validate` workflow for release builds to catch crashpad regressions before merge.
|
| |
|
|
| |
* refactor hub callbacks
* improve http responses
|
| |
|
|
|
|
|
| |
- **Cross-platform `GetProcessMetrics`**: Implement Linux (`/proc/{pid}/stat`, `/proc/{pid}/statm`, `/proc/{pid}/status`) and macOS (`proc_pidinfo(PROC_PIDTASKINFO)`) support for CPU times and memory metrics. Fix Windows to populate the `MemoryBytes` field (was always 0). All platforms now set `MemoryBytes = WorkingSetSize`.
- **`ProcessMetricsTracker`**: Experimental utility class (`zenutil`) that periodically samples resource usage for a set of tracked child processes. Supports both a dedicated background thread and an ASIO steady_timer mode. Computes delta-based CPU usage percentage across samples, with batched sampling (8 processes per tick) to limit per-cycle overhead.
- **`ProcessHandle` documentation**: Add Doxygen comments to all public methods describing platform-specific behavior.
- **Cleanup**: Remove unused `ZEN_RUN_TESTS` macro (inlined at its single call site in `zenserver/main.cpp`), remove dead `#if 0` thread-shutdown workaround block.
- **Minor fixes**: Use `HttpClientAccessToken` constructor in hordeclient instead of setting private members directly. Log ASIO version at startup and include it in the server settings list.
|
| | |
|
| | |
|