| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Integrates the **tourist** trace analysis library and builds a full `zen trace` command suite for working with Unreal Engine `.utrace` files.
### Trace analysis library (`thirdparty/tourist/`)
- Adds the tourist library as a third-party dependency with three modules: **foundation** (platform primitives, memory, scheduling), **trace** (UE Trace protocol decoding), and **analysis** (event dispatching and analyzer framework).
- Cross-platform support for Windows, Linux, and macOS.
### `zen trace` CLI commands (`src/zen/cmds/`, `src/zen/trace/`)
- **`zen trace analyze`** — Summarize a `.utrace` file: session metadata, thread inventory, command line + build configuration, CPU profiling scopes, timing, event rates, log messages, and (with symbols) memory allocation metrics including live-allocs dumps, callstack-keyed aggregation, and allocation churn. Optional HTML output for memory reports.
- **`zen trace inspect`** — Dump the event schema (declared types, fields, sizes) from a trace file.
- **`zen trace trim`** — Extract a time-window from a trace into a new `.utrace` file.
- **`zen trace serve`** — Launch a local HTTP server hosting an interactive trace viewer; opens in the default browser.
### Symbolication (`src/zen/trace/symbol_resolver.*`, `thirdparty/raw_pdb/`)
- Pluggable resolver with multiple backends: `pdb` (in-tree raw_pdb), `dbghelp` (Windows), `llvm-symbolizer` (all platforms), `atos` (macOS). An `auto` backend picks the best available tool per platform.
- Microsoft Symbol Server support: downloads PDBs on demand using a redirect-aware HTTP client.
- Local PDB cache keyed by image GUID preserves symbols across binary recompilation.
- Callstack trimming heuristic strips UE internal noise from reports.
- Binary analysis cache (`.ucache_z`) avoids re-resolving the same trace.
### Interactive trace viewer (`src/zen/frontend/html/`, `src/zen/trace/trace_viewer_service.*`)
- Timeline: scope-level detail, horizontal zoom/pan, vertical scrolling, viewport-driven loading with pre-computed LOD for responsive navigation of large traces.
- Thread grouping (collapsible sidebar sections) synthesized from name suffixes, natural sort order, visual distinction between lane threads and OS threads.
- Bookmark and region annotations; region categories with per-category toggles; bookmark marker toggle in the toolbar.
- Filterable Logs tab showing captured `UE_LOG` output.
- Stats tab with per-scope aggregate statistics.
- Memory tab with interactive allocation analysis and an allocation size histogram.
- CsvProfiler event parsing and chart UI.
### Other in-branch supporting changes
- **Cross-platform browser launcher** (`browser_launcher.{h,cpp}`) used by `trace serve`.
- **`ReciprocalU64`** fast 64-bit integer division (zencore/intmath) for trace analyzers.
- **`parallelsort`** cross-platform parallel sort helper (zenutil).
- Frontend zip build rule so the viewer's HTML assets are bundled into `zen.exe`.
- `/Zo` flag for better optimized debug info on Windows release builds.
- `trace-tests.cpp` in the `zen-test` harness (harness itself landed on main via #985).
|
| |
|
|
| |
- Improvement: New `ZEN_SCOPED_LOG(Expr)` macro routes `ZEN_INFO`/`ZEN_WARN`/`ZEN_DEBUG` in the enclosing block through the given logger expression instead of the default
- Improvement: `BuildContainer`, `SaveOplog`, and `LoadOplogContext` now take a caller-provided `LoggerRef` so diagnostic messages route through the caller's logger
|
| |
|
|
|
|
|
|
|
|
| |
- **Frontend dashboard overhaul**: Unified compute/main dashboards into a single shared UI. Added new pages for cache, projects, metrics, sessions, info (build/runtime config, system stats). Added live-update via WebSockets with pause control, sortable detail tables, themed styling. Refactored compute/hub/orchestrator pages into modular JS.
- **HTTP server fixes and stats**: Fixed http.sys local-only fallback when default port is in use, implemented root endpoint redirect for http.sys, fixed Linux/Mac port reuse. Added /stats endpoint exposing HTTP server metrics (bytes transferred, request rates). Added WebSocket stats tracking.
- **OTEL/diagnostics hardening**: Improved OTLP HTTP exporter with better error handling and resilience. Extended diagnostics services configuration.
- **Session management**: Added new sessions service with HTTP endpoints for registering, updating, querying, and removing sessions. Includes session log file support. This is still WIP.
- **CLI subcommand support**: Added support for commands with subcommands in the zen CLI tool, with improved command dispatch.
- **Misc**: Exposed CPU usage/hostname to frontend, fixed JS compact binary float32/float64 decoding, limited projects displayed on front page to 25 sorted by last access, added vscode:// link support.
Also contains some fixes from TSAN analysis.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
**Bug fixes across zenstore, zenremotestore, and related subsystems, primarily surfaced by static analysis.**
## Cache subsystem (cachedisklayer.cpp)
- Fixed tombstone scoping bug: tombstone flag and missing entry were recorded outside the block where data was removed, causing non-missing entries to be incorrectly tombstoned
- Fixed use-after-overwrite: `RemoveMemCachedData`/`RemoveMetaData` were called after `Payload` was overwritten on cache put, leaking stale data
- Fixed incorrect retry sleep formula (`100 - (3 - RetriesLeft) * 100` always produced the same or negative value; corrected to `(3 - RetriesLeft) * 100`)
- Fixed broken `break` missing from sidecar file read loop, causing reads past valid data
- Fixed missing format argument in three `ZEN_WARN`/`ZEN_ERROR` log calls (format string had `{}` placeholders with no corresponding argument, or vice versa)
- Fixed elapsed timer being accumulated inside the wrong scope in `HandleRpcGetCacheRecords`
- Fixed test asserting against unserialized `RecordPolicy` instead of the deserialized `Loaded` copy
- Initialized `AbortFlag`/`PauseFlag` atomics at declaration (UB if read before first write)
## Build store (buildstore.cpp / buildstore.h)
- Fixed wrong variable used in warning log: used loop index `ResultIndex` instead of `Index`/`MetaLocationResultIndexes[Index]`, logging wrong hash values
- Fixed `sizeof(AccessTimesHeader)` used instead of `sizeof(AccessTimeRecord)` when advancing write offset, corrupting the access times file if the sizes differ
- Initialized `m_LastAccessTimeUpdateCount` atomic member (was uninitialized)
- Changed map iteration loops to use `const auto&` to avoid unnecessary copies
## Project store (projectstore.cpp / projectstore.h)
- Fixed wrong iterator dereferenced in `IterateChunks`: used `ChunkIt->second` (from a different map lookup) instead of `MetaIt->second`
- Fixed wrong assert variable: `Sizes[Index]` should be `RawSizes[Index]`
- Fixed `MakeTombstone`/`IsTombstone` inconsistency: `MakeTombstone` was zeroing `OpLsn` but `IsTombstone` checks `OpLsn.Number != 0`; tombstone creation now preserves `OpLsn`
- Fixed uninitialized `InvalidEntries` counter
- Fixed format string mismatch in warning log
- Initialized `AbortFlag`/`PauseFlag` atomics; changed map iteration to `const auto&`
## Workspaces (workspaces.cpp)
- Fixed missing alias registration when a workspace share is updated: alias was deleted but never re-inserted
- Fixed integer overflow in range clamping: `(RequestedOffset + RequestedSize) > Size` could wrap; corrected to `RequestedSize > Size - RequestedOffset`
- Changed map iteration loops to `const auto&`
## CAS subsystem (cas.cpp, caslog.cpp, compactcas.cpp, filecas.cpp)
- Fixed `IterateChunks` passing original `Payload` buffer instead of the modified `Chunk` buffer (content type was set on the copy but the original was sent to the callback)
- Fixed invalid `std::future::get()` call on default-constructed futures
- Fixed sign-comparison in `CasLogFile::Replay` loop (`int i` vs `size_t`)
- Changed `CasLogFile::IsValid` and `Open` to take `const std::filesystem::path&` instead of by value
- Fixed format string in `~CasContainerStrategy` error log
## Remote store (zenremotestore)
- Fixed `FolderContent::operator==` always returning true: loop variable `PathCount` was initialized to 0 instead of `Paths.size()`
- Fixed `GetChunkIndexForRawHash` looking up from wrong map (`RawHashToSequenceIndex` instead of `ChunkHashToChunkIndex`)
- Fixed double-counted `UniqueSequencesFound` stat (incremented in both branches of an if/else)
- Fixed `RawSize` sentinel value truncation: `(uint32_t)-1` assigned to a `uint64_t` field; corrected to `(uint64_t)-1`
- Initialized uninitialized atomic and struct members across `buildstorageoperations.h`, `chunkblock.h`, and `remoteprojectstore.h`
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zencore fixes:
- filesystem.cpp: ReadFile error reporting logic
- compactbinaryvalue.h: CbValue::As*String error reporting logic
zenhttp fixes:
- httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB)
- httpsys async workpool initialization race
zenstore fixes:
- cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost)
- structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses)
- cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete)
- cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size
- buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash)
- buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob
zenserver fixes:
- httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow)
- hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper)
- zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
|
| |
|
|
| |
This reverts commit 3c89c486338890ce39ddebe5be4722a09e85701a.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zenstore fixes:
- cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost)
- structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses)
- cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete)
- cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size
- buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash)
- buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob
zenserver fixes:
- httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow)
- hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper)
- zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
| |
|
| |
* reduce held locks while performing scrub operation
|
| |
|
|
|
| |
executed in the destructor (#683)
don't call WriteChunks in batch operation if no chunks needs to be written
|
| |
|
|
|
| |
* use fixed vectors for batch requests
* refactor cache batch value put/get to not execute code that can throw execeptions in destructor
* extend test with multi-bucket requests
|
| |
|
| |
* add checkes to protect against access violation due to failed disk read
|
| |
|
|
|
| |
- Improvement: Deeper validation of data when scrub is activated (cas/cache/project)
- Improvement: Enabled more multi threading when running scrub operations
- Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
|
| |
|
| |
* try to move file into place before trying speculative remove of target file
|
| |
|
|
| |
* move referencemetadata to zenstore
* rename zenutil/windows/service to windowsservice
|
| |
|
|
|
|
|
|
|
| |
* remove dependency to zenutil/workerpools.h from remoteprojectstore.cpp
* remove dependency to zenutil/workerpools.h from buildstoragecache.cpp
* remove unneded include
* move jupiter helpers to zenremotestore
* move parallelwork to zencore
* remove zenutil dependency from zenremotestore
* clean up test project dependencies - use indirect dependencies
|
| |
|
| |
- Improvement: Add additional validations when reading disk cache records to get references in GC
|
| |
|
|
|
| |
- Ensure that text responses are in a field named "Message"
- Change the record response to be named "Record" instead of "Object"
|
| |
|
|
| |
Conflicts are now treated as successes, and we optionally return a Details array instead of an ErrorMessages array. Details are returned for all requests in a batch, or no requests in a batch depending on whether there are any details to be shared about any of the put requests. The details for a conflict include the raw hash and raw size of the item. If the item is a record, we also include the record as an object.
|
| |
|
| |
- Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
|
| |
|
|
|
|
|
| |
- Refactor so we can have more than one cas store for project store and cache.
- Refactor `UpstreamCacheClient` so it is not tied to a specific CidStore
- Refactor scrub to keep the GC interface ScrubStorage function separate from scrub accessor functions (renamed to Scrub).
- Refactor storage size to keep GC interface StorageSize function separate from size accessor functions (renamed to TotalSize)
- Refactor cache storage so `ZenCacheDiskLayer::CacheBucket` implements GcStorage interface rather than `ZenCacheNamespace`
|
| |
|
|
| |
keep rawsize and rawhash if available when using batch for inline puts
keep rawsize and rawhash of input value if we have calculated it for validation already
|
| |\ |
|
| | | |
|
| | | |
|
| | |
| |
| |
| | |
Previously rejected puts would put the chunks, but not write them to the index, which was wrong.
|
| | | |
|
| |\| |
|
| | | |
|
| |\| |
|
| | | |
|
| | |
| |
| |
| |
| | |
* exception safety when issuing ParallelWork
* add asserts to Latch usage to catch usage errors
* extended error messaging and recovery handling in ParallelWork destructor to help find issues
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* make sure to close log file when resetting log
* drop entries that refers to missing blocks
* Don't scrub keys that has been rewritten
* currectly count added bytes / m_TotalSize
* fix negative sleep time in BlockStoreFile::Open()
* be defensive when fetching log position
* append to log files *after* we updated all state successfully
* explicitly close stuff in destructors with exception catching
* clean up empty size block store files
|
| | |
| |
| |
| |
| | |
- Feature: `zen builds pause`, `zen builds resume` and `zen builds abort` commands to control a running `zen builds` command
- `--process-id` the process id to control, if omitted it tries to find a running process using the same executable as itself
- Improvement: Process report now indicates if it is pausing or aborting
|
| | |
| |
| |
| |
| | |
* Don't count a miss twice for memory stats if the entry can't be found
* changelog
|
| | |
| |
| |
| | |
- Bugfix: Flush the last block before closing the last new block written to during blockstore compact. UE-291196
- Feature: Drop unreachable CAS data during GC pass. UE-291196
|
| | |
| |
| |
| | |
* don't hold exclusive locks while deleting files from a dropped bucket/namespace
* cleaner detection of missing namespace when issuing a drop
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* use ParallelWork in rpc playback
* use ParallelWork in projectstore
* use ParallelWork in buildstore
* use ParallelWork in cachedisklayer
* use ParallelWork in compactcas
* use ParallelWork in filecas
* don't set abort flag in ParallelWork destructor
* add PrepareFileForScatteredWrite for temp files in httpclient
* Use PrepareFileForScatteredWrite when stream-decompressing files
* be more relaxed when deleting temp files
* allow explicit zen-cache when using direct host url without resolving
* fix lambda capture when writing loose chunks
* no delay when attempting to remove temp files
|
| | |
| |
| |
| |
| | |
- Improvement: Cleaned up snapshot writing for CompactCAS/FileCas/Cache/Project stores
- Improvement: Safer recovery when failing to delete log for CompactCAS/FileCas/Cache/Project stores
- Improvement: Added log file reset when writing snapshot at startup for FileCas
|
| | |
| |
| |
| | |
Feature: Add per bucket cache configuration (Lua options file only)
Improvement: --cache-memlayer-sizethreshold is now deprecated and has a new name: --cache-bucket-memlayer-sizethreshold to line up with per cache bucket configuration
|
| | |
| |
| | |
* tweak block iteration chunk sizes
|
| | |
| |
| | |
* optimize cache bucket snapshot and sidecar writing
|
| | | |
|
| | | |
|
| | |
| |
| | |
* do cache bucket flush/write snapshot as part of compact to reduce disk I/O
|
| | |
| |
| | |
- Bugfix: Long file paths now works correctly on Windows
|
| | |
| |
| |
| |
| |
| | |
* Added EASTL to help with eliminating memory allocations
* Applied EASTL to eliminate memory allocations, primarily by using `fixed_vector` et al to use stack allocations / inline struct allocations
Reduces memory events in traces by close to a factor of 10 in test scenario (starting editor for project F)
|
| | |
| |
| |
| | |
Result structure contains status and a string message (may be empty)
|
| | | |
|
| | |
| |
| |
| | |
Value comparison methods moved to more appropriate area in file.
|
| |/
|
|
| |
Overwrite with differing value should be denied if QueryLocal is not present and StoreLocal is present. Overwrite with equal value should succeed regardless of policy flags.
|