aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/cache/structuredcachestore.cpp
Commit message (Collapse)AuthorAgeFilesLines
* zenstore bug-fixes from static analysis pass (#815)Stefan Boberg11 days1-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | **Bug fixes across zenstore, zenremotestore, and related subsystems, primarily surfaced by static analysis.** ## Cache subsystem (cachedisklayer.cpp) - Fixed tombstone scoping bug: tombstone flag and missing entry were recorded outside the block where data was removed, causing non-missing entries to be incorrectly tombstoned - Fixed use-after-overwrite: `RemoveMemCachedData`/`RemoveMetaData` were called after `Payload` was overwritten on cache put, leaking stale data - Fixed incorrect retry sleep formula (`100 - (3 - RetriesLeft) * 100` always produced the same or negative value; corrected to `(3 - RetriesLeft) * 100`) - Fixed broken `break` missing from sidecar file read loop, causing reads past valid data - Fixed missing format argument in three `ZEN_WARN`/`ZEN_ERROR` log calls (format string had `{}` placeholders with no corresponding argument, or vice versa) - Fixed elapsed timer being accumulated inside the wrong scope in `HandleRpcGetCacheRecords` - Fixed test asserting against unserialized `RecordPolicy` instead of the deserialized `Loaded` copy - Initialized `AbortFlag`/`PauseFlag` atomics at declaration (UB if read before first write) ## Build store (buildstore.cpp / buildstore.h) - Fixed wrong variable used in warning log: used loop index `ResultIndex` instead of `Index`/`MetaLocationResultIndexes[Index]`, logging wrong hash values - Fixed `sizeof(AccessTimesHeader)` used instead of `sizeof(AccessTimeRecord)` when advancing write offset, corrupting the access times file if the sizes differ - Initialized `m_LastAccessTimeUpdateCount` atomic member (was uninitialized) - Changed map iteration loops to use `const auto&` to avoid unnecessary copies ## Project store (projectstore.cpp / projectstore.h) - Fixed wrong iterator dereferenced in `IterateChunks`: used `ChunkIt->second` (from a different map lookup) instead of `MetaIt->second` - Fixed wrong assert variable: `Sizes[Index]` should be `RawSizes[Index]` - Fixed `MakeTombstone`/`IsTombstone` inconsistency: `MakeTombstone` was zeroing `OpLsn` but `IsTombstone` checks `OpLsn.Number != 0`; tombstone creation now preserves `OpLsn` - Fixed uninitialized `InvalidEntries` counter - Fixed format string mismatch in warning log - Initialized `AbortFlag`/`PauseFlag` atomics; changed map iteration to `const auto&` ## Workspaces (workspaces.cpp) - Fixed missing alias registration when a workspace share is updated: alias was deleted but never re-inserted - Fixed integer overflow in range clamping: `(RequestedOffset + RequestedSize) > Size` could wrap; corrected to `RequestedSize > Size - RequestedOffset` - Changed map iteration loops to `const auto&` ## CAS subsystem (cas.cpp, caslog.cpp, compactcas.cpp, filecas.cpp) - Fixed `IterateChunks` passing original `Payload` buffer instead of the modified `Chunk` buffer (content type was set on the copy but the original was sent to the callback) - Fixed invalid `std::future::get()` call on default-constructed futures - Fixed sign-comparison in `CasLogFile::Replay` loop (`int i` vs `size_t`) - Changed `CasLogFile::IsValid` and `Open` to take `const std::filesystem::path&` instead of by value - Fixed format string in `~CasContainerStrategy` error log ## Remote store (zenremotestore) - Fixed `FolderContent::operator==` always returning true: loop variable `PathCount` was initialized to 0 instead of `Paths.size()` - Fixed `GetChunkIndexForRawHash` looking up from wrong map (`RawHashToSequenceIndex` instead of `ChunkHashToChunkIndex`) - Fixed double-counted `UniqueSequencesFound` stat (incremented in both branches of an if/else) - Fixed `RawSize` sentinel value truncation: `(uint32_t)-1` assigned to a `uint64_t` field; corrected to `(uint64_t)-1` - Initialized uninitialized atomic and struct members across `buildstorageoperations.h`, `chunkblock.h`, and `remoteprojectstore.h`
* Add test suites (#799)Stefan Boberg2026-03-021-0/+4
| | | | | | | | | | | | | Makes all test cases part of a test suite. Test suites are named after the module and the name of the file containing the implementation of the test. * This allows for better and more predictable filtering of which test cases to run which should also be able to reduce the time CI spends in tests since it can filter on the tests for that particular module. Also improves `xmake test` behaviour: * instead of an explicit list of projects just enumerate the test projects which are available based on build system state * also introduces logic to avoid running `xmake config` unnecessarily which would invalidate the existing build and do lots of unnecessary work since dependencies were invalidated by the updated config * also invokes build only for the chosen test targets As a bonus, also adds `xmake sln --open` which allows opening IDE after generation of solution/xmake project is done.
* test running / reporting improvements (#797)Stefan Boberg2026-02-281-1/+1
| | | | | | | | | | | | | | | | | | | **CI/CD improvements (validate.yml):** - Add test reporter (`ue-foundation/test-reporter@v2`) for all three platforms, rendering JUnit test results directly in PR check runs - Add "Trust workspace" step on Windows to fix git safe.directory ownership issue with self-hosted runners - Clean stale report files before each test run to prevent false failures from leftover XML - Broaden `paths-ignore` to skip builds for non-code changes (`*.md`, `LICENSE`, `.gitignore`, `docs/**`) **Test improvements:** - Convert `CHECK` to `REQUIRE` in several test suites (projectstore, integration, http) for fail-fast behavior - Mark some tests with `doctest::skip()` for selective execution - Skip httpclient transport tests pending investigation - Add `--noskip` option to `xmake test` task - Add `--repeat=<N>` option to `xmake test` task, to run tests repeatedly N times or until there is a failure **xmake test output improvements:** - Add totals row to test summary table - Right-justify numeric columns in summary table
* Various bug fixes (#778)Stefan Boberg2026-02-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | zencore fixes: - filesystem.cpp: ReadFile error reporting logic - compactbinaryvalue.h: CbValue::As*String error reporting logic zenhttp fixes: - httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB) - httpsys async workpool initialization race zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
* Revert "Fix correctness and concurrency bugs found during code review"Stefan Boberg2026-02-241-4/+1
| | | | This reverts commit 3c89c486338890ce39ddebe5be4722a09e85701a.
* Fix correctness and concurrency bugs found during code reviewStefan Boberg2026-02-241-1/+4
| | | | | | | | | | | | | | | | | zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths) Co-Authored-By: Claude Opus 4.6 <[email protected]>
* Fix unit test that relies on being able to overwritezousar2025-12-191-2/+2
|
* remove catching of exceptions in batch operations now that they are not ↵Dan Engelbrecht2025-12-101-22/+11
| | | | | executed in the destructor (#683) don't call WriteChunks in batch operation if no chunks needs to be written
* batch op not in destructor (#676)Dan Engelbrecht2025-12-041-45/+77
| | | | | * use fixed vectors for batch requests * refactor cache batch value put/get to not execute code that can throw execeptions in destructor * extend test with multi-bucket requests
* automatic scrub on startup (#667)Dan Engelbrecht2025-11-271-29/+21
| | | | | - Improvement: Deeper validation of data when scrub is activated (cas/cache/project) - Improvement: Enabled more multi threading when running scrub operations - Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
* Various fixes to address issues flagged by gcc / non-UE toolchain build (#621)Stefan Boberg2025-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | * gcc: avoid using memset on nontrivial struct * redundant `return std::move` * fixed various compilation issues flagged by gcc * fix issue in xmake.lua detecting whether we are building with the UE toolchain or not * add GCC ignore -Wundef (comment is inaccurate) * remove redundant std::move * don't catch exceptions by value * unreferenced variables * initialize "by the book" instead of memset * remove unused exception reference * add #include <cstring> to fix gcc build * explicitly poulate KeyValueMap by traversing input spans fixes gcc compilation * remove unreferenced variable * eliminate redundant `std::move` which gcc complains about * fix gcc compilation by including <cstring> * tag unreferenced variable to fix gcc compilation * fixes for various cases of naming members the same as their type
* add ability to limit concurrency (#565)Stefan Boberg2025-10-101-2/+2
| | | | | | | | | | | | effective concurrency in zenserver can be limited via the `--corelimit=<N>` option on the command line. Any value passed in here will be used instead of the return value from `std::thread::hardware_concurrency()` if it is lower. * added --corelimit option to zenserver * made sure thread pools are configured lazily and not during global init * added log output indicating effective and HW concurrency * added change log entry * removed debug logging from ZenEntryPoint::Run() also removed main thread naming on Linux since it makes the output from `top` and similar tools confusing (it shows `main` instead of `zenserver`)
* speed up tests (#555)Dan Engelbrecht2025-10-061-32/+56
| | | | | | | | | | | | * faster FileSystemTraversal test * faster jobqueue test * faster NamedEvent test * faster cache tests * faster basic http tests * faster blockstore test * faster cache store tests * faster compactcas tests * more responsive zenserver launch * tweak worker pool sizes in tests
* cacherequests helpers test only (#551)Dan Engelbrecht2025-10-031-2/+1
| | | | * don't use cacherequests utils in cache_cmd.cpp * make zenutil/cacherequests code into test code helpers only
* Adjust the responses from PUT commandszousar2025-09-231-2/+2
| | | | | - Ensure that text responses are in a field named "Message" - Change the record response to be named "Record" instead of "Object"
* Change batch put responses for client reportingzousar2025-09-191-2/+7
| | | | Conflicts are now treated as successes, and we optionally return a Details array instead of an ErrorMessages array. Details are returned for all requests in a batch, or no requests in a batch depending on whether there are any details to be shared about any of the put requests. The details for a conflict include the raw hash and raw size of the item. If the item is a record, we also include the record as an object.
* add EMode to WorkerTheadPool to avoid thread starvation (#492)Dan Engelbrecht2025-09-101-45/+59
| | | - Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
* per namespace/project cas prep refactor (#470)Dan Engelbrecht2025-08-201-39/+29
| | | | | | | - Refactor so we can have more than one cas store for project store and cache. - Refactor `UpstreamCacheClient` so it is not tied to a specific CidStore - Refactor scrub to keep the GC interface ScrubStorage function separate from scrub accessor functions (renamed to Scrub). - Refactor storage size to keep GC interface StorageSize function separate from size accessor functions (renamed to TotalSize) - Refactor cache storage so `ZenCacheDiskLayer::CacheBucket` implements GcStorage interface rather than `ZenCacheNamespace`
* Merge branch 'main' into zs/put-overwrite-policyzousar2025-06-241-18/+33
|\
| * make sure we unregister from GC before we drop bucket/namespaces (#443)Dan Engelbrecht2025-06-191-0/+1
| |
| * unblock cache bucket drop (#406)Dan Engelbrecht2025-05-261-14/+28
| | | | | | | | * don't hold exclusive locks while deleting files from a dropped bucket/namespace * cleaner detection of missing namespace when issuing a drop
| * reduced memory churn using fixed_xxx containers (#236)Stefan Boberg2025-03-061-4/+4
| | | | | | | | | | | | * Added EASTL to help with eliminating memory allocations * Applied EASTL to eliminate memory allocations, primarily by using `fixed_vector` et al to use stack allocations / inline struct allocations Reduces memory events in traces by close to a factor of 10 in test scenario (starting editor for project F)
* | Change to PutResult structurezousar2025-06-241-10/+10
| | | | | | | | Result structure contains status and a string message (may be empty)
* | Enforce Overwrite Prevention According To Cache Policyzousar2025-02-261-23/+39
|/ | | | Overwrite with differing value should be denied if QueryLocal is not present and StoreLocal is present. Overwrite with equal value should succeed regardless of policy flags.
* Add multithreading directory scanning in core/filesystem (#277)Dan Engelbrecht2025-01-221-1/+1
| | | | | | add DirectoryContent::IncludeFileSizes add DirectoryContent::IncludeAttributes add multithreaded GetDirectoryContent use multithreaded GetDirectoryContent in workspace folder scanning
* added support for dynamic LLM tags (#245)Stefan Boberg2024-12-021-0/+27
| | | | | * added FLLMTag which can be used to register memory tags outside of core * changed `UE_MEMSCOPE` -> `ZEN_MEMSCOPE` for consistency * instrumented some subsystems with dynamic tags
* oplog prep gc fix (#216)Dan Engelbrecht2024-11-151-18/+35
| | | | | | - Added option gc-validation to zenserver that does a check for missing references in all oplog post full GC. Enabled by default. - Feature: Added option gc-validation to zen gc command to control reference validation. Enabled by default. - Added more details in post GC log. - Fixed race condition in oplog writes which could cause used attachments to be incorrectly removed by GC
* bucket size queries (#203)Dan Engelbrecht2024-10-211-1/+23
| | | - Feature: Added options --bucketsize and --bucketsizes to zen cache-info to get data sizes in cache buckets and attachments
* remove gc v1 (#121)Dan Engelbrecht2024-10-031-283/+17
| | | | | * kill gc v1 * block use of gc v1 from zen command line * warn and flip to gcv2 if --gc-v2=false is specified for zenserver
* Add `gc-attachment-passes` option to zenserver (#167)Dan Engelbrecht2024-09-251-1/+1
| | | | | Added option `gc-attachment-passes` to zenserver Cleaned up GCv2 start and stop logs and added identifier to easily find matching start and end of a GC pass in log file Fixed project store not properly sorting references found during lock phase
* gc unused refactor (#165)Dan Engelbrecht2024-09-231-14/+10
| | | | | * optimize IoHash and OId comparisions * refactor filtering of unused references * add attachment filtering to gc
* unblock PreCache (#164)Dan Engelbrecht2024-09-201-1/+1
| | | Don't lock disk cache buckets from writing when scanning records for attachment references
* trace scopes improvementsDan Engelbrecht2024-09-101-2/+2
|
* move gc logs to gc logger (#142)Dan Engelbrecht2024-09-041-1/+9
| | | - Improvement: Move GC logging in callback functions into "gc" context
* oplog index snapshots (#140)Dan Engelbrecht2024-09-031-0/+2
| | | - Feature: Added project store oplog index snapshots for faster opening of oplog - opening oplogs are roughly 10x faster
* Make sure `noexcept` functions does not leak exceptions (#136)Dan Engelbrecht2024-08-231-6/+6
|
* Make sure we monitor for new project, oplogs, namespaces and buckets during ↵Dan Engelbrecht2024-06-131-0/+235
| | | | | | GCv2 (#93) - Bugfix: Make sure we monitor and include new project/oplogs created during GCv2 - Bugfix: Make sure we monitor and include new namespaces/cache buckets created during GCv2
* add batching of CacheStore requests for GetCacheValues/GetCacheChunks (#90)Dan Engelbrecht2024-06-041-10/+135
| | | | | | * cache file size of block on open * add ability to control size limit for small chunk callback when iterating block * Add batch fetch of cache values in the GetCacheValues request
* batch cache put (#67)Dan Engelbrecht2024-05-021-12/+69
| | | - Improvement: Batch scope for put of cache values
* use direct file access for large file hash (#63)Dan Engelbrecht2024-04-261-4/+4
| | | - Improvement: Refactor `IoHash::HashBuffer` and `BLAKE3::HashBuffer` to not use memory mapped files. Performs better and saves ~10% of oplog export time on CI
* improved assert (#37)Dan Engelbrecht2024-04-041-2/+2
| | | | - Improvement: Add file and line to ASSERT exceptions - Improvement: Catch call stack when throwing assert exceptions and log/output call stack at important places to provide more context to caller
* clean up test linking (#4)Dan Engelbrecht2024-03-141-10/+10
| | | | | | | - Improvement: Add zenhttp-test and zenutil-test - Improvement: Moved cachepolicy test to cachepolicy.cpp - Improvement: Renamed cachestore tests from z$ to cachestore - Improvement: Moved test linking so test for a lib is linked by <lib>-test - Improvement: Removed HttpRequestParseRelativeUri in httpstructuredcache.cpp and use the one in cacherequests.h instead
* improved block store logging and more gcv2 tests (#659)Dan Engelbrecht2024-02-271-10/+1
| | | | * improved gc/blockstore logging * more gcv2 tests
* remove reference caching (#658)Dan Engelbrecht2024-02-271-135/+84
| | | * remove reference caching
* hashing fixes (#657)Dan Engelbrecht2024-02-261-1/+72
| | | | | * move structuredcachestore tests to zenstore-test * Don't materialize entire files when hashing if it is a large files * rewrite CompositeBuffer::Mid to never materialize buffers
* separate RPC processing from HTTP processing (#626)Stefan Boberg2023-12-201-1/+1
| | | | | | * moved all RPC processing from HttpStructuredCacheService into separate CacheRpcHandler class in zenstore * move package marshaling to zenutil. was previously in zenhttp/httpshared but it's useful in other contexts as well where we don't want to depend on zenhttp * introduced UpstreamCacheClient, this provides a subset of functions on UpstreamCache and lives in zenstore
* move cachedisklayer and structuredcachestore into zenstore (#624)Stefan Boberg2023-12-191-0/+2440