aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/cache
Commit message (Collapse)AuthorAgeFilesLines
* test running / reporting improvements (#797)Stefan Boberg11 days1-1/+1
| | | | | | | | | | | | | | | | | | | **CI/CD improvements (validate.yml):** - Add test reporter (`ue-foundation/test-reporter@v2`) for all three platforms, rendering JUnit test results directly in PR check runs - Add "Trust workspace" step on Windows to fix git safe.directory ownership issue with self-hosted runners - Clean stale report files before each test run to prevent false failures from leftover XML - Broaden `paths-ignore` to skip builds for non-code changes (`*.md`, `LICENSE`, `.gitignore`, `docs/**`) **Test improvements:** - Convert `CHECK` to `REQUIRE` in several test suites (projectstore, integration, http) for fail-fast behavior - Mark some tests with `doctest::skip()` for selective execution - Skip httpclient transport tests pending investigation - Add `--noskip` option to `xmake test` task - Add `--repeat=<N>` option to `xmake test` task, to run tests repeatedly N times or until there is a failure **xmake test output improvements:** - Add totals row to test summary table - Right-justify numeric columns in summary table
* Various bug fixes (#778)Stefan Boberg2026-02-243-4/+7
| | | | | | | | | | | | | | | | | | | | | | zencore fixes: - filesystem.cpp: ReadFile error reporting logic - compactbinaryvalue.h: CbValue::As*String error reporting logic zenhttp fixes: - httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB) - httpsys async workpool initialization race zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
* Revert "Fix correctness and concurrency bugs found during code review"Stefan Boberg2026-02-243-7/+4
| | | | This reverts commit 3c89c486338890ce39ddebe5be4722a09e85701a.
* Fix correctness and concurrency bugs found during code reviewStefan Boberg2026-02-243-4/+7
| | | | | | | | | | | | | | | | | zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths) Co-Authored-By: Claude Opus 4.6 <[email protected]>
* reduce blocking in scrub (#743)Dan Engelbrecht2026-02-031-56/+86
| | | * reduce held locks while performing scrub operation
* don't do full cb-object validation on cache records when read from disk (#739)Dan Engelbrecht2026-01-291-10/+9
| | | * don't do full cb-object validation on cache records when read from disk
* Fix unit test that relies on being able to overwritezousar2025-12-191-2/+2
|
* Ensure upstream put propagation includes overwritezousar2025-12-191-5/+12
| | | | When changing the default limit-overwrite behavior, a unit test surfaced a bug where an put of data with overwrite cache policy would not get propagated via zen's built-in upstream mechanism with a matching overwrite cache policy to the upstream. This change ensures that it does and leaves the unit test configured to exercise this scenario.
* remove catching of exceptions in batch operations now that they are not ↵Dan Engelbrecht2025-12-102-101/+61
| | | | | executed in the destructor (#683) don't call WriteChunks in batch operation if no chunks needs to be written
* batch op not in destructor (#676)Dan Engelbrecht2025-12-043-328/+402
| | | | | * use fixed vectors for batch requests * refactor cache batch value put/get to not execute code that can throw execeptions in destructor * extend test with multi-bucket requests
* add checks to protect against access violation due to failed disk read (#675)Dan Engelbrecht2025-12-041-0/+12
| | | * add checkes to protect against access violation due to failed disk read
* automatic scrub on startup (#667)Dan Engelbrecht2025-11-273-64/+69
| | | | | - Improvement: Deeper validation of data when scrub is activated (cas/cache/project) - Improvement: Enabled more multi threading when running scrub operations - Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
* RawOffset can be anything and we expect an empty buffer to be returned along ↵Dan Engelbrecht2025-11-261-17/+23
| | | | with RawSize = 0 if the offset was out of bounds for the value. (#666)
* Various fixes to address issues flagged by gcc / non-UE toolchain build (#621)Stefan Boberg2025-11-011-0/+1
| | | | | | | | | | | | | | | | | | | | * gcc: avoid using memset on nontrivial struct * redundant `return std::move` * fixed various compilation issues flagged by gcc * fix issue in xmake.lua detecting whether we are building with the UE toolchain or not * add GCC ignore -Wundef (comment is inaccurate) * remove redundant std::move * don't catch exceptions by value * unreferenced variables * initialize "by the book" instead of memset * remove unused exception reference * add #include <cstring> to fix gcc build * explicitly poulate KeyValueMap by traversing input spans fixes gcc compilation * remove unreferenced variable * eliminate redundant `std::move` which gcc complains about * fix gcc compilation by including <cstring> * tag unreferenced variable to fix gcc compilation * fixes for various cases of naming members the same as their type
* optimize filecas write file (#613)Dan Engelbrecht2025-10-241-16/+10
| | | * try to move file into place before trying speculative remove of target file
* add ability to limit concurrency (#565)Stefan Boberg2025-10-101-2/+2
| | | | | | | | | | | | effective concurrency in zenserver can be limited via the `--corelimit=<N>` option on the command line. Any value passed in here will be used instead of the return value from `std::thread::hardware_concurrency()` if it is lower. * added --corelimit option to zenserver * made sure thread pools are configured lazily and not during global init * added log output indicating effective and HW concurrency * added change log entry * removed debug logging from ZenEntryPoint::Run() also removed main thread naming on Linux since it makes the output from `top` and similar tools confusing (it shows `main` instead of `zenserver`)
* speed up tests (#555)Dan Engelbrecht2025-10-061-32/+56
| | | | | | | | | | | | * faster FileSystemTraversal test * faster jobqueue test * faster NamedEvent test * faster cache tests * faster basic http tests * faster blockstore test * faster cache store tests * faster compactcas tests * more responsive zenserver launch * tweak worker pool sizes in tests
* cacherequests helpers test only (#551)Dan Engelbrecht2025-10-035-5/+673
| | | | * don't use cacherequests utils in cache_cmd.cpp * make zenutil/cacherequests code into test code helpers only
* zenutil cleanup (#550)Dan Engelbrecht2025-10-031-1/+2
| | | | * move referencemetadata to zenstore * rename zenutil/windows/service to windowsservice
* remove zenutil dependency in zenremotestore (#547)Dan Engelbrecht2025-10-031-1/+1
| | | | | | | | | * remove dependency to zenutil/workerpools.h from remoteprojectstore.cpp * remove dependency to zenutil/workerpools.h from buildstoragecache.cpp * remove unneded include * move jupiter helpers to zenremotestore * move parallelwork to zencore * remove zenutil dependency from zenremotestore * clean up test project dependencies - use indirect dependencies
* more cbobject validations (#527)Dan Engelbrecht2025-09-291-5/+19
| | | - Improvement: Add additional validations when reading disk cache records to get references in GC
* GetCacheChunk value request respects RawHash (#518)Zousar Shaker2025-09-291-1/+3
| | | When requesting a value via the GetCacheChunks rpc method, if the request specified a raw hash, only return a hit if the raw hash matches what was in the cache.
* Improvement to Incomplete Result Iterationzousar2025-09-251-2/+1
| | | | From review feedback
* Report Incomplete Records To Clientzousar2025-09-241-5/+39
| | | | When requesting partial records, report back when a record is incomplete via an "Incomplete" array of bools that is a sibling to the "Result" array for batch/rpc operations, or via the HttpResponseCode::PartialContent status code for individual record requests.
* Adjust the responses from PUT commandszousar2025-09-233-14/+9
| | | | | - Ensure that text responses are in a field named "Message" - Change the record response to be named "Record" instead of "Object"
* Change batch put responses for client reportingzousar2025-09-193-33/+75
| | | | Conflicts are now treated as successes, and we optionally return a Details array instead of an ErrorMessages array. Details are returned for all requests in a batch, or no requests in a batch depending on whether there are any details to be shared about any of the put requests. The details for a conflict include the raw hash and raw size of the item. If the item is a record, we also include the record as an object.
* add EMode to WorkerTheadPool to avoid thread starvation (#492)Dan Engelbrecht2025-09-102-48/+63
| | | - Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
* add validation of compact binary payloads before reading them (#483)Dan Engelbrecht2025-09-041-5/+3
| | | * add validation of compact binary payloads before reading them
* revert multi-cid store (#475)Dan Engelbrecht2025-08-261-37/+17
|
* per namespace/project cas prep refactor (#470)Dan Engelbrecht2025-08-203-68/+82
| | | | | | | - Refactor so we can have more than one cas store for project store and cache. - Refactor `UpstreamCacheClient` so it is not tied to a specific CidStore - Refactor scrub to keep the GC interface ScrubStorage function separate from scrub accessor functions (renamed to Scrub). - Refactor storage size to keep GC interface StorageSize function separate from size accessor functions (renamed to TotalSize) - Refactor cache storage so `ZenCacheDiskLayer::CacheBucket` implements GcStorage interface rather than `ZenCacheNamespace`
* reduce lock contention when checking for disk cache put reject (#465)Dan Engelbrecht2025-08-121-91/+81
| | | | keep rawsize and rawhash if available when using batch for inline puts keep rawsize and rawhash of input value if we have calculated it for validation already
* Merge branch 'main' into zs/put-overwrite-policyZousar Shaker2025-08-081-2/+2
|\
| * add the correct set of references hashes in batched inline mode (#459)Dan Engelbrecht2025-08-061-3/+3
| |
* | precommitzousar2025-08-071-9/+4
| |
* | Avoid committing chunks for batch rejected putszousar2025-08-071-42/+35
| | | | | | | | Previously rejected puts would put the chunks, but not write them to the index, which was wrong.
* | Moving put rejections to happen in batch handlingzousar2025-08-051-33/+87
| |
* | Merge branch 'main' into zs/put-overwrite-policyzousar2025-08-051-11/+40
|\|
| * add hardening for legacy cache bucket manifests (#454)Dan Engelbrecht2025-08-041-11/+40
| |
* | xmake precommitzousar2025-06-241-4/+4
| |
* | Merge branch 'main' into zs/put-overwrite-policyzousar2025-06-243-308/+530
|\|
| * make sure we unregister from GC before we drop bucket/namespaces (#443)Dan Engelbrecht2025-06-192-1/+4
| |
| * graceful wait in parallelwork destructor (#438)Dan Engelbrecht2025-06-161-35/+45
| | | | | | | | | | * exception safety when issuing ParallelWork * add asserts to Latch usage to catch usage errors * extended error messaging and recovery handling in ParallelWork destructor to help find issues
| * missing chunks bugfix (#424)Dan Engelbrecht2025-06-091-14/+70
| | | | | | | | | | | | | | | | | | | | | | * make sure to close log file when resetting log * drop entries that refers to missing blocks * Don't scrub keys that has been rewritten * currectly count added bytes / m_TotalSize * fix negative sleep time in BlockStoreFile::Open() * be defensive when fetching log position * append to log files *after* we updated all state successfully * explicitly close stuff in destructors with exception catching * clean up empty size block store files
| * pause, resume and abort running builds cmd (#421)Dan Engelbrecht2025-06-051-4/+8
| | | | | | | | | | - Feature: `zen builds pause`, `zen builds resume` and `zen builds abort` commands to control a running `zen builds` command - `--process-id` the process id to control, if omitted it tries to find a running process using the same executable as itself - Improvement: Process report now indicates if it is pausing or aborting
| * fix cachbucket mem hit count (#415)Dan Engelbrecht2025-06-021-4/+7
| | | | | | | | | | * Don't count a miss twice for memory stats if the entry can't be found * changelog
| * add missing flush inblockstore compact (#411)Dan Engelbrecht2025-05-301-2/+22
| | | | | | | | - Bugfix: Flush the last block before closing the last new block written to during blockstore compact. UE-291196 - Feature: Drop unreachable CAS data during GC pass. UE-291196
| * unblock cache bucket drop (#406)Dan Engelbrecht2025-05-262-40/+110
| | | | | | | | * don't hold exclusive locks while deleting files from a dropped bucket/namespace * cleaner detection of missing namespace when issuing a drop
| * handle exception with batch work (#401)Dan Engelbrecht2025-05-191-15/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * use ParallelWork in rpc playback * use ParallelWork in projectstore * use ParallelWork in buildstore * use ParallelWork in cachedisklayer * use ParallelWork in compactcas * use ParallelWork in filecas * don't set abort flag in ParallelWork destructor * add PrepareFileForScatteredWrite for temp files in httpclient * Use PrepareFileForScatteredWrite when stream-decompressing files * be more relaxed when deleting temp files * allow explicit zen-cache when using direct host url without resolving * fix lambda capture when writing loose chunks * no delay when attempting to remove temp files
| * keep snapshot on log delete fail (#391)Dan Engelbrecht2025-05-121-26/+21
| | | | | | | | | | - Improvement: Cleaned up snapshot writing for CompactCAS/FileCas/Cache/Project stores - Improvement: Safer recovery when failing to delete log for CompactCAS/FileCas/Cache/Project stores - Improvement: Added log file reset when writing snapshot at startup for FileCas
| * enable per bucket config (#388)Dan Engelbrecht2025-05-121-2/+19
| | | | | | | | Feature: Add per bucket cache configuration (Lua options file only) Improvement: --cache-memlayer-sizethreshold is now deprecated and has a new name: --cache-bucket-memlayer-sizethreshold to line up with per cache bucket configuration