aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/cas.cpp
Commit message (Collapse)AuthorAgeFilesLines
* zenstore bug-fixes from static analysis pass (#815)Stefan Boberg11 days1-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | **Bug fixes across zenstore, zenremotestore, and related subsystems, primarily surfaced by static analysis.** ## Cache subsystem (cachedisklayer.cpp) - Fixed tombstone scoping bug: tombstone flag and missing entry were recorded outside the block where data was removed, causing non-missing entries to be incorrectly tombstoned - Fixed use-after-overwrite: `RemoveMemCachedData`/`RemoveMetaData` were called after `Payload` was overwritten on cache put, leaking stale data - Fixed incorrect retry sleep formula (`100 - (3 - RetriesLeft) * 100` always produced the same or negative value; corrected to `(3 - RetriesLeft) * 100`) - Fixed broken `break` missing from sidecar file read loop, causing reads past valid data - Fixed missing format argument in three `ZEN_WARN`/`ZEN_ERROR` log calls (format string had `{}` placeholders with no corresponding argument, or vice versa) - Fixed elapsed timer being accumulated inside the wrong scope in `HandleRpcGetCacheRecords` - Fixed test asserting against unserialized `RecordPolicy` instead of the deserialized `Loaded` copy - Initialized `AbortFlag`/`PauseFlag` atomics at declaration (UB if read before first write) ## Build store (buildstore.cpp / buildstore.h) - Fixed wrong variable used in warning log: used loop index `ResultIndex` instead of `Index`/`MetaLocationResultIndexes[Index]`, logging wrong hash values - Fixed `sizeof(AccessTimesHeader)` used instead of `sizeof(AccessTimeRecord)` when advancing write offset, corrupting the access times file if the sizes differ - Initialized `m_LastAccessTimeUpdateCount` atomic member (was uninitialized) - Changed map iteration loops to use `const auto&` to avoid unnecessary copies ## Project store (projectstore.cpp / projectstore.h) - Fixed wrong iterator dereferenced in `IterateChunks`: used `ChunkIt->second` (from a different map lookup) instead of `MetaIt->second` - Fixed wrong assert variable: `Sizes[Index]` should be `RawSizes[Index]` - Fixed `MakeTombstone`/`IsTombstone` inconsistency: `MakeTombstone` was zeroing `OpLsn` but `IsTombstone` checks `OpLsn.Number != 0`; tombstone creation now preserves `OpLsn` - Fixed uninitialized `InvalidEntries` counter - Fixed format string mismatch in warning log - Initialized `AbortFlag`/`PauseFlag` atomics; changed map iteration to `const auto&` ## Workspaces (workspaces.cpp) - Fixed missing alias registration when a workspace share is updated: alias was deleted but never re-inserted - Fixed integer overflow in range clamping: `(RequestedOffset + RequestedSize) > Size` could wrap; corrected to `RequestedSize > Size - RequestedOffset` - Changed map iteration loops to `const auto&` ## CAS subsystem (cas.cpp, caslog.cpp, compactcas.cpp, filecas.cpp) - Fixed `IterateChunks` passing original `Payload` buffer instead of the modified `Chunk` buffer (content type was set on the copy but the original was sent to the callback) - Fixed invalid `std::future::get()` call on default-constructed futures - Fixed sign-comparison in `CasLogFile::Replay` loop (`int i` vs `size_t`) - Changed `CasLogFile::IsValid` and `Open` to take `const std::filesystem::path&` instead of by value - Fixed format string in `~CasContainerStrategy` error log ## Remote store (zenremotestore) - Fixed `FolderContent::operator==` always returning true: loop variable `PathCount` was initialized to 0 instead of `Paths.size()` - Fixed `GetChunkIndexForRawHash` looking up from wrong map (`RawHashToSequenceIndex` instead of `ChunkHashToChunkIndex`) - Fixed double-counted `UniqueSequencesFound` stat (incremented in both branches of an if/else) - Fixed `RawSize` sentinel value truncation: `(uint32_t)-1` assigned to a `uint64_t` field; corrected to `(uint64_t)-1` - Initialized uninitialized atomic and struct members across `buildstorageoperations.h`, `chunkblock.h`, and `remoteprojectstore.h`
* Add test suites (#799)Stefan Boberg2026-03-021-0/+4
| | | | | | | | | | | | | Makes all test cases part of a test suite. Test suites are named after the module and the name of the file containing the implementation of the test. * This allows for better and more predictable filtering of which test cases to run which should also be able to reduce the time CI spends in tests since it can filter on the tests for that particular module. Also improves `xmake test` behaviour: * instead of an explicit list of projects just enumerate the test projects which are available based on build system state * also introduces logic to avoid running `xmake config` unnecessarily which would invalidate the existing build and do lots of unnecessary work since dependencies were invalidated by the updated config * also invokes build only for the chosen test targets As a bonus, also adds `xmake sln --open` which allows opening IDE after generation of solution/xmake project is done.
* Various bug fixes (#778)Stefan Boberg2026-02-241-6/+6
| | | | | | | | | | | | | | | | | | | | | | zencore fixes: - filesystem.cpp: ReadFile error reporting logic - compactbinaryvalue.h: CbValue::As*String error reporting logic zenhttp fixes: - httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB) - httpsys async workpool initialization race zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
* Revert "Fix correctness and concurrency bugs found during code review"Stefan Boberg2026-02-241-6/+6
| | | | This reverts commit 3c89c486338890ce39ddebe5be4722a09e85701a.
* Fix correctness and concurrency bugs found during code reviewStefan Boberg2026-02-241-6/+6
| | | | | | | | | | | | | | | | | zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths) Co-Authored-By: Claude Opus 4.6 <[email protected]>
* oplog download size (#690)Dan Engelbrecht2025-12-151-9/+9
| | | | - Bugfix: Upload of oplogs could reference multiple blocks for the same chunk causing redundant downloads of blocks - Improvement: Use the improved block reuse selection function from zen builds upload in zen oplog-export to reduce oplog download size
* add EMode to WorkerTheadPool to avoid thread starvation (#492)Dan Engelbrecht2025-09-101-7/+12
| | | - Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
* per namespace/project cas prep refactor (#470)Dan Engelbrecht2025-08-201-10/+8
| | | | | | | - Refactor so we can have more than one cas store for project store and cache. - Refactor `UpstreamCacheClient` so it is not tied to a specific CidStore - Refactor scrub to keep the GC interface ScrubStorage function separate from scrub accessor functions (renamed to Scrub). - Refactor storage size to keep GC interface StorageSize function separate from size accessor functions (renamed to TotalSize) - Refactor cache storage so `ZenCacheDiskLayer::CacheBucket` implements GcStorage interface rather than `ZenCacheNamespace`
* zen print fixes/improvements (#469)Dan Engelbrecht2025-08-191-1/+1
| | | | | - Improvement: `zen print` now allows output of compact binary content even if they are in non-optimal format (Unifom vs Non-Uniform arrays and objects) - Feature: `zen print` now has a `--show-type-info` option to add type information to output of compact binary content - Bugfix: Stats information for Build Store (Zen Store Cache) no longer throws exception and outputs invalid state information
* tweak iterate block parameters (#390)Dan Engelbrecht2025-05-121-2/+5
| | | * tweak block iteration chunk sizes
* long filename support (#330)Dan Engelbrecht2025-03-311-1/+1
| | | - Bugfix: Long file paths now works correctly on Windows
* Memory tracking improvements (#262)Stefan Boberg2024-12-111-1/+0
| | | | | * added LLM tag to properly tag RPC allocations * annotated some more httpsys functions with memory tags * only emit memory scope events if the active tag is different from the new tag
* added support for dynamic LLM tags (#245)Stefan Boberg2024-12-021-0/+20
| | | | | * added FLLMTag which can be used to register memory tags outside of core * changed `UE_MEMSCOPE` -> `ZEN_MEMSCOPE` for consistency * instrumented some subsystems with dynamic tags
* caller controls threshold for bulk-loading chunks in IterateChunks (#222)Dan Engelbrecht2024-11-251-4/+8
| | | | | | * Allow caller to control threshold for bulk-loading chunks in IterateChunks * use smaller batch chunk reading for /fileinfos and /chunkinfos as we do not intend to read the payload * use smaller batch read buffer when just querying for size of attachments
* split zencore/memory.h -> memoryview.h, memcmp.h (#228)Stefan Boberg2024-11-251-1/+1
| | | | | | | minor clean-up `zencore/memory.h` used to contain a variety of things including `Malloc` support along with `MemoryView` etc since the memory allocator stuff moved into `zencore/memory/memory.h` there was basically only `MemoryView` and `MemCmp` in there which seemed better to split out into separate headers to avoid overloading `memory.h`
* remove gc v1 (#121)Dan Engelbrecht2024-10-031-11/+0
| | | | | * kill gc v1 * block use of gc v1 from zen command line * warn and flip to gcv2 if --gc-v2=false is specified for zenserver
* separate worker pools into burst/background to avoid background jobs ↵Dan Engelbrecht2024-08-221-1/+1
| | | | blocking client requests (#134)
* use write and move in place for safer writing of files (#70)Dan Engelbrecht2024-05-021-13/+1
| | | * use write and move in place for safer writing of files
* fix get project files loop (#68)Dan Engelbrecht2024-04-301-5/+8
| | | | | - Bugfix: Remove extra loop causing GetProjectFiles for project store to find all chunks once for each chunk found - Bugfix: Don't capture ChunkIndex variable in CasImpl::IterateChunks by reference as it causes crash - Improvement: Make FileCasStrategy::IterateChunks (optionally) multithreaded (improves GetProjectFiles performance)
* oplog iterate chunks content type (#65)Dan Engelbrecht2024-04-261-3/+24
| | | | - Bugfix: Properly set content type of chunks fetch from CidStore - Improvement: Add IterateChunks(std::span<Oid>) for better performance in get oplog
* iterate cas chunks (#59)Dan Engelbrecht2024-04-241-0/+24
| | | - Improvement: Reworked GetChunkInfos in oplog store to reduce disk thrashing and improve performance
* InsertChunks for CAS store (#55)Dan Engelbrecht2024-04-221-9/+114
| | | - Improvement: Add batching when writing multiple small chunks to block store - decreases I/O load significantly on oplog import
* Use multithreading to fetch size/rawsize of entries in ↵Dan Engelbrecht2024-03-281-1/+1
| | | | | | `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` (#30) - Improvement: Use multithreading to fetch size/rawsize of entries in `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` - Improvement: Add `GetMediumWorkerPool()` in addition to `LargeWorkerPool()` and `SmallWorkerPool()`
* fix potential partially written files (#2)Dan Engelbrecht2024-03-131-2/+12
| | | | * Make sure WriteFile() does not leave incomplete files * use TemporaryFile and MoveTemporaryIntoPlace to avoid leaving partial files on error
* Make sure we wait for all scheduled tasks to complete before throwing ↵Dan Engelbrecht2024-02-281-0/+7
| | | | | exceptions further (#662) Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
* improve trace (#606)Dan Engelbrecht2023-12-131-1/+5
| | | | | * Adding some more trace scopes for better visiblity * Removed spammy trace scope when replaying oplogs * Remove "::Disk" from trace scopes - redundant now that we have merge disk and memory layers
* global thread worker pools (#577)Dan Engelbrecht2023-11-291-1/+2
| | | - Improvement: Use two global worker thread pools instead of ad-hoc creation of worker pools
* close thread pool + parallel CasImpl::Initialize (#576)Dan Engelbrecht2023-11-281-4/+17
| | | | * close thread pool at destruction * parallell casimpl::initialize
* add `flush` command and more gc status info (#483)Dan Engelbrecht2023-10-181-0/+2
| | | | | | - Feature: New endpoint `/admin/flush ` to flush all storage - CAS, Cache and ProjectStore - Feature: New command `zen flush` to flush all storage - CAS, Cache and ProjectStore - Improved: Command `zen gc-status` now gives details about storage, when last GC occured, how long until next GC etc - Changed: Cache access and write log are disabled by default
* add more trace scopes (#362)Dan Engelbrecht2023-09-151-0/+2
| | | | | * more trace scopes * Make sure ReplayLogEntries uses the correct size for oplog buffer * changelog
* Content scrubbing (#271)Stefan Boberg2023-05-161-1/+5
| | | Added zen scrub command which may be triggered via the zen CLI helper. This traverses storage and validates contents either by content hash and/or by structure. If unexpected data is encountered it is invalidated.
* Additional trace instrumentation (#312)Stefan Boberg2023-05-161-0/+6
| | | | | | | | | * added trace instrumentation to upstreamcache * added asio trace instrumentation * added trace annotations for project store * added trace annotations for BlockStore * added trace annotations for HttpClient * added trace annotations for CAS/GC
* minor GC API cleanupStefan Boberg2023-05-151-6/+6
| | | | | Scrub -> ScrubStorage Trigger -> TriggerGc (to make relationship to TriggerScrub clearer)
* moved source directories into `/src` (#264)Stefan Boberg2023-05-021-0/+355
* moved source directories into `/src` * updated bundle.lua for new `src` path * moved some docs, icon * removed old test trees