aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/blockstore.cpp
Commit message (Collapse)AuthorAgeFilesLines
* minor zenstore/blockstore fixes (#821)Stefan Boberg6 days1-12/+12
| | | | | | | | | | | - Fix clang-format error accidentally introduced by recent PR - Fix `FileSize()` CAS race that repeatedly invalidated the cache when concurrent callers both missed; remove `store(0)` on CAS failure - Fix `WriteChunks` not accounting for initial alignment padding in `m_TotalSize`, causing drift vs `WriteChunk`'s correct accounting - Fix Create retry sleep computing negative values (100 - N*100 instead of 100 + N*100), matching the Open retry pattern - Fix `~BlockStore` error log missing format placeholder for `Ex.what()` - Fix `GetFreeBlockIndex` infinite loop when all indexes have orphan files on disk but aren't in `m_ChunkBlocks`; bound probe to `m_MaxBlockCount` - Fix `IterateBlock` ignoring `SmallSizeCallback` return value for single out-of-bounds chunks, preventing early termination - Fix `BlockStoreCompactState::IterateBlocks` iterating map by value instead of const reference
* Add test suites (#799)Stefan Boberg2026-03-021-0/+4
| | | | | | | | | | | | | Makes all test cases part of a test suite. Test suites are named after the module and the name of the file containing the implementation of the test. * This allows for better and more predictable filtering of which test cases to run which should also be able to reduce the time CI spends in tests since it can filter on the tests for that particular module. Also improves `xmake test` behaviour: * instead of an explicit list of projects just enumerate the test projects which are available based on build system state * also introduces logic to avoid running `xmake config` unnecessarily which would invalidate the existing build and do lots of unnecessary work since dependencies were invalidated by the updated config * also invokes build only for the chosen test targets As a bonus, also adds `xmake sln --open` which allows opening IDE after generation of solution/xmake project is done.
* reduce batch size for reads (#740)Dan Engelbrecht2026-01-291-2/+2
| | | | | * reduce maximum size per chunk to read to reduce disk contention * increase timeout before warning on slow shut down of zenserver * reduce default window size for blockstore chunk iteration
* various optimizations (#704)Dan Engelbrecht2026-01-091-6/+18
| | | | | | | | | - Improvement: Validate chunk hashes when dechunking files in oplog import - Improvement: Use stream decompression when dechunking files - Improvement: When assembling blocks for oplog export, make sure we keep under/at block size limit - Improvement: Make cancelling of oplog import more responsive - Improvement: Use decompress to composite to avoid allocating a new memory buffer for uncompressed chunks during oplog import - Improvement: Reduce memory buffer size and allocate it on demand when writing multiple chunks to block store - Improvement: Reduce lock contention when fetching/checking existence of chunks in block store
* automatic scrub on startup (#667)Dan Engelbrecht2025-11-271-0/+1
| | | | | - Improvement: Deeper validation of data when scrub is activated (cas/cache/project) - Improvement: Enabled more multi threading when running scrub operations - Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
* fix block store file appender (#658)Dan Engelbrecht2025-11-201-3/+43
| | | * fix bug where we write buffered data instead of provided data in BlockStoreFileAppender
* add append-only buffering of BlockStoreFile (#652)Dan Engelbrecht2025-11-171-9/+124
| | | | * add append-only buffering of BlockStoreFile replaces use of BasicFileWriter in Compact which bypassed cached position in BlockStore
* optimize blockstore flush (#614)Dan Engelbrecht2025-10-271-28/+55
| | | | | * rework block store block flushing to only happen once at end of block write outside of locks * fix warning at startup if no gc.dlog file exists
* optimize blockstore filesize (#612)Dan Engelbrecht2025-10-241-1/+11
| | | * since we only ever append to a block store file we don't need to actually flush the position
* add ability to limit concurrency (#565)Stefan Boberg2025-10-101-2/+2
| | | | | | | | | | | | effective concurrency in zenserver can be limited via the `--corelimit=<N>` option on the command line. Any value passed in here will be used instead of the return value from `std::thread::hardware_concurrency()` if it is lower. * added --corelimit option to zenserver * made sure thread pools are configured lazily and not during global init * added log output indicating effective and HW concurrency * added change log entry * removed debug logging from ZenEntryPoint::Run() also removed main thread naming on Linux since it makes the output from `top` and similar tools confusing (it shows `main` instead of `zenserver`)
* fix missing chunk in block after gc (#560)Dan Engelbrecht2025-10-061-1/+3
| | | * make sure we use aligned write pos in blockstore compact when checking target block size
* speed up tests (#555)Dan Engelbrecht2025-10-061-58/+72
| | | | | | | | | | | | * faster FileSystemTraversal test * faster jobqueue test * faster NamedEvent test * faster cache tests * faster basic http tests * faster blockstore test * faster cache store tests * faster compactcas tests * more responsive zenserver launch * tweak worker pool sizes in tests
* fix missing chunk (#548)Dan Engelbrecht2025-10-031-2/+12
| | | | * fix race condition where BlockStoreFile::m_CachedFileSize may be reset between check and get in FileSize()
* fix race condition in BlockStoreFile::Flush (#525)Dan Engelbrecht2025-09-291-2/+2
| | | Bugfix: Flush of blockstore file could sometimes cause an error due to a race condition
* more responsive cancel during oplog import (#505)Dan Engelbrecht2025-09-221-6/+7
| | | | - Improvement: Faster oplog import due to chunk existance check improvement - Improvement: Cancelling oplog import is now more responsive during initial phase
* add EMode to WorkerTheadPool to avoid thread starvation (#492)Dan Engelbrecht2025-09-101-100/+112
| | | - Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
* missing chunks bugfix (#424)Dan Engelbrecht2025-06-091-16/+37
| | | | | | | | | | | * make sure to close log file when resetting log * drop entries that refers to missing blocks * Don't scrub keys that has been rewritten * currectly count added bytes / m_TotalSize * fix negative sleep time in BlockStoreFile::Open() * be defensive when fetching log position * append to log files *after* we updated all state successfully * explicitly close stuff in destructors with exception catching * clean up empty size block store files
* add missing flush inblockstore compact (#411)Dan Engelbrecht2025-05-301-11/+67
| | | | - Bugfix: Flush the last block before closing the last new block written to during blockstore compact. UE-291196 - Feature: Drop unreachable CAS data during GC pass. UE-291196
* faster oplog validate (#408)Dan Engelbrecht2025-05-301-5/+42
| | | Improvement: Faster oplog validate to reduce GC wall time and disk I/O pressure
* optimize block store CompactBlocks (#384)Dan Engelbrecht2025-05-071-14/+19
| | | | | - Improvement: Optimize block compact reducing memcpy operations - Improvement: Handle padding of block store blocks when compacting to avoid excessive flusing of write buffer - Improvement: Handle padding when writing oplog index snapshot to avoid unnecessary flushing of write buffer
* iterate chunks crash fix (#376)Dan Engelbrecht2025-05-021-10/+21
| | | * Bugfix: Add explicit lambda capture in CasContainer::IterateChunks to avoid accessing state data references
* long filename support (#330)Dan Engelbrecht2025-03-311-9/+9
| | | - Bugfix: Long file paths now works correctly on Windows
* zen build cache service (#318)Dan Engelbrecht2025-03-261-1/+1
| | | | | | | | | - **EXPERIMENTAL** `zen builds` - Feature: `--zen-cache-host` option for `upload` and `download` operations to use a zenserver host `/builds` endpoint for storing build blob and blob metadata - Feature: New `/builds` endpoint for caching build blobs and blob metadata - `/builds/{namespace}/{bucket}/{buildid}/blobs/{hash}` `GET` and `PUT` method for storing and fetching blobs - `/builds/{namespace}/{bucket}/{buildid}/blobs/putBlobMetadata` `POST` method for storing metadata about blobs - `/builds/{namespace}/{bucket}/{buildid}/blobs/getBlobMetadata` `POST` method for fetching metadata about blobs - `/builds/{namespace}/{bucket}/{buildid}/blobs/exists` `POST` method for checking existance of blobs
* Add multithreading directory scanning in core/filesystem (#277)Dan Engelbrecht2025-01-221-2/+3
| | | | | | add DirectoryContent::IncludeFileSizes add DirectoryContent::IncludeAttributes add multithreaded GetDirectoryContent use multithreaded GetDirectoryContent in workspace folder scanning
* batch fetch record cache values (#266)Dan Engelbrecht2024-12-171-2/+7
| | | | | | - Improvement: Batch fetch record attachments when appropriate - Improvement: Reduce memory buffer allocation in BlockStore::IterateBlock - Improvement: Tweaked BlockStore::IterateBlock logic when to use threaded work (at least 4 chunks requested) - Bugfix: CasContainerStrategy::IterateChunks could give wrong payload/index when requesting 1 or 2 chunks
* added support for dynamic LLM tags (#245)Stefan Boberg2024-12-021-0/+19
| | | | | * added FLLMTag which can be used to register memory tags outside of core * changed `UE_MEMSCOPE` -> `ZEN_MEMSCOPE` for consistency * instrumented some subsystems with dynamic tags
* use plain sorted array instead of map of vectors (#237)Dan Engelbrecht2024-11-271-18/+26
| | | | | * use plain sorted array instead of map of vectors * reserve vectors up front = 5% perf increase * don't do batch read of chunks if we have a single chunk -> 1% perf gain
* caller controls threshold for bulk-loading chunks in IterateChunks (#222)Dan Engelbrecht2024-11-251-1/+2
| | | | | | * Allow caller to control threshold for bulk-loading chunks in IterateChunks * use smaller batch chunk reading for /fileinfos and /chunkinfos as we do not intend to read the payload * use smaller batch read buffer when just querying for size of attachments
* don't read chunks into memory during cache batch fetch unless we may cache ↵Dan Engelbrecht2024-10-091-12/+12
| | | | | them in memory (#188) * Don't read chunks into memory during cache batch fetch unless we may cache them in memory
* remove gc v1 (#121)Dan Engelbrecht2024-10-031-631/+17
| | | | | * kill gc v1 * block use of gc v1 from zen command line * warn and flip to gcv2 if --gc-v2=false is specified for zenserver
* gc block size target max size (#180)Dan Engelbrecht2024-10-021-8/+31
| | | | | | * If a block is small (less than half max size) we add it to blocks to compact Sort blocks when iterating over them * do compact of block stores even if no new unused are found * do compact phase even if bucket is empty
* optimize startup time (#175)Dan Engelbrecht2024-09-301-15/+5
| | | | | | * use tsl::robin_set for BlockIndexSet don't calculate full block location when only block index is needed * don't copy visitor function * reserve space for attachments
* exception safety when writing block (#168)Dan Engelbrecht2024-09-251-10/+8
| | | | * make sure we always clear writing block from m_ActiveWriteBlocks even if we have an exception
* gc performance improvements (#160)Dan Engelbrecht2024-09-171-3/+3
| | | | | | | | | | * optimized ValidateCbUInt * optimized iohash comparision * replace unordered set/map with tsl/robin set/map in blockstore * increase max buffer size when writing cache bucket sidecar * only store meta data for files < 4Gb * faster ReadAttachmentsFromMetaData * remove memcpy call in BlockStoreDiskLocation * only write cache bucket state to disk if GC deleted anything
* oplog index snapshots (#140)Dan Engelbrecht2024-09-031-0/+2
| | | - Feature: Added project store oplog index snapshots for faster opening of oplog - opening oplogs are roughly 10x faster
* meta info store (#75)Dan Engelbrecht2024-08-301-7/+147
| | | | - Feature: Added option `--gc-cache-attachment-store` which caches referenced attachments in cache records on disk for faster GC - default is `false` - Feature: Added option `--gc-projectstore-attachment-store` which caches referenced attachments in project store oplogs on disk for faster GC - default is `false`
* prevent new block in gc (#118)Dan Engelbrecht2024-08-151-1/+1
| | | * make sure we don't reset write-pos for new block for each block iterated
* Skip chunk in block stores when iterating a block if the location is out of ↵Dan Engelbrecht2024-08-121-2/+7
| | | | range (#109)
* don't assert that we have moved bytes if source block is zero size (#97)Dan Engelbrecht2024-06-141-1/+2
| | | | * don't assert that we have moved bytes if source block is zero size * handle invalid session ids gracefully
* add batching of CacheStore requests for GetCacheValues/GetCacheChunks (#90)Dan Engelbrecht2024-06-041-14/+40
| | | | | | * cache file size of block on open * add ability to control size limit for small chunk callback when iterating block * Add batch fetch of cache values in the GetCacheValues request
* refactor BlockStore IterateChunks (#77)Dan Engelbrecht2024-05-171-189/+196
| | | Improvement: Refactored IterateChunks to allow reuse in diskcachelayer and hide public GetBlockFile() function in BlockStore
* iterate cas chunks (#59)Dan Engelbrecht2024-04-241-83/+122
| | | - Improvement: Reworked GetChunkInfos in oplog store to reduce disk thrashing and improve performance
* InsertChunks for CAS store (#55)Dan Engelbrecht2024-04-221-0/+163
| | | - Improvement: Add batching when writing multiple small chunks to block store - decreases I/O load significantly on oplog import
* gc v2 disk freed space fix and oplog stats report improvement (#45)Dan Engelbrecht2024-04-151-10/+15
| | | | | - Bugfix: Correctly calculate size freed/data moved from blocks in GCv2 - Improvement: Reduced details in remote store stats for oplog export/import to user - Improvement: Transfer speed for oplog export/import is now an overall number rather than average of speed per single request
* improved assert (#37)Dan Engelbrecht2024-04-041-3/+3
| | | | - Improvement: Add file and line to ASSERT exceptions - Improvement: Catch call stack when throwing assert exceptions and log/output call stack at important places to provide more context to caller
* validate rpc chunk responses (#36)Dan Engelbrecht2024-04-031-1/+5
| | | * Validate size of found chunks in cas/cache
* add disk caching to block move (#661)Dan Engelbrecht2024-02-271-23/+36
| | | * add disk caching to block move
* improved block store logging and more gcv2 tests (#659)Dan Engelbrecht2024-02-271-16/+52
| | | | * improved gc/blockstore logging * more gcv2 tests
* Add retry with optional resume logic to HttpClient::Download (#639)Dan Engelbrecht2024-01-241-80/+84
| | | | | | | - Improvement: Refactored Jupiter upstream to use HttpClient - Improvement: Added retry and resume logic to HttpClient - Improvement: Added authentication support to HttpClient - Improvement: Clearer logging in GCV2 compact of FileCas/BlockStore - Improvement: Size details in oplog import logging
* Fix crash bug when trying to inspect non-open block file in GC (#614)Dan Engelbrecht2023-12-181-7/+19
|