aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore
Commit message (Collapse)AuthorAgeFilesLines
* test running / reporting improvements (#797)Stefan Boberg11 days1-1/+1
| | | | | | | | | | | | | | | | | | | **CI/CD improvements (validate.yml):** - Add test reporter (`ue-foundation/test-reporter@v2`) for all three platforms, rendering JUnit test results directly in PR check runs - Add "Trust workspace" step on Windows to fix git safe.directory ownership issue with self-hosted runners - Clean stale report files before each test run to prevent false failures from leftover XML - Broaden `paths-ignore` to skip builds for non-code changes (`*.md`, `LICENSE`, `.gitignore`, `docs/**`) **Test improvements:** - Convert `CHECK` to `REQUIRE` in several test suites (projectstore, integration, http) for fail-fast behavior - Mark some tests with `doctest::skip()` for selective execution - Skip httpclient transport tests pending investigation - Add `--noskip` option to `xmake test` task - Add `--repeat=<N>` option to `xmake test` task, to run tests repeatedly N times or until there is a failure **xmake test output improvements:** - Add totals row to test summary table - Right-justify numeric columns in summary table
* Various bug fixes (#778)Stefan Boberg2026-02-245-12/+14
| | | | | | | | | | | | | | | | | | | | | | zencore fixes: - filesystem.cpp: ReadFile error reporting logic - compactbinaryvalue.h: CbValue::As*String error reporting logic zenhttp fixes: - httpasio BindAcceptor would `return 0;` in a function returning `std::string` (UB) - httpsys async workpool initialization race zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths)
* Revert "Fix correctness and concurrency bugs found during code review"Stefan Boberg2026-02-245-14/+12
| | | | This reverts commit 3c89c486338890ce39ddebe5be4722a09e85701a.
* Fix correctness and concurrency bugs found during code reviewStefan Boberg2026-02-245-12/+14
| | | | | | | | | | | | | | | | | zenstore fixes: - cas.cpp: GetFileCasResults Results param passed by value instead of reference (large chunk results were silently lost) - structuredcachestore.cpp: MissCount unconditionally incremented (counted hits as misses) - cacherpc.cpp: Wrong boolean in Incomplete response array (all entries marked incomplete) - cachedisklayer.cpp: sizeof(sizeof(...)) in two validation checks computed sizeof(size_t) instead of struct size - buildstore.cpp: Wrong hash tracked in GC key list (BlobHash pushed twice instead of MetadataHash) - buildstore.cpp: Removed duplicate m_LastAccessTimeUpdateCount increment in PutBlob zenserver fixes: - httpbuildstore.cpp: Reversed subtraction in HTTP range calculation (unsigned underflow) - hubservice.cpp: Deadlock in Provision() calling Wake() while holding m_Lock (extracted WakeLocked helper) - zipfs.cpp: Data race in GetFile() lazy initialization (added RwLock with shared/exclusive paths) Co-Authored-By: Claude Opus 4.6 <[email protected]>
* GC - fix handling of attachment ranges, http access token expiration, lock ↵Stefan Boberg2026-02-202-4/+5
| | | | | | | | file retry logic (#766) * GC - fix handling of attachment ranges * fix trace/log strings * fix HTTP access token expiration time logic * added missing lock retry in zenserver startup
* reduce lock time for project store gc precache and gc validate (#750)Dan Engelbrecht2026-02-112-46/+261
| | | | | * add oplog snapshot function to allow reduction of held oplog locks * release project lock when precaching each oplog
* use matcher over regex (#744)Dan Engelbrecht2026-02-041-2/+2
| | | | * replace http router AddPattern with AddMatcher * fix scrub logging
* reduce blocking in scrub (#743)Dan Engelbrecht2026-02-033-67/+100
| | | * reduce held locks while performing scrub operation
* reduce batch size for reads (#740)Dan Engelbrecht2026-01-291-2/+2
| | | | | * reduce maximum size per chunk to read to reduce disk contention * increase timeout before warning on slow shut down of zenserver * reduce default window size for blockstore chunk iteration
* don't do full cb-object validation on cache records when read from disk (#739)Dan Engelbrecht2026-01-291-10/+9
| | | * don't do full cb-object validation on cache records when read from disk
* remove ZENCORE_API completely (#718)Stefan Boberg2026-01-192-3/+3
| | | initially we had ZENCORE_API macros to potentially allow for DLL linkage. It turns out that this is not useful and the macros just contribute noise, so this change removes them completely.
* various optimizations (#704)Dan Engelbrecht2026-01-093-11/+25
| | | | | | | | | - Improvement: Validate chunk hashes when dechunking files in oplog import - Improvement: Use stream decompression when dechunking files - Improvement: When assembling blocks for oplog export, make sure we keep under/at block size limit - Improvement: Make cancelling of oplog import more responsive - Improvement: Use decompress to composite to avoid allocating a new memory buffer for uncompressed chunks during oplog import - Improvement: Reduce memory buffer size and allocate it on demand when writing multiple chunks to block store - Improvement: Reduce lock contention when fetching/checking existence of chunks in block store
* added early-out check in GcManager::ScrubStorage(ScrubContext& GcCtx) (#698)Stefan Boberg2026-01-071-1/+7
| | | | | minimises time spent doing setup work after the deadline has expired also added log output with deadline/timeout information
* Fix unit test that relies on being able to overwritezousar2025-12-191-2/+2
|
* Ensure upstream put propagation includes overwritezousar2025-12-192-5/+13
| | | | When changing the default limit-overwrite behavior, a unit test surfaced a bug where an put of data with overwrite cache policy would not get propagated via zen's built-in upstream mechanism with a matching overwrite cache policy to the upstream. This change ensures that it does and leaves the unit test configured to exercise this scenario.
* Change default limit-overwrite behavior to truezousar2025-12-171-1/+1
|
* remove error warning in output (#695)Dan Engelbrecht2025-12-171-2/+2
| | | * changed some logging string so they don't get caught in CI logging
* oplog download size (#690)Dan Engelbrecht2025-12-151-9/+9
| | | | - Bugfix: Upload of oplogs could reference multiple blocks for the same chunk causing redundant downloads of blocks - Improvement: Use the improved block reuse selection function from zen builds upload in zen oplog-export to reduce oplog download size
* add otel instrumentation (#581)Stefan Boberg2025-12-112-2/+6
| | | | | | | | this change adds OTEL tracing to a few places * Top-level application lifecycle (config/init/cleanup, main loop) * http.sys requests it also brings some otlptrace optimizations and dynamic configuration of tracing. OTLP tracing is currently always disabled
* catch all exceptions during projectstore scrub (#686)Dan Engelbrecht2025-12-111-0/+6
|
* remove catching of exceptions in batch operations now that they are not ↵Dan Engelbrecht2025-12-102-101/+61
| | | | | executed in the destructor (#683) don't call WriteChunks in batch operation if no chunks needs to be written
* batch op not in destructor (#676)Dan Engelbrecht2025-12-045-343/+428
| | | | | * use fixed vectors for batch requests * refactor cache batch value put/get to not execute code that can throw execeptions in destructor * extend test with multi-bucket requests
* add checks to protect against access violation due to failed disk read (#675)Dan Engelbrecht2025-12-041-0/+12
| | | * add checkes to protect against access violation due to failed disk read
* make sure we use exclusive lock in projectstore when flushing/writing ↵Dan Engelbrecht2025-12-012-12/+37
| | | | snapshot (#673)
* automatic scrub on startup (#667)Dan Engelbrecht2025-11-2714-367/+613
| | | | | - Improvement: Deeper validation of data when scrub is activated (cas/cache/project) - Improvement: Enabled more multi threading when running scrub operations - Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
* RawOffset can be anything and we expect an empty buffer to be returned along ↵Dan Engelbrecht2025-11-261-17/+23
| | | | with RawSize = 0 if the offset was out of bounds for the value. (#666)
* fix block store file appender (#658)Dan Engelbrecht2025-11-201-3/+43
| | | * fix bug where we write buffered data instead of provided data in BlockStoreFileAppender
* add append-only buffering of BlockStoreFile (#652)Dan Engelbrecht2025-11-171-9/+124
| | | | * add append-only buffering of BlockStoreFile replaces use of BasicFileWriter in Compact which bypassed cached position in BlockStore
* switch to xmake for package management (#611)Stefan Boberg2025-11-072-3/+3
| | | | | | | | | | | | | | | | | | | | | | This change removes our dependency on vcpkg for package management, in favour of bringing some code in-tree in the `thirdparty` folder as well as using the xmake build-in package management feature. For the latter, all the package definitions are maintained in the zen repo itself, in the `repo` folder. It should now also be easier to build the project as it will no longer depend on having the right version of vcpkg installed, which has been a common problem for new people coming in to the codebase. Now you should only need xmake to build. * Bumps xmake requirement on github runners to 2.9.9 to resolve an issue where xmake on Windows invokes cmake with `v144` toolchain which does not exist * BLAKE3 is now in-tree at `thirdparty/blake3` * cpr is now in-tree at `thirdparty/cpr` * cxxopts is now in-tree at `thirdparty/cxxopts` * fmt is now in-tree at `thirdparty/fmt` * robin-map is now in-tree at `thirdparty/robin-map` * ryml is now in-tree at `thirdparty/ryml` * sol2 is now in-tree at `thirdparty/sol2` * spdlog is now in-tree at `thirdparty/spdlog` * utfcpp is now in-tree at `thirdparty/utfcpp` * xmake package repo definitions is in `repo` * implemented support for sanitizers. ASAN is supported on windows, TSAN, UBSAN, MSAN etc are supported on Linux/MacOS though I have not yet tested it extensively on MacOS * the zencore encryption implementation also now supports using mbedTLS which is used on MacOS, though for now we still use openssl on Linux * crashpad * bumps libcurl to 8.11.0 (from 8.8.0) which should address a rare build upload bug
* fix clean directory and make them use effective threading where appropriate ↵v5.7.8-pre5v5.7.8-pre3v5.7.8-pre2Dan Engelbrecht2025-11-031-1/+1
| | | | | | (#625) fix retry logic so it does not immediately sleep if file does not exist make sure we don't try to delete target folder files if we have already wiped it
* Various fixes to address issues flagged by gcc / non-UE toolchain build (#621)Stefan Boberg2025-11-015-5/+13
| | | | | | | | | | | | | | | | | | | | * gcc: avoid using memset on nontrivial struct * redundant `return std::move` * fixed various compilation issues flagged by gcc * fix issue in xmake.lua detecting whether we are building with the UE toolchain or not * add GCC ignore -Wundef (comment is inaccurate) * remove redundant std::move * don't catch exceptions by value * unreferenced variables * initialize "by the book" instead of memset * remove unused exception reference * add #include <cstring> to fix gcc build * explicitly poulate KeyValueMap by traversing input spans fixes gcc compilation * remove unreferenced variable * eliminate redundant `std::move` which gcc complains about * fix gcc compilation by including <cstring> * tag unreferenced variable to fix gcc compilation * fixes for various cases of naming members the same as their type
* fix use-after-free in TEST_CASE("compactcas.threadedinsert") (#620)v5.7.8-pre1Stefan Boberg2025-10-301-6/+8
|
* fix --zentrace=no compile errors (#616)Stefan Boberg2025-10-281-8/+8
| | | | | | * make sure the correct `UE_WITH_TRACE` conditional is used to enable/disable support code as appropriate * fixed some accidental `int32`, `int64` et al usage, due to typedefs leaking through from trace header with this fix, it is now possible to build with `--zentrace=no` again
* optimize blockstore flush (#614)Dan Engelbrecht2025-10-273-35/+66
| | | | | * rework block store block flushing to only happen once at end of block write outside of locks * fix warning at startup if no gc.dlog file exists
* optimize filecas write file (#613)Dan Engelbrecht2025-10-241-16/+10
| | | * try to move file into place before trying speculative remove of target file
* optimize blockstore filesize (#612)Dan Engelbrecht2025-10-241-1/+11
| | | * since we only ever append to a block store file we don't need to actually flush the position
* fix gc disk load graph (#610)Dan Engelbrecht2025-10-241-3/+3
| | | * make sure our gc disk load graph includes the latest measurement value
* gracefully handle broken gc dlog (#606)Dan Engelbrecht2025-10-241-0/+8
| | | * if gc.dlog is corrupt, remove and restart a new log
* refactor CasContainerStrategy::IterateOneBlock to make it more readable (#607)Dan Engelbrecht2025-10-242-91/+102
|
* if we are low on disk space, only run GC if it will remove any data (#603)Dan Engelbrecht2025-10-232-90/+160
| | | | * if we are low on disk space, only run GC if it will remove any data * make sure we don't treat bail of GC due to disk space as success causing 0 wait between GC passes
* remove scope in GC that prevented GC from executing (#600)Dan Engelbrecht2025-10-221-30/+31
|
* add support for OTLP logging/tracing (#599)Stefan Boberg2025-10-225-5/+5
| | | | | | | | - adds `zentelemetry` project which houses new functionality for serializing logs and traces in OpenTelemetry Protocol format (OTLP) - moved existing stats functionality from `zencore` to `zentelemetry` - adds `TRefCounted<T>` for vtable-less refcounting - adds `MemoryArena` class which allows for linear allocation of memory from chunks - adds `protozero` which is used to encode OTLP protobuf messages
* fix builds storage stats (#590)Dan Engelbrecht2025-10-202-5/+10
| | | * restructure builds storage stats to match web-ui expectations
* fix gc state switching (#588)Dan Engelbrecht2025-10-171-40/+38
| | | * fix state issue in GC thread where shutting down gc did not always block gc from running
* add ability to limit concurrency (#565)Stefan Boberg2025-10-104-11/+11
| | | | | | | | | | | | effective concurrency in zenserver can be limited via the `--corelimit=<N>` option on the command line. Any value passed in here will be used instead of the return value from `std::thread::hardware_concurrency()` if it is lower. * added --corelimit option to zenserver * made sure thread pools are configured lazily and not during global init * added log output indicating effective and HW concurrency * added change log entry * removed debug logging from ZenEntryPoint::Run() also removed main thread naming on Linux since it makes the output from `top` and similar tools confusing (it shows `main` instead of `zenserver`)
* fix missing chunk in block after gc (#560)Dan Engelbrecht2025-10-061-1/+3
| | | * make sure we use aligned write pos in blockstore compact when checking target block size
* fixed issue in compactcas.restart test due to std::vector<bool> (#559)Stefan Boberg2025-10-061-2/+5
| | | `std::vector<bool>` is a special container since it bit packs the values rather than just using an array of booleans. This means that updating it on multiple threads simultaneously is dangerous
* speed up tests (#555)Dan Engelbrecht2025-10-063-154/+209
| | | | | | | | | | | | * faster FileSystemTraversal test * faster jobqueue test * faster NamedEvent test * faster cache tests * faster basic http tests * faster blockstore test * faster cache store tests * faster compactcas tests * more responsive zenserver launch * tweak worker pool sizes in tests
* cacherequests helpers test only (#551)Dan Engelbrecht2025-10-0312-8/+1051
| | | | * don't use cacherequests utils in cache_cmd.cpp * make zenutil/cacherequests code into test code helpers only
* zenutil cleanup (#550)Dan Engelbrecht2025-10-034-2/+145
| | | | * move referencemetadata to zenstore * rename zenutil/windows/service to windowsservice