| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
file retry logic (#766)
* GC - fix handling of attachment ranges
* fix trace/log strings
* fix HTTP access token expiration time logic
* added missing lock retry in zenserver startup
|
| |
|
|
|
| |
minimises time spent doing setup work after the deadline has expired
also added log output with deadline/timeout information
|
| |
|
|
|
|
|
|
| |
this change adds OTEL tracing to a few places
* Top-level application lifecycle (config/init/cleanup, main loop)
* http.sys requests
it also brings some otlptrace optimizations and dynamic configuration of tracing. OTLP tracing is currently always disabled
|
| |
|
|
|
| |
- Improvement: Deeper validation of data when scrub is activated (cas/cache/project)
- Improvement: Enabled more multi threading when running scrub operations
- Improvement: Added means to force a scrub operation at startup with a new release using ZEN_DATA_FORCE_SCRUB_VERSION variable in xmake.lua
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes our dependency on vcpkg for package management, in favour of bringing some code in-tree in the `thirdparty` folder as well as using the xmake build-in package management feature. For the latter, all the package definitions are maintained in the zen repo itself, in the `repo` folder.
It should now also be easier to build the project as it will no longer depend on having the right version of vcpkg installed, which has been a common problem for new people coming in to the codebase. Now you should only need xmake to build.
* Bumps xmake requirement on github runners to 2.9.9 to resolve an issue where xmake on Windows invokes cmake with `v144` toolchain which does not exist
* BLAKE3 is now in-tree at `thirdparty/blake3`
* cpr is now in-tree at `thirdparty/cpr`
* cxxopts is now in-tree at `thirdparty/cxxopts`
* fmt is now in-tree at `thirdparty/fmt`
* robin-map is now in-tree at `thirdparty/robin-map`
* ryml is now in-tree at `thirdparty/ryml`
* sol2 is now in-tree at `thirdparty/sol2`
* spdlog is now in-tree at `thirdparty/spdlog`
* utfcpp is now in-tree at `thirdparty/utfcpp`
* xmake package repo definitions is in `repo`
* implemented support for sanitizers. ASAN is supported on windows, TSAN, UBSAN, MSAN etc are supported on Linux/MacOS though I have not yet tested it extensively on MacOS
* the zencore encryption implementation also now supports using mbedTLS which is used on MacOS, though for now we still use openssl on Linux
* crashpad
* bumps libcurl to 8.11.0 (from 8.8.0) which should address a rare build upload bug
|
| |
|
|
|
| |
* rework block store block flushing to only happen once at end of block write outside of locks
* fix warning at startup if no gc.dlog file exists
|
| |
|
| |
* make sure our gc disk load graph includes the latest measurement value
|
| |
|
| |
* if gc.dlog is corrupt, remove and restart a new log
|
| |
|
|
| |
* if we are low on disk space, only run GC if it will remove any data
* make sure we don't treat bail of GC due to disk space as success causing 0 wait between GC passes
|
| | |
|
| |
|
| |
* fix state issue in GC thread where shutting down gc did not always block gc from running
|
| |
|
|
|
|
| |
* fix for invalid regex in HttpBuildStoreService - triggers with most recent MSVC version
* in GcScheduler don't wait for exit signal if exit has already been requested. this caused extended waits for shutdown in some automated tests on very fast machines, possibly also due to some behaviour change in condition_variable
* speculative fix/workaround for issue with TLS teardown on secondary thread while main was tearing down trace
|
| | |
|
| |
|
|
|
| |
- Change BadAlloc exceptions in GC to warnings
- Add explict ASSERT exception catch in http plugin request processing
- Make exceptions handled in http request processing to warnings
|
| |
|
| |
- Improvement: Add a new mode to worker thread pools to avoid starvation of workers which could cause long stalls due to other work begin queued up. UE-305498
|
| |
|
|
|
|
| |
- Improvement: For projectstore oplog GET operation, only read basic information and release it if the oplog is not already open to reduce memory usage when listing oplogs in web UI
- Improvement: Reduce memory usage for oplog Op address lookup
Refactored Oplog::EState -> Oplog ::EMode and make sure we open data files in read-only mode when EMode::kBasicReadOnly is used.
|
| |
|
|
|
| |
* clean up trace command line options
explicitly shut down worker pools
* some additional startup trace scopes
|
| |
|
|
| |
* check low disk space condition more frequently and trigger GC when low water mark is reached
* show waited time when waiting for zenserver instance to exit
|
| |
|
|
|
| |
contention (#385)
* make RemoveExpiredData and PreCache serial to reduce CPU overhead / lock contention
|
| |
|
|
| |
* oom and ood exceptions in GC are now treated as warnings instead of errors
|
| |
|
|
|
| |
* block writing GC state/info if disk is full
* fix if/else on error while writing gc state
|
| |
|
| |
- Bugfix: Long file paths now works correctly on Windows
|
| |
|
|
|
|
|
|
|
| |
- **EXPERIMENTAL** `zen builds`
- Feature: `--zen-cache-host` option for `upload` and `download` operations to use a zenserver host `/builds` endpoint for storing build blob and blob metadata
- Feature: New `/builds` endpoint for caching build blobs and blob metadata
- `/builds/{namespace}/{bucket}/{buildid}/blobs/{hash}` `GET` and `PUT` method for storing and fetching blobs
- `/builds/{namespace}/{bucket}/{buildid}/blobs/putBlobMetadata` `POST` method for storing metadata about blobs
- `/builds/{namespace}/{bucket}/{buildid}/blobs/getBlobMetadata` `POST` method for fetching metadata about blobs
- `/builds/{namespace}/{bucket}/{buildid}/blobs/exists` `POST` method for checking existance of blobs
|
| |
|
|
| |
* Suppress progress report callback if oplog import detects oplog with zero ops
* output error code when catching system errors
|
| |
|
| |
This change adds more instrumentation for memory tracking, so that as little as possible comes through as Unknown in Insights analysis.
|
| | |
|
| |
|
|
|
| |
* added FLLMTag which can be used to register memory tags outside of core
* changed `UE_MEMSCOPE` -> `ZEN_MEMSCOPE` for consistency
* instrumented some subsystems with dynamic tags
|
| | |
|
| |
|
|
|
|
| |
- Added option gc-validation to zenserver that does a check for missing references in all oplog post full GC. Enabled by default.
- Feature: Added option gc-validation to zen gc command to control reference validation. Enabled by default.
- Added more details in post GC log.
- Fixed race condition in oplog writes which could cause used attachments to be incorrectly removed by GC
|
| |
|
|
| |
pressure (#205)
|
| |
|
|
|
| |
* kill gc v1
* block use of gc v1 from zen command line
* warn and flip to gcv2 if --gc-v2=false is specified for zenserver
|
| | |
|
| |
|
|
|
| |
- Do a single call to mempcy when fetching attachments from the meta store in GC
- Use small lambda when calling std::sort in FilterReferences (enables inlining of the comparision function)
- Use a single function for < and == comparision in KeepUnusedReferences
|
| |
|
| |
* Use alternate IoHash comparision function - reduces KeepUnusedReferences execution time by ~20%
|
| |
|
| |
* zen command - add options to control meta data cache when triggering gc
|
| |
|
|
|
| |
Added option `gc-attachment-passes` to zenserver
Cleaned up GCv2 start and stop logs and added identifier to easily find matching start and end of a GC pass in log file
Fixed project store not properly sorting references found during lock phase
|
| |
|
|
|
| |
* optimize IoHash and OId comparisions
* refactor filtering of unused references
* add attachment filtering to gc
|
| |
|
| |
- Improvement: Move GC logging in callback functions into "gc" context
|
| |
|
|
| |
blocking client requests (#134)
|
| |
|
| |
* if disk space is low, set the last gc time to avoid spamming retries
|
| | |
|
| |
|
|
| |
* Retry writing GC state if it fails to handle transient problems
* If GC operation fails demote errors to warnings on consecutive fails
|
| |
|
| |
* add option to force gcv2 to run single threaded
|
| |
|
|
|
|
| |
GCv2 (#93)
- Bugfix: Make sure we monitor and include new project/oplogs created during GCv2
- Bugfix: Make sure we monitor and include new namespaces/cache buckets created during GCv2
|
| |
|
| |
* use write and move in place for safer writing of files
|
| |
|
| |
- Bugfix: Harden GCv2 when errors occur and gracefully abort GC operation on error
|
| |
|
|
| |
- Improvement: Add file and line to ASSERT exceptions
- Improvement: Catch call stack when throwing assert exceptions and log/output call stack at important places to provide more context to caller
|
| |
|
|
|
|
| |
`/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` (#30)
- Improvement: Use multithreading to fetch size/rawsize of entries in `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files`
- Improvement: Add `GetMediumWorkerPool()` in addition to `LargeWorkerPool()` and `SmallWorkerPool()`
|
| |
|
|
|
| |
exceptions further (#662)
Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
|
| |
|
| |
* Don't capture local variables in loop by reference
|