aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/include
Commit message (Collapse)AuthorAgeFilesLines
* validate rpc chunk responses (#36)Dan Engelbrecht2024-04-031-1/+1
| | | * Validate size of found chunks in cas/cache
* add support for responding with partial cache chunks (#11)Dan Engelbrecht2024-03-211-2/+3
| | | * add support for responding with partial cache chunks
* special treatment large oplog attachments v2 (#5)Dan Engelbrecht2024-03-141-0/+54
| | | | | - Bugfix: Install Ctrl+C handler earlier when doing `zen oplog-export` and `zen oplog-export` to properly cancel jobs - Improvement: Add ability to block a set of CAS entries from GC in project store - Improvement: Large attachments and loose files are now split into smaller chunks and stored in blocks during oplog export
* Make sure we wait for all scheduled tasks to complete before throwing ↵Dan Engelbrecht2024-02-281-0/+4
| | | | | exceptions further (#662) Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
* improved block store logging and more gcv2 tests (#659)Dan Engelbrecht2024-02-271-1/+2
| | | | * improved gc/blockstore logging * more gcv2 tests
* remove reference caching (#658)Dan Engelbrecht2024-02-271-43/+8
| | | * remove reference caching
* hashing fixes (#657)Dan Engelbrecht2024-02-262-1/+2
| | | | | * move structuredcachestore tests to zenstore-test * Don't materialize entire files when hashing if it is a large files * rewrite CompositeBuffer::Mid to never materialize buffers
* separate RPC processing from HTTP processing (#626)Stefan Boberg2023-12-205-1/+275
| | | | | | * moved all RPC processing from HttpStructuredCacheService into separate CacheRpcHandler class in zenstore * move package marshaling to zenutil. was previously in zenhttp/httpshared but it's useful in other contexts as well where we don't want to depend on zenhttp * introduced UpstreamCacheClient, this provides a subset of functions on UpstreamCache and lives in zenstore
* move cachedisklayer and structuredcachestore into zenstore (#624)Stefan Boberg2023-12-193-0/+853
|
* Fix crash bug when trying to inspect non-open block file in GC (#614)Dan Engelbrecht2023-12-181-0/+1
|
* improved scrubbing of oplogs and filecas (#596)Stefan Boberg2023-12-112-1/+8
| | | | | | - Improvement: Scrub command now validates compressed buffer hashes in filecas storage (used for large chunks) - Improvement: Added --dry, --no-gc and --no-cas options to zen scrub command - Improvement: Implemented oplog scrubbing (previously was a no-op) - Improvement: Implemented support for running scrubbint at startup with --scrub=<options>
* reserve vectors in gcv2 upfront / load factor for robin_map (#582)Dan Engelbrecht2023-12-042-16/+26
| | | | | * reserve vectors in gcv2 upfront * set max load factor for robin_map indexes to reduce memory usage * set min load factor for robin_map indexes to allow them to shrink
* use 32 bit offset and size in BlockStoreLocation (#581)Dan Engelbrecht2023-12-011-13/+13
| | | - Improvement: Reduce memory usage in GC and diskbucket flush
* add separate PreCache step for GcReferenceChecker (#578)Dan Engelbrecht2023-12-012-0/+6
| | | | | | - Improvement: GCv2: Use separate PreCache step to improve concurrency when checking references - Improvement: GCv2: Improved verbose logging - Improvement: GCv2: Sort chunks to read by block/offset when finding references - Improvement: GCv2: Exit as soon as no more unreferenced items are left
* optimized index snapshot reading/writing (#561)Stefan Boberg2023-11-271-1/+10
| | | | | the previous implementation of in-memory index snapshots serialise data to memory before writing to disk and vice versa when reading. This leads to some memory spikes which end up pushing useful data out of system cache and also cause stalls on I/O operations. this change moves more code to a streaming serialisation approach which scales better from a memory usage perspective and also performs much better
* gc stop command (#569)v0.2.36-pre2Dan Engelbrecht2023-11-271-0/+2
| | | | | - Feature: New endpoint `/admin/gc-stop` to cancel a running garbage collect operation - Feature: Added `zen gc-stop` command to cancel a running garbage collect operation - Bugfix: GCv2 - make sure to discover all projects and oplogs before checking for expired data
* Add GC Cancel/Stop (#568)Dan Engelbrecht2023-11-242-5/+16
| | | | - GcScheduler will now cancel any running GC when it shuts down. - Old GC is rather limited in *when* it reacts to cancel of GC. GCv2 is more responsive.
* add command line options for compact block threshold and gc verbose (#557)Dan Engelbrecht2023-11-211-2/+8
| | | | | | | | | | | - Feature: Added new options to zenserver for GC V2 - `--gc-compactblock-threshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--gc-verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new options to `zen gc` command for GC V2 - `--compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new parameters for endpoint `admin/gc` (PUT) - `compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `verbose` GCV2 - enable more verbose output when running a GC pass
* compact separate for gc referencer (#533)Dan Engelbrecht2023-11-212-49/+57
| | | | | - Refactor GCV2 so GcReferencer::RemoveExpiredData returns a store compactor, moving out the actual disk work from deleting items in the index. - Refactor GCV2 GcResult to reuse GcCompactStoreStats and GcStats - Make Compacting of stores non-parallell to not eat all the disk I/O when running GC
* gc history log (#519)Dan Engelbrecht2023-11-131-0/+7
| | | | | - Feature: Writes a `gc.log` with settings and detailed result after each GC execution (version 2 only) - Break out file name rotate to allow access for gclog - CompactBinaryToJson(MemoryView Data, StringBuilderBase& InBuilder)
* spdlog implementation hiding (#498)Stefan Boberg2023-11-061-19/+16
| | | | | | | | | this change aims to hide logging internals from client code, in order to make it easier to extend and take more control over the logging process in the future. As a bonus side effect, the generated code is much tighter (net delta around 2.5% on the resulting executable which includes lots of thirdparty code) and should take less time to compile and link. Client usage via macros is pretty much unchanged. The main exposure client code had to spdlog internals before was the use of custom loggers per subsystem, where it would be common to have `spdlog::logger` references to keep a reference to a logger within a class. This is now replaced by `zen::LoggerRef` which currently simply encapsulates an actual `spdlog::logger` instance, but this is intended to be an implementation detail which will change in the future. The way the change works is that we now handle any formatting of log messages in the zencore logging subsystem instead of relying on `spdlog` to manage this. We use the `fmt` library to do the formatting which means the client usage is identical to using `spdlog`. The formatted message is then forwarded onto any sinks etc which are still implememted via `spdlog`.
* gc v2 tests (#512)Dan Engelbrecht2023-11-061-1/+1
| | | | | | | | | | * set MaxBlockCount at init * properly calculate total size * basic blockstore compact blocks test * correct detection of block swap * Use one implementation for CreateRandomBlob * reduce some data sets to increase speed of tests * reduce test time * rename BlockStoreCompactState::AddBlock -> BlockStoreCompactState::IncludeBlock
* statsd for cas (#511)Dan Engelbrecht2023-11-061-10/+13
| | | | * separate statsd interfaces so they can be accessible to zenstore * statsd for cas
* multithread cache bucket (#508)Dan Engelbrecht2023-11-061-4/+5
| | | | * Multithread init and flush of cache bucket * tweaked threading cound for bucket discovery, disklayer flush and gc v2
* individual gc stats (#506)Dan Engelbrecht2023-10-301-21/+69
| | | | | - Feature: New parameter for endpoint `admin/gc` (GET) `details=true` which gives details stats on GC operation when using GC V2 - Feature: New options for zen command `gc-status` - `--details` that enables the detailed output from the last GC operation when using GC V2
* New GC implementation (#459)Dan Engelbrecht2023-10-302-7/+238
| | | - Feature: New garbage collection implementation, still in evaluation mode. Enabled by `--gc-v2` command line option
* fix m_LastFullGcDuration, m_LastFullGCDiff, m_LastFullGcDuration and ↵Dan Engelbrecht2023-10-231-8/+8
| | | | m_LastLightweightGcDuration stats (#494)
* Remove any unreferenced blocks in block store on open (#492)Dan Engelbrecht2023-10-231-3/+2
| | | * Remove any unreferenced blocks in block store on open
* Don't prune block locations due to missing blocks a startup (#487)Dan Engelbrecht2023-10-201-5/+7
| | | | | | * Don't prune block locations due to missing blocks a startup This makes the behaviour consistent with FileCas - you can have an index that is not fully backed by data. Asking for a location that is not backed by data results in getting an empty result back Also, don't try to GC blocks that are unknown to the block store at the time of snapshot (to avoid removing data that comes in after GatherReferences in GC)
* clean up GcContributor and GcStorage to be pure interfaces (#485)Dan Engelbrecht2023-10-201-9/+3
|
* Add --skip-delete option to gc command (#484)Dan Engelbrecht2023-10-201-0/+1
| | | | - Feature: Add `--skip-delete` option to gc command - Bugfix: Fix implementation when claiming GC reserve during GC
* add `flush` command and more gc status info (#483)Dan Engelbrecht2023-10-181-9/+56
| | | | | | - Feature: New endpoint `/admin/flush ` to flush all storage - CAS, Cache and ProjectStore - Feature: New command `zen flush` to flush all storage - CAS, Cache and ProjectStore - Improved: Command `zen gc-status` now gives details about storage, when last GC occured, how long until next GC etc - Changed: Cache access and write log are disabled by default
* check that block does not exists on disk before starting write to it (#449)Dan Engelbrecht2023-10-051-0/+2
| | | * check that block does not exists on disk before starting write to it
* faster accesstime save restore (#439)Dan Engelbrecht2023-10-031-1/+1
| | | | | | | | | | - Improvement: Reduce time a cache bucket is locked for write when flushing/garbage collecting - Change format for faster read/write and reduced size on disk - Don't lock index while writing manifest to disk - Skip garbage collect if we are currently in a Flush operation - BlockStore::Flush no longer terminates currently writing block - Garbage collect references to currently writing block but keep the block as new data may be added - Fix BlockStore::Prune used disk space calculation - Don't materialize data in filecas when we just need the size
* lightweight gc (#431)Dan Engelbrecht2023-10-021-11/+16
| | | | | | - Feature: Add lightweight GC that only removes items from cache/project store without cleaning up data referenced in Cid store - Add `skipcid` parameter to http endpoint `admin/gc`, defaults to "false" - Add `--skipcid` option to `zen gc` command, defaults to false - Add `--gc-lightweight-interval-seconds` option to zenserver
* adding more stats (#429)Dan Engelbrecht2023-09-281-3/+7
| | | | | - Feature: Add detailed stats on requests and data sizes on a per-bucket level, use parameter `cachestorestats=true` on the `/stats/z$` endpoint to enable - Feature: Add detailed stats on requests and data sizes on cidstore, use parameter `cidstorestats=true` on the `/stats/z$` endpoint to enable - Feature: Dashboard now accepts parameters in the URL which is passed on to the `/stats/z$` endpoint
* More statistics for Cache, Project Store and Cid Store (#405)Dan Engelbrecht2023-09-141-8/+16
| | | | | Cache: requestcount, badrequestcount, writes Project Store: requestcount Cid Store: cidhits, cidmisses, cidwrites
* safer gc on low disk (#373)Dan Engelbrecht2023-08-221-1/+1
| | | * - Improvement: Make sure we have disk space available to do GC and use reserve up front if need be
* CidStore now implements the ChunkResolver interfaceStefan Boberg2023-06-302-9/+12
| | | | | this allows client code to use the ChunkResolver interface instead of CidStore, which can help with testing scenarios
* added zen::ChunkResolverStefan Boberg2023-06-301-0/+9
| | | | cherry-picked from sb/proto to reduce delta
* Content scrubbing (#271)Stefan Boberg2023-05-163-27/+60
| | | Added zen scrub command which may be triggered via the zen CLI helper. This traverses storage and validates contents either by content hash and/or by structure. If unexpected data is encountered it is invalidated.
* Add `--gc-projectstore-duration-seconds` option (#281)Dan Engelbrecht2023-05-161-6/+12
| | | | | | * Add `--gc-projectstore-duration-seconds` option * Cleanup lua gc options parsing * Remove dead configuration values * changelog
* removed remnants of ZEN_USE_REF_TRACKINGStefan Boberg2023-05-151-8/+0
| | | | this code was originally meant to be used for GC but is no longer needed
* added ScrubStorage to GcStorage base classStefan Boberg2023-05-151-2/+4
|
* added static_assert for BlockStoreDiskLocationStefan Boberg2023-05-151-0/+2
|
* corrected CidStore commentStefan Boberg2023-05-151-4/+0
|
* minor GC API cleanupStefan Boberg2023-05-152-15/+15
| | | | | Scrub -> ScrubStorage Trigger -> TriggerGc (to make relationship to TriggerScrub clearer)
* clang-format (sorry)Stefan Boberg2023-05-111-1/+1
|
* build fix (accidental commit on the wrong branch)Stefan Boberg2023-05-111-0/+3
|
* Low disk space detector (#277)Dan Engelbrecht2023-05-091-6/+23
| | | | * - Feature: Disk writes are now blocked early and return an insufficient storage error if free disk space falls below the `--low-diskspace-threshold` value * Never keep an entry in m_ChunkBlocks that points to a nullptr