aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/include
Commit message (Collapse)AuthorAgeFilesLines
* remove gc v1 (#121)Dan Engelbrecht2024-10-034-100/+17
| | | | | * kill gc v1 * block use of gc v1 from zen command line * warn and flip to gcv2 if --gc-v2=false is specified for zenserver
* gc block size target max size (#180)Dan Engelbrecht2024-10-021-2/+11
| | | | | | * If a block is small (less than half max size) we add it to blocks to compact Sort blocks when iterating over them * do compact of block stores even if no new unused are found * do compact phase even if bucket is empty
* gc command attachment options (#176)Dan Engelbrecht2024-09-302-7/+14
| | | * zen command - add options to control meta data cache when triggering gc
* optimize startup time (#175)Dan Engelbrecht2024-09-301-9/+3
| | | | | | * use tsl::robin_set for BlockIndexSet don't calculate full block location when only block index is needed * don't copy visitor function * reserve space for attachments
* reduce lock time for memcache trim (#171)Dan Engelbrecht2024-09-271-3/+4
| | | | | | | - Improvement: Faster memcache trimming - Reduce calculations while holding bucket lock for memcache trim analysis to reduce contention - When trimming memcache, evict 25% more than required to reduce frequency of trimming - When trimming memcache, don't repack memcache data vector, defer that to regular garbage collection - When trimming memcache, deallocate memcache buffers when not holding exclusive lock in bucket
* Add `gc-attachment-passes` option to zenserver (#167)Dan Engelbrecht2024-09-251-4/+11
| | | | | Added option `gc-attachment-passes` to zenserver Cleaned up GCv2 start and stop logs and added identifier to easily find matching start and end of a GC pass in log file Fixed project store not properly sorting references found during lock phase
* Added namespace qualifier (optional) for z$ rpc requests (#166)Stefan Boberg2024-09-231-0/+1
| | | This change adds support for a namespace-qualified RPC endpoint for z$ at `/z$/<namespace>/$rpc` which may be used to validate RPC requests by URL inspection. The old scheme is still supported.
* gc unused refactor (#165)Dan Engelbrecht2024-09-232-18/+27
| | | | | * optimize IoHash and OId comparisions * refactor filtering of unused references * add attachment filtering to gc
* made fmt formatter format function const (#162)Stefan Boberg2024-09-201-1/+1
| | | this appears to be required as of fmt v11
* unblock PreCache (#164)Dan Engelbrecht2024-09-201-1/+1
| | | Don't lock disk cache buckets from writing when scanning records for attachment references
* gc performance improvements (#160)Dan Engelbrecht2024-09-172-29/+22
| | | | | | | | | | * optimized ValidateCbUInt * optimized iohash comparision * replace unordered set/map with tsl/robin set/map in blockstore * increase max buffer size when writing cache bucket sidecar * only store meta data for files < 4Gb * faster ReadAttachmentsFromMetaData * remove memcpy call in BlockStoreDiskLocation * only write cache bucket state to disk if GC deleted anything
* clean cache slog files on startup (#143)Dan Engelbrecht2024-09-041-3/+7
| | | | - Bugfix: If we fail to move a temporary file into place, try to re-open the file so we clean it up - Improvement: Clean up cache bucket log files at startup as we store the matching information in the index snapshot for the bucket
* move gc logs to gc logger (#142)Dan Engelbrecht2024-09-041-0/+1
| | | - Improvement: Move GC logging in callback functions into "gc" context
* oplog index snapshots (#140)Dan Engelbrecht2024-09-031-2/+2
| | | - Feature: Added project store oplog index snapshots for faster opening of oplog - opening oplogs are roughly 10x faster
* meta info store (#75)Dan Engelbrecht2024-08-302-7/+24
| | | | - Feature: Added option `--gc-cache-attachment-store` which caches referenced attachments in cache records on disk for faster GC - default is `false` - Feature: Added option `--gc-projectstore-attachment-store` which caches referenced attachments in project store oplogs on disk for faster GC - default is `false`
* hardening and reduced spam from GC on failure (#112)Dan Engelbrecht2024-08-141-2/+3
| | | | * Retry writing GC state if it fails to handle transient problems * If GC operation fails demote errors to warnings on consecutive fails
* add gc single threaded option (#104)Dan Engelbrecht2024-08-071-1/+4
| | | * add option to force gcv2 to run single threaded
* Make sure we monitor for new project, oplogs, namespaces and buckets during ↵Dan Engelbrecht2024-06-133-10/+72
| | | | | | GCv2 (#93) - Bugfix: Make sure we monitor and include new project/oplogs created during GCv2 - Bugfix: Make sure we monitor and include new namespaces/cache buckets created during GCv2
* workspaces config and fixes (#92)Dan Engelbrecht2024-06-111-4/+5
| | | | * fix alias request capture * use single config file for workspaces
* workspace share aliases (#91)Dan Engelbrecht2024-06-041-0/+15
| | | | | | | - Add `zen workspace-share` `--root-path` option - the root local file path of the workspace - if given it will automatically create the workspace before creating the share. If `--workspace` is omitted, an id will be generated from the `--root-path` parameter - Add `/ws/share/{alias}/` endpoint - a shortcut to `/ws/{workspace_id}/{share_id}/` based endpoints using the alias for a workspace share - Add `--alias` option to replace `--workspace` and `--share` options for `workspace-share` zen commands - Rename `zen workspace create` `folder` option to `root-path` - Rename `zen workspace create` `folder` option to `share-path`
* add batching of CacheStore requests for GetCacheValues/GetCacheChunks (#90)Dan Engelbrecht2024-06-043-19/+56
| | | | | | * cache file size of block on open * add ability to control size limit for small chunk callback when iterating block * Add batch fetch of cache values in the GetCacheValues request
* workspace shares (#84)Dan Engelbrecht2024-05-291-0/+102
| | | Feature: New 'workspaces' service which allows a user to share a local folder via zenserver. A workspace can have mulitple workspace shares and they provie an HTTP API that is compatible with the project oplog HTTP API. Workspaces and shares are preserved between runs. Workspaces feature is disabled by default - enable with --workspaces-enabled option when launching zenserver.
* refactor BlockStore IterateChunks (#77)Dan Engelbrecht2024-05-171-12/+11
| | | Improvement: Refactored IterateChunks to allow reuse in diskcachelayer and hide public GetBlockFile() function in BlockStore
* batch cache put (#67)Dan Engelbrecht2024-05-022-6/+46
| | | - Improvement: Batch scope for put of cache values
* iterate cas chunks (#59)Dan Engelbrecht2024-04-242-2/+8
| | | - Improvement: Reworked GetChunkInfos in oplog store to reduce disk thrashing and improve performance
* safer gcv2 on error (#60)Dan Engelbrecht2024-04-241-0/+2
| | | - Bugfix: Harden GCv2 when errors occur and gracefully abort GC operation on error
* InsertChunks for CAS store (#55)Dan Engelbrecht2024-04-222-9/+15
| | | - Improvement: Add batching when writing multiple small chunks to block store - decreases I/O load significantly on oplog import
* validate rpc chunk responses (#36)Dan Engelbrecht2024-04-031-1/+1
| | | * Validate size of found chunks in cas/cache
* add support for responding with partial cache chunks (#11)Dan Engelbrecht2024-03-211-2/+3
| | | * add support for responding with partial cache chunks
* special treatment large oplog attachments v2 (#5)Dan Engelbrecht2024-03-141-0/+54
| | | | | - Bugfix: Install Ctrl+C handler earlier when doing `zen oplog-export` and `zen oplog-export` to properly cancel jobs - Improvement: Add ability to block a set of CAS entries from GC in project store - Improvement: Large attachments and loose files are now split into smaller chunks and stored in blocks during oplog export
* Make sure we wait for all scheduled tasks to complete before throwing ↵Dan Engelbrecht2024-02-281-0/+4
| | | | | exceptions further (#662) Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
* improved block store logging and more gcv2 tests (#659)Dan Engelbrecht2024-02-271-1/+2
| | | | * improved gc/blockstore logging * more gcv2 tests
* remove reference caching (#658)Dan Engelbrecht2024-02-271-43/+8
| | | * remove reference caching
* hashing fixes (#657)Dan Engelbrecht2024-02-262-1/+2
| | | | | * move structuredcachestore tests to zenstore-test * Don't materialize entire files when hashing if it is a large files * rewrite CompositeBuffer::Mid to never materialize buffers
* separate RPC processing from HTTP processing (#626)Stefan Boberg2023-12-205-1/+275
| | | | | | * moved all RPC processing from HttpStructuredCacheService into separate CacheRpcHandler class in zenstore * move package marshaling to zenutil. was previously in zenhttp/httpshared but it's useful in other contexts as well where we don't want to depend on zenhttp * introduced UpstreamCacheClient, this provides a subset of functions on UpstreamCache and lives in zenstore
* move cachedisklayer and structuredcachestore into zenstore (#624)Stefan Boberg2023-12-193-0/+853
|
* Fix crash bug when trying to inspect non-open block file in GC (#614)Dan Engelbrecht2023-12-181-0/+1
|
* improved scrubbing of oplogs and filecas (#596)Stefan Boberg2023-12-112-1/+8
| | | | | | - Improvement: Scrub command now validates compressed buffer hashes in filecas storage (used for large chunks) - Improvement: Added --dry, --no-gc and --no-cas options to zen scrub command - Improvement: Implemented oplog scrubbing (previously was a no-op) - Improvement: Implemented support for running scrubbint at startup with --scrub=<options>
* reserve vectors in gcv2 upfront / load factor for robin_map (#582)Dan Engelbrecht2023-12-042-16/+26
| | | | | * reserve vectors in gcv2 upfront * set max load factor for robin_map indexes to reduce memory usage * set min load factor for robin_map indexes to allow them to shrink
* use 32 bit offset and size in BlockStoreLocation (#581)Dan Engelbrecht2023-12-011-13/+13
| | | - Improvement: Reduce memory usage in GC and diskbucket flush
* add separate PreCache step for GcReferenceChecker (#578)Dan Engelbrecht2023-12-012-0/+6
| | | | | | - Improvement: GCv2: Use separate PreCache step to improve concurrency when checking references - Improvement: GCv2: Improved verbose logging - Improvement: GCv2: Sort chunks to read by block/offset when finding references - Improvement: GCv2: Exit as soon as no more unreferenced items are left
* optimized index snapshot reading/writing (#561)Stefan Boberg2023-11-271-1/+10
| | | | | the previous implementation of in-memory index snapshots serialise data to memory before writing to disk and vice versa when reading. This leads to some memory spikes which end up pushing useful data out of system cache and also cause stalls on I/O operations. this change moves more code to a streaming serialisation approach which scales better from a memory usage perspective and also performs much better
* gc stop command (#569)v0.2.36-pre2Dan Engelbrecht2023-11-271-0/+2
| | | | | - Feature: New endpoint `/admin/gc-stop` to cancel a running garbage collect operation - Feature: Added `zen gc-stop` command to cancel a running garbage collect operation - Bugfix: GCv2 - make sure to discover all projects and oplogs before checking for expired data
* Add GC Cancel/Stop (#568)Dan Engelbrecht2023-11-242-5/+16
| | | | - GcScheduler will now cancel any running GC when it shuts down. - Old GC is rather limited in *when* it reacts to cancel of GC. GCv2 is more responsive.
* add command line options for compact block threshold and gc verbose (#557)Dan Engelbrecht2023-11-211-2/+8
| | | | | | | | | | | - Feature: Added new options to zenserver for GC V2 - `--gc-compactblock-threshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--gc-verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new options to `zen gc` command for GC V2 - `--compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new parameters for endpoint `admin/gc` (PUT) - `compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `verbose` GCV2 - enable more verbose output when running a GC pass
* compact separate for gc referencer (#533)Dan Engelbrecht2023-11-212-49/+57
| | | | | - Refactor GCV2 so GcReferencer::RemoveExpiredData returns a store compactor, moving out the actual disk work from deleting items in the index. - Refactor GCV2 GcResult to reuse GcCompactStoreStats and GcStats - Make Compacting of stores non-parallell to not eat all the disk I/O when running GC
* gc history log (#519)Dan Engelbrecht2023-11-131-0/+7
| | | | | - Feature: Writes a `gc.log` with settings and detailed result after each GC execution (version 2 only) - Break out file name rotate to allow access for gclog - CompactBinaryToJson(MemoryView Data, StringBuilderBase& InBuilder)
* spdlog implementation hiding (#498)Stefan Boberg2023-11-061-19/+16
| | | | | | | | | this change aims to hide logging internals from client code, in order to make it easier to extend and take more control over the logging process in the future. As a bonus side effect, the generated code is much tighter (net delta around 2.5% on the resulting executable which includes lots of thirdparty code) and should take less time to compile and link. Client usage via macros is pretty much unchanged. The main exposure client code had to spdlog internals before was the use of custom loggers per subsystem, where it would be common to have `spdlog::logger` references to keep a reference to a logger within a class. This is now replaced by `zen::LoggerRef` which currently simply encapsulates an actual `spdlog::logger` instance, but this is intended to be an implementation detail which will change in the future. The way the change works is that we now handle any formatting of log messages in the zencore logging subsystem instead of relying on `spdlog` to manage this. We use the `fmt` library to do the formatting which means the client usage is identical to using `spdlog`. The formatted message is then forwarded onto any sinks etc which are still implememted via `spdlog`.
* gc v2 tests (#512)Dan Engelbrecht2023-11-061-1/+1
| | | | | | | | | | * set MaxBlockCount at init * properly calculate total size * basic blockstore compact blocks test * correct detection of block swap * Use one implementation for CreateRandomBlob * reduce some data sets to increase speed of tests * reduce test time * rename BlockStoreCompactState::AddBlock -> BlockStoreCompactState::IncludeBlock
* statsd for cas (#511)Dan Engelbrecht2023-11-061-10/+13
| | | | * separate statsd interfaces so they can be accessible to zenstore * statsd for cas