aboutsummaryrefslogtreecommitdiff
path: root/src/zenserver/cache/structuredcachestore.cpp
Commit message (Collapse)AuthorAgeFilesLines
* move cachedisklayer and structuredcachestore into zenstore (#624)Stefan Boberg2023-12-191-2456/+0
|
* add separate PreCache step for GcReferenceChecker (#578)Dan Engelbrecht2023-12-011-33/+66
| | | | | | - Improvement: GCv2: Use separate PreCache step to improve concurrency when checking references - Improvement: GCv2: Improved verbose logging - Improvement: GCv2: Sort chunks to read by block/offset when finding references - Improvement: GCv2: Exit as soon as no more unreferenced items are left
* compact separate for gc referencer (#533)Dan Engelbrecht2023-11-211-110/+118
| | | | | - Refactor GCV2 so GcReferencer::RemoveExpiredData returns a store compactor, moving out the actual disk work from deleting items in the index. - Refactor GCV2 GcResult to reuse GcCompactStoreStats and GcStats - Make Compacting of stores non-parallell to not eat all the disk I/O when running GC
* Don't put cache entries into the memory cache on Put, only on Get (#518)Dan Engelbrecht2023-11-071-5/+21
|
* spdlog implementation hiding (#498)Stefan Boberg2023-11-061-2/+3
| | | | | | | | | this change aims to hide logging internals from client code, in order to make it easier to extend and take more control over the logging process in the future. As a bonus side effect, the generated code is much tighter (net delta around 2.5% on the resulting executable which includes lots of thirdparty code) and should take less time to compile and link. Client usage via macros is pretty much unchanged. The main exposure client code had to spdlog internals before was the use of custom loggers per subsystem, where it would be common to have `spdlog::logger` references to keep a reference to a logger within a class. This is now replaced by `zen::LoggerRef` which currently simply encapsulates an actual `spdlog::logger` instance, but this is intended to be an implementation detail which will change in the future. The way the change works is that we now handle any formatting of log messages in the zencore logging subsystem instead of relying on `spdlog` to manage this. We use the `fmt` library to do the formatting which means the client usage is identical to using `spdlog`. The formatted message is then forwarded onto any sinks etc which are still implememted via `spdlog`.
* gc v2 tests (#512)Dan Engelbrecht2023-11-061-53/+9
| | | | | | | | | | * set MaxBlockCount at init * properly calculate total size * basic blockstore compact blocks test * correct detection of block swap * Use one implementation for CreateRandomBlob * reduce some data sets to increase speed of tests * reduce test time * rename BlockStoreCompactState::AddBlock -> BlockStoreCompactState::IncludeBlock
* statsd for cas (#511)Dan Engelbrecht2023-11-061-1/+1
| | | | * separate statsd interfaces so they can be accessible to zenstore * statsd for cas
* individual gc stats (#506)Dan Engelbrecht2023-10-301-139/+167
| | | | | - Feature: New parameter for endpoint `admin/gc` (GET) `details=true` which gives details stats on GC operation when using GC V2 - Feature: New options for zen command `gc-status` - `--details` that enables the detailed output from the last GC operation when using GC V2
* New GC implementation (#459)Dan Engelbrecht2023-10-301-1/+653
| | | - Feature: New garbage collection implementation, still in evaluation mode. Enabled by `--gc-v2` command line option
* statsd metrics reporting (#496)Stefan Boberg2023-10-251-4/+24
| | | | | added support for reporting metrics via statsd style UDP messaging, which is supported by many monitoring solution providers this change adds reporting only of three cache related metrics (hit/miss/put) but this should be extended to include more metrics after additional evaluation
* merge disk and memory layers (#493)Dan Engelbrecht2023-10-241-96/+69
| | | | - Feature: Added `--cache-memlayer-sizethreshold` option to zenserver to control at which size cache entries get cached in memory - Changed: Merged cache memory layer with cache disk layer to reduce memory and cpu overhead
* Remove any unreferenced blocks in block store on open (#492)Dan Engelbrecht2023-10-231-8/+3
| | | * Remove any unreferenced blocks in block store on open
* Don't prune block locations due to missing blocks a startup (#487)Dan Engelbrecht2023-10-201-0/+8
| | | | | | * Don't prune block locations due to missing blocks a startup This makes the behaviour consistent with FileCas - you can have an index that is not fully backed by data. Asking for a location that is not backed by data results in getting an empty result back Also, don't try to GC blocks that are unknown to the block store at the time of snapshot (to avoid removing data that comes in after GatherReferences in GC)
* clean up GcContributor and GcStorage to be pure interfaces (#485)Dan Engelbrecht2023-10-201-2/+6
|
* add `flush` command and more gc status info (#483)Dan Engelbrecht2023-10-181-3/+8
| | | | | | - Feature: New endpoint `/admin/flush ` to flush all storage - CAS, Cache and ProjectStore - Feature: New command `zen flush` to flush all storage - CAS, Cache and ProjectStore - Improved: Command `zen gc-status` now gives details about storage, when last GC occured, how long until next GC etc - Changed: Cache access and write log are disabled by default
* cache reference tracking (#455)Dan Engelbrecht2023-10-101-36/+45
| | | | | - Feature: Add caching of referenced CId content for structured cache records, this avoid disk thrashing when gathering references for GC - disabled by default, enable with `--cache-reference-cache-enabled` - Improvement: Faster collection of referenced CId content in project store
* reject bad bucket reads (#456)Stefan Boberg2023-10-091-4/+13
| | | | * extended bad bucket rejection logic to include GET operations as well as PUTs
* reject known bad bucket names in structured cache (#452)v0.2.27-pre0Stefan Boberg2023-10-061-5/+30
| | | | | | | * added string_view helpers for ParseHexBytes/ParseHexNumber * reject known bad buckets in structured cache put handler (32-character hex bucket names are rejected) * also added bucket rejection logic to bucket discovery * added rejected_writes stat to HttpStructuredCache
* faster accesstime save restore (#439)Dan Engelbrecht2023-10-031-29/+33
| | | | | | | | | | - Improvement: Reduce time a cache bucket is locked for write when flushing/garbage collecting - Change format for faster read/write and reduced size on disk - Don't lock index while writing manifest to disk - Skip garbage collect if we are currently in a Flush operation - BlockStore::Flush no longer terminates currently writing block - Garbage collect references to currently writing block but keep the block as new data may be added - Fix BlockStore::Prune used disk space calculation - Don't materialize data in filecas when we just need the size
* Limit size of memory cache layer (#423)Dan Engelbrecht2023-10-021-21/+51
| | | | | | | | - Feature: Limit the size ZenCacheMemoryLayer may use - `--cache-memlayer-targetfootprint` option to set which size (in bytes) it should be limited to, zero to have it unbounded - `--cache-memlayer-maxage` option to set how long (in seconds) cache items should be kept in the memory cache Do more "standard" GC rather than clearing everything. Tries to purge memory on Get/Put on the fly if exceeding limit - not sure if we should have a polling thread instead of adding overhead to Get/Put (however light it may be).
* adding more stats (#429)Dan Engelbrecht2023-09-281-4/+53
| | | | | - Feature: Add detailed stats on requests and data sizes on a per-bucket level, use parameter `cachestorestats=true` on the `/stats/z$` endpoint to enable - Feature: Add detailed stats on requests and data sizes on cidstore, use parameter `cidstorestats=true` on the `/stats/z$` endpoint to enable - Feature: Dashboard now accepts parameters in the URL which is passed on to the `/stats/z$` endpoint
* Add runtime status/control of logging (#419)Dan Engelbrecht2023-09-221-12/+29
| | | | | | | | | | | - Feature: New endpoint `/admin/logs` to query status of logging and log file locations and cache logging - `enablewritelog`=`true`/`false` parameter to control cache write logging - `enableaccesslog`=`true`/`false` parameter to control cache access logging - `loglevel` = `trace`/`debug`/`info`/`warning`/`error` - Feature: New zen command `logs` to query/control zen logging - No arguments gives status of logging and paths to log files - `--cache-write-log` `enable`/`disable` to control cache write logging - `--cache-access-log` `enable`/`disable` to control cache access logging - `--loglevel` `trace`/`debug`/`info`/`warning`/`error` to set debug level
* VFS implementation for local storage service (#396)Stefan Boberg2023-09-201-0/+18
| | | currently, only Windows (using Projected File System) is supported
* add DiskWriteBlocker to structured cache store log writer (#408)Dan Engelbrecht2023-09-151-30/+40
| | | | | * add DiskWriteBlocker to structured cache store log writer * changelog
* add more trace scopes (#362)Dan Engelbrecht2023-09-151-2/+10
| | | | | * more trace scopes * Make sure ReplayLogEntries uses the correct size for oplog buffer * changelog
* ZenCacheStore is now reference counted (#398)Stefan Boberg2023-09-131-0/+10
| | | this change also adds a GetNamespaces function which may be used to enumerate all currently known cache namespaces
* gracefully handle errors when writing cache log (#391)Dan Engelbrecht2023-09-111-29/+50
| | | | | | | * gracefully handle errors when writing cache log * changelog * fix log message
* Minor: Make sure to reset cache logging worker thread event to avoid ↵Dan Engelbrecht2023-08-241-0/+1
| | | | busy-looping looking for more work
* single thread async cache log (#361)Dan Engelbrecht2023-08-171-72/+83
| | | | * rework cache store background jogging * correct capture for context
* skip upstream logic early if we have no upstream endpoints (#359)Dan Engelbrecht2023-08-171-23/+42
| | | | | * Skip upstream logic early if we have not upstream endpoints * make cache store logging of CbObjects async * changelog
* cache log sessionid (#297)Stefan Boberg2023-05-231-50/+78
| | | | | | | | | | | * implemented structured cache logging to be used as audit trail to help analyse potential cache pollution/corruption * added common header to all known log targets * made Oid::operator bool explicit to avoid logging/text format mishaps * HttpClient::operator bool -> explicit * changed cache logs to not rotate on start in order to retain more history * added CacheRequestContext * properly initialize request context * log session id and request id on zencacehstore get/put * changelog
* Restructured structured cache store (#314)Stefan Boberg2023-05-171-2422/+2
| | | | This change separates out the disk and memory storage strategies into separate cpp/h files to improve maintainability.
* Content scrubbing (#271)Stefan Boberg2023-05-161-135/+290
| | | Added zen scrub command which may be triggered via the zen CLI helper. This traverses storage and validates contents either by content hash and/or by structure. If unexpected data is encountered it is invalidated.
* Add `--gc-projectstore-duration-seconds` option (#281)Dan Engelbrecht2023-05-161-9/+9
| | | | | | * Add `--gc-projectstore-duration-seconds` option * Cleanup lua gc options parsing * Remove dead configuration values * changelog
* removed remnants of ZEN_USE_REF_TRACKINGStefan Boberg2023-05-151-19/+0
| | | | this code was originally meant to be used for GC but is no longer needed
* minor GC API cleanupStefan Boberg2023-05-151-13/+13
| | | | | Scrub -> ScrubStorage Trigger -> TriggerGc (to make relationship to TriggerScrub clearer)
* Remove ZEN_CACHE_TRACKER etcStefan Boberg2023-05-151-13/+0
| | | this was code which was originally intended for use with GC but it's no longer useful
* wipe cache buckets block store that may contain invalid state (#300)Dan Engelbrecht2023-05-121-0/+14
| | | | * wipe cache buckets block store that may contain invalid state * Update CHANGELOG.md
* fix gc bucket index compaction (#299)Dan Engelbrecht2023-05-121-4/+5
| | | | * fix compaction of m_Payloads and m_AccessTimes in ZenCacheDiskLayer::CacheBucket * changelog
* implemented structured cache logging (#296)Stefan Boberg2023-05-121-2/+73
| | | | | | | | may be used as audit trail to help analyse potential cache pollution/corruption * also added common header with timestamp to all known log targets * made `Oid::operator bool` explicit to avoid logging/text format mishaps * made `HttpClient::operator bool` explicit
* WARN level log if we can't write snapshot/manifest/access times (#288)Dan Engelbrecht2023-05-111-2/+9
|
* clean up log/index reading and fix incorrect logging about bad log files (#286)Dan Engelbrecht2023-05-101-4/+8
|
* Validate that entries points inside valid blocks at startup (#280)Dan Engelbrecht2023-05-091-4/+44
| | | | | * Separate initialization of block store from pruning of unknown blocks * Validate that entries points inside valid blocks
* moved source directories into `/src` (#264)Stefan Boberg2023-05-021-0/+3648
* moved source directories into `/src` * updated bundle.lua for new `src` path * moved some docs, icon * removed old test trees