aboutsummaryrefslogtreecommitdiff
path: root/src/zenstore/gc.cpp
Commit message (Collapse)AuthorAgeFilesLines
* safer gcv2 on error (#60)Dan Engelbrecht2024-04-241-1/+19
| | | - Bugfix: Harden GCv2 when errors occur and gracefully abort GC operation on error
* improved assert (#37)Dan Engelbrecht2024-04-041-16/+16
| | | | - Improvement: Add file and line to ASSERT exceptions - Improvement: Catch call stack when throwing assert exceptions and log/output call stack at important places to provide more context to caller
* Use multithreading to fetch size/rawsize of entries in ↵Dan Engelbrecht2024-03-281-2/+2
| | | | | | `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` (#30) - Improvement: Use multithreading to fetch size/rawsize of entries in `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` - Improvement: Add `GetMediumWorkerPool()` in addition to `LargeWorkerPool()` and `SmallWorkerPool()`
* Make sure we wait for all scheduled tasks to complete before throwing ↵Dan Engelbrecht2024-02-281-48/+88
| | | | | exceptions further (#662) Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
* Don't capture local variables in loop by reference (#623)Dan Engelbrecht2023-12-191-27/+27
| | | * Don't capture local variables in loop by reference
* fix peak disk load in gc status (#608)Dan Engelbrecht2023-12-131-12/+11
| | | * MaxLoad is max load per monitor slot, not the MaxLoad for the entire graph
* improved scrubbing of oplogs and filecas (#596)Stefan Boberg2023-12-111-3/+10
| | | | | | - Improvement: Scrub command now validates compressed buffer hashes in filecas storage (used for large chunks) - Improvement: Added --dry, --no-gc and --no-cas options to zen scrub command - Improvement: Implemented oplog scrubbing (previously was a no-op) - Improvement: Implemented support for running scrubbint at startup with --scrub=<options>
* add separate PreCache step for GcReferenceChecker (#578)Dan Engelbrecht2023-12-011-6/+54
| | | | | | - Improvement: GCv2: Use separate PreCache step to improve concurrency when checking references - Improvement: GCv2: Improved verbose logging - Improvement: GCv2: Sort chunks to read by block/offset when finding references - Improvement: GCv2: Exit as soon as no more unreferenced items are left
* global thread worker pools (#577)Dan Engelbrecht2023-11-291-10/+4
| | | - Improvement: Use two global worker thread pools instead of ad-hoc creation of worker pools
* tracing for gcv2 (#574)Dan Engelbrecht2023-11-281-1/+11
| | | | | | - Improvement: Added more trace scopes for GCv2 - Bugfix: Make sure we can override flags to "false" when running `zen gc` commmand - `smallobjects`, `skipcid`, `skipdelete`, `verbose`
* gc stop command (#569)v0.2.36-pre2Dan Engelbrecht2023-11-271-1/+16
| | | | | - Feature: New endpoint `/admin/gc-stop` to cancel a running garbage collect operation - Feature: Added `zen gc-stop` command to cancel a running garbage collect operation - Bugfix: GCv2 - make sure to discover all projects and oplogs before checking for expired data
* Add GC Cancel/Stop (#568)Dan Engelbrecht2023-11-241-6/+97
| | | | - GcScheduler will now cancel any running GC when it shuts down. - Old GC is rather limited in *when* it reacts to cancel of GC. GCv2 is more responsive.
* add command line options for compact block threshold and gc verbose (#557)Dan Engelbrecht2023-11-211-146/+210
| | | | | | | | | | | - Feature: Added new options to zenserver for GC V2 - `--gc-compactblock-threshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--gc-verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new options to `zen gc` command for GC V2 - `--compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new parameters for endpoint `admin/gc` (PUT) - `compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `verbose` GCV2 - enable more verbose output when running a GC pass
* compact separate for gc referencer (#533)Dan Engelbrecht2023-11-211-396/+425
| | | | | - Refactor GCV2 so GcReferencer::RemoveExpiredData returns a store compactor, moving out the actual disk work from deleting items in the index. - Refactor GCV2 GcResult to reuse GcCompactStoreStats and GcStats - Make Compacting of stores non-parallell to not eat all the disk I/O when running GC
* blocking queue fix (#550)Dan Engelbrecht2023-11-161-1/+1
| | | | | | | | | * make BlockingQueue::m_CompleteAdding non-atomic * ZenCacheDiskLayer::Flush logging * name worker threads in ZenCacheDiskLayer::DiscoverBuckets * name worker threads in gcv2 * improved logging in ZenServerInstance * scrub threadpool naming * remove waitpid handling, we should just call wait to kill zombie processes
* gc history log (#519)Dan Engelbrecht2023-11-131-8/+279
| | | | | - Feature: Writes a `gc.log` with settings and detailed result after each GC execution (version 2 only) - Break out file name rotate to allow access for gclog - CompactBinaryToJson(MemoryView Data, StringBuilderBase& InBuilder)
* gc v2 tests (#512)Dan Engelbrecht2023-11-061-27/+11
| | | | | | | | | | * set MaxBlockCount at init * properly calculate total size * basic blockstore compact blocks test * correct detection of block swap * Use one implementation for CreateRandomBlob * reduce some data sets to increase speed of tests * reduce test time * rename BlockStoreCompactState::AddBlock -> BlockStoreCompactState::IncludeBlock
* multithread cache bucket (#508)Dan Engelbrecht2023-11-061-252/+252
| | | | * Multithread init and flush of cache bucket * tweaked threading cound for bucket discovery, disklayer flush and gc v2
* individual gc stats (#506)Dan Engelbrecht2023-10-301-265/+366
| | | | | - Feature: New parameter for endpoint `admin/gc` (GET) `details=true` which gives details stats on GC operation when using GC V2 - Feature: New options for zen command `gc-status` - `--details` that enables the detailed output from the last GC operation when using GC V2
* New GC implementation (#459)Dan Engelbrecht2023-10-301-18/+308
| | | - Feature: New garbage collection implementation, still in evaluation mode. Enabled by `--gc-v2` command line option
* added missing includes (#504)Stefan Boberg2023-10-271-0/+1
| | | | | this change adds some includes to files which "inherit" includes from elsewhere this was exposed on another branch when removing some heavy dependencies from central headers
* fix m_LastFullGcDuration, m_LastFullGCDiff, m_LastFullGcDuration and ↵Dan Engelbrecht2023-10-231-18/+13
| | | | m_LastLightweightGcDuration stats (#494)
* clean up GcContributor and GcStorage to be pure interfaces (#485)Dan Engelbrecht2023-10-201-24/+0
|
* Add --skip-delete option to gc command (#484)Dan Engelbrecht2023-10-201-0/+4
| | | | - Feature: Add `--skip-delete` option to gc command - Bugfix: Fix implementation when claiming GC reserve during GC
* add `flush` command and more gc status info (#483)Dan Engelbrecht2023-10-181-23/+89
| | | | | | - Feature: New endpoint `/admin/flush ` to flush all storage - CAS, Cache and ProjectStore - Feature: New command `zen flush` to flush all storage - CAS, Cache and ProjectStore - Improved: Command `zen gc-status` now gives details about storage, when last GC occured, how long until next GC etc - Changed: Cache access and write log are disabled by default
* skip lightweight GC if full GC is due soon (#467)Stefan Boberg2023-10-121-20/+30
| | | | | | GC will now skip a lightweight GC if a full GC is due to run within the next lightweight GC interval also fixed some minor typos
* fixed GC logging output stats (#458)Stefan Boberg2023-10-101-1/+1
| | | disk usage stats are now properly reported in log messages
* fix gc infinite loop (#453)Dan Engelbrecht2023-10-061-1/+9
| | | | * make sure we update last gc time even if gc fails * If we can't check if an oplog/project markerfile exists, assume it is not expired
* trivial: log output typo in GCStefan Boberg2023-10-051-1/+1
|
* clean up date formatting (#440)Stefan Boberg2023-10-021-4/+4
| | | | * clean up date formatting (previous code would include a newline)
* fix formatting of gc start messagev0.2.26-pre0Dan Engelbrecht2023-10-021-1/+1
|
* Handle OOM and OOD more gracefully to not spam Sentry with error reports (#434)Dan Engelbrecht2023-10-021-1/+39
| | | | | | - Improvement: Catch Out Of Memory and Out Of Disk exceptions and report back to reqeuster without reporting an error to Sentry - Improvement: If creating bucket fails when storing and item in the structured cache, log a warning and propagate error to requester without reporting an error to Sentry - Improvement: Make an explicit flush of the active block written to in blockstore flush - Improvement: Make sure cache and cas MakeIndexSnapshot does not throw exception on failure which would cause and abnormal termniation at exit
* lightweight gc (#431)Dan Engelbrecht2023-10-021-47/+141
| | | | | | - Feature: Add lightweight GC that only removes items from cache/project store without cleaning up data referenced in Cid store - Add `skipcid` parameter to http endpoint `admin/gc`, defaults to "false" - Add `--skipcid` option to `zen gc` command, defaults to false - Add `--gc-lightweight-interval-seconds` option to zenserver
* add more trace scopes (#362)Dan Engelbrecht2023-09-151-0/+5
| | | | | * more trace scopes * Make sure ReplayLogEntries uses the correct size for oplog buffer * changelog
* retry file create (#383)Dan Engelbrecht2023-09-041-1/+4
| | | | | * add retry logic when creating files * only write disk usage log if disk writes are allowed * changelog
* safer gc on low disk (#373)Dan Engelbrecht2023-08-221-36/+45
| | | * - Improvement: Make sure we have disk space available to do GC and use reserve up front if need be
* catch exceptions when scheduling GC and when writing GC scheduling state (#339)Dan Engelbrecht2023-08-011-136/+149
| | | * catch exceptions when scheduling GC and when writing GC scheduling state
* Content scrubbing (#271)Stefan Boberg2023-05-161-7/+135
| | | Added zen scrub command which may be triggered via the zen CLI helper. This traverses storage and validates contents either by content hash and/or by structure. If unexpected data is encountered it is invalidated.
* Additional trace instrumentation (#312)Stefan Boberg2023-05-161-0/+5
| | | | | | | | | * added trace instrumentation to upstreamcache * added asio trace instrumentation * added trace annotations for project store * added trace annotations for BlockStore * added trace annotations for HttpClient * added trace annotations for CAS/GC
* Add `--gc-projectstore-duration-seconds` option (#281)Dan Engelbrecht2023-05-161-26/+51
| | | | | | * Add `--gc-projectstore-duration-seconds` option * Cleanup lua gc options parsing * Remove dead configuration values * changelog
* removed remnants of ZEN_USE_REF_TRACKINGStefan Boberg2023-05-151-20/+0
| | | | this code was originally meant to be used for GC but is no longer needed
* minor GC API cleanupStefan Boberg2023-05-151-5/+5
| | | | | Scrub -> ScrubStorage Trigger -> TriggerGc (to make relationship to TriggerScrub clearer)
* make sure we create gc root directory before checking disk spaceDan Engelbrecht2023-05-101-7/+7
|
* Low disk space detector (#277)Dan Engelbrecht2023-05-091-3/+45
| | | | * - Feature: Disk writes are now blocked early and return an insufficient storage error if free disk space falls below the `--low-diskspace-threshold` value * Never keep an entry in m_ChunkBlocks that points to a nullptr
* moved source directories into `/src` (#264)Stefan Boberg2023-05-021-0/+1312
* moved source directories into `/src` * updated bundle.lua for new `src` path * moved some docs, icon * removed old test trees