aboutsummaryrefslogtreecommitdiff
path: root/src/zenserver/projectstore
Commit message (Collapse)AuthorAgeFilesLines
* Use multithreading to fetch size/rawsize of entries in ↵Dan Engelbrecht2024-03-283-60/+164
| | | | | | `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` (#30) - Improvement: Use multithreading to fetch size/rawsize of entries in `/prj/{project}/oplog/{log}/chunkinfos` and `/prj/{project}/oplog/{log}/files` - Improvement: Add `GetMediumWorkerPool()` in addition to `LargeWorkerPool()` and `SmallWorkerPool()`
* add "fieldnames" query param for GetProjectFiles/GetProjectChunkInfos (#29)Dan Engelbrecht2024-03-283-23/+137
| | | | | | | | - Improvement: It is now possible to control which fields to include in `/prj/{project}/oplog/{log}/chunkinfos` request by adding a comma delimited list of filed names for `fieldnames` parameter - Default fields are: `id`, `rawhash` and `rawsize` (translates to `?fieldnames=id,rawhash,rawsize`) - Use `?fieldnames=*` to get all the fields - Improvement: It is now possible to control which fields to include in `/prj/{project}/oplog/{log}/files` request by adding a comma delimited list of filed names for `fieldnames` parameter - Default fields are: `id`, `clientpath` and `serverpath` (translates to `?fieldnames=id,clientpath,serverpath`), `filter=client` only applies if `fieldnames` is not given as a parameter - Use `?fieldnames=*` to get all the fields
* Get raw size for compressed chunks correctly for ↵Dan Engelbrecht2024-03-271-1/+7
| | | | `/prj/{project}/oplog/{log}/chunkinfos` (#27)
* make existing block log clearerDan Engelbrecht2024-03-261-1/+1
|
* fix order of parameters in block check result message (#26)Dan Engelbrecht2024-03-261-1/+1
| | | * fix order of parametes in block check result message
* add filter to projectstore entries request (#25)Dan Engelbrecht2024-03-261-6/+65
| | | | | | | * Add HttpServerRequest::Decode to decode http request parameters * Add support to filter projectstore `entries` request using the `fieldfilter` where the wanted fields are comma (,) delimited * Add support for responding with compressed payloads for projectstore `entries` requests by adding AcceptType `compressed-binary` to the request header * Add support for responding with compressed payloads for projectstore `files` requests by adding AcceptType `compressed-binary` to the request header * Add support for responding with compressed payloads for projectstore `chunkinfo` requests by adding AcceptType `compressed-binary` to the request header
* consistent paths encoding (#24)Dan Engelbrecht2024-03-253-47/+49
| | | | * Don't encode filesystem path to UTF8 unless stored in compactbinary string * Be consistent where we encode/decode paths to UTF8
* add a limit to the number of times we attempt to finalize (#22)Dan Engelbrecht2024-03-251-81/+133
| | | | | | - Improvement: Add limit to the number of times we attempt to finalize and exported oplog - Improvement: Switch to large thread pool when executing oplog export/import - Improvement: Clean up reporting of missing attachments in oplog export/import - Improvement: Remove double-reporting of abort reason for oplog export/import
* use batch request for checking existing blocks as Jupiter is now fixed (#20)v5.4.2-pre15v5.4.2-pre14Dan Engelbrecht2024-03-231-41/+10
|
* check existance of reused blocks (#18)Dan Engelbrecht2024-03-225-43/+167
| | | | | * Add HasAttachments to RemoteProjectStore so we can query if attachment blocks actually exist * use individual requests for compressed blob check in Jupiter * remove weird 1000.5 to 1000.0 when converting seconds to milliseconds
* non memory copy compressed range (#13)Dan Engelbrecht2024-03-203-32/+39
| | | | | * Add CompressedBuffer::GetRange that references source data rather than make a memory copy * Use Compressed.CopyRange in project store GetChunkRange * docs for CompressedBuffer::CopyRange and CompressedBuffer::GetRange
* read jupiter token from file (#10)Dan Engelbrecht2024-03-181-5/+12
| | | | | * Add ability to specify a json file for cloud access token OidcToken.exe in UE can be asked to produce such a file with the -OutFile option * avoid division by zero when reporting progress
* special treatment large oplog attachments v2 (#5)Dan Engelbrecht2024-03-144-624/+1325
| | | | | - Bugfix: Install Ctrl+C handler earlier when doing `zen oplog-export` and `zen oplog-export` to properly cancel jobs - Improvement: Add ability to block a set of CAS entries from GC in project store - Improvement: Large attachments and loose files are now split into smaller chunks and stored in blocks during oplog export
* fix potential partially written files (#2)Dan Engelbrecht2024-03-131-8/+2
| | | | * Make sure WriteFile() does not leave incomplete files * use TemporaryFile and MoveTemporaryIntoPlace to avoid leaving partial files on error
* Make sure we wait for all scheduled tasks to complete before throwing ↵Dan Engelbrecht2024-02-281-1/+3
| | | | | exceptions further (#662) Bugfix: We must not throw exceptions to calling function until all async work we spawned has returned
* Keep track of added ops during GCV2 instead of rescanning full oplog when ↵Dan Engelbrecht2024-02-132-8/+25
| | | | added ops are detected (#652)
* Save compressed large attachments to temporary files on disk (#650)Dan Engelbrecht2024-02-122-183/+217
| | | | | | | | | * Save large compressed large attachments to temporary files on disk * bump oplog block max size up to 64Mb again * Make sure CompositeBuffer::AppendBuffers actually moves inputs when it should * removed parallell execution of fetching payload for block assembly it was not actually helping and added complexity * make sure we move/release payload buffers as soon as possible * make sure we don't read in full large attachments to memory when computing hash
* compress large attachments on demand (#647)Dan Engelbrecht2024-02-054-362/+550
| | | | | | | - Improvement: Speed up oplog export by fetching/compressing big attachments on demand - Improvement: Speed up oplog export by batch-fetcing small attachments - Improvement: Speed up oplog import by batching writes of oplog ops - Improvement: Tweak oplog export default block size and embed size limit - Improvement: Add more messaging and progress during oplog import/export
* respond with BadRequest result instead of throwing exception on bad request ↵Dan Engelbrecht2024-02-051-2/+12
| | | | input (#648)
* improve oplog export logging (#644)Dan Engelbrecht2024-01-316-104/+219
| | | | | | - Improvement: More details in oplog import/export logs - Improvement: Switch from Download to Get when fetching Refs from Jupiter as they can't be resumed anyway and streaming to disk is redundant - Bugfix: Make sure we clear read callback when doing Put in HttpClient to avoid timeout due to not sending data when reusing sessions - Bugfix: Respect `--ignore-missing-attachments` in `oplog-export` command when loose file is missing on disk
* fix response error conversion (#643)Dan Engelbrecht2024-01-291-4/+5
| | | | * make sure we properly convert compact-binary results to text when receiving errors * log fix
* add ignore-missing-attachments option to oplog export (debugging tool) (#641)Dan Engelbrecht2024-01-253-35/+48
| | | | | | | * add ignore-missing-attachments option to oplog export (debugging tool) * add more status codes to do retry for in http client * add missing X-Jupiter-IoHash header for jupiter PutRef * reduce oplog block size to reduce amount of redundant chunks to download * improved logging
* Add retry with optional resume logic to HttpClient::Download (#639)Dan Engelbrecht2024-01-243-101/+51
| | | | | | | - Improvement: Refactored Jupiter upstream to use HttpClient - Improvement: Added retry and resume logic to HttpClient - Improvement: Added authentication support to HttpClient - Improvement: Clearer logging in GCV2 compact of FileCas/BlockStore - Improvement: Size details in oplog import logging
* oplog import/export improvements (#634)Dan Engelbrecht2024-01-233-224/+368
| | | | * improve feedback from oplog import/export * improve oplog save performance
* add --ignore-missing-attachments to oplog-import command (#637)Dan Engelbrecht2024-01-223-49/+83
|
* improved errors from jupiter upstream (#636)Dan Engelbrecht2024-01-221-5/+22
| | | * get more detailed error messages from jupiter upstream
* separate RPC processing from HTTP processing (#626)Stefan Boberg2023-12-202-2/+2
| | | | | | * moved all RPC processing from HttpStructuredCacheService into separate CacheRpcHandler class in zenstore * move package marshaling to zenutil. was previously in zenhttp/httpshared but it's useful in other contexts as well where we don't want to depend on zenhttp * introduced UpstreamCacheClient, this provides a subset of functions on UpstreamCache and lives in zenstore
* ensure we can build without trace (#619)Stefan Boberg2023-12-191-2/+2
| | | | `xmake config -zentrace=n` would previously not build cleanly
* improve trace (#606)Dan Engelbrecht2023-12-131-2/+0
| | | | | * Adding some more trace scopes for better visiblity * Removed spammy trace scope when replaying oplogs * Remove "::Disk" from trace scopes - redundant now that we have merge disk and memory layers
* improved scrubbing of oplogs and filecas (#596)Stefan Boberg2023-12-112-81/+193
| | | | | | - Improvement: Scrub command now validates compressed buffer hashes in filecas storage (used for large chunks) - Improvement: Added --dry, --no-gc and --no-cas options to zen scrub command - Improvement: Implemented oplog scrubbing (previously was a no-op) - Improvement: Implemented support for running scrubbint at startup with --scrub=<options>
* Change naming to ChunkInfos instead of Chunkszousar2023-12-064-12/+12
|
* Ran precommitzousar2023-12-052-7/+3
|
* Get hash when retrieving chunkszousar2023-12-052-7/+32
| | | | Also changes the returned fields for each chunk from size->rawsize. Backwards compatibility is not a concern as this was unused in past zenserver releases.
* Add endpoint for all chunk infoszousar2023-12-013-8/+45
| | | | Add endpoint for querying all chunk infos in an oplog.
* add separate PreCache step for GcReferenceChecker (#578)Dan Engelbrecht2023-12-011-14/+29
| | | | | | - Improvement: GCv2: Use separate PreCache step to improve concurrency when checking references - Improvement: GCv2: Improved verbose logging - Improvement: GCv2: Sort chunks to read by block/offset when finding references - Improvement: GCv2: Exit as soon as no more unreferenced items are left
* global thread worker pools (#577)Dan Engelbrecht2023-11-291-12/+4
| | | - Improvement: Use two global worker thread pools instead of ad-hoc creation of worker pools
* tracing for gcv2 (#574)Dan Engelbrecht2023-11-281-0/+12
| | | | | | - Improvement: Added more trace scopes for GCv2 - Bugfix: Make sure we can override flags to "false" when running `zen gc` commmand - `smallobjects`, `skipcid`, `skipdelete`, `verbose`
* gcv2 tests for project store and bugfixes (#571)Dan Engelbrecht2023-11-271-84/+209
| | | * gcv2 tests for project store and bugfixes
* gc stop command (#569)v0.2.36-pre2Dan Engelbrecht2023-11-271-2/+21
| | | | | - Feature: New endpoint `/admin/gc-stop` to cancel a running garbage collect operation - Feature: Added `zen gc-stop` command to cancel a running garbage collect operation - Bugfix: GCv2 - make sure to discover all projects and oplogs before checking for expired data
* add command line options for compact block threshold and gc verbose (#557)Dan Engelbrecht2023-11-211-1/+17
| | | | | | | | | | | - Feature: Added new options to zenserver for GC V2 - `--gc-compactblock-threshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--gc-verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new options to `zen gc` command for GC V2 - `--compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `--verbose` GCV2 - enable more verbose output when running a GC pass - Feature: Added new parameters for endpoint `admin/gc` (PUT) - `compactblockthreshold` GCV2 - how much of a compact block should be used to skip compacting the block, default is 90% - `verbose` GCV2 - enable more verbose output when running a GC pass
* compact separate for gc referencer (#533)Dan Engelbrecht2023-11-212-85/+205
| | | | | - Refactor GCV2 so GcReferencer::RemoveExpiredData returns a store compactor, moving out the actual disk work from deleting items in the index. - Refactor GCV2 GcResult to reuse GcCompactStoreStats and GcStats - Make Compacting of stores non-parallell to not eat all the disk I/O when running GC
* disk layer gc and error/warnings cleanup (#515)Dan Engelbrecht2023-11-081-41/+41
| | | | | | | - Improvement: Use GC reserve when writing index/manifest for a disk cache bucket when disk is low when available - Improvement: Demote errors to warning for issues that are not critical and we handle gracefully - Improvement: Treat more out of memory errors from windows as Out Of Memory errors Fixed wrong sizeof() statement for compactcas index (luckily the two structs are of same size)
* spdlog implementation hiding (#498)Stefan Boberg2023-11-063-8/+9
| | | | | | | | | this change aims to hide logging internals from client code, in order to make it easier to extend and take more control over the logging process in the future. As a bonus side effect, the generated code is much tighter (net delta around 2.5% on the resulting executable which includes lots of thirdparty code) and should take less time to compile and link. Client usage via macros is pretty much unchanged. The main exposure client code had to spdlog internals before was the use of custom loggers per subsystem, where it would be common to have `spdlog::logger` references to keep a reference to a logger within a class. This is now replaced by `zen::LoggerRef` which currently simply encapsulates an actual `spdlog::logger` instance, but this is intended to be an implementation detail which will change in the future. The way the change works is that we now handle any formatting of log messages in the zencore logging subsystem instead of relying on `spdlog` to manage this. We use the `fmt` library to do the formatting which means the client usage is identical to using `spdlog`. The formatted message is then forwarded onto any sinks etc which are still implememted via `spdlog`.
* gc v2 tests (#512)Dan Engelbrecht2023-11-061-12/+1
| | | | | | | | | | * set MaxBlockCount at init * properly calculate total size * basic blockstore compact blocks test * correct detection of block swap * Use one implementation for CreateRandomBlob * reduce some data sets to increase speed of tests * reduce test time * rename BlockStoreCompactState::AddBlock -> BlockStoreCompactState::IncludeBlock
* individual gc stats (#506)Dan Engelbrecht2023-10-302-30/+64
| | | | | - Feature: New parameter for endpoint `admin/gc` (GET) `details=true` which gives details stats on GC operation when using GC V2 - Feature: New options for zen command `gc-status` - `--details` that enables the detailed output from the last GC operation when using GC V2
* New GC implementation (#459)Dan Engelbrecht2023-10-302-5/+270
| | | - Feature: New garbage collection implementation, still in evaluation mode. Enabled by `--gc-v2` command line option
* added missing includes (#504)Stefan Boberg2023-10-271-0/+2
| | | | | this change adds some includes to files which "inherit" includes from elsewhere this was exposed on another branch when removing some heavy dependencies from central headers
* merge disk and memory layers (#493)Dan Engelbrecht2023-10-241-1/+1
| | | | - Feature: Added `--cache-memlayer-sizethreshold` option to zenserver to control at which size cache entries get cached in memory - Changed: Merged cache memory layer with cache disk layer to reduce memory and cpu overhead
* clean up GcContributor and GcStorage to be pure interfaces (#485)Dan Engelbrecht2023-10-202-3/+7
|
* add `flush` command and more gc status info (#483)Dan Engelbrecht2023-10-181-2/+3
| | | | | | - Feature: New endpoint `/admin/flush ` to flush all storage - CAS, Cache and ProjectStore - Feature: New command `zen flush` to flush all storage - CAS, Cache and ProjectStore - Improved: Command `zen gc-status` now gives details about storage, when last GC occured, how long until next GC etc - Changed: Cache access and write log are disabled by default