* This is 99% mechanical:
Extracted repo.Repository interface that only exposes high-level object and manifest management methods, but not blob nor content management.
Renamed old *repo.Repository to *repo.DirectRepository
Reviewed codebase to only depend on repo.Repository as much as possible, but added way for low-level CLI commands to use DirectRepository.
* PR fixes
, where blob.Storage.PutBlob gets a list of slices and writes them sequentially
* performance: added gather.Bytes and gather.WriteBuffer
They are similar to bytes.Buffer but instead of managing a single
byte slice, they maintain a list of slices that and when they run out of
space they allocate new fixed-size slice from a free list.
This helps keep memory allocations completely under control regardless
of the size of data written.
* switch from byte slices and bytes.Buffer to gather.Bytes.
This is mostly mechanical, the only cases where it's not involve blob
storage providers, where we leverage the fact that we don't need to
ever concatenate the slices into one and instead we can do gather
writes.
* PR feedback
* object: added AsyncWrites to ObjectWriter, which improves performance of uploading of a single file
Fixes#351
Co-Authored-By: Julio López <julio+gh@kasten.io>
* performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression
* repo: switched object writer to buf.Pool
* content: switched encryption to use buf.Pool
* object: switched compression to use buf.Pool
* testing: added missing content manager Close()
Previously we would store special field Payload for contents
that were added but never flushed to the store and it was not
encrypted. This required special handling different for pending
vs flushed contents.
This change maintains pending pack buffer ready to be flushed
and appends encrypted contents to it, which avoids data copying.
The buffers are pooled to avoid allocations.
Add an implementation of the metadata store using kopia snapshots and restores to manage persistence of walk metadata. Metadata are stored to and retrieved from the store, and a mechanism for persisting the store is implemented using kopia snapshots.
- added pooled splitters and ability to reset them without having to recreate
- added support for caller-provided compressor output to be able to pool it
- added pooling of compressor instances, since those are costly
non-optimized (0.5.0)
0. BLAKE2B-256-128 AES256-GCM-HMAC-SHA256 644.9 MiB / second
before this change:
0. BLAKE2B-256-128 AES256-GCM-HMAC-SHA256 655.9 MiB / second
after (this change):
0. BLAKE2B-256-128 AES256-GCM-HMAC-SHA256 781.5 MiB / second
* performance: plumbed through output buffer to encryption and hashing, so that the caller can pre-allocate/reuse it
* testing: fixed how we do comparison of byte slices to account for possible nils, which can be returned from encryption
Upload: reduced memory usage during uploads
Replaced custom work item management with much simpler solution.
Upload: added pooling of upload buffers
This reduces a lot of allocations and GC pressure.
In a test of 2 dirs x 100K files x 100 bytes each, allocated bytes
went from 27 GB to 20 GB.
This improves #93
* repo: added some initial metrics using OpenCensus
* cli: added flags to expose Prometheus metrics on a local endpoint
`--metrics-listen-addr=localhost:X` exposes prometheus metrics on
http://localhost:X/metrics
Also, kopia server will automatically expose /metrics endpoint on the
same port it runs as, without authentication.