kopia

mirror of https://github.com/kopia/kopia.git synced 2026-05-18 11:44:36 -04:00

Author	SHA1	Message	Date
Prasad Ghangal	3bf947d746	feat(repository): Metadata compression config support for directory and indirect content (#4080 ) * Configure compressor for k and x prefixed content Adds metadata compression setting to policy Add support to configure compressor for k and x prefixed content Set zstd-fastest as the default compressor for metadata in the policy Adds support to set and show metadata compression to kopia policy commands Adds metadata compression config to dir writer Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Pass concatenate options with ConcatenateOptions struct Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Move content compression handling to caller Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Move handling manifests to manifest pkg Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Correct const in server_test Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Remove unnecessary whitespace Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> * Disable metadata compression for < V2 format Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com> --------- Signed-off-by: Prasad Ghangal <prasad.ganghal@veeam.com>	2024-10-23 23:28:23 -07:00
Jarek Kowalski	e36fa78385	feat(snapshots): added support for per-directory splitter overrides (#3887 ) This is useful when backing up directories that have giant files aligned at MiB boundary, such as VM disk backups, etc.	2024-06-07 13:42:15 -07:00
Jarek Kowalski	d0fc1e03c4	fix(server): do not make blocking calls inside server status API (#3666 ) also reduce global server lock scope	2024-02-21 12:34:16 -08:00
Jarek Kowalski	524ffaf4b8	refactor(repository): added context to potentially blocking repository methods (#3654 ) Primarily for wiring a context.Context to a call to content.Manager.refresh, which was using a detached context.	2024-02-20 14:48:23 -08:00
Jarek Kowalski	51dcaa985d	chore(ci): upgraded linter to 1.48.0 (#2294 ) Mechanically fixed all issues, added `lint-fix` make target.	2022-08-09 06:07:54 +00:00
Jarek Kowalski	23299c3451	refactor(repository): ensure MutableParameters are never cached (#2284 )	2022-08-06 18:11:32 -07:00
Jarek Kowalski	68b8afd43f	feat(snapshots): improved performance when uploading huge files (#2064 ) * feat(snapshots): improved performance when uploading huge files This is controlled by an upload policy which specifies the size threshold above which indvidual files are uploaded in parts and concatenated. This allows multiple threads to run splitting, hashing, compression and encryption in parallel, which was previously only possible across multiple files, but not when a single file was being uploaded. The default is 2GiB for now, so this feature only kicks in for very larger files. In the future we may lower this. Benchmark involved uploading a single 42.1 GB file which was a VM disk snapshot of fresh Ubuntu installation (fresh EXT4 partition with lots of zero bytes) to a brand-new filesystem repository on local SSD of M1 Pro Macbook Pro 2021. * before: 59-63s (~700 MB/s) * after: 15-17s (~2.6 GB/s) * additional test to ensure files are really e2e readable	2022-06-24 07:38:07 +00:00
Jarek Kowalski	9bf9cac7fb	refactor(repository): ensure we always parse content.ID and object.ID (#1960 ) * refactor(repository): ensure we always parse content.ID and object.ID This changes the types to be incompatible with string to prevent direct conversion to and from string. This has the additional benefit of reducing number of memory allocations and bytes for all IDs. content.ID went from 2 allocations to 1: typical case 32 characters + 16 bytes per-string overhead worst-case 65 characters + 16 bytes per-string overhead now: 34 bytes object.ID went from 2 allocations to 1: typical case 32 characters + 16 bytes per-string overhead worst-case 65 characters + 16 bytes per-string overhead now: 36 bytes * move index.{ID,IDRange} methods to separate files * replaced index.IDFromHash with content.IDFromHash externally * minor tweaks and additional tests * Update repo/content/index/id_test.go Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com> * Update repo/content/index/id_test.go Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com> * pr feedback * post-merge fixes * pr feedback * pr feedback * fixed subtle regression in sortedContents() This was actually not producing invalid results because of how base36 works, just not sorting as efficiently as it could. Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>	2022-05-25 14:15:56 +00:00
Jarek Kowalski	daa62de3e4	chore(ci): added checklocks static analyzer (#1838 ) From https://github.com/google/gvisor/tree/master/tools/checklocks This will perform static verification that we're using `sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access to certain fields. This was mostly just a matter of adding annotations to indicate which fields are guarded by which mutex. In a handful of places the code had to be refactored to allow static analyzer to do its job better or to not be confused by some constructs. In one place this actually uncovered a bug where a function was not releasing a lock properly in an error case. The check is part of `make lint` but can also be invoked by `make check-locks`.	2022-03-19 22:42:59 -07:00
Jarek Kowalski	e67f84e0ba	chore(general): updated linter to 1.44.0 (#1681 )	2022-01-25 21:21:13 -08:00
Jarek Kowalski	bbbef44d8a	More coverage improvements (#1577 ) * increased direct coverage for internal/cache * object: code coverage improvements for object writer	2021-12-11 23:27:42 -08:00
Jarek Kowalski	8b760b66a8	logging: added memoization of Logger instances per context (#1369 )	2021-10-09 05:02:18 -07:00
Jarek Kowalski	792cc874dc	repo: allow reusing of object writer buffers (#1315 ) This reduces memory consumption and speeds up backups. 1. Backing up kopia repository (3.5 GB files:133102 dirs:20074): before: 25s, 490 MB after: 21s, 445 MB 2. Large files (14.8 GB, 76 files) before: 30s, 597 MB after: 28s, 495 MB All tests repeated 5 times for clean local filesystem repo.	2021-09-25 14:54:31 -07:00
Jarek Kowalski	35d0f31c0d	huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244 ) This helps recycle buffers more efficiently during snapshots. Also, improved memory tracking, enabled profiling flags and added pprof by default.	2021-08-20 08:45:10 -07:00
Jarek Kowalski	40510c043d	Support for content-level compression (#1076 ) * cli: added a flag to create repository with v2 index features * content: plumb through compression.ID parameter to content.Manager.WriteContent() * content: expose content.Manager.SupportsContentCompression This allows object manager to decide whether to create compressed object or let the content manager do it. * object: if compression is requested and the repo supports it, pass compression ID to the content manager * cli: show compression status in 'repository status' * cli: output compression information in 'content list' and 'content stats' * content: compression and decompression support * content: unit tests for compression * object: compression tests * testing: added integration tests against v2 index * testing: run all e2e tests with and without content-level compression * htmlui: added UI for specifying index format on creation * cli: additional tests for 'content ls' and 'content stats' * applied pr suggestions	2021-05-22 05:35:27 -07:00
Jarek Kowalski	30ca3e2e6c	Upgraded linter to 1.40.1 (#1072 ) * tools: upgraded linter to 1.40.1 * lint: fixed nolintlint vionlations * lint: disabled tagliatele linter * lint: fixed remaining warnings	2021-05-15 12:12:34 -07:00
Jarek Kowalski	f4347886b8	logging: simplified log levels (#954 ) Removed Warning, Notify and Fatal: * `Warning` => `Error` or `Info` * `Notify` => `Info` * `Fatal` was never used. Note that --log-level=warning is still supported for backwards compatibility, but it is the same as --log-level=error. Co-authored-by: Julio López <julio+gh@kasten.io>	2021-04-09 07:27:35 -07:00
Jarek Kowalski	66cebb79cb	Fixed empty object IDs in checkpoints (#649 ) * object: fixed race condition between Result() and Checkpoint() This would sometimes result in indirect objects having empty object IDs. Fixes #648 * upload: ensure checkpoints never containt empty object IDs. * testing: reduce armhf test weight	2020-09-29 07:14:47 -07:00
Jarek Kowalski	f0b97b960b	Fixed checkpointing to not restart the entire upload process (#594 ) * object: added Checkpoint() method to object writer * upload: refactored code structure to allow better checkpointing * upload: removed Checkpoint() method from UploadProgress * Update fs/entry.go Co-authored-by: Julio López <julio+gh@kasten.io>	2020-09-12 22:36:22 -07:00
Jarek Kowalski	e22d22dba2	object: implemented fast concatenation of objects by merging their index entries (#607 )	2020-09-11 20:12:01 -07:00
Jarek Kowalski	faf280616a	Splitter throughput improvements (#606 ) * object: refactored writer to detect split points before writing This introduces new primitive that will be moved into splitters themselves in subsequent commits. I'm doing this in small steps to ensure we don't regress at any time. * splitter: refactored TestSplitters test This is use slow (byte-by-byte) and fast (nextSplitPoint) methods of determining split points. Note nextSplitPoint is not implemented by splitters yet, but this verifies that the test is expecting the right thing. * object: splitter refactoring - replaced ShouldSplit() with NextSplitPoint() everywhere, still not optimized * splitter: added additional dimension to splitter_test We split either in large chunks or one byte at a time to catch the corner cases in the splitter implementation. * splitter: optimized splitters using NextSplitPoint primitive This improves splitter performance by about 40% (buzhash) and makes it virtually free for FIXED splitter.	2020-09-11 19:45:48 -07:00
Jarek Kowalski	9a6dea898b	Linter upgrade to v1.30.0 (#526 ) * fixed godot linter errors * reformatted source with gofumpt * disabled some linters * fixed nolintlint warnings * fixed gci warnings * lint: fixed 'nestif' warnings * lint: fixed 'exhaustive' warnings * lint: fixed 'gocritic' warnings * lint: fixed 'noctx' warnings * lint: fixed 'wsl' warnings * lint: fixed 'goerr113' warnings * lint: fixed 'gosec' warnings * lint: upgraded linter to 1.30.0 * lint: more 'exhaustive' warnings Co-authored-by: Nick <nick@kasten.io>	2020-08-12 19:28:53 -07:00
Jarek Kowalski	573d10422a	object: ensure that all I objects have a content prefix When prefix is not specified on ObjectWriter, we force 'x' content prefix on intermediate contents, so object IDs will look like: Ix{hash} This ensures the index contents will be stored in `q` blobs, making `snapshot gc` easier.	2020-04-12 23:55:09 -07:00
Jarek Kowalski	8687f1c008	object: added AsyncWrites to ObjectWriter, which improves performance… (#369 ) * object: added AsyncWrites to ObjectWriter, which improves performance of uploading of a single file Fixes #351 Co-Authored-By: Julio López <julio+gh@kasten.io>	2020-03-22 09:02:33 -07:00
Jarek Kowalski	239d809075	performance: introduced buf.Pool which helps reuse memory buffers (#345 ) * performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression * repo: switched object writer to buf.Pool * content: switched encryption to use buf.Pool * object: switched compression to use buf.Pool * testing: added missing content manager Close()	2020-03-18 20:42:16 -07:00
Jarek Kowalski	8d452a8285	performance: improvements to object manager (#336 ) - added pooled splitters and ability to reset them without having to recreate - added support for caller-provided compressor output to be able to pool it - added pooling of compressor instances, since those are costly	2020-03-13 08:56:18 -07:00
Jarek Kowalski	d181403284	crypto: refactored encryption, hashing and splitter into separate packages (#274 ) Added some tests, deleted XSALSA20 which never worked E2E	2020-02-27 12:36:49 -08:00
Jarek Kowalski	0b8c4d0ef9	object: fixed compression bug where we were not clearing the buffer this effectively defeated the purpose of compression, caused high memory usage and other kinds of bad behavior. refactored the code to prevent this issue by resetting the buffer at the caller not callee. fixed previous e2e test to catch the issue mentioned in #166, verified it fails against master and passes with this change.	2020-01-09 16:36:57 -08:00
Jarek Kowalski	ac70a38101	lint: upgraded to 1.22.2 and make lint issues a build failure fixed or silenced linter warnings, mostly due to magic numeric constants	2020-01-03 16:39:30 -08:00
Jarek Kowalski	2ba4e83cef	moved all compression to separate package and sanitized identifiers	2019-12-10 23:25:28 -08:00
Jarek Kowalski	aec3cdcb2f	object: added support for compressed objects	2019-12-10 23:25:28 -08:00
Jarek Kowalski	6217df1a87	lint: switched to 1.21 and fixed a ton of whitespace issues discovered by new wsl linter	2019-11-26 06:49:49 -08:00
Jarek Kowalski	54edb97b3a	refactoring: renamed repo/block to repo/content Also introduced strongly typed content.ID and manifest.ID (instead of string) This aligns identifiers across all layers of repository: blob.ID content.ID object.ID manifest.ID	2019-06-01 22:24:19 -07:00
Jarek Kowalski	63303904e1	switched remaining fmt.Errorf to errors.Wrap()	2019-06-01 10:57:05 -07:00
Jarek Kowalski	03339c18af	[breaking change] deprecated DYNAMIC splitter due to license issue The splitter in question was depending on github.com/silvasur/buzhash which is not licensed according to FOSSA bot Switched to new faster implementation of buzhash, which is unfortunately incompatible and will split the objects in different places. This change is be semi-breaking - old repositories can be read, but when uploading large objects they will be re-uploaded where previously they would be de-duped. Also added 'benchmark splitters' subcommand and moved 'block cryptobenchmark' subcommand to 'benchmark crypto'.	2019-05-30 22:20:45 -07:00
Jarek Kowalski	0c41d41276	Fixed up paths after merge	2019-05-27 15:48:39 -07:00
Jarek Kowalski	327d8317d8	refactored repo/ into separate github.com/kopia/repo/ git repository	2018-10-26 20:40:57 -07:00
Jarek Kowalski	1b014c875a	simplified repository API password handling. completely rewrote password storage: - by default passwords are kept in OS-specific keyring (Keychain on macOS, Windows Credentials Manager on Windows), which can be optionally disabled to store password in a local file. - on Linux keychain is disabled by default (does not work reliably in terminal sessions), but can be enabled using command-line flag.	2018-09-07 21:34:31 -07:00
Jarek Kowalski	91066f2469	reorganized low-level repository packages by moving them all under kopia/kopia/repo/	2018-08-30 22:01:05 -07:00

39 Commits