* remove deprecated `snapshot gc` command
* run `maintenance` instead of `snapshot gc` in robustness
* use `maintenance` command instead of `gc` alias for clarity
* use `maintenance run` in `TestSnapshotDeleteRestore`
This commit changes the behavior of the command
`kopia repo upgrade begin...` to not fail (exit code 1) when the repository is already using the latest format version. Instead, a helpful message is output and the program exits with zero code. In effect the command becomes idempotent-successive upgrades would return the same exit code. Such an idempotent api is desirable, especially in cases where we build automation around format upgrades.
Before this change, an error code 1 is returned when upgrading a repository that is already up to date:
```
$ kopia repo status | grep "Format Version"
Format version: 3
$ kopia repo upgrade begin --upgrade-owner-id admin
[1] ERROR error setting the upgrade lock intent: repository is using version 3, and version 3 is the maximum
```
and after this change, a 0 code is returned:
```
$ kopia repo upgrade begin --upgrade-owner-id admin
[0] Repository format is already upto date.
```
* feat(server): improved server shutdown and integration tests
Added `--shutdown-grace-period` flag to `kopia server start` command
which can be used to specify how long the server will wait for active
connections to finish before forcibly shutting down.
This allowed removal of final out-of-process execution of
during integration tests and the need for `integration-tests` target
which was running the same tests as `tests` but in out-of-process mode.
We thus now have all the test coverage in-process without having to
build and launch `kopia` binary.
* fixed logging
* increase test timeout
* speed up and/or parallelize longest-running tests
* feat(cli): print upgrade owner in repository status
To help users understand the state of their repository better, this one
line change also prints out the upgrade owner's ID in the output of
`kopia repository status`.
* Upgrade `create --format-version` help message
To show that there is now a format version 3 that can be set.
* Return ReadCloser from StreamingFile
Allow better resource management by returning something that can be closed
when dealing with StreamingFiles.
* Close StreamingFile Reader during upload
* Use NopCloser on inputs that don't implement Close
Fixup callers of the StreamingFile API by wrapping regular Readers with
NopCloser calls where necessary.
Almost all were easy to replace, except ones exposed via JSON which
have been left as-is.
The linter has a cool behavior where it flags attempts to pass
`atomic.Int32` for example by value , which is always a mistake,
say as an argument to `fmt.Sprintf()`
This may be a breaking change for users who rely on particular kopia metrics (unlikely):
- introduced blob-level metrics:
* `kopia_blob_download_full_blob_bytes_total`
* `kopia_blob_download_partial_blob_bytes_total`
* `kopia_blob_upload_bytes_total`
* `kopia_blob_storage_latency_ms` - per-method latency distribution
* `kopia_blob_errors_total` - per-method error counter
- updated cache metrics to indicate particular cache
* `kopia_cache_hit_bytes_total{cache="CACHE_TYPE"}`
* `kopia_cache_hit_total{cache="CACHE_TYPE"}`
* `kopia_cache_malformed_total{cache="CACHE_TYPE"}`
* `kopia_cache_miss_total{cache="CACHE_TYPE"}`
* `kopia_cache_miss_errors_total{cache="CACHE_TYPE"}`
* `kopia_cache_miss_bytes_total{cache="CACHE_TYPE"}`
* `kopia_cache_store_errors_total{cache="CACHE_TYPE"}`
where `CACHE_TYPE` is one of `contents`, `metadata` or `index-blobs`
- reorganized and unified content-level metrics:
* `kopia_content_write_bytes_total`
* `kopia_content_write_duration_nanos_total`
* `kopia_content_compression_attempted_bytes_total`
* `kopia_content_compression_attempted_duration_nanos_total`
* `kopia_content_compression_savings_bytes_total`
* `kopia_content_compressible_bytes_total`
* `kopia_content_non_compressible_bytes_total`
* `kopia_content_after_compression_bytes_total`
* `kopia_content_decompressed_bytes_total`
* `kopia_content_decompressed_duration_nanos_total`
* `kopia_content_encrypted_bytes_total`
* `kopia_content_encrypted_duration_nanos_total`
* `kopia_content_hashed_bytes_total`
* `kopia_content_hashed_duration_nanos_total`
* `kopia_content_deduplicated_bytes_total`
* `kopia_content_read_bytes_total`
* `kopia_content_read_duration_nanos_total`
* `kopia_content_decrypted_bytes_total`
* `kopia_content_decrypted_duration_nanos_total`
* `kopia_content_uploaded_bytes_total`
Also introduced `internal/metrics` framework which constructs Prometheus metrics in a uniform way and will allow us to include some of these metrics in telemetry report in future PRs.
This removes tons of boilerplate code around:
- retry loop
- connection management
- storage registration
* used generics in runInParallel
* introduced generics in freepool
* introduced strong typing for workshare.Pool and workshare.AsyncGroup
* fixed linter error on openbsd
Lack of generics support is blocking various dependency upgrades,
so this unblocks that.
Temporarily disabled `checklocks` linter until it is fixed upstream.
* Update display on repository summary
* Apply throughout app
* Situate units_test
* Update Command Line documentation
* Envar cleanup
* Rename to BytesString
* Restore envar string available for test
* Remove extraneous empty check and restore UIPreferences field for frontend
* PR: config bool cleanup and missed `BaseEnv`s
* Fix lint and test
In the function that parses the tags passed to the create snapshot
command, if the tag had an incorrect format, an error message would
be returned which did not show the tag itself, making debugging such
error difficult. This commit includes the tag in the error message to
make debugging easier.
* feat(cli): Allow restore from snapshoted path
* Find files in multiple snapshots
* Added --snapshot-time to restore
* Added restore by path test
* More timespec formats
* Test for snapshot list with a file in multiple snapshots
* Handle restore without target path
* Fix for tests
* Made changes requested in PR and rebased
* implemented format blob cache abstraction
* moved upgrade lock logic to repo/format
* moved set parameters logic to repo/format
* moved change password functionality to repo/format
* mechanical changes
* mechanical changes to react to format manager interface
* get current repository format bytes instead of static
* implemented format.Manager which dynamically fetches and caches latest format blob
* repo changes to use format.Manager
* fixed failing unit test due to different timings
* reduced lock contention by using RWMutex
* serve immutable parts of format without any locks
* increase test timeout
* fixed handling of negative validDuration
The new rules are:
- validDuration < 0 - ignore initial cached file, refresh every 15min
- validDuration > 15min - refresh every 15 minutes
- validDuration > 0 && validDuration <= 15min - refresh using provided
interval (mostly used for testing)
* Encryptor pipeline
* Added ECC related options to repository create cli command
* Fix for lint errors
* Fixing comments from the PR
* Fixed lint errors
* Changes requested in PR
* Created e2e test
* Initial implementation of ecc using Encryptor interface
* Created benchmark ecc command
* Fixing the order inside the wrapper
* Removed rs_bw because it is always worse
* Fixing naming and adding more comments
* Different approaches depending of file size/space overhead
* Fixes requested in PR
* Fixed lint errors
* Fixes requested in the PR
* Fixed import order
* Fixed more lint errors
* refactor(repository): moved format blob management to separate package
This is completely mechanical, no behavior changes, only:
- moved types and functions to a new package
- adjusted visibility where needed
- added missing godoc
- renamed some identifiers to align with current usage
- mechanically converted some top-level functions into member functions
- fixed some mis-named variables
* refactor(repository): moved content.FormatingOptions to format.ContentFormat
* feat(repository): added `required features` to the repository
This is intended for future compatibility to be able to reliably
stop old kopia client from being able to open a repository when
the old code does not understand new `required feature`.
Required features are checked on startup and periodically using the
same method as upgrade lock, where they will return errors during blob
operations.
* pr feedback
When upgrading from legacy to epoch manager-based index, we will write
an intentionally-corrupted index blob, such that old clients won't be
able to understand it when they read the repository index using legacy
format.
The error message emitted by very old clients is not great, but it's
safer to do that rather than corrupt the repository.
Note that this additional safety has a delay of up to 15 minutes
which is the time required for old clients to stop relying on index list
cache in case of very long-running snapshots, server or KopiaUI.
Instead of passing static content.FormattingOptions (and caching it)
we now introduce an interface to provide its values.
This will allow the values to dynamically change at runtime in the
future to support cases like live migration.
* kopia format upgrade lock
* Update cli/command_repository_set_parameters_test.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* Update cli/command_repository_upgrade.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* Update cli/command_repository_upgrade.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* pr feedback
* pr feedback
* add a min drain time check
* env var for io-drain-timeout
* fix: add more doctext around upgrade phases
* build: wrap with EnvName
* add experimental warning
* protect upgrade cli behind env varible
* fix conflicts after relocating the upgrade lock
* generalize the command args
* drop certain features as per feedback
* sub-divide the upgrade command into begin and rollback
* Update cli/command_repository_upgrade.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* Update cli/command_repository_upgrade.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* missing return
* rename force flag to allow-unsafe-upgrade
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Ali Dowair <adowair@umich.edu>
Co-authored-by: Shikhar Mall <small@kopia.io>
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
Also hide the flag, since it's not recommended to be tweaked anyway.
The value of <=45m is very important for safety of the garbage collection algorithms - too long an interval between checkpoints could mean that GC treats contents in the middle of being uploaded as unused, because they are not reachable from any snapshots or checkpoints.
Fixes#2193
This was caused by a default `-1ns` which is no longer supported
in latest Kingpin.
The effect was that `kopia cache set` without
`--max-list-cache-duration` would fail. Unforutnately test was passing
that flag so it was missed.
This was likely caused by https://github.com/alecthomas/kingpin/pull/329
* feat(infra): improved support for in-process testing
* support for killing of a running server using simulated Ctrl-C
* support for overriding os.Stdin
* migrated many tests from the exe runner to in-process runner
* added required indirection when defining Envar() so we can later override it in tests
* refactored CLI runners by moving environment overrides to CLITestEnv
This is caused by a fix where fs.Directory was incorrectly reporting
its size == total size of all files in all subdirectories and
`snapshot list` was relying on that.
Fixes#2144
Some compression algorithms are not recommended because they
allocate disproportionate amounts of memory. They are still
possible to use, just marked as NOT RECOMMENDED in the UI.
This commit renames the sparse restore flag (`kopia snapshot restore`
and `kopia restore`) to conform more with the naming precedents in
the Kopia code. This is a breaking change.
The original motivation can be found here:
https://github.com/kopia/htmlui/pull/61#discussion_r899155054
- removed a bunch of hacks and should improve the logging
performance by avoiding interfaces and data translation. This will
allow using of de-sugared loggers in performance-critical
logging situations.
- this will also allow using features of ZAP more directly without
having to reimplement them.
- moved logging.Printf() to testlogging
- refactored `uitask` to store logs in a structural format and
present them as JSON only in the UI
- renamed printf_logger.go to printf.go so that fewer columns are used
in the logs
* feat(snapshots): improved performance when uploading huge files
This is controlled by an upload policy which specifies the size
threshold above which indvidual files are uploaded in parts
and concatenated.
This allows multiple threads to run splitting, hashing, compression
and encryption in parallel, which was previously only possible across
multiple files, but not when a single file was being uploaded.
The default is 2GiB for now, so this feature only kicks in for very
larger files. In the future we may lower this.
Benchmark involved uploading a single 42.1 GB file which was a VM disk
snapshot of fresh Ubuntu installation (fresh EXT4 partition with lots
of zero bytes) to a brand-new filesystem repository on local SSD of
M1 Pro Macbook Pro 2021.
* before: 59-63s (~700 MB/s)
* after: 15-17s (~2.6 GB/s)
* additional test to ensure files are really e2e readable