Use non-formatting logging functions for message without formatting.
For example, `log.Info("message")` instead of `log.Infof("message")`
Configure linter for printf-like functions
Code movement and simplification, no functional changes.
Objectives:
- Allow callers specifying the needed key (or hash) size, instead of
hard-coding it in the registered PBK derivers. Conceptually, the caller
needs to specify the key size, since that is a requirement of the
(encryption) algorithm being used in the caller. Now, the code changes
here do not result in any functional changes since the key size is
always 32 bytes.
- Remove a global definition for the default PB key deriver to use.
Instead, each of the 3 use case sets the default value.
Changes:
- `crypto.DeriveKeyFromPassword` now takes a key size.
- Adds new constants for the key sizes at the callers.
- Removes the global `crypto.MasterKeySize` const.
- Removes the global `crypto.DefaultKeyDerivationAlgorithm` const.
- Adds const for the default derivation algorithms for each use case.
- Adds a const for the salt length in the `internal/user` package, to ensure
the same salt length is used in both hash versions.
- Unexports various functions, variables and constants in the `internal/crypto`
& `internal/user` packages.
- Renames various constants for consistency.
- Removes unused functions and symbols.
- Renames files to be consistent and better reflect the structure of the code.
- Adds a couple of tests to ensure the const values are in sync and supported.
- Fixes a couple of typos
Followups to:
- #3725
- #3770
- #3779
- #3799
- #3816
The individual commits show the code transformations to simplify the
review of the changes.
* feat(cli): print upgrade owner in repository status
To help users understand the state of their repository better, this one
line change also prints out the upgrade owner's ID in the output of
`kopia repository status`.
* Upgrade `create --format-version` help message
To show that there is now a format version 3 that can be set.
* Encryptor pipeline
* Added ECC related options to repository create cli command
* Fix for lint errors
* Fixing comments from the PR
* Fixed lint errors
* Changes requested in PR
* Created e2e test
* refactor(repository): moved format blob management to separate package
This is completely mechanical, no behavior changes, only:
- moved types and functions to a new package
- adjusted visibility where needed
- added missing godoc
- renamed some identifiers to align with current usage
- mechanically converted some top-level functions into member functions
- fixed some mis-named variables
* refactor(repository): moved content.FormatingOptions to format.ContentFormat
* feat(infra): improved support for in-process testing
* support for killing of a running server using simulated Ctrl-C
* support for overriding os.Stdin
* migrated many tests from the exe runner to in-process runner
* added required indirection when defining Envar() so we can later override it in tests
* refactored CLI runners by moving environment overrides to CLITestEnv
New flag `--enable-jaeger-collector` and the corresponding
`KOPIA_ENABLE_JAEGER_COLLECTOR` environment variable enables Jaeger
exporter, which by default sends OTEL traces to Jaeger collector on
http://localhost:14268/api/traces
To change this, use environment variables:
* `OTEL_EXPORTER_JAEGER_ENDPOINT`
* `OTEL_EXPORTER_JAEGER_USER`
* `OTEL_EXPORTER_JAEGER_PASSWORD`
When tracing is disabled, the impact on performance is negligible.
To see this in action:
1. Download latest Jaeger all-in-one from https://www.jaegertracing.io/download/
2. Run `jaeger-all-in-one` binary without any parameters.
3. Run `kopia --enable-jaeger-collector snapshot create ...`
4. Go to http://localhost:16686/search and search for traces
When enabled, metrics are pushed to the provided Prometheus Push
Gateway at the start and end of each command and periodically every
few seconds.
```
--metrics-push-addr=http://address:port
--metrics-push-interval=5s
--metrics-push-job=kopia
--metrics-push-grouping=a:b --metrics-push-grouping=c:d
--metrics-push-username=user
--metrics-push-password=pass
```
* refactor cli tests to allow the use of in-memory mock
* use in-memory repo for set-parameters cli tests
* move inmemory storage provider into test package
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
* feat: persisting retention options in repository blob
- plumb retention parameters through wrapped storage
- generalize aes encryption mechanism
- rewrite the retention blob on password change
- do not write retention blob when empty
* handle retention-blob not-found failures
* cli params to set retention modes on repository create
* enable versioned map mock storage with retention settings
* adding unit tests
* write format and retention blob with retention settings if available
* rename certain functions and constants specific to format blob
* delete retention cache on password-change
* fix: replace SetTime() api call with TouchBlob()
* Update repo/repository_test.go
Co-authored-by: Nick <nick@kasten.io>
* pr feedback and codecov improvements
* fix: rename retention-blob structures to generic blob-cfg
* fix: remove minio dependency on retention constants
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>
Added policy.Definition which allows us to precisely report where
each piece of policy came from.
Fixed a one-off bug with "noParent", which prevented merging of parent
policies one level too soon.
Added a whole bunch of merging helpers and generic reflection-based
test that ensures every single merge is tested.
* blob: changed default shards from {3,3} to {1,3}
Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.
Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.
The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).
* fixed compat tests
* content: fixed repo upgrade version
Previously upgrade would enable epoch manager and index v2 but would
not set the version of the format itself. Everything worked fine
but it would not protect from old kopia opening the repository.
* ci: added compatibility test that uses real 0.8 and current binaries
* testing: KOPIA_TEST_LOG_OUTPUT logs subcommand outputs
* cli: additional flags for 'blob list'
* Makefile: run all tests against epoch-based index manager
* epoch: added support for deletion watermark, which keeps track of latest maintenance which dropped index entries
* content: added deletion watermark to content manager
* maintenance: improved maintenance without safety to force rewrites
* maintenance: skip quick maintenance when epoch manager is enabled
* maintenance: do not enable quick maintenance when epoch manager is used
* testing: skip TestIndexOptimize when running against epoch manager-backed index strutures
* cli: added 'repository validate-provider' which runs a set of tests against blob storage provider to validate it
This implements a provider tests which exercises subtle behaviors which are not always correctly implemented by providers claiming compatibility with S3, for example.
The test checks:
- not found behavior
- prefix scans
- timestamps
- write atomicity
* retry: improved error message on failure
* rclone: fixed stats reporting and awaiting for completion
* webdav: prevent panic when attempting to mkdir with empty name
* testing: run providervalidation.ValidateProvider as part of regular provider tests
* cli: print a recommendation to validate provider after repository creation
* repo: added 'enable password change' flag (defaults to true for new repositories), which prevents embedding replicas of kopia.repository in pack blobs
* cli: added 'repo change-password' which can change the password of a connected repository
* repo: nit - renamed variables and functions dealing with key derivation
* repo: fixed cache validation HMAC secret to use stored HMAC secret instead of password-derived one
* cli: added test for repo change-password
* repo: negative cases for attempting to change password in an old repository
* Update cli/command_repository_change_password.go
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
* epoch: misc fixes and logging
* blob: misc helpers
* cli: removed useless 'repository upgrade', replaced by 'repository set-parameters'
* content: implemented indexBlobManagerV1 which uses epoch manager
* cli: commands to manipulate repository epoch parameters
* cli: commands to examine epoch-based indexes
* content: added test suite that uses epoch-based index manager
* content: fixed a ton of test data races caused by sharing blobtesting.DataMap
* cli: additional tests and validation for 'repository set-params'
* testing: replaced the use of suite with our own, since suite is not parallelizable
* logging: added logger wrappers for Broadcast and Prefix
* nit: moved max hash size to a named constant
* content: added internal logger
* content: replaced context-based logging with explicit Loggers
This will capture the logger.Logger associated with the context when
the repository is opened and will reuse it for all logs instead of
creating new logger for each log message.
The new logger will also write logs to the internal logger in addition
to writing to a log file/console.
* cli: allow decrypting all blobs whose names start with _
* maintenance: added logs cleanup
* cli: commands to view logs
* cli: log selected command on each write session
* cli: added a flag to create repository with v2 index features
* content: plumb through compression.ID parameter to content.Manager.WriteContent()
* content: expose content.Manager.SupportsContentCompression
This allows object manager to decide whether to create compressed object
or let the content manager do it.
* object: if compression is requested and the repo supports it, pass compression ID to the content manager
* cli: show compression status in 'repository status'
* cli: output compression information in 'content list' and 'content stats'
* content: compression and decompression support
* content: unit tests for compression
* object: compression tests
* testing: added integration tests against v2 index
* testing: run all e2e tests with and without content-level compression
* htmlui: added UI for specifying index format on creation
* cli: additional tests for 'content ls' and 'content stats'
* applied pr suggestions
* cli: fixed remaining testability indirections for output and logging
* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process
* tests: refactored most e2e tests to invoke kopia subcommands in-process
* Makefile: enable code coverage for cli/ and internal/
* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme
* Makefile: push coverage from PRs again
* tests: disable buffer management to reduce memory usage on ARM
* cli: fixed misaligned atomic field on ARMHF
also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
cli: major refactoring of how CLI commands are registered
The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.
This change is 94.3% mechanical, but is fully organic and hand-made.
* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`
- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
interface
Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.
`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).
* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session
* cli: disable maintenance in 'kopia server start'
Server will close the repository before completing.
* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
- used atomic to manage SharedManager.closed
* removed stale example
* linter: fixed spurious failures
Co-authored-by: Julio López <julio+gh@kasten.io>
* linter: upgraded to 1.33, disabled some linters
* lint: fixed 'errorlint' errors
This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.
Verified that there are no exceptions for errorlint linter which
guarantees that.
* lint: fixed or suppressed wrapcheck errors
* lint: nolintlint and misc cleanups
Co-authored-by: Julio López <julio+gh@kasten.io>
* logging: cleaned up stderr logging
- do not show module
- do not show timestamps by default (enable with --console-timestamps)
* logging: replaced most printStderr() with log.Info
* cli: additional logging cleanup
Support for remote content repository where all contents and
manifests are fetched over HTTP(S) instead of locally
manipulating blob storage
* server: implement content and manifest access APIs
* apiclient: moved Kopia API client to separate package
* content: exposed content.ValidatePrefix()
* manifest: added JSON serialization attributes to EntryMetadata
* repo: changed repo.Open() to return Repository instead of *DirectRepository
* repo: added apiServerRepository
* cli: added 'kopia repository connect server'
This sets up repository connection via the API server instead of
directly-manipulated storage.
* server: add support for specifying a list of usernames/password via --htpasswd-file
* tests: added API server repository E2E test
* server: only return manifests (policies and snapshots) belonging to authenticated user
Maintenance: support for automatic GC
Moved maintenance algorithms from 'cli' to 'repo/maintenance' package
Added support for CLI commands:
kopia gc - performs quick maintenance
kopia gc --full- perform full maintenance
Full maintenance performs snapshot gc, but it's not safe to do this automatically possibly in parallel to snapshots being taken. This will be addressed ~0.7 timeframe.
New ciphers are using authenticated encryption with associated data
(AEAD) and per-content key derived using HMAC-SHA256:
* AES256-GCM-HMAC-SHA256
* CHACHA20-POLY1305-HMAC-SHA256
They support content IDs of arbitrary length and are quite fast:
On my 2019 MBP:
- BLAKE2B-256 + AES256-GCM-HMAC-SHA256 - 648.7 MiB / second
- BLAKE2B-256 + CHACHA20-POLY1305-HMAC-SHA256 - 597.1 MiB / second
- HMAC-SHA256 + AES256-GCM-HMAC-SHA256 351 MiB / second
- HMAC-SHA256 + CHACHA20-POLY1305-HMAC-SHA256 316.2 MiB / second
Previous ciphers had several subtle issues:
* SALSA20 encryption, used weak nonce (64 bit prefix of content ID),
which means that for any two contents, whose IDs that have the same
64-bit prefix, their plaintext can be decoded from the ciphertext
alone.
* AES-{128,192,256}-CTR were not authenticated, so we were
required to hash plaintext after decryption to validate. This is not
recommended due to possibility of subtle timing attacks if an attacker
controls the ciphertext.
* SALSA20-HMAC was only validating checksum and not that the ciphertext
was for the correct content ID.
New repositories cannot be created using deprecated ciphers, but they
will still be supported for existing repositories, until at least 0.6.0.
The users are encouraged to migrate to one of new ciphers when 0.5.0 is
out.
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.