* blob: changed default shards from {3,3} to {1,3}
Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.
Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.
The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).
* fixed compat tests
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
* Support setting AWS S3 storage class for all types of blobs
* Read .storageconfig file
* Improve loading logic
* Hide .storageconfig from ListBlobs()
When --`config-file` is passed as a filename without any directory
(absolute or relative) it is resolved in OS-specific
config path.
For example on macOS:
`--config-file foo.config`
resolves to:
`~/Library/Application Support/kopia/foo.config`
* cli: added 'repository validate-provider' which runs a set of tests against blob storage provider to validate it
This implements a provider tests which exercises subtle behaviors which are not always correctly implemented by providers claiming compatibility with S3, for example.
The test checks:
- not found behavior
- prefix scans
- timestamps
- write atomicity
* retry: improved error message on failure
* rclone: fixed stats reporting and awaiting for completion
* webdav: prevent panic when attempting to mkdir with empty name
* testing: run providervalidation.ValidateProvider as part of regular provider tests
* cli: print a recommendation to validate provider after repository creation
* logging: added logger wrappers for Broadcast and Prefix
* nit: moved max hash size to a named constant
* content: added internal logger
* content: replaced context-based logging with explicit Loggers
This will capture the logger.Logger associated with the context when
the repository is opened and will reuse it for all logs instead of
creating new logger for each log message.
The new logger will also write logs to the internal logger in addition
to writing to a log file/console.
* cli: allow decrypting all blobs whose names start with _
* maintenance: added logs cleanup
* cli: commands to view logs
* cli: log selected command on each write session
* cli: fixed remaining testability indirections for output and logging
* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process
* tests: refactored most e2e tests to invoke kopia subcommands in-process
* Makefile: enable code coverage for cli/ and internal/
* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme
* Makefile: push coverage from PRs again
* tests: disable buffer management to reduce memory usage on ARM
* cli: fixed misaligned atomic field on ARMHF
also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
* introduced passwordpersist package which has password persistence
strategies (keyring, file, none, multiple) with possibility of adding
more in the future.
* moved all password persistence logic out of 'repo'
* removed global variable repo.EnableKeyRing
cli: major refactoring of how CLI commands are registered
The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.
This change is 94.3% mechanical, but is fully organic and hand-made.
* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
Removed Warning, Notify and Fatal:
* `Warning` => `Error` or `Info`
* `Notify` => `Info`
* `Fatal` was never used.
Note that --log-level=warning is still supported for backwards
compatibility, but it is the same as --log-level=error.
Co-authored-by: Julio López <julio+gh@kasten.io>
* cli: added --safety=full|none flag to maintenance commands
This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.
This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.
* 'blob gc'
* 'content rewrite'
* 'snapshot gc'
* pr renames
* maintenance: fixed computation of safe time for --safety=none
* maintenance: improved logging for blob gc
* maintenance: do not rewrite truly short, densely packed packs
* mechanical: pass eventual consistency settle time via CompactOptions
* maintenance: add option to disable eventual consistency time buffers with --safety=none
* maintenance: trigger flush at the end of snapshot gc
* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs
* testing: allow debugging of integration tests inside VSCode
* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
* nit: replaced harcoded string constants with named constants
* acl: added management of ACL entries
* auth: implemented DefaultAuthorizer which uses ACLs if any entries are found in the system and falls back to LegacyAuthorizer if not
* cli: switch to DefaultAuthorizer when starting server
* cli: added ACL management
* server: refactored authenticator + added refresh
Authenticator is now an interface which also supports Refresh.
* authz: refactored authorizer to be an interface + added Refresh()
* server: refresh authentication and authorizer
* e2e tests for ACLs
* server: handling of SIGHUP to refresh authn/authz caches
* server: reorganized flags to specify auth options:
- removed '--allow-repository-users' - it's always on
- one of --without-password, --server-password or --random-password
can be specified to specify password for the UI user
- htpasswd-file - can be specified to provide password for UI or remote
users
* cli: moved 'kopia user' to 'kopia server user'
* server: allow all UI actions if no authenticator is set
* acl: removed priority until we have a better understood use case for it
* acl: added validation of allowed labels when adding ACL entries
* site: added docs for ACLs
Fixes#690
This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.
Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.
The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.
After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.
In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.
Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.
With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
* cache: improved cache cleanup on exit
Ensure we do one full sweep before closing if cache has been modified.
Before we would do periodic sweep every minute which would not kick in
for very short snapshots, which Kopia does very frequently. This leads
to build-up of metadata cache entries (q blobs) that never
get cleaned until some long session.
* caching: streamlined cache handling
- deprecated caching-related flags, now cache is always on or off with
no way to disable it per invocation.
- reduced default list cache duration from 10min to 30s
- moved blob-list cache to separate subdirectory
- cleaned up cache info output to include blob-list cache parameters
- removed ability to disable cache for per-context (this was only
used in 'snapshot verify' codepath)
- added ability to partially clear individual caches via CLI
* blob: refactored upload reporting
Instead of plumbing this through blob storage context, we are passing
and explicit callback that reports uploads as they happen.
* htmlui: improved counter presentation
* nit: added missing UI route which fixes Reload behavior on the Tasks page
* user: added user profile (username&password for authentication) and CRUD methods
* manifest: helpers for disambiguating manifest entries
* authn: added repository-based user authenticator
* cli: added commands to manipulate user accounts and passwords
* cli: added --allow-repository-users option to 'server start'
* Update cli/command_user_info.go
Co-authored-by: Julio López <julio+gh@kasten.io>
* Always return false when the user is not found.
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`
- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
interface
Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.
`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).
* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session
* cli: disable maintenance in 'kopia server start'
Server will close the repository before completing.
* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
- used atomic to manage SharedManager.closed
* removed stale example
* linter: fixed spurious failures
Co-authored-by: Julio López <julio+gh@kasten.io>
* content: fixed time-based auto-flush behavior to behave like Flush()
Previously it would sometimes be possible for a content whose write
started before time-based flush to finish writing afterwards (and it
would be included in the new index).
Refactored the code so that time-based flush happens before WriteContent
write and behaves exactly the same was as real Flush() so all writes
started before it will be awaited during the flush.
Also previous regression test was incorrect since it was mocking the
wrong blob method.
* content: refactored index blob manager crypto to separate file
This will be reused for encrypting session info.
* content: added support for session markers
Session marker (`s` blob) is written BEFORE the first data blob
(`p` or `q`) that belongs to new index segment (`n` is written).
Session marker is removed AFTER the index blob (`n`) has been written.
All pack and index blobs belonging to a session will have the session
ID as its suffix, so that if a reader can see `s<sessionID>` blob, they
will ignore any `p` and `q` blobs with the same suffix.
* maintenance: ignore blobs belonging to active sessions when running blob garbage collection
* cli: added 'sessions list' for listing active sessions
* content: added retrying writing previously failed blobs before writing new one
* linter: upgraded to 1.33, disabled some linters
* lint: fixed 'errorlint' errors
This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.
Verified that there are no exceptions for errorlint linter which
guarantees that.
* lint: fixed or suppressed wrapcheck errors
* lint: nolintlint and misc cleanups
Co-authored-by: Julio López <julio+gh@kasten.io>
Both source and destination can be specified using user@host,
@host or user@host:/path where destination values override the
corresponding parts of the source, so both targeted
and mass copying is supported.
Supported combinations are:
Source: Destination Behavior
---------------------------------------------------
@host1 @host2 copy snapshots from all users of host1
user1@host1 @host2 copy all snapshots to user1@host2
user1@host1 user2@host2 copy all snapshots to user2@host2
user1@host1:/path1 @host2 copy to user1@host2:/path1
user1@host1:/path1 user2@host2 copy to user2@host2:/path1
user1@host1:/path1 user2@host2:/path2 copy snapshots from single path
When --move is specified, the matching source snapshots are also deleted.
* cli: upgraded kingpin to latest version (not tagged)
This allows using `EnableFileExpansion` to disable treating
arguments prefixed with "@" as file includes.
* logging: cleaned up stderr logging
- do not show module
- do not show timestamps by default (enable with --console-timestamps)
* logging: replaced most printStderr() with log.Info
* cli: additional logging cleanup
* cli: ensure advanced commands are not accidentally used
This prints an error when a dangerous command is used without
first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable.
Co-authored-by: Julio López <julio+gh@kasten.io>
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config
* cli: added 'repository set-client' to configure parameters of connected repository
* cli: cleaned up 'repository status' output
* cli: added 'index inspect' which can dump contents of index blob or local file
* repo: added read-only option when connecting to a repo which prevents any mutations
Co-authored-by: Julio Lopez <julio+gh@k....io>
- run maintenance even if the command is about to return an error
(otherwise if folks have persistent error causing snapshots to fail
they will never run maintenance)
- disable progress output after snapshotting so that
'kopia snapshot --all' output is clean
instead moved to run as part of maintenance ('kopia maintenance run')
added 'kopia maintenance run --force' flag which runs maintenance even
if not owned
Support for remote content repository where all contents and
manifests are fetched over HTTP(S) instead of locally
manipulating blob storage
* server: implement content and manifest access APIs
* apiclient: moved Kopia API client to separate package
* content: exposed content.ValidatePrefix()
* manifest: added JSON serialization attributes to EntryMetadata
* repo: changed repo.Open() to return Repository instead of *DirectRepository
* repo: added apiServerRepository
* cli: added 'kopia repository connect server'
This sets up repository connection via the API server instead of
directly-manipulated storage.
* server: add support for specifying a list of usernames/password via --htpasswd-file
* tests: added API server repository E2E test
* server: only return manifests (policies and snapshots) belonging to authenticated user
* maintenance: encrypt maintenance schedule block
* maintenance: created snapshotmaintenance package that wraps maintenance and performs snapshot GC + regular maintenance in one shot, used in CLI and server
* PR feedback.
* mechanical rename of package snapshot/gc => snapshot/snapshotgc
* maintenance: record maintenance run times and statuses
Also stopped dropping deleted contents during quick maintenance, since
doing this safely requires coordinating with snapshot GC which is
part of full maintenance.
* cli: 'maintenance info' outputs maintenance run history
* maintenance: only drop index entries when it's safe to do so
This is based on the timestamp of previous successful GC that's old
enough to resolve all race conditions between snapshot creation and GC.
* maintenance: added internal flush to RewriteContents() to better measure its time
Maintenance: support for automatic GC
Moved maintenance algorithms from 'cli' to 'repo/maintenance' package
Added support for CLI commands:
kopia gc - performs quick maintenance
kopia gc --full- perform full maintenance
Full maintenance performs snapshot gc, but it's not safe to do this automatically possibly in parallel to snapshots being taken. This will be addressed ~0.7 timeframe.
* This is 99% mechanical:
Extracted repo.Repository interface that only exposes high-level object and manifest management methods, but not blob nor content management.
Renamed old *repo.Repository to *repo.DirectRepository
Reviewed codebase to only depend on repo.Repository as much as possible, but added way for low-level CLI commands to use DirectRepository.
* PR fixes
* repo: added some initial metrics using OpenCensus
* cli: added flags to expose Prometheus metrics on a local endpoint
`--metrics-listen-addr=localhost:X` exposes prometheus metrics on
http://localhost:X/metrics
Also, kopia server will automatically expose /metrics endpoint on the
same port it runs as, without authentication.
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
/api/v1/repo/create
/api/v1/repo/connect
/api/v1/repo/disconnect
Refactored server code and fixed a number of outstanding robustness
issues. Tweaked the API responses a bit to make more sense when consumed
by the UI.