* blob: changed default shards from {3,3} to {1,3}
Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.
Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.
The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).
* fixed compat tests
* cli: fixed remaining testability indirections for output and logging
* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process
* tests: refactored most e2e tests to invoke kopia subcommands in-process
* Makefile: enable code coverage for cli/ and internal/
* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme
* Makefile: push coverage from PRs again
* tests: disable buffer management to reduce memory usage on ARM
* cli: fixed misaligned atomic field on ARMHF
also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
* introduced passwordpersist package which has password persistence
strategies (keyring, file, none, multiple) with possibility of adding
more in the future.
* moved all password persistence logic out of 'repo'
* removed global variable repo.EnableKeyRing
cli: major refactoring of how CLI commands are registered
The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.
This change is 94.3% mechanical, but is fully organic and hand-made.
* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
* Dockerfile: specified reasonable defaults options for containerized kopia
* addressed pr comments, switched to gcr.io/distroless/static:nonroot
distroless has no executable code, so this requires KOPIA_PASSWORD
to always be provided via env, b/c distroless does not have
/bin/stty to disable TTY echo (we should not require that, BTW)
* site: added docker image documentation
* repo: refactored connect code set up cache for server repositories
- improved logic to close the cache on last connection
- preemptively add all contents with a prefix to the cache
- refactored how config is loaded and saved
Now cache dir will be stored as relative and resolved to absolute as
part of loading and saving the file, in all other places cache dir
is expected to be absolute.
* server: removed cache directory from the API and UI
This won't be easily available and does not seem useful to expose
anyway.
* cli: enabled cache commands for server repositories
* cli: added KOPIA_CACHE_DIRECTORY environment variable
This is used on two occassions - when setting up connection (it gets
persisted in the config) and later when opening (to override the
cache location from config). It makes setting up docker container with
mounted cache somewhat easier with one environment variable.
* cli: show cache size for the server cache
* tls: present more helpful error message that includes SHA256 fingerprint of the TLS server on mismatch
* server: return the name of user who attempted to login when authentication fails
* cache: improved cache cleanup on exit
Ensure we do one full sweep before closing if cache has been modified.
Before we would do periodic sweep every minute which would not kick in
for very short snapshots, which Kopia does very frequently. This leads
to build-up of metadata cache entries (q blobs) that never
get cleaned until some long session.
* caching: streamlined cache handling
- deprecated caching-related flags, now cache is always on or off with
no way to disable it per invocation.
- reduced default list cache duration from 10min to 30s
- moved blob-list cache to separate subdirectory
- cleaned up cache info output to include blob-list cache parameters
- removed ability to disable cache for per-context (this was only
used in 'snapshot verify' codepath)
- added ability to partially clear individual caches via CLI
* linter: upgraded to 1.33, disabled some linters
* lint: fixed 'errorlint' errors
This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.
Verified that there are no exceptions for errorlint linter which
guarantees that.
* lint: fixed or suppressed wrapcheck errors
* lint: nolintlint and misc cleanups
Co-authored-by: Julio López <julio+gh@kasten.io>
This can be specified at `repo create` or `repo connect` to enable
actions. By default actions are disabled to avoid security risks
associated with executing code.
Alternatively during `snapshot create` one can specify
`--force-enable-actions` or `--force-disable-actions`
Both source and destination can be specified using user@host,
@host or user@host:/path where destination values override the
corresponding parts of the source, so both targeted
and mass copying is supported.
Supported combinations are:
Source: Destination Behavior
---------------------------------------------------
@host1 @host2 copy snapshots from all users of host1
user1@host1 @host2 copy all snapshots to user1@host2
user1@host1 user2@host2 copy all snapshots to user2@host2
user1@host1:/path1 @host2 copy to user1@host2:/path1
user1@host1:/path1 user2@host2 copy to user2@host2:/path1
user1@host1:/path1 user2@host2:/path2 copy snapshots from single path
When --move is specified, the matching source snapshots are also deleted.
* cli: upgraded kingpin to latest version (not tagged)
This allows using `EnableFileExpansion` to disable treating
arguments prefixed with "@" as file includes.
* logging: cleaned up stderr logging
- do not show module
- do not show timestamps by default (enable with --console-timestamps)
* logging: replaced most printStderr() with log.Info
* cli: additional logging cleanup
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config
* cli: added 'repository set-client' to configure parameters of connected repository
* cli: cleaned up 'repository status' output
* cli: added 'index inspect' which can dump contents of index blob or local file
* repo: added read-only option when connecting to a repo which prevents any mutations
Co-authored-by: Julio Lopez <julio+gh@k....io>
The hostname/username are now persisted when connecting to repository
in a local config file.
This prevents weird behavior changes when hostname is suddenly changed,
such as when moving between networks.
repo.Repository will now expose Hostname/Username properties which
are always guarnateed to be set, and are used throughout.
Removed --hostname/--username overrides when taking snapshot et.al.
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
Also introduced strongly typed content.ID and manifest.ID (instead of string)
This aligns identifiers across all layers of repository:
blob.ID
content.ID
object.ID
manifest.ID
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.
Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.
Also renamed CLI subcommands from `kopia storage` to `kopia blob`.
While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
completely rewrote password storage:
- by default passwords are kept in OS-specific keyring (Keychain on macOS,
Windows Credentials Manager on Windows), which can be optionally disabled
to store password in a local file.
- on Linux keychain is disabled by default (does not work reliably
in terminal sessions), but can be enabled using command-line flag.