Commit Graph

47 Commits

Author SHA1 Message Date
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Jarek Kowalski
bffe8b37da repo: limit the duration of kopia.repository caching to 15 minutes (#1196)
added flags to specify kopia.repository cache duration
2021-07-15 11:32:55 -07:00
Jarek Kowalski
30ca3e2e6c Upgraded linter to 1.40.1 (#1072)
* tools: upgraded linter to 1.40.1

* lint: fixed nolintlint vionlations

* lint: disabled tagliatele linter

* lint: fixed remaining warnings
2021-05-15 12:12:34 -07:00
Jarek Kowalski
fcd507a56d Refactored most of the CLI tests to run in-process as opposed to using sub-processes (#1059)
* cli: fixed remaining testability indirections for output and logging

* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process

* tests: refactored most e2e tests to invoke kopia subcommands in-process

* Makefile: enable code coverage for cli/ and internal/

* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme

* Makefile: push coverage from PRs again

* tests: disable buffer management to reduce memory usage on ARM

* cli: fixed misaligned atomic field on ARMHF

also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
2021-05-11 22:26:28 -07:00
Jarek Kowalski
41931f21ce repo: refactored password persistence (#1065)
* introduced passwordpersist package which has password persistence
  strategies (keyring, file, none, multiple) with possibility of adding
  more in the future.
* moved all password persistence logic out of 'repo'
* removed global variable repo.EnableKeyRing
2021-05-11 21:53:36 -07:00
Jarek Kowalski
8a2167784d cli: final steps to remove last global variables for password and globalPasswordFromToken (#1056)
* cli: removed globalPassword variable

* cli: remove globalPasswordFromToken variable
2021-05-07 12:43:47 -07:00
Jarek Kowalski
d2288c443f cli: major refactoring (#1046)
cli: major refactoring of how CLI commands are registered

The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.

This change is 94.3% mechanical, but is fully organic and hand-made.

* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
2021-05-03 10:28:00 -07:00
Jarek Kowalski
3a94c16678 Dockerfile: switched to distroless, specified defaults environment variables for containerized kopia (#897)
* Dockerfile: specified reasonable defaults options for containerized kopia

* addressed pr comments, switched to gcr.io/distroless/static:nonroot

distroless has no executable code, so this requires KOPIA_PASSWORD
to always be provided via env, b/c distroless does not have
/bin/stty to disable TTY echo (we should not require that, BTW)

* site: added docker image documentation
2021-03-19 21:54:48 -07:00
Jarek Kowalski
1f1465f4ba Improvements and cleanups for connecting to kopia server (#870)
* repo: refactored connect code set up cache for server repositories

- improved logic to close the cache on last connection
- preemptively add all contents with a prefix to the cache
- refactored how config is loaded and saved

Now cache dir will be stored as relative and resolved to absolute as
part of loading and saving the file, in all other places cache dir
is expected to be absolute.

* server: removed cache directory from the API and UI

This won't be easily available and does not seem useful to expose
anyway.

* cli: enabled cache commands for server repositories

* cli: added KOPIA_CACHE_DIRECTORY environment variable

This is used on two occassions - when setting up connection (it gets
persisted in the config) and later when opening (to override the
cache location from config). It makes setting up docker container with
mounted cache somewhat easier with one environment variable.

* cli: show cache size for the server cache

* tls: present more helpful error message that includes SHA256 fingerprint of the TLS server on mismatch

* server: return the name of user who attempted to login when authentication fails
2021-03-07 11:25:21 -08:00
Jarek Kowalski
c990fc9ec1 cache: streamlined flags and cache handling (#831)
* cache: improved cache cleanup on exit

Ensure we do one full sweep before closing if cache has been modified.

Before we would do periodic sweep every minute which would not kick in
for very short snapshots, which Kopia does very frequently. This leads
to build-up of metadata cache entries (q blobs) that never
get cleaned until some long session.

* caching: streamlined cache handling

- deprecated caching-related flags, now cache is always on or off with
  no way to disable it per invocation.
- reduced default list cache duration from 10min to 30s
- moved blob-list cache to separate subdirectory
- cleaned up cache info output to include blob-list cache parameters
- removed ability to disable cache for per-context (this was only
  used in 'snapshot verify' codepath)
- added ability to partially clear individual caches via CLI
2021-02-16 17:30:49 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Jarek Kowalski
eecd9d13c9 actions: Added --enable-actions flag (#737)
This can be specified at `repo create` or `repo connect` to enable
actions. By default actions are disabled to avoid security risks
associated with executing code.

Alternatively during `snapshot create` one can specify
`--force-enable-actions` or `--force-disable-actions`
2020-12-21 18:05:25 -08:00
Jarek Kowalski
ad4b222939 cli: added support for copying (or moving) snapshot history (#703)
Both source and destination can be specified using user@host,
@host or user@host:/path where destination values override the
corresponding parts of the source, so both targeted
and mass copying is supported.

Supported combinations are:

Source:             Destination         Behavior
---------------------------------------------------
@host1              @host2              copy snapshots from all users of host1
user1@host1         @host2              copy all snapshots to user1@host2
user1@host1         user2@host2         copy all snapshots to user2@host2
user1@host1:/path1  @host2              copy to user1@host2:/path1
user1@host1:/path1  user2@host2         copy to user2@host2:/path1
user1@host1:/path1  user2@host2:/path2  copy snapshots from single path

When --move is specified, the matching source snapshots are also deleted.

* cli: upgraded kingpin to latest version (not tagged)

This allows using `EnableFileExpansion` to disable treating
arguments prefixed with "@" as file includes.
2020-12-04 16:34:55 -08:00
Jarek Kowalski
9d7cf71a37 Logging flags (#674)
* logging: cleaned up stderr logging

- do not show module
- do not show timestamps by default (enable with --console-timestamps)

* logging: replaced most printStderr() with log.Info

* cli: additional logging cleanup
2020-10-10 10:48:37 -07:00
Jarek Kowalski
29ce1819cb Added support for setting and changing repository client options (description, read-only, hostname, username) (#589)
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config

* cli: added 'repository set-client' to configure parameters of connected repository

* cli: cleaned up 'repository status' output
2020-09-04 13:57:15 -07:00
Jarek Kowalski
c11850cfb7 Tools to help investigate repository structures safely (#553)
* cli: added 'index inspect' which can dump contents of index blob or local file
* repo: added read-only option when connecting to a repo which prevents any mutations

Co-authored-by: Julio Lopez <julio+gh@k....io>
2020-08-25 17:55:48 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00
Jarek Kowalski
d526843124 cli: fixed metadata cache size on connect/create 2020-03-07 21:47:32 -08:00
Jarek Kowalski
fb181257bf cli: implemented update check, fixes #119 2020-03-04 22:06:05 -08:00
Jarek Kowalski
e3854f7773 BREAKING: changed how hostname/username are handled
The hostname/username are now persisted when connecting to repository
in a local config file.

This prevents weird behavior changes when hostname is suddenly changed,
such as when moving between networks.

repo.Repository will now expose Hostname/Username properties which
are always guarnateed to be set, and are used throughout.

Removed --hostname/--username overrides when taking snapshot et.al.
2020-02-25 20:40:23 -08:00
Jarek Kowalski
c8fcae93aa logging: refactored logging
This is mostly mechanical and changes how loggers are instantiated.

Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).

By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling

ctx = logging.WithLogger(ctx, newLoggerFunc)

To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)

This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).

It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
2020-02-25 17:24:44 -08:00
Jarek Kowalski
edca1733b6 repo: moved password persistence to repository layer 2020-02-09 20:55:07 -08:00
Jarek Kowalski
ac70a38101 lint: upgraded to 1.22.2 and make lint issues a build failure
fixed or silenced linter warnings, mostly due to magic numeric constants
2020-01-03 16:39:30 -08:00
Jarek Kowalski
6217df1a87 lint: switched to 1.21 and fixed a ton of whitespace issues discovered
by new wsl linter
2019-11-26 06:49:49 -08:00
Jarek Kowalski
9893a552e2 metrics: disable google analytics as it's not providing any useful data, need to rethink the story here 2019-08-17 16:13:30 -07:00
Jarek Kowalski
6ef696d97a cli: resolve symlinks for snapshot roots
also - error handling improvements in the CLI
2019-07-18 08:40:44 -10:00
Jarek Kowalski
22170b4832 cli: changed 'cache set' subcommand to support changing individual parameters 2019-06-11 22:08:52 -07:00
Jarek Kowalski
bf311400f4 Added separate cache for metadata.
All blocks with a non-empty prefix land in that cache which has its own
size and expires independently from content cache.
2019-06-08 11:57:23 -07:00
Jarek Kowalski
54edb97b3a refactoring: renamed repo/block to repo/content
Also introduced strongly typed content.ID and manifest.ID (instead of string)

This aligns identifiers across all layers of repository:

blob.ID
content.ID
object.ID
manifest.ID
2019-06-01 22:24:19 -07:00
Jarek Kowalski
9e5d0beccd refactoring: renamed storage.Storage to blob.Storage
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.

Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.

Also renamed CLI subcommands from `kopia storage` to `kopia blob`.

While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
2019-06-01 14:10:35 -07:00
Jarek Kowalski
1a7a02ddbe cleanup imports by grouping all local imports together 2019-06-01 10:57:55 -07:00
Jarek Kowalski
0c41d41276 Fixed up paths after merge 2019-05-27 15:48:39 -07:00
Jarek Kowalski
a6a153b22e switched fmt.Errorf() to errors.Wrap() 2019-05-11 12:34:14 -07:00
Jarek Kowalski
327d8317d8 refactored repo/ into separate github.com/kopia/repo/ git repository 2018-10-26 20:40:57 -07:00
Jarek Kowalski
1b014c875a simplified repository API password handling.
completely rewrote password storage:

- by default passwords are kept in OS-specific keyring (Keychain on macOS,
Windows Credentials Manager on Windows), which can be optionally disabled
to store password in a local file.

- on Linux keychain is disabled by default (does not work reliably
in terminal sessions), but can be enabled using command-line flag.
2018-09-07 21:34:31 -07:00
Jarek Kowalski
91066f2469 reorganized low-level repository packages by moving them all under kopia/kopia/repo/ 2018-08-30 22:01:05 -07:00
Jarek Kowalski
b2b34c1dea reacted to a change in gometalinter that failed the build 2018-07-10 06:19:22 -07:00
Jarek Kowalski
53db414ff7 Added simple analytics mechanism based on Google Analytics for tracking features usage and latency.
Controlled on command line via --analytics-consent (defaults to asking user)
2018-05-30 21:22:07 -07:00
Jarek Kowalski
d8201229d8 plumbed through ctx in storage.Storage APIs and all uses 2018-04-03 17:39:54 -07:00
Jarek Kowalski
453bab3560 fixed some lint errors, mostly dead code and missing error checks 2018-03-19 12:26:28 -07:00
Jarek Kowalski
faa2625a5f revamped CLI help to hide most commands 2018-02-15 19:49:51 -08:00
Jarek Kowalski
090d97ba78 cli: reorganized all repo commands (connect/create/status) to top level 2018-01-10 19:13:09 -08:00
Jarek Kowalski
bf4c0e694d refactored CLI to use individual subcommands to connect to individual storage types, each with specialized flags and validation 2018-01-10 19:13:08 -08:00
Jarek Kowalski
f9f2c54993 added command to reconfigure caching on a repository that's already connected 2018-01-04 17:44:58 -08:00
Jarek Kowalski
30c11dc926 refactored block manager to support on-disk caching 2017-11-27 18:07:16 -08:00
Jarek Kowalski
8435ed4c80 beginnings of end-to-end test, cleaned up stdout vs stderr output in a few cases 2017-09-04 17:42:50 -07:00
Jarek Kowalski
f606ab4347 reorganized more top-level CLI commands into subcommands 2017-08-20 07:51:24 -07:00