Commit Graph

536 Commits

Author SHA1 Message Date
Jarek Kowalski
93930d20cb policy: revamped policy merge mechanism (#1538)
Added policy.Definition which allows us to precisely report where
each piece of policy came from.

Fixed a one-off bug with "noParent", which prevented merging of parent
policies one level too soon.

Added a whole bunch of merging helpers and generic reflection-based
test that ensures every single merge is tested.
2021-11-27 18:14:45 -08:00
Lukas Rieger
5224f79d7d [snapshot restore] use non-atomic writes (#1534)
* don't flush every file two times on snapshot restore
2021-11-26 13:10:44 -08:00
Jarek Kowalski
8ab3e049d2 cli: fixed 'snapshot list --json --max-results' (#1529)
Fixes #1455
2021-11-20 22:42:27 -08:00
Jarek Kowalski
525720db95 cli: added 'snapshot pin' command (#1528) 2021-11-20 20:53:25 -08:00
CrendKing
2394b420b0 Change Mbit/s units to MB/s (base-10) (#1522) 2021-11-18 15:41:40 -08:00
Jarek Kowalski
62edab618f throtting: implemented a Throttler based on token bucket and configur… (#1512)
* throtting: implemented a Throttler based on token bucket and configurable window.

* cli: rewired throttle options to use common Limits structure and helpers

The JSON is backwards compatible.

* blob: remove explicit throttling from gcs,s3,b2 & azure

* cleanup: removed internal/throttle

* repo: add throttling wrapper around storage at the repository level

* throttling: expose APIs to get limits and add validation

* server: expose API to get/set throttle in a running server

* pr feedback
2021-11-16 07:39:26 -08:00
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Shikhar Mall
2857c4831a storage api put-blob retention options (#1511)
* storage api put-blob retention options

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2021-11-15 19:46:42 -08:00
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Jarek Kowalski
89edfbf257 maintenance: send logs to content log as well (#1496) 2021-11-06 23:08:00 -07:00
Jarek Kowalski
03def8f33a server: maintenance in newly-created repo (#1494)
The issue in #1439 was caused by goroutine context being associated
with the HTTP request so it became canceled soon after the request was
over, thus the goroutine to run maintenance never ran.

Fixed by adding ctxutil.Detach()

Also fixed logging by passing top-level contexts to requests
and added --log-server-requests flag to `server start` which enables
request logging.
2021-11-06 17:10:53 -07:00
Jarek Kowalski
dcff6c285d Added support for logging policies (#1472)
* policy: introduced OptionalBool - refactoring

* policy: added logging policy

* testing: added support for symlinks and modtime to mockfs

* logging: exposed NullLogger instance

* upload: emit debug logs according to logging policies

* cli: logging policy support
2021-11-06 10:06:05 -07:00
Jarek Kowalski
2a6140d82f fixed directory read race condition (#1489)
This was introduced by a refactoring in #1361 - unlike
ioutil.ReadDir() which internally handles list/delete race and always
returns os.FileInfo, Info() on os.DirEntry can fail if a file
is deleted right after listing it.

Fixes #1486
2021-11-05 10:18:03 -07:00
Eng Zer Jun
c3f4c41591 refactor: move from ioutil.ReadDir to os.ReadDir (#1361)
* refactor: move from ioutil.ReadDir to os.ReadDir

This commit is an addition to PR #1360. According to
`ioutil.ReadDir` documentation (https://pkg.go.dev/io/ioutil#ReadDir),
`os.ReadDir` should be preferred as it is a more efficient and correct
implementation.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* perf: optimize localfs scan performance

Reference: https://github.com/kopia/kopia/pull/1361#issuecomment-937345195
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-11-04 16:57:24 -07:00
Jarek Kowalski
a0cfa2556f introduced structural debug logging and optional JSON output (#1475)
* logging: added Logger.Debugw(message, key1, value1, ..., keyN, valueN)

This is based on ZAP and allows structural logs to be emitted.

* cli: added --json-log-console and --json-log-file flags

* logging: updated storage logging wrapper to use structural logging

* pr feedback
2021-11-03 21:57:37 -07:00
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Jarek Kowalski
c8c433fb60 cli: when migrating snapshot honor destination policies (#1433)
In particular this applies compression based on destination repository
policies.

By default ignore rules are disabled during migration to preserve all
files, but that can be optionally enabled using '--apply-ignore-rules'.

Fixes #1429
2021-10-21 22:12:06 -07:00
Jarek Kowalski
39a195c6f7 cli: clarified usage of 'policy set --snapshot-time` (#1430)
Fixes #1426
2021-10-20 09:04:44 -07:00
Jarek Kowalski
76533246b4 sftp: support for password authentication (#1425)
Fixes #1373
2021-10-20 07:33:24 -07:00
Jarek Kowalski
3c4cf33376 cli: 'repo sync-to' will compare only unique ID not the entire format (#1423)
This allows it to tolerate format changes, such as the ones made
during upgrade.

Fixes #1421
2021-10-19 09:22:57 -07:00
Robert Schindler
3e43f8e76f Migrate: pick latest instead of oldest snapshot with --latest-only (#1420)
Fix #1414
2021-10-19 08:56:35 -07:00
Jarek Kowalski
97e05f996e cli: benchmark compressor memory usage (#1416) 2021-10-18 21:39:00 -07:00
Jarek Kowalski
d3568a6fdb cli: added 'index inspect --parallel' flag (#1404) 2021-10-17 18:35:21 -07:00
Jarek Kowalski
191a51b278 ui: fixed snapshotting UNC roots (#1401)
This was caused by additional resolution of path names only done in UI,
which caused \\hostname\share to be treated as relative and resolved
against the home directory.

Fixes #1385
Fixes #1362
2021-10-17 13:25:12 -07:00
Jarek Kowalski
5451a19296 cli: added 'snapshot delete --all-snapshots-for-source' (#1397) 2021-10-17 09:46:44 -07:00
Jarek Kowalski
51fdbf2048 nit: minor log output cleanups (#1380) 2021-10-13 06:07:18 -07:00
Jarek Kowalski
4a47bc3210 logging: switched from go-logging to zap (#1376)
This is much more efficient in terms of memory allocations
and speeds up backup due to less GC pressure.

Fixes #1345
2021-10-12 22:52:24 -07:00
Jarek Kowalski
8b760b66a8 logging: added memoization of Logger instances per context (#1369) 2021-10-09 05:02:18 -07:00
Eng Zer Jun
73e492c9db refactor: move from io/ioutil to io and os package (#1360)
* refactor: move from io/ioutil to io and os package

The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* chore: remove //nolint:gosec for os.ReadFile

At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-06 08:39:10 -07:00
Jarek Kowalski
3bb5b63289 sftp: fixed performance reggression due to connection management (#1359)
* sftp: fixed performance reggression due to connection management

The previous pooling was causing serialization of all requests, which
was too slow.

This change effectively reverts connection pooling but adds
automatic reconnection + unit tests.

* sftp: fixed unwanted retry on initial connection
2021-10-06 07:28:42 -07:00
CrendKing
93b9bf15b4 Support setting AWS S3 storage class for all types of blobs (#1335)
* Support setting AWS S3 storage class for all types of blobs

* Read .storageconfig file

* Improve loading logic

* Hide .storageconfig from ListBlobs()
2021-10-03 19:01:39 -07:00
Pawit Pornkitprasan
4c16770a8a cli: linux: read KOPIA_USE_KEYRING env (#1337)
On Linux, Keyring is disabled by default and `--use-keyring` must
be specified on every command-line invocation, which gets quite
annoying.

This commit allows enabling keyring from environment variable
instead.
2021-10-01 22:56:50 -07:00
cristihcd
d9c1bd0758 minor spelling and grammar fixes (#1318) 2021-09-28 15:19:02 -07:00
Jarek Kowalski
d56b93587e cli: only read first 128 MiB of provided file for compression benchmark (#1317) 2021-09-26 15:08:09 -07:00
Jarek Kowalski
1894727458 cli: fixed windows color output (#1305)
Fixes #1287
2021-09-19 21:59:24 -07:00
Jarek Kowalski
5ab712b3e7 cli: fixed error message when trying 'repo sync-to from-config' and the config file is connected to the repository server (#1304) 2021-09-19 21:54:47 -07:00
Jarek Kowalski
1566b74263 blob: support for custom blob store sharding (#1299)
* blob: support for custom blob store sharding

This is experimental.

The .shards file can reside in the root of any blob storage that uses
sharding (filesystem/sftp/webdav) and can specify rules for sharding.

{
  "default": [3,2,1],
  "overrides": [
    { "prefix": "p", "shards": [2,2] },
    { "prefix": "x", "shards": [1,1,1] }
  ],
  "maxNonShardedLength": 2
}

With this in place we'll be later able to do resharding of the
repository to optimize get/put/list performance for both repositories
and caches.

* cli: command line tools to manipulate shards in a directory
2021-09-19 18:50:38 -07:00
Jarek Kowalski
c48f2e5528 cli: added support for passing bare filename for --config-file parameter (#1293)
When --`config-file` is passed as a filename without any directory
(absolute or relative) it is resolved in OS-specific
config path.

For example on macOS:

`--config-file foo.config`

resolves to:

`~/Library/Application Support/kopia/foo.config`
2021-09-15 18:30:38 -07:00
Jarek Kowalski
928150fe6b linter: upgrade to 1.42.1 (#1292) 2021-09-14 19:11:39 -07:00
Jarek Kowalski
8b2b91f9f9 content: fixed repo upgrade version (#1286)
* content: fixed repo upgrade version

Previously upgrade would enable epoch manager and index v2 but would
not set the version of the format itself. Everything worked fine
but it would not protect from old kopia opening the repository.

* ci: added compatibility test that uses real 0.8 and current binaries
2021-09-10 22:51:51 -07:00
Julio Lopez
06a997385a cli: include parameters in maintenance info JSON output (#981)
* cli: include parameters in maintenance info JSON output
* e2e: add maintenance info checks in e2e test
* cli: simple test for 'maintenance info --json' command
2021-09-10 17:51:55 -07:00
Jarek Kowalski
7e68d8e4c1 Consolidated format version flags (#1284) 2021-09-08 18:44:03 -07:00
Julio Lopez
1077cc24d3 blob/s3: point-in-time support for s3 versioned stores (#1259)
Adds a `--point-in-time' flag for `repo connect`
2021-09-02 21:38:08 -07:00
Jarek Kowalski
56edb042ae content: switched defaults to use v2 index format and epoch-based indexes (#1275) 2021-09-02 05:53:01 -07:00
Jarek Kowalski
9cebffc628 Fix endurance test (#1254) 2021-08-27 04:22:18 -07:00
Jarek Kowalski
9e182f131a linter: upgraded to 1.42.0 (#1246) 2021-08-20 18:26:45 -07:00
Jarek Kowalski
35d0f31c0d huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244)
This helps recycle buffers more efficiently during snapshots.
Also, improved memory tracking, enabled profiling flags and added pprof
by default.
2021-08-20 08:45:10 -07:00
Jarek Kowalski
1238a0db60 cli: additional 'index inspect' flags (#1231)
* cli: fixed output of 'index inspect' to go to stdout

* cli: additional flags for 'index inspect'
2021-08-06 21:54:38 -07:00
Jarek Kowalski
67165cae5f build(deps): bump github.com/prometheus/client_golang (#1226)
Includes manual change to fix linter deprecation warning.

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/master/CHANGELOG.md)
- [Commits](https://github.com/prometheus/client_golang/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-02 21:36:15 -07:00
Jarek Kowalski
d6d9a1fb5f Maintenance improvements for epoch-based index structures (#1225)
* testing: KOPIA_TEST_LOG_OUTPUT logs subcommand outputs

* cli: additional flags for 'blob list'

* Makefile: run all tests against epoch-based index manager

* epoch: added support for deletion watermark, which keeps track of latest maintenance which dropped index entries

* content: added deletion watermark to content manager

* maintenance: improved maintenance without safety to force rewrites

* maintenance: skip quick maintenance when epoch manager is enabled

* maintenance: do not enable quick maintenance when epoch manager is used

* testing: skip TestIndexOptimize when running against epoch manager-backed index strutures
2021-08-02 21:08:54 -07:00