Commit Graph

285 Commits

Author SHA1 Message Date
Julio Lopez
e420d74096 refactor(general): minor cleanups in robustness framework (#1971)
- nit: re-group struct fields
- nit: use consts
- nit: remove unnecessary fmt.Errorf(...) usage
- nit: avoid unnecessarily calling defaultActionControls when there is already a value
- robustness: increase readability in actions map declaration
- Prefer named functions over closures for the actions.
- nit fix typo
- nit: simplify robustness Log.StringThisRun
  Iterates only over the tail of the slice, which avoids iterating over the entire slice
2022-05-24 21:15:55 +00:00
Jarek Kowalski
a61927e089 chore(infra): added more leak checks to tests (#1953) 2022-05-16 06:37:57 +00:00
Jarek Kowalski
81c0580a01 feat(cli): REVERT added 'content delete --forget' flag (#1932) (#1940)
This reverts commit d990af4dc2.
2022-05-10 03:45:25 +00:00
Jarek Kowalski
d990af4dc2 feat(cli): added 'content delete --forget' flag (#1932)
* feat(cli): added 'content delete --forget' flag

This allows low-level hiding of entries in the index, which makes
them completely invisible.

For #1906

* improved code coverage

* pr feedback
2022-05-06 21:16:12 -07:00
chaitalisg
325edd2229 test(general): Ignore directory size check before and after restore (#1904) 2022-04-27 00:13:31 +00:00
chaitalisg
08cf7eb936 test(general): Add a test method to perform repository format version upgrade (#1832)
Authored-by: Chaitali Gondhalekar (@chaitalisg)
Co-authored-by: Julio (@julio-lopez)
2022-03-30 15:16:49 -07:00
Ali Dowair
aafe56cd6f feat(snapshots): support restoring sparse files (#1823)
* feat(snapshots): support restoring sparse files

This commit implements basic support for restoring sparse files from
a snapshot. When specifying "--mode=sparse" in a snapshot restore
command, Kopia will make a best effort to make sure the underlying
filesystem allocates the minimum amount of blocks needed to persist
restored files. In other words, enabling this feature will "force"
all restored files to be sparse-blocks of zero bytes in the source
file should not be allocated.

* Address review comments

- Separate sparse option into its own bool flag
- Implement sparsefile packagewith copySparse method
- Truncate once before writing sparse file
- Check error from Truncate
- Add unit test for copySparse
- Invoke GetBlockSize once per file copy
- Remove support for Windows and explain why
- Add unit test for stat package

Co-authored-by: Dave Smith-Uchida <dave@kasten.io>
2022-03-22 19:09:50 -07:00
Jarek Kowalski
daa62de3e4 chore(ci): added checklocks static analyzer (#1838)
From https://github.com/google/gvisor/tree/master/tools/checklocks

This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.

This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.

In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.

In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.

The check is part of `make lint` but can also be invoked by
`make check-locks`.
2022-03-19 22:42:59 -07:00
Jarek Kowalski
c6944c70b1 fix(repository): fix deduplication when snapshotting identical files in parallel (#1815)
This issue was reported on Slack by Ming. Thanks!
This is particularly easy to reproduce when compression is used.
2022-03-08 20:11:06 -08:00
Jarek Kowalski
369d304084 refactor(repository): better context cancelation handling (#1802)
Instead of ignoring context cancelation in Open(), ensure we don't
spawn goroutines that might be canceled.
2022-03-06 16:56:30 -08:00
Jarek Kowalski
90df511609 fix(snapshots): treat empty retention policy as retaining ALL, not NONE (#1733)
This is a safety measure which addresses P0 improvement for #1732.

Given that retention policies that retain nothing make no sense, this
is not considered a breaking change.
2022-02-07 11:40:27 -08:00
Shikhar Mall
63bedd3446 feat(cli): allow changing retention parameters from CLI (#1680)
Co-authored-by: Shikhar Mall <small@kopia.io>
2022-02-02 19:04:22 -08:00
Jarek Kowalski
fd163cfc20 feat(kopiaui): connect to repository asynchronously on startup (#1691)
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.

This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
2022-01-29 18:28:52 -08:00
Jarek Kowalski
f404806557 fix(ci): fixed linter issue, do not ignore in workflow (#1687) 2022-01-29 08:15:24 -08:00
Jarek Kowalski
9cad0edb53 test(ui): added end-to-end HTML UI test (#1686)
* test(general): refactored parsing of server output

* test(ui): added experimental end-to-end test using chromedp
2022-01-29 01:34:45 -08:00
Jarek Kowalski
e67f84e0ba chore(general): updated linter to 1.44.0 (#1681) 2022-01-25 21:21:13 -08:00
Jarek Kowalski
aeb483a081 fix(testing): fixed robustness tests (#1661)
Fixes #1660
Broken by #1644
2022-01-19 17:12:23 -08:00
Jarek Kowalski
003b150a0e fix(ui): fixed HTTP 400 response when repository is not connected (#1659) 2022-01-14 08:47:41 -08:00
Jarek Kowalski
3d58566644 fix(security): prevent cross-site request forgery in the UI website (#1653)
* fix(security): prevent cross-site request forgery in the UI website

This fixes a [cross-site request forgery (CSRF)](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
vulnerability in self-hosted UI for Kopia server.

The vulnerability allows potential attacker to make unauthorized API
calls against a running Kopia server. It requires an attacker to trick
the user into visiting a malicious website while also logged into a
Kopia website.

The vulnerability only affected self-hosted Kopia servers with UI. The
following configurations were not vulnerable:

* Kopia Repository Server without UI
* KopiaUI (desktop app)
* command-line usage of `kopia`

All users are strongly recommended to upgrade at the earliest
convenience.

* pr feedback
2022-01-13 11:31:51 -08:00
Jarek Kowalski
2e9a57f0b4 server: support for server control APIs and tooling (#1644)
This adds new set of APIs `/api/v1/control/*` which can be used to administratively control a running server.

Once the server is started, the administrative user can control it
using CLI commands:

export KOPIA_SERVER_ADDRESS=...
export KOPIA_SERVER_CERT_FINGERPRINT=...
export KOPIA_SERVER_PASSWORD=...

* `kopia server status` - displays status of sources managed by the server
* `kopia server snapshot` - triggers server-side upload of snapshots for managed sources
* `kopia server cancel` - cancels upload of snapshots for managed sources
* `kopia server pause` - pauses scheduled snapshots for managed sources
* `kopia server resume` - resumes scheduled snapshots for managed sources
* `kopia server refresh` - causes server to resynchronize with externally-made changes, such as policies or new sources
* `kopia server flush` - causes server to flush all pending writes
* `kopia server shutdown` - graceful shutdown of the server

Authentication uses new user `server-control` and is disabled
by default. To enable it when starting the server, provide the password
using one of the following methods:

* `--server-control-password`
* `--random-server-control-password`
* `.htpasswd` file
* `KOPIA_SERVER_CONTROL_PASSWORD` environment variable

This change allows us to tighten the API security and remove some
methods that UI user was able to call, but which were not needed.
2022-01-03 18:48:38 -08:00
Jarek Kowalski
b81362d72c testing: do not run randomized tests in code coverage mode (#1585) 2021-12-13 22:07:50 -08:00
Jarek Kowalski
e870fcc4aa cli: reduce number of index blob writes in 'snapshot create --all' (#1574) 2021-12-11 10:08:34 -08:00
Jarek Kowalski
7673753050 Merge retention tags in snapshot lists (#1567)
* cli: refactored snapshot list

* cli: show range tags in snapshot list

For example if N snapshots are coalesced together because they
have identical roots we may emit now:

```
  2021-03-31 23:09:27 PDT ked3400debc7dd61baffab070bafd59cd (monthly-10)
  2021-04-30 06:12:53 PDT kd0576d212e55a831b7ff1636f90a7233 (monthly-4..9)
  + 5 identical snapshots until 2021-09-30 23:00:19 PDT
  2021-10-31 23:22:25 PDT k846bf22aa2863d27f05e820f840b14f8 (monthly-3)
  2021-11-08 21:29:31 PST k5793ddcd61ef27b93c75ab74a5828176 (latest-1..3,hourly-1..13,daily-1..7,weekly-1..4,monthly-1..2,annual-1)
  + 18 identical snapshots until 2021-12-04 10:09:54 PST
```

* server: server-side coalescing of snapshot

* ui: added coalescing of retention tags
2021-12-05 20:49:41 -08:00
Jarek Kowalski
5f04fad003 ui: major improvements to new snapshot flow (#1565)
* ui: changed how PolicyEditor is instantiated via a route

* server: added paths/resolve API

* server: refresh affected source manager after policy change

Also switched 15-second refresh cycle which is way too aggressive
to 30-minute cycle (manual refresh button can be used if needed).

* policy: allow overriding top-level policy for estimation

* server: changed source create API to always require policy

* ui: streamlined new snapshot and estimate flow

* linter fix
2021-12-04 22:13:10 -08:00
Jarek Kowalski
8ab3e049d2 cli: fixed 'snapshot list --json --max-results' (#1529)
Fixes #1455
2021-11-20 22:42:27 -08:00
Jarek Kowalski
62edab618f throtting: implemented a Throttler based on token bucket and configur… (#1512)
* throtting: implemented a Throttler based on token bucket and configurable window.

* cli: rewired throttle options to use common Limits structure and helpers

The JSON is backwards compatible.

* blob: remove explicit throttling from gcs,s3,b2 & azure

* cleanup: removed internal/throttle

* repo: add throttling wrapper around storage at the repository level

* throttling: expose APIs to get limits and add validation

* server: expose API to get/set throttle in a running server

* pr feedback
2021-11-16 07:39:26 -08:00
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Eng Zer Jun
c3f4c41591 refactor: move from ioutil.ReadDir to os.ReadDir (#1361)
* refactor: move from ioutil.ReadDir to os.ReadDir

This commit is an addition to PR #1360. According to
`ioutil.ReadDir` documentation (https://pkg.go.dev/io/ioutil#ReadDir),
`os.ReadDir` should be preferred as it is a more efficient and correct
implementation.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* perf: optimize localfs scan performance

Reference: https://github.com/kopia/kopia/pull/1361#issuecomment-937345195
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-11-04 16:57:24 -07:00
Jarek Kowalski
a0cfa2556f introduced structural debug logging and optional JSON output (#1475)
* logging: added Logger.Debugw(message, key1, value1, ..., keyN, valueN)

This is based on ZAP and allows structural logs to be emitted.

* cli: added --json-log-console and --json-log-file flags

* logging: updated storage logging wrapper to use structural logging

* pr feedback
2021-11-03 21:57:37 -07:00
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Jarek Kowalski
c8c433fb60 cli: when migrating snapshot honor destination policies (#1433)
In particular this applies compression based on destination repository
policies.

By default ignore rules are disabled during migration to preserve all
files, but that can be optionally enabled using '--apply-ignore-rules'.

Fixes #1429
2021-10-21 22:12:06 -07:00
Jarek Kowalski
3c4cf33376 cli: 'repo sync-to' will compare only unique ID not the entire format (#1423)
This allows it to tolerate format changes, such as the ones made
during upgrade.

Fixes #1421
2021-10-19 09:22:57 -07:00
Jarek Kowalski
e96ddc6105 upload: fix ignoring files after action redirects snapshot directory (#1409)
Fixes #1399
Fixes #1346
2021-10-17 23:02:00 -07:00
Jarek Kowalski
d9e0723fe4 actions: expose KOPIA_ACTION and KOPIA_VERSION environment variables (#1398)
Fixes #1394
2021-10-17 11:59:36 -07:00
Jarek Kowalski
5451a19296 cli: added 'snapshot delete --all-snapshots-for-source' (#1397) 2021-10-17 09:46:44 -07:00
Jarek Kowalski
97c4483cc9 testing: added tests for filesystem --flat option (#1384) 2021-10-13 21:21:55 -07:00
Jarek Kowalski
4a47bc3210 logging: switched from go-logging to zap (#1376)
This is much more efficient in terms of memory allocations
and speeds up backup due to less GC pressure.

Fixes #1345
2021-10-12 22:52:24 -07:00
Z
33c8733750 add unicode filename test, add env switches for long filenames/unicode filenames, update workflow file to include env variables (#1371) 2021-10-09 12:38:36 -07:00
Eng Zer Jun
73e492c9db refactor: move from io/ioutil to io and os package (#1360)
* refactor: move from io/ioutil to io and os package

The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* chore: remove //nolint:gosec for os.ReadFile

At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-06 08:39:10 -07:00
Jarek Kowalski
792cc874dc repo: allow reusing of object writer buffers (#1315)
This reduces memory consumption and speeds up backups.

1. Backing up kopia repository (3.5 GB files:133102 dirs:20074):

before: 25s, 490 MB
after: 21s, 445 MB

2. Large files (14.8 GB, 76 files)

before: 30s, 597 MB
after: 28s, 495 MB

All tests repeated 5 times for clean local filesystem repo.
2021-09-25 14:54:31 -07:00
Jarek Kowalski
928150fe6b linter: upgrade to 1.42.1 (#1292) 2021-09-14 19:11:39 -07:00
Jarek Kowalski
8b2b91f9f9 content: fixed repo upgrade version (#1286)
* content: fixed repo upgrade version

Previously upgrade would enable epoch manager and index v2 but would
not set the version of the format itself. Everything worked fine
but it would not protect from old kopia opening the repository.

* ci: added compatibility test that uses real 0.8 and current binaries
2021-09-10 22:51:51 -07:00
Julio Lopez
06a997385a cli: include parameters in maintenance info JSON output (#981)
* cli: include parameters in maintenance info JSON output
* e2e: add maintenance info checks in e2e test
* cli: simple test for 'maintenance info --json' command
2021-09-10 17:51:55 -07:00
Jarek Kowalski
d98b0edead endurance: rewrote test to be more stable (#1285) 2021-09-09 21:05:33 -07:00
Jarek Kowalski
7e68d8e4c1 Consolidated format version flags (#1284) 2021-09-08 18:44:03 -07:00
Jarek Kowalski
9e182f131a linter: upgraded to 1.42.0 (#1246) 2021-08-20 18:26:45 -07:00
Jarek Kowalski
35d0f31c0d huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244)
This helps recycle buffers more efficiently during snapshots.
Also, improved memory tracking, enabled profiling flags and added pprof
by default.
2021-08-20 08:45:10 -07:00
Jarek Kowalski
d6d9a1fb5f Maintenance improvements for epoch-based index structures (#1225)
* testing: KOPIA_TEST_LOG_OUTPUT logs subcommand outputs

* cli: additional flags for 'blob list'

* Makefile: run all tests against epoch-based index manager

* epoch: added support for deletion watermark, which keeps track of latest maintenance which dropped index entries

* content: added deletion watermark to content manager

* maintenance: improved maintenance without safety to force rewrites

* maintenance: skip quick maintenance when epoch manager is enabled

* maintenance: do not enable quick maintenance when epoch manager is used

* testing: skip TestIndexOptimize when running against epoch manager-backed index strutures
2021-08-02 21:08:54 -07:00
Jarek Kowalski
730ba7b94a Repository password change support (#1197)
* repo: added 'enable password change' flag (defaults to true for new repositories), which prevents embedding replicas of kopia.repository in pack blobs

* cli: added 'repo change-password' which can change the password of a connected repository

* repo: nit - renamed variables and functions dealing with key derivation

* repo: fixed cache validation HMAC secret to use stored HMAC secret instead of password-derived one

* cli: added test for repo change-password

* repo: negative cases for attempting to change password in an old repository

* Update cli/command_repository_change_password.go

Co-authored-by: Julio Lopez <julio+gh@kasten.io>

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-07-17 07:58:02 -07:00