Commit Graph

562 Commits

Author SHA1 Message Date
Jarek Kowalski
9d63e56bb9 feat(cli): improved formatting of 'policy show' outputs (#1767) 2022-02-22 22:21:48 -08:00
Jarek Kowalski
6d29b2a082 feat(cli): added snapshot list flags: --storage-stats and --reverse (#1766)
For #1722
Related #1755
2022-02-21 20:24:13 -08:00
Jarek Kowalski
f1d3130351 refactor(repository): expose ContentInfo() on repo.Repository (#1765) 2022-02-21 14:38:59 -08:00
Jarek Kowalski
f5cee5ae42 refactor(snapshots): refactored TreeWalker to use workshare (#1762)
Also cleaned up the API and added unit tests.
2022-02-20 16:58:53 -08:00
Jarek Kowalski
a48e24e693 feat(providers): add Google Drive support (#1731)
* feat(provider): Add Google Drive support.

Co-authored-by: xkxx <xkxx@users.noreply.github.com>
Co-authored-by: xkxx <xkxiang@gmail.com>
2022-02-16 22:34:48 -08:00
Jarek Kowalski
90df511609 fix(snapshots): treat empty retention policy as retaining ALL, not NONE (#1733)
This is a safety measure which addresses P0 improvement for #1732.

Given that retention policies that retain nothing make no sense, this
is not considered a breaking change.
2022-02-07 11:40:27 -08:00
Shikhar Mall
63bedd3446 feat(cli): allow changing retention parameters from CLI (#1680)
Co-authored-by: Shikhar Mall <small@kopia.io>
2022-02-02 19:04:22 -08:00
Jarek Kowalski
e15852af41 test(cli): fixed test flake in TestServerControl (#1717) 2022-02-02 18:44:48 -08:00
Shikhar Mall
aa5e4cfb33 refactor(cli): An in-memory storage mock setup for CLI tests (#1697)
* refactor cli tests to allow the use of in-memory mock

* use in-memory repo for set-parameters cli tests

* move inmemory storage provider into test package

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2022-02-01 10:29:13 -08:00
Jarek Kowalski
fd163cfc20 feat(kopiaui): connect to repository asynchronously on startup (#1691)
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.

This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
2022-01-29 18:28:52 -08:00
Jarek Kowalski
f67274e229 fix(providers): fixed DoNotRecreate and tests for gcs (#1688)
Also simplified validation test suite, which will simply test whether
the provider supports DoNotRecreate or properly rejects it without
external configuration.
2022-01-29 09:12:07 -08:00
Jarek Kowalski
9cad0edb53 test(ui): added end-to-end HTML UI test (#1686)
* test(general): refactored parsing of server output

* test(ui): added experimental end-to-end test using chromedp
2022-01-29 01:34:45 -08:00
Ali Dowair
7ca8b85a57 feat(providers): expand PutBlob API to allow for idempotent puts (#1654)
* Add a new PutBlob option and blob error type

When `DoNotRecreate` is set as true, the blob put operation should
only succeed if no blob with the given blob ID already exists.
Othwerwise, `ErrBlobAlreadyExists` is returned.

* Validate default storage providers' support

By default, storage providers should not support idempotent creates.
This commit adds error handling to exit early if `DoNotRecreate` is
set to true. The commit also verifies this behavior in the provider
validation test.

* Implement support for new option in GCS storage

* Push PutBlob option handling down to Impl

When PutBlob options were introduced, error handling logic for them
was implemented for the Sharded storage interface. However, the
behavior of different providers that implement Sharded can be
different, so it's better to push the options down to be processed in
the provider implementations.

* Introduce new error type for unsupported put opts

To unify error handling code and make it more maintainable, introduce
a new error type `blob.ErrUnsupportedPutBlobOption`, which is to be
returned whenever a storage provider implementation is given put
options it does not support.
2022-01-27 08:49:06 -08:00
Jarek Kowalski
e67f84e0ba chore(general): updated linter to 1.44.0 (#1681) 2022-01-25 21:21:13 -08:00
Shikhar Mall
b592776edf feat(repository): persistence for blob-retention configuration (#1596)
* feat: persisting retention options in repository blob

 - plumb retention parameters through wrapped storage
 - generalize aes encryption mechanism
 - rewrite the retention blob on password change
 - do not write retention blob when empty

* handle retention-blob not-found failures

* cli params to set retention modes on repository create

* enable versioned map mock storage with retention settings

* adding unit tests

* write format and retention blob with retention settings if available

* rename certain functions and constants specific to format blob

* delete retention cache on password-change

* fix: replace SetTime() api call with TouchBlob()

* Update repo/repository_test.go

Co-authored-by: Nick <nick@kasten.io>

* pr feedback and codecov improvements

* fix: rename retention-blob structures to generic blob-cfg

* fix: remove minio dependency on retention constants

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>
2022-01-22 08:37:00 -08:00
Jarek Kowalski
32ed220a6c build(lint): enabled gochecknoglobals and tagged existing globals (#1664) 2022-01-15 12:54:56 -08:00
Jarek Kowalski
3d58566644 fix(security): prevent cross-site request forgery in the UI website (#1653)
* fix(security): prevent cross-site request forgery in the UI website

This fixes a [cross-site request forgery (CSRF)](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
vulnerability in self-hosted UI for Kopia server.

The vulnerability allows potential attacker to make unauthorized API
calls against a running Kopia server. It requires an attacker to trick
the user into visiting a malicious website while also logged into a
Kopia website.

The vulnerability only affected self-hosted Kopia servers with UI. The
following configurations were not vulnerable:

* Kopia Repository Server without UI
* KopiaUI (desktop app)
* command-line usage of `kopia`

All users are strongly recommended to upgrade at the earliest
convenience.

* pr feedback
2022-01-13 11:31:51 -08:00
Jarek Kowalski
2e9a57f0b4 server: support for server control APIs and tooling (#1644)
This adds new set of APIs `/api/v1/control/*` which can be used to administratively control a running server.

Once the server is started, the administrative user can control it
using CLI commands:

export KOPIA_SERVER_ADDRESS=...
export KOPIA_SERVER_CERT_FINGERPRINT=...
export KOPIA_SERVER_PASSWORD=...

* `kopia server status` - displays status of sources managed by the server
* `kopia server snapshot` - triggers server-side upload of snapshots for managed sources
* `kopia server cancel` - cancels upload of snapshots for managed sources
* `kopia server pause` - pauses scheduled snapshots for managed sources
* `kopia server resume` - resumes scheduled snapshots for managed sources
* `kopia server refresh` - causes server to resynchronize with externally-made changes, such as policies or new sources
* `kopia server flush` - causes server to flush all pending writes
* `kopia server shutdown` - graceful shutdown of the server

Authentication uses new user `server-control` and is disabled
by default. To enable it when starting the server, provide the password
using one of the following methods:

* `--server-control-password`
* `--random-server-control-password`
* `.htpasswd` file
* `KOPIA_SERVER_CONTROL_PASSWORD` environment variable

This change allows us to tighten the API security and remove some
methods that UI user was able to call, but which were not needed.
2022-01-03 18:48:38 -08:00
Jarek Kowalski
c66b1c3e76 server: moved serving of static files to internal/server package (#1637) 2022-01-01 13:07:47 -08:00
Jarek Kowalski
f56ad31d41 ui: apply dark mode default and persist user choice (#1621) 2021-12-23 12:09:55 -08:00
Jarek Kowalski
7401684e71 blob: replaced blob.Storage.SetTime() method with blob.PutOptions.SetTime (#1595)
* sharded: plumbed through blob.PutOptions

* blob: removed blob.Storage.SetTime() method

This was only used for `kopia repo sync-to` and got replaced with
an equivalent blob.PutOptions.SetTime, which wehn set to non-zero time
will attempt to set the modification time on a file.

Since some providers don't support changing modification time, we
are able to emulate it using per-blob metadata (on B2, Azure and GCS),
sadly S3 is still unsupported, because it does not support returning
metadata in list results.

Also added PutOptions.GetTime, which when set to not nil, will
populate the provided variable with actual time that got assigned
to the blob.

Added tests that verify that each provider supports GetTime
and SetTime according to this spec.

* blob: additional test coverage for filesystem storage

* blob: added PutBlobAndGetMetadata() helper and used where appropriate

* fixed test failures

* pr feedback

* Update repo/blob/azure/azure_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* Update repo/blob/filesystem/filesystem_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* Update repo/blob/filesystem/filesystem_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* blobtesting: fixed object_locking_map.go

* blobtesting: removed SetTime from ObjectLockingMap

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
2021-12-18 14:00:20 -08:00
strager
4dd28a3346 Fix typo in policy help documentation (#1597)
The field is called ignoreDotFiles in JSON, but the documentation says
dotIgnoreFiles. Fix the docs to refer to the correct field name.
2021-12-18 11:17:27 -08:00
Jarek Kowalski
e870fcc4aa cli: reduce number of index blob writes in 'snapshot create --all' (#1574) 2021-12-11 10:08:34 -08:00
Jarek Kowalski
7673753050 Merge retention tags in snapshot lists (#1567)
* cli: refactored snapshot list

* cli: show range tags in snapshot list

For example if N snapshots are coalesced together because they
have identical roots we may emit now:

```
  2021-03-31 23:09:27 PDT ked3400debc7dd61baffab070bafd59cd (monthly-10)
  2021-04-30 06:12:53 PDT kd0576d212e55a831b7ff1636f90a7233 (monthly-4..9)
  + 5 identical snapshots until 2021-09-30 23:00:19 PDT
  2021-10-31 23:22:25 PDT k846bf22aa2863d27f05e820f840b14f8 (monthly-3)
  2021-11-08 21:29:31 PST k5793ddcd61ef27b93c75ab74a5828176 (latest-1..3,hourly-1..13,daily-1..7,weekly-1..4,monthly-1..2,annual-1)
  + 18 identical snapshots until 2021-12-04 10:09:54 PST
```

* server: server-side coalescing of snapshot

* ui: added coalescing of retention tags
2021-12-05 20:49:41 -08:00
Jarek Kowalski
2cb05f7501 logging: added log rotation and improved predictability of log sweep (#1562)
* logging: added log rotation and improved predictability of log sweep

With this change logs will be rotated every 50 MB, which prevents
accumulation of giant files while server is running.

This change will also guarantee that log sweep completes at least once
before each invocation of Kopia finishes. Previously it was a goroutine
that was not monitored for completion.

Flags can be used to override default behaviors:

* `--max-log-file-segment-size`
* `--no-wait-for-log-sweep` - disables waiting for full log sweep

Fixes #1561

* logging: added --log-dir-max-total-size-mb flag

This limits the total size of all logs in a directory to 1 GB.
2021-12-03 16:43:46 -08:00
Jarek Kowalski
920341cb68 cache: prevent metadata cache thrashing if working set exceeds max defined size (#1557)
This is done by protecting newly added cache items from being swept for
X amount of time where X defaults to:

* `metadata` - 24 hours (new)
* `data` - 10 min (new)
* `indexes` - 1 hours (same as today)

Fixes #1540
2021-12-03 15:35:01 -08:00
Jarek Kowalski
93930d20cb policy: revamped policy merge mechanism (#1538)
Added policy.Definition which allows us to precisely report where
each piece of policy came from.

Fixed a one-off bug with "noParent", which prevented merging of parent
policies one level too soon.

Added a whole bunch of merging helpers and generic reflection-based
test that ensures every single merge is tested.
2021-11-27 18:14:45 -08:00
Lukas Rieger
5224f79d7d [snapshot restore] use non-atomic writes (#1534)
* don't flush every file two times on snapshot restore
2021-11-26 13:10:44 -08:00
Jarek Kowalski
8ab3e049d2 cli: fixed 'snapshot list --json --max-results' (#1529)
Fixes #1455
2021-11-20 22:42:27 -08:00
Jarek Kowalski
525720db95 cli: added 'snapshot pin' command (#1528) 2021-11-20 20:53:25 -08:00
CrendKing
2394b420b0 Change Mbit/s units to MB/s (base-10) (#1522) 2021-11-18 15:41:40 -08:00
Jarek Kowalski
62edab618f throtting: implemented a Throttler based on token bucket and configur… (#1512)
* throtting: implemented a Throttler based on token bucket and configurable window.

* cli: rewired throttle options to use common Limits structure and helpers

The JSON is backwards compatible.

* blob: remove explicit throttling from gcs,s3,b2 & azure

* cleanup: removed internal/throttle

* repo: add throttling wrapper around storage at the repository level

* throttling: expose APIs to get limits and add validation

* server: expose API to get/set throttle in a running server

* pr feedback
2021-11-16 07:39:26 -08:00
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Shikhar Mall
2857c4831a storage api put-blob retention options (#1511)
* storage api put-blob retention options

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2021-11-15 19:46:42 -08:00
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Jarek Kowalski
89edfbf257 maintenance: send logs to content log as well (#1496) 2021-11-06 23:08:00 -07:00
Jarek Kowalski
03def8f33a server: maintenance in newly-created repo (#1494)
The issue in #1439 was caused by goroutine context being associated
with the HTTP request so it became canceled soon after the request was
over, thus the goroutine to run maintenance never ran.

Fixed by adding ctxutil.Detach()

Also fixed logging by passing top-level contexts to requests
and added --log-server-requests flag to `server start` which enables
request logging.
2021-11-06 17:10:53 -07:00
Jarek Kowalski
dcff6c285d Added support for logging policies (#1472)
* policy: introduced OptionalBool - refactoring

* policy: added logging policy

* testing: added support for symlinks and modtime to mockfs

* logging: exposed NullLogger instance

* upload: emit debug logs according to logging policies

* cli: logging policy support
2021-11-06 10:06:05 -07:00
Jarek Kowalski
2a6140d82f fixed directory read race condition (#1489)
This was introduced by a refactoring in #1361 - unlike
ioutil.ReadDir() which internally handles list/delete race and always
returns os.FileInfo, Info() on os.DirEntry can fail if a file
is deleted right after listing it.

Fixes #1486
2021-11-05 10:18:03 -07:00
Eng Zer Jun
c3f4c41591 refactor: move from ioutil.ReadDir to os.ReadDir (#1361)
* refactor: move from ioutil.ReadDir to os.ReadDir

This commit is an addition to PR #1360. According to
`ioutil.ReadDir` documentation (https://pkg.go.dev/io/ioutil#ReadDir),
`os.ReadDir` should be preferred as it is a more efficient and correct
implementation.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* perf: optimize localfs scan performance

Reference: https://github.com/kopia/kopia/pull/1361#issuecomment-937345195
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-11-04 16:57:24 -07:00
Jarek Kowalski
a0cfa2556f introduced structural debug logging and optional JSON output (#1475)
* logging: added Logger.Debugw(message, key1, value1, ..., keyN, valueN)

This is based on ZAP and allows structural logs to be emitted.

* cli: added --json-log-console and --json-log-file flags

* logging: updated storage logging wrapper to use structural logging

* pr feedback
2021-11-03 21:57:37 -07:00
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Jarek Kowalski
c8c433fb60 cli: when migrating snapshot honor destination policies (#1433)
In particular this applies compression based on destination repository
policies.

By default ignore rules are disabled during migration to preserve all
files, but that can be optionally enabled using '--apply-ignore-rules'.

Fixes #1429
2021-10-21 22:12:06 -07:00
Jarek Kowalski
39a195c6f7 cli: clarified usage of 'policy set --snapshot-time` (#1430)
Fixes #1426
2021-10-20 09:04:44 -07:00
Jarek Kowalski
76533246b4 sftp: support for password authentication (#1425)
Fixes #1373
2021-10-20 07:33:24 -07:00
Jarek Kowalski
3c4cf33376 cli: 'repo sync-to' will compare only unique ID not the entire format (#1423)
This allows it to tolerate format changes, such as the ones made
during upgrade.

Fixes #1421
2021-10-19 09:22:57 -07:00
Robert Schindler
3e43f8e76f Migrate: pick latest instead of oldest snapshot with --latest-only (#1420)
Fix #1414
2021-10-19 08:56:35 -07:00
Jarek Kowalski
97e05f996e cli: benchmark compressor memory usage (#1416) 2021-10-18 21:39:00 -07:00
Jarek Kowalski
d3568a6fdb cli: added 'index inspect --parallel' flag (#1404) 2021-10-17 18:35:21 -07:00
Jarek Kowalski
191a51b278 ui: fixed snapshotting UNC roots (#1401)
This was caused by additional resolution of path names only done in UI,
which caused \\hostname\share to be treated as relative and resolved
against the home directory.

Fixes #1385
Fixes #1362
2021-10-17 13:25:12 -07:00