Commit Graph

591 Commits

Author SHA1 Message Date
Julio Lopez
511f4aa65d chore(cli): minor metrics-related cleanups (#1995)
* stop ticker to release resources
* nit: fix typo
* nit: add new line at EOF
2022-05-31 14:04:01 -07:00
Julio Lopez
99fb50118f fix(cli): add retention to JSON output (#1992)
refactor(cli): snapshot list JSON functionality.
Defines SnapshotManifest struct for the snapshot list JSON output.

test(cli): `snapshot list --json`
2022-05-31 13:43:42 -07:00
Jarek Kowalski
17c74e6386 feat(cli): added open telemetry tracing support (#1988)
New flag `--enable-jaeger-collector` and the corresponding
`KOPIA_ENABLE_JAEGER_COLLECTOR` environment variable enables Jaeger
exporter, which by default sends OTEL traces to Jaeger collector on
http://localhost:14268/api/traces

To change this, use environment variables:

* `OTEL_EXPORTER_JAEGER_ENDPOINT`
* `OTEL_EXPORTER_JAEGER_USER`
* `OTEL_EXPORTER_JAEGER_PASSWORD`

When tracing is disabled, the impact on performance is negligible.

To see this in action:

1. Download latest Jaeger all-in-one from https://www.jaegertracing.io/download/
2. Run `jaeger-all-in-one` binary without any parameters.
3. Run `kopia --enable-jaeger-collector snapshot create ...`
4. Go to http://localhost:16686/search and search for traces
2022-05-28 10:39:00 -07:00
Jarek Kowalski
e8c1cfe142 feat(cli): added flags for pushing kopia metrics (#1983)
When enabled, metrics are pushed to the provided Prometheus Push
Gateway at the start and end of each command and periodically every
few seconds.

```
--metrics-push-addr=http://address:port
--metrics-push-interval=5s
--metrics-push-job=kopia
--metrics-push-grouping=a:b --metrics-push-grouping=c:d
--metrics-push-username=user
--metrics-push-password=pass
```
2022-05-28 07:44:59 -07:00
Julio Lopez
e9ca80dd27 refactor(cli): deprecate snapshot gc command (#1973)
* refactor(cli): deprecate `snapshot gc` command
* replace `snapshot gc` in tests with `maintenance run`
2022-05-26 05:51:36 +00:00
Jarek Kowalski
9bf9cac7fb refactor(repository): ensure we always parse content.ID and object.ID (#1960)
* refactor(repository): ensure we always parse content.ID and object.ID

This changes the types to be incompatible with string to prevent direct
conversion to and from string.

This has the additional benefit of reducing number of memory allocations
and bytes for all IDs.

content.ID went from 2 allocations to 1:
   typical case 32 characters + 16 bytes per-string overhead
   worst-case 65 characters + 16 bytes per-string overhead
   now: 34 bytes

object.ID went from 2 allocations to 1:
   typical case 32 characters + 16 bytes per-string overhead
   worst-case 65 characters + 16 bytes per-string overhead
   now: 36 bytes

* move index.{ID,IDRange} methods to separate files

* replaced index.IDFromHash with content.IDFromHash externally

* minor tweaks and additional tests

* Update repo/content/index/id_test.go

Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>

* Update repo/content/index/id_test.go

Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>

* pr feedback

* post-merge fixes

* pr feedback

* pr feedback

* fixed subtle regression in sortedContents()

This was actually not producing invalid results because of how base36
works, just not sorting as efficiently as it could.

Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
2022-05-25 14:15:56 +00:00
Jarek Kowalski
f49bcdd883 feat(cli): implementation for 'kopia snapshot fix' (#1930)
* feat(cli): implementation for 'kopia snapshot fix'

This allows modifications and fixes to the snapshots after they have
been taken.

Supported are:

* `kopia snapshot fix remove-invalid-files [--verify-files-percent=X]`

Removes all directory entries where the underlying files cannot be
read based on index analysis (this does not read the files, only index
structures so is reasonably quick).

`--verify-files-percent=100` can be used to trigger full read for
all files.

* `kopia snapshot fix remove-files --object-id=<object-id>`

Removes the object with a given ID from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.

* `kopia snapshot fix remove-files --filename=<wildcard>`

Removes the files with a given name from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.

By default all snapshots are analyzed and rewritten. To limit the scope
use:

--source=user@host:/path
--manifest-id=manifestID

By default the rewrite operation writes new directory entries but
does not replace the manifests. To do that pass `--commit`.

Related #1906
Fixes #799

reorganized CLI per PR suggestion

* additional logging for diff command

* added Clone() method to snapshot manifst and directory entry

* added a comprehensive test, moved DirRewriter  to separate file

* pr feedback

* more pr feedback

* improved logging output

* disable test in -race configuration since it's way to slow

* pr feedback
2022-05-25 01:17:55 +00:00
Julio Lopez
3d1de6f27a chore(general): minor cleanups (#1959)
- expand command flag description for clarification
- include blob id in blob get error in the cache
- nit: remove unused BOTO_PATH
- nit: fix comment
- cleanup: remove unnecessary function declaration in interface
- leverage 'testify' to simplify test
2022-05-23 15:16:25 -07:00
Jarek Kowalski
99eeb3c063 feat(cli): added CLI for controlling throttler (#1956)
Supported are:

```
$ kopia throttle set \
    --download-bytes-per-second=N | unlimited
    --upload-bytes-per-second=N | unlimited
    --read-requests-per-second=N | unlimited
    --write-requests-per-second=N | unlimited
    --list-requests-per-second=N | unlimited
    --concurrent-reads=N | unlimited
    --concurrent-writes=N | unlimited
```

To change parameters of a running server use:

```
$ kopia server throttle set \
    --address=<server-url> \
    --server-control-password=<password> \
    --download-bytes-per-second=N | unlimited
    --upload-bytes-per-second=N | unlimited
    --read-requests-per-second=N | unlimited
    --write-requests-per-second=N | unlimited
    --list-requests-per-second=N | unlimited
    --concurrent-reads=N | unlimited
    --concurrent-writes=N | unlimited
```
2022-05-18 01:27:06 -07:00
Jarek Kowalski
a61927e089 chore(infra): added more leak checks to tests (#1953) 2022-05-16 06:37:57 +00:00
Jarek Kowalski
1ae6c6df03 fix(repository): fixed slow goroutine leak from indexBlobCache, added tests (#1950) 2022-05-16 01:21:30 +00:00
Jarek Kowalski
81c0580a01 feat(cli): REVERT added 'content delete --forget' flag (#1932) (#1940)
This reverts commit d990af4dc2.
2022-05-10 03:45:25 +00:00
Jarek Kowalski
d990af4dc2 feat(cli): added 'content delete --forget' flag (#1932)
* feat(cli): added 'content delete --forget' flag

This allows low-level hiding of entries in the index, which makes
them completely invisible.

For #1906

* improved code coverage

* pr feedback
2022-05-06 21:16:12 -07:00
Jarek Kowalski
98f3473b67 refactor(snapshots): extracted snapshotfs.Verifier component (#1921)
* refactor(snapshots): extracted snapshotfs.Verifier component

* refactor(repository): added tests for snapshotfs.Verifier, misc cleanups

* fixed data race

* fixed atomic alignment

* nit
2022-05-02 04:03:28 +00:00
Jarek Kowalski
5d87d81733 chore(repository): extracted content index building and parsing into repo/content/index (#1881) 2022-04-05 18:04:50 -07:00
Ali Dowair
044792db30 feat(cli): show storage capacity in repo status (#1867)
The connected repository's backing storage capacity and available
space can be now retrieved from `kopia repository status`. In text
format, these fields are printed in a human friendly form (MiB, GiB).
In JSON mode (`--json`), they are output as bytes.

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
Co-authored-by: Julio
2022-04-01 01:03:51 +00:00
Jarek Kowalski
92683c5a5e feat(snapshots): support for controlling upload parallelism via policies (#1850) 2022-03-26 19:20:49 +00:00
Jarek Kowalski
d7e10dba59 feat(cli): improved kopia benchmark commands (#1849)
* added --parallel flag
* added `kopia benchmark encryption`
* added `kopia benchmark hashing`
2022-03-26 14:28:08 +00:00
Ali Dowair
aafe56cd6f feat(snapshots): support restoring sparse files (#1823)
* feat(snapshots): support restoring sparse files

This commit implements basic support for restoring sparse files from
a snapshot. When specifying "--mode=sparse" in a snapshot restore
command, Kopia will make a best effort to make sure the underlying
filesystem allocates the minimum amount of blocks needed to persist
restored files. In other words, enabling this feature will "force"
all restored files to be sparse-blocks of zero bytes in the source
file should not be allocated.

* Address review comments

- Separate sparse option into its own bool flag
- Implement sparsefile packagewith copySparse method
- Truncate once before writing sparse file
- Check error from Truncate
- Add unit test for copySparse
- Invoke GetBlockSize once per file copy
- Remove support for Windows and explain why
- Add unit test for stat package

Co-authored-by: Dave Smith-Uchida <dave@kasten.io>
2022-03-22 19:09:50 -07:00
Jarek Kowalski
3ef472248d fix(ci): fixed flaky TestServerControl (#1840) 2022-03-20 20:19:21 -07:00
Jarek Kowalski
daa62de3e4 chore(ci): added checklocks static analyzer (#1838)
From https://github.com/google/gvisor/tree/master/tools/checklocks

This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.

This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.

In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.

In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.

The check is part of `make lint` but can also be invoked by
`make check-locks`.
2022-03-19 22:42:59 -07:00
Jarek Kowalski
9054fd0dc2 chore(ci): upgraded linter to 1.45 (#1836)
* chore(deps): upgraded linter to 1.45

* fixed linter warning
2022-03-18 22:24:42 -07:00
Jarek Kowalski
991c08e4b4 chore(repository): switched from opencensus to directly exporting prometheus metrics (#1831) 2022-03-17 23:39:36 -07:00
Julio Lopez
3f9e27c97b feat(cli): add --json output to 'repo status' command (#1834)
* cleanup(repository) add kopia:sensitive tag to content.FormattingOptions field

* feat(cli): add --json output to  command
2022-03-17 22:22:24 -07:00
Jarek Kowalski
2b8f7453db fix(cli): fixed and unified help text for policy commands (#1829)
Fixes #1822
2022-03-16 00:46:57 -07:00
Jarek Kowalski
c4ab086d5f chore(repository): streamlined internal content cache API (#1828)
moved from ./repo/content to ./internal/cache
2022-03-15 23:57:04 -07:00
Jarek Kowalski
69dc7ba969 feat(repository): added 'hint' to Prefetch methods. (#1825) 2022-03-12 23:16:39 -08:00
Jarek Kowalski
369d304084 refactor(repository): better context cancelation handling (#1802)
Instead of ignoring context cancelation in Open(), ensure we don't
spawn goroutines that might be canceled.
2022-03-06 16:56:30 -08:00
Jarek Kowalski
926e14aacb feat(repository): added PrefetchObjects() API (#1779)
* feat(repository): added precaching of data blobs

* feat(repository): added utilities for converting ID slices to strings

* feat(repository): added object.PrefetchBackingContents

* feat(repository): implemented Repository.PrefetchObjects

* feat(cli): added 'cache prefetch' subcommand

* feat(repository): prefetch in parallel

* added tests
2022-03-06 14:30:58 -08:00
Jarek Kowalski
9d63e56bb9 feat(cli): improved formatting of 'policy show' outputs (#1767) 2022-02-22 22:21:48 -08:00
Jarek Kowalski
6d29b2a082 feat(cli): added snapshot list flags: --storage-stats and --reverse (#1766)
For #1722
Related #1755
2022-02-21 20:24:13 -08:00
Jarek Kowalski
f1d3130351 refactor(repository): expose ContentInfo() on repo.Repository (#1765) 2022-02-21 14:38:59 -08:00
Jarek Kowalski
f5cee5ae42 refactor(snapshots): refactored TreeWalker to use workshare (#1762)
Also cleaned up the API and added unit tests.
2022-02-20 16:58:53 -08:00
Jarek Kowalski
a48e24e693 feat(providers): add Google Drive support (#1731)
* feat(provider): Add Google Drive support.

Co-authored-by: xkxx <xkxx@users.noreply.github.com>
Co-authored-by: xkxx <xkxiang@gmail.com>
2022-02-16 22:34:48 -08:00
Jarek Kowalski
90df511609 fix(snapshots): treat empty retention policy as retaining ALL, not NONE (#1733)
This is a safety measure which addresses P0 improvement for #1732.

Given that retention policies that retain nothing make no sense, this
is not considered a breaking change.
2022-02-07 11:40:27 -08:00
Shikhar Mall
63bedd3446 feat(cli): allow changing retention parameters from CLI (#1680)
Co-authored-by: Shikhar Mall <small@kopia.io>
2022-02-02 19:04:22 -08:00
Jarek Kowalski
e15852af41 test(cli): fixed test flake in TestServerControl (#1717) 2022-02-02 18:44:48 -08:00
Shikhar Mall
aa5e4cfb33 refactor(cli): An in-memory storage mock setup for CLI tests (#1697)
* refactor cli tests to allow the use of in-memory mock

* use in-memory repo for set-parameters cli tests

* move inmemory storage provider into test package

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2022-02-01 10:29:13 -08:00
Jarek Kowalski
fd163cfc20 feat(kopiaui): connect to repository asynchronously on startup (#1691)
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.

This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
2022-01-29 18:28:52 -08:00
Jarek Kowalski
f67274e229 fix(providers): fixed DoNotRecreate and tests for gcs (#1688)
Also simplified validation test suite, which will simply test whether
the provider supports DoNotRecreate or properly rejects it without
external configuration.
2022-01-29 09:12:07 -08:00
Jarek Kowalski
9cad0edb53 test(ui): added end-to-end HTML UI test (#1686)
* test(general): refactored parsing of server output

* test(ui): added experimental end-to-end test using chromedp
2022-01-29 01:34:45 -08:00
Ali Dowair
7ca8b85a57 feat(providers): expand PutBlob API to allow for idempotent puts (#1654)
* Add a new PutBlob option and blob error type

When `DoNotRecreate` is set as true, the blob put operation should
only succeed if no blob with the given blob ID already exists.
Othwerwise, `ErrBlobAlreadyExists` is returned.

* Validate default storage providers' support

By default, storage providers should not support idempotent creates.
This commit adds error handling to exit early if `DoNotRecreate` is
set to true. The commit also verifies this behavior in the provider
validation test.

* Implement support for new option in GCS storage

* Push PutBlob option handling down to Impl

When PutBlob options were introduced, error handling logic for them
was implemented for the Sharded storage interface. However, the
behavior of different providers that implement Sharded can be
different, so it's better to push the options down to be processed in
the provider implementations.

* Introduce new error type for unsupported put opts

To unify error handling code and make it more maintainable, introduce
a new error type `blob.ErrUnsupportedPutBlobOption`, which is to be
returned whenever a storage provider implementation is given put
options it does not support.
2022-01-27 08:49:06 -08:00
Jarek Kowalski
e67f84e0ba chore(general): updated linter to 1.44.0 (#1681) 2022-01-25 21:21:13 -08:00
Shikhar Mall
b592776edf feat(repository): persistence for blob-retention configuration (#1596)
* feat: persisting retention options in repository blob

 - plumb retention parameters through wrapped storage
 - generalize aes encryption mechanism
 - rewrite the retention blob on password change
 - do not write retention blob when empty

* handle retention-blob not-found failures

* cli params to set retention modes on repository create

* enable versioned map mock storage with retention settings

* adding unit tests

* write format and retention blob with retention settings if available

* rename certain functions and constants specific to format blob

* delete retention cache on password-change

* fix: replace SetTime() api call with TouchBlob()

* Update repo/repository_test.go

Co-authored-by: Nick <nick@kasten.io>

* pr feedback and codecov improvements

* fix: rename retention-blob structures to generic blob-cfg

* fix: remove minio dependency on retention constants

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>
2022-01-22 08:37:00 -08:00
Jarek Kowalski
32ed220a6c build(lint): enabled gochecknoglobals and tagged existing globals (#1664) 2022-01-15 12:54:56 -08:00
Jarek Kowalski
3d58566644 fix(security): prevent cross-site request forgery in the UI website (#1653)
* fix(security): prevent cross-site request forgery in the UI website

This fixes a [cross-site request forgery (CSRF)](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
vulnerability in self-hosted UI for Kopia server.

The vulnerability allows potential attacker to make unauthorized API
calls against a running Kopia server. It requires an attacker to trick
the user into visiting a malicious website while also logged into a
Kopia website.

The vulnerability only affected self-hosted Kopia servers with UI. The
following configurations were not vulnerable:

* Kopia Repository Server without UI
* KopiaUI (desktop app)
* command-line usage of `kopia`

All users are strongly recommended to upgrade at the earliest
convenience.

* pr feedback
2022-01-13 11:31:51 -08:00
Jarek Kowalski
2e9a57f0b4 server: support for server control APIs and tooling (#1644)
This adds new set of APIs `/api/v1/control/*` which can be used to administratively control a running server.

Once the server is started, the administrative user can control it
using CLI commands:

export KOPIA_SERVER_ADDRESS=...
export KOPIA_SERVER_CERT_FINGERPRINT=...
export KOPIA_SERVER_PASSWORD=...

* `kopia server status` - displays status of sources managed by the server
* `kopia server snapshot` - triggers server-side upload of snapshots for managed sources
* `kopia server cancel` - cancels upload of snapshots for managed sources
* `kopia server pause` - pauses scheduled snapshots for managed sources
* `kopia server resume` - resumes scheduled snapshots for managed sources
* `kopia server refresh` - causes server to resynchronize with externally-made changes, such as policies or new sources
* `kopia server flush` - causes server to flush all pending writes
* `kopia server shutdown` - graceful shutdown of the server

Authentication uses new user `server-control` and is disabled
by default. To enable it when starting the server, provide the password
using one of the following methods:

* `--server-control-password`
* `--random-server-control-password`
* `.htpasswd` file
* `KOPIA_SERVER_CONTROL_PASSWORD` environment variable

This change allows us to tighten the API security and remove some
methods that UI user was able to call, but which were not needed.
2022-01-03 18:48:38 -08:00
Jarek Kowalski
c66b1c3e76 server: moved serving of static files to internal/server package (#1637) 2022-01-01 13:07:47 -08:00
Jarek Kowalski
f56ad31d41 ui: apply dark mode default and persist user choice (#1621) 2021-12-23 12:09:55 -08:00
Jarek Kowalski
7401684e71 blob: replaced blob.Storage.SetTime() method with blob.PutOptions.SetTime (#1595)
* sharded: plumbed through blob.PutOptions

* blob: removed blob.Storage.SetTime() method

This was only used for `kopia repo sync-to` and got replaced with
an equivalent blob.PutOptions.SetTime, which wehn set to non-zero time
will attempt to set the modification time on a file.

Since some providers don't support changing modification time, we
are able to emulate it using per-blob metadata (on B2, Azure and GCS),
sadly S3 is still unsupported, because it does not support returning
metadata in list results.

Also added PutOptions.GetTime, which when set to not nil, will
populate the provided variable with actual time that got assigned
to the blob.

Added tests that verify that each provider supports GetTime
and SetTime according to this spec.

* blob: additional test coverage for filesystem storage

* blob: added PutBlobAndGetMetadata() helper and used where appropriate

* fixed test failures

* pr feedback

* Update repo/blob/azure/azure_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* Update repo/blob/filesystem/filesystem_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* Update repo/blob/filesystem/filesystem_storage.go

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>

* blobtesting: fixed object_locking_map.go

* blobtesting: removed SetTime from ObjectLockingMap

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
2021-12-18 14:00:20 -08:00