Commit Graph

500 Commits

Author SHA1 Message Date
Jarek Kowalski
f49bcdd883 feat(cli): implementation for 'kopia snapshot fix' (#1930)
* feat(cli): implementation for 'kopia snapshot fix'

This allows modifications and fixes to the snapshots after they have
been taken.

Supported are:

* `kopia snapshot fix remove-invalid-files [--verify-files-percent=X]`

Removes all directory entries where the underlying files cannot be
read based on index analysis (this does not read the files, only index
structures so is reasonably quick).

`--verify-files-percent=100` can be used to trigger full read for
all files.

* `kopia snapshot fix remove-files --object-id=<object-id>`

Removes the object with a given ID from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.

* `kopia snapshot fix remove-files --filename=<wildcard>`

Removes the files with a given name from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.

By default all snapshots are analyzed and rewritten. To limit the scope
use:

--source=user@host:/path
--manifest-id=manifestID

By default the rewrite operation writes new directory entries but
does not replace the manifests. To do that pass `--commit`.

Related #1906
Fixes #799

reorganized CLI per PR suggestion

* additional logging for diff command

* added Clone() method to snapshot manifst and directory entry

* added a comprehensive test, moved DirRewriter  to separate file

* pr feedback

* more pr feedback

* improved logging output

* disable test in -race configuration since it's way to slow

* pr feedback
2022-05-25 01:17:55 +00:00
Julio Lopez
3d1de6f27a chore(general): minor cleanups (#1959)
- expand command flag description for clarification
- include blob id in blob get error in the cache
- nit: remove unused BOTO_PATH
- nit: fix comment
- cleanup: remove unnecessary function declaration in interface
- leverage 'testify' to simplify test
2022-05-23 15:16:25 -07:00
ashmrtn
9f85864da5 feat(snapshots): Add callback-based iteration function to Directory interface (#1957)
* New interface method to iterate over dir entries

* Fix build and test failures from interface

* Fix entry iteration for StaticDirectory

* Make utility function for directory iteration

* Fix lint errors

* No wrapcheck on fs.ReaddirToIterate

* Be consistent for IterateEntry implementations
2022-05-20 18:04:35 -07:00
Jarek Kowalski
99eeb3c063 feat(cli): added CLI for controlling throttler (#1956)
Supported are:

```
$ kopia throttle set \
    --download-bytes-per-second=N | unlimited
    --upload-bytes-per-second=N | unlimited
    --read-requests-per-second=N | unlimited
    --write-requests-per-second=N | unlimited
    --list-requests-per-second=N | unlimited
    --concurrent-reads=N | unlimited
    --concurrent-writes=N | unlimited
```

To change parameters of a running server use:

```
$ kopia server throttle set \
    --address=<server-url> \
    --server-control-password=<password> \
    --download-bytes-per-second=N | unlimited
    --upload-bytes-per-second=N | unlimited
    --read-requests-per-second=N | unlimited
    --write-requests-per-second=N | unlimited
    --list-requests-per-second=N | unlimited
    --concurrent-reads=N | unlimited
    --concurrent-writes=N | unlimited
```
2022-05-18 01:27:06 -07:00
Jarek Kowalski
a61927e089 chore(infra): added more leak checks to tests (#1953) 2022-05-16 06:37:57 +00:00
Jarek Kowalski
c971d7c4f8 feat(providers): ensure Ctrl-C is not passed to rclone (#1951)
Based on https://stackoverflow.com/questions/33165530/prevent-ctrlc-from-interrupting-exec-command-in-golang

Fixes #1934
2022-05-16 05:28:57 +00:00
Jarek Kowalski
1ae6c6df03 fix(repository): fixed slow goroutine leak from indexBlobCache, added tests (#1950) 2022-05-16 01:21:30 +00:00
Jarek Kowalski
98f3473b67 refactor(snapshots): extracted snapshotfs.Verifier component (#1921)
* refactor(snapshots): extracted snapshotfs.Verifier component

* refactor(repository): added tests for snapshotfs.Verifier, misc cleanups

* fixed data race

* fixed atomic alignment

* nit
2022-05-02 04:03:28 +00:00
Jarek Kowalski
5d87d81733 chore(repository): extracted content index building and parsing into repo/content/index (#1881) 2022-04-05 18:04:50 -07:00
Jarek Kowalski
4f36e4e1d0 fix(repository): fixed accidental cache directory location change during refactoring (#1866) 2022-03-27 19:46:40 +00:00
Jarek Kowalski
b64ad62f69 refactor(repository): unified data and metadata cache implementations (#1864)
With this change there's now a single implementation with one boolean parameter that indicates whether we should fetch blobs or contents.
2022-03-27 08:53:20 -07:00
Jarek Kowalski
92683c5a5e feat(snapshots): support for controlling upload parallelism via policies (#1850) 2022-03-26 19:20:49 +00:00
Jarek Kowalski
b2b5e6fb97 fix(repository): fixed 'unable to get content data and info' (#1844)
We were incorrectly assigning the same content cache key regardless of
the storage format of the content (compression format, etc.)

This means that if a content has both compressed and non-compressed
entry in the index, it would sometimes get cached incorrectly
and the subsequent reads would fail.

Clearing the cache would fix the issue - this change appends
format-specific suffix to cache keys, so that clearing the cache is not
needed.

Huge thanks to Mark Derricutt who helped get to the bottom of this
on Slack.

Fix #1843
2022-03-22 20:39:36 -07:00
Ali Dowair
aafe56cd6f feat(snapshots): support restoring sparse files (#1823)
* feat(snapshots): support restoring sparse files

This commit implements basic support for restoring sparse files from
a snapshot. When specifying "--mode=sparse" in a snapshot restore
command, Kopia will make a best effort to make sure the underlying
filesystem allocates the minimum amount of blocks needed to persist
restored files. In other words, enabling this feature will "force"
all restored files to be sparse-blocks of zero bytes in the source
file should not be allocated.

* Address review comments

- Separate sparse option into its own bool flag
- Implement sparsefile packagewith copySparse method
- Truncate once before writing sparse file
- Check error from Truncate
- Add unit test for copySparse
- Invoke GetBlockSize once per file copy
- Remove support for Windows and explain why
- Add unit test for stat package

Co-authored-by: Dave Smith-Uchida <dave@kasten.io>
2022-03-22 19:09:50 -07:00
Jarek Kowalski
716ed71b59 fix(server): fixed server startup race condition (#1842)
This fixes a regression introduced in #1837
2022-03-20 22:21:03 -07:00
Jarek Kowalski
daa62de3e4 chore(ci): added checklocks static analyzer (#1838)
From https://github.com/google/gvisor/tree/master/tools/checklocks

This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.

This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.

In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.

In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.

The check is part of `make lint` but can also be invoked by
`make check-locks`.
2022-03-19 22:42:59 -07:00
Jarek Kowalski
9cba4a97be refactor(repository): major server code refactoring (#1837)
This removes big shared lock held for for the duration of each request
and replaces it with trivially short lock to capture the current
state of the server/repository before passing it to handlers.

Handlers are now limited to only accessing a small subset of Server
functionality to be able to better reason about them.
2022-03-19 22:01:38 -07:00
Jarek Kowalski
9054fd0dc2 chore(ci): upgraded linter to 1.45 (#1836)
* chore(deps): upgraded linter to 1.45

* fixed linter warning
2022-03-18 22:24:42 -07:00
Jarek Kowalski
6ba2f586ad feat(repository): ensure we don't run parallel fetches for the same blob IDs (#1835)
* feat(repository): ensure we don't run parallel fetches for the same blob IDs

Fixed a bunch of test flakes.

* fixed race condition, limit the size of mutex cache by using LRU
2022-03-18 21:27:06 -07:00
Jarek Kowalski
991c08e4b4 chore(repository): switched from opencensus to directly exporting prometheus metrics (#1831) 2022-03-17 23:39:36 -07:00
Ali Dowair
c7ce87f95b feat(providers): Implement API to get Storage free space (#1753)
* Introduce Volume sub-interface
  The Volume interface defines APIs to access a storage provider's
  volume (disk) capacity, usage, etc.. It is inherited by the Storage
  interface, and is at the same hierarchical level as the Reader
  interface.

* Add validations for new Volume method:
  Check that GetCapacity() either returns `ErrNotAVolume`, or that it
  returns a Capacity struct with values that make sense.

* Implement default (passthrough) GetCapacity:
  Cloud providers do not have finite volumes, and WebDAV volumes have no
  notion of volume size and usage. These implementations should just
  return an error (ErrNotAVolume) when their GetCapacity() is called.

* Implement GetCapacity for sftp storage: Uses the sftp.Client interface
* Implement GetCapacity for logging, readonly store
* Implement GetCapacity() for blobtesting implementations

* Implement GetCapacity() for Google Drive:
  Also modifies GetDriveClient to return the entire service instead of just the Files client.

* Implemented GetCapacity() for filesystem storage:
  Implemented the function in a seperate file for each OS/architecture (Unix, OpenBSD, Windows).
2022-03-16 12:35:33 -07:00
Jarek Kowalski
3ef81134a1 fix(cli): fixing ignoring lines starting with '#' in 'policy edit' (#1830)
Fixes #1821
2022-03-16 08:00:50 -07:00
Jarek Kowalski
c4ab086d5f chore(repository): streamlined internal content cache API (#1828)
moved from ./repo/content to ./internal/cache
2022-03-15 23:57:04 -07:00
Jarek Kowalski
1aa3a35518 refactor(repository): moved index blob cache to separate directory (#1827)
Also cleaned up internal content cache APIs.
2022-03-13 21:40:15 -07:00
Jarek Kowalski
69dc7ba969 feat(repository): added 'hint' to Prefetch methods. (#1825) 2022-03-12 23:16:39 -08:00
Jarek Kowalski
db7dcac33c feat(repo): exposed PrefetchContents on the repo.Repository() (#1824) 2022-03-12 14:36:54 -08:00
Jarek Kowalski
e591f4b79b chore(ci): skip non-deterministic tests when computing code coverage (#1817) 2022-03-09 07:42:01 -08:00
Jarek Kowalski
c6944c70b1 fix(repository): fix deduplication when snapshotting identical files in parallel (#1815)
This issue was reported on Slack by Ming. Thanks!
This is particularly easy to reproduce when compression is used.
2022-03-08 20:11:06 -08:00
Jarek Kowalski
369d304084 refactor(repository): better context cancelation handling (#1802)
Instead of ignoring context cancelation in Open(), ensure we don't
spawn goroutines that might be canceled.
2022-03-06 16:56:30 -08:00
Jarek Kowalski
926e14aacb feat(repository): added PrefetchObjects() API (#1779)
* feat(repository): added precaching of data blobs

* feat(repository): added utilities for converting ID slices to strings

* feat(repository): added object.PrefetchBackingContents

* feat(repository): implemented Repository.PrefetchObjects

* feat(cli): added 'cache prefetch' subcommand

* feat(repository): prefetch in parallel

* added tests
2022-03-06 14:30:58 -08:00
Jarek Kowalski
6baf10ad31 fix(repository): fixed 'context canceled' regression (#1799)
This was broken in #1758 by holding onto a context beyond its intended
lifetime.

Fixes #1796
2022-03-03 21:02:40 -08:00
Shikhar Mall
58c28f2205 feat(general): Upgrade lock retries & monitoring during open & write sessions (#1758)
* background upgrade lock monitor

* retry lock forever on connect

* pr feedback

* remove time computations under read lock for efficiency

* extend the unit test to cover lock monitoring with a controlled time function

* more cleanup

Co-authored-by: Shikhar Mall <small@kopia.io>
2022-02-24 15:00:02 -08:00
Jarek Kowalski
f1d3130351 refactor(repository): expose ContentInfo() on repo.Repository (#1765) 2022-02-21 14:38:59 -08:00
Jarek Kowalski
f5cee5ae42 refactor(snapshots): refactored TreeWalker to use workshare (#1762)
Also cleaned up the API and added unit tests.
2022-02-20 16:58:53 -08:00
Jarek Kowalski
4bc69d05ef feat(snapshots): big upload performance improvements by parallelizing directory traversal (#1752)
* feat(general): added internal/workshare package

This introduces work sharing utility useful when walking trees of
things (such as filesystem), which allows N threads/goroutines to be
used.

Whenever a routine is visiting its children, it can share some of that
work with another idle goroutine in the pool (when available). If
no other goroutine is idle, we are already at capacity and the caller
simply does the work in their own goroutine.

The API introduced here is not the most beautiful, but allows us to
avoid allocations in most cases, which is critical for high-performance
data processing.

* feat(snapshots): speed up uploads by parallelizing directory traversal

Previously directories were walked strictly sequentially which means
we could never be uploading data from multiple directories in parallel,
even if they had just a few files each.

This change switches to using the new `workshare` utility which improves
parallelism. It also reduces memory allocations, goroutine creations
and overall memory usage when taking large snapshots, while increasing
CPU utilization.

Tests on realistic directory structures show huge speed-ups during cold
snapshots (without any metadata caching:)

Photo library - 160GB, files:41717 dirs:1350

    Before: 3m11s
    After: 1m50s
    Total time reduction: 43%

Working code directory - 30.7 GB files:194560 dirs:42455

    Before: 55s
    After: 25s
    Total time reduction: 55%

* do not report multiple cancelation errors during parallel uploads

* do not report multiple cancelation errors during parallel uploads

* pr feedback, clarified usage, added comments

* fixed flaky test
2022-02-18 21:16:30 -08:00
Jarek Kowalski
a48e24e693 feat(providers): add Google Drive support (#1731)
* feat(provider): Add Google Drive support.

Co-authored-by: xkxx <xkxx@users.noreply.github.com>
Co-authored-by: xkxx <xkxiang@gmail.com>
2022-02-16 22:34:48 -08:00
Jarek Kowalski
512dcebaf4 feat(ui): support for editing pins and description (#1748)
https://user-images.githubusercontent.com/249880/152926565-2d3186d2-7989-4482-bcbc-9b20659745dd.mp4
2022-02-11 20:34:15 -08:00
Jarek Kowalski
04f70f74f3 test(server): fixed flaky TestSnapshotCounters test (#1734) 2022-02-06 21:12:49 -08:00
Jarek Kowalski
4fb473dc77 feat(ui): support for snapshot and source deletion (#1730)
* feat(ui): add support snapshot and source deletion

https://user-images.githubusercontent.com/249880/152661589-51aeaab3-4b14-4c87-84ec-519815b34a89.mp4

* fix(server): fixed ListSnapshots() unique result
2022-02-05 22:43:54 -08:00
Julio Lopez
d68cc9f647 refactor(general): misc nits (#1729)
* nit: clarify maxCleanupTime documentation

* refactor: runTaskIndexCompactionQuick helper

Expand one-line wrapper

* nit: improve error message

* nit: clarify error message

* nit: use errors.New for static error messages
2022-02-04 17:38:40 -08:00
Onkar Bhat
6f55e65cbd feat(providers): treat token expiration errors as non-retryable (#1675)
Token expiration errors should be treated as non-retryable errors.
2022-02-02 17:38:32 -08:00
Shikhar Mall
aa5e4cfb33 refactor(cli): An in-memory storage mock setup for CLI tests (#1697)
* refactor cli tests to allow the use of in-memory mock

* use in-memory repo for set-parameters cli tests

* move inmemory storage provider into test package

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2022-02-01 10:29:13 -08:00
Jarek Kowalski
fd163cfc20 feat(kopiaui): connect to repository asynchronously on startup (#1691)
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.

This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
2022-01-29 18:28:52 -08:00
Jarek Kowalski
f67274e229 fix(providers): fixed DoNotRecreate and tests for gcs (#1688)
Also simplified validation test suite, which will simply test whether
the provider supports DoNotRecreate or properly rejects it without
external configuration.
2022-01-29 09:12:07 -08:00
Jarek Kowalski
9cad0edb53 test(ui): added end-to-end HTML UI test (#1686)
* test(general): refactored parsing of server output

* test(ui): added experimental end-to-end test using chromedp
2022-01-29 01:34:45 -08:00
Jarek Kowalski
400b3c5ed5 fix(server): sleep 30m after failed maintenance (#1684)
Fixes #1651
Fixes #1652
2022-01-27 18:27:40 -08:00
Ali Dowair
7ca8b85a57 feat(providers): expand PutBlob API to allow for idempotent puts (#1654)
* Add a new PutBlob option and blob error type

When `DoNotRecreate` is set as true, the blob put operation should
only succeed if no blob with the given blob ID already exists.
Othwerwise, `ErrBlobAlreadyExists` is returned.

* Validate default storage providers' support

By default, storage providers should not support idempotent creates.
This commit adds error handling to exit early if `DoNotRecreate` is
set to true. The commit also verifies this behavior in the provider
validation test.

* Implement support for new option in GCS storage

* Push PutBlob option handling down to Impl

When PutBlob options were introduced, error handling logic for them
was implemented for the Sharded storage interface. However, the
behavior of different providers that implement Sharded can be
different, so it's better to push the options down to be processed in
the provider implementations.

* Introduce new error type for unsupported put opts

To unify error handling code and make it more maintainable, introduce
a new error type `blob.ErrUnsupportedPutBlobOption`, which is to be
returned whenever a storage provider implementation is given put
options it does not support.
2022-01-27 08:49:06 -08:00
Jarek Kowalski
e67f84e0ba chore(general): updated linter to 1.44.0 (#1681) 2022-01-25 21:21:13 -08:00
Jarek Kowalski
9cb2a40816 feat(providers): improved sharded directory creation (#1665)
When a sharded directory is missing do not attempt to create all
its parents, but only children of the repository root.

This way when a top-level directory is unmounted, we won't recreate
it unnecessarily.

This is implemented for filesystem and SFTP providers.
2022-01-23 14:56:35 -08:00
Shikhar Mall
b592776edf feat(repository): persistence for blob-retention configuration (#1596)
* feat: persisting retention options in repository blob

 - plumb retention parameters through wrapped storage
 - generalize aes encryption mechanism
 - rewrite the retention blob on password change
 - do not write retention blob when empty

* handle retention-blob not-found failures

* cli params to set retention modes on repository create

* enable versioned map mock storage with retention settings

* adding unit tests

* write format and retention blob with retention settings if available

* rename certain functions and constants specific to format blob

* delete retention cache on password-change

* fix: replace SetTime() api call with TouchBlob()

* Update repo/repository_test.go

Co-authored-by: Nick <nick@kasten.io>

* pr feedback and codecov improvements

* fix: rename retention-blob structures to generic blob-cfg

* fix: remove minio dependency on retention constants

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>
2022-01-22 08:37:00 -08:00