* feat(cli): implementation for 'kopia snapshot fix'
This allows modifications and fixes to the snapshots after they have
been taken.
Supported are:
* `kopia snapshot fix remove-invalid-files [--verify-files-percent=X]`
Removes all directory entries where the underlying files cannot be
read based on index analysis (this does not read the files, only index
structures so is reasonably quick).
`--verify-files-percent=100` can be used to trigger full read for
all files.
* `kopia snapshot fix remove-files --object-id=<object-id>`
Removes the object with a given ID from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
* `kopia snapshot fix remove-files --filename=<wildcard>`
Removes the files with a given name from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
By default all snapshots are analyzed and rewritten. To limit the scope
use:
--source=user@host:/path
--manifest-id=manifestID
By default the rewrite operation writes new directory entries but
does not replace the manifests. To do that pass `--commit`.
Related #1906Fixes#799
reorganized CLI per PR suggestion
* additional logging for diff command
* added Clone() method to snapshot manifst and directory entry
* added a comprehensive test, moved DirRewriter to separate file
* pr feedback
* more pr feedback
* improved logging output
* disable test in -race configuration since it's way to slow
* pr feedback
- expand command flag description for clarification
- include blob id in blob get error in the cache
- nit: remove unused BOTO_PATH
- nit: fix comment
- cleanup: remove unnecessary function declaration in interface
- leverage 'testify' to simplify test
* New interface method to iterate over dir entries
* Fix build and test failures from interface
* Fix entry iteration for StaticDirectory
* Make utility function for directory iteration
* Fix lint errors
* No wrapcheck on fs.ReaddirToIterate
* Be consistent for IterateEntry implementations
We were incorrectly assigning the same content cache key regardless of
the storage format of the content (compression format, etc.)
This means that if a content has both compressed and non-compressed
entry in the index, it would sometimes get cached incorrectly
and the subsequent reads would fail.
Clearing the cache would fix the issue - this change appends
format-specific suffix to cache keys, so that clearing the cache is not
needed.
Huge thanks to Mark Derricutt who helped get to the bottom of this
on Slack.
Fix#1843
* feat(snapshots): support restoring sparse files
This commit implements basic support for restoring sparse files from
a snapshot. When specifying "--mode=sparse" in a snapshot restore
command, Kopia will make a best effort to make sure the underlying
filesystem allocates the minimum amount of blocks needed to persist
restored files. In other words, enabling this feature will "force"
all restored files to be sparse-blocks of zero bytes in the source
file should not be allocated.
* Address review comments
- Separate sparse option into its own bool flag
- Implement sparsefile packagewith copySparse method
- Truncate once before writing sparse file
- Check error from Truncate
- Add unit test for copySparse
- Invoke GetBlockSize once per file copy
- Remove support for Windows and explain why
- Add unit test for stat package
Co-authored-by: Dave Smith-Uchida <dave@kasten.io>
From https://github.com/google/gvisor/tree/master/tools/checklocks
This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.
This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.
In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.
In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.
The check is part of `make lint` but can also be invoked by
`make check-locks`.
This removes big shared lock held for for the duration of each request
and replaces it with trivially short lock to capture the current
state of the server/repository before passing it to handlers.
Handlers are now limited to only accessing a small subset of Server
functionality to be able to better reason about them.
* feat(repository): ensure we don't run parallel fetches for the same blob IDs
Fixed a bunch of test flakes.
* fixed race condition, limit the size of mutex cache by using LRU
* Introduce Volume sub-interface
The Volume interface defines APIs to access a storage provider's
volume (disk) capacity, usage, etc.. It is inherited by the Storage
interface, and is at the same hierarchical level as the Reader
interface.
* Add validations for new Volume method:
Check that GetCapacity() either returns `ErrNotAVolume`, or that it
returns a Capacity struct with values that make sense.
* Implement default (passthrough) GetCapacity:
Cloud providers do not have finite volumes, and WebDAV volumes have no
notion of volume size and usage. These implementations should just
return an error (ErrNotAVolume) when their GetCapacity() is called.
* Implement GetCapacity for sftp storage: Uses the sftp.Client interface
* Implement GetCapacity for logging, readonly store
* Implement GetCapacity() for blobtesting implementations
* Implement GetCapacity() for Google Drive:
Also modifies GetDriveClient to return the entire service instead of just the Files client.
* Implemented GetCapacity() for filesystem storage:
Implemented the function in a seperate file for each OS/architecture (Unix, OpenBSD, Windows).
* background upgrade lock monitor
* retry lock forever on connect
* pr feedback
* remove time computations under read lock for efficiency
* extend the unit test to cover lock monitoring with a controlled time function
* more cleanup
Co-authored-by: Shikhar Mall <small@kopia.io>
* feat(general): added internal/workshare package
This introduces work sharing utility useful when walking trees of
things (such as filesystem), which allows N threads/goroutines to be
used.
Whenever a routine is visiting its children, it can share some of that
work with another idle goroutine in the pool (when available). If
no other goroutine is idle, we are already at capacity and the caller
simply does the work in their own goroutine.
The API introduced here is not the most beautiful, but allows us to
avoid allocations in most cases, which is critical for high-performance
data processing.
* feat(snapshots): speed up uploads by parallelizing directory traversal
Previously directories were walked strictly sequentially which means
we could never be uploading data from multiple directories in parallel,
even if they had just a few files each.
This change switches to using the new `workshare` utility which improves
parallelism. It also reduces memory allocations, goroutine creations
and overall memory usage when taking large snapshots, while increasing
CPU utilization.
Tests on realistic directory structures show huge speed-ups during cold
snapshots (without any metadata caching:)
Photo library - 160GB, files:41717 dirs:1350
Before: 3m11s
After: 1m50s
Total time reduction: 43%
Working code directory - 30.7 GB files:194560 dirs:42455
Before: 55s
After: 25s
Total time reduction: 55%
* do not report multiple cancelation errors during parallel uploads
* do not report multiple cancelation errors during parallel uploads
* pr feedback, clarified usage, added comments
* fixed flaky test
* refactor cli tests to allow the use of in-memory mock
* use in-memory repo for set-parameters cli tests
* move inmemory storage provider into test package
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.
This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
Also simplified validation test suite, which will simply test whether
the provider supports DoNotRecreate or properly rejects it without
external configuration.
* Add a new PutBlob option and blob error type
When `DoNotRecreate` is set as true, the blob put operation should
only succeed if no blob with the given blob ID already exists.
Othwerwise, `ErrBlobAlreadyExists` is returned.
* Validate default storage providers' support
By default, storage providers should not support idempotent creates.
This commit adds error handling to exit early if `DoNotRecreate` is
set to true. The commit also verifies this behavior in the provider
validation test.
* Implement support for new option in GCS storage
* Push PutBlob option handling down to Impl
When PutBlob options were introduced, error handling logic for them
was implemented for the Sharded storage interface. However, the
behavior of different providers that implement Sharded can be
different, so it's better to push the options down to be processed in
the provider implementations.
* Introduce new error type for unsupported put opts
To unify error handling code and make it more maintainable, introduce
a new error type `blob.ErrUnsupportedPutBlobOption`, which is to be
returned whenever a storage provider implementation is given put
options it does not support.
When a sharded directory is missing do not attempt to create all
its parents, but only children of the repository root.
This way when a top-level directory is unmounted, we won't recreate
it unnecessarily.
This is implemented for filesystem and SFTP providers.
* feat: persisting retention options in repository blob
- plumb retention parameters through wrapped storage
- generalize aes encryption mechanism
- rewrite the retention blob on password change
- do not write retention blob when empty
* handle retention-blob not-found failures
* cli params to set retention modes on repository create
* enable versioned map mock storage with retention settings
* adding unit tests
* write format and retention blob with retention settings if available
* rename certain functions and constants specific to format blob
* delete retention cache on password-change
* fix: replace SetTime() api call with TouchBlob()
* Update repo/repository_test.go
Co-authored-by: Nick <nick@kasten.io>
* pr feedback and codecov improvements
* fix: rename retention-blob structures to generic blob-cfg
* fix: remove minio dependency on retention constants
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>