kopia

mirror of https://github.com/kopia/kopia.git synced 2026-02-06 04:34:07 -05:00

Author	SHA1	Message	Date
ashmrtn	08c58d53b6	feat(providers): Create IsReadOnly API for blob storage (#3230 ) * Add new blob.Storage call to see if it's readonly Return whether the storage is readonly so higher layers in the stack can selectively disable some functionality if needed, like compaction. Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>	2023-08-21 17:00:50 +00:00
PhracturedBlue	42aad38540	feat(repository): Implement retention time extension on S3 buckets using Object Locks (#2179 ) * Implement ability to extend retention time on S3 buckets using Object Locks * Move object-lock extension to maintenance.Params. * Use a default function for unsupported extensions instead of duplicating code * Fix potential lockup during object-lock extension * Fix race condition. Add more code coverage * rebase to V3 * Add checks to prevent user from setting Retention Period < Full Maintenance Interval --------- Co-authored-by: Ashlie Martinez <ashmrtnz@alcion.ai>	2023-07-03 16:20:02 -07:00
Jarek Kowalski	daa62de3e4	chore(ci): added checklocks static analyzer (#1838 ) From https://github.com/google/gvisor/tree/master/tools/checklocks This will perform static verification that we're using `sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access to certain fields. This was mostly just a matter of adding annotations to indicate which fields are guarded by which mutex. In a handful of places the code had to be refactored to allow static analyzer to do its job better or to not be confused by some constructs. In one place this actually uncovered a bug where a function was not releasing a lock properly in an error case. The check is part of `make lint` but can also be invoked by `make check-locks`.	2022-03-19 22:42:59 -07:00
Ali Dowair	c7ce87f95b	feat(providers): Implement API to get Storage free space (#1753 ) * Introduce Volume sub-interface The Volume interface defines APIs to access a storage provider's volume (disk) capacity, usage, etc.. It is inherited by the Storage interface, and is at the same hierarchical level as the Reader interface. * Add validations for new Volume method: Check that GetCapacity() either returns `ErrNotAVolume`, or that it returns a Capacity struct with values that make sense. * Implement default (passthrough) GetCapacity: Cloud providers do not have finite volumes, and WebDAV volumes have no notion of volume size and usage. These implementations should just return an error (ErrNotAVolume) when their GetCapacity() is called. * Implement GetCapacity for sftp storage: Uses the sftp.Client interface * Implement GetCapacity for logging, readonly store * Implement GetCapacity() for blobtesting implementations * Implement GetCapacity() for Google Drive: Also modifies GetDriveClient to return the entire service instead of just the Files client. * Implemented GetCapacity() for filesystem storage: Implemented the function in a seperate file for each OS/architecture (Unix, OpenBSD, Windows).	2022-03-16 12:35:33 -07:00
Jarek Kowalski	7401684e71	blob: replaced blob.Storage.SetTime() method with blob.PutOptions.SetTime (#1595 ) * sharded: plumbed through blob.PutOptions * blob: removed blob.Storage.SetTime() method This was only used for `kopia repo sync-to` and got replaced with an equivalent blob.PutOptions.SetTime, which wehn set to non-zero time will attempt to set the modification time on a file. Since some providers don't support changing modification time, we are able to emulate it using per-blob metadata (on B2, Azure and GCS), sadly S3 is still unsupported, because it does not support returning metadata in list results. Also added PutOptions.GetTime, which when set to not nil, will populate the provided variable with actual time that got assigned to the blob. Added tests that verify that each provider supports GetTime and SetTime according to this spec. * blob: additional test coverage for filesystem storage * blob: added PutBlobAndGetMetadata() helper and used where appropriate * fixed test failures * pr feedback * Update repo/blob/azure/azure_storage.go Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com> * Update repo/blob/filesystem/filesystem_storage.go Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com> * Update repo/blob/filesystem/filesystem_storage.go Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com> * blobtesting: fixed object_locking_map.go * blobtesting: removed SetTime from ObjectLockingMap Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>	2021-12-18 14:00:20 -08:00
Jarek Kowalski	081c252e18	blobtesting: refactored fault injection into fluent API (#1578 )	2021-12-13 11:11:43 -08:00
Shikhar Mall	2857c4831a	storage api put-blob retention options (#1511 ) * storage api put-blob retention options Co-authored-by: Shikhar Mall <shikhar@kasten.io>	2021-11-15 19:46:42 -08:00
David Zaninovic	540910e854	Use blob.OutputBuffer in blob.Reader interface instead of internal gather.WriteBuffer (#1452 ) * Use blob.OutputBuffer in blob.Reader interface instead of internal gather.WriteBuffer * blob: remove blob.Storage.OutputBuffer methods Append() and ToByteSlice() (cherry picked from commit 36d30b3b5f1f916e95493ca7552e6612f56624a6) Co-authored-by: Jarek Kowalski <jaak@jkowalski.net>	2021-11-03 11:58:49 -07:00
Jarek Kowalski	0d0f48a7ee	clock: discard monotonic clock component in clock.Now() (#1437 ) The dual time measurement is described in https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md The fix is to discard hidden monotonic time component of time.Time by converting to unix time and back. Reviewed usage of clock.Now() and replaced with timetrack.StartTimer() when measuring time. The problem in #1402 was that passage of time was measured using the monotonic time and not wall clock time. When the computer goes to sleep, monotonic time is still monotonic while wall clock time makes a leap when the computer wakes up. This is the behavior that epoch manager (and most other compontents in Kopia) rely upon. Fixes #1402 Co-authored-by: Julio Lopez <julio+gh@kasten.io>	2021-10-22 15:35:09 -07:00
Eng Zer Jun	73e492c9db	refactor: move from io/ioutil to io and os package (#1360 ) * refactor: move from io/ioutil to io and os package The io/ioutil package has been deprecated as of Go 1.16, see https://golang.org/doc/go1.16#ioutil. This commit replaces the existing io/ioutil functions with their new definitions in io and os packages. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * chore: remove //nolint:gosec for os.ReadFile At the time of this commit, the G304 rule of gosec does not include the `os.ReadFile` function. We remove `//nolint:gosec` temporarily until https://github.com/securego/gosec/pull/706 is merged. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2021-10-06 08:39:10 -07:00
Jarek Kowalski	35d0f31c0d	huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244 ) This helps recycle buffers more efficiently during snapshots. Also, improved memory tracking, enabled profiling flags and added pprof by default.	2021-08-20 08:45:10 -07:00
Jarek Kowalski	6277fa27c9	content: refactored own writes cache and list cache into blob.Storage wrappers (#1133 ) added blob.Storage.FlushCaches method.	2021-06-12 19:22:25 -07:00
Jarek Kowalski	c242235a32	blob: added SetTime() method which may be optionally implemented by blob.Storage (#575 ) cli: added --times option to 'repository sync'	2020-08-31 19:50:15 -07:00
Jarek Kowalski	1a8fcb086c	Added endurance test which tests kopia over long time scale (#558 ) Globally replaced all use of time with internal 'clock' package which provides indirection to time.Now() Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT logfile: squelch annoying log message testenv: added faketimeserver which serves time over HTTP testing: added endurance test which tests kopia over long time scale This creates kopia repository and simulates usage of Kopia over multiple months (using accelerated fake time) to trigger effects that are only visible after long time passage (maintenance, compactions, expirations). The test is not used part of any test suite yet but will run in post-submit mode only, preferably 24/7. testing: refactored internal/clock to only support injection when 'testing' build tag is present	2020-08-26 23:03:46 -07:00
Jarek Kowalski	9a6dea898b	Linter upgrade to v1.30.0 (#526 ) * fixed godot linter errors * reformatted source with gofumpt * disabled some linters * fixed nolintlint warnings * fixed gci warnings * lint: fixed 'nestif' warnings * lint: fixed 'exhaustive' warnings * lint: fixed 'gocritic' warnings * lint: fixed 'noctx' warnings * lint: fixed 'wsl' warnings * lint: fixed 'goerr113' warnings * lint: fixed 'gosec' warnings * lint: upgraded linter to 1.30.0 * lint: more 'exhaustive' warnings Co-authored-by: Nick <nick@kasten.io>	2020-08-12 19:28:53 -07:00
Jarek Kowalski	505ab92e21	Support for repository sync (#522 ) * blob: added DisplayName() method to blob.Storage * cli: added 'kopia repo sync-to <provider>' which replicates BLOBs Usage demo: https://asciinema.org/a/352299 Fixes #509 * implemented suggestion by Ciantic to fail sync if the destination repository is not compatible with the source * cli: added 'kopia repo sync --must-exist' This ensures that target repository is not empty, otherwise syncing to an accidentally unmounted filesystem directory might copy everything again.	2020-08-09 12:36:41 -07:00
Jarek Kowalski	d68273a576	Improvements for dealing with eventually-consistent stores (S3) (#437 ) * content: added support for cache of own writes Thi keeps track of which blobs (n and m) have been written by the local repository client, so that even if the storage listing is eventually consistent (as in S3), we get somewhat sane behavior. Note that this is still assumming read-after-create semantics, which S3 also guarantees, otherwise it's very hard to do anything useful. * compaction: support for compaction logs Instead of compaction immediately deleting source index blobs, we now write log entries (with `m` prefix) which are merged on reads and applied only if the blob list includes all inputs and outputs, in which case the inputs are discarded since they are known to have been superseded by the outputs. This addresses eventual consistency issues in stores such as S3, which don't guarantee list-after-put or list-after-delete. With such stores the repository is ultimately eventually consistent and there's not much that can be done about it, unless we use second strongly consistent storage (such as GCS) for the index only. * content: updated list cache to cache both `n` and `m` * repo: fixed cache clear on windows Clearing cache requires closing repository first, as Windows is holding the files locked. This requires ability to close the repository twice. * content: refactored index blob management into indexBlobManager * testing: fixed blobtesting.Map storage to allow overwrites * blob: added debug output String() to blob.Metadata * testing: added indexBlobManager stress test This works by using N parallel "actors", each repeatedly performing operations on indexBlobManagers all sharing single eventually consistent storage. Each actor runs in a loop and randomly selects between: - reading all contents in indexes and verifying that it includes all contents written by the actor so far and that contents are correctly marked as deleted - creating new contents - deleting one of previously-created contents (by the same actor) - compacting all index files into one The test runs on accelerated time (every read of time moves it by 0.1 seconds) and simulates several hours of running. In case of a failure, the log should provide enough debugging information to trace the exact sequence of events leading up to the failure - each log line is prefixed with actorID and all storage access is logged. * makefile: increase test timeout * content: fixed index blob manager race The race is where if we delete compaction log too early, it may lead to previously deleted contents becoming temporarily live again to an outside observer. Added test case that reproduces the issue, verified that it fails without the fix and passed with one. * testing: improvements to TestIndexBlobManagerStress test - better logging to be able to trace the root cause in case of a failure - prevented concurrent compaction which is unsafe: The sequence: 1. A creates contentA1 in INDEX-1 2. B creates contentB1 in INDEX-2 3. A deletes contentA1 in INDEX-3 4. B does compaction, but is not seeing INDEX-3 (due to EC or simply because B started read before #3 completed), so it writes INDEX-4==merge(INDEX-1,INDEX-2) * INDEX-4 has contentA1 as active 5. A does compaction but it's not seeing INDEX-4 yet (due to EC or because read started before #4), so it drops contentA1, writes INDEX-5=merge(INDEX-1,INDEX-2,INDEX-3) * INDEX-5 does not have contentA1 7. C sees INDEX-5 and INDEX-5 and merge(INDEX-4,INDEX-5) contains contentA1 which is wrong, because A has been deleted (and there's no record of it anywhere in the system) * content: when building pack index ensure index bytes are different each time by adding 32 random bytes	2020-05-31 17:11:20 -07:00
Jarek Kowalski	8c4fb53c96	blob: support for GetMetadata() to get server-side timestamp and blob length (#440 )	2020-05-18 11:06:34 -07:00
Jarek Kowalski	d657415817	testing: added blob.Storage wrapper that simulates eventual consistency (#434 ) This is done by introducing N unsynchronized caches, which simulate what frontend of a cloud storage system might do, that causes eventual consistency behavior.	2020-05-09 12:19:32 -07:00

19 Commits