kopia

mirror of https://github.com/kopia/kopia.git synced 2026-06-16 18:19:03 -04:00

Author	SHA1	Message	Date
Jarek Kowalski	4d7f0cb6cd	Fixed symlink restore behavior on macOS (#673 ) * restore: use symlink-specific APIs instead of chmod, chown and chtimes * upload: fix updating directory modtime for symlinks * cli: plumbed through flags to restore to control new behaviors * localfs: use Lstat() instead of Stat() in Child() method * testing: added restore tests for new flags	2020-10-10 11:03:35 -07:00
Jarek Kowalski	1962882aa8	testing: use shorter RSA keys to speed up server tests (#665 )	2020-10-04 22:06:59 -07:00
Jarek Kowalski	ae38fa3917	Speed up integration tests (#653 ) * testing: don't use expensive scrypt-65536-8-1 in integration tests * testing: use platform-specific encryption and hashing for arm and arm64 to speed up tests * testing: manually manage log directory to be able to analyze integration test failures * testing: snapshot_gc_test was too quick * Makefile: renamed target building integration test binary	2020-09-30 22:01:16 -07:00
Jarek Kowalski	0758a92c58	restore: improved user experience (#644 ) * restore: improved user experience * 'snapshot restore' is now the same as 'restore' and both will support restoring by manifest ID, root ID or root ID + subdirectory * added support for restoring individual files * implemented PR feedback and refactored object ID parsing Moving helpers inside the snapshot/ package helped clean up the code a lot.	2020-09-28 22:57:24 -07:00
Jarek Kowalski	c9c8d27c8d	Repro and fix for zero-sized snapshot bug (#641 ) * server: repro for zero-sized snapshot bug As described in https://kopia.discourse.group/t/kopia-0-7-0-not-backing-up-any-files-repro-needed/136/5 * server: fixed zero-sized snapshots after repository is connected via API The root cause was that source manager was inheriting HTTP call context which was immediately closed after the 'connect' RPC returned thus silently killing all uploads.	2020-09-23 20:15:36 -07:00
Julio López	ae6a960080	Prefer t.TempDir() over makeScratchDir(t) (#612 ) Prefer t.TempDir() over makeScratchDir(t) Remove unused randomString Leverage T.TempDir() in CLITest env	2020-09-22 22:16:39 -07:00
Jarek Kowalski	6bdcb81712	ignorefs: fixed arm-specific linter warning (#637 ) * ignorefs: fixed arm-specific linter warning * testing: TestServerStart fixes for armhf	2020-09-22 19:04:05 -07:00
Jarek Kowalski	fce9497375	restore: support for symlinks (experimental) (#621 )	2020-09-18 10:29:20 -07:00
Nick	7f61dc6637	[Robustness] Add command line parameters for kopia snapshotter (#576 ) * [Robustness] Add command line parameters for kopia snapshotter Add flags for: - no-progress - parallel - cache sizes - no update check Add an integration test to validate snapshotter expected output against a kopia executable.	2020-09-18 01:15:19 -07:00
Jarek Kowalski	f2cf71d914	logging: revamped logs from content manager to be machine parseable (#617 ) * logging: revamped logs from content manager to be machine parseable Logs from the content manager (except reads) are sent to separate log file that is always free from personally-identifiable information (e.g. no file names, just content IDs and blob IDs). Also moved CLI logs to a subdirectory (cli-logs) and put content logs in a parallel directory (content-logs) Also, the log file name will now include the type of the command that was invoked: kopia-20200913-134157-16110-snapshot-create.log Fixes #588 * tests: moved all logs from tests to a separate directory	2020-09-16 20:04:26 -07:00
Jarek Kowalski	c7be3a0c87	testing: added performance benchmark (#618 ) The benchmarks creates 20 GB of files in different configurations * 10 x 2 GB files * 100 x 200 MB files * 1000 x 20 MB files and backs them up to a local filesystem repository measuring time, CPU and RAM usage. The benchmarking script uses GCP instance (n1-standard-8) with fast NVME flash to eliminate local filesystem latency. Current performance numbers show major improvement in latency in 0.7.0-rc1 due to splitter throughput optimization (#606).	2020-09-15 21:30:08 -07:00
Jarek Kowalski	6a14ac8a2a	cli: ensure advanced commands are not accidentally used (#611 ) * cli: ensure advanced commands are not accidentally used This prints an error when a dangerous command is used without first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable. Co-authored-by: Julio López <julio+gh@kasten.io>	2020-09-12 20:31:25 -07:00
Julio López	64b6018140	Test for directory reuse after GC (#601 ) content:Allow returning deleted content in GetContent maintenance: check deleted contents as well maintenance: test for when a directory content is reused after deletion testing: add support for repo open options in repotesting * Allow passing repo options to MustReopen * Add repotesting.Environment.MustConnectOpenAnother * Remove kopia.config.mlock file * snapshot create helper * Fix content delete related and e2e tests	2020-09-12 19:28:52 -07:00
Jarek Kowalski	29ce1819cb	Added support for setting and changing repository client options (description, read-only, hostname, username) (#589 ) * repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config * cli: added 'repository set-client' to configure parameters of connected repository * cli: cleaned up 'repository status' output	2020-09-04 13:57:15 -07:00
Jarek Kowalski	ded1ecf936	implemented Cache Directory Tagging Specification + CLI + UI (#565 ) Fixes #564 cli: added 'kopia policy set --ignore-cache-dirs' option to control whether to ignore caches (global default=true) ui: added checkbox to control 'Ignore Cache Dirs' in policy editor ignorefs: moved ignoring cache directories to ignorefs layer Co-authored-by: Julio López <julio+gh@kasten.io>	2020-08-31 21:35:26 -07:00
Jarek Kowalski	c242235a32	blob: added SetTime() method which may be optionally implemented by blob.Storage (#575 ) cli: added --times option to 'repository sync'	2020-08-31 19:50:15 -07:00
Jarek Kowalski	965160dba1	cli: ignore trailing / in repository server URL (#569 ) Fixes #557	2020-08-30 16:10:26 -07:00
Jarek Kowalski	1a8fcb086c	Added endurance test which tests kopia over long time scale (#558 ) Globally replaced all use of time with internal 'clock' package which provides indirection to time.Now() Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT logfile: squelch annoying log message testenv: added faketimeserver which serves time over HTTP testing: added endurance test which tests kopia over long time scale This creates kopia repository and simulates usage of Kopia over multiple months (using accelerated fake time) to trigger effects that are only visible after long time passage (maintenance, compactions, expirations). The test is not used part of any test suite yet but will run in post-submit mode only, preferably 24/7. testing: refactored internal/clock to only support injection when 'testing' build tag is present	2020-08-26 23:03:46 -07:00
Nick	e7675f2d01	Address additional suggestions from fio workload PR #529 (#550 ) Followup on recent PR #529, some suggestions and discussion after it was merged: - Express probability as float in range [0,1] - Add a unit test for DeleteContentsAtDepth - Add a comment on writeFilesAtDepth explaining depth vs branchDepth - Refactor pickRandSubdirPath for easier readability and understanding Upon some reflection, I decided to refactor pickRandSubdirPath() to gather indexes and pick randomly from them instead of the previous reservoir sampling approach. I think this is easier to understand going forward without extra explanation, doesn't have much additional memory overhead, and reduces the number of rand calls to 1.	2020-08-20 21:10:56 -07:00
Nick	da6b933542	[Robustness] Add additional fio workloads and fix fio runner (#529 ) * [Robustness] Add additional fio workloads Add more fio workloads to write files at different depths in random branches of the generated file system tree. - Write files at depth - Write files at a specified depth, creating a new directory branch at a random depth - Delete a random directory at a given depth - Delete some or all of the contents of a random directory at a specified depth	2020-08-14 21:54:52 -07:00
Nick	14d50aaa50	[Robustness] Fix for kopia runner and custom working directory (#533 ) * [Robustness] Fix for kopia runner and custom work dir Apply fix similar to #293 for the robustness kopia runner. Add control for runner working directory.	2020-08-14 17:32:45 -07:00
Nick	0c3ab1337e	[Robustness] Fswalker should ignore host name (#531 ) Fix fswalker to ignore hostname to allow reporting on walks done across different hosts. Also prevent Before and After walk data from printing to reduce log size.	2020-08-13 16:40:23 -07:00
Nick	7da4022cfd	[refactor] Move robustness and engine packages (#528 ) Perform minor refactor by moving robustness and engine packages in preparation for later PRs Fix import path	2020-08-13 14:56:44 -07:00
Jarek Kowalski	9a6dea898b	Linter upgrade to v1.30.0 (#526 ) * fixed godot linter errors * reformatted source with gofumpt * disabled some linters * fixed nolintlint warnings * fixed gci warnings * lint: fixed 'nestif' warnings * lint: fixed 'exhaustive' warnings * lint: fixed 'gocritic' warnings * lint: fixed 'noctx' warnings * lint: fixed 'wsl' warnings * lint: fixed 'goerr113' warnings * lint: fixed 'gosec' warnings * lint: upgraded linter to 1.30.0 * lint: more 'exhaustive' warnings Co-authored-by: Nick <nick@kasten.io>	2020-08-12 19:28:53 -07:00
Jarek Kowalski	bb2434dc28	upload: auto-ignore kopia cache directories when creating snapshots (#524 ) This creates a marker file named `.kopia-cache` in the directory that is the root of cache. When the uploader finds this file, it will treat the entire directory as if it were empty. This allows excluding directory caches from entire home and root directories.	2020-08-10 11:19:54 -07:00
Jarek Kowalski	505ab92e21	Support for repository sync (#522 ) * blob: added DisplayName() method to blob.Storage * cli: added 'kopia repo sync-to <provider>' which replicates BLOBs Usage demo: https://asciinema.org/a/352299 Fixes #509 * implemented suggestion by Ciantic to fail sync if the destination repository is not compatible with the source * cli: added 'kopia repo sync --must-exist' This ensures that target repository is not empty, otherwise syncing to an accidentally unmounted filesystem directory might copy everything again.	2020-08-09 12:36:41 -07:00
Nick	7867513732	[Fix #519 ] Fix pack blob parse in TestRestoreFail (#520 ) Fix parsing of pack blob ID by using a specific regex instead of a strings.Contains. This prevents the test from misidentifying other blob IDs as pack blobs, such as `kopia.maintenance`.	2020-08-05 18:31:48 -07:00
Jarek Kowalski	40acf238f3	Fixed arm and arm64 build. (#506 ) * fixed a number of cases where misaligned data was causing panics on armv7 (but not armv8) * travis: enable arm64 * test: reduce compressed data sizes when running on arm * arm: wait longer for snapshots	2020-07-30 17:31:28 -07:00
Jarek Kowalski	8ead49b779	restore: support for zip, tar and tar.gz restore outputs (#482 ) * restore: support for zip, tar and tar.gz restore outputs Moved restore functionality to its own package. * Fix enum values in the 'mode' flag Co-authored-by: Julio López <julio+gh@kasten.io>	2020-07-22 22:56:11 -07:00
Nick	ce5e6dcd13	[Robustness] Add first robustness tests Add two tests: - TestManySmallFiles: writes 100k files size 4k to a directory. Snapshots the data tree, restores and validates data. - TestModifyWorkload: Loops over a simple randomized workload. Performs a series of random file writes to some random sub-directories, then takes a snapshot of the data tree. All snapshots taken during this test are restore-verified at the end. A global test engine is instantiated in main_test.go, to be used in the robustness test suite across tests (saves time loading/saving metadata once per run instead of per test).	2020-07-14 22:37:11 -07:00
Nick	82a2fa0ea5	[Robustness] Add test engine to manage snapshot verification testing (#468 ) * Add test engine to manage snapshot verification testing Test engine manages the test and metadata repositories, snapshot checker, metadata storage persistence, and file writer. It is the high level helper that will be invoked in the snapshot verification testing suite. - modify data directory file structure - issue snapshot/restore/delete to the data directory - accumulate metadata over the course of the test suite - flush accumulated metadata to the metadata repository - load historical metadata from the repository on initialization - perform automatic data integrity verification on snap restore This change corresponds to the robustness execution engine component from the design documentation.	2020-06-27 23:46:37 -07:00
Jarek Kowalski	79757672ca	server: implemented 'flush' and 'refresh' API Added test that verifies that when client performs Flush (which happens at the end of each snapshot and when repository is closed), the server writes new blobs to the storage. Fixes #464	2020-06-07 19:38:13 -07:00
Nick	b30da511e7	Wire up snapshot-store-compare behavior Connect the snapshotter, the storer, and the comparer. Invoke the snapshotter to take/restore/delete snapshots on the repo, the comparer to gather metadata before the snapshot and after the restore, and the storer to save metadata for later lookup when verifying restores.	2020-05-31 21:12:31 -07:00
Jarek Kowalski	d68273a576	Improvements for dealing with eventually-consistent stores (S3) (#437 ) * content: added support for cache of own writes Thi keeps track of which blobs (n and m) have been written by the local repository client, so that even if the storage listing is eventually consistent (as in S3), we get somewhat sane behavior. Note that this is still assumming read-after-create semantics, which S3 also guarantees, otherwise it's very hard to do anything useful. * compaction: support for compaction logs Instead of compaction immediately deleting source index blobs, we now write log entries (with `m` prefix) which are merged on reads and applied only if the blob list includes all inputs and outputs, in which case the inputs are discarded since they are known to have been superseded by the outputs. This addresses eventual consistency issues in stores such as S3, which don't guarantee list-after-put or list-after-delete. With such stores the repository is ultimately eventually consistent and there's not much that can be done about it, unless we use second strongly consistent storage (such as GCS) for the index only. * content: updated list cache to cache both `n` and `m` * repo: fixed cache clear on windows Clearing cache requires closing repository first, as Windows is holding the files locked. This requires ability to close the repository twice. * content: refactored index blob management into indexBlobManager * testing: fixed blobtesting.Map storage to allow overwrites * blob: added debug output String() to blob.Metadata * testing: added indexBlobManager stress test This works by using N parallel "actors", each repeatedly performing operations on indexBlobManagers all sharing single eventually consistent storage. Each actor runs in a loop and randomly selects between: - reading all contents in indexes and verifying that it includes all contents written by the actor so far and that contents are correctly marked as deleted - creating new contents - deleting one of previously-created contents (by the same actor) - compacting all index files into one The test runs on accelerated time (every read of time moves it by 0.1 seconds) and simulates several hours of running. In case of a failure, the log should provide enough debugging information to trace the exact sequence of events leading up to the failure - each log line is prefixed with actorID and all storage access is logged. * makefile: increase test timeout * content: fixed index blob manager race The race is where if we delete compaction log too early, it may lead to previously deleted contents becoming temporarily live again to an outside observer. Added test case that reproduces the issue, verified that it fails without the fix and passed with one. * testing: improvements to TestIndexBlobManagerStress test - better logging to be able to trace the root cause in case of a failure - prevented concurrent compaction which is unsafe: The sequence: 1. A creates contentA1 in INDEX-1 2. B creates contentB1 in INDEX-2 3. A deletes contentA1 in INDEX-3 4. B does compaction, but is not seeing INDEX-3 (due to EC or simply because B started read before #3 completed), so it writes INDEX-4==merge(INDEX-1,INDEX-2) * INDEX-4 has contentA1 as active 5. A does compaction but it's not seeing INDEX-4 yet (due to EC or because read started before #4), so it drops contentA1, writes INDEX-5=merge(INDEX-1,INDEX-2,INDEX-3) * INDEX-5 does not have contentA1 7. C sees INDEX-5 and INDEX-5 and merge(INDEX-4,INDEX-5) contains contentA1 which is wrong, because A has been deleted (and there's no record of it anywhere in the system) * content: when building pack index ensure index bytes are different each time by adding 32 random bytes	2020-05-31 17:11:20 -07:00
Julio López	0875939d56	dep: upgrade protobuf dependents (#442 ) Upgrade cloud.google.com/go/storage to v1.8.0 from version 1.6.0 Change logs: - https://github.com/googleapis/google-cloud-go/releases/tag/storage%2Fv1.8.0 - https://github.com/googleapis/google-cloud-go/releases/tag/storage%2Fv1.7.0 Protobuf from 1.3.5 to 1.4.2 - https://github.com/golang/protobuf/releases - https://github.com/golang/protobuf/releases/tag/v1.4.2 Use google.golang.org/protobuf version 1.23.0 Instead of github.com/golang/protobuf/proto which has been superseded - https://github.com/protocolbuffers/protobuf-go/releases cloud.google.com/go from 0.54.0 to 0.57.0 - https://github.com/googleapis/google-cloud-go/releases/tag/v0.57.0 - https://github.com/googleapis/google-cloud-go/releases/tag/v0.56.0 - https://github.com/googleapis/google-cloud-go/releases/tag/v0.55.0 google.golang.org/api from 0.20 to 0.25.0 - https://github.com/googleapis/google-api-go-client/releases github.com/prometheus/client_golang to 1.6.0 - https://github.com/prometheus/client_golang/releases Required changes: - Fix import paths for protobuf imports - Add linter exception - Use prototext package to marshal to text	2020-05-21 13:22:59 -07:00
Julio López	23c12125d8	Build test/tools in Darwin as well (#447 ) Upgrade github.com/google/fswalker to get fixes for Darwin/macOS #212 google/fswalker#25	2020-05-21 12:02:27 -07:00
Jarek Kowalski	ca28469706	cli: improved 'snapshot delete' usage (#436 ) New usage: ``` kopia snapshot delete manifestID... [--delete] kopia snapshot delete rootObjectID... [--delete] ``` Fixes #435 cli: added --unsafe-ignore-source as alias for `--delete` This is a hidden flag for backwards compatibility. It will be removed.	2020-05-13 23:43:45 -07:00
Jarek Kowalski	d657415817	testing: added blob.Storage wrapper that simulates eventual consistency (#434 ) This is done by introducing N unsynchronized caches, which simulate what frontend of a cloud storage system might do, that causes eventual consistency behavior.	2020-05-09 12:19:32 -07:00
Jarek Kowalski	be4b897579	Support for remote repository (#427 ) Support for remote content repository where all contents and manifests are fetched over HTTP(S) instead of locally manipulating blob storage * server: implement content and manifest access APIs * apiclient: moved Kopia API client to separate package * content: exposed content.ValidatePrefix() * manifest: added JSON serialization attributes to EntryMetadata * repo: changed repo.Open() to return Repository instead of DirectRepository repo: added apiServerRepository * cli: added 'kopia repository connect server' This sets up repository connection via the API server instead of directly-manipulated storage. * server: add support for specifying a list of usernames/password via --htpasswd-file * tests: added API server repository E2E test * server: only return manifests (policies and snapshots) belonging to authenticated user	2020-05-02 21:41:49 -07:00
Jarek Kowalski	4b4628a21e	Repository maintenance support (#411 ) Maintenance: support for automatic GC Moved maintenance algorithms from 'cli' to 'repo/maintenance' package Added support for CLI commands: kopia gc - performs quick maintenance kopia gc --full- perform full maintenance Full maintenance performs snapshot gc, but it's not safe to do this automatically possibly in parallel to snapshots being taken. This will be addressed ~0.7 timeframe.	2020-04-14 00:11:41 -07:00
Jarek Kowalski	057c2789d8	Kopia UI: support for multiple repositories + portability (#398 ) * server: when serving HTML UI, prefix the title with string from KOPIA_UI_TITLE_PREFIX envar * kopia-ui: support for multiple repositories + portability This is a major rewrite of the app/ codebase which changes how configuration for repositories is maintained and how it flows through the component hierarchy. Portable mode is enabled by creating 'repositories' subdirectory before launching the app. on macOS: <parent>/KopiaUI.app <parent>/repositories/ On Windows, option #1 - nested directory <parent>\KopiaUI.exe <parent>\repositories\ On Windows, option #2 - parallel directory <parent>\some-dir\KopiaUI.exe <parent>\repositories\ In portable mode, repositories will have 'cache' and 'logs' nested in it.	2020-04-04 17:18:37 -07:00
Jarek Kowalski	6cb9b8fa4f	repo: refactored public API (#318 ) * This is 99% mechanical: Extracted repo.Repository interface that only exposes high-level object and manifest management methods, but not blob nor content management. Renamed old repo.Repository to repo.DirectRepository Reviewed codebase to only depend on repo.Repository as much as possible, but added way for low-level CLI commands to use DirectRepository. * PR fixes	2020-03-26 08:04:01 -07:00
Seb Patane	6789f8e64c	cli: allow override of snapshot start time and end time	2020-03-21 09:27:32 -07:00
Jarek Kowalski	239d809075	performance: introduced buf.Pool which helps reuse memory buffers (#345 ) * performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression * repo: switched object writer to buf.Pool * content: switched encryption to use buf.Pool * object: switched compression to use buf.Pool * testing: added missing content manager Close()	2020-03-18 20:42:16 -07:00
Nick	78a921ab2c	Rename Simple mutex	2020-03-14 12:00:22 -07:00
Nick	60d1cc821a	[robustness testing] Implement metadata store for test metadata using kopia Add an implementation of the metadata store using kopia snapshots and restores to manage persistence of walk metadata. Metadata are stored to and retrieved from the store, and a mechanism for persisting the store is implemented using kopia snapshots.	2020-03-14 12:00:22 -07:00
Jarek Kowalski	6b04b42794	tests: added smoke test that exercises all combinations of encryption and hashing	2020-03-13 20:42:12 -07:00
Jarek Kowalski	e80f5536c3	performance: plumbed through output buffer to encryption and hashing,… (#333 ) * performance: plumbed through output buffer to encryption and hashing, so that the caller can pre-allocate/reuse it * testing: fixed how we do comparison of byte slices to account for possible nils, which can be returned from encryption	2020-03-12 08:27:44 -07:00
Jarek Kowalski	514df69afa	performance: added wrapper around io.Copy() this pools copy buffers so they can be reused instead of throwing away after each io.Copy()	2020-03-10 21:52:30 -07:00
Nick	05852322da	Add snapshotter interface and kopia implementation Snapshotter interface describes an entity that can create, restore, and delete snapshots, as well as manage a repository. Add kopia implementation of the snapshotter interface.	2020-03-10 07:32:14 -07:00

1 2 3 4

158 Commits