Commit Graph

159 Commits

Author SHA1 Message Date
Jarek Kowalski
6b756bad40 fshasher: truncate timestamps to full seconds when comparing to accomodate filesystems that lose precision (#661) 2020-10-03 15:15:24 -07:00
Jarek Kowalski
f66fe5789e Eliminated busy loop after snapshot failure (#658)
* server: if a snapshot fails, don't start the next one for 5 minutes or until the next successful refresh.

* Makefile: don't print skipped tests
2020-10-02 19:48:21 -07:00
Jarek Kowalski
ae38fa3917 Speed up integration tests (#653)
* testing: don't use expensive scrypt-65536-8-1 in integration tests

* testing: use platform-specific encryption and hashing for arm and arm64 to speed up tests

* testing: manually manage log directory to be able to analyze integration test failures

* testing: snapshot_gc_test was too quick

* Makefile: renamed target building integration test binary
2020-09-30 22:01:16 -07:00
Jarek Kowalski
a01bfde39a Makefile: plug in gotestsum for better test output (#652) 2020-09-29 21:14:20 -07:00
Jarek Kowalski
1636071f6b testing: increase test timeout because 90s is often flaky 2020-09-26 18:49:53 -07:00
Jarek Kowalski
c7be3a0c87 testing: added performance benchmark (#618)
The benchmarks creates 20 GB of files in different configurations

* 10 x 2 GB files
* 100 x 200 MB files
* 1000 x 20 MB files

and backs them up to a local filesystem repository measuring time,
CPU and RAM usage.

The benchmarking script uses GCP instance (n1-standard-8) with fast NVME
flash to eliminate local filesystem latency.

Current performance numbers show major improvement in latency in
0.7.0-rc1 due to splitter throughput optimization (#606).
2020-09-15 21:30:08 -07:00
Jarek Kowalski
57888a81eb travis: disable publishing RPM on pull requests since it needs credentials 2020-09-11 09:21:14 -07:00
Jarek Kowalski
17764567ab travis: update RPM and APT repos on linux/amd64 even for non-tagged commits 2020-09-10 17:30:17 -07:00
Jarek Kowalski
f8d0abb020 Makefile: fixed publishing logic 2020-09-09 23:38:10 -07:00
Jarek Kowalski
4ef314bee5 Added RPM repository (#600)
* goreleaser: added signatures to RPM binaries

Currently goreleaser does not support it, so we're overriding
signing script and signing all RPMs that it produces.

Also changed goreleaser parameters to only publish binaries
when running on linux/amd64.

* build: added automatic publishing of RPMs to a YUM repository

Also fixed RPM file names to match local conventions.
2020-09-09 23:18:20 -07:00
Jarek Kowalski
1a8fcb086c Added endurance test which tests kopia over long time scale (#558)
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()

Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT

logfile: squelch annoying log message

testenv: added faketimeserver which serves time over HTTP

testing: added endurance test which tests kopia over long time scale

This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).

The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.

testing: refactored internal/clock to only support injection when
'testing' build tag is present
2020-08-26 23:03:46 -07:00
Jarek Kowalski
b7872760e0 makefile: fixed BOTO_PATH 2020-08-14 14:33:58 -07:00
Jarek Kowalski
b381d31eb8 tools: added apt-publish tool that push to APT repository
The repository is in GCS and the script will automatically copy
DEB files from dist/ to the proper locations and will regenerate
APT package index.
2020-08-14 14:08:10 -07:00
Jarek Kowalski
eea6b466af maintenance: fixed scheduling of quick maintenance (#507)
* maintenance: fixed scheduling of quick maintenance

* Makefile: increase unit test timeouts
2020-08-01 06:57:10 -07:00
Jarek Kowalski
40acf238f3 Fixed arm and arm64 build. (#506)
* fixed a number of cases where misaligned data was causing panics on armv7 (but not armv8)
* travis: enable arm64
* test: reduce compressed data sizes when running on arm
* arm: wait longer for snapshots
2020-07-30 17:31:28 -07:00
Jarek Kowalski
d68273a576 Improvements for dealing with eventually-consistent stores (S3) (#437)
* content: added support for cache of own writes

Thi keeps track of which blobs (n and m) have been written by the
local repository client, so that even if the storage listing
is eventually consistent (as in S3), we get somewhat sane behavior.

Note that this is still assumming read-after-create semantics, which
S3 also guarantees, otherwise it's very hard to do anything useful.

* compaction: support for compaction logs

Instead of compaction immediately deleting source index blobs, we now
write log entries (with `m` prefix) which are merged on reads
and applied only if the blob list includes all inputs and outputs, in
which case the inputs are discarded since they are known to have been
superseded by the outputs.

This addresses eventual consistency issues in stores such as S3,
which don't guarantee list-after-put or list-after-delete. With such
stores the repository is ultimately eventually consistent and there's
not much that can be done about it, unless we use second strongly
consistent storage (such as GCS) for the index only.

* content: updated list cache to cache both `n` and `m`

* repo: fixed cache clear on windows

Clearing cache requires closing repository first, as Windows is holding
the files locked.

This requires ability to close the repository twice.

* content: refactored index blob management into indexBlobManager

* testing: fixed blobtesting.Map storage to allow overwrites

* blob: added debug output String() to blob.Metadata

* testing: added indexBlobManager stress test

This works by using N parallel "actors", each repeatedly performing
operations on indexBlobManagers all sharing single eventually consistent
storage.

Each actor runs in a loop and randomly selects between:

- *reading* all contents in indexes and verifying that it includes
  all contents written by the actor so far and that contents are
  correctly marked as deleted
- *creating* new contents
- *deleting* one of previously-created contents (by the same actor)
- *compacting* all index files into one

The test runs on accelerated time (every read of time moves it by 0.1
seconds) and simulates several hours of running.

In case of a failure, the log should provide enough debugging
information to trace the exact sequence of events leading up to the
failure - each log line is prefixed with actorID and all storage
access is logged.

* makefile: increase test timeout

* content: fixed index blob manager race

The race is where if we delete compaction log too early, it may lead to
previously deleted contents becoming temporarily live again to an
outside observer.

Added test case that reproduces the issue, verified that it fails
without the fix and passed with one.

* testing: improvements to TestIndexBlobManagerStress test

- better logging to be able to trace the root cause in case of a failure
- prevented concurrent compaction which is unsafe:

The sequence:

1. A creates contentA1 in INDEX-1
2. B creates contentB1 in INDEX-2
3. A deletes contentA1 in INDEX-3
4. B does compaction, but is not seeing INDEX-3 (due to EC or simply
   because B started read before #3 completed), so it writes
   INDEX-4==merge(INDEX-1,INDEX-2)
   * INDEX-4 has contentA1 as active
5. A does compaction but it's not seeing INDEX-4 yet (due to EC
   or because read started before #4), so it drops contentA1, writes
   INDEX-5=merge(INDEX-1,INDEX-2,INDEX-3)
   * INDEX-5 does not have contentA1
7. C sees INDEX-5 and INDEX-5 and merge(INDEX-4,INDEX-5)
   contains contentA1 which is wrong, because A has been deleted
   (and there's no record of it anywhere in the system)

* content: when building pack index ensure index bytes are different each time by adding 32 random bytes
2020-05-31 17:11:20 -07:00
Jarek Kowalski
4b4628a21e Repository maintenance support (#411)
Maintenance: support for automatic GC

Moved maintenance algorithms from 'cli' to 'repo/maintenance' package

Added support for CLI commands:

kopia gc - performs quick maintenance
kopia gc --full- perform full maintenance

Full maintenance performs snapshot gc, but it's not safe to do this automatically possibly in parallel to snapshots being taken. This will be addressed ~0.7 timeframe.
2020-04-14 00:11:41 -07:00
Jarek Kowalski
70d4c8764a cli: improvements to content selection for list/rewrite/stats/verify (#409)
They now uniformly support 3 flags:

--prefix=P       selects contents with the specified prefix
--prefixed       selects contents with ANY prefix
--non-prefixed   selects non-prefixed contents

Also changed content manager iteration API to support ranges.

cli: add --prefix to 'blob gc' and 'blob stats'
2020-04-06 18:43:41 -07:00
Jarek Kowalski
0017f59add Makefile: fix travis-setup on Windows 2020-04-03 23:11:29 -07:00
Jarek Kowalski
2055fe3cca Makefile: support Windows Appveyor environment (#407)
* Makefile: support Windows Appveyor environment

* appveyor: added .appveyor.yml

* travis: disable windows environment

* Makefile: do not run vet-time-inject on Windows

* Makefile: removed the use of backtick invocation of subshell which does not work on windows

* Makefile: make integration-tests work on Windows outside of Travis
2020-04-03 23:03:42 -07:00
Jarek Kowalski
60977812f0 Support for gather writes (#373)
, where blob.Storage.PutBlob gets a list of slices and writes them sequentially 
* performance: added gather.Bytes and gather.WriteBuffer

They are similar to bytes.Buffer but instead of managing a single
byte slice, they maintain a list of slices that and when they run out of
space they allocate new fixed-size slice from a free list.

This helps keep memory allocations completely under control regardless
of the size of data written.

* switch from byte slices and bytes.Buffer to gather.Bytes.

This is mostly mechanical, the only cases where it's not involve blob
storage providers, where we leverage the fact that we don't need to
ever concatenate the slices into one and instead we can do gather
writes.

* PR feedback
2020-03-24 15:05:52 -07:00
Jarek Kowalski
239d809075 performance: introduced buf.Pool which helps reuse memory buffers (#345)
* performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression
* repo: switched object writer to buf.Pool
* content: switched encryption to use buf.Pool
* object: switched compression to use buf.Pool
* testing: added missing content manager Close()
2020-03-18 20:42:16 -07:00
Jarek Kowalski
c9877bf130 performance: refactored content manager to avoid copying data
Previously we would store special field Payload for contents
that were added but never flushed to the store and it was not
encrypted. This required special handling different for pending
vs flushed contents.

This change maintains pending pack buffer ready to be flushed
and appends encrypted contents to it, which avoids data copying.
The buffers are pooled to avoid allocations.
2020-03-17 18:07:10 -07:00
Jarek Kowalski
e80f5536c3 performance: plumbed through output buffer to encryption and hashing,… (#333)
* performance: plumbed through output buffer to encryption and hashing, so that the caller can pre-allocate/reuse it

* testing: fixed how we do comparison of byte slices to account for possible nils, which can be returned from encryption
2020-03-12 08:27:44 -07:00
Jarek Kowalski
514df69afa performance: added wrapper around io.Copy()
this pools copy buffers so they can be reused instead of throwing away
after each io.Copy()
2020-03-10 21:52:30 -07:00
Jarek Kowalski
ec56ed3d70 Makefile: fixed layering test 2020-03-10 17:43:56 -07:00
Jarek Kowalski
f07d164829 Makefile: added layering-test, fixes #324 2020-03-10 13:18:23 -07:00
Julio López
d9ce3d0ad6 Inject time in Kopia components (#314)
Motivation: Allow time injection for (unit) tests, to more easily test and
verify time-dependent invariants.

Add time injection support for:

* repo.Manager
* manifest.Manager
* snapshot.Uploader

Then, wire up to these components. The content.Manager already had support for
time injection, but was not wired up from the time function passed to repo creation.

Add an internal/faketime package for testing. Mainly code movement from testing
code in the repo/content package. Motivation: make it available to other packages
outside content Also, add simple tests for faketime functions.
2020-03-10 00:42:10 -07:00
Nick
2c72fbd514 Remove FIO_USE_DOCKER env 2020-03-03 20:36:43 -08:00
Nick
b98236a535 Use fio image from dockerhub
Changing image to ljishen/fio instead of building
an image in kopia.
2020-03-03 20:36:43 -08:00
Nick
c5d8c9a271 Using docker to wrap fio execution to add robustness tool tests to Travis
Update the fio runner to use a docker image if the appropriate environment variable is set. Docker image is built via a makefile target and used in the robustness tool tests.
2020-03-03 20:36:43 -08:00
Jarek Kowalski
d95e6a3d09 sftp: Fixed issues in SFTP provider, Fixes #216
- did not work on windows due to use of filepath which uses backslash
  instead of slash
- added support for embedding SFTP key
- fixed UI controls
- misc fixes for KopiaUI
- added progress reporting
2020-03-01 18:56:06 -08:00
Jarek Kowalski
fd91186ead travis: used own retry script, since travis_retry does not work in makefile 2020-02-29 10:55:46 -08:00
Jarek Kowalski
007890ebfd travis: used own retry script, since travis_retry does not work in makefile 2020-02-29 10:46:42 -08:00
Jarek Kowalski
6d4d66621e travis: wrap setup and release steps with travis_retry to reduce flakes 2020-02-29 10:33:46 -08:00
Nick
173e18c97d Fix typo in makefile so CI can pass (upload-coverage target)
Fixing a typo in the Makefile target upload-coverage. Without fix, the target it is not found if Travis is running against a PR.
2020-02-28 18:10:10 -08:00
Jarek Kowalski
c55e53041c travis: fixed --skip-publish logic 2020-02-27 21:49:00 -08:00
Jarek Kowalski
7512f581f1 travis: checkout go.mod/go.sum at the end of travis-setup 2020-02-27 21:30:35 -08:00
Jarek Kowalski
2d214d2670 Makefile: refactored travis-release to run goreleaser first, which should avoid dirty git state 2020-02-27 21:04:57 -08:00
Jarek Kowalski
4ca9bee898 Makefile: print git diff before goreleaser 2020-02-26 22:22:19 -08:00
Jarek Kowalski
9b50a6e891 test: increased e2e test timeout
Added linear retry support when waiting for snapshots
2020-02-22 19:27:10 -08:00
Jarek Kowalski
98f1b26f39 kopia-ui: support for auto-update and publishing prerelease binaries from Travis to kopia/kopia-ui-release 2020-02-19 23:08:12 -08:00
Jarek Kowalski
11d6eb1c6c Added unit tests for HTML UI to make it a bit harder to regress.
Covered are:

- connect and create flow
- parameter pages for all providers
- connect using token
2020-02-19 18:22:45 -08:00
Jarek Kowalski
f9db94ca77 travis: disable windows signing by setting CSC_LINK and CSC_KEY_PASSWORD to empty, attempt 2 2020-02-18 06:50:35 -08:00
Jarek Kowalski
795b4f14e6 travis: disable windows signing by setting CSC_LINK and CSC_KEY_PASSWORD to empty 2020-02-17 23:29:45 -08:00
Jarek Kowalski
8208990321 travis: fix --skip-sign logic 2020-02-17 21:01:26 -08:00
Jarek Kowalski
98e877f437 travis: disable signing on windows 2020-02-17 20:19:07 -08:00
Jarek Kowalski
35a7bb6038 travis: deparallelize tasks within build to reduce flakes and improve logging 2020-02-17 20:19:07 -08:00
Jarek Kowalski
7f79c77d73 Makefile: refactored tools to be installable on Windows with only minimal deps (make/curl/unzip)
This allows full kopia and kopia-ui to be built on Windows along with
running lint and integration tests.
2020-02-17 18:45:08 -08:00
Jarek Kowalski
f8720eda62 travis: re-enabled integration tests on Windows after disabling GPG 2020-02-10 19:13:43 -08:00