Commit Graph

39 Commits

Author SHA1 Message Date
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Eng Zer Jun
73e492c9db refactor: move from io/ioutil to io and os package (#1360)
* refactor: move from io/ioutil to io and os package

The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* chore: remove //nolint:gosec for os.ReadFile

At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-06 08:39:10 -07:00
Jarek Kowalski
928150fe6b linter: upgrade to 1.42.1 (#1292) 2021-09-14 19:11:39 -07:00
Jarek Kowalski
9e182f131a linter: upgraded to 1.42.0 (#1246) 2021-08-20 18:26:45 -07:00
Steve Joachim
83b4dee349 serialize client auth (#232) (#1198) 2021-07-15 11:27:13 -07:00
Steve Joachim
d73e0d60ce robustness: reduce memory consumption (#1155)
Detailed changes:
* Implement Kopia API client
* Implement kopia client
* Implement new persister
* Separate storeLoad and Delete concurrency tests
* Update Store interface to return error for Delete
* Return errors from os.RemoveAll
* Robustness test updates
* Push robustness metadata without writing to fs
* Fix testenv.AssertNoError references
* Update tests to use new kopia persister
* Minor updates to multiclient test cases
* Use require.NoError instead of assertNoError
* Add context to store interface
* Update logging to be less verbose
* Use io instead of ioutil package
* Simplify restore by using object ID
* Accommodate repository.NewWriter signature change
* Improve tests to increase code coverage
* Spelling and error string fixes
* Address lint errors

Co-authored-by: Nick <nick@kasten.io>
2021-06-30 13:46:51 -07:00
Steve Joachim
ba5eb74151 robustness: use testing.T for parallel robustness tests (#1051)
Use testing.T for parallel robustness tests
Elide t.Helper() to improve test output
2021-05-18 21:22:32 -07:00
Jarek Kowalski
281a7fcc95 e2e test refactoring (#1058)
* tests: refactored test directory creation into separate package

* mechanical: refactored e2e test output parsing and error handling
2021-05-08 11:15:31 -07:00
Steve Joachim
d2b816934c robustness: add tests for kopia server repos (#1029)
* Add error handling for robustness Checker
* robustness engine updates for concurrency
* add client-server utils for multiclient robustness
* create client package for robustness tests
* create multi-client robustness test framework
* multi-client robustness test definitions
* update robustness test runner scripts
* Address unparam lint errors

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-04-29 21:45:13 -07:00
Jarek Kowalski
9a756c719f Enabled race detector in CI, fixed a few data races (#919)
* content: fixed data race in IterateUnreferencedBlobs

* upload: fixed data race between uploader and estimator

* testing: fixed data race in repo/blob/logging test

* makefile: run tests on CI/linux/amd64 with -race

* robustness: fixed test race

* content: fixed data race getContentDataUnlocked that triggers TestParallelWrites - looks scary but in practice very hard to trigger in real life and does not cause data corruption

* testing: reduce test complexity under race detector

* server: fixed minor race in refreshStatus()

* testing: reduced depth of sharedTestDataDir2

* ci: run race detector in separate job

* ci: run unit test race detector in parallel to integration tests
2021-04-02 18:21:04 -07:00
Nick
b2b921fb82 Add context to robustness engine interfaces (#893)
Add context to all interface methods, exposed through engine actions to the test writer/engine user.

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-03-17 20:02:02 -07:00
Nick
eaf14a5fa5 Path protection between robustness engine FileWriter and Snapshotter (#865)
Protect filesystem subtrees from concurrent manipulation during critical sections
if engine actions are called asynchronously. This change provides coordination
between the `Snapshotter` and the `FileWriter`. For example, the `FileWriter`
should be blocked from perturbing the same directory tree if a
Gather-Snapshot is taking place along that tree simultaneously.
This will ensure the fingerprint data accumulated during the `Gather` phase
will correspond unambiguously to the data included in the snapshot.

Extend build flags to kopia snapshotter
  This package now imports fswalker which can only be built for
  darwin,amd64 or linux,amd64
2021-03-16 15:15:15 -07:00
Nick
9b3cae781f Fix robustness engine i/o limit test flake (#864)
Instead of a flaky timing check, calculate the amount of data that was actually written to the fio file tree.
2021-03-04 15:21:23 -08:00
Nick
7e57984bba Metadata protection for asynchronous robustness transactions (#851)
Add metadata R/W locking for asynchronous accesses to robustness engine metadata.
Remove the index from the Store interface and maintain it only in Checker, where it's used.
2021-03-02 23:48:44 -08:00
Nick
1722cd1db8 Path lock utility for coordination between robustness engine actions (#850)
* Path lock utility for coordination between robustness engine actions

Add a utility to ensure path-based synchronization between goroutines. If a path is locked, a subsequent Lock will block if the requested path is the same, or a child/parent (recursive), of the locked path.

This assists with coordination between asynchronous robustness engine actions that may rely on the underlying data directory remaining unchanged. For example:

- between gathering a filesystem fingerprint and taking a snapshot.
- when one WriteFilesAtDepth command has traversed into a directory that another goroutine has picked for deletion.

* Fix linter
2021-03-02 23:43:20 -08:00
Jarek Kowalski
e694367da8 lint: fixed vet-time-inject and replaced with forbidigo linter (#848)
added faketime.NewClockTimeWithOffset and used that to fix flaky
'TestDeleteUnreferencedBlobs' test.
2021-02-21 07:46:04 -08:00
Jarek Kowalski
4bf42e337d fix long filenames on Windows (#822)
* windows: fixed handling of long filenames
2021-02-12 09:09:42 -08:00
carlbraganza
c23d42f84d Refactored the robustness engine constructor to use externally specified interfaces. (#815) 2021-02-03 11:19:33 -08:00
carlbraganza
6fd4409bb3 Introduced a FileWriter interface in the robustness test and refactored accordingly. (#797) 2021-01-25 19:57:27 -08:00
carlbraganza
8d907c937c All robustness test engine interface definitions moved into the test root directory (#791)
* All robustness test engine interface definitions moved into the test root directory.

* Fixed lint issue.
2021-01-21 18:55:08 -08:00
Jarek Kowalski
1f3b8d4da4 upgrade linter to 1.35 (#786)
* lint: added test that enforces Makefile and GH action linter versions are in sync
* workaround for linter gomnd problem - https://github.com/golangci/golangci-lint/issues/1653
2021-01-16 18:21:16 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Julio López
3795ffc6f9 robustness: minor cleanups (#726)
Remove unnecessary intermediate variables.
Send SIGTERM instead of SIGKILL to terminate child kopia server process.
Set Pdeathsig on Linux for child kopia server process.
Trivial: reduce scope of hostFioDataPathStr variable.
Trivial: rename local variable.
Trivial: Use log.Fatalln instead of log + exit(1).
Improve error message in robustness test to tell apart failure cause.
2020-12-16 12:49:54 -08:00
Julio López
35346863d2 robustness: add support for kopia server storage repo (#720)
Adds single client to server robustness tests.

Co-authored-by: Rahul M Chheda <rchheda@infracloud.io>
2020-12-15 17:56:30 -08:00
Jarek Kowalski
e8714cb2a1 s3: upgraded to minio v7 (#707) 2020-12-02 19:46:05 -08:00
Nick
71dcbcf2e3 Robustness engine actions with stats and log (#685)
* Robustness engine actions with stats and logging

- Add actions to robustness engine
- Actions wrap other functional behavior and serve as a common interface for collecting stats
- Add stats for the engine, both per run and cumulative over time
- Add a log for actions that the engine has executed
- Add recovery logic to re-sync snapshot metadata after a possible failed engine run (e.g. if metadata wasn't properly persisted).

Current built-in actions:
- snapshot root directory
- restore random snapshot ID into a target restore path
- delete a random snapshot ID
- run GC
- write random files to the local data directory
- delete a random subdirectory under the local data directory
- delete files in a directory
- restore a snapshot ID into the local data directory

Actions are executed according to a set of options, which dictate the relative probabilities of picking a given action, along with ranges for action-specific parameters that can be randomized.
2020-11-17 01:07:04 -08:00
Jarek Kowalski
1a8fcb086c Added endurance test which tests kopia over long time scale (#558)
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()

Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT

logfile: squelch annoying log message

testenv: added faketimeserver which serves time over HTTP

testing: added endurance test which tests kopia over long time scale

This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).

The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.

testing: refactored internal/clock to only support injection when
'testing' build tag is present
2020-08-26 23:03:46 -07:00
Nick
14d50aaa50 [Robustness] Fix for kopia runner and custom working directory (#533)
* [Robustness] Fix for kopia runner and custom work dir

Apply fix similar to #293 for the robustness kopia runner.
Add control for runner working directory.
2020-08-14 17:32:45 -07:00
Nick
7da4022cfd [refactor] Move robustness and engine packages (#528)
Perform minor refactor by moving robustness and engine packages
in preparation for later PRs

Fix import path
2020-08-13 14:56:44 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00
Jarek Kowalski
40acf238f3 Fixed arm and arm64 build. (#506)
* fixed a number of cases where misaligned data was causing panics on armv7 (but not armv8)
* travis: enable arm64
* test: reduce compressed data sizes when running on arm
* arm: wait longer for snapshots
2020-07-30 17:31:28 -07:00
Nick
ce5e6dcd13 [Robustness] Add first robustness tests
Add two tests:
- TestManySmallFiles: writes 100k files size 4k to a directory. Snapshots the data tree, restores and validates data.
- TestModifyWorkload: Loops over a simple randomized workload. Performs a series of random file writes to some random sub-directories, then takes a snapshot of the data tree. All snapshots taken during this test are restore-verified at the end.

A global test engine is instantiated in main_test.go, to be used in the robustness test suite across tests (saves time loading/saving metadata once per run instead of per test).
2020-07-14 22:37:11 -07:00
Nick
82a2fa0ea5 [Robustness] Add test engine to manage snapshot verification testing (#468)
* Add test engine to manage snapshot verification testing

Test engine manages the test and metadata repositories, snapshot
checker, metadata storage persistence, and file writer. It is
the high level helper that will be invoked in the snapshot
verification testing suite.

- modify data directory file structure
- issue snapshot/restore/delete to the data directory
- accumulate metadata over the course of the test suite
- flush accumulated metadata to the metadata repository
- load historical metadata from the repository on initialization
- perform automatic data integrity verification on snap restore

This change corresponds to the robustness execution engine component from the design documentation.
2020-06-27 23:46:37 -07:00
Nick
b30da511e7 Wire up snapshot-store-compare behavior
Connect the snapshotter, the storer, and the comparer. Invoke
the snapshotter to take/restore/delete snapshots on the repo,
the comparer to gather metadata before the snapshot and
after the restore, and the storer to save metadata for later
lookup when verifying restores.
2020-05-31 21:12:31 -07:00
Nick
78a921ab2c Rename Simple mutex 2020-03-14 12:00:22 -07:00
Nick
60d1cc821a [robustness testing] Implement metadata store for test metadata using kopia
Add an implementation of the metadata store using kopia snapshots and restores to manage persistence of walk metadata. Metadata are stored to and retrieved from the store, and a mechanism for persisting the store is implemented using kopia snapshots.
2020-03-14 12:00:22 -07:00
Nick
05852322da Add snapshotter interface and kopia implementation
Snapshotter interface describes an entity that can create,
restore, and delete snapshots, as well as manage a repository.

Add kopia implementation of the snapshotter interface.
2020-03-10 07:32:14 -07:00
Nick
3cb7e37fc1 Add comparer interface and fswalker implementation (#226)
Add comparer interface which gathers data on a path and
compares that data to a new path, returning error if the path
differs in any way from the input data. The details of what
constitutes a difference is left to the implementation.

FSWalker implementation uses Walk and Report to do the data
gathering and comparison. Filters are applied to sort out any
differences that might be expected (e.g. ctime, atime, mtime,
rename of root directory after restore).
2020-02-13 17:07:18 -08:00