Commit Graph

43 Commits

Author SHA1 Message Date
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Jarek Kowalski
7e68d8e4c1 Consolidated format version flags (#1284) 2021-09-08 18:44:03 -07:00
Jarek Kowalski
d6d9a1fb5f Maintenance improvements for epoch-based index structures (#1225)
* testing: KOPIA_TEST_LOG_OUTPUT logs subcommand outputs

* cli: additional flags for 'blob list'

* Makefile: run all tests against epoch-based index manager

* epoch: added support for deletion watermark, which keeps track of latest maintenance which dropped index entries

* content: added deletion watermark to content manager

* maintenance: improved maintenance without safety to force rewrites

* maintenance: skip quick maintenance when epoch manager is enabled

* maintenance: do not enable quick maintenance when epoch manager is used

* testing: skip TestIndexOptimize when running against epoch manager-backed index strutures
2021-08-02 21:08:54 -07:00
Jarek Kowalski
e64d5b8eab Fixed few subtle threading bugs uncovered by stress test and rewrote the test to be model-based (#1157)
* testing: refactored logs directory management

* content: fixed index mutex to be shared across all write sessions

added mutex protection during writecontent/refresh race

* testing: upload log artifacts

* content: bump revision number after index has been added

This fixes a bug where manifest manager in another session for
the same open repository may not see a content added, because they
will prematurely cache the incomplete set of contents.

This took 2 weeks to find.

* manifest: improved log output, fixed unnecessary mutex release

* testing: rewrote stress test to be model-based and more precise
2021-07-01 21:37:27 -07:00
Jarek Kowalski
d84c884321 Added content manager internal logging (#1116)
* logging: added logger wrappers for Broadcast and Prefix

* nit: moved max hash size to a named constant

* content: added internal logger

* content: replaced context-based logging with explicit Loggers

This will capture the logger.Logger associated with the context when
the repository is opened and will reuse it for all logs instead of
creating new logger for each log message.

The new logger will also write logs to the internal logger in addition
to writing to a log file/console.

* cli: allow decrypting all blobs whose names start with _

* maintenance: added logs cleanup

* cli: commands to view logs

* cli: log selected command on each write session
2021-06-05 08:48:43 -07:00
Jarek Kowalski
fcd507a56d Refactored most of the CLI tests to run in-process as opposed to using sub-processes (#1059)
* cli: fixed remaining testability indirections for output and logging

* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process

* tests: refactored most e2e tests to invoke kopia subcommands in-process

* Makefile: enable code coverage for cli/ and internal/

* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme

* Makefile: push coverage from PRs again

* tests: disable buffer management to reduce memory usage on ARM

* cli: fixed misaligned atomic field on ARMHF

also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
2021-05-11 22:26:28 -07:00
Jarek Kowalski
281a7fcc95 e2e test refactoring (#1058)
* tests: refactored test directory creation into separate package

* mechanical: refactored e2e test output parsing and error handling
2021-05-08 11:15:31 -07:00
Jarek Kowalski
d2288c443f cli: major refactoring (#1046)
cli: major refactoring of how CLI commands are registered

The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.

This change is 94.3% mechanical, but is fully organic and hand-made.

* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
2021-05-03 10:28:00 -07:00
Jarek Kowalski
7c088338ce testing: upload endurance test logs as artifacts, run more frequently 2021-04-07 14:11:39 -07:00
Jarek Kowalski
d07eb9f300 cli: added --safety=full|none flag to maintenance commands (#912)
* cli: added --safety=full|none flag to maintenance commands

This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.

This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.

* 'blob gc'
* 'content rewrite'
* 'snapshot gc'

* pr renames

* maintenance: fixed computation of safe time for --safety=none

* maintenance: improved logging for blob gc

* maintenance: do not rewrite truly short, densely packed packs

* mechanical: pass eventual consistency settle time via CompactOptions

* maintenance: add option to disable eventual consistency time buffers with --safety=none

* maintenance: trigger flush at the end of snapshot gc

* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs

* testing: allow debugging of integration tests inside VSCode

* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
2021-04-02 21:56:01 -07:00
Jarek Kowalski
175ca8bd7a Misc cleanups (#899)
* apiclient: stop logging short-term cookies

* testing: unset KOPIA_PASSWORD in tests, which disrupts subprocesses
2021-03-19 21:57:15 -07:00
Pavan Navarathna
3e76169921 Support for stdin streams (#862)
* Add StreamingFile interface
* unit test for virtualfs
* CLI: Snapshot create support for stdin sources
* Uploader support for fs.StreamingFile
* End to end test for stdin source snapshot
* upload test to improve coverage
2021-03-04 15:34:05 -08:00
Jarek Kowalski
e2b9a81ac3 Major CI/CD refactoring and re-added support for ARM/ARM64 runners (#849)
* ci: refactored CI/CD logic & Makefile

- removed all travis CI emulation environment variables and replaced with:

CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true

- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel

* testing: fixed tests on ARM and ARM64

- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
2021-02-23 00:52:54 -08:00
Jarek Kowalski
23273af1cd snapshot: reworked error handling and added fail-fast option (#840)
Fixes #690

This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.

Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.

The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.

After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.

In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.

Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.

With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
2021-02-17 10:29:01 -08:00
Jarek Kowalski
81e0ecf2e1 testing: all logs to t.Logf() when the test fails (#833)
* testing: all logs to t.Logf() when the test fails

* testing: send server stderr to t.Logf()
2021-02-13 16:32:36 -08:00
Jarek Kowalski
4bf42e337d fix long filenames on Windows (#822)
* windows: fixed handling of long filenames
2021-02-12 09:09:42 -08:00
Jarek Kowalski
646c325826 Implemented new streaming GRPC protocol for Kopia Repository Server (#789)
* grpcapi: added GPRC API for the repository server

* repo: added transparent retries to GRPC repository client

Normally GRPC reconnects automatically, which can survive server
restarts (minus transient errors).

In our case we're establishing a stream which will be broken and
needs to be restarted after io.EOF is detected.

It safe to do transparent retries for read-only (repo.Repository),
but not safe for write sessions (repo.RepositoryWriter), because the
session may re-connect to different server that won't have the buffered
content write available in memory.
2021-01-28 05:15:12 -08:00
Jarek Kowalski
1f3b8d4da4 upgrade linter to 1.35 (#786)
* lint: added test that enforces Makefile and GH action linter versions are in sync
* workaround for linter gomnd problem - https://github.com/golangci/golangci-lint/issues/1653
2021-01-16 18:21:16 -08:00
Peter Palotas
cd8f3e81b8 Created end-to-end tests verifying .kopiaignore behavior. (#774)
* Created end-to-end tests verifying .kopiaignore behavior.

This is related to #571 and #773, but provided as a separate PR to include tests that did not work before PR #773.

* Commented failing tests.

These tests will be re-enabled when #773 is done.

* Added additional commented tests of .kopiaignore

These will be uncommented in #773.
2021-01-08 07:39:59 -08:00
Jarek Kowalski
f1b471d7e6 Fixes for test flakes (#770)
* testing: prevented spurious test flakes caused by kopia subprocesses messing with stderr

This was not causing actual failures, but misreporting error messages.

* testing: ensure random names are always unique by adding a counter
2021-01-05 21:37:23 -08:00
Jarek Kowalski
207009939f cli: only fetch the persisted password from keychain if one was not provided on the command line (#744)
This also fixed a test bug where the test was incorrectly passing
password via environment variable and it was (incorrectly) expected
to be ignored.

Password is determined in the following order:

- flag/environment variable (highest priority)
- persistent storage
- asking user (lowest priority)
2020-12-24 22:39:02 -08:00
Jarek Kowalski
ae38fa3917 Speed up integration tests (#653)
* testing: don't use expensive scrypt-65536-8-1 in integration tests

* testing: use platform-specific encryption and hashing for arm and arm64 to speed up tests

* testing: manually manage log directory to be able to analyze integration test failures

* testing: snapshot_gc_test was too quick

* Makefile: renamed target building integration test binary
2020-09-30 22:01:16 -07:00
Jarek Kowalski
0758a92c58 restore: improved user experience (#644)
* restore: improved user experience

* 'snapshot restore' is now the same as 'restore' and both will
  support restoring by manifest ID, root ID or root ID + subdirectory

* added support for restoring individual files

* implemented PR feedback and refactored object ID parsing

Moving helpers inside the snapshot/ package helped clean up the code
a lot.
2020-09-28 22:57:24 -07:00
Jarek Kowalski
c9c8d27c8d Repro and fix for zero-sized snapshot bug (#641)
* server: repro for zero-sized snapshot bug

As described in https://kopia.discourse.group/t/kopia-0-7-0-not-backing-up-any-files-repro-needed/136/5

* server: fixed zero-sized snapshots after repository is connected via API

The root cause was that source manager was inheriting HTTP call context
which was immediately closed after the 'connect' RPC returned thus
silently killing all uploads.
2020-09-23 20:15:36 -07:00
Julio López
ae6a960080 Prefer t.TempDir() over makeScratchDir(t) (#612)
Prefer t.TempDir() over makeScratchDir(t)
Remove unused randomString
Leverage T.TempDir() in CLITest env
2020-09-22 22:16:39 -07:00
Jarek Kowalski
fce9497375 restore: support for symlinks (experimental) (#621) 2020-09-18 10:29:20 -07:00
Jarek Kowalski
f2cf71d914 logging: revamped logs from content manager to be machine parseable (#617)
* logging: revamped logs from content manager to be machine parseable

Logs from the content manager (except reads) are sent to separate log
file that is always free from personally-identifiable information
(e.g. no file names, just content IDs and blob IDs).

Also moved CLI logs to a subdirectory (cli-logs) and put content logs
in a parallel directory (content-logs)

Also, the log file name will now include the type of the command that
was invoked:

   kopia-20200913-134157-16110-snapshot-create.log

Fixes #588

* tests: moved all logs from tests to a separate directory
2020-09-16 20:04:26 -07:00
Jarek Kowalski
6a14ac8a2a cli: ensure advanced commands are not accidentally used (#611)
* cli: ensure advanced commands are not accidentally used

This prints an error when a dangerous command is used without
first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable.

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 20:31:25 -07:00
Jarek Kowalski
1a8fcb086c Added endurance test which tests kopia over long time scale (#558)
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()

Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT

logfile: squelch annoying log message

testenv: added faketimeserver which serves time over HTTP

testing: added endurance test which tests kopia over long time scale

This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).

The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.

testing: refactored internal/clock to only support injection when
'testing' build tag is present
2020-08-26 23:03:46 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00
Jarek Kowalski
514df69afa performance: added wrapper around io.Copy()
this pools copy buffers so they can be reused instead of throwing away
after each io.Copy()
2020-03-10 21:52:30 -07:00
Jarek Kowalski
fb181257bf cli: implemented update check, fixes #119 2020-03-04 22:06:05 -08:00
Nick
8e0167027d Use bytes buffer to capture stderr instead of reading pipe
Give a `*bytes.Buffer` to the command and let `package exec` read from the pipe into the buffer for us.

The current use of the `StderrPipe()` method was problematic; the documentation for os/exec states:

> Wait will close the pipe after seeing the command exit, so most callers need not close the pipe themselves. It is thus incorrect to call Wait before all reads from the pipe have completed. For the same reason, it is incorrect to use Run when using StderrPipe.
2020-03-02 17:04:46 -08:00
Jarek Kowalski
faa7225c23 testing: better fix for ignoring ErrClosed 2020-03-01 10:53:52 -08:00
Jarek Kowalski
5e1b03dcba test: ignore ErrClose when reading from stderr to fix text flake 2020-02-29 20:07:45 -08:00
Nick
835d0b259c Fix error clobbering in kopia cli runner
Error variable was being shared between c.Output and
ioutil.ReadAll. The two could race to set err, and if
Output() finishes first with an error, ReadAll could overwrite
it with nil. Fix is to delegate an independent error for
the pipe read, and fail the test if it is non-nil.
2020-02-29 13:58:10 -08:00
Julio López
a9670762fd Generate shorter file names, and thus paths, in E2E tests (#229)
- Reduces name lengths by ~ 1/2
- Generate file names up to maxNameLength

Co-Authored-By: Nick <nick@kasten.io>
2020-02-14 22:25:50 -08:00
Jarek Kowalski
75929f65e9 travis: add integration tests and install-noui on Windows 2020-02-09 22:46:19 -08:00
Nick
383c042bf5 Adding low-level FSWalker walker/reporter functionality
Adds a wrapper around `Walk` that takes a Policy (protobuf definition) and performs a walk using it as configuration. The resulting Walk struct pointer is returned. The only exported functionality is unfortunately to read the Policy as a protobuf text file, so the implementation creates a temporary policy file whose lifetime is the duration of the call.

Adds a wrapper around the the FSWalker reporter `Compare` functionality. Takes a config file and two Walk pointers and compares the walks, returning the pb-defined Report struct. Again, the only exported functionality for reading config information is to read it as a protobuf text file. Creates a temporary config file, whose lifetime is the duration of the call, to pass in to the fswalker function.
2020-02-07 12:09:39 -08:00
Jarek Kowalski
9680dc376b cli: improvements for 'kopia server' and client
Those will make it possible to securely host 'kopia server' embedded
in a desktop app that runs in the background and can access UI.

- added support for using and generating TLS certificates
- added /api/v1/shutdown API to remotely trigger server shutdown
- added support for automatically shutting down server if no requests
  arrive in certain amount of time
- added support for generating and printing random password to STDERR

TLS supports 3 modes:

1. serve TLS using externally-provided cert/key PEM files
2. generate & write PEM files, then serve TLS using them
3. generate and use emphemeral cert/key (prints SHA256 fingerprint)
2020-01-24 17:25:45 -08:00
Nick
7367f85afe Snapshot failure test and fix kopia panic for non-executable directories
Adding test to probe snapshot failure. Tests snapshot of non-existent source directory, and then iterates through file permissions for files, directories, and issues snapshot to the source. Permission combinations are applied to a parent directory of the source, source itself, contents of source root when the source is a directory, and contents of a subdirectory in that source root.

Fixing kopia executable panic when a directory's contents can't be read. Previously the only Lstat error responded to was not-exist, letting other errors fall through and passing a nil `os.FileInfo` to the following function call, resulting in panic.
2020-01-20 21:29:55 -08:00
Jarek Kowalski
2bc8383e47 testing: customized e2e test directory tree shapes
based on PR feedback, instead of 3 uniform directories,
have 3 different shapes.
2020-01-16 19:40:34 -08:00
Jarek Kowalski
7d39be976f testing: split end_to_end_test into separate files.
refactored test helpers to separate package
made all tests parallel and improved the code structure
2020-01-16 19:40:34 -08:00