The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md
The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.
Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.
The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.
Fixes#1402
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
* testing: KOPIA_TEST_LOG_OUTPUT logs subcommand outputs
* cli: additional flags for 'blob list'
* Makefile: run all tests against epoch-based index manager
* epoch: added support for deletion watermark, which keeps track of latest maintenance which dropped index entries
* content: added deletion watermark to content manager
* maintenance: improved maintenance without safety to force rewrites
* maintenance: skip quick maintenance when epoch manager is enabled
* maintenance: do not enable quick maintenance when epoch manager is used
* testing: skip TestIndexOptimize when running against epoch manager-backed index strutures
* testing: refactored logs directory management
* content: fixed index mutex to be shared across all write sessions
added mutex protection during writecontent/refresh race
* testing: upload log artifacts
* content: bump revision number after index has been added
This fixes a bug where manifest manager in another session for
the same open repository may not see a content added, because they
will prematurely cache the incomplete set of contents.
This took 2 weeks to find.
* manifest: improved log output, fixed unnecessary mutex release
* testing: rewrote stress test to be model-based and more precise
* logging: added logger wrappers for Broadcast and Prefix
* nit: moved max hash size to a named constant
* content: added internal logger
* content: replaced context-based logging with explicit Loggers
This will capture the logger.Logger associated with the context when
the repository is opened and will reuse it for all logs instead of
creating new logger for each log message.
The new logger will also write logs to the internal logger in addition
to writing to a log file/console.
* cli: allow decrypting all blobs whose names start with _
* maintenance: added logs cleanup
* cli: commands to view logs
* cli: log selected command on each write session
* cli: fixed remaining testability indirections for output and logging
* cli: added cli.RunSubcommand() which is used in testing to execute a subcommand in the same process
* tests: refactored most e2e tests to invoke kopia subcommands in-process
* Makefile: enable code coverage for cli/ and internal/
* testing: pass 'testing' tag to unit tests which uses much faster (insecure) password hashing scheme
* Makefile: push coverage from PRs again
* tests: disable buffer management to reduce memory usage on ARM
* cli: fixed misaligned atomic field on ARMHF
also temporarily fixed statup-time benign race condition when setting
default on the timeZone variable, which is the last global variable.
cli: major refactoring of how CLI commands are registered
The goal is to eliminate flags as global variables to allow for better
testing. Each command and subcommand and most sets of flags are now
their own struct with 'setup()' methods that attached the flags or
subcommand to the provided parent.
This change is 94.3% mechanical, but is fully organic and hand-made.
* introduced cli.appServices interface which provides the environment in which commands run
* remove auto-maintenance global flag
* removed globals in memory_tracking.go
* removed globals from cli_progress.go
* removed globals from the update_check.go
* moved configPath into TheApp
* removed remaining globals from config.go
* refactored logfile to get rid of global variables
* removed 'app' global variable
* linter fixes
* fixed password_*.go build
* fixed BSD build
* cli: added --safety=full|none flag to maintenance commands
This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.
This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.
* 'blob gc'
* 'content rewrite'
* 'snapshot gc'
* pr renames
* maintenance: fixed computation of safe time for --safety=none
* maintenance: improved logging for blob gc
* maintenance: do not rewrite truly short, densely packed packs
* mechanical: pass eventual consistency settle time via CompactOptions
* maintenance: add option to disable eventual consistency time buffers with --safety=none
* maintenance: trigger flush at the end of snapshot gc
* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs
* testing: allow debugging of integration tests inside VSCode
* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
* Add StreamingFile interface
* unit test for virtualfs
* CLI: Snapshot create support for stdin sources
* Uploader support for fs.StreamingFile
* End to end test for stdin source snapshot
* upload test to improve coverage
* ci: refactored CI/CD logic & Makefile
- removed all travis CI emulation environment variables and replaced with:
CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true
- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel
* testing: fixed tests on ARM and ARM64
- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
Fixes#690
This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.
Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.
The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.
After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.
In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.
Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.
With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
* grpcapi: added GPRC API for the repository server
* repo: added transparent retries to GRPC repository client
Normally GRPC reconnects automatically, which can survive server
restarts (minus transient errors).
In our case we're establishing a stream which will be broken and
needs to be restarted after io.EOF is detected.
It safe to do transparent retries for read-only (repo.Repository),
but not safe for write sessions (repo.RepositoryWriter), because the
session may re-connect to different server that won't have the buffered
content write available in memory.
* Created end-to-end tests verifying .kopiaignore behavior.
This is related to #571 and #773, but provided as a separate PR to include tests that did not work before PR #773.
* Commented failing tests.
These tests will be re-enabled when #773 is done.
* Added additional commented tests of .kopiaignore
These will be uncommented in #773.
* testing: prevented spurious test flakes caused by kopia subprocesses messing with stderr
This was not causing actual failures, but misreporting error messages.
* testing: ensure random names are always unique by adding a counter
This also fixed a test bug where the test was incorrectly passing
password via environment variable and it was (incorrectly) expected
to be ignored.
Password is determined in the following order:
- flag/environment variable (highest priority)
- persistent storage
- asking user (lowest priority)
* testing: don't use expensive scrypt-65536-8-1 in integration tests
* testing: use platform-specific encryption and hashing for arm and arm64 to speed up tests
* testing: manually manage log directory to be able to analyze integration test failures
* testing: snapshot_gc_test was too quick
* Makefile: renamed target building integration test binary
* restore: improved user experience
* 'snapshot restore' is now the same as 'restore' and both will
support restoring by manifest ID, root ID or root ID + subdirectory
* added support for restoring individual files
* implemented PR feedback and refactored object ID parsing
Moving helpers inside the snapshot/ package helped clean up the code
a lot.
* server: repro for zero-sized snapshot bug
As described in https://kopia.discourse.group/t/kopia-0-7-0-not-backing-up-any-files-repro-needed/136/5
* server: fixed zero-sized snapshots after repository is connected via API
The root cause was that source manager was inheriting HTTP call context
which was immediately closed after the 'connect' RPC returned thus
silently killing all uploads.
* logging: revamped logs from content manager to be machine parseable
Logs from the content manager (except reads) are sent to separate log
file that is always free from personally-identifiable information
(e.g. no file names, just content IDs and blob IDs).
Also moved CLI logs to a subdirectory (cli-logs) and put content logs
in a parallel directory (content-logs)
Also, the log file name will now include the type of the command that
was invoked:
kopia-20200913-134157-16110-snapshot-create.log
Fixes#588
* tests: moved all logs from tests to a separate directory
* cli: ensure advanced commands are not accidentally used
This prints an error when a dangerous command is used without
first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable.
Co-authored-by: Julio López <julio+gh@kasten.io>
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()
Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT
logfile: squelch annoying log message
testenv: added faketimeserver which serves time over HTTP
testing: added endurance test which tests kopia over long time scale
This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).
The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.
testing: refactored internal/clock to only support injection when
'testing' build tag is present
Give a `*bytes.Buffer` to the command and let `package exec` read from the pipe into the buffer for us.
The current use of the `StderrPipe()` method was problematic; the documentation for os/exec states:
> Wait will close the pipe after seeing the command exit, so most callers need not close the pipe themselves. It is thus incorrect to call Wait before all reads from the pipe have completed. For the same reason, it is incorrect to use Run when using StderrPipe.
Error variable was being shared between c.Output and
ioutil.ReadAll. The two could race to set err, and if
Output() finishes first with an error, ReadAll could overwrite
it with nil. Fix is to delegate an independent error for
the pipe read, and fail the test if it is non-nil.
Adds a wrapper around `Walk` that takes a Policy (protobuf definition) and performs a walk using it as configuration. The resulting Walk struct pointer is returned. The only exported functionality is unfortunately to read the Policy as a protobuf text file, so the implementation creates a temporary policy file whose lifetime is the duration of the call.
Adds a wrapper around the the FSWalker reporter `Compare` functionality. Takes a config file and two Walk pointers and compares the walks, returning the pb-defined Report struct. Again, the only exported functionality for reading config information is to read it as a protobuf text file. Creates a temporary config file, whose lifetime is the duration of the call, to pass in to the fswalker function.
Those will make it possible to securely host 'kopia server' embedded
in a desktop app that runs in the background and can access UI.
- added support for using and generating TLS certificates
- added /api/v1/shutdown API to remotely trigger server shutdown
- added support for automatically shutting down server if no requests
arrive in certain amount of time
- added support for generating and printing random password to STDERR
TLS supports 3 modes:
1. serve TLS using externally-provided cert/key PEM files
2. generate & write PEM files, then serve TLS using them
3. generate and use emphemeral cert/key (prints SHA256 fingerprint)
Adding test to probe snapshot failure. Tests snapshot of non-existent source directory, and then iterates through file permissions for files, directories, and issues snapshot to the source. Permission combinations are applied to a parent directory of the source, source itself, contents of source root when the source is a directory, and contents of a subdirectory in that source root.
Fixing kopia executable panic when a directory's contents can't be read. Previously the only Lstat error responded to was not-exist, letting other errors fall through and passing a nil `os.FileInfo` to the following function call, resulting in panic.