Commit Graph

212 Commits

Author SHA1 Message Date
Jarek Kowalski
7c088338ce testing: upload endurance test logs as artifacts, run more frequently 2021-04-07 14:11:39 -07:00
Jarek Kowalski
67ae65eb56 testing: fixed TestFullMaintenance flake (#944) 2021-04-05 21:08:01 -07:00
Jarek Kowalski
49c1d08ccb cli: output usage to stdout but errors to stderr (#941)
* cli: output usage to stdout but errors to stderr

* fixed test flake
2021-04-04 12:05:27 -07:00
Jarek Kowalski
79adef0f33 ci: run endurance test 2021-04-03 19:10:36 -07:00
Jarek Kowalski
d07eb9f300 cli: added --safety=full|none flag to maintenance commands (#912)
* cli: added --safety=full|none flag to maintenance commands

This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.

This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.

* 'blob gc'
* 'content rewrite'
* 'snapshot gc'

* pr renames

* maintenance: fixed computation of safe time for --safety=none

* maintenance: improved logging for blob gc

* maintenance: do not rewrite truly short, densely packed packs

* mechanical: pass eventual consistency settle time via CompactOptions

* maintenance: add option to disable eventual consistency time buffers with --safety=none

* maintenance: trigger flush at the end of snapshot gc

* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs

* testing: allow debugging of integration tests inside VSCode

* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
2021-04-02 21:56:01 -07:00
Jarek Kowalski
9a128ffb9f filesystem: support ~ in repository path, require absolute paths (#922)
Fixes #918
2021-04-02 21:55:24 -07:00
Jarek Kowalski
8beb265c27 nit: output snapshot ID when --json is used (#921) 2021-04-02 19:58:17 -07:00
Jarek Kowalski
9a756c719f Enabled race detector in CI, fixed a few data races (#919)
* content: fixed data race in IterateUnreferencedBlobs

* upload: fixed data race between uploader and estimator

* testing: fixed data race in repo/blob/logging test

* makefile: run tests on CI/linux/amd64 with -race

* robustness: fixed test race

* content: fixed data race getContentDataUnlocked that triggers TestParallelWrites - looks scary but in practice very hard to trigger in real life and does not cause data corruption

* testing: reduce test complexity under race detector

* server: fixed minor race in refreshStatus()

* testing: reduced depth of sharedTestDataDir2

* ci: run race detector in separate job

* ci: run unit test race detector in parallel to integration tests
2021-04-02 18:21:04 -07:00
Jarek Kowalski
74833cefcb cli: added standard --json flags to several commands (#910)
* cli: added standard --json flags to several commands

Fixes #272

* Update flag description

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-03-25 17:55:18 -07:00
Jarek Kowalski
175ca8bd7a Misc cleanups (#899)
* apiclient: stop logging short-term cookies

* testing: unset KOPIA_PASSWORD in tests, which disrupts subprocesses
2021-03-19 21:57:15 -07:00
Jarek Kowalski
cbcd59f18e Added repository user authorization support + server flag refactoring + refresh (#890)
* nit: replaced harcoded string constants with named constants

* acl: added management of ACL entries

* auth: implemented DefaultAuthorizer which uses ACLs if any entries are found in the system and falls back to LegacyAuthorizer if not

* cli: switch to DefaultAuthorizer when starting server

* cli: added ACL management

* server: refactored authenticator + added refresh

Authenticator is now an interface which also supports Refresh.

* authz: refactored authorizer to be an interface + added Refresh()

* server: refresh authentication and authorizer

* e2e tests for ACLs

* server: handling of SIGHUP to refresh authn/authz caches

* server: reorganized flags to specify auth options:

- removed '--allow-repository-users' - it's always on
- one of --without-password, --server-password or --random-password
  can be specified to specify password for the UI user
- htpasswd-file - can be specified to provide password for UI or remote
  users

* cli: moved 'kopia user' to 'kopia server user'

* server: allow all UI actions if no authenticator is set

* acl: removed priority until we have a better understood use case for it

* acl: added validation of allowed labels when adding ACL entries

* site: added docs for ACLs
2021-03-18 23:03:27 -07:00
Nick
b2b921fb82 Add context to robustness engine interfaces (#893)
Add context to all interface methods, exposed through engine actions to the test writer/engine user.

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-03-17 20:02:02 -07:00
Nick
eaf14a5fa5 Path protection between robustness engine FileWriter and Snapshotter (#865)
Protect filesystem subtrees from concurrent manipulation during critical sections
if engine actions are called asynchronously. This change provides coordination
between the `Snapshotter` and the `FileWriter`. For example, the `FileWriter`
should be blocked from perturbing the same directory tree if a
Gather-Snapshot is taking place along that tree simultaneously.
This will ensure the fingerprint data accumulated during the `Gather` phase
will correspond unambiguously to the data included in the snapshot.

Extend build flags to kopia snapshotter
  This package now imports fswalker which can only be built for
  darwin,amd64 or linux,amd64
2021-03-16 15:15:15 -07:00
Jarek Kowalski
689ed0a851 server: refactored authentication and authorization (#871)
This formalizes the concept of a 'UI user' which is a local
user that can call APIs the same way that UI does it.

The server will now allow access to:

- UI user (identified using `--server-username` with password specified
  using `--server-password' or `--random-password`)
- remote users with usersnames/passwords specified in `--htpasswd-file`
- remote users defined in the repository using `kopia users add`
  when `--allow-repository-users` is passed.

The UI user only has access to methods specifically designated as such
(normally APIs used by the UI + few special ones such as 'shutdown').

Remote users (identified via `user@host`) don't get access to UI APIs.

There are some APIs that can be accessed by any authenticated
caller (UI or remote):

- /api/v1/flush
- /api/v1/repo/status
- /api/v1/repo/sync
- /api/v1/repo/parameters

To make this easier to understand in code, refactored server handlers
to require specifing what kind of authorization is required
at registration time.
2021-03-08 22:25:22 -08:00
Pavan Navarathna
3e76169921 Support for stdin streams (#862)
* Add StreamingFile interface
* unit test for virtualfs
* CLI: Snapshot create support for stdin sources
* Uploader support for fs.StreamingFile
* End to end test for stdin source snapshot
* upload test to improve coverage
2021-03-04 15:34:05 -08:00
Nick
9b3cae781f Fix robustness engine i/o limit test flake (#864)
Instead of a flaky timing check, calculate the amount of data that was actually written to the fio file tree.
2021-03-04 15:21:23 -08:00
Nick
7e57984bba Metadata protection for asynchronous robustness transactions (#851)
Add metadata R/W locking for asynchronous accesses to robustness engine metadata.
Remove the index from the Store interface and maintain it only in Checker, where it's used.
2021-03-02 23:48:44 -08:00
Nick
1722cd1db8 Path lock utility for coordination between robustness engine actions (#850)
* Path lock utility for coordination between robustness engine actions

Add a utility to ensure path-based synchronization between goroutines. If a path is locked, a subsequent Lock will block if the requested path is the same, or a child/parent (recursive), of the locked path.

This assists with coordination between asynchronous robustness engine actions that may rely on the underlying data directory remaining unchanged. For example:

- between gathering a filesystem fingerprint and taking a snapshot.
- when one WriteFilesAtDepth command has traversed into a directory that another goroutine has picked for deletion.

* Fix linter
2021-03-02 23:43:20 -08:00
Jarek Kowalski
4e705726fe Implemented caching for server connections (#845)
* cache: refactored reusable portion of cache into separate package

* repo: plumbed through caching for remote repository clients

* repo: plumb through cache in the unit tests

* cache: ensure we only allow absolute cache paths, fixed cache path resolution for remote repositories
2021-03-01 06:15:39 -08:00
Jarek Kowalski
e2b9a81ac3 Major CI/CD refactoring and re-added support for ARM/ARM64 runners (#849)
* ci: refactored CI/CD logic & Makefile

- removed all travis CI emulation environment variables and replaced with:

CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true

- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel

* testing: fixed tests on ARM and ARM64

- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
2021-02-23 00:52:54 -08:00
Jarek Kowalski
e694367da8 lint: fixed vet-time-inject and replaced with forbidigo linter (#848)
added faketime.NewClockTimeWithOffset and used that to fix flaky
'TestDeleteUnreferencedBlobs' test.
2021-02-21 07:46:04 -08:00
Jarek Kowalski
23273af1cd snapshot: reworked error handling and added fail-fast option (#840)
Fixes #690

This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.

Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.

The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.

After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.

In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.

Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.

With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
2021-02-17 10:29:01 -08:00
Jarek Kowalski
5240f62e47 Auto shutdown fix (#834)
* server: removed auto-shutdown option

* server: added --shutdown-on-stdin which will shutdown server when stdin is closed. used by kopia-ui
2021-02-13 19:49:32 -08:00
Jarek Kowalski
81e0ecf2e1 testing: all logs to t.Logf() when the test fails (#833)
* testing: all logs to t.Logf() when the test fails

* testing: send server stderr to t.Logf()
2021-02-13 16:32:36 -08:00
Jarek Kowalski
de840547e6 Improved upload reporting (#832)
* blob: refactored upload reporting

Instead of plumbing this through blob storage context, we are passing
and explicit callback that reports uploads as they happen.

* htmlui: improved counter presentation

* nit: added missing UI route which fixes Reload behavior on the Tasks page
2021-02-13 10:51:11 -08:00
Jarek Kowalski
4bf42e337d fix long filenames on Windows (#822)
* windows: fixed handling of long filenames
2021-02-12 09:09:42 -08:00
Pavan Navarathna
c964e244f0 Support for ignoring sources when using snapshot create --all (#804)
* Add manual field to SchedulingPolicy
* CLI: Set and show for policy with manual field
* CLI: Edit policy support for manual field
* Check manual when creating snapshot for all source
* End to end test for snapshot create all
* Add UI option for setting Manual field
2021-02-10 21:59:06 -08:00
Jarek Kowalski
5d07237156 Added support for user authentication using user profiles stored in the repository (#809)
* user: added user profile (username&password for authentication) and CRUD methods
* manifest: helpers for disambiguating manifest entries
* authn: added repository-based user authenticator
* cli: added commands to manipulate user accounts and passwords
* cli: added --allow-repository-users option to 'server start'
* Update cli/command_user_info.go

Co-authored-by: Julio López <julio+gh@kasten.io>
* Always return false when the user is not found.
2021-02-03 22:04:05 -08:00
carlbraganza
c23d42f84d Refactored the robustness engine constructor to use externally specified interfaces. (#815) 2021-02-03 11:19:33 -08:00
Jarek Kowalski
1a826d85c5 cli: added '--insecure' flag to 'kopia server start' (#803)
* cli: added '--insecure' flag to 'kopia server start'

This is a breaking change for development scenarios to prevent people
from unknowingly launching insecure servers.

Attempt to start a server without either TLS or password protection
results in an error now (unless --insecure is also passed).

KopiaUI already launches server with TLS and random password, so it
does not require it.
2021-01-28 09:13:57 -08:00
Jarek Kowalski
646c325826 Implemented new streaming GRPC protocol for Kopia Repository Server (#789)
* grpcapi: added GPRC API for the repository server

* repo: added transparent retries to GRPC repository client

Normally GRPC reconnects automatically, which can survive server
restarts (minus transient errors).

In our case we're establishing a stream which will be broken and
needs to be restarted after io.EOF is detected.

It safe to do transparent retries for read-only (repo.Repository),
but not safe for write sessions (repo.RepositoryWriter), because the
session may re-connect to different server that won't have the buffered
content write available in memory.
2021-01-28 05:15:12 -08:00
Jarek Kowalski
a11ddaf2cf restore: added support for incremental restore and ignoring copy errors (#794)
* restore: added support for incremental restore and ignoring copy errors

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-01-27 06:29:15 -08:00
carlbraganza
6fd4409bb3 Introduced a FileWriter interface in the robustness test and refactored accordingly. (#797) 2021-01-25 19:57:27 -08:00
carlbraganza
8d907c937c All robustness test engine interface definitions moved into the test root directory (#791)
* All robustness test engine interface definitions moved into the test root directory.

* Fixed lint issue.
2021-01-21 18:55:08 -08:00
Nick
122a31b905 Overwrite symlinks with optional flag - fix #689 (#783)
Fixes #689

Add symlink overwrite behavior to fix "file exists" error when restoring a symlink that already exists

Before creating the restored symlink, check `os.Lstat`:
- If it returns an error indicating the file does not exist, proceed to symlink creation
- If it returns any other error, propagate the error up to the caller
- If the fileInfo indicates the entry is a symlink AND `--no-overwrite-symlinks` was set in the restore command, propagate an error to the caller
- If `--no-overwrite-symlinks` was NOT set, remove the existing symlink before proceeding to symlink creation
- Else the file exists but it is not of type symlink. Halt the operation and propagate an error indicating we tried to restore a symlink over a file system entry that already existed but was not a symlink.

Added case to `TestSnapshotRestore` that fails before this fix and succeeds after. The case is simply to restore the same snapshot into the same directory twice in a row, where the second restore will be on top of the first one.

Added test case to ensure `--no-overwrite-symlinks` throws an error as expected if restoring into a directory where a symlink already exists at the path symlink creation is attempted.

Added test case to ensure that the restore operation fails if a symlink is needed to be restored to the same path as an existing non-symlink filesystem entry with the same name.

* Skip overwrite test on Windows

If test is run as non-admin it is likely to fail on Windows
with insufficient permissions to overwrite the previously
restored data.

* Add brief summary of overwrite behavior to help

Add a brief summary to the restore command help text
 of expected behavior when restoring into a target location
 that has existing data present.
2021-01-21 12:26:42 -08:00
Jarek Kowalski
fa7976599c repo: refactored repository interfaces (#780)
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`

- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
  interface

Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.

`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).

* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session

* cli: disable maintenance in 'kopia server start'
  Server will close the repository before completing.

* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
  - used atomic to manage SharedManager.closed

* removed stale example
* linter: fixed spurious failures

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-01-20 11:41:47 -08:00
Jarek Kowalski
1f3b8d4da4 upgrade linter to 1.35 (#786)
* lint: added test that enforces Makefile and GH action linter versions are in sync
* workaround for linter gomnd problem - https://github.com/golangci/golangci-lint/issues/1653
2021-01-16 18:21:16 -08:00
Jarek Kowalski
4c8b9291e1 content: refactoring of Manager (#785)
- renamed content.Manager to content.WriteManager
- merged lockFreeManager and CommittedReadManager into SharedManager
- also reassigned some methods to SharedManager (no code move)
2021-01-14 23:13:57 -08:00
Jarek Kowalski
f517703079 Preliminary support for sessions (#752)
* content: fixed time-based auto-flush behavior to behave like Flush()

Previously it would sometimes be possible for a content whose write
started before time-based flush to finish writing afterwards (and it
would be included in the new index).

Refactored the code so that time-based flush happens before WriteContent
write and behaves exactly the same was as real Flush() so all writes
started before it will be awaited during the flush.

Also previous regression test was incorrect since it was mocking the
wrong blob method.

* content: refactored index blob manager crypto to separate file

This will be reused for encrypting session info.

* content: added support for session markers

Session marker (`s` blob) is written BEFORE the first data blob
(`p` or `q`) that belongs to new index segment (`n` is written).

Session marker is removed AFTER the index blob (`n`) has been written.

All pack and index blobs belonging to a session will have the session
ID as its suffix, so that if a reader can see `s<sessionID>` blob, they
will ignore any `p` and `q` blobs with the same suffix.

* maintenance: ignore blobs belonging to active sessions when running blob garbage collection

* cli: added 'sessions list' for listing active sessions

*  content: added retrying writing previously failed blobs before writing new one
2021-01-14 00:25:51 -08:00
Peter Palotas
ae37719e51 Improved .kopiaignore pattern matching (#773)
* Improved .kopiaignore pattern matching

.kopiaignore pattern matching now (hopefully) conforms to the .gitignore specification (https://git-scm.com/docs/gitignore)

Replaced old package "ignore" with a newly written "wcmatch" that manages the globbing. This should support all the patterns that .gitignore supports.  

Some changes in ignorefs that dealt with how the patterns were matched.

This fixes #571

* Fixed invalid matching of non-rooted patterns that contained a slash.

If a pattern contains a slash in the middle of the pattern this should only match relative to the .gitignore file, i.e. the same as if it started with a '/' according to the .gitignore spec.

Example:
foo/bar should match "/foo/bar", but not "/other/foo/bar".  
whereas 
"bar" matches both "/bar" and "/foo/bar"

* Uncommented previously failing tests.

* Fixed problem with matching "nested" .kopiaignore files.

Ignore-patterns must be applied from the root .kopiaignore down the hierarchy, so that an ignore file in a subdirectory can negate a pattern from a parent directory.

* Uncommented tests that should now work.
2021-01-08 08:13:18 -08:00
Peter Palotas
cd8f3e81b8 Created end-to-end tests verifying .kopiaignore behavior. (#774)
* Created end-to-end tests verifying .kopiaignore behavior.

This is related to #571 and #773, but provided as a separate PR to include tests that did not work before PR #773.

* Commented failing tests.

These tests will be re-enabled when #773 is done.

* Added additional commented tests of .kopiaignore

These will be uncommented in #773.
2021-01-08 07:39:59 -08:00
Jarek Kowalski
8e17edcdf6 content: mechanical refactoring of content manager to extract CommittedReadManager (#771)
* content: introduced ContentReadManager
  This only introduces new type and reassigns methods
* content: moved all CommittedReadManager methods to one file
* content: more code movement
* content: refactored read manager setup
* content: refactored test hook that allowed cleaner passing of custom own writes cache
2021-01-07 00:39:20 -08:00
Jarek Kowalski
f1b471d7e6 Fixes for test flakes (#770)
* testing: prevented spurious test flakes caused by kopia subprocesses messing with stderr

This was not causing actual failures, but misreporting error messages.

* testing: ensure random names are always unique by adding a counter
2021-01-05 21:37:23 -08:00
Jarek Kowalski
207009939f cli: only fetch the persisted password from keychain if one was not provided on the command line (#744)
This also fixed a test bug where the test was incorrectly passing
password via environment variable and it was (incorrectly) expected
to be ignored.

Password is determined in the following order:

- flag/environment variable (highest priority)
- persistent storage
- asking user (lowest priority)
2020-12-24 22:39:02 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Jarek Kowalski
246dcf80ba testing: added 'snapshot verify' test 2020-12-21 20:02:53 -08:00
Jarek Kowalski
d7ca543356 cli: improvements to 'snapshot verify'
* When running against direct repository, it will verify that all
  backing blobs exist based on results of listing.
* Deprecated annoying --all-sources flag which is now default if no
  sources are provided.
2020-12-21 20:02:53 -08:00
Jarek Kowalski
eecd9d13c9 actions: Added --enable-actions flag (#737)
This can be specified at `repo create` or `repo connect` to enable
actions. By default actions are disabled to avoid security risks
associated with executing code.

Alternatively during `snapshot create` one can specify
`--force-enable-actions` or `--force-disable-actions`
2020-12-21 18:05:25 -08:00
Jarek Kowalski
4f7d211f72 Added support for actions that run before&after snapshot roots and before/after specific folders (#722)
* policy: add actions
* fs: added LocalFilesystemPath() which can optionally return local filesystem
  path (if entry is local)
* cli: added support for setting policy actions
* upload: support for executing actions before/after folder (non-inheritable)
  and before/after snapshots (inheritable)
* testing: end-to-end test for actions
* additional tests for actions with embedded scripts
2020-12-21 15:53:21 -08:00
Julio López
3795ffc6f9 robustness: minor cleanups (#726)
Remove unnecessary intermediate variables.
Send SIGTERM instead of SIGKILL to terminate child kopia server process.
Set Pdeathsig on Linux for child kopia server process.
Trivial: reduce scope of hostFioDataPathStr variable.
Trivial: rename local variable.
Trivial: Use log.Fatalln instead of log + exit(1).
Improve error message in robustness test to tell apart failure cause.
2020-12-16 12:49:54 -08:00