Commit Graph

397 Commits

Author SHA1 Message Date
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Jarek Kowalski
d7ca543356 cli: improvements to 'snapshot verify'
* When running against direct repository, it will verify that all
  backing blobs exist based on results of listing.
* Deprecated annoying --all-sources flag which is now default if no
  sources are provided.
2020-12-21 20:02:53 -08:00
Jarek Kowalski
eecd9d13c9 actions: Added --enable-actions flag (#737)
This can be specified at `repo create` or `repo connect` to enable
actions. By default actions are disabled to avoid security risks
associated with executing code.

Alternatively during `snapshot create` one can specify
`--force-enable-actions` or `--force-disable-actions`
2020-12-21 18:05:25 -08:00
Jarek Kowalski
4f7d211f72 Added support for actions that run before&after snapshot roots and before/after specific folders (#722)
* policy: add actions
* fs: added LocalFilesystemPath() which can optionally return local filesystem
  path (if entry is local)
* cli: added support for setting policy actions
* upload: support for executing actions before/after folder (non-inheritable)
  and before/after snapshots (inheritable)
* testing: end-to-end test for actions
* additional tests for actions with embedded scripts
2020-12-21 15:53:21 -08:00
Jarek Kowalski
f279a5be69 'kopia policy set' code cleanup (#730)
* cli: split command_policy_set.go by individual areas

* cli: refactored 'policy set' implementation to reuse helpers

* use defined const instead of literal

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-18 10:55:43 -08:00
Jarek Kowalski
ad4b222939 cli: added support for copying (or moving) snapshot history (#703)
Both source and destination can be specified using user@host,
@host or user@host:/path where destination values override the
corresponding parts of the source, so both targeted
and mass copying is supported.

Supported combinations are:

Source:             Destination         Behavior
---------------------------------------------------
@host1              @host2              copy snapshots from all users of host1
user1@host1         @host2              copy all snapshots to user1@host2
user1@host1         user2@host2         copy all snapshots to user2@host2
user1@host1:/path1  @host2              copy to user1@host2:/path1
user1@host1:/path1  user2@host2         copy to user2@host2:/path1
user1@host1:/path1  user2@host2:/path2  copy snapshots from single path

When --move is specified, the matching source snapshots are also deleted.

* cli: upgraded kingpin to latest version (not tagged)

This allows using `EnableFileExpansion` to disable treating
arguments prefixed with "@" as file includes.
2020-12-04 16:34:55 -08:00
Piotr Tabor
bf729095dc Fuse: Passthrough options to allow other-users access and mounting in empty directory. (#691) 2020-10-31 21:57:27 -07:00
Erkki Seppälä
6a93e4d5b9 Added support for scanning only one filesystem via files policy (#676)
The new files policy oneFileSystem ignores files that are mounted to
other filesystems similarly to tar's --one-file-system switch. For
example, if this is enabled, backing up / should now automatically
ignore /dev, /proc, etc, so the directory entries themselves don't
appear in the backup. The value of the policy is 'false' by default.

This is implemented by adding a non-windows-field Device (of type
DeviceInfo, reflecting the implementation of Owner) to the Entry
interface. DeviceInfo holds the dev and rdev acquired with stat (same
way as with Owner), but in addition to that it also holds the same
values for the parent directory. It would seem that doing this in some
other way, ie. in ReadDir, would require modifying the ReadDir
interface which seems a too large modification for a feature this
small.

This change introduces a duplication of 'stat' call to the files, as
the Owner feature already does a separate call. I doubt the
performance implications are noticeable, though with some refactoring
both Owner and Device fields could be filled in in one go.

Filling in the field has been placed in fs/localfs/localfs.go where
entryFromChildFileInfo has acquired a third parameter giving the the
parent entry. From that information the Device of the parent is
retrieved, to be passed off to platformSpecificDeviceInfo which does
the rest of the paperwork. Other fs implementations just put in the
default values.

The Dev and Rdev fields returned by the 'stat' call have different
sizes on different platforms, but for convenience they are internally
handled the same. The conversion is done with local_fs_32bit.go and
local_fs_64bit.go which are conditionally compiled on different
platforms.

Finally the actual check of the condition is in ignorefs.go function
shouldIncludeByDevice which is analoguous to the other similarly named
functions.

Co-authored-by: Erkki Seppälä <flux@inside.org>
2020-10-14 22:45:32 -07:00
Jarek Kowalski
170bdeb944 cli: improved restore progress information (#678)
Demo: https://asciinema.org/a/8RV8SwBVt2CI0HSsklYwO8FhP
2020-10-11 11:56:46 -07:00
Jarek Kowalski
4d7f0cb6cd Fixed symlink restore behavior on macOS (#673)
* restore: use symlink-specific APIs instead of chmod, chown and chtimes

* upload: fix updating directory modtime for symlinks

* cli: plumbed through flags to restore to control new behaviors

* localfs: use Lstat() instead of Stat() in Child() method

* testing: added restore tests for new flags
2020-10-10 11:03:35 -07:00
Jarek Kowalski
9d7cf71a37 Logging flags (#674)
* logging: cleaned up stderr logging

- do not show module
- do not show timestamps by default (enable with --console-timestamps)

* logging: replaced most printStderr() with log.Info

* cli: additional logging cleanup
2020-10-10 10:48:37 -07:00
Jarek Kowalski
ec9c4d6095 restore: support for parallelization (#668) 2020-10-07 21:41:32 -07:00
Jarek Kowalski
4fd0bcf7dc cli: 'snapshot create' switched all stderr output to use logger (#663)
This allows 'kopia snap create --all --log-level=warning --no-progress'
to produce no console output unless an error is encountered.
2020-10-05 17:28:00 -07:00
Jarek Kowalski
d08c052651 cli: removed confusing default on 'snapshot list' that only lists last 100 snapshots per source (#666) 2020-10-05 16:59:06 -07:00
Jarek Kowalski
0758a92c58 restore: improved user experience (#644)
* restore: improved user experience

* 'snapshot restore' is now the same as 'restore' and both will
  support restoring by manifest ID, root ID or root ID + subdirectory

* added support for restoring individual files

* implemented PR feedback and refactored object ID parsing

Moving helpers inside the snapshot/ package helped clean up the code
a lot.
2020-09-28 22:57:24 -07:00
Jarek Kowalski
ff6a414ec5 cli: When listing directory that had errors, print error summary at the end. (#643)
Can be disabled with `--no-error-summary`.
Quick demo: https://asciinema.org/a/2rma0sx2mD6HoIPy6VL0QEFeP

Also refactored fs.Directory to provide Summary optionally.
2020-09-25 09:06:41 -07:00
Jarek Kowalski
d0d6ac4767 SFTP connectivity and docs improvements (#623)
* sftp: support for external SSH command and host verfication improvements

- removed custom parsing of hostnames and verification and replaced with
  standard 'knownhosts' implementation.

- added option to launch external SSH command which supports
  aliases, agent, etc.

NOTE, we're still not supporting any cases where password needs to be
entered on the command line, since that would be incompatible with
the UI which uses client-server model.

Fixes #500
Fixes #414

* site: updated SFTP repository connection instructions

Fixes #590
2020-09-20 11:10:13 -07:00
Jarek Kowalski
538c644586 cli: don't ask for password if repository is not connected (#627) 2020-09-19 11:45:03 -07:00
Jarek Kowalski
fce9497375 restore: support for symlinks (experimental) (#621) 2020-09-18 10:29:20 -07:00
Jarek Kowalski
9551d2495d upload: scan the directory to be uploaded in parallel to estimate the amount of data to be uploaded (#622)
This allows better progress indicator in the CLI and UI.
The percentage completed is not displayed until estimate is available.

Quick demo: https://asciinema.org/a/O7ktcWSgaGUPfJwhzc65mMWM1
2020-09-17 23:59:18 -07:00
Jarek Kowalski
9faf6b33d0 cli: fixed snapshot delete to support deleting file (not directory) snapshots by object ID (#613) 2020-09-12 22:36:47 -07:00
Jarek Kowalski
f0b97b960b Fixed checkpointing to not restart the entire upload process (#594)
* object: added Checkpoint() method to object writer

* upload: refactored code structure to allow better checkpointing

* upload: removed Checkpoint() method from UploadProgress

* Update fs/entry.go

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 22:36:22 -07:00
Jarek Kowalski
6a14ac8a2a cli: ensure advanced commands are not accidentally used (#611)
* cli: ensure advanced commands are not accidentally used

This prints an error when a dangerous command is used without
first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable.

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 20:31:25 -07:00
Jarek Kowalski
faf280616a Splitter throughput improvements (#606)
* object: refactored writer to detect split points before writing

This introduces new primitive that will be moved into splitters
themselves in subsequent commits. I'm doing this in small steps to
ensure we don't regress at any time.

* splitter: refactored TestSplitters test

This is use slow (byte-by-byte) and fast (nextSplitPoint) methods of
determining split points.

Note nextSplitPoint is not implemented by splitters yet, but this
verifies that the test is expecting the right thing.

* object: splitter refactoring - replaced ShouldSplit() with NextSplitPoint() everywhere, still not optimized

* splitter: added additional dimension to splitter_test

We split either in large chunks or one byte at a time to catch
the corner cases in the splitter implementation.

* splitter: optimized splitters using NextSplitPoint primitive

This improves splitter performance by about 40% (buzhash) and makes
it virtually free for FIXED splitter.
2020-09-11 19:45:48 -07:00
Jarek Kowalski
3b87902433 Kopia UI improvements for repository management (#592)
* cli: added --tls-print-server-cert flag

This prints complete server certificate that is base64 and PEM-encoded.

It is needed for Electron to securely connect to the server outside of
the browser, since there's no way to trust certificate by fingerprint.

* server: added repo/exists API

* server: added ClientOptions to create and connect API

* server: exposed current-user API

* server: API to change description of a repository

* htmlui: refactored connect/create flow

This cleaned up the code a lot and made UX more obvious.

* kopia-ui: simplified repository management UX

Removed repository configuration window which was confusing due to
the notion of 'server'.

Now KopiaUI will automatically launch 'kopia server --ui' for each
config found in the kopia config directory and shut it down every
time repository is disconnected.

See https://youtu.be/P4Ll_LR4UVM for a quick demo.

Fixes #583
2020-09-07 08:00:19 -07:00
Jarek Kowalski
29ce1819cb Added support for setting and changing repository client options (description, read-only, hostname, username) (#589)
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config

* cli: added 'repository set-client' to configure parameters of connected repository

* cli: cleaned up 'repository status' output
2020-09-04 13:57:15 -07:00
Jarek Kowalski
a5838ff34c Improvements to UX for mounting directories (both CLI and KopiaUI) (#573)
* cli: simplified mount command

See https://youtu.be/1Nt_HIl-NWQ

It will always use WebDAV on Windows and FUSE on Unix. Removed
confusing options.

New usage:

$ kopia mount [--browse]
    Mounts all snapshots in a temporary filesystem directory
    (both Unix and Windows).

$ kopia mount <object> [--browse]
    Mounts given object in a temporary filesystem directory
    (both Unix and Windows).

$ kopia mount <object> z: [--browse]
    Mounts given object as a given drive letter in Windows (using
    temporary WebDAV mount).

$ kopia mount <object> * [--browse]
    Mounts given object as a random drive letter in Windows.

$ kopia mount <object> /mount/path [--browse]
    Mounts given object in given path in Unix.

<object> can be the ID of a directory 'k<hash>' or 'all'

Optional --browse automatically opens OS-native file browser.

* htmlui: added UI for mounting directories

See https://youtu.be/T-9SshVa1d8 for a quick demo.

Also replaced some UI text with icons.

* lint: windows-specific fix
2020-09-03 17:46:48 -07:00
Jarek Kowalski
90a9cf1143 cli: plumbed through missing --server-cert-fingerprint option (#580)
Fixes: #577
2020-09-01 19:57:14 -07:00
Jarek Kowalski
ded1ecf936 implemented Cache Directory Tagging Specification + CLI + UI (#565)
Fixes #564

cli: added 'kopia policy set --ignore-cache-dirs' option to control
whether to ignore caches (global default=true)

ui: added checkbox to control 'Ignore Cache Dirs' in policy editor

ignorefs: moved ignoring cache directories to ignorefs layer

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-08-31 21:35:26 -07:00
Jarek Kowalski
c242235a32 blob: added SetTime() method which may be optionally implemented by blob.Storage (#575)
cli: added --times option to 'repository sync'
2020-08-31 19:50:15 -07:00
Jarek Kowalski
965160dba1 cli: ignore trailing / in repository server URL (#569)
Fixes #557
2020-08-30 16:10:26 -07:00
Jarek Kowalski
1a8fcb086c Added endurance test which tests kopia over long time scale (#558)
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()

Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT

logfile: squelch annoying log message

testenv: added faketimeserver which serves time over HTTP

testing: added endurance test which tests kopia over long time scale

This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).

The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.

testing: refactored internal/clock to only support injection when
'testing' build tag is present
2020-08-26 23:03:46 -07:00
Jarek Kowalski
c11850cfb7 Tools to help investigate repository structures safely (#553)
* cli: added 'index inspect' which can dump contents of index blob or local file
* repo: added read-only option when connecting to a repo which prevents any mutations

Co-authored-by: Julio Lopez <julio+gh@k....io>
2020-08-25 17:55:48 -07:00
Jarek Kowalski
7ae823945c Experimental rclone backend (#545)
This will launch 'rclone webdav server' passing random TLS
certificate and username/password and serve predefined rclone
remote path.

This is very experimental, use with caution.

Fixes #313.

Additional / required changes:
* blob: (experimental) support for rclone provider
* server: refactored TLS utilities to separate package
* webdav: add support for specifying trusted TLS certificate fingerprint
* kopia-ui: added rclone support
2020-08-17 20:43:41 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00
Jarek Kowalski
984480908b Updated documentation for v0.6.0 release (#525)
* cli: small tweaks to kopia server mode

  * print SHA256 certficate thumbprint for auto-generated certs.
  * client will accept both upper- and lowercase thumbprint values

* site: updated documentation for v0.6.0 release

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-08-10 21:11:48 -07:00
Jarek Kowalski
505ab92e21 Support for repository sync (#522)
* blob: added DisplayName() method to blob.Storage

* cli: added 'kopia repo sync-to <provider>' which replicates BLOBs

Usage demo: https://asciinema.org/a/352299

Fixes #509

* implemented suggestion by Ciantic to fail sync if the destination repository is not compatible with the source

* cli: added 'kopia repo sync --must-exist'

This ensures that target repository is not empty, otherwise syncing to
an accidentally unmounted filesystem directory might copy everything
again.
2020-08-09 12:36:41 -07:00
Jarek Kowalski
04fdd0105e Fix for policy manager not setting labels in some cases (#517)
* cli: fixed 'kopia policy rm' deleting global when passed policy ID

* policy: additional unit test coverage for policy manager

* fixed path parsing logic to avoid the use filepath package which is platform-dependent, added more tests
2020-08-06 21:25:48 -07:00
Jarek Kowalski
9de7348a75 cli: auto-upgrade the repository to 0.6.0 (#512)
On first 'kopia snapshot' or 'kopia server' use where the 'maintenance'
manifest cannot be found, we create a default one and display usage
note.
2020-08-04 23:53:18 -07:00
Jarek Kowalski
fca8155283 update_check: clean up version numbers to always include 'v' prefix 2020-07-31 22:35:20 -07:00
Jarek Kowalski
8ead49b779 restore: support for zip, tar and tar.gz restore outputs (#482)
* restore: support for zip, tar and tar.gz restore outputs

Moved restore functionality to its own package.

* Fix enum values in the 'mode' flag

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-07-22 22:56:11 -07:00
Jarek Kowalski
1c2534199c cli: fixed panic() during snapshot verification caused by refactoring to new Repository interface 2020-06-24 08:39:56 -07:00
Jarek Kowalski
3843ed6727 cli: added content verify --include-deleted flag 2020-06-24 08:39:56 -07:00
Jarek Kowalski
25934a544d content: fixed deletion of cleanup blobs (#462)
* content: ensure that cleanup blobs have unique contents to prevent situation where they keep getting rewritten and thus never deleted

* cli: added '--decrypt' option to 'kopia blob show'
2020-06-08 08:14:01 -07:00
Jarek Kowalski
293cb10471 cosmetic changes to maintenance behavior (#461)
- run maintenance even if the command is about to return an error
  (otherwise if folks have persistent error causing snapshots to fail
  they will never run maintenance)
- disable progress output after snapshotting so that
  'kopia snapshot --all' output is clean
2020-06-01 23:38:18 -07:00
Jarek Kowalski
960c33475e maintenance: disabled automatic compaction on repository opening
instead moved to run as part of maintenance ('kopia maintenance run')

added 'kopia maintenance run --force' flag which runs maintenance even
if not owned
2020-06-01 00:57:32 -07:00
Jarek Kowalski
d68273a576 Improvements for dealing with eventually-consistent stores (S3) (#437)
* content: added support for cache of own writes

Thi keeps track of which blobs (n and m) have been written by the
local repository client, so that even if the storage listing
is eventually consistent (as in S3), we get somewhat sane behavior.

Note that this is still assumming read-after-create semantics, which
S3 also guarantees, otherwise it's very hard to do anything useful.

* compaction: support for compaction logs

Instead of compaction immediately deleting source index blobs, we now
write log entries (with `m` prefix) which are merged on reads
and applied only if the blob list includes all inputs and outputs, in
which case the inputs are discarded since they are known to have been
superseded by the outputs.

This addresses eventual consistency issues in stores such as S3,
which don't guarantee list-after-put or list-after-delete. With such
stores the repository is ultimately eventually consistent and there's
not much that can be done about it, unless we use second strongly
consistent storage (such as GCS) for the index only.

* content: updated list cache to cache both `n` and `m`

* repo: fixed cache clear on windows

Clearing cache requires closing repository first, as Windows is holding
the files locked.

This requires ability to close the repository twice.

* content: refactored index blob management into indexBlobManager

* testing: fixed blobtesting.Map storage to allow overwrites

* blob: added debug output String() to blob.Metadata

* testing: added indexBlobManager stress test

This works by using N parallel "actors", each repeatedly performing
operations on indexBlobManagers all sharing single eventually consistent
storage.

Each actor runs in a loop and randomly selects between:

- *reading* all contents in indexes and verifying that it includes
  all contents written by the actor so far and that contents are
  correctly marked as deleted
- *creating* new contents
- *deleting* one of previously-created contents (by the same actor)
- *compacting* all index files into one

The test runs on accelerated time (every read of time moves it by 0.1
seconds) and simulates several hours of running.

In case of a failure, the log should provide enough debugging
information to trace the exact sequence of events leading up to the
failure - each log line is prefixed with actorID and all storage
access is logged.

* makefile: increase test timeout

* content: fixed index blob manager race

The race is where if we delete compaction log too early, it may lead to
previously deleted contents becoming temporarily live again to an
outside observer.

Added test case that reproduces the issue, verified that it fails
without the fix and passed with one.

* testing: improvements to TestIndexBlobManagerStress test

- better logging to be able to trace the root cause in case of a failure
- prevented concurrent compaction which is unsafe:

The sequence:

1. A creates contentA1 in INDEX-1
2. B creates contentB1 in INDEX-2
3. A deletes contentA1 in INDEX-3
4. B does compaction, but is not seeing INDEX-3 (due to EC or simply
   because B started read before #3 completed), so it writes
   INDEX-4==merge(INDEX-1,INDEX-2)
   * INDEX-4 has contentA1 as active
5. A does compaction but it's not seeing INDEX-4 yet (due to EC
   or because read started before #4), so it drops contentA1, writes
   INDEX-5=merge(INDEX-1,INDEX-2,INDEX-3)
   * INDEX-5 does not have contentA1
7. C sees INDEX-5 and INDEX-5 and merge(INDEX-4,INDEX-5)
   contains contentA1 which is wrong, because A has been deleted
   (and there's no record of it anywhere in the system)

* content: when building pack index ensure index bytes are different each time by adding 32 random bytes
2020-05-31 17:11:20 -07:00
Jarek Kowalski
9123e3ebff cli: 'kopia server' made --ui default (#452) 2020-05-25 19:09:26 -07:00
Julio López
e493177236 Remove legacy flags from snapshot create command (#441)
Remove `--hostname` and `--username` flags
2020-05-18 22:48:18 -07:00
Jarek Kowalski
ca28469706 cli: improved 'snapshot delete' usage (#436)
New usage:

```
kopia snapshot delete manifestID... [--delete]
kopia snapshot delete rootObjectID... [--delete]
```

Fixes #435

cli: added --unsafe-ignore-source as alias for `--delete`
This is a hidden flag for backwards compatibility. It will be removed.
2020-05-13 23:43:45 -07:00