Commit Graph

421 Commits

Author SHA1 Message Date
Jarek Kowalski
c990fc9ec1 cache: streamlined flags and cache handling (#831)
* cache: improved cache cleanup on exit

Ensure we do one full sweep before closing if cache has been modified.

Before we would do periodic sweep every minute which would not kick in
for very short snapshots, which Kopia does very frequently. This leads
to build-up of metadata cache entries (q blobs) that never
get cleaned until some long session.

* caching: streamlined cache handling

- deprecated caching-related flags, now cache is always on or off with
  no way to disable it per invocation.
- reduced default list cache duration from 10min to 30s
- moved blob-list cache to separate subdirectory
- cleaned up cache info output to include blob-list cache parameters
- removed ability to disable cache for per-context (this was only
  used in 'snapshot verify' codepath)
- added ability to partially clear individual caches via CLI
2021-02-16 17:30:49 -08:00
Jarek Kowalski
675bf4e033 Removed manifest manager refresh + server improvements (#835)
* manifest: removed explicit refresh

Instead, content manager is exposing a revision counter that changes
on each mutation or index change. Manifest manager will be invalidated
whenever this is encountered.

* server: refactored initialization API

* server: added unit tests for repository server APIs (HTTP and REST)

* server: ensure we don't upload contents that already exist

This saves bandwidth, since the client can compute hash locally
and ask the server whether the object exists before starting the upload.
2021-02-15 23:55:58 -08:00
Jarek Kowalski
5240f62e47 Auto shutdown fix (#834)
* server: removed auto-shutdown option

* server: added --shutdown-on-stdin which will shutdown server when stdin is closed. used by kopia-ui
2021-02-13 19:49:32 -08:00
Jarek Kowalski
de840547e6 Improved upload reporting (#832)
* blob: refactored upload reporting

Instead of plumbing this through blob storage context, we are passing
and explicit callback that reports uploads as they happen.

* htmlui: improved counter presentation

* nit: added missing UI route which fixes Reload behavior on the Tasks page
2021-02-13 10:51:11 -08:00
Jarek Kowalski
4bf42e337d fix long filenames on Windows (#822)
* windows: fixed handling of long filenames
2021-02-12 09:09:42 -08:00
Pavan Navarathna
c964e244f0 Support for ignoring sources when using snapshot create --all (#804)
* Add manual field to SchedulingPolicy
* CLI: Set and show for policy with manual field
* CLI: Edit policy support for manual field
* Check manual when creating snapshot for all source
* End to end test for snapshot create all
* Add UI option for setting Manual field
2021-02-10 21:59:06 -08:00
Nick
0e1c617045 Fix progress output string format (#828) 2021-02-10 11:18:52 -08:00
Jarek Kowalski
5d07237156 Added support for user authentication using user profiles stored in the repository (#809)
* user: added user profile (username&password for authentication) and CRUD methods
* manifest: helpers for disambiguating manifest entries
* authn: added repository-based user authenticator
* cli: added commands to manipulate user accounts and passwords
* cli: added --allow-repository-users option to 'server start'
* Update cli/command_user_info.go

Co-authored-by: Julio López <julio+gh@kasten.io>
* Always return false when the user is not found.
2021-02-03 22:04:05 -08:00
Jarek Kowalski
1a826d85c5 cli: added '--insecure' flag to 'kopia server start' (#803)
* cli: added '--insecure' flag to 'kopia server start'

This is a breaking change for development scenarios to prevent people
from unknowingly launching insecure servers.

Attempt to start a server without either TLS or password protection
results in an error now (unless --insecure is also passed).

KopiaUI already launches server with TLS and random password, so it
does not require it.
2021-01-28 09:13:57 -08:00
Jarek Kowalski
646c325826 Implemented new streaming GRPC protocol for Kopia Repository Server (#789)
* grpcapi: added GPRC API for the repository server

* repo: added transparent retries to GRPC repository client

Normally GRPC reconnects automatically, which can survive server
restarts (minus transient errors).

In our case we're establishing a stream which will be broken and
needs to be restarted after io.EOF is detected.

It safe to do transparent retries for read-only (repo.Repository),
but not safe for write sessions (repo.RepositoryWriter), because the
session may re-connect to different server that won't have the buffered
content write available in memory.
2021-01-28 05:15:12 -08:00
Jarek Kowalski
a11ddaf2cf restore: added support for incremental restore and ignoring copy errors (#794)
* restore: added support for incremental restore and ignoring copy errors

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-01-27 06:29:15 -08:00
Janne Johansson
45912e272e Option to print out the commands for benchmark (#779)
* Option to print out the commands for using crypto, splitter and compression

Co-authored-by: Janne Johansson <janne.johansson@safespring.com>
Co-authored-by: Jarek Kowalski <jaak@jkowalski.net>
2021-01-23 20:04:09 -08:00
Nick
122a31b905 Overwrite symlinks with optional flag - fix #689 (#783)
Fixes #689

Add symlink overwrite behavior to fix "file exists" error when restoring a symlink that already exists

Before creating the restored symlink, check `os.Lstat`:
- If it returns an error indicating the file does not exist, proceed to symlink creation
- If it returns any other error, propagate the error up to the caller
- If the fileInfo indicates the entry is a symlink AND `--no-overwrite-symlinks` was set in the restore command, propagate an error to the caller
- If `--no-overwrite-symlinks` was NOT set, remove the existing symlink before proceeding to symlink creation
- Else the file exists but it is not of type symlink. Halt the operation and propagate an error indicating we tried to restore a symlink over a file system entry that already existed but was not a symlink.

Added case to `TestSnapshotRestore` that fails before this fix and succeeds after. The case is simply to restore the same snapshot into the same directory twice in a row, where the second restore will be on top of the first one.

Added test case to ensure `--no-overwrite-symlinks` throws an error as expected if restoring into a directory where a symlink already exists at the path symlink creation is attempted.

Added test case to ensure that the restore operation fails if a symlink is needed to be restored to the same path as an existing non-symlink filesystem entry with the same name.

* Skip overwrite test on Windows

If test is run as non-admin it is likely to fail on Windows
with insufficient permissions to overwrite the previously
restored data.

* Add brief summary of overwrite behavior to help

Add a brief summary to the restore command help text
 of expected behavior when restoring into a target location
 that has existing data present.
2021-01-21 12:26:42 -08:00
Jarek Kowalski
5912247f29 server: reworked authn/authz (#788)
* server: reworked authn/authz

Previously authentication was done as an wrapper handler and
authorization was inlined. This change moves authn/authz handlers
inside the server and implements separate authorization module that's
individually tested.

Also fixed an issue where server users were not able to see global
or host-level policies.

* PR feedback
2021-01-21 07:31:34 -08:00
Jarek Kowalski
fa7976599c repo: refactored repository interfaces (#780)
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`

- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
  interface

Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.

`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).

* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session

* cli: disable maintenance in 'kopia server start'
  Server will close the repository before completing.

* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
  - used atomic to manage SharedManager.closed

* removed stale example
* linter: fixed spurious failures

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-01-20 11:41:47 -08:00
Jarek Kowalski
4c8b9291e1 content: refactoring of Manager (#785)
- renamed content.Manager to content.WriteManager
- merged lockFreeManager and CommittedReadManager into SharedManager
- also reassigned some methods to SharedManager (no code move)
2021-01-14 23:13:57 -08:00
Jarek Kowalski
f517703079 Preliminary support for sessions (#752)
* content: fixed time-based auto-flush behavior to behave like Flush()

Previously it would sometimes be possible for a content whose write
started before time-based flush to finish writing afterwards (and it
would be included in the new index).

Refactored the code so that time-based flush happens before WriteContent
write and behaves exactly the same was as real Flush() so all writes
started before it will be awaited during the flush.

Also previous regression test was incorrect since it was mocking the
wrong blob method.

* content: refactored index blob manager crypto to separate file

This will be reused for encrypting session info.

* content: added support for session markers

Session marker (`s` blob) is written BEFORE the first data blob
(`p` or `q`) that belongs to new index segment (`n` is written).

Session marker is removed AFTER the index blob (`n`) has been written.

All pack and index blobs belonging to a session will have the session
ID as its suffix, so that if a reader can see `s<sessionID>` blob, they
will ignore any `p` and `q` blobs with the same suffix.

* maintenance: ignore blobs belonging to active sessions when running blob garbage collection

* cli: added 'sessions list' for listing active sessions

*  content: added retrying writing previously failed blobs before writing new one
2021-01-14 00:25:51 -08:00
Jarek Kowalski
ed9db56b87 Cleaned up and refactored object manager (#782)
* object: refactored Open() and VerifyObject() to be stateless

(no code movement yet to facilitate review)

* mechanical: moved function more appropriate files

* object: remove object manager tracing which was unused
2021-01-13 00:40:23 -08:00
Jarek Kowalski
73a34ff7ff trivial: move CachingOptions out of content.Manager, where it's not needed (#775)
* trivial: move CachingOptions out of content.Manager, where it's not needed

* trivial: removed newManagerWithOptions which was the same as NewManager

also moved one-time initialization to newReadManager()
2021-01-07 18:51:15 -08:00
Jarek Kowalski
d3a6421213 cli: make sure we don't run maintenance as part of read-only actions (#769)
* cli: make sure we don't run maintenance as part of read-only actions

* cli: classified some actions that use *repo.DirectRepository as read-only too
2021-01-05 21:55:02 -08:00
Janne Johansson
f3737fef6e crypto -vs- compress (#766)
Probably a cut-n-paste error.
2021-01-05 07:03:57 -08:00
Jarek Kowalski
f751e4f5c7 cli: only run update check if the binary was built from GitHub and use the original repository name (#756) 2021-01-04 21:33:43 -08:00
Jarek Kowalski
5e8e175cfa repo: refactored read/write methods of repo.Repository (#749)
Reader methods go to repo.Reader and write methods go to repo.Writer
Switched usage to new interfaces based on linter errors.
2021-01-04 21:33:12 -08:00
Jarek Kowalski
207009939f cli: only fetch the persisted password from keychain if one was not provided on the command line (#744)
This also fixed a test bug where the test was incorrectly passing
password via environment variable and it was (incorrectly) expected
to be ignored.

Password is determined in the following order:

- flag/environment variable (highest priority)
- persistent storage
- asking user (lowest priority)
2020-12-24 22:39:02 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Jarek Kowalski
d7ca543356 cli: improvements to 'snapshot verify'
* When running against direct repository, it will verify that all
  backing blobs exist based on results of listing.
* Deprecated annoying --all-sources flag which is now default if no
  sources are provided.
2020-12-21 20:02:53 -08:00
Jarek Kowalski
eecd9d13c9 actions: Added --enable-actions flag (#737)
This can be specified at `repo create` or `repo connect` to enable
actions. By default actions are disabled to avoid security risks
associated with executing code.

Alternatively during `snapshot create` one can specify
`--force-enable-actions` or `--force-disable-actions`
2020-12-21 18:05:25 -08:00
Jarek Kowalski
4f7d211f72 Added support for actions that run before&after snapshot roots and before/after specific folders (#722)
* policy: add actions
* fs: added LocalFilesystemPath() which can optionally return local filesystem
  path (if entry is local)
* cli: added support for setting policy actions
* upload: support for executing actions before/after folder (non-inheritable)
  and before/after snapshots (inheritable)
* testing: end-to-end test for actions
* additional tests for actions with embedded scripts
2020-12-21 15:53:21 -08:00
Jarek Kowalski
f279a5be69 'kopia policy set' code cleanup (#730)
* cli: split command_policy_set.go by individual areas

* cli: refactored 'policy set' implementation to reuse helpers

* use defined const instead of literal

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-18 10:55:43 -08:00
Jarek Kowalski
ad4b222939 cli: added support for copying (or moving) snapshot history (#703)
Both source and destination can be specified using user@host,
@host or user@host:/path where destination values override the
corresponding parts of the source, so both targeted
and mass copying is supported.

Supported combinations are:

Source:             Destination         Behavior
---------------------------------------------------
@host1              @host2              copy snapshots from all users of host1
user1@host1         @host2              copy all snapshots to user1@host2
user1@host1         user2@host2         copy all snapshots to user2@host2
user1@host1:/path1  @host2              copy to user1@host2:/path1
user1@host1:/path1  user2@host2         copy to user2@host2:/path1
user1@host1:/path1  user2@host2:/path2  copy snapshots from single path

When --move is specified, the matching source snapshots are also deleted.

* cli: upgraded kingpin to latest version (not tagged)

This allows using `EnableFileExpansion` to disable treating
arguments prefixed with "@" as file includes.
2020-12-04 16:34:55 -08:00
Piotr Tabor
bf729095dc Fuse: Passthrough options to allow other-users access and mounting in empty directory. (#691) 2020-10-31 21:57:27 -07:00
Erkki Seppälä
6a93e4d5b9 Added support for scanning only one filesystem via files policy (#676)
The new files policy oneFileSystem ignores files that are mounted to
other filesystems similarly to tar's --one-file-system switch. For
example, if this is enabled, backing up / should now automatically
ignore /dev, /proc, etc, so the directory entries themselves don't
appear in the backup. The value of the policy is 'false' by default.

This is implemented by adding a non-windows-field Device (of type
DeviceInfo, reflecting the implementation of Owner) to the Entry
interface. DeviceInfo holds the dev and rdev acquired with stat (same
way as with Owner), but in addition to that it also holds the same
values for the parent directory. It would seem that doing this in some
other way, ie. in ReadDir, would require modifying the ReadDir
interface which seems a too large modification for a feature this
small.

This change introduces a duplication of 'stat' call to the files, as
the Owner feature already does a separate call. I doubt the
performance implications are noticeable, though with some refactoring
both Owner and Device fields could be filled in in one go.

Filling in the field has been placed in fs/localfs/localfs.go where
entryFromChildFileInfo has acquired a third parameter giving the the
parent entry. From that information the Device of the parent is
retrieved, to be passed off to platformSpecificDeviceInfo which does
the rest of the paperwork. Other fs implementations just put in the
default values.

The Dev and Rdev fields returned by the 'stat' call have different
sizes on different platforms, but for convenience they are internally
handled the same. The conversion is done with local_fs_32bit.go and
local_fs_64bit.go which are conditionally compiled on different
platforms.

Finally the actual check of the condition is in ignorefs.go function
shouldIncludeByDevice which is analoguous to the other similarly named
functions.

Co-authored-by: Erkki Seppälä <flux@inside.org>
2020-10-14 22:45:32 -07:00
Jarek Kowalski
170bdeb944 cli: improved restore progress information (#678)
Demo: https://asciinema.org/a/8RV8SwBVt2CI0HSsklYwO8FhP
2020-10-11 11:56:46 -07:00
Jarek Kowalski
4d7f0cb6cd Fixed symlink restore behavior on macOS (#673)
* restore: use symlink-specific APIs instead of chmod, chown and chtimes

* upload: fix updating directory modtime for symlinks

* cli: plumbed through flags to restore to control new behaviors

* localfs: use Lstat() instead of Stat() in Child() method

* testing: added restore tests for new flags
2020-10-10 11:03:35 -07:00
Jarek Kowalski
9d7cf71a37 Logging flags (#674)
* logging: cleaned up stderr logging

- do not show module
- do not show timestamps by default (enable with --console-timestamps)

* logging: replaced most printStderr() with log.Info

* cli: additional logging cleanup
2020-10-10 10:48:37 -07:00
Jarek Kowalski
ec9c4d6095 restore: support for parallelization (#668) 2020-10-07 21:41:32 -07:00
Jarek Kowalski
4fd0bcf7dc cli: 'snapshot create' switched all stderr output to use logger (#663)
This allows 'kopia snap create --all --log-level=warning --no-progress'
to produce no console output unless an error is encountered.
2020-10-05 17:28:00 -07:00
Jarek Kowalski
d08c052651 cli: removed confusing default on 'snapshot list' that only lists last 100 snapshots per source (#666) 2020-10-05 16:59:06 -07:00
Jarek Kowalski
0758a92c58 restore: improved user experience (#644)
* restore: improved user experience

* 'snapshot restore' is now the same as 'restore' and both will
  support restoring by manifest ID, root ID or root ID + subdirectory

* added support for restoring individual files

* implemented PR feedback and refactored object ID parsing

Moving helpers inside the snapshot/ package helped clean up the code
a lot.
2020-09-28 22:57:24 -07:00
Jarek Kowalski
ff6a414ec5 cli: When listing directory that had errors, print error summary at the end. (#643)
Can be disabled with `--no-error-summary`.
Quick demo: https://asciinema.org/a/2rma0sx2mD6HoIPy6VL0QEFeP

Also refactored fs.Directory to provide Summary optionally.
2020-09-25 09:06:41 -07:00
Jarek Kowalski
d0d6ac4767 SFTP connectivity and docs improvements (#623)
* sftp: support for external SSH command and host verfication improvements

- removed custom parsing of hostnames and verification and replaced with
  standard 'knownhosts' implementation.

- added option to launch external SSH command which supports
  aliases, agent, etc.

NOTE, we're still not supporting any cases where password needs to be
entered on the command line, since that would be incompatible with
the UI which uses client-server model.

Fixes #500
Fixes #414

* site: updated SFTP repository connection instructions

Fixes #590
2020-09-20 11:10:13 -07:00
Jarek Kowalski
538c644586 cli: don't ask for password if repository is not connected (#627) 2020-09-19 11:45:03 -07:00
Jarek Kowalski
fce9497375 restore: support for symlinks (experimental) (#621) 2020-09-18 10:29:20 -07:00
Jarek Kowalski
9551d2495d upload: scan the directory to be uploaded in parallel to estimate the amount of data to be uploaded (#622)
This allows better progress indicator in the CLI and UI.
The percentage completed is not displayed until estimate is available.

Quick demo: https://asciinema.org/a/O7ktcWSgaGUPfJwhzc65mMWM1
2020-09-17 23:59:18 -07:00
Jarek Kowalski
9faf6b33d0 cli: fixed snapshot delete to support deleting file (not directory) snapshots by object ID (#613) 2020-09-12 22:36:47 -07:00
Jarek Kowalski
f0b97b960b Fixed checkpointing to not restart the entire upload process (#594)
* object: added Checkpoint() method to object writer

* upload: refactored code structure to allow better checkpointing

* upload: removed Checkpoint() method from UploadProgress

* Update fs/entry.go

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 22:36:22 -07:00
Jarek Kowalski
6a14ac8a2a cli: ensure advanced commands are not accidentally used (#611)
* cli: ensure advanced commands are not accidentally used

This prints an error when a dangerous command is used without
first setting KOPIA_ADVANCED_COMMANDS=enabled environment variable.

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 20:31:25 -07:00
Jarek Kowalski
faf280616a Splitter throughput improvements (#606)
* object: refactored writer to detect split points before writing

This introduces new primitive that will be moved into splitters
themselves in subsequent commits. I'm doing this in small steps to
ensure we don't regress at any time.

* splitter: refactored TestSplitters test

This is use slow (byte-by-byte) and fast (nextSplitPoint) methods of
determining split points.

Note nextSplitPoint is not implemented by splitters yet, but this
verifies that the test is expecting the right thing.

* object: splitter refactoring - replaced ShouldSplit() with NextSplitPoint() everywhere, still not optimized

* splitter: added additional dimension to splitter_test

We split either in large chunks or one byte at a time to catch
the corner cases in the splitter implementation.

* splitter: optimized splitters using NextSplitPoint primitive

This improves splitter performance by about 40% (buzhash) and makes
it virtually free for FIXED splitter.
2020-09-11 19:45:48 -07:00
Jarek Kowalski
3b87902433 Kopia UI improvements for repository management (#592)
* cli: added --tls-print-server-cert flag

This prints complete server certificate that is base64 and PEM-encoded.

It is needed for Electron to securely connect to the server outside of
the browser, since there's no way to trust certificate by fingerprint.

* server: added repo/exists API

* server: added ClientOptions to create and connect API

* server: exposed current-user API

* server: API to change description of a repository

* htmlui: refactored connect/create flow

This cleaned up the code a lot and made UX more obvious.

* kopia-ui: simplified repository management UX

Removed repository configuration window which was confusing due to
the notion of 'server'.

Now KopiaUI will automatically launch 'kopia server --ui' for each
config found in the kopia config directory and shut it down every
time repository is disconnected.

See https://youtu.be/P4Ll_LR4UVM for a quick demo.

Fixes #583
2020-09-07 08:00:19 -07:00
Jarek Kowalski
29ce1819cb Added support for setting and changing repository client options (description, read-only, hostname, username) (#589)
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config

* cli: added 'repository set-client' to configure parameters of connected repository

* cli: cleaned up 'repository status' output
2020-09-04 13:57:15 -07:00