Commit Graph

117 Commits

Author SHA1 Message Date
Jarek Kowalski
40510c043d Support for content-level compression (#1076)
* cli: added a flag to create repository with v2 index features

* content: plumb through compression.ID parameter to content.Manager.WriteContent()

* content: expose content.Manager.SupportsContentCompression

This allows object manager to decide whether to create compressed object
or let the content manager do it.

* object: if compression is requested and the repo supports it, pass compression ID to the content manager

* cli: show compression status in 'repository status'

* cli: output compression information in 'content list' and 'content stats'

* content: compression and decompression support

* content: unit tests for compression

* object: compression tests

* testing: added integration tests against v2 index

* testing: run all e2e tests with and without content-level compression

* htmlui: added UI for specifying index format on creation

* cli: additional tests for 'content ls' and 'content stats'

* applied pr suggestions
2021-05-22 05:35:27 -07:00
Jarek Kowalski
5179ad2cd2 cli: test + misc improvements (#1083)
* cli: Added --max-examples-per-bucket flag to 'kopia snapshot estimate'

Added and cleaned up a bunch of unit tests.

Fixes #1054

* cli: misc tests to increase code coverage of the cli package

* ci: move code coverage run into separate GH job
2021-05-17 21:47:11 -07:00
Jarek Kowalski
30ca3e2e6c Upgraded linter to 1.40.1 (#1072)
* tools: upgraded linter to 1.40.1

* lint: fixed nolintlint vionlations

* lint: disabled tagliatele linter

* lint: fixed remaining warnings
2021-05-15 12:12:34 -07:00
Jarek Kowalski
41931f21ce repo: refactored password persistence (#1065)
* introduced passwordpersist package which has password persistence
  strategies (keyring, file, none, multiple) with possibility of adding
  more in the future.
* moved all password persistence logic out of 'repo'
* removed global variable repo.EnableKeyRing
2021-05-11 21:53:36 -07:00
Sirish Bathina
dd41296f2a Tagging of kopia snapshots and listing of snapshots by tag (#1030) 2021-04-30 06:16:19 -07:00
Jarek Kowalski
df430371b9 Refactored content.Info to be an interface and switched index parsing to be lazy (#1008) 2021-04-27 05:53:52 -07:00
Jarek Kowalski
d290c0a967 ui: do not attempt running maintenance if the current user is not the maintenance owner, to avoid producing error in the Tasks tab (#1010) 2021-04-24 12:37:06 -07:00
Jarek Kowalski
70a83b381b kopia-ui: for read-only repositories start a read-only source manager (#1009) 2021-04-24 11:02:33 -07:00
Jarek Kowalski
62fab592f0 ui: fixed Estimate not honoring the defined policies (#1002) 2021-04-21 17:26:00 -07:00
Jarek Kowalski
74f926cb0d content: added content.Info.OriginalLength (#989) 2021-04-19 19:44:10 -07:00
Jarek Kowalski
2062c07259 mechanical field renames (#988)
* content: mechanical rename content.Info.Length -> content.Info.PackedLength
* server: renamed grpc API ContentInfo.length->packed_length (non-breaking)
2021-04-16 22:42:32 -07:00
Jarek Kowalski
f4347886b8 logging: simplified log levels (#954)
Removed Warning, Notify and Fatal:

* `Warning` => `Error` or `Info`
* `Notify` => `Info`
* `Fatal` was never used.

Note that --log-level=warning is still supported for backwards
compatibility, but it is the same as --log-level=error.

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-04-09 07:27:35 -07:00
Jarek Kowalski
b8c3ae378b testing: replaced locally-defined must() with require.NoError() (#942) 2021-04-05 09:57:50 -07:00
Jarek Kowalski
d07eb9f300 cli: added --safety=full|none flag to maintenance commands (#912)
* cli: added --safety=full|none flag to maintenance commands

This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.

This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.

* 'blob gc'
* 'content rewrite'
* 'snapshot gc'

* pr renames

* maintenance: fixed computation of safe time for --safety=none

* maintenance: improved logging for blob gc

* maintenance: do not rewrite truly short, densely packed packs

* mechanical: pass eventual consistency settle time via CompactOptions

* maintenance: add option to disable eventual consistency time buffers with --safety=none

* maintenance: trigger flush at the end of snapshot gc

* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs

* testing: allow debugging of integration tests inside VSCode

* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
2021-04-02 21:56:01 -07:00
Jarek Kowalski
9a128ffb9f filesystem: support ~ in repository path, require absolute paths (#922)
Fixes #918
2021-04-02 21:55:24 -07:00
Jarek Kowalski
9a756c719f Enabled race detector in CI, fixed a few data races (#919)
* content: fixed data race in IterateUnreferencedBlobs

* upload: fixed data race between uploader and estimator

* testing: fixed data race in repo/blob/logging test

* makefile: run tests on CI/linux/amd64 with -race

* robustness: fixed test race

* content: fixed data race getContentDataUnlocked that triggers TestParallelWrites - looks scary but in practice very hard to trigger in real life and does not cause data corruption

* testing: reduce test complexity under race detector

* server: fixed minor race in refreshStatus()

* testing: reduced depth of sharedTestDataDir2

* ci: run race detector in separate job

* ci: run unit test race detector in parallel to integration tests
2021-04-02 18:21:04 -07:00
Jarek Kowalski
2c2c9d52e0 nit: refactored repetitive reportesting setup code (#916) 2021-03-29 14:52:14 -07:00
Jarek Kowalski
cbcd59f18e Added repository user authorization support + server flag refactoring + refresh (#890)
* nit: replaced harcoded string constants with named constants

* acl: added management of ACL entries

* auth: implemented DefaultAuthorizer which uses ACLs if any entries are found in the system and falls back to LegacyAuthorizer if not

* cli: switch to DefaultAuthorizer when starting server

* cli: added ACL management

* server: refactored authenticator + added refresh

Authenticator is now an interface which also supports Refresh.

* authz: refactored authorizer to be an interface + added Refresh()

* server: refresh authentication and authorizer

* e2e tests for ACLs

* server: handling of SIGHUP to refresh authn/authz caches

* server: reorganized flags to specify auth options:

- removed '--allow-repository-users' - it's always on
- one of --without-password, --server-password or --random-password
  can be specified to specify password for the UI user
- htpasswd-file - can be specified to provide password for UI or remote
  users

* cli: moved 'kopia user' to 'kopia server user'

* server: allow all UI actions if no authenticator is set

* acl: removed priority until we have a better understood use case for it

* acl: added validation of allowed labels when adding ACL entries

* site: added docs for ACLs
2021-03-18 23:03:27 -07:00
Jarek Kowalski
4efb06849e server: ensure we reject access to the UI static files for users other than the UI user (#884)
This is for a scenario where a user provides valid username/password
but such that the username is not authorized to access the UI.

Previously we'd make it look like they got access (because they can
see the UI at leaast partially), but all API calls would fail.

With this change we're failing early with HTTP 403 and pointing the
users at a GH issue explaining what to do.

Fixes #580.
2021-03-13 09:58:27 -08:00
Jarek Kowalski
132e2eef50 New snapshot UX - streamlined snapshot creation and policy setting (#878)
* uitask: added support for reporting string progress info

* server: report current directory as task progress

* snapshot: created reusable Estimate() method to be used during upload, cli estimate and via API

* cli: switched to snapshotfs.Estimate()

* server: added API to estimate snapshot size

* kopia-ui: fixed directory selector

* htmlui: streamlined new snapshot flow and cleaned up policy setting

See https://youtu.be/8p6csuoB3kg
2021-03-10 23:04:55 -08:00
Jarek Kowalski
689ed0a851 server: refactored authentication and authorization (#871)
This formalizes the concept of a 'UI user' which is a local
user that can call APIs the same way that UI does it.

The server will now allow access to:

- UI user (identified using `--server-username` with password specified
  using `--server-password' or `--random-password`)
- remote users with usersnames/passwords specified in `--htpasswd-file`
- remote users defined in the repository using `kopia users add`
  when `--allow-repository-users` is passed.

The UI user only has access to methods specifically designated as such
(normally APIs used by the UI + few special ones such as 'shutdown').

Remote users (identified via `user@host`) don't get access to UI APIs.

There are some APIs that can be accessed by any authenticated
caller (UI or remote):

- /api/v1/flush
- /api/v1/repo/status
- /api/v1/repo/sync
- /api/v1/repo/parameters

To make this easier to understand in code, refactored server handlers
to require specifing what kind of authorization is required
at registration time.
2021-03-08 22:25:22 -08:00
Jarek Kowalski
1f1465f4ba Improvements and cleanups for connecting to kopia server (#870)
* repo: refactored connect code set up cache for server repositories

- improved logic to close the cache on last connection
- preemptively add all contents with a prefix to the cache
- refactored how config is loaded and saved

Now cache dir will be stored as relative and resolved to absolute as
part of loading and saving the file, in all other places cache dir
is expected to be absolute.

* server: removed cache directory from the API and UI

This won't be easily available and does not seem useful to expose
anyway.

* cli: enabled cache commands for server repositories

* cli: added KOPIA_CACHE_DIRECTORY environment variable

This is used on two occassions - when setting up connection (it gets
persisted in the config) and later when opening (to override the
cache location from config). It makes setting up docker container with
mounted cache somewhat easier with one environment variable.

* cli: show cache size for the server cache

* tls: present more helpful error message that includes SHA256 fingerprint of the TLS server on mismatch

* server: return the name of user who attempted to login when authentication fails
2021-03-07 11:25:21 -08:00
Jarek Kowalski
9620b57e35 server: avoid password hashing by using short-lived JWT tokens (#857)
Tokens encode the authenticated user, last for 1 minute and are signed
with HMAC-SHA-256. This improves HTTP server performance by a lot:

BEFORE: 168383 files (6.4 GB) - 3m38s
AFTER: 168383 files (6.4 GB) - 1m37s
2021-03-01 06:17:06 -08:00
Jarek Kowalski
4e705726fe Implemented caching for server connections (#845)
* cache: refactored reusable portion of cache into separate package

* repo: plumbed through caching for remote repository clients

* repo: plumb through cache in the unit tests

* cache: ensure we only allow absolute cache paths, fixed cache path resolution for remote repositories
2021-03-01 06:15:39 -08:00
Julio López
7bafe51dcc Replace go-bindata with //go:embed (#844)
* Replace htmlui_fallback.go with go:embed
* Replace go-bindata generated UI with go:embed
* Update site Go version to 1.16
* Update BUILD.md to reflect workflow with go:embed
2021-02-23 01:09:40 -08:00
Jarek Kowalski
e2b9a81ac3 Major CI/CD refactoring and re-added support for ARM/ARM64 runners (#849)
* ci: refactored CI/CD logic & Makefile

- removed all travis CI emulation environment variables and replaced with:

CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true

- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel

* testing: fixed tests on ARM and ARM64

- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
2021-02-23 00:52:54 -08:00
Jarek Kowalski
23273af1cd snapshot: reworked error handling and added fail-fast option (#840)
Fixes #690

This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.

Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.

The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.

After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.

In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.

Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.

With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
2021-02-17 10:29:01 -08:00
Jarek Kowalski
fe9ebfb671 server: test flake fix (#839)
Addresses https://github.com/kopia/kopia/runs/1915273219?check_suite_focus=true

Verified by testing 100 times.
2021-02-16 19:40:50 -08:00
Jarek Kowalski
675bf4e033 Removed manifest manager refresh + server improvements (#835)
* manifest: removed explicit refresh

Instead, content manager is exposing a revision counter that changes
on each mutation or index change. Manifest manager will be invalidated
whenever this is encountered.

* server: refactored initialization API

* server: added unit tests for repository server APIs (HTTP and REST)

* server: ensure we don't upload contents that already exist

This saves bandwidth, since the client can compute hash locally
and ask the server whether the object exists before starting the upload.
2021-02-15 23:55:58 -08:00
Jarek Kowalski
de840547e6 Improved upload reporting (#832)
* blob: refactored upload reporting

Instead of plumbing this through blob storage context, we are passing
and explicit callback that reports uploads as they happen.

* htmlui: improved counter presentation

* nit: added missing UI route which fixes Reload behavior on the Tasks page
2021-02-13 10:51:11 -08:00
Jarek Kowalski
504238df7a Restore UI (#823)
* server: added restore api
* htmlui: restore UI
2021-02-11 02:08:47 -08:00
Jarek Kowalski
dc3d9f53a3 Added support for Tasks in the UI (#818)
* uitask: added package for managing and introspection into tasks running inside the process

* server: added API for getting details of tasks running inside the server

* htmlui: added new tab called 'Tasks'

This allows access to progres, logs and cancelation for long-running
tasks (Snapshots, Maintenance, and in the future Restore, Estimate,
Verify)

* snapshot: improve counters returned from the upload
2021-02-08 17:54:39 -08:00
Jarek Kowalski
646c325826 Implemented new streaming GRPC protocol for Kopia Repository Server (#789)
* grpcapi: added GPRC API for the repository server

* repo: added transparent retries to GRPC repository client

Normally GRPC reconnects automatically, which can survive server
restarts (minus transient errors).

In our case we're establishing a stream which will be broken and
needs to be restarted after io.EOF is detected.

It safe to do transparent retries for read-only (repo.Repository),
but not safe for write sessions (repo.RepositoryWriter), because the
session may re-connect to different server that won't have the buffered
content write available in memory.
2021-01-28 05:15:12 -08:00
Jarek Kowalski
5912247f29 server: reworked authn/authz (#788)
* server: reworked authn/authz

Previously authentication was done as an wrapper handler and
authorization was inlined. This change moves authn/authz handlers
inside the server and implements separate authorization module that's
individually tested.

Also fixed an issue where server users were not able to see global
or host-level policies.

* PR feedback
2021-01-21 07:31:34 -08:00
Jarek Kowalski
fa7976599c repo: refactored repository interfaces (#780)
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`

- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
  interface

Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.

`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).

* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session

* cli: disable maintenance in 'kopia server start'
  Server will close the repository before completing.

* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
  - used atomic to manage SharedManager.closed

* removed stale example
* linter: fixed spurious failures

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-01-20 11:41:47 -08:00
Jarek Kowalski
4c8b9291e1 content: refactoring of Manager (#785)
- renamed content.Manager to content.WriteManager
- merged lockFreeManager and CommittedReadManager into SharedManager
- also reassigned some methods to SharedManager (no code move)
2021-01-14 23:13:57 -08:00
Jarek Kowalski
73a34ff7ff trivial: move CachingOptions out of content.Manager, where it's not needed (#775)
* trivial: move CachingOptions out of content.Manager, where it's not needed

* trivial: removed newManagerWithOptions which was the same as NewManager

also moved one-time initialization to newReadManager()
2021-01-07 18:51:15 -08:00
Jarek Kowalski
5e8e175cfa repo: refactored read/write methods of repo.Repository (#749)
Reader methods go to repo.Reader and write methods go to repo.Writer
Switched usage to new interfaces based on linter errors.
2021-01-04 21:33:12 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Piotr Tabor
bf729095dc Fuse: Passthrough options to allow other-users access and mounting in empty directory. (#691) 2020-10-31 21:57:27 -07:00
Jarek Kowalski
f66fe5789e Eliminated busy loop after snapshot failure (#658)
* server: if a snapshot fails, don't start the next one for 5 minutes or until the next successful refresh.

* Makefile: don't print skipped tests
2020-10-02 19:48:21 -07:00
Jarek Kowalski
0758a92c58 restore: improved user experience (#644)
* restore: improved user experience

* 'snapshot restore' is now the same as 'restore' and both will
  support restoring by manifest ID, root ID or root ID + subdirectory

* added support for restoring individual files

* implemented PR feedback and refactored object ID parsing

Moving helpers inside the snapshot/ package helped clean up the code
a lot.
2020-09-28 22:57:24 -07:00
Jarek Kowalski
c9c8d27c8d Repro and fix for zero-sized snapshot bug (#641)
* server: repro for zero-sized snapshot bug

As described in https://kopia.discourse.group/t/kopia-0-7-0-not-backing-up-any-files-repro-needed/136/5

* server: fixed zero-sized snapshots after repository is connected via API

The root cause was that source manager was inheriting HTTP call context
which was immediately closed after the 'connect' RPC returned thus
silently killing all uploads.
2020-09-23 20:15:36 -07:00
Jarek Kowalski
3b87902433 Kopia UI improvements for repository management (#592)
* cli: added --tls-print-server-cert flag

This prints complete server certificate that is base64 and PEM-encoded.

It is needed for Electron to securely connect to the server outside of
the browser, since there's no way to trust certificate by fingerprint.

* server: added repo/exists API

* server: added ClientOptions to create and connect API

* server: exposed current-user API

* server: API to change description of a repository

* htmlui: refactored connect/create flow

This cleaned up the code a lot and made UX more obvious.

* kopia-ui: simplified repository management UX

Removed repository configuration window which was confusing due to
the notion of 'server'.

Now KopiaUI will automatically launch 'kopia server --ui' for each
config found in the kopia config directory and shut it down every
time repository is disconnected.

See https://youtu.be/P4Ll_LR4UVM for a quick demo.

Fixes #583
2020-09-07 08:00:19 -07:00
Jarek Kowalski
29ce1819cb Added support for setting and changing repository client options (description, read-only, hostname, username) (#589)
* repo: refactored client-specific options (hostname,username,description,readonly) into new struct that is JSON-compatible with current config

* cli: added 'repository set-client' to configure parameters of connected repository

* cli: cleaned up 'repository status' output
2020-09-04 13:57:15 -07:00
Jarek Kowalski
a5838ff34c Improvements to UX for mounting directories (both CLI and KopiaUI) (#573)
* cli: simplified mount command

See https://youtu.be/1Nt_HIl-NWQ

It will always use WebDAV on Windows and FUSE on Unix. Removed
confusing options.

New usage:

$ kopia mount [--browse]
    Mounts all snapshots in a temporary filesystem directory
    (both Unix and Windows).

$ kopia mount <object> [--browse]
    Mounts given object in a temporary filesystem directory
    (both Unix and Windows).

$ kopia mount <object> z: [--browse]
    Mounts given object as a given drive letter in Windows (using
    temporary WebDAV mount).

$ kopia mount <object> * [--browse]
    Mounts given object as a random drive letter in Windows.

$ kopia mount <object> /mount/path [--browse]
    Mounts given object in given path in Unix.

<object> can be the ID of a directory 'k<hash>' or 'all'

Optional --browse automatically opens OS-native file browser.

* htmlui: added UI for mounting directories

See https://youtu.be/T-9SshVa1d8 for a quick demo.

Also replaced some UI text with icons.

* lint: windows-specific fix
2020-09-03 17:46:48 -07:00
Jarek Kowalski
1a8fcb086c Added endurance test which tests kopia over long time scale (#558)
Globally replaced all use of time with internal 'clock' package
which provides indirection to time.Now()

Added support for faking clock in Kopia via KOPIA_FAKE_CLOCK_ENDPOINT

logfile: squelch annoying log message

testenv: added faketimeserver which serves time over HTTP

testing: added endurance test which tests kopia over long time scale

This creates kopia repository and simulates usage of Kopia over multiple
months (using accelerated fake time) to trigger effects that are only
visible after long time passage (maintenance, compactions, expirations).

The test is not used part of any test suite yet but will run in
post-submit mode only, preferably 24/7.

testing: refactored internal/clock to only support injection when
'testing' build tag is present
2020-08-26 23:03:46 -07:00
Jarek Kowalski
48f253173b kopia-ui: added ability to connect to kopia server and few other minor tweaks (#546)
* kopia-ui: added ability to connect to kopia server

* kopia-ui: update status page to show some data for repositories connected to API server

* kopia-ui: hide user@host selection dropdown for kopia server repositories
2020-08-16 17:57:37 -07:00
Jarek Kowalski
27ec5c70a9 server: pre-read request body to fix HTTP/2 deadlock (#539)
Fixes #538 (hopefully)
2020-08-15 21:53:46 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00