Commit Graph

140 Commits

Author SHA1 Message Date
Jarek Kowalski
dc964bee43 ui: Policy Editor - show effective value and definition point for policy fields (#1545)
* policy: resolve API for policy editor

* htmlui: enhanced Policy Editor UI to preview effective values
2021-11-30 21:40:41 -08:00
Jarek Kowalski
93930d20cb policy: revamped policy merge mechanism (#1538)
Added policy.Definition which allows us to precisely report where
each piece of policy came from.

Fixed a one-off bug with "noParent", which prevented merging of parent
policies one level too soon.

Added a whole bunch of merging helpers and generic reflection-based
test that ensures every single merge is tested.
2021-11-27 18:14:45 -08:00
Jarek Kowalski
a5d689eb36 ui: Added test to verify #1057 (#1526) 2021-11-20 11:43:03 -08:00
Jarek Kowalski
62edab618f throtting: implemented a Throttler based on token bucket and configur… (#1512)
* throtting: implemented a Throttler based on token bucket and configurable window.

* cli: rewired throttle options to use common Limits structure and helpers

The JSON is backwards compatible.

* blob: remove explicit throttling from gcs,s3,b2 & azure

* cleanup: removed internal/throttle

* repo: add throttling wrapper around storage at the repository level

* throttling: expose APIs to get limits and add validation

* server: expose API to get/set throttle in a running server

* pr feedback
2021-11-16 07:39:26 -08:00
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Jarek Kowalski
e41c53b01b server: ensure all HTTP requests are processed in a detached context (#1495) 2021-11-06 17:35:57 -07:00
Jarek Kowalski
03def8f33a server: maintenance in newly-created repo (#1494)
The issue in #1439 was caused by goroutine context being associated
with the HTTP request so it became canceled soon after the request was
over, thus the goroutine to run maintenance never ran.

Fixed by adding ctxutil.Detach()

Also fixed logging by passing top-level contexts to requests
and added --log-server-requests flag to `server start` which enables
request logging.
2021-11-06 17:10:53 -07:00
Jarek Kowalski
0b737c170d maintenance: improved scheduling (#1493)
Instead of attempting maintenance every 10 minutes we will do a longer
sleep until the predicted next maintenance time (or 4 hours, whichever
is shorter).

Related #1439
2021-11-06 16:44:27 -07:00
Jarek Kowalski
0d0f48a7ee clock: discard monotonic clock component in clock.Now() (#1437)
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md

The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.

Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.

The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.

Fixes #1402

Co-authored-by: Julio Lopez <julio+gh@kasten.io>
2021-10-22 15:35:09 -07:00
Jarek Kowalski
fa2a931891 ui: fixed the refresh button in source list (#1428) 2021-10-20 06:58:37 -07:00
Jarek Kowalski
789a739a5b server: fixed next snapshot time computation (#1418)
Fixes #1405

Moved logic to SchedulingPolicy, added unit tests.
2021-10-18 22:55:11 -07:00
Jarek Kowalski
8b760b66a8 logging: added memoization of Logger instances per context (#1369) 2021-10-09 05:02:18 -07:00
Eng Zer Jun
73e492c9db refactor: move from io/ioutil to io and os package (#1360)
* refactor: move from io/ioutil to io and os package

The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* chore: remove //nolint:gosec for os.ReadFile

At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-06 08:39:10 -07:00
Jarek Kowalski
6b7c03bc97 deps: upgrade to github.com/golang-jwt/jwt/v4@4.1.0 and fixed linter (#1342) 2021-10-02 10:07:46 -07:00
Jarek Kowalski
792cc874dc repo: allow reusing of object writer buffers (#1315)
This reduces memory consumption and speeds up backups.

1. Backing up kopia repository (3.5 GB files:133102 dirs:20074):

before: 25s, 490 MB
after: 21s, 445 MB

2. Large files (14.8 GB, 76 files)

before: 30s, 597 MB
after: 28s, 495 MB

All tests repeated 5 times for clean local filesystem repo.
2021-09-25 14:54:31 -07:00
Jarek Kowalski
928150fe6b linter: upgrade to 1.42.1 (#1292) 2021-09-14 19:11:39 -07:00
Jarek Kowalski
7e68d8e4c1 Consolidated format version flags (#1284) 2021-09-08 18:44:03 -07:00
Jarek Kowalski
35d0f31c0d huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244)
This helps recycle buffers more efficiently during snapshots.
Also, improved memory tracking, enabled profiling flags and added pprof
by default.
2021-08-20 08:45:10 -07:00
Jarek Kowalski
4aacad25f5 server: switched to github.com/golang-jwt/jwt/v4 to fix upstream security issue (#1235) 2021-08-07 09:18:29 -07:00
Jarek Kowalski
1ef3d243a0 repo: big performance improvement for WriteContent with repo server (#1182)
* repo: big performance improvement for WriteContent with repo server

When re-uploading previously snapshotted directory we fetch directory
content `k<hash>` and very frequently end up writing the exact same
content. By caching last N content IDs we can avoid costly round-trip
to the server since we know that content ID was present in the session.

Also added small number of asynchronous writes, which also helps with
upload performance. Background writes are awaited before Flush().

Performance when snapshotting lots of small files (source code):

31.9 GB files:471205 dirs:75817, warm cache
Before: 260s
After: 55s (4-5x faster)

* fixed tests
2021-07-09 22:39:04 -07:00
Jarek Kowalski
9e059a1277 upgraded linter to 1.41.0 (#1144) 2021-06-16 19:44:55 -07:00
Jarek Kowalski
4b251bdaac mechanical: added ctx parameter to repo.{Direct}WriteSession callback (#1114) 2021-06-02 23:12:30 -07:00
Jarek Kowalski
40510c043d Support for content-level compression (#1076)
* cli: added a flag to create repository with v2 index features

* content: plumb through compression.ID parameter to content.Manager.WriteContent()

* content: expose content.Manager.SupportsContentCompression

This allows object manager to decide whether to create compressed object
or let the content manager do it.

* object: if compression is requested and the repo supports it, pass compression ID to the content manager

* cli: show compression status in 'repository status'

* cli: output compression information in 'content list' and 'content stats'

* content: compression and decompression support

* content: unit tests for compression

* object: compression tests

* testing: added integration tests against v2 index

* testing: run all e2e tests with and without content-level compression

* htmlui: added UI for specifying index format on creation

* cli: additional tests for 'content ls' and 'content stats'

* applied pr suggestions
2021-05-22 05:35:27 -07:00
Jarek Kowalski
5179ad2cd2 cli: test + misc improvements (#1083)
* cli: Added --max-examples-per-bucket flag to 'kopia snapshot estimate'

Added and cleaned up a bunch of unit tests.

Fixes #1054

* cli: misc tests to increase code coverage of the cli package

* ci: move code coverage run into separate GH job
2021-05-17 21:47:11 -07:00
Jarek Kowalski
30ca3e2e6c Upgraded linter to 1.40.1 (#1072)
* tools: upgraded linter to 1.40.1

* lint: fixed nolintlint vionlations

* lint: disabled tagliatele linter

* lint: fixed remaining warnings
2021-05-15 12:12:34 -07:00
Jarek Kowalski
41931f21ce repo: refactored password persistence (#1065)
* introduced passwordpersist package which has password persistence
  strategies (keyring, file, none, multiple) with possibility of adding
  more in the future.
* moved all password persistence logic out of 'repo'
* removed global variable repo.EnableKeyRing
2021-05-11 21:53:36 -07:00
Sirish Bathina
dd41296f2a Tagging of kopia snapshots and listing of snapshots by tag (#1030) 2021-04-30 06:16:19 -07:00
Jarek Kowalski
df430371b9 Refactored content.Info to be an interface and switched index parsing to be lazy (#1008) 2021-04-27 05:53:52 -07:00
Jarek Kowalski
d290c0a967 ui: do not attempt running maintenance if the current user is not the maintenance owner, to avoid producing error in the Tasks tab (#1010) 2021-04-24 12:37:06 -07:00
Jarek Kowalski
70a83b381b kopia-ui: for read-only repositories start a read-only source manager (#1009) 2021-04-24 11:02:33 -07:00
Jarek Kowalski
62fab592f0 ui: fixed Estimate not honoring the defined policies (#1002) 2021-04-21 17:26:00 -07:00
Jarek Kowalski
74f926cb0d content: added content.Info.OriginalLength (#989) 2021-04-19 19:44:10 -07:00
Jarek Kowalski
2062c07259 mechanical field renames (#988)
* content: mechanical rename content.Info.Length -> content.Info.PackedLength
* server: renamed grpc API ContentInfo.length->packed_length (non-breaking)
2021-04-16 22:42:32 -07:00
Jarek Kowalski
f4347886b8 logging: simplified log levels (#954)
Removed Warning, Notify and Fatal:

* `Warning` => `Error` or `Info`
* `Notify` => `Info`
* `Fatal` was never used.

Note that --log-level=warning is still supported for backwards
compatibility, but it is the same as --log-level=error.

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-04-09 07:27:35 -07:00
Jarek Kowalski
b8c3ae378b testing: replaced locally-defined must() with require.NoError() (#942) 2021-04-05 09:57:50 -07:00
Jarek Kowalski
d07eb9f300 cli: added --safety=full|none flag to maintenance commands (#912)
* cli: added --safety=full|none flag to maintenance commands

This allows selection between safe, high-latency maintenance parameters
which allow concurrent access (`full`) or low-latency which may be
unsafe in certain situations when concurrent Kopia processes are
running.

This is a breaking change for advanced CLI commands, where it removes
timing parameters and replaces them with single `--safety` option.

* 'blob gc'
* 'content rewrite'
* 'snapshot gc'

* pr renames

* maintenance: fixed computation of safe time for --safety=none

* maintenance: improved logging for blob gc

* maintenance: do not rewrite truly short, densely packed packs

* mechanical: pass eventual consistency settle time via CompactOptions

* maintenance: add option to disable eventual consistency time buffers with --safety=none

* maintenance: trigger flush at the end of snapshot gc

* maintenance: reload indexes after compaction that drops deleted entries, this allows single-pass maintenance with --safety=none to delete all unused blobs

* testing: allow debugging of integration tests inside VSCode

* testing: added end-to-end maintenance test that verifies that full maintenance with --safety=none removes all data
2021-04-02 21:56:01 -07:00
Jarek Kowalski
9a128ffb9f filesystem: support ~ in repository path, require absolute paths (#922)
Fixes #918
2021-04-02 21:55:24 -07:00
Jarek Kowalski
9a756c719f Enabled race detector in CI, fixed a few data races (#919)
* content: fixed data race in IterateUnreferencedBlobs

* upload: fixed data race between uploader and estimator

* testing: fixed data race in repo/blob/logging test

* makefile: run tests on CI/linux/amd64 with -race

* robustness: fixed test race

* content: fixed data race getContentDataUnlocked that triggers TestParallelWrites - looks scary but in practice very hard to trigger in real life and does not cause data corruption

* testing: reduce test complexity under race detector

* server: fixed minor race in refreshStatus()

* testing: reduced depth of sharedTestDataDir2

* ci: run race detector in separate job

* ci: run unit test race detector in parallel to integration tests
2021-04-02 18:21:04 -07:00
Jarek Kowalski
2c2c9d52e0 nit: refactored repetitive reportesting setup code (#916) 2021-03-29 14:52:14 -07:00
Jarek Kowalski
cbcd59f18e Added repository user authorization support + server flag refactoring + refresh (#890)
* nit: replaced harcoded string constants with named constants

* acl: added management of ACL entries

* auth: implemented DefaultAuthorizer which uses ACLs if any entries are found in the system and falls back to LegacyAuthorizer if not

* cli: switch to DefaultAuthorizer when starting server

* cli: added ACL management

* server: refactored authenticator + added refresh

Authenticator is now an interface which also supports Refresh.

* authz: refactored authorizer to be an interface + added Refresh()

* server: refresh authentication and authorizer

* e2e tests for ACLs

* server: handling of SIGHUP to refresh authn/authz caches

* server: reorganized flags to specify auth options:

- removed '--allow-repository-users' - it's always on
- one of --without-password, --server-password or --random-password
  can be specified to specify password for the UI user
- htpasswd-file - can be specified to provide password for UI or remote
  users

* cli: moved 'kopia user' to 'kopia server user'

* server: allow all UI actions if no authenticator is set

* acl: removed priority until we have a better understood use case for it

* acl: added validation of allowed labels when adding ACL entries

* site: added docs for ACLs
2021-03-18 23:03:27 -07:00
Jarek Kowalski
4efb06849e server: ensure we reject access to the UI static files for users other than the UI user (#884)
This is for a scenario where a user provides valid username/password
but such that the username is not authorized to access the UI.

Previously we'd make it look like they got access (because they can
see the UI at leaast partially), but all API calls would fail.

With this change we're failing early with HTTP 403 and pointing the
users at a GH issue explaining what to do.

Fixes #580.
2021-03-13 09:58:27 -08:00
Jarek Kowalski
132e2eef50 New snapshot UX - streamlined snapshot creation and policy setting (#878)
* uitask: added support for reporting string progress info

* server: report current directory as task progress

* snapshot: created reusable Estimate() method to be used during upload, cli estimate and via API

* cli: switched to snapshotfs.Estimate()

* server: added API to estimate snapshot size

* kopia-ui: fixed directory selector

* htmlui: streamlined new snapshot flow and cleaned up policy setting

See https://youtu.be/8p6csuoB3kg
2021-03-10 23:04:55 -08:00
Jarek Kowalski
689ed0a851 server: refactored authentication and authorization (#871)
This formalizes the concept of a 'UI user' which is a local
user that can call APIs the same way that UI does it.

The server will now allow access to:

- UI user (identified using `--server-username` with password specified
  using `--server-password' or `--random-password`)
- remote users with usersnames/passwords specified in `--htpasswd-file`
- remote users defined in the repository using `kopia users add`
  when `--allow-repository-users` is passed.

The UI user only has access to methods specifically designated as such
(normally APIs used by the UI + few special ones such as 'shutdown').

Remote users (identified via `user@host`) don't get access to UI APIs.

There are some APIs that can be accessed by any authenticated
caller (UI or remote):

- /api/v1/flush
- /api/v1/repo/status
- /api/v1/repo/sync
- /api/v1/repo/parameters

To make this easier to understand in code, refactored server handlers
to require specifing what kind of authorization is required
at registration time.
2021-03-08 22:25:22 -08:00
Jarek Kowalski
1f1465f4ba Improvements and cleanups for connecting to kopia server (#870)
* repo: refactored connect code set up cache for server repositories

- improved logic to close the cache on last connection
- preemptively add all contents with a prefix to the cache
- refactored how config is loaded and saved

Now cache dir will be stored as relative and resolved to absolute as
part of loading and saving the file, in all other places cache dir
is expected to be absolute.

* server: removed cache directory from the API and UI

This won't be easily available and does not seem useful to expose
anyway.

* cli: enabled cache commands for server repositories

* cli: added KOPIA_CACHE_DIRECTORY environment variable

This is used on two occassions - when setting up connection (it gets
persisted in the config) and later when opening (to override the
cache location from config). It makes setting up docker container with
mounted cache somewhat easier with one environment variable.

* cli: show cache size for the server cache

* tls: present more helpful error message that includes SHA256 fingerprint of the TLS server on mismatch

* server: return the name of user who attempted to login when authentication fails
2021-03-07 11:25:21 -08:00
Jarek Kowalski
9620b57e35 server: avoid password hashing by using short-lived JWT tokens (#857)
Tokens encode the authenticated user, last for 1 minute and are signed
with HMAC-SHA-256. This improves HTTP server performance by a lot:

BEFORE: 168383 files (6.4 GB) - 3m38s
AFTER: 168383 files (6.4 GB) - 1m37s
2021-03-01 06:17:06 -08:00
Jarek Kowalski
4e705726fe Implemented caching for server connections (#845)
* cache: refactored reusable portion of cache into separate package

* repo: plumbed through caching for remote repository clients

* repo: plumb through cache in the unit tests

* cache: ensure we only allow absolute cache paths, fixed cache path resolution for remote repositories
2021-03-01 06:15:39 -08:00
Julio López
7bafe51dcc Replace go-bindata with //go:embed (#844)
* Replace htmlui_fallback.go with go:embed
* Replace go-bindata generated UI with go:embed
* Update site Go version to 1.16
* Update BUILD.md to reflect workflow with go:embed
2021-02-23 01:09:40 -08:00
Jarek Kowalski
e2b9a81ac3 Major CI/CD refactoring and re-added support for ARM/ARM64 runners (#849)
* ci: refactored CI/CD logic & Makefile

- removed all travis CI emulation environment variables and replaced with:

CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true

- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel

* testing: fixed tests on ARM and ARM64

- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
2021-02-23 00:52:54 -08:00
Jarek Kowalski
23273af1cd snapshot: reworked error handling and added fail-fast option (#840)
Fixes #690

This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.

Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.

The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.

After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.

In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.

Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.

With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
2021-02-17 10:29:01 -08:00