Commit Graph

130 Commits

Author SHA1 Message Date
Julio López
39fb62970f refactor(cli): refactor diagnosis flags (#5026)
Refactor `--profile-*` flags:
- Multiple profile types can be enabled at once, before only
  a single type profiling could be done during a process execution.
- The new `--profiles-store-on-exit` enables all available profile
  types, except for CPU profiling which needs to be explicitly enabled.
- Profiling parameters can now be set via new flags. This allows setting
  the profile parameters for the pprof endpoint, as well as when saving
  profiles to files on exit.
- Group profiling flags with other observability flags
- Adds a `--diagnostics-output-directory` flag that unifies and
  supersedes the `--profile-dir` and `--metrics-directory` flags

Enhancements and behavior changes:
- Profile flags now have effect for all kopia commands, including
  `server start`. Before these flags did not have any effect
  in a few commands.
- Multiple profile types can be enabled at once, before only
  a single type profiling could be done during a process execution.
- The new `--profiles-store-on-exit` enables all available profile
  types, except for CPU profiling which needs to be explicitly enabled.
- Profiling parameters can now be set via new flags. This allows setting
  the profile parameters for the pprof endpoint, as well as when saving
  profiles to files on exit.

The following flags have been removed:
- `--profile-dir`: superseded by the `--diagnostics-output-directory` flag
- `--profile-blocking`: the `--profile-store-on-exit` flag enables blocking
  profiling. Use `--profile-blocking-rate=0` to explicitly disable it.
- `--profile-memory`: the `--profile-store-on-exit` flag enables memory
  profiling. Use `--profile-memory-rate=0` to explicitly disable it.
- `--profile-mutex`: the `--profile-store-on-exit` flag enables mutex
  profiling. Use `--profile-mutex-fraction=0` to explicitly disable it.

Add CLI test for profile flags.
2025-11-26 09:40:01 -08:00
Julio Lopez
bb20d9e11a chore(ci): enable wsl_v5:{assign,expr} linter settings (#4982)
Enable wsl_v5 settings:
- assign
- expr
2025-11-12 23:12:06 -08:00
Julio Lopez
19af93f2b1 refactor(ci): enable wsl_v5:return linter (#4975)
Also, disable redundant nakedret linter
2025-11-11 16:53:10 -08:00
Julio Lopez
6650223291 refactor(general): cleanup observabilityFlags (#4852)
- nit: rename function to repositoryAction.
  It always calls the action with a repository
- move allocator stats functionality to observability
- rename observability functions to start/stop. They
  start and stop more than just the metrics services.
- rename field to c.enablePProfEndpoint for clarity.
- add observability run function to make it explicit
  where start and stop are called.
2025-09-28 23:59:27 -07:00
Julio Lopez
a0a4a7ba18 refactor(cli): ensure auto-maintenance errors are propagated (#4851)
Ensure auto-maintenance errors are propagated.
This enables sending notifications for failed "auto-maintenances".

Preserve action callback error when closing the repository fails.
2025-09-28 23:55:15 -07:00
Jarek Kowalski
0f7253eb66 feat(general): rewrote content logs to always be JSON-based and reorganized log structure (#4822)
This is a breaking change to users who might be using Kopia as a library.

### Log Format

```json
{"t":"<timestamp-rfc-3389-microseconds>", "span:T1":"V1", "span:T2":"V2", "n":"<source>", "m":"<message>", /*parameters*/}
```

Where each record is associated with one or more spans that describe its scope:

* `"span:client": "<hash-of-username@hostname>"`
* `"span:repo": "<random>"` - random identifier of a repository connection (from `repo.Open`)
* `"span:maintenance": "<random>"` - random identifier of a maintenance session
* `"span:upload": "<hash-of-username@host:/path>"` - uniquely identifies upload session of a given directory
* `"span:checkpoint": "<random>"` - encapsulates each checkpoint operation during Upload
* `"span:server-session": "<random>"` -single client connection to the server
* `"span:flush": "<random>"` - encapsulates each Flush session
* `"span:maintenance": "<random>"` - encapsulates each maintenance operation
* `"span:loadIndex" : "<random>"` - encapsulates index loading operation
* `"span:emr" : "<random>"` - encapsulates epoch manager refresh
* `"span:writePack": "<pack-blob-ID>"` - encapsulates pack blob preparation and writing

(plus additional minor spans for various phases of the maintenance).

Notable points:

- Used internal zero allocation JSON writer for reduced memory usage.
- renamed `--disable-internal-log` to `--disable-repository-log` (controls saving blobs to repository)
- added `--disable-content-log` (controls writing of `content-log` files)
- all storage operations are also logged in a structural way and associated with the corresponding spans.
- all content IDs are logged in a truncated format (since first N bytes that are usually enough to be unique) to improve compressibility of logs (blob IDs are frequently repeated but content IDs usually appear just once).

This format should make it possible to recreate the journey of any single content throughout pack blobs, indexes and compaction events.
2025-09-27 17:11:13 -07:00
Julio Lopez
24af54e55e refactor(cil): rename flag to --dangerous-commands (#4773) 2025-09-03 12:35:46 -07:00
Julio Lopez
6142059575 refactor(cli): reword message for dangerous commands (#4765) 2025-08-16 20:03:31 -07:00
Julio Lopez
70032f8498 refactor(cli): remove deprecated no-op flags (#4764)
Removes the following flags:
--caching: no-op
--list-caching: no-op
--enable-jaeger-collector: errors out when specified.

Also removes no-longer-used `deprecatedFlag` function.
2025-08-16 19:44:36 -07:00
Julio Lopez
3d4c5f8f9e refactor(general): s/interface{}/any/ (#4614) 2025-05-29 06:07:49 +00:00
Julio Lopez
1c5c4e2568 refactor(cli): cleanup cli.repositoryAccessMode (#4541)
Remove `repositoryAccessMode.mustBeConnected`
It is always true.

Rename `repositoryAccessMode.disableMaintenance`
to `allowMaintenance`.
This explicitly conveys when maintenance is allowed
to run. It is related to accessing a repository in
'read-only' mode.
2025-05-01 15:35:11 -07:00
Julio Lopez
8098f49c90 chore(ci): remove exclusion for unused ctx parameters (#4530)
Remove unused-parameter exclusion for `ctx` in revive linter.

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
Co-authored-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-04-26 23:11:36 -07:00
Jarek Kowalski
b874bb8b9b feat(cli): send general error notifications only if the standard output is not a terminal (#4245) 2024-11-15 18:00:09 -08:00
Jarek Kowalski
afb85cbb34 feat(cli): send error notifications and snapshot reports (#4233)
* feat(cli): send error notifications and snapshot reports

Notifications will be sent to all configured notification profiles
according to their severity levels.

The following events will trigger notifications:

- Snapshot is created (CLI only, severity >= report)
- Server Maintenance error occurs (CLI, server and UI, severity >= error)
- Any other CLI error occurs (CLI only, severity >= error).

A flag `--no-error-notifications` can be used to disable error notifications.

* added template tests

* improved time formatting in templates

* plumb through notifytemplate.Options

* more testing for formatting options

* fixed default date format to RFC1123
2024-11-11 17:53:50 -08:00
Julio López
2ca59394c1 refactor(cli): do not run auto-maintenance after a command fails (#4168) 2024-10-14 10:17:11 -07:00
Jarek Kowalski
c0bd372d29 feat(cli): support for defining notification profiles and templates via CLI (#4034)
* feat(cli): support for defining notification profiles via CLI

Profile management:

```
$ kopia notification profile configure email \
    --profile-name=X \
    --smtp-server=smtp.gmail.com \
    --smtp-port=587 \
    --smtp-username=X \
    --smtp-password=X \
    --mail-from=X \
    --mail-to=X \
    --format=html|txt \
    [--send-test-notification]

$ kopia notification profile configure pushover --profile-name=X \
    --user-key=X \
    --app-token=X \
    --format=html|txt \
    [--send-test-notification]

$ kopia notification profile configure webhook --profile-name=X \
    --endpooint=http://some-address:port/path \
    --method=POST|PUT \
    --format=html|txt \
    [--send-test-notification]

$ kopia notification profile test --profile-name=X

$ kopia notification profile delete --profile-name=X

$ kopia notification profile list
```

Template management:

```
$ kopia notification template show X

$ kopia notification template set X \
   --from-stdin | --from-file=X | --editor

$ kopia notification template remove X

$ kopia notification template list

```

Implements #1958

* additional refactoring for testability, various naming tweaks
2024-10-06 16:28:39 +00:00
Julio López
961a39039b refactor(general): use errors.New where appropriate (#4160)
Replaces 'errors.Errorf\("([^"]+)"\)' => 'errors.New("\1")'
2024-10-05 19:05:00 -07:00
Julio López
5dbc8a478a refactor(general): minor cleanups (#4003)
Followups to #3655

* wrap fs.Reader
* nit: remove unnecessary intermediate variable
* nit: rename local variable
* cleanup: move restore.Progress interface to cli pkg
* move cliRestoreProgress to a separate file
* refactor(general): replace switch with if/else for clarity
  Removes a tautology for `err == nil`, which was guaranteed
  to be true in the second case statement for the switch.
  Replacing the switch statement with and if/else block is clearer.
* initialize restoreProgress in restore command
* fix: use error.Wrapf with format string and args


Simplify SetCounters signature:

Pass arguments in a `restore.Stats` struct.
  `SetCounters(s restore.Stats)`
Simplifies call sites and implementation.
In this case it makes sense to pass all the values
using the restore.Stats struct as it simplifies
the calls.
However, this pattern should be avoided in general
as it essentially makes all the arguments "optional".
This makes it easy to miss setting a value and simply
passing 0 (the default value), thus it becomes error
prone.
In this particular case, the struct is being passed
through verbatim, thus eliminating the risk of
missing a value, at least in the current state of
the code.
2024-08-27 09:42:58 -07:00
Jarek Kowalski
fcb8197f3f chore(ci): upgraded linter to 1.59.0 (#3883) 2024-05-29 20:31:57 -07:00
Eugene Sumin
2b92388286 refactor(general): Increase restore progress granularity (#3655)
When restoring huge file(s), the progress reporting is done in a bit
weird way:

```
kopia_test % kopia snapshot restore ka2084d263182164b6cf3456668e6b6da /Users/eugen.sumin/kopia_test/2
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/2) with parallelism=8...
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.6 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Processed 6 (5.4 GB) of 5 (5.4 GB) 1.5 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (5.4 GB).
```
In fact, the amount of restored data is dumped when particular file
completely restored.

This PR contains the least invasive change, which allows us to see
progress update while file is downloaded from object storage.
```
Restoring to local filesystem (/Users/eugen.sumin/kopia_test/55) with parallelism=8...
Processed 2 (3.1 MB) of 5 (1.8 GB).
Processed 4 (459.6 MB) of 5 (1.8 GB) 270.3 MB/s (25.2%) remaining 4s.
Processed 4 (468.7 MB) of 5 (1.8 GB) 269 MB/s (25.7%) remaining 4s.
Processed 4 (741.6 MB) of 5 (1.8 GB) 269 MB/s (40.6%) remaining 3s.
Processed 4 (1.1 GB) of 5 (1.8 GB) 280 MB/s (57.6%) remaining 2s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 291.1 MB/s (75.2%) remaining 1s.
Processed 5 (1.4 GB) of 5 (1.8 GB) 289.8 MB/s (75.6%) remaining 1s.
Processed 5 (1.6 GB) of 5 (1.8 GB) 270.2 MB/s (85.3%) remaining 0s.
Processed 5 (1.7 GB) of 5 (1.8 GB) 256.3 MB/s (95.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Processed 6 (1.8 GB) of 5 (1.8 GB) 251 MB/s (100.0%) remaining 0s.
Restored 5 files, 1 directories and 0 symbolic links (1.8 GB).
```

---------

Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
2024-05-10 09:47:13 -07:00
Jarek Kowalski
7278f570e2 chore(ci): upgraded linter to 1.57.1 (#3753) 2024-03-25 22:20:38 -07:00
Jarek Kowalski
29cd545c33 chore(ci): upgrade linter to 1.56.2 (#3714) 2024-03-09 10:39:11 -08:00
Julio Lopez
c56d330383 feat(cli): handle SIGTERM (#3562)
* refactor(test): allow signaling sub-process from testenv.CLIExeRunner
* test(cli): add test for handling SIGTERM
* feat(general): catch and process SIGTERM for termination
* refactor(cli): rename function cli.App.onTerminate
  Renames function from onCtrlC to a more generic onTerminate
2024-01-11 18:02:31 -08:00
Jarek Kowalski
cbc66f936d chore(ci): upgraded linter to 1.53.3 (#3079)
* chore(ci): upgraded linter to 1.53.3

This flagged a bunch of unused parameters, so the PR is larger than
usual, but 99% mechanical.

* separate lint CI task

* run Lint in separate CI
2023-06-18 13:26:01 -07:00
Jarek Kowalski
6fa50640f4 build(deps): manual upgrade to github.com/alecthomas/kingpin/v2 (#2804)
also upgraded github.com/klauspost/reedsolomon to latest non-retracted version
go mod tidy
2023-03-11 06:28:05 -08:00
Jarek Kowalski
f69424961f chore(ci): upgrade golang to 1.19.2 and linter to 1.50.1 (#2526)
Lack of generics support is blocking various dependency upgrades,
so this unblocks that.

Temporarily disabled `checklocks` linter until it is fixed upstream.
2022-10-28 11:02:47 -07:00
Jarek Kowalski
51dcaa985d chore(ci): upgraded linter to 1.48.0 (#2294)
Mechanically fixed all issues, added `lint-fix` make target.
2022-08-09 06:07:54 +00:00
Ricardo Pescuma Domenecci
46697a69ae feat(cli): allow to profile benchmarks (#2281) 2022-08-06 00:54:55 +00:00
Jarek Kowalski
b9be9632a2 feat(repository): added required features to the repository (#2220)
* feat(repository): added `required features` to the repository

This is intended for future compatibility to be able to reliably
stop old kopia client from being able to open a repository when
the old code does not understand new `required feature`.

Required features are checked on startup and periodically using the
same method as upgrade lock, where they will return errors during blob
operations.

* pr feedback
2022-07-29 09:31:17 -07:00
Shikhar Mall
26e6f59b2b feat(cli): New Upgrade CLI / Switch to Format Version 3 (upgrade coordination) (#1818)
* kopia format upgrade lock

* Update cli/command_repository_set_parameters_test.go

Co-authored-by: Ali Dowair <adowair@umich.edu>

* Update cli/command_repository_upgrade.go

Co-authored-by: Ali Dowair <adowair@umich.edu>

* Update cli/command_repository_upgrade.go

Co-authored-by: Ali Dowair <adowair@umich.edu>

* pr feedback

* pr feedback

* add a min drain time check

* env var for io-drain-timeout

* fix: add more doctext around upgrade phases

* build: wrap with EnvName

* add experimental warning

* protect upgrade cli behind env varible

* fix conflicts after relocating the upgrade lock

* generalize the command args

* drop certain features as per feedback

* sub-divide the upgrade command into begin and rollback

* Update cli/command_repository_upgrade.go

Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>

* Update cli/command_repository_upgrade.go

Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>

* missing return

* rename force flag to allow-unsafe-upgrade

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Ali Dowair <adowair@umich.edu>
Co-authored-by: Shikhar Mall <small@kopia.io>
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
2022-07-27 16:23:45 -07:00
Jarek Kowalski
ea257b1597 feat(cli): removed unnecessary logs from cli-logs (#2174)
- removed memory tracking since it's redundant with profiling
  and prometheus support.
- various cleanups to make sure default log is clean
2022-07-10 16:25:25 -07:00
Jarek Kowalski
8515d050e5 test(infra): improved support for in-process testing (#2169)
* feat(infra): improved support for in-process testing

* support for killing of a running server using simulated Ctrl-C
* support for overriding os.Stdin
* migrated many tests from the exe runner to in-process runner

* added required indirection when defining Envar() so we can later override it in tests

* refactored CLI runners by moving environment overrides to CLITestEnv
2022-07-09 18:22:50 -07:00
Jarek Kowalski
17c74e6386 feat(cli): added open telemetry tracing support (#1988)
New flag `--enable-jaeger-collector` and the corresponding
`KOPIA_ENABLE_JAEGER_COLLECTOR` environment variable enables Jaeger
exporter, which by default sends OTEL traces to Jaeger collector on
http://localhost:14268/api/traces

To change this, use environment variables:

* `OTEL_EXPORTER_JAEGER_ENDPOINT`
* `OTEL_EXPORTER_JAEGER_USER`
* `OTEL_EXPORTER_JAEGER_PASSWORD`

When tracing is disabled, the impact on performance is negligible.

To see this in action:

1. Download latest Jaeger all-in-one from https://www.jaegertracing.io/download/
2. Run `jaeger-all-in-one` binary without any parameters.
3. Run `kopia --enable-jaeger-collector snapshot create ...`
4. Go to http://localhost:16686/search and search for traces
2022-05-28 10:39:00 -07:00
Jarek Kowalski
e8c1cfe142 feat(cli): added flags for pushing kopia metrics (#1983)
When enabled, metrics are pushed to the provided Prometheus Push
Gateway at the start and end of each command and periodically every
few seconds.

```
--metrics-push-addr=http://address:port
--metrics-push-interval=5s
--metrics-push-job=kopia
--metrics-push-grouping=a:b --metrics-push-grouping=c:d
--metrics-push-username=user
--metrics-push-password=pass
```
2022-05-28 07:44:59 -07:00
Jarek Kowalski
1ae6c6df03 fix(repository): fixed slow goroutine leak from indexBlobCache, added tests (#1950) 2022-05-16 01:21:30 +00:00
Jarek Kowalski
991c08e4b4 chore(repository): switched from opencensus to directly exporting prometheus metrics (#1831) 2022-03-17 23:39:36 -07:00
Jarek Kowalski
a48e24e693 feat(providers): add Google Drive support (#1731)
* feat(provider): Add Google Drive support.

Co-authored-by: xkxx <xkxx@users.noreply.github.com>
Co-authored-by: xkxx <xkxiang@gmail.com>
2022-02-16 22:34:48 -08:00
Shikhar Mall
aa5e4cfb33 refactor(cli): An in-memory storage mock setup for CLI tests (#1697)
* refactor cli tests to allow the use of in-memory mock

* use in-memory repo for set-parameters cli tests

* move inmemory storage provider into test package

Co-authored-by: Shikhar Mall <shikhar@kasten.io>
2022-02-01 10:29:13 -08:00
Jarek Kowalski
fd163cfc20 feat(kopiaui): connect to repository asynchronously on startup (#1691)
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.

This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
2022-01-29 18:28:52 -08:00
Jarek Kowalski
e67f84e0ba chore(general): updated linter to 1.44.0 (#1681) 2022-01-25 21:21:13 -08:00
Jarek Kowalski
32ed220a6c build(lint): enabled gochecknoglobals and tagged existing globals (#1664) 2022-01-15 12:54:56 -08:00
Jarek Kowalski
3d58566644 fix(security): prevent cross-site request forgery in the UI website (#1653)
* fix(security): prevent cross-site request forgery in the UI website

This fixes a [cross-site request forgery (CSRF)](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
vulnerability in self-hosted UI for Kopia server.

The vulnerability allows potential attacker to make unauthorized API
calls against a running Kopia server. It requires an attacker to trick
the user into visiting a malicious website while also logged into a
Kopia website.

The vulnerability only affected self-hosted Kopia servers with UI. The
following configurations were not vulnerable:

* Kopia Repository Server without UI
* KopiaUI (desktop app)
* command-line usage of `kopia`

All users are strongly recommended to upgrade at the earliest
convenience.

* pr feedback
2022-01-13 11:31:51 -08:00
Jarek Kowalski
2cb05f7501 logging: added log rotation and improved predictability of log sweep (#1562)
* logging: added log rotation and improved predictability of log sweep

With this change logs will be rotated every 50 MB, which prevents
accumulation of giant files while server is running.

This change will also guarantee that log sweep completes at least once
before each invocation of Kopia finishes. Previously it was a goroutine
that was not monitored for completion.

Flags can be used to override default behaviors:

* `--max-log-file-segment-size`
* `--no-wait-for-log-sweep` - disables waiting for full log sweep

Fixes #1561

* logging: added --log-dir-max-total-size-mb flag

This limits the total size of all logs in a directory to 1 GB.
2021-12-03 16:43:46 -08:00
Jarek Kowalski
cead806a3f blob: changed default shards from {3,3} to {1,3} (#1513)
* blob: changed default shards from {3,3} to {1,3}

Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.

Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.

The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).

* fixed compat tests
2021-11-16 06:02:04 -08:00
Jarek Kowalski
8a4ac4dec3 Upgraded linter to 1.43.0 (#1505)
* fixed new gocritic violations
* fixed new 'contextcheck' violations
* fixed 'gosec' warnings
* suppressed ireturn and varnamelen linters
* fixed tenv violations, enabled building robustness tests on arm64
* fixed remaining linux failures
* makefile: fixed 'lint-all' target when running on arm64
* linter: increase deadline
* disable nilnil linter - to be enabled in separate PR
2021-11-11 17:03:11 -08:00
Jarek Kowalski
51fdbf2048 nit: minor log output cleanups (#1380) 2021-10-13 06:07:18 -07:00
Jarek Kowalski
4a47bc3210 logging: switched from go-logging to zap (#1376)
This is much more efficient in terms of memory allocations
and speeds up backup due to less GC pressure.

Fixes #1345
2021-10-12 22:52:24 -07:00
Jarek Kowalski
8b760b66a8 logging: added memoization of Logger instances per context (#1369) 2021-10-09 05:02:18 -07:00
CrendKing
93b9bf15b4 Support setting AWS S3 storage class for all types of blobs (#1335)
* Support setting AWS S3 storage class for all types of blobs

* Read .storageconfig file

* Improve loading logic

* Hide .storageconfig from ListBlobs()
2021-10-03 19:01:39 -07:00
Jarek Kowalski
1894727458 cli: fixed windows color output (#1305)
Fixes #1287
2021-09-19 21:59:24 -07:00