New flag `--enable-jaeger-collector` and the corresponding
`KOPIA_ENABLE_JAEGER_COLLECTOR` environment variable enables Jaeger
exporter, which by default sends OTEL traces to Jaeger collector on
http://localhost:14268/api/traces
To change this, use environment variables:
* `OTEL_EXPORTER_JAEGER_ENDPOINT`
* `OTEL_EXPORTER_JAEGER_USER`
* `OTEL_EXPORTER_JAEGER_PASSWORD`
When tracing is disabled, the impact on performance is negligible.
To see this in action:
1. Download latest Jaeger all-in-one from https://www.jaegertracing.io/download/
2. Run `jaeger-all-in-one` binary without any parameters.
3. Run `kopia --enable-jaeger-collector snapshot create ...`
4. Go to http://localhost:16686/search and search for traces
When enabled, metrics are pushed to the provided Prometheus Push
Gateway at the start and end of each command and periodically every
few seconds.
```
--metrics-push-addr=http://address:port
--metrics-push-interval=5s
--metrics-push-job=kopia
--metrics-push-grouping=a:b --metrics-push-grouping=c:d
--metrics-push-username=user
--metrics-push-password=pass
```
* refactor(repository): ensure we always parse content.ID and object.ID
This changes the types to be incompatible with string to prevent direct
conversion to and from string.
This has the additional benefit of reducing number of memory allocations
and bytes for all IDs.
content.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 34 bytes
object.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 36 bytes
* move index.{ID,IDRange} methods to separate files
* replaced index.IDFromHash with content.IDFromHash externally
* minor tweaks and additional tests
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* pr feedback
* post-merge fixes
* pr feedback
* pr feedback
* fixed subtle regression in sortedContents()
This was actually not producing invalid results because of how base36
works, just not sorting as efficiently as it could.
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* feat(cli): implementation for 'kopia snapshot fix'
This allows modifications and fixes to the snapshots after they have
been taken.
Supported are:
* `kopia snapshot fix remove-invalid-files [--verify-files-percent=X]`
Removes all directory entries where the underlying files cannot be
read based on index analysis (this does not read the files, only index
structures so is reasonably quick).
`--verify-files-percent=100` can be used to trigger full read for
all files.
* `kopia snapshot fix remove-files --object-id=<object-id>`
Removes the object with a given ID from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
* `kopia snapshot fix remove-files --filename=<wildcard>`
Removes the files with a given name from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
By default all snapshots are analyzed and rewritten. To limit the scope
use:
--source=user@host:/path
--manifest-id=manifestID
By default the rewrite operation writes new directory entries but
does not replace the manifests. To do that pass `--commit`.
Related #1906Fixes#799
reorganized CLI per PR suggestion
* additional logging for diff command
* added Clone() method to snapshot manifst and directory entry
* added a comprehensive test, moved DirRewriter to separate file
* pr feedback
* more pr feedback
* improved logging output
* disable test in -race configuration since it's way to slow
* pr feedback
- expand command flag description for clarification
- include blob id in blob get error in the cache
- nit: remove unused BOTO_PATH
- nit: fix comment
- cleanup: remove unnecessary function declaration in interface
- leverage 'testify' to simplify test
* feat(cli): added 'content delete --forget' flag
This allows low-level hiding of entries in the index, which makes
them completely invisible.
For #1906
* improved code coverage
* pr feedback
The connected repository's backing storage capacity and available
space can be now retrieved from `kopia repository status`. In text
format, these fields are printed in a human friendly form (MiB, GiB).
In JSON mode (`--json`), they are output as bytes.
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
Co-authored-by: Julio
* feat(snapshots): support restoring sparse files
This commit implements basic support for restoring sparse files from
a snapshot. When specifying "--mode=sparse" in a snapshot restore
command, Kopia will make a best effort to make sure the underlying
filesystem allocates the minimum amount of blocks needed to persist
restored files. In other words, enabling this feature will "force"
all restored files to be sparse-blocks of zero bytes in the source
file should not be allocated.
* Address review comments
- Separate sparse option into its own bool flag
- Implement sparsefile packagewith copySparse method
- Truncate once before writing sparse file
- Check error from Truncate
- Add unit test for copySparse
- Invoke GetBlockSize once per file copy
- Remove support for Windows and explain why
- Add unit test for stat package
Co-authored-by: Dave Smith-Uchida <dave@kasten.io>
From https://github.com/google/gvisor/tree/master/tools/checklocks
This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.
This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.
In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.
In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.
The check is part of `make lint` but can also be invoked by
`make check-locks`.
This is a safety measure which addresses P0 improvement for #1732.
Given that retention policies that retain nothing make no sense, this
is not considered a breaking change.
* refactor cli tests to allow the use of in-memory mock
* use in-memory repo for set-parameters cli tests
* move inmemory storage provider into test package
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
This allows KopiaUI server to start when the repository directory
is not mounted or otherwise unavailable. Connection attempts will
be retried indefinitely and user will see new `Initializing` page.
This also exposes `Open` and `Connect` as tasks allowing the user to see
logs directly in the UI and cancel the operation.
Also simplified validation test suite, which will simply test whether
the provider supports DoNotRecreate or properly rejects it without
external configuration.
* Add a new PutBlob option and blob error type
When `DoNotRecreate` is set as true, the blob put operation should
only succeed if no blob with the given blob ID already exists.
Othwerwise, `ErrBlobAlreadyExists` is returned.
* Validate default storage providers' support
By default, storage providers should not support idempotent creates.
This commit adds error handling to exit early if `DoNotRecreate` is
set to true. The commit also verifies this behavior in the provider
validation test.
* Implement support for new option in GCS storage
* Push PutBlob option handling down to Impl
When PutBlob options were introduced, error handling logic for them
was implemented for the Sharded storage interface. However, the
behavior of different providers that implement Sharded can be
different, so it's better to push the options down to be processed in
the provider implementations.
* Introduce new error type for unsupported put opts
To unify error handling code and make it more maintainable, introduce
a new error type `blob.ErrUnsupportedPutBlobOption`, which is to be
returned whenever a storage provider implementation is given put
options it does not support.
* feat: persisting retention options in repository blob
- plumb retention parameters through wrapped storage
- generalize aes encryption mechanism
- rewrite the retention blob on password change
- do not write retention blob when empty
* handle retention-blob not-found failures
* cli params to set retention modes on repository create
* enable versioned map mock storage with retention settings
* adding unit tests
* write format and retention blob with retention settings if available
* rename certain functions and constants specific to format blob
* delete retention cache on password-change
* fix: replace SetTime() api call with TouchBlob()
* Update repo/repository_test.go
Co-authored-by: Nick <nick@kasten.io>
* pr feedback and codecov improvements
* fix: rename retention-blob structures to generic blob-cfg
* fix: remove minio dependency on retention constants
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Nick <nick@kasten.io>
* fix(security): prevent cross-site request forgery in the UI website
This fixes a [cross-site request forgery (CSRF)](https://en.wikipedia.org/wiki/Cross-site_request_forgery)
vulnerability in self-hosted UI for Kopia server.
The vulnerability allows potential attacker to make unauthorized API
calls against a running Kopia server. It requires an attacker to trick
the user into visiting a malicious website while also logged into a
Kopia website.
The vulnerability only affected self-hosted Kopia servers with UI. The
following configurations were not vulnerable:
* Kopia Repository Server without UI
* KopiaUI (desktop app)
* command-line usage of `kopia`
All users are strongly recommended to upgrade at the earliest
convenience.
* pr feedback
This adds new set of APIs `/api/v1/control/*` which can be used to administratively control a running server.
Once the server is started, the administrative user can control it
using CLI commands:
export KOPIA_SERVER_ADDRESS=...
export KOPIA_SERVER_CERT_FINGERPRINT=...
export KOPIA_SERVER_PASSWORD=...
* `kopia server status` - displays status of sources managed by the server
* `kopia server snapshot` - triggers server-side upload of snapshots for managed sources
* `kopia server cancel` - cancels upload of snapshots for managed sources
* `kopia server pause` - pauses scheduled snapshots for managed sources
* `kopia server resume` - resumes scheduled snapshots for managed sources
* `kopia server refresh` - causes server to resynchronize with externally-made changes, such as policies or new sources
* `kopia server flush` - causes server to flush all pending writes
* `kopia server shutdown` - graceful shutdown of the server
Authentication uses new user `server-control` and is disabled
by default. To enable it when starting the server, provide the password
using one of the following methods:
* `--server-control-password`
* `--random-server-control-password`
* `.htpasswd` file
* `KOPIA_SERVER_CONTROL_PASSWORD` environment variable
This change allows us to tighten the API security and remove some
methods that UI user was able to call, but which were not needed.
* sharded: plumbed through blob.PutOptions
* blob: removed blob.Storage.SetTime() method
This was only used for `kopia repo sync-to` and got replaced with
an equivalent blob.PutOptions.SetTime, which wehn set to non-zero time
will attempt to set the modification time on a file.
Since some providers don't support changing modification time, we
are able to emulate it using per-blob metadata (on B2, Azure and GCS),
sadly S3 is still unsupported, because it does not support returning
metadata in list results.
Also added PutOptions.GetTime, which when set to not nil, will
populate the provided variable with actual time that got assigned
to the blob.
Added tests that verify that each provider supports GetTime
and SetTime according to this spec.
* blob: additional test coverage for filesystem storage
* blob: added PutBlobAndGetMetadata() helper and used where appropriate
* fixed test failures
* pr feedback
* Update repo/blob/azure/azure_storage.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* Update repo/blob/filesystem/filesystem_storage.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* Update repo/blob/filesystem/filesystem_storage.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* blobtesting: fixed object_locking_map.go
* blobtesting: removed SetTime from ObjectLockingMap
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>