Added functionality to calculate aggregate statistics when
comparing what's changed between snapshots using kopia diff
Statistics collected during snapshot diff computation includes:
- files added/removed/modified
- dirs added/removed/modified
- files/dirs with metadata changes but same underlying content (OID)
Testing approach:
Added a test for verifying stats collected when comparing two directories with the same objectID but metadata changes across snapshots (dir mode, dir mod time, dir owner, etc), expectation is all the appropriate dir stats fields are updated.
Added another test for verifying stats collected when comparing two directories with similar file contents but the metadata for the files have changed between snapshots but content remains unchanged. Expectation is all the relevant file level stats fields are updated.
Existing tests have been updated due to stats now being printed in addition to previous output.
The kopia server was not uploading any logs to the repository,
because "repodiag" blob uploads would always fail.
The cause was the following: when the (log) repodiag blob
PUT operation was initiated, the `Context` used for this
operation was already canceled.
The context used for blob uploads is passed to
`repodiag.NewLogManager` when opening the repository.
In the case of the kopia server, the repository is asynchronously
opened in `server.Server.InitReposotoryAsync`. The context
passed to `repo.Open` is canceled after the "open repository"
server task completes.
This issue was introduced in #1691
Change:
Use `context.WithoutCancel()` instead of the context passed
when the repo diagnoser is created.
New tests are included to reproduce this failure and verify
the fix.
- test: ensure server logs are uploaded to the repo
- test: honor cancellation in map storage
- test: repodiag context cancellation
Ref:
- #1691
This was caused by the client using key derivation algorithm
from a config file (which did not have it when it was generated
using old version of Kopia).
Fixes#4254
Followups to #3655
* wrap fs.Reader
* nit: remove unnecessary intermediate variable
* nit: rename local variable
* cleanup: move restore.Progress interface to cli pkg
* move cliRestoreProgress to a separate file
* refactor(general): replace switch with if/else for clarity
Removes a tautology for `err == nil`, which was guaranteed
to be true in the second case statement for the switch.
Replacing the switch statement with and if/else block is clearer.
* initialize restoreProgress in restore command
* fix: use error.Wrapf with format string and args
Simplify SetCounters signature:
Pass arguments in a `restore.Stats` struct.
`SetCounters(s restore.Stats)`
Simplifies call sites and implementation.
In this case it makes sense to pass all the values
using the restore.Stats struct as it simplifies
the calls.
However, this pattern should be avoided in general
as it essentially makes all the arguments "optional".
This makes it easy to miss setting a value and simply
passing 0 (the default value), thus it becomes error
prone.
In this particular case, the struct is being passed
through verbatim, thus eliminating the risk of
missing a value, at least in the current state of
the code.
Ensure repository disconnection at the end of the `server start` CLI command.
This was caught as a result of fixing the test below.
Fix `TestServerStartInsecure`:
Remove `--password=xxx` parameter, which causes a server start failure
due to incorrect repo password, and not for the case being checked,
which is the lack of the `--insecure` parameter.
Update test comments accordingly.
Code movement and simplification, no functional changes.
Objectives:
- Allow callers specifying the needed key (or hash) size, instead of
hard-coding it in the registered PBK derivers. Conceptually, the caller
needs to specify the key size, since that is a requirement of the
(encryption) algorithm being used in the caller. Now, the code changes
here do not result in any functional changes since the key size is
always 32 bytes.
- Remove a global definition for the default PB key deriver to use.
Instead, each of the 3 use case sets the default value.
Changes:
- `crypto.DeriveKeyFromPassword` now takes a key size.
- Adds new constants for the key sizes at the callers.
- Removes the global `crypto.MasterKeySize` const.
- Removes the global `crypto.DefaultKeyDerivationAlgorithm` const.
- Adds const for the default derivation algorithms for each use case.
- Adds a const for the salt length in the `internal/user` package, to ensure
the same salt length is used in both hash versions.
- Unexports various functions, variables and constants in the `internal/crypto`
& `internal/user` packages.
- Renames various constants for consistency.
- Removes unused functions and symbols.
- Renames files to be consistent and better reflect the structure of the code.
- Adds a couple of tests to ensure the const values are in sync and supported.
- Fixes a couple of typos
Followups to:
- #3725
- #3770
- #3779
- #3799
- #3816
The individual commits show the code transformations to simplify the
review of the changes.
* Fix restoring objects with I prefix
set default of snapshot-time to 'latest' as noted in the help output
* Change test of restore to check it works without a time given
This is because --snapshot-time defaults to "latest" now.
* fix(repository): fixed handling of content.Info
Previously content.Info was an interface which was implemented by:
* index.InfoStruct
* index.indexEntryInfoV1
* index.indexEntryInfoV2
The last 2 implementations were relying on memory-mapped files
which in rare cases could be closed while Kopia was still processing
them leading to #2599.
This changes fixes the bug and strictly separates content.Info (which
is now always a struct) from the other two (which were renamed as
index.InfoReader and only used inside repo/content/...).
In addition to being safer, this _should_ reduce memory allocations.
* reduce the size of content.Info with proper alignment.
* pr feedback
* renamed index.InfoStruct to index.Info
* feat(server): reduce server refreshes of the repository
Previously each source would refresh itself from the repository
very frequently to determine the upcoming snapshot time. This change
refactors source manager so it does not own the repository connection
on its own but instead delegates all policy reads through the server.
Also introduces a new server scheduler that is responsible for
centrally managing the snapshot schedule and triggering snapshots
when they are due.
* Update cli/command_server_start.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* Update internal/server/server.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* Update internal/server/server_maintenance.go
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* pr feedback
---------
Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>
* feat(repository): apply retention policies server-side
This allows append-only snapshots where the client can never delete
arbitrary manifests and policies are maintained on the server.
The client only needs permissions to create snapshots in a given, which
automatically gives them permission to invoke the server-side method
for their own snapshots only.
* Update cli/command_acl_add.go
Co-authored-by: Guillaume <Gui13@users.noreply.github.com>
* Update internal/server/api_manifest.go
Co-authored-by: Guillaume <Gui13@users.noreply.github.com>
* Update internal/server/api_manifest.go
Co-authored-by: Guillaume <Gui13@users.noreply.github.com>
* Update internal/server/grpc_session.go
Co-authored-by: Guillaume <Gui13@users.noreply.github.com>
---------
Co-authored-by: Guillaume <Gui13@users.noreply.github.com>
Adds a check for snapshot create when --all and a source path are
given simultaneously. Returns an error in this case since it would
otherwise create a duplicate snapshot for the specified source path.
---
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
- fixed directory iteration order
- reduced providervalidation memory usage
- disabled one test case of TestSnapshotSparseRestore
(filed https://github.com/kopia/kopia/issues/3178 to fix)
Changes kopia's behavior to match the exit code that would
have been returned when the `--json` flag was not specified.
`kopia snapshot create my/path --json` terminates with a 0
status code in cases where
`kopia snapshot create my/path` terminates with a
non-zero exit code.
One such case is when there are permissions errors reading
files or directories to snapshot.
Adds end-to-end tests for snapshot create with '--json' flag
* remove deprecated `snapshot gc` command
* run `maintenance` instead of `snapshot gc` in robustness
* use `maintenance` command instead of `gc` alias for clarity
* use `maintenance run` in `TestSnapshotDeleteRestore`
This change adds a new streaming response to the FindManifests API. The
server will deliver the response in chunks of N manifests where N is
requested by the client. This allows the client to process the response
in chunks and improves pipelining of responses.
For now client will hold the entire response in memory since this
is what FindManifests() API currently does. This will be fixed in a
follow up change.
Replaces #2713Fixes#2660
* feat(server): improved server shutdown and integration tests
Added `--shutdown-grace-period` flag to `kopia server start` command
which can be used to specify how long the server will wait for active
connections to finish before forcibly shutting down.
This allowed removal of final out-of-process execution of
during integration tests and the need for `integration-tests` target
which was running the same tests as `tests` but in out-of-process mode.
We thus now have all the test coverage in-process without having to
build and launch `kopia` binary.
* fixed logging
* increase test timeout
* speed up and/or parallelize longest-running tests
Also modified an end-to-end test to also check that these extra mode flags work when snapshotting+restoring.
Manually tested fuse-mount.
Co-authored-by: Luca Citi <lciti@ieee.org>
* feat(cli): Allow restore from snapshoted path
* Find files in multiple snapshots
* Added --snapshot-time to restore
* Added restore by path test
* More timespec formats
* Test for snapshot list with a file in multiple snapshots
* Handle restore without target path
* Fix for tests
* Made changes requested in PR and rebased
* Encryptor pipeline
* Added ECC related options to repository create cli command
* Fix for lint errors
* Fixing comments from the PR
* Fixed lint errors
* Changes requested in PR
* Created e2e test
* refactor(repository): moved format blob management to separate package
This is completely mechanical, no behavior changes, only:
- moved types and functions to a new package
- adjusted visibility where needed
- added missing godoc
- renamed some identifiers to align with current usage
- mechanically converted some top-level functions into member functions
- fixed some mis-named variables
* refactor(repository): moved content.FormatingOptions to format.ContentFormat
* kopia format upgrade lock
* Update cli/command_repository_set_parameters_test.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* Update cli/command_repository_upgrade.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* Update cli/command_repository_upgrade.go
Co-authored-by: Ali Dowair <adowair@umich.edu>
* pr feedback
* pr feedback
* add a min drain time check
* env var for io-drain-timeout
* fix: add more doctext around upgrade phases
* build: wrap with EnvName
* add experimental warning
* protect upgrade cli behind env varible
* fix conflicts after relocating the upgrade lock
* generalize the command args
* drop certain features as per feedback
* sub-divide the upgrade command into begin and rollback
* Update cli/command_repository_upgrade.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* Update cli/command_repository_upgrade.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* missing return
* rename force flag to allow-unsafe-upgrade
Co-authored-by: Shikhar Mall <shikhar@kasten.io>
Co-authored-by: Ali Dowair <adowair@umich.edu>
Co-authored-by: Shikhar Mall <small@kopia.io>
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
Also hide the flag, since it's not recommended to be tweaked anyway.
The value of <=45m is very important for safety of the garbage collection algorithms - too long an interval between checkpoints could mean that GC treats contents in the middle of being uploaded as unused, because they are not reachable from any snapshots or checkpoints.
Fixes#2193
* feat(infra): improved support for in-process testing
* support for killing of a running server using simulated Ctrl-C
* support for overriding os.Stdin
* migrated many tests from the exe runner to in-process runner
* added required indirection when defining Envar() so we can later override it in tests
* refactored CLI runners by moving environment overrides to CLITestEnv
This commit renames the sparse restore flag (`kopia snapshot restore`
and `kopia restore`) to conform more with the naming precedents in
the Kopia code. This is a breaking change.
The original motivation can be found here:
https://github.com/kopia/htmlui/pull/61#discussion_r899155054
* Unify sparse and normal IO output
This commit refactors the code paths that excercise normal and sparse
writing of restored content. The goal is to expose sparsefile.Copy()
and iocopy.Copy() to be interchangeable, thereby allowing us to wrap
or transform their behavior more easily in the future.
* Introduce getStreamCopier()
* Pull ioCopy() into getStreamCopier()
* Fix small nit in E2E test
We should be getting the block size of the destination file, not
the source file.
* Call stat.GetBlockSize() once per FilesystemOutput
A tiny refactor to pull this call out of the generated stream copier,
as the block size should not change from one file to the next within
a restore entry.
NOTE: as a side effect, if block size could not be found (an error
is returned), we will return the default stream copier instead of
letting the sparse copier fail. A warning will be logged, but this
error will not cause the restore to fail; it will proceed silently.