Adds a `--stats-only` flag to the `diff` command to reduce
output verbosity.
When `kopia diff` is invoked with the `--stats-only` flag it
only outputs the aggregate statistics of changes.
- remove spurious log message, duplicates what is being sent to the output;
- nit: update struct comment;
- consolidate code via a parameterized function;
- rename variable for clarity, it makes it easier to understand by avoiding negative condition check;
- remove unused return value;
- replace if blocks with switch.
Added functionality to calculate aggregate statistics when
comparing what's changed between snapshots using kopia diff
Statistics collected during snapshot diff computation includes:
- files added/removed/modified
- dirs added/removed/modified
- files/dirs with metadata changes but same underlying content (OID)
Testing approach:
Added a test for verifying stats collected when comparing two directories with the same objectID but metadata changes across snapshots (dir mode, dir mod time, dir owner, etc), expectation is all the appropriate dir stats fields are updated.
Added another test for verifying stats collected when comparing two directories with similar file contents but the metadata for the files have changed between snapshots but content remains unchanged. Expectation is all the relevant file level stats fields are updated.
Existing tests have been updated due to stats now being printed in addition to previous output.
* chore(ci): upgraded linter to 1.53.3
This flagged a bunch of unused parameters, so the PR is larger than
usual, but 99% mechanical.
* separate lint CI task
* run Lint in separate CI
* Remove remaining internal uses of Readdir
* Remove old helpers and interface functions.
* Update tests for updated fs.Directory interface
* Fix index out of range error in snapshot walker
Record one error if an error occurred and it's not limiting errors
* Use helper functions more; exit loops early
Follow up on reviewer comments and reduce code duplication, use more
targetted functions like Directory.Child, and exit directory iteration
early if possible.
* Remove fs.Entries type and unused functions
Leave some functions dealing with sorting and finding entries in fs
package. This retains tests for those functions while still allowing
mockfs to access them.
* Simplify function return
* feat(cli): implementation for 'kopia snapshot fix'
This allows modifications and fixes to the snapshots after they have
been taken.
Supported are:
* `kopia snapshot fix remove-invalid-files [--verify-files-percent=X]`
Removes all directory entries where the underlying files cannot be
read based on index analysis (this does not read the files, only index
structures so is reasonably quick).
`--verify-files-percent=100` can be used to trigger full read for
all files.
* `kopia snapshot fix remove-files --object-id=<object-id>`
Removes the object with a given ID from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
* `kopia snapshot fix remove-files --filename=<wildcard>`
Removes the files with a given name from the entire snapshot tree.
Useful when you accidentally snapshot a sensitive file.
By default all snapshots are analyzed and rewritten. To limit the scope
use:
--source=user@host:/path
--manifest-id=manifestID
By default the rewrite operation writes new directory entries but
does not replace the manifests. To do that pass `--commit`.
Related #1906Fixes#799
reorganized CLI per PR suggestion
* additional logging for diff command
* added Clone() method to snapshot manifst and directory entry
* added a comprehensive test, moved DirRewriter to separate file
* pr feedback
* more pr feedback
* improved logging output
* disable test in -race configuration since it's way to slow
* pr feedback
* refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* chore: remove //nolint:gosec for os.ReadFile
At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* linter: upgraded to 1.33, disabled some linters
* lint: fixed 'errorlint' errors
This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.
Verified that there are no exceptions for errorlint linter which
guarantees that.
* lint: fixed or suppressed wrapcheck errors
* lint: nolintlint and misc cleanups
Co-authored-by: Julio López <julio+gh@kasten.io>
The new files policy oneFileSystem ignores files that are mounted to
other filesystems similarly to tar's --one-file-system switch. For
example, if this is enabled, backing up / should now automatically
ignore /dev, /proc, etc, so the directory entries themselves don't
appear in the backup. The value of the policy is 'false' by default.
This is implemented by adding a non-windows-field Device (of type
DeviceInfo, reflecting the implementation of Owner) to the Entry
interface. DeviceInfo holds the dev and rdev acquired with stat (same
way as with Owner), but in addition to that it also holds the same
values for the parent directory. It would seem that doing this in some
other way, ie. in ReadDir, would require modifying the ReadDir
interface which seems a too large modification for a feature this
small.
This change introduces a duplication of 'stat' call to the files, as
the Owner feature already does a separate call. I doubt the
performance implications are noticeable, though with some refactoring
both Owner and Device fields could be filled in in one go.
Filling in the field has been placed in fs/localfs/localfs.go where
entryFromChildFileInfo has acquired a third parameter giving the the
parent entry. From that information the Device of the parent is
retrieved, to be passed off to platformSpecificDeviceInfo which does
the rest of the paperwork. Other fs implementations just put in the
default values.
The Dev and Rdev fields returned by the 'stat' call have different
sizes on different platforms, but for convenience they are internally
handled the same. The conversion is done with local_fs_32bit.go and
local_fs_64bit.go which are conditionally compiled on different
platforms.
Finally the actual check of the condition is in ignorefs.go function
shouldIncludeByDevice which is analoguous to the other similarly named
functions.
Co-authored-by: Erkki Seppälä <flux@inside.org>
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
- Use it to compare the entry attributes for all entry types.
Allows comparing differences in file attributes among the
contents of two directories. This is useful for verifying
restored contents in end-to-end tests.
- Print message about modified entries only when both entries
are files and they are being compared.