* Implement volume shadow copy support on Windows
* Update go-vss version
* Fix unused variables
* Rename upload_actions*.go files
* Move vss settings to a separate policy section
* Handle existing shadow copy root
* Fix tests
* Fix lint issues
* Add cli policy test
* Add OS snapshot integration test
* Add GitHub Actions VSS test
* Fix "Incorrect function" error for root VSS snapshots
* Rename err to finalErr in createOSSnapshot
* Add OSSnapshotMode test
* Do not modify paths starting with \\?\ on Windows
* Allow warning messages in logfile tests
* Fix ignorefs not wrapping OS snapshot directory
* Retry VSS creation if another op was in progress
---------
Co-authored-by: Jarek Kowalski <jaak@jkowalski.net>
* chore(ci): upgraded linter to 1.53.3
This flagged a bunch of unused parameters, so the PR is larger than
usual, but 99% mechanical.
* separate lint CI task
* run Lint in separate CI
* Return ReadCloser from StreamingFile
Allow better resource management by returning something that can be closed
when dealing with StreamingFiles.
* Close StreamingFile Reader during upload
* Use NopCloser on inputs that don't implement Close
Fixup callers of the StreamingFile API by wrapping regular Readers with
NopCloser calls where necessary.
Also modified an end-to-end test to also check that these extra mode flags work when snapshotting+restoring.
Manually tested fuse-mount.
Co-authored-by: Luca Citi <lciti@ieee.org>
Almost all were easy to replace, except ones exposed via JSON which
have been left as-is.
The linter has a cool behavior where it flags attempts to pass
`atomic.Int32` for example by value , which is always a mistake,
say as an argument to `fmt.Sprintf()`
This removes tons of boilerplate code around:
- retry loop
- connection management
- storage registration
* used generics in runInParallel
* introduced generics in freepool
* introduced strong typing for workshare.Pool and workshare.AsyncGroup
* fixed linter error on openbsd
* Allow setting mod time on StreamingFiles
Only set during struct creation. Default the old constructor to using
the current time as the mod time.
* Change how mod time is handled for StreamingFiles
Don't set StreamingFile mod time in the uploader, instead use the value
in the file's metadata. Also allows StreamingFiles to be recognized as
cached files (previously uploaded). StreamingFiles don't know their file
size until they've been completely uploaded so leaving that out makes
them eligible for being marked as "cached".
This commit combined with the previous commit slightly changes how
timestamps on StreamingFiles are handled. It will result in them having
slightly earlier timestamps because they are now set on struct creation
instead of when the file was uploaded.
As timestamps are fairly fine-grained and the default is to use the
current time as the mod time it seems unlikely this patch will result in
incorrectly thinking a StreamingFile is cached even though it has
changed size.
* Uploader test for StreamingFile caching
* refactor(repository): moved format blob management to separate package
This is completely mechanical, no behavior changes, only:
- moved types and functions to a new package
- adjusted visibility where needed
- added missing godoc
- renamed some identifiers to align with current usage
- mechanically converted some top-level functions into member functions
- fixed some mis-named variables
* refactor(repository): moved content.FormatingOptions to format.ContentFormat
* Allow dynamic directory entries with virtualfs
* Tests for new virtualfs implementation
* Add escape hatch for estimator during upload
Some virtualfs.StreamingDirectory-s may not be able to (efficiently)
support iterating through entries multiple times. Make a way for the
estimator to ask if they support multiple iterations and skip the
directory if they do not.
* Exapand Directory interface
Expand the Directory interface instead of making a new interface as it's
error-prone to ensure all wrapper types properly handle types that use
the new interface.
* Post-rebase fixes
* Make StreamingDirectory single iteration only
Simplify code and test slightly by not allowing users to declare a
StreamingDirectory that can be iterated through multiple times.
* Add better test for estimator ignoring stream dir
Previous test in uploader had a race condition, meaning it may not catch
all cases.
* Ignore atomic access in checklocks
Comparisons known to be done after all additions to the variables in
question.
* Implement reviewer feedback
* Remove unused function parameter
* Remove remaining internal uses of Readdir
* Remove old helpers and interface functions.
* Update tests for updated fs.Directory interface
* Fix index out of range error in snapshot walker
Record one error if an error occurred and it's not limiting errors
* Use helper functions more; exit loops early
Follow up on reviewer comments and reduce code duplication, use more
targetted functions like Directory.Child, and exit directory iteration
early if possible.
* Remove fs.Entries type and unused functions
Leave some functions dealing with sorting and finding entries in fs
package. This retains tests for those functions while still allowing
mockfs to access them.
* Simplify function return
When combined with #1963, it significantly reduces memory usage.
When backing up Kopia enlistment with various binaries 2.8GB
(files:74180 dirs:12322):
Before: max memory 440MB, time 5.8s
After: max memory 360MB, time 5.4s
* refactor(repository): ensure we always parse content.ID and object.ID
This changes the types to be incompatible with string to prevent direct
conversion to and from string.
This has the additional benefit of reducing number of memory allocations
and bytes for all IDs.
content.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 34 bytes
object.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 36 bytes
* move index.{ID,IDRange} methods to separate files
* replaced index.IDFromHash with content.IDFromHash externally
* minor tweaks and additional tests
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* pr feedback
* post-merge fixes
* pr feedback
* pr feedback
* fixed subtle regression in sortedContents()
This was actually not producing invalid results because of how base36
works, just not sorting as efficiently as it could.
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* New interface method to iterate over dir entries
* Fix build and test failures from interface
* Fix entry iteration for StaticDirectory
* Make utility function for directory iteration
* Fix lint errors
* No wrapcheck on fs.ReaddirToIterate
* Be consistent for IterateEntry implementations
From https://github.com/google/gvisor/tree/master/tools/checklocks
This will perform static verification that we're using
`sync.Mutex`, `sync.RWMutex` and `atomic` correctly to guard access
to certain fields.
This was mostly just a matter of adding annotations to indicate which
fields are guarded by which mutex.
In a handful of places the code had to be refactored to allow static
analyzer to do its job better or to not be confused by some
constructs.
In one place this actually uncovered a bug where a function was not
releasing a lock properly in an error case.
The check is part of `make lint` but can also be invoked by
`make check-locks`.
The dual time measurement is described in
https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md
The fix is to discard hidden monotonic time component of time.Time
by converting to unix time and back.
Reviewed usage of clock.Now() and replaced with timetrack.StartTimer()
when measuring time.
The problem in #1402 was that passage of time was measured using
the monotonic time and not wall clock time. When the computer goes
to sleep, monotonic time is still monotonic while wall clock time makes
a leap when the computer wakes up. This is the behavior that
epoch manager (and most other compontents in Kopia) rely upon.
Fixes#1402
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
* refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* chore: remove //nolint:gosec for os.ReadFile
At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* localfs: reduce memory usage when scanning short directories
We read first 100 entries to determine if the directory is short
before forking to goroutines. Also reduced goroutine count, 16 was way
too aggressive.
* localfs: fixed windows-specific behavior where os.Lstat() returns different timestamps than ReadDir()
* cachefs: make cache read O(1) instead of O(n)
Move the wrapping of fs.Entry before caching rather than doing
it everytime we try to read from the cache.
* Add StreamingFile interface
* unit test for virtualfs
* CLI: Snapshot create support for stdin sources
* Uploader support for fs.StreamingFile
* End to end test for stdin source snapshot
* upload test to improve coverage
* policy: added errorHandling.ignoreUnknownTypes flag (defaults to true)
* cli: get/set ignore-unknown-types policy flag
* htmlui: added UI for setting ignore-unknown-types
* htmlui: fixed typo
* fs: return fs.ErrorEntry when a directory entry is not recognized (localfs and repofs)
* upload: explicitly handle unknown entry types by treating them as ignored errors
* ci: refactored CI/CD logic & Makefile
- removed all travis CI emulation environment variables and replaced with:
CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true
- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel
* testing: fixed tests on ARM and ARM64
- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
Fixes#690
This is a breaking change for folks who are expecting snapshots to fail
quickly without writing a snapshot manifest in case of an error.
Before this change, any source read failure would cause the entire
snapshot to fail (and not write a snapshot manifest as a result),
unless `ignoreFileErrors` or `ignoreDirectoryErrors` was set.
The new behavior is to continue snapshotting remaining files and
directories (this can be disabled by passing `--fail-fast` flag or
setting `KOPIA_SNAPSHOT_FAIL_FAST=1` environment variable) and defer
returning an error until the very end.
After snapshotting we will always attempt to write the snapshot manifest
(except when the root of the snapshot itself cannot be opened). In case
of a fail-fast error, the manifest will be marked as 'partial' and
the directory tree will contain only partial set of files.
In case of any errors, the manifest (and each directory object) will
list the number if failures and no more than 10 examples of failed
files/directories along with their respective errors.
Once the snapshot is complete we will return non-zero exit code to the
operating system if there were any fatal errors during snapshotting.
With this change we are repurposing `ignoreFileErrors` and
`ignoreDirectoryErrors` to designate some errors as non-fatal.
Non-fatal errors are reported as warnings in the logs and will not
cause a non-zero exit code to be returned.
* Improved .kopiaignore pattern matching
.kopiaignore pattern matching now (hopefully) conforms to the .gitignore specification (https://git-scm.com/docs/gitignore)
Replaced old package "ignore" with a newly written "wcmatch" that manages the globbing. This should support all the patterns that .gitignore supports.
Some changes in ignorefs that dealt with how the patterns were matched.
This fixes#571
* Fixed invalid matching of non-rooted patterns that contained a slash.
If a pattern contains a slash in the middle of the pattern this should only match relative to the .gitignore file, i.e. the same as if it started with a '/' according to the .gitignore spec.
Example:
foo/bar should match "/foo/bar", but not "/other/foo/bar".
whereas
"bar" matches both "/bar" and "/foo/bar"
* Uncommented previously failing tests.
* Fixed problem with matching "nested" .kopiaignore files.
Ignore-patterns must be applied from the root .kopiaignore down the hierarchy, so that an ignore file in a subdirectory can negate a pattern from a parent directory.
* Uncommented tests that should now work.
* Created end-to-end tests verifying .kopiaignore behavior.
This is related to #571 and #773, but provided as a separate PR to include tests that did not work before PR #773.
* Commented failing tests.
These tests will be re-enabled when #773 is done.
* Added additional commented tests of .kopiaignore
These will be uncommented in #773.