* refactor: move from io/ioutil to io and os package
The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* chore: remove //nolint:gosec for os.ReadFile
At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.
Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
* cli: added a flag to create repository with v2 index features
* content: plumb through compression.ID parameter to content.Manager.WriteContent()
* content: expose content.Manager.SupportsContentCompression
This allows object manager to decide whether to create compressed object
or let the content manager do it.
* object: if compression is requested and the repo supports it, pass compression ID to the content manager
* cli: show compression status in 'repository status'
* cli: output compression information in 'content list' and 'content stats'
* content: compression and decompression support
* content: unit tests for compression
* object: compression tests
* testing: added integration tests against v2 index
* testing: run all e2e tests with and without content-level compression
* htmlui: added UI for specifying index format on creation
* cli: additional tests for 'content ls' and 'content stats'
* applied pr suggestions
Removed Warning, Notify and Fatal:
* `Warning` => `Error` or `Info`
* `Notify` => `Info`
* `Fatal` was never used.
Note that --log-level=warning is still supported for backwards
compatibility, but it is the same as --log-level=error.
Co-authored-by: Julio López <julio+gh@kasten.io>
* testing: ensure tests are releasing all buffer pools to reduce memory usage, we had huge leaks
* object: reduced complexity and memory usage of TestEndToEndReadAndSeekWithCompression
* manifest: more test fixes
* trivial: update comment
Co-authored-by: Julio López <julio+gh@kasten.io>
* ci: refactored CI/CD logic & Makefile
- removed all travis CI emulation environment variables and replaced with:
CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true
- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel
* testing: fixed tests on ARM and ARM64
- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
* object: refactored Open() and VerifyObject() to be stateless
(no code movement yet to facilitate review)
* mechanical: moved function more appropriate files
* object: remove object manager tracing which was unused
* linter: upgraded to 1.33, disabled some linters
* lint: fixed 'errorlint' errors
This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.
Verified that there are no exceptions for errorlint linter which
guarantees that.
* lint: fixed or suppressed wrapcheck errors
* lint: nolintlint and misc cleanups
Co-authored-by: Julio López <julio+gh@kasten.io>
* object: fixed race condition between Result() and Checkpoint()
This would sometimes result in indirect objects having empty object IDs.
Fixes#648
* upload: ensure checkpoints never containt empty object IDs.
* testing: reduce armhf test weight
* object: refactored writer to detect split points before writing
This introduces new primitive that will be moved into splitters
themselves in subsequent commits. I'm doing this in small steps to
ensure we don't regress at any time.
* splitter: refactored TestSplitters test
This is use slow (byte-by-byte) and fast (nextSplitPoint) methods of
determining split points.
Note nextSplitPoint is not implemented by splitters yet, but this
verifies that the test is expecting the right thing.
* object: splitter refactoring - replaced ShouldSplit() with NextSplitPoint() everywhere, still not optimized
* splitter: added additional dimension to splitter_test
We split either in large chunks or one byte at a time to catch
the corner cases in the splitter implementation.
* splitter: optimized splitters using NextSplitPoint primitive
This improves splitter performance by about 40% (buzhash) and makes
it virtually free for FIXED splitter.
* fixed a number of cases where misaligned data was causing panics on armv7 (but not armv8)
* travis: enable arm64
* test: reduce compressed data sizes when running on arm
* arm: wait longer for snapshots
When prefix is not specified on ObjectWriter, we force
'x' content prefix on intermediate contents, so object IDs
will look like:
Ix{hash}
This ensures the index contents will be stored in `q` blobs,
making `snapshot gc` easier.
* object: added AsyncWrites to ObjectWriter, which improves performance of uploading of a single file
Fixes#351
Co-Authored-By: Julio López <julio+gh@kasten.io>
* performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression
* repo: switched object writer to buf.Pool
* content: switched encryption to use buf.Pool
* object: switched compression to use buf.Pool
* testing: added missing content manager Close()
- added pooled splitters and ability to reset them without having to recreate
- added support for caller-provided compressor output to be able to pool it
- added pooling of compressor instances, since those are costly
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
this effectively defeated the purpose of compression, caused high
memory usage and other kinds of bad behavior.
refactored the code to prevent this issue by resetting the buffer
at the caller not callee.
fixed previous e2e test to catch the issue mentioned in #166,
verified it fails against master and passes with this change.
Also introduced strongly typed content.ID and manifest.ID (instead of string)
This aligns identifiers across all layers of repository:
blob.ID
content.ID
object.ID
manifest.ID
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.
Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.
Also renamed CLI subcommands from `kopia storage` to `kopia blob`.
While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
The splitter in question was depending on
github.com/silvasur/buzhash which is not licensed according to FOSSA bot
Switched to new faster implementation of buzhash, which is
unfortunately incompatible and will split the objects in different
places.
This change is be semi-breaking - old repositories can be read, but
when uploading large objects they will be re-uploaded where previously
they would be de-duped.
Also added 'benchmark splitters' subcommand and moved 'block cryptobenchmark'
subcommand to 'benchmark crypto'.
completely rewrote password storage:
- by default passwords are kept in OS-specific keyring (Keychain on macOS,
Windows Credentials Manager on Windows), which can be optionally disabled
to store password in a local file.
- on Linux keychain is disabled by default (does not work reliably
in terminal sessions), but can be enabled using command-line flag.