Commit Graph

51 Commits

Author SHA1 Message Date
Jarek Kowalski
8b760b66a8 logging: added memoization of Logger instances per context (#1369) 2021-10-09 05:02:18 -07:00
Eng Zer Jun
73e492c9db refactor: move from io/ioutil to io and os package (#1360)
* refactor: move from io/ioutil to io and os package

The io/ioutil package has been deprecated as of Go 1.16, see
https://golang.org/doc/go1.16#ioutil. This commit replaces the existing
io/ioutil functions with their new definitions in io and os packages.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

* chore: remove //nolint:gosec for os.ReadFile

At the time of this commit, the G304 rule of gosec does not include the
`os.ReadFile` function. We remove `//nolint:gosec` temporarily until
https://github.com/securego/gosec/pull/706 is merged.

Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>
2021-10-06 08:39:10 -07:00
Jarek Kowalski
792cc874dc repo: allow reusing of object writer buffers (#1315)
This reduces memory consumption and speeds up backups.

1. Backing up kopia repository (3.5 GB files:133102 dirs:20074):

before: 25s, 490 MB
after: 21s, 445 MB

2. Large files (14.8 GB, 76 files)

before: 30s, 597 MB
after: 28s, 495 MB

All tests repeated 5 times for clean local filesystem repo.
2021-09-25 14:54:31 -07:00
Jarek Kowalski
35d0f31c0d huge: replaced the use of allocated byte slices with populating gather.WriteBuffer in the repository (#1244)
This helps recycle buffers more efficiently during snapshots.
Also, improved memory tracking, enabled profiling flags and added pprof
by default.
2021-08-20 08:45:10 -07:00
Jarek Kowalski
40510c043d Support for content-level compression (#1076)
* cli: added a flag to create repository with v2 index features

* content: plumb through compression.ID parameter to content.Manager.WriteContent()

* content: expose content.Manager.SupportsContentCompression

This allows object manager to decide whether to create compressed object
or let the content manager do it.

* object: if compression is requested and the repo supports it, pass compression ID to the content manager

* cli: show compression status in 'repository status'

* cli: output compression information in 'content list' and 'content stats'

* content: compression and decompression support

* content: unit tests for compression

* object: compression tests

* testing: added integration tests against v2 index

* testing: run all e2e tests with and without content-level compression

* htmlui: added UI for specifying index format on creation

* cli: additional tests for 'content ls' and 'content stats'

* applied pr suggestions
2021-05-22 05:35:27 -07:00
Jarek Kowalski
30ca3e2e6c Upgraded linter to 1.40.1 (#1072)
* tools: upgraded linter to 1.40.1

* lint: fixed nolintlint vionlations

* lint: disabled tagliatele linter

* lint: fixed remaining warnings
2021-05-15 12:12:34 -07:00
Jarek Kowalski
df430371b9 Refactored content.Info to be an interface and switched index parsing to be lazy (#1008) 2021-04-27 05:53:52 -07:00
Jarek Kowalski
2062c07259 mechanical field renames (#988)
* content: mechanical rename content.Info.Length -> content.Info.PackedLength
* server: renamed grpc API ContentInfo.length->packed_length (non-breaking)
2021-04-16 22:42:32 -07:00
Jarek Kowalski
f4347886b8 logging: simplified log levels (#954)
Removed Warning, Notify and Fatal:

* `Warning` => `Error` or `Info`
* `Notify` => `Info`
* `Fatal` was never used.

Note that --log-level=warning is still supported for backwards
compatibility, but it is the same as --log-level=error.

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-04-09 07:27:35 -07:00
Jarek Kowalski
7c108930ef testing: ensure tests are releasing all buffer pools to reduce memory usage, we had huge leaks (#895)
* testing: ensure tests are releasing all buffer pools to reduce memory usage, we had huge leaks

* object: reduced complexity and memory usage of TestEndToEndReadAndSeekWithCompression

* manifest: more test fixes

* trivial: update comment

Co-authored-by: Julio López <julio+gh@kasten.io>
2021-03-18 06:40:33 -07:00
Jarek Kowalski
e2b9a81ac3 Major CI/CD refactoring and re-added support for ARM/ARM64 runners (#849)
* ci: refactored CI/CD logic & Makefile

- removed all travis CI emulation environment variables and replaced with:

CI_TAG=<empty>|tag
IS_PULL_REQUEST=false|true

- refactored all OS and architecture-specific decisions to use around standard GOOS/GOARCH values instead of uname/OS
- re-added self-hosted runner for ARMHF (3 replicas)
- added brand new self-hosted runner for ARM64 (3 replicas)
- disabled attempts to publish and sign on forks
- improved integration test log output to better see timings and sub-tests
- print longest tests (unit tests and integration) after each run
- verified that all configurations build successfully on a clone (jkowalski/kopia)
- run make setup in parallel

* testing: fixed tests on ARM and ARM64

- fixed ARM-specific alignment issue
- cleaned up test logging
- fixed huge params warning threshold because it was tripping on ARM.
- reduced test complexity to make them fit in 15 minutes
2021-02-23 00:52:54 -08:00
Jarek Kowalski
1f3b8d4da4 upgrade linter to 1.35 (#786)
* lint: added test that enforces Makefile and GH action linter versions are in sync
* workaround for linter gomnd problem - https://github.com/golangci/golangci-lint/issues/1653
2021-01-16 18:21:16 -08:00
Jarek Kowalski
ed9db56b87 Cleaned up and refactored object manager (#782)
* object: refactored Open() and VerifyObject() to be stateless

(no code movement yet to facilitate review)

* mechanical: moved function more appropriate files

* object: remove object manager tracing which was unused
2021-01-13 00:40:23 -08:00
Jarek Kowalski
d52af7cebb fixed cases where nil was passed to errors.Wrap() causing nil to be returned (#747) 2020-12-24 20:35:29 -08:00
Jarek Kowalski
e03971fc59 Upgraded linter to v1.33.0 (#734)
* linter: upgraded to 1.33, disabled some linters

* lint: fixed 'errorlint' errors

This ensures that all error comparisons use errors.Is() or errors.As().
We will be wrapping more errors going forward so it's important that
error checks are not strict everywhere.

Verified that there are no exceptions for errorlint linter which
guarantees that.

* lint: fixed or suppressed wrapcheck errors

* lint: nolintlint and misc cleanups

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-12-21 22:39:22 -08:00
Jarek Kowalski
66cebb79cb Fixed empty object IDs in checkpoints (#649)
* object: fixed race condition between Result() and Checkpoint()

This would sometimes result in indirect objects having empty object IDs.

Fixes #648

* upload: ensure checkpoints never containt empty object IDs.

* testing: reduce armhf test weight
2020-09-29 07:14:47 -07:00
Jarek Kowalski
f0b97b960b Fixed checkpointing to not restart the entire upload process (#594)
* object: added Checkpoint() method to object writer

* upload: refactored code structure to allow better checkpointing

* upload: removed Checkpoint() method from UploadProgress

* Update fs/entry.go

Co-authored-by: Julio López <julio+gh@kasten.io>
2020-09-12 22:36:22 -07:00
Jarek Kowalski
e22d22dba2 object: implemented fast concatenation of objects by merging their index entries (#607) 2020-09-11 20:12:01 -07:00
Jarek Kowalski
faf280616a Splitter throughput improvements (#606)
* object: refactored writer to detect split points before writing

This introduces new primitive that will be moved into splitters
themselves in subsequent commits. I'm doing this in small steps to
ensure we don't regress at any time.

* splitter: refactored TestSplitters test

This is use slow (byte-by-byte) and fast (nextSplitPoint) methods of
determining split points.

Note nextSplitPoint is not implemented by splitters yet, but this
verifies that the test is expecting the right thing.

* object: splitter refactoring - replaced ShouldSplit() with NextSplitPoint() everywhere, still not optimized

* splitter: added additional dimension to splitter_test

We split either in large chunks or one byte at a time to catch
the corner cases in the splitter implementation.

* splitter: optimized splitters using NextSplitPoint primitive

This improves splitter performance by about 40% (buzhash) and makes
it virtually free for FIXED splitter.
2020-09-11 19:45:48 -07:00
Julio López
aebf8616a2 object: loadSeekTable helper (#608) 2020-09-11 19:12:13 -07:00
Jarek Kowalski
9a6dea898b Linter upgrade to v1.30.0 (#526)
* fixed godot linter errors
* reformatted source with gofumpt
* disabled some linters
* fixed nolintlint warnings
* fixed gci warnings
* lint: fixed 'nestif' warnings
* lint: fixed 'exhaustive' warnings
* lint: fixed 'gocritic' warnings
* lint: fixed 'noctx' warnings
* lint: fixed 'wsl' warnings
* lint: fixed 'goerr113' warnings
* lint: fixed 'gosec' warnings
* lint: upgraded linter to 1.30.0
* lint: more 'exhaustive' warnings

Co-authored-by: Nick <nick@kasten.io>
2020-08-12 19:28:53 -07:00
Jarek Kowalski
40acf238f3 Fixed arm and arm64 build. (#506)
* fixed a number of cases where misaligned data was causing panics on armv7 (but not armv8)
* travis: enable arm64
* test: reduce compressed data sizes when running on arm
* arm: wait longer for snapshots
2020-07-30 17:31:28 -07:00
Jarek Kowalski
573d10422a object: ensure that all I objects have a content prefix
When prefix is not specified on ObjectWriter, we force
'x' content prefix on intermediate contents, so object IDs
will look like:

Ix{hash}

This ensures the index contents will be stored in `q` blobs,
making `snapshot gc` easier.
2020-04-12 23:55:09 -07:00
Jarek Kowalski
8687f1c008 object: added AsyncWrites to ObjectWriter, which improves performance… (#369)
* object: added AsyncWrites to ObjectWriter, which improves performance of uploading of a single file

Fixes #351

Co-Authored-By: Julio López <julio+gh@kasten.io>
2020-03-22 09:02:33 -07:00
Jarek Kowalski
239d809075 performance: introduced buf.Pool which helps reuse memory buffers (#345)
* performance: added buf.Pool which can be used to manage ephemeral buffers for encryption and compression
* repo: switched object writer to buf.Pool
* content: switched encryption to use buf.Pool
* object: switched compression to use buf.Pool
* testing: added missing content manager Close()
2020-03-18 20:42:16 -07:00
Jarek Kowalski
8d452a8285 performance: improvements to object manager (#336)
- added pooled splitters and ability to reset them without having to recreate
- added support for caller-provided compressor output to be able to pool it
- added pooling of compressor instances, since those are costly
2020-03-13 08:56:18 -07:00
Jarek Kowalski
d181403284 crypto: refactored encryption, hashing and splitter into separate packages (#274)
Added some tests, deleted XSALSA20 which never worked E2E
2020-02-27 12:36:49 -08:00
Jarek Kowalski
c8fcae93aa logging: refactored logging
This is mostly mechanical and changes how loggers are instantiated.

Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).

By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling

ctx = logging.WithLogger(ctx, newLoggerFunc)

To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)

This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).

It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
2020-02-25 17:24:44 -08:00
Jarek Kowalski
0b8c4d0ef9 object: fixed compression bug where we were not clearing the buffer
this effectively defeated the purpose of compression, caused high
memory usage and other kinds of bad behavior.

refactored the code to prevent this issue by resetting the buffer
at the caller not callee.

fixed previous e2e test to catch the issue mentioned in #166,
verified it fails against master and passes with this change.
2020-01-09 16:36:57 -08:00
Jarek Kowalski
ac70a38101 lint: upgraded to 1.22.2 and make lint issues a build failure
fixed or silenced linter warnings, mostly due to magic numeric constants
2020-01-03 16:39:30 -08:00
Jarek Kowalski
2ba4e83cef moved all compression to separate package and sanitized identifiers 2019-12-10 23:25:28 -08:00
Jarek Kowalski
aec3cdcb2f object: added support for compressed objects 2019-12-10 23:25:28 -08:00
Jarek Kowalski
cfa2eea1e1 object: stop verifying object length in VerifyObject
With compression, the decompressed object length is
not known until we read the content.
2019-12-10 23:25:28 -08:00
Jarek Kowalski
58759551b3 compression: added object.Compressor object and some implementations 2019-12-10 23:25:28 -08:00
Jarek Kowalski
6217df1a87 lint: switched to 1.21 and fixed a ton of whitespace issues discovered
by new wsl linter
2019-11-26 06:49:49 -08:00
Jarek Kowalski
f4afaec9b5 object: fixed Seek() behavior for seeking to end of object 2019-11-25 23:36:24 -08:00
Jarek Kowalski
72520029b0 golangci-lint: added more linters
Also fixed pre-existing lint errors.
2019-06-02 22:56:57 -07:00
Jarek Kowalski
54edb97b3a refactoring: renamed repo/block to repo/content
Also introduced strongly typed content.ID and manifest.ID (instead of string)

This aligns identifiers across all layers of repository:

blob.ID
content.ID
object.ID
manifest.ID
2019-06-01 22:24:19 -07:00
Jarek Kowalski
916da07e0f deprecate block format v0 2019-06-01 16:40:51 -07:00
Jarek Kowalski
9e5d0beccd refactoring: renamed storage.Storage to blob.Storage
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.

Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.

Also renamed CLI subcommands from `kopia storage` to `kopia blob`.

While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
2019-06-01 14:10:35 -07:00
Jarek Kowalski
1a7a02ddbe cleanup imports by grouping all local imports together 2019-06-01 10:57:55 -07:00
Jarek Kowalski
63303904e1 switched remaining fmt.Errorf to errors.Wrap() 2019-06-01 10:57:05 -07:00
Jarek Kowalski
6678c01f46 object_splitter_test: added baseline test 2019-05-31 06:49:54 -07:00
Jarek Kowalski
03339c18af [breaking change] deprecated DYNAMIC splitter due to license issue
The splitter in question was depending on
github.com/silvasur/buzhash which is not licensed according to FOSSA bot

Switched to new faster implementation of buzhash, which is
unfortunately incompatible and will split the objects in different
places.

This change is be semi-breaking - old repositories can be read, but
when uploading large objects they will be re-uploaded where previously
they would be de-duped.

Also added 'benchmark splitters' subcommand and moved 'block cryptobenchmark'
subcommand to 'benchmark crypto'.
2019-05-30 22:20:45 -07:00
Jarek Kowalski
0c41d41276 Fixed up paths after merge 2019-05-27 15:48:39 -07:00
Jarek Kowalski
327d8317d8 refactored repo/ into separate github.com/kopia/repo/ git repository 2018-10-26 20:40:57 -07:00
Jarek Kowalski
dd1c0943cd refactor: moved config file management to kopia/kopia/repo
also fixed layering issue and removed dependency from 'object'
on 'config'
2018-10-23 19:43:43 -07:00
Jarek Kowalski
2f8481b9d4 repo: removed TESTONLY_MD5 algorithm everywhere 2018-10-17 17:58:35 -07:00
Jarek Kowalski
1b014c875a simplified repository API password handling.
completely rewrote password storage:

- by default passwords are kept in OS-specific keyring (Keychain on macOS,
Windows Credentials Manager on Windows), which can be optionally disabled
to store password in a local file.

- on Linux keychain is disabled by default (does not work reliably
in terminal sessions), but can be enabled using command-line flag.
2018-09-07 21:34:31 -07:00
Jarek Kowalski
1bbc169c0d added missing package godoc, fixed test paths 2018-08-31 19:12:58 -07:00