kopia

mirror of https://github.com/kopia/kopia.git synced 2026-05-11 16:25:13 -04:00

Author	SHA1	Message	Date
Matthieu MOREL	8a176255c0	fix(general): enable wsl for all go files (#4524 ) Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>	2025-04-26 13:01:20 -07:00
Julio López	961a39039b	refactor(general): use `errors.New` where appropriate (#4160 ) Replaces 'errors.Errorf\("([^"]+)"\)' => 'errors.New("\1")'	2024-10-05 19:05:00 -07:00
Julio López	1f9f9a1846	chore(general): use non-formatting log variants when there is no formatting (#3931 ) Use non-formatting logging functions for message without formatting. For example, `log.Info("message")` instead of `log.Infof("message")` Configure linter for printf-like functions	2024-06-18 23:13:17 -07:00
Jarek Kowalski	b55d5b474c	refactor(repository): refactored internal index read API to reduce memory allocations (#3754 ) * refactor(repository): refactored internal index read API to reduce memory allocations * fixed stress test flake, improved debuggability * fixed spurious checklocks failures * post-merge fixes * pr feedback	2024-04-12 22:59:11 -07:00
Jarek Kowalski	09415e0c7d	chore(ci): upgraded to go 1.22 (#3746 ) Upgrades go to 1.22 and switches to new-style for loops --------- Co-authored-by: Julio López <1953782+julio-lopez@users.noreply.github.com>	2024-04-08 09:52:47 -07:00
Jarek Kowalski	fe55dcb6a2	feat(repository): added hard size limit to the on-disk cache (#3238 ) * test(providers): added capacity limits to blobtesting.mapStorage * refactor(general): added mutex map which dynamically allocates and releases named mutexes * refactor(repository): refactored cache cleanup and limit enforcement * refactor(repository): plumb through cache size limits in the repository * feat(cli): added CLI options to set cache size limits * unified flag setting and field naming * Update cli/command_cache_set.go Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com> * pr feedback --------- Co-authored-by: Shikhar Mall <mall.shikhar.in@gmail.com>	2023-08-24 09:38:56 -07:00
Julio Lopez	a99e38c247	fix(lint): remove uses of deprecated rand.Read (#2858 ) Lint fixes in preparation for moving to Go 1.20 Remove deprecated calls to `rand.Seed` In Go 1.20 the default generator is seeded randomly at program startup, which is the desired behavior for these tests. Remove uses of deprecated rand.Read: replace with calls to rand.Uint64() Remove deprecated uses of rand.Read in content manager tests and S3 versioned tests. Adds a concurrency-safe helpers to provide functionality similar to that provided by `rand.Read(b []byte) (int, error)`	2023-03-28 01:44:09 +00:00
Edward Betts	1e97574391	fix(general): correct spelling mistakes (#2684 )	2023-01-21 07:37:15 -08:00
Jarek Kowalski	f8be8f6a56	refactor(repository): extract parts repo/content into packages (#2651 ) - repolog package - blobcrypto package - indexblob package Minor cleanups: - removed dead code - introduced New*() methods for object construction	2022-12-17 16:19:12 +00:00
Jarek Kowalski	65f295ed79	refactor(repository): replaced atomic values with Go 1.19 atomic wrappers (#2590 ) Almost all were easy to replace, except ones exposed via JSON which have been left as-is. The linter has a cool behavior where it flags attempts to pass `atomic.Int32` for example by value , which is always a mistake, say as an argument to `fmt.Sprintf()`	2022-11-19 18:39:04 +00:00
Jarek Kowalski	51dcaa985d	chore(ci): upgraded linter to 1.48.0 (#2294 ) Mechanically fixed all issues, added `lint-fix` make target.	2022-08-09 06:07:54 +00:00
Jarek Kowalski	70e24106ee	refactor(general): unified logging.Logger with *zap.SugaredLogger (#2090 ) - removed a bunch of hacks and should improve the logging performance by avoiding interfaces and data translation. This will allow using of de-sugared loggers in performance-critical logging situations. - this will also allow using features of ZAP more directly without having to reimplement them. - moved logging.Printf() to testlogging - refactored `uitask` to store logs in a structural format and present them as JSON only in the UI - renamed printf_logger.go to printf.go so that fewer columns are used in the logs	2022-06-26 05:11:52 +00:00
Jarek Kowalski	9bf9cac7fb	refactor(repository): ensure we always parse content.ID and object.ID (#1960 ) * refactor(repository): ensure we always parse content.ID and object.ID This changes the types to be incompatible with string to prevent direct conversion to and from string. This has the additional benefit of reducing number of memory allocations and bytes for all IDs. content.ID went from 2 allocations to 1: typical case 32 characters + 16 bytes per-string overhead worst-case 65 characters + 16 bytes per-string overhead now: 34 bytes object.ID went from 2 allocations to 1: typical case 32 characters + 16 bytes per-string overhead worst-case 65 characters + 16 bytes per-string overhead now: 36 bytes * move index.{ID,IDRange} methods to separate files * replaced index.IDFromHash with content.IDFromHash externally * minor tweaks and additional tests * Update repo/content/index/id_test.go Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com> * Update repo/content/index/id_test.go Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com> * pr feedback * post-merge fixes * pr feedback * pr feedback * fixed subtle regression in sortedContents() This was actually not producing invalid results because of how base36 works, just not sorting as efficiently as it could. Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>	2022-05-25 14:15:56 +00:00
Jarek Kowalski	b81362d72c	testing: do not run randomized tests in code coverage mode (#1585 )	2021-12-13 22:07:50 -08:00
Jarek Kowalski	cead806a3f	blob: changed default shards from {3,3} to {1,3} (#1513 ) * blob: changed default shards from {3,3} to {1,3} Turns out for very large repository around 100TB (5M blobs), we end up creating max ~16M directories which is way too much and slows down listing. Currently each leaf directory only has a handful of files. Simple sharding of {3} should work much better and will end up creating directories with meaningful shard sizes - 12 K files per directory should not be too slow and will reduce the overhead of listing by 4096 times. The change is done in a backwards-compatible way and will respect custom sharding (.shards) file written by previous 0.9 builds as well as older repositories that don't have the .shards file (which we assume to be {3,3}). * fixed compat tests	2021-11-16 06:02:04 -08:00
Jarek Kowalski	a0cfa2556f	introduced structural debug logging and optional JSON output (#1475 ) * logging: added Logger.Debugw(message, key1, value1, ..., keyN, valueN) This is based on ZAP and allows structural logs to be emitted. * cli: added --json-log-console and --json-log-file flags * logging: updated storage logging wrapper to use structural logging * pr feedback	2021-11-03 21:57:37 -07:00
Eng Zer Jun	73e492c9db	refactor: move from io/ioutil to io and os package (#1360 ) * refactor: move from io/ioutil to io and os package The io/ioutil package has been deprecated as of Go 1.16, see https://golang.org/doc/go1.16#ioutil. This commit replaces the existing io/ioutil functions with their new definitions in io and os packages. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com> * chore: remove //nolint:gosec for os.ReadFile At the time of this commit, the G304 rule of gosec does not include the `os.ReadFile` function. We remove `//nolint:gosec` temporarily until https://github.com/securego/gosec/pull/706 is merged. Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>	2021-10-06 08:39:10 -07:00
Jarek Kowalski	792cc874dc	repo: allow reusing of object writer buffers (#1315 ) This reduces memory consumption and speeds up backups. 1. Backing up kopia repository (3.5 GB files:133102 dirs:20074): before: 25s, 490 MB after: 21s, 445 MB 2. Large files (14.8 GB, 76 files) before: 30s, 597 MB after: 28s, 495 MB All tests repeated 5 times for clean local filesystem repo.	2021-09-25 14:54:31 -07:00
Jarek Kowalski	e64d5b8eab	Fixed few subtle threading bugs uncovered by stress test and rewrote the test to be model-based (#1157 ) * testing: refactored logs directory management * content: fixed index mutex to be shared across all write sessions added mutex protection during writecontent/refresh race * testing: upload log artifacts * content: bump revision number after index has been added This fixes a bug where manifest manager in another session for the same open repository may not see a content added, because they will prematurely cache the incomplete set of contents. This took 2 weeks to find. * manifest: improved log output, fixed unnecessary mutex release * testing: rewrote stress test to be model-based and more precise	2021-07-01 21:37:27 -07:00
Jarek Kowalski	4b251bdaac	mechanical: added ctx parameter to repo.{Direct}WriteSession callback (#1114 )	2021-06-02 23:12:30 -07:00
Jarek Kowalski	40510c043d	Support for content-level compression (#1076 ) * cli: added a flag to create repository with v2 index features * content: plumb through compression.ID parameter to content.Manager.WriteContent() * content: expose content.Manager.SupportsContentCompression This allows object manager to decide whether to create compressed object or let the content manager do it. * object: if compression is requested and the repo supports it, pass compression ID to the content manager * cli: show compression status in 'repository status' * cli: output compression information in 'content list' and 'content stats' * content: compression and decompression support * content: unit tests for compression * object: compression tests * testing: added integration tests against v2 index * testing: run all e2e tests with and without content-level compression * htmlui: added UI for specifying index format on creation * cli: additional tests for 'content ls' and 'content stats' * applied pr suggestions	2021-05-22 05:35:27 -07:00
Jarek Kowalski	df430371b9	Refactored content.Info to be an interface and switched index parsing to be lazy (#1008 )	2021-04-27 05:53:52 -07:00
Jarek Kowalski	de840547e6	Improved upload reporting (#832 ) * blob: refactored upload reporting Instead of plumbing this through blob storage context, we are passing and explicit callback that reports uploads as they happen. * htmlui: improved counter presentation * nit: added missing UI route which fixes Reload behavior on the Tasks page	2021-02-13 10:51:11 -08:00
Jarek Kowalski	fa7976599c	repo: refactored repository interfaces (#780 ) - `repo.Repository` is now read-only and only has methods that can be supported over kopia server - `repo.RepositoryWriter` has read-write methods that can be supported over kopia server - `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection - `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository` - `repo.Reader` removed and merged with `repo.Repository` - `repo.Writer` became `repo.RepositoryWriter` - `repo.DirectRepository` struct became `repo.DirectRepository` interface Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs. `repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`). content: removed implicit flush on content manager close * repo: added tests for WriteSession() and implicit flush behavior * invalidate manifest manager after write session * cli: disable maintenance in 'kopia server start' Server will close the repository before completing. * repo: unconditionally close RepositoryWriter in {Direct,}WriteSession * repo: added panic in case somebody tries to create RepositoryWriter after closing repository - used atomic to manage SharedManager.closed * removed stale example * linter: fixed spurious failures Co-authored-by: Julio López <julio+gh@kasten.io>	2021-01-20 11:41:47 -08:00
Jarek Kowalski	1f3b8d4da4	upgrade linter to 1.35 (#786 ) * lint: added test that enforces Makefile and GH action linter versions are in sync * workaround for linter gomnd problem - https://github.com/golangci/golangci-lint/issues/1653	2021-01-16 18:21:16 -08:00
Jarek Kowalski	9a6dea898b	Linter upgrade to v1.30.0 (#526 ) * fixed godot linter errors * reformatted source with gofumpt * disabled some linters * fixed nolintlint warnings * fixed gci warnings * lint: fixed 'nestif' warnings * lint: fixed 'exhaustive' warnings * lint: fixed 'gocritic' warnings * lint: fixed 'noctx' warnings * lint: fixed 'wsl' warnings * lint: fixed 'goerr113' warnings * lint: fixed 'gosec' warnings * lint: upgraded linter to 1.30.0 * lint: more 'exhaustive' warnings Co-authored-by: Nick <nick@kasten.io>	2020-08-12 19:28:53 -07:00
Jarek Kowalski	d68273a576	Improvements for dealing with eventually-consistent stores (S3) (#437 ) * content: added support for cache of own writes Thi keeps track of which blobs (n and m) have been written by the local repository client, so that even if the storage listing is eventually consistent (as in S3), we get somewhat sane behavior. Note that this is still assumming read-after-create semantics, which S3 also guarantees, otherwise it's very hard to do anything useful. * compaction: support for compaction logs Instead of compaction immediately deleting source index blobs, we now write log entries (with `m` prefix) which are merged on reads and applied only if the blob list includes all inputs and outputs, in which case the inputs are discarded since they are known to have been superseded by the outputs. This addresses eventual consistency issues in stores such as S3, which don't guarantee list-after-put or list-after-delete. With such stores the repository is ultimately eventually consistent and there's not much that can be done about it, unless we use second strongly consistent storage (such as GCS) for the index only. * content: updated list cache to cache both `n` and `m` * repo: fixed cache clear on windows Clearing cache requires closing repository first, as Windows is holding the files locked. This requires ability to close the repository twice. * content: refactored index blob management into indexBlobManager * testing: fixed blobtesting.Map storage to allow overwrites * blob: added debug output String() to blob.Metadata * testing: added indexBlobManager stress test This works by using N parallel "actors", each repeatedly performing operations on indexBlobManagers all sharing single eventually consistent storage. Each actor runs in a loop and randomly selects between: - reading all contents in indexes and verifying that it includes all contents written by the actor so far and that contents are correctly marked as deleted - creating new contents - deleting one of previously-created contents (by the same actor) - compacting all index files into one The test runs on accelerated time (every read of time moves it by 0.1 seconds) and simulates several hours of running. In case of a failure, the log should provide enough debugging information to trace the exact sequence of events leading up to the failure - each log line is prefixed with actorID and all storage access is logged. * makefile: increase test timeout * content: fixed index blob manager race The race is where if we delete compaction log too early, it may lead to previously deleted contents becoming temporarily live again to an outside observer. Added test case that reproduces the issue, verified that it fails without the fix and passed with one. * testing: improvements to TestIndexBlobManagerStress test - better logging to be able to trace the root cause in case of a failure - prevented concurrent compaction which is unsafe: The sequence: 1. A creates contentA1 in INDEX-1 2. B creates contentB1 in INDEX-2 3. A deletes contentA1 in INDEX-3 4. B does compaction, but is not seeing INDEX-3 (due to EC or simply because B started read before #3 completed), so it writes INDEX-4==merge(INDEX-1,INDEX-2) * INDEX-4 has contentA1 as active 5. A does compaction but it's not seeing INDEX-4 yet (due to EC or because read started before #4), so it drops contentA1, writes INDEX-5=merge(INDEX-1,INDEX-2,INDEX-3) * INDEX-5 does not have contentA1 7. C sees INDEX-5 and INDEX-5 and merge(INDEX-4,INDEX-5) contains contentA1 which is wrong, because A has been deleted (and there's no record of it anywhere in the system) * content: when building pack index ensure index bytes are different each time by adding 32 random bytes	2020-05-31 17:11:20 -07:00
Jarek Kowalski	be4b897579	Support for remote repository (#427 ) Support for remote content repository where all contents and manifests are fetched over HTTP(S) instead of locally manipulating blob storage * server: implement content and manifest access APIs * apiclient: moved Kopia API client to separate package * content: exposed content.ValidatePrefix() * manifest: added JSON serialization attributes to EntryMetadata * repo: changed repo.Open() to return Repository instead of DirectRepository repo: added apiServerRepository * cli: added 'kopia repository connect server' This sets up repository connection via the API server instead of directly-manipulated storage. * server: add support for specifying a list of usernames/password via --htpasswd-file * tests: added API server repository E2E test * server: only return manifests (policies and snapshots) belonging to authenticated user	2020-05-02 21:41:49 -07:00
Jarek Kowalski	6cb9b8fa4f	repo: refactored public API (#318 ) * This is 99% mechanical: Extracted repo.Repository interface that only exposes high-level object and manifest management methods, but not blob nor content management. Renamed old repo.Repository to repo.DirectRepository Reviewed codebase to only depend on repo.Repository as much as possible, but added way for low-level CLI commands to use DirectRepository. * PR fixes	2020-03-26 08:04:01 -07:00
Jarek Kowalski	c8fcae93aa	logging: refactored logging This is mostly mechanical and changes how loggers are instantiated. Logger is now associated with a context, passed around all methods, (most methods had ctx, but had to add it in a few missing places). By default Kopia does not produce any logs, but it can be overridden, either locally for a nested context, by calling ctx = logging.WithLogger(ctx, newLoggerFunc) To override logs globally, call logging.SetDefaultLogger(newLoggerFunc) This refactoring allowed removing dependency from Kopia repo and go-logging library (the CLI still uses it, though). It is now also possible to have all test methods emit logs using t.Logf() so that they show up in failure reports, which should make debugging of test failures suck less.	2020-02-25 17:24:44 -08:00
Jarek Kowalski	edca1733b6	repo: moved password persistence to repository layer	2020-02-09 20:55:07 -08:00
Julio Lopez	4625e5ba9e	Remove content.CompactOptions.MinSmallBlobs Use MaxSmallBlobs instead. MaxSmallBlobs was not being really used. Replaced uses of MinSmallBlobs with MaxSmallBlobs and removed MinSmallBlobs	2020-02-06 21:51:51 -08:00
Jarek Kowalski	ac70a38101	lint: upgraded to 1.22.2 and make lint issues a build failure fixed or silenced linter warnings, mostly due to magic numeric constants	2020-01-03 16:39:30 -08:00
Jarek Kowalski	6217df1a87	lint: switched to 1.21 and fixed a ton of whitespace issues discovered by new wsl linter	2019-11-26 06:49:49 -08:00
Jarek Kowalski	4c3272dd94	content: fixed content.Manager.Flush() Previously, it was possible for Flush() to miss in-flight writes, but only when using repository manually since Uploader guarantees there are no in-flight writes when it completes. With this change Flush() will guarantee that any pending writes completed before Flush() has started are guaranteed to be committed to the repository before Flush() returns. This was actually a regression introduced in #105. Added regression test to prevent it from reoccurring.	2019-09-02 19:13:36 -07:00
Jarek Kowalski	b365d3414c	removed content.Manager.ListContents*() API	2019-07-27 19:10:58 -07:00
Jarek Kowalski	72520029b0	golangci-lint: added more linters Also fixed pre-existing lint errors.	2019-06-02 22:56:57 -07:00
Jarek Kowalski	54edb97b3a	refactoring: renamed repo/block to repo/content Also introduced strongly typed content.ID and manifest.ID (instead of string) This aligns identifiers across all layers of repository: blob.ID content.ID object.ID manifest.ID	2019-06-01 22:24:19 -07:00
Jarek Kowalski	9e5d0beccd	refactoring: renamed storage.Storage to blob.Storage This updates the terminology everywhere - blocks become blobs and `storage.Storage` becomes `blob.Storage`. Also introduced blob.ID which is a specialized string type, that's different from CABS block ID. Also renamed CLI subcommands from `kopia storage` to `kopia blob`. While at it introduced `block.ErrBlockNotFound` and `object.ErrObjectNotFound` that do not leak from lower layers.	2019-06-01 14:10:35 -07:00
Jarek Kowalski	1a7a02ddbe	cleanup imports by grouping all local imports together	2019-06-01 10:57:55 -07:00
Jarek Kowalski	63303904e1	switched remaining fmt.Errorf to errors.Wrap()	2019-06-01 10:57:05 -07:00
Jarek Kowalski	0c41d41276	Fixed up paths after merge	2019-05-27 15:48:39 -07:00
Jarek Kowalski	bdafe117d9	Makefile: switched linter to golangci-lint and updated goveralls setup fixed lint errors & removed .gometalinter config	2019-04-01 19:22:01 -07:00
Jarek Kowalski	e458ee24d8	imported github.com/kopia/kopia/repo and renamed package path to github.com/kopia/repo/	2018-10-26 17:33:58 -07:00

44 Commits