* blob: changed default shards from {3,3} to {1,3}
Turns out for very large repository around 100TB (5M blobs),
we end up creating max ~16M directories which is way too much
and slows down listing. Currently each leaf directory only has a handful
of files.
Simple sharding of {3} should work much better and will end up creating
directories with meaningful shard sizes - 12 K files per directory
should not be too slow and will reduce the overhead of listing by
4096 times.
The change is done in a backwards-compatible way and will respect
custom sharding (.shards) file written by previous 0.9 builds
as well as older repositories that don't have the .shards file (which
we assume to be {3,3}).
* fixed compat tests
* content: fixed repo upgrade version
Previously upgrade would enable epoch manager and index v2 but would
not set the version of the format itself. Everything worked fine
but it would not protect from old kopia opening the repository.
* ci: added compatibility test that uses real 0.8 and current binaries
* repo: added 'enable password change' flag (defaults to true for new repositories), which prevents embedding replicas of kopia.repository in pack blobs
* cli: added 'repo change-password' which can change the password of a connected repository
* repo: nit - renamed variables and functions dealing with key derivation
* repo: fixed cache validation HMAC secret to use stored HMAC secret instead of password-derived one
* cli: added test for repo change-password
* repo: negative cases for attempting to change password in an old repository
* Update cli/command_repository_change_password.go
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
Co-authored-by: Julio Lopez <julio+gh@kasten.io>
* cli: added a flag to create repository with v2 index features
* content: plumb through compression.ID parameter to content.Manager.WriteContent()
* content: expose content.Manager.SupportsContentCompression
This allows object manager to decide whether to create compressed object
or let the content manager do it.
* object: if compression is requested and the repo supports it, pass compression ID to the content manager
* cli: show compression status in 'repository status'
* cli: output compression information in 'content list' and 'content stats'
* content: compression and decompression support
* content: unit tests for compression
* object: compression tests
* testing: added integration tests against v2 index
* testing: run all e2e tests with and without content-level compression
* htmlui: added UI for specifying index format on creation
* cli: additional tests for 'content ls' and 'content stats'
* applied pr suggestions
* testing: ensure tests are releasing all buffer pools to reduce memory usage, we had huge leaks
* object: reduced complexity and memory usage of TestEndToEndReadAndSeekWithCompression
* manifest: more test fixes
* trivial: update comment
Co-authored-by: Julio López <julio+gh@kasten.io>
* manifest: removed explicit refresh
Instead, content manager is exposing a revision counter that changes
on each mutation or index change. Manifest manager will be invalidated
whenever this is encountered.
* server: refactored initialization API
* server: added unit tests for repository server APIs (HTTP and REST)
* server: ensure we don't upload contents that already exist
This saves bandwidth, since the client can compute hash locally
and ask the server whether the object exists before starting the upload.
* blob: refactored upload reporting
Instead of plumbing this through blob storage context, we are passing
and explicit callback that reports uploads as they happen.
* htmlui: improved counter presentation
* nit: added missing UI route which fixes Reload behavior on the Tasks page
- `repo.Repository` is now read-only and only has methods that can be supported over kopia server
- `repo.RepositoryWriter` has read-write methods that can be supported over kopia server
- `repo.DirectRepository` is read-only and contains all methods of `repo.Repository` plus some low-level methods for data inspection
- `repo.DirectRepositoryWriter` contains write methods for `repo.DirectRepository`
- `repo.Reader` removed and merged with `repo.Repository`
- `repo.Writer` became `repo.RepositoryWriter`
- `*repo.DirectRepository` struct became `repo.DirectRepository`
interface
Getting `{Direct}RepositoryWriter` requires using `NewWriter()` or `NewDirectWriter()` on a read-only repository and multiple simultaneous writers are supported at the same time, each writing to their own indexes and pack blobs.
`repo.Open` returns `repo.Repository` (which is also `repo.RepositoryWriter`).
* content: removed implicit flush on content manager close
* repo: added tests for WriteSession() and implicit flush behavior
* invalidate manifest manager after write session
* cli: disable maintenance in 'kopia server start'
Server will close the repository before completing.
* repo: unconditionally close RepositoryWriter in {Direct,}WriteSession
* repo: added panic in case somebody tries to create RepositoryWriter after closing repository
- used atomic to manage SharedManager.closed
* removed stale example
* linter: fixed spurious failures
Co-authored-by: Julio López <julio+gh@kasten.io>
content:Allow returning deleted content in GetContent
maintenance: check deleted contents as well
maintenance: test for when a directory content is reused after deletion
testing: add support for repo open options in repotesting
* Allow passing repo options to MustReopen
* Add repotesting.Environment.MustConnectOpenAnother
* Remove kopia.config.mlock file
* snapshot create helper
* Fix content delete related and e2e tests
Support for remote content repository where all contents and
manifests are fetched over HTTP(S) instead of locally
manipulating blob storage
* server: implement content and manifest access APIs
* apiclient: moved Kopia API client to separate package
* content: exposed content.ValidatePrefix()
* manifest: added JSON serialization attributes to EntryMetadata
* repo: changed repo.Open() to return Repository instead of *DirectRepository
* repo: added apiServerRepository
* cli: added 'kopia repository connect server'
This sets up repository connection via the API server instead of
directly-manipulated storage.
* server: add support for specifying a list of usernames/password via --htpasswd-file
* tests: added API server repository E2E test
* server: only return manifests (policies and snapshots) belonging to authenticated user
* This is 99% mechanical:
Extracted repo.Repository interface that only exposes high-level object and manifest management methods, but not blob nor content management.
Renamed old *repo.Repository to *repo.DirectRepository
Reviewed codebase to only depend on repo.Repository as much as possible, but added way for low-level CLI commands to use DirectRepository.
* PR fixes
Motivation: Allow time injection for (unit) tests, to more easily test and
verify time-dependent invariants.
Add time injection support for:
* repo.Manager
* manifest.Manager
* snapshot.Uploader
Then, wire up to these components. The content.Manager already had support for
time injection, but was not wired up from the time function passed to repo creation.
Add an internal/faketime package for testing. Mainly code movement from testing
code in the repo/content package. Motivation: make it available to other packages
outside content Also, add simple tests for faketime functions.
This is mostly mechanical and changes how loggers are instantiated.
Logger is now associated with a context, passed around all methods,
(most methods had ctx, but had to add it in a few missing places).
By default Kopia does not produce any logs, but it can be overridden,
either locally for a nested context, by calling
ctx = logging.WithLogger(ctx, newLoggerFunc)
To override logs globally, call logging.SetDefaultLogger(newLoggerFunc)
This refactoring allowed removing dependency from Kopia repo
and go-logging library (the CLI still uses it, though).
It is now also possible to have all test methods emit logs using
t.Logf() so that they show up in failure reports, which should make
debugging of test failures suck less.
Also introduced strongly typed content.ID and manifest.ID (instead of string)
This aligns identifiers across all layers of repository:
blob.ID
content.ID
object.ID
manifest.ID
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.
Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.
Also renamed CLI subcommands from `kopia storage` to `kopia blob`.
While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
The splitter in question was depending on
github.com/silvasur/buzhash which is not licensed according to FOSSA bot
Switched to new faster implementation of buzhash, which is
unfortunately incompatible and will split the objects in different
places.
This change is be semi-breaking - old repositories can be read, but
when uploading large objects they will be re-uploaded where previously
they would be de-duped.
Also added 'benchmark splitters' subcommand and moved 'block cryptobenchmark'
subcommand to 'benchmark crypto'.