`snapshot gc` marks contents not reachable from the root of any snapshot
as soft-deleted
The algorithm is a mark-and-sweep with parallel iteration of objects.
Currently it stores content IDs and object IDs in a map, so won't scale
to huge repositories, but this can be fixed in the future.
This fixes#110 at least for reasonable repository sizes.
Previously, it was possible for Flush() to miss in-flight writes,
but only when using repository manually since Uploader guarantees
there are no in-flight writes when it completes.
With this change Flush() will guarantee that any pending writes
completed before Flush() has started are guaranteed to be committed
to the repository before Flush() returns.
This was actually a regression introduced in #105.
Added regression test to prevent it from reoccurring.
Repository.Token() generates a base64-encoded token that can
be stored in password manager that fully describes repository connection
information (blob.ConnectionInfo) and optionally a password.
Use `kopia repo status -t` to print the token.
Use `kopia repo status -t -s` to print the token that also includes
repository password.
Use `kopia repo connect from-config --token T` to reconnect using the
token.
Also introduced strongly typed content.ID and manifest.ID (instead of string)
This aligns identifiers across all layers of repository:
blob.ID
content.ID
object.ID
manifest.ID
This updates the terminology everywhere - blocks become blobs and
`storage.Storage` becomes `blob.Storage`.
Also introduced blob.ID which is a specialized string type, that's
different from CABS block ID.
Also renamed CLI subcommands from `kopia storage` to `kopia blob`.
While at it introduced `block.ErrBlockNotFound` and
`object.ErrObjectNotFound` that do not leak from lower layers.
previously the coverage for each package only included lines covered by tests in the same package,
excluding end-to-end tests
now, we get very different coverage numbers...
instead of having time-based naming, block manager will perform occasional compaction at startup time and delete unwanted blocks
The protocol is safe from concurrency standpoint and provides eventual consistency.
We only delete blocks after creating compacted blocks.
We retry loading index until (any) consistent index is fetched (possibly from cache) and all underlying blocks are also available, not necessarily the latest ones.
TODO - we need to periodically snapshot block index contents, so that if we have a bug somewhere in compaction code, we have a way of restoring working indexes.