Refactor `--profile-*` flags:
- Multiple profile types can be enabled at once, before only
a single type profiling could be done during a process execution.
- The new `--profiles-store-on-exit` enables all available profile
types, except for CPU profiling which needs to be explicitly enabled.
- Profiling parameters can now be set via new flags. This allows setting
the profile parameters for the pprof endpoint, as well as when saving
profiles to files on exit.
- Group profiling flags with other observability flags
- Adds a `--diagnostics-output-directory` flag that unifies and
supersedes the `--profile-dir` and `--metrics-directory` flags
Enhancements and behavior changes:
- Profile flags now have effect for all kopia commands, including
`server start`. Before these flags did not have any effect
in a few commands.
- Multiple profile types can be enabled at once, before only
a single type profiling could be done during a process execution.
- The new `--profiles-store-on-exit` enables all available profile
types, except for CPU profiling which needs to be explicitly enabled.
- Profiling parameters can now be set via new flags. This allows setting
the profile parameters for the pprof endpoint, as well as when saving
profiles to files on exit.
The following flags have been removed:
- `--profile-dir`: superseded by the `--diagnostics-output-directory` flag
- `--profile-blocking`: the `--profile-store-on-exit` flag enables blocking
profiling. Use `--profile-blocking-rate=0` to explicitly disable it.
- `--profile-memory`: the `--profile-store-on-exit` flag enables memory
profiling. Use `--profile-memory-rate=0` to explicitly disable it.
- `--profile-mutex`: the `--profile-store-on-exit` flag enables mutex
profiling. Use `--profile-mutex-fraction=0` to explicitly disable it.
Add CLI test for profile flags.
- upgrade to golangci-lint 2.6.1
- updates for gosec
- updates for govet
- updates for perfsprint
- updates modernize
Leaves out modernize:omitempty due to conflicts with tests
In kopia, "blob" is a generic term to refer to either
an object in an object storage provider, or a file
in a file system storage provider. There are various
types of blobs in a kopia repository.
In kopia, the term "pack" is used to refer to specific types
of blobs, namely 'p' & 'q' pack blobs, that store
"content" data, as opposed to say, "index" blobs.
This change attempts to use the term "pack" consistently
in the functions and types used for pack deletion.
Note that the corresponding task names, shown below, remain
unchanged since these names are used in the persistent
maintenance run metadata, and that is used to make decisions
about the safety of the execution of those tasks.
```
TaskDeleteOrphanedBlobsQuick = "quick-delete-blobs"
TaskDeleteOrphanedBlobsFull = "full-delete-blobs"
```
Maintenance is critical for healthy of the repository.
On the other hand, Maintenance is complex, because
it runs multiple sub tasks each may generate different
results according to the maintenance policy.
The results may include deleting/combining/adding
data/metadata to the repository.
It is worthy to add more observability for these
tasks for below reasons:
It is helpful for troubleshooting. Any data change
to the repository is critical, the observability info
helps to understand what happened during the
maintenance and why that happened.
It is helpful for users to understand/predict the
repo's behavior. The repo data may be stored
in a public cloud for which costs are sensitive
to scale/duration of data stored. On the other
hand, repository has its own policy to manage
the data, so the data is not deleted until it is
safe enough according to the policy.
The observability info helps users to
understand how much data is in-use,
how much data is out of use and
when it is deleted
* fix(cli): improve progress output control in repository sync and documentation
- Update progress flag description from "progress bar" to "progress output" for clarity
- Document progress control features in Logging, Synchronization, and Command-Line reference
- Support --no-progress flag for cleaner automation and scripting usage
- nit: rename function to repositoryAction.
It always calls the action with a repository
- move allocator stats functionality to observability
- rename observability functions to start/stop. They
start and stop more than just the metrics services.
- rename field to c.enablePProfEndpoint for clarity.
- add observability run function to make it explicit
where start and stop are called.
Ensure auto-maintenance errors are propagated.
This enables sending notifications for failed "auto-maintenances".
Preserve action callback error when closing the repository fails.
This is a breaking change to users who might be using Kopia as a library.
### Log Format
```json
{"t":"<timestamp-rfc-3389-microseconds>", "span:T1":"V1", "span:T2":"V2", "n":"<source>", "m":"<message>", /*parameters*/}
```
Where each record is associated with one or more spans that describe its scope:
* `"span:client": "<hash-of-username@hostname>"`
* `"span:repo": "<random>"` - random identifier of a repository connection (from `repo.Open`)
* `"span:maintenance": "<random>"` - random identifier of a maintenance session
* `"span:upload": "<hash-of-username@host:/path>"` - uniquely identifies upload session of a given directory
* `"span:checkpoint": "<random>"` - encapsulates each checkpoint operation during Upload
* `"span:server-session": "<random>"` -single client connection to the server
* `"span:flush": "<random>"` - encapsulates each Flush session
* `"span:maintenance": "<random>"` - encapsulates each maintenance operation
* `"span:loadIndex" : "<random>"` - encapsulates index loading operation
* `"span:emr" : "<random>"` - encapsulates epoch manager refresh
* `"span:writePack": "<pack-blob-ID>"` - encapsulates pack blob preparation and writing
(plus additional minor spans for various phases of the maintenance).
Notable points:
- Used internal zero allocation JSON writer for reduced memory usage.
- renamed `--disable-internal-log` to `--disable-repository-log` (controls saving blobs to repository)
- added `--disable-content-log` (controls writing of `content-log` files)
- all storage operations are also logged in a structural way and associated with the corresponding spans.
- all content IDs are logged in a truncated format (since first N bytes that are usually enough to be unique) to improve compressibility of logs (blob IDs are frequently repeated but content IDs usually appear just once).
This format should make it possible to recreate the journey of any single content throughout pack blobs, indexes and compaction events.
* Remove unused return value from ListIndexBlobInfos
* Unexport index.Builder.buildStable
* Remove unnecessary OneUseBuilder.BuildStable
* Remove unnecessary `BuilderCreator` interface,
use a function type instead.
* Cleanup comment
Move general functionality from the `content verify` CLI
command implementation to helpers in the content package.
The primary motivation is to allow reusing the content
verification functionality during maintenance.
A separate followup change also extends content
verification to include additional stats useful for
debugging repository corruptions.
Overview of the changes:
- Relocation of the content verification functionality
to the content package. The entry point is
content.WriteManager.VerifyContents.
This is primarily code movement with no functional changes.
- Addition of unit tests for the content verification functionality
by exercising content.WriteManager.VerifyContents.
- Minor functional change: changing the logging level from
Error to Warn for the "inner loop" error messages. This allows
filtering out these messages if needed, while still observing the
error message that is logged for the overall operation.
Nits and cleanups:
- clarify log message to indicate the effect of advancing the deletion watermark;
- add omitzero JSON tag to appropriate fields in snapshot.Manifest struct;
- use maps.Clone instead of explicit loop;
- rename function to IterateUnreferencedPacks for clarity;
- use atomic.Int32 type;
- move a continue check to the beginning of the loop, no actual
work / side effects were performed before the check;
- reduce type requirement in blob.ReadBlobMap
Removes the following flags:
--caching: no-op
--list-caching: no-op
--enable-jaeger-collector: errors out when specified.
Also removes no-longer-used `deprecatedFlag` function.
- Add read stats to snapshot verifier output
- Add periodic JSON progress output.
- Refactor the use of directory summary.
- Use stats mutex for all stats.
- Add processedBytes to the snapshot verify output
- Output more frequently, when bytes processed changes
The `dirRelativePath` variable is actually the path to the file being
checked, but was treated as if it was the path to the parent directory,
causing the filename to be duplicated in log messages.
- enable `forcetypeassert` linter in non-test files
- add `//nolint` annotations
- add `testutil.EnsureType` helper for type assertions
- enable `forcetypeassert` linter in test files
- nit: rename var to packCountByPrefix
- leverage impossible package
- use maps.Clone
- unexport indirectObjectID
- unexport compressed
- rename function to flushBufferLocked
- add checklocks annotations to functions that must be called under w.mu