* fix(repository): fixed handling of content.Info
Previously content.Info was an interface which was implemented by:
* index.InfoStruct
* index.indexEntryInfoV1
* index.indexEntryInfoV2
The last 2 implementations were relying on memory-mapped files
which in rare cases could be closed while Kopia was still processing
them leading to #2599.
This changes fixes the bug and strictly separates content.Info (which
is now always a struct) from the other two (which were renamed as
index.InfoReader and only used inside repo/content/...).
In addition to being safer, this _should_ reduce memory allocations.
* reduce the size of content.Info with proper alignment.
* pr feedback
* renamed index.InfoStruct to index.Info
- removed a bunch of hacks and should improve the logging
performance by avoiding interfaces and data translation. This will
allow using of de-sugared loggers in performance-critical
logging situations.
- this will also allow using features of ZAP more directly without
having to reimplement them.
- moved logging.Printf() to testlogging
- refactored `uitask` to store logs in a structural format and
present them as JSON only in the UI
- renamed printf_logger.go to printf.go so that fewer columns are used
in the logs
* refactor(repository): ensure we always parse content.ID and object.ID
This changes the types to be incompatible with string to prevent direct
conversion to and from string.
This has the additional benefit of reducing number of memory allocations
and bytes for all IDs.
content.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 34 bytes
object.ID went from 2 allocations to 1:
typical case 32 characters + 16 bytes per-string overhead
worst-case 65 characters + 16 bytes per-string overhead
now: 36 bytes
* move index.{ID,IDRange} methods to separate files
* replaced index.IDFromHash with content.IDFromHash externally
* minor tweaks and additional tests
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* Update repo/content/index/id_test.go
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
* pr feedback
* post-merge fixes
* pr feedback
* pr feedback
* fixed subtle regression in sortedContents()
This was actually not producing invalid results because of how base36
works, just not sorting as efficiently as it could.
Co-authored-by: Julio Lopez <1953782+julio-lopez@users.noreply.github.com>
This is done by protecting newly added cache items from being swept for
X amount of time where X defaults to:
* `metadata` - 24 hours (new)
* `data` - 10 min (new)
* `indexes` - 1 hours (same as today)
Fixes#1540
* logging: added Logger.Debugw(message, key1, value1, ..., keyN, valueN)
This is based on ZAP and allows structural logs to be emitted.
* cli: added --json-log-console and --json-log-file flags
* logging: updated storage logging wrapper to use structural logging
* pr feedback
* logging: added logger wrappers for Broadcast and Prefix
* nit: moved max hash size to a named constant
* content: added internal logger
* content: replaced context-based logging with explicit Loggers
This will capture the logger.Logger associated with the context when
the repository is opened and will reuse it for all logs instead of
creating new logger for each log message.
The new logger will also write logs to the internal logger in addition
to writing to a log file/console.
* cli: allow decrypting all blobs whose names start with _
* maintenance: added logs cleanup
* cli: commands to view logs
* cli: log selected command on each write session
* content: added GetCompressionHeaderID and GetEncryptionKeyID to content.info
Both must be zero in index v1 but will be non-zero in index v2 to
support in-content compression and key rotation in the future.
* content: cleaned up index v1 code
* content: added index v2 implementation
* content: updated index test to verify that we're able to store all supported values in all Info fields
* content: optimized sorting of content.Info by content ID using bucket sort and parallelization
For 10M contents this reduces sort time from 10s to ~2s
* content: fixed a bunch of off-by-one errors in index v2, added tests
* content: fixed test failures due to increased validation
* content: plumbed through index version (currently hardcoded to v1) in content manager