Jarek Kowalski 4bc69d05ef feat(snapshots): big upload performance improvements by parallelizing directory traversal (#1752)
* feat(general): added internal/workshare package

This introduces work sharing utility useful when walking trees of
things (such as filesystem), which allows N threads/goroutines to be
used.

Whenever a routine is visiting its children, it can share some of that
work with another idle goroutine in the pool (when available). If
no other goroutine is idle, we are already at capacity and the caller
simply does the work in their own goroutine.

The API introduced here is not the most beautiful, but allows us to
avoid allocations in most cases, which is critical for high-performance
data processing.

* feat(snapshots): speed up uploads by parallelizing directory traversal

Previously directories were walked strictly sequentially which means
we could never be uploading data from multiple directories in parallel,
even if they had just a few files each.

This change switches to using the new `workshare` utility which improves
parallelism. It also reduces memory allocations, goroutine creations
and overall memory usage when taking large snapshots, while increasing
CPU utilization.

Tests on realistic directory structures show huge speed-ups during cold
snapshots (without any metadata caching:)

Photo library - 160GB, files:41717 dirs:1350

    Before: 3m11s
    After: 1m50s
    Total time reduction: 43%

Working code directory - 30.7 GB files:194560 dirs:42455

    Before: 55s
    After: 25s
    Total time reduction: 55%

* do not report multiple cancelation errors during parallel uploads

* do not report multiple cancelation errors during parallel uploads

* pr feedback, clarified usage, added comments

* fixed flaky test
2022-02-18 21:16:30 -08:00
2020-03-05 18:40:23 -08:00
2020-02-05 21:38:16 -08:00
2016-06-12 17:09:12 -07:00

Kopia

Kopia Build Status Slack GoDoc Coverage StatusGo Report Card Contributor Covenant

n.

  1. copy, replica (Polish)
  2. lance, spear
  3. fast and secure backup tool

Kopia is a simple, cross-platform tool for managing encrypted backups in the cloud. It provides fast, incremental backups, secure, client-side end-to-end encryption, compression and data deduplication.

Unlike other cloud backup solutions, the user is in full control of the backup storage and responsible for purchasing one of the cloud storage products (such as Google Cloud Storage), which offer great durability and availability for the data.

Kopia in action

Using kopia command line tool:

asciicast

Kopia UI - experimental user interface

Kopia UI Tutorial

Getting Started

See Documentation for more information.

Building Kopia

See Build Infrastructure for more information on building Kopia and working with the source code.

Licensing

Kopia is licensed under the Apache License, Version 2.0. See LICENSE for the full license text.

Contribution Guidelines

Kopia is open source and contributions are welcome. For more information on how to contribute see the Contribution Guidelines.

Reporting Security Issues

If you find a security issue you'd like to disclose privately, please contact kopia-pmc@googlegroups.com or via direct message to maintainers on Slack.

Disclaimer

Kopia is a personal project and is not affiliated with, supported or endorsed by Google.

Cryptography Notice

This distribution includes cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country's laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See http://www.wassenaar.org/ for more information.

The U.S. Government Department of Commerce, Bureau of Industry and Security (BIS), has classified this software as Export Commodity Control Number (ECCN) 5D002.C.1, which includes information security software using or performing cryptographic functions with symmetric algorithms. The form and manner of this distribution makes it eligible for export under the License Exception ENC Technology Software Unrestricted (TSU) exception (see the BIS Export Administration Regulations, Section 740.13) for both object code and source code.

Netlify Status

Description
No description provided
Readme 54 MiB
Languages
Go 97.3%
JavaScript 1%
Makefile 0.8%
Shell 0.6%
HTML 0.3%