mirror of
https://github.com/spacedriveapp/spacedrive.git
synced 2026-02-19 23:25:51 -05:00
99 lines
4.3 KiB
Plaintext
99 lines
4.3 KiB
Plaintext
---
|
|
index: 12
|
|
---
|
|
|
|
# Sync
|
|
|
|
Spacedrive synchronizes libraries by treating SQLite cells as last-write-wins CRDTs,
|
|
with sync metadata about each model being encoded into Prisma schema using [model attributes](#model-types).
|
|
|
|
In the cases where LWW CRDTs are used,
|
|
conflicts are resolved using a [Hybrid Logical Clock](https://github.com/atolab/uhlc-rs)
|
|
to determine the ordering of events.
|
|
|
|
We would be remiss to not credit credit [Actual Budget](https://actualbudget.com/)
|
|
with many of the CRDT concepts used in Spacedrive's sync system.
|
|
|
|
## Model Types
|
|
|
|
All data in a library conforms to one of the following types.
|
|
Each type uses a different strategy for syncing.
|
|
|
|
### Local
|
|
|
|
Local records exist entirely outside of the sync system.
|
|
They don't have Sync IDs and never leave the node they were created on.
|
|
|
|
Used for Nodes, Statistics, and Sync Events.
|
|
|
|
`@local`
|
|
|
|
### Shared
|
|
|
|
Shared records encompass most data synced in the CRDT fashion.
|
|
Updates are applied per-field using a last-write-wins strategy.
|
|
|
|
Used for Objects, Tags, Spaces, and Jobs.
|
|
|
|
`@shared(id?: String, modelId: Int)`
|
|
|
|
- `id` - Scalar field to override the default Sync ID.
|
|
- `modelId` - Integer to identify the model by. Helps save on bandwidth and storage as opposed to storing model names as strings.
|
|
|
|
### Relation
|
|
|
|
Similar to shared records, but represent a many-to-many relation between two records.
|
|
Sync ID is the combination of `item` and `group` Sync IDs.
|
|
|
|
Used for TagOnFile and FileInSpace.
|
|
|
|
`@relation(item: String, group: String, modelId: Int)`
|
|
|
|
- `item` - Field that identifies the item that the relation is connecting.
|
|
- `group` - Field that identifies the group that the item should be connected to.
|
|
- `modelId` - Integer to identify the model by. Helps save on bandwidth and storage as opposed to storing model names as strings.
|
|
|
|
## Sync Actors
|
|
|
|
The sync system is comprised of a number of actors that send, receive, and ingest sync operations.
|
|
|
|
### Ingest
|
|
|
|
The most important of these is the
|
|
[ingest actor](https://github.com/spacedriveapp/spacedrive/blob/main/core/crates/sync/src/ingest.rs).
|
|
Its existence entirely independent of both the core and the cloud sync actors allows it to be interacted with by many different systems,
|
|
while ensuring that only one process is responsible for providing it with operations to ingest at a time.
|
|
|
|
Its work loop goes something like this:
|
|
|
|
- Wait for a notification that new sync operations are available to ingest
|
|
- Request new operations from whatever process holds the communication channel with the ingester.
|
|
- This request is accompanied by timestamps of the last operation ingested for each instance of the library, to be used to fetch only the latest necessary sync operations.
|
|
- If the communication channel is dropped, the work loop will restart
|
|
- Ingested the sync operations in batches distinguished by instance, model, and record id
|
|
- If the batch contains a `Delete`, all other operations are ignored and the record is deleted.
|
|
- If the batch contains a `Create`, all `Update` operations are applied on top of it to reduce database load.
|
|
- If the batch only contains `Update` operations, they are applied on top of a fake `Create` operation to reduce datbase load.
|
|
- The latest timestamp of the instance is updated to the timestamp of the last operation ingested.
|
|
- Return to the beginning, waiting for a notification of new operations
|
|
|
|
### Cloud Send/Receive/Ingest
|
|
|
|
Each of these play a different role in getting sync operations to and from the cloud, and into the ingest actor.
|
|
|
|
#### [Send](https://github.com/spacedriveapp/spacedrive/blob/main/core/src/cloud/sync/send.rs)
|
|
|
|
When new sync operations are created on an instance will attempt to obtain a lock on cloud sync operations for that particular instance,
|
|
and will upload them as a compressed base64 format.
|
|
|
|
The lock on cloud sync operations is necessary to prevent multiple processes from attempting to upload operations at the same time,
|
|
and is implemented as a short-expiry redis entry.
|
|
|
|
#### [Receive](https://github.com/spacedriveapp/spacedrive/blob/main/core/src/cloud/sync/receive.rs)
|
|
|
|
Downloads sync operations from the cloud periodically, and stores them in the `cloud_crdt_operation` table.
|
|
|
|
#### [Ingest](https://github.com/spacedriveapp/spacedrive/blob/main/core/src/cloud/sync/ingest.rs)
|
|
|
|
Reads sync operations from the `cloud_crdt_operation` table, and sends them to the ingest actor.
|