Files
spacedrive/docs/core/sync.md
2025-10-07 20:31:12 -07:00

19 KiB

Spacedrive Sync System

Status: Implementation Ready Version: 2.0 Last Updated: 2025-10-08

Overview

Spacedrive's sync system enables real-time, multi-device synchronization of library metadata, ensuring that changes made on one device are reflected across all paired devices. This document provides the definitive specification for implementing sync.

Core Architecture

The Three Pillars

  1. TransactionManager (TM): Sole gatekeeper for all syncable database writes, ensuring atomic DB commits + sync log creation
  2. Sync Log: Append-only, sequentially-ordered log of all state changes per library, maintained only by the leader device
  3. Sync Service: Replicates sync log entries between paired devices using pull-based synchronization

Data Flow

┌─────────────────────────────────────────────────────────────────────┐
│ Device A                                                            │
│                                                                     │
│  User Action (e.g., create album)                                  │
│         ↓                                                           │
│  [ Action Layer ]                                                   │
│         ↓                                                           │
│  [ TransactionManager ]                                             │
│         ↓                                                           │
│  ┌─────────────────────────────┐                                   │
│  │  ATOMIC TRANSACTION         │                                   │
│  │  1. Write to database       │                                   │
│  │  2. Create sync log entry   │                                   │
│  │  COMMIT                     │                                   │
│  └─────────────────────────────┘                                   │
│         ↓                                                           │
│  [ Event Bus ] → Client cache updates                              │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘
                              ↓ Sync replication
┌─────────────────────────────────────────────────────────────────────┐
│ Device B                                                            │
│                                                                     │
│  [ Sync Service ]                                                   │
│         ↓ (polls for new entries)                                  │
│  Fetch sync log from Device A                                      │
│         ↓                                                           │
│  [ Apply Sync Entry ]                                               │
│         ↓                                                           │
│  [ TransactionManager ] (applies change)                            │
│         ↓                                                           │
│  Database updated + Event emitted                                   │
│         ↓                                                           │
│  Client cache updates → UI reflects change                          │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

Syncable Trait

All database models that need to sync implement the Syncable trait:

/// Enables automatic sync log creation for database models
pub trait Syncable {
    /// Stable model identifier used in sync logs (e.g., "album", "tag", "entry")
    const SYNC_MODEL: &'static str;

    /// Globally unique ID for this resource across all devices
    fn sync_id(&self) -> Uuid;

    /// Version number for optimistic concurrency control
    fn version(&self) -> i64;

    /// Optional: Exclude platform-specific or derived fields from sync
    fn exclude_fields() -> Option<&'static [&'static str]> {
        None
    }

    /// Optional: Convert to sync-safe JSON (default: full serialization)
    fn to_sync_json(&self) -> serde_json::Value where Self: Serialize {
        serde_json::to_value(self).unwrap_or(serde_json::json!({}))
    }
}

Example Implementation:

// Database model
#[derive(Clone, Debug, DeriveEntityModel, Serialize, Deserialize)]
#[sea_orm(table_name = "albums")]
pub struct Model {
    pub id: i32,              // Database primary key
    pub uuid: Uuid,           // Sync identifier
    pub name: String,
    pub version: i64,         // For conflict resolution
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}

impl Syncable for albums::Model {
    const SYNC_MODEL: &'static str = "album";

    fn sync_id(&self) -> Uuid {
        self.uuid
    }

    fn version(&self) -> i64 {
        self.version
    }

    fn exclude_fields() -> Option<&'static [&'static str]> {
        // Don't sync database IDs or timestamps (platform-specific)
        Some(&["id", "created_at", "updated_at"])
    }
}

TransactionManager

The TM is the only component that performs state-changing writes. It guarantees atomicity and automatic sync log creation.

Core API

pub struct TransactionManager {
    event_bus: Arc<EventBus>,
    sync_sequence: Arc<Mutex<HashMap<Uuid, u64>>>, // library_id → sequence
}

impl TransactionManager {
    /// Commit single resource change (creates sync log)
    pub async fn commit<M, R>(
        &self,
        library: Arc<Library>,
        model: M,
    ) -> Result<R, TxError>
    where
        M: Syncable + IntoActiveModel,
        R: Identifiable + From<M>;

    /// Commit batch of changes (10-1K items, creates per-item sync logs)
    pub async fn commit_batch<M, R>(
        &self,
        library: Arc<Library>,
        models: Vec<M>,
    ) -> Result<Vec<R>, TxError>
    where
        M: Syncable + IntoActiveModel,
        R: Identifiable + From<M>;

    /// Commit bulk operation (1K+ items, creates ONE metadata sync log)
    pub async fn commit_bulk<M>(
        &self,
        library: Arc<Library>,
        changes: ChangeSet<M>,
    ) -> Result<BulkAck, TxError>
    where
        M: Syncable + IntoActiveModel;
}

Commit Strategies

Method Use Case Sync Log Event Example
commit() Single user action 1 per item Rich resource User renames file
commit_batch() Watcher events (10-1K) 1 per item Batch User copies folder
commit_bulk() Initial indexing (1K+) 1 metadata only Summary Index 1M files

Critical: Bulk Operations

Problem: Indexing 1M files shouldn't create 1M sync log entries.

Solution: Bulk operations create ONE metadata sync log:

{
  "sequence": 1234,
  "model_type": "bulk_operation",
  "operation": "InitialIndex",
  "location_id": "uuid-...",
  "affected_count": 1000000,
  "hints": {
    "location_path": "/Users/alice/Photos"
  }
}

Why: Each device indexes its own filesystem independently. The sync log just says "I indexed location X" — it does NOT replicate 1M entries. Other devices trigger their own local indexing jobs when they see this notification.

Performance Impact:

  • With per-entry sync logs: ~500MB, 10 minutes, 3M operations
  • With bulk metadata: ~500 bytes, 1 minute, 1M operations (10x faster!)

Usage Example

// Before: Manual DB write + event emission (error-prone)
let model = albums::ActiveModel { /* ... */ };
model.insert(db).await?;
event_bus.emit(Event::AlbumCreated { /* ... */ }); // Can forget this!

// After: TransactionManager (atomic, automatic)
let model = albums::ActiveModel { /* ... */ };
let album = tm.commit::<albums::Model, Album>(library, model).await?;
// ✅ DB write + sync log + event — all atomic!

Sync Log Schema

CREATE TABLE sync_log (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    sequence INTEGER NOT NULL,              -- Monotonic per library
    library_id TEXT NOT NULL,
    device_id TEXT NOT NULL,               -- Device that created this entry
    timestamp TEXT NOT NULL,

    -- Change details
    model_type TEXT NOT NULL,              -- "album", "tag", "entry", "bulk_operation"
    record_id TEXT NOT NULL,               -- UUID of changed record
    change_type TEXT NOT NULL,             -- "insert", "update", "delete", "bulk_insert"
    version INTEGER NOT NULL DEFAULT 1,     -- Optimistic concurrency

    -- Data payload (JSON)
    data TEXT NOT NULL,

    UNIQUE(library_id, sequence)
);

CREATE INDEX idx_sync_log_library_sequence ON sync_log(library_id, sequence);
CREATE INDEX idx_sync_log_device ON sync_log(device_id);
CREATE INDEX idx_sync_log_model_record ON sync_log(model_type, record_id);

Leader Election

Each library requires a single leader device responsible for assigning sync log sequence numbers. This prevents sequence collisions.

Election Strategy

  1. Initial Leader: Device that creates the library
  2. Heartbeat: Leader sends heartbeat every 30 seconds
  3. Re-election: If leader offline >60s, devices elect new leader (highest device_id wins)
  4. Lease: Leader holds exclusive write lease

Implementation

pub struct SyncLeader {
    library_id: Uuid,
    leader_device_id: Uuid,
    lease_expires_at: DateTime<Utc>,
}

impl TransactionManager {
    pub async fn request_leadership(&self, library_id: Uuid) -> Result<bool, TxError> {
        // Check if current leader is still valid
        // If not, attempt to become leader
        // Update leadership table with lease
    }

    pub async fn is_leader(&self, library_id: Uuid) -> bool {
        // Check if this device holds valid lease
    }

    async fn next_sequence(&self, library_id: Uuid) -> Result<u64, TxError> {
        if !self.is_leader(library_id).await {
            return Err(TxError::NotLeader);
        }

        let mut sequences = self.sync_sequence.lock().unwrap();
        let seq = sequences.entry(library_id).or_insert(0);
        *seq += 1;
        Ok(*seq)
    }
}

Sync Service (Follower)

Devices that are not the leader pull sync log entries and apply them locally.

pub struct SyncFollowerService {
    library_id: Uuid,
    leader_device_id: Uuid,
    last_synced_sequence: Arc<Mutex<u64>>,
    tx_manager: Arc<TransactionManager>,
}

impl SyncFollowerService {
    /// Poll for new sync entries (called every 5 seconds)
    pub async fn sync_iteration(&mut self) -> Result<SyncResult, SyncError> {
        let last_seq = *self.last_synced_sequence.lock().unwrap();

        // Fetch entries from leader since last_seq
        let entries = self.fetch_entries_from_leader(last_seq).await?;

        if entries.is_empty() {
            return Ok(SyncResult::NoChanges);
        }

        // Apply each entry
        for entry in entries {
            self.apply_sync_entry(entry).await?;
        }

        Ok(SyncResult::Applied { count: entries.len() })
    }

    async fn apply_sync_entry(&mut self, entry: SyncLogEntry) -> Result<(), SyncError> {
        match entry.model_type.as_str() {
            "bulk_operation" => {
                // Parse metadata
                let metadata: BulkOperationMetadata = serde_json::from_value(entry.data)?;
                self.handle_bulk_operation(metadata).await?;
            }
            _ => {
                // Regular sync entry - deserialize and apply
                let model = self.deserialize_model(&entry)?;
                self.apply_model_change(model, entry.change_type).await?;
            }
        }

        // Update last synced sequence
        *self.last_synced_sequence.lock().unwrap() = entry.sequence;
        Ok(())
    }

    async fn handle_bulk_operation(&mut self, metadata: BulkOperationMetadata) -> Result<(), SyncError> {
        match metadata.operation {
            BulkOperation::InitialIndex { location_id, location_path } => {
                tracing::info!(
                    "Peer indexed location {} with {} entries",
                    location_id, metadata.affected_count
                );

                // Check if we have this location locally
                if let Some(local_location) = self.find_matching_location(&location_path).await? {
                    // Trigger our own indexing job
                    self.job_manager.queue(IndexerJob {
                        location_id: local_location.id,
                        mode: IndexMode::Full,
                    }).await?;
                }
            }
            _ => {}
        }
        Ok(())
    }
}

Library Sync Setup (Phase 1)

Before devices can sync, they must:

  1. Pair (cryptographic authentication)
  2. Discover libraries on remote device
  3. Register devices in each other's libraries

See sync-setup.md for complete implementation details.

Setup Flow

// 1. Discover remote libraries (after pairing)
let discovery = client.query(
    "query:network.sync_setup.discover.v1",
    DiscoverRemoteLibrariesInput { device_id: paired_device.id }
).await?;

// 2. Setup library sync (RegisterOnly in Phase 1)
let setup_result = client.action(
    "action:network.sync_setup.input.v1",
    LibrarySyncSetupInput {
        local_device_id: my_device_id,
        remote_device_id: paired_device.id,
        local_library_id: my_library.id,
        remote_library_id: discovery.libraries[0].id,
        action: LibrarySyncAction::RegisterOnly,
        leader_device_id: my_device_id, // This device becomes leader
    }
).await?;

// 3. Ready for sync!
// Sync service starts polling for changes

Sync Domains

Spacedrive syncs different types of data with different strategies:

Domain What Syncs Strategy
Index File/folder entries Metadata only (each device indexes own filesystem)
Metadata Tags, albums, collections Full replication across devices
Content File content (future) User-configured sync conduits
State UI state, preferences Device-specific, no sync

Conflict Resolution

Optimistic Concurrency

All Syncable models have a version field. When applying a sync entry:

async fn apply_model_change(&self, remote_model: Model, change_type: ChangeType) -> Result<()> {
    match change_type {
        ChangeType::Update => {
            // Fetch current local version
            let local_model = Model::find_by_uuid(remote_model.sync_id(), db).await?;

            if let Some(local) = local_model {
                if local.version >= remote_model.version {
                    // Local is newer or same - skip update
                    tracing::debug!("Skipping sync entry: local version is newer");
                    return Ok(());
                }
            }

            // Remote is newer - apply update
            remote_model.update(db).await?;
        }
        ChangeType::Insert => {
            remote_model.insert(db).await?;
        }
        ChangeType::Delete => {
            Model::delete_by_uuid(remote_model.sync_id(), db).await?;
        }
    }
    Ok(())
}

Conflict Strategy

  • Last-Write-Wins (LWW): Use version field to determine winner
  • No CRDTs: Simpler, sufficient for metadata sync
  • User Metadata: Tags, albums use union merge (both versions kept)

Raw SQL Compatibility

Reads: Unrestricted. Use SeaORM query builder or raw SQL freely.

Writes: Must go through TransactionManager. For advanced cases:

tm.with_tx(library, |txn| async move {
    // Raw SQL writes inside TM transaction
    txn.execute(Statement::from_sql_and_values(
        DbBackend::Sqlite,
        "UPDATE albums SET name = ? WHERE uuid = ?",
        vec![name.into(), uuid.into()],
    )).await?;

    // Tell TM to log this change
    tm.sync_log_for::<albums::Model>(txn, uuid).await?;

    Ok(())
}).await?;

Implementation Roadmap

Phase 1: Foundation (Current)

  • Device pairing protocol
  • Library sync setup (RegisterOnly)
  • TransactionManager core
  • Syncable trait + derives
  • Sync log schema
  • Leader election

Phase 2: Basic Sync

  • Sync follower service (pull-based)
  • Apply sync entries
  • Handle bulk operations
  • Conflict resolution
  • Album/Tag/Location sync

Phase 3: File Sync

  • Entry sync (metadata only)
  • Watcher integration
  • Bulk indexing with metadata logs
  • Cross-device file operations

Phase 4: Advanced Features

  • Content sync (via sync conduits)
  • Push-based sync (optional optimization)
  • Multi-leader support
  • Conflict resolution UI

Performance Considerations

Indexing Performance

  • 1M files, per-entry logs: 10 minutes, 500MB sync log
  • 1M files, bulk metadata: 1 minute, 500 bytes sync log
  • Result: 10x faster, 1 million times smaller sync log

Network Efficiency

  • Pull-based sync: Batch fetch (max 100 entries per request)
  • Compression: Gzip sync log JSON (typically 5x reduction)
  • Delta sync: Only fetch entries since last sequence

Database Optimization

  • Sync log: Append-only, no updates (fast writes)
  • Indexes on (library_id, sequence) for efficient polling
  • Vacuum old entries after successful sync (> 30 days)

Security

Encryption

  • All sync data transmitted over encrypted Iroh streams
  • Sync log contains full model data (no encryption at rest in Phase 1)
  • Future: Library-level encryption (see AT_REST_LIBRARY_ENCRYPTION.md)

Access Control

  • Only paired devices can sync
  • Device pairing uses cryptographic challenge/response
  • Leader election prevents unauthorized writes

Testing Strategy

Unit Tests

#[tokio::test]
async fn test_sync_log_creation() {
    let tm = TransactionManager::new(event_bus);
    let model = albums::Model { /* ... */ };

    let album = tm.commit::<albums::Model, Album>(library, model).await.unwrap();

    // Verify sync log entry created
    let entry = sync_log::Entity::find()
        .filter(sync_log::Column::RecordId.eq(album.id))
        .one(db)
        .await
        .unwrap()
        .unwrap();

    assert_eq!(entry.model_type, "album");
}

Integration Tests

  • Two-device sync simulation
  • Leader failover scenarios
  • Bulk operation handling
  • Conflict resolution

References

  • Sync Setup: docs/core/sync-setup.md
  • Event System: docs/core/events.md
  • Client Cache: docs/core/normalized_cache.md
  • Design Details: docs/core/design/sync/