Files
spacedrive/docs/core/design/SDPATH_REFACTOR.md
Jamie Pine d439d69590 feat: Finalize Spacedrive refactor and enhance session management
- Added a new documentation file outlining the finalization plan for the Spacedrive refactor, detailing remaining tasks and objectives for migrating to a CQRS architecture.
- Implemented session state management to track the current library context across operations, improving the overall architecture's modularity and maintainability.
- Refactored CLI commands and core components to utilize the new session state service, ensuring a more consistent and error-free user experience.
- Enhanced error handling and logging in various modules to facilitate debugging and improve reliability.
2025-09-13 16:08:07 -07:00

16 KiB

Of course. The whitepaper indeed specifies a more powerful, dual-mode SdPath that is crucial for enabling resilient and intelligent file operations. The current implementation in the codebase represents only the physical addressing portion of that vision.

Here is a design document detailing the refactor required to align the SdPath implementation with the whitepaper's architecture.


Refactor Design: Evolving SdPath to a Universal Content Address

1. Introduction & Motivation

[cite_start]The Spacedrive whitepaper, in section 4.1.3, introduces SdPath as a universal addressing system designed to make device boundaries transparent[cite: 172]. It explicitly defines SdPath as an enum supporting two distinct modes:

  • Physical: A direct pointer to a file at a specific path on a specific device.
  • [cite_start]Content: An abstract, location-independent handle that refers to file content via its unique ContentId[cite: 173].

[cite_start]The current codebase implements SdPath as a struct representing only the physical path[cite: 1159], which is fragile. If the target device is offline, any operation using this SdPath will fail.

This refactor will evolve the SdPath struct into the enum described in the whitepaper. [cite_start]This change is foundational to enabling many of Spacedrive's advanced features, including the Simulation Engine, resilient file operations, transparent failover, and optimal performance routing[cite: 182].


2. Current SdPath Implementation

The existing implementation in src/shared/types.rs is a simple struct:

[cite_start]// [cite: 1159]
#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub struct SdPath {
    pub device_id: Uuid,
    pub path: PathBuf,
}

Limitations:

  • Fragile: It's a direct pointer. If device_id is offline, the path is useless.
  • Not Content-Aware: It has no knowledge of the file's content, preventing intelligent operations like deduplication-aware transfers or sourcing identical content from a different online device.
  • Limited Abstraction: It tightly couples file operations to a specific physical location.

3. Proposed SdPath Refactor

[cite_start]We will replace the struct with the enum exactly as specified in the whitepaper[cite: 173]. This provides a single, unified type for all pathing operations.

3.1. The New SdPath Enum

The new implementation in src/shared/types.rs will be:

[cite_start]// As described in the whitepaper [cite: 173]
#[derive(Debug, Clone, PartialEq, Eq, Hash, Serialize, Deserialize)]
pub enum SdPath {
    Physical {
        device_id: Uuid,
        path: PathBuf,
    },
    Content {
        content_id: Uuid, // Or a dedicated ContentId type
    },
}

3.2. Adapting Existing Methods

The existing methods will be adapted to work on the enum:

  • new(device_id, path) becomes SdPath::physical(device_id, path).
  • local(path) remains a convenience function that creates a Physical variant with the current device's ID.
  • is_local() will now perform a match:
    pub fn is_local(&self) -> bool {
        match self {
            SdPath::Physical { device_id, .. } => *device_id == get_current_device_id(),
            SdPath::Content { .. } => false, // Content path is abstract, not inherently local
        }
    }
    
  • as_local_path() will similarly only return Some(&PathBuf) for a local Physical variant.
  • display() will format based on the variant, e.g., sd://<device_id>/path/to/file for Physical and sd://content/<content_id> for Content.

3.3. New Associated Functions

  • SdPath::content(content_id: Uuid) -> Self: A new constructor for creating content-aware paths.
  • SdPath::from_uri(uri: &str) -> Result<Self, ParseError>: A parser for string representations.
  • to_uri(&self) -> String: The inverse of from_uri.

4. The Path Resolution Service

The power of the Content variant is unlocked by a Path Resolution Service. [cite_start]This service is responsible for implementing the "optimal path resolution" described in the whitepaper[cite: 178].

4.1. Purpose

The resolver's goal is to take any SdPath and return the best available SdPath::Physical instance that can be used to perform a file operation.

4.2. Implementation

A new struct, PathResolver, will be introduced, and its methods will take the CoreContext to access the VDFS. A resolve method will be added directly to SdPath for convenience.

// In src/shared/types.rs
impl SdPath {
    pub async fn resolve(
        &self,
        context: &CoreContext
    ) -> Result<SdPath, PathResolutionError> {
        match self {
            // If already physical, just verify the device is online.
            SdPath::Physical { device_id, .. } => {
                // ... logic to check device status via context.networking ...
                if is_online { Ok(self.clone()) }
                else { Err(PathResolutionError::DeviceOffline(*device_id)) }
            }
            // If content-based, find the optimal physical path.
            SdPath::Content { content_id } => {
                resolve_optimal_path(context, *content_id).await
            }
        }
    }
}

// In a new module, e.g., src/vdfs/resolver.rs
async fn resolve_optimal_path(
    context: &CoreContext,
    content_id: Uuid
) -> Result<SdPath, PathResolutionError> {
    // 1. Get the current library's DB connection from context
    let library = context.library_manager.get_active_library().await
        .ok_or(PathResolutionError::NoActiveLibrary)?;
    let db = library.db().conn();

    // 2. Query the ContentIdentity table to find all Entries with this content_id
    // ... SeaORM query to join content_identities -> entries -> locations -> devices ...
    // This gives a list of all physical instances (device_id, path).

    [cite_start]// 3. Evaluate each candidate instance based on the cost function [cite: 179]
    let mut candidates = Vec::new();
    // for instance in query_results {
    //     let cost = calculate_path_cost(&instance, context).await;
    //     candidates.push((cost, instance));
    // }

    // 4. Select the lowest-cost, valid path
    candidates.sort_by(|a, b| a.0.cmp(&b.0));

    if let Some((_, best_instance)) = candidates.first() {
        Ok(SdPath::physical(best_instance.device_id, best_instance.path))
    } else {
        Err(PathResolutionError::NoOnlineInstancesFound(content_id))
    }
}

4.3. Error Handling

A new error enum, PathResolutionError, will be created to handle failures, such as:

  • NoOnlineInstancesFound(Uuid)
  • DeviceOffline(Uuid)
  • NoActiveLibrary
  • DatabaseError(String)

4.4. Performant Batch Resolution

Resolving paths one-by-one in a loop is inefficient and would lead to the "N+1 query problem." A performant implementation must handle batches of paths by gathering all necessary data in as few queries as possible.

Algorithm:

  1. Partition: Separate the input Vec<SdPath> into physical_paths and content_paths.
  2. Pre-computation: Before querying the database, fetch live and cached metrics from the relevant system managers.
    • Get a snapshot of all online devices and their network latencies from the DeviceManager and networking layer.
    • Get a snapshot of all volume metrics (e.g., PhysicalClass, benchmarked speed) from the VolumeManager.
  3. Database Query:
    • Collect all unique content_ids from the content_paths.
    • Execute a single database query using a WHERE ... IN clause to retrieve all physical instances for all requested content_ids. The query should join across tables to return tuples of (content_id, device_id, volume_id, path).
  4. In-Memory Cost Calculation:
    • Group the database results by content_id.
    • For each content_id, iterate through its potential physical instances.
    • Filter out any instance on a device that the pre-computation step identified as offline.
    • Calculate a cost for each remaining instance using the pre-computed device latencies and volume metrics.
    • Select the instance with the lowest cost for each content_id.
  5. Assembly: Combine the resolved Content paths with the verified Physical paths into the final result, perhaps returning a HashMap<SdPath, Result<SdPath, PathResolutionError>> to correlate original paths with their resolved states.

Implementation:

// In src/vdfs/resolver.rs

pub struct PathResolver {
    // ... context or manager handles ...
}

impl PathResolver {
    pub async fn resolve_batch(
        &self,
        paths: Vec<SdPath>
    ) -> HashMap<SdPath, Result<SdPath, PathResolutionError>> {
        // 1. Partition paths by variant (Physical vs. Content).
        // 2. Pre-compute device online status and volume metrics in batch.
        // 3. Collect all content_ids.
        // 4. Execute single DB query to get all physical instances for all content_ids.
        // 5. In memory, calculate costs and select the best instance for each content_id.
        // 6. Verify physical paths against online device status.
        // 7. Assemble and return the final HashMap.
        unimplemented!()
    }
}

This batch-oriented approach ensures that resolving many paths is highly efficient, avoiding repeated queries and leveraging in-memory lookups for the cost evaluation.


4.5. PathResolver Integration

The PathResolver is a core service and should be integrated into the application's central context.

  • Location: Create the new resolver at src/operations/indexing/path_resolver.rs. The existing PathResolver struct in that file, which only handles resolving entry_id to a PathBuf, should be merged into this new, more powerful service.
  • Integration: An instance of the new PathResolver should be added to the CoreContext in src/context.rs to make it accessible to all actions and jobs.
  • [cite_start]Cost Function Parameters: The "optimal path resolution" [cite: 178] should be guided by a cost function. The implementation should prioritize sources based on the following, in order:
    1. Is the source on the local device? (lowest cost)
    2. What is the network latency to the source's device? (from the NetworkingService)
    3. What is the benchmarked speed of the source's volume? (from the VolumeManager)

5. Impact on the Codebase (Expanded)

This refactor will touch every part of the codebase that handles file paths. The following instructions provide specific guidance for each affected area.

5.1. Action and Job Contracts

The fundamental principle is that Actions receive SdPaths, and Jobs resolve them.

  1. Action Definitions: All action structs that currently accept PathBuf for file operations must be changed to accept SdPath. For example, in src/operations/files/copy/action.rs, FileCopyAction should be changed:

    // src/operations/files/copy/action.rs
    pub struct FileCopyAction {
        // BEFORE: pub sources: Vec<PathBuf>,
        pub sources: Vec<SdPath>, // AFTER
        // BEFORE: pub destination: PathBuf,
        pub destination: SdPath, // AFTER
        pub options: CopyOptions,
    }
    

    This pattern applies to FileDeleteAction, ValidationAction, DuplicateDetectionAction, and others.

  2. Job Execution Flow: Any job that operates on files (e.g., FileCopyJob, DeleteJob) must begin its run method by resolving its SdPath members into physical paths.

    // Example in src/operations/files/copy/job.rs
    impl JobHandler for FileCopyJob {
        async fn run(&mut self, ctx: JobContext<'_>) -> JobResult<Self::Output> {
            // 1. RESOLVE PATHS FIRST
            let physical_destination = self.destination.resolve(&ctx).await?;
            let mut physical_sources = Vec::new();
            for source in &self.sources.paths {
                physical_sources.push(source.resolve(&ctx).await?);
            }
    
            // ... existing logic now uses physical_sources and physical_destination ...
        }
    }
    
  3. Operation Target Validity: Explicit rules must be enforced within jobs for SdPath variants:

    • Destination/Target: Operations like copy, move, delete, validate, and index require a physical target. The job's run method must ensure the destination SdPath is or resolves to a Physical variant. An attempt to use a Content variant as a final destination is a logical error and should fail.
    • Source: A source can be a Content variant, as the resolver will find a physical location for it.

5.2. API Layer (CLI Commands)

To allow users to specify content-based paths, the CLI command layer must be updated to accept string URIs instead of just PathBuf.

  • File: src/infrastructure/cli/daemon/types/commands.rs
  • Action: Change enums like DaemonCommand::Copy to use Vec<String> instead of Vec<PathBuf>.
    // src/infrastructure/cli/daemon/types/commands.rs
    pub enum DaemonCommand {
        // ...
        Copy {
            // BEFORE: sources: Vec<PathBuf>,
            sources: Vec<String>, // AFTER (as URIs)
            // BEFORE: destination: PathBuf,
            destination: String, // AFTER (as a URI)
            // ... options
        },
        // ...
    }
    
  • The command handlers in src/infrastructure/cli/daemon/handlers/ will then be responsible for parsing these string URIs into SdPath enums before creating and dispatching an Action.

5.3. Copy Strategy and Routing

The copy strategy logic must be updated to be SdPath variant-aware.

  • File: src/operations/files/copy/routing.rs

  • Action: The CopyStrategyRouter::select_strategy function must be refactored. The core logic should be:

    1. Resolve the source and destination SdPaths first.
    2. After resolution, both paths will be SdPath::Physical.
    3. Compare the device_id of the two Physical paths.
    4. If the device_ids are the same, use the VolumeManager to check if they are on the same volume and select LocalMoveStrategy or LocalStreamCopyStrategy.
    5. If the device_ids differ, select RemoteTransferStrategy.
  • File: src/operations/files/copy/strategy.rs

  • Action: The strategy implementations (LocalMoveStrategy, LocalStreamCopyStrategy) currently call .as_local_path(). This is unsafe. They should be modified to only accept resolved, physical paths. Their signatures can be changed, or they should match on the SdPath variant and return an error if it is not Physical.

6. Example Usage (Before & After)

[cite_start]This example, adapted from the whitepaper, shows how resilience is achieved[cite: 174].

Before:

// Fragile: Fails if source_path.device_id is offline
async fn copy_files(source_path: SdPath, target_path: SdPath) -> Result<()> {
    // ... direct p2p transfer logic using source_path ...
    Ok(())
}

After:

// Resilient: Finds an alternative online source automatically
async fn copy_files(
    source: SdPath,
    target: SdPath,
    context: &CoreContext
) -> Result<()> {
    // Resolve the source path to an optimal, available physical location
    let physical_source = source.resolve(context).await?;

    // ... p2p transfer logic using the resolved physical_source ...
    Ok(())
}

7. Conclusion

Refactoring SdPath from a simple struct to the dual-mode enum is a critical step in realizing the full architectural vision of Spacedrive. It replaces a fragile pointer system with a resilient, content-aware abstraction. This change directly enables the promised features of transparent failover and performance optimization, and it provides the necessary foundation for the Simulation Engine and other advanced, AI-native capabilities.