docs and plans

This commit is contained in:
Jamie Pine
2022-04-11 21:42:32 -07:00
parent 9eb7c89591
commit e648dd2e2b
5 changed files with 54 additions and 33 deletions

1
.gitignore vendored
View File

@@ -24,6 +24,7 @@ packages/*/data
apps/*/data
docs/public/*.st
docs/public/*.toml
dev.db
!cli/cmd/turbo
cli/npm/turbo-android-arm64/bin

View File

@@ -20,8 +20,8 @@ mod sync {
// we can now impl specfic CRDT traits to given resources
enum SyncResource {
FilePath(dyn Replicate),
File(dyn OperationalTransform),
Tag(dyn OperationalTransform),
File(dyn PropertyOperation),
Tag(dyn PropertyOperation),
TagOnFile(dyn LastWriteWin),
Jobs(dyn Replicate + OperationalTransform)
}
@@ -31,29 +31,19 @@ mod sync {
## Data Types
Data is divided into several kinds, Shared, Relational and Owned.
Data is divided into several kinds, Shared and Owned.
- **Shared data** - Can be created and modified by any client. Has a `uuid`.
*Sync Method:* `Operational transform*`
*Sync Method:* `Property operation*`
> Shared resources could be,`files`, `tags`, `comments`, `albums` and `labels`. Since these can be created, updated or deleted by any client at any time.
- **Relational data** - Can be created and modified by any client. Links two UUIDs by local IDs.
*Sync Method:* `Last write wins (LWW)`
> Any many-to-many tables do not store UUIDs, we have to handle this data specifically. Querying for the resources local IDs before creating or deleting the relation.
- **Owned data** - Can only be modified by the client that created it. Has a `client_id` and `uuid`.
*Sync Method:* `Replicate`
> Owned resources would be `file_paths`, `jobs`, `locations` and `media_data`, since a client is the single source of truth for this data. This means we can perform conflict free synchronization.
- **Offline data** - Not synchronized at all.
> For example `logs`, `pending_operations` and `_migrations`. These are static and not part of this system.
**Shared data doesn't always use this method, in some cases we can create shared resources in bulk, where conflicts are handled by simply merging. More on that in [Synchronization Strategy]()*.
@@ -120,17 +110,20 @@ Owned data → Bulk shared data → Shared data → Relational data
### Types of CRDT:
```rust
trait OperationalTransform;
trait LastWriteWin;
trait PropertyOperation;
trait Replicate;
```
- **Operational Transform** - Update Shared resources at a property level. Operations stored in `pending_operations` table.
- **Last Write Win** - The most recent event will always be applied, used for many-to-many datasets.
- **PropertyOperation** - Update Shared resources at a property level. Operations stored in `pending_operations` table.
- **Replicate** - Used exclusively for Owned data, clients will replicate with no questions asked.
- ~~**Last Write Win** - The most recent event will always be applied, used for many-to-many datasets.~~
## Operations
@@ -139,7 +132,7 @@ Operations perform a Shared data change, they are cached in the database as `pen
Operations are removed once all online clients have received the payload.
```rust
struct OperationalTransform<V> {
struct PropertyOperation<V> {
method: OperationMethod,
// the name of the database table
resource_type: String,
@@ -272,18 +265,6 @@ Files also impempent `OperationalMerge` would use
## Ingesting Sync Events

View File

@@ -36,7 +36,7 @@ struct File {
}
```
- `partial_checksum ` - A SHA256 checksum generated from 5 samples of 10,000 bytes throughout the file data, including the begining and end. This is used to identify a file as *likely* unique in under 100µs.
- `partial_checksum ` - A SHA256 checksum generated from 5 samples of 10,000 bytes throughout the file data, including the begining and end + total byte count. This is used to identify a file as *likely* unique in under 100µs.
> ~~It is impossible to have a unique constraint at a database level for the `partial_checksum` however we can asyncronously resolve conflicts by querying for duplicates and generating full checksums at a later date.~~
>

View File

@@ -0,0 +1,39 @@
This extension must first register an indexer context to prevent the indexer from scanning the photo library
```rust
struct IndexerContext {
key: String,
is_dir: bool,
extension: Option<String>,
must_contain: Vec<String>,
always_ignored: Option<String>
scan: bool,
}
```
```rust
core.register_context(IndexerContext {
key: "apple-photo-library",
is_dir: false,
extension: ".photoslibrary",
must_contain: vec!["database", "originals"],
always_ignored: None,
scan: false, // apple-photos extension takes care of scan
});
core.register_context(IndexerContext {
key: "github-repository",
is_dir: true,
extension: None,
must_contain: vec![".git"],
always_ignored: Some("node_modules", "target")
scan: true,
});
```
For Apple Photos we need:
- Hidden/Favorite items
- Live photo support
- Original creation date
- Edited photos
- Albums

View File