More P2P docs (#2492)

* Remove relay

* restructure p2p

* wip

* cleanup webrtc

* split up P2P docs

* wip

* more wip

* the fork has moved

* finish local network discovery

* Document the relay system

* be less stupid

* a

* remote ip from deploy script

* remove debug from deploy script

* Explain relay setup and usage

* Physical pain

* fix

* error handling for relay setup

* Listeners Relay state + merge it into NLM state

* `node_remote_identity`

* redo libraries hook

* toggle relay active in settings

* Dedicated network settings page

* Stablise P2P debug page

* warning for rspc remote

* Linear links in docs

* fix p2p settings switches

* fix typescript errors on general page

* fix ipv6 listener status

* discovery method in UI

* Remove p2p debug menu on the sidebar

* wip

* lol

* wat

* fix

* another attempt at fixing library hook

* fix

* Remove sync from sidebar

* fix load library code

* I hate this

* Detect connections over the relay

* fix

* fixes

* a

* fix mDNS

* a bunch o fixes

* a bunch of state management fixes

* Metadata sync on connection

* skill issue

* fix markdown

* Clippy cleanup

* Backport #2380

* Update interface/locales/en/common.json

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/local-network-discovery.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/local-network-discovery.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p_proto.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/overview.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/overview.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/relay.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p_proto.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p_proto.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/transport-layer.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p_proto.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/local-network-discovery.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* Update docs/developers/p2p/sd_p2p_proto.mdx

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>

* a

* Cleaning binario section

* cleanup Docker message

* idk

* Idempotent listeners

* Manual peers working????

* minor fixes

* crazy idea - don't panic in the event loop

* fixes

* debug

* debug

* LAN badge in network settings

* Use `dns_lookup` instead of `tokio::net::lookup_host`

* fix

* bruh sandwich

* proper dialing

* a

* remove logs

* fix

* Small cleanup

* manual peers state on connected device

* a

* Fix manual discovery state + give it a badge

* Clippy improvements

* flip discovery priority

* Add `addrs` to debug query

* connection candidates in debug

* Fix state

* Clippppppppppppy

* Manual discovery badge

* Flesh out ping example

* Usage guide

* `sd_p2p_proto` examples

* More discovery docs

* More docs work

* docs docs docs and more docs

* PONG

* rename

---------

Co-authored-by: Matthew Yung <117509016+myung03@users.noreply.github.com>
This commit is contained in:
Oscar Beaumont
2024-05-30 21:48:12 +08:00
committed by GitHub
parent 277d53094c
commit 0ea1333c83
57 changed files with 3068 additions and 1583 deletions

View File

@@ -0,0 +1,36 @@
---
title: Discovery
index: 25
---
# Discovery
Discovery is the process of finding other nodes to connect with.
We do this through the following 3 systems:
- Manual entry by the user
- [Local Network Discovery](/docs/developers/p2p/local-network-discovery) via mDNS
- [Relay](/docs/developers/p2p/relay) via mDNS
## Overview of methods
From a technical perspective all of these methods operate very differently so it's important to understand the differences between them when making changes to the P2P system.
The relay and manual entry method both require the P2P system to be given an upfront list of peers to connect to, whereas the mDNS system is able to discover them itself. For the relay these peers comes from the instances table in the database and for manual entry these peers comes from the node configuration file.
A quirk manual entry is that when we attempt a connection we don't know the remote nodes identity or metadata prior to connecting. We instead establish a full connection to ask for remote nodes information (`RemoteIdentity` and metadata) after which we treat it as discovered.
This table summarises the differences between the methods:
| | mDNS | Relay | Manual |
|-------------------------------------------------------------------------|---------|----------|-----------|
| Requires upfront knowledge of existence of peer | No | Yes | Yes |
| Knows connection info (metadata, RemoteIdentity) ahead of connection | Yes | Yes | No |
## Manually provided peers
The user can manually provide a set of [`SocketAddr`](https://doc.rust-lang.org/std/net/enum.SocketAddr.html)'s or [FQDN](https://en.wikipedia.org/wiki/Fully_qualified_domain_name)'s and the P2P system will attempt to connect to them. If a domain is provided the P2P system will resolve it to an IP address and then attempt to connect to that address.
This feature primarily exists for usage with Docker as mDNS discovery does not work correctly, however it could be useful for working around difficult network setups. It's important to note that you *must* use [port forwarding](https://en.wikipedia.org/wiki/Port_forwarding) and set a static port for the node when using this feature.
When you add a manual peer it will *not* show up in the nodes list until the P2P system is able to establish a connection. This is because without having established a connection the system is unable to determine the remote nodes identity or metadata which is required for it to be considered discovered.

View File

@@ -0,0 +1,91 @@
---
title: Local Network Discovery
index: 26
---
# Local Network Discovery
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/src/hooks/mdns.rs)
Our local network discovery uses [DNS-Based Service Discovery](https://www.rfc-editor.org/rfc/rfc6763.html) which itself is built around [Multicast DNS (mDNS)](https://datatracker.ietf.org/doc/html/rfc6762). This is a really well established technology and is used in [Spotify Connect](https://support.spotify.com/au/article/spotify-connect/), [Apple Airplay](https://www.apple.com/au/airplay/) and many other services you use every day.
We make use of the [mdns-sd](https://docs.rs/mdns-sd) crate.
## Service Structure
The following is an example of what would be broadcast from a single Spacedrive node:
```toml
# {remote_identity_of_self}._sd._udp.local.
name=Oscars Laptop # Shown to the user to select a device
operating_system=macos # Used for UI purposes
device_model=MacBook Pro # Used for UI purposes
version=0.0.1 # Spacedrive version
# For each library that's active on the Spacedrive node:
# {library_uuid}={remote_identity_of_self}
d66ed0c3-03ac-4f9b-a374-a927830dfd5b=0l9vTOWu+5aJs0cyWxdfJEGtloEepGRAXcEuDeTDRPk
```
Within `sd-core` this is defined in two parts. The [`PeerMetadata` struct](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/metadata.rs#L9) takes care of the node metadata and libraries are inserted by the [`libraries_hook`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/libraries.rs#L13).
## Modes
<Notice
type="note"
text="This section discusses 'Contacts Only' which is not yet fully implemented (refer to ENG-1197)."
/>
Within Spacedrive's settings the user is able to choose between three modes for local network discovery:
- **Contacts only**: Only devices that are in your contacts list will be able to see your device.
- **Enabled**: All devices on the local network will be able to see your device.
- **Disabled**: No devices on the local network will be able to see your device.
**Enabled** and **Disabled** are implemented by spawning and shutting down the [`sd_p2p::Mdns`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/crates/p2p/src/mdns.rs#L17) service as required within `sd-core`.
**Contacts only** the mDNS service will not contain the [`PeerMetadata`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/metadata.rs#L9) fields and instead will contain a hash of the users Spacedrive identifier. If a Spacedrive node detects another node in the local network with a hash in it's contacts, it can make a request to the node and if the remote node also has the current node in it's contacts, it will respond with the full metadata.
## Integration with Spacedrive accounts
P2P is currently *not* integrated with Spacedrive accounts and we will integrate it in the future for better security.
Right now we use a remote identity to identify the remote device, however tihs is not very user friendly. If a [MITM](https://en.wikipedia.org/wiki/Man-in-the-middle_attack)-style attack is preformed the remote identity will show up with the attacker's device but this isn't going to be particularly noticable by the user.
To combat this issue we can integrate with Spacedrive accounts so the user can be presented with the users name and verified email. This allows the user to prove the remote device is who they expect in a more user friendly way.
The Spacedrive account information will need to be put into the peer metadata and we can use cryptographic signatures to verify the account is linked with the remote device without a network connection.
This issue is tracked as [ENG-1758](https://linear.app/spacedriveapp/issue/ENG-1758/p2p-🤝-spacedrive-accounts)
## Problems with Docker
Docker can be problematic with P2P due to it applying another level of [NAT](https://en.wikipedia.org/wiki/Network_address_translation). This shouldn't cause any issues for usage with the relay but it makes mDNS cease to work.
It is possible to [expose the mDNS daemon](https://medium.com/@andrejtaneski/using-mdns-from-a-docker-container-b516a408a66b) of the host machine into the container, and we could potentially implement something similar, however it's unclear if [`mdns-sd`](https://docs.rs/mdns-sd) uses the OS's daemon and if `avahi` is used by all host Linux distributions we want to support.
When `sd-server` is run from Docker the `Dockerfile` sets the environment variable `SD_DOCKER=true` which is picked up by the core and it exposes P2P on port `7373` instead of a random port like the default configuration.
This allows the administrator to use the `-p 7373:7373` flag when running the container to expose the P2P port to the host machine. This can then be paired with manually entering the IP address and port of the node into the P2P settings of the other node and a connection can be established. Although this is suboptimal, it serves as an alternative for the time being.
This issue is tracked as [ENG-1343](https://linear.app/spacedriveapp/issue/ENG-1343/docker-support-for-p2p).
## Problems with mobile
mDNS discovery does not work on mobile at the moment. To prevent this affecting other devices, we patch [`if-watch`](https://docs.rs/if-watch) using the fork [spacedriveapp/if-watch](https://github.com/spacedriveapp/if-watch). This fork basically implements a *no-op* for mobile so that the core is able to compile.
This issue is tracked as [ENG-1108](https://linear.app/spacedriveapp/issue/ENG-1108/mdns-working-on-ios).
## Problems on Linux
It was reported on Discord that opening Spacedrive would cause excessive network activity. This is possibly a bug with the mDNS system; it was not able to be reproduced on macOS.
This issue is tracked as [ENG-1319](https://linear.app/spacedriveapp/issue/ENG-1319/excessive-mdns-pings).
## Tracking
When information about the device is exposed to the local network we introduce the risk that this information is used for tracking users.
The intention is for the contacts only mode to mitigate this risk as the device will only be discoverable by other devices that are in the users contacts and all information will be unintelligible.
Apple outline some information about how they combat this for AirDrop [here](https://support.apple.com/en-au/guide/security/sec2261183f4/web) and we can do something similar.

View File

@@ -0,0 +1,22 @@
---
title: overview
index: 20
---
# Peer-to-peer
Our peer-to-peer technology works at the heart of Spacedrive allowing all of your devices to seamlessly communicate and share data. This documentation outlines the system's design and how to use it.
## Terminology
- **Node**: An application running Spacedrive's network stack.
- This could be the Spacedrive app or the P2P relay.
- If you have multiple Spacedrive installations open on your computer, each one is an independent node.
- **Library**: A logical collection of your data within Spacedrive.
- Conceptually, a library is the conflict resolved state of one or more **instances**, although a lot of the time we don't strictly treat it that way.
- **Instance**: An instance of a library running on a particular node.
- An instance correlates directly to each SQLite file.
- You could *technically* have more than one instance for a library on a single node, although our core would fall apart as we identify traffic by library.
- [`Identity`](https://github.com/spacedriveapp/spacedrive/blob/518d5836f6585a5f597c3ae5a0d27d084adc0a63/crates/p2p/src/identity.rs#L29) - A public/private keypair which represents the library or node.
- [`RemoteIdentity`](https://github.com/spacedriveapp/spacedrive/blob/518d5836f6585a5f597c3ae5a0d27d084adc0a63/crates/p2p/src/identity.rs#L70) - A public key which represents the library or node.
- [`PeerId`](https://docs.rs/libp2p/latest/libp2p/struct.PeerId.html) - The identifier libp2p uses. Can be derived from a `RemoteIdentity`.

View File

@@ -0,0 +1,57 @@
---
title: Protocols
index: 29
---
# Protocols
## Ping
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/core/src/p2p/operations/ping.rs)
We have the implementation of a basic ping protocol. This is not actually used within Spacedrive but acts a reference for implementing a new protocol.
## Spacedrop
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/core/src/p2p/operations/spacedrop.rs)
Spacedrop is a system for sending files quickly to other peers. It is intended for sending to peers that have not been paired into the library. It is great for sending a file to a friend on your same network running Spacedrive but you can use the regular file manager for sharing a file without another node in your library.
This protocol works but some of the following are missing features:
- Pause/resumable transfers
- Transfer a folder - [ENG-1297](https://linear.app/spacedriveapp/issue/ENG-1297/spacedrop-create-folder-button-in-save-dialog-for-multiple-file)
- Usage with `sd-server` will result in bugs if you have multiple web clients - [ENG-1034](https://linear.app/spacedriveapp/issue/ENG-1034/spacedrop-on-multi-user-server-will-break) and [ENG-1522](https://linear.app/spacedriveapp/issue/ENG-1522/spacedrop-on-web)
The following are known bugs:
- [ENG-1298](https://linear.app/spacedriveapp/issue/ENG-1298/spacedrop-cancel-prior-to-toast)
- [ENG-1035](https://linear.app/spacedriveapp/issue/ENG-1035/spacedrop-show-toast-while-waiting-for-remote-to-acceptdeny-response)
- [ENG-1203](https://linear.app/spacedriveapp/issue/ENG-1203/spacedrop-ui-handle-timeouts)
- [ENG-1211](https://linear.app/spacedriveapp/issue/ENG-1211/spacedrop-what-if-file-content-changes-while-sending)
## rspc
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/core/src/p2p/operations/rspc.rs)
This protocol was an experiment to expose the rspc router of a node over P2P. Although it works this is a security nightmare so it has been disabled by default and hidden behind the `wipP2P` feature flag.
How to test this feature:
- Enable the `wipP2P` feature flag
- Enable the feature within the network page of settings
- Ensure "Enable remote acccess" is enabled on both nodes
- Click the "rspc remote" button on the node you want to connect to.
- If the connection fails you will be presented with a white screen, otherwise you will be given a library selection and once seleted you will be given Spacedrive UI running on the remote node.
Major problems with this feature:
- This protocol doesn't have any security (hence it being disabled by default). It's also a nightmare to secure as it's full access (including to do filesystem actions) or no access. - [ENG-1646](https://linear.app/spacedriveapp/issue/ENG-1646/rspc-over-p2p-handle-condition-in-ui-of-remote-offline-node)
- The rspc websocket connection established over the P2P system is leaked so it will never be cleaned up. Fixing this would require changes to rspc. - [ENG-1647](https://linear.app/spacedriveapp/issue/ENG-1647/stop-leaking-rspc-p2p-websockets)
- Any usage of rspc outside the React context will still be using the local node's rspc router. We don't do this often but we definitely do it. - [ENG-1648](https://linear.app/spacedriveapp/issue/ENG-1648/prevent-any-usage-of-rspc-out-of-the-react-context)
From my ([oscartbeaumont](https://github.com/oscartbeaumont)'s) perspective this was a cool experiment but not something we should ship because getting it's nightmare to get security right.
## Sync
Unimplemented
In an earlier version of the P2P system we had a method for sending sync messages to other nodes over the peer to peer connection, however this was removed during some refactoring of the sync system.
The code for it could be taken from [here](https://github.com/spacedriveapp/spacedrive/blob/aa72c083c2e5f6cf33f3c1fb66283e5fe0d1ba3b/core/src/p2p/pairing/mod.rs) and upgraded to account for changes to the sync and P2P system to bring back this functionality.

View File

@@ -0,0 +1,102 @@
---
title: relay
index: 27
---
# Relay
To establish connections outside of your local network we rely on an external relay to help with coordinating connections and also to proxy traffic between peers if the network conditions are not favourable.
## Implementation
We make use of [libp2p](https://libp2p.io)'s [Direct Connection Upgrade through Relay](https://github.com/libp2p/specs/blob/master/relay/DCUtR.md) and [Circuit Relay](https://github.com/libp2p/specs/blob/master/relay/README.md) protocols for our relay system.
[Client Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/src/hooks/quic/transport.rs)
·
[Server Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/apps/p2p-relay)
## Relay discovery
Each client will regularly make requests to [https://app.spacedrive.com/api/p2p/relays](https://app.spacedrive.com/api/p2p/relays) to get the list of currently active relay servers.
Each relay server will register itself with the discovery server automatically when started. This requires an authentication token so it can only be done by Spacedrive ran servers.
We store the relays in Redis with a TTL. This is so if the relay server is shutdown and does not do its regular check-in, it will be automatically removed from the pool.
## How it works
We register a listen for each relay that is returned from the discovery server. When a connection is established we will attempt to connect to the relay server. We also attempt to establish connections with peers that we already know about through the active libraries.
Currently we connect to every relay server that is returned from the discovery server. This is obviously not ideal but if two nodes were to connect to the different relay servers we would need some way of communicating between them (which is a complicated problem to solve).
The issue of connecting to every relay server is tracked as [ENG-1672](https://linear.app/spacedriveapp/issue/ENG-1672/mesh-relays).
## Authentication
Currently the relay service is completly unauthenticated. To prevent abuse we are planning to restrict the relays to Spacedrive accounts.
libp2p doesn't have a ready-made solution for this as it's heavily designed around the [IPFS](https://ipfs.tech) usecase which is all open. This will likely require a custom network behavior to be implemented in libp2p which will be a decent undertaking.
This issue is tracked as [ENG-1652](https://linear.app/spacedriveapp/issue/ENG-1652/relay-authentication).
## Billing
Currently the relay service has no method of tracking usage based on the connected peers.
libp2p doesn't have a ready-made solution for this as it's heavily designed around the [IPFS](https://ipfs.tech) use case, which is all open. This will likely require a custom network behavior to be implemented in libp2p which will be a decent undertaking.
This issue is tracked as [ENG-1667](https://linear.app/spacedriveapp/issue/ENG-1667/relay-metering).
## Rate limiting
We should rate limit connection being opened with the Relay to ensure denial of service attacks for not possible.
libp2p has a built-in [RateLimiter](https://docs.rs/libp2p/latest/libp2p/relay/trait.RateLimiter.html) trait which we can implement. The rate limiting information should be stored to Redis so it shared between all relays.
## Alternative design
Our relay system is currently built on top of [libp2p](https://libp2p.io)'s system for relays. Given all of the limitations of the current design discussed above I don't think libp2p's relay system was really designed for private relays so it could be worth dropping it entirely and investigating another solution.
I have done some digging into [WebRTC](https://webrtc.org) (specially [STUN](https://en.wikipedia.org/wiki/STUN) and [TURN](https://en.wikipedia.org/wiki/Traversal_Using_Relays_around_NAT)) and it does seem like a really solid alternative.
Given the core of the `sd_p2p` crate is decoupled from libp2p we could easily implement an alternative connection system based on WebRTC while keeping libp2p for the quic-based transport for local networks.
The major advantage to using WebRTC would be the ability to use a SaSS solution for hosting the relay infrastructure. WebRTC is based on [STUN](https://en.wikipedia.org/wiki/STUN) and [TURN](https://en.wikipedia.org/wiki/Traversal_Using_Relays_around_NAT) which are very ubiquitous protocols. The following is a comparison of some webrtc services:
| | Pricing (per GB ) | Has Billing API |
|-------------------------------------------------------------------|-------------------|-----------------|
| [Cloudflare Calls](https://developers.cloudflare.com/calls/turn/) | 0.05$ | No |
| [Twilio](https://www.twilio.com/stun-turn) | 0.40$ to 0.80$ | No |
| [Metered](https://www.metered.ca/stun-turn) | 0.40$ to 0.10$ | Yes |
WebRTC also has a built in system for authentication via the SDP object that needs to be exchanged between peers for a valid connection to be established. For an explaination of webrtc checkout [this Fireship video](https://www.youtube.com/watch?v=WmR9IMUD_CY).
libp2p *does* have a [WebRTC transport](https://docs.rs/libp2p-webrtc/0.7.1-alpha) but it seems to be only for browser to server communication not server to server like we require so I don't think it would be viable for our usecase.
### Security
The relay works through encrypted communication so we can not read the data that is being relayed. The relay servers currently must be owned by Spacedrive to work, however it's possible we can allow the community to run their own relays in the future.
### Hosting the Relay server
<Notice
type="note"
text="Be careful running this on your local machine as it will expose your public IP address to all Spacedrive users."
/>
1. Set up the server using the following command:
```bash
cargo run -p sd-p2p-relay init
# You will be prompted to enter the p2p secret.
# It can be found in the `spacedrive-api` Vercel project as the `P2P_SECRET` environment variable.
```
2. Now that you have set up the server, you can run the relay server using the following command:
```bash
cargo run -p sd-p2p-relay
```
*Note that you will need to ensure port `7373` is exposed through your firewall for this to work.*

View File

@@ -0,0 +1,34 @@
---
title: sd-p2p
index: 22
---
# `sd_p2p` crate
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p)
The P2P crate was designed from the ground up to be modular.
The `P2P` struct is the core of the system, but doesn't actually do any P2P functionality. It's a state manager and event bus which exposes a hook system for other components of the P2P system to register themselves.
This modular design helps with separation of concerns which significantly helps with comprehending the entire system and streamlines testing.
## What are hooks?
A hook is very similar to an actor. It's a component which can be registered with the P2P system and it is allowed to listen and react to events.
A hook allows for processing events from the P2P system and also ensures when the P2P system shuts down, the hook is also shutdown.
There are special hooks called listeners. These are implemented as a superset of a regular hook and are able to create and accept connections.
## Default hooks?
The `sd_p2p` crate comes with a few default hooks:
- `Mdns` - Local network discovery using mDNS
- `Quic` - Quic transport layer built on top of `libp2p`
Spacedrive implements some of it's own hooks within the `core/src/p2p` directory to deal with libraries correctly.
## Lazy vs eager connection
The P2P system is designed to lazily connect to peers. This is intentional to preserve battery life and reduce network usage. When the clients attempts to connect to a remote peer it will establish a connection and automatically close it after a period of inactivity.

View File

@@ -0,0 +1,27 @@
---
title: sd-p2p-block
index: 24
---
# `sd_p2p_block`
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/block)
A file block protocol based on [SyncThing Block Exchange Protocol v1](https://docs.syncthing.net/specs/bep-v1.html).
The goal of this protocol is to take bytes in and reliabily and quickly transfer them to the other side.
## Example
```rust
# TODO
```
TODO - Outline my idea for a better implementation.
https://linear.app/spacedriveapp/issue/ENG-1760/block-protocol-v2
https://linear.app/spacedriveapp/issue/ENG-1292/spaceblock-abstract-name-from-spaceblockrequest
https://linear.app/spacedriveapp/issue/ENG-1312/spaceblock-file-checksum
https://linear.app/spacedriveapp/issue/ENG-563/spaceblock-error-handling
https://linear.app/spacedriveapp/issue/ENG-567/spaceblock-cancel-transfer
https://linear.app/spacedriveapp/issue/ENG-572/spaceblock-file-name-overflow

View File

@@ -0,0 +1,75 @@
---
title: sd-p2p-proto
index: 23
---
# `sd_p2p_proto`
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/proto)
This crate provides utilities for implementing asynchronous deserializers and matching synchronous serializers. The goal of these implementations is to rapidly send and receive Rust structs over the network.
This crate allows for creating implementations faster than other common options, at the cost of some developer experience.
Before building, the performance of both [msgpack](https://docs.rs/rmp-serde) and [bincode](https://docs.rs/bincode) was compared against manual implementations using `AsyncRead`. It was found that over the network using asynchronous deserialization was faster.
This logically follows as if you use a synchronous serializer, you will do the following:
- Send the total length of the message
- Allocate a buffer for the message
- Wait asynchronously for the buffer to be filled
- Synchronously copy from the buffer into each of the struct fields
When using an asynchronous serializer you can skip sending the messages length and allocating the intermediate buffer as we can rely on the known length of each field while decoding - this is a win for performance and memory usage.
This crate provides utilities to make the implementations less error prone. However, long term it would be great to replace this with a derive macro similar to how crates like [serde](https://serde.rs) work.
From my ([oscartbeaumont](https://github.com/oscartbeaumont)'s) research no crate exists that meets these requirements. I attempted the implementation of one called [binario](https://github.com/oscartbeaumont/binario), however it is still incomplete as juggling async and lifetimes is pretty challenging. The recent stablisation of [RPITIT](https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html) would likely make this much easier if it were to be implemented today.
This issue is tracked as [ENG-431](https://linear.app/spacedriveapp/issue/ENG-431/binary-handling-abstraction).
## Example
### UUID
```rust
let uuid: uuid::Uuid = todo!();
// Encode
let mut bytes = vec![];
encode::uuid(&mut bytes, uuid);
// Decode
let stream: impl AsyncRead + Unpin = todo!(); // This will commonly be a `sd_p2p::UnicastStream`
let uuid = decode::uuid(&mut stream).await.unwrap();
```
### String
```rust
let string = format!("Hello, World!");
// Encode
let mut bytes = vec![];
encode::string(&mut bytes, string);
// Decode
let stream: impl AsyncRead + Unpin = todo!(); // This will commonly be a `sd_p2p::UnicastStream`
let string = decode::string(&mut stream).await.unwrap();
```
### Buffer
This may seem redundant but it is required for dynamically sized buffers as it is not possible to know the length of the buffer in advance so when decoding, so we must send the length of the buffer too.
```rust
let buf = b"Hello, World!".to_vec();
// Encode
let mut bytes = vec![];
encode::buf(&mut bytes, buf);
// Decode
let stream: impl AsyncRead + Unpin = todo!(); // This will commonly be a `sd_p2p::UnicastStream`
let buf = decode::buf(&mut stream).await.unwrap();
```

View File

@@ -0,0 +1,18 @@
---
title: sd-p2p-tunnel
index: 24
---
# `sd-p2p-tunnel`
[Implementation](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/tunnel)
TODO
## Example
```rust
# TODO
```
TODO - https://linear.app/spacedriveapp/issue/ENG-753/spacetunnel-encryption

View File

@@ -0,0 +1,16 @@
---
title: Transport layer
index: 28
---
# Transport Layer
We use [QUIC](https://en.wikipedia.org/wiki/QUIC) as our transport layer for peer to peer communication. QUIC is the perfect protocol for our use cases as it's fast, has built in stream multiplexing and has TLS built in so we can encryption out of the box.
## TLS authentication
Quic comes with built in TLS authentication with we make use of. Each node is issued a keypair ([`Identity`](https://github.com/spacedriveapp/spacedrive/blob/518d5836f6585a5f597c3ae5a0d27d084adc0a63/crates/p2p/src/identity.rs#L29)) which is stored in the node configuration.
This certificate ensures the communication between our node and the remote node can't be intercepted or tampered with, however it provides no assurances about the identity of the remote node.
An attacker could still do an [MITM](https://en.wikipedia.org/wiki/Man-in-the-middle_attack) by sitting in the middle and presenting it's certificate to each side. To combat this we also have library certificates that allow us to verify and encrypt the libraries traffic so it can only be decoded by another node within the library.

View File

@@ -0,0 +1,77 @@
---
title: usage
index: 21
---
# Usage
This is a high-level guide of how to build features within Spacedrive on top of the peer-to-peer system. I would recommend referring to this [example PR](#todo) alongside this guide as a practical reference.
Start by adding a new variant to [`Header` enum](https://github.com/spacedriveapp/spacedrive/blob/main/core/src/p2p/protocol.rs) and adjusting the `Header::from_stream` and `Header::to_bytes` implementation to support it.
Next create a new file for the features code in [`core/src/p2p/operations`](https://github.com/spacedriveapp/spacedrive/tree/main/core/src/p2p/operations) like the following:
```rust
use std::{error::Error, sync::Arc};
use sd_p2p::{RemoteIdentity, UnicastStream, P2P};
use tokio::io::AsyncWriteExt;
use tracing::debug;
use crate::p2p::Header;
/// This method can be called to send a ping to a remote peer.
/// The P2P system will take care of finding the peer and establishing a connection.
pub async fn ping(p2p: Arc<P2P>, identity: RemoteIdentity) -> Result<(), Box<dyn Error>> {
let peer = p2p
.peers()
.get(&identity)
.ok_or("Peer not found, has it been discovered?")?
.clone();
let mut stream = peer.new_stream().await?;
stream.write_all(&Header::NameOfYourNewHeaderVariant.to_bytes()).await?;
Ok(())
}
/// This method is called when a ping `Header` is found on the incoming request.
/// You must call this from the `match header` on the incoming handler.
pub(crate) async fn receiver(stream: UnicastStream) {
debug!("Received communication from peer '{}'", stream.remote_identity());
}
```
Next you need to setup an incoming handler [here](https://github.com/spacedriveapp/spacedrive/blob/4a62d268efea7dd6ff573531b1e2b2970c7ba562/core/src/p2p/manager.rs#L306) to define how your new `Header` variant should be handled when received. It should look something like:
```rust
match header {
...
Header::NameOfYourNewHeaderVariant => operations::name_of_your_new_file::receiver(stream).await;
}
```
Finally, you can use the `UnicastStream` stream which implements [`AsyncRead`](https://docs.rs/tokio/latest/tokio/io/trait.AsyncRead.html) + [`AsyncWrite`](https://docs.rs/tokio/latest/tokio/io/trait.AsyncWrite.html) to send data back and forth between peers to implement any application functionality.
## Version compatibility and breaking changes
It is the responsibility of the developer to ensure the protocol does not go through any breaking changes, as this would cause communication errors when multiple devices are running different versions of the software.
However, sometimes a breaking change may be required so we keep track of the Spacedrive version of each node within the peer metadata which can be used to coordinate breaking changes.
In the sending code you will already have access to the `Peer` so you can access the metadata directly. If your in the receiver code you can use the following to get the `Peer`:
```rust
let peer = p2p.peers().get(&stream.remote_identity()).unwrap();
```
Then you can access the version from the metadata like so:
```rust
// If your in the receiver method you've got the `peer` if not you can get it from the P2P system:
let metadata = PeerMetadata::from_hashmap(&peer.metadata()).unwrap();
// You could use the `semver` crate to compare versions
let is_running_version_0_1_0 = metadata.version.as_deref() == Some("0.1.0");
```

View File

@@ -1,208 +0,0 @@
---
title: p2p
index: 14
---
# Peer-to-peer
Our peer-to-peer technology works at the heart of Spacedrive allowing all of your devices to seamlessly communicate and share data. This documentation outlines
## Implementing features with P2P
TODO:
- From frontend, to backend
- Including authentication
- Versioning/making breaking change
- Show using `sd_p2p_tunnel`
## Underlying technology
### Terminology
- **Node**: An application running Spacedrive's network stack.
- This could be the Spacedrive app or the P2P relay.
- If you have multiple Spacedrive installations open on your computer, each one is an independant node.
- **Library**: A logical collection of your data within Spacedrive.
- From a theorical perspective, a library is just the conflict resolved state of one or more **instances** although a lot of the time we don't stricly treat it that way.
- **Instance**: An instance of a library running on a particular node.
- An instance correlates directly to each SQLite file.
- You could *technically* have more than one instance for a library on a single node, although our core would fall apart as we identify traffic by library.
- [`Identity`](https://github.com/spacedriveapp/spacedrive/blob/518d5836f6585a5f597c3ae5a0d27d084adc0a63/crates/p2p/src/identity.rs#L29) - A public/private keypair which represents the library or node.
- [`RemoteIdentity`](https://github.com/spacedriveapp/spacedrive/blob/518d5836f6585a5f597c3ae5a0d27d084adc0a63/crates/p2p/src/identity.rs#L70) - A public key which represents the library or node.
- [`PeerId`](https://docs.rs/libp2p/latest/libp2p/struct.PeerId.html) - The identifier libp2p uses. Can be derived from a `RemoteIdentity`.
### `sd_p2p` crate
The P2P crate was designed from the group up to be modular.
The `P2P` struct is the core of the system, and suprisingly doesn't actually do any P2P functionality. It's a state manager and event bus along with providing a hook system for other components of the P2P system to register themselves.
This modular design helps with separting the concern which helps with comprehending the entire system and makes it easier for testing.
The `sd_p2p` crate provides a hook for:
- `Mdns` - Local network discovery
- `Quic` - Quic transport layer built on top of `libp2p`
#### What are hooks?
A hook is very similar to an actor. It's a component which can be registered with the P2P system.
A hook allows for processing events from the P2P system and also ensures when the P2P system shuts down, the hook is also shutdown.
Their are special hooks called listeners. These are implemented as a superset of a regular hook and are able to create and accept connections.
Subcrates:
- [`sd_p2p_block`](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/block) - Block protocol based on [SyncThing Block Exchange Protocol v1](https://docs.syncthing.net/specs/bep-v1.html)
- [`sd_p2p_proto`](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/proto) - Utilities for zero fluff encoding and decoding.
- [`sd_p2p_tunnel`](https://github.com/spacedriveapp/spacedrive/tree/main/crates/p2p/crates/tunnel) - Encrypt a stream of data between two nodes
#### `sd_p2p_proto`
This crate provides utilities for implementing asynchronous deserializers and matching synchronous serializers. The goal of these implementations is to really quickly send and receive Rust structs over the network.
This crate allows for creating implementations faster than other common options, at the cost of some developer experience.
Before building this I originally compared the performance of both [msgpack](https://docs.rs/rmp-serde) and [bincode](https://docs.rs/bincode) against manual implementations using `AsyncRead` and I found that over the network using asynchronous deserialization was faster.
This makes logically makes sense as if you want to use a synchronous serializer you will do the following:
- Send the total length of the message
- Allocate a buffer for the message
- Wait asynchronously for the buffer to be filled
- Synchronously copy from the buffer into each of the struct fields
When using an asynchronous serializer you can skip sending the messages length and allocating the intermediate buffer as we can rely on the known length of each field while decoding and this is a win for performance and memory usage.
This crate provides utilities to make the implementations less error prone, however long term it would be great to replace this with a derive macro similar to how crates like [serde](https://serde.rs) work.
From my research no crate exists that meets these requirements. It is also a difficult problem because your juggling lifetimes and async which is rough. I attempted an implementation called [binario](https://github.com/oscartbeaumont/binario), however it is still incomplete so we never adopted it. I suspect Rust's recent stablisation of [RPITIT](https://blog.rust-lang.org/2023/12/21/async-fn-rpit-in-traits.html) would make this much easier.
### Local Network Discovery
Our local network discovery uses [DNS-Based Service Discovery](https://www.rfc-editor.org/rfc/rfc6763.html) which itself is built around [Multicast DNS (mDNS)](https://datatracker.ietf.org/doc/html/rfc6762). This is a really well established technology and is used in [Spotify Connect](https://support.spotify.com/au/article/spotify-connect/), [Apple Airplay](https://www.apple.com/au/airplay/) and many other services you use every day.
#### Service Structure
The following is an example of what would be broadcast from a single Spacedrive node:
```toml
# {remote_identity_of_self}._sd._udp.local.
name=Oscars Laptop # Shown to the user to select a device
operating_system=macos # Used for UI purposes
device_model=MacBook Pro # Used for UI purposes
version=0.0.1 # Spacedrive version
# For each library that's active on the Spacedrive node:
# {library_uuid}={remote_identity_of_self}
d66ed0c3-03ac-4f9b-a374-a927830dfd5b=0l9vTOWu+5aJs0cyWxdfJEGtloEepGRAXcEuDeTDRPk
```
Within `sd-core` this is defined in two parts. The [`PeerMetadata` struct](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/metadata.rs#L9) takes care of the node metadata and libraries are inserted by the [`libraries_hook`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/libraries.rs#L13).
#### Modes
<Notice
type="note"
text="This section discusses 'Contacts Only' which is not yet fully implemented (refer to ENG-1197)."
/>
Within Spacedrive's settings the user is able to choose between three modes for local network discovery:
- **Contacts only**: Only devices that are in your contacts list will be able to see your device.
- **Enabled**: All devices on the local network will be able to see your device.
- **Disabled**: No devices on the local network will be able to see your device.
**Enabled** and **Disabled** are implemented by spawning and shutting down the [`sd_p2p::Mdns`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/crates/p2p/src/mdns.rs#L17) service as required within `sd-core`.
**Contacts only** the mDNS service will not contain the [`PeerMetadata`](https://github.com/spacedriveapp/spacedrive/blob/44478207e72495b3777e294660d78939711b544f/core/src/p2p/metadata.rs#L9) fields and instead will contain a hash of the users Spacedrive identifier. If a Spacedrive node detects another node in the local network with a hash in it's contacts, it can make a request to the node and if the remote node also has the current node in it's contacts, it will respond with the full metadata.
#### Implementation
We make use of the [mdns-sd](https://docs.rs/mdns-sd) crate.
### Manual connection
The user can manually provide a set of [`SocketAddr`](https://doc.rust-lang.org/std/net/enum.SocketAddr.html)'s and the P2P system to attempt to connect to.
This feature primarily exists for usage in combination with Docker but it could be useful for working around difficult network setups.
#### Implementation
TODO - TODO
#### Problems with Docker
TODO - MDNS daemon
TODO - Docker and why it's a pain mDNS. Explain the current stuff i've done with it.
### Transport layer
TODO - Quic
TODO - Explain authentication
### Relay
TODO
### Direction Connect via Relay
TODO
#### Authentication
TODO - How we gonna restrict this???
#### Billing
TODO - How we gonna bill for this???
### Design Decisions
TODO
### Things I would do differently?
TODO
### Crates
TODO
#### Security
##### Threat model
TODO - Risks of sharing IP's using discovery, risks of compromised relay, risks of compromised local network during pairing
##### Authentication
TODO
##### Authorization
TODO
##### Tracking
TODO - Link to Apple stuff
#### Version compatibility and breaking changes
TODO - Compatibility across versions of Spacedrive
#### libp2p
TODO - Why libp2p fork?, Why libp2p can be problematic for what we do
TODO - How we transpose our certificates to libp2p certificates
#### Major issues
TODO - mDNS issues on Linux
TODO - The double up of service discovery when using local and relay
TODO - Question? Why does remote_identity_of_self show up in metadata and the mDNS record itself.
{/* TODO */}
TODO - Request flow. Eg. incoming goes from Quic to mpsc to the users code, to the handlers.
TODO - Resumable uploads/transfers