From 1efbd26388c60bad451ccec210c1e135d1dd1091 Mon Sep 17 00:00:00 2001 From: Evan Quiney Date: Mon, 22 Dec 2025 17:57:43 +0000 Subject: [PATCH] add architecture.md, move images to docs/imgs (#968) ## Motivation Documentation will make contribution easier and communicate our development philosophy and decision process. Closes #967 ## Changes Added `architecture.md` to docs/ and moved the images out of docs and into their own docs/imgs/ folder --- README.md | 6 +- docs/architecture.md | 64 ++++++++++++++++++ docs/{ => imgs}/exo-logo-black-bg.jpg | Bin .../exo-logo-transparent-black-text.png | 0 docs/{ => imgs}/exo-logo-transparent.png | 0 docs/{ => imgs}/exo-rounded.png | 0 docs/{ => imgs}/exo-screenshot.jpg | Bin docs/{ => imgs}/four-mac-studio-topology.png | Bin docs/{ => imgs}/macos-app-one-macbook.png | Bin 9 files changed, 67 insertions(+), 3 deletions(-) create mode 100644 docs/architecture.md rename docs/{ => imgs}/exo-logo-black-bg.jpg (100%) rename docs/{ => imgs}/exo-logo-transparent-black-text.png (100%) rename docs/{ => imgs}/exo-logo-transparent.png (100%) rename docs/{ => imgs}/exo-rounded.png (100%) rename docs/{ => imgs}/exo-screenshot.jpg (100%) rename docs/{ => imgs}/four-mac-studio-topology.png (100%) rename docs/{ => imgs}/macos-app-one-macbook.png (100%) diff --git a/README.md b/README.md index a79c2317..242e23b5 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@
- - exo logo + + exo logo exo: Run your own AI cluster at home with everyday devices. Maintained by [exo labs](https://x.com/exolabs). @@ -99,7 +99,7 @@ This starts the exo dashboard and API at http://localhost:52415/ exo ships a macOS app that runs in the background on your Mac. -exo macOS App - running on a MacBook +exo macOS App - running on a MacBook The macOS app requires macOS Tahoe 26.2 or later. diff --git a/docs/architecture.md b/docs/architecture.md new file mode 100644 index 00000000..6f60a7b9 --- /dev/null +++ b/docs/architecture.md @@ -0,0 +1,64 @@ +# EXO Architecture overview + +EXO uses an _Event Sourcing_ architecture, and Erlang-style _message passing_. To facilitate this, we've written a channel library extending anyio channels with inspiration from tokio::sync::mpsc. + +Each logical module - designed to be functional independently of the others - communicates with the rest of the system by sending messages on topics. + +## Systems + +There are currently 5 major systems: + +- Master + + Executes placement and orders events through a single writer + +- Worker + + Schedules work on a node, gathers system information, etc.# + +- Runner + + Executes inference jobs (for now) in an isolated process from the worker for fault-tolerance. + +- API + + Runs a python webserver for exposing state and commands to client applications + +- Election + + Implements a distributed algorithm for master election in unstable networking conditions + +## Topics + +There are currently 5 topics: + +- Commands + + The API and Worker instruct the master when the event log isn't sufficient. Namely placement and catchup requests go through Commands atm. + +- Local Events + + All nodes write events here, the master reads those events and orders them + +- Global Events + + The master writes events here, all nodes read from this topic and fold the produced events into their `State` + +- Election Messages + + Before establishing a cluster, nodes communicate here to negotiate a master node. + +- Connection Messages + + The networking system write mdns-discovered hardware connections here. + + +## Event Sourcing + +Lots has been written about event sourcing, but it lets us centralize faulty connections and message ACKing with the following model. + +Whenever a device produces side effects, it captures those side effects in an `Event`. `Event`s are then "applied" to their model of `State`, which is globally distributed across the cluster. Whenever a command is received, it is combined with state to produce side effects, captured in yet more events. The rule of thumb is "`Event`s are past tense, `Command`s are imperative". Telling a node to perform some action like "place this model" or "Give me a copy of the event log" is represented by a command (The worker's `Task`s are also commands), while "this node is using 300GB of ram" is an event. Notably, `Event`s SHOULD never cause side effects on their own. There are a few exceptions to this, we're working out the specifics of generalizing the distributed event sourcing model to make it better suit our needs + +## Purity + +A significant goal of the current design is to make data flow explicit. Classes should either represent simple data (`CamelCaseModel`s typically, and `TaggedModel`s for unions) or active `System`s (Erlang `Actor`s), with all transformations of that data being "referentially transparent" - destructure and construct new data, don't mutate in place. We have had varying degrees of success with this, and are still exploring where purity makes sense. diff --git a/docs/exo-logo-black-bg.jpg b/docs/imgs/exo-logo-black-bg.jpg similarity index 100% rename from docs/exo-logo-black-bg.jpg rename to docs/imgs/exo-logo-black-bg.jpg diff --git a/docs/exo-logo-transparent-black-text.png b/docs/imgs/exo-logo-transparent-black-text.png similarity index 100% rename from docs/exo-logo-transparent-black-text.png rename to docs/imgs/exo-logo-transparent-black-text.png diff --git a/docs/exo-logo-transparent.png b/docs/imgs/exo-logo-transparent.png similarity index 100% rename from docs/exo-logo-transparent.png rename to docs/imgs/exo-logo-transparent.png diff --git a/docs/exo-rounded.png b/docs/imgs/exo-rounded.png similarity index 100% rename from docs/exo-rounded.png rename to docs/imgs/exo-rounded.png diff --git a/docs/exo-screenshot.jpg b/docs/imgs/exo-screenshot.jpg similarity index 100% rename from docs/exo-screenshot.jpg rename to docs/imgs/exo-screenshot.jpg diff --git a/docs/four-mac-studio-topology.png b/docs/imgs/four-mac-studio-topology.png similarity index 100% rename from docs/four-mac-studio-topology.png rename to docs/imgs/four-mac-studio-topology.png diff --git a/docs/macos-app-one-macbook.png b/docs/imgs/macos-app-one-macbook.png similarity index 100% rename from docs/macos-app-one-macbook.png rename to docs/imgs/macos-app-one-macbook.png