## Motivation
Replace confusing EXO_MODELS_DIR/EXO_MODELS_PATH with clearer
multi-directory support, enabling automatic download spillover across
volumes.
## Changes
- EXO_MODELS_DIRS: colon-separated writable dirs (default always
prepended, first with enough space wins)
- EXO_MODELS_READ_ONLY_DIRS: colon-separated read-only dirs (protected
from deletion)
- select_download_dir(): picks writable dir by free space
- resolve_existing_model(): unified lookup across all dirs
- is_read_only_model_dir(): path-based read-only detection instead of
hardcoded flag
- Updated coordinator, worker, model cards, tests
## Why It Works
Default dir always included so zero-config behavior is unchanged. Disk
space checked at download time for automatic spillover. Read-only status
derived from path, not hardcoded.
## Test Plan
### Manual Testing
- No env vars set → identical behavior
- EXO_MODELS_DIRS=/Volumes/SSD/models → downloads to external storage
- EXO_MODELS_READ_ONLY_DIRS=/mnt/nfs → models found, deletion blocked
### Automated Testing
- 4 new tests in test_xdg_paths.py (prepend, default-only, overlap,
empty read-only)
- Existing tests updated to patch new constants
updates macmon to an upstream fork that fixes m5 max issues.
might see if the upstream version gets merged before we release.
---------
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
## Motivation
Updated documentation for v1.0.68 release
## Changes
**docs/api.md:**
- Added documentation for new API endpoints: Claude Messages API
(`/v1/messages`), OpenAI Responses API (`/v1/responses`), and Ollama API
compatibility endpoints
- Documented custom model management endpoints (`POST /models/add`,
`DELETE /models/custom/{model_id}`)
- Added `enable_thinking` parameter documentation for thinking-capable
models (DeepSeek V3.1, Qwen3, GLM-4.7)
- Documented usage statistics in responses (prompt_tokens,
completion_tokens, total_tokens)
- Added streaming event format documentation for all API types
- Updated image generation section with FLUX.1-Kontext-dev support and
new dimensions (1024x1365, 1365x1024)
- Added request cancellation documentation
- Updated complete endpoint summary with all new endpoints
- Added security notes about trust_remote_code being opt-in
**README.md:**
- Updated Features section to highlight multiple API compatibility
options
- Added Environment Variables section documenting all configuration
options (EXO_MODELS_PATH, EXO_OFFLINE, EXO_ENABLE_IMAGE_MODELS,
EXO_LIBP2P_NAMESPACE, EXO_FAST_SYNCH, EXO_TRACING_ENABLED)
- Expanded "Using the API" section with examples for Claude Messages
API, OpenAI Responses API, and Ollama API
- Added custom model loading documentation with security notes
- Updated file locations to include log files and custom model cards
paths
**CONTRIBUTING.md:**
- Added documentation for TOML model cards format and the API adapter
pattern
**docs/architecture.md:**
- Documented the adapter architecture introduced in PR #1167Closes#1653
---------
Co-authored-by: askmanu[bot] <192355599+askmanu[bot]@users.noreply.github.com>
Co-authored-by: Evan Quiney <evanev7@gmail.com>
## Motivation
There's no way to easily use the cancellation features we added! Also,
prefill can take ages so let's allow cancelling out of that.
## Changes
Wiring up our existing functionality to easily cancel during generation
(and adding stuff to do so during prefill)
## Test Plan
### Manual Testing
Tested it works during both prefill and decode.
### Automated testing
Needs testing to see if this causes a GPU timeout error on large prefill
on large models in pipeline parallel. However, from manually testing GLM
5 pipeline ring on 2 nodes, and from reading the code, it does not seem
like this will be the case.
## Motivation
There is an issue on Macs that means that an explicit synchronization is
necessary for memory to be updated from L1 cache. This means that GPU
locks can occur when a spin wait does not see the updated timestamp.
## Changes
Updated in my own personal fork.
## Why It Works
https://github.com/ARM-software/acle/releases
## Test Plan
### Manual Testing
Tested manually that no GPU locks occur (even with multiple simultaneous
instances running) and that the performance differential is negligible
(267 vs 269 tps on Llama 3.2 1B at an approx 10k context.)
------------------------------------------------------
I have seen a GPU lock, specifically when sending a particularly large
chat completion while the model was loading. However, I have since been
unable to reproduce and this may be something I did wrong. Please do
create an issue and tag me if any GPU locks do occur.
---------
Co-authored-by: Jake Hillion <jake@hillion.co.uk>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
## Motivation
Running RDMA from source is not well documented as is. Several
surprising things that took time to debug internally too.
App should be updated to detect MacOS versions in future.
## Motivation
When `get_shard_download_status()` runs, it iterates over all models in
`MODEL_CARDS` and calls `build_full_shard()` → `build_base_shard()` →
`ModelCard.from_hf()`. This unconditionally tried to download
`config.json` from HuggingFace, but image models (FLUX, Qwen-Image)
don't have a root-level config.json file, causing errors:
```
Error downloading shard: File not found: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/config.json
Error downloading shard: File not found: https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/config.json
Error downloading shard: File not found: https://huggingface.co/Qwen/Qwen-Image/resolve/main/config.json
Error downloading shard: File not found: https://huggingface.co/Qwen/Qwen-Image-Edit-2509/resolve/main/config.json
```
## Changes
### ModelCard.load() fix
- `build_base_shard()` now uses `ModelCard.load()` instead of
`ModelCard.from_hf()`
- `ModelCard.load()` iterates through `MODEL_CARDS.values()` to find a
match by `model_id`
### exo-bench fixes
- Use `name` field instead of `id` for model resolution
- Pass `full_model_id` to `/instance/previews` endpoint
- Make model name matching case-insensitive
- Update README example model name
## Why It Works
`MODEL_CARDS` uses short names as keys (e.g., `"flux1-schnell"`) but the
`model_id` values are HuggingFace paths (e.g.,
`"black-forest-labs/FLUX.1-schnell"`). When `ModelCard.load()` was
called with the HF path, it didn't match any key and fell back to
`from_hf()` which tried to download config.json.
The fix iterates through `MODEL_CARDS.values()` to find a match by
`model_id`, ensuring predefined models (including image models) use
their registry entries directly without network calls. A key lookup is
unnecessary since `load()` is always called with HF paths which don't
match the short-name keys.
## Test Plan
### Manual Testing
- Run exo and verify no more "Error downloading shard: File not found:
.../config.json" errors for image models
- Run exo-bench and verify model resolution works correctly
### Automated Testing
- `uv run basedpyright` - passes with 0 errors
- `uv run pytest` - all tests pass
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Updated README.md with documentation for four new features:
- added a "Benchmarking" section documenting the exo-bench tool for
measuring model performance across different placement configurations
- documented the custom namespace feature for cluster isolation in the
macOS app section
- added a "Configuration Options" subsection explaining the --no-worker
CLI flag for coordinator-only nodes
- added a "File Locations (Linux)" subsection documenting XDG Base
Directory Specification compliance on Linux systems
Issue #930
## Motivation
The README showcases exo's features and benchmarks but doesn't show what
the dashboard actually looks like. Adding a screenshot helps users
understand what they'll get when they run exo.
## Changes
- Added dashboard screenshot to `docs/imgs/dashboard-cluster-view.png`:
Shows the cluster topology view with 4 × 512GB M3 Ultra Mac Studio
running DeepSeek v3.1 (8-bit) and Kimi-K2-Thinking (4-bit)
- Added a new "Dashboard" section to README.md below Features,
displaying the screenshot with caption
## Why It Works
Visual documentation helps users understand what exo offers before they
install it. The screenshot demonstrates the cluster management
capabilities.
## Test Plan
### Manual Testing
- Verified image renders correctly in GitHub markdown preview
### Automated Testing
- N/A - documentation only change
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
https://github.com/exo-explore/exo/issues/1075
## Changes
- Added in-app "Uninstall" option under Advanced menu that cleanly
removes all system components
- Added NetworkSetupHelper.uninstall() to remove LaunchDaemon, scripts,
logs, and restore network settings
- Added LaunchAtLoginHelper.disable() to unregister from login items
- Created standalone uninstall-exo.sh script for users who already
deleted the app
- Added uninstall documentation to README
<img width="386" height="577" alt="image"
src="https://github.com/user-attachments/assets/6bbcd18a-992a-409d-8791-ed5e13bbcfe0"
/>
<img width="372" height="432" alt="image"
src="https://github.com/user-attachments/assets/ee76b45d-c111-4807-ab28-3f2f20e01140"
/>
## Why It Works
The in-app uninstaller runs a privileged shell script (via AppleScript)
to launchctl bootout the daemon, remove files, and restore the
"Automatic" network location. The standalone script provides the same
cleanup for users who already deleted the app.
## Test Plan
### Manual Testing
Hardware: MacBook Pro
- Built and ran app, verified LaunchDaemon and network location were
created
- Used in-app Uninstall, verified all components removed and network
restored to Automatic
- Rebuilt app, quit normally, ran sudo ./uninstall-exo.sh, verified same
cleanup
### Automated Testing
N/A
---------
Co-authored-by: Evan <evanev7@gmail.com>
## Motivation
Discord link expired.
## Changes
Replace discord invite link with permanent link.
## Why It Works
It's permanent now.
## Test Plan
Clicked the link. It works.
## Motivation
We didn't have instructions for enabling RDMA on macOS.
## Changes
I added instructions for enabling RDMA on macOS.
## Why It Works
Tried it on my M4 Max MacBook Pro and works.
## Test Plan
### Manual Testing
Tried it on my M4 Max MacBook Pro and works.
### Automated Testing
In the future, we could automate this from fresh macOS builds using KVM
over IP. See #1030
This PR updates the "Run from Source (Mac & Linux)" section in README.md
to clarify Linux instructions.
Changes include:
- Split the section into macOS and Linux subsections.
- Added native Linux package manager commands (apt, dnf, pacman) for
dependencies: uv, node, npm.
- Clarified that macmon is macOS-only.
- Noted that Homebrew on Linux is optional, with native package managers
preferred.
These changes improve clarity for Linux users and fix confusion from the
previous macOS-centric instructions.
## Motivation
README looks weird after last update.
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
I actually checked the file on GitHub this time.
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
Documentation will make contribution easier and communicate our
development philosophy and decision process. Closes#967
## Changes
Added `architecture.md` to docs/ and moved the images out of docs and
into their own docs/imgs/ folder
## Motivation
Addresses #974
```
INFO: pip is looking at multiple versions of exo to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement exo-pyo3-bindings (from exo) (from versions: none)
ERROR: No matching distribution found for exo-pyo3-bindings
```
## Changes
Describes Rust dependency for building from source
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
Tested locally and runs after this setup without exo-pyo3-bindings error
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
Users need to know what **prerequisites** they need in order to run exo.
Simple addition to docs prevents future raised issues.
## Changes
Updated ``README.md``:
- to include installation instructions for
**[uv](https://github.com/astral-sh/uv)** and
**[macmon](https://github.com/vladkens/macmon)**.
Updated ``CONTRIBUTING.md``:
- to verify these prerequisites are met before starting development.
- Standardized on brew installation instructions for macOS users to keep
the guide simple.
## Why It Works
By listing these prerequisites upfront, users will set up their
environment correctly before attempting to run exo.
## Test Plan
### Manual Testing
MacBook Pro M4
- Verified that ``uv`` and ``macmon`` were missing initially, causing
failures
- after installing them via brew (as documented), uv run exo starts
successfully.
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
---------
Co-authored-by: Evan Quiney <evanev7@gmail.com>
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
Made a mistake on the merge of the last PR.
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->