## Motivation CI failures can be avoided by running checks locally before committing. This adds clear documentation to AGENTS.md so that AI agents (and humans) know exactly which checks must pass before pushing code. ## Changes Added a new "Pre-Commit Checks (REQUIRED)" section to AGENTS.md that: - Lists all 4 required checks (basedpyright, ruff, nix fmt, pytest) - Provides a one-liner to run all checks in sequence - Notes that `nix fmt` changes must be staged before committing - Explains that CI runs `nix flake check` which verifies everything ## Why It Works Clear documentation prevents CI failures by ensuring contributors run checks locally first. The one-liner command makes it easy to run all checks before committing. ## Test Plan ### Manual Testing - Verified the documented commands work correctly ### Automated Testing - N/A - documentation only change Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
4.0 KiB
AGENTS.md
This file provides guidance to AI coding agents when working with code in this repository.
Project Overview
exo is a distributed AI inference system that connects multiple devices into a cluster. It enables running large language models across multiple machines using MLX as the inference backend and libp2p for peer-to-peer networking.
Build & Run Commands
# Build the dashboard (required before running exo)
cd dashboard && npm install && npm run build && cd ..
# Run exo (starts both master and worker with API at http://localhost:52415)
uv run exo
# Run with verbose logging
uv run exo -v # or -vv for more verbose
# Run tests (excludes slow tests by default)
uv run pytest
# Run all tests including slow tests
uv run pytest -m ""
# Run a specific test file
uv run pytest src/exo/shared/tests/test_election.py
# Run a specific test function
uv run pytest src/exo/shared/tests/test_election.py::test_function_name
# Type checking (strict mode)
uv run basedpyright
# Linting
uv run ruff check
# Format code (using nix)
nix fmt
Pre-Commit Checks (REQUIRED)
IMPORTANT: Always run these checks before committing code. CI will fail if these don't pass.
# 1. Type checking - MUST pass with 0 errors
uv run basedpyright
# 2. Linting - MUST pass
uv run ruff check
# 3. Formatting - MUST be applied
nix fmt
# 4. Tests - MUST pass
uv run pytest
Run all checks in sequence:
uv run basedpyright && uv run ruff check && nix fmt && uv run pytest
If nix fmt changes any files, stage them before committing. The CI runs nix flake check which verifies formatting, linting, and runs Rust tests.
Architecture
Node Composition
A single exo Node (src/exo/main.py) runs multiple components:
- Router: libp2p-based pub/sub messaging via Rust bindings (exo_pyo3_bindings)
- Worker: Handles inference tasks, downloads models, manages runner processes
- Master: Coordinates cluster state, places model instances across nodes
- Election: Bully algorithm for master election
- API: FastAPI server for OpenAI-compatible chat completions
Message Flow
Components communicate via typed pub/sub topics (src/exo/routing/topics.py):
GLOBAL_EVENTS: Master broadcasts indexed events to all workersLOCAL_EVENTS: Workers send events to master for indexingCOMMANDS: Workers/API send commands to masterELECTION_MESSAGES: Election protocol messagesCONNECTION_MESSAGES: libp2p connection updates
Event Sourcing
The system uses event sourcing for state management:
State(src/exo/shared/types/state.py): Immutable state objectapply()(src/exo/shared/apply.py): Pure function that applies events to state- Master indexes events and broadcasts; workers apply indexed events
Key Type Hierarchy
src/exo/shared/types/: Pydantic models for all shared typesevents.py: Event types (discriminated union)commands.py: Command typestasks.py: Task types for worker executionstate.py: Cluster state model
Rust Components
Rust code in rust/ provides:
networking: libp2p networking (gossipsub, peer discovery)exo_pyo3_bindings: PyO3 bindings exposing Rust to Pythonsystem_custodian: System-level operations
Dashboard
Svelte 5 + TypeScript frontend in dashboard/. Build output goes to dashboard/build/ and is served by the API.
Code Style Requirements
From .cursorrules:
- Strict, exhaustive typing - never bypass the type-checker
- Use
Literal[...]for enum-like sets,typing.NewTypefor primitives - Pydantic models with
frozen=Trueandstrict=True - Pure functions with injectable effect handlers for side-effects
- Descriptive names - no abbreviations or 3-letter acronyms
- Catch exceptions only where you can handle them meaningfully
- Use
@finaland immutability wherever applicable
Testing
Tests use pytest-asyncio with asyncio_mode = "auto". Tests are in tests/ subdirectories alongside the code they test. The EXO_TESTS=1 env var is set during tests.