# Contributing to EXO Thank you for your interest in contributing to EXO! ## Getting Started To run EXO from source: **Prerequisites:** - [uv](https://github.com/astral-sh/uv) (for Python dependency management) ```bash brew install uv ``` - [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now) ```bash curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh rustup toolchain install nightly ``` - [macmon](https://github.com/vladkens/macmon) (for hardware monitoring on Apple Silicon) Use the pinned fork revision used by this repo instead of Homebrew `macmon`. ```bash cargo install --git https://github.com/vladkens/macmon \ --rev a1cd06b6cc0d5e61db24fd8832e74cd992097a7d \ macmon \ --force ``` ```bash git clone https://github.com/exo-explore/exo.git cd exo/dashboard npm install && npm run build && cd .. uv run exo ``` ## Development EXO is built with a mix of Rust, Python, and TypeScript (Svelte for the dashboard), and the codebase is actively evolving. Before starting work: - Pull the latest source to ensure you're working with the most recent code - Keep your changes focused - implement one feature or fix per pull request - Avoid combining unrelated changes, even if they seem small This makes reviews faster and helps us maintain code quality as the project evolves. ## Code Style Write pure functions where possible. When adding new code, prefer Rust unless there's a good reason otherwise. Leverage the type systems available to you - Rust's type system, Python type hints, and TypeScript types. Comments should explain why you're doing something, not what the code does - especially for non-obvious decisions. Run `nix fmt` to auto-format your code before submitting. ## Model Cards EXO uses TOML-based model cards to define model metadata and capabilities. Model cards are stored in: - `resources/inference_model_cards/` for text generation models - `resources/image_model_cards/` for image generation models - `~/.exo/custom_model_cards/` for user-added custom models ### Adding a Model Card To add a new model, create a TOML file with the following structure: ```toml model_id = "mlx-community/Llama-3.2-1B-Instruct-4bit" n_layers = 16 hidden_size = 2048 supports_tensor = true tasks = ["TextGeneration"] family = "llama" quantization = "4bit" base_model = "Llama 3.2 1B" capabilities = ["text"] [storage_size] in_bytes = 729808896 ``` ### Required Fields - `model_id`: Hugging Face model identifier - `n_layers`: Number of transformer layers - `hidden_size`: Hidden dimension size - `supports_tensor`: Whether the model supports tensor parallelism - `tasks`: List of supported tasks (`TextGeneration`, `TextToImage`, `ImageToImage`) - `family`: Model family (e.g., "llama", "deepseek", "qwen") - `quantization`: Quantization level (e.g., "4bit", "8bit", "bf16") - `base_model`: Human-readable base model name - `capabilities`: List of capabilities (e.g., `["text"]`, `["text", "thinking"]`) ### Optional Fields - `components`: For multi-component models (like image models with separate text encoders and transformers) - `uses_cfg`: Whether the model uses classifier-free guidance (for image models) - `trust_remote_code`: Whether to allow remote code execution (defaults to `false` for security) ### Capabilities The `capabilities` field defines what the model can do: - `text`: Standard text generation - `thinking`: Model supports chain-of-thought reasoning - `thinking_toggle`: Thinking can be enabled/disabled via `enable_thinking` parameter - `image_edit`: Model supports image-to-image editing (FLUX.1-Kontext) ### Security Note By default, `trust_remote_code` is set to `false` for security. Only enable it if the model explicitly requires remote code execution from the Hugging Face hub. ## API Adapters EXO supports multiple API formats through an adapter pattern. Adapters convert API-specific request formats to the internal `TextGenerationTaskParams` format and convert internal token chunks back to API-specific responses. ### Adapter Architecture All adapters live in `src/exo/master/adapters/` and follow the same pattern: 1. Convert API-specific requests to `TextGenerationTaskParams` 2. Handle both streaming and non-streaming response generation 3. Convert internal `TokenChunk` objects to API-specific formats 4. Manage error handling and edge cases ### Existing Adapters - `chat_completions.py`: OpenAI Chat Completions API - `claude.py`: Anthropic Claude Messages API - `responses.py`: OpenAI Responses API - `ollama.py`: Ollama API (for OpenWebUI compatibility) ### Adding a New API Adapter To add support for a new API format: 1. Create a new adapter file in `src/exo/master/adapters/` 2. Implement a request conversion function: ```python def your_api_request_to_text_generation( request: YourAPIRequest, ) -> TextGenerationTaskParams: # Convert API request to internal format pass ``` 3. Implement streaming response generation: ```python async def generate_your_api_stream( command_id: CommandId, chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None], ) -> AsyncGenerator[str, None]: # Convert internal chunks to API-specific streaming format pass ``` 4. Implement non-streaming response collection: ```python async def collect_your_api_response( command_id: CommandId, chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None], ) -> AsyncGenerator[str]: # Collect all chunks and return single response pass ``` 5. Register the adapter endpoints in `src/exo/master/api.py` The adapter pattern keeps API-specific logic isolated from core inference systems. Internal systems (worker, runner, event sourcing) only see `TextGenerationTaskParams` and `TokenChunk` objects - no API-specific types cross the adapter boundary. For detailed API documentation, see [docs/api.md](docs/api.md). ## Testing EXO relies heavily on manual testing at this point in the project, but this is evolving. Before submitting a change, test both before and after to demonstrate how your change improves behavior. Do the best you can with the hardware you have available - if you need help testing, ask and we'll do our best to assist. Add automated tests where possible - we're actively working to substantially improve our automated testing story. ## Submitting Changes 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/your-feature`) 3. Commit your changes (`git commit -am 'Add some feature'`) 4. Push to the branch (`git push origin feature/your-feature`) 5. Open a Pull Request and follow the PR template ## Reporting Issues If you find a bug or have a feature request, please open an issue on GitHub with: - A clear description of the problem or feature - Steps to reproduce (for bugs) - Expected vs actual behavior - Your environment (macOS version, hardware, etc.) ## Questions? Join our community: - [X](https://x.com/exolabs)