mirror of
https://github.com/exo-explore/exo.git
synced 2026-04-17 12:30:29 -04:00
188 lines
6.9 KiB
Markdown
188 lines
6.9 KiB
Markdown
# Contributing to EXO
|
|
|
|
Thank you for your interest in contributing to EXO!
|
|
|
|
## Getting Started
|
|
|
|
To run EXO from source:
|
|
|
|
**Prerequisites:**
|
|
- [uv](https://github.com/astral-sh/uv) (for Python dependency management)
|
|
```bash
|
|
brew install uv
|
|
```
|
|
- [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now)
|
|
```bash
|
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
|
rustup toolchain install nightly
|
|
```
|
|
- [macmon](https://github.com/vladkens/macmon) (for hardware monitoring on Apple Silicon)
|
|
Use the pinned fork revision used by this repo instead of Homebrew `macmon`.
|
|
```bash
|
|
cargo install --git https://github.com/vladkens/macmon \
|
|
--rev a1cd06b6cc0d5e61db24fd8832e74cd992097a7d \
|
|
macmon \
|
|
--force
|
|
```
|
|
|
|
```bash
|
|
git clone https://github.com/exo-explore/exo.git
|
|
cd exo/dashboard
|
|
npm install && npm run build && cd ..
|
|
uv run exo
|
|
```
|
|
|
|
## Development
|
|
|
|
EXO is built with a mix of Rust, Python, and TypeScript (Svelte for the dashboard), and the codebase is actively evolving. Before starting work:
|
|
|
|
- Pull the latest source to ensure you're working with the most recent code
|
|
- Keep your changes focused - implement one feature or fix per pull request
|
|
- Avoid combining unrelated changes, even if they seem small
|
|
|
|
This makes reviews faster and helps us maintain code quality as the project evolves.
|
|
|
|
## Code Style
|
|
|
|
Write pure functions where possible. When adding new code, prefer Rust unless there's a good reason otherwise. Leverage the type systems available to you - Rust's type system, Python type hints, and TypeScript types. Comments should explain why you're doing something, not what the code does - especially for non-obvious decisions.
|
|
|
|
Run `nix fmt` to auto-format your code before submitting.
|
|
|
|
## Model Cards
|
|
|
|
EXO uses TOML-based model cards to define model metadata and capabilities. Model cards are stored in:
|
|
- `resources/inference_model_cards/` for text generation models
|
|
- `resources/image_model_cards/` for image generation models
|
|
- `~/.exo/custom_model_cards/` for user-added custom models
|
|
|
|
### Adding a Model Card
|
|
|
|
To add a new model, create a TOML file with the following structure:
|
|
|
|
```toml
|
|
model_id = "mlx-community/Llama-3.2-1B-Instruct-4bit"
|
|
n_layers = 16
|
|
hidden_size = 2048
|
|
supports_tensor = true
|
|
tasks = ["TextGeneration"]
|
|
family = "llama"
|
|
quantization = "4bit"
|
|
base_model = "Llama 3.2 1B"
|
|
capabilities = ["text"]
|
|
|
|
[storage_size]
|
|
in_bytes = 729808896
|
|
```
|
|
|
|
### Required Fields
|
|
|
|
- `model_id`: Hugging Face model identifier
|
|
- `n_layers`: Number of transformer layers
|
|
- `hidden_size`: Hidden dimension size
|
|
- `supports_tensor`: Whether the model supports tensor parallelism
|
|
- `tasks`: List of supported tasks (`TextGeneration`, `TextToImage`, `ImageToImage`)
|
|
- `family`: Model family (e.g., "llama", "deepseek", "qwen")
|
|
- `quantization`: Quantization level (e.g., "4bit", "8bit", "bf16")
|
|
- `base_model`: Human-readable base model name
|
|
- `capabilities`: List of capabilities (e.g., `["text"]`, `["text", "thinking"]`)
|
|
|
|
### Optional Fields
|
|
|
|
- `components`: For multi-component models (like image models with separate text encoders and transformers)
|
|
- `uses_cfg`: Whether the model uses classifier-free guidance (for image models)
|
|
- `trust_remote_code`: Whether to allow remote code execution (defaults to `false` for security)
|
|
|
|
### Capabilities
|
|
|
|
The `capabilities` field defines what the model can do:
|
|
- `text`: Standard text generation
|
|
- `thinking`: Model supports chain-of-thought reasoning
|
|
- `thinking_toggle`: Thinking can be enabled/disabled via `enable_thinking` parameter
|
|
- `image_edit`: Model supports image-to-image editing (FLUX.1-Kontext)
|
|
|
|
### Security Note
|
|
|
|
By default, `trust_remote_code` is set to `false` for security. Only enable it if the model explicitly requires remote code execution from the Hugging Face hub.
|
|
|
|
## API Adapters
|
|
|
|
EXO supports multiple API formats through an adapter pattern. Adapters convert API-specific request formats to the internal `TextGenerationTaskParams` format and convert internal token chunks back to API-specific responses.
|
|
|
|
### Adapter Architecture
|
|
|
|
All adapters live in `src/exo/master/adapters/` and follow the same pattern:
|
|
|
|
1. Convert API-specific requests to `TextGenerationTaskParams`
|
|
2. Handle both streaming and non-streaming response generation
|
|
3. Convert internal `TokenChunk` objects to API-specific formats
|
|
4. Manage error handling and edge cases
|
|
|
|
### Existing Adapters
|
|
|
|
- `chat_completions.py`: OpenAI Chat Completions API
|
|
- `claude.py`: Anthropic Claude Messages API
|
|
- `responses.py`: OpenAI Responses API
|
|
- `ollama.py`: Ollama API (for OpenWebUI compatibility)
|
|
|
|
### Adding a New API Adapter
|
|
|
|
To add support for a new API format:
|
|
|
|
1. Create a new adapter file in `src/exo/master/adapters/`
|
|
2. Implement a request conversion function:
|
|
```python
|
|
def your_api_request_to_text_generation(
|
|
request: YourAPIRequest,
|
|
) -> TextGenerationTaskParams:
|
|
# Convert API request to internal format
|
|
pass
|
|
```
|
|
3. Implement streaming response generation:
|
|
```python
|
|
async def generate_your_api_stream(
|
|
command_id: CommandId,
|
|
chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None],
|
|
) -> AsyncGenerator[str, None]:
|
|
# Convert internal chunks to API-specific streaming format
|
|
pass
|
|
```
|
|
4. Implement non-streaming response collection:
|
|
```python
|
|
async def collect_your_api_response(
|
|
command_id: CommandId,
|
|
chunk_stream: AsyncGenerator[TokenChunk | ErrorChunk | ToolCallChunk, None],
|
|
) -> AsyncGenerator[str]:
|
|
# Collect all chunks and return single response
|
|
pass
|
|
```
|
|
5. Register the adapter endpoints in `src/exo/master/api.py`
|
|
|
|
The adapter pattern keeps API-specific logic isolated from core inference systems. Internal systems (worker, runner, event sourcing) only see `TextGenerationTaskParams` and `TokenChunk` objects - no API-specific types cross the adapter boundary.
|
|
|
|
For detailed API documentation, see [docs/api.md](docs/api.md).
|
|
|
|
## Testing
|
|
|
|
EXO relies heavily on manual testing at this point in the project, but this is evolving. Before submitting a change, test both before and after to demonstrate how your change improves behavior. Do the best you can with the hardware you have available - if you need help testing, ask and we'll do our best to assist. Add automated tests where possible - we're actively working to substantially improve our automated testing story.
|
|
|
|
## Submitting Changes
|
|
|
|
1. Fork the repository
|
|
2. Create a feature branch (`git checkout -b feature/your-feature`)
|
|
3. Commit your changes (`git commit -am 'Add some feature'`)
|
|
4. Push to the branch (`git push origin feature/your-feature`)
|
|
5. Open a Pull Request and follow the PR template
|
|
|
|
## Reporting Issues
|
|
|
|
If you find a bug or have a feature request, please open an issue on GitHub with:
|
|
- A clear description of the problem or feature
|
|
- Steps to reproduce (for bugs)
|
|
- Expected vs actual behavior
|
|
- Your environment (macOS version, hardware, etc.)
|
|
|
|
## Questions?
|
|
|
|
Join our community:
|
|
- [X](https://x.com/exolabs)
|