Update staging 14

Co-authored-by: Evan <evanev7@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: David Munha Canas Correia <dmunha@MacBook-David.local>
Co-authored-by: github-actions bot <github-actions@users.noreply.github.com>
This commit is contained in:
rltakashige
2025-11-04 17:44:24 -08:00
committed by GitHub
parent 3b409647ba
commit 16f724e24c
69 changed files with 5527 additions and 2484 deletions

1
.gitattributes vendored
View File

@@ -1 +0,0 @@
worker/utils/macmon/bin/macmon filter=lfs diff=lfs merge=lfs -text

159
.github/benchmark-dashboard/README.md vendored Normal file
View File

@@ -0,0 +1,159 @@
# EXO Benchmark Dashboard
A fully self-contained, browser-based dashboard for tracking EXO benchmark performance over time.
## Features
- 📊 **Success Rate Tracking**: Monitor cluster reliability across commits
-**Response Time Analysis**: Track average request completion times
- 🎯 **Throughput Metrics**: Tokens per second visualization
- 📈 **Request Distribution**: Success/failure breakdown over time
- 🔄 **Auto-Refresh**: Updates every 60 seconds
- 📺 **TV-Ready**: Large, clear visualizations perfect for display
- 🔐 **Secure**: Credentials stored in browser localStorage only
- 🌐 **No Backend**: Directly accesses S3 from the browser
## Quick Start
### Option 1: Direct File Access (Simplest)
Just open the HTML file directly in your browser:
```bash
open .github/benchmark-dashboard/index.html
```
Then click "Configure AWS Credentials" and enter your keys.
### Option 2: URL Parameters (For Quick Setup)
```bash
# Serve with credentials in URL (they'll be moved to localStorage)
open ".github/benchmark-dashboard/index.html?accessKey=YOUR_KEY&secretKey=YOUR_SECRET&region=us-east-1"
```
The credentials will be saved to localStorage and removed from the URL immediately.
### Option 3: Simple HTTP Server
```bash
# From repo root
python3 -m http.server 8080
# Then open: http://localhost:8080/.github/benchmark-dashboard/
```
## AWS Credentials
The dashboard needs read-only access to the `exo-benchmark-results` S3 bucket.
### Required IAM Permissions
```json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:ListBucket"
],
"Resource": [
"arn:aws:s3:::exo-benchmark-results",
"arn:aws:s3:::exo-benchmark-results/*"
]
}
]
}
```
### Security Notes
- ✅ Credentials stored in browser `localStorage` only
- ✅ Never sent to any server (except AWS)
- ✅ All S3 access happens client-side
- ✅ Use read-only IAM credentials
- ⚠️ Don't commit credentials to git
- ⚠️ Use a dedicated read-only IAM user
## TV/Kiosk Mode
For permanent display on a TV:
### macOS
```bash
open -a "Google Chrome" --args --kiosk ".github/benchmark-dashboard/index.html"
```
### Linux
```bash
chromium-browser --kiosk --app="file://$(pwd)/.github/benchmark-dashboard/index.html"
```
### Auto-start on Boot
Create a simple startup script:
```bash
#!/bin/bash
# /usr/local/bin/start-benchmark-dashboard.sh
cd /path/to/exo
python3 -m http.server 8080 &
sleep 2
chromium-browser --kiosk http://localhost:8080/.github/benchmark-dashboard/
```
## Data Displayed
### Summary Cards
- **Latest Success Rate**: Most recent benchmark success percentage with trend
- **Avg Response Time**: Latest average response time in ms with trend
- **Total Benchmarks**: Count of all benchmarks run
- **Active Configurations**: Number of unique benchmark configs
### Charts
1. **Success Rate Over Time**: Line chart showing reliability trends
2. **Average Response Time**: Performance over time (lower is better)
3. **Throughput**: Tokens/second metric (higher is better)
4. **Request Distribution**: Stacked bar chart of successes/failures
## How It Works
1. **Loads AWS SDK**: Uses AWS SDK for JavaScript (browser version)
2. **Lists S3 Objects**: Fetches all files from `s3://exo-benchmark-results/bench/`
3. **Downloads Results**: Fetches each JSON result file
4. **Parses & Visualizes**: Uses Chart.js to create interactive charts
5. **Auto-Refreshes**: Polls S3 every 60 seconds for new results
## Customization
To modify the dashboard:
1. Edit `index.html`
2. Adjust `REFRESH_INTERVAL` for different polling frequency
3. Modify chart colors/styles in the Chart.js configuration
4. Add new metrics by extending the results parsing
## Troubleshooting
**"AWS credentials not configured"**
- Click "Configure AWS Credentials" and enter your keys
**"Error loading benchmark data"**
- Check AWS credentials are correct
- Verify S3 bucket name is `exo-benchmark-results`
- Ensure IAM user has read permissions
- Check browser console for detailed errors
**"No benchmark results found"**
- Wait for benchmark workflows to run
- Verify results are being uploaded to S3
- Check S3 bucket has files in `bench/` prefix
**Charts not updating**
- Check browser console for errors
- Verify network connectivity to S3
- Try refreshing the page manually

1601
.github/benchmark-dashboard/index.html vendored Normal file
View File

File diff suppressed because it is too large Load Diff

186
.github/configs/README.md vendored Normal file
View File

@@ -0,0 +1,186 @@
# EXO Benchmark Configurations
This directory contains configuration files for the EXO staged benchmark system.
## Overview
The staged benchmark system allows you to run complex, multi-stage load tests against EXO clusters. Each stage can have different characteristics:
- **Prompt Length**: Number of tokens in the input prompt
- **Generation Length**: Maximum tokens to generate in the response
- **Time Between Requests**: Delay (in seconds) between firing consecutive requests
- **Iterations**: Number of requests to send in this stage
Requests are **fire-and-forget** - they don't wait for the previous request to complete. This allows you to test overlapping request handling and measure success rates under load.
## Configuration Files
### `bench_simple.yaml`
A minimal configuration that replicates the behavior of the original `bench.py` script:
- Single stage with 1 iteration
- Short prompt (~20 tokens)
- Generates up to 100 tokens
This is useful for quick smoke tests.
### `bench_config.yaml`
A comprehensive multi-stage benchmark with:
1. **Warmup** (10 requests): Light load with short prompts
2. **Medium Load** (20 requests): Moderate load with medium prompts
3. **Stress Test** (30 requests): Heavy overlapping requests with long prompts
4. **Cooldown** (5 requests): Light load to wind down
This tests the cluster's behavior under varying load patterns.
## Configuration Schema
```yaml
# Hardware configuration - maps runner labels to instance counts
hardware_plan:
M3ULTRA_GPU80_512GB: 4
# Environment variables to set on each node (optional)
environment:
OVERRIDE_MEMORY_MB: 512
# Timeout for instance and runner readiness (seconds)
timeout_seconds: 600
# Model instances to run concurrently
model_ids:
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
# Benchmark stages
stages:
- name: "stage_name" # Human-readable name for this stage
prompt_length: 100 # Target prompt length in tokens
generation_length: 200 # Max tokens to generate
time_between_requests: 2.0 # Seconds between firing requests
iterations: 10 # Number of requests in this stage
```
## Running Benchmarks
### Via GitHub Actions
**Automatic (every commit):**
- The **`bench`** workflow runs automatically on every push
- Uses `bench_simple.yaml` as the default configuration
- All settings (hardware plan, timeout, environment variables, models, stages) are defined in the config file
**Manual (on-demand):**
1. Go to **Actions****bench** workflow
2. Click **Run workflow**
3. Configure:
- **Config File**: Path to your YAML config (default: `.github/configs/bench_simple.yaml`)
- `.github/configs/bench_simple.yaml` for quick tests
- `.github/configs/bench_config.yaml` for complex multi-stage tests
All other settings (hardware plan, timeout, environment variables, models, stages) are read from the specified config file.
### Via Command Line
```bash
# Start EXO on localhost:8000
uv run exo --api-port 8000
# Run simple benchmark (1 stage, 1 iteration)
python3 .github/scripts/bench.py \
--api-port 8000 \
--config .github/configs/bench_simple.yaml \
--expected-nodes 1 \
--is-primary true \
--timeout-seconds 600
# Run complex staged benchmark (4 stages, multiple iterations)
python3 .github/scripts/bench.py \
--api-port 8000 \
--config .github/configs/bench_config.yaml \
--expected-nodes 1 \
--is-primary true \
--timeout-seconds 600
```
## Output Metrics
For each stage, the benchmark reports:
- **Total Requests**: Number of requests fired
- **Successful Requests**: Requests that completed successfully
- **Failed Requests**: Requests that encountered errors
- **Success Rate**: Percentage of successful requests
- **Total Tokens**: Sum of all tokens generated across successful requests
- **Avg Tokens/Request**: Average tokens per successful request
- **Avg Time/Request**: Average completion time per successful request
A JSON summary is also printed for easy parsing and storage.
## Creating Custom Benchmarks
To create a custom benchmark:
1. Copy an existing config file (e.g., `bench_config.yaml`)
2. Modify the stages to match your test scenario
3. Save it in this directory with a descriptive name
4. Run it using the workflow or command line
### Example: Sustained Load Test
```yaml
hardware_plan:
M3ULTRA_GPU80_512GB: 2
environment:
OVERRIDE_MEMORY_MB: 1024
timeout_seconds: 600
model_ids:
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
stages:
- name: "sustained_load"
prompt_length: 200
generation_length: 150
time_between_requests: 0.5 # Very fast - 2 requests/second
iterations: 100 # Run for ~50 seconds
```
### Example: Varying Prompt Sizes
```yaml
hardware_plan:
M4PRO_GPU16_24GB: 3
timeout_seconds: 900
model_ids:
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
stages:
- name: "tiny_prompts"
prompt_length: 10
generation_length: 100
time_between_requests: 1.0
iterations: 10
- name: "medium_prompts"
prompt_length: 200
generation_length: 100
time_between_requests: 1.0
iterations: 10
- name: "large_prompts"
prompt_length: 1000
generation_length: 100
time_between_requests: 1.0
iterations: 10
```
## Tips
- **Overlapping Requests**: Set `time_between_requests` < expected completion time to test concurrent request handling
- **Sequential Requests**: Set `time_between_requests` > expected completion time to ensure requests don't overlap
- **Realistic Load**: Model real usage patterns by varying prompt/generation lengths across stages
- **Success Rate**: A 100% success rate indicates the cluster handled the load well; lower rates suggest capacity limits

49
.github/configs/bench_config.yaml vendored Normal file
View File

@@ -0,0 +1,49 @@
# EXO Staged Benchmark Configuration
# This configuration defines a multi-stage load test for EXO clusters
# Hardware configuration - maps runner labels to instance counts
hardware_plan:
M3ULTRA_GPU80_512GB: 4
# Environment variables to set on each node (optional)
environment:
OVERRIDE_MEMORY_MB: 512
# Timeout for instance and runner readiness (seconds)
timeout_seconds: 600
# Multiple instances run concurrently on the cluster
model_ids:
- "mlx-community/Qwen3-0.6B-4bit"
- "mlx-community/Qwen3-0.6B-4bit"
# Stages run sequentially, each with its own characteristics
stages:
# Stage 1: Light load with short prompts
- name: "warmup"
prompt_length: 50 # Number of tokens in prompt
generation_length: 100 # Max tokens to generate
time_between_requests: 5.0 # Seconds between firing requests
iterations: 10 # Number of requests to send in this stage
# Stage 2: Medium load with medium prompts
- name: "medium_load"
prompt_length: 200
generation_length: 150
time_between_requests: 3.0
iterations: 20
# Stage 3: Heavy load with long prompts - requests will overlap
- name: "stress_test"
prompt_length: 500
generation_length: 200
time_between_requests: 1.0 # Fast firing - will definitely overlap
iterations: 30
# Stage 4: Cool down with simple prompts
- name: "cooldown"
prompt_length: 50
generation_length: 50
time_between_requests: 10.0
iterations: 5

36
.github/configs/bench_simple.yaml vendored Normal file
View File

@@ -0,0 +1,36 @@
# Simple single-shot benchmark
# Tests 2 instances concurrently on 2 nodes
# Hardware configuration - maps runner labels to instance counts
hardware_plan:
puffin4: 1
puffin8: 1
# Environment variables to set on each node
environment:
PLACEHOLDER: "placeholder"
# OVERRIDE_MEMORY_MB: 30000
# MLX_METAL_FAST_SYNCH: 1
# Timeout for instance and runner readiness (seconds)
timeout_seconds: 900
# Model instances to run concurrently
model_ids:
- "mlx-community/DeepSeek-V3.1-8bit"
# - "mlx-community/Qwen3-235B-A22B-4bit"
# - "mlx-community/Llama-3.3-70B-Instruct-4bit"
# Placement strategy: "tensor", "pipeline", or "auto"
strategy: "tensor_rdma"
# If true, run requests sequentially (no overlap); if false, fire-and-forget (default: false)
no_overlap: true
# Benchmark stages
stages:
- name: "simple"
prompt_length: 512
generation_length: 10
time_between_requests: 2.0
iterations: 10

1190
.github/scripts/bench.py vendored Normal file
View File

File diff suppressed because it is too large Load Diff

68
.github/scripts/build_matrix.py vendored Normal file
View File

@@ -0,0 +1,68 @@
#!/usr/bin/env python3
import json
import os
from typing import NotRequired, TypedDict, cast
import yaml
class MatrixEntry(TypedDict):
label: str
index: int
class MatrixInclude(TypedDict):
label: str
index: int
is_primary: bool
expected_nodes: int
class Config(TypedDict):
hardware_plan: dict[str, int]
timeout_seconds: NotRequired[int]
environment: NotRequired[dict[str, str]]
# Read the config file
config_file: str = os.environ['CONFIG_FILE']
with open(config_file, 'r') as f:
config: Config = cast(Config, yaml.safe_load(f))
# Extract hardware plan from config
plan: dict[str, int] = config['hardware_plan']
if not plan:
raise ValueError(f"No hardware_plan found in {config_file}")
# Build matrix entries
entries: list[MatrixEntry] = []
for label, count in plan.items():
for idx in range(count):
entries.append({"label": label, "index": idx})
total_nodes: int = len(entries)
matrix: dict[str, list[MatrixInclude]] = {"include": [
{
"label": e["label"],
"index": e["index"],
"is_primary": (i == 0),
"expected_nodes": total_nodes
}
for i, e in enumerate(entries)
]}
# Extract other config values
timeout_seconds: int = config.get('timeout_seconds', 600)
environment: dict[str, str] = config.get('environment', {})
# Output to GitHub Actions
with open(os.environ['GITHUB_OUTPUT'], 'a') as f:
f.write(f"matrix={json.dumps(matrix)}\n")
f.write(f"config_file={config_file}\n")
f.write(f"timeout_seconds={timeout_seconds}\n")
f.write(f"environment={json.dumps(environment)}\n")
print(f"Matrix: {json.dumps(matrix)}")
print(f"Config file: {config_file}")
print(f"Timeout: {timeout_seconds}")
print(f"Environment: {json.dumps(environment)}")

156
.github/workflows/BENCH_USAGE.md vendored Normal file
View File

@@ -0,0 +1,156 @@
# Benchmark Workflow Usage
## Overview
The `bench_matrix.yml` workflow enables distributed benchmarking of models across multiple self-hosted macOS runners with different hardware configurations.
## Workflow Inputs
| Input | Description | Default | Required |
|-------|-------------|---------|----------|
| `model_id` | Model ID to benchmark | `mlx-community/Llama-3.2-1B-Instruct-4bit` | Yes |
| `hardware_plan` | JSON mapping of runner labels to counts | `{"M4PRO_GPU16_24GB": 1}` | Yes |
| `prompt` | Benchmark prompt text | `What is the capital of France?` | No |
| `timeout_seconds` | Timeout for instance/runner readiness | `600` | No |
## Hardware Plan Format
The `hardware_plan` input is a JSON object mapping runner labels to the number of machines:
```json
{
"M4PRO_GPU16_24GB": 2,
"M3ULTRA_GPU80_512GB": 1
}
```
This example would:
- Start 2 runners with the `M4PRO_GPU16_24GB` label
- Start 1 runner with the `M3ULTRA_GPU80_512GB` label
- Total of 3 runners coordinating on a single distributed inference instance
## How It Works
1. **Planning Job** (`plan`)
- Runs on `ubuntu-latest`
- Parses the `hardware_plan` JSON
- Generates a dynamic matrix with one entry per runner
- Only the first runner (index 0) is marked as `is_primary`
2. **Benchmark Worker Jobs** (`bench_worker`)
- Each job runs on a self-hosted macOS runner with the specified label
- All runners start EXO in parallel
- The primary runner creates the model instance
- All runners wait for their assigned runner to be ready (Loaded/Running status)
- The primary runner executes the benchmark and prints results
- The primary runner deletes the instance
## Example Usage
### Single Machine Benchmark
```yaml
model_id: mlx-community/Llama-3.2-1B-Instruct-4bit
hardware_plan: '{"M4PRO_GPU16_24GB": 1}'
prompt: What is the capital of France?
timeout_seconds: 600
```
### Multi-Machine Distributed Benchmark
```yaml
model_id: mlx-community/Llama-3.2-3B-Instruct-4bit
hardware_plan: '{"M4PRO_GPU16_24GB": 2, "M3ULTRA_GPU80_512GB": 1}'
prompt: Explain quantum computing in simple terms.
timeout_seconds: 900
```
## Benchmark Output
The primary runner outputs a JSON object with benchmark results:
```json
{
"model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
"instance_id": "abc-123-def",
"tokens": 42,
"elapsed_s": 2.451,
"tps": 17.136
}
```
Where:
- `tokens`: Number of chunks/tokens generated
- `elapsed_s`: Total elapsed time in seconds
- `tps`: Tokens per second (tokens / elapsed_s)
## Runner Requirements
Each self-hosted runner must:
- Be labeled with appropriate hardware tags (e.g., `M4PRO_GPU16_24GB`)
- Have the `self-hosted` and `macOS` labels
- Have Nix installed with flakes enabled
- Have network connectivity to other runners in the same job
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ GitHub Actions Workflow (bench_matrix.yml) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌────────────────┐ │
│ │ Plan Job │ │
│ │ (ubuntu) │──┬─► Matrix: [{label, index, primary}] │
│ └────────────────┘ │ │
│ │ │
│ ┌───────────────────▼──────────────────────────────────┐ │
│ │ Bench Worker Jobs (Matrix) │ │
│ ├──────────────────────────────────────────────────────┤ │
│ │ │ │
│ │ Runner 0 (Primary) Runner 1 Runner 2 │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │ │
│ │ │ Start EXO │ │ Start EXO │ │ Start EXO│ │ │
│ │ │ Create Inst │ │ Wait... │ │ Wait... │ │ │
│ │ │ Wait Ready │ │ Wait Ready │ │ Wait... │ │ │
│ │ │ Run Bench │ │ (idle) │ │ (idle) │ │ │
│ │ │ Print TPS │ │ │ │ │ │ │
│ │ │ Delete Inst │ │ │ │ │ │ │
│ │ └─────────────┘ └─────────────┘ └──────────┘ │ │
│ └───────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
## Implementation Details
### `scripts/bench.py`
A standalone Python script that:
- Creates instance (primary only)
- Polls `/state` endpoint until instance and all runners are ready
- Executes chat completion with timing (primary only)
- Parses SSE stream and counts tokens
- Computes TPS metrics
- Cleans up instance (primary only)
### Key Functions
- `wait_for_instance()`: Polls until instance with model_id appears
- `wait_for_runners_ready()`: Polls until expected number of runners reach Loaded/Running status
- `run_benchmark()`: Executes chat completion, measures time, counts tokens
## Troubleshooting
### Instance never becomes ready
- Check EXO logs in the workflow output
- Verify model_id is valid and accessible
- Increase `timeout_seconds`
### Runner mismatch
- Ensure hardware_plan counts match available labeled runners
- Check runner labels match exactly (case-sensitive)
### Network issues
- Verify runners can communicate on the network
- Check firewall rules between runner hosts

292
.github/workflows/bench.yml vendored Normal file
View File

@@ -0,0 +1,292 @@
name: bench
on: [push]
jobs:
plan:
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.build.outputs.matrix }}
config_file: ${{ steps.build.outputs.config_file }}
timeout_seconds: ${{ steps.build.outputs.timeout_seconds }}
environment: ${{ steps.build.outputs.environment }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Build matrix from config file
id: build
shell: bash
run: |
set -euo pipefail
CONFIG_FILE='.github/configs/bench_simple.yaml'
export CONFIG_FILE
echo "Config file: $CONFIG_FILE"
python3 .github/scripts/build_matrix.py
bench_worker:
needs: plan
strategy:
fail-fast: false
matrix: ${{ fromJSON(needs.plan.outputs.matrix) }}
name: "bench on ${{ matrix.label }} [${{ matrix.index }}]"
runs-on: [self-hosted, macOS, "${{ matrix.label }}"]
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: false
- name: Configure git user
run: |
git config --local user.email "github-actions@users.noreply.github.com"
git config --local user.name "github-actions bot"
shell: bash
# TODO: this is mega hacky and I'd like a simpler solution.
- name: Setup Nix Environment
run: |
echo "Checking for nix installation..."
# Check if nix is already available
if command -v nix >/dev/null 2>&1; then
echo "Nix already in PATH"
# Try sourcing profile scripts to set up environment properly
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
echo "Sourcing multi-user nix-daemon profile script"
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
elif [ -f "$HOME/.nix-profile/etc/profile.d/nix.sh" ]; then
echo "Sourcing single-user nix profile script"
source "$HOME/.nix-profile/etc/profile.d/nix.sh"
elif [ -f /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh ]; then
echo "Sourcing per-user nix profile script"
source /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh
elif [ -f /etc/profile.d/nix.sh ]; then
echo "Sourcing system-wide nix profile script"
source /etc/profile.d/nix.sh
# Fallback: manually add nix to PATH if binary exists
elif [ -f /nix/var/nix/profiles/default/bin/nix ]; then
echo "Found nix binary, manually adding to PATH"
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
elif [ -f "$HOME/.nix-profile/bin/nix" ]; then
echo "Found nix binary in user profile, manually adding to PATH"
export PATH="$HOME/.nix-profile/bin:$PATH"
else
echo "Nix not found. Debugging info:"
echo "USER: $USER"
echo "HOME: $HOME"
echo "Current PATH: $PATH"
echo ""
echo "Checking common Nix locations:"
echo " /nix/var/nix/profiles/default/bin/nix:"
ls -la /nix/var/nix/profiles/default/bin/nix 2>/dev/null || echo " Not found"
echo " /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh:"
ls -la /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh 2>/dev/null || echo " Not found"
echo " ~/.nix-profile/etc/profile.d/nix.sh:"
ls -la "$HOME/.nix-profile/etc/profile.d/nix.sh" 2>/dev/null || echo " Not found"
echo " /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh:"
ls -la "/nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh" 2>/dev/null || echo " Not found"
echo ""
echo "/nix directory structure:"
ls -la /nix 2>/dev/null || echo " /nix directory not found"
echo ""
echo "/nix/var:"
ls -la /nix/var 2>/dev/null || echo " /nix/var not found"
echo ""
echo "/nix/store:"
ls -la /nix/store 2>/dev/null | head -20 || echo " /nix/store not found"
echo ""
echo "GitHub Actions runner is running as user '$USER'."
echo "If Nix is installed for a different user, either:"
echo " 1. Install Nix for user '$USER' (multi-user install recommended)"
echo " 2. Configure the runner service to run as the user with Nix installed"
echo " 3. Ensure Nix is installed system-wide with proper daemon setup"
exit 1
fi
# Verify nix is available and persist to GITHUB_ENV
if command -v nix >/dev/null 2>&1; then
echo "✓ Nix is available"
nix --version
echo "PATH=$PATH" >> $GITHUB_ENV
if [ -n "$NIX_PATH" ]; then
echo "NIX_PATH=$NIX_PATH" >> $GITHUB_ENV
fi
else
echo "ERROR: Failed to set up Nix"
echo "PATH after setup attempt: $PATH"
exit 1
fi
shell: bash
- name: Setup EXO_HOME and API_PORT
run: |
EXO_HOME=$(mktemp -d -t exo-e2e-XXXXXXXX)
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
EXO_MODELS_DIR="$HOME/.exo/models"
EXO_LIBP2P_NAMESPACE="bench-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
echo "EXO_HOME=$EXO_HOME" >> "$GITHUB_ENV"
echo "API_PORT=$API_PORT" >> "$GITHUB_ENV"
echo "EXO_MODELS_DIR=$EXO_MODELS_DIR" >> "$GITHUB_ENV"
echo "EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE" >> "$GITHUB_ENV"
echo "Created EXO_HOME: $EXO_HOME"
echo "Generated API_PORT: $API_PORT"
echo "Using models from: $EXO_MODELS_DIR"
echo "Using libp2p namespace: $EXO_LIBP2P_NAMESPACE"
shell: bash
- name: Configure local MLX if available
run: |
RUNNER_LABELS='${{ toJSON(runner.labels) }}'
if echo "$RUNNER_LABELS" | grep -q "local_mlx"; then
echo "Runner has 'local_mlx' tag, configuring local MLX paths..."
MODIFIED=false
if [ -d "/Users/Shared/mlx" ]; then
echo "Found /Users/Shared/mlx, enabling local mlx path in pyproject.toml"
sed -i.bak 's|^# mlx = { path = "/Users/Shared/mlx", editable=true }$|mlx = { path = "/Users/Shared/mlx", editable=true }|' pyproject.toml
MODIFIED=true
fi
if [ -d "/Users/Shared/mlx-lm" ]; then
echo "Found /Users/Shared/mlx-lm, enabling local mlx-lm path in pyproject.toml"
sed -i.bak 's|^# mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }$|mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }|' pyproject.toml
MODIFIED=true
fi
if [ "$MODIFIED" = true ]; then
echo "Modified pyproject.toml [tool.uv.sources] section:"
sed -n '/\[tool\.uv\.sources\]/,/^\[/p' pyproject.toml | head -n -1
echo "Regenerating uv.lock with local MLX paths..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command uv lock --upgrade-package mlx --upgrade-package mlx-lm
fi
else
echo "Runner does not have 'local_mlx' tag, using default PyPI packages"
fi
shell: bash
- name: Sync dependencies
run: |
if [ -d "/Users/Shared/test" ]; then
pushd /Users/Shared/test
uv sync --reinstall
popd
fi
echo "Running just sync to ensure clean dependencies..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync
shell: bash
- name: Start EXO and run bench script
shell: bash
env:
IS_PRIMARY: ${{ matrix.is_primary }}
EXPECTED_NODES: ${{ matrix.expected_nodes }}
HARDWARE_LABEL: ${{ matrix.label }}
CONFIG_FILE: ${{ needs.plan.outputs.config_file }}
TIMEOUT_SECONDS: ${{ needs.plan.outputs.timeout_seconds }}
ENVIRONMENT_JSON: ${{ needs.plan.outputs.environment }}
run: |
set -euo pipefail
# Parse environment variables from config
ENV_VARS=""
if [ -n "$ENVIRONMENT_JSON" ] && [ "$ENVIRONMENT_JSON" != "{}" ]; then
ENV_VARS=$(echo "$ENVIRONMENT_JSON" | python3 -c "import sys, json; env = json.load(sys.stdin); print(' '.join([f'{k}={v}' for k, v in env.items()]))")
fi
echo "Starting EXO with API_PORT=${API_PORT} EXO_HOME=${EXO_HOME} EXO_LIBP2P_NAMESPACE=${EXO_LIBP2P_NAMESPACE}"
echo "Environment variables from config: $ENV_VARS"
LOG_FILE=/tmp/exo.log
: > "$LOG_FILE"
MASTER_FLAG=""
if [ "$IS_PRIMARY" = "true" ]; then
MASTER_FLAG="-m"
fi
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
"EXO_HOME=$EXO_HOME EXO_MODELS_DIR=$EXO_MODELS_DIR EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE $ENV_VARS PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run exo $MASTER_FLAG --api-port $API_PORT" \
>> "$LOG_FILE" 2>&1 &
EXO_PID=$!
echo "Started EXO in background with PID: $EXO_PID"
echo "Log file: $LOG_FILE"
cleanup() {
echo '=== EXO log (tail) ==='
tail -n 300 "$LOG_FILE" || true
if ps -p "$EXO_PID" >/dev/null 2>&1; then
echo "Killing EXO (PID $EXO_PID)"
kill "$EXO_PID" || true
fi
}
trap cleanup EXIT
for i in $(seq 1 60); do
if curl -s "http://localhost:${API_PORT}/state" >/dev/null 2>&1; then
echo "EXO API ready"
break
fi
if ! ps -p "$EXO_PID" >/dev/null 2>&1; then
echo "EXO terminated early"; sed -n '1,200p' "$LOG_FILE" || true; exit 1
fi
sleep 1
done
RESULTS_FILE="/tmp/bench_results_${GITHUB_RUN_ID}_${GITHUB_RUN_ATTEMPT}_$(date +%s).json"
echo "Results will be saved to: $RESULTS_FILE"
echo "RESULTS_FILE=$RESULTS_FILE" >> "$GITHUB_ENV"
echo "Running bench script with config: $CONFIG_FILE, timeout: $TIMEOUT_SECONDS"
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
"PYTHONUNBUFFERED=1 uv run --no-project --with pyyaml --with pydantic python .github/scripts/bench.py \
--api-port $API_PORT \
--config $CONFIG_FILE \
--expected-nodes ${EXPECTED_NODES} \
--is-primary ${IS_PRIMARY} \
--timeout-seconds ${TIMEOUT_SECONDS} \
--output $RESULTS_FILE \
--git-commit ${GITHUB_SHA} \
--hardware-labels ${HARDWARE_LABEL}"
- name: Install AWS CLI
if: always() && env.RESULTS_FILE && matrix.is_primary
run: |
if ! command -v aws &> /dev/null; then
echo "AWS CLI not found, installing..."
brew install awscli
else
echo "AWS CLI already installed"
fi
shell: bash
- name: Upload results to S3
if: always() && env.RESULTS_FILE && matrix.is_primary
env:
AWS_ACCESS_KEY_ID: ${{ secrets.S3_BENCHMARKS_AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.S3_BENCHMARKS_AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-1
run: |
echo "Checking for results file: $RESULTS_FILE"
echo "Is primary: ${{ matrix.is_primary }}"
if [ -f "$RESULTS_FILE" ]; then
TIMESTAMP=$(date -u +%Y/%m/%d/%H%M%S)
S3_KEY="bench/${TIMESTAMP}_${GITHUB_SHA:0:8}_${GITHUB_RUN_ID}.json"
echo "Uploading results to s3://exo-benchmark-results/$S3_KEY"
aws s3 cp "$RESULTS_FILE" "s3://exo-benchmark-results/$S3_KEY" \
--content-type application/json \
--metadata "commit=${GITHUB_SHA},run_id=${GITHUB_RUN_ID},branch=${GITHUB_REF_NAME}"
echo "Results uploaded successfully"
echo "View at: https://exo-benchmark-results.s3.amazonaws.com/$S3_KEY"
else
echo "Results file not found at: $RESULTS_FILE"
echo "Skipping upload"
fi
shell: bash
- name: Cleanup EXO_HOME
run: |
echo "Cleaning up EXO_HOME: $EXO_HOME"
rm -rf "$EXO_HOME"
shell: bash
if: always()

View File

@@ -1,360 +0,0 @@
name: macOS System Info
on:
workflow_dispatch: # This allows manual triggering
# push:
# branches: [ '*' ]
# tags: [ '*' ]
jobs:
master:
runs-on: ['self-hosted', 'macOS']
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: true
- name: Configure git user
run: |
git config --local user.email "github-actions@users.noreply.github.com"
git config --local user.name "github-actions bot"
shell: bash
- name: Pull LFS files
run: |
echo "Pulling Git LFS files..."
git lfs pull
shell: bash
- name: Reset databases
run: |
if [ -d ~/.exo ]; then
rm -rf ~/.exo/*.db*
fi
- name: Setup EXO_HOME and API_PORT
run: |
EXO_HOME=$(mktemp -d -t exo-e2e-master-XXXXXXXX)
# Generate random port (macOS compatible method)
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
echo "EXO_HOME=$EXO_HOME" >> $GITHUB_ENV
echo "API_PORT=$API_PORT" >> $GITHUB_ENV
echo "Created EXO_HOME: $EXO_HOME"
echo "Generated API_PORT: $API_PORT"
echo "Verifying API_PORT is set: $API_PORT"
shell: bash
- name: Setup Nix Environment
run: |
echo "Checking for nix installation..."
# Check if nix binary exists directly
if [ -f /nix/var/nix/profiles/default/bin/nix ]; then
echo "Found nix binary at /nix/var/nix/profiles/default/bin/nix"
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
echo "PATH=$PATH" >> $GITHUB_ENV
nix --version
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
echo "Found nix profile script, sourcing..."
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
nix --version
elif command -v nix >/dev/null 2>&1; then
echo "Nix already in PATH"
nix --version
else
echo "Nix not found. Debugging info:"
echo "Contents of /nix/var/nix/profiles/default/:"
ls -la /nix/var/nix/profiles/default/ 2>/dev/null || echo "Directory not found"
echo "Contents of /nix/var/nix/profiles/default/bin/:"
ls -la /nix/var/nix/profiles/default/bin/ 2>/dev/null || echo "Directory not found"
exit 1
fi
shell: bash
- name: Print macOS system information
run: |
echo "=== macOS System Information ==="
echo "OS Version:"
sw_vers
echo -e "\n=== Memory Information ==="
system_profiler SPMemoryDataType
echo -e "\n=== Memory Usage Summary ==="
vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages free: (\d+)/ and printf "Free Memory: %.2f GB\n", $1 * $size / 1024 / 1024 / 1024'
top -l 1 -s 0 | grep PhysMem
echo -e "\n=== CPU Information ==="
sysctl -n machdep.cpu.brand_string
system_profiler SPHardwareDataType | grep -E "Cores|Processors"
echo -e "\n=== Disk Space ==="
df -h /
# - name: Setup Hugging Face token
# run: |
# mkdir -p ~/.cache/huggingface
# echo "${{ secrets.HF_TOKEN }}" > ~/.cache/huggingface/token
- name: Sync dependencies
run: |
echo "Running just sync-clean to ensure clean dependencies..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync-clean
shell: bash
- name: Build forwarder
run: |
echo "Building Go forwarder binary..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just build-forwarder
shell: bash
- name: Start node (master)
run: |
echo "Starting master node with debug enabled..."
echo "Environment check - API_PORT: '$API_PORT'"
echo "Environment check - EXO_HOME: '$EXO_HOME'"
if [ -z "$API_PORT" ]; then
echo "ERROR: API_PORT is not set!"
exit 1
fi
# Run with Python unbuffered output and maximum debug level
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME API_PORT=$API_PORT PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run master/main.py" > /tmp/master_node.log 2>&1 &
MASTER_PID=$!
echo "Started master node in background with PID: $MASTER_PID"
echo "Log file: /tmp/master_node.log"
echo "Starting worker node..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run worker/main.py" > /tmp/worker_node.log 2>&1 &
WORKER_PID=$!
echo "Started worker node in background with PID: $WORKER_PID"
echo "Log file: /tmp/worker_node.log"
for i in {1..30}; do
echo "Attempt $i: Checking if master node is ready..."
if curl -s http://localhost:$API_PORT/state > /dev/null 2>&1; then
echo "Master node is ready!"
break
fi
if [ $i -eq 30 ]; then
echo "Master node failed to start within 30 seconds. Checking logs..."
echo "=== Master node log ==="
cat /tmp/master_node.log || echo "No master log file found"
echo "=== Worker node log ==="
cat /tmp/worker_node.log || echo "No worker log file found"
exit 1
fi
sleep 1
done
# wait for master to have a COMPLETE or FAILED task in the state
for i in {1..30}; do
if curl -s http://localhost:$API_PORT/state | jq -r '.tasks | any(.task_status == "COMPLETE" or .task_status == "FAILED")' > 0; then
echo "Master node has a COMPLETE or FAILED task in the state"
break
fi
sleep 1
done
echo "=== Master node log ==="
cat /tmp/master_node.log || echo "No master log file found"
echo "=== Worker node log ==="
cat /tmp/worker_node.log || echo "No worker log file found"
- name: Cleanup EXO_HOME
run: |
echo "Cleaning up EXO_HOME: $EXO_HOME"
rm -rf "$EXO_HOME"
shell: bash
if: always()
worker:
runs-on: ['self-hosted', 'macOS']
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: true
- name: Configure git user
run: |
git config --local user.email "github-actions@users.noreply.github.com"
git config --local user.name "github-actions bot"
shell: bash
- name: Pull LFS files
run: |
echo "Pulling Git LFS files..."
git lfs pull
shell: bash
- name: Reset databases
run: |
if [ -d ~/.exo ]; then
rm -rf ~/.exo/*.db*
fi
- name: Setup EXO_HOME and API_PORT
run: |
EXO_HOME=$(mktemp -d -t exo-e2e-worker-XXXXXXXX)
# Generate random port (macOS compatible method)
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
echo "EXO_HOME=$EXO_HOME" >> $GITHUB_ENV
echo "API_PORT=$API_PORT" >> $GITHUB_ENV
echo "Created EXO_HOME: $EXO_HOME"
echo "Generated API_PORT: $API_PORT"
echo "Verifying API_PORT is set: $API_PORT"
shell: bash
- name: Setup Nix Environment
run: |
echo "Checking for nix installation..."
# Check if nix binary exists directly
if [ -f /nix/var/nix/profiles/default/bin/nix ]; then
echo "Found nix binary at /nix/var/nix/profiles/default/bin/nix"
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
echo "PATH=$PATH" >> $GITHUB_ENV
nix --version
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
echo "Found nix profile script, sourcing..."
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
nix --version
elif command -v nix >/dev/null 2>&1; then
echo "Nix already in PATH"
nix --version
else
echo "Nix not found. Debugging info:"
echo "Contents of /nix/var/nix/profiles/default/:"
ls -la /nix/var/nix/profiles/default/ 2>/dev/null || echo "Directory not found"
echo "Contents of /nix/var/nix/profiles/default/bin/:"
ls -la /nix/var/nix/profiles/default/bin/ 2>/dev/null || echo "Directory not found"
exit 1
fi
shell: bash
- name: Print macOS system information
run: |
echo "=== macOS System Information ==="
echo "OS Version:"
sw_vers
echo -e "\n=== Memory Information ==="
system_profiler SPMemoryDataType
echo -e "\n=== Memory Usage Summary ==="
vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages free: (\d+)/ and printf "Free Memory: %.2f GB\n", $1 * $size / 1024 / 1024 / 1024'
top -l 1 -s 0 | grep PhysMem
echo -e "\n=== CPU Information ==="
sysctl -n machdep.cpu.brand_string
system_profiler SPHardwareDataType | grep -E "Cores|Processors"
echo -e "\n=== Disk Space ==="
df -h /
# - name: Setup Hugging Face token
# run: |
# mkdir -p ~/.cache/huggingface
# echo "${{ secrets.HF_TOKEN }}" > ~/.cache/huggingface/token
- name: Sync dependencies
run: |
echo "Running just sync-clean to ensure clean dependencies..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync-clean
shell: bash
- name: Build forwarder
run: |
echo "Building Go forwarder binary..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just build-forwarder
shell: bash
- name: Start node (replica)
run: |
echo "Starting master node with debug enabled..."
echo "Environment check - API_PORT: '$API_PORT'"
echo "Environment check - EXO_HOME: '$EXO_HOME'"
if [ -z "$API_PORT" ]; then
echo "ERROR: API_PORT is not set!"
exit 1
fi
# Run with Python unbuffered output and maximum debug level
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_RUN_AS_REPLICA=1 EXO_HOME=$EXO_HOME API_PORT=$API_PORT PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run master/main.py" > /tmp/master_node.log 2>&1 &
MASTER_PID=$!
echo "Started master node in background with PID: $MASTER_PID"
echo "Log file: /tmp/master_node.log"
echo "Starting worker node..."
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run worker/main.py" > /tmp/worker_node.log 2>&1 &
WORKER_PID=$!
echo "Started worker node in background with PID: $WORKER_PID"
echo "Log file: /tmp/worker_node.log"
echo "Waiting for master node to start on port $API_PORT..."
# Wait for the master node to be ready (up to 30 seconds)
for i in {1..30}; do
echo "Attempt $i: Checking if master node is ready..."
if curl -s http://localhost:$API_PORT/state > /dev/null 2>&1; then
echo "Master node is ready!"
break
fi
if [ $i -eq 30 ]; then
echo "Master node failed to start within 30 seconds. Checking logs..."
echo "=== Master node log ==="
cat /tmp/master_node.log || echo "No master log file found"
echo "=== Worker node log ==="
cat /tmp/worker_node.log || echo "No worker log file found"
exit 1
fi
sleep 1
done
resp=$(curl -X POST http://localhost:$API_PORT/instance -H "Content-Type: application/json" -d '{"model_id": "llama-3.2:1b"}')
echo "Response: $resp"
instance_id=$(echo $resp | jq -r '.instance_id')
echo "Instance ID: $instance_id"
for i in {1..50}; do
resp=$(curl -s -w "%{http_code}" -X GET http://localhost:$API_PORT/instance/$instance_id -H "Content-Type: application/json")
http_code="${resp: -3}"
response_body="${resp%???}"
echo "HTTP Code: $http_code"
echo "Response: $response_body"
if [ "$http_code" == "200" ]; then
instance_status=$(echo $response_body | jq -r '.instance_type')
if [ "$instance_status" == "ACTIVE" ]; then
echo "Instance is ready"
break
fi
elif [ "$http_code" == "404" ]; then
echo "Instance not yet created, waiting..."
else
echo "Unexpected HTTP status: $http_code"
fi
sleep 1
done
resp=$(curl http://localhost:$API_PORT/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "llama-3.2:1b", "messages": [{"role": "user", "content": "What is the meaning of exo?"}], "temperature": 0.7}')
echo "Response: $resp"
resp=$(curl -X DELETE http://localhost:$API_PORT/instance/$instance_id -H "Content-Type: application/json")
echo "Response: $resp"
echo "=== Master node log ==="
cat /tmp/master_node.log || echo "No master log file found"
echo "=== Worker node log ==="
cat /tmp/worker_node.log || echo "No worker log file found"
kill $MASTER_PID
kill $WORKER_PID
- name: Cleanup EXO_HOME
run: |
echo "Cleaning up EXO_HOME: $EXO_HOME"
rm -rf "$EXO_HOME"
shell: bash
if: always()

View File

@@ -17,7 +17,7 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: true
lfs: false
- uses: cachix/install-nix-action@v31
with:

25
TODO.md Normal file
View File

@@ -0,0 +1,25 @@
1. Currently EXO just doesn't start cleanly a lot of the time. I see two kinds of issues:
b. EXO starts but then after creating an instance that instance never loads (either gets stuck in Loading of Inactive).
2. Currently a lot of requests from the API are timing out, but we still process those requests internally. If an API request times out, we should cancel all corresponding tasks to that API request (why process a request with nobody listening).
4. I'd like to see profiled network latency / bandwidth.
5. I'd like to see how much bandwidth each link is using.
6. We should handle the case where one machine doesn't have the model downloaded and then other machines are waiting on it. In this case we get loads of timeout errors because the others are waiting for the one that needs to download the model.
7. Solve the problem of in continuous batching when a new prompt comes in, it will block decode of the current batch until the prefill is complete.
8. We want people to be able to copy models over to a new device without ever connecting EXO to the internet. Right now EXO require internet connection once to cache some files to check if a download is complete. Instead, we should simply check if there is a non-empty model folder locally with no .partial files. This indicates it's a fully downloaded model that can be loaded.
10. More granular control over how to deploy instances.
12. Nix is great but installing it is a pain and we have ended up in a lot of cases having PATH issues or installation issues. For example, after rebooting mike it seemed to no longer have a nix installation and needed reinstalling. It has a bunch of broken symlinks left over from nix that caused ssh to fail, making it even harder to debug. We need consistent environments (perhaps MDM) so we can guarantee nix is installed properly on each machine.
13. Memory pressure instead of memory used.
14. Show the type of each connection (TB5, Ethernet, etc.) in the UI. Refer to old exo: https://github.com/exo-explore/exo/blob/56f783b38dc6b08ce606b07a5386dc40dae00330/exo/helpers.py#L251
15. Prioritise certain connection types (or by latency). TB5 > Ethernet > WiFi. Refer to old exo: https://github.com/exo-explore/exo/blob/56f783b38dc6b08ce606b07a5386dc40dae00330/exo/helpers.py#L251
16. Dynamically switch to higher priority connection when it becomes available. Probably bring back InstanceReplacedAtomically.
17. Faster model loads by streaming model from other devices in cluster.
18. Add support for specifying the type of network connection to use in a test. Depends on 15/16.
19. Fix mx.distributed.Group typing.
20. Add chat completion cancellations (e.g OpenWebUI has something for cancelling an ongoing request).
21. Make two separate things: tensor or pipeline, and ring or ibv.
Potential refactors:
1. Make ForwarderEvent typed
2. Topology can be simplified
3. Get rid of InstanceReplacedAtomically

View File

@@ -1,43 +0,0 @@
#!/usr/bin/env bash
# Get the total memory in MB
TOTAL_MEM_MB=$(($(sysctl -n hw.memsize) / 1024 / 1024))
# Calculate 80% and TOTAL_MEM_GB-5GB in MB
EIGHTY_PERCENT=$(($TOTAL_MEM_MB * 80 / 100))
MINUS_5GB=$((($TOTAL_MEM_MB - 5120)))
# Calculate 70% and TOTAL_MEM_GB-8GB in MB
SEVENTY_PERCENT=$(($TOTAL_MEM_MB * 70 / 100))
MINUS_8GB=$((($TOTAL_MEM_MB - 8192)))
# Set WIRED_LIMIT_MB to higher value
if [ $EIGHTY_PERCENT -gt $MINUS_5GB ]; then
WIRED_LIMIT_MB=$EIGHTY_PERCENT
else
WIRED_LIMIT_MB=$MINUS_5GB
fi
# Set WIRED_LWM_MB to higher value
if [ $SEVENTY_PERCENT -gt $MINUS_8GB ]; then
WIRED_LWM_MB=$SEVENTY_PERCENT
else
WIRED_LWM_MB=$MINUS_8GB
fi
# Display the calculated values
echo "Total memory: $TOTAL_MEM_MB MB"
echo "Maximum limit (iogpu.wired_limit_mb): $WIRED_LIMIT_MB MB"
echo "Lower bound (iogpu.wired_lwm_mb): $WIRED_LWM_MB MB"
# Apply the values with sysctl, but check if we're already root
if [ "$EUID" -eq 0 ]; then
sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB
sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB
else
# Try without sudo first, fall back to sudo if needed
sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB 2>/dev/null || \
sudo sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB
sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB 2>/dev/null || \
sudo sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB
fi

View File

@@ -1,133 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
# copy_model.sh: clone ~/.exo/models from SOURCE to one or more TARGETS using scp -3.
# Username defaults:
# - If host is "aN" and no user given, username defaults to "aN".
# - Otherwise defaults to $(whoami), unless you pass user@host.
#
# Examples:
# ./copy_model.sh a1 a2 a3
# ./copy_model.sh a1 frank@a2 192.168.1.3
if [ $# -lt 2 ]; then
echo "Usage: $0 SOURCE TARGET [TARGET...]" >&2
exit 2
fi
SOURCE="$1"
shift
TARGETS=("$@")
DEFAULT_USER="$(whoami)"
MODELS_REL=".exo/models" # relative under $HOME
timestamp() { date "+%Y-%m-%d %H:%M:%S"; }
split_user_host() {
local in="$1"
if [[ "$in" == *"@"* ]]; then
printf "%s|%s" "${in%%@*}" "${in#*@}"
else
printf "|%s" "$in"
fi
}
resolve_ip() {
local hostish="$1"
if [[ "$hostish" =~ ^a([0-9]+)$ ]]; then
echo "192.168.1.${BASH_REMATCH[1]}"
else
echo "$hostish"
fi
}
default_user_for() {
local hostish="$1"
if [[ "$hostish" =~ ^a([0-9]+)$ ]]; then
echo "$hostish"
else
echo "$DEFAULT_USER"
fi
}
SSH_OPTS=(-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=ERROR -o ConnectTimeout=10)
SSHPASS_BIN="$(command -v sshpass || true)"
SCP_BIN="${SCP_BIN:-scp}"
read -s -p "Password for all hosts: " PASS
echo
if [ -n "$SSHPASS_BIN" ]; then
echo "$(timestamp) sshpass found: will provide the password non-interactively."
else
echo "$(timestamp) WARNING: sshpass not found — youll be prompted by scp/ssh per hop unless keys are set up."
fi
# Build source endpoint (default username logic)
IFS='|' read -r SRC_USER_RAW SRC_HOSTISH <<<"$(split_user_host "$SOURCE")"
SRC_USER="${SRC_USER_RAW:-$(default_user_for "$SRC_HOSTISH")}"
SRC_IP="$(resolve_ip "$SRC_HOSTISH")"
SRC_HOST="${SRC_USER}@${SRC_IP}"
echo "$(timestamp) Source: ${SRC_HOST}:~/${MODELS_REL}"
echo "$(timestamp) Targets: ${#TARGETS[@]}"
# Helper to run a simple remote command via ssh (for mkdir -p checks)
ssh_run() {
local host="$1"
shift
if [ -n "$SSHPASS_BIN" ]; then
sshpass -p "$PASS" ssh "${SSH_OPTS[@]}" "$host" "$@"
else
ssh "${SSH_OPTS[@]}" "$host" "$@"
fi
}
# Ensure source dir exists (create if missing, per your request)
ssh_run "$SRC_HOST" "mkdir -p ~/${MODELS_REL}"
failures=0
count=0
for T in "${TARGETS[@]}"; do
count=$((count + 1))
IFS='|' read -r T_USER_RAW T_HOSTISH <<<"$(split_user_host "$T")"
T_USER="${T_USER_RAW:-$(default_user_for "$T_HOSTISH")}"
T_IP="$(resolve_ip "$T_HOSTISH")"
T_HOST="${T_USER}@${T_IP}"
echo "============================================================"
echo "$(timestamp) [${count}/${#TARGETS[@]}] ${SRC_HOST} ==> ${T_HOST}"
echo "$(timestamp) Ensuring destination directory exists…"
ssh_run "$T_HOST" "mkdir -p ~/${MODELS_REL%/*}" # ~/.exo
# Copy the whole "models" directory into ~/.exo on the target.
# scp -3 = copy between two remotes via local; -r recursive; -p preserve times/modes
if [ -n "$SSHPASS_BIN" ]; then
echo "$(timestamp) Running: scp -3 -rp ${SRC_HOST}:~/${MODELS_REL} ${T_HOST}:~/.exo/"
if sshpass -p "$PASS" "$SCP_BIN" "${SSH_OPTS[@]}" -3 -rp \
"${SRC_HOST}:~/${MODELS_REL}" \
"${T_HOST}:~/.exo/"; then
echo "$(timestamp) [${count}] Done: ${T_HOST}"
else
echo "$(timestamp) [${count}] ERROR during scp to ${T_HOST}" >&2
failures=$((failures + 1))
fi
else
echo "$(timestamp) Running: scp -3 -rp ${SRC_HOST}:~/${MODELS_REL} ${T_HOST}:~/.exo/"
if "$SCP_BIN" "${SSH_OPTS[@]}" -3 -rp \
"${SRC_HOST}:~/${MODELS_REL}" \
"${T_HOST}:~/.exo/"; then
echo "$(timestamp) [${count}] Done: ${T_HOST}"
else
echo "$(timestamp) [${count}] ERROR during scp to ${T_HOST}" >&2
failures=$((failures + 1))
fi
fi
done
echo "============================================================"
if [ "$failures" -eq 0 ]; then
echo "$(timestamp) All transfers completed successfully."
else
echo "$(timestamp) Completed with ${failures} failure(s)."
fi

View File

@@ -461,6 +461,17 @@
margin-bottom: 8px;
}
.instance-strategy {
font-size: 13px;
color: var(--exo-light-gray);
margin-bottom: 8px;
}
.instance-strategy-value {
font-weight: 600;
color: var(--exo-yellow);
}
.instance-details {
font-size: 12px;
color: var(--exo-light-gray);
@@ -468,15 +479,6 @@
.download-progress {
font-size: 11px;
color: var(--exo-light-gray);
margin-top: 4px;
display: flex;
align-items: center;
gap: 8px;
}
.progress-bar-container {
background-color: var(--exo-black);
border-radius: 8px;
@@ -492,75 +494,96 @@
transition: width 0.3s ease;
}
/* Detailed download info */
.download-details {
margin-top: 8px;
padding: 12px;
background-color: #1a1a1a;
border: 1px solid var(--exo-medium-gray);
border-radius: 6px;
box-sizing: border-box;
width: 100%;
max-width: 100%;
overflow: visible;
}
.download-runner-header {
font-size: 11px;
color: var(--exo-light-gray);
opacity: 0.85;
margin-bottom: 4px;
}
.download-overview-row {
display: flex;
gap: 12px;
flex-wrap: wrap;
font-size: 12px;
/* Overall download summary styles */
.overall-download-summary {
margin-top: 10px;
margin-bottom: 8px;
}
.download-overview-item strong {
color: #E0E0E0;
font-weight: 600;
margin-right: 4px;
}
.progress-with-label {
.overall-download-header {
display: flex;
justify-content: space-between;
align-items: center;
gap: 8px;
margin-bottom: 10px;
margin-bottom: 4px;
}
.progress-with-label .progress-bar-container {
flex: 1 1 auto;
}
.progress-percent {
font-size: 12px;
.overall-download-label {
font-size: 11px;
font-weight: 500;
color: var(--exo-light-gray);
opacity: 0.7;
}
.overall-download-percent {
font-size: 11px;
font-weight: 500;
color: var(--exo-light-gray);
opacity: 0.7;
font-variant-numeric: tabular-nums;
white-space: nowrap;
}
.download-overview-combined {
font-size: 12px;
.overall-download-stats {
font-size: 10px;
color: var(--exo-light-gray);
opacity: 0.9;
margin-top: 4px;
opacity: 0.6;
}
.instance-download-summary {
/* Per-node download summary styles */
.node-download-summary {
margin-top: 12px;
padding: 10px;
background-color: rgba(0, 0, 0, 0.2);
border-radius: 6px;
border-left: 3px solid #3b82f6;
}
.node-download-header {
display: flex;
justify-content: space-between;
align-items: center;
margin-bottom: 6px;
}
.node-download-name {
font-size: 13px;
font-weight: 600;
color: var(--exo-yellow);
}
.node-download-percent {
font-size: 13px;
font-weight: 600;
color: #3b82f6;
font-variant-numeric: tabular-nums;
}
.node-download-stats {
font-size: 11px;
color: var(--exo-light-gray);
margin-top: 6px;
opacity: 0.95;
margin-bottom: 10px;
opacity: 0.9;
}
/* File-level download details */
.download-files-list {
display: grid;
gap: 8px;
margin-top: 10px;
}
.download-file {
padding: 8px;
background-color: var(--exo-dark-gray);
background-color: rgba(0, 0, 0, 0.3);
border: 1px solid var(--exo-medium-gray);
border-radius: 6px;
box-sizing: border-box;
width: 100%;
max-width: 100%;
}
.download-file-header {
display: flex;
justify-content: space-between;
@@ -572,6 +595,7 @@
max-width: 100%;
overflow: hidden;
}
.download-file-name {
color: #E0E0E0;
font-weight: 500;
@@ -581,11 +605,7 @@
min-width: 0;
flex: 1 1 auto;
}
.download-file-stats {
color: var(--exo-light-gray);
text-align: right;
white-space: nowrap;
}
.download-file-percent {
color: var(--exo-light-gray);
white-space: nowrap;
@@ -593,6 +613,7 @@
font-variant-numeric: tabular-nums;
flex: 0 0 auto;
}
.download-file-subtext {
color: var(--exo-light-gray);
font-size: 10px;
@@ -603,26 +624,20 @@
white-space: nowrap;
max-width: 100%;
}
.download-details, .download-files-list {
box-sizing: border-box;
width: 100%;
max-width: 100%;
}
.download-files-list {
overflow: visible;
padding-right: 2px; /* avoid edge clipping */
}
.download-file .progress-bar-container {
width: 100%;
max-width: 100%;
box-sizing: border-box;
height: 5px;
}
.completed-files-section {
margin-top: 12px;
padding-top: 8px;
border-top: 1px solid var(--exo-medium-gray);
border-top: 1px solid rgba(255, 255, 255, 0.1);
}
.completed-files-header {
font-size: 10px;
color: var(--exo-light-gray);
@@ -630,11 +645,13 @@
margin-bottom: 6px;
font-weight: 500;
}
.completed-files-list {
display: flex;
flex-direction: column;
gap: 3px;
}
.completed-file-item {
font-size: 10px;
color: var(--exo-light-gray);
@@ -772,6 +789,82 @@
cursor: not-allowed;
}
.strategy-selector {
display: flex;
flex-direction: column;
gap: 8px;
}
.strategy-options {
display: flex;
gap: 12px;
flex-wrap: wrap;
}
.strategy-option {
display: flex;
align-items: center;
gap: 6px;
cursor: pointer;
padding: 8px 12px;
border-radius: 6px;
background-color: var(--exo-dark-gray);
border: 2px solid var(--exo-medium-gray);
transition: all 0.2s ease;
user-select: none;
}
.strategy-option:hover {
background-color: var(--exo-medium-gray);
border-color: rgba(255, 215, 0, 0.5);
}
.strategy-option input[type="radio"] {
appearance: none;
width: 16px;
height: 16px;
border: 2px solid var(--exo-light-gray);
border-radius: 50%;
cursor: pointer;
position: relative;
margin: 0;
transition: all 0.2s ease;
}
.strategy-option input[type="radio"]:checked {
border-color: var(--exo-yellow);
background-color: var(--exo-yellow);
}
.strategy-option input[type="radio"]:checked::after {
content: '';
position: absolute;
top: 50%;
left: 50%;
transform: translate(-50%, -50%);
width: 6px;
height: 6px;
border-radius: 50%;
background-color: var(--exo-black);
}
.strategy-option:has(input[type="radio"]:checked) {
background-color: rgba(255, 215, 0, 0.15);
border-color: var(--exo-yellow);
}
.strategy-option label {
cursor: pointer;
font-size: 14px;
font-weight: 500;
color: var(--exo-light-gray);
margin: 0;
}
.strategy-option:has(input[type="radio"]:checked) label {
color: var(--exo-yellow);
}
.launch-status {
font-size: 12px;
padding: 8px;
@@ -850,6 +943,33 @@
<select id="modelSelect" class="model-select">
<option value="">Loading models...</option>
</select>
<div class="strategy-selector">
<label class="launch-label">Parallelization Strategy:</label>
<div class="strategy-options">
<div class="strategy-option">
<input type="radio" id="strategyAuto" name="strategy" value="auto" checked>
<label for="strategyAuto">Auto</label>
</div>
<div class="strategy-option">
<input type="radio" id="strategyPipeline" name="strategy" value="pipeline">
<label for="strategyPipeline">Pipeline</label>
</div>
<div class="strategy-option">
<input type="radio" id="strategyTensor" name="strategy" value="tensor">
<label for="strategyTensor">Tensor</label>
</div>
<div class="strategy-option">
<input type="radio" id="strategyPipelineRdma" name="strategy" value="pipeline_rdma">
<label for="strategyPipelineRdma">Pipeline RDMA</label>
</div>
<div class="strategy-option">
<input type="radio" id="strategyTensorRdma" name="strategy" value="tensor_rdma">
<label for="strategyTensorRdma">Tensor RDMA</label>
</div>
</div>
</div>
<button id="launchInstanceButton" class="launch-button" disabled>Launch Instance</button>
<div id="launchStatus" class="launch-status"></div>
</div>
@@ -1112,6 +1232,9 @@
return;
}
const selectedStrategy = document.querySelector('input[name="strategy"]:checked').value;
console.log("selectedStrategy", selectedStrategy);
try {
showLaunchStatus('Launching instance...', 'loading');
launchInstanceButton.disabled = true;
@@ -1121,7 +1244,10 @@
headers: {
'Content-Type': 'application/json',
},
body: JSON.stringify({ model_id: selectedModelId })
body: JSON.stringify({
model_id: selectedModelId,
strategy: selectedStrategy
})
});
if (!response.ok) {
@@ -1251,60 +1377,6 @@
return { isDownloading: isDownloadingAny, progress, details };
}
function buildDownloadDetailsHTML(details) {
if (!details || details.length === 0) return '';
function shortId(id) { return (id && id.length > 8) ? id.slice(0, 8) + '…' : (id || ''); }
return details.map(({ runnerId, nodeId, progress }) => {
const etaStr = formatDurationMs(progress.etaMs);
const pctStr = formatPercent(progress.percentage || 0, 2);
const bytesStr = `${formatBytes(progress.downloadedBytes)} / ${formatBytes(progress.totalBytes)}`;
const speedStr = formatBytesPerSecond(progress.speed);
const filesSummary = `${progress.completedFiles}/${progress.totalFiles}`;
const allFiles = progress.files || [];
const inProgressFiles = allFiles.filter(f => (f.percentage || 0) < 100);
const completedFiles = allFiles.filter(f => (f.percentage || 0) >= 100);
const inProgressHTML = inProgressFiles.map(f => {
const fPct = f.percentage || 0;
const fBytes = `${formatBytes(f.downloadedBytes)} / ${formatBytes(f.totalBytes)}`;
const fEta = formatDurationMs(f.etaMs);
const fSpeed = formatBytesPerSecond(f.speed);
const pctText = formatPercent(fPct, 2);
return `
<div class="download-file">
<div class="download-file-header">
<span class="download-file-name" title="${f.name}">${f.name}</span>
<span class="download-file-percent">${pctText}</span>
</div>
<div class="download-file-subtext">${fBytes} • ETA ${fEta}${fSpeed}</div>
<div class="progress-bar-container"><div class="progress-bar" style="width: ${Math.max(0, Math.min(100, fPct)).toFixed(2)}%;"></div></div>
</div>
`;
}).join('');
const completedHTML = completedFiles.length > 0 ? `
<div class="completed-files-section">
<div class="completed-files-header">Completed (${completedFiles.length})</div>
<div class="completed-files-list">
${completedFiles.map(f => `<div class="completed-file-item" title="${f.name}">${f.name}</div>`).join('')}
</div>
</div>
` : '';
const runnerName = (nodeId && nodeIdToFriendlyName[nodeId]) ? nodeIdToFriendlyName[nodeId] : '?';
const headerText = `${runnerName} (${shortId(nodeId || '')})`;
return `
<div class="download-details">
<div class="download-runner-header">${headerText}</div>
<div class="download-files-list">
${inProgressHTML}
</div>
${completedHTML}
</div>
`;
}).join('');
}
// Derive a display status for an instance from its runners.
// Priority: FAILED > DOWNLOADING > STARTING > RUNNING > LOADED > INACTIVE
@@ -1383,9 +1455,37 @@
? instance.instanceId.substring(0, 8) + '...'
: instance.instanceId;
const hostsHTML = instance.hosts?.map(host =>
`<span class="instance-host">${host.ip}:${host.port}</span>`
).join('') || '';
// Create reverse mapping from runnerId to nodeId using nodeToRunner
const nodeToRunner = instance.shardAssignments?.nodeToRunner || {};
const runnerToNode = {};
Object.entries(nodeToRunner).forEach(([nodeId, runnerId]) => {
runnerToNode[runnerId] = nodeId;
});
// Extract parallelization strategy from the first shard
const runnerToShard = instance.shardAssignments?.runnerToShard || {};
const firstShardData = Object.values(runnerToShard)[0];
let parallelizationStrategy = 'Unknown';
if (firstShardData) {
const shardKeys = Object.keys(firstShardData);
if (shardKeys.length === 1) {
const shardPayload = firstShardData[shardKeys[0]];
parallelizationStrategy = shardPayload?.strategy || firstShardData.strategy || 'Unknown';
} else {
parallelizationStrategy = firstShardData.strategy || 'Unknown';
}
}
// Generate hosts HTML using runner IDs and friendly names
const runnerIds = Object.keys(runnerToShard);
const hostsHTML = runnerIds.map(runnerId => {
const nodeId = runnerToNode[runnerId];
const friendlyName = nodeId && nodeIdToFriendlyName[nodeId]
? nodeIdToFriendlyName[nodeId]
: 'Unknown Node';
const shortId = runnerId.slice(-4);
return `<span class="instance-host">${friendlyName} (${shortId})</span>`;
}).join('') || '';
// Calculate download status for this instance
const downloadStatus = calculateInstanceDownloadStatus(instance, runners);
@@ -1397,32 +1497,95 @@
({ statusText, statusClass } = deriveInstanceStatus(instance, runners));
}
// Generate download progress HTML
// Generate download progress HTML - overall + per node with file details
let downloadProgressHTML = '';
let instanceDownloadSummary = '';
if (downloadStatus.isDownloading) {
const detailsHTML = buildDownloadDetailsHTML(downloadStatus.details || []);
const pctText = (downloadStatus.progress || 0).toFixed(2);
// Aggregate a compact summary from the first runner (they should be consistent in aggregate)
const first = (downloadStatus.details || [])[0]?.progress;
const etaStr = first ? formatDurationMs(first.etaMs) : '—';
const bytesStr = first ? `${formatBytes(first.downloadedBytes)} / ${formatBytes(first.totalBytes)}` : '';
const speedStr = first ? formatBytesPerSecond(first.speed) : '';
const filesSummary = first ? `${first.completedFiles}/${first.totalFiles}` : '';
instanceDownloadSummary = `${etaStr} · ${bytesStr} · ${speedStr} · ${filesSummary} files`;
downloadProgressHTML = `
<div class="download-progress">
<span>${pctText}%</span>
<div class="progress-bar-container">
<div class="progress-bar" style="width: ${pctText}%;"></div>
// Calculate overall progress across all nodes
const overallPct = (downloadStatus.progress || 0).toFixed(2);
const totalBytesAll = downloadStatus.details.reduce((sum, d) => sum + (d.progress.totalBytes || 0), 0);
const downloadedBytesAll = downloadStatus.details.reduce((sum, d) => sum + (d.progress.downloadedBytes || 0), 0);
const nodeCount = downloadStatus.details.length;
// Overall progress section
const overallHTML = `
<div class="overall-download-summary">
<div class="overall-download-header">
<span class="overall-download-label">Overall</span>
<span class="overall-download-percent">${overallPct}%</span>
</div>
<div class="progress-bar-container">
<div class="progress-bar" style="width: ${overallPct}%;"></div>
</div>
<div class="overall-download-stats">${formatBytes(downloadedBytesAll)} / ${formatBytes(totalBytesAll)}${nodeCount} runner${nodeCount !== 1 ? 's' : ''}</div>
</div>
${detailsHTML}
`;
const perNodeHTML = (downloadStatus.details || []).map(({ runnerId, nodeId, progress }) => {
const nodeName = (nodeId && nodeIdToFriendlyName[nodeId])
? nodeIdToFriendlyName[nodeId]
: (nodeIdToFriendlyName[runnerId] || 'Unknown Node');
const pctText = (progress.percentage || 0).toFixed(2);
const etaStr = formatDurationMs(progress.etaMs);
const bytesStr = `${formatBytes(progress.downloadedBytes)} / ${formatBytes(progress.totalBytes)}`;
const speedStr = formatBytesPerSecond(progress.speed);
const filesSummary = `${progress.completedFiles}/${progress.totalFiles} files`;
// Separate files into in-progress and completed
const allFiles = progress.files || [];
const inProgressFiles = allFiles.filter(f => (f.percentage || 0) < 100);
const completedFiles = allFiles.filter(f => (f.percentage || 0) >= 100);
// Generate HTML for in-progress files
const inProgressHTML = inProgressFiles.map(f => {
const fPct = f.percentage || 0;
const fBytes = `${formatBytes(f.downloadedBytes)} / ${formatBytes(f.totalBytes)}`;
const fEta = formatDurationMs(f.etaMs);
const fSpeed = formatBytesPerSecond(f.speed);
const pctFormatted = formatPercent(fPct, 2);
return `
<div class="download-file">
<div class="download-file-header">
<span class="download-file-name" title="${f.name}">${f.name}</span>
<span class="download-file-percent">${pctFormatted}</span>
</div>
<div class="download-file-subtext">${fBytes} • ETA ${fEta}${fSpeed}</div>
<div class="progress-bar-container"><div class="progress-bar" style="width: ${Math.max(0, Math.min(100, fPct)).toFixed(2)}%;"></div></div>
</div>
`;
}).join('');
// Generate HTML for completed files
const completedHTML = completedFiles.length > 0 ? `
<div class="completed-files-section">
<div class="completed-files-header">Completed (${completedFiles.length})</div>
<div class="completed-files-list">
${completedFiles.map(f => `<div class="completed-file-item" title="${f.name}">${f.name}</div>`).join('')}
</div>
</div>
` : '';
return `
<div class="node-download-summary">
<div class="node-download-header">
<span class="node-download-name">${nodeName}</span>
<span class="node-download-percent">${pctText}%</span>
</div>
<div class="progress-bar-container">
<div class="progress-bar" style="width: ${pctText}%;"></div>
</div>
<div class="node-download-stats">${etaStr} · ${bytesStr} · ${speedStr} · ${filesSummary}</div>
<div class="download-files-list">
${inProgressHTML}
</div>
${completedHTML}
</div>
`;
}).join('');
downloadProgressHTML = overallHTML + perNodeHTML;
}
const shardCount = Object.keys(instance.shardAssignments?.runnerToShard || {}).length;
const shardCount = Object.keys(runnerToShard).length;
return `
<div class="instance-item">
<div class="instance-header">
@@ -1436,8 +1599,8 @@
</button>
</div>
</div>
<div class="instance-model">${modelId} <span style="color: var(--exo-light-gray); opacity: 0.8;">(${shardCount})</span></div>
${instanceDownloadSummary ? `<div class="instance-download-summary">${instanceDownloadSummary}</div>` : ''}
<div class="instance-model">${modelId} <span style="color: var(--exo-light-gray); opacity: 0.8;">(${shardCount} runner${shardCount !== 1 ? 's' : ''})</span></div>
<div class="instance-strategy">Strategy: <span class="instance-strategy-value">${parallelizationStrategy}</span></div>
${downloadProgressHTML}
${hostsHTML ? `<div class="instance-hosts">${hostsHTML}</div>` : ''}

18
flake.lock generated
View File

@@ -8,11 +8,11 @@
"rust-analyzer-src": "rust-analyzer-src"
},
"locked": {
"lastModified": 1755585599,
"narHash": "sha256-tl/0cnsqB/Yt7DbaGMel2RLa7QG5elA8lkaOXli6VdY=",
"lastModified": 1761893049,
"narHash": "sha256-1TtFDPhC+ZsrOOtBnry1EZC+WipTTvsOVjIEVugqji8=",
"owner": "nix-community",
"repo": "fenix",
"rev": "6ed03ef4c8ec36d193c18e06b9ecddde78fb7e42",
"rev": "c2ac9a5c0d6d16630c3b225b874bd14528d1abe6",
"type": "github"
},
"original": {
@@ -41,11 +41,11 @@
},
"nixpkgs": {
"locked": {
"lastModified": 1755615617,
"narHash": "sha256-HMwfAJBdrr8wXAkbGhtcby1zGFvs+StOp19xNsbqdOg=",
"lastModified": 1761672384,
"narHash": "sha256-o9KF3DJL7g7iYMZq9SWgfS1BFlNbsm6xplRjVlOCkXI=",
"owner": "NixOS",
"repo": "nixpkgs",
"rev": "20075955deac2583bb12f07151c2df830ef346b4",
"rev": "08dacfca559e1d7da38f3cf05f1f45ee9bfd213c",
"type": "github"
},
"original": {
@@ -65,11 +65,11 @@
"rust-analyzer-src": {
"flake": false,
"locked": {
"lastModified": 1755504847,
"narHash": "sha256-VX0B9hwhJypCGqncVVLC+SmeMVd/GAYbJZ0MiiUn2Pk=",
"lastModified": 1761849405,
"narHash": "sha256-igXdvC+WCUN+3gnfk+ptT7rMmxQuY6WbIg1rXMUN1DM=",
"owner": "rust-lang",
"repo": "rust-analyzer",
"rev": "a905e3b21b144d77e1b304e49f3264f6f8d4db75",
"rev": "f7de8ae045a5fe80f1203c5a1c3015b05f7c3550",
"type": "github"
},
"original": {

View File

@@ -61,6 +61,10 @@
# JUST
just
]
++ (pkgs.lib.optionals pkgs.stdenv.isLinux [
# IFCONFIG
unixtools.ifconfig
])
++ (pkgs.lib.optionals pkgs.stdenv.isDarwin [
# MACMON
macmon
@@ -68,8 +72,8 @@
shellHook = ''
# PYTHON
export DASHBOARD_DIR=$(git rev-parse --show-toplevel)/dashboard;
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${pkgs.python313}/lib
export DASHBOARD_DIR="$(git rev-parse --show-toplevel)/dashboard"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:${pkgs.python313}/lib"
echo
echo "🍎🍎 Run 'just <recipe>' to get started"
just --list

View File

@@ -16,6 +16,10 @@ sync:
sync-clean:
uv sync --all-packages --force-reinstall --no-cache
rust-rebuild:
cd rust && cargo run --bin stub_gen
just sync-clean
clean:
rm -rf **/__pycache__
rm -rf rust/target

View File

@@ -1,65 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
###############################################################################
# Args & prerequisites
###############################################################################
if [[ $# -gt 1 ]]; then
echo "Usage: $0 [hosts_file]" >&2
exit 1
fi
HOSTS_FILE=${1:-hosts.txt}
###############################################################################
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
###############################################################################
if [[ ! -f "$HOSTS_FILE" ]]; then
echo "Error: $HOSTS_FILE not found"
exit 1
fi
if builtin command -v mapfile >/dev/null 2>&1; then
mapfile -t HOSTS <"$HOSTS_FILE"
else
HOSTS=()
while IFS= read -r h; do
[[ -n "$h" ]] && HOSTS+=("$h")
done <"$HOSTS_FILE"
fi
[[ ${#HOSTS[@]} -gt 0 ]] || {
echo "No hosts found in $HOSTS_FILE"
exit 1
}
###############################################################################
# Helper run a remote command and capture rc/stderr/stdout
###############################################################################
ssh_opts=(-o StrictHostKeyChecking=no
-o LogLevel=ERROR)
run_remote() { # $1 host $2 command
local host=$1 cmd=$2 rc
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
rc=0
else
rc=$?
fi
return $rc
}
###############################################################################
# Kill exo everywhere (parallel)
###############################################################################
echo "=== Killing exo on ${#HOSTS[@]} host(s) ==="
fail=0
for h in "${HOSTS[@]}"; do
(
run_remote "$h" 'pkill -f exo || true'
) || fail=1 &
done
wait
((fail == 0)) || {
echo "❌ Some hosts could not be reached—check SSH access."
exit 1
}
echo "✓ exo processes killed on all reachable hosts."

View File

@@ -26,8 +26,6 @@ dependencies = [
"sqlalchemy[asyncio]>=2.0.43",
"greenlet>=3.2.4",
"huggingface-hub>=0.33.4",
"mlx==0.29.3",
"mlx-lm==0.28.3",
"psutil>=7.0.0",
"transformers>=4.55.2",
"cobs>=1.2.2",
@@ -36,6 +34,8 @@ dependencies = [
"exo_pyo3_bindings", # rust bindings
"anyio>=4.10.0",
"bidict>=0.23.1",
"mlx>=0.29.3",
"mlx-lm>=0.28.3",
]
[project.scripts]
@@ -52,7 +52,7 @@ dev = [
]
# mlx[cuda] requires a newer version of mlx. the ideal on linux is: default to mlx[cpu] unless[cuda] specified.
# [project.optional-dependencies]
[project.optional-dependencies]
# cuda = [
# "mlx[cuda]==0.26.3",
# ]
@@ -69,6 +69,9 @@ members = [
[tool.uv.sources]
exo_pyo3_bindings = { workspace = true }
# Uncomment to use local mlx/mlx-lm development versions:
# mlx = { path = "/Users/Shared/mlx", editable=true }
# mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }
[build-system]
requires = ["uv_build>=0.8.9,<0.9.0"]
@@ -94,7 +97,7 @@ reportUnnecessaryTypeIgnoreComment = "error"
pythonVersion = "3.13"
pythonPlatform = "Darwin"
exclude = ["**/.venv", "**/venv", "**/__pycache__", "**/exo_scripts", "**/.direnv", "**/rust"]
exclude = ["**/.venv", "**/venv", "**/__pycache__", "**/exo_scripts", "**/.direnv", "**/rust", "mlx/*", "mlx-lm/*"]
stubPath = "typings"
[[tool.basedpyright.executionEnvironments]]

View File

@@ -1,82 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
###############################################################################
# Args & prerequisites
###############################################################################
if [[ $# -lt 1 ]]; then
echo "Usage: $0 <git_command> [git_args...]" >&2
echo "Examples:" >&2
echo " $0 pull" >&2
echo " $0 checkout main" >&2
echo " $0 status" >&2
echo " $0 fetch --all" >&2
exit 1
fi
GIT_CMD="$*" # All args form the git command
HOSTS_FILE=${HOSTS_FILE:-hosts.txt}
###############################################################################
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
###############################################################################
if [[ ! -f "$HOSTS_FILE" ]]; then
echo "Error: $HOSTS_FILE not found"
exit 1
fi
if builtin command -v mapfile >/dev/null 2>&1; then
mapfile -t HOSTS <"$HOSTS_FILE"
else
HOSTS=()
while IFS= read -r h; do
[[ -n "$h" ]] && HOSTS+=("$h")
done <"$HOSTS_FILE"
fi
[[ ${#HOSTS[@]} -gt 0 ]] || {
echo "No hosts found in $HOSTS_FILE"
exit 1
}
###############################################################################
# Helper run a remote command and capture rc/stderr/stdout
###############################################################################
ssh_opts=(-o StrictHostKeyChecking=no
-o LogLevel=ERROR)
run_remote() { # $1 host $2 command
local host=$1 cmd=$2 rc
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
rc=0
else
rc=$?
fi
return $rc
}
###############################################################################
# Run git command on remote hosts (parallel)
###############################################################################
echo ""
echo "=== Running 'git $GIT_CMD' on ${#HOSTS[@]} remote host(s) ==="
fail=0
for h in "${HOSTS[@]}"; do
(
echo "→ Running on $h..."
if run_remote "$h" "cd ~/exo && git $GIT_CMD"; then
echo "$h: success"
else
echo "$h: failed"
exit 1
fi
) || fail=1 &
done
wait
echo ""
if ((fail == 0)); then
echo "🎉 Git command executed successfully on all hosts!"
else
echo "⚠️ Some hosts failed—see above."
exit 1
fi

48
run.sh
View File

@@ -1,48 +0,0 @@
#!/bin/bash
DIR="$PWD"
# Initialize flags
REPLICA=false
CLEAN=false
# Parse command line arguments
while getopts "rc" opt; do
case $opt in
r)
REPLICA=true
;;
c)
CLEAN=true
;;
\?)
echo "Invalid option: -$OPTARG" >&2
echo "Usage: $0 [-r] [-c]"
echo " -r Run as replica"
echo " -c Clean databases before starting"
exit 1
;;
esac
done
# Clean if requested
if [ "$CLEAN" = true ]; then
echo "Cleaning databases..."
rm -f ~/.exo/*db*
fi
# Configure MLX
# ./configure_mlx.sh
# Second command (master) - changes based on replica flag
if [ "$REPLICA" = true ]; then
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export RUST_LOG=true EXO_RUN_AS_REPLICA=1 EXO_HOME=.exo API_PORT=8001; uv run exo-master'\""
else
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export RUST_LOG=true; uv run exo-master'\""
fi
# First command (worker) - changes based on replica flag
if [ "$REPLICA" = true ]; then
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export EXO_HOME=.exo; uv run exo-worker'\""
else
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c uv run exo-worker\""
fi

View File

@@ -1,99 +0,0 @@
#!/usr/bin/env bash
set -euo pipefail
###############################################################################
# Args & prerequisites
###############################################################################
if [[ $# -gt 1 ]]; then
echo "Usage: $0 [hosts_file]" >&2
exit 1
fi
HOSTS_FILE=${1:-hosts.txt}
###############################################################################
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
###############################################################################
if [[ ! -f "$HOSTS_FILE" ]]; then
echo "Error: $HOSTS_FILE not found"
exit 1
fi
if builtin command -v mapfile >/dev/null 2>&1; then
mapfile -t HOSTS <"$HOSTS_FILE"
else
HOSTS=()
while IFS= read -r h; do
[[ -n "$h" ]] && HOSTS+=("$h")
done <"$HOSTS_FILE"
fi
[[ ${#HOSTS[@]} -gt 0 ]] || {
echo "No hosts found in $HOSTS_FILE"
exit 1
}
###############################################################################
# Helper run a remote command and capture rc/stderr/stdout
###############################################################################
ssh_opts=(-o StrictHostKeyChecking=no
-o LogLevel=ERROR)
run_remote() { # $1 host $2 command
local host=$1 cmd=$2 rc
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
rc=0
else
rc=$?
fi
return $rc
}
###############################################################################
# Phase 1 kill exo everywhere (parallel)
###############################################################################
echo "=== Stage 1: killing exo on ${#HOSTS[@]} host(s) ==="
fail=0
for h in "${HOSTS[@]}"; do
(
run_remote "$h" 'pkill -f exo || true'
) || fail=1 &
done
wait
((fail == 0)) || {
echo "❌ Some hosts could not be reached—check SSH access."
exit 1
}
echo "✓ exo processes killed on all reachable hosts."
#
###############################################################################
# Phase 2 cleanup database files (parallel)
###############################################################################
echo "=== Stage 2: cleaning up database files ==="
fail=0
for h in "${HOSTS[@]}"; do
(
run_remote "$h" 'rm -f ~/.exo/*db* || true'
) || fail=1 &
done
wait
((fail == 0)) || {
echo "❌ Some hosts failed database cleanup."
exit 1
}
echo "✓ Database files cleaned on all hosts."
###############################################################################
# Phase 3 start new exo processes in Terminal windows (parallel)
###############################################################################
echo "=== Stage 3: starting new exo processes ==="
fail=0
for h in "${HOSTS[@]}"; do
# Use osascript to open Terminal windows on remote Mac
remote_cmd="osascript -e \"tell app \\\"Terminal\\\" to do script \\\"cd ~/exo; nix develop --command uv run exo\\\"\""
(run_remote "$h" "$remote_cmd") || fail=1 &
done
wait
((fail == 0)) && echo "🎉 Deployment finished!" || {
echo "⚠️ Some starts failed—see above."
exit 1
}

View File

@@ -2,8 +2,16 @@
# ruff: noqa: E501, F401
import builtins
from enum import Enum
import enum
import typing
@typing.final
class AllQueuesFullError(builtins.Exception):
def __new__(cls, *args: typing.Any) -> AllQueuesFullError: ...
def __repr__(self) -> builtins.str: ...
def __str__(self) -> builtins.str: ...
@typing.final
class ConnectionUpdate:
@property
def update_type(self) -> ConnectionUpdateType:
@@ -26,6 +34,7 @@ class ConnectionUpdate:
Remote connection's TCP port.
"""
@typing.final
class Keypair:
r"""
Identity keypair of a node.
@@ -46,12 +55,12 @@ class Keypair:
Generate a new Secp256k1 keypair.
"""
@staticmethod
def from_protobuf_encoding(bytes:bytes) -> Keypair:
def from_protobuf_encoding(bytes: bytes) -> Keypair:
r"""
Decode a private key from a protobuf structure and parse it as a `Keypair`.
"""
@staticmethod
def rsa_from_pkcs8(bytes:bytes) -> Keypair:
def rsa_from_pkcs8(bytes: bytes) -> Keypair:
r"""
Decode an keypair from a DER-encoded secret key in PKCS#8 `PrivateKeyInfo`
format (i.e. unencrypted) as defined in [RFC5208].
@@ -59,7 +68,7 @@ class Keypair:
[RFC5208]: https://tools.ietf.org/html/rfc5208#section-5
"""
@staticmethod
def secp256k1_from_der(bytes:bytes) -> Keypair:
def secp256k1_from_der(bytes: bytes) -> Keypair:
r"""
Decode a keypair from a DER-encoded Secp256k1 secret key in an `ECPrivateKey`
structure as defined in [RFC5915].
@@ -67,7 +76,7 @@ class Keypair:
[RFC5915]: https://tools.ietf.org/html/rfc5915
"""
@staticmethod
def ed25519_from_bytes(bytes:bytes) -> Keypair: ...
def ed25519_from_bytes(bytes: bytes) -> Keypair: ...
def to_protobuf_encoding(self) -> bytes:
r"""
Encode a private key as protobuf structure.
@@ -77,6 +86,7 @@ class Keypair:
Convert the `Keypair` into the corresponding `PeerId`.
"""
@typing.final
class Multiaddr:
r"""
Representation of a Multiaddr.
@@ -87,17 +97,17 @@ class Multiaddr:
Create a new, empty multiaddress.
"""
@staticmethod
def with_capacity(n:builtins.int) -> Multiaddr:
def with_capacity(n: builtins.int) -> Multiaddr:
r"""
Create a new, empty multiaddress with the given capacity.
"""
@staticmethod
def from_bytes(bytes:bytes) -> Multiaddr:
def from_bytes(bytes: bytes) -> Multiaddr:
r"""
Parse a `Multiaddr` value from its byte slice representation.
"""
@staticmethod
def from_string(string:builtins.str) -> Multiaddr:
def from_string(string: builtins.str) -> Multiaddr:
r"""
Parse a `Multiaddr` value from its string representation.
"""
@@ -118,13 +128,14 @@ class Multiaddr:
Convert a Multiaddr to a string.
"""
@typing.final
class NetworkingHandle:
def __new__(cls, identity:Keypair) -> NetworkingHandle: ...
def __new__(cls, identity: Keypair) -> NetworkingHandle: ...
async def connection_update_recv(self) -> ConnectionUpdate:
r"""
Receives the next `ConnectionUpdate` from networking.
"""
async def connection_update_recv_many(self, limit:builtins.int) -> builtins.list[ConnectionUpdate]:
async def connection_update_recv_many(self, limit: builtins.int) -> builtins.list[ConnectionUpdate]:
r"""
Receives at most `limit` `ConnectionUpdate`s from networking and returns them.
@@ -132,19 +143,19 @@ class NetworkingHandle:
For `limit > 0`, if there are no `ConnectionUpdate`s in the channel's queue this method
will sleep until a `ConnectionUpdate`s is sent.
"""
async def gossipsub_subscribe(self, topic:builtins.str) -> builtins.bool:
async def gossipsub_subscribe(self, topic: builtins.str) -> builtins.bool:
r"""
Subscribe to a `GossipSub` topic.
Returns `True` if the subscription worked. Returns `False` if we were already subscribed.
"""
async def gossipsub_unsubscribe(self, topic:builtins.str) -> builtins.bool:
async def gossipsub_unsubscribe(self, topic: builtins.str) -> builtins.bool:
r"""
Unsubscribes from a `GossipSub` topic.
Returns `True` if we were subscribed to this topic. Returns `False` if we were not subscribed.
"""
async def gossipsub_publish(self, topic:builtins.str, data:bytes) -> None:
async def gossipsub_publish(self, topic: builtins.str, data: bytes) -> None:
r"""
Publishes a message with multiple topics to the `GossipSub` network.
@@ -154,7 +165,7 @@ class NetworkingHandle:
r"""
Receives the next message from the `GossipSub` network.
"""
async def gossipsub_recv_many(self, limit:builtins.int) -> builtins.list[tuple[builtins.str, bytes]]:
async def gossipsub_recv_many(self, limit: builtins.int) -> builtins.list[tuple[builtins.str, bytes]]:
r"""
Receives at most `limit` messages from the `GossipSub` network and returns them.
@@ -163,11 +174,13 @@ class NetworkingHandle:
will sleep until a message is sent.
"""
@typing.final
class NoPeersSubscribedToTopicError(builtins.Exception):
def __new__(cls, *args) -> NoPeersSubscribedToTopicError: ...
def __new__(cls, *args: typing.Any) -> NoPeersSubscribedToTopicError: ...
def __repr__(self) -> builtins.str: ...
def __str__(self) -> builtins.str: ...
@typing.final
class PeerId:
r"""
Identifier of a peer of the network.
@@ -183,7 +196,7 @@ class PeerId:
This is useful for randomly walking on a DHT, or for testing purposes.
"""
@staticmethod
def from_bytes(bytes:bytes) -> PeerId:
def from_bytes(bytes: bytes) -> PeerId:
r"""
Parses a `PeerId` from bytes.
"""
@@ -198,7 +211,8 @@ class PeerId:
def __repr__(self) -> builtins.str: ...
def __str__(self) -> builtins.str: ...
class ConnectionUpdateType(Enum):
@typing.final
class ConnectionUpdateType(enum.Enum):
r"""
Connection or disconnection event discriminant type.
"""

View File

@@ -65,6 +65,40 @@ mod exception {
Self::MSG.to_string()
}
}
#[gen_stub_pyclass]
#[pyclass(frozen, extends=PyException, name="AllQueuesFullError")]
pub struct PyAllQueuesFullError {}
impl PyAllQueuesFullError {
const MSG: &'static str = "All libp2p peers are unresponsive, resend the message or reconnect.";
/// Creates a new [ `PyErr` ] of this type.
///
/// [`PyErr`] : https://docs.rs/pyo3/latest/pyo3/struct.PyErr.html "PyErr in pyo3"
pub(crate) fn new_err() -> PyErr {
PyErr::new::<Self, _>(()) // TODO: check if this needs to be replaced???
}
}
#[gen_stub_pymethods]
#[pymethods]
impl PyAllQueuesFullError {
#[new]
#[pyo3(signature = (*args))]
#[allow(unused_variables)]
pub(crate) fn new(args: &Bound<'_, PyTuple>) -> Self {
Self {}
}
fn __repr__(&self) -> String {
format!("PeerId(\"{}\")", Self::MSG)
}
fn __str__(&self) -> String {
Self::MSG.to_string()
}
}
}
/// Connection or disconnection event discriminant type.
@@ -167,7 +201,7 @@ async fn networking_task(
let pyresult: PyResult<MessageId> = if let Err(PublishError::NoPeersSubscribedToTopic) = result {
Err(exception::PyNoPeersSubscribedToTopicError::new_err())
} else if let Err(PublishError::AllQueuesFull(_)) = result {
Err(exception::PyNoPeersSubscribedToTopicError::new_err())
Err(exception::PyAllQueuesFullError::new_err())
} else {
result.pyerr()
};
@@ -526,6 +560,7 @@ impl PyNetworkingHandle {
pub fn networking_submodule(m: &Bound<'_, PyModule>) -> PyResult<()> {
m.add_class::<exception::PyNoPeersSubscribedToTopicError>()?;
m.add_class::<exception::PyAllQueuesFullError>()?;
m.add_class::<PyConnectionUpdateType>()?;
m.add_class::<PyConnectionUpdate>()?;

View File

@@ -13,6 +13,7 @@ pub type Swarm = libp2p::Swarm<Behaviour>;
/// this be passed in as a parameter? What about rapidly changing versions in debug builds?
/// this is all VERY very hard to figure out and needs to be mulled over as a team.
pub const NETWORK_VERSION: &[u8] = b"v0.0.1";
pub const OVERRIDE_VERSION_ENV_VAR: &str = "EXO_LIBP2P_NAMESPACE";
/// Create and configure a swarm which listens to all ports on OS
pub fn create_swarm(keypair: identity::Keypair) -> alias::AnyResult<Swarm> {
@@ -29,20 +30,27 @@ pub fn create_swarm(keypair: identity::Keypair) -> alias::AnyResult<Swarm> {
mod transport {
use crate::alias;
use crate::swarm::NETWORK_VERSION;
use crate::swarm::{NETWORK_VERSION, OVERRIDE_VERSION_ENV_VAR};
use futures::{AsyncRead, AsyncWrite};
use keccak_const::Sha3_256;
use libp2p::core::muxing;
use libp2p::core::transport::Boxed;
use libp2p::pnet::{PnetError, PnetOutput};
use libp2p::{PeerId, Transport, identity, noise, pnet, yamux};
use std::{sync::LazyLock, env};
/// Key used for networking's private network; parametrized on the [`NETWORK_VERSION`].
/// See [`pnet_upgrade`] for more.
const PNET_PRESHARED_KEY: [u8; 32] = Sha3_256::new()
.update(b"exo_discovery_network")
.update(NETWORK_VERSION)
.finalize();
static PNET_PRESHARED_KEY: LazyLock<[u8; 32]> = LazyLock::new(|| {
let builder = Sha3_256::new().update(b"exo_discovery_network");
if let Ok(var) = env::var(OVERRIDE_VERSION_ENV_VAR) {
let bytes = var.into_bytes();
builder.update(&bytes)
} else {
builder.update(NETWORK_VERSION)
}.finalize()
});
/// Make the Swarm run on a private network, as to not clash with public libp2p nodes and
/// also different-versioned instances of this same network.
@@ -55,7 +63,7 @@ mod transport {
TSocket: AsyncRead + AsyncWrite + Send + Unpin + 'static,
{
use pnet::{PnetConfig, PreSharedKey};
PnetConfig::new(PreSharedKey::new(PNET_PRESHARED_KEY))
PnetConfig::new(PreSharedKey::new(*PNET_PRESHARED_KEY))
.handshake(socket)
.await
}

View File

@@ -1,65 +0,0 @@
#!/usr/bin/env bash
# bulk_scp.sh — Sync a local repo to many hosts, respecting .gitignore and continuing even if
# some hosts fail. Tested on macOS Bash 3.x.
#
# ------------ User-tunable variables ------------
LOCAL_DIR="." # Local directory you want to send
REMOTE_DIR="~/exo" # Destination directory on the remote machines
HOSTS_FILE="hosts.json" # JSON array of hosts (["user@ip", ...])
# ------------ End of user-tunable section -------
set -uo pipefail # Treat unset vars as error; fail pipelines, but we handle exit codes ourselves
if [ "$#" -ne 1 ]; then
echo "Usage: $0 <password>" >&2
exit 1
fi
PASSWORD="$1"
# Dependency checks
for cmd in sshpass jq rsync git; do
if ! command -v "$cmd" >/dev/null 2>&1; then
echo "Error: $cmd is required but not installed." >&2
exit 1
fi
done
# Verify hosts file exists
if [ ! -f "$HOSTS_FILE" ]; then
echo "Error: Hosts file '$HOSTS_FILE' not found." >&2
exit 1
fi
# Build a temporary exclude file containing every Gitignored path
EXCLUDE_FILE=$(mktemp)
trap 'rm -f "$EXCLUDE_FILE"' EXIT
if git -C "$LOCAL_DIR" rev-parse --is-inside-work-tree >/dev/null 2>&1; then
git -C "$LOCAL_DIR" ls-files -z -o -i --exclude-standard \
| tr '\0' '\n' > "$EXCLUDE_FILE"
else
# Fallback: just use toplevel .gitignore if present
[ -f "$LOCAL_DIR/.gitignore" ] && cat "$LOCAL_DIR/.gitignore" > "$EXCLUDE_FILE"
fi
# Iterate over hosts — process substitution keeps stdin free for rsync/ssh
while IFS= read -r TARGET || [ -n "$TARGET" ]; do
[ -z "$TARGET" ] && continue # skip blanks
echo "\n—— Syncing $LOCAL_DIR$TARGET:$REMOTE_DIR ——"
# # Ensure remote directory exists (ignore failure but report)
# if ! sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no "$TARGET" "mkdir -p $REMOTE_DIR" </dev/null; then
# echo "✗ Failed to create $REMOTE_DIR on $TARGET" >&2
# continue # move on to next host
# fi
# Rsync with checksums; redirect stdin so rsync/ssh can't eat host list
if sshpass -p "$PASSWORD" rsync -azc --delete --exclude-from="$EXCLUDE_FILE" \
-e "ssh -o StrictHostKeyChecking=no" \
"$LOCAL_DIR/" "$TARGET:$REMOTE_DIR/" </dev/null; then
echo "✓ Success: $TARGET"
else
echo "✗ Failed: $TARGET" >&2
fi
done < <(jq -r '.[]' "$HOSTS_FILE")

View File

View File

@@ -1,80 +0,0 @@
import hashlib
import os
import sys
EXCLUDE_DIRS = {".git", "build", "vendor", ".idea", ".vscode", "__pycache__"}
def norm_rel(path: str, base: str) -> str:
"""Forwarder-rootrelative path with '/' separators."""
abs_path = os.path.abspath(path)
abs_base = os.path.abspath(base)
rel = os.path.relpath(abs_path, abs_base)
return rel.replace(os.sep, "/")
def collect_files(arg_path: str) -> tuple[str, list[str]]:
# Resolve forwarder_root and src_root from the provided path
p = os.path.abspath(arg_path)
if not os.path.isdir(p):
sys.stderr.write(f"error: path must be a directory: {arg_path}\n")
sys.exit(2)
if os.path.basename(p) == "src":
forwarder_root = os.path.dirname(p)
src_root = p
else:
forwarder_root = p
src_root = os.path.join(forwarder_root, "src")
files = []
# 1) Include .go files under src, excluding *_test.go
if os.path.isdir(src_root):
for root, dirs, filenames in os.walk(src_root):
# prune excluded dirs
dirs[:] = [d for d in dirs if d not in EXCLUDE_DIRS]
for name in filenames:
# strict .go, exclude *_test.go
if not name.lower().endswith(".go"):
continue
if name.lower().endswith("_test.go"):
continue
files.append(os.path.join(root, name))
# 2) Add go.mod, go.sum, main.go from the forwarder root
for name in ("go.mod", "go.sum", "main.go"):
pth = os.path.join(forwarder_root, name)
if os.path.isfile(pth):
# defensive: exclude *_test.go at root too
if name.lower().endswith("_test.go"):
continue
files.append(pth)
# Deduplicate and sort deterministically by forwarder-rootrelative path
files: list[str] = sorted(set(files), key=lambda f: norm_rel(f, forwarder_root))
return forwarder_root, files
def hash_files(forwarder_root: str, files: list[str]) -> str:
h = hashlib.sha256()
for fp in files:
rel = norm_rel(fp, forwarder_root)
h.update(b"F\x00")
h.update(rel.encode("utf-8"))
h.update(b"\x00")
with open(fp, "rb") as f:
for chunk in iter(lambda: f.read(256 * 1024), b""):
h.update(chunk)
h.update(b"\n")
return h.hexdigest()
def main():
if len(sys.argv) > 1:
arg = sys.argv[1]
else:
arg = os.path.join("networking", "forwarder", "src")
forwarder_root, files = collect_files(arg)
digest = hash_files(forwarder_root, files)
# print without trailing newline (easier to capture in shell)
sys.stdout.write(digest)
if __name__ == "__main__":
main()

View File

@@ -1,17 +0,0 @@
[project]
name = "exo-scripts"
version = "0.1.0"
description = "Scripts for the Exo project"
readme = "README.md"
requires-python = ">=3.13"
dependencies = [
"huggingface_hub>=0.33.4",
"exo"
]
[build-system]
requires = ["uv_build>=0.8.9,<0.9.0"]
build-backend = "uv_build"
[tool.uv.sources]
exo = { workspace = true }

View File

View File

@@ -1,511 +0,0 @@
import asyncio
import json
import argparse
import sys
import time
from dataclasses import is_dataclass, asdict
from logging import getLogger
from typing import List, Optional, Any, Sequence, Tuple
# Your existing imports — unchanged
from exo.shared.types.state import State
from exo.shared.apply import apply
from exo.shared.db.sqlite.event_log_manager import EventLogManager, EventLogConfig
from exo.shared.types.events.components import EventFromEventLog
from exo.shared.types.events import Event
# --- Third-party UI (new) ---
from rich.syntax import Syntax
from rich.text import Text
from rich.panel import Panel
from rich.console import RenderableType
from textual.app import App, ComposeResult
from textual.containers import Horizontal, Vertical
from textual.widgets import Static, ListView, ListItem, Input, Footer, Label
from textual.reactive import reactive
from textual import on
from textual.binding import Binding
from textual.message import Message
logger = getLogger("helper_log")
# Worker-related event types (same set)
WORKER_EVENT_TYPES = {
'TaskCreated', 'TaskStateUpdated', 'TaskFailed', 'TaskDeleted',
'ChunkGenerated',
'InstanceCreated', 'InstanceDeleted', 'InstanceActivated', 'InstanceDeactivated', 'InstanceReplacedAtomically',
'RunnerStatusUpdated', 'RunnerDeleted'
}
# ---------- Data / DB helpers (mostly your original logic) ----------
event_log_manager: Optional[EventLogManager] = None
async def init_db() -> None:
global event_log_manager
event_log_manager = EventLogManager(EventLogConfig())
await event_log_manager.initialize()
async def get_events_since(since: int) -> Sequence[EventFromEventLog[Event]]:
# type: ignore[attr-defined, return-value]
return await event_log_manager.global_events.get_events_since(since)
async def load_all_events() -> List[EventFromEventLog[Event]]:
events: List[EventFromEventLog[Event]] = []
since = 0
while True:
new_events = await get_events_since(since)
if not new_events:
break
events.extend(new_events)
since += len(new_events)
return events
def compute_states(events: List[EventFromEventLog[Event]]) -> List[State]:
states: List[State] = [State()]
state = states[0]
for event in events:
state = apply(state, event)
states.append(state)
return states
def filter_worker_state(state: State) -> dict:
state_dict = json.loads(state.model_dump_json())
return {
'node_status': state_dict.get('node_status', {}),
'instances': state_dict.get('instances', {}),
'runners': state_dict.get('runners', {}),
'tasks': state_dict.get('tasks', {}),
'last_event_applied_idx': state_dict.get('last_event_applied_idx', 0)
}
def event_type_name(e: EventFromEventLog[Event]) -> str:
return type(e.event).__name__
def is_worker_event(e: EventFromEventLog[Event]) -> bool:
return event_type_name(e) in WORKER_EVENT_TYPES
def safe_json(obj: Any) -> str:
"""Serialize unknown objects to JSON-ish string safely."""
def to_serializable(x: Any):
try:
if is_dataclass(x):
return asdict(x)
except Exception:
pass
if isinstance(x, (str, int, float, bool)) or x is None:
return x
if isinstance(x, dict):
return {str(k): to_serializable(v) for k, v in x.items()}
if isinstance(x, (list, tuple, set)):
return [to_serializable(v) for v in x]
try:
json.dumps(x) # type: ignore
return x
except Exception:
return repr(x)
try:
return json.dumps(to_serializable(obj), indent=2, ensure_ascii=False)
except Exception:
# Last resort
return repr(obj)
def summarize_event_line(e: EventFromEventLog[Event], max_len: int = 160) -> Text:
etype = event_type_name(e)
attrs = vars(e.event)
prefix = Text(f"[{e.idx_in_log}] ", style="bold dim")
t = Text(etype, style="bold cyan")
t = prefix + t + Text(": ", style="dim")
first = True
for k, v in attrs.items():
if not first:
t.append(", ", style="dim")
first = False
t.append(str(k), style="magenta")
t.append("=")
# Coarse coloring by type
if isinstance(v, str):
t.append(repr(v), style="green")
elif isinstance(v, (int, float)):
t.append(repr(v), style="yellow")
elif isinstance(v, bool):
t.append(repr(v), style="cyan")
else:
t.append(repr(v), style="")
if len(t.plain) > max_len:
t.truncate(max_len - 1)
t.append("", style="dim")
return t
def event_detail_renderable(e: EventFromEventLog[Event]) -> RenderableType:
payload = {
"idx_in_log": e.idx_in_log,
"event_type": event_type_name(e),
"attributes": vars(e.event)
}
return Syntax(safe_json(payload), "json", word_wrap=True)
# ---------- Non-TUI (stdout) mode, like your current script ----------
async def run_non_tui(worker_mode: bool) -> None:
await init_db()
events = await load_all_events()
states = compute_states(events)
final_state = states[-1]
if worker_mode:
filtered_events = [e for e in events if is_worker_event(e)]
events = filtered_events
filtered_state = filter_worker_state(final_state)
print("Final State (filtered):")
print(json.dumps(filtered_state, indent=2))
else:
print("Final State:")
print(final_state.model_dump_json(indent=2))
print("\nEvents:")
for e in events:
etype = event_type_name(e)
attrs = ', '.join(f"{k}={value!r}" for k, value in vars(e.event).items())
print(f"[{e.idx_in_log}] {etype}: {attrs}")
# ---------- Textual TUI ----------
class StateView(Static):
"""Left pane: shows state JSON, with optional worker filter."""
def update_state(self, state: State, worker_mode: bool, index_in_log_for_status: Optional[int]) -> None:
if worker_mode:
data = filter_worker_state(state)
json_str = json.dumps(data, indent=2, ensure_ascii=False)
else:
json_str = state.model_dump_json(indent=2)
syntax = Syntax(json_str, "json", word_wrap=True)
title = f"State after event #{index_in_log_for_status}" if index_in_log_for_status is not None else "Initial State"
self.update(Panel(syntax, title=title, border_style="cyan"))
class EventListItem(ListItem):
def __init__(self, e: EventFromEventLog[Event]) -> None:
super().__init__(Static(summarize_event_line(e)))
self._event = e
@property
def wrapped_event(self) -> EventFromEventLog[Event]:
return self._event
class EventDetail(Static):
"""Right-bottom: details of the selected event."""
def show_event(self, e: Optional[EventFromEventLog[Event]]) -> None:
if e is None:
self.update(Panel(Text("No event selected.", style="dim"), title="Event Details"))
else:
self.update(Panel(event_detail_renderable(e), title=f"Event #{e.idx_in_log}{event_type_name(e)}", border_style="magenta"))
class StatusBar(Static):
def set_status(self, realtime: bool, total_events: int, current_idx_in_log: Optional[int]) -> None:
mode = "Realtime" if realtime else "Timetravel"
parts = [
f"[{mode}]",
f"Events: {total_events}",
]
if current_idx_in_log is not None:
parts.append(f"Current: #{current_idx_in_log}")
parts.append("Keys: ↑/↓ Select • PgUp/PgDn Scroll • Ctrl+↑/↓ ±5 • [/] State PgUp/PgDn • g Goto • r Realtime • q Quit")
self.update(Text(" ".join(parts), style="dim"))
class GotoPrompt(Static):
"""Simple inline goto prompt (appears above Footer)."""
class Submitted(Message):
def __init__(self, value: Optional[int]) -> None:
super().__init__()
self.value = value
def compose(self) -> ComposeResult:
yield Label("Go to event id (idx_in_log):", id="goto-label")
yield Input(placeholder="e.g., 123", id="goto-input")
def on_mount(self) -> None:
self.query_one(Input).focus()
@on(Input.Submitted)
def _submitted(self, event: Input.Submitted) -> None:
text = (event.value or "").strip()
try:
value = int(text)
except ValueError:
value = None
self.post_message(self.Submitted(value))
class EventLogApp(App):
CSS = """
Screen {
layout: vertical;
}
#main {
height: 1fr;
}
#left {
width: 60%;
}
#right {
width: 40%;
}
#events {
height: 3fr;
}
#detail {
height: 2fr;
border: tall;
}
#status {
height: 1;
padding: 0 1;
}
#goto {
dock: bottom;
height: 3;
padding: 1 2;
background: $panel;
border: round $accent;
}
"""
BINDINGS = [
Binding("q", "quit", "Quit"),
Binding("r", "toggle_realtime", "Realtime"),
Binding("[", "state_page_up", "State PgUp"),
Binding("]", "state_page_down", "State PgDn"),
Binding("g", "prompt_goto", "Goto"),
Binding("ctrl+up", "jump_up", "Jump Up"),
Binding("ctrl+down", "jump_down", "Jump Down"),
]
# Reactive state
realtime: reactive[bool] = reactive(False)
worker_mode: bool
# Data
wrapped_events: List[EventFromEventLog[Event]]
states: List[State]
filtered_indices: Optional[List[int]] # maps filtered idx -> original idx
update_interval: float = 1.0
_poll_timer = None
def __init__(self, worker_mode: bool) -> None:
super().__init__()
self.worker_mode = worker_mode
self.wrapped_events = []
self.states = [State()]
self.filtered_indices = None
async def on_mount(self) -> None:
await init_db()
await self._initial_load()
# periodic polling for new events
self._poll_timer = self.set_interval(self.update_interval, self._tick_poll)
# Put list selection at end (last event) by default
self._select_last()
async def _initial_load(self) -> None:
self.wrapped_events = await load_all_events()
self.states = compute_states(self.wrapped_events)
# Build filtered view if needed
if self.worker_mode:
self.filtered_indices = [i for i, e in enumerate(self.wrapped_events) if is_worker_event(e)]
else:
self.filtered_indices = None
# Populate the ListView
lv = self.query_one("#events", ListView)
lv.clear()
events_to_show = self._view_events()
for e in events_to_show:
lv.append(EventListItem(e))
# Update left state & details
self._refresh_views()
def compose(self) -> ComposeResult:
# Layout: [Header optional] -> main Horizontal -> Status bar + Footer
with Horizontal(id="main"):
with Vertical(id="left"):
yield StateView(id="state")
with Vertical(id="right"):
yield ListView(id="events")
yield EventDetail(id="detail")
yield StatusBar(id="status")
yield Footer()
def _current_original_index(self) -> int:
lv = self.query_one("#events", ListView)
idx = lv.index
if idx is None or idx < 0:
return -1
if self.filtered_indices is not None:
if idx >= len(self.filtered_indices):
return -1
return self.filtered_indices[idx]
return idx
def _view_events(self) -> List[EventFromEventLog[Event]]:
if self.filtered_indices is not None:
return [self.wrapped_events[i] for i in self.filtered_indices]
return self.wrapped_events
def _select_last(self) -> None:
lv = self.query_one("#events", ListView)
n = len(lv.children)
if n:
lv.index = n - 1
def _refresh_views(self) -> None:
# Update State pane and Detail pane and Status bar
original_idx = self._current_original_index()
state_idx = (original_idx + 1) if original_idx >= 0 else 0
state = self.states[state_idx]
state_view = self.query_one("#state", StateView)
idx_in_log = None
if original_idx >= 0:
idx_in_log = self.wrapped_events[original_idx].idx_in_log
state_view.update_state(state, self.worker_mode, idx_in_log)
# Detail pane
detail = self.query_one("#detail", EventDetail)
current_event = self.wrapped_events[original_idx] if original_idx >= 0 else None
detail.show_event(current_event)
# Status bar
status = self.query_one("#status", StatusBar)
total_events = len(self.wrapped_events)
status.set_status(self.realtime, total_events, current_event.idx_in_log if current_event else None)
async def _poll_once(self) -> bool:
"""Fetch and append new events; return True if updated."""
last_since = len(self.wrapped_events)
new_wrapped = await get_events_since(last_since)
if not new_wrapped:
return False
# Extend states incrementally (avoid recomputing all)
for nw in new_wrapped:
state = self.states[-1]
self.states.append(apply(state, nw))
start_len = len(self.wrapped_events)
self.wrapped_events.extend(new_wrapped)
# Update filtered mapping and UI list
lv = self.query_one("#events", ListView)
if self.worker_mode:
if self.filtered_indices is None:
self.filtered_indices = []
for k in range(start_len, len(self.wrapped_events)):
if is_worker_event(self.wrapped_events[k]):
self.filtered_indices.append(k)
lv.append(EventListItem(self.wrapped_events[k]))
else:
for k in range(start_len, len(self.wrapped_events)):
lv.append(EventListItem(self.wrapped_events[k]))
# Auto-follow the tail in realtime mode
if self.realtime:
self._select_last()
# Refresh panes
self._refresh_views()
return True
def _tick_poll(self) -> None:
# called by timer; schedule the async poll
asyncio.create_task(self._poll_once())
# ------ Actions / key handlers ------
def action_quit(self) -> None:
self.exit()
def action_toggle_realtime(self) -> None:
self.realtime = not self.realtime
if self.realtime:
self._select_last()
self._refresh_views()
def action_state_page_up(self) -> None:
state_view = self.query_one("#state", StateView)
state_view.scroll_page_up()
def action_state_page_down(self) -> None:
state_view = self.query_one("#state", StateView)
state_view.scroll_page_down()
def action_jump_up(self) -> None:
lv = self.query_one("#events", ListView)
if lv.children:
lv.index = max(0, (lv.index or 0) - 5)
self._refresh_views()
def action_jump_down(self) -> None:
lv = self.query_one("#events", ListView)
if lv.children:
lv.index = min(len(lv.children) - 1, (lv.index or 0) + 5)
self._refresh_views()
def action_prompt_goto(self) -> None:
# mount a small prompt near bottom
if self.query("#goto"):
return
prompt = GotoPrompt(id="goto")
self.mount(prompt)
@on(GotoPrompt.Submitted)
def _on_goto_submitted(self, msg: GotoPrompt.Submitted) -> None:
# Remove prompt
for node in self.query("#goto"):
node.remove()
if msg.value is None:
return
target = msg.value
# find in current view's idx_in_log
events_to_show = self._view_events()
lv = self.query_one("#events", ListView)
for i, e in enumerate(events_to_show):
if e.idx_in_log == target:
lv.index = i
self._refresh_views()
break
@on(ListView.Highlighted, "#events")
@on(ListView.Selected, "#events")
def _on_event_selected(self, *_: Any) -> None:
# Update panes when selection changes
self._refresh_views()
# ---------- Entrypoint ----------
def main() -> None:
parser = argparse.ArgumentParser(description='Read and display events from the event log (Textual UI)')
parser.add_argument('--worker', action='store_true',
help='Only show worker-related events (task, streaming, instance, runner status)')
parser.add_argument('--no-ui', action='store_true',
help='Print to stdout (non-interactive), like the original non-TUI mode')
args = parser.parse_args()
# Non-interactive fallback if no TTY or user requests it
if args.no_ui or not sys.stdout.isatty():
asyncio.run(run_non_tui(worker_mode=args.worker))
return
# TUI mode
app = EventLogApp(worker_mode=args.worker)
app.run()
if __name__ == "__main__":
main()

View File

@@ -1,12 +0,0 @@
from exo.worker.download.download_utils import *
async def main():
meta = await file_meta(
'mlx-community/DeepSeek-R1-4bit',
revision='main',
path='config.json',
redirected_location=None,
)
print(meta)
asyncio.run(main())

View File

@@ -1,284 +0,0 @@
#!/usr/bin/env python3
"""
watch-pull-restart.py — Unix-only
Runs a command, periodically checks git upstream, pulls if upstream is ahead,
and gracefully restarts the command. Watcher logs go to STDERR; your app's
output goes straight to the console (STDOUT/STDERR).
Assumptions:
- current branch tracks an upstream (i.e., @{u} exists)
- pulls must be fast-forward (remote-ahead workflow)
Arguments:
- cmd: Command to run/manage (e.g. './run.sh' or 'python -m app').
- restart-cmd: Optional hook to run after a successful pull (e.g., systemctl restart).
- sleep-secs: Poll interval while up-to-date.
- grace-secs: Seconds to wait after SIGTERM before SIGKILL.
- debounce-secs: Coalesce multiple pulls before restart.
Usage:
./watch-pull-restart.py --cmd "./run.sh" --sleep-secs 1
./watch-pull-restart.py --cmd "python -m app" --restart-cmd "systemctl --user restart myapp"
./watch-pull-restart.py --restart-cmd "systemctl --user restart myapp" # no managed child; only trigger hook
"""
import argparse
import os
import signal
import subprocess
import sys
import time
from types import FrameType
from typing import Optional
# ---------- logging helpers (to STDERR) ----------
def log(msg: str):
sys.stderr.write(msg.rstrip() + "\n")
sys.stderr.flush()
def sep(title: str = ""):
"""Big visual separator for state transitions (to STDERR)."""
sys.stderr.write("\n\n")
if title:
sys.stderr.write(f"===== [watch] {title} =====\n")
else:
sys.stderr.write("===== [watch] =====\n")
sys.stderr.flush()
def run_capture(cmd: str, check: bool = True) -> subprocess.CompletedProcess[str]:
"""Run and capture output; for git plumbing."""
return subprocess.run(
cmd,
shell=True,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True,
check=check,
)
# ---------- shell helpers ----------
def is_up_to_date() -> bool:
subprocess.run("git fetch --quiet",
shell=True) # Quiet fetch; ignore network errors (we'll just try again next tick)
try:
current = run_capture("git rev-parse HEAD", check=True).stdout.strip()
upstream = run_capture("git rev-parse @{u}", check=True).stdout.strip()
return current == upstream
except subprocess.CalledProcessError:
return True # No upstream or other git error; treat as up-to-date to avoid thrash
def pull_ff_only() -> bool:
"""Returns True if pull applied changes, False if already up-to-date."""
try:
cp = run_capture("git pull --ff-only --no-rebase", check=True)
return "Already up to date" not in cp.stdout and cp.returncode == 0 # Git prints "Already up to date." on no-op; cheap heuristic
except subprocess.CalledProcessError as e:
log("[watch] git pull failed:")
if e.stdout: # pyright: ignore[reportAny]
log(e.stdout) # pyright: ignore[reportAny]
if e.stderr: # pyright: ignore[reportAny]
log(e.stderr) # pyright: ignore[reportAny]
return False
# ---------- managed processes ----------
class ManagedProc:
def __init__(self, cmd: Optional[str], grace_secs: float):
self.cmd = cmd
self.grace = grace_secs
self.child: Optional[subprocess.Popen[bytes]] = None
def start(self):
if not self.cmd:
return
if self.child and self.child.poll() is None:
return
sep("starting main cmd")
log(f"[watch] starting: {self.cmd}")
# New process group so we can signal the entire tree (shell + children)
self.child = subprocess.Popen(
self.cmd,
shell=True, # allow shell features in --cmd
stdout=None, # inherit parent's stdout (your app prints normally)
stderr=None, # inherit parent's stderr
stdin=None,
preexec_fn=os.setsid, # create new session (PGID == child PID)
)
def stop_gracefully(self):
if not self.child:
return
if self.child.poll() is not None:
self.child = None
return
sep("stopping main cmd (SIGTERM)")
try:
os.killpg(self.child.pid, signal.SIGTERM)
except ProcessLookupError:
pass
deadline = time.time() + self.grace
while time.time() < deadline:
if self.child.poll() is not None:
self.child = None
return
time.sleep(0.1)
sep("main cmd unresponsive; SIGKILL")
try:
os.killpg(self.child.pid, signal.SIGKILL)
except ProcessLookupError:
pass
self.child = None
def forward_signal(self, sig: int):
if not self.child or self.child.poll() is not None:
return
try:
os.killpg(self.child.pid, sig)
except ProcessLookupError:
pass
class OneShotHook:
"""
One-shot hook command (e.g., systemctl restart).
Runs to completion with inherited stdio so its output is visible.
"""
def __init__(self, cmd: Optional[str], grace_secs: float):
self.cmd = cmd
self.grace = grace_secs
self.child: Optional[subprocess.Popen[bytes]] = None
def run(self) -> int:
if not self.cmd:
return 0
sep("running restart hook")
log(f"[watch] hook: {self.cmd}")
self.child = subprocess.Popen(
self.cmd,
shell=True,
stdout=None, # inherit stdio
stderr=None,
stdin=None,
preexec_fn=os.setsid,
)
# Wait with grace/kill if needed (rare for hooks, but symmetric)
deadline = time.time() + self.grace
while True:
rc = self.child.poll()
if rc is not None:
self.child = None
return rc
if time.time() > deadline:
sep("hook exceeded grace; SIGKILL")
try:
os.killpg(self.child.pid, signal.SIGKILL)
except ProcessLookupError:
pass
self.child = None
return 137 # killed
time.sleep(0.1)
def forward_signal(self, sig: int):
if not self.child or self.child.poll() is not None:
return
try:
os.killpg(self.child.pid, sig)
except ProcessLookupError:
pass
# ---------- main loop ----------
def main():
# CMD commands
ap = argparse.ArgumentParser(description="Auto-pull & restart on upstream changes (Unix).")
ap.add_argument("--cmd", help="Command to run/manage (e.g. './run.sh' or 'python -m app').")
ap.add_argument("--restart-cmd", help="Optional hook to run after a successful pull (e.g., systemctl restart).")
ap.add_argument("--sleep-secs", type=float, default=0.5, help="Poll interval while up-to-date.")
ap.add_argument("--grace-secs", type=float, default=5.0, help="Seconds to wait after SIGTERM before SIGKILL.")
ap.add_argument("--debounce-secs", type=float, default=0.5, help="Coalesce multiple pulls before restart.")
args = ap.parse_args()
# get CMD command values
cmd = args.cmd # pyright: ignore[reportAny]
assert cmd is None or isinstance(cmd, str)
restart_cmd = args.restart_cmd # pyright: ignore[reportAny]
assert cmd is None or isinstance(restart_cmd, str)
sleep_secs = args.sleep_secs # pyright: ignore[reportAny]
assert sleep_secs is not None and isinstance(sleep_secs, float)
grace_secs = args.grace_secs # pyright: ignore[reportAny]
assert sleep_secs is not None and isinstance(grace_secs, float)
debounce_secs = args.debounce_secs # pyright: ignore[reportAny]
assert sleep_secs is not None and isinstance(debounce_secs, float)
# start managed proc
proc = ManagedProc(cmd, grace_secs)
hook = OneShotHook(restart_cmd, grace_secs)
# signal handling for graceful exit
exiting = {"flag": False}
def _handle(sig_num: int, _frame: Optional[FrameType]):
sep(f"received signal {sig_num}; exiting")
exiting["flag"] = True
proc.forward_signal(sig_num)
hook.forward_signal(sig_num)
signal.signal(signal.SIGINT, _handle)
signal.signal(signal.SIGTERM, _handle)
# Initial start (if managing a process)
proc.start()
pending_restart = False
last_change = 0.0
while not exiting["flag"]:
try:
if not is_up_to_date():
sep("upstream ahead; pulling")
changed = pull_ff_only()
if changed:
last_change = time.time()
pending_restart = True
# handle debounce window
if pending_restart and (time.time() - last_change) >= debounce_secs:
# Optional hook first
if restart_cmd:
rc = hook.run()
if rc != 0:
sep(f"hook exited with {rc}")
# Then bounce managed process
if cmd:
proc.stop_gracefully()
proc.start()
pending_restart = False
sep("restart cycle complete")
# keep the child alive if it crashed without a pull
if cmd and (proc.child is None or proc.child.poll() is not None):
sep("main cmd exited; restarting")
proc.start()
time.sleep(sleep_secs)
except Exception as e:
sep("loop error")
log(f"[watch] {e}")
time.sleep(2.0)
# graceful shutdown on exit
proc.stop_gracefully()
sep("bye")
if __name__ == "__main__":
main()

View File

@@ -1,22 +0,0 @@
you have 2 scripts now added:
1. scp_repo.sh that you call like "./scp_repo.sh {password}"
where password is the password for the studios. call this from the
root of the repo and it will send any differences in your local repo
to the machines. this should only be needed when things changed
2. run_remote.sh, also called like "./run_remote.sh {password}"
which kills all running exo process and starts new ones with fresh dbs
both of these use the file hosts.json which is a json list of strings
of the form user@ip where you need to put the studios with their username
and THUNDERBOLT ips (get these manually from the machines after all of
them and your laptop are hooked up via tb5 and have ips on the thunderbolt
bridge in settings>network). the order here doesn't matter EXCEPT for the
first entry which will be the master. so the script runs ./run.sh -c on the
first entry in that list and ./run.sh -rc on all the others
separately, there is now a nodes.json which is also a list of strings but this
time of the node ids of the machines (the uuid that gets generated in python
and printed when the process starts etc). here you do need them in the exact
order the machines are connected in via thunderbolt. this is used to prefer
spawning models across machines 1-2 and then 3-4 in that order if doable

View File

@@ -1,11 +1,32 @@
from abc import ABC, abstractmethod
from functools import partial
from typing import TYPE_CHECKING, Protocol, cast, override
from mlx_lm.models.deepseek_v3 import DeepseekV3MLP
from mlx_lm.models.deepseek_v3 import Model as DeepseekV3Model
from mlx_lm.models.llama import Model as LlamaModel
from mlx_lm.models.qwen3_moe import Model as Qwen3MoeModel
from mlx_lm.models.qwen3_moe import Qwen3MoeSparseMoeBlock
import mlx.core as mx
import mlx.nn as nn # pyright: ignore[reportMissingTypeStubs]
from exo.shared.types.worker.shards import PipelineShardMetadata
from exo.shared.types.worker.shards import (
PipelineShardMetadata,
ShardMetadata,
TensorShardMetadata,
)
from mlx.nn.layers.distributed import ( # type: ignore
shard_inplace, # type: ignore
shard_linear, # type: ignore
sum_gradients, # type: ignore
)
class IdentityLayer(nn.Module):
def __init__(self) -> None:
super().__init__()
self.use_sliding = False
@override
def __call__(self, x: mx.array, *args: object, **kwargs: object) -> mx.array:
return x
@@ -70,61 +91,270 @@ class PipelineLastLayer(CustomMlxLayer):
return output
def inner_model(model: nn.Module) -> nn.Module:
inner = getattr(model, "model", None)
if isinstance(inner, nn.Module):
return inner
inner = getattr(model, "transformer", None)
if isinstance(inner, nn.Module):
return inner
raise ValueError("Model must either have a 'model' or 'transformer' attribute")
class ParallelisationShardStrategy(Protocol):
def auto_parallel(
self, model: nn.Module, model_shard_meta: ShardMetadata
) -> nn.Module: ...
# def auto_parallel(model: nn.Module, rank: int, size: int, start_layer: int, end_layer: int) -> nn.Module:
def auto_parallel(
model: nn.Module, model_shard_meta: PipelineShardMetadata
) -> nn.Module:
"""
Automatically parallelize a model across multiple devices.
class PipelineParallelisationStrategy(ParallelisationShardStrategy):
def auto_parallel(
self, model: nn.Module, model_shard_meta: ShardMetadata
) -> nn.Module:
"""
Automatically parallelize a model across multiple devices.
Args:
model: The model to parallelize (must have a 'layers' or 'h' property)
model_shard_meta: The metadata for the model shard
Returns:
The parallelized model
"""
assert isinstance(model_shard_meta, PipelineShardMetadata)
Args:
model: The model to parallelize (must have a 'layers' or 'h' property)
model_shard_meta: The metadata for the model shard
inner_model_instance: nn.Module = PipelineParallelisationStrategy._inner_model(
model
)
Returns:
The parallelized model
"""
inner_model_instance: nn.Module = inner_model(model)
# Handle both model.layers and model.h cases
layers: list[_LayerCallable]
if hasattr(inner_model_instance, "layers"):
layers = cast(list[_LayerCallable], inner_model_instance.layers)
elif hasattr(inner_model_instance, "h"):
layers = cast(list[_LayerCallable], inner_model_instance.h)
else:
raise ValueError("Model must have either a 'layers' or 'h' attribute")
# Handle both model.layers and model.h cases
layers: list[_LayerCallable]
if hasattr(inner_model_instance, "layers"):
layers = cast(list[_LayerCallable], inner_model_instance.layers)
else:
layers = cast(list[_LayerCallable], inner_model_instance.h)
layers[: model_shard_meta.start_layer] = [
IdentityLayer() for _ in range(model_shard_meta.start_layer)
]
layers[model_shard_meta.end_layer :] = [
IdentityLayer() for _ in range(len(layers) - model_shard_meta.end_layer)
]
layers[model_shard_meta.start_layer] = PipelineFirstLayer(
layers[model_shard_meta.start_layer],
model_shard_meta.device_rank,
model_shard_meta.world_size,
)
layers[model_shard_meta.end_layer - 1] = PipelineLastLayer(
layers[model_shard_meta.end_layer - 1],
model_shard_meta.device_rank,
model_shard_meta.world_size,
)
layers[: model_shard_meta.start_layer] = [
IdentityLayer() for _ in range(model_shard_meta.start_layer)
]
layers[model_shard_meta.end_layer :] = [
IdentityLayer() for _ in range(len(layers) - model_shard_meta.end_layer)
]
layers[model_shard_meta.start_layer] = PipelineFirstLayer(
layers[model_shard_meta.start_layer],
model_shard_meta.device_rank,
model_shard_meta.world_size,
)
layers[model_shard_meta.end_layer - 1] = PipelineLastLayer(
layers[model_shard_meta.end_layer - 1],
model_shard_meta.device_rank,
model_shard_meta.world_size,
)
# At this point `layers` *must* be a concrete list.
assert isinstance(layers, list), (
"Expected a list of layers after auto-parallel initialisation"
)
# At this point `layers` *must* be a concrete list.
assert isinstance(layers, list), (
"Expected a list of layers after auto-parallel initialisation"
)
return model
return model
@staticmethod
def _inner_model(model: nn.Module) -> nn.Module:
inner = getattr(model, "model", None)
if isinstance(inner, nn.Module):
return inner
inner = getattr(model, "transformer", None)
if isinstance(inner, nn.Module):
return inner
raise ValueError("Model must either have a 'model' or 'transformer' attribute")
class TensorParallelisationStrategy(ParallelisationShardStrategy):
def __init__(self, group: mx.distributed.Group): # type: ignore
self.group = group # type: ignore
self.N = self.group.size # type: ignore
def auto_parallel(
self, model: nn.Module, model_shard_meta: ShardMetadata
) -> nn.Module:
assert isinstance(model_shard_meta, TensorShardMetadata)
all_to_sharded_linear = partial(
shard_linear,
sharding="all-to-sharded",
group=self.group, # pyright: ignore
)
sharded_to_all_linear = partial(
shard_linear,
sharding="sharded-to-all",
group=self.group, # type: ignore
)
all_to_sharded_linear_in_place = partial(
shard_inplace,
sharding="all-to-sharded",
group=self.group, # pyright: ignore
)
sharded_to_all_linear_in_place = partial(
shard_inplace,
sharding="sharded-to-all",
group=self.group, # type: ignore
)
if isinstance(model, LlamaModel):
tensor_parallel_sharding_strategy = LlamaShardingStrategy(
self.group, # type: ignore
all_to_sharded_linear,
sharded_to_all_linear,
all_to_sharded_linear_in_place,
sharded_to_all_linear_in_place,
)
elif isinstance(model, DeepseekV3Model):
tensor_parallel_sharding_strategy = DeepSeekShardingStrategy(
self.group, # type: ignore
all_to_sharded_linear,
sharded_to_all_linear,
all_to_sharded_linear_in_place,
sharded_to_all_linear_in_place,
)
elif isinstance(model, Qwen3MoeModel):
tensor_parallel_sharding_strategy = QwenShardingStrategy(
self.group, # type: ignore
all_to_sharded_linear,
sharded_to_all_linear,
all_to_sharded_linear_in_place,
sharded_to_all_linear_in_place,
)
else:
raise ValueError(f"Unsupported model type: {type(model)}")
return tensor_parallel_sharding_strategy.shard_model(model)
class TensorParallelShardingStrategy(ABC):
def __init__(
self,
group, # type: ignore
all_to_sharded_linear, # type: ignore
sharded_to_all_linear, # type: ignore
all_to_sharded_linear_in_place, # type: ignore
sharded_to_all_linear_in_place, # type: ignore
):
self.all_to_sharded_linear = all_to_sharded_linear
self.sharded_to_all_linear = sharded_to_all_linear
self.all_to_sharded_linear_in_place = all_to_sharded_linear_in_place
self.sharded_to_all_linear_in_place = sharded_to_all_linear_in_place
self.group = group or mx.distributed.init() # type: ignore
self.N = cast(int, group.size()) # type: ignore
@abstractmethod
def shard_model(self, model: nn.Module) -> nn.Module: ...
class LlamaShardingStrategy(TensorParallelShardingStrategy):
def shard_model(self, model: nn.Module) -> nn.Module:
model = cast(LlamaModel, model)
for layer in model.layers:
layer.self_attn.q_proj = self.all_to_sharded_linear(layer.self_attn.q_proj)
layer.self_attn.k_proj = self.all_to_sharded_linear(layer.self_attn.k_proj)
layer.self_attn.v_proj = self.all_to_sharded_linear(layer.self_attn.v_proj)
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
layer.self_attn.n_heads //= self.N
if layer.self_attn.n_kv_heads is not None:
layer.self_attn.n_kv_heads //= self.N
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
return model
class DeepSeekShardingStrategy(TensorParallelShardingStrategy):
def shard_model(self, model: nn.Module) -> nn.Module:
model = cast(DeepseekV3Model, model)
for layer in model.layers:
# Shard the self attention
if layer.self_attn.q_lora_rank is None: # pyright: ignore[reportUnnecessaryComparison]
layer.self_attn.q_proj = self.all_to_sharded_linear(
layer.self_attn.q_proj
)
else:
layer.self_attn.q_b_proj = self.all_to_sharded_linear(
layer.self_attn.q_b_proj
)
layer.self_attn.kv_b_proj = self.all_to_sharded_linear(
layer.self_attn.kv_b_proj
)
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
layer.self_attn.num_heads //= self.N
# Shard the MLP
if isinstance(layer.mlp, DeepseekV3MLP):
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
# Shard the MoE. Shard in place since the MoE should be responsible
# for aggregating the results.
else:
self.all_to_sharded_linear_in_place(layer.mlp.shared_experts.gate_proj)
self.sharded_to_all_linear_in_place(layer.mlp.shared_experts.down_proj)
self.all_to_sharded_linear_in_place(layer.mlp.shared_experts.up_proj)
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.gate_proj)
self.sharded_to_all_linear_in_place(layer.mlp.switch_mlp.down_proj)
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.up_proj)
layer.mlp = ShardedDeepseekV3MoE(layer.mlp) # type: ignore
layer.mlp.sharding_group = self.group # type: ignore
return model
class ShardedDeepseekV3MoE(CustomMlxLayer):
def __init__(self, layer: _LayerCallable):
super().__init__(layer)
self.sharding_group: mx.distributed.Group | None = None # type: ignore
def __call__(self, x: mx.array) -> mx.array:
if self.sharding_group is not None: # type: ignore
x = sum_gradients(self.sharding_group)(x) # type: ignore
y = self.original_layer.__call__(x) # type: ignore
if self.sharding_group is not None: # type: ignore
y = mx.distributed.all_sum(y, group=self.sharding_group) # type: ignore
return y
class QwenShardingStrategy(TensorParallelShardingStrategy):
def shard_model(self, model: nn.Module) -> nn.Module:
model = cast(Qwen3MoeModel, model)
for layer in model.layers:
# Shard the self attention
layer.self_attn.q_proj = self.all_to_sharded_linear(layer.self_attn.q_proj)
layer.self_attn.k_proj = self.all_to_sharded_linear(layer.self_attn.k_proj)
layer.self_attn.v_proj = self.all_to_sharded_linear(layer.self_attn.v_proj)
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
layer.self_attn.n_heads //= self.N
layer.self_attn.n_kv_heads //= self.N
# Shard the MoE. Shard in place since the MoE should be responsible
# for aggregating the results.
if isinstance(layer.mlp, Qwen3MoeSparseMoeBlock):
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.gate_proj)
self.sharded_to_all_linear_in_place(layer.mlp.switch_mlp.down_proj)
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.up_proj)
layer.mlp = ShardedQwenMoE(layer.mlp) # type: ignore
layer.mlp.sharding_group = self.group # type:ignore
# Shard the MLP
else:
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
return model
class ShardedQwenMoE(CustomMlxLayer):
def __init__(self, layer: _LayerCallable):
super().__init__(layer)
self.sharding_group: mx.distributed.Group | None = None # type: ignore
def __call__(self, x: mx.array) -> mx.array:
if self.sharding_group is not None: # type: ignore
x = sum_gradients(self.sharding_group)(x) # type: ignore
y = self.original_layer.__call__(x) # type: ignore
if self.sharding_group is not None: # type: ignore
y = mx.distributed.all_sum(y, group=self.sharding_group) # type: ignore
return y

View File

@@ -1,24 +1,31 @@
import asyncio
import concurrent.futures
import contextlib
import os
import resource
from asyncio import AbstractEventLoop
from typing import Any, Callable, Optional, cast
from typing import Any, Callable, Optional
from loguru import logger
from mlx_lm.models.cache import KVCache
from mlx_lm.sample_utils import make_sampler
from mlx_lm.tokenizer_utils import TokenizerWrapper as _TokenizerWrapper
from mlx_lm.tokenizer_utils import load_tokenizer # type: ignore
try:
from mlx_lm.tokenizer_utils import load_tokenizer # type: ignore
except ImportError:
from mlx_lm.tokenizer_utils import load as load_tokenizer # type: ignore
from mlx_lm.utils import load_model # type: ignore
from pydantic import RootModel
import mlx.core as mx
import mlx.nn as nn # pyright: ignore[reportMissingTypeStubs]
from exo.engines.mlx import Model, TokenizerWrapper
from exo.engines.mlx.auto_parallel import IdentityLayer, auto_parallel
from exo.engines.mlx.auto_parallel import (
IdentityLayer,
PipelineParallelisationStrategy,
TensorParallelisationStrategy,
)
from exo.shared.types.common import Host
from exo.shared.types.memory import Memory
from exo.shared.types.tasks import ChatCompletionTaskParams
from exo.shared.types.worker.communication import runner_print
from exo.shared.types.worker.shards import ShardMetadata
@@ -31,15 +38,17 @@ mlx_rank: None | int = None
mlx_world_size: None | int = None
def mx_barrier():
def mx_barrier(group: mx.distributed.Group | None = None): # type: ignore
mx.eval( # type: ignore
mx.distributed.all_sum(
mx.array(1.0), stream=mx.default_stream(mx.Device(mx.cpu))
mx.array(1.0),
stream=mx.default_stream(mx.Device(mx.cpu)),
group=group, # type: ignore[type-arg]
)
)
def broadcast_from_zero(value: int) -> int:
def broadcast_from_zero(value: int, group: mx.distributed.Group | None = None): # type: ignore
if mlx_rank is None:
return value
@@ -48,7 +57,7 @@ def broadcast_from_zero(value: int) -> int:
else:
a = mx.array([0], dtype=mx.int32)
m = mx.distributed.all_sum(a, stream=mx.Device(mx.DeviceType.cpu))
m = mx.distributed.all_sum(a, stream=mx.Device(mx.DeviceType.cpu), group=group) # type: ignore
mx.eval(m) # type: ignore
return int(m.item())
@@ -59,68 +68,60 @@ class HostList(RootModel[list[str]]):
return cls(root=[str(host) for host in hosts])
def mlx_setup(
model_size_mb: int,
cache_frac_of_mrwss: float = 0.65, # main workhorse
wired_frac_of_mrwss: float = 0.00, # start with no wiring
) -> None:
if not mx.metal.is_available():
logger.warning(
"Metal is not available. Skipping MLX memory wired limits setup."
)
return
info = mx.metal.device_info()
mrwss = int(info["max_recommended_working_set_size"]) # bytes
memsize = int(info["memory_size"]) # bytes
runner_print(f"model size mb {model_size_mb}")
runner_print(f"{mrwss=}")
runner_print(f"{memsize=}")
model_bytes = int(model_size_mb * 1024**2)
kv_bytes = int(0.02 * model_bytes)
# Cache: keep most of weights+KV “on ice”, but dont starve the OS.
target_cache = int(1.10 * (model_bytes + kv_bytes)) # +10% slack
target_cache = min(target_cache, int(cache_frac_of_mrwss * mrwss))
target_cache = min(target_cache, memsize)
runner_print(f"{target_cache=}")
mx.set_cache_limit(max(target_cache, 0))
# Wiring: off by default; if you reenable, wire at most a small fraction.
if wired_frac_of_mrwss > 0.0:
target_wired = int(wired_frac_of_mrwss * mrwss)
target_wired = min(target_wired, target_cache) # dont wire more than cache
runner_print(f"{target_wired=}")
with contextlib.suppress(Exception): # older macOS wont have this
mx.set_wired_limit(max(target_wired, 0))
def mlx_distributed_init(rank: int, hosts: list[Host]) -> mx.distributed.Group: # type: ignore
def mlx_distributed_init( # type: ignore[return]
rank: int,
hosts: list[Host] | None = None,
mlx_ibv_devices: list[list[str | None]] | None = None,
mlx_ibv_coordinator: str | None = None,
) -> mx.distributed.Group: # type: ignore
"""
Initialize the MLX distributed (runs in thread pool)
Initialize the MLX distributed (runs in thread pool).
Either hosts or mlx_ibv_devices must be provided:
- hosts: traditional host-based connectivity using MLX_HOSTFILE
- mlx_ibv_devices: RDMA connectivity matrix using MLX_IBV_DEVICES
- mlx_ibv_coordinator: coordinator address (IP:PORT) for RDMA setup
"""
global mlx_rank, mlx_world_size
runner_print(f"Starting initialization for rank {rank}")
# Setup distributed environment
hostfile = f"./hosts_{rank}.json" # TODO: this needs to be unique?
hosts_json = HostList.from_hosts(hosts).model_dump_json()
if mlx_ibv_devices is not None:
assert mlx_ibv_coordinator is not None, (
"To use ibv backend must set ibv coordinator"
)
import json
runner_print(f"rank {rank} hostfile: {hostfile} hosts: {hosts_json}")
# Use RDMA connectivity matrix
devices_file = f"./hosts_{rank}.json"
ibv_devices_json = json.dumps(mlx_ibv_devices)
runner_print(f"rank {rank} MLX_IBV_DEVICES: {ibv_devices_json}")
runner_print(f"rank {rank} MLX_IBV_COORDINATOR: {mlx_ibv_coordinator}")
with open(hostfile, "w") as f:
_ = f.write(hosts_json)
with open(devices_file, "w") as f:
_ = f.write(ibv_devices_json)
os.environ["MLX_HOSTFILE"] = hostfile
os.environ["MLX_RANK"] = str(rank)
os.environ["MLX_RING_VERBOSE"] = "1"
os.environ["MLX_IBV_DEVICES"] = devices_file
os.environ["MLX_RANK"] = str(rank)
os.environ["MLX_IBV_COORDINATOR"] = mlx_ibv_coordinator
group = mx.distributed.init(backend="ring", strict=True)
mlx_rank = group.rank()
mlx_world_size = group.rank()
elif hosts is not None:
# Traditional host-based connectivity
hostfile = f"./hosts_{rank}.json"
hosts_json = HostList.from_hosts(hosts).model_dump_json()
runner_print(f"rank {rank} hostfile: {hostfile} hosts: {hosts_json}")
with open(hostfile, "w") as f:
_ = f.write(hosts_json)
os.environ["MLX_HOSTFILE"] = hostfile
os.environ["MLX_RANK"] = str(rank)
os.environ["MLX_RING_VERBOSE"] = "1"
else:
raise ValueError("Either hosts or mlx_ibv_devices must be provided")
group = mx.distributed.init(
backend="ring" if hosts is not None else "ibv", strict=True
)
runner_print(f"Rank {rank} mlx distributed initialization complete")
return group
@@ -128,40 +129,79 @@ def mlx_distributed_init(rank: int, hosts: list[Host]) -> mx.distributed.Group:
def initialize_mlx(
model_shard_meta: ShardMetadata,
hosts: list[Host],
) -> tuple[Model, TokenizerWrapper, Callable[[mx.array], mx.array]]:
hosts: list[Host] | None = None,
mlx_ibv_devices: list[list[str | None]] | None = None,
mlx_ibv_coordinator: str | None = None,
) -> tuple[Model, TokenizerWrapper, Callable[[mx.array], mx.array], Any]:
"""
Initialize the MLX model, tokenizer, and sampler. Runs in the MLX thread.
Either hosts or mlx_ibv_devices must be provided for distributed setups:
- hosts: traditional host-based connectivity
- mlx_ibv_devices: RDMA connectivity matrix
"""
mx.random.seed(42)
if len(hosts) > 1:
mlx_distributed_init(model_shard_meta.device_rank, hosts)
group = mlx_distributed_init( # type: ignore[misc]
model_shard_meta.device_rank,
hosts=hosts,
mlx_ibv_devices=mlx_ibv_devices,
mlx_ibv_coordinator=mlx_ibv_coordinator,
)
# set_wired_limit_for_model(get_weights_size(model_shard_meta))
# Determine world size from either hosts or mlx_ibv_devices
sampler: Callable[[mx.array], mx.array] = make_sampler(temp=0.7)
model, tokenizer = shard_and_load(model_shard_meta)
model = cast(Model, model)
model, tokenizer = shard_and_load(model_shard_meta, group=group) # type: ignore[reportUnknownArgumentType]
return model, tokenizer, sampler
return model, tokenizer, sampler, group # type: ignore[return-value]
def shard_and_load(
model_shard_meta: ShardMetadata,
group: mx.distributed.Group, # type: ignore
) -> tuple[nn.Module, TokenizerWrapper]:
model_path = build_model_path(model_shard_meta.model_meta.model_id)
runner_print(f"loading model from {model_path}")
runner_print(
f"loading model from {model_path} with strategy {model_shard_meta.strategy}"
)
model, config = load_model(model_path, lazy=True, strict=False) # type: ignore
runner_print(f"{config=}")
assert isinstance(model, nn.Module)
tokenizer = load_tokenizer(model_path)
tokenizer = load_tokenizer(model_path) # type: ignore
assert isinstance(tokenizer, _TokenizerWrapper)
model = auto_parallel(model, model_shard_meta)
if group:
runner_print(f"Group size: {group.size()}, group rank: {group.rank()}") # type: ignore
else:
runner_print("!!! No group")
match model_shard_meta.strategy:
case "auto":
strategy = PipelineParallelisationStrategy()
case "pipeline":
strategy = PipelineParallelisationStrategy()
case "pipeline_rdma":
strategy = PipelineParallelisationStrategy()
case "tensor":
strategy = TensorParallelisationStrategy(group) # type: ignore[reportUnknownArgumentType]
case "tensor_rdma":
strategy = TensorParallelisationStrategy(group) # type: ignore[reportUnknownArgumentType]
model = strategy.auto_parallel(model, model_shard_meta)
runner_print(f"Model after auto_parallel: {str(model)}")
mx.eval(model.parameters()) # type: ignore
mx.eval(model) # type: ignore
# Synchronize processes before generation to avoid timeout
mx_barrier()
mx_barrier(group) # type: ignore[reportUnknownArgumentType]
return model, tokenizer # type: ignore
@@ -257,3 +297,30 @@ def mlx_force_oom(size: int = 40000) -> None:
e = mx.matmul(b, c)
f = mx.sigmoid(d + e)
mx.eval(f) # type: ignore
def set_wired_limit_for_model(model_size: Memory):
"""
A context manager to temporarily change the wired limit.
Note, the wired limit should not be changed during an async eval. If an
async eval could be running pass in the streams to synchronize with prior
to exiting the context manager.
"""
if not mx.metal.is_available():
return
model_bytes = model_size.in_bytes
max_rec_size = int(mx.metal.device_info()["max_recommended_working_set_size"])
if model_bytes > 0.9 * max_rec_size:
model_mb = model_bytes // 2**20
max_rec_mb = max_rec_size // 2**20
runner_print(
f"[WARNING] Generating with a model that requires {model_mb} MB "
f"which is close to the maximum recommended size of {max_rec_mb} "
"MB. This can be slow. See the documentation for possible work-arounds: "
"https://github.com/ml-explore/mlx-lm/tree/main#large-models"
)
runner_print(f"Setting wired limit to {max_rec_size}")
mx.set_wired_limit(max_rec_size)
runner_print(f"Wired limit set to {max_rec_size}")

View File

@@ -153,7 +153,9 @@ class Node:
await self.master.shutdown()
self.master = None
else:
logger.info(f"Node {result.session_id.master_node_id} elected master")
logger.info(
f"Node {result.session_id.master_node_id} elected master"
)
if result.is_new_master:
await anyio.sleep(0)
if self.worker:
@@ -175,10 +177,10 @@ class Node:
)
self._tg.start_soon(self.worker.run)
if self.api:
self.api.reset(result.session_id)
self.api.reset(result.session_id, result.won_clock)
else:
if self.api:
self.api.unpause()
self.api.unpause(result.won_clock)
def main():

View File

@@ -93,6 +93,7 @@ class API:
self.event_buffer: OrderedBuffer[Event] = OrderedBuffer[Event]()
self.node_id: NodeId = node_id
self.session_id: SessionId = session_id
self.last_completed_election: int = 0
self.port = port
self.paused: bool = False
@@ -121,14 +122,15 @@ class API:
] = {}
self._tg: TaskGroup | None = None
def reset(self, new_session_id: SessionId):
def reset(self, new_session_id: SessionId, result_clock: int):
self.state = State()
self.session_id = new_session_id
self.event_buffer = OrderedBuffer[Event]()
self._chat_completion_queues = {}
self.unpause()
self.unpause(result_clock)
def unpause(self):
def unpause(self, result_clock: int):
self.last_completed_election = result_clock
self.paused = False
self.paused_ev.set()
self.paused_ev = AsyncTaskEvent()
@@ -155,6 +157,7 @@ class API:
self, payload: CreateInstanceTaskParams
) -> CreateInstanceResponse:
model_meta = await resolve_model_meta(payload.model_id)
strategy = payload.strategy
required_memory_bytes = model_meta.storage_size.in_kb
available_memory_bytes = self._calculate_total_available_memory()
@@ -165,8 +168,7 @@ class API:
)
command = CreateInstance(
command_id=CommandId(),
model_meta=model_meta,
command_id=CommandId(), model_meta=model_meta, strategy=strategy
)
await self._send(command)
@@ -260,10 +262,10 @@ class API:
# Store thinking in the thinking field
message.thinking = thinking_match.group(1).strip()
for instance in self.state.instances.values():
if instance.shard_assignments.model_id == payload.model:
break
else:
if not any(
instance.shard_assignments.model_id == payload.model
for instance in self.state.instances.values()
):
await self._trigger_notify_user_to_download_model(payload.model)
raise HTTPException(
status_code=404, detail=f"No instance found for model {payload.model}"
@@ -334,7 +336,7 @@ class API:
async def _pause_on_new_election(self):
with self.election_receiver as ems:
async for message in ems:
if message.clock > self.session_id.election_clock:
if message.clock > self.last_completed_election:
self.paused = True
async def _send(self, command: Command):

View File

@@ -1,3 +1,5 @@
from datetime import datetime, timezone
from anyio import create_task_group
from anyio.abc import TaskGroup
from loguru import logger
@@ -202,6 +204,8 @@ class Master:
indexed = IndexedEvent(event=event, idx=len(self._event_log))
self.state = apply(self.state, indexed)
event._master_time_stamp = datetime.now(tz=timezone.utc) # pyright: ignore[reportPrivateUsage]
# TODO: SQL
self._event_log.append(event)
await self._send_event(indexed)

View File

@@ -6,6 +6,8 @@ from typing import Sequence
from exo.master.placement_utils import (
filter_cycles_by_memory,
get_hosts_from_subgraph,
get_mlx_ibv_coordinator,
get_mlx_ibv_devices_matrix,
get_shard_assignments,
get_smallest_cycles,
)
@@ -39,7 +41,6 @@ def get_instance_placements_after_create(
logger.info("finding cycles:")
cycles = topology.get_cycles()
logger.info(f"{cycles=}")
# we can also always just have a node on its own
singleton_cycles = [[node] for node in all_nodes]
candidate_cycles = cycles + singleton_cycles
cycles_with_sufficient_memory = filter_cycles_by_memory(
@@ -58,7 +59,7 @@ def get_instance_placements_after_create(
]
if tb_only and smallest_tb_cycles == []:
raise ValueError("No cycles found with sufficient memory")
raise ValueError("No TB cycles found with sufficient memory")
elif smallest_tb_cycles != []:
smallest_cycles = smallest_tb_cycles
@@ -80,29 +81,46 @@ def get_instance_placements_after_create(
),
)
shard_assignments = get_shard_assignments(command.model_meta, selected_cycle)
shard_assignments = get_shard_assignments(
command.model_meta, selected_cycle, command.strategy
)
cycle_digraph: Topology = topology.get_subgraph_from_nodes(selected_cycle)
hosts: list[Host] = get_hosts_from_subgraph(cycle_digraph)
instance_id = InstanceId()
target_instances = dict(deepcopy(current_instances))
target_instances[instance_id] = Instance(
instance_id=instance_id,
instance_type=InstanceStatus.Active,
shard_assignments=shard_assignments,
hosts=[
Host(
ip=host.ip,
# NOTE: this is stupid
# |
# v
# NOTE: it's fine to have non-deterministic ports here since this is in a command decision
port=random_ephemeral_port(),
)
for host in hosts
],
)
if command.strategy in ("tensor_rdma", "pipeline_rdma"):
mlx_ibv_devices = get_mlx_ibv_devices_matrix(
selected_cycle,
cycle_digraph,
)
mlx_ibv_coordinator = get_mlx_ibv_coordinator(
selected_cycle,
coordinator_port=random_ephemeral_port(),
)
target_instances[instance_id] = Instance(
instance_id=instance_id,
instance_type=InstanceStatus.Active,
shard_assignments=shard_assignments,
mlx_ibv_devices=mlx_ibv_devices,
mlx_ibv_coordinator=mlx_ibv_coordinator,
)
else:
hosts: list[Host] = get_hosts_from_subgraph(cycle_digraph)
target_instances[instance_id] = Instance(
instance_id=instance_id,
instance_type=InstanceStatus.Active,
shard_assignments=shard_assignments,
hosts=[
Host(
ip=host.ip,
port=random_ephemeral_port(),
)
for host in hosts
],
)
return target_instances

View File

@@ -1,5 +1,7 @@
from collections.abc import Generator
from typing import TypeGuard, cast
from loguru import logger
from pydantic import BaseModel
from exo.shared.topology import Topology
@@ -9,8 +11,13 @@ from exo.shared.types.models import ModelMetadata
from exo.shared.types.profiling import NodePerformanceProfile
from exo.shared.types.topology import NodeInfo
from exo.shared.types.worker.common import RunnerId
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
from exo.shared.types.worker.runners import ShardAssignments
from exo.shared.types.worker.shards import PipelineShardMetadata
from exo.shared.types.worker.shards import (
PipelineShardMetadata,
ShardMetadata,
TensorShardMetadata,
)
class NodeWithProfile(BaseModel):
@@ -43,10 +50,11 @@ def get_smallest_cycles(cycles: list[list[NodeInfo]]) -> list[list[NodeInfo]]:
return [cycle for cycle in cycles if len(cycle) == min_nodes]
def get_shard_assignments(
def get_shard_assignments_for_pipeline_parallel(
model_meta: ModelMetadata,
selected_cycle: list[NodeInfo],
) -> ShardAssignments:
parallelisation_strategy: ParallelisationStrategyType,
):
if not narrow_all_nodes(selected_cycle):
raise ValueError("All nodes must have profiles to create shard assignments")
@@ -55,7 +63,8 @@ def get_shard_assignments(
start=Memory(),
)
total_layers = model_meta.n_layers
runner_to_shard: dict[RunnerId, PipelineShardMetadata] = {}
world_size = len(selected_cycle)
runner_to_shard: dict[RunnerId, ShardMetadata] = {}
node_to_runner: dict[NodeId, RunnerId] = {}
layers_assigned = 0
@@ -73,13 +82,15 @@ def get_shard_assignments(
node_layers = max(1, node_layers)
runner_id = RunnerId()
shard = PipelineShardMetadata(
model_meta=model_meta,
device_rank=i,
world_size=len(selected_cycle),
world_size=world_size,
start_layer=layers_assigned,
end_layer=layers_assigned + node_layers,
n_layers=total_layers,
strategy=parallelisation_strategy,
)
runner_to_shard[runner_id] = shard
@@ -95,6 +106,82 @@ def get_shard_assignments(
return shard_assignments
def get_shard_assignments_for_tensor_parallel(
model_meta: ModelMetadata,
selected_cycle: list[NodeInfo],
parallelisation_strategy: ParallelisationStrategyType,
):
if not narrow_all_nodes(selected_cycle):
raise ValueError("All nodes must have profiles to create shard assignments")
total_layers = model_meta.n_layers
world_size = len(selected_cycle)
runner_to_shard: dict[RunnerId, ShardMetadata] = {}
node_to_runner: dict[NodeId, RunnerId] = {}
for i, node in enumerate(selected_cycle):
shard = TensorShardMetadata(
model_meta=model_meta,
device_rank=i,
world_size=world_size,
start_layer=0,
end_layer=total_layers,
n_layers=total_layers,
strategy=parallelisation_strategy,
)
runner_id = RunnerId()
runner_to_shard[runner_id] = shard
node_to_runner[node.node_id] = runner_id
shard_assignments = ShardAssignments(
model_id=model_meta.model_id,
runner_to_shard=runner_to_shard,
node_to_runner=node_to_runner,
)
return shard_assignments
def get_shard_assignments(
model_meta: ModelMetadata,
selected_cycle: list[NodeInfo],
parallelisation_strategy: ParallelisationStrategyType,
) -> ShardAssignments:
match parallelisation_strategy:
case "auto":
return get_shard_assignments_for_pipeline_parallel(
model_meta=model_meta,
selected_cycle=selected_cycle,
parallelisation_strategy=parallelisation_strategy,
)
case "pipeline":
return get_shard_assignments_for_pipeline_parallel(
model_meta=model_meta,
selected_cycle=selected_cycle,
parallelisation_strategy=parallelisation_strategy,
)
case "pipeline_rdma":
return get_shard_assignments_for_pipeline_parallel(
model_meta=model_meta,
selected_cycle=selected_cycle,
parallelisation_strategy=parallelisation_strategy,
)
case "tensor":
return get_shard_assignments_for_tensor_parallel(
model_meta=model_meta,
selected_cycle=selected_cycle,
parallelisation_strategy=parallelisation_strategy,
)
case "tensor_rdma":
return get_shard_assignments_for_tensor_parallel(
model_meta=model_meta,
selected_cycle=selected_cycle,
parallelisation_strategy=parallelisation_strategy,
)
def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
cycles = cycle_digraph.get_cycles()
if not cycles:
@@ -126,3 +213,109 @@ def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
break
return hosts
def get_mlx_ibv_devices_matrix(
selected_cycle: list[NodeInfo],
cycle_digraph: Topology,
) -> list[list[str | None]]:
"""Build connectivity matrix mapping device i to device j via RDMA interface names.
The matrix element [i][j] contains the interface name on device i that connects
to device j, or None if no connection exists or no interface name is found.
Diagonal elements are always None.
"""
num_nodes = len(selected_cycle)
matrix: list[list[str | None]] = [
[None for _ in range(num_nodes)] for _ in range(num_nodes)
]
for i, node_i in enumerate(selected_cycle):
for j, node_j in enumerate(selected_cycle):
if i == j:
continue
# just for debugging for now...
for connection_ip in _find_connection_ip(node_i, node_j, cycle_digraph):
interface_name = _find_interface_name_for_ip(connection_ip, node_i)
logger.info(
f"Interface name for {connection_ip} on {node_i.node_id}: {interface_name}"
)
matrix[i][j] = "rdma_en3" # TODO: hack, for now it's always en3
continue
for connection_ip in _find_connection_ip(node_i, node_j, cycle_digraph):
# Set the first valid rmda i -> j connection - if there are multiple, we set essentially randomly - this is fine, the connection doesn't appear to have to be bidirectional
if (
interface_name := _find_interface_name_for_ip(
connection_ip,
node_i,
)
) is not None:
matrix[i][j] = interface_name
break
else:
raise ValueError(
"Current ibv backend requires all-to-all rdma connections"
)
return matrix
def _find_connection_ip(
node_i: NodeInfo,
node_j: NodeInfo,
cycle_digraph: Topology,
) -> Generator[str]:
"""Find all IP addresses that connect node i to node j."""
for connection in cycle_digraph.list_connections():
if (
connection.local_node_id == node_j.node_id
and connection.send_back_node_id == node_i.node_id
and connection.send_back_multiaddr is not None
):
yield connection.send_back_multiaddr.ip_address
def _find_interface_name_for_ip(
ip_address: str,
node_info: NodeInfo,
) -> str | None:
if node_info.node_profile is None:
return None
for interface in node_info.node_profile.network_interfaces:
logger.info(
f"Checking interface {interface.name} for IP {interface.ip_address} == {ip_address}: {interface.ip_address == ip_address}"
)
if interface.name not in ["en2", "en3", "en4", "en5", "en6", "en7"]:
continue
if interface.ip_address == ip_address:
return f"rdma_{interface.name}"
return None
def get_mlx_ibv_coordinator(
selected_cycle: list[NodeInfo],
coordinator_port: int,
) -> str | None:
"""Get the coordinator address for MLX IBV (rank 0 device).
Selects a non-thunderbolt IP address from rank 0 node as a heuristic for
ethernet accessibility. Returns address in format "X.X.X.X:PORT".
"""
if len(selected_cycle) == 0:
logger.warning("No nodes in selected cycle, cannot determine coordinator")
return None
rank_0_node = selected_cycle[0]
logger.info(f"Selecting coordinator from rank 0 node: {rank_0_node.node_id}")
assert rank_0_node.node_profile is not None
for iface in rank_0_node.node_profile.network_interfaces:
if iface.name == "en0" and "." in iface.ip_address:
return f"{iface.ip_address}:{coordinator_port}"
raise ValueError("No en0 iface found for device")

View File

@@ -118,6 +118,7 @@ async def test_master():
n_layers=16,
storage_size=Memory.from_bytes(678948),
),
strategy="auto",
)
),
)

View File

@@ -12,6 +12,7 @@ from exo.shared.types.common import CommandId, NodeId
from exo.shared.types.events import InstanceCreated, InstanceDeleted
from exo.shared.types.memory import Memory
from exo.shared.types.models import ModelId, ModelMetadata
from exo.shared.types.profiling import NetworkInterfaceInfo, NodePerformanceProfile
from exo.shared.types.topology import Connection, NodeInfo
from exo.shared.types.worker.common import InstanceId
from exo.shared.types.worker.instances import Instance, InstanceStatus
@@ -49,6 +50,7 @@ def create_instance_command(model_meta: ModelMetadata) -> CreateInstance:
return CreateInstance(
command_id=CommandId(),
model_meta=model_meta,
strategy="auto",
)
@@ -78,6 +80,7 @@ def test_get_instance_placements_create_instance(
create_instance_command = CreateInstance(
command_id=CommandId(),
model_meta=model_meta,
strategy="auto",
)
node_id_a = NodeId()
node_id_b = NodeId()
@@ -132,6 +135,7 @@ def test_get_instance_placements_one_node_exact_fit(
pretty_name="Test Model",
n_layers=10,
),
strategy="auto",
)
placements = get_instance_placements_after_create(
create_instance_command, topology, {}
@@ -160,6 +164,7 @@ def test_get_instance_placements_one_node_fits_with_extra_memory(
pretty_name="Test Model",
n_layers=10,
),
strategy="auto",
)
placements = get_instance_placements_after_create(
create_instance_command, topology, {}
@@ -188,6 +193,7 @@ def test_get_instance_placements_one_node_not_fit(
pretty_name="Test Model",
n_layers=10,
),
strategy="auto",
)
with pytest.raises(ValueError, match="No cycles found with sufficient memory"):
@@ -297,6 +303,7 @@ def test_placement_prioritizes_leaf_cycle_with_less_memory(
create_instance_command = CreateInstance(
command_id=CommandId(),
model_meta=model_meta,
strategy="auto",
)
# Act
@@ -316,3 +323,130 @@ def test_placement_prioritizes_leaf_cycle_with_less_memory(
assert expected_leaf_cycle_nodes.issubset(assigned_nodes)
assert assigned_nodes.isdisjoint(non_leaf_cycle_nodes)
def test_tensor_rdma_backend_connectivity_matrix(
topology: Topology,
model_meta: ModelMetadata,
create_node: Callable[[int, NodeId | None], NodeInfo],
create_connection: Callable[[NodeId, NodeId], Connection],
):
model_meta.n_layers = 12
model_meta.storage_size.in_bytes = 1500
node_id_a = NodeId()
node_id_b = NodeId()
node_id_c = NodeId()
node_a = create_node(500, node_id_a)
node_b = create_node(500, node_id_b)
node_c = create_node(500, node_id_c)
ethernet_interface = NetworkInterfaceInfo(
name="en0",
ip_address="192.168.1.100",
type="ethernet",
)
assert node_a.node_profile is not None
assert node_b.node_profile is not None
assert node_c.node_profile is not None
conn_a_b = create_connection(node_id_a, node_id_b)
conn_b_c = create_connection(node_id_b, node_id_c)
conn_c_a = create_connection(node_id_c, node_id_a)
assert conn_a_b.send_back_multiaddr is not None
assert conn_b_c.send_back_multiaddr is not None
assert conn_c_a.send_back_multiaddr is not None
node_a.node_profile = NodePerformanceProfile(
model_id="test",
chip_id="test",
friendly_name="test",
memory=node_a.node_profile.memory,
network_interfaces=[
NetworkInterfaceInfo(
name="en3",
ip_address=conn_a_b.send_back_multiaddr.ip_address,
type="rdma",
),
ethernet_interface,
],
system=node_a.node_profile.system,
)
node_b.node_profile = NodePerformanceProfile(
model_id="test",
chip_id="test",
friendly_name="test",
memory=node_b.node_profile.memory,
network_interfaces=[
NetworkInterfaceInfo(
name="en4",
ip_address=conn_b_c.send_back_multiaddr.ip_address,
type="rdma",
),
ethernet_interface,
],
system=node_b.node_profile.system,
)
node_c.node_profile = NodePerformanceProfile(
model_id="test",
chip_id="test",
friendly_name="test",
memory=node_c.node_profile.memory,
network_interfaces=[
NetworkInterfaceInfo(
name="en5",
ip_address=conn_c_a.send_back_multiaddr.ip_address,
type="rdma",
),
ethernet_interface,
],
system=node_c.node_profile.system,
)
topology.add_node(node_a)
topology.add_node(node_b)
topology.add_node(node_c)
topology.add_connection(conn_a_b)
topology.add_connection(conn_b_c)
topology.add_connection(conn_c_a)
create_instance_command = CreateInstance(
command_id=CommandId(),
model_meta=model_meta,
strategy="tensor_rdma",
)
placements = get_instance_placements_after_create(
create_instance_command, topology, {}
)
assert len(placements) == 1
instance_id = list(placements.keys())[0]
instance = placements[instance_id]
assert instance.hosts is None
assert instance.mlx_ibv_devices is not None
assert instance.mlx_ibv_coordinator is not None
matrix = instance.mlx_ibv_devices
assert len(matrix) == 3
for i in range(3):
assert matrix[i][i] is None
assigned_nodes = list(instance.shard_assignments.node_to_runner.keys())
node_to_idx = {node_id: idx for idx, node_id in enumerate(assigned_nodes)}
idx_a = node_to_idx[node_id_a]
idx_b = node_to_idx[node_id_b]
idx_c = node_to_idx[node_id_c]
assert matrix[idx_a][idx_b] == "rdma_en3"
assert matrix[idx_b][idx_c] == "rdma_en4"
assert matrix[idx_c][idx_a] == "rdma_en5"
assert ":" in instance.mlx_ibv_coordinator
assert not instance.mlx_ibv_coordinator.startswith("169.254")

View File

@@ -200,7 +200,7 @@ def test_get_shard_assignments(
selected_cycle = cycles[0]
# act
shard_assignments = get_shard_assignments(model_meta, selected_cycle)
shard_assignments = get_shard_assignments(model_meta, selected_cycle, "pipeline")
# assert
runner_id_a = shard_assignments.node_to_runner[node_a_id]

View File

@@ -12,7 +12,12 @@ from anyio import (
sleep_forever,
)
from anyio.abc import TaskGroup
from exo_pyo3_bindings import Keypair, NetworkingHandle, NoPeersSubscribedToTopicError
from exo_pyo3_bindings import (
AllQueuesFullError,
Keypair,
NetworkingHandle,
NoPeersSubscribedToTopicError,
)
from filelock import FileLock
from loguru import logger
@@ -207,7 +212,7 @@ class Router:
await self._net.gossipsub_publish(topic, data)
# As a hack, this also catches AllQueuesFull
# Need to fix that ASAP.
except NoPeersSubscribedToTopicError:
except (NoPeersSubscribedToTopicError, AllQueuesFullError):
pass

View File

@@ -16,8 +16,6 @@ from exo.shared.types.common import NodeId, SessionId
from exo.utils.channels import Receiver, Sender
from exo.utils.pydantic_ext import CamelCaseModel
ELECTION_TIMEOUT = 3.0
class ElectionMessage(CamelCaseModel):
clock: int
@@ -27,6 +25,8 @@ class ElectionMessage(CamelCaseModel):
# Could eventually include a list of neighbour nodes for centrality
def __lt__(self, other: Self) -> bool:
if self.clock != other.clock:
return self.clock < other.clock
if self.seniority != other.seniority:
return self.seniority < other.seniority
elif self.commands_seen != other.commands_seen:
@@ -40,6 +40,7 @@ class ElectionMessage(CamelCaseModel):
class ElectionResult(CamelCaseModel):
session_id: SessionId
won_clock: int
is_new_master: bool
historic_messages: list[ConnectionMessage]
@@ -90,19 +91,33 @@ class Election:
tg.start_soon(self._election_receiver)
tg.start_soon(self._connection_receiver)
tg.start_soon(self._command_counter)
await self._campaign(None)
# And start an election immediately, that instantly resolves
candidates: list[ElectionMessage] = []
logger.info("Starting initial campaign")
self._candidates = candidates
logger.info("Campaign started")
await self._campaign(candidates, campaign_timeout=0.0)
logger.info("Initial campaign finished")
# Cancel and wait for the last election to end
if self._campaign_cancel_scope is not None:
logger.info("Cancelling campaign")
self._campaign_cancel_scope.cancel()
# Only exit once the latest campaign has finished
if self._campaign_done is not None:
logger.info("Waiting for campaign to finish")
await self._campaign_done.wait()
logger.info("Campaign cancelled and finished")
logger.info("Election finished")
async def elect(self, em: ElectionMessage) -> None:
logger.info(f"Electing: {em}")
is_new_master = em.proposed_session != self.current_session
self.current_session = em.proposed_session
logger.info(f"Current session: {self.current_session}")
await self._er_sender.send(
ElectionResult(
won_clock=em.clock,
session_id=em.proposed_session,
is_new_master=is_new_master,
historic_messages=self._connection_messages,
@@ -120,16 +135,29 @@ class Election:
async def _election_receiver(self) -> None:
with self._em_receiver as election_messages:
async for message in election_messages:
logger.info(f"Election message received: {message}")
if message.proposed_session.master_node_id == self.node_id:
logger.info("Dropping message from ourselves")
# Drop messages from us (See exo.routing.router)
continue
# If a new round is starting, we participate
if message.clock > self.clock:
self.clock = message.clock
await self._campaign(message)
logger.info(f"New clock: {self.clock}")
assert self._tg is not None
logger.info("Starting new campaign")
candidates: list[ElectionMessage] = [message]
logger.info(f"Candidates: {candidates}")
logger.info(f"Current candidates: {self._candidates}")
self._candidates = candidates
logger.info(f"New candidates: {self._candidates}")
logger.info("Starting new campaign")
self._tg.start_soon(self._campaign, candidates)
logger.info("Campaign started")
continue
# Dismiss old messages
if message.clock < self.clock:
logger.info(f"Dropping old message: {message}")
continue
logger.debug(f"Election added candidate {message}")
# Now we are processing this rounds messages - including the message that triggered this round.
@@ -137,70 +165,97 @@ class Election:
async def _connection_receiver(self) -> None:
with self._cm_receiver as connection_messages:
async for msg in connection_messages:
async for first in connection_messages:
# Delay after connection message for time to symmetrically setup
await anyio.sleep(0.2)
rest = connection_messages.collect()
logger.info(f"Connection messages received: {first} followed by {rest}")
logger.info(f"Current clock: {self.clock}")
# These messages are strictly peer to peer
self.clock += 1
await self._campaign(None)
self._connection_messages.append(msg)
logger.info(f"New clock: {self.clock}")
assert self._tg is not None
candidates: list[ElectionMessage] = []
self._candidates = candidates
logger.info("Starting new campaign")
self._tg.start_soon(self._campaign, candidates)
logger.info("Campaign started")
self._connection_messages.append(first)
self._connection_messages.extend(rest)
logger.info("Connection message added")
async def _command_counter(self) -> None:
with self._co_receiver as commands:
async for _command in commands:
self.commands_seen += 1
async def _campaign(self, initial_message: ElectionMessage | None) -> None:
async def _campaign(
self, candidates: list[ElectionMessage], *, campaign_timeout: float = 3.0
) -> None:
clock = self.clock
# Kill the old campaign
if self._campaign_cancel_scope:
logger.info("Cancelling other campaign")
self._campaign_cancel_scope.cancel()
if self._campaign_done:
logger.info("Waiting for other campaign to finish")
await self._campaign_done.wait()
candidates: list[ElectionMessage] = []
if initial_message:
candidates.append(initial_message)
self._candidates = candidates
done = Event()
self._campaign_done = done
assert self._tg is not None, (
"Election campaign started before election service initialized"
)
# Spin off a new campaign
self._tg.start_soon(self._complete_campaign, self.clock, candidates, done)
async def _complete_campaign(
self, clock: int, candidates: list[ElectionMessage], done: Event
) -> None:
scope = CancelScope()
self._campaign_cancel_scope = scope
try:
with scope:
self._campaign_cancel_scope = scope
logger.info(f"Election {clock} started")
candidates.append(self._election_status(clock))
await self._em_sender.send(self._election_status(clock))
status = self._election_status(clock)
candidates.append(status)
await self._em_sender.send(status)
await anyio.sleep(ELECTION_TIMEOUT)
logger.info(f"Sleeping for {campaign_timeout} seconds")
await anyio.sleep(campaign_timeout)
# minor hack - rebroadcast status in case anyone has missed it.
await self._em_sender.send(status)
logger.info("Woke up from sleep")
# add an anyio checkpoint - anyio.lowlevel.chekpoint() or checkpoint_if_cancelled() is preferred, but wasn't typechecking last I checked
await anyio.sleep(0)
# Election finished!
candidates = sorted(candidates)
logger.debug(f"Election queue {candidates}")
elected = candidates[-1]
elected = max(candidates)
logger.info(f"Election queue {candidates}")
logger.info(f"Elected: {elected}")
if (
self.node_id == elected.proposed_session.master_node_id
and self.seniority >= 0
):
logger.info(
f"Node is a candidate and seniority is {self.seniority}"
)
self.seniority = max(self.seniority, len(candidates))
logger.info(f"New seniority: {self.seniority}")
else:
logger.info(
f"Node is not a candidate or seniority is not {self.seniority}"
)
logger.info(
f"Election finished, new SessionId({elected.proposed_session})"
f"Election finished, new SessionId({elected.proposed_session}) with queue {candidates}"
)
logger.info("Sending election result")
await self.elect(elected)
logger.info("Election result sent")
except get_cancelled_exc_class():
logger.info("Election cancelled")
logger.info(f"Election {clock} cancelled")
finally:
logger.info(f"Election {clock} finally")
if self._campaign_cancel_scope is scope:
self._campaign_cancel_scope = None
done.set()
logger.info("Setting done event")
done.set()
logger.info("Done event set")
def _election_status(self, clock: int | None = None) -> ElectionMessage:
c = self.clock if clock is None else clock

View File

@@ -166,7 +166,7 @@ MODEL_CARDS: dict[str, ModelCard] = {
"llama-3.3-70b": ModelCard(
short_id="llama-3.3-70b",
model_id="mlx-community/Llama-3.3-70B-Instruct-4bit",
name="Llama 3.3 70B",
name="Llama 3.3 70B (4-bit)",
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
tags=[],
metadata=ModelMetadata(
@@ -176,6 +176,32 @@ MODEL_CARDS: dict[str, ModelCard] = {
n_layers=80,
),
),
"llama-3.3-70b-8bit": ModelCard(
short_id="llama-3.3-70b-8bit",
model_id="mlx-community/Llama-3.3-70B-Instruct-8bit",
name="Llama 3.3 70B (8-bit)",
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
tags=[],
metadata=ModelMetadata(
model_id=ModelId("mlx-community/Llama-3.3-70B-Instruct-8bit"),
pretty_name="Llama 3.3 70B (8-bit)",
storage_size=Memory.from_kb(77516320),
n_layers=80,
),
),
"llama-3.3-70b-fp16": ModelCard(
short_id="llama-3.3-70b-fp16",
model_id="mlx-community/llama-3.3-70b-instruct-fp16",
name="Llama 3.3 70B (FP16)",
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
tags=[],
metadata=ModelMetadata(
model_id=ModelId("mlx-community/llama-3.3-70b-instruct-fp16"),
pretty_name="Llama 3.3 70B (FP16)",
storage_size=Memory.from_kb(155032640),
n_layers=80,
),
),
# phi-3
"phi-3-mini": ModelCard(
short_id="phi-3-mini",
@@ -230,6 +256,32 @@ MODEL_CARDS: dict[str, ModelCard] = {
n_layers=48,
),
),
"qwen3-235b-a22b": ModelCard(
short_id="qwen3-235b-a22b",
model_id="mlx-community/Qwen3-235B-A22B-4bit",
name="Qwen3 235B, Active 22B (4-bit)",
description="""Qwen3 235B (Active 22B) is a large language model trained on the Qwen3 235B dataset.""",
tags=[],
metadata=ModelMetadata(
model_id=ModelId("mlx-community/Qwen3-235B-A22B-4bit"),
pretty_name="Qwen3 235B, Active 22B (4-bit)",
storage_size=Memory.from_kb(123207680),
n_layers=94,
),
),
"qwen3-235b-a22b-8bit": ModelCard(
short_id="qwen3-235b-a22b-8bit",
model_id="mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit",
name="Qwen3 235B, Active 22B (8-bit)",
description="""Qwen3 235B (Active 22B) is a large language model trained on the Qwen3 235B dataset.""",
tags=[],
metadata=ModelMetadata(
model_id=ModelId("mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit"),
pretty_name="Qwen3 235B, Active 22B (8-bit)",
storage_size=Memory.from_kb(246415360),
n_layers=94,
),
),
# granite
"granite-3.3-2b": ModelCard(
short_id="granite-3.3-2b",

View File

@@ -7,6 +7,7 @@ from exo.shared.openai_compat import FinishReason
from exo.shared.types.common import CommandId
from exo.shared.types.models import ModelMetadata
from exo.shared.types.worker.instances import InstanceId
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
class ModelListModel(BaseModel):
@@ -123,6 +124,7 @@ class ChatCompletionTaskParams(BaseModel):
class CreateInstanceTaskParams(BaseModel):
# TODO: in future the user could specify a specific Instance, not just a model_id
model_id: str
strategy: ParallelisationStrategyType = "auto"
class DeleteInstanceTaskParams(BaseModel):

View File

@@ -4,6 +4,7 @@ from exo.shared.types.api import ChatCompletionTaskParams
from exo.shared.types.common import CommandId, NodeId
from exo.shared.types.models import ModelMetadata
from exo.shared.types.worker.common import InstanceId
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
from exo.utils.pydantic_ext import CamelCaseModel, TaggedModel
@@ -22,6 +23,7 @@ class ChatCompletion(BaseCommand):
class CreateInstance(BaseCommand):
model_meta: ModelMetadata
strategy: ParallelisationStrategyType
class SpinUpInstance(BaseCommand):

View File

@@ -1,3 +1,4 @@
from datetime import datetime
from enum import Enum
from pydantic import Field
@@ -60,6 +61,8 @@ class EventType(str, Enum):
class BaseEvent(TaggedModel):
event_id: EventId = Field(default_factory=EventId)
# Internal, for debugging. Please don't rely on this field for anything!
_master_time_stamp: None | datetime = None
class TestEvent(BaseEvent):

View File

@@ -11,7 +11,9 @@ class BaseRunnerMessage(TaggedModel):
class SetupMessage(BaseRunnerMessage):
model_shard_meta: ShardMetadata
hosts: list[Host]
hosts: list[Host] | None = None
mlx_ibv_devices: list[list[str | None]] | None = None
mlx_ibv_coordinator: str | None = None
# TODO: We probably want a general task message that can take any task type. Can be fixed later.

View File

@@ -17,4 +17,6 @@ class Instance(CamelCaseModel):
instance_id: InstanceId
instance_type: InstanceStatus
shard_assignments: ShardAssignments
hosts: list[Host]
hosts: list[Host] | None = None
mlx_ibv_devices: list[list[str | None]] | None = None
mlx_ibv_coordinator: str | None = None

View File

@@ -14,7 +14,9 @@ class AssignRunnerOp(BaseRunnerOp):
instance_id: InstanceId
runner_id: RunnerId
shard_metadata: ShardMetadata
hosts: list[Host]
hosts: list[Host] | None = None
mlx_ibv_devices: list[list[str | None]] | None = None
mlx_ibv_coordinator: str | None = None
class UnassignRunnerOp(BaseRunnerOp):

View File

@@ -0,0 +1,13 @@
from typing import Literal
ParallelisationStrategyType = Literal[
"auto",
"pipeline",
"tensor",
"tensor_rdma",
"pipeline_rdma",
]
def strategy_error() -> ValueError:
return ValueError("Unexpected strategy")

View File

@@ -1,6 +1,7 @@
from pydantic import Field
from exo.shared.types.models import ModelMetadata
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
from exo.utils.pydantic_ext import TaggedModel
@@ -19,19 +20,12 @@ class BaseShardMetadata(TaggedModel):
immediate_exception: bool = False
should_timeout: float | None = None
class PipelineShardMetadata(BaseShardMetadata):
"""
Pipeline parallelism shard meta.
Layers are represented as a half-open interval [start_layer, end_layer),
where start_layer is inclusive and end_layer is exclusive.
"""
start_layer: int = Field(ge=0)
end_layer: int = Field(ge=0)
n_layers: int = Field(ge=0)
strategy: ParallelisationStrategyType = "auto"
@property
def is_first_layer(self) -> bool:
return self.start_layer == 0
@@ -46,4 +40,19 @@ class PipelineShardMetadata(BaseShardMetadata):
)
ShardMetadata = PipelineShardMetadata
class PipelineShardMetadata(BaseShardMetadata):
"""
Pipeline parallelism shard meta.
Layers are represented as a half-open interval [start_layer, end_layer),
where start_layer is inclusive and end_layer is exclusive.
"""
strategy: ParallelisationStrategyType = "pipeline"
class TensorShardMetadata(BaseShardMetadata):
strategy: ParallelisationStrategyType = "tensor"
ShardMetadata = PipelineShardMetadata | TensorShardMetadata

View File

@@ -1,4 +1,5 @@
from math import inf
from typing import Self
from anyio import ClosedResourceError, WouldBlock
from anyio.streams.memory import (
@@ -47,6 +48,9 @@ class Receiver[T](AnyioReceiver[T]):
out.extend(self.collect())
return out
def __enter__(self) -> Self:
return self
class channel[T]: # noqa: N801
def __new__(cls, max_buffer_size: float = inf) -> tuple[Sender[T], Receiver[T]]:

View File

@@ -18,12 +18,14 @@ from exo.worker.runner.runner_supervisor import RunnerSupervisor
class AssignedRunner(BaseModel):
runner_id: RunnerId
instance_id: InstanceId
shard_metadata: ShardMetadata # just data
hosts: list[Host]
shard_metadata: ShardMetadata
hosts: list[Host] | None = None
mlx_ibv_devices: list[list[str | None]] | None = None
mlx_ibv_coordinator: str | None = None
status: RunnerStatus
failures: list[tuple[float, Exception]] = []
runner: RunnerSupervisor | None # set if the runner is 'up'
runner: RunnerSupervisor | None = None
model_config = ConfigDict(arbitrary_types_allowed=True)

View File

@@ -194,8 +194,8 @@ class Worker:
# run the op, synchronously blocking for now
if op is not None:
logger.info(f"Executing op {str(op)[:100]}")
logger.debug(f"Worker executing op: {str(op)[:100]}")
logger.info(f"Executing op {type(op)} {str(op)[:100]}")
logger.debug(f"Worker executing op: {type(op)} {str(op)[:100]}")
try:
async for event in self.execute_op(op):
await self.event_publisher(event)
@@ -285,6 +285,8 @@ class Worker:
instance_id=op.instance_id,
shard_metadata=op.shard_metadata,
hosts=op.hosts,
mlx_ibv_devices=op.mlx_ibv_devices,
mlx_ibv_coordinator=op.mlx_ibv_coordinator,
status=DownloadingRunnerStatus(
download_progress=DownloadPending(node_id=self.node_id)
),
@@ -439,6 +441,8 @@ class Worker:
assigned_runner.runner = await RunnerSupervisor.create(
model_shard_meta=assigned_runner.shard_metadata,
hosts=assigned_runner.hosts,
mlx_ibv_devices=assigned_runner.mlx_ibv_devices,
mlx_ibv_coordinator=assigned_runner.mlx_ibv_coordinator,
initialize_timeout=initialize_timeout,
)

View File

@@ -176,6 +176,8 @@ def assign_runners(
runner_id
],
hosts=instance.hosts,
mlx_ibv_devices=instance.mlx_ibv_devices,
mlx_ibv_coordinator=instance.mlx_ibv_coordinator,
)
return None

View File

@@ -21,6 +21,7 @@ def entrypoint(raw_conn: Connection, err_path: str) -> None:
It redirects fd=2 (stderr) to a pipe provided by the parent, *then* imports
the heavy runner module so that any C/C++ or MLX logs/crashes land in that pipe.
"""
# os.environ["MLX_METAL_FAST_SYNCH"] = "1"
_redirect_stderr_to_file(err_path)
faulthandler.enable(file=sys.stderr, all_threads=True)

View File

@@ -1,9 +1,10 @@
import asyncio
import concurrent.futures
import functools
import time
from collections.abc import AsyncGenerator
from functools import partial
from typing import Callable, Generator, Optional, Tuple
from typing import Any, Callable, Generator, Optional, Tuple
import mlx.core as mx
from mlx.core import array
@@ -13,9 +14,9 @@ from mlx_lm.models.cache import KVCache
from exo.engines.mlx import Model, TokenizerWrapper
from exo.engines.mlx.utils_mlx import (
apply_chat_template,
broadcast_from_zero,
broadcast_from_zero, # type: ignore
make_kv_cache,
mx_barrier,
mx_barrier, # type: ignore
)
from exo.shared.types.api import ChatCompletionMessage
from exo.shared.types.tasks import ChatCompletionTaskParams
@@ -33,15 +34,35 @@ from exo.shared.types.worker.communication import (
generation_stream = mx.new_stream(mx.default_device())
def maybe_quantize_kv_cache(
prompt_cache: list[Any],
quantized_kv_start: int,
kv_group_size: int,
kv_bits: int | None,
) -> None:
if kv_bits is None:
return
for e, c in enumerate(prompt_cache): # type: ignore[type-arg]
if hasattr(c, "to_quantized") and c.offset >= quantized_kv_start: # type: ignore[type-arg]
prompt_cache[e] = c.to_quantized(group_size=kv_group_size, bits=kv_bits) # type: ignore[type-arg]
def generate_step(
prompt: mx.array,
model: Model,
*,
max_tokens: int = 256,
sampler: Callable[[mx.array], mx.array],
logits_processors: list[Callable[[mx.array, mx.array], mx.array]] | None = None,
max_kv_size: Optional[int] = None,
prompt_cache: Optional[list[KVCache]] = None,
prefill_step_size: int = 2048,
kv_bits: int | None = None,
kv_group_size: int = 64,
quantized_kv_start: int = 0,
prompt_progress_callback: Callable[[int, int], None] | None = None,
input_embeddings: mx.array | None = None,
group: mx.distributed.Group | None = None, # type: ignore[type-arg]
) -> Generator[Tuple[int, mx.array], None, None]:
"""
A generator producing token ids based on the given prompt from the model.
@@ -51,85 +72,159 @@ def generate_step(
model (Model): The model to use for generation.
max_tokens (int): The maximum number of tokens. Use``-1`` for an infinite
generator. Default: ``256``.
sampler (Callable[mx.array, mx.array], optional): A sampler for sampling a
token from a vector of log probabilities. Default: ``None``.
sampler (Callable[mx.array, mx.array]): A sampler for sampling a
token from a vector of log probabilities.
logits_processors (List[Callable[[mx.array, mx.array], mx.array]], optional):
A list of functions that take tokens and logits and return the processed
logits. Default: ``None``.
max_kv_size (int, optional): Maximum size of the key-value cache. Old
entries (except the first 4 tokens) will be overwritten.
prompt_cache (List[Any], optional): A pre-computed prompt cache. Note, if
provided, the cache will be updated in place.
prefill_step_size (int): Step size for processing the prompt.
kv_bits (int, optional): Number of bits to use for KV cache quantization.
None implies no cache quantization. Default: ``None``.
kv_group_size (int): Group size for KV cache quantization. Default: ``64``.
quantized_kv_start (int): Step to begin using a quantized KV cache.
when ``kv_bits`` is non-None. Default: ``0``.
prompt_progress_callback (Callable[[int, int], None]): A call-back which takes the
prompt tokens processed so far and the total number of prompt tokens.
input_embeddings (mx.array, optional): Input embeddings to use instead of or in
conjunction with prompt tokens. Default: ``None``.
Yields:
Tuple[int, mx.array]: One token and a vector of log probabilities.
"""
if input_embeddings is not None:
if len(prompt) > 0 and len(prompt) != len(input_embeddings):
raise ValueError(
f"When providing input_embeddings, their sequence length ({len(input_embeddings)}) "
f"must match the sequence length of the prompt ({len(prompt)}), or the "
"prompt must be empty."
)
elif len(prompt) == 0:
raise ValueError(
"Either input_embeddings or prompt (or both) must be provided."
)
tokens = None
# Create the KV cache for generation
if prompt_cache is None:
prompt_cache = cache.make_prompt_cache(
model,
max_kv_size=max_kv_size,
)
def _step(input_tokens: mx.array):
prompt_progress_callback = prompt_progress_callback or (lambda *_: None) # type: ignore[type-arg]
quantize_cache_fn = functools.partial(
maybe_quantize_kv_cache,
quantized_kv_start=quantized_kv_start,
kv_group_size=kv_group_size,
kv_bits=kv_bits,
)
def _model_call(
input_tokens: mx.array, input_embeddings: mx.array | None
) -> mx.array:
if input_embeddings is not None:
return model( # type: ignore[type-arg]
input_tokens,
cache=prompt_cache,
input_embeddings=input_embeddings, # type: ignore[type-arg]
)
else:
return model(input_tokens, cache=prompt_cache)
def _step(
input_tokens: mx.array, input_embeddings: mx.array | None = None
) -> tuple[mx.array, mx.array]:
nonlocal tokens
with mx.stream(generation_stream):
logits = model(
input_tokens[None],
cache=prompt_cache,
logits = _model_call(
input_tokens=input_tokens[None],
input_embeddings=(
input_embeddings[None] if input_embeddings is not None else None
),
)
logits = logits[:, -1, :]
if logits_processors and len(input_tokens) > 0:
tokens = (
mx.concat([tokens, input_tokens])
if tokens is not None
else input_tokens
)
for processor in logits_processors:
logits = processor(tokens, logits)
quantize_cache_fn(prompt_cache)
logprobs = logits - mx.logsumexp(logits, keepdims=True)
sampled = sampler(logprobs)
return sampled, logprobs.squeeze(0)
with mx.stream(generation_stream):
total_prompt_tokens = len(prompt)
total_prompt_tokens = (
len(input_embeddings) if input_embeddings is not None else len(prompt)
)
prompt_processed_tokens = 0
prompt_progress_callback(prompt_processed_tokens, total_prompt_tokens)
while total_prompt_tokens - prompt_processed_tokens > prefill_step_size:
runner_print(
f"Prefilling {min(prefill_step_size, len(prompt))} tokens. Remaining tokens: {len(prompt)}. Peak memory: {mx.get_peak_memory() // 2**30} GB"
)
logits = model(prompt[:prefill_step_size][None], cache=prompt_cache)
n_to_process = min(prefill_step_size, prompt.size)
_model_call(
input_tokens=prompt[:n_to_process][None],
input_embeddings=(
input_embeddings[:n_to_process][None]
if input_embeddings is not None
else None
),
)
quantize_cache_fn(prompt_cache)
start_time = time.time()
mx.eval([c.state for c in prompt_cache] + [logits]) # type: ignore
mx.eval([c.state for c in prompt_cache]) # type: ignore
eval_time = time.time() - start_time
prompt_processed_tokens += prefill_step_size
prompt_processed_tokens += n_to_process
prompt = prompt[prefill_step_size:]
prompt = prompt[n_to_process:]
input_embeddings = (
input_embeddings[n_to_process:]
if input_embeddings is not None
else input_embeddings
)
mx.clear_cache()
if eval_time > 7.0:
prefill_step_size = prefill_step_size // 2
prefill_step_size = broadcast_from_zero(prefill_step_size)
if group is not None:
prefill_step_size = broadcast_from_zero(prefill_step_size)
prefill_step_size = max(1, prefill_step_size)
prompt_progress_callback(prompt_processed_tokens, total_prompt_tokens)
if prompt_processed_tokens > 0:
runner_print("finished prefil stage.")
y, logprobs = _step(input_tokens=prompt)
y, logprobs = _step(input_tokens=prompt, input_embeddings=input_embeddings)
# TODO: Why on earth is this async_eval called twice?
# Also why is it async_eval not eval ?
mx.async_eval(y, logprobs) # type: ignore
n = 0
mx.async_eval(y, logprobs) # type: ignore[type-arg]
next_y: array | None = None
next_logprobs: array | None = None
mx.async_eval(y, logprobs) # type: ignore
n = 0
while True:
if n != max_tokens:
assert y is not None
next_y, next_logprobs = _step(y)
mx.async_eval(next_y, next_logprobs) # type: ignore
mx.async_eval(next_y, next_logprobs) # type: ignore[type-arg]
if n == 0:
mx.eval(y) # type: ignore
mx.eval(y) # type: ignore[type-arg]
prompt_progress_callback(total_prompt_tokens, total_prompt_tokens)
if n == max_tokens:
break
yield int(y.item()), logprobs # type: ignore
@@ -146,8 +241,16 @@ def stream_generate(
max_tokens: int,
sampler: Callable[[mx.array], mx.array],
conn: AsyncConnection[RunnerResponse, RunnerMessage] | None,
logits_processors: list[Callable[[mx.array, mx.array], mx.array]] | None = None,
max_kv_size: int | None = None,
prompt_cache: Optional[list[KVCache]] = None,
prefill_step_size: int = 2048,
kv_bits: int | None = None,
kv_group_size: int = 64,
quantized_kv_start: int = 0,
prompt_progress_callback: Callable[[int, int], None] | None = None,
input_embeddings: mx.array | None = None,
group: mx.distributed.Group | None = None, # type: ignore[type-arg]
) -> Generator[GenerationResponse, None, None]:
# Try to infer if special tokens are needed
add_special_tokens = tokenizer.bos_token is None or not prompt.startswith(
@@ -166,8 +269,16 @@ def stream_generate(
model,
max_tokens=max_tokens,
sampler=sampler,
logits_processors=logits_processors,
max_kv_size=max_kv_size,
prompt_cache=prompt_cache,
prefill_step_size=prefill_step_size,
kv_bits=kv_bits,
kv_group_size=kv_group_size,
quantized_kv_start=quantized_kv_start,
prompt_progress_callback=prompt_progress_callback,
input_embeddings=input_embeddings,
group=group,
)
token = None
@@ -199,6 +310,7 @@ async def warmup_inference(
model: Model,
tokenizer: TokenizerWrapper,
sampler: Callable[[mx.array], mx.array],
group: mx.distributed.Group | None = None, # type: ignore
) -> int:
loop = asyncio.get_running_loop()
@@ -220,18 +332,21 @@ async def warmup_inference(
def _generate_warmup():
nonlocal tokens_generated
for token in stream_generate(
runner_print("Generating warmup tokens")
for _r in stream_generate(
model=model,
tokenizer=tokenizer,
prompt=warmup_prompt,
max_tokens=50,
sampler=sampler,
conn=None,
group=group,
):
runner_print("Generated warmup token: " + str(token.text))
runner_print("Generated warmup token: " + str(_r.text))
tokens_generated += 1
await loop.run_in_executor(mlx_executor, _generate_warmup)
runner_print("Generated ALL warmup tokens")
mx_barrier()
return tokens_generated

View File

@@ -7,7 +7,6 @@ from multiprocessing.connection import Connection
from exo.engines.mlx.utils_mlx import (
initialize_mlx,
mlx_force_oom,
mlx_setup,
)
from exo.shared.global_conn import set_conn
from exo.shared.types.worker.commands_runner import (
@@ -26,8 +25,7 @@ from exo.shared.types.worker.communication import (
)
from exo.shared.types.worker.shards import ShardMetadata
from exo.utils import ensure_type
from exo.worker.runner.generate import mlx_generate, warmup_inference
from exo.worker.runner.utils import get_weights_size
from exo.worker.runner.generate import mlx_generate, warmup_inference # type: ignore
async def main(raw_conn: Connection):
@@ -40,33 +38,39 @@ async def main(raw_conn: Connection):
setup_message = ensure_type(init_message, SetupMessage)
model_shard_meta: ShardMetadata = setup_message.model_shard_meta
hosts = setup_message.hosts
mlx_ibv_devices = setup_message.mlx_ibv_devices
mlx_ibv_coordinator = setup_message.mlx_ibv_coordinator
if getattr(model_shard_meta, "immediate_exception", False):
raise Exception("Fake exception - runner failed to spin up.")
if timeout := getattr(model_shard_meta, "should_timeout", 0):
await asyncio.sleep(timeout)
mlx_setup(
int(get_weights_size(model_shard_meta).in_kb // 2**10),
cache_frac_of_mrwss=0.8,
wired_frac_of_mrwss=0.8,
)
setup_start_time = time.time()
mlx_executor = concurrent.futures.ThreadPoolExecutor(max_workers=1)
loop = asyncio.get_running_loop()
model, tokenizer, sampler = await loop.run_in_executor(
model, tokenizer, sampler, group = await loop.run_in_executor( # type: ignore[type-arg]
mlx_executor,
partial(initialize_mlx, model_shard_meta=model_shard_meta, hosts=hosts),
partial(
initialize_mlx,
model_shard_meta=model_shard_meta,
hosts=hosts,
mlx_ibv_devices=mlx_ibv_devices,
mlx_ibv_coordinator=mlx_ibv_coordinator,
),
)
runner_print(
f"Warming up inference for model_shard_meta: {model_shard_meta} hosts: {hosts}"
)
toks = await warmup_inference(
mlx_executor=mlx_executor,
model=model,
tokenizer=tokenizer,
sampler=sampler,
group=group, # type: ignore[type-arg]
)
runner_print(f"Warmed up by generating {toks} tokens")
await conn.send(InitializedResponse(time_taken=time.time() - setup_start_time))

View File

@@ -34,18 +34,21 @@ from exo.shared.types.worker.common import RunnerError
from exo.shared.types.worker.shards import ShardMetadata
from exo.worker.runner.bootstrap import entrypoint
from exo.worker.runner.utils import (
get_init_timeout,
get_prefil_timeout,
get_token_generate_timeout,
get_weights_size,
)
INITIALIZE_TIMEOUT = 400
PREFILL_TIMEOUT_SECONDS = 60
DECODE_TIMEOUT_SECONDS = 5
class RunnerSupervisor:
def __init__(
self,
model_shard_meta: ShardMetadata,
hosts: list[Host],
hosts: list[Host] | None,
mlx_ibv_devices: list[list[str | None]] | None,
mlx_ibv_coordinator: str | None,
runner_process: Process,
conn: Connection,
read_queue: asyncio.Queue[RunnerResponse],
@@ -53,6 +56,8 @@ class RunnerSupervisor:
):
self.model_shard_meta = model_shard_meta
self.hosts = hosts
self.mlx_ibv_devices = mlx_ibv_devices
self.mlx_ibv_coordinator = mlx_ibv_coordinator
self.runner_process = runner_process
self.conn = AsyncConnection[RunnerMessage, RunnerResponse](conn)
@@ -67,7 +72,9 @@ class RunnerSupervisor:
async def create(
cls,
model_shard_meta: ShardMetadata,
hosts: list[Host],
hosts: list[Host] | None = None,
mlx_ibv_devices: list[list[str | None]] | None = None,
mlx_ibv_coordinator: str | None = None,
initialize_timeout: Optional[float] = None,
) -> "RunnerSupervisor":
"""
@@ -93,6 +100,8 @@ class RunnerSupervisor:
self = cls(
model_shard_meta=model_shard_meta,
hosts=hosts,
mlx_ibv_devices=mlx_ibv_devices,
mlx_ibv_coordinator=mlx_ibv_coordinator,
runner_process=runner_process,
read_queue=read_queue,
conn=parent_conn,
@@ -104,12 +113,12 @@ class RunnerSupervisor:
SetupMessage(
model_shard_meta=model_shard_meta,
hosts=hosts,
mlx_ibv_devices=mlx_ibv_devices,
mlx_ibv_coordinator=mlx_ibv_coordinator,
)
)
if not initialize_timeout:
initialize_timeout = get_init_timeout(model_shard_meta)
initialize_timeout = initialize_timeout or INITIALIZE_TIMEOUT
response = await self._read_with_error_check(timeout=initialize_timeout)
assert isinstance(response, InitializedResponse)
@@ -206,17 +215,13 @@ class RunnerSupervisor:
response = await self._read_with_error_check(5.0)
assert isinstance(response, TokenizedResponse)
prompt_tokens = response.prompt_tokens
if request_started_callback is not None:
await request_started_callback()
prefil_timeout = get_prefil_timeout(
self.model_shard_meta, prompt_tokens=prompt_tokens
)
token_timeout = get_token_generate_timeout(self.model_shard_meta)
timeout = prefil_timeout
logger.bind(user_facing=True).info(
timeout = PREFILL_TIMEOUT_SECONDS
logger.info(
f"Starting chat completion with timeout {timeout}"
)
@@ -224,8 +229,8 @@ class RunnerSupervisor:
try:
response = await self._read_with_error_check(timeout)
except asyncio.TimeoutError as e:
logger.bind(user_facing=True).error(
f"Generation timed out during {'prefil' if timeout == prefil_timeout else 'decoding stage'}"
logger.error(
f"Generation timed out during {'prefill' if timeout == PREFILL_TIMEOUT_SECONDS else 'decoding stage'}"
)
raise e
@@ -239,7 +244,7 @@ class RunnerSupervisor:
token_id=response.token,
finish_reason=response.finish_reason,
)
timeout = token_timeout
timeout = DECODE_TIMEOUT_SECONDS
case FinishedResponse():
break
case _:
@@ -322,7 +327,7 @@ class RunnerSupervisor:
except Exception:
cause = f"signal={sig}"
logger.bind(user_facing=True).error(f"Runner terminated ({cause}).\n{captured}")
logger.error(f"Runner terminated ({cause}).\n{captured}")
return RunnerError(
error_type="RunnerCrash",

View File

@@ -5,7 +5,6 @@ import sys
import psutil
from loguru import logger
from exo.shared.constants import LB_DISK_GBPS, LB_MEMBW_GBPS, LB_TFLOPS
from exo.shared.types.memory import Memory
from exo.shared.types.worker.shards import ShardMetadata
@@ -57,48 +56,9 @@ def get_weights_size(model_shard_meta: ShardMetadata) -> Memory:
(model_shard_meta.end_layer - model_shard_meta.start_layer)
/ model_shard_meta.n_layers
* model_shard_meta.model_meta.storage_size.in_kb
/ (
1
if model_shard_meta.strategy in ["auto", "pipeline", "pipeline_rdma"]
else model_shard_meta.world_size
)
)
def get_init_timeout(model_shard_meta: ShardMetadata) -> float:
weights_size = get_weights_size(model_shard_meta)
kbps_read = 1024 * 1024 * LB_DISK_GBPS / 3
return weights_size.in_kb / kbps_read + 30.0
def _prefill_flops_for_shard(model_shard_meta: ShardMetadata, s: int) -> float:
p = get_weights_size(model_shard_meta).in_bytes
flops = 2.0 * p * s # parameter-dependent GEMMs
# flops += _attention_flops(meta, S) # optional S^2 term
return flops
def get_prefil_timeout(
model_shard_meta: ShardMetadata,
prompt_tokens: int,
*,
effective_tflops: float = LB_TFLOPS,
safety_mult: float = 1.6,
base_pad_s: float = 5.0,
) -> float:
"""
Returns a conservative timeout (seconds) for the prefill stage.
"""
total_flops = _prefill_flops_for_shard(model_shard_meta, prompt_tokens)
# Convert to seconds using sustained throughput
time_seconds = total_flops / (effective_tflops * 1e12)
# Prefill across pipeline stages is largely sequential; summing FLOPs already accounts for it.
# Add a base pad (launch/IO) and a safety multiplier for variance.
return base_pad_s + safety_mult * time_seconds
def get_token_generate_timeout(model_shard_meta: ShardMetadata) -> float:
weights_size = get_weights_size(model_shard_meta)
kbps_read = 1024 * 1024 * LB_MEMBW_GBPS / 3
return weights_size.in_kb / kbps_read + 2.0

View File

@@ -1,7 +1,6 @@
import asyncio
import re
import sys
from typing import Dict, List, Optional
from loguru import logger
from pydantic import BaseModel, Field
@@ -72,20 +71,16 @@ async def get_mac_friendly_name_async() -> str | None:
return None
async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
async def get_network_interface_info_async() -> list[NetworkInterfaceInfo]:
"""
Retrieves detailed network interface information on macOS.
Parses output from 'networksetup -listallhardwareports' and 'ifconfig'
to determine interface names, IP addresses, and types (ethernet, wifi, vpn, other).
Returns a list of NetworkInterfaceInfo objects.
"""
if sys.platform != "darwin":
return []
interfaces_info: list[NetworkInterfaceInfo] = []
interfaces_info: List[NetworkInterfaceInfo] = []
device_to_type_map: Dict[str, str] = {}
async def _run_cmd_async(command_parts: List[str]) -> Optional[str]:
async def _run_cmd_async(command_parts: list[str]) -> str | None:
# Helper to run a command and return its stdout, or None on error.
try:
process = await asyncio.create_subprocess_exec(
@@ -118,37 +113,9 @@ async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
)
return None
# 1. Get hardware port types from networksetup
networksetup_output = await _run_cmd_async(
["networksetup", "-listallhardwareports"]
)
if networksetup_output:
current_hardware_port_type_raw: Optional[str] = None
for line in networksetup_output.splitlines():
line_stripped = line.strip()
if line_stripped.startswith("Hardware Port:"):
current_hardware_port_type_raw = line_stripped.split(":", 1)[1].strip()
elif line_stripped.startswith("Device:") and current_hardware_port_type_raw:
device_name = line_stripped.split(":", 1)[1].strip()
if device_name and device_name != "N/A":
if "Thunderbolt" in current_hardware_port_type_raw:
device_to_type_map[device_name] = "thunderbolt"
elif (
"Wi-Fi" in current_hardware_port_type_raw
or "AirPort" in current_hardware_port_type_raw
):
device_to_type_map[device_name] = "wifi"
elif (
"Ethernet" in current_hardware_port_type_raw
or "LAN" in current_hardware_port_type_raw
):
device_to_type_map[device_name] = "ethernet"
current_hardware_port_type_raw = None # Reset for the next block
# 2. Get interface names and IP addresses from ifconfig
# Get interface names and IP addresses from ifconfig
ifconfig_output = await _run_cmd_async(["ifconfig"])
if ifconfig_output:
current_if_name: Optional[str] = None
# Regex for interface name (e.g., en0:, utun0:, tailscale0.)
interface_header_pattern = re.compile(r"^([a-zA-Z0-9\._-]+):")
# Regex for IPv4 address (inet)
@@ -156,44 +123,30 @@ async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
# Regex for IPv6 address (inet6)
inet6_pattern = re.compile(r"^\s+inet6\s+([0-9a-fA-F:]+(?:%[a-zA-Z0-9._-]+)?)")
def _add_interface_entry(if_name: str, ip_addr: str):
_if_type = device_to_type_map.get(if_name)
if not _if_type: # Infer type if not found via networksetup
if if_name.startswith(("utun", "wg", "ppp")) or "tailscale" in if_name:
_if_type = "vpn"
elif if_name.startswith("bridge"):
_if_type = "virtual" # For non-Thunderbolt bridges (e.g., Docker)
else:
_if_type = "other"
interfaces_info.append(
NetworkInterfaceInfo(name=if_name, ip_address=ip_addr, type=_if_type)
)
current_if_name: str | None = None
for line in ifconfig_output.splitlines():
header_match = interface_header_pattern.match(line)
if header_match:
potential_if_name = header_match.group(1)
if potential_if_name == "lo0": # Skip loopback interface
current_if_name = None
else:
current_if_name = potential_if_name
continue
current_if_name = header_match.group(1)
if current_if_name:
inet_m = inet_pattern.match(line)
if inet_m:
ipv4_address = inet_m.group(1)
_add_interface_entry(
current_if_name, ipv4_address
) # Add all IPv4, including APIPA
continue
interfaces_info.append(
NetworkInterfaceInfo(
name=current_if_name, ip_address=ipv4_address, type=""
)
)
inet6_m = inet6_pattern.match(line)
if inet6_m:
ipv6_address = inet6_m.group(1)
# No specific filtering for IPv6 link-local (e.g., fe80::) for now.
_add_interface_entry(current_if_name, ipv6_address)
interfaces_info.append(
NetworkInterfaceInfo(
name=current_if_name, ip_address=ipv6_address, type=""
)
)
return interfaces_info
@@ -203,7 +156,7 @@ async def get_mac_system_info_async() -> SystemInfo:
model_id_val = "Unknown Model"
chip_id_val = "Unknown Chip"
memory_val = 0
network_interfaces_info_list: List[NetworkInterfaceInfo] = []
network_interfaces_info_list: list[NetworkInterfaceInfo] = []
if sys.platform != "darwin":
return SystemInfo(

24
tmp/run_llm.sh Executable file
View File

@@ -0,0 +1,24 @@
#!/usr/bin/env bash
set -euo pipefail
if [ $# -lt 2 ]; then
echo "Usage: $0 <hostname> <query>"
exit 1
fi
HOST="$1"
shift
QUERY="$*"
curl -sN -X POST "http://$HOST:8000/v1/chat/completions" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"mlx-community/DeepSeek-V3.1-8bit\",
\"stream\": true,
\"messages\": [{ \"role\": \"user\", \"content\": \"$QUERY\" }]
}" |
grep --line-buffered '^data:' |
grep --line-buffered -v 'data: \[DONE\]' |
cut -d' ' -f2- |
jq -r --unbuffered '.choices[].delta.content // empty' |
awk '{ORS=""; print; fflush()} END {print "\n"}'

184
uv.lock generated
View File

@@ -1,5 +1,5 @@
version = 1
revision = 2
revision = 3
requires-python = ">=3.13"
resolution-markers = [
"sys_platform == 'darwin'",
@@ -391,8 +391,8 @@ requires-dist = [
{ name = "greenlet", specifier = ">=3.2.4" },
{ name = "huggingface-hub", specifier = ">=0.33.4" },
{ name = "loguru", specifier = ">=0.7.3" },
{ name = "mlx", specifier = "==0.29.3" },
{ name = "mlx-lm", specifier = "==0.28.3" },
{ name = "mlx", specifier = ">=0.29.3" },
{ name = "mlx-lm", specifier = ">=0.28.3" },
{ name = "networkx", specifier = ">=3.5" },
{ name = "openai", specifier = ">=1.99.9" },
{ name = "pathlib", specifier = ">=1.0.1" },
@@ -455,7 +455,7 @@ requires-dist = [
[[package]]
name = "fastapi"
version = "0.120.3"
version = "0.121.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "annotated-doc", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -463,9 +463,9 @@ dependencies = [
{ name = "starlette", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/85/c6/f324c07f5ebe34237b56b6396a94568d2d4a705df8a2ff82fa45029e7252/fastapi-0.120.3.tar.gz", hash = "sha256:17db50718ee86c9e01e54f9d8600abf130f6f762711cd0d8f02eb392668271ba", size = 339363, upload-time = "2025-10-30T20:41:33.072Z" }
sdist = { url = "https://files.pythonhosted.org/packages/8c/e3/77a2df0946703973b9905fd0cde6172c15e0781984320123b4f5079e7113/fastapi-0.121.0.tar.gz", hash = "sha256:06663356a0b1ee93e875bbf05a31fb22314f5bed455afaaad2b2dad7f26e98fa", size = 342412, upload-time = "2025-11-03T10:25:54.818Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/37/3a/1eef3ab55ede5af09186723898545a94d0a32b7ac9ea4e7af7bcb95f132a/fastapi-0.120.3-py3-none-any.whl", hash = "sha256:bfee21c98db9128dc425a686eafd14899e26e4471aab33076bff2427fd6dcd22", size = 108255, upload-time = "2025-10-30T20:41:31.247Z" },
{ url = "https://files.pythonhosted.org/packages/dd/2c/42277afc1ba1a18f8358561eee40785d27becab8f80a1f945c0a3051c6eb/fastapi-0.121.0-py3-none-any.whl", hash = "sha256:8bdf1b15a55f4e4b0d6201033da9109ea15632cb76cf156e7b8b4019f2172106", size = 109183, upload-time = "2025-11-03T10:25:53.27Z" },
]
[[package]]
@@ -981,7 +981,7 @@ wheels = [
[[package]]
name = "openai"
version = "2.6.1"
version = "2.7.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -993,9 +993,9 @@ dependencies = [
{ name = "tqdm", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/c4/44/303deb97be7c1c9b53118b52825cbd1557aeeff510f3a52566b1fa66f6a2/openai-2.6.1.tar.gz", hash = "sha256:27ae704d190615fca0c0fc2b796a38f8b5879645a3a52c9c453b23f97141bb49", size = 593043, upload-time = "2025-10-24T13:29:52.79Z" }
sdist = { url = "https://files.pythonhosted.org/packages/84/2c/3ca91dbd1a5b80c20fbd1e21d601f6afd7fd51927a1b27b08226b67ebd61/openai-2.7.0.tar.gz", hash = "sha256:8c42c24d06afece19e69afcb6c2b23b8b90f603a81616d8a0be80b80fb527ed2", size = 595876, upload-time = "2025-11-03T23:52:07.935Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/15/0e/331df43df633e6105ff9cf45e0ce57762bd126a45ac16b25a43f6738d8a2/openai-2.6.1-py3-none-any.whl", hash = "sha256:904e4b5254a8416746a2f05649594fa41b19d799843cd134dac86167e094edef", size = 1005551, upload-time = "2025-10-24T13:29:50.973Z" },
{ url = "https://files.pythonhosted.org/packages/fc/0f/e9618a92a9497846a3071f2a7ed43409215947106c7e5ce7d082f784de10/openai-2.7.0-py3-none-any.whl", hash = "sha256:9fc44861a692b7e80a7ec1252c10af79612a3ef1581ecb192caf4585afca5363", size = 1008759, upload-time = "2025-11-03T23:52:05.322Z" },
]
[[package]]
@@ -1106,22 +1106,22 @@ wheels = [
[[package]]
name = "psutil"
version = "7.1.2"
version = "7.1.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/cd/ec/7b8e6b9b1d22708138630ef34c53ab2b61032c04f16adfdbb96791c8c70c/psutil-7.1.2.tar.gz", hash = "sha256:aa225cdde1335ff9684708ee8c72650f6598d5ed2114b9a7c5802030b1785018", size = 487424, upload-time = "2025-10-25T10:46:34.931Z" }
sdist = { url = "https://files.pythonhosted.org/packages/e1/88/bdd0a41e5857d5d703287598cbf08dad90aed56774ea52ae071bae9071b6/psutil-7.1.3.tar.gz", hash = "sha256:6c86281738d77335af7aec228328e944b30930899ea760ecf33a4dba66be5e74", size = 489059, upload-time = "2025-11-02T12:25:54.619Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/b8/d9/b56cc9f883140ac10021a8c9b0f4e16eed1ba675c22513cdcbce3ba64014/psutil-7.1.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0cc5c6889b9871f231ed5455a9a02149e388fffcb30b607fb7a8896a6d95f22e", size = 238575, upload-time = "2025-10-25T10:46:38.728Z" },
{ url = "https://files.pythonhosted.org/packages/36/eb/28d22de383888deb252c818622196e709da98816e296ef95afda33f1c0a2/psutil-7.1.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:8e9e77a977208d84aa363a4a12e0f72189d58bbf4e46b49aae29a2c6e93ef206", size = 239297, upload-time = "2025-10-25T10:46:41.347Z" },
{ url = "https://files.pythonhosted.org/packages/89/5d/220039e2f28cc129626e54d63892ab05c0d56a29818bfe7268dcb5008932/psutil-7.1.2-cp313-cp313t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7d9623a5e4164d2220ecceb071f4b333b3c78866141e8887c072129185f41278", size = 280420, upload-time = "2025-10-25T10:46:44.122Z" },
{ url = "https://files.pythonhosted.org/packages/ba/7a/286f0e1c167445b2ef4a6cbdfc8c59fdb45a5a493788950cf8467201dc73/psutil-7.1.2-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:364b1c10fe4ed59c89ec49e5f1a70da353b27986fa8233b4b999df4742a5ee2f", size = 283049, upload-time = "2025-10-25T10:46:47.095Z" },
{ url = "https://files.pythonhosted.org/packages/56/9e/f1c5c746b4ed5320952acd3002d3962fe36f30524c00ea79fdf954cc6779/psutil-7.1.2-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:e09cfe92aa8e22b1ec5e2d394820cf86c5dff6367ac3242366485dfa874d43bc", size = 238640, upload-time = "2025-10-25T10:46:54.089Z" },
{ url = "https://files.pythonhosted.org/packages/32/ee/fd26216a735395cc25c3899634e34aeb41fb1f3dbb44acc67d9e594be562/psutil-7.1.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:fa6342cf859c48b19df3e4aa170e4cfb64aadc50b11e06bb569c6c777b089c9e", size = 239303, upload-time = "2025-10-25T10:46:56.932Z" },
{ url = "https://files.pythonhosted.org/packages/3c/cd/7d96eaec4ef7742b845a9ce2759a2769ecce4ab7a99133da24abacbc9e41/psutil-7.1.2-cp314-cp314t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:625977443498ee7d6c1e63e93bacca893fd759a66c5f635d05e05811d23fb5ee", size = 281717, upload-time = "2025-10-25T10:46:59.116Z" },
{ url = "https://files.pythonhosted.org/packages/bc/1a/7f0b84bdb067d35fe7fade5fff888408688caf989806ce2d6dae08c72dd5/psutil-7.1.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a24bcd7b7f2918d934af0fb91859f621b873d6aa81267575e3655cd387572a7", size = 284575, upload-time = "2025-10-25T10:47:00.944Z" },
{ url = "https://files.pythonhosted.org/packages/ae/89/b9f8d47ddbc52d7301fc868e8224e5f44ed3c7f55e6d0f54ecaf5dd9ff5e/psutil-7.1.2-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:c9ba5c19f2d46203ee8c152c7b01df6eec87d883cfd8ee1af2ef2727f6b0f814", size = 237244, upload-time = "2025-10-25T10:47:07.086Z" },
{ url = "https://files.pythonhosted.org/packages/c8/7a/8628c2f6b240680a67d73d8742bb9ff39b1820a693740e43096d5dcb01e5/psutil-7.1.2-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:2a486030d2fe81bec023f703d3d155f4823a10a47c36784c84f1cc7f8d39bedb", size = 238101, upload-time = "2025-10-25T10:47:09.523Z" },
{ url = "https://files.pythonhosted.org/packages/30/28/5e27f4d5a0e347f8e3cc16cd7d35533dbce086c95807f1f0e9cd77e26c10/psutil-7.1.2-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3efd8fc791492e7808a51cb2b94889db7578bfaea22df931424f874468e389e3", size = 258675, upload-time = "2025-10-25T10:47:11.082Z" },
{ url = "https://files.pythonhosted.org/packages/e5/5c/79cf60c9acf36d087f0db0f82066fca4a780e97e5b3a2e4c38209c03d170/psutil-7.1.2-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e2aeb9b64f481b8eabfc633bd39e0016d4d8bbcd590d984af764d80bf0851b8a", size = 260203, upload-time = "2025-10-25T10:47:13.226Z" },
{ url = "https://files.pythonhosted.org/packages/bd/93/0c49e776b8734fef56ec9c5c57f923922f2cf0497d62e0f419465f28f3d0/psutil-7.1.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0005da714eee687b4b8decd3d6cc7c6db36215c9e74e5ad2264b90c3df7d92dc", size = 239751, upload-time = "2025-11-02T12:25:58.161Z" },
{ url = "https://files.pythonhosted.org/packages/6f/8d/b31e39c769e70780f007969815195a55c81a63efebdd4dbe9e7a113adb2f/psutil-7.1.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:19644c85dcb987e35eeeaefdc3915d059dac7bd1167cdcdbf27e0ce2df0c08c0", size = 240368, upload-time = "2025-11-02T12:26:00.491Z" },
{ url = "https://files.pythonhosted.org/packages/62/61/23fd4acc3c9eebbf6b6c78bcd89e5d020cfde4acf0a9233e9d4e3fa698b4/psutil-7.1.3-cp313-cp313t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:95ef04cf2e5ba0ab9eaafc4a11eaae91b44f4ef5541acd2ee91d9108d00d59a7", size = 287134, upload-time = "2025-11-02T12:26:02.613Z" },
{ url = "https://files.pythonhosted.org/packages/30/1c/f921a009ea9ceb51aa355cb0cc118f68d354db36eae18174bab63affb3e6/psutil-7.1.3-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1068c303be3a72f8e18e412c5b2a8f6d31750fb152f9cb106b54090296c9d251", size = 289904, upload-time = "2025-11-02T12:26:05.207Z" },
{ url = "https://files.pythonhosted.org/packages/2e/bb/6670bded3e3236eb4287c7bcdc167e9fae6e1e9286e437f7111caed2f909/psutil-7.1.3-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:b403da1df4d6d43973dc004d19cee3b848e998ae3154cc8097d139b77156c353", size = 239843, upload-time = "2025-11-02T12:26:11.968Z" },
{ url = "https://files.pythonhosted.org/packages/b8/66/853d50e75a38c9a7370ddbeefabdd3d3116b9c31ef94dc92c6729bc36bec/psutil-7.1.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ad81425efc5e75da3f39b3e636293360ad8d0b49bed7df824c79764fb4ba9b8b", size = 240369, upload-time = "2025-11-02T12:26:14.358Z" },
{ url = "https://files.pythonhosted.org/packages/41/bd/313aba97cb5bfb26916dc29cf0646cbe4dd6a89ca69e8c6edce654876d39/psutil-7.1.3-cp314-cp314t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8f33a3702e167783a9213db10ad29650ebf383946e91bc77f28a5eb083496bc9", size = 288210, upload-time = "2025-11-02T12:26:16.699Z" },
{ url = "https://files.pythonhosted.org/packages/c2/fa/76e3c06e760927a0cfb5705eb38164254de34e9bd86db656d4dbaa228b04/psutil-7.1.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fac9cd332c67f4422504297889da5ab7e05fd11e3c4392140f7370f4208ded1f", size = 291182, upload-time = "2025-11-02T12:26:18.848Z" },
{ url = "https://files.pythonhosted.org/packages/ef/94/46b9154a800253e7ecff5aaacdf8ebf43db99de4a2dfa18575b02548654e/psutil-7.1.3-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:2bdbcd0e58ca14996a42adf3621a6244f1bb2e2e528886959c72cf1e326677ab", size = 238359, upload-time = "2025-11-02T12:26:25.284Z" },
{ url = "https://files.pythonhosted.org/packages/68/3a/9f93cff5c025029a36d9a92fef47220ab4692ee7f2be0fba9f92813d0cb8/psutil-7.1.3-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:bc31fa00f1fbc3c3802141eede66f3a2d51d89716a194bf2cd6fc68310a19880", size = 239171, upload-time = "2025-11-02T12:26:27.23Z" },
{ url = "https://files.pythonhosted.org/packages/ce/b1/5f49af514f76431ba4eea935b8ad3725cdeb397e9245ab919dbc1d1dc20f/psutil-7.1.3-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3bb428f9f05c1225a558f53e30ccbad9930b11c3fc206836242de1091d3e7dd3", size = 263261, upload-time = "2025-11-02T12:26:29.48Z" },
{ url = "https://files.pythonhosted.org/packages/e0/95/992c8816a74016eb095e73585d747e0a8ea21a061ed3689474fabb29a395/psutil-7.1.3-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:56d974e02ca2c8eb4812c3f76c30e28836fffc311d55d979f1465c1feeb2b68b", size = 264635, upload-time = "2025-11-02T12:26:31.74Z" },
]
[[package]]
@@ -1254,54 +1254,54 @@ wheels = [
[[package]]
name = "regex"
version = "2025.10.23"
version = "2025.11.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/f8/c8/1d2160d36b11fbe0a61acb7c3c81ab032d9ec8ad888ac9e0a61b85ab99dd/regex-2025.10.23.tar.gz", hash = "sha256:8cbaf8ceb88f96ae2356d01b9adf5e6306fa42fa6f7eab6b97794e37c959ac26", size = 401266, upload-time = "2025-10-21T15:58:20.23Z" }
sdist = { url = "https://files.pythonhosted.org/packages/cc/a9/546676f25e573a4cf00fe8e119b78a37b6a8fe2dc95cda877b30889c9c45/regex-2025.11.3.tar.gz", hash = "sha256:1fedc720f9bb2494ce31a58a1631f9c82df6a09b49c19517ea5cc280b4541e01", size = 414669, upload-time = "2025-11-03T21:34:22.089Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/28/c6/195a6217a43719d5a6a12cc192a22d12c40290cecfa577f00f4fb822f07d/regex-2025.10.23-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:b7690f95404a1293923a296981fd943cca12c31a41af9c21ba3edd06398fc193", size = 488956, upload-time = "2025-10-21T15:55:42.887Z" },
{ url = "https://files.pythonhosted.org/packages/4c/93/181070cd1aa2fa541ff2d3afcf763ceecd4937b34c615fa92765020a6c90/regex-2025.10.23-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:1a32d77aeaea58a13230100dd8797ac1a84c457f3af2fdf0d81ea689d5a9105b", size = 290997, upload-time = "2025-10-21T15:55:44.53Z" },
{ url = "https://files.pythonhosted.org/packages/b6/c5/9d37fbe3a40ed8dda78c23e1263002497540c0d1522ed75482ef6c2000f0/regex-2025.10.23-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:b24b29402f264f70a3c81f45974323b41764ff7159655360543b7cabb73e7d2f", size = 288686, upload-time = "2025-10-21T15:55:46.186Z" },
{ url = "https://files.pythonhosted.org/packages/5f/e7/db610ff9f10c2921f9b6ac0c8d8be4681b28ddd40fc0549429366967e61f/regex-2025.10.23-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:563824a08c7c03d96856d84b46fdb3bbb7cfbdf79da7ef68725cda2ce169c72a", size = 798466, upload-time = "2025-10-21T15:55:48.24Z" },
{ url = "https://files.pythonhosted.org/packages/90/10/aab883e1fa7fe2feb15ac663026e70ca0ae1411efa0c7a4a0342d9545015/regex-2025.10.23-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a0ec8bdd88d2e2659c3518087ee34b37e20bd169419ffead4240a7004e8ed03b", size = 863996, upload-time = "2025-10-21T15:55:50.478Z" },
{ url = "https://files.pythonhosted.org/packages/a2/b0/8f686dd97a51f3b37d0238cd00a6d0f9ccabe701f05b56de1918571d0d61/regex-2025.10.23-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b577601bfe1d33913fcd9276d7607bbac827c4798d9e14d04bf37d417a6c41cb", size = 912145, upload-time = "2025-10-21T15:55:52.215Z" },
{ url = "https://files.pythonhosted.org/packages/a3/ca/639f8cd5b08797bca38fc5e7e07f76641a428cf8c7fca05894caf045aa32/regex-2025.10.23-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7c9f2c68ac6cb3de94eea08a437a75eaa2bd33f9e97c84836ca0b610a5804368", size = 803370, upload-time = "2025-10-21T15:55:53.944Z" },
{ url = "https://files.pythonhosted.org/packages/0d/1e/a40725bb76959eddf8abc42a967bed6f4851b39f5ac4f20e9794d7832aa5/regex-2025.10.23-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:89f8b9ea3830c79468e26b0e21c3585f69f105157c2154a36f6b7839f8afb351", size = 787767, upload-time = "2025-10-21T15:55:56.004Z" },
{ url = "https://files.pythonhosted.org/packages/3d/d8/8ee9858062936b0f99656dce390aa667c6e7fb0c357b1b9bf76fb5e2e708/regex-2025.10.23-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:98fd84c4e4ea185b3bb5bf065261ab45867d8875032f358a435647285c722673", size = 858335, upload-time = "2025-10-21T15:55:58.185Z" },
{ url = "https://files.pythonhosted.org/packages/d8/0a/ed5faaa63fa8e3064ab670e08061fbf09e3a10235b19630cf0cbb9e48c0a/regex-2025.10.23-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:1e11d3e5887b8b096f96b4154dfb902f29c723a9556639586cd140e77e28b313", size = 850402, upload-time = "2025-10-21T15:56:00.023Z" },
{ url = "https://files.pythonhosted.org/packages/79/14/d05f617342f4b2b4a23561da500ca2beab062bfcc408d60680e77ecaf04d/regex-2025.10.23-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:4f13450328a6634348d47a88367e06b64c9d84980ef6a748f717b13f8ce64e87", size = 789739, upload-time = "2025-10-21T15:56:01.967Z" },
{ url = "https://files.pythonhosted.org/packages/3e/b3/95b310605285573341fc062d1d30b19a54f857530e86c805f942c4ff7941/regex-2025.10.23-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:7d6606524fa77b3912c9ef52a42ef63c6cfbfc1077e9dc6296cd5da0da286044", size = 491850, upload-time = "2025-10-21T15:56:11.685Z" },
{ url = "https://files.pythonhosted.org/packages/a4/8f/207c2cec01e34e56db1eff606eef46644a60cf1739ecd474627db90ad90b/regex-2025.10.23-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:c037aadf4d64bdc38af7db3dbd34877a057ce6524eefcb2914d6d41c56f968cc", size = 292537, upload-time = "2025-10-21T15:56:13.963Z" },
{ url = "https://files.pythonhosted.org/packages/98/3b/025240af4ada1dc0b5f10d73f3e5122d04ce7f8908ab8881e5d82b9d61b6/regex-2025.10.23-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:99018c331fb2529084a0c9b4c713dfa49fafb47c7712422e49467c13a636c656", size = 290904, upload-time = "2025-10-21T15:56:16.016Z" },
{ url = "https://files.pythonhosted.org/packages/81/8e/104ac14e2d3450c43db18ec03e1b96b445a94ae510b60138f00ce2cb7ca1/regex-2025.10.23-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fd8aba965604d70306eb90a35528f776e59112a7114a5162824d43b76fa27f58", size = 807311, upload-time = "2025-10-21T15:56:17.818Z" },
{ url = "https://files.pythonhosted.org/packages/19/63/78aef90141b7ce0be8a18e1782f764f6997ad09de0e05251f0d2503a914a/regex-2025.10.23-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:238e67264b4013e74136c49f883734f68656adf8257bfa13b515626b31b20f8e", size = 873241, upload-time = "2025-10-21T15:56:19.941Z" },
{ url = "https://files.pythonhosted.org/packages/b3/a8/80eb1201bb49ae4dba68a1b284b4211ed9daa8e74dc600018a10a90399fb/regex-2025.10.23-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b2eb48bd9848d66fd04826382f5e8491ae633de3233a3d64d58ceb4ecfa2113a", size = 914794, upload-time = "2025-10-21T15:56:22.488Z" },
{ url = "https://files.pythonhosted.org/packages/f0/d5/1984b6ee93281f360a119a5ca1af6a8ca7d8417861671388bf750becc29b/regex-2025.10.23-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d36591ce06d047d0c0fe2fc5f14bfbd5b4525d08a7b6a279379085e13f0e3d0e", size = 812581, upload-time = "2025-10-21T15:56:24.319Z" },
{ url = "https://files.pythonhosted.org/packages/c4/39/11ebdc6d9927172a64ae237d16763145db6bd45ebb4055c17b88edab72a7/regex-2025.10.23-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:b5d4ece8628d6e364302006366cea3ee887db397faebacc5dacf8ef19e064cf8", size = 795346, upload-time = "2025-10-21T15:56:26.232Z" },
{ url = "https://files.pythonhosted.org/packages/3b/b4/89a591bcc08b5e436af43315284bd233ba77daf0cf20e098d7af12f006c1/regex-2025.10.23-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:39a7e8083959cb1c4ff74e483eecb5a65d3b3e1d821b256e54baf61782c906c6", size = 868214, upload-time = "2025-10-21T15:56:28.597Z" },
{ url = "https://files.pythonhosted.org/packages/3d/ff/58ba98409c1dbc8316cdb20dafbc63ed267380a07780cafecaf5012dabc9/regex-2025.10.23-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:842d449a8fefe546f311656cf8c0d6729b08c09a185f1cad94c756210286d6a8", size = 854540, upload-time = "2025-10-21T15:56:30.875Z" },
{ url = "https://files.pythonhosted.org/packages/9a/f2/4a9e9338d67626e2071b643f828a482712ad15889d7268e11e9a63d6f7e9/regex-2025.10.23-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:d614986dc68506be8f00474f4f6960e03e4ca9883f7df47744800e7d7c08a494", size = 799346, upload-time = "2025-10-21T15:56:32.725Z" },
{ url = "https://files.pythonhosted.org/packages/73/f6/0caf29fec943f201fbc8822879c99d31e59c1d51a983d9843ee5cf398539/regex-2025.10.23-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:5b5cb5b6344c4c4c24b2dc87b0bfee78202b07ef7633385df70da7fcf6f7cec6", size = 488960, upload-time = "2025-10-21T15:56:40.849Z" },
{ url = "https://files.pythonhosted.org/packages/8e/7d/ebb7085b8fa31c24ce0355107cea2b92229d9050552a01c5d291c42aecea/regex-2025.10.23-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:a6ce7973384c37bdf0f371a843f95a6e6f4e1489e10e0cf57330198df72959c5", size = 290932, upload-time = "2025-10-21T15:56:42.875Z" },
{ url = "https://files.pythonhosted.org/packages/27/41/43906867287cbb5ca4cee671c3cc8081e15deef86a8189c3aad9ac9f6b4d/regex-2025.10.23-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:2ee3663f2c334959016b56e3bd0dd187cbc73f948e3a3af14c3caaa0c3035d10", size = 288766, upload-time = "2025-10-21T15:56:44.894Z" },
{ url = "https://files.pythonhosted.org/packages/ab/9e/ea66132776700fc77a39b1056e7a5f1308032fead94507e208dc6716b7cd/regex-2025.10.23-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2003cc82a579107e70d013482acce8ba773293f2db534fb532738395c557ff34", size = 798884, upload-time = "2025-10-21T15:56:47.178Z" },
{ url = "https://files.pythonhosted.org/packages/d5/99/aed1453687ab63819a443930770db972c5c8064421f0d9f5da9ad029f26b/regex-2025.10.23-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:182c452279365a93a9f45874f7f191ec1c51e1f1eb41bf2b16563f1a40c1da3a", size = 864768, upload-time = "2025-10-21T15:56:49.793Z" },
{ url = "https://files.pythonhosted.org/packages/99/5d/732fe747a1304805eb3853ce6337eea16b169f7105a0d0dd9c6a5ffa9948/regex-2025.10.23-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b1249e9ff581c5b658c8f0437f883b01f1edcf424a16388591e7c05e5e9e8b0c", size = 911394, upload-time = "2025-10-21T15:56:52.186Z" },
{ url = "https://files.pythonhosted.org/packages/5e/48/58a1f6623466522352a6efa153b9a3714fc559d9f930e9bc947b4a88a2c3/regex-2025.10.23-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2b841698f93db3ccc36caa1900d2a3be281d9539b822dc012f08fc80b46a3224", size = 803145, upload-time = "2025-10-21T15:56:55.142Z" },
{ url = "https://files.pythonhosted.org/packages/ea/f6/7dea79be2681a5574ab3fc237aa53b2c1dfd6bd2b44d4640b6c76f33f4c1/regex-2025.10.23-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:956d89e0c92d471e8f7eee73f73fdff5ed345886378c45a43175a77538a1ffe4", size = 787831, upload-time = "2025-10-21T15:56:57.203Z" },
{ url = "https://files.pythonhosted.org/packages/3a/ad/07b76950fbbe65f88120ca2d8d845047c401450f607c99ed38862904671d/regex-2025.10.23-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:5c259cb363299a0d90d63b5c0d7568ee98419861618a95ee9d91a41cb9954462", size = 859162, upload-time = "2025-10-21T15:56:59.195Z" },
{ url = "https://files.pythonhosted.org/packages/41/87/374f3b2021b22aa6a4fc0b750d63f9721e53d1631a238f7a1c343c1cd288/regex-2025.10.23-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:185d2b18c062820b3a40d8fefa223a83f10b20a674bf6e8c4a432e8dfd844627", size = 849899, upload-time = "2025-10-21T15:57:01.747Z" },
{ url = "https://files.pythonhosted.org/packages/12/4a/7f7bb17c5a5a9747249807210e348450dab9212a46ae6d23ebce86ba6a2b/regex-2025.10.23-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:281d87fa790049c2b7c1b4253121edd80b392b19b5a3d28dc2a77579cb2a58ec", size = 789372, upload-time = "2025-10-21T15:57:04.018Z" },
{ url = "https://files.pythonhosted.org/packages/a6/d0/2025268315e8b2b7b660039824cb7765a41623e97d4cd421510925400487/regex-2025.10.23-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:1f5799ea1787aa6de6c150377d11afad39a38afd033f0c5247aecb997978c422", size = 491854, upload-time = "2025-10-21T15:57:12.526Z" },
{ url = "https://files.pythonhosted.org/packages/44/35/5681c2fec5e8b33454390af209c4353dfc44606bf06d714b0b8bd0454ffe/regex-2025.10.23-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:a9639ab7540cfea45ef57d16dcbea2e22de351998d614c3ad2f9778fa3bdd788", size = 292542, upload-time = "2025-10-21T15:57:15.158Z" },
{ url = "https://files.pythonhosted.org/packages/5d/17/184eed05543b724132e4a18149e900f5189001fcfe2d64edaae4fbaf36b4/regex-2025.10.23-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:08f52122c352eb44c3421dab78b9b73a8a77a282cc8314ae576fcaa92b780d10", size = 290903, upload-time = "2025-10-21T15:57:17.108Z" },
{ url = "https://files.pythonhosted.org/packages/25/d0/5e3347aa0db0de382dddfa133a7b0ae72f24b4344f3989398980b44a3924/regex-2025.10.23-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ebf1baebef1c4088ad5a5623decec6b52950f0e4d7a0ae4d48f0a99f8c9cb7d7", size = 807546, upload-time = "2025-10-21T15:57:19.179Z" },
{ url = "https://files.pythonhosted.org/packages/d2/bb/40c589bbdce1be0c55e9f8159789d58d47a22014f2f820cf2b517a5cd193/regex-2025.10.23-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:16b0f1c2e2d566c562d5c384c2b492646be0a19798532fdc1fdedacc66e3223f", size = 873322, upload-time = "2025-10-21T15:57:21.36Z" },
{ url = "https://files.pythonhosted.org/packages/fe/56/a7e40c01575ac93360e606278d359f91829781a9f7fb6e5aa435039edbda/regex-2025.10.23-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f7ada5d9dceafaab92646aa00c10a9efd9b09942dd9b0d7c5a4b73db92cc7e61", size = 914855, upload-time = "2025-10-21T15:57:24.044Z" },
{ url = "https://files.pythonhosted.org/packages/5c/4b/d55587b192763db3163c3f508b3b67b31bb6f5e7a0e08b83013d0a59500a/regex-2025.10.23-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3a36b4005770044bf08edecc798f0e41a75795b9e7c9c12fe29da8d792ef870c", size = 812724, upload-time = "2025-10-21T15:57:26.123Z" },
{ url = "https://files.pythonhosted.org/packages/33/20/18bac334955fbe99d17229f4f8e98d05e4a501ac03a442be8facbb37c304/regex-2025.10.23-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:af7b2661dcc032da1fae82069b5ebf2ac1dfcd5359ef8b35e1367bfc92181432", size = 795439, upload-time = "2025-10-21T15:57:28.497Z" },
{ url = "https://files.pythonhosted.org/packages/67/46/c57266be9df8549c7d85deb4cb82280cb0019e46fff677534c5fa1badfa4/regex-2025.10.23-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:1cb976810ac1416a67562c2e5ba0accf6f928932320fef302e08100ed681b38e", size = 868336, upload-time = "2025-10-21T15:57:30.867Z" },
{ url = "https://files.pythonhosted.org/packages/b8/f3/bd5879e41ef8187fec5e678e94b526a93f99e7bbe0437b0f2b47f9101694/regex-2025.10.23-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:1a56a54be3897d62f54290190fbcd754bff6932934529fbf5b29933da28fcd43", size = 854567, upload-time = "2025-10-21T15:57:33.062Z" },
{ url = "https://files.pythonhosted.org/packages/e6/57/2b6bbdbd2f24dfed5b028033aa17ad8f7d86bb28f1a892cac8b3bc89d059/regex-2025.10.23-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:8f3e6d202fb52c2153f532043bbcf618fd177df47b0b306741eb9b60ba96edc3", size = 799565, upload-time = "2025-10-21T15:57:35.153Z" },
{ url = "https://files.pythonhosted.org/packages/e1/a7/dda24ebd49da46a197436ad96378f17df30ceb40e52e859fc42cac45b850/regex-2025.11.3-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:c1e448051717a334891f2b9a620fe36776ebf3dd8ec46a0b877c8ae69575feb4", size = 489081, upload-time = "2025-11-03T21:31:55.9Z" },
{ url = "https://files.pythonhosted.org/packages/19/22/af2dc751aacf88089836aa088a1a11c4f21a04707eb1b0478e8e8fb32847/regex-2025.11.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:9b5aca4d5dfd7fbfbfbdaf44850fcc7709a01146a797536a8f84952e940cca76", size = 291123, upload-time = "2025-11-03T21:31:57.758Z" },
{ url = "https://files.pythonhosted.org/packages/a3/88/1a3ea5672f4b0a84802ee9891b86743438e7c04eb0b8f8c4e16a42375327/regex-2025.11.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:04d2765516395cf7dda331a244a3282c0f5ae96075f728629287dfa6f76ba70a", size = 288814, upload-time = "2025-11-03T21:32:01.12Z" },
{ url = "https://files.pythonhosted.org/packages/fb/8c/f5987895bf42b8ddeea1b315c9fedcfe07cadee28b9c98cf50d00adcb14d/regex-2025.11.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5d9903ca42bfeec4cebedba8022a7c97ad2aab22e09573ce9976ba01b65e4361", size = 798592, upload-time = "2025-11-03T21:32:03.006Z" },
{ url = "https://files.pythonhosted.org/packages/99/2a/6591ebeede78203fa77ee46a1c36649e02df9eaa77a033d1ccdf2fcd5d4e/regex-2025.11.3-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:639431bdc89d6429f6721625e8129413980ccd62e9d3f496be618a41d205f160", size = 864122, upload-time = "2025-11-03T21:32:04.553Z" },
{ url = "https://files.pythonhosted.org/packages/94/d6/be32a87cf28cf8ed064ff281cfbd49aefd90242a83e4b08b5a86b38e8eb4/regex-2025.11.3-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f117efad42068f9715677c8523ed2be1518116d1c49b1dd17987716695181efe", size = 912272, upload-time = "2025-11-03T21:32:06.148Z" },
{ url = "https://files.pythonhosted.org/packages/62/11/9bcef2d1445665b180ac7f230406ad80671f0fc2a6ffb93493b5dd8cd64c/regex-2025.11.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4aecb6f461316adf9f1f0f6a4a1a3d79e045f9b71ec76055a791affa3b285850", size = 803497, upload-time = "2025-11-03T21:32:08.162Z" },
{ url = "https://files.pythonhosted.org/packages/e5/a7/da0dc273d57f560399aa16d8a68ae7f9b57679476fc7ace46501d455fe84/regex-2025.11.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:3b3a5f320136873cc5561098dfab677eea139521cb9a9e8db98b7e64aef44cbc", size = 787892, upload-time = "2025-11-03T21:32:09.769Z" },
{ url = "https://files.pythonhosted.org/packages/da/4b/732a0c5a9736a0b8d6d720d4945a2f1e6f38f87f48f3173559f53e8d5d82/regex-2025.11.3-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:75fa6f0056e7efb1f42a1c34e58be24072cb9e61a601340cc1196ae92326a4f9", size = 858462, upload-time = "2025-11-03T21:32:11.769Z" },
{ url = "https://files.pythonhosted.org/packages/0c/f5/a2a03df27dc4c2d0c769220f5110ba8c4084b0bfa9ab0f9b4fcfa3d2b0fc/regex-2025.11.3-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:dbe6095001465294f13f1adcd3311e50dd84e5a71525f20a10bd16689c61ce0b", size = 850528, upload-time = "2025-11-03T21:32:13.906Z" },
{ url = "https://files.pythonhosted.org/packages/d6/09/e1cd5bee3841c7f6eb37d95ca91cdee7100b8f88b81e41c2ef426910891a/regex-2025.11.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:454d9b4ae7881afbc25015b8627c16d88a597479b9dea82b8c6e7e2e07240dc7", size = 789866, upload-time = "2025-11-03T21:32:15.748Z" },
{ url = "https://files.pythonhosted.org/packages/20/28/fd0c63357caefe5680b8ea052131acbd7f456893b69cc2a90cc3e0dc90d4/regex-2025.11.3-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:1eb1ebf6822b756c723e09f5186473d93236c06c579d2cc0671a722d2ab14281", size = 491984, upload-time = "2025-11-03T21:32:23.466Z" },
{ url = "https://files.pythonhosted.org/packages/df/ec/7014c15626ab46b902b3bcc4b28a7bae46d8f281fc7ea9c95e22fcaaa917/regex-2025.11.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:1e00ec2970aab10dc5db34af535f21fcf32b4a31d99e34963419636e2f85ae39", size = 292673, upload-time = "2025-11-03T21:32:25.034Z" },
{ url = "https://files.pythonhosted.org/packages/23/ab/3b952ff7239f20d05f1f99e9e20188513905f218c81d52fb5e78d2bf7634/regex-2025.11.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:a4cb042b615245d5ff9b3794f56be4138b5adc35a4166014d31d1814744148c7", size = 291029, upload-time = "2025-11-03T21:32:26.528Z" },
{ url = "https://files.pythonhosted.org/packages/21/7e/3dc2749fc684f455f162dcafb8a187b559e2614f3826877d3844a131f37b/regex-2025.11.3-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:44f264d4bf02f3176467d90b294d59bf1db9fe53c141ff772f27a8b456b2a9ed", size = 807437, upload-time = "2025-11-03T21:32:28.363Z" },
{ url = "https://files.pythonhosted.org/packages/1b/0b/d529a85ab349c6a25d1ca783235b6e3eedf187247eab536797021f7126c6/regex-2025.11.3-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:7be0277469bf3bd7a34a9c57c1b6a724532a0d235cd0dc4e7f4316f982c28b19", size = 873368, upload-time = "2025-11-03T21:32:30.4Z" },
{ url = "https://files.pythonhosted.org/packages/7d/18/2d868155f8c9e3e9d8f9e10c64e9a9f496bb8f7e037a88a8bed26b435af6/regex-2025.11.3-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0d31e08426ff4b5b650f68839f5af51a92a5b51abd8554a60c2fbc7c71f25d0b", size = 914921, upload-time = "2025-11-03T21:32:32.123Z" },
{ url = "https://files.pythonhosted.org/packages/2d/71/9d72ff0f354fa783fe2ba913c8734c3b433b86406117a8db4ea2bf1c7a2f/regex-2025.11.3-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e43586ce5bd28f9f285a6e729466841368c4a0353f6fd08d4ce4630843d3648a", size = 812708, upload-time = "2025-11-03T21:32:34.305Z" },
{ url = "https://files.pythonhosted.org/packages/e7/19/ce4bf7f5575c97f82b6e804ffb5c4e940c62609ab2a0d9538d47a7fdf7d4/regex-2025.11.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:0f9397d561a4c16829d4e6ff75202c1c08b68a3bdbfe29dbfcdb31c9830907c6", size = 795472, upload-time = "2025-11-03T21:32:36.364Z" },
{ url = "https://files.pythonhosted.org/packages/03/86/fd1063a176ffb7b2315f9a1b08d17b18118b28d9df163132615b835a26ee/regex-2025.11.3-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:dd16e78eb18ffdb25ee33a0682d17912e8cc8a770e885aeee95020046128f1ce", size = 868341, upload-time = "2025-11-03T21:32:38.042Z" },
{ url = "https://files.pythonhosted.org/packages/12/43/103fb2e9811205e7386366501bc866a164a0430c79dd59eac886a2822950/regex-2025.11.3-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:ffcca5b9efe948ba0661e9df0fa50d2bc4b097c70b9810212d6b62f05d83b2dd", size = 854666, upload-time = "2025-11-03T21:32:40.079Z" },
{ url = "https://files.pythonhosted.org/packages/7d/22/e392e53f3869b75804762c7c848bd2dd2abf2b70fb0e526f58724638bd35/regex-2025.11.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:c56b4d162ca2b43318ac671c65bd4d563e841a694ac70e1a976ac38fcf4ca1d2", size = 799473, upload-time = "2025-11-03T21:32:42.148Z" },
{ url = "https://files.pythonhosted.org/packages/31/e9/f6e13de7e0983837f7b6d238ad9458800a874bf37c264f7923e63409944c/regex-2025.11.3-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:9697a52e57576c83139d7c6f213d64485d3df5bf84807c35fa409e6c970801c6", size = 489089, upload-time = "2025-11-03T21:32:50.027Z" },
{ url = "https://files.pythonhosted.org/packages/a3/5c/261f4a262f1fa65141c1b74b255988bd2fa020cc599e53b080667d591cfc/regex-2025.11.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:e18bc3f73bd41243c9b38a6d9f2366cd0e0137a9aebe2d8ff76c5b67d4c0a3f4", size = 291059, upload-time = "2025-11-03T21:32:51.682Z" },
{ url = "https://files.pythonhosted.org/packages/8e/57/f14eeb7f072b0e9a5a090d1712741fd8f214ec193dba773cf5410108bb7d/regex-2025.11.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:61a08bcb0ec14ff4e0ed2044aad948d0659604f824cbd50b55e30b0ec6f09c73", size = 288900, upload-time = "2025-11-03T21:32:53.569Z" },
{ url = "https://files.pythonhosted.org/packages/3c/6b/1d650c45e99a9b327586739d926a1cd4e94666b1bd4af90428b36af66dc7/regex-2025.11.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c9c30003b9347c24bcc210958c5d167b9e4f9be786cb380a7d32f14f9b84674f", size = 799010, upload-time = "2025-11-03T21:32:55.222Z" },
{ url = "https://files.pythonhosted.org/packages/99/ee/d66dcbc6b628ce4e3f7f0cbbb84603aa2fc0ffc878babc857726b8aab2e9/regex-2025.11.3-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4e1e592789704459900728d88d41a46fe3969b82ab62945560a31732ffc19a6d", size = 864893, upload-time = "2025-11-03T21:32:57.239Z" },
{ url = "https://files.pythonhosted.org/packages/bf/2d/f238229f1caba7ac87a6c4153d79947fb0261415827ae0f77c304260c7d3/regex-2025.11.3-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:6538241f45eb5a25aa575dbba1069ad786f68a4f2773a29a2bd3dd1f9de787be", size = 911522, upload-time = "2025-11-03T21:32:59.274Z" },
{ url = "https://files.pythonhosted.org/packages/bd/3d/22a4eaba214a917c80e04f6025d26143690f0419511e0116508e24b11c9b/regex-2025.11.3-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bce22519c989bb72a7e6b36a199384c53db7722fe669ba891da75907fe3587db", size = 803272, upload-time = "2025-11-03T21:33:01.393Z" },
{ url = "https://files.pythonhosted.org/packages/84/b1/03188f634a409353a84b5ef49754b97dbcc0c0f6fd6c8ede505a8960a0a4/regex-2025.11.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:66d559b21d3640203ab9075797a55165d79017520685fb407b9234d72ab63c62", size = 787958, upload-time = "2025-11-03T21:33:03.379Z" },
{ url = "https://files.pythonhosted.org/packages/99/6a/27d072f7fbf6fadd59c64d210305e1ff865cc3b78b526fd147db768c553b/regex-2025.11.3-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:669dcfb2e38f9e8c69507bace46f4889e3abbfd9b0c29719202883c0a603598f", size = 859289, upload-time = "2025-11-03T21:33:05.374Z" },
{ url = "https://files.pythonhosted.org/packages/9a/70/1b3878f648e0b6abe023172dacb02157e685564853cc363d9961bcccde4e/regex-2025.11.3-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:32f74f35ff0f25a5021373ac61442edcb150731fbaa28286bbc8bb1582c89d02", size = 850026, upload-time = "2025-11-03T21:33:07.131Z" },
{ url = "https://files.pythonhosted.org/packages/dd/d5/68e25559b526b8baab8e66839304ede68ff6727237a47727d240006bd0ff/regex-2025.11.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:e6c7a21dffba883234baefe91bc3388e629779582038f75d2a5be918e250f0ed", size = 789499, upload-time = "2025-11-03T21:33:09.141Z" },
{ url = "https://files.pythonhosted.org/packages/c3/06/49b198550ee0f5e4184271cee87ba4dfd9692c91ec55289e6282f0f86ccf/regex-2025.11.3-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:ba0d8a5d7f04f73ee7d01d974d47c5834f8a1b0224390e4fe7c12a3a92a78ecc", size = 491985, upload-time = "2025-11-03T21:33:16.555Z" },
{ url = "https://files.pythonhosted.org/packages/ce/bf/abdafade008f0b1c9da10d934034cb670432d6cf6cbe38bbb53a1cfd6cf8/regex-2025.11.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:442d86cf1cfe4faabf97db7d901ef58347efd004934da045c745e7b5bd57ac49", size = 292669, upload-time = "2025-11-03T21:33:18.32Z" },
{ url = "https://files.pythonhosted.org/packages/f9/ef/0c357bb8edbd2ad8e273fcb9e1761bc37b8acbc6e1be050bebd6475f19c1/regex-2025.11.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:fd0a5e563c756de210bb964789b5abe4f114dacae9104a47e1a649b910361536", size = 291030, upload-time = "2025-11-03T21:33:20.048Z" },
{ url = "https://files.pythonhosted.org/packages/79/06/edbb67257596649b8fb088d6aeacbcb248ac195714b18a65e018bf4c0b50/regex-2025.11.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bf3490bcbb985a1ae97b2ce9ad1c0f06a852d5b19dde9b07bdf25bf224248c95", size = 807674, upload-time = "2025-11-03T21:33:21.797Z" },
{ url = "https://files.pythonhosted.org/packages/f4/d9/ad4deccfce0ea336296bd087f1a191543bb99ee1c53093dcd4c64d951d00/regex-2025.11.3-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3809988f0a8b8c9dcc0f92478d6501fac7200b9ec56aecf0ec21f4a2ec4b6009", size = 873451, upload-time = "2025-11-03T21:33:23.741Z" },
{ url = "https://files.pythonhosted.org/packages/13/75/a55a4724c56ef13e3e04acaab29df26582f6978c000ac9cd6810ad1f341f/regex-2025.11.3-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f4ff94e58e84aedb9c9fce66d4ef9f27a190285b451420f297c9a09f2b9abee9", size = 914980, upload-time = "2025-11-03T21:33:25.999Z" },
{ url = "https://files.pythonhosted.org/packages/67/1e/a1657ee15bd9116f70d4a530c736983eed997b361e20ecd8f5ca3759d5c5/regex-2025.11.3-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7eb542fd347ce61e1321b0a6b945d5701528dca0cd9759c2e3bb8bd57e47964d", size = 812852, upload-time = "2025-11-03T21:33:27.852Z" },
{ url = "https://files.pythonhosted.org/packages/b8/6f/f7516dde5506a588a561d296b2d0044839de06035bb486b326065b4c101e/regex-2025.11.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:d6c2d5919075a1f2e413c00b056ea0c2f065b3f5fe83c3d07d325ab92dce51d6", size = 795566, upload-time = "2025-11-03T21:33:32.364Z" },
{ url = "https://files.pythonhosted.org/packages/d9/dd/3d10b9e170cc16fb34cb2cef91513cf3df65f440b3366030631b2984a264/regex-2025.11.3-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:3f8bf11a4827cc7ce5a53d4ef6cddd5ad25595d3c1435ef08f76825851343154", size = 868463, upload-time = "2025-11-03T21:33:34.459Z" },
{ url = "https://files.pythonhosted.org/packages/f5/8e/935e6beff1695aa9085ff83195daccd72acc82c81793df480f34569330de/regex-2025.11.3-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:22c12d837298651e5550ac1d964e4ff57c3f56965fc1812c90c9fb2028eaf267", size = 854694, upload-time = "2025-11-03T21:33:36.793Z" },
{ url = "https://files.pythonhosted.org/packages/92/12/10650181a040978b2f5720a6a74d44f841371a3d984c2083fc1752e4acf6/regex-2025.11.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:62ba394a3dda9ad41c7c780f60f6e4a70988741415ae96f6d1bf6c239cf01379", size = 799691, upload-time = "2025-11-03T21:33:39.079Z" },
]
[[package]]
@@ -1334,25 +1334,25 @@ wheels = [
[[package]]
name = "ruff"
version = "0.14.2"
version = "0.14.3"
source = { registry = "https://pypi.org/simple" }
sdist = { url = "https://files.pythonhosted.org/packages/ee/34/8218a19b2055b80601e8fd201ec723c74c7fe1ca06d525a43ed07b6d8e85/ruff-0.14.2.tar.gz", hash = "sha256:98da787668f239313d9c902ca7c523fe11b8ec3f39345553a51b25abc4629c96", size = 5539663, upload-time = "2025-10-23T19:37:00.956Z" }
sdist = { url = "https://files.pythonhosted.org/packages/75/62/50b7727004dfe361104dfbf898c45a9a2fdfad8c72c04ae62900224d6ecf/ruff-0.14.3.tar.gz", hash = "sha256:4ff876d2ab2b161b6de0aa1f5bd714e8e9b4033dc122ee006925fbacc4f62153", size = 5558687, upload-time = "2025-10-31T00:26:26.878Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/16/dd/23eb2db5ad9acae7c845700493b72d3ae214dce0b226f27df89216110f2b/ruff-0.14.2-py3-none-linux_armv6l.whl", hash = "sha256:7cbe4e593505bdec5884c2d0a4d791a90301bc23e49a6b1eb642dd85ef9c64f1", size = 12533390, upload-time = "2025-10-23T19:36:18.044Z" },
{ url = "https://files.pythonhosted.org/packages/5a/8c/5f9acff43ddcf3f85130d0146d0477e28ccecc495f9f684f8f7119b74c0d/ruff-0.14.2-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:8d54b561729cee92f8d89c316ad7a3f9705533f5903b042399b6ae0ddfc62e11", size = 12887187, upload-time = "2025-10-23T19:36:22.664Z" },
{ url = "https://files.pythonhosted.org/packages/99/fa/047646491479074029665022e9f3dc6f0515797f40a4b6014ea8474c539d/ruff-0.14.2-py3-none-macosx_11_0_arm64.whl", hash = "sha256:5c8753dfa44ebb2cde10ce5b4d2ef55a41fb9d9b16732a2c5df64620dbda44a3", size = 11925177, upload-time = "2025-10-23T19:36:24.778Z" },
{ url = "https://files.pythonhosted.org/packages/15/8b/c44cf7fe6e59ab24a9d939493a11030b503bdc2a16622cede8b7b1df0114/ruff-0.14.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3d0bbeffb8d9f4fccf7b5198d566d0bad99a9cb622f1fc3467af96cb8773c9e3", size = 12358285, upload-time = "2025-10-23T19:36:26.979Z" },
{ url = "https://files.pythonhosted.org/packages/45/01/47701b26254267ef40369aea3acb62a7b23e921c27372d127e0f3af48092/ruff-0.14.2-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7047f0c5a713a401e43a88d36843d9c83a19c584e63d664474675620aaa634a8", size = 12303832, upload-time = "2025-10-23T19:36:29.192Z" },
{ url = "https://files.pythonhosted.org/packages/2d/5c/ae7244ca4fbdf2bee9d6405dcd5bc6ae51ee1df66eb7a9884b77b8af856d/ruff-0.14.2-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3bf8d2f9aa1602599217d82e8e0af7fd33e5878c4d98f37906b7c93f46f9a839", size = 13036995, upload-time = "2025-10-23T19:36:31.861Z" },
{ url = "https://files.pythonhosted.org/packages/27/4c/0860a79ce6fd4c709ac01173f76f929d53f59748d0dcdd662519835dae43/ruff-0.14.2-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:1c505b389e19c57a317cf4b42db824e2fca96ffb3d86766c1c9f8b96d32048a7", size = 14512649, upload-time = "2025-10-23T19:36:33.915Z" },
{ url = "https://files.pythonhosted.org/packages/7f/7f/d365de998069720a3abfc250ddd876fc4b81a403a766c74ff9bde15b5378/ruff-0.14.2-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a307fc45ebd887b3f26b36d9326bb70bf69b01561950cdcc6c0bdf7bb8e0f7cc", size = 14088182, upload-time = "2025-10-23T19:36:36.983Z" },
{ url = "https://files.pythonhosted.org/packages/6c/ea/d8e3e6b209162000a7be1faa41b0a0c16a133010311edc3329753cc6596a/ruff-0.14.2-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:61ae91a32c853172f832c2f40bd05fd69f491db7289fb85a9b941ebdd549781a", size = 13599516, upload-time = "2025-10-23T19:36:39.208Z" },
{ url = "https://files.pythonhosted.org/packages/fa/ea/c7810322086db68989fb20a8d5221dd3b79e49e396b01badca07b433ab45/ruff-0.14.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc1967e40286f63ee23c615e8e7e98098dedc7301568bd88991f6e544d8ae096", size = 13272690, upload-time = "2025-10-23T19:36:41.453Z" },
{ url = "https://files.pythonhosted.org/packages/a9/39/10b05acf8c45786ef501d454e00937e1b97964f846bf28883d1f9619928a/ruff-0.14.2-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:2877f02119cdebf52a632d743a2e302dea422bfae152ebe2f193d3285a3a65df", size = 13496497, upload-time = "2025-10-23T19:36:43.61Z" },
{ url = "https://files.pythonhosted.org/packages/59/a1/1f25f8301e13751c30895092485fada29076e5e14264bdacc37202e85d24/ruff-0.14.2-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e681c5bc777de5af898decdcb6ba3321d0d466f4cb43c3e7cc2c3b4e7b843a05", size = 12266116, upload-time = "2025-10-23T19:36:45.625Z" },
{ url = "https://files.pythonhosted.org/packages/5c/fa/0029bfc9ce16ae78164e6923ef392e5f173b793b26cc39aa1d8b366cf9dc/ruff-0.14.2-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:e21be42d72e224736f0c992cdb9959a2fa53c7e943b97ef5d081e13170e3ffc5", size = 12281345, upload-time = "2025-10-23T19:36:47.618Z" },
{ url = "https://files.pythonhosted.org/packages/a5/ab/ece7baa3c0f29b7683be868c024f0838770c16607bea6852e46b202f1ff6/ruff-0.14.2-py3-none-musllinux_1_2_i686.whl", hash = "sha256:b8264016f6f209fac16262882dbebf3f8be1629777cf0f37e7aff071b3e9b92e", size = 12629296, upload-time = "2025-10-23T19:36:49.789Z" },
{ url = "https://files.pythonhosted.org/packages/a4/7f/638f54b43f3d4e48c6a68062794e5b367ddac778051806b9e235dfb7aa81/ruff-0.14.2-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5ca36b4cb4db3067a3b24444463ceea5565ea78b95fe9a07ca7cb7fd16948770", size = 13371610, upload-time = "2025-10-23T19:36:51.882Z" },
{ url = "https://files.pythonhosted.org/packages/ce/8e/0c10ff1ea5d4360ab8bfca4cb2c9d979101a391f3e79d2616c9bf348cd26/ruff-0.14.3-py3-none-linux_armv6l.whl", hash = "sha256:876b21e6c824f519446715c1342b8e60f97f93264012de9d8d10314f8a79c371", size = 12535613, upload-time = "2025-10-31T00:25:44.302Z" },
{ url = "https://files.pythonhosted.org/packages/d3/c8/6724f4634c1daf52409fbf13fefda64aa9c8f81e44727a378b7b73dc590b/ruff-0.14.3-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:b6fd8c79b457bedd2abf2702b9b472147cd860ed7855c73a5247fa55c9117654", size = 12855812, upload-time = "2025-10-31T00:25:47.793Z" },
{ url = "https://files.pythonhosted.org/packages/de/03/db1bce591d55fd5f8a08bb02517fa0b5097b2ccabd4ea1ee29aa72b67d96/ruff-0.14.3-py3-none-macosx_11_0_arm64.whl", hash = "sha256:71ff6edca490c308f083156938c0c1a66907151263c4abdcb588602c6e696a14", size = 11944026, upload-time = "2025-10-31T00:25:49.657Z" },
{ url = "https://files.pythonhosted.org/packages/0b/75/4f8dbd48e03272715d12c87dc4fcaaf21b913f0affa5f12a4e9c6f8a0582/ruff-0.14.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:786ee3ce6139772ff9272aaf43296d975c0217ee1b97538a98171bf0d21f87ed", size = 12356818, upload-time = "2025-10-31T00:25:51.949Z" },
{ url = "https://files.pythonhosted.org/packages/ec/9b/506ec5b140c11d44a9a4f284ea7c14ebf6f8b01e6e8917734a3325bff787/ruff-0.14.3-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:cd6291d0061811c52b8e392f946889916757610d45d004e41140d81fb6cd5ddc", size = 12336745, upload-time = "2025-10-31T00:25:54.248Z" },
{ url = "https://files.pythonhosted.org/packages/c7/e1/c560d254048c147f35e7f8131d30bc1f63a008ac61595cf3078a3e93533d/ruff-0.14.3-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a497ec0c3d2c88561b6d90f9c29f5ae68221ac00d471f306fa21fa4264ce5fcd", size = 13101684, upload-time = "2025-10-31T00:25:56.253Z" },
{ url = "https://files.pythonhosted.org/packages/a5/32/e310133f8af5cd11f8cc30f52522a3ebccc5ea5bff4b492f94faceaca7a8/ruff-0.14.3-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:e231e1be58fc568950a04fbe6887c8e4b85310e7889727e2b81db205c45059eb", size = 14535000, upload-time = "2025-10-31T00:25:58.397Z" },
{ url = "https://files.pythonhosted.org/packages/a2/a1/7b0470a22158c6d8501eabc5e9b6043c99bede40fa1994cadf6b5c2a61c7/ruff-0.14.3-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:469e35872a09c0e45fecf48dd960bfbce056b5db2d5e6b50eca329b4f853ae20", size = 14156450, upload-time = "2025-10-31T00:26:00.889Z" },
{ url = "https://files.pythonhosted.org/packages/0a/96/24bfd9d1a7f532b560dcee1a87096332e461354d3882124219bcaff65c09/ruff-0.14.3-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3d6bc90307c469cb9d28b7cfad90aaa600b10d67c6e22026869f585e1e8a2db0", size = 13568414, upload-time = "2025-10-31T00:26:03.291Z" },
{ url = "https://files.pythonhosted.org/packages/a7/e7/138b883f0dfe4ad5b76b58bf4ae675f4d2176ac2b24bdd81b4d966b28c61/ruff-0.14.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0e2f8a0bbcffcfd895df39c9a4ecd59bb80dca03dc43f7fb63e647ed176b741e", size = 13315293, upload-time = "2025-10-31T00:26:05.708Z" },
{ url = "https://files.pythonhosted.org/packages/33/f4/c09bb898be97b2eb18476b7c950df8815ef14cf956074177e9fbd40b7719/ruff-0.14.3-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:678fdd7c7d2d94851597c23ee6336d25f9930b460b55f8598e011b57c74fd8c5", size = 13539444, upload-time = "2025-10-31T00:26:08.09Z" },
{ url = "https://files.pythonhosted.org/packages/9c/aa/b30a1db25fc6128b1dd6ff0741fa4abf969ded161599d07ca7edd0739cc0/ruff-0.14.3-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:1ec1ac071e7e37e0221d2f2dbaf90897a988c531a8592a6a5959f0603a1ecf5e", size = 12252581, upload-time = "2025-10-31T00:26:10.297Z" },
{ url = "https://files.pythonhosted.org/packages/da/13/21096308f384d796ffe3f2960b17054110a9c3828d223ca540c2b7cc670b/ruff-0.14.3-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afcdc4b5335ef440d19e7df9e8ae2ad9f749352190e96d481dc501b753f0733e", size = 12307503, upload-time = "2025-10-31T00:26:12.646Z" },
{ url = "https://files.pythonhosted.org/packages/cb/cc/a350bac23f03b7dbcde3c81b154706e80c6f16b06ff1ce28ed07dc7b07b0/ruff-0.14.3-py3-none-musllinux_1_2_i686.whl", hash = "sha256:7bfc42f81862749a7136267a343990f865e71fe2f99cf8d2958f684d23ce3dfa", size = 12675457, upload-time = "2025-10-31T00:26:15.044Z" },
{ url = "https://files.pythonhosted.org/packages/cb/76/46346029fa2f2078826bc88ef7167e8c198e58fe3126636e52f77488cbba/ruff-0.14.3-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a65e448cfd7e9c59fae8cf37f9221585d3354febaad9a07f29158af1528e165f", size = 13403980, upload-time = "2025-10-31T00:26:17.81Z" },
]
[[package]]
@@ -1443,19 +1443,19 @@ wheels = [
[[package]]
name = "starlette"
version = "0.49.1"
version = "0.49.3"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "anyio", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/1b/3f/507c21db33b66fb027a332f2cb3abbbe924cc3a79ced12f01ed8645955c9/starlette-0.49.1.tar.gz", hash = "sha256:481a43b71e24ed8c43b11ea02f5353d77840e01480881b8cb5a26b8cae64a8cb", size = 2654703, upload-time = "2025-10-28T17:34:10.928Z" }
sdist = { url = "https://files.pythonhosted.org/packages/de/1a/608df0b10b53b0beb96a37854ee05864d182ddd4b1156a22f1ad3860425a/starlette-0.49.3.tar.gz", hash = "sha256:1c14546f299b5901a1ea0e34410575bc33bbd741377a10484a54445588d00284", size = 2655031, upload-time = "2025-11-01T15:12:26.13Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/51/da/545b75d420bb23b5d494b0517757b351963e974e79933f01e05c929f20a6/starlette-0.49.1-py3-none-any.whl", hash = "sha256:d92ce9f07e4a3caa3ac13a79523bd18e3bc0042bb8ff2d759a8e7dd0e1859875", size = 74175, upload-time = "2025-10-28T17:34:09.13Z" },
{ url = "https://files.pythonhosted.org/packages/a3/e0/021c772d6a662f43b63044ab481dc6ac7592447605b5b35a957785363122/starlette-0.49.3-py3-none-any.whl", hash = "sha256:b579b99715fdc2980cf88c8ec96d3bf1ce16f5a8051a7c2b84ef9b1cdecaea2f", size = 74340, upload-time = "2025-11-01T15:12:24.387Z" },
]
[[package]]
name = "textual"
version = "6.4.0"
version = "6.5.0"
source = { registry = "https://pypi.org/simple" }
dependencies = [
{ name = "markdown-it-py", extra = ["linkify"], marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
@@ -1465,9 +1465,9 @@ dependencies = [
{ name = "rich", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
]
sdist = { url = "https://files.pythonhosted.org/packages/23/6c/565521dc6dd00fa857845483ae0c070575fda1f9a56d92d732554fecfea4/textual-6.4.0.tar.gz", hash = "sha256:f40df9165a001c10249698d532f2f5a71708b70f0e4ef3fce081a9dd93ffeaaa", size = 1573599, upload-time = "2025-10-22T17:29:51.357Z" }
sdist = { url = "https://files.pythonhosted.org/packages/af/90/59757aa887ddcea61428820274f1a2d1f986feb7880374a5420ab5d37132/textual-6.5.0.tar.gz", hash = "sha256:e5f152cdd47db48a635d23b839721bae4d0e8b6d855e3fede7285218289294e3", size = 1574116, upload-time = "2025-10-31T17:21:53.4Z" }
wheels = [
{ url = "https://files.pythonhosted.org/packages/37/20/6eed0e55bdd2576475e9cea49cc71c47f8e56ab54f04cbe04b2fb56440de/textual-6.4.0-py3-none-any.whl", hash = "sha256:b346dbb8e12f17cefb33ddfdf7f19bdc9e66c29daf82fc981a8db6b7d985e115", size = 711663, upload-time = "2025-10-22T17:29:49.346Z" },
{ url = "https://files.pythonhosted.org/packages/42/37/1deba011782a49ea249c73adcf703a39b0249ac9b0e17d1a2e4074df8d57/textual-6.5.0-py3-none-any.whl", hash = "sha256:c5505be7fe606b8054fb88431279885f88352bddca64832f6acd293ef7d9b54f", size = 711848, upload-time = "2025-10-31T17:21:51.134Z" },
]
[[package]]