mirror of
https://github.com/exo-explore/exo.git
synced 2025-12-23 22:27:50 -05:00
Update staging 14
Co-authored-by: Evan <evanev7@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Co-authored-by: David Munha Canas Correia <dmunha@MacBook-David.local> Co-authored-by: github-actions bot <github-actions@users.noreply.github.com>
This commit is contained in:
1
.gitattributes
vendored
1
.gitattributes
vendored
@@ -1 +0,0 @@
|
||||
worker/utils/macmon/bin/macmon filter=lfs diff=lfs merge=lfs -text
|
||||
159
.github/benchmark-dashboard/README.md
vendored
Normal file
159
.github/benchmark-dashboard/README.md
vendored
Normal file
@@ -0,0 +1,159 @@
|
||||
# EXO Benchmark Dashboard
|
||||
|
||||
A fully self-contained, browser-based dashboard for tracking EXO benchmark performance over time.
|
||||
|
||||
## Features
|
||||
|
||||
- 📊 **Success Rate Tracking**: Monitor cluster reliability across commits
|
||||
- ⚡ **Response Time Analysis**: Track average request completion times
|
||||
- 🎯 **Throughput Metrics**: Tokens per second visualization
|
||||
- 📈 **Request Distribution**: Success/failure breakdown over time
|
||||
- 🔄 **Auto-Refresh**: Updates every 60 seconds
|
||||
- 📺 **TV-Ready**: Large, clear visualizations perfect for display
|
||||
- 🔐 **Secure**: Credentials stored in browser localStorage only
|
||||
- 🌐 **No Backend**: Directly accesses S3 from the browser
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Option 1: Direct File Access (Simplest)
|
||||
|
||||
Just open the HTML file directly in your browser:
|
||||
|
||||
```bash
|
||||
open .github/benchmark-dashboard/index.html
|
||||
```
|
||||
|
||||
Then click "Configure AWS Credentials" and enter your keys.
|
||||
|
||||
### Option 2: URL Parameters (For Quick Setup)
|
||||
|
||||
```bash
|
||||
# Serve with credentials in URL (they'll be moved to localStorage)
|
||||
open ".github/benchmark-dashboard/index.html?accessKey=YOUR_KEY&secretKey=YOUR_SECRET®ion=us-east-1"
|
||||
```
|
||||
|
||||
The credentials will be saved to localStorage and removed from the URL immediately.
|
||||
|
||||
### Option 3: Simple HTTP Server
|
||||
|
||||
```bash
|
||||
# From repo root
|
||||
python3 -m http.server 8080
|
||||
|
||||
# Then open: http://localhost:8080/.github/benchmark-dashboard/
|
||||
```
|
||||
|
||||
## AWS Credentials
|
||||
|
||||
The dashboard needs read-only access to the `exo-benchmark-results` S3 bucket.
|
||||
|
||||
### Required IAM Permissions
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": "2012-10-17",
|
||||
"Statement": [
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:ListBucket"
|
||||
],
|
||||
"Resource": [
|
||||
"arn:aws:s3:::exo-benchmark-results",
|
||||
"arn:aws:s3:::exo-benchmark-results/*"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### Security Notes
|
||||
|
||||
- ✅ Credentials stored in browser `localStorage` only
|
||||
- ✅ Never sent to any server (except AWS)
|
||||
- ✅ All S3 access happens client-side
|
||||
- ✅ Use read-only IAM credentials
|
||||
- ⚠️ Don't commit credentials to git
|
||||
- ⚠️ Use a dedicated read-only IAM user
|
||||
|
||||
## TV/Kiosk Mode
|
||||
|
||||
For permanent display on a TV:
|
||||
|
||||
### macOS
|
||||
```bash
|
||||
open -a "Google Chrome" --args --kiosk ".github/benchmark-dashboard/index.html"
|
||||
```
|
||||
|
||||
### Linux
|
||||
```bash
|
||||
chromium-browser --kiosk --app="file://$(pwd)/.github/benchmark-dashboard/index.html"
|
||||
```
|
||||
|
||||
### Auto-start on Boot
|
||||
|
||||
Create a simple startup script:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# /usr/local/bin/start-benchmark-dashboard.sh
|
||||
|
||||
cd /path/to/exo
|
||||
python3 -m http.server 8080 &
|
||||
sleep 2
|
||||
chromium-browser --kiosk http://localhost:8080/.github/benchmark-dashboard/
|
||||
```
|
||||
|
||||
## Data Displayed
|
||||
|
||||
### Summary Cards
|
||||
- **Latest Success Rate**: Most recent benchmark success percentage with trend
|
||||
- **Avg Response Time**: Latest average response time in ms with trend
|
||||
- **Total Benchmarks**: Count of all benchmarks run
|
||||
- **Active Configurations**: Number of unique benchmark configs
|
||||
|
||||
### Charts
|
||||
1. **Success Rate Over Time**: Line chart showing reliability trends
|
||||
2. **Average Response Time**: Performance over time (lower is better)
|
||||
3. **Throughput**: Tokens/second metric (higher is better)
|
||||
4. **Request Distribution**: Stacked bar chart of successes/failures
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Loads AWS SDK**: Uses AWS SDK for JavaScript (browser version)
|
||||
2. **Lists S3 Objects**: Fetches all files from `s3://exo-benchmark-results/bench/`
|
||||
3. **Downloads Results**: Fetches each JSON result file
|
||||
4. **Parses & Visualizes**: Uses Chart.js to create interactive charts
|
||||
5. **Auto-Refreshes**: Polls S3 every 60 seconds for new results
|
||||
|
||||
## Customization
|
||||
|
||||
To modify the dashboard:
|
||||
|
||||
1. Edit `index.html`
|
||||
2. Adjust `REFRESH_INTERVAL` for different polling frequency
|
||||
3. Modify chart colors/styles in the Chart.js configuration
|
||||
4. Add new metrics by extending the results parsing
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**"AWS credentials not configured"**
|
||||
- Click "Configure AWS Credentials" and enter your keys
|
||||
|
||||
**"Error loading benchmark data"**
|
||||
- Check AWS credentials are correct
|
||||
- Verify S3 bucket name is `exo-benchmark-results`
|
||||
- Ensure IAM user has read permissions
|
||||
- Check browser console for detailed errors
|
||||
|
||||
**"No benchmark results found"**
|
||||
- Wait for benchmark workflows to run
|
||||
- Verify results are being uploaded to S3
|
||||
- Check S3 bucket has files in `bench/` prefix
|
||||
|
||||
**Charts not updating**
|
||||
- Check browser console for errors
|
||||
- Verify network connectivity to S3
|
||||
- Try refreshing the page manually
|
||||
|
||||
1601
.github/benchmark-dashboard/index.html
vendored
Normal file
1601
.github/benchmark-dashboard/index.html
vendored
Normal file
File diff suppressed because it is too large
Load Diff
186
.github/configs/README.md
vendored
Normal file
186
.github/configs/README.md
vendored
Normal file
@@ -0,0 +1,186 @@
|
||||
# EXO Benchmark Configurations
|
||||
|
||||
This directory contains configuration files for the EXO staged benchmark system.
|
||||
|
||||
## Overview
|
||||
|
||||
The staged benchmark system allows you to run complex, multi-stage load tests against EXO clusters. Each stage can have different characteristics:
|
||||
|
||||
- **Prompt Length**: Number of tokens in the input prompt
|
||||
- **Generation Length**: Maximum tokens to generate in the response
|
||||
- **Time Between Requests**: Delay (in seconds) between firing consecutive requests
|
||||
- **Iterations**: Number of requests to send in this stage
|
||||
|
||||
Requests are **fire-and-forget** - they don't wait for the previous request to complete. This allows you to test overlapping request handling and measure success rates under load.
|
||||
|
||||
## Configuration Files
|
||||
|
||||
### `bench_simple.yaml`
|
||||
A minimal configuration that replicates the behavior of the original `bench.py` script:
|
||||
- Single stage with 1 iteration
|
||||
- Short prompt (~20 tokens)
|
||||
- Generates up to 100 tokens
|
||||
|
||||
This is useful for quick smoke tests.
|
||||
|
||||
### `bench_config.yaml`
|
||||
A comprehensive multi-stage benchmark with:
|
||||
1. **Warmup** (10 requests): Light load with short prompts
|
||||
2. **Medium Load** (20 requests): Moderate load with medium prompts
|
||||
3. **Stress Test** (30 requests): Heavy overlapping requests with long prompts
|
||||
4. **Cooldown** (5 requests): Light load to wind down
|
||||
|
||||
This tests the cluster's behavior under varying load patterns.
|
||||
|
||||
## Configuration Schema
|
||||
|
||||
```yaml
|
||||
# Hardware configuration - maps runner labels to instance counts
|
||||
hardware_plan:
|
||||
M3ULTRA_GPU80_512GB: 4
|
||||
|
||||
# Environment variables to set on each node (optional)
|
||||
environment:
|
||||
OVERRIDE_MEMORY_MB: 512
|
||||
|
||||
# Timeout for instance and runner readiness (seconds)
|
||||
timeout_seconds: 600
|
||||
|
||||
# Model instances to run concurrently
|
||||
model_ids:
|
||||
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
|
||||
|
||||
# Benchmark stages
|
||||
stages:
|
||||
- name: "stage_name" # Human-readable name for this stage
|
||||
prompt_length: 100 # Target prompt length in tokens
|
||||
generation_length: 200 # Max tokens to generate
|
||||
time_between_requests: 2.0 # Seconds between firing requests
|
||||
iterations: 10 # Number of requests in this stage
|
||||
```
|
||||
|
||||
## Running Benchmarks
|
||||
|
||||
### Via GitHub Actions
|
||||
|
||||
**Automatic (every commit):**
|
||||
- The **`bench`** workflow runs automatically on every push
|
||||
- Uses `bench_simple.yaml` as the default configuration
|
||||
- All settings (hardware plan, timeout, environment variables, models, stages) are defined in the config file
|
||||
|
||||
**Manual (on-demand):**
|
||||
1. Go to **Actions** → **bench** workflow
|
||||
2. Click **Run workflow**
|
||||
3. Configure:
|
||||
- **Config File**: Path to your YAML config (default: `.github/configs/bench_simple.yaml`)
|
||||
- `.github/configs/bench_simple.yaml` for quick tests
|
||||
- `.github/configs/bench_config.yaml` for complex multi-stage tests
|
||||
|
||||
All other settings (hardware plan, timeout, environment variables, models, stages) are read from the specified config file.
|
||||
|
||||
### Via Command Line
|
||||
|
||||
```bash
|
||||
# Start EXO on localhost:8000
|
||||
uv run exo --api-port 8000
|
||||
|
||||
# Run simple benchmark (1 stage, 1 iteration)
|
||||
python3 .github/scripts/bench.py \
|
||||
--api-port 8000 \
|
||||
--config .github/configs/bench_simple.yaml \
|
||||
--expected-nodes 1 \
|
||||
--is-primary true \
|
||||
--timeout-seconds 600
|
||||
|
||||
# Run complex staged benchmark (4 stages, multiple iterations)
|
||||
python3 .github/scripts/bench.py \
|
||||
--api-port 8000 \
|
||||
--config .github/configs/bench_config.yaml \
|
||||
--expected-nodes 1 \
|
||||
--is-primary true \
|
||||
--timeout-seconds 600
|
||||
```
|
||||
|
||||
## Output Metrics
|
||||
|
||||
For each stage, the benchmark reports:
|
||||
|
||||
- **Total Requests**: Number of requests fired
|
||||
- **Successful Requests**: Requests that completed successfully
|
||||
- **Failed Requests**: Requests that encountered errors
|
||||
- **Success Rate**: Percentage of successful requests
|
||||
- **Total Tokens**: Sum of all tokens generated across successful requests
|
||||
- **Avg Tokens/Request**: Average tokens per successful request
|
||||
- **Avg Time/Request**: Average completion time per successful request
|
||||
|
||||
A JSON summary is also printed for easy parsing and storage.
|
||||
|
||||
## Creating Custom Benchmarks
|
||||
|
||||
To create a custom benchmark:
|
||||
|
||||
1. Copy an existing config file (e.g., `bench_config.yaml`)
|
||||
2. Modify the stages to match your test scenario
|
||||
3. Save it in this directory with a descriptive name
|
||||
4. Run it using the workflow or command line
|
||||
|
||||
### Example: Sustained Load Test
|
||||
|
||||
```yaml
|
||||
hardware_plan:
|
||||
M3ULTRA_GPU80_512GB: 2
|
||||
|
||||
environment:
|
||||
OVERRIDE_MEMORY_MB: 1024
|
||||
|
||||
timeout_seconds: 600
|
||||
|
||||
model_ids:
|
||||
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
|
||||
|
||||
stages:
|
||||
- name: "sustained_load"
|
||||
prompt_length: 200
|
||||
generation_length: 150
|
||||
time_between_requests: 0.5 # Very fast - 2 requests/second
|
||||
iterations: 100 # Run for ~50 seconds
|
||||
```
|
||||
|
||||
### Example: Varying Prompt Sizes
|
||||
|
||||
```yaml
|
||||
hardware_plan:
|
||||
M4PRO_GPU16_24GB: 3
|
||||
|
||||
timeout_seconds: 900
|
||||
|
||||
model_ids:
|
||||
- "mlx-community/Llama-3.2-1B-Instruct-4bit"
|
||||
|
||||
stages:
|
||||
- name: "tiny_prompts"
|
||||
prompt_length: 10
|
||||
generation_length: 100
|
||||
time_between_requests: 1.0
|
||||
iterations: 10
|
||||
|
||||
- name: "medium_prompts"
|
||||
prompt_length: 200
|
||||
generation_length: 100
|
||||
time_between_requests: 1.0
|
||||
iterations: 10
|
||||
|
||||
- name: "large_prompts"
|
||||
prompt_length: 1000
|
||||
generation_length: 100
|
||||
time_between_requests: 1.0
|
||||
iterations: 10
|
||||
```
|
||||
|
||||
## Tips
|
||||
|
||||
- **Overlapping Requests**: Set `time_between_requests` < expected completion time to test concurrent request handling
|
||||
- **Sequential Requests**: Set `time_between_requests` > expected completion time to ensure requests don't overlap
|
||||
- **Realistic Load**: Model real usage patterns by varying prompt/generation lengths across stages
|
||||
- **Success Rate**: A 100% success rate indicates the cluster handled the load well; lower rates suggest capacity limits
|
||||
|
||||
49
.github/configs/bench_config.yaml
vendored
Normal file
49
.github/configs/bench_config.yaml
vendored
Normal file
@@ -0,0 +1,49 @@
|
||||
# EXO Staged Benchmark Configuration
|
||||
# This configuration defines a multi-stage load test for EXO clusters
|
||||
|
||||
# Hardware configuration - maps runner labels to instance counts
|
||||
hardware_plan:
|
||||
M3ULTRA_GPU80_512GB: 4
|
||||
|
||||
# Environment variables to set on each node (optional)
|
||||
environment:
|
||||
OVERRIDE_MEMORY_MB: 512
|
||||
|
||||
# Timeout for instance and runner readiness (seconds)
|
||||
timeout_seconds: 600
|
||||
|
||||
# Multiple instances run concurrently on the cluster
|
||||
model_ids:
|
||||
- "mlx-community/Qwen3-0.6B-4bit"
|
||||
- "mlx-community/Qwen3-0.6B-4bit"
|
||||
|
||||
# Stages run sequentially, each with its own characteristics
|
||||
stages:
|
||||
# Stage 1: Light load with short prompts
|
||||
- name: "warmup"
|
||||
prompt_length: 50 # Number of tokens in prompt
|
||||
generation_length: 100 # Max tokens to generate
|
||||
time_between_requests: 5.0 # Seconds between firing requests
|
||||
iterations: 10 # Number of requests to send in this stage
|
||||
|
||||
# Stage 2: Medium load with medium prompts
|
||||
- name: "medium_load"
|
||||
prompt_length: 200
|
||||
generation_length: 150
|
||||
time_between_requests: 3.0
|
||||
iterations: 20
|
||||
|
||||
# Stage 3: Heavy load with long prompts - requests will overlap
|
||||
- name: "stress_test"
|
||||
prompt_length: 500
|
||||
generation_length: 200
|
||||
time_between_requests: 1.0 # Fast firing - will definitely overlap
|
||||
iterations: 30
|
||||
|
||||
# Stage 4: Cool down with simple prompts
|
||||
- name: "cooldown"
|
||||
prompt_length: 50
|
||||
generation_length: 50
|
||||
time_between_requests: 10.0
|
||||
iterations: 5
|
||||
|
||||
36
.github/configs/bench_simple.yaml
vendored
Normal file
36
.github/configs/bench_simple.yaml
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
# Simple single-shot benchmark
|
||||
# Tests 2 instances concurrently on 2 nodes
|
||||
|
||||
# Hardware configuration - maps runner labels to instance counts
|
||||
hardware_plan:
|
||||
puffin4: 1
|
||||
puffin8: 1
|
||||
|
||||
# Environment variables to set on each node
|
||||
environment:
|
||||
PLACEHOLDER: "placeholder"
|
||||
# OVERRIDE_MEMORY_MB: 30000
|
||||
# MLX_METAL_FAST_SYNCH: 1
|
||||
|
||||
# Timeout for instance and runner readiness (seconds)
|
||||
timeout_seconds: 900
|
||||
|
||||
# Model instances to run concurrently
|
||||
model_ids:
|
||||
- "mlx-community/DeepSeek-V3.1-8bit"
|
||||
# - "mlx-community/Qwen3-235B-A22B-4bit"
|
||||
# - "mlx-community/Llama-3.3-70B-Instruct-4bit"
|
||||
|
||||
# Placement strategy: "tensor", "pipeline", or "auto"
|
||||
strategy: "tensor_rdma"
|
||||
|
||||
# If true, run requests sequentially (no overlap); if false, fire-and-forget (default: false)
|
||||
no_overlap: true
|
||||
|
||||
# Benchmark stages
|
||||
stages:
|
||||
- name: "simple"
|
||||
prompt_length: 512
|
||||
generation_length: 10
|
||||
time_between_requests: 2.0
|
||||
iterations: 10
|
||||
1190
.github/scripts/bench.py
vendored
Normal file
1190
.github/scripts/bench.py
vendored
Normal file
File diff suppressed because it is too large
Load Diff
68
.github/scripts/build_matrix.py
vendored
Normal file
68
.github/scripts/build_matrix.py
vendored
Normal file
@@ -0,0 +1,68 @@
|
||||
#!/usr/bin/env python3
|
||||
import json
|
||||
import os
|
||||
from typing import NotRequired, TypedDict, cast
|
||||
import yaml
|
||||
|
||||
|
||||
class MatrixEntry(TypedDict):
|
||||
label: str
|
||||
index: int
|
||||
|
||||
|
||||
class MatrixInclude(TypedDict):
|
||||
label: str
|
||||
index: int
|
||||
is_primary: bool
|
||||
expected_nodes: int
|
||||
|
||||
|
||||
class Config(TypedDict):
|
||||
hardware_plan: dict[str, int]
|
||||
timeout_seconds: NotRequired[int]
|
||||
environment: NotRequired[dict[str, str]]
|
||||
|
||||
|
||||
# Read the config file
|
||||
config_file: str = os.environ['CONFIG_FILE']
|
||||
with open(config_file, 'r') as f:
|
||||
config: Config = cast(Config, yaml.safe_load(f))
|
||||
|
||||
# Extract hardware plan from config
|
||||
plan: dict[str, int] = config['hardware_plan']
|
||||
if not plan:
|
||||
raise ValueError(f"No hardware_plan found in {config_file}")
|
||||
|
||||
# Build matrix entries
|
||||
entries: list[MatrixEntry] = []
|
||||
for label, count in plan.items():
|
||||
for idx in range(count):
|
||||
entries.append({"label": label, "index": idx})
|
||||
|
||||
total_nodes: int = len(entries)
|
||||
matrix: dict[str, list[MatrixInclude]] = {"include": [
|
||||
{
|
||||
"label": e["label"],
|
||||
"index": e["index"],
|
||||
"is_primary": (i == 0),
|
||||
"expected_nodes": total_nodes
|
||||
}
|
||||
for i, e in enumerate(entries)
|
||||
]}
|
||||
|
||||
# Extract other config values
|
||||
timeout_seconds: int = config.get('timeout_seconds', 600)
|
||||
environment: dict[str, str] = config.get('environment', {})
|
||||
|
||||
# Output to GitHub Actions
|
||||
with open(os.environ['GITHUB_OUTPUT'], 'a') as f:
|
||||
f.write(f"matrix={json.dumps(matrix)}\n")
|
||||
f.write(f"config_file={config_file}\n")
|
||||
f.write(f"timeout_seconds={timeout_seconds}\n")
|
||||
f.write(f"environment={json.dumps(environment)}\n")
|
||||
|
||||
print(f"Matrix: {json.dumps(matrix)}")
|
||||
print(f"Config file: {config_file}")
|
||||
print(f"Timeout: {timeout_seconds}")
|
||||
print(f"Environment: {json.dumps(environment)}")
|
||||
|
||||
156
.github/workflows/BENCH_USAGE.md
vendored
Normal file
156
.github/workflows/BENCH_USAGE.md
vendored
Normal file
@@ -0,0 +1,156 @@
|
||||
# Benchmark Workflow Usage
|
||||
|
||||
## Overview
|
||||
|
||||
The `bench_matrix.yml` workflow enables distributed benchmarking of models across multiple self-hosted macOS runners with different hardware configurations.
|
||||
|
||||
## Workflow Inputs
|
||||
|
||||
| Input | Description | Default | Required |
|
||||
|-------|-------------|---------|----------|
|
||||
| `model_id` | Model ID to benchmark | `mlx-community/Llama-3.2-1B-Instruct-4bit` | Yes |
|
||||
| `hardware_plan` | JSON mapping of runner labels to counts | `{"M4PRO_GPU16_24GB": 1}` | Yes |
|
||||
| `prompt` | Benchmark prompt text | `What is the capital of France?` | No |
|
||||
| `timeout_seconds` | Timeout for instance/runner readiness | `600` | No |
|
||||
|
||||
## Hardware Plan Format
|
||||
|
||||
The `hardware_plan` input is a JSON object mapping runner labels to the number of machines:
|
||||
|
||||
```json
|
||||
{
|
||||
"M4PRO_GPU16_24GB": 2,
|
||||
"M3ULTRA_GPU80_512GB": 1
|
||||
}
|
||||
```
|
||||
|
||||
This example would:
|
||||
- Start 2 runners with the `M4PRO_GPU16_24GB` label
|
||||
- Start 1 runner with the `M3ULTRA_GPU80_512GB` label
|
||||
- Total of 3 runners coordinating on a single distributed inference instance
|
||||
|
||||
## How It Works
|
||||
|
||||
1. **Planning Job** (`plan`)
|
||||
- Runs on `ubuntu-latest`
|
||||
- Parses the `hardware_plan` JSON
|
||||
- Generates a dynamic matrix with one entry per runner
|
||||
- Only the first runner (index 0) is marked as `is_primary`
|
||||
|
||||
2. **Benchmark Worker Jobs** (`bench_worker`)
|
||||
- Each job runs on a self-hosted macOS runner with the specified label
|
||||
- All runners start EXO in parallel
|
||||
- The primary runner creates the model instance
|
||||
- All runners wait for their assigned runner to be ready (Loaded/Running status)
|
||||
- The primary runner executes the benchmark and prints results
|
||||
- The primary runner deletes the instance
|
||||
|
||||
## Example Usage
|
||||
|
||||
### Single Machine Benchmark
|
||||
|
||||
```yaml
|
||||
model_id: mlx-community/Llama-3.2-1B-Instruct-4bit
|
||||
hardware_plan: '{"M4PRO_GPU16_24GB": 1}'
|
||||
prompt: What is the capital of France?
|
||||
timeout_seconds: 600
|
||||
```
|
||||
|
||||
### Multi-Machine Distributed Benchmark
|
||||
|
||||
```yaml
|
||||
model_id: mlx-community/Llama-3.2-3B-Instruct-4bit
|
||||
hardware_plan: '{"M4PRO_GPU16_24GB": 2, "M3ULTRA_GPU80_512GB": 1}'
|
||||
prompt: Explain quantum computing in simple terms.
|
||||
timeout_seconds: 900
|
||||
```
|
||||
|
||||
## Benchmark Output
|
||||
|
||||
The primary runner outputs a JSON object with benchmark results:
|
||||
|
||||
```json
|
||||
{
|
||||
"model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
|
||||
"instance_id": "abc-123-def",
|
||||
"tokens": 42,
|
||||
"elapsed_s": 2.451,
|
||||
"tps": 17.136
|
||||
}
|
||||
```
|
||||
|
||||
Where:
|
||||
- `tokens`: Number of chunks/tokens generated
|
||||
- `elapsed_s`: Total elapsed time in seconds
|
||||
- `tps`: Tokens per second (tokens / elapsed_s)
|
||||
|
||||
## Runner Requirements
|
||||
|
||||
Each self-hosted runner must:
|
||||
- Be labeled with appropriate hardware tags (e.g., `M4PRO_GPU16_24GB`)
|
||||
- Have the `self-hosted` and `macOS` labels
|
||||
- Have Nix installed with flakes enabled
|
||||
- Have network connectivity to other runners in the same job
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ GitHub Actions Workflow (bench_matrix.yml) │
|
||||
├─────────────────────────────────────────────────────────────┤
|
||||
│ │
|
||||
│ ┌────────────────┐ │
|
||||
│ │ Plan Job │ │
|
||||
│ │ (ubuntu) │──┬─► Matrix: [{label, index, primary}] │
|
||||
│ └────────────────┘ │ │
|
||||
│ │ │
|
||||
│ ┌───────────────────▼──────────────────────────────────┐ │
|
||||
│ │ Bench Worker Jobs (Matrix) │ │
|
||||
│ ├──────────────────────────────────────────────────────┤ │
|
||||
│ │ │ │
|
||||
│ │ Runner 0 (Primary) Runner 1 Runner 2 │ │
|
||||
│ │ ┌─────────────┐ ┌─────────────┐ ┌──────────┐ │ │
|
||||
│ │ │ Start EXO │ │ Start EXO │ │ Start EXO│ │ │
|
||||
│ │ │ Create Inst │ │ Wait... │ │ Wait... │ │ │
|
||||
│ │ │ Wait Ready │ │ Wait Ready │ │ Wait... │ │ │
|
||||
│ │ │ Run Bench │ │ (idle) │ │ (idle) │ │ │
|
||||
│ │ │ Print TPS │ │ │ │ │ │ │
|
||||
│ │ │ Delete Inst │ │ │ │ │ │ │
|
||||
│ │ └─────────────┘ └─────────────┘ └──────────┘ │ │
|
||||
│ └───────────────────────────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Implementation Details
|
||||
|
||||
### `scripts/bench.py`
|
||||
|
||||
A standalone Python script that:
|
||||
- Creates instance (primary only)
|
||||
- Polls `/state` endpoint until instance and all runners are ready
|
||||
- Executes chat completion with timing (primary only)
|
||||
- Parses SSE stream and counts tokens
|
||||
- Computes TPS metrics
|
||||
- Cleans up instance (primary only)
|
||||
|
||||
### Key Functions
|
||||
|
||||
- `wait_for_instance()`: Polls until instance with model_id appears
|
||||
- `wait_for_runners_ready()`: Polls until expected number of runners reach Loaded/Running status
|
||||
- `run_benchmark()`: Executes chat completion, measures time, counts tokens
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Instance never becomes ready
|
||||
- Check EXO logs in the workflow output
|
||||
- Verify model_id is valid and accessible
|
||||
- Increase `timeout_seconds`
|
||||
|
||||
### Runner mismatch
|
||||
- Ensure hardware_plan counts match available labeled runners
|
||||
- Check runner labels match exactly (case-sensitive)
|
||||
|
||||
### Network issues
|
||||
- Verify runners can communicate on the network
|
||||
- Check firewall rules between runner hosts
|
||||
|
||||
292
.github/workflows/bench.yml
vendored
Normal file
292
.github/workflows/bench.yml
vendored
Normal file
@@ -0,0 +1,292 @@
|
||||
name: bench
|
||||
|
||||
on: [push]
|
||||
|
||||
jobs:
|
||||
plan:
|
||||
runs-on: ubuntu-latest
|
||||
outputs:
|
||||
matrix: ${{ steps.build.outputs.matrix }}
|
||||
config_file: ${{ steps.build.outputs.config_file }}
|
||||
timeout_seconds: ${{ steps.build.outputs.timeout_seconds }}
|
||||
environment: ${{ steps.build.outputs.environment }}
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
|
||||
- name: Build matrix from config file
|
||||
id: build
|
||||
shell: bash
|
||||
run: |
|
||||
set -euo pipefail
|
||||
CONFIG_FILE='.github/configs/bench_simple.yaml'
|
||||
export CONFIG_FILE
|
||||
echo "Config file: $CONFIG_FILE"
|
||||
python3 .github/scripts/build_matrix.py
|
||||
|
||||
bench_worker:
|
||||
needs: plan
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix: ${{ fromJSON(needs.plan.outputs.matrix) }}
|
||||
name: "bench on ${{ matrix.label }} [${{ matrix.index }}]"
|
||||
runs-on: [self-hosted, macOS, "${{ matrix.label }}"]
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
lfs: false
|
||||
|
||||
- name: Configure git user
|
||||
run: |
|
||||
git config --local user.email "github-actions@users.noreply.github.com"
|
||||
git config --local user.name "github-actions bot"
|
||||
shell: bash
|
||||
|
||||
# TODO: this is mega hacky and I'd like a simpler solution.
|
||||
- name: Setup Nix Environment
|
||||
run: |
|
||||
echo "Checking for nix installation..."
|
||||
|
||||
# Check if nix is already available
|
||||
if command -v nix >/dev/null 2>&1; then
|
||||
echo "Nix already in PATH"
|
||||
# Try sourcing profile scripts to set up environment properly
|
||||
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
|
||||
echo "Sourcing multi-user nix-daemon profile script"
|
||||
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
|
||||
elif [ -f "$HOME/.nix-profile/etc/profile.d/nix.sh" ]; then
|
||||
echo "Sourcing single-user nix profile script"
|
||||
source "$HOME/.nix-profile/etc/profile.d/nix.sh"
|
||||
elif [ -f /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh ]; then
|
||||
echo "Sourcing per-user nix profile script"
|
||||
source /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh
|
||||
elif [ -f /etc/profile.d/nix.sh ]; then
|
||||
echo "Sourcing system-wide nix profile script"
|
||||
source /etc/profile.d/nix.sh
|
||||
# Fallback: manually add nix to PATH if binary exists
|
||||
elif [ -f /nix/var/nix/profiles/default/bin/nix ]; then
|
||||
echo "Found nix binary, manually adding to PATH"
|
||||
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
|
||||
elif [ -f "$HOME/.nix-profile/bin/nix" ]; then
|
||||
echo "Found nix binary in user profile, manually adding to PATH"
|
||||
export PATH="$HOME/.nix-profile/bin:$PATH"
|
||||
else
|
||||
echo "Nix not found. Debugging info:"
|
||||
echo "USER: $USER"
|
||||
echo "HOME: $HOME"
|
||||
echo "Current PATH: $PATH"
|
||||
echo ""
|
||||
echo "Checking common Nix locations:"
|
||||
echo " /nix/var/nix/profiles/default/bin/nix:"
|
||||
ls -la /nix/var/nix/profiles/default/bin/nix 2>/dev/null || echo " Not found"
|
||||
echo " /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh:"
|
||||
ls -la /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh 2>/dev/null || echo " Not found"
|
||||
echo " ~/.nix-profile/etc/profile.d/nix.sh:"
|
||||
ls -la "$HOME/.nix-profile/etc/profile.d/nix.sh" 2>/dev/null || echo " Not found"
|
||||
echo " /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh:"
|
||||
ls -la "/nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh" 2>/dev/null || echo " Not found"
|
||||
echo ""
|
||||
echo "/nix directory structure:"
|
||||
ls -la /nix 2>/dev/null || echo " /nix directory not found"
|
||||
echo ""
|
||||
echo "/nix/var:"
|
||||
ls -la /nix/var 2>/dev/null || echo " /nix/var not found"
|
||||
echo ""
|
||||
echo "/nix/store:"
|
||||
ls -la /nix/store 2>/dev/null | head -20 || echo " /nix/store not found"
|
||||
echo ""
|
||||
echo "GitHub Actions runner is running as user '$USER'."
|
||||
echo "If Nix is installed for a different user, either:"
|
||||
echo " 1. Install Nix for user '$USER' (multi-user install recommended)"
|
||||
echo " 2. Configure the runner service to run as the user with Nix installed"
|
||||
echo " 3. Ensure Nix is installed system-wide with proper daemon setup"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Verify nix is available and persist to GITHUB_ENV
|
||||
if command -v nix >/dev/null 2>&1; then
|
||||
echo "✓ Nix is available"
|
||||
nix --version
|
||||
echo "PATH=$PATH" >> $GITHUB_ENV
|
||||
if [ -n "$NIX_PATH" ]; then
|
||||
echo "NIX_PATH=$NIX_PATH" >> $GITHUB_ENV
|
||||
fi
|
||||
else
|
||||
echo "ERROR: Failed to set up Nix"
|
||||
echo "PATH after setup attempt: $PATH"
|
||||
exit 1
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Setup EXO_HOME and API_PORT
|
||||
run: |
|
||||
EXO_HOME=$(mktemp -d -t exo-e2e-XXXXXXXX)
|
||||
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
|
||||
EXO_MODELS_DIR="$HOME/.exo/models"
|
||||
EXO_LIBP2P_NAMESPACE="bench-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
|
||||
echo "EXO_HOME=$EXO_HOME" >> "$GITHUB_ENV"
|
||||
echo "API_PORT=$API_PORT" >> "$GITHUB_ENV"
|
||||
echo "EXO_MODELS_DIR=$EXO_MODELS_DIR" >> "$GITHUB_ENV"
|
||||
echo "EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE" >> "$GITHUB_ENV"
|
||||
echo "Created EXO_HOME: $EXO_HOME"
|
||||
echo "Generated API_PORT: $API_PORT"
|
||||
echo "Using models from: $EXO_MODELS_DIR"
|
||||
echo "Using libp2p namespace: $EXO_LIBP2P_NAMESPACE"
|
||||
shell: bash
|
||||
|
||||
- name: Configure local MLX if available
|
||||
run: |
|
||||
RUNNER_LABELS='${{ toJSON(runner.labels) }}'
|
||||
if echo "$RUNNER_LABELS" | grep -q "local_mlx"; then
|
||||
echo "Runner has 'local_mlx' tag, configuring local MLX paths..."
|
||||
MODIFIED=false
|
||||
if [ -d "/Users/Shared/mlx" ]; then
|
||||
echo "Found /Users/Shared/mlx, enabling local mlx path in pyproject.toml"
|
||||
sed -i.bak 's|^# mlx = { path = "/Users/Shared/mlx", editable=true }$|mlx = { path = "/Users/Shared/mlx", editable=true }|' pyproject.toml
|
||||
MODIFIED=true
|
||||
fi
|
||||
if [ -d "/Users/Shared/mlx-lm" ]; then
|
||||
echo "Found /Users/Shared/mlx-lm, enabling local mlx-lm path in pyproject.toml"
|
||||
sed -i.bak 's|^# mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }$|mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }|' pyproject.toml
|
||||
MODIFIED=true
|
||||
fi
|
||||
if [ "$MODIFIED" = true ]; then
|
||||
echo "Modified pyproject.toml [tool.uv.sources] section:"
|
||||
sed -n '/\[tool\.uv\.sources\]/,/^\[/p' pyproject.toml | head -n -1
|
||||
echo "Regenerating uv.lock with local MLX paths..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command uv lock --upgrade-package mlx --upgrade-package mlx-lm
|
||||
fi
|
||||
else
|
||||
echo "Runner does not have 'local_mlx' tag, using default PyPI packages"
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Sync dependencies
|
||||
run: |
|
||||
if [ -d "/Users/Shared/test" ]; then
|
||||
pushd /Users/Shared/test
|
||||
uv sync --reinstall
|
||||
popd
|
||||
fi
|
||||
echo "Running just sync to ensure clean dependencies..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync
|
||||
shell: bash
|
||||
|
||||
- name: Start EXO and run bench script
|
||||
shell: bash
|
||||
env:
|
||||
IS_PRIMARY: ${{ matrix.is_primary }}
|
||||
EXPECTED_NODES: ${{ matrix.expected_nodes }}
|
||||
HARDWARE_LABEL: ${{ matrix.label }}
|
||||
CONFIG_FILE: ${{ needs.plan.outputs.config_file }}
|
||||
TIMEOUT_SECONDS: ${{ needs.plan.outputs.timeout_seconds }}
|
||||
ENVIRONMENT_JSON: ${{ needs.plan.outputs.environment }}
|
||||
run: |
|
||||
set -euo pipefail
|
||||
|
||||
# Parse environment variables from config
|
||||
ENV_VARS=""
|
||||
if [ -n "$ENVIRONMENT_JSON" ] && [ "$ENVIRONMENT_JSON" != "{}" ]; then
|
||||
ENV_VARS=$(echo "$ENVIRONMENT_JSON" | python3 -c "import sys, json; env = json.load(sys.stdin); print(' '.join([f'{k}={v}' for k, v in env.items()]))")
|
||||
fi
|
||||
|
||||
echo "Starting EXO with API_PORT=${API_PORT} EXO_HOME=${EXO_HOME} EXO_LIBP2P_NAMESPACE=${EXO_LIBP2P_NAMESPACE}"
|
||||
echo "Environment variables from config: $ENV_VARS"
|
||||
LOG_FILE=/tmp/exo.log
|
||||
: > "$LOG_FILE"
|
||||
|
||||
MASTER_FLAG=""
|
||||
if [ "$IS_PRIMARY" = "true" ]; then
|
||||
MASTER_FLAG="-m"
|
||||
fi
|
||||
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
|
||||
"EXO_HOME=$EXO_HOME EXO_MODELS_DIR=$EXO_MODELS_DIR EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE $ENV_VARS PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run exo $MASTER_FLAG --api-port $API_PORT" \
|
||||
>> "$LOG_FILE" 2>&1 &
|
||||
|
||||
EXO_PID=$!
|
||||
echo "Started EXO in background with PID: $EXO_PID"
|
||||
echo "Log file: $LOG_FILE"
|
||||
|
||||
cleanup() {
|
||||
echo '=== EXO log (tail) ==='
|
||||
tail -n 300 "$LOG_FILE" || true
|
||||
if ps -p "$EXO_PID" >/dev/null 2>&1; then
|
||||
echo "Killing EXO (PID $EXO_PID)"
|
||||
kill "$EXO_PID" || true
|
||||
fi
|
||||
}
|
||||
trap cleanup EXIT
|
||||
|
||||
for i in $(seq 1 60); do
|
||||
if curl -s "http://localhost:${API_PORT}/state" >/dev/null 2>&1; then
|
||||
echo "EXO API ready"
|
||||
break
|
||||
fi
|
||||
if ! ps -p "$EXO_PID" >/dev/null 2>&1; then
|
||||
echo "EXO terminated early"; sed -n '1,200p' "$LOG_FILE" || true; exit 1
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
RESULTS_FILE="/tmp/bench_results_${GITHUB_RUN_ID}_${GITHUB_RUN_ATTEMPT}_$(date +%s).json"
|
||||
echo "Results will be saved to: $RESULTS_FILE"
|
||||
echo "RESULTS_FILE=$RESULTS_FILE" >> "$GITHUB_ENV"
|
||||
|
||||
echo "Running bench script with config: $CONFIG_FILE, timeout: $TIMEOUT_SECONDS"
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
|
||||
"PYTHONUNBUFFERED=1 uv run --no-project --with pyyaml --with pydantic python .github/scripts/bench.py \
|
||||
--api-port $API_PORT \
|
||||
--config $CONFIG_FILE \
|
||||
--expected-nodes ${EXPECTED_NODES} \
|
||||
--is-primary ${IS_PRIMARY} \
|
||||
--timeout-seconds ${TIMEOUT_SECONDS} \
|
||||
--output $RESULTS_FILE \
|
||||
--git-commit ${GITHUB_SHA} \
|
||||
--hardware-labels ${HARDWARE_LABEL}"
|
||||
|
||||
- name: Install AWS CLI
|
||||
if: always() && env.RESULTS_FILE && matrix.is_primary
|
||||
run: |
|
||||
if ! command -v aws &> /dev/null; then
|
||||
echo "AWS CLI not found, installing..."
|
||||
brew install awscli
|
||||
else
|
||||
echo "AWS CLI already installed"
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Upload results to S3
|
||||
if: always() && env.RESULTS_FILE && matrix.is_primary
|
||||
env:
|
||||
AWS_ACCESS_KEY_ID: ${{ secrets.S3_BENCHMARKS_AWS_ACCESS_KEY_ID }}
|
||||
AWS_SECRET_ACCESS_KEY: ${{ secrets.S3_BENCHMARKS_AWS_SECRET_ACCESS_KEY }}
|
||||
AWS_DEFAULT_REGION: us-east-1
|
||||
run: |
|
||||
echo "Checking for results file: $RESULTS_FILE"
|
||||
echo "Is primary: ${{ matrix.is_primary }}"
|
||||
|
||||
if [ -f "$RESULTS_FILE" ]; then
|
||||
TIMESTAMP=$(date -u +%Y/%m/%d/%H%M%S)
|
||||
S3_KEY="bench/${TIMESTAMP}_${GITHUB_SHA:0:8}_${GITHUB_RUN_ID}.json"
|
||||
echo "Uploading results to s3://exo-benchmark-results/$S3_KEY"
|
||||
|
||||
aws s3 cp "$RESULTS_FILE" "s3://exo-benchmark-results/$S3_KEY" \
|
||||
--content-type application/json \
|
||||
--metadata "commit=${GITHUB_SHA},run_id=${GITHUB_RUN_ID},branch=${GITHUB_REF_NAME}"
|
||||
|
||||
echo "Results uploaded successfully"
|
||||
echo "View at: https://exo-benchmark-results.s3.amazonaws.com/$S3_KEY"
|
||||
else
|
||||
echo "Results file not found at: $RESULTS_FILE"
|
||||
echo "Skipping upload"
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Cleanup EXO_HOME
|
||||
run: |
|
||||
echo "Cleaning up EXO_HOME: $EXO_HOME"
|
||||
rm -rf "$EXO_HOME"
|
||||
shell: bash
|
||||
if: always()
|
||||
360
.github/workflows/e2e_test.yml
vendored
360
.github/workflows/e2e_test.yml
vendored
@@ -1,360 +0,0 @@
|
||||
name: macOS System Info
|
||||
|
||||
on:
|
||||
workflow_dispatch: # This allows manual triggering
|
||||
# push:
|
||||
# branches: [ '*' ]
|
||||
# tags: [ '*' ]
|
||||
|
||||
jobs:
|
||||
master:
|
||||
runs-on: ['self-hosted', 'macOS']
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
lfs: true
|
||||
|
||||
- name: Configure git user
|
||||
run: |
|
||||
git config --local user.email "github-actions@users.noreply.github.com"
|
||||
git config --local user.name "github-actions bot"
|
||||
shell: bash
|
||||
|
||||
- name: Pull LFS files
|
||||
run: |
|
||||
echo "Pulling Git LFS files..."
|
||||
git lfs pull
|
||||
shell: bash
|
||||
|
||||
- name: Reset databases
|
||||
run: |
|
||||
if [ -d ~/.exo ]; then
|
||||
rm -rf ~/.exo/*.db*
|
||||
fi
|
||||
|
||||
- name: Setup EXO_HOME and API_PORT
|
||||
run: |
|
||||
EXO_HOME=$(mktemp -d -t exo-e2e-master-XXXXXXXX)
|
||||
# Generate random port (macOS compatible method)
|
||||
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
|
||||
echo "EXO_HOME=$EXO_HOME" >> $GITHUB_ENV
|
||||
echo "API_PORT=$API_PORT" >> $GITHUB_ENV
|
||||
echo "Created EXO_HOME: $EXO_HOME"
|
||||
echo "Generated API_PORT: $API_PORT"
|
||||
echo "Verifying API_PORT is set: $API_PORT"
|
||||
shell: bash
|
||||
|
||||
- name: Setup Nix Environment
|
||||
run: |
|
||||
echo "Checking for nix installation..."
|
||||
|
||||
# Check if nix binary exists directly
|
||||
if [ -f /nix/var/nix/profiles/default/bin/nix ]; then
|
||||
echo "Found nix binary at /nix/var/nix/profiles/default/bin/nix"
|
||||
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
|
||||
echo "PATH=$PATH" >> $GITHUB_ENV
|
||||
nix --version
|
||||
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
|
||||
echo "Found nix profile script, sourcing..."
|
||||
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
|
||||
nix --version
|
||||
elif command -v nix >/dev/null 2>&1; then
|
||||
echo "Nix already in PATH"
|
||||
nix --version
|
||||
else
|
||||
echo "Nix not found. Debugging info:"
|
||||
echo "Contents of /nix/var/nix/profiles/default/:"
|
||||
ls -la /nix/var/nix/profiles/default/ 2>/dev/null || echo "Directory not found"
|
||||
echo "Contents of /nix/var/nix/profiles/default/bin/:"
|
||||
ls -la /nix/var/nix/profiles/default/bin/ 2>/dev/null || echo "Directory not found"
|
||||
exit 1
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Print macOS system information
|
||||
run: |
|
||||
echo "=== macOS System Information ==="
|
||||
echo "OS Version:"
|
||||
sw_vers
|
||||
|
||||
echo -e "\n=== Memory Information ==="
|
||||
system_profiler SPMemoryDataType
|
||||
|
||||
echo -e "\n=== Memory Usage Summary ==="
|
||||
vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages free: (\d+)/ and printf "Free Memory: %.2f GB\n", $1 * $size / 1024 / 1024 / 1024'
|
||||
top -l 1 -s 0 | grep PhysMem
|
||||
|
||||
echo -e "\n=== CPU Information ==="
|
||||
sysctl -n machdep.cpu.brand_string
|
||||
system_profiler SPHardwareDataType | grep -E "Cores|Processors"
|
||||
|
||||
echo -e "\n=== Disk Space ==="
|
||||
df -h /
|
||||
|
||||
# - name: Setup Hugging Face token
|
||||
# run: |
|
||||
# mkdir -p ~/.cache/huggingface
|
||||
# echo "${{ secrets.HF_TOKEN }}" > ~/.cache/huggingface/token
|
||||
|
||||
- name: Sync dependencies
|
||||
run: |
|
||||
echo "Running just sync-clean to ensure clean dependencies..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync-clean
|
||||
shell: bash
|
||||
|
||||
- name: Build forwarder
|
||||
run: |
|
||||
echo "Building Go forwarder binary..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just build-forwarder
|
||||
shell: bash
|
||||
|
||||
- name: Start node (master)
|
||||
run: |
|
||||
echo "Starting master node with debug enabled..."
|
||||
echo "Environment check - API_PORT: '$API_PORT'"
|
||||
echo "Environment check - EXO_HOME: '$EXO_HOME'"
|
||||
if [ -z "$API_PORT" ]; then
|
||||
echo "ERROR: API_PORT is not set!"
|
||||
exit 1
|
||||
fi
|
||||
# Run with Python unbuffered output and maximum debug level
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME API_PORT=$API_PORT PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run master/main.py" > /tmp/master_node.log 2>&1 &
|
||||
MASTER_PID=$!
|
||||
echo "Started master node in background with PID: $MASTER_PID"
|
||||
echo "Log file: /tmp/master_node.log"
|
||||
|
||||
echo "Starting worker node..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run worker/main.py" > /tmp/worker_node.log 2>&1 &
|
||||
WORKER_PID=$!
|
||||
echo "Started worker node in background with PID: $WORKER_PID"
|
||||
echo "Log file: /tmp/worker_node.log"
|
||||
|
||||
for i in {1..30}; do
|
||||
echo "Attempt $i: Checking if master node is ready..."
|
||||
if curl -s http://localhost:$API_PORT/state > /dev/null 2>&1; then
|
||||
echo "Master node is ready!"
|
||||
break
|
||||
fi
|
||||
if [ $i -eq 30 ]; then
|
||||
echo "Master node failed to start within 30 seconds. Checking logs..."
|
||||
echo "=== Master node log ==="
|
||||
cat /tmp/master_node.log || echo "No master log file found"
|
||||
echo "=== Worker node log ==="
|
||||
cat /tmp/worker_node.log || echo "No worker log file found"
|
||||
exit 1
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
# wait for master to have a COMPLETE or FAILED task in the state
|
||||
for i in {1..30}; do
|
||||
if curl -s http://localhost:$API_PORT/state | jq -r '.tasks | any(.task_status == "COMPLETE" or .task_status == "FAILED")' > 0; then
|
||||
echo "Master node has a COMPLETE or FAILED task in the state"
|
||||
break
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
echo "=== Master node log ==="
|
||||
cat /tmp/master_node.log || echo "No master log file found"
|
||||
echo "=== Worker node log ==="
|
||||
cat /tmp/worker_node.log || echo "No worker log file found"
|
||||
|
||||
- name: Cleanup EXO_HOME
|
||||
run: |
|
||||
echo "Cleaning up EXO_HOME: $EXO_HOME"
|
||||
rm -rf "$EXO_HOME"
|
||||
shell: bash
|
||||
if: always()
|
||||
|
||||
worker:
|
||||
runs-on: ['self-hosted', 'macOS']
|
||||
|
||||
steps:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
lfs: true
|
||||
|
||||
- name: Configure git user
|
||||
run: |
|
||||
git config --local user.email "github-actions@users.noreply.github.com"
|
||||
git config --local user.name "github-actions bot"
|
||||
shell: bash
|
||||
|
||||
- name: Pull LFS files
|
||||
run: |
|
||||
echo "Pulling Git LFS files..."
|
||||
git lfs pull
|
||||
shell: bash
|
||||
|
||||
- name: Reset databases
|
||||
run: |
|
||||
if [ -d ~/.exo ]; then
|
||||
rm -rf ~/.exo/*.db*
|
||||
fi
|
||||
|
||||
- name: Setup EXO_HOME and API_PORT
|
||||
run: |
|
||||
EXO_HOME=$(mktemp -d -t exo-e2e-worker-XXXXXXXX)
|
||||
# Generate random port (macOS compatible method)
|
||||
API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
|
||||
echo "EXO_HOME=$EXO_HOME" >> $GITHUB_ENV
|
||||
echo "API_PORT=$API_PORT" >> $GITHUB_ENV
|
||||
echo "Created EXO_HOME: $EXO_HOME"
|
||||
echo "Generated API_PORT: $API_PORT"
|
||||
echo "Verifying API_PORT is set: $API_PORT"
|
||||
shell: bash
|
||||
|
||||
- name: Setup Nix Environment
|
||||
run: |
|
||||
echo "Checking for nix installation..."
|
||||
|
||||
# Check if nix binary exists directly
|
||||
if [ -f /nix/var/nix/profiles/default/bin/nix ]; then
|
||||
echo "Found nix binary at /nix/var/nix/profiles/default/bin/nix"
|
||||
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
|
||||
echo "PATH=$PATH" >> $GITHUB_ENV
|
||||
nix --version
|
||||
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
|
||||
echo "Found nix profile script, sourcing..."
|
||||
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
|
||||
nix --version
|
||||
elif command -v nix >/dev/null 2>&1; then
|
||||
echo "Nix already in PATH"
|
||||
nix --version
|
||||
else
|
||||
echo "Nix not found. Debugging info:"
|
||||
echo "Contents of /nix/var/nix/profiles/default/:"
|
||||
ls -la /nix/var/nix/profiles/default/ 2>/dev/null || echo "Directory not found"
|
||||
echo "Contents of /nix/var/nix/profiles/default/bin/:"
|
||||
ls -la /nix/var/nix/profiles/default/bin/ 2>/dev/null || echo "Directory not found"
|
||||
exit 1
|
||||
fi
|
||||
shell: bash
|
||||
|
||||
- name: Print macOS system information
|
||||
run: |
|
||||
echo "=== macOS System Information ==="
|
||||
echo "OS Version:"
|
||||
sw_vers
|
||||
|
||||
echo -e "\n=== Memory Information ==="
|
||||
system_profiler SPMemoryDataType
|
||||
|
||||
echo -e "\n=== Memory Usage Summary ==="
|
||||
vm_stat | perl -ne '/page size of (\d+)/ and $size=$1; /Pages free: (\d+)/ and printf "Free Memory: %.2f GB\n", $1 * $size / 1024 / 1024 / 1024'
|
||||
top -l 1 -s 0 | grep PhysMem
|
||||
|
||||
echo -e "\n=== CPU Information ==="
|
||||
sysctl -n machdep.cpu.brand_string
|
||||
system_profiler SPHardwareDataType | grep -E "Cores|Processors"
|
||||
|
||||
echo -e "\n=== Disk Space ==="
|
||||
df -h /
|
||||
|
||||
# - name: Setup Hugging Face token
|
||||
# run: |
|
||||
# mkdir -p ~/.cache/huggingface
|
||||
# echo "${{ secrets.HF_TOKEN }}" > ~/.cache/huggingface/token
|
||||
|
||||
- name: Sync dependencies
|
||||
run: |
|
||||
echo "Running just sync-clean to ensure clean dependencies..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync-clean
|
||||
shell: bash
|
||||
|
||||
- name: Build forwarder
|
||||
run: |
|
||||
echo "Building Go forwarder binary..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just build-forwarder
|
||||
shell: bash
|
||||
|
||||
- name: Start node (replica)
|
||||
run: |
|
||||
echo "Starting master node with debug enabled..."
|
||||
echo "Environment check - API_PORT: '$API_PORT'"
|
||||
echo "Environment check - EXO_HOME: '$EXO_HOME'"
|
||||
if [ -z "$API_PORT" ]; then
|
||||
echo "ERROR: API_PORT is not set!"
|
||||
exit 1
|
||||
fi
|
||||
# Run with Python unbuffered output and maximum debug level
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_RUN_AS_REPLICA=1 EXO_HOME=$EXO_HOME API_PORT=$API_PORT PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run master/main.py" > /tmp/master_node.log 2>&1 &
|
||||
MASTER_PID=$!
|
||||
echo "Started master node in background with PID: $MASTER_PID"
|
||||
echo "Log file: /tmp/master_node.log"
|
||||
|
||||
echo "Starting worker node..."
|
||||
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c "EXO_HOME=$EXO_HOME PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run worker/main.py" > /tmp/worker_node.log 2>&1 &
|
||||
WORKER_PID=$!
|
||||
echo "Started worker node in background with PID: $WORKER_PID"
|
||||
echo "Log file: /tmp/worker_node.log"
|
||||
|
||||
echo "Waiting for master node to start on port $API_PORT..."
|
||||
# Wait for the master node to be ready (up to 30 seconds)
|
||||
for i in {1..30}; do
|
||||
echo "Attempt $i: Checking if master node is ready..."
|
||||
if curl -s http://localhost:$API_PORT/state > /dev/null 2>&1; then
|
||||
echo "Master node is ready!"
|
||||
break
|
||||
fi
|
||||
if [ $i -eq 30 ]; then
|
||||
echo "Master node failed to start within 30 seconds. Checking logs..."
|
||||
echo "=== Master node log ==="
|
||||
cat /tmp/master_node.log || echo "No master log file found"
|
||||
echo "=== Worker node log ==="
|
||||
cat /tmp/worker_node.log || echo "No worker log file found"
|
||||
exit 1
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
resp=$(curl -X POST http://localhost:$API_PORT/instance -H "Content-Type: application/json" -d '{"model_id": "llama-3.2:1b"}')
|
||||
echo "Response: $resp"
|
||||
instance_id=$(echo $resp | jq -r '.instance_id')
|
||||
echo "Instance ID: $instance_id"
|
||||
|
||||
for i in {1..50}; do
|
||||
resp=$(curl -s -w "%{http_code}" -X GET http://localhost:$API_PORT/instance/$instance_id -H "Content-Type: application/json")
|
||||
http_code="${resp: -3}"
|
||||
response_body="${resp%???}"
|
||||
echo "HTTP Code: $http_code"
|
||||
echo "Response: $response_body"
|
||||
|
||||
if [ "$http_code" == "200" ]; then
|
||||
instance_status=$(echo $response_body | jq -r '.instance_type')
|
||||
if [ "$instance_status" == "ACTIVE" ]; then
|
||||
echo "Instance is ready"
|
||||
break
|
||||
fi
|
||||
elif [ "$http_code" == "404" ]; then
|
||||
echo "Instance not yet created, waiting..."
|
||||
else
|
||||
echo "Unexpected HTTP status: $http_code"
|
||||
fi
|
||||
sleep 1
|
||||
done
|
||||
|
||||
resp=$(curl http://localhost:$API_PORT/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "llama-3.2:1b", "messages": [{"role": "user", "content": "What is the meaning of exo?"}], "temperature": 0.7}')
|
||||
echo "Response: $resp"
|
||||
|
||||
resp=$(curl -X DELETE http://localhost:$API_PORT/instance/$instance_id -H "Content-Type: application/json")
|
||||
echo "Response: $resp"
|
||||
|
||||
echo "=== Master node log ==="
|
||||
cat /tmp/master_node.log || echo "No master log file found"
|
||||
echo "=== Worker node log ==="
|
||||
cat /tmp/worker_node.log || echo "No worker log file found"
|
||||
|
||||
kill $MASTER_PID
|
||||
kill $WORKER_PID
|
||||
|
||||
- name: Cleanup EXO_HOME
|
||||
run: |
|
||||
echo "Cleaning up EXO_HOME: $EXO_HOME"
|
||||
rm -rf "$EXO_HOME"
|
||||
shell: bash
|
||||
if: always()
|
||||
2
.github/workflows/pipeline.yml
vendored
2
.github/workflows/pipeline.yml
vendored
@@ -17,7 +17,7 @@ jobs:
|
||||
- name: Checkout repository
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
lfs: true
|
||||
lfs: false
|
||||
|
||||
- uses: cachix/install-nix-action@v31
|
||||
with:
|
||||
|
||||
25
TODO.md
Normal file
25
TODO.md
Normal file
@@ -0,0 +1,25 @@
|
||||
1. Currently EXO just doesn't start cleanly a lot of the time. I see two kinds of issues:
|
||||
b. EXO starts but then after creating an instance that instance never loads (either gets stuck in Loading of Inactive).
|
||||
2. Currently a lot of requests from the API are timing out, but we still process those requests internally. If an API request times out, we should cancel all corresponding tasks to that API request (why process a request with nobody listening).
|
||||
4. I'd like to see profiled network latency / bandwidth.
|
||||
5. I'd like to see how much bandwidth each link is using.
|
||||
6. We should handle the case where one machine doesn't have the model downloaded and then other machines are waiting on it. In this case we get loads of timeout errors because the others are waiting for the one that needs to download the model.
|
||||
7. Solve the problem of in continuous batching when a new prompt comes in, it will block decode of the current batch until the prefill is complete.
|
||||
8. We want people to be able to copy models over to a new device without ever connecting EXO to the internet. Right now EXO require internet connection once to cache some files to check if a download is complete. Instead, we should simply check if there is a non-empty model folder locally with no .partial files. This indicates it's a fully downloaded model that can be loaded.
|
||||
10. More granular control over how to deploy instances.
|
||||
12. Nix is great but installing it is a pain and we have ended up in a lot of cases having PATH issues or installation issues. For example, after rebooting mike it seemed to no longer have a nix installation and needed reinstalling. It has a bunch of broken symlinks left over from nix that caused ssh to fail, making it even harder to debug. We need consistent environments (perhaps MDM) so we can guarantee nix is installed properly on each machine.
|
||||
13. Memory pressure instead of memory used.
|
||||
14. Show the type of each connection (TB5, Ethernet, etc.) in the UI. Refer to old exo: https://github.com/exo-explore/exo/blob/56f783b38dc6b08ce606b07a5386dc40dae00330/exo/helpers.py#L251
|
||||
15. Prioritise certain connection types (or by latency). TB5 > Ethernet > WiFi. Refer to old exo: https://github.com/exo-explore/exo/blob/56f783b38dc6b08ce606b07a5386dc40dae00330/exo/helpers.py#L251
|
||||
16. Dynamically switch to higher priority connection when it becomes available. Probably bring back InstanceReplacedAtomically.
|
||||
17. Faster model loads by streaming model from other devices in cluster.
|
||||
18. Add support for specifying the type of network connection to use in a test. Depends on 15/16.
|
||||
19. Fix mx.distributed.Group typing.
|
||||
20. Add chat completion cancellations (e.g OpenWebUI has something for cancelling an ongoing request).
|
||||
21. Make two separate things: tensor or pipeline, and ring or ibv.
|
||||
|
||||
Potential refactors:
|
||||
|
||||
1. Make ForwarderEvent typed
|
||||
2. Topology can be simplified
|
||||
3. Get rid of InstanceReplacedAtomically
|
||||
@@ -1,43 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
|
||||
# Get the total memory in MB
|
||||
TOTAL_MEM_MB=$(($(sysctl -n hw.memsize) / 1024 / 1024))
|
||||
|
||||
# Calculate 80% and TOTAL_MEM_GB-5GB in MB
|
||||
EIGHTY_PERCENT=$(($TOTAL_MEM_MB * 80 / 100))
|
||||
MINUS_5GB=$((($TOTAL_MEM_MB - 5120)))
|
||||
|
||||
# Calculate 70% and TOTAL_MEM_GB-8GB in MB
|
||||
SEVENTY_PERCENT=$(($TOTAL_MEM_MB * 70 / 100))
|
||||
MINUS_8GB=$((($TOTAL_MEM_MB - 8192)))
|
||||
|
||||
# Set WIRED_LIMIT_MB to higher value
|
||||
if [ $EIGHTY_PERCENT -gt $MINUS_5GB ]; then
|
||||
WIRED_LIMIT_MB=$EIGHTY_PERCENT
|
||||
else
|
||||
WIRED_LIMIT_MB=$MINUS_5GB
|
||||
fi
|
||||
|
||||
# Set WIRED_LWM_MB to higher value
|
||||
if [ $SEVENTY_PERCENT -gt $MINUS_8GB ]; then
|
||||
WIRED_LWM_MB=$SEVENTY_PERCENT
|
||||
else
|
||||
WIRED_LWM_MB=$MINUS_8GB
|
||||
fi
|
||||
|
||||
# Display the calculated values
|
||||
echo "Total memory: $TOTAL_MEM_MB MB"
|
||||
echo "Maximum limit (iogpu.wired_limit_mb): $WIRED_LIMIT_MB MB"
|
||||
echo "Lower bound (iogpu.wired_lwm_mb): $WIRED_LWM_MB MB"
|
||||
|
||||
# Apply the values with sysctl, but check if we're already root
|
||||
if [ "$EUID" -eq 0 ]; then
|
||||
sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB
|
||||
sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB
|
||||
else
|
||||
# Try without sudo first, fall back to sudo if needed
|
||||
sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB 2>/dev/null || \
|
||||
sudo sysctl -w iogpu.wired_limit_mb=$WIRED_LIMIT_MB
|
||||
sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB 2>/dev/null || \
|
||||
sudo sysctl -w iogpu.wired_lwm_mb=$WIRED_LWM_MB
|
||||
fi
|
||||
133
copy_model.sh
133
copy_model.sh
@@ -1,133 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
# copy_model.sh: clone ~/.exo/models from SOURCE to one or more TARGETS using scp -3.
|
||||
# Username defaults:
|
||||
# - If host is "aN" and no user given, username defaults to "aN".
|
||||
# - Otherwise defaults to $(whoami), unless you pass user@host.
|
||||
#
|
||||
# Examples:
|
||||
# ./copy_model.sh a1 a2 a3
|
||||
# ./copy_model.sh a1 frank@a2 192.168.1.3
|
||||
|
||||
if [ $# -lt 2 ]; then
|
||||
echo "Usage: $0 SOURCE TARGET [TARGET...]" >&2
|
||||
exit 2
|
||||
fi
|
||||
|
||||
SOURCE="$1"
|
||||
shift
|
||||
TARGETS=("$@")
|
||||
|
||||
DEFAULT_USER="$(whoami)"
|
||||
MODELS_REL=".exo/models" # relative under $HOME
|
||||
|
||||
timestamp() { date "+%Y-%m-%d %H:%M:%S"; }
|
||||
|
||||
split_user_host() {
|
||||
local in="$1"
|
||||
if [[ "$in" == *"@"* ]]; then
|
||||
printf "%s|%s" "${in%%@*}" "${in#*@}"
|
||||
else
|
||||
printf "|%s" "$in"
|
||||
fi
|
||||
}
|
||||
|
||||
resolve_ip() {
|
||||
local hostish="$1"
|
||||
if [[ "$hostish" =~ ^a([0-9]+)$ ]]; then
|
||||
echo "192.168.1.${BASH_REMATCH[1]}"
|
||||
else
|
||||
echo "$hostish"
|
||||
fi
|
||||
}
|
||||
|
||||
default_user_for() {
|
||||
local hostish="$1"
|
||||
if [[ "$hostish" =~ ^a([0-9]+)$ ]]; then
|
||||
echo "$hostish"
|
||||
else
|
||||
echo "$DEFAULT_USER"
|
||||
fi
|
||||
}
|
||||
|
||||
SSH_OPTS=(-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=ERROR -o ConnectTimeout=10)
|
||||
SSHPASS_BIN="$(command -v sshpass || true)"
|
||||
SCP_BIN="${SCP_BIN:-scp}"
|
||||
|
||||
read -s -p "Password for all hosts: " PASS
|
||||
echo
|
||||
if [ -n "$SSHPASS_BIN" ]; then
|
||||
echo "$(timestamp) sshpass found: will provide the password non-interactively."
|
||||
else
|
||||
echo "$(timestamp) WARNING: sshpass not found — you’ll be prompted by scp/ssh per hop unless keys are set up."
|
||||
fi
|
||||
|
||||
# Build source endpoint (default username logic)
|
||||
IFS='|' read -r SRC_USER_RAW SRC_HOSTISH <<<"$(split_user_host "$SOURCE")"
|
||||
SRC_USER="${SRC_USER_RAW:-$(default_user_for "$SRC_HOSTISH")}"
|
||||
SRC_IP="$(resolve_ip "$SRC_HOSTISH")"
|
||||
SRC_HOST="${SRC_USER}@${SRC_IP}"
|
||||
|
||||
echo "$(timestamp) Source: ${SRC_HOST}:~/${MODELS_REL}"
|
||||
echo "$(timestamp) Targets: ${#TARGETS[@]}"
|
||||
|
||||
# Helper to run a simple remote command via ssh (for mkdir -p checks)
|
||||
ssh_run() {
|
||||
local host="$1"
|
||||
shift
|
||||
if [ -n "$SSHPASS_BIN" ]; then
|
||||
sshpass -p "$PASS" ssh "${SSH_OPTS[@]}" "$host" "$@"
|
||||
else
|
||||
ssh "${SSH_OPTS[@]}" "$host" "$@"
|
||||
fi
|
||||
}
|
||||
|
||||
# Ensure source dir exists (create if missing, per your request)
|
||||
ssh_run "$SRC_HOST" "mkdir -p ~/${MODELS_REL}"
|
||||
|
||||
failures=0
|
||||
count=0
|
||||
for T in "${TARGETS[@]}"; do
|
||||
count=$((count + 1))
|
||||
IFS='|' read -r T_USER_RAW T_HOSTISH <<<"$(split_user_host "$T")"
|
||||
T_USER="${T_USER_RAW:-$(default_user_for "$T_HOSTISH")}"
|
||||
T_IP="$(resolve_ip "$T_HOSTISH")"
|
||||
T_HOST="${T_USER}@${T_IP}"
|
||||
|
||||
echo "============================================================"
|
||||
echo "$(timestamp) [${count}/${#TARGETS[@]}] ${SRC_HOST} ==> ${T_HOST}"
|
||||
echo "$(timestamp) Ensuring destination directory exists…"
|
||||
ssh_run "$T_HOST" "mkdir -p ~/${MODELS_REL%/*}" # ~/.exo
|
||||
|
||||
# Copy the whole "models" directory into ~/.exo on the target.
|
||||
# scp -3 = copy between two remotes via local; -r recursive; -p preserve times/modes
|
||||
if [ -n "$SSHPASS_BIN" ]; then
|
||||
echo "$(timestamp) Running: scp -3 -rp ${SRC_HOST}:~/${MODELS_REL} ${T_HOST}:~/.exo/"
|
||||
if sshpass -p "$PASS" "$SCP_BIN" "${SSH_OPTS[@]}" -3 -rp \
|
||||
"${SRC_HOST}:~/${MODELS_REL}" \
|
||||
"${T_HOST}:~/.exo/"; then
|
||||
echo "$(timestamp) [${count}] Done: ${T_HOST}"
|
||||
else
|
||||
echo "$(timestamp) [${count}] ERROR during scp to ${T_HOST}" >&2
|
||||
failures=$((failures + 1))
|
||||
fi
|
||||
else
|
||||
echo "$(timestamp) Running: scp -3 -rp ${SRC_HOST}:~/${MODELS_REL} ${T_HOST}:~/.exo/"
|
||||
if "$SCP_BIN" "${SSH_OPTS[@]}" -3 -rp \
|
||||
"${SRC_HOST}:~/${MODELS_REL}" \
|
||||
"${T_HOST}:~/.exo/"; then
|
||||
echo "$(timestamp) [${count}] Done: ${T_HOST}"
|
||||
else
|
||||
echo "$(timestamp) [${count}] ERROR during scp to ${T_HOST}" >&2
|
||||
failures=$((failures + 1))
|
||||
fi
|
||||
fi
|
||||
done
|
||||
|
||||
echo "============================================================"
|
||||
if [ "$failures" -eq 0 ]; then
|
||||
echo "$(timestamp) All transfers completed successfully."
|
||||
else
|
||||
echo "$(timestamp) Completed with ${failures} failure(s)."
|
||||
fi
|
||||
@@ -461,6 +461,17 @@
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
|
||||
.instance-strategy {
|
||||
font-size: 13px;
|
||||
color: var(--exo-light-gray);
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
|
||||
.instance-strategy-value {
|
||||
font-weight: 600;
|
||||
color: var(--exo-yellow);
|
||||
}
|
||||
|
||||
.instance-details {
|
||||
font-size: 12px;
|
||||
color: var(--exo-light-gray);
|
||||
@@ -468,15 +479,6 @@
|
||||
|
||||
|
||||
|
||||
.download-progress {
|
||||
font-size: 11px;
|
||||
color: var(--exo-light-gray);
|
||||
margin-top: 4px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.progress-bar-container {
|
||||
background-color: var(--exo-black);
|
||||
border-radius: 8px;
|
||||
@@ -492,75 +494,96 @@
|
||||
transition: width 0.3s ease;
|
||||
}
|
||||
|
||||
/* Detailed download info */
|
||||
.download-details {
|
||||
margin-top: 8px;
|
||||
padding: 12px;
|
||||
background-color: #1a1a1a;
|
||||
border: 1px solid var(--exo-medium-gray);
|
||||
border-radius: 6px;
|
||||
box-sizing: border-box;
|
||||
width: 100%;
|
||||
max-width: 100%;
|
||||
overflow: visible;
|
||||
}
|
||||
.download-runner-header {
|
||||
font-size: 11px;
|
||||
color: var(--exo-light-gray);
|
||||
opacity: 0.85;
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
.download-overview-row {
|
||||
display: flex;
|
||||
gap: 12px;
|
||||
flex-wrap: wrap;
|
||||
font-size: 12px;
|
||||
|
||||
/* Overall download summary styles */
|
||||
.overall-download-summary {
|
||||
margin-top: 10px;
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
.download-overview-item strong {
|
||||
color: #E0E0E0;
|
||||
font-weight: 600;
|
||||
margin-right: 4px;
|
||||
}
|
||||
.progress-with-label {
|
||||
|
||||
.overall-download-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
margin-bottom: 10px;
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
.progress-with-label .progress-bar-container {
|
||||
flex: 1 1 auto;
|
||||
}
|
||||
.progress-percent {
|
||||
font-size: 12px;
|
||||
|
||||
.overall-download-label {
|
||||
font-size: 11px;
|
||||
font-weight: 500;
|
||||
color: var(--exo-light-gray);
|
||||
opacity: 0.7;
|
||||
}
|
||||
|
||||
.overall-download-percent {
|
||||
font-size: 11px;
|
||||
font-weight: 500;
|
||||
color: var(--exo-light-gray);
|
||||
opacity: 0.7;
|
||||
font-variant-numeric: tabular-nums;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.download-overview-combined {
|
||||
font-size: 12px;
|
||||
|
||||
.overall-download-stats {
|
||||
font-size: 10px;
|
||||
color: var(--exo-light-gray);
|
||||
opacity: 0.9;
|
||||
margin-top: 4px;
|
||||
opacity: 0.6;
|
||||
}
|
||||
.instance-download-summary {
|
||||
|
||||
/* Per-node download summary styles */
|
||||
.node-download-summary {
|
||||
margin-top: 12px;
|
||||
padding: 10px;
|
||||
background-color: rgba(0, 0, 0, 0.2);
|
||||
border-radius: 6px;
|
||||
border-left: 3px solid #3b82f6;
|
||||
}
|
||||
|
||||
.node-download-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 6px;
|
||||
}
|
||||
|
||||
.node-download-name {
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
color: var(--exo-yellow);
|
||||
}
|
||||
|
||||
.node-download-percent {
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
color: #3b82f6;
|
||||
font-variant-numeric: tabular-nums;
|
||||
}
|
||||
|
||||
.node-download-stats {
|
||||
font-size: 11px;
|
||||
color: var(--exo-light-gray);
|
||||
margin-top: 6px;
|
||||
opacity: 0.95;
|
||||
margin-bottom: 10px;
|
||||
opacity: 0.9;
|
||||
}
|
||||
|
||||
/* File-level download details */
|
||||
.download-files-list {
|
||||
display: grid;
|
||||
gap: 8px;
|
||||
margin-top: 10px;
|
||||
}
|
||||
|
||||
.download-file {
|
||||
padding: 8px;
|
||||
background-color: var(--exo-dark-gray);
|
||||
background-color: rgba(0, 0, 0, 0.3);
|
||||
border: 1px solid var(--exo-medium-gray);
|
||||
border-radius: 6px;
|
||||
box-sizing: border-box;
|
||||
width: 100%;
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
.download-file-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
@@ -572,6 +595,7 @@
|
||||
max-width: 100%;
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.download-file-name {
|
||||
color: #E0E0E0;
|
||||
font-weight: 500;
|
||||
@@ -581,11 +605,7 @@
|
||||
min-width: 0;
|
||||
flex: 1 1 auto;
|
||||
}
|
||||
.download-file-stats {
|
||||
color: var(--exo-light-gray);
|
||||
text-align: right;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.download-file-percent {
|
||||
color: var(--exo-light-gray);
|
||||
white-space: nowrap;
|
||||
@@ -593,6 +613,7 @@
|
||||
font-variant-numeric: tabular-nums;
|
||||
flex: 0 0 auto;
|
||||
}
|
||||
|
||||
.download-file-subtext {
|
||||
color: var(--exo-light-gray);
|
||||
font-size: 10px;
|
||||
@@ -603,26 +624,20 @@
|
||||
white-space: nowrap;
|
||||
max-width: 100%;
|
||||
}
|
||||
.download-details, .download-files-list {
|
||||
box-sizing: border-box;
|
||||
width: 100%;
|
||||
max-width: 100%;
|
||||
}
|
||||
.download-files-list {
|
||||
overflow: visible;
|
||||
padding-right: 2px; /* avoid edge clipping */
|
||||
}
|
||||
|
||||
.download-file .progress-bar-container {
|
||||
width: 100%;
|
||||
max-width: 100%;
|
||||
box-sizing: border-box;
|
||||
height: 5px;
|
||||
}
|
||||
|
||||
.completed-files-section {
|
||||
margin-top: 12px;
|
||||
padding-top: 8px;
|
||||
border-top: 1px solid var(--exo-medium-gray);
|
||||
border-top: 1px solid rgba(255, 255, 255, 0.1);
|
||||
}
|
||||
|
||||
.completed-files-header {
|
||||
font-size: 10px;
|
||||
color: var(--exo-light-gray);
|
||||
@@ -630,11 +645,13 @@
|
||||
margin-bottom: 6px;
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.completed-files-list {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 3px;
|
||||
}
|
||||
|
||||
.completed-file-item {
|
||||
font-size: 10px;
|
||||
color: var(--exo-light-gray);
|
||||
@@ -772,6 +789,82 @@
|
||||
cursor: not-allowed;
|
||||
}
|
||||
|
||||
.strategy-selector {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.strategy-options {
|
||||
display: flex;
|
||||
gap: 12px;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.strategy-option {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
cursor: pointer;
|
||||
padding: 8px 12px;
|
||||
border-radius: 6px;
|
||||
background-color: var(--exo-dark-gray);
|
||||
border: 2px solid var(--exo-medium-gray);
|
||||
transition: all 0.2s ease;
|
||||
user-select: none;
|
||||
}
|
||||
|
||||
.strategy-option:hover {
|
||||
background-color: var(--exo-medium-gray);
|
||||
border-color: rgba(255, 215, 0, 0.5);
|
||||
}
|
||||
|
||||
.strategy-option input[type="radio"] {
|
||||
appearance: none;
|
||||
width: 16px;
|
||||
height: 16px;
|
||||
border: 2px solid var(--exo-light-gray);
|
||||
border-radius: 50%;
|
||||
cursor: pointer;
|
||||
position: relative;
|
||||
margin: 0;
|
||||
transition: all 0.2s ease;
|
||||
}
|
||||
|
||||
.strategy-option input[type="radio"]:checked {
|
||||
border-color: var(--exo-yellow);
|
||||
background-color: var(--exo-yellow);
|
||||
}
|
||||
|
||||
.strategy-option input[type="radio"]:checked::after {
|
||||
content: '';
|
||||
position: absolute;
|
||||
top: 50%;
|
||||
left: 50%;
|
||||
transform: translate(-50%, -50%);
|
||||
width: 6px;
|
||||
height: 6px;
|
||||
border-radius: 50%;
|
||||
background-color: var(--exo-black);
|
||||
}
|
||||
|
||||
.strategy-option:has(input[type="radio"]:checked) {
|
||||
background-color: rgba(255, 215, 0, 0.15);
|
||||
border-color: var(--exo-yellow);
|
||||
}
|
||||
|
||||
.strategy-option label {
|
||||
cursor: pointer;
|
||||
font-size: 14px;
|
||||
font-weight: 500;
|
||||
color: var(--exo-light-gray);
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.strategy-option:has(input[type="radio"]:checked) label {
|
||||
color: var(--exo-yellow);
|
||||
}
|
||||
|
||||
.launch-status {
|
||||
font-size: 12px;
|
||||
padding: 8px;
|
||||
@@ -850,6 +943,33 @@
|
||||
<select id="modelSelect" class="model-select">
|
||||
<option value="">Loading models...</option>
|
||||
</select>
|
||||
|
||||
<div class="strategy-selector">
|
||||
<label class="launch-label">Parallelization Strategy:</label>
|
||||
<div class="strategy-options">
|
||||
<div class="strategy-option">
|
||||
<input type="radio" id="strategyAuto" name="strategy" value="auto" checked>
|
||||
<label for="strategyAuto">Auto</label>
|
||||
</div>
|
||||
<div class="strategy-option">
|
||||
<input type="radio" id="strategyPipeline" name="strategy" value="pipeline">
|
||||
<label for="strategyPipeline">Pipeline</label>
|
||||
</div>
|
||||
<div class="strategy-option">
|
||||
<input type="radio" id="strategyTensor" name="strategy" value="tensor">
|
||||
<label for="strategyTensor">Tensor</label>
|
||||
</div>
|
||||
<div class="strategy-option">
|
||||
<input type="radio" id="strategyPipelineRdma" name="strategy" value="pipeline_rdma">
|
||||
<label for="strategyPipelineRdma">Pipeline RDMA</label>
|
||||
</div>
|
||||
<div class="strategy-option">
|
||||
<input type="radio" id="strategyTensorRdma" name="strategy" value="tensor_rdma">
|
||||
<label for="strategyTensorRdma">Tensor RDMA</label>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<button id="launchInstanceButton" class="launch-button" disabled>Launch Instance</button>
|
||||
<div id="launchStatus" class="launch-status"></div>
|
||||
</div>
|
||||
@@ -1112,6 +1232,9 @@
|
||||
return;
|
||||
}
|
||||
|
||||
const selectedStrategy = document.querySelector('input[name="strategy"]:checked').value;
|
||||
console.log("selectedStrategy", selectedStrategy);
|
||||
|
||||
try {
|
||||
showLaunchStatus('Launching instance...', 'loading');
|
||||
launchInstanceButton.disabled = true;
|
||||
@@ -1121,7 +1244,10 @@
|
||||
headers: {
|
||||
'Content-Type': 'application/json',
|
||||
},
|
||||
body: JSON.stringify({ model_id: selectedModelId })
|
||||
body: JSON.stringify({
|
||||
model_id: selectedModelId,
|
||||
strategy: selectedStrategy
|
||||
})
|
||||
});
|
||||
|
||||
if (!response.ok) {
|
||||
@@ -1251,60 +1377,6 @@
|
||||
return { isDownloading: isDownloadingAny, progress, details };
|
||||
}
|
||||
|
||||
function buildDownloadDetailsHTML(details) {
|
||||
if (!details || details.length === 0) return '';
|
||||
function shortId(id) { return (id && id.length > 8) ? id.slice(0, 8) + '…' : (id || ''); }
|
||||
return details.map(({ runnerId, nodeId, progress }) => {
|
||||
const etaStr = formatDurationMs(progress.etaMs);
|
||||
const pctStr = formatPercent(progress.percentage || 0, 2);
|
||||
const bytesStr = `${formatBytes(progress.downloadedBytes)} / ${formatBytes(progress.totalBytes)}`;
|
||||
const speedStr = formatBytesPerSecond(progress.speed);
|
||||
const filesSummary = `${progress.completedFiles}/${progress.totalFiles}`;
|
||||
|
||||
const allFiles = progress.files || [];
|
||||
const inProgressFiles = allFiles.filter(f => (f.percentage || 0) < 100);
|
||||
const completedFiles = allFiles.filter(f => (f.percentage || 0) >= 100);
|
||||
|
||||
const inProgressHTML = inProgressFiles.map(f => {
|
||||
const fPct = f.percentage || 0;
|
||||
const fBytes = `${formatBytes(f.downloadedBytes)} / ${formatBytes(f.totalBytes)}`;
|
||||
const fEta = formatDurationMs(f.etaMs);
|
||||
const fSpeed = formatBytesPerSecond(f.speed);
|
||||
const pctText = formatPercent(fPct, 2);
|
||||
return `
|
||||
<div class="download-file">
|
||||
<div class="download-file-header">
|
||||
<span class="download-file-name" title="${f.name}">${f.name}</span>
|
||||
<span class="download-file-percent">${pctText}</span>
|
||||
</div>
|
||||
<div class="download-file-subtext">${fBytes} • ETA ${fEta} • ${fSpeed}</div>
|
||||
<div class="progress-bar-container"><div class="progress-bar" style="width: ${Math.max(0, Math.min(100, fPct)).toFixed(2)}%;"></div></div>
|
||||
</div>
|
||||
`;
|
||||
}).join('');
|
||||
|
||||
const completedHTML = completedFiles.length > 0 ? `
|
||||
<div class="completed-files-section">
|
||||
<div class="completed-files-header">Completed (${completedFiles.length})</div>
|
||||
<div class="completed-files-list">
|
||||
${completedFiles.map(f => `<div class="completed-file-item" title="${f.name}">${f.name}</div>`).join('')}
|
||||
</div>
|
||||
</div>
|
||||
` : '';
|
||||
|
||||
const runnerName = (nodeId && nodeIdToFriendlyName[nodeId]) ? nodeIdToFriendlyName[nodeId] : '?';
|
||||
const headerText = `${runnerName} (${shortId(nodeId || '')})`;
|
||||
return `
|
||||
<div class="download-details">
|
||||
<div class="download-runner-header">${headerText}</div>
|
||||
<div class="download-files-list">
|
||||
${inProgressHTML}
|
||||
</div>
|
||||
${completedHTML}
|
||||
</div>
|
||||
`;
|
||||
}).join('');
|
||||
}
|
||||
|
||||
// Derive a display status for an instance from its runners.
|
||||
// Priority: FAILED > DOWNLOADING > STARTING > RUNNING > LOADED > INACTIVE
|
||||
@@ -1383,9 +1455,37 @@
|
||||
? instance.instanceId.substring(0, 8) + '...'
|
||||
: instance.instanceId;
|
||||
|
||||
const hostsHTML = instance.hosts?.map(host =>
|
||||
`<span class="instance-host">${host.ip}:${host.port}</span>`
|
||||
).join('') || '';
|
||||
// Create reverse mapping from runnerId to nodeId using nodeToRunner
|
||||
const nodeToRunner = instance.shardAssignments?.nodeToRunner || {};
|
||||
const runnerToNode = {};
|
||||
Object.entries(nodeToRunner).forEach(([nodeId, runnerId]) => {
|
||||
runnerToNode[runnerId] = nodeId;
|
||||
});
|
||||
|
||||
// Extract parallelization strategy from the first shard
|
||||
const runnerToShard = instance.shardAssignments?.runnerToShard || {};
|
||||
const firstShardData = Object.values(runnerToShard)[0];
|
||||
let parallelizationStrategy = 'Unknown';
|
||||
if (firstShardData) {
|
||||
const shardKeys = Object.keys(firstShardData);
|
||||
if (shardKeys.length === 1) {
|
||||
const shardPayload = firstShardData[shardKeys[0]];
|
||||
parallelizationStrategy = shardPayload?.strategy || firstShardData.strategy || 'Unknown';
|
||||
} else {
|
||||
parallelizationStrategy = firstShardData.strategy || 'Unknown';
|
||||
}
|
||||
}
|
||||
|
||||
// Generate hosts HTML using runner IDs and friendly names
|
||||
const runnerIds = Object.keys(runnerToShard);
|
||||
const hostsHTML = runnerIds.map(runnerId => {
|
||||
const nodeId = runnerToNode[runnerId];
|
||||
const friendlyName = nodeId && nodeIdToFriendlyName[nodeId]
|
||||
? nodeIdToFriendlyName[nodeId]
|
||||
: 'Unknown Node';
|
||||
const shortId = runnerId.slice(-4);
|
||||
return `<span class="instance-host">${friendlyName} (${shortId})</span>`;
|
||||
}).join('') || '';
|
||||
|
||||
// Calculate download status for this instance
|
||||
const downloadStatus = calculateInstanceDownloadStatus(instance, runners);
|
||||
@@ -1397,32 +1497,95 @@
|
||||
({ statusText, statusClass } = deriveInstanceStatus(instance, runners));
|
||||
}
|
||||
|
||||
// Generate download progress HTML
|
||||
// Generate download progress HTML - overall + per node with file details
|
||||
let downloadProgressHTML = '';
|
||||
let instanceDownloadSummary = '';
|
||||
if (downloadStatus.isDownloading) {
|
||||
const detailsHTML = buildDownloadDetailsHTML(downloadStatus.details || []);
|
||||
const pctText = (downloadStatus.progress || 0).toFixed(2);
|
||||
// Aggregate a compact summary from the first runner (they should be consistent in aggregate)
|
||||
const first = (downloadStatus.details || [])[0]?.progress;
|
||||
const etaStr = first ? formatDurationMs(first.etaMs) : '—';
|
||||
const bytesStr = first ? `${formatBytes(first.downloadedBytes)} / ${formatBytes(first.totalBytes)}` : '';
|
||||
const speedStr = first ? formatBytesPerSecond(first.speed) : '';
|
||||
const filesSummary = first ? `${first.completedFiles}/${first.totalFiles}` : '';
|
||||
instanceDownloadSummary = `${etaStr} · ${bytesStr} · ${speedStr} · ${filesSummary} files`;
|
||||
|
||||
downloadProgressHTML = `
|
||||
<div class="download-progress">
|
||||
<span>${pctText}%</span>
|
||||
<div class="progress-bar-container">
|
||||
<div class="progress-bar" style="width: ${pctText}%;"></div>
|
||||
// Calculate overall progress across all nodes
|
||||
const overallPct = (downloadStatus.progress || 0).toFixed(2);
|
||||
const totalBytesAll = downloadStatus.details.reduce((sum, d) => sum + (d.progress.totalBytes || 0), 0);
|
||||
const downloadedBytesAll = downloadStatus.details.reduce((sum, d) => sum + (d.progress.downloadedBytes || 0), 0);
|
||||
const nodeCount = downloadStatus.details.length;
|
||||
|
||||
// Overall progress section
|
||||
const overallHTML = `
|
||||
<div class="overall-download-summary">
|
||||
<div class="overall-download-header">
|
||||
<span class="overall-download-label">Overall</span>
|
||||
<span class="overall-download-percent">${overallPct}%</span>
|
||||
</div>
|
||||
<div class="progress-bar-container">
|
||||
<div class="progress-bar" style="width: ${overallPct}%;"></div>
|
||||
</div>
|
||||
<div class="overall-download-stats">${formatBytes(downloadedBytesAll)} / ${formatBytes(totalBytesAll)} • ${nodeCount} runner${nodeCount !== 1 ? 's' : ''}</div>
|
||||
</div>
|
||||
${detailsHTML}
|
||||
`;
|
||||
|
||||
const perNodeHTML = (downloadStatus.details || []).map(({ runnerId, nodeId, progress }) => {
|
||||
const nodeName = (nodeId && nodeIdToFriendlyName[nodeId])
|
||||
? nodeIdToFriendlyName[nodeId]
|
||||
: (nodeIdToFriendlyName[runnerId] || 'Unknown Node');
|
||||
const pctText = (progress.percentage || 0).toFixed(2);
|
||||
const etaStr = formatDurationMs(progress.etaMs);
|
||||
const bytesStr = `${formatBytes(progress.downloadedBytes)} / ${formatBytes(progress.totalBytes)}`;
|
||||
const speedStr = formatBytesPerSecond(progress.speed);
|
||||
const filesSummary = `${progress.completedFiles}/${progress.totalFiles} files`;
|
||||
|
||||
// Separate files into in-progress and completed
|
||||
const allFiles = progress.files || [];
|
||||
const inProgressFiles = allFiles.filter(f => (f.percentage || 0) < 100);
|
||||
const completedFiles = allFiles.filter(f => (f.percentage || 0) >= 100);
|
||||
|
||||
// Generate HTML for in-progress files
|
||||
const inProgressHTML = inProgressFiles.map(f => {
|
||||
const fPct = f.percentage || 0;
|
||||
const fBytes = `${formatBytes(f.downloadedBytes)} / ${formatBytes(f.totalBytes)}`;
|
||||
const fEta = formatDurationMs(f.etaMs);
|
||||
const fSpeed = formatBytesPerSecond(f.speed);
|
||||
const pctFormatted = formatPercent(fPct, 2);
|
||||
return `
|
||||
<div class="download-file">
|
||||
<div class="download-file-header">
|
||||
<span class="download-file-name" title="${f.name}">${f.name}</span>
|
||||
<span class="download-file-percent">${pctFormatted}</span>
|
||||
</div>
|
||||
<div class="download-file-subtext">${fBytes} • ETA ${fEta} • ${fSpeed}</div>
|
||||
<div class="progress-bar-container"><div class="progress-bar" style="width: ${Math.max(0, Math.min(100, fPct)).toFixed(2)}%;"></div></div>
|
||||
</div>
|
||||
`;
|
||||
}).join('');
|
||||
|
||||
// Generate HTML for completed files
|
||||
const completedHTML = completedFiles.length > 0 ? `
|
||||
<div class="completed-files-section">
|
||||
<div class="completed-files-header">Completed (${completedFiles.length})</div>
|
||||
<div class="completed-files-list">
|
||||
${completedFiles.map(f => `<div class="completed-file-item" title="${f.name}">${f.name}</div>`).join('')}
|
||||
</div>
|
||||
</div>
|
||||
` : '';
|
||||
|
||||
return `
|
||||
<div class="node-download-summary">
|
||||
<div class="node-download-header">
|
||||
<span class="node-download-name">${nodeName}</span>
|
||||
<span class="node-download-percent">${pctText}%</span>
|
||||
</div>
|
||||
<div class="progress-bar-container">
|
||||
<div class="progress-bar" style="width: ${pctText}%;"></div>
|
||||
</div>
|
||||
<div class="node-download-stats">${etaStr} · ${bytesStr} · ${speedStr} · ${filesSummary}</div>
|
||||
<div class="download-files-list">
|
||||
${inProgressHTML}
|
||||
</div>
|
||||
${completedHTML}
|
||||
</div>
|
||||
`;
|
||||
}).join('');
|
||||
|
||||
downloadProgressHTML = overallHTML + perNodeHTML;
|
||||
}
|
||||
|
||||
const shardCount = Object.keys(instance.shardAssignments?.runnerToShard || {}).length;
|
||||
const shardCount = Object.keys(runnerToShard).length;
|
||||
return `
|
||||
<div class="instance-item">
|
||||
<div class="instance-header">
|
||||
@@ -1436,8 +1599,8 @@
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
<div class="instance-model">${modelId} <span style="color: var(--exo-light-gray); opacity: 0.8;">(${shardCount})</span></div>
|
||||
${instanceDownloadSummary ? `<div class="instance-download-summary">${instanceDownloadSummary}</div>` : ''}
|
||||
<div class="instance-model">${modelId} <span style="color: var(--exo-light-gray); opacity: 0.8;">(${shardCount} runner${shardCount !== 1 ? 's' : ''})</span></div>
|
||||
<div class="instance-strategy">Strategy: <span class="instance-strategy-value">${parallelizationStrategy}</span></div>
|
||||
|
||||
${downloadProgressHTML}
|
||||
${hostsHTML ? `<div class="instance-hosts">${hostsHTML}</div>` : ''}
|
||||
|
||||
18
flake.lock
generated
18
flake.lock
generated
@@ -8,11 +8,11 @@
|
||||
"rust-analyzer-src": "rust-analyzer-src"
|
||||
},
|
||||
"locked": {
|
||||
"lastModified": 1755585599,
|
||||
"narHash": "sha256-tl/0cnsqB/Yt7DbaGMel2RLa7QG5elA8lkaOXli6VdY=",
|
||||
"lastModified": 1761893049,
|
||||
"narHash": "sha256-1TtFDPhC+ZsrOOtBnry1EZC+WipTTvsOVjIEVugqji8=",
|
||||
"owner": "nix-community",
|
||||
"repo": "fenix",
|
||||
"rev": "6ed03ef4c8ec36d193c18e06b9ecddde78fb7e42",
|
||||
"rev": "c2ac9a5c0d6d16630c3b225b874bd14528d1abe6",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
@@ -41,11 +41,11 @@
|
||||
},
|
||||
"nixpkgs": {
|
||||
"locked": {
|
||||
"lastModified": 1755615617,
|
||||
"narHash": "sha256-HMwfAJBdrr8wXAkbGhtcby1zGFvs+StOp19xNsbqdOg=",
|
||||
"lastModified": 1761672384,
|
||||
"narHash": "sha256-o9KF3DJL7g7iYMZq9SWgfS1BFlNbsm6xplRjVlOCkXI=",
|
||||
"owner": "NixOS",
|
||||
"repo": "nixpkgs",
|
||||
"rev": "20075955deac2583bb12f07151c2df830ef346b4",
|
||||
"rev": "08dacfca559e1d7da38f3cf05f1f45ee9bfd213c",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
@@ -65,11 +65,11 @@
|
||||
"rust-analyzer-src": {
|
||||
"flake": false,
|
||||
"locked": {
|
||||
"lastModified": 1755504847,
|
||||
"narHash": "sha256-VX0B9hwhJypCGqncVVLC+SmeMVd/GAYbJZ0MiiUn2Pk=",
|
||||
"lastModified": 1761849405,
|
||||
"narHash": "sha256-igXdvC+WCUN+3gnfk+ptT7rMmxQuY6WbIg1rXMUN1DM=",
|
||||
"owner": "rust-lang",
|
||||
"repo": "rust-analyzer",
|
||||
"rev": "a905e3b21b144d77e1b304e49f3264f6f8d4db75",
|
||||
"rev": "f7de8ae045a5fe80f1203c5a1c3015b05f7c3550",
|
||||
"type": "github"
|
||||
},
|
||||
"original": {
|
||||
|
||||
@@ -61,6 +61,10 @@
|
||||
# JUST
|
||||
just
|
||||
]
|
||||
++ (pkgs.lib.optionals pkgs.stdenv.isLinux [
|
||||
# IFCONFIG
|
||||
unixtools.ifconfig
|
||||
])
|
||||
++ (pkgs.lib.optionals pkgs.stdenv.isDarwin [
|
||||
# MACMON
|
||||
macmon
|
||||
@@ -68,8 +72,8 @@
|
||||
|
||||
shellHook = ''
|
||||
# PYTHON
|
||||
export DASHBOARD_DIR=$(git rev-parse --show-toplevel)/dashboard;
|
||||
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:${pkgs.python313}/lib
|
||||
export DASHBOARD_DIR="$(git rev-parse --show-toplevel)/dashboard"
|
||||
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:${pkgs.python313}/lib"
|
||||
echo
|
||||
echo "🍎🍎 Run 'just <recipe>' to get started"
|
||||
just --list
|
||||
|
||||
4
justfile
4
justfile
@@ -16,6 +16,10 @@ sync:
|
||||
sync-clean:
|
||||
uv sync --all-packages --force-reinstall --no-cache
|
||||
|
||||
rust-rebuild:
|
||||
cd rust && cargo run --bin stub_gen
|
||||
just sync-clean
|
||||
|
||||
clean:
|
||||
rm -rf **/__pycache__
|
||||
rm -rf rust/target
|
||||
|
||||
@@ -1,65 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
###############################################################################
|
||||
# Args & prerequisites
|
||||
###############################################################################
|
||||
if [[ $# -gt 1 ]]; then
|
||||
echo "Usage: $0 [hosts_file]" >&2
|
||||
exit 1
|
||||
fi
|
||||
HOSTS_FILE=${1:-hosts.txt}
|
||||
|
||||
###############################################################################
|
||||
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
|
||||
###############################################################################
|
||||
if [[ ! -f "$HOSTS_FILE" ]]; then
|
||||
echo "Error: $HOSTS_FILE not found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if builtin command -v mapfile >/dev/null 2>&1; then
|
||||
mapfile -t HOSTS <"$HOSTS_FILE"
|
||||
else
|
||||
HOSTS=()
|
||||
while IFS= read -r h; do
|
||||
[[ -n "$h" ]] && HOSTS+=("$h")
|
||||
done <"$HOSTS_FILE"
|
||||
fi
|
||||
[[ ${#HOSTS[@]} -gt 0 ]] || {
|
||||
echo "No hosts found in $HOSTS_FILE"
|
||||
exit 1
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Helper – run a remote command and capture rc/stderr/stdout
|
||||
###############################################################################
|
||||
ssh_opts=(-o StrictHostKeyChecking=no
|
||||
-o LogLevel=ERROR)
|
||||
|
||||
run_remote() { # $1 host $2 command
|
||||
local host=$1 cmd=$2 rc
|
||||
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
|
||||
rc=0
|
||||
else
|
||||
rc=$?
|
||||
fi
|
||||
return $rc
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Kill exo everywhere (parallel)
|
||||
###############################################################################
|
||||
echo "=== Killing exo on ${#HOSTS[@]} host(s) ==="
|
||||
fail=0
|
||||
for h in "${HOSTS[@]}"; do
|
||||
(
|
||||
run_remote "$h" 'pkill -f exo || true'
|
||||
) || fail=1 &
|
||||
done
|
||||
wait
|
||||
((fail == 0)) || {
|
||||
echo "❌ Some hosts could not be reached—check SSH access."
|
||||
exit 1
|
||||
}
|
||||
echo "✓ exo processes killed on all reachable hosts."
|
||||
@@ -26,8 +26,6 @@ dependencies = [
|
||||
"sqlalchemy[asyncio]>=2.0.43",
|
||||
"greenlet>=3.2.4",
|
||||
"huggingface-hub>=0.33.4",
|
||||
"mlx==0.29.3",
|
||||
"mlx-lm==0.28.3",
|
||||
"psutil>=7.0.0",
|
||||
"transformers>=4.55.2",
|
||||
"cobs>=1.2.2",
|
||||
@@ -36,6 +34,8 @@ dependencies = [
|
||||
"exo_pyo3_bindings", # rust bindings
|
||||
"anyio>=4.10.0",
|
||||
"bidict>=0.23.1",
|
||||
"mlx>=0.29.3",
|
||||
"mlx-lm>=0.28.3",
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
@@ -52,7 +52,7 @@ dev = [
|
||||
]
|
||||
|
||||
# mlx[cuda] requires a newer version of mlx. the ideal on linux is: default to mlx[cpu] unless[cuda] specified.
|
||||
# [project.optional-dependencies]
|
||||
[project.optional-dependencies]
|
||||
# cuda = [
|
||||
# "mlx[cuda]==0.26.3",
|
||||
# ]
|
||||
@@ -69,6 +69,9 @@ members = [
|
||||
|
||||
[tool.uv.sources]
|
||||
exo_pyo3_bindings = { workspace = true }
|
||||
# Uncomment to use local mlx/mlx-lm development versions:
|
||||
# mlx = { path = "/Users/Shared/mlx", editable=true }
|
||||
# mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }
|
||||
|
||||
[build-system]
|
||||
requires = ["uv_build>=0.8.9,<0.9.0"]
|
||||
@@ -94,7 +97,7 @@ reportUnnecessaryTypeIgnoreComment = "error"
|
||||
pythonVersion = "3.13"
|
||||
pythonPlatform = "Darwin"
|
||||
|
||||
exclude = ["**/.venv", "**/venv", "**/__pycache__", "**/exo_scripts", "**/.direnv", "**/rust"]
|
||||
exclude = ["**/.venv", "**/venv", "**/__pycache__", "**/exo_scripts", "**/.direnv", "**/rust", "mlx/*", "mlx-lm/*"]
|
||||
stubPath = "typings"
|
||||
|
||||
[[tool.basedpyright.executionEnvironments]]
|
||||
|
||||
@@ -1,82 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
###############################################################################
|
||||
# Args & prerequisites
|
||||
###############################################################################
|
||||
if [[ $# -lt 1 ]]; then
|
||||
echo "Usage: $0 <git_command> [git_args...]" >&2
|
||||
echo "Examples:" >&2
|
||||
echo " $0 pull" >&2
|
||||
echo " $0 checkout main" >&2
|
||||
echo " $0 status" >&2
|
||||
echo " $0 fetch --all" >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
GIT_CMD="$*" # All args form the git command
|
||||
HOSTS_FILE=${HOSTS_FILE:-hosts.txt}
|
||||
|
||||
###############################################################################
|
||||
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
|
||||
###############################################################################
|
||||
if [[ ! -f "$HOSTS_FILE" ]]; then
|
||||
echo "Error: $HOSTS_FILE not found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if builtin command -v mapfile >/dev/null 2>&1; then
|
||||
mapfile -t HOSTS <"$HOSTS_FILE"
|
||||
else
|
||||
HOSTS=()
|
||||
while IFS= read -r h; do
|
||||
[[ -n "$h" ]] && HOSTS+=("$h")
|
||||
done <"$HOSTS_FILE"
|
||||
fi
|
||||
[[ ${#HOSTS[@]} -gt 0 ]] || {
|
||||
echo "No hosts found in $HOSTS_FILE"
|
||||
exit 1
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Helper – run a remote command and capture rc/stderr/stdout
|
||||
###############################################################################
|
||||
ssh_opts=(-o StrictHostKeyChecking=no
|
||||
-o LogLevel=ERROR)
|
||||
|
||||
run_remote() { # $1 host $2 command
|
||||
local host=$1 cmd=$2 rc
|
||||
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
|
||||
rc=0
|
||||
else
|
||||
rc=$?
|
||||
fi
|
||||
return $rc
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Run git command on remote hosts (parallel)
|
||||
###############################################################################
|
||||
echo ""
|
||||
echo "=== Running 'git $GIT_CMD' on ${#HOSTS[@]} remote host(s) ==="
|
||||
fail=0
|
||||
for h in "${HOSTS[@]}"; do
|
||||
(
|
||||
echo "→ Running on $h..."
|
||||
if run_remote "$h" "cd ~/exo && git $GIT_CMD"; then
|
||||
echo " ✓ $h: success"
|
||||
else
|
||||
echo " ❌ $h: failed"
|
||||
exit 1
|
||||
fi
|
||||
) || fail=1 &
|
||||
done
|
||||
wait
|
||||
|
||||
echo ""
|
||||
if ((fail == 0)); then
|
||||
echo "🎉 Git command executed successfully on all hosts!"
|
||||
else
|
||||
echo "⚠️ Some hosts failed—see above."
|
||||
exit 1
|
||||
fi
|
||||
48
run.sh
48
run.sh
@@ -1,48 +0,0 @@
|
||||
#!/bin/bash
|
||||
DIR="$PWD"
|
||||
|
||||
# Initialize flags
|
||||
REPLICA=false
|
||||
CLEAN=false
|
||||
|
||||
# Parse command line arguments
|
||||
while getopts "rc" opt; do
|
||||
case $opt in
|
||||
r)
|
||||
REPLICA=true
|
||||
;;
|
||||
c)
|
||||
CLEAN=true
|
||||
;;
|
||||
\?)
|
||||
echo "Invalid option: -$OPTARG" >&2
|
||||
echo "Usage: $0 [-r] [-c]"
|
||||
echo " -r Run as replica"
|
||||
echo " -c Clean databases before starting"
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
done
|
||||
|
||||
# Clean if requested
|
||||
if [ "$CLEAN" = true ]; then
|
||||
echo "Cleaning databases..."
|
||||
rm -f ~/.exo/*db*
|
||||
fi
|
||||
|
||||
# Configure MLX
|
||||
# ./configure_mlx.sh
|
||||
|
||||
# Second command (master) - changes based on replica flag
|
||||
if [ "$REPLICA" = true ]; then
|
||||
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export RUST_LOG=true EXO_RUN_AS_REPLICA=1 EXO_HOME=.exo API_PORT=8001; uv run exo-master'\""
|
||||
else
|
||||
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export RUST_LOG=true; uv run exo-master'\""
|
||||
fi
|
||||
|
||||
# First command (worker) - changes based on replica flag
|
||||
if [ "$REPLICA" = true ]; then
|
||||
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c bash -c 'export EXO_HOME=.exo; uv run exo-worker'\""
|
||||
else
|
||||
osascript -e "tell app \"Terminal\" to do script \"cd '$DIR'; nix develop -c uv run exo-worker\""
|
||||
fi
|
||||
@@ -1,99 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
###############################################################################
|
||||
# Args & prerequisites
|
||||
###############################################################################
|
||||
if [[ $# -gt 1 ]]; then
|
||||
echo "Usage: $0 [hosts_file]" >&2
|
||||
exit 1
|
||||
fi
|
||||
HOSTS_FILE=${1:-hosts.txt}
|
||||
|
||||
###############################################################################
|
||||
# Load hosts.txt (works on macOS Bash 3.2 and Bash 4+)
|
||||
###############################################################################
|
||||
if [[ ! -f "$HOSTS_FILE" ]]; then
|
||||
echo "Error: $HOSTS_FILE not found"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if builtin command -v mapfile >/dev/null 2>&1; then
|
||||
mapfile -t HOSTS <"$HOSTS_FILE"
|
||||
else
|
||||
HOSTS=()
|
||||
while IFS= read -r h; do
|
||||
[[ -n "$h" ]] && HOSTS+=("$h")
|
||||
done <"$HOSTS_FILE"
|
||||
fi
|
||||
[[ ${#HOSTS[@]} -gt 0 ]] || {
|
||||
echo "No hosts found in $HOSTS_FILE"
|
||||
exit 1
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Helper – run a remote command and capture rc/stderr/stdout
|
||||
###############################################################################
|
||||
ssh_opts=(-o StrictHostKeyChecking=no
|
||||
-o LogLevel=ERROR)
|
||||
|
||||
run_remote() { # $1 host $2 command
|
||||
local host=$1 cmd=$2 rc
|
||||
if ssh "${ssh_opts[@]}" "$host" "$cmd"; then
|
||||
rc=0
|
||||
else
|
||||
rc=$?
|
||||
fi
|
||||
return $rc
|
||||
}
|
||||
|
||||
###############################################################################
|
||||
# Phase 1 – kill exo everywhere (parallel)
|
||||
###############################################################################
|
||||
echo "=== Stage 1: killing exo on ${#HOSTS[@]} host(s) ==="
|
||||
fail=0
|
||||
for h in "${HOSTS[@]}"; do
|
||||
(
|
||||
run_remote "$h" 'pkill -f exo || true'
|
||||
) || fail=1 &
|
||||
done
|
||||
wait
|
||||
((fail == 0)) || {
|
||||
echo "❌ Some hosts could not be reached—check SSH access."
|
||||
exit 1
|
||||
}
|
||||
echo "✓ exo processes killed on all reachable hosts."
|
||||
#
|
||||
###############################################################################
|
||||
# Phase 2 – cleanup database files (parallel)
|
||||
###############################################################################
|
||||
echo "=== Stage 2: cleaning up database files ==="
|
||||
fail=0
|
||||
for h in "${HOSTS[@]}"; do
|
||||
(
|
||||
run_remote "$h" 'rm -f ~/.exo/*db* || true'
|
||||
) || fail=1 &
|
||||
done
|
||||
wait
|
||||
((fail == 0)) || {
|
||||
echo "❌ Some hosts failed database cleanup."
|
||||
exit 1
|
||||
}
|
||||
echo "✓ Database files cleaned on all hosts."
|
||||
|
||||
###############################################################################
|
||||
# Phase 3 – start new exo processes in Terminal windows (parallel)
|
||||
###############################################################################
|
||||
echo "=== Stage 3: starting new exo processes ==="
|
||||
fail=0
|
||||
for h in "${HOSTS[@]}"; do
|
||||
# Use osascript to open Terminal windows on remote Mac
|
||||
remote_cmd="osascript -e \"tell app \\\"Terminal\\\" to do script \\\"cd ~/exo; nix develop --command uv run exo\\\"\""
|
||||
|
||||
(run_remote "$h" "$remote_cmd") || fail=1 &
|
||||
done
|
||||
wait
|
||||
((fail == 0)) && echo "🎉 Deployment finished!" || {
|
||||
echo "⚠️ Some starts failed—see above."
|
||||
exit 1
|
||||
}
|
||||
@@ -2,8 +2,16 @@
|
||||
# ruff: noqa: E501, F401
|
||||
|
||||
import builtins
|
||||
from enum import Enum
|
||||
import enum
|
||||
import typing
|
||||
|
||||
@typing.final
|
||||
class AllQueuesFullError(builtins.Exception):
|
||||
def __new__(cls, *args: typing.Any) -> AllQueuesFullError: ...
|
||||
def __repr__(self) -> builtins.str: ...
|
||||
def __str__(self) -> builtins.str: ...
|
||||
|
||||
@typing.final
|
||||
class ConnectionUpdate:
|
||||
@property
|
||||
def update_type(self) -> ConnectionUpdateType:
|
||||
@@ -26,6 +34,7 @@ class ConnectionUpdate:
|
||||
Remote connection's TCP port.
|
||||
"""
|
||||
|
||||
@typing.final
|
||||
class Keypair:
|
||||
r"""
|
||||
Identity keypair of a node.
|
||||
@@ -46,12 +55,12 @@ class Keypair:
|
||||
Generate a new Secp256k1 keypair.
|
||||
"""
|
||||
@staticmethod
|
||||
def from_protobuf_encoding(bytes:bytes) -> Keypair:
|
||||
def from_protobuf_encoding(bytes: bytes) -> Keypair:
|
||||
r"""
|
||||
Decode a private key from a protobuf structure and parse it as a `Keypair`.
|
||||
"""
|
||||
@staticmethod
|
||||
def rsa_from_pkcs8(bytes:bytes) -> Keypair:
|
||||
def rsa_from_pkcs8(bytes: bytes) -> Keypair:
|
||||
r"""
|
||||
Decode an keypair from a DER-encoded secret key in PKCS#8 `PrivateKeyInfo`
|
||||
format (i.e. unencrypted) as defined in [RFC5208].
|
||||
@@ -59,7 +68,7 @@ class Keypair:
|
||||
[RFC5208]: https://tools.ietf.org/html/rfc5208#section-5
|
||||
"""
|
||||
@staticmethod
|
||||
def secp256k1_from_der(bytes:bytes) -> Keypair:
|
||||
def secp256k1_from_der(bytes: bytes) -> Keypair:
|
||||
r"""
|
||||
Decode a keypair from a DER-encoded Secp256k1 secret key in an `ECPrivateKey`
|
||||
structure as defined in [RFC5915].
|
||||
@@ -67,7 +76,7 @@ class Keypair:
|
||||
[RFC5915]: https://tools.ietf.org/html/rfc5915
|
||||
"""
|
||||
@staticmethod
|
||||
def ed25519_from_bytes(bytes:bytes) -> Keypair: ...
|
||||
def ed25519_from_bytes(bytes: bytes) -> Keypair: ...
|
||||
def to_protobuf_encoding(self) -> bytes:
|
||||
r"""
|
||||
Encode a private key as protobuf structure.
|
||||
@@ -77,6 +86,7 @@ class Keypair:
|
||||
Convert the `Keypair` into the corresponding `PeerId`.
|
||||
"""
|
||||
|
||||
@typing.final
|
||||
class Multiaddr:
|
||||
r"""
|
||||
Representation of a Multiaddr.
|
||||
@@ -87,17 +97,17 @@ class Multiaddr:
|
||||
Create a new, empty multiaddress.
|
||||
"""
|
||||
@staticmethod
|
||||
def with_capacity(n:builtins.int) -> Multiaddr:
|
||||
def with_capacity(n: builtins.int) -> Multiaddr:
|
||||
r"""
|
||||
Create a new, empty multiaddress with the given capacity.
|
||||
"""
|
||||
@staticmethod
|
||||
def from_bytes(bytes:bytes) -> Multiaddr:
|
||||
def from_bytes(bytes: bytes) -> Multiaddr:
|
||||
r"""
|
||||
Parse a `Multiaddr` value from its byte slice representation.
|
||||
"""
|
||||
@staticmethod
|
||||
def from_string(string:builtins.str) -> Multiaddr:
|
||||
def from_string(string: builtins.str) -> Multiaddr:
|
||||
r"""
|
||||
Parse a `Multiaddr` value from its string representation.
|
||||
"""
|
||||
@@ -118,13 +128,14 @@ class Multiaddr:
|
||||
Convert a Multiaddr to a string.
|
||||
"""
|
||||
|
||||
@typing.final
|
||||
class NetworkingHandle:
|
||||
def __new__(cls, identity:Keypair) -> NetworkingHandle: ...
|
||||
def __new__(cls, identity: Keypair) -> NetworkingHandle: ...
|
||||
async def connection_update_recv(self) -> ConnectionUpdate:
|
||||
r"""
|
||||
Receives the next `ConnectionUpdate` from networking.
|
||||
"""
|
||||
async def connection_update_recv_many(self, limit:builtins.int) -> builtins.list[ConnectionUpdate]:
|
||||
async def connection_update_recv_many(self, limit: builtins.int) -> builtins.list[ConnectionUpdate]:
|
||||
r"""
|
||||
Receives at most `limit` `ConnectionUpdate`s from networking and returns them.
|
||||
|
||||
@@ -132,19 +143,19 @@ class NetworkingHandle:
|
||||
For `limit > 0`, if there are no `ConnectionUpdate`s in the channel's queue this method
|
||||
will sleep until a `ConnectionUpdate`s is sent.
|
||||
"""
|
||||
async def gossipsub_subscribe(self, topic:builtins.str) -> builtins.bool:
|
||||
async def gossipsub_subscribe(self, topic: builtins.str) -> builtins.bool:
|
||||
r"""
|
||||
Subscribe to a `GossipSub` topic.
|
||||
|
||||
Returns `True` if the subscription worked. Returns `False` if we were already subscribed.
|
||||
"""
|
||||
async def gossipsub_unsubscribe(self, topic:builtins.str) -> builtins.bool:
|
||||
async def gossipsub_unsubscribe(self, topic: builtins.str) -> builtins.bool:
|
||||
r"""
|
||||
Unsubscribes from a `GossipSub` topic.
|
||||
|
||||
Returns `True` if we were subscribed to this topic. Returns `False` if we were not subscribed.
|
||||
"""
|
||||
async def gossipsub_publish(self, topic:builtins.str, data:bytes) -> None:
|
||||
async def gossipsub_publish(self, topic: builtins.str, data: bytes) -> None:
|
||||
r"""
|
||||
Publishes a message with multiple topics to the `GossipSub` network.
|
||||
|
||||
@@ -154,7 +165,7 @@ class NetworkingHandle:
|
||||
r"""
|
||||
Receives the next message from the `GossipSub` network.
|
||||
"""
|
||||
async def gossipsub_recv_many(self, limit:builtins.int) -> builtins.list[tuple[builtins.str, bytes]]:
|
||||
async def gossipsub_recv_many(self, limit: builtins.int) -> builtins.list[tuple[builtins.str, bytes]]:
|
||||
r"""
|
||||
Receives at most `limit` messages from the `GossipSub` network and returns them.
|
||||
|
||||
@@ -163,11 +174,13 @@ class NetworkingHandle:
|
||||
will sleep until a message is sent.
|
||||
"""
|
||||
|
||||
@typing.final
|
||||
class NoPeersSubscribedToTopicError(builtins.Exception):
|
||||
def __new__(cls, *args) -> NoPeersSubscribedToTopicError: ...
|
||||
def __new__(cls, *args: typing.Any) -> NoPeersSubscribedToTopicError: ...
|
||||
def __repr__(self) -> builtins.str: ...
|
||||
def __str__(self) -> builtins.str: ...
|
||||
|
||||
@typing.final
|
||||
class PeerId:
|
||||
r"""
|
||||
Identifier of a peer of the network.
|
||||
@@ -183,7 +196,7 @@ class PeerId:
|
||||
This is useful for randomly walking on a DHT, or for testing purposes.
|
||||
"""
|
||||
@staticmethod
|
||||
def from_bytes(bytes:bytes) -> PeerId:
|
||||
def from_bytes(bytes: bytes) -> PeerId:
|
||||
r"""
|
||||
Parses a `PeerId` from bytes.
|
||||
"""
|
||||
@@ -198,7 +211,8 @@ class PeerId:
|
||||
def __repr__(self) -> builtins.str: ...
|
||||
def __str__(self) -> builtins.str: ...
|
||||
|
||||
class ConnectionUpdateType(Enum):
|
||||
@typing.final
|
||||
class ConnectionUpdateType(enum.Enum):
|
||||
r"""
|
||||
Connection or disconnection event discriminant type.
|
||||
"""
|
||||
|
||||
@@ -65,6 +65,40 @@ mod exception {
|
||||
Self::MSG.to_string()
|
||||
}
|
||||
}
|
||||
|
||||
#[gen_stub_pyclass]
|
||||
#[pyclass(frozen, extends=PyException, name="AllQueuesFullError")]
|
||||
pub struct PyAllQueuesFullError {}
|
||||
|
||||
impl PyAllQueuesFullError {
|
||||
const MSG: &'static str = "All libp2p peers are unresponsive, resend the message or reconnect.";
|
||||
|
||||
/// Creates a new [ `PyErr` ] of this type.
|
||||
///
|
||||
/// [`PyErr`] : https://docs.rs/pyo3/latest/pyo3/struct.PyErr.html "PyErr in pyo3"
|
||||
pub(crate) fn new_err() -> PyErr {
|
||||
PyErr::new::<Self, _>(()) // TODO: check if this needs to be replaced???
|
||||
}
|
||||
}
|
||||
|
||||
#[gen_stub_pymethods]
|
||||
#[pymethods]
|
||||
impl PyAllQueuesFullError {
|
||||
#[new]
|
||||
#[pyo3(signature = (*args))]
|
||||
#[allow(unused_variables)]
|
||||
pub(crate) fn new(args: &Bound<'_, PyTuple>) -> Self {
|
||||
Self {}
|
||||
}
|
||||
|
||||
fn __repr__(&self) -> String {
|
||||
format!("PeerId(\"{}\")", Self::MSG)
|
||||
}
|
||||
|
||||
fn __str__(&self) -> String {
|
||||
Self::MSG.to_string()
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Connection or disconnection event discriminant type.
|
||||
@@ -167,7 +201,7 @@ async fn networking_task(
|
||||
let pyresult: PyResult<MessageId> = if let Err(PublishError::NoPeersSubscribedToTopic) = result {
|
||||
Err(exception::PyNoPeersSubscribedToTopicError::new_err())
|
||||
} else if let Err(PublishError::AllQueuesFull(_)) = result {
|
||||
Err(exception::PyNoPeersSubscribedToTopicError::new_err())
|
||||
Err(exception::PyAllQueuesFullError::new_err())
|
||||
} else {
|
||||
result.pyerr()
|
||||
};
|
||||
@@ -526,6 +560,7 @@ impl PyNetworkingHandle {
|
||||
|
||||
pub fn networking_submodule(m: &Bound<'_, PyModule>) -> PyResult<()> {
|
||||
m.add_class::<exception::PyNoPeersSubscribedToTopicError>()?;
|
||||
m.add_class::<exception::PyAllQueuesFullError>()?;
|
||||
|
||||
m.add_class::<PyConnectionUpdateType>()?;
|
||||
m.add_class::<PyConnectionUpdate>()?;
|
||||
|
||||
@@ -13,6 +13,7 @@ pub type Swarm = libp2p::Swarm<Behaviour>;
|
||||
/// this be passed in as a parameter? What about rapidly changing versions in debug builds?
|
||||
/// this is all VERY very hard to figure out and needs to be mulled over as a team.
|
||||
pub const NETWORK_VERSION: &[u8] = b"v0.0.1";
|
||||
pub const OVERRIDE_VERSION_ENV_VAR: &str = "EXO_LIBP2P_NAMESPACE";
|
||||
|
||||
/// Create and configure a swarm which listens to all ports on OS
|
||||
pub fn create_swarm(keypair: identity::Keypair) -> alias::AnyResult<Swarm> {
|
||||
@@ -29,20 +30,27 @@ pub fn create_swarm(keypair: identity::Keypair) -> alias::AnyResult<Swarm> {
|
||||
|
||||
mod transport {
|
||||
use crate::alias;
|
||||
use crate::swarm::NETWORK_VERSION;
|
||||
use crate::swarm::{NETWORK_VERSION, OVERRIDE_VERSION_ENV_VAR};
|
||||
use futures::{AsyncRead, AsyncWrite};
|
||||
use keccak_const::Sha3_256;
|
||||
use libp2p::core::muxing;
|
||||
use libp2p::core::transport::Boxed;
|
||||
use libp2p::pnet::{PnetError, PnetOutput};
|
||||
use libp2p::{PeerId, Transport, identity, noise, pnet, yamux};
|
||||
use std::{sync::LazyLock, env};
|
||||
|
||||
/// Key used for networking's private network; parametrized on the [`NETWORK_VERSION`].
|
||||
/// See [`pnet_upgrade`] for more.
|
||||
const PNET_PRESHARED_KEY: [u8; 32] = Sha3_256::new()
|
||||
.update(b"exo_discovery_network")
|
||||
.update(NETWORK_VERSION)
|
||||
.finalize();
|
||||
static PNET_PRESHARED_KEY: LazyLock<[u8; 32]> = LazyLock::new(|| {
|
||||
let builder = Sha3_256::new().update(b"exo_discovery_network");
|
||||
|
||||
if let Ok(var) = env::var(OVERRIDE_VERSION_ENV_VAR) {
|
||||
let bytes = var.into_bytes();
|
||||
builder.update(&bytes)
|
||||
} else {
|
||||
builder.update(NETWORK_VERSION)
|
||||
}.finalize()
|
||||
});
|
||||
|
||||
/// Make the Swarm run on a private network, as to not clash with public libp2p nodes and
|
||||
/// also different-versioned instances of this same network.
|
||||
@@ -55,7 +63,7 @@ mod transport {
|
||||
TSocket: AsyncRead + AsyncWrite + Send + Unpin + 'static,
|
||||
{
|
||||
use pnet::{PnetConfig, PreSharedKey};
|
||||
PnetConfig::new(PreSharedKey::new(PNET_PRESHARED_KEY))
|
||||
PnetConfig::new(PreSharedKey::new(*PNET_PRESHARED_KEY))
|
||||
.handshake(socket)
|
||||
.await
|
||||
}
|
||||
|
||||
65
scp_repo.sh
65
scp_repo.sh
@@ -1,65 +0,0 @@
|
||||
#!/usr/bin/env bash
|
||||
# bulk_scp.sh — Sync a local repo to many hosts, respecting .gitignore and continuing even if
|
||||
# some hosts fail. Tested on macOS Bash 3.x.
|
||||
#
|
||||
# ------------ User-tunable variables ------------
|
||||
LOCAL_DIR="." # Local directory you want to send
|
||||
REMOTE_DIR="~/exo" # Destination directory on the remote machines
|
||||
HOSTS_FILE="hosts.json" # JSON array of hosts (["user@ip", ...])
|
||||
# ------------ End of user-tunable section -------
|
||||
|
||||
set -uo pipefail # Treat unset vars as error; fail pipelines, but we handle exit codes ourselves
|
||||
|
||||
if [ "$#" -ne 1 ]; then
|
||||
echo "Usage: $0 <password>" >&2
|
||||
exit 1
|
||||
fi
|
||||
PASSWORD="$1"
|
||||
|
||||
# Dependency checks
|
||||
for cmd in sshpass jq rsync git; do
|
||||
if ! command -v "$cmd" >/dev/null 2>&1; then
|
||||
echo "Error: $cmd is required but not installed." >&2
|
||||
exit 1
|
||||
fi
|
||||
done
|
||||
|
||||
# Verify hosts file exists
|
||||
if [ ! -f "$HOSTS_FILE" ]; then
|
||||
echo "Error: Hosts file '$HOSTS_FILE' not found." >&2
|
||||
exit 1
|
||||
fi
|
||||
|
||||
# Build a temporary exclude file containing every Git‑ignored path
|
||||
EXCLUDE_FILE=$(mktemp)
|
||||
trap 'rm -f "$EXCLUDE_FILE"' EXIT
|
||||
|
||||
if git -C "$LOCAL_DIR" rev-parse --is-inside-work-tree >/dev/null 2>&1; then
|
||||
git -C "$LOCAL_DIR" ls-files -z -o -i --exclude-standard \
|
||||
| tr '\0' '\n' > "$EXCLUDE_FILE"
|
||||
else
|
||||
# Fallback: just use top‑level .gitignore if present
|
||||
[ -f "$LOCAL_DIR/.gitignore" ] && cat "$LOCAL_DIR/.gitignore" > "$EXCLUDE_FILE"
|
||||
fi
|
||||
|
||||
# Iterate over hosts — process substitution keeps stdin free for rsync/ssh
|
||||
while IFS= read -r TARGET || [ -n "$TARGET" ]; do
|
||||
[ -z "$TARGET" ] && continue # skip blanks
|
||||
echo "\n—— Syncing $LOCAL_DIR → $TARGET:$REMOTE_DIR ——"
|
||||
|
||||
# # Ensure remote directory exists (ignore failure but report)
|
||||
# if ! sshpass -p "$PASSWORD" ssh -o StrictHostKeyChecking=no "$TARGET" "mkdir -p $REMOTE_DIR" </dev/null; then
|
||||
# echo "✗ Failed to create $REMOTE_DIR on $TARGET" >&2
|
||||
# continue # move on to next host
|
||||
# fi
|
||||
|
||||
# Rsync with checksums; redirect stdin so rsync/ssh can't eat host list
|
||||
if sshpass -p "$PASSWORD" rsync -azc --delete --exclude-from="$EXCLUDE_FILE" \
|
||||
-e "ssh -o StrictHostKeyChecking=no" \
|
||||
"$LOCAL_DIR/" "$TARGET:$REMOTE_DIR/" </dev/null; then
|
||||
echo "✓ Success: $TARGET"
|
||||
else
|
||||
echo "✗ Failed: $TARGET" >&2
|
||||
fi
|
||||
|
||||
done < <(jq -r '.[]' "$HOSTS_FILE")
|
||||
@@ -1,80 +0,0 @@
|
||||
import hashlib
|
||||
import os
|
||||
import sys
|
||||
|
||||
EXCLUDE_DIRS = {".git", "build", "vendor", ".idea", ".vscode", "__pycache__"}
|
||||
|
||||
def norm_rel(path: str, base: str) -> str:
|
||||
"""Forwarder-root–relative path with '/' separators."""
|
||||
abs_path = os.path.abspath(path)
|
||||
abs_base = os.path.abspath(base)
|
||||
rel = os.path.relpath(abs_path, abs_base)
|
||||
return rel.replace(os.sep, "/")
|
||||
|
||||
def collect_files(arg_path: str) -> tuple[str, list[str]]:
|
||||
# Resolve forwarder_root and src_root from the provided path
|
||||
p = os.path.abspath(arg_path)
|
||||
if not os.path.isdir(p):
|
||||
sys.stderr.write(f"error: path must be a directory: {arg_path}\n")
|
||||
sys.exit(2)
|
||||
|
||||
if os.path.basename(p) == "src":
|
||||
forwarder_root = os.path.dirname(p)
|
||||
src_root = p
|
||||
else:
|
||||
forwarder_root = p
|
||||
src_root = os.path.join(forwarder_root, "src")
|
||||
|
||||
files = []
|
||||
|
||||
# 1) Include .go files under src, excluding *_test.go
|
||||
if os.path.isdir(src_root):
|
||||
for root, dirs, filenames in os.walk(src_root):
|
||||
# prune excluded dirs
|
||||
dirs[:] = [d for d in dirs if d not in EXCLUDE_DIRS]
|
||||
for name in filenames:
|
||||
# strict .go, exclude *_test.go
|
||||
if not name.lower().endswith(".go"):
|
||||
continue
|
||||
if name.lower().endswith("_test.go"):
|
||||
continue
|
||||
files.append(os.path.join(root, name))
|
||||
|
||||
# 2) Add go.mod, go.sum, main.go from the forwarder root
|
||||
for name in ("go.mod", "go.sum", "main.go"):
|
||||
pth = os.path.join(forwarder_root, name)
|
||||
if os.path.isfile(pth):
|
||||
# defensive: exclude *_test.go at root too
|
||||
if name.lower().endswith("_test.go"):
|
||||
continue
|
||||
files.append(pth)
|
||||
|
||||
# Deduplicate and sort deterministically by forwarder-root–relative path
|
||||
files: list[str] = sorted(set(files), key=lambda f: norm_rel(f, forwarder_root))
|
||||
return forwarder_root, files
|
||||
|
||||
def hash_files(forwarder_root: str, files: list[str]) -> str:
|
||||
h = hashlib.sha256()
|
||||
for fp in files:
|
||||
rel = norm_rel(fp, forwarder_root)
|
||||
h.update(b"F\x00")
|
||||
h.update(rel.encode("utf-8"))
|
||||
h.update(b"\x00")
|
||||
with open(fp, "rb") as f:
|
||||
for chunk in iter(lambda: f.read(256 * 1024), b""):
|
||||
h.update(chunk)
|
||||
h.update(b"\n")
|
||||
return h.hexdigest()
|
||||
|
||||
def main():
|
||||
if len(sys.argv) > 1:
|
||||
arg = sys.argv[1]
|
||||
else:
|
||||
arg = os.path.join("networking", "forwarder", "src")
|
||||
forwarder_root, files = collect_files(arg)
|
||||
digest = hash_files(forwarder_root, files)
|
||||
# print without trailing newline (easier to capture in shell)
|
||||
sys.stdout.write(digest)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,17 +0,0 @@
|
||||
[project]
|
||||
name = "exo-scripts"
|
||||
version = "0.1.0"
|
||||
description = "Scripts for the Exo project"
|
||||
readme = "README.md"
|
||||
requires-python = ">=3.13"
|
||||
dependencies = [
|
||||
"huggingface_hub>=0.33.4",
|
||||
"exo"
|
||||
]
|
||||
|
||||
[build-system]
|
||||
requires = ["uv_build>=0.8.9,<0.9.0"]
|
||||
build-backend = "uv_build"
|
||||
|
||||
[tool.uv.sources]
|
||||
exo = { workspace = true }
|
||||
@@ -1,511 +0,0 @@
|
||||
import asyncio
|
||||
import json
|
||||
import argparse
|
||||
import sys
|
||||
import time
|
||||
from dataclasses import is_dataclass, asdict
|
||||
from logging import getLogger
|
||||
from typing import List, Optional, Any, Sequence, Tuple
|
||||
|
||||
# Your existing imports — unchanged
|
||||
from exo.shared.types.state import State
|
||||
from exo.shared.apply import apply
|
||||
from exo.shared.db.sqlite.event_log_manager import EventLogManager, EventLogConfig
|
||||
from exo.shared.types.events.components import EventFromEventLog
|
||||
from exo.shared.types.events import Event
|
||||
|
||||
# --- Third-party UI (new) ---
|
||||
from rich.syntax import Syntax
|
||||
from rich.text import Text
|
||||
from rich.panel import Panel
|
||||
from rich.console import RenderableType
|
||||
|
||||
from textual.app import App, ComposeResult
|
||||
from textual.containers import Horizontal, Vertical
|
||||
from textual.widgets import Static, ListView, ListItem, Input, Footer, Label
|
||||
from textual.reactive import reactive
|
||||
from textual import on
|
||||
from textual.binding import Binding
|
||||
from textual.message import Message
|
||||
|
||||
logger = getLogger("helper_log")
|
||||
|
||||
# Worker-related event types (same set)
|
||||
WORKER_EVENT_TYPES = {
|
||||
'TaskCreated', 'TaskStateUpdated', 'TaskFailed', 'TaskDeleted',
|
||||
'ChunkGenerated',
|
||||
'InstanceCreated', 'InstanceDeleted', 'InstanceActivated', 'InstanceDeactivated', 'InstanceReplacedAtomically',
|
||||
'RunnerStatusUpdated', 'RunnerDeleted'
|
||||
}
|
||||
|
||||
|
||||
# ---------- Data / DB helpers (mostly your original logic) ----------
|
||||
|
||||
event_log_manager: Optional[EventLogManager] = None
|
||||
|
||||
async def init_db() -> None:
|
||||
global event_log_manager
|
||||
event_log_manager = EventLogManager(EventLogConfig())
|
||||
await event_log_manager.initialize()
|
||||
|
||||
async def get_events_since(since: int) -> Sequence[EventFromEventLog[Event]]:
|
||||
# type: ignore[attr-defined, return-value]
|
||||
return await event_log_manager.global_events.get_events_since(since)
|
||||
|
||||
async def load_all_events() -> List[EventFromEventLog[Event]]:
|
||||
events: List[EventFromEventLog[Event]] = []
|
||||
since = 0
|
||||
while True:
|
||||
new_events = await get_events_since(since)
|
||||
if not new_events:
|
||||
break
|
||||
events.extend(new_events)
|
||||
since += len(new_events)
|
||||
return events
|
||||
|
||||
def compute_states(events: List[EventFromEventLog[Event]]) -> List[State]:
|
||||
states: List[State] = [State()]
|
||||
state = states[0]
|
||||
for event in events:
|
||||
state = apply(state, event)
|
||||
states.append(state)
|
||||
return states
|
||||
|
||||
def filter_worker_state(state: State) -> dict:
|
||||
state_dict = json.loads(state.model_dump_json())
|
||||
return {
|
||||
'node_status': state_dict.get('node_status', {}),
|
||||
'instances': state_dict.get('instances', {}),
|
||||
'runners': state_dict.get('runners', {}),
|
||||
'tasks': state_dict.get('tasks', {}),
|
||||
'last_event_applied_idx': state_dict.get('last_event_applied_idx', 0)
|
||||
}
|
||||
|
||||
def event_type_name(e: EventFromEventLog[Event]) -> str:
|
||||
return type(e.event).__name__
|
||||
|
||||
def is_worker_event(e: EventFromEventLog[Event]) -> bool:
|
||||
return event_type_name(e) in WORKER_EVENT_TYPES
|
||||
|
||||
def safe_json(obj: Any) -> str:
|
||||
"""Serialize unknown objects to JSON-ish string safely."""
|
||||
def to_serializable(x: Any):
|
||||
try:
|
||||
if is_dataclass(x):
|
||||
return asdict(x)
|
||||
except Exception:
|
||||
pass
|
||||
if isinstance(x, (str, int, float, bool)) or x is None:
|
||||
return x
|
||||
if isinstance(x, dict):
|
||||
return {str(k): to_serializable(v) for k, v in x.items()}
|
||||
if isinstance(x, (list, tuple, set)):
|
||||
return [to_serializable(v) for v in x]
|
||||
try:
|
||||
json.dumps(x) # type: ignore
|
||||
return x
|
||||
except Exception:
|
||||
return repr(x)
|
||||
try:
|
||||
return json.dumps(to_serializable(obj), indent=2, ensure_ascii=False)
|
||||
except Exception:
|
||||
# Last resort
|
||||
return repr(obj)
|
||||
|
||||
def summarize_event_line(e: EventFromEventLog[Event], max_len: int = 160) -> Text:
|
||||
etype = event_type_name(e)
|
||||
attrs = vars(e.event)
|
||||
prefix = Text(f"[{e.idx_in_log}] ", style="bold dim")
|
||||
t = Text(etype, style="bold cyan")
|
||||
t = prefix + t + Text(": ", style="dim")
|
||||
first = True
|
||||
for k, v in attrs.items():
|
||||
if not first:
|
||||
t.append(", ", style="dim")
|
||||
first = False
|
||||
t.append(str(k), style="magenta")
|
||||
t.append("=")
|
||||
# Coarse coloring by type
|
||||
if isinstance(v, str):
|
||||
t.append(repr(v), style="green")
|
||||
elif isinstance(v, (int, float)):
|
||||
t.append(repr(v), style="yellow")
|
||||
elif isinstance(v, bool):
|
||||
t.append(repr(v), style="cyan")
|
||||
else:
|
||||
t.append(repr(v), style="")
|
||||
if len(t.plain) > max_len:
|
||||
t.truncate(max_len - 1)
|
||||
t.append("…", style="dim")
|
||||
return t
|
||||
|
||||
def event_detail_renderable(e: EventFromEventLog[Event]) -> RenderableType:
|
||||
payload = {
|
||||
"idx_in_log": e.idx_in_log,
|
||||
"event_type": event_type_name(e),
|
||||
"attributes": vars(e.event)
|
||||
}
|
||||
return Syntax(safe_json(payload), "json", word_wrap=True)
|
||||
|
||||
|
||||
# ---------- Non-TUI (stdout) mode, like your current script ----------
|
||||
|
||||
async def run_non_tui(worker_mode: bool) -> None:
|
||||
await init_db()
|
||||
events = await load_all_events()
|
||||
states = compute_states(events)
|
||||
final_state = states[-1]
|
||||
|
||||
if worker_mode:
|
||||
filtered_events = [e for e in events if is_worker_event(e)]
|
||||
events = filtered_events
|
||||
filtered_state = filter_worker_state(final_state)
|
||||
print("Final State (filtered):")
|
||||
print(json.dumps(filtered_state, indent=2))
|
||||
else:
|
||||
print("Final State:")
|
||||
print(final_state.model_dump_json(indent=2))
|
||||
|
||||
print("\nEvents:")
|
||||
for e in events:
|
||||
etype = event_type_name(e)
|
||||
attrs = ', '.join(f"{k}={value!r}" for k, value in vars(e.event).items())
|
||||
print(f"[{e.idx_in_log}] {etype}: {attrs}")
|
||||
|
||||
|
||||
# ---------- Textual TUI ----------
|
||||
|
||||
class StateView(Static):
|
||||
"""Left pane: shows state JSON, with optional worker filter."""
|
||||
def update_state(self, state: State, worker_mode: bool, index_in_log_for_status: Optional[int]) -> None:
|
||||
if worker_mode:
|
||||
data = filter_worker_state(state)
|
||||
json_str = json.dumps(data, indent=2, ensure_ascii=False)
|
||||
else:
|
||||
json_str = state.model_dump_json(indent=2)
|
||||
syntax = Syntax(json_str, "json", word_wrap=True)
|
||||
title = f"State after event #{index_in_log_for_status}" if index_in_log_for_status is not None else "Initial State"
|
||||
self.update(Panel(syntax, title=title, border_style="cyan"))
|
||||
|
||||
class EventListItem(ListItem):
|
||||
def __init__(self, e: EventFromEventLog[Event]) -> None:
|
||||
super().__init__(Static(summarize_event_line(e)))
|
||||
self._event = e
|
||||
|
||||
@property
|
||||
def wrapped_event(self) -> EventFromEventLog[Event]:
|
||||
return self._event
|
||||
|
||||
class EventDetail(Static):
|
||||
"""Right-bottom: details of the selected event."""
|
||||
def show_event(self, e: Optional[EventFromEventLog[Event]]) -> None:
|
||||
if e is None:
|
||||
self.update(Panel(Text("No event selected.", style="dim"), title="Event Details"))
|
||||
else:
|
||||
self.update(Panel(event_detail_renderable(e), title=f"Event #{e.idx_in_log} • {event_type_name(e)}", border_style="magenta"))
|
||||
|
||||
class StatusBar(Static):
|
||||
def set_status(self, realtime: bool, total_events: int, current_idx_in_log: Optional[int]) -> None:
|
||||
mode = "Realtime" if realtime else "Timetravel"
|
||||
parts = [
|
||||
f"[{mode}]",
|
||||
f"Events: {total_events}",
|
||||
]
|
||||
if current_idx_in_log is not None:
|
||||
parts.append(f"Current: #{current_idx_in_log}")
|
||||
parts.append("Keys: ↑/↓ Select • PgUp/PgDn Scroll • Ctrl+↑/↓ ±5 • [/] State PgUp/PgDn • g Goto • r Realtime • q Quit")
|
||||
self.update(Text(" ".join(parts), style="dim"))
|
||||
|
||||
|
||||
class GotoPrompt(Static):
|
||||
"""Simple inline goto prompt (appears above Footer)."""
|
||||
class Submitted(Message):
|
||||
def __init__(self, value: Optional[int]) -> None:
|
||||
super().__init__()
|
||||
self.value = value
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
yield Label("Go to event id (idx_in_log):", id="goto-label")
|
||||
yield Input(placeholder="e.g., 123", id="goto-input")
|
||||
|
||||
def on_mount(self) -> None:
|
||||
self.query_one(Input).focus()
|
||||
|
||||
@on(Input.Submitted)
|
||||
def _submitted(self, event: Input.Submitted) -> None:
|
||||
text = (event.value or "").strip()
|
||||
try:
|
||||
value = int(text)
|
||||
except ValueError:
|
||||
value = None
|
||||
self.post_message(self.Submitted(value))
|
||||
|
||||
|
||||
class EventLogApp(App):
|
||||
CSS = """
|
||||
Screen {
|
||||
layout: vertical;
|
||||
}
|
||||
#main {
|
||||
height: 1fr;
|
||||
}
|
||||
#left {
|
||||
width: 60%;
|
||||
}
|
||||
#right {
|
||||
width: 40%;
|
||||
}
|
||||
#events {
|
||||
height: 3fr;
|
||||
}
|
||||
#detail {
|
||||
height: 2fr;
|
||||
border: tall;
|
||||
}
|
||||
#status {
|
||||
height: 1;
|
||||
padding: 0 1;
|
||||
}
|
||||
#goto {
|
||||
dock: bottom;
|
||||
height: 3;
|
||||
padding: 1 2;
|
||||
background: $panel;
|
||||
border: round $accent;
|
||||
}
|
||||
"""
|
||||
|
||||
BINDINGS = [
|
||||
Binding("q", "quit", "Quit"),
|
||||
Binding("r", "toggle_realtime", "Realtime"),
|
||||
Binding("[", "state_page_up", "State PgUp"),
|
||||
Binding("]", "state_page_down", "State PgDn"),
|
||||
Binding("g", "prompt_goto", "Goto"),
|
||||
Binding("ctrl+up", "jump_up", "Jump Up"),
|
||||
Binding("ctrl+down", "jump_down", "Jump Down"),
|
||||
]
|
||||
|
||||
# Reactive state
|
||||
realtime: reactive[bool] = reactive(False)
|
||||
worker_mode: bool
|
||||
|
||||
# Data
|
||||
wrapped_events: List[EventFromEventLog[Event]]
|
||||
states: List[State]
|
||||
filtered_indices: Optional[List[int]] # maps filtered idx -> original idx
|
||||
update_interval: float = 1.0
|
||||
_poll_timer = None
|
||||
|
||||
def __init__(self, worker_mode: bool) -> None:
|
||||
super().__init__()
|
||||
self.worker_mode = worker_mode
|
||||
self.wrapped_events = []
|
||||
self.states = [State()]
|
||||
self.filtered_indices = None
|
||||
|
||||
async def on_mount(self) -> None:
|
||||
await init_db()
|
||||
await self._initial_load()
|
||||
# periodic polling for new events
|
||||
self._poll_timer = self.set_interval(self.update_interval, self._tick_poll)
|
||||
# Put list selection at end (last event) by default
|
||||
self._select_last()
|
||||
|
||||
async def _initial_load(self) -> None:
|
||||
self.wrapped_events = await load_all_events()
|
||||
self.states = compute_states(self.wrapped_events)
|
||||
|
||||
# Build filtered view if needed
|
||||
if self.worker_mode:
|
||||
self.filtered_indices = [i for i, e in enumerate(self.wrapped_events) if is_worker_event(e)]
|
||||
else:
|
||||
self.filtered_indices = None
|
||||
|
||||
# Populate the ListView
|
||||
lv = self.query_one("#events", ListView)
|
||||
lv.clear()
|
||||
events_to_show = self._view_events()
|
||||
for e in events_to_show:
|
||||
lv.append(EventListItem(e))
|
||||
|
||||
# Update left state & details
|
||||
self._refresh_views()
|
||||
|
||||
def compose(self) -> ComposeResult:
|
||||
# Layout: [Header optional] -> main Horizontal -> Status bar + Footer
|
||||
with Horizontal(id="main"):
|
||||
with Vertical(id="left"):
|
||||
yield StateView(id="state")
|
||||
with Vertical(id="right"):
|
||||
yield ListView(id="events")
|
||||
yield EventDetail(id="detail")
|
||||
yield StatusBar(id="status")
|
||||
yield Footer()
|
||||
|
||||
def _current_original_index(self) -> int:
|
||||
lv = self.query_one("#events", ListView)
|
||||
idx = lv.index
|
||||
if idx is None or idx < 0:
|
||||
return -1
|
||||
if self.filtered_indices is not None:
|
||||
if idx >= len(self.filtered_indices):
|
||||
return -1
|
||||
return self.filtered_indices[idx]
|
||||
return idx
|
||||
|
||||
def _view_events(self) -> List[EventFromEventLog[Event]]:
|
||||
if self.filtered_indices is not None:
|
||||
return [self.wrapped_events[i] for i in self.filtered_indices]
|
||||
return self.wrapped_events
|
||||
|
||||
def _select_last(self) -> None:
|
||||
lv = self.query_one("#events", ListView)
|
||||
n = len(lv.children)
|
||||
if n:
|
||||
lv.index = n - 1
|
||||
|
||||
def _refresh_views(self) -> None:
|
||||
# Update State pane and Detail pane and Status bar
|
||||
original_idx = self._current_original_index()
|
||||
state_idx = (original_idx + 1) if original_idx >= 0 else 0
|
||||
state = self.states[state_idx]
|
||||
state_view = self.query_one("#state", StateView)
|
||||
idx_in_log = None
|
||||
if original_idx >= 0:
|
||||
idx_in_log = self.wrapped_events[original_idx].idx_in_log
|
||||
state_view.update_state(state, self.worker_mode, idx_in_log)
|
||||
|
||||
# Detail pane
|
||||
detail = self.query_one("#detail", EventDetail)
|
||||
current_event = self.wrapped_events[original_idx] if original_idx >= 0 else None
|
||||
detail.show_event(current_event)
|
||||
|
||||
# Status bar
|
||||
status = self.query_one("#status", StatusBar)
|
||||
total_events = len(self.wrapped_events)
|
||||
status.set_status(self.realtime, total_events, current_event.idx_in_log if current_event else None)
|
||||
|
||||
async def _poll_once(self) -> bool:
|
||||
"""Fetch and append new events; return True if updated."""
|
||||
last_since = len(self.wrapped_events)
|
||||
new_wrapped = await get_events_since(last_since)
|
||||
if not new_wrapped:
|
||||
return False
|
||||
|
||||
# Extend states incrementally (avoid recomputing all)
|
||||
for nw in new_wrapped:
|
||||
state = self.states[-1]
|
||||
self.states.append(apply(state, nw))
|
||||
|
||||
start_len = len(self.wrapped_events)
|
||||
self.wrapped_events.extend(new_wrapped)
|
||||
|
||||
# Update filtered mapping and UI list
|
||||
lv = self.query_one("#events", ListView)
|
||||
if self.worker_mode:
|
||||
if self.filtered_indices is None:
|
||||
self.filtered_indices = []
|
||||
for k in range(start_len, len(self.wrapped_events)):
|
||||
if is_worker_event(self.wrapped_events[k]):
|
||||
self.filtered_indices.append(k)
|
||||
lv.append(EventListItem(self.wrapped_events[k]))
|
||||
else:
|
||||
for k in range(start_len, len(self.wrapped_events)):
|
||||
lv.append(EventListItem(self.wrapped_events[k]))
|
||||
|
||||
# Auto-follow the tail in realtime mode
|
||||
if self.realtime:
|
||||
self._select_last()
|
||||
|
||||
# Refresh panes
|
||||
self._refresh_views()
|
||||
return True
|
||||
|
||||
def _tick_poll(self) -> None:
|
||||
# called by timer; schedule the async poll
|
||||
asyncio.create_task(self._poll_once())
|
||||
|
||||
# ------ Actions / key handlers ------
|
||||
def action_quit(self) -> None:
|
||||
self.exit()
|
||||
|
||||
def action_toggle_realtime(self) -> None:
|
||||
self.realtime = not self.realtime
|
||||
if self.realtime:
|
||||
self._select_last()
|
||||
self._refresh_views()
|
||||
|
||||
def action_state_page_up(self) -> None:
|
||||
state_view = self.query_one("#state", StateView)
|
||||
state_view.scroll_page_up()
|
||||
|
||||
def action_state_page_down(self) -> None:
|
||||
state_view = self.query_one("#state", StateView)
|
||||
state_view.scroll_page_down()
|
||||
|
||||
def action_jump_up(self) -> None:
|
||||
lv = self.query_one("#events", ListView)
|
||||
if lv.children:
|
||||
lv.index = max(0, (lv.index or 0) - 5)
|
||||
self._refresh_views()
|
||||
|
||||
def action_jump_down(self) -> None:
|
||||
lv = self.query_one("#events", ListView)
|
||||
if lv.children:
|
||||
lv.index = min(len(lv.children) - 1, (lv.index or 0) + 5)
|
||||
self._refresh_views()
|
||||
|
||||
def action_prompt_goto(self) -> None:
|
||||
# mount a small prompt near bottom
|
||||
if self.query("#goto"):
|
||||
return
|
||||
prompt = GotoPrompt(id="goto")
|
||||
self.mount(prompt)
|
||||
|
||||
@on(GotoPrompt.Submitted)
|
||||
def _on_goto_submitted(self, msg: GotoPrompt.Submitted) -> None:
|
||||
# Remove prompt
|
||||
for node in self.query("#goto"):
|
||||
node.remove()
|
||||
|
||||
if msg.value is None:
|
||||
return
|
||||
|
||||
target = msg.value
|
||||
# find in current view's idx_in_log
|
||||
events_to_show = self._view_events()
|
||||
lv = self.query_one("#events", ListView)
|
||||
for i, e in enumerate(events_to_show):
|
||||
if e.idx_in_log == target:
|
||||
lv.index = i
|
||||
self._refresh_views()
|
||||
break
|
||||
|
||||
@on(ListView.Highlighted, "#events")
|
||||
@on(ListView.Selected, "#events")
|
||||
def _on_event_selected(self, *_: Any) -> None:
|
||||
# Update panes when selection changes
|
||||
self._refresh_views()
|
||||
|
||||
|
||||
# ---------- Entrypoint ----------
|
||||
|
||||
def main() -> None:
|
||||
parser = argparse.ArgumentParser(description='Read and display events from the event log (Textual UI)')
|
||||
parser.add_argument('--worker', action='store_true',
|
||||
help='Only show worker-related events (task, streaming, instance, runner status)')
|
||||
parser.add_argument('--no-ui', action='store_true',
|
||||
help='Print to stdout (non-interactive), like the original non-TUI mode')
|
||||
args = parser.parse_args()
|
||||
|
||||
# Non-interactive fallback if no TTY or user requests it
|
||||
if args.no_ui or not sys.stdout.isatty():
|
||||
asyncio.run(run_non_tui(worker_mode=args.worker))
|
||||
return
|
||||
|
||||
# TUI mode
|
||||
app = EventLogApp(worker_mode=args.worker)
|
||||
app.run()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,12 +0,0 @@
|
||||
from exo.worker.download.download_utils import *
|
||||
|
||||
async def main():
|
||||
meta = await file_meta(
|
||||
'mlx-community/DeepSeek-R1-4bit',
|
||||
revision='main',
|
||||
path='config.json',
|
||||
redirected_location=None,
|
||||
)
|
||||
print(meta)
|
||||
|
||||
asyncio.run(main())
|
||||
@@ -1,284 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
|
||||
"""
|
||||
watch-pull-restart.py — Unix-only
|
||||
|
||||
Runs a command, periodically checks git upstream, pulls if upstream is ahead,
|
||||
and gracefully restarts the command. Watcher logs go to STDERR; your app's
|
||||
output goes straight to the console (STDOUT/STDERR).
|
||||
|
||||
Assumptions:
|
||||
- current branch tracks an upstream (i.e., @{u} exists)
|
||||
- pulls must be fast-forward (remote-ahead workflow)
|
||||
|
||||
Arguments:
|
||||
- cmd: Command to run/manage (e.g. './run.sh' or 'python -m app').
|
||||
- restart-cmd: Optional hook to run after a successful pull (e.g., systemctl restart).
|
||||
- sleep-secs: Poll interval while up-to-date.
|
||||
- grace-secs: Seconds to wait after SIGTERM before SIGKILL.
|
||||
- debounce-secs: Coalesce multiple pulls before restart.
|
||||
|
||||
Usage:
|
||||
./watch-pull-restart.py --cmd "./run.sh" --sleep-secs 1
|
||||
./watch-pull-restart.py --cmd "python -m app" --restart-cmd "systemctl --user restart myapp"
|
||||
./watch-pull-restart.py --restart-cmd "systemctl --user restart myapp" # no managed child; only trigger hook
|
||||
"""
|
||||
import argparse
|
||||
import os
|
||||
import signal
|
||||
import subprocess
|
||||
import sys
|
||||
import time
|
||||
from types import FrameType
|
||||
from typing import Optional
|
||||
|
||||
|
||||
# ---------- logging helpers (to STDERR) ----------
|
||||
def log(msg: str):
|
||||
sys.stderr.write(msg.rstrip() + "\n")
|
||||
sys.stderr.flush()
|
||||
|
||||
|
||||
def sep(title: str = ""):
|
||||
"""Big visual separator for state transitions (to STDERR)."""
|
||||
sys.stderr.write("\n\n")
|
||||
if title:
|
||||
sys.stderr.write(f"===== [watch] {title} =====\n")
|
||||
else:
|
||||
sys.stderr.write("===== [watch] =====\n")
|
||||
sys.stderr.flush()
|
||||
|
||||
|
||||
def run_capture(cmd: str, check: bool = True) -> subprocess.CompletedProcess[str]:
|
||||
"""Run and capture output; for git plumbing."""
|
||||
return subprocess.run(
|
||||
cmd,
|
||||
shell=True,
|
||||
stdout=subprocess.PIPE,
|
||||
stderr=subprocess.PIPE,
|
||||
text=True,
|
||||
check=check,
|
||||
)
|
||||
|
||||
|
||||
# ---------- shell helpers ----------
|
||||
def is_up_to_date() -> bool:
|
||||
subprocess.run("git fetch --quiet",
|
||||
shell=True) # Quiet fetch; ignore network errors (we'll just try again next tick)
|
||||
try:
|
||||
current = run_capture("git rev-parse HEAD", check=True).stdout.strip()
|
||||
upstream = run_capture("git rev-parse @{u}", check=True).stdout.strip()
|
||||
return current == upstream
|
||||
except subprocess.CalledProcessError:
|
||||
return True # No upstream or other git error; treat as up-to-date to avoid thrash
|
||||
|
||||
|
||||
def pull_ff_only() -> bool:
|
||||
"""Returns True if pull applied changes, False if already up-to-date."""
|
||||
try:
|
||||
cp = run_capture("git pull --ff-only --no-rebase", check=True)
|
||||
return "Already up to date" not in cp.stdout and cp.returncode == 0 # Git prints "Already up to date." on no-op; cheap heuristic
|
||||
except subprocess.CalledProcessError as e:
|
||||
log("[watch] git pull failed:")
|
||||
if e.stdout: # pyright: ignore[reportAny]
|
||||
log(e.stdout) # pyright: ignore[reportAny]
|
||||
if e.stderr: # pyright: ignore[reportAny]
|
||||
log(e.stderr) # pyright: ignore[reportAny]
|
||||
return False
|
||||
|
||||
|
||||
# ---------- managed processes ----------
|
||||
class ManagedProc:
|
||||
def __init__(self, cmd: Optional[str], grace_secs: float):
|
||||
self.cmd = cmd
|
||||
self.grace = grace_secs
|
||||
self.child: Optional[subprocess.Popen[bytes]] = None
|
||||
|
||||
def start(self):
|
||||
if not self.cmd:
|
||||
return
|
||||
if self.child and self.child.poll() is None:
|
||||
return
|
||||
sep("starting main cmd")
|
||||
log(f"[watch] starting: {self.cmd}")
|
||||
# New process group so we can signal the entire tree (shell + children)
|
||||
self.child = subprocess.Popen(
|
||||
self.cmd,
|
||||
shell=True, # allow shell features in --cmd
|
||||
stdout=None, # inherit parent's stdout (your app prints normally)
|
||||
stderr=None, # inherit parent's stderr
|
||||
stdin=None,
|
||||
preexec_fn=os.setsid, # create new session (PGID == child PID)
|
||||
)
|
||||
|
||||
def stop_gracefully(self):
|
||||
if not self.child:
|
||||
return
|
||||
if self.child.poll() is not None:
|
||||
self.child = None
|
||||
return
|
||||
|
||||
sep("stopping main cmd (SIGTERM)")
|
||||
try:
|
||||
os.killpg(self.child.pid, signal.SIGTERM)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
|
||||
deadline = time.time() + self.grace
|
||||
while time.time() < deadline:
|
||||
if self.child.poll() is not None:
|
||||
self.child = None
|
||||
return
|
||||
time.sleep(0.1)
|
||||
|
||||
sep("main cmd unresponsive; SIGKILL")
|
||||
try:
|
||||
os.killpg(self.child.pid, signal.SIGKILL)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
self.child = None
|
||||
|
||||
def forward_signal(self, sig: int):
|
||||
if not self.child or self.child.poll() is not None:
|
||||
return
|
||||
try:
|
||||
os.killpg(self.child.pid, sig)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
|
||||
|
||||
class OneShotHook:
|
||||
"""
|
||||
One-shot hook command (e.g., systemctl restart).
|
||||
Runs to completion with inherited stdio so its output is visible.
|
||||
"""
|
||||
|
||||
def __init__(self, cmd: Optional[str], grace_secs: float):
|
||||
self.cmd = cmd
|
||||
self.grace = grace_secs
|
||||
self.child: Optional[subprocess.Popen[bytes]] = None
|
||||
|
||||
def run(self) -> int:
|
||||
if not self.cmd:
|
||||
return 0
|
||||
sep("running restart hook")
|
||||
log(f"[watch] hook: {self.cmd}")
|
||||
self.child = subprocess.Popen(
|
||||
self.cmd,
|
||||
shell=True,
|
||||
stdout=None, # inherit stdio
|
||||
stderr=None,
|
||||
stdin=None,
|
||||
preexec_fn=os.setsid,
|
||||
)
|
||||
# Wait with grace/kill if needed (rare for hooks, but symmetric)
|
||||
deadline = time.time() + self.grace
|
||||
while True:
|
||||
rc = self.child.poll()
|
||||
if rc is not None:
|
||||
self.child = None
|
||||
return rc
|
||||
if time.time() > deadline:
|
||||
sep("hook exceeded grace; SIGKILL")
|
||||
try:
|
||||
os.killpg(self.child.pid, signal.SIGKILL)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
self.child = None
|
||||
return 137 # killed
|
||||
time.sleep(0.1)
|
||||
|
||||
def forward_signal(self, sig: int):
|
||||
if not self.child or self.child.poll() is not None:
|
||||
return
|
||||
try:
|
||||
os.killpg(self.child.pid, sig)
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
|
||||
|
||||
# ---------- main loop ----------
|
||||
def main():
|
||||
# CMD commands
|
||||
ap = argparse.ArgumentParser(description="Auto-pull & restart on upstream changes (Unix).")
|
||||
ap.add_argument("--cmd", help="Command to run/manage (e.g. './run.sh' or 'python -m app').")
|
||||
ap.add_argument("--restart-cmd", help="Optional hook to run after a successful pull (e.g., systemctl restart).")
|
||||
ap.add_argument("--sleep-secs", type=float, default=0.5, help="Poll interval while up-to-date.")
|
||||
ap.add_argument("--grace-secs", type=float, default=5.0, help="Seconds to wait after SIGTERM before SIGKILL.")
|
||||
ap.add_argument("--debounce-secs", type=float, default=0.5, help="Coalesce multiple pulls before restart.")
|
||||
args = ap.parse_args()
|
||||
|
||||
# get CMD command values
|
||||
cmd = args.cmd # pyright: ignore[reportAny]
|
||||
assert cmd is None or isinstance(cmd, str)
|
||||
restart_cmd = args.restart_cmd # pyright: ignore[reportAny]
|
||||
assert cmd is None or isinstance(restart_cmd, str)
|
||||
sleep_secs = args.sleep_secs # pyright: ignore[reportAny]
|
||||
assert sleep_secs is not None and isinstance(sleep_secs, float)
|
||||
grace_secs = args.grace_secs # pyright: ignore[reportAny]
|
||||
assert sleep_secs is not None and isinstance(grace_secs, float)
|
||||
debounce_secs = args.debounce_secs # pyright: ignore[reportAny]
|
||||
assert sleep_secs is not None and isinstance(debounce_secs, float)
|
||||
|
||||
# start managed proc
|
||||
proc = ManagedProc(cmd, grace_secs)
|
||||
hook = OneShotHook(restart_cmd, grace_secs)
|
||||
|
||||
# signal handling for graceful exit
|
||||
exiting = {"flag": False}
|
||||
|
||||
def _handle(sig_num: int, _frame: Optional[FrameType]):
|
||||
sep(f"received signal {sig_num}; exiting")
|
||||
exiting["flag"] = True
|
||||
proc.forward_signal(sig_num)
|
||||
hook.forward_signal(sig_num)
|
||||
|
||||
signal.signal(signal.SIGINT, _handle)
|
||||
signal.signal(signal.SIGTERM, _handle)
|
||||
|
||||
# Initial start (if managing a process)
|
||||
proc.start()
|
||||
|
||||
pending_restart = False
|
||||
last_change = 0.0
|
||||
while not exiting["flag"]:
|
||||
try:
|
||||
if not is_up_to_date():
|
||||
sep("upstream ahead; pulling")
|
||||
changed = pull_ff_only()
|
||||
if changed:
|
||||
last_change = time.time()
|
||||
pending_restart = True
|
||||
|
||||
# handle debounce window
|
||||
if pending_restart and (time.time() - last_change) >= debounce_secs:
|
||||
# Optional hook first
|
||||
if restart_cmd:
|
||||
rc = hook.run()
|
||||
if rc != 0:
|
||||
sep(f"hook exited with {rc}")
|
||||
# Then bounce managed process
|
||||
if cmd:
|
||||
proc.stop_gracefully()
|
||||
proc.start()
|
||||
pending_restart = False
|
||||
sep("restart cycle complete")
|
||||
|
||||
# keep the child alive if it crashed without a pull
|
||||
if cmd and (proc.child is None or proc.child.poll() is not None):
|
||||
sep("main cmd exited; restarting")
|
||||
proc.start()
|
||||
|
||||
time.sleep(sleep_secs)
|
||||
except Exception as e:
|
||||
sep("loop error")
|
||||
log(f"[watch] {e}")
|
||||
time.sleep(2.0)
|
||||
|
||||
# graceful shutdown on exit
|
||||
proc.stop_gracefully()
|
||||
sep("bye")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -1,22 +0,0 @@
|
||||
you have 2 scripts now added:
|
||||
1. scp_repo.sh that you call like "./scp_repo.sh {password}"
|
||||
where password is the password for the studios. call this from the
|
||||
root of the repo and it will send any differences in your local repo
|
||||
to the machines. this should only be needed when things changed
|
||||
2. run_remote.sh, also called like "./run_remote.sh {password}"
|
||||
which kills all running exo process and starts new ones with fresh dbs
|
||||
|
||||
both of these use the file hosts.json which is a json list of strings
|
||||
of the form user@ip where you need to put the studios with their username
|
||||
and THUNDERBOLT ips (get these manually from the machines after all of
|
||||
them and your laptop are hooked up via tb5 and have ips on the thunderbolt
|
||||
bridge in settings>network). the order here doesn't matter EXCEPT for the
|
||||
first entry which will be the master. so the script runs ./run.sh -c on the
|
||||
first entry in that list and ./run.sh -rc on all the others
|
||||
|
||||
|
||||
separately, there is now a nodes.json which is also a list of strings but this
|
||||
time of the node ids of the machines (the uuid that gets generated in python
|
||||
and printed when the process starts etc). here you do need them in the exact
|
||||
order the machines are connected in via thunderbolt. this is used to prefer
|
||||
spawning models across machines 1-2 and then 3-4 in that order if doable
|
||||
@@ -1,11 +1,32 @@
|
||||
from abc import ABC, abstractmethod
|
||||
from functools import partial
|
||||
from typing import TYPE_CHECKING, Protocol, cast, override
|
||||
|
||||
from mlx_lm.models.deepseek_v3 import DeepseekV3MLP
|
||||
from mlx_lm.models.deepseek_v3 import Model as DeepseekV3Model
|
||||
from mlx_lm.models.llama import Model as LlamaModel
|
||||
from mlx_lm.models.qwen3_moe import Model as Qwen3MoeModel
|
||||
from mlx_lm.models.qwen3_moe import Qwen3MoeSparseMoeBlock
|
||||
|
||||
import mlx.core as mx
|
||||
import mlx.nn as nn # pyright: ignore[reportMissingTypeStubs]
|
||||
from exo.shared.types.worker.shards import PipelineShardMetadata
|
||||
from exo.shared.types.worker.shards import (
|
||||
PipelineShardMetadata,
|
||||
ShardMetadata,
|
||||
TensorShardMetadata,
|
||||
)
|
||||
from mlx.nn.layers.distributed import ( # type: ignore
|
||||
shard_inplace, # type: ignore
|
||||
shard_linear, # type: ignore
|
||||
sum_gradients, # type: ignore
|
||||
)
|
||||
|
||||
|
||||
class IdentityLayer(nn.Module):
|
||||
def __init__(self) -> None:
|
||||
super().__init__()
|
||||
self.use_sliding = False
|
||||
|
||||
@override
|
||||
def __call__(self, x: mx.array, *args: object, **kwargs: object) -> mx.array:
|
||||
return x
|
||||
@@ -70,61 +91,270 @@ class PipelineLastLayer(CustomMlxLayer):
|
||||
return output
|
||||
|
||||
|
||||
def inner_model(model: nn.Module) -> nn.Module:
|
||||
inner = getattr(model, "model", None)
|
||||
if isinstance(inner, nn.Module):
|
||||
return inner
|
||||
|
||||
inner = getattr(model, "transformer", None)
|
||||
if isinstance(inner, nn.Module):
|
||||
return inner
|
||||
|
||||
raise ValueError("Model must either have a 'model' or 'transformer' attribute")
|
||||
class ParallelisationShardStrategy(Protocol):
|
||||
def auto_parallel(
|
||||
self, model: nn.Module, model_shard_meta: ShardMetadata
|
||||
) -> nn.Module: ...
|
||||
|
||||
|
||||
# def auto_parallel(model: nn.Module, rank: int, size: int, start_layer: int, end_layer: int) -> nn.Module:
|
||||
def auto_parallel(
|
||||
model: nn.Module, model_shard_meta: PipelineShardMetadata
|
||||
) -> nn.Module:
|
||||
"""
|
||||
Automatically parallelize a model across multiple devices.
|
||||
class PipelineParallelisationStrategy(ParallelisationShardStrategy):
|
||||
def auto_parallel(
|
||||
self, model: nn.Module, model_shard_meta: ShardMetadata
|
||||
) -> nn.Module:
|
||||
"""
|
||||
Automatically parallelize a model across multiple devices.
|
||||
Args:
|
||||
model: The model to parallelize (must have a 'layers' or 'h' property)
|
||||
model_shard_meta: The metadata for the model shard
|
||||
Returns:
|
||||
The parallelized model
|
||||
"""
|
||||
assert isinstance(model_shard_meta, PipelineShardMetadata)
|
||||
|
||||
Args:
|
||||
model: The model to parallelize (must have a 'layers' or 'h' property)
|
||||
model_shard_meta: The metadata for the model shard
|
||||
inner_model_instance: nn.Module = PipelineParallelisationStrategy._inner_model(
|
||||
model
|
||||
)
|
||||
|
||||
Returns:
|
||||
The parallelized model
|
||||
"""
|
||||
inner_model_instance: nn.Module = inner_model(model)
|
||||
# Handle both model.layers and model.h cases
|
||||
layers: list[_LayerCallable]
|
||||
if hasattr(inner_model_instance, "layers"):
|
||||
layers = cast(list[_LayerCallable], inner_model_instance.layers)
|
||||
elif hasattr(inner_model_instance, "h"):
|
||||
layers = cast(list[_LayerCallable], inner_model_instance.h)
|
||||
else:
|
||||
raise ValueError("Model must have either a 'layers' or 'h' attribute")
|
||||
|
||||
# Handle both model.layers and model.h cases
|
||||
layers: list[_LayerCallable]
|
||||
if hasattr(inner_model_instance, "layers"):
|
||||
layers = cast(list[_LayerCallable], inner_model_instance.layers)
|
||||
else:
|
||||
layers = cast(list[_LayerCallable], inner_model_instance.h)
|
||||
layers[: model_shard_meta.start_layer] = [
|
||||
IdentityLayer() for _ in range(model_shard_meta.start_layer)
|
||||
]
|
||||
layers[model_shard_meta.end_layer :] = [
|
||||
IdentityLayer() for _ in range(len(layers) - model_shard_meta.end_layer)
|
||||
]
|
||||
layers[model_shard_meta.start_layer] = PipelineFirstLayer(
|
||||
layers[model_shard_meta.start_layer],
|
||||
model_shard_meta.device_rank,
|
||||
model_shard_meta.world_size,
|
||||
)
|
||||
layers[model_shard_meta.end_layer - 1] = PipelineLastLayer(
|
||||
layers[model_shard_meta.end_layer - 1],
|
||||
model_shard_meta.device_rank,
|
||||
model_shard_meta.world_size,
|
||||
)
|
||||
|
||||
layers[: model_shard_meta.start_layer] = [
|
||||
IdentityLayer() for _ in range(model_shard_meta.start_layer)
|
||||
]
|
||||
layers[model_shard_meta.end_layer :] = [
|
||||
IdentityLayer() for _ in range(len(layers) - model_shard_meta.end_layer)
|
||||
]
|
||||
layers[model_shard_meta.start_layer] = PipelineFirstLayer(
|
||||
layers[model_shard_meta.start_layer],
|
||||
model_shard_meta.device_rank,
|
||||
model_shard_meta.world_size,
|
||||
)
|
||||
layers[model_shard_meta.end_layer - 1] = PipelineLastLayer(
|
||||
layers[model_shard_meta.end_layer - 1],
|
||||
model_shard_meta.device_rank,
|
||||
model_shard_meta.world_size,
|
||||
)
|
||||
# At this point `layers` *must* be a concrete list.
|
||||
assert isinstance(layers, list), (
|
||||
"Expected a list of layers after auto-parallel initialisation"
|
||||
)
|
||||
|
||||
# At this point `layers` *must* be a concrete list.
|
||||
assert isinstance(layers, list), (
|
||||
"Expected a list of layers after auto-parallel initialisation"
|
||||
)
|
||||
return model
|
||||
|
||||
return model
|
||||
@staticmethod
|
||||
def _inner_model(model: nn.Module) -> nn.Module:
|
||||
inner = getattr(model, "model", None)
|
||||
if isinstance(inner, nn.Module):
|
||||
return inner
|
||||
|
||||
inner = getattr(model, "transformer", None)
|
||||
if isinstance(inner, nn.Module):
|
||||
return inner
|
||||
|
||||
raise ValueError("Model must either have a 'model' or 'transformer' attribute")
|
||||
|
||||
|
||||
class TensorParallelisationStrategy(ParallelisationShardStrategy):
|
||||
def __init__(self, group: mx.distributed.Group): # type: ignore
|
||||
self.group = group # type: ignore
|
||||
self.N = self.group.size # type: ignore
|
||||
|
||||
def auto_parallel(
|
||||
self, model: nn.Module, model_shard_meta: ShardMetadata
|
||||
) -> nn.Module:
|
||||
assert isinstance(model_shard_meta, TensorShardMetadata)
|
||||
|
||||
all_to_sharded_linear = partial(
|
||||
shard_linear,
|
||||
sharding="all-to-sharded",
|
||||
group=self.group, # pyright: ignore
|
||||
)
|
||||
sharded_to_all_linear = partial(
|
||||
shard_linear,
|
||||
sharding="sharded-to-all",
|
||||
group=self.group, # type: ignore
|
||||
)
|
||||
|
||||
all_to_sharded_linear_in_place = partial(
|
||||
shard_inplace,
|
||||
sharding="all-to-sharded",
|
||||
group=self.group, # pyright: ignore
|
||||
)
|
||||
sharded_to_all_linear_in_place = partial(
|
||||
shard_inplace,
|
||||
sharding="sharded-to-all",
|
||||
group=self.group, # type: ignore
|
||||
)
|
||||
|
||||
if isinstance(model, LlamaModel):
|
||||
tensor_parallel_sharding_strategy = LlamaShardingStrategy(
|
||||
self.group, # type: ignore
|
||||
all_to_sharded_linear,
|
||||
sharded_to_all_linear,
|
||||
all_to_sharded_linear_in_place,
|
||||
sharded_to_all_linear_in_place,
|
||||
)
|
||||
elif isinstance(model, DeepseekV3Model):
|
||||
tensor_parallel_sharding_strategy = DeepSeekShardingStrategy(
|
||||
self.group, # type: ignore
|
||||
all_to_sharded_linear,
|
||||
sharded_to_all_linear,
|
||||
all_to_sharded_linear_in_place,
|
||||
sharded_to_all_linear_in_place,
|
||||
)
|
||||
elif isinstance(model, Qwen3MoeModel):
|
||||
tensor_parallel_sharding_strategy = QwenShardingStrategy(
|
||||
self.group, # type: ignore
|
||||
all_to_sharded_linear,
|
||||
sharded_to_all_linear,
|
||||
all_to_sharded_linear_in_place,
|
||||
sharded_to_all_linear_in_place,
|
||||
)
|
||||
else:
|
||||
raise ValueError(f"Unsupported model type: {type(model)}")
|
||||
|
||||
return tensor_parallel_sharding_strategy.shard_model(model)
|
||||
|
||||
|
||||
class TensorParallelShardingStrategy(ABC):
|
||||
def __init__(
|
||||
self,
|
||||
group, # type: ignore
|
||||
all_to_sharded_linear, # type: ignore
|
||||
sharded_to_all_linear, # type: ignore
|
||||
all_to_sharded_linear_in_place, # type: ignore
|
||||
sharded_to_all_linear_in_place, # type: ignore
|
||||
):
|
||||
self.all_to_sharded_linear = all_to_sharded_linear
|
||||
self.sharded_to_all_linear = sharded_to_all_linear
|
||||
self.all_to_sharded_linear_in_place = all_to_sharded_linear_in_place
|
||||
self.sharded_to_all_linear_in_place = sharded_to_all_linear_in_place
|
||||
self.group = group or mx.distributed.init() # type: ignore
|
||||
self.N = cast(int, group.size()) # type: ignore
|
||||
|
||||
@abstractmethod
|
||||
def shard_model(self, model: nn.Module) -> nn.Module: ...
|
||||
|
||||
|
||||
class LlamaShardingStrategy(TensorParallelShardingStrategy):
|
||||
def shard_model(self, model: nn.Module) -> nn.Module:
|
||||
model = cast(LlamaModel, model)
|
||||
for layer in model.layers:
|
||||
layer.self_attn.q_proj = self.all_to_sharded_linear(layer.self_attn.q_proj)
|
||||
layer.self_attn.k_proj = self.all_to_sharded_linear(layer.self_attn.k_proj)
|
||||
layer.self_attn.v_proj = self.all_to_sharded_linear(layer.self_attn.v_proj)
|
||||
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
|
||||
layer.self_attn.n_heads //= self.N
|
||||
if layer.self_attn.n_kv_heads is not None:
|
||||
layer.self_attn.n_kv_heads //= self.N
|
||||
|
||||
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
|
||||
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
|
||||
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
|
||||
|
||||
return model
|
||||
|
||||
|
||||
class DeepSeekShardingStrategy(TensorParallelShardingStrategy):
|
||||
def shard_model(self, model: nn.Module) -> nn.Module:
|
||||
model = cast(DeepseekV3Model, model)
|
||||
for layer in model.layers:
|
||||
# Shard the self attention
|
||||
if layer.self_attn.q_lora_rank is None: # pyright: ignore[reportUnnecessaryComparison]
|
||||
layer.self_attn.q_proj = self.all_to_sharded_linear(
|
||||
layer.self_attn.q_proj
|
||||
)
|
||||
else:
|
||||
layer.self_attn.q_b_proj = self.all_to_sharded_linear(
|
||||
layer.self_attn.q_b_proj
|
||||
)
|
||||
layer.self_attn.kv_b_proj = self.all_to_sharded_linear(
|
||||
layer.self_attn.kv_b_proj
|
||||
)
|
||||
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
|
||||
layer.self_attn.num_heads //= self.N
|
||||
|
||||
# Shard the MLP
|
||||
if isinstance(layer.mlp, DeepseekV3MLP):
|
||||
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
|
||||
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
|
||||
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
|
||||
|
||||
# Shard the MoE. Shard in place since the MoE should be responsible
|
||||
# for aggregating the results.
|
||||
else:
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.shared_experts.gate_proj)
|
||||
self.sharded_to_all_linear_in_place(layer.mlp.shared_experts.down_proj)
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.shared_experts.up_proj)
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.gate_proj)
|
||||
self.sharded_to_all_linear_in_place(layer.mlp.switch_mlp.down_proj)
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.up_proj)
|
||||
layer.mlp = ShardedDeepseekV3MoE(layer.mlp) # type: ignore
|
||||
layer.mlp.sharding_group = self.group # type: ignore
|
||||
|
||||
return model
|
||||
|
||||
|
||||
class ShardedDeepseekV3MoE(CustomMlxLayer):
|
||||
def __init__(self, layer: _LayerCallable):
|
||||
super().__init__(layer)
|
||||
self.sharding_group: mx.distributed.Group | None = None # type: ignore
|
||||
|
||||
def __call__(self, x: mx.array) -> mx.array:
|
||||
if self.sharding_group is not None: # type: ignore
|
||||
x = sum_gradients(self.sharding_group)(x) # type: ignore
|
||||
y = self.original_layer.__call__(x) # type: ignore
|
||||
if self.sharding_group is not None: # type: ignore
|
||||
y = mx.distributed.all_sum(y, group=self.sharding_group) # type: ignore
|
||||
return y
|
||||
|
||||
|
||||
class QwenShardingStrategy(TensorParallelShardingStrategy):
|
||||
def shard_model(self, model: nn.Module) -> nn.Module:
|
||||
model = cast(Qwen3MoeModel, model)
|
||||
for layer in model.layers:
|
||||
# Shard the self attention
|
||||
layer.self_attn.q_proj = self.all_to_sharded_linear(layer.self_attn.q_proj)
|
||||
layer.self_attn.k_proj = self.all_to_sharded_linear(layer.self_attn.k_proj)
|
||||
layer.self_attn.v_proj = self.all_to_sharded_linear(layer.self_attn.v_proj)
|
||||
layer.self_attn.o_proj = self.sharded_to_all_linear(layer.self_attn.o_proj)
|
||||
layer.self_attn.n_heads //= self.N
|
||||
layer.self_attn.n_kv_heads //= self.N
|
||||
|
||||
# Shard the MoE. Shard in place since the MoE should be responsible
|
||||
# for aggregating the results.
|
||||
if isinstance(layer.mlp, Qwen3MoeSparseMoeBlock):
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.gate_proj)
|
||||
self.sharded_to_all_linear_in_place(layer.mlp.switch_mlp.down_proj)
|
||||
self.all_to_sharded_linear_in_place(layer.mlp.switch_mlp.up_proj)
|
||||
layer.mlp = ShardedQwenMoE(layer.mlp) # type: ignore
|
||||
layer.mlp.sharding_group = self.group # type:ignore
|
||||
|
||||
# Shard the MLP
|
||||
else:
|
||||
layer.mlp.gate_proj = self.all_to_sharded_linear(layer.mlp.gate_proj)
|
||||
layer.mlp.down_proj = self.sharded_to_all_linear(layer.mlp.down_proj)
|
||||
layer.mlp.up_proj = self.all_to_sharded_linear(layer.mlp.up_proj)
|
||||
|
||||
return model
|
||||
|
||||
|
||||
class ShardedQwenMoE(CustomMlxLayer):
|
||||
def __init__(self, layer: _LayerCallable):
|
||||
super().__init__(layer)
|
||||
self.sharding_group: mx.distributed.Group | None = None # type: ignore
|
||||
|
||||
def __call__(self, x: mx.array) -> mx.array:
|
||||
if self.sharding_group is not None: # type: ignore
|
||||
x = sum_gradients(self.sharding_group)(x) # type: ignore
|
||||
y = self.original_layer.__call__(x) # type: ignore
|
||||
if self.sharding_group is not None: # type: ignore
|
||||
y = mx.distributed.all_sum(y, group=self.sharding_group) # type: ignore
|
||||
return y
|
||||
|
||||
@@ -1,24 +1,31 @@
|
||||
import asyncio
|
||||
import concurrent.futures
|
||||
import contextlib
|
||||
import os
|
||||
import resource
|
||||
from asyncio import AbstractEventLoop
|
||||
from typing import Any, Callable, Optional, cast
|
||||
from typing import Any, Callable, Optional
|
||||
|
||||
from loguru import logger
|
||||
from mlx_lm.models.cache import KVCache
|
||||
from mlx_lm.sample_utils import make_sampler
|
||||
from mlx_lm.tokenizer_utils import TokenizerWrapper as _TokenizerWrapper
|
||||
from mlx_lm.tokenizer_utils import load_tokenizer # type: ignore
|
||||
|
||||
try:
|
||||
from mlx_lm.tokenizer_utils import load_tokenizer # type: ignore
|
||||
except ImportError:
|
||||
from mlx_lm.tokenizer_utils import load as load_tokenizer # type: ignore
|
||||
from mlx_lm.utils import load_model # type: ignore
|
||||
from pydantic import RootModel
|
||||
|
||||
import mlx.core as mx
|
||||
import mlx.nn as nn # pyright: ignore[reportMissingTypeStubs]
|
||||
from exo.engines.mlx import Model, TokenizerWrapper
|
||||
from exo.engines.mlx.auto_parallel import IdentityLayer, auto_parallel
|
||||
from exo.engines.mlx.auto_parallel import (
|
||||
IdentityLayer,
|
||||
PipelineParallelisationStrategy,
|
||||
TensorParallelisationStrategy,
|
||||
)
|
||||
from exo.shared.types.common import Host
|
||||
from exo.shared.types.memory import Memory
|
||||
from exo.shared.types.tasks import ChatCompletionTaskParams
|
||||
from exo.shared.types.worker.communication import runner_print
|
||||
from exo.shared.types.worker.shards import ShardMetadata
|
||||
@@ -31,15 +38,17 @@ mlx_rank: None | int = None
|
||||
mlx_world_size: None | int = None
|
||||
|
||||
|
||||
def mx_barrier():
|
||||
def mx_barrier(group: mx.distributed.Group | None = None): # type: ignore
|
||||
mx.eval( # type: ignore
|
||||
mx.distributed.all_sum(
|
||||
mx.array(1.0), stream=mx.default_stream(mx.Device(mx.cpu))
|
||||
mx.array(1.0),
|
||||
stream=mx.default_stream(mx.Device(mx.cpu)),
|
||||
group=group, # type: ignore[type-arg]
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
def broadcast_from_zero(value: int) -> int:
|
||||
def broadcast_from_zero(value: int, group: mx.distributed.Group | None = None): # type: ignore
|
||||
if mlx_rank is None:
|
||||
return value
|
||||
|
||||
@@ -48,7 +57,7 @@ def broadcast_from_zero(value: int) -> int:
|
||||
else:
|
||||
a = mx.array([0], dtype=mx.int32)
|
||||
|
||||
m = mx.distributed.all_sum(a, stream=mx.Device(mx.DeviceType.cpu))
|
||||
m = mx.distributed.all_sum(a, stream=mx.Device(mx.DeviceType.cpu), group=group) # type: ignore
|
||||
mx.eval(m) # type: ignore
|
||||
return int(m.item())
|
||||
|
||||
@@ -59,68 +68,60 @@ class HostList(RootModel[list[str]]):
|
||||
return cls(root=[str(host) for host in hosts])
|
||||
|
||||
|
||||
def mlx_setup(
|
||||
model_size_mb: int,
|
||||
cache_frac_of_mrwss: float = 0.65, # main workhorse
|
||||
wired_frac_of_mrwss: float = 0.00, # start with no wiring
|
||||
) -> None:
|
||||
if not mx.metal.is_available():
|
||||
logger.warning(
|
||||
"Metal is not available. Skipping MLX memory wired limits setup."
|
||||
)
|
||||
return
|
||||
info = mx.metal.device_info()
|
||||
mrwss = int(info["max_recommended_working_set_size"]) # bytes
|
||||
memsize = int(info["memory_size"]) # bytes
|
||||
|
||||
runner_print(f"model size mb {model_size_mb}")
|
||||
runner_print(f"{mrwss=}")
|
||||
runner_print(f"{memsize=}")
|
||||
|
||||
model_bytes = int(model_size_mb * 1024**2)
|
||||
kv_bytes = int(0.02 * model_bytes)
|
||||
|
||||
# Cache: keep most of weights+KV “on ice”, but don’t starve the OS.
|
||||
target_cache = int(1.10 * (model_bytes + kv_bytes)) # +10% slack
|
||||
target_cache = min(target_cache, int(cache_frac_of_mrwss * mrwss))
|
||||
target_cache = min(target_cache, memsize)
|
||||
|
||||
runner_print(f"{target_cache=}")
|
||||
mx.set_cache_limit(max(target_cache, 0))
|
||||
|
||||
# Wiring: off by default; if you re‑enable, wire at most a small fraction.
|
||||
if wired_frac_of_mrwss > 0.0:
|
||||
target_wired = int(wired_frac_of_mrwss * mrwss)
|
||||
target_wired = min(target_wired, target_cache) # don’t wire more than cache
|
||||
|
||||
runner_print(f"{target_wired=}")
|
||||
with contextlib.suppress(Exception): # older macOS won’t have this
|
||||
mx.set_wired_limit(max(target_wired, 0))
|
||||
|
||||
|
||||
def mlx_distributed_init(rank: int, hosts: list[Host]) -> mx.distributed.Group: # type: ignore
|
||||
def mlx_distributed_init( # type: ignore[return]
|
||||
rank: int,
|
||||
hosts: list[Host] | None = None,
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None,
|
||||
mlx_ibv_coordinator: str | None = None,
|
||||
) -> mx.distributed.Group: # type: ignore
|
||||
"""
|
||||
Initialize the MLX distributed (runs in thread pool)
|
||||
Initialize the MLX distributed (runs in thread pool).
|
||||
|
||||
Either hosts or mlx_ibv_devices must be provided:
|
||||
- hosts: traditional host-based connectivity using MLX_HOSTFILE
|
||||
- mlx_ibv_devices: RDMA connectivity matrix using MLX_IBV_DEVICES
|
||||
- mlx_ibv_coordinator: coordinator address (IP:PORT) for RDMA setup
|
||||
"""
|
||||
global mlx_rank, mlx_world_size
|
||||
runner_print(f"Starting initialization for rank {rank}")
|
||||
|
||||
# Setup distributed environment
|
||||
hostfile = f"./hosts_{rank}.json" # TODO: this needs to be unique?
|
||||
hosts_json = HostList.from_hosts(hosts).model_dump_json()
|
||||
if mlx_ibv_devices is not None:
|
||||
assert mlx_ibv_coordinator is not None, (
|
||||
"To use ibv backend must set ibv coordinator"
|
||||
)
|
||||
import json
|
||||
|
||||
runner_print(f"rank {rank} hostfile: {hostfile} hosts: {hosts_json}")
|
||||
# Use RDMA connectivity matrix
|
||||
devices_file = f"./hosts_{rank}.json"
|
||||
ibv_devices_json = json.dumps(mlx_ibv_devices)
|
||||
runner_print(f"rank {rank} MLX_IBV_DEVICES: {ibv_devices_json}")
|
||||
runner_print(f"rank {rank} MLX_IBV_COORDINATOR: {mlx_ibv_coordinator}")
|
||||
|
||||
with open(hostfile, "w") as f:
|
||||
_ = f.write(hosts_json)
|
||||
with open(devices_file, "w") as f:
|
||||
_ = f.write(ibv_devices_json)
|
||||
|
||||
os.environ["MLX_HOSTFILE"] = hostfile
|
||||
os.environ["MLX_RANK"] = str(rank)
|
||||
os.environ["MLX_RING_VERBOSE"] = "1"
|
||||
os.environ["MLX_IBV_DEVICES"] = devices_file
|
||||
os.environ["MLX_RANK"] = str(rank)
|
||||
os.environ["MLX_IBV_COORDINATOR"] = mlx_ibv_coordinator
|
||||
|
||||
group = mx.distributed.init(backend="ring", strict=True)
|
||||
mlx_rank = group.rank()
|
||||
mlx_world_size = group.rank()
|
||||
elif hosts is not None:
|
||||
# Traditional host-based connectivity
|
||||
hostfile = f"./hosts_{rank}.json"
|
||||
hosts_json = HostList.from_hosts(hosts).model_dump_json()
|
||||
|
||||
runner_print(f"rank {rank} hostfile: {hostfile} hosts: {hosts_json}")
|
||||
|
||||
with open(hostfile, "w") as f:
|
||||
_ = f.write(hosts_json)
|
||||
|
||||
os.environ["MLX_HOSTFILE"] = hostfile
|
||||
os.environ["MLX_RANK"] = str(rank)
|
||||
os.environ["MLX_RING_VERBOSE"] = "1"
|
||||
else:
|
||||
raise ValueError("Either hosts or mlx_ibv_devices must be provided")
|
||||
|
||||
group = mx.distributed.init(
|
||||
backend="ring" if hosts is not None else "ibv", strict=True
|
||||
)
|
||||
runner_print(f"Rank {rank} mlx distributed initialization complete")
|
||||
|
||||
return group
|
||||
@@ -128,40 +129,79 @@ def mlx_distributed_init(rank: int, hosts: list[Host]) -> mx.distributed.Group:
|
||||
|
||||
def initialize_mlx(
|
||||
model_shard_meta: ShardMetadata,
|
||||
hosts: list[Host],
|
||||
) -> tuple[Model, TokenizerWrapper, Callable[[mx.array], mx.array]]:
|
||||
hosts: list[Host] | None = None,
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None,
|
||||
mlx_ibv_coordinator: str | None = None,
|
||||
) -> tuple[Model, TokenizerWrapper, Callable[[mx.array], mx.array], Any]:
|
||||
"""
|
||||
Initialize the MLX model, tokenizer, and sampler. Runs in the MLX thread.
|
||||
|
||||
Either hosts or mlx_ibv_devices must be provided for distributed setups:
|
||||
- hosts: traditional host-based connectivity
|
||||
- mlx_ibv_devices: RDMA connectivity matrix
|
||||
"""
|
||||
mx.random.seed(42)
|
||||
if len(hosts) > 1:
|
||||
mlx_distributed_init(model_shard_meta.device_rank, hosts)
|
||||
group = mlx_distributed_init( # type: ignore[misc]
|
||||
model_shard_meta.device_rank,
|
||||
hosts=hosts,
|
||||
mlx_ibv_devices=mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=mlx_ibv_coordinator,
|
||||
)
|
||||
|
||||
# set_wired_limit_for_model(get_weights_size(model_shard_meta))
|
||||
|
||||
# Determine world size from either hosts or mlx_ibv_devices
|
||||
|
||||
sampler: Callable[[mx.array], mx.array] = make_sampler(temp=0.7)
|
||||
|
||||
model, tokenizer = shard_and_load(model_shard_meta)
|
||||
model = cast(Model, model)
|
||||
model, tokenizer = shard_and_load(model_shard_meta, group=group) # type: ignore[reportUnknownArgumentType]
|
||||
|
||||
return model, tokenizer, sampler
|
||||
return model, tokenizer, sampler, group # type: ignore[return-value]
|
||||
|
||||
|
||||
def shard_and_load(
|
||||
model_shard_meta: ShardMetadata,
|
||||
group: mx.distributed.Group, # type: ignore
|
||||
) -> tuple[nn.Module, TokenizerWrapper]:
|
||||
model_path = build_model_path(model_shard_meta.model_meta.model_id)
|
||||
|
||||
runner_print(f"loading model from {model_path}")
|
||||
runner_print(
|
||||
f"loading model from {model_path} with strategy {model_shard_meta.strategy}"
|
||||
)
|
||||
|
||||
model, config = load_model(model_path, lazy=True, strict=False) # type: ignore
|
||||
runner_print(f"{config=}")
|
||||
assert isinstance(model, nn.Module)
|
||||
|
||||
tokenizer = load_tokenizer(model_path)
|
||||
tokenizer = load_tokenizer(model_path) # type: ignore
|
||||
assert isinstance(tokenizer, _TokenizerWrapper)
|
||||
model = auto_parallel(model, model_shard_meta)
|
||||
|
||||
if group:
|
||||
runner_print(f"Group size: {group.size()}, group rank: {group.rank()}") # type: ignore
|
||||
else:
|
||||
runner_print("!!! No group")
|
||||
|
||||
match model_shard_meta.strategy:
|
||||
case "auto":
|
||||
strategy = PipelineParallelisationStrategy()
|
||||
case "pipeline":
|
||||
strategy = PipelineParallelisationStrategy()
|
||||
case "pipeline_rdma":
|
||||
strategy = PipelineParallelisationStrategy()
|
||||
case "tensor":
|
||||
strategy = TensorParallelisationStrategy(group) # type: ignore[reportUnknownArgumentType]
|
||||
case "tensor_rdma":
|
||||
strategy = TensorParallelisationStrategy(group) # type: ignore[reportUnknownArgumentType]
|
||||
|
||||
model = strategy.auto_parallel(model, model_shard_meta)
|
||||
|
||||
runner_print(f"Model after auto_parallel: {str(model)}")
|
||||
|
||||
mx.eval(model.parameters()) # type: ignore
|
||||
mx.eval(model) # type: ignore
|
||||
|
||||
# Synchronize processes before generation to avoid timeout
|
||||
mx_barrier()
|
||||
mx_barrier(group) # type: ignore[reportUnknownArgumentType]
|
||||
|
||||
return model, tokenizer # type: ignore
|
||||
|
||||
@@ -257,3 +297,30 @@ def mlx_force_oom(size: int = 40000) -> None:
|
||||
e = mx.matmul(b, c)
|
||||
f = mx.sigmoid(d + e)
|
||||
mx.eval(f) # type: ignore
|
||||
|
||||
|
||||
def set_wired_limit_for_model(model_size: Memory):
|
||||
"""
|
||||
A context manager to temporarily change the wired limit.
|
||||
|
||||
Note, the wired limit should not be changed during an async eval. If an
|
||||
async eval could be running pass in the streams to synchronize with prior
|
||||
to exiting the context manager.
|
||||
"""
|
||||
if not mx.metal.is_available():
|
||||
return
|
||||
|
||||
model_bytes = model_size.in_bytes
|
||||
max_rec_size = int(mx.metal.device_info()["max_recommended_working_set_size"])
|
||||
if model_bytes > 0.9 * max_rec_size:
|
||||
model_mb = model_bytes // 2**20
|
||||
max_rec_mb = max_rec_size // 2**20
|
||||
runner_print(
|
||||
f"[WARNING] Generating with a model that requires {model_mb} MB "
|
||||
f"which is close to the maximum recommended size of {max_rec_mb} "
|
||||
"MB. This can be slow. See the documentation for possible work-arounds: "
|
||||
"https://github.com/ml-explore/mlx-lm/tree/main#large-models"
|
||||
)
|
||||
runner_print(f"Setting wired limit to {max_rec_size}")
|
||||
mx.set_wired_limit(max_rec_size)
|
||||
runner_print(f"Wired limit set to {max_rec_size}")
|
||||
|
||||
@@ -153,7 +153,9 @@ class Node:
|
||||
await self.master.shutdown()
|
||||
self.master = None
|
||||
else:
|
||||
logger.info(f"Node {result.session_id.master_node_id} elected master")
|
||||
logger.info(
|
||||
f"Node {result.session_id.master_node_id} elected master"
|
||||
)
|
||||
if result.is_new_master:
|
||||
await anyio.sleep(0)
|
||||
if self.worker:
|
||||
@@ -175,10 +177,10 @@ class Node:
|
||||
)
|
||||
self._tg.start_soon(self.worker.run)
|
||||
if self.api:
|
||||
self.api.reset(result.session_id)
|
||||
self.api.reset(result.session_id, result.won_clock)
|
||||
else:
|
||||
if self.api:
|
||||
self.api.unpause()
|
||||
self.api.unpause(result.won_clock)
|
||||
|
||||
|
||||
def main():
|
||||
|
||||
@@ -93,6 +93,7 @@ class API:
|
||||
self.event_buffer: OrderedBuffer[Event] = OrderedBuffer[Event]()
|
||||
self.node_id: NodeId = node_id
|
||||
self.session_id: SessionId = session_id
|
||||
self.last_completed_election: int = 0
|
||||
self.port = port
|
||||
|
||||
self.paused: bool = False
|
||||
@@ -121,14 +122,15 @@ class API:
|
||||
] = {}
|
||||
self._tg: TaskGroup | None = None
|
||||
|
||||
def reset(self, new_session_id: SessionId):
|
||||
def reset(self, new_session_id: SessionId, result_clock: int):
|
||||
self.state = State()
|
||||
self.session_id = new_session_id
|
||||
self.event_buffer = OrderedBuffer[Event]()
|
||||
self._chat_completion_queues = {}
|
||||
self.unpause()
|
||||
self.unpause(result_clock)
|
||||
|
||||
def unpause(self):
|
||||
def unpause(self, result_clock: int):
|
||||
self.last_completed_election = result_clock
|
||||
self.paused = False
|
||||
self.paused_ev.set()
|
||||
self.paused_ev = AsyncTaskEvent()
|
||||
@@ -155,6 +157,7 @@ class API:
|
||||
self, payload: CreateInstanceTaskParams
|
||||
) -> CreateInstanceResponse:
|
||||
model_meta = await resolve_model_meta(payload.model_id)
|
||||
strategy = payload.strategy
|
||||
required_memory_bytes = model_meta.storage_size.in_kb
|
||||
available_memory_bytes = self._calculate_total_available_memory()
|
||||
|
||||
@@ -165,8 +168,7 @@ class API:
|
||||
)
|
||||
|
||||
command = CreateInstance(
|
||||
command_id=CommandId(),
|
||||
model_meta=model_meta,
|
||||
command_id=CommandId(), model_meta=model_meta, strategy=strategy
|
||||
)
|
||||
await self._send(command)
|
||||
|
||||
@@ -260,10 +262,10 @@ class API:
|
||||
# Store thinking in the thinking field
|
||||
message.thinking = thinking_match.group(1).strip()
|
||||
|
||||
for instance in self.state.instances.values():
|
||||
if instance.shard_assignments.model_id == payload.model:
|
||||
break
|
||||
else:
|
||||
if not any(
|
||||
instance.shard_assignments.model_id == payload.model
|
||||
for instance in self.state.instances.values()
|
||||
):
|
||||
await self._trigger_notify_user_to_download_model(payload.model)
|
||||
raise HTTPException(
|
||||
status_code=404, detail=f"No instance found for model {payload.model}"
|
||||
@@ -334,7 +336,7 @@ class API:
|
||||
async def _pause_on_new_election(self):
|
||||
with self.election_receiver as ems:
|
||||
async for message in ems:
|
||||
if message.clock > self.session_id.election_clock:
|
||||
if message.clock > self.last_completed_election:
|
||||
self.paused = True
|
||||
|
||||
async def _send(self, command: Command):
|
||||
|
||||
@@ -1,3 +1,5 @@
|
||||
from datetime import datetime, timezone
|
||||
|
||||
from anyio import create_task_group
|
||||
from anyio.abc import TaskGroup
|
||||
from loguru import logger
|
||||
@@ -202,6 +204,8 @@ class Master:
|
||||
indexed = IndexedEvent(event=event, idx=len(self._event_log))
|
||||
self.state = apply(self.state, indexed)
|
||||
|
||||
event._master_time_stamp = datetime.now(tz=timezone.utc) # pyright: ignore[reportPrivateUsage]
|
||||
|
||||
# TODO: SQL
|
||||
self._event_log.append(event)
|
||||
await self._send_event(indexed)
|
||||
|
||||
@@ -6,6 +6,8 @@ from typing import Sequence
|
||||
from exo.master.placement_utils import (
|
||||
filter_cycles_by_memory,
|
||||
get_hosts_from_subgraph,
|
||||
get_mlx_ibv_coordinator,
|
||||
get_mlx_ibv_devices_matrix,
|
||||
get_shard_assignments,
|
||||
get_smallest_cycles,
|
||||
)
|
||||
@@ -39,7 +41,6 @@ def get_instance_placements_after_create(
|
||||
logger.info("finding cycles:")
|
||||
cycles = topology.get_cycles()
|
||||
logger.info(f"{cycles=}")
|
||||
# we can also always just have a node on its own
|
||||
singleton_cycles = [[node] for node in all_nodes]
|
||||
candidate_cycles = cycles + singleton_cycles
|
||||
cycles_with_sufficient_memory = filter_cycles_by_memory(
|
||||
@@ -58,7 +59,7 @@ def get_instance_placements_after_create(
|
||||
]
|
||||
|
||||
if tb_only and smallest_tb_cycles == []:
|
||||
raise ValueError("No cycles found with sufficient memory")
|
||||
raise ValueError("No TB cycles found with sufficient memory")
|
||||
elif smallest_tb_cycles != []:
|
||||
smallest_cycles = smallest_tb_cycles
|
||||
|
||||
@@ -80,29 +81,46 @@ def get_instance_placements_after_create(
|
||||
),
|
||||
)
|
||||
|
||||
shard_assignments = get_shard_assignments(command.model_meta, selected_cycle)
|
||||
shard_assignments = get_shard_assignments(
|
||||
command.model_meta, selected_cycle, command.strategy
|
||||
)
|
||||
|
||||
cycle_digraph: Topology = topology.get_subgraph_from_nodes(selected_cycle)
|
||||
hosts: list[Host] = get_hosts_from_subgraph(cycle_digraph)
|
||||
|
||||
instance_id = InstanceId()
|
||||
target_instances = dict(deepcopy(current_instances))
|
||||
target_instances[instance_id] = Instance(
|
||||
instance_id=instance_id,
|
||||
instance_type=InstanceStatus.Active,
|
||||
shard_assignments=shard_assignments,
|
||||
hosts=[
|
||||
Host(
|
||||
ip=host.ip,
|
||||
# NOTE: this is stupid
|
||||
# |
|
||||
# v
|
||||
# NOTE: it's fine to have non-deterministic ports here since this is in a command decision
|
||||
port=random_ephemeral_port(),
|
||||
)
|
||||
for host in hosts
|
||||
],
|
||||
)
|
||||
|
||||
if command.strategy in ("tensor_rdma", "pipeline_rdma"):
|
||||
mlx_ibv_devices = get_mlx_ibv_devices_matrix(
|
||||
selected_cycle,
|
||||
cycle_digraph,
|
||||
)
|
||||
mlx_ibv_coordinator = get_mlx_ibv_coordinator(
|
||||
selected_cycle,
|
||||
coordinator_port=random_ephemeral_port(),
|
||||
)
|
||||
target_instances[instance_id] = Instance(
|
||||
instance_id=instance_id,
|
||||
instance_type=InstanceStatus.Active,
|
||||
shard_assignments=shard_assignments,
|
||||
mlx_ibv_devices=mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=mlx_ibv_coordinator,
|
||||
)
|
||||
else:
|
||||
hosts: list[Host] = get_hosts_from_subgraph(cycle_digraph)
|
||||
target_instances[instance_id] = Instance(
|
||||
instance_id=instance_id,
|
||||
instance_type=InstanceStatus.Active,
|
||||
shard_assignments=shard_assignments,
|
||||
hosts=[
|
||||
Host(
|
||||
ip=host.ip,
|
||||
port=random_ephemeral_port(),
|
||||
)
|
||||
for host in hosts
|
||||
],
|
||||
)
|
||||
|
||||
return target_instances
|
||||
|
||||
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
from collections.abc import Generator
|
||||
from typing import TypeGuard, cast
|
||||
|
||||
from loguru import logger
|
||||
from pydantic import BaseModel
|
||||
|
||||
from exo.shared.topology import Topology
|
||||
@@ -9,8 +11,13 @@ from exo.shared.types.models import ModelMetadata
|
||||
from exo.shared.types.profiling import NodePerformanceProfile
|
||||
from exo.shared.types.topology import NodeInfo
|
||||
from exo.shared.types.worker.common import RunnerId
|
||||
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
|
||||
from exo.shared.types.worker.runners import ShardAssignments
|
||||
from exo.shared.types.worker.shards import PipelineShardMetadata
|
||||
from exo.shared.types.worker.shards import (
|
||||
PipelineShardMetadata,
|
||||
ShardMetadata,
|
||||
TensorShardMetadata,
|
||||
)
|
||||
|
||||
|
||||
class NodeWithProfile(BaseModel):
|
||||
@@ -43,10 +50,11 @@ def get_smallest_cycles(cycles: list[list[NodeInfo]]) -> list[list[NodeInfo]]:
|
||||
return [cycle for cycle in cycles if len(cycle) == min_nodes]
|
||||
|
||||
|
||||
def get_shard_assignments(
|
||||
def get_shard_assignments_for_pipeline_parallel(
|
||||
model_meta: ModelMetadata,
|
||||
selected_cycle: list[NodeInfo],
|
||||
) -> ShardAssignments:
|
||||
parallelisation_strategy: ParallelisationStrategyType,
|
||||
):
|
||||
if not narrow_all_nodes(selected_cycle):
|
||||
raise ValueError("All nodes must have profiles to create shard assignments")
|
||||
|
||||
@@ -55,7 +63,8 @@ def get_shard_assignments(
|
||||
start=Memory(),
|
||||
)
|
||||
total_layers = model_meta.n_layers
|
||||
runner_to_shard: dict[RunnerId, PipelineShardMetadata] = {}
|
||||
world_size = len(selected_cycle)
|
||||
runner_to_shard: dict[RunnerId, ShardMetadata] = {}
|
||||
node_to_runner: dict[NodeId, RunnerId] = {}
|
||||
|
||||
layers_assigned = 0
|
||||
@@ -73,13 +82,15 @@ def get_shard_assignments(
|
||||
node_layers = max(1, node_layers)
|
||||
|
||||
runner_id = RunnerId()
|
||||
|
||||
shard = PipelineShardMetadata(
|
||||
model_meta=model_meta,
|
||||
device_rank=i,
|
||||
world_size=len(selected_cycle),
|
||||
world_size=world_size,
|
||||
start_layer=layers_assigned,
|
||||
end_layer=layers_assigned + node_layers,
|
||||
n_layers=total_layers,
|
||||
strategy=parallelisation_strategy,
|
||||
)
|
||||
|
||||
runner_to_shard[runner_id] = shard
|
||||
@@ -95,6 +106,82 @@ def get_shard_assignments(
|
||||
return shard_assignments
|
||||
|
||||
|
||||
def get_shard_assignments_for_tensor_parallel(
|
||||
model_meta: ModelMetadata,
|
||||
selected_cycle: list[NodeInfo],
|
||||
parallelisation_strategy: ParallelisationStrategyType,
|
||||
):
|
||||
if not narrow_all_nodes(selected_cycle):
|
||||
raise ValueError("All nodes must have profiles to create shard assignments")
|
||||
|
||||
total_layers = model_meta.n_layers
|
||||
world_size = len(selected_cycle)
|
||||
runner_to_shard: dict[RunnerId, ShardMetadata] = {}
|
||||
node_to_runner: dict[NodeId, RunnerId] = {}
|
||||
|
||||
for i, node in enumerate(selected_cycle):
|
||||
shard = TensorShardMetadata(
|
||||
model_meta=model_meta,
|
||||
device_rank=i,
|
||||
world_size=world_size,
|
||||
start_layer=0,
|
||||
end_layer=total_layers,
|
||||
n_layers=total_layers,
|
||||
strategy=parallelisation_strategy,
|
||||
)
|
||||
|
||||
runner_id = RunnerId()
|
||||
|
||||
runner_to_shard[runner_id] = shard
|
||||
node_to_runner[node.node_id] = runner_id
|
||||
|
||||
shard_assignments = ShardAssignments(
|
||||
model_id=model_meta.model_id,
|
||||
runner_to_shard=runner_to_shard,
|
||||
node_to_runner=node_to_runner,
|
||||
)
|
||||
|
||||
return shard_assignments
|
||||
|
||||
|
||||
def get_shard_assignments(
|
||||
model_meta: ModelMetadata,
|
||||
selected_cycle: list[NodeInfo],
|
||||
parallelisation_strategy: ParallelisationStrategyType,
|
||||
) -> ShardAssignments:
|
||||
match parallelisation_strategy:
|
||||
case "auto":
|
||||
return get_shard_assignments_for_pipeline_parallel(
|
||||
model_meta=model_meta,
|
||||
selected_cycle=selected_cycle,
|
||||
parallelisation_strategy=parallelisation_strategy,
|
||||
)
|
||||
case "pipeline":
|
||||
return get_shard_assignments_for_pipeline_parallel(
|
||||
model_meta=model_meta,
|
||||
selected_cycle=selected_cycle,
|
||||
parallelisation_strategy=parallelisation_strategy,
|
||||
)
|
||||
case "pipeline_rdma":
|
||||
return get_shard_assignments_for_pipeline_parallel(
|
||||
model_meta=model_meta,
|
||||
selected_cycle=selected_cycle,
|
||||
parallelisation_strategy=parallelisation_strategy,
|
||||
)
|
||||
case "tensor":
|
||||
return get_shard_assignments_for_tensor_parallel(
|
||||
model_meta=model_meta,
|
||||
selected_cycle=selected_cycle,
|
||||
parallelisation_strategy=parallelisation_strategy,
|
||||
)
|
||||
case "tensor_rdma":
|
||||
return get_shard_assignments_for_tensor_parallel(
|
||||
model_meta=model_meta,
|
||||
selected_cycle=selected_cycle,
|
||||
parallelisation_strategy=parallelisation_strategy,
|
||||
)
|
||||
|
||||
|
||||
def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
|
||||
cycles = cycle_digraph.get_cycles()
|
||||
if not cycles:
|
||||
@@ -126,3 +213,109 @@ def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
|
||||
break
|
||||
|
||||
return hosts
|
||||
|
||||
|
||||
def get_mlx_ibv_devices_matrix(
|
||||
selected_cycle: list[NodeInfo],
|
||||
cycle_digraph: Topology,
|
||||
) -> list[list[str | None]]:
|
||||
"""Build connectivity matrix mapping device i to device j via RDMA interface names.
|
||||
|
||||
The matrix element [i][j] contains the interface name on device i that connects
|
||||
to device j, or None if no connection exists or no interface name is found.
|
||||
Diagonal elements are always None.
|
||||
"""
|
||||
num_nodes = len(selected_cycle)
|
||||
matrix: list[list[str | None]] = [
|
||||
[None for _ in range(num_nodes)] for _ in range(num_nodes)
|
||||
]
|
||||
|
||||
for i, node_i in enumerate(selected_cycle):
|
||||
for j, node_j in enumerate(selected_cycle):
|
||||
if i == j:
|
||||
continue
|
||||
|
||||
# just for debugging for now...
|
||||
for connection_ip in _find_connection_ip(node_i, node_j, cycle_digraph):
|
||||
interface_name = _find_interface_name_for_ip(connection_ip, node_i)
|
||||
logger.info(
|
||||
f"Interface name for {connection_ip} on {node_i.node_id}: {interface_name}"
|
||||
)
|
||||
|
||||
matrix[i][j] = "rdma_en3" # TODO: hack, for now it's always en3
|
||||
continue
|
||||
|
||||
for connection_ip in _find_connection_ip(node_i, node_j, cycle_digraph):
|
||||
# Set the first valid rmda i -> j connection - if there are multiple, we set essentially randomly - this is fine, the connection doesn't appear to have to be bidirectional
|
||||
if (
|
||||
interface_name := _find_interface_name_for_ip(
|
||||
connection_ip,
|
||||
node_i,
|
||||
)
|
||||
) is not None:
|
||||
matrix[i][j] = interface_name
|
||||
break
|
||||
else:
|
||||
raise ValueError(
|
||||
"Current ibv backend requires all-to-all rdma connections"
|
||||
)
|
||||
|
||||
return matrix
|
||||
|
||||
|
||||
def _find_connection_ip(
|
||||
node_i: NodeInfo,
|
||||
node_j: NodeInfo,
|
||||
cycle_digraph: Topology,
|
||||
) -> Generator[str]:
|
||||
"""Find all IP addresses that connect node i to node j."""
|
||||
for connection in cycle_digraph.list_connections():
|
||||
if (
|
||||
connection.local_node_id == node_j.node_id
|
||||
and connection.send_back_node_id == node_i.node_id
|
||||
and connection.send_back_multiaddr is not None
|
||||
):
|
||||
yield connection.send_back_multiaddr.ip_address
|
||||
|
||||
|
||||
def _find_interface_name_for_ip(
|
||||
ip_address: str,
|
||||
node_info: NodeInfo,
|
||||
) -> str | None:
|
||||
if node_info.node_profile is None:
|
||||
return None
|
||||
|
||||
for interface in node_info.node_profile.network_interfaces:
|
||||
logger.info(
|
||||
f"Checking interface {interface.name} for IP {interface.ip_address} == {ip_address}: {interface.ip_address == ip_address}"
|
||||
)
|
||||
if interface.name not in ["en2", "en3", "en4", "en5", "en6", "en7"]:
|
||||
continue
|
||||
if interface.ip_address == ip_address:
|
||||
return f"rdma_{interface.name}"
|
||||
|
||||
return None
|
||||
|
||||
|
||||
def get_mlx_ibv_coordinator(
|
||||
selected_cycle: list[NodeInfo],
|
||||
coordinator_port: int,
|
||||
) -> str | None:
|
||||
"""Get the coordinator address for MLX IBV (rank 0 device).
|
||||
|
||||
Selects a non-thunderbolt IP address from rank 0 node as a heuristic for
|
||||
ethernet accessibility. Returns address in format "X.X.X.X:PORT".
|
||||
"""
|
||||
|
||||
if len(selected_cycle) == 0:
|
||||
logger.warning("No nodes in selected cycle, cannot determine coordinator")
|
||||
return None
|
||||
|
||||
rank_0_node = selected_cycle[0]
|
||||
logger.info(f"Selecting coordinator from rank 0 node: {rank_0_node.node_id}")
|
||||
assert rank_0_node.node_profile is not None
|
||||
for iface in rank_0_node.node_profile.network_interfaces:
|
||||
if iface.name == "en0" and "." in iface.ip_address:
|
||||
return f"{iface.ip_address}:{coordinator_port}"
|
||||
|
||||
raise ValueError("No en0 iface found for device")
|
||||
|
||||
@@ -118,6 +118,7 @@ async def test_master():
|
||||
n_layers=16,
|
||||
storage_size=Memory.from_bytes(678948),
|
||||
),
|
||||
strategy="auto",
|
||||
)
|
||||
),
|
||||
)
|
||||
|
||||
@@ -12,6 +12,7 @@ from exo.shared.types.common import CommandId, NodeId
|
||||
from exo.shared.types.events import InstanceCreated, InstanceDeleted
|
||||
from exo.shared.types.memory import Memory
|
||||
from exo.shared.types.models import ModelId, ModelMetadata
|
||||
from exo.shared.types.profiling import NetworkInterfaceInfo, NodePerformanceProfile
|
||||
from exo.shared.types.topology import Connection, NodeInfo
|
||||
from exo.shared.types.worker.common import InstanceId
|
||||
from exo.shared.types.worker.instances import Instance, InstanceStatus
|
||||
@@ -49,6 +50,7 @@ def create_instance_command(model_meta: ModelMetadata) -> CreateInstance:
|
||||
return CreateInstance(
|
||||
command_id=CommandId(),
|
||||
model_meta=model_meta,
|
||||
strategy="auto",
|
||||
)
|
||||
|
||||
|
||||
@@ -78,6 +80,7 @@ def test_get_instance_placements_create_instance(
|
||||
create_instance_command = CreateInstance(
|
||||
command_id=CommandId(),
|
||||
model_meta=model_meta,
|
||||
strategy="auto",
|
||||
)
|
||||
node_id_a = NodeId()
|
||||
node_id_b = NodeId()
|
||||
@@ -132,6 +135,7 @@ def test_get_instance_placements_one_node_exact_fit(
|
||||
pretty_name="Test Model",
|
||||
n_layers=10,
|
||||
),
|
||||
strategy="auto",
|
||||
)
|
||||
placements = get_instance_placements_after_create(
|
||||
create_instance_command, topology, {}
|
||||
@@ -160,6 +164,7 @@ def test_get_instance_placements_one_node_fits_with_extra_memory(
|
||||
pretty_name="Test Model",
|
||||
n_layers=10,
|
||||
),
|
||||
strategy="auto",
|
||||
)
|
||||
placements = get_instance_placements_after_create(
|
||||
create_instance_command, topology, {}
|
||||
@@ -188,6 +193,7 @@ def test_get_instance_placements_one_node_not_fit(
|
||||
pretty_name="Test Model",
|
||||
n_layers=10,
|
||||
),
|
||||
strategy="auto",
|
||||
)
|
||||
|
||||
with pytest.raises(ValueError, match="No cycles found with sufficient memory"):
|
||||
@@ -297,6 +303,7 @@ def test_placement_prioritizes_leaf_cycle_with_less_memory(
|
||||
create_instance_command = CreateInstance(
|
||||
command_id=CommandId(),
|
||||
model_meta=model_meta,
|
||||
strategy="auto",
|
||||
)
|
||||
|
||||
# Act
|
||||
@@ -316,3 +323,130 @@ def test_placement_prioritizes_leaf_cycle_with_less_memory(
|
||||
|
||||
assert expected_leaf_cycle_nodes.issubset(assigned_nodes)
|
||||
assert assigned_nodes.isdisjoint(non_leaf_cycle_nodes)
|
||||
|
||||
|
||||
def test_tensor_rdma_backend_connectivity_matrix(
|
||||
topology: Topology,
|
||||
model_meta: ModelMetadata,
|
||||
create_node: Callable[[int, NodeId | None], NodeInfo],
|
||||
create_connection: Callable[[NodeId, NodeId], Connection],
|
||||
):
|
||||
model_meta.n_layers = 12
|
||||
model_meta.storage_size.in_bytes = 1500
|
||||
|
||||
node_id_a = NodeId()
|
||||
node_id_b = NodeId()
|
||||
node_id_c = NodeId()
|
||||
|
||||
node_a = create_node(500, node_id_a)
|
||||
node_b = create_node(500, node_id_b)
|
||||
node_c = create_node(500, node_id_c)
|
||||
|
||||
ethernet_interface = NetworkInterfaceInfo(
|
||||
name="en0",
|
||||
ip_address="192.168.1.100",
|
||||
type="ethernet",
|
||||
)
|
||||
|
||||
assert node_a.node_profile is not None
|
||||
assert node_b.node_profile is not None
|
||||
assert node_c.node_profile is not None
|
||||
|
||||
conn_a_b = create_connection(node_id_a, node_id_b)
|
||||
conn_b_c = create_connection(node_id_b, node_id_c)
|
||||
conn_c_a = create_connection(node_id_c, node_id_a)
|
||||
|
||||
assert conn_a_b.send_back_multiaddr is not None
|
||||
assert conn_b_c.send_back_multiaddr is not None
|
||||
assert conn_c_a.send_back_multiaddr is not None
|
||||
|
||||
node_a.node_profile = NodePerformanceProfile(
|
||||
model_id="test",
|
||||
chip_id="test",
|
||||
friendly_name="test",
|
||||
memory=node_a.node_profile.memory,
|
||||
network_interfaces=[
|
||||
NetworkInterfaceInfo(
|
||||
name="en3",
|
||||
ip_address=conn_a_b.send_back_multiaddr.ip_address,
|
||||
type="rdma",
|
||||
),
|
||||
ethernet_interface,
|
||||
],
|
||||
system=node_a.node_profile.system,
|
||||
)
|
||||
node_b.node_profile = NodePerformanceProfile(
|
||||
model_id="test",
|
||||
chip_id="test",
|
||||
friendly_name="test",
|
||||
memory=node_b.node_profile.memory,
|
||||
network_interfaces=[
|
||||
NetworkInterfaceInfo(
|
||||
name="en4",
|
||||
ip_address=conn_b_c.send_back_multiaddr.ip_address,
|
||||
type="rdma",
|
||||
),
|
||||
ethernet_interface,
|
||||
],
|
||||
system=node_b.node_profile.system,
|
||||
)
|
||||
node_c.node_profile = NodePerformanceProfile(
|
||||
model_id="test",
|
||||
chip_id="test",
|
||||
friendly_name="test",
|
||||
memory=node_c.node_profile.memory,
|
||||
network_interfaces=[
|
||||
NetworkInterfaceInfo(
|
||||
name="en5",
|
||||
ip_address=conn_c_a.send_back_multiaddr.ip_address,
|
||||
type="rdma",
|
||||
),
|
||||
ethernet_interface,
|
||||
],
|
||||
system=node_c.node_profile.system,
|
||||
)
|
||||
|
||||
topology.add_node(node_a)
|
||||
topology.add_node(node_b)
|
||||
topology.add_node(node_c)
|
||||
topology.add_connection(conn_a_b)
|
||||
topology.add_connection(conn_b_c)
|
||||
topology.add_connection(conn_c_a)
|
||||
|
||||
create_instance_command = CreateInstance(
|
||||
command_id=CommandId(),
|
||||
model_meta=model_meta,
|
||||
strategy="tensor_rdma",
|
||||
)
|
||||
|
||||
placements = get_instance_placements_after_create(
|
||||
create_instance_command, topology, {}
|
||||
)
|
||||
|
||||
assert len(placements) == 1
|
||||
instance_id = list(placements.keys())[0]
|
||||
instance = placements[instance_id]
|
||||
|
||||
assert instance.hosts is None
|
||||
assert instance.mlx_ibv_devices is not None
|
||||
assert instance.mlx_ibv_coordinator is not None
|
||||
|
||||
matrix = instance.mlx_ibv_devices
|
||||
assert len(matrix) == 3
|
||||
|
||||
for i in range(3):
|
||||
assert matrix[i][i] is None
|
||||
|
||||
assigned_nodes = list(instance.shard_assignments.node_to_runner.keys())
|
||||
node_to_idx = {node_id: idx for idx, node_id in enumerate(assigned_nodes)}
|
||||
|
||||
idx_a = node_to_idx[node_id_a]
|
||||
idx_b = node_to_idx[node_id_b]
|
||||
idx_c = node_to_idx[node_id_c]
|
||||
|
||||
assert matrix[idx_a][idx_b] == "rdma_en3"
|
||||
assert matrix[idx_b][idx_c] == "rdma_en4"
|
||||
assert matrix[idx_c][idx_a] == "rdma_en5"
|
||||
|
||||
assert ":" in instance.mlx_ibv_coordinator
|
||||
assert not instance.mlx_ibv_coordinator.startswith("169.254")
|
||||
|
||||
@@ -200,7 +200,7 @@ def test_get_shard_assignments(
|
||||
selected_cycle = cycles[0]
|
||||
|
||||
# act
|
||||
shard_assignments = get_shard_assignments(model_meta, selected_cycle)
|
||||
shard_assignments = get_shard_assignments(model_meta, selected_cycle, "pipeline")
|
||||
|
||||
# assert
|
||||
runner_id_a = shard_assignments.node_to_runner[node_a_id]
|
||||
|
||||
@@ -12,7 +12,12 @@ from anyio import (
|
||||
sleep_forever,
|
||||
)
|
||||
from anyio.abc import TaskGroup
|
||||
from exo_pyo3_bindings import Keypair, NetworkingHandle, NoPeersSubscribedToTopicError
|
||||
from exo_pyo3_bindings import (
|
||||
AllQueuesFullError,
|
||||
Keypair,
|
||||
NetworkingHandle,
|
||||
NoPeersSubscribedToTopicError,
|
||||
)
|
||||
from filelock import FileLock
|
||||
from loguru import logger
|
||||
|
||||
@@ -207,7 +212,7 @@ class Router:
|
||||
await self._net.gossipsub_publish(topic, data)
|
||||
# As a hack, this also catches AllQueuesFull
|
||||
# Need to fix that ASAP.
|
||||
except NoPeersSubscribedToTopicError:
|
||||
except (NoPeersSubscribedToTopicError, AllQueuesFullError):
|
||||
pass
|
||||
|
||||
|
||||
|
||||
@@ -16,8 +16,6 @@ from exo.shared.types.common import NodeId, SessionId
|
||||
from exo.utils.channels import Receiver, Sender
|
||||
from exo.utils.pydantic_ext import CamelCaseModel
|
||||
|
||||
ELECTION_TIMEOUT = 3.0
|
||||
|
||||
|
||||
class ElectionMessage(CamelCaseModel):
|
||||
clock: int
|
||||
@@ -27,6 +25,8 @@ class ElectionMessage(CamelCaseModel):
|
||||
|
||||
# Could eventually include a list of neighbour nodes for centrality
|
||||
def __lt__(self, other: Self) -> bool:
|
||||
if self.clock != other.clock:
|
||||
return self.clock < other.clock
|
||||
if self.seniority != other.seniority:
|
||||
return self.seniority < other.seniority
|
||||
elif self.commands_seen != other.commands_seen:
|
||||
@@ -40,6 +40,7 @@ class ElectionMessage(CamelCaseModel):
|
||||
|
||||
class ElectionResult(CamelCaseModel):
|
||||
session_id: SessionId
|
||||
won_clock: int
|
||||
is_new_master: bool
|
||||
historic_messages: list[ConnectionMessage]
|
||||
|
||||
@@ -90,19 +91,33 @@ class Election:
|
||||
tg.start_soon(self._election_receiver)
|
||||
tg.start_soon(self._connection_receiver)
|
||||
tg.start_soon(self._command_counter)
|
||||
await self._campaign(None)
|
||||
|
||||
# And start an election immediately, that instantly resolves
|
||||
candidates: list[ElectionMessage] = []
|
||||
logger.info("Starting initial campaign")
|
||||
self._candidates = candidates
|
||||
logger.info("Campaign started")
|
||||
await self._campaign(candidates, campaign_timeout=0.0)
|
||||
logger.info("Initial campaign finished")
|
||||
|
||||
# Cancel and wait for the last election to end
|
||||
if self._campaign_cancel_scope is not None:
|
||||
logger.info("Cancelling campaign")
|
||||
self._campaign_cancel_scope.cancel()
|
||||
# Only exit once the latest campaign has finished
|
||||
if self._campaign_done is not None:
|
||||
logger.info("Waiting for campaign to finish")
|
||||
await self._campaign_done.wait()
|
||||
logger.info("Campaign cancelled and finished")
|
||||
logger.info("Election finished")
|
||||
|
||||
async def elect(self, em: ElectionMessage) -> None:
|
||||
logger.info(f"Electing: {em}")
|
||||
is_new_master = em.proposed_session != self.current_session
|
||||
self.current_session = em.proposed_session
|
||||
logger.info(f"Current session: {self.current_session}")
|
||||
await self._er_sender.send(
|
||||
ElectionResult(
|
||||
won_clock=em.clock,
|
||||
session_id=em.proposed_session,
|
||||
is_new_master=is_new_master,
|
||||
historic_messages=self._connection_messages,
|
||||
@@ -120,16 +135,29 @@ class Election:
|
||||
async def _election_receiver(self) -> None:
|
||||
with self._em_receiver as election_messages:
|
||||
async for message in election_messages:
|
||||
logger.info(f"Election message received: {message}")
|
||||
if message.proposed_session.master_node_id == self.node_id:
|
||||
logger.info("Dropping message from ourselves")
|
||||
# Drop messages from us (See exo.routing.router)
|
||||
continue
|
||||
# If a new round is starting, we participate
|
||||
if message.clock > self.clock:
|
||||
self.clock = message.clock
|
||||
await self._campaign(message)
|
||||
logger.info(f"New clock: {self.clock}")
|
||||
assert self._tg is not None
|
||||
logger.info("Starting new campaign")
|
||||
candidates: list[ElectionMessage] = [message]
|
||||
logger.info(f"Candidates: {candidates}")
|
||||
logger.info(f"Current candidates: {self._candidates}")
|
||||
self._candidates = candidates
|
||||
logger.info(f"New candidates: {self._candidates}")
|
||||
logger.info("Starting new campaign")
|
||||
self._tg.start_soon(self._campaign, candidates)
|
||||
logger.info("Campaign started")
|
||||
continue
|
||||
# Dismiss old messages
|
||||
if message.clock < self.clock:
|
||||
logger.info(f"Dropping old message: {message}")
|
||||
continue
|
||||
logger.debug(f"Election added candidate {message}")
|
||||
# Now we are processing this rounds messages - including the message that triggered this round.
|
||||
@@ -137,70 +165,97 @@ class Election:
|
||||
|
||||
async def _connection_receiver(self) -> None:
|
||||
with self._cm_receiver as connection_messages:
|
||||
async for msg in connection_messages:
|
||||
async for first in connection_messages:
|
||||
# Delay after connection message for time to symmetrically setup
|
||||
await anyio.sleep(0.2)
|
||||
rest = connection_messages.collect()
|
||||
|
||||
logger.info(f"Connection messages received: {first} followed by {rest}")
|
||||
logger.info(f"Current clock: {self.clock}")
|
||||
# These messages are strictly peer to peer
|
||||
self.clock += 1
|
||||
await self._campaign(None)
|
||||
self._connection_messages.append(msg)
|
||||
logger.info(f"New clock: {self.clock}")
|
||||
assert self._tg is not None
|
||||
candidates: list[ElectionMessage] = []
|
||||
self._candidates = candidates
|
||||
logger.info("Starting new campaign")
|
||||
self._tg.start_soon(self._campaign, candidates)
|
||||
logger.info("Campaign started")
|
||||
self._connection_messages.append(first)
|
||||
self._connection_messages.extend(rest)
|
||||
logger.info("Connection message added")
|
||||
|
||||
async def _command_counter(self) -> None:
|
||||
with self._co_receiver as commands:
|
||||
async for _command in commands:
|
||||
self.commands_seen += 1
|
||||
|
||||
async def _campaign(self, initial_message: ElectionMessage | None) -> None:
|
||||
async def _campaign(
|
||||
self, candidates: list[ElectionMessage], *, campaign_timeout: float = 3.0
|
||||
) -> None:
|
||||
clock = self.clock
|
||||
|
||||
# Kill the old campaign
|
||||
if self._campaign_cancel_scope:
|
||||
logger.info("Cancelling other campaign")
|
||||
self._campaign_cancel_scope.cancel()
|
||||
if self._campaign_done:
|
||||
logger.info("Waiting for other campaign to finish")
|
||||
await self._campaign_done.wait()
|
||||
|
||||
candidates: list[ElectionMessage] = []
|
||||
if initial_message:
|
||||
candidates.append(initial_message)
|
||||
self._candidates = candidates
|
||||
done = Event()
|
||||
self._campaign_done = done
|
||||
|
||||
assert self._tg is not None, (
|
||||
"Election campaign started before election service initialized"
|
||||
)
|
||||
# Spin off a new campaign
|
||||
self._tg.start_soon(self._complete_campaign, self.clock, candidates, done)
|
||||
|
||||
async def _complete_campaign(
|
||||
self, clock: int, candidates: list[ElectionMessage], done: Event
|
||||
) -> None:
|
||||
scope = CancelScope()
|
||||
self._campaign_cancel_scope = scope
|
||||
|
||||
try:
|
||||
with scope:
|
||||
self._campaign_cancel_scope = scope
|
||||
logger.info(f"Election {clock} started")
|
||||
|
||||
candidates.append(self._election_status(clock))
|
||||
await self._em_sender.send(self._election_status(clock))
|
||||
status = self._election_status(clock)
|
||||
candidates.append(status)
|
||||
await self._em_sender.send(status)
|
||||
|
||||
await anyio.sleep(ELECTION_TIMEOUT)
|
||||
logger.info(f"Sleeping for {campaign_timeout} seconds")
|
||||
await anyio.sleep(campaign_timeout)
|
||||
# minor hack - rebroadcast status in case anyone has missed it.
|
||||
await self._em_sender.send(status)
|
||||
logger.info("Woke up from sleep")
|
||||
# add an anyio checkpoint - anyio.lowlevel.chekpoint() or checkpoint_if_cancelled() is preferred, but wasn't typechecking last I checked
|
||||
await anyio.sleep(0)
|
||||
|
||||
# Election finished!
|
||||
candidates = sorted(candidates)
|
||||
logger.debug(f"Election queue {candidates}")
|
||||
elected = candidates[-1]
|
||||
elected = max(candidates)
|
||||
logger.info(f"Election queue {candidates}")
|
||||
logger.info(f"Elected: {elected}")
|
||||
if (
|
||||
self.node_id == elected.proposed_session.master_node_id
|
||||
and self.seniority >= 0
|
||||
):
|
||||
logger.info(
|
||||
f"Node is a candidate and seniority is {self.seniority}"
|
||||
)
|
||||
self.seniority = max(self.seniority, len(candidates))
|
||||
logger.info(f"New seniority: {self.seniority}")
|
||||
else:
|
||||
logger.info(
|
||||
f"Node is not a candidate or seniority is not {self.seniority}"
|
||||
)
|
||||
logger.info(
|
||||
f"Election finished, new SessionId({elected.proposed_session})"
|
||||
f"Election finished, new SessionId({elected.proposed_session}) with queue {candidates}"
|
||||
)
|
||||
logger.info("Sending election result")
|
||||
await self.elect(elected)
|
||||
logger.info("Election result sent")
|
||||
except get_cancelled_exc_class():
|
||||
logger.info("Election cancelled")
|
||||
logger.info(f"Election {clock} cancelled")
|
||||
finally:
|
||||
logger.info(f"Election {clock} finally")
|
||||
if self._campaign_cancel_scope is scope:
|
||||
self._campaign_cancel_scope = None
|
||||
done.set()
|
||||
logger.info("Setting done event")
|
||||
done.set()
|
||||
logger.info("Done event set")
|
||||
|
||||
def _election_status(self, clock: int | None = None) -> ElectionMessage:
|
||||
c = self.clock if clock is None else clock
|
||||
|
||||
@@ -166,7 +166,7 @@ MODEL_CARDS: dict[str, ModelCard] = {
|
||||
"llama-3.3-70b": ModelCard(
|
||||
short_id="llama-3.3-70b",
|
||||
model_id="mlx-community/Llama-3.3-70B-Instruct-4bit",
|
||||
name="Llama 3.3 70B",
|
||||
name="Llama 3.3 70B (4-bit)",
|
||||
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
|
||||
tags=[],
|
||||
metadata=ModelMetadata(
|
||||
@@ -176,6 +176,32 @@ MODEL_CARDS: dict[str, ModelCard] = {
|
||||
n_layers=80,
|
||||
),
|
||||
),
|
||||
"llama-3.3-70b-8bit": ModelCard(
|
||||
short_id="llama-3.3-70b-8bit",
|
||||
model_id="mlx-community/Llama-3.3-70B-Instruct-8bit",
|
||||
name="Llama 3.3 70B (8-bit)",
|
||||
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
|
||||
tags=[],
|
||||
metadata=ModelMetadata(
|
||||
model_id=ModelId("mlx-community/Llama-3.3-70B-Instruct-8bit"),
|
||||
pretty_name="Llama 3.3 70B (8-bit)",
|
||||
storage_size=Memory.from_kb(77516320),
|
||||
n_layers=80,
|
||||
),
|
||||
),
|
||||
"llama-3.3-70b-fp16": ModelCard(
|
||||
short_id="llama-3.3-70b-fp16",
|
||||
model_id="mlx-community/llama-3.3-70b-instruct-fp16",
|
||||
name="Llama 3.3 70B (FP16)",
|
||||
description="""The Meta Llama 3.3 multilingual large language model (LLM) is an instruction tuned generative model in 70B (text in/text out)""",
|
||||
tags=[],
|
||||
metadata=ModelMetadata(
|
||||
model_id=ModelId("mlx-community/llama-3.3-70b-instruct-fp16"),
|
||||
pretty_name="Llama 3.3 70B (FP16)",
|
||||
storage_size=Memory.from_kb(155032640),
|
||||
n_layers=80,
|
||||
),
|
||||
),
|
||||
# phi-3
|
||||
"phi-3-mini": ModelCard(
|
||||
short_id="phi-3-mini",
|
||||
@@ -230,6 +256,32 @@ MODEL_CARDS: dict[str, ModelCard] = {
|
||||
n_layers=48,
|
||||
),
|
||||
),
|
||||
"qwen3-235b-a22b": ModelCard(
|
||||
short_id="qwen3-235b-a22b",
|
||||
model_id="mlx-community/Qwen3-235B-A22B-4bit",
|
||||
name="Qwen3 235B, Active 22B (4-bit)",
|
||||
description="""Qwen3 235B (Active 22B) is a large language model trained on the Qwen3 235B dataset.""",
|
||||
tags=[],
|
||||
metadata=ModelMetadata(
|
||||
model_id=ModelId("mlx-community/Qwen3-235B-A22B-4bit"),
|
||||
pretty_name="Qwen3 235B, Active 22B (4-bit)",
|
||||
storage_size=Memory.from_kb(123207680),
|
||||
n_layers=94,
|
||||
),
|
||||
),
|
||||
"qwen3-235b-a22b-8bit": ModelCard(
|
||||
short_id="qwen3-235b-a22b-8bit",
|
||||
model_id="mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit",
|
||||
name="Qwen3 235B, Active 22B (8-bit)",
|
||||
description="""Qwen3 235B (Active 22B) is a large language model trained on the Qwen3 235B dataset.""",
|
||||
tags=[],
|
||||
metadata=ModelMetadata(
|
||||
model_id=ModelId("mlx-community/Qwen3-235B-A22B-Instruct-2507-8bit"),
|
||||
pretty_name="Qwen3 235B, Active 22B (8-bit)",
|
||||
storage_size=Memory.from_kb(246415360),
|
||||
n_layers=94,
|
||||
),
|
||||
),
|
||||
# granite
|
||||
"granite-3.3-2b": ModelCard(
|
||||
short_id="granite-3.3-2b",
|
||||
|
||||
@@ -7,6 +7,7 @@ from exo.shared.openai_compat import FinishReason
|
||||
from exo.shared.types.common import CommandId
|
||||
from exo.shared.types.models import ModelMetadata
|
||||
from exo.shared.types.worker.instances import InstanceId
|
||||
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
|
||||
|
||||
|
||||
class ModelListModel(BaseModel):
|
||||
@@ -123,6 +124,7 @@ class ChatCompletionTaskParams(BaseModel):
|
||||
class CreateInstanceTaskParams(BaseModel):
|
||||
# TODO: in future the user could specify a specific Instance, not just a model_id
|
||||
model_id: str
|
||||
strategy: ParallelisationStrategyType = "auto"
|
||||
|
||||
|
||||
class DeleteInstanceTaskParams(BaseModel):
|
||||
|
||||
@@ -4,6 +4,7 @@ from exo.shared.types.api import ChatCompletionTaskParams
|
||||
from exo.shared.types.common import CommandId, NodeId
|
||||
from exo.shared.types.models import ModelMetadata
|
||||
from exo.shared.types.worker.common import InstanceId
|
||||
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
|
||||
from exo.utils.pydantic_ext import CamelCaseModel, TaggedModel
|
||||
|
||||
|
||||
@@ -22,6 +23,7 @@ class ChatCompletion(BaseCommand):
|
||||
|
||||
class CreateInstance(BaseCommand):
|
||||
model_meta: ModelMetadata
|
||||
strategy: ParallelisationStrategyType
|
||||
|
||||
|
||||
class SpinUpInstance(BaseCommand):
|
||||
|
||||
@@ -1,3 +1,4 @@
|
||||
from datetime import datetime
|
||||
from enum import Enum
|
||||
|
||||
from pydantic import Field
|
||||
@@ -60,6 +61,8 @@ class EventType(str, Enum):
|
||||
|
||||
class BaseEvent(TaggedModel):
|
||||
event_id: EventId = Field(default_factory=EventId)
|
||||
# Internal, for debugging. Please don't rely on this field for anything!
|
||||
_master_time_stamp: None | datetime = None
|
||||
|
||||
|
||||
class TestEvent(BaseEvent):
|
||||
|
||||
@@ -11,7 +11,9 @@ class BaseRunnerMessage(TaggedModel):
|
||||
|
||||
class SetupMessage(BaseRunnerMessage):
|
||||
model_shard_meta: ShardMetadata
|
||||
hosts: list[Host]
|
||||
hosts: list[Host] | None = None
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None
|
||||
mlx_ibv_coordinator: str | None = None
|
||||
|
||||
|
||||
# TODO: We probably want a general task message that can take any task type. Can be fixed later.
|
||||
|
||||
@@ -17,4 +17,6 @@ class Instance(CamelCaseModel):
|
||||
instance_id: InstanceId
|
||||
instance_type: InstanceStatus
|
||||
shard_assignments: ShardAssignments
|
||||
hosts: list[Host]
|
||||
hosts: list[Host] | None = None
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None
|
||||
mlx_ibv_coordinator: str | None = None
|
||||
|
||||
@@ -14,7 +14,9 @@ class AssignRunnerOp(BaseRunnerOp):
|
||||
instance_id: InstanceId
|
||||
runner_id: RunnerId
|
||||
shard_metadata: ShardMetadata
|
||||
hosts: list[Host]
|
||||
hosts: list[Host] | None = None
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None
|
||||
mlx_ibv_coordinator: str | None = None
|
||||
|
||||
|
||||
class UnassignRunnerOp(BaseRunnerOp):
|
||||
|
||||
13
src/exo/shared/types/worker/parallelisation_strategy.py
Normal file
13
src/exo/shared/types/worker/parallelisation_strategy.py
Normal file
@@ -0,0 +1,13 @@
|
||||
from typing import Literal
|
||||
|
||||
ParallelisationStrategyType = Literal[
|
||||
"auto",
|
||||
"pipeline",
|
||||
"tensor",
|
||||
"tensor_rdma",
|
||||
"pipeline_rdma",
|
||||
]
|
||||
|
||||
|
||||
def strategy_error() -> ValueError:
|
||||
return ValueError("Unexpected strategy")
|
||||
@@ -1,6 +1,7 @@
|
||||
from pydantic import Field
|
||||
|
||||
from exo.shared.types.models import ModelMetadata
|
||||
from exo.shared.types.worker.parallelisation_strategy import ParallelisationStrategyType
|
||||
from exo.utils.pydantic_ext import TaggedModel
|
||||
|
||||
|
||||
@@ -19,19 +20,12 @@ class BaseShardMetadata(TaggedModel):
|
||||
immediate_exception: bool = False
|
||||
should_timeout: float | None = None
|
||||
|
||||
|
||||
class PipelineShardMetadata(BaseShardMetadata):
|
||||
"""
|
||||
Pipeline parallelism shard meta.
|
||||
|
||||
Layers are represented as a half-open interval [start_layer, end_layer),
|
||||
where start_layer is inclusive and end_layer is exclusive.
|
||||
"""
|
||||
|
||||
start_layer: int = Field(ge=0)
|
||||
end_layer: int = Field(ge=0)
|
||||
n_layers: int = Field(ge=0)
|
||||
|
||||
strategy: ParallelisationStrategyType = "auto"
|
||||
|
||||
@property
|
||||
def is_first_layer(self) -> bool:
|
||||
return self.start_layer == 0
|
||||
@@ -46,4 +40,19 @@ class PipelineShardMetadata(BaseShardMetadata):
|
||||
)
|
||||
|
||||
|
||||
ShardMetadata = PipelineShardMetadata
|
||||
class PipelineShardMetadata(BaseShardMetadata):
|
||||
"""
|
||||
Pipeline parallelism shard meta.
|
||||
|
||||
Layers are represented as a half-open interval [start_layer, end_layer),
|
||||
where start_layer is inclusive and end_layer is exclusive.
|
||||
"""
|
||||
|
||||
strategy: ParallelisationStrategyType = "pipeline"
|
||||
|
||||
|
||||
class TensorShardMetadata(BaseShardMetadata):
|
||||
strategy: ParallelisationStrategyType = "tensor"
|
||||
|
||||
|
||||
ShardMetadata = PipelineShardMetadata | TensorShardMetadata
|
||||
|
||||
@@ -1,4 +1,5 @@
|
||||
from math import inf
|
||||
from typing import Self
|
||||
|
||||
from anyio import ClosedResourceError, WouldBlock
|
||||
from anyio.streams.memory import (
|
||||
@@ -47,6 +48,9 @@ class Receiver[T](AnyioReceiver[T]):
|
||||
out.extend(self.collect())
|
||||
return out
|
||||
|
||||
def __enter__(self) -> Self:
|
||||
return self
|
||||
|
||||
|
||||
class channel[T]: # noqa: N801
|
||||
def __new__(cls, max_buffer_size: float = inf) -> tuple[Sender[T], Receiver[T]]:
|
||||
|
||||
@@ -18,12 +18,14 @@ from exo.worker.runner.runner_supervisor import RunnerSupervisor
|
||||
class AssignedRunner(BaseModel):
|
||||
runner_id: RunnerId
|
||||
instance_id: InstanceId
|
||||
shard_metadata: ShardMetadata # just data
|
||||
hosts: list[Host]
|
||||
shard_metadata: ShardMetadata
|
||||
hosts: list[Host] | None = None
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None
|
||||
mlx_ibv_coordinator: str | None = None
|
||||
|
||||
status: RunnerStatus
|
||||
failures: list[tuple[float, Exception]] = []
|
||||
runner: RunnerSupervisor | None # set if the runner is 'up'
|
||||
runner: RunnerSupervisor | None = None
|
||||
|
||||
model_config = ConfigDict(arbitrary_types_allowed=True)
|
||||
|
||||
|
||||
@@ -194,8 +194,8 @@ class Worker:
|
||||
|
||||
# run the op, synchronously blocking for now
|
||||
if op is not None:
|
||||
logger.info(f"Executing op {str(op)[:100]}")
|
||||
logger.debug(f"Worker executing op: {str(op)[:100]}")
|
||||
logger.info(f"Executing op {type(op)} {str(op)[:100]}")
|
||||
logger.debug(f"Worker executing op: {type(op)} {str(op)[:100]}")
|
||||
try:
|
||||
async for event in self.execute_op(op):
|
||||
await self.event_publisher(event)
|
||||
@@ -285,6 +285,8 @@ class Worker:
|
||||
instance_id=op.instance_id,
|
||||
shard_metadata=op.shard_metadata,
|
||||
hosts=op.hosts,
|
||||
mlx_ibv_devices=op.mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=op.mlx_ibv_coordinator,
|
||||
status=DownloadingRunnerStatus(
|
||||
download_progress=DownloadPending(node_id=self.node_id)
|
||||
),
|
||||
@@ -439,6 +441,8 @@ class Worker:
|
||||
assigned_runner.runner = await RunnerSupervisor.create(
|
||||
model_shard_meta=assigned_runner.shard_metadata,
|
||||
hosts=assigned_runner.hosts,
|
||||
mlx_ibv_devices=assigned_runner.mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=assigned_runner.mlx_ibv_coordinator,
|
||||
initialize_timeout=initialize_timeout,
|
||||
)
|
||||
|
||||
|
||||
@@ -176,6 +176,8 @@ def assign_runners(
|
||||
runner_id
|
||||
],
|
||||
hosts=instance.hosts,
|
||||
mlx_ibv_devices=instance.mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=instance.mlx_ibv_coordinator,
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
@@ -21,6 +21,7 @@ def entrypoint(raw_conn: Connection, err_path: str) -> None:
|
||||
It redirects fd=2 (stderr) to a pipe provided by the parent, *then* imports
|
||||
the heavy runner module so that any C/C++ or MLX logs/crashes land in that pipe.
|
||||
"""
|
||||
# os.environ["MLX_METAL_FAST_SYNCH"] = "1"
|
||||
_redirect_stderr_to_file(err_path)
|
||||
faulthandler.enable(file=sys.stderr, all_threads=True)
|
||||
|
||||
|
||||
@@ -1,9 +1,10 @@
|
||||
import asyncio
|
||||
import concurrent.futures
|
||||
import functools
|
||||
import time
|
||||
from collections.abc import AsyncGenerator
|
||||
from functools import partial
|
||||
from typing import Callable, Generator, Optional, Tuple
|
||||
from typing import Any, Callable, Generator, Optional, Tuple
|
||||
|
||||
import mlx.core as mx
|
||||
from mlx.core import array
|
||||
@@ -13,9 +14,9 @@ from mlx_lm.models.cache import KVCache
|
||||
from exo.engines.mlx import Model, TokenizerWrapper
|
||||
from exo.engines.mlx.utils_mlx import (
|
||||
apply_chat_template,
|
||||
broadcast_from_zero,
|
||||
broadcast_from_zero, # type: ignore
|
||||
make_kv_cache,
|
||||
mx_barrier,
|
||||
mx_barrier, # type: ignore
|
||||
)
|
||||
from exo.shared.types.api import ChatCompletionMessage
|
||||
from exo.shared.types.tasks import ChatCompletionTaskParams
|
||||
@@ -33,15 +34,35 @@ from exo.shared.types.worker.communication import (
|
||||
generation_stream = mx.new_stream(mx.default_device())
|
||||
|
||||
|
||||
def maybe_quantize_kv_cache(
|
||||
prompt_cache: list[Any],
|
||||
quantized_kv_start: int,
|
||||
kv_group_size: int,
|
||||
kv_bits: int | None,
|
||||
) -> None:
|
||||
if kv_bits is None:
|
||||
return
|
||||
for e, c in enumerate(prompt_cache): # type: ignore[type-arg]
|
||||
if hasattr(c, "to_quantized") and c.offset >= quantized_kv_start: # type: ignore[type-arg]
|
||||
prompt_cache[e] = c.to_quantized(group_size=kv_group_size, bits=kv_bits) # type: ignore[type-arg]
|
||||
|
||||
|
||||
def generate_step(
|
||||
prompt: mx.array,
|
||||
model: Model,
|
||||
*,
|
||||
max_tokens: int = 256,
|
||||
sampler: Callable[[mx.array], mx.array],
|
||||
logits_processors: list[Callable[[mx.array, mx.array], mx.array]] | None = None,
|
||||
max_kv_size: Optional[int] = None,
|
||||
prompt_cache: Optional[list[KVCache]] = None,
|
||||
prefill_step_size: int = 2048,
|
||||
kv_bits: int | None = None,
|
||||
kv_group_size: int = 64,
|
||||
quantized_kv_start: int = 0,
|
||||
prompt_progress_callback: Callable[[int, int], None] | None = None,
|
||||
input_embeddings: mx.array | None = None,
|
||||
group: mx.distributed.Group | None = None, # type: ignore[type-arg]
|
||||
) -> Generator[Tuple[int, mx.array], None, None]:
|
||||
"""
|
||||
A generator producing token ids based on the given prompt from the model.
|
||||
@@ -51,85 +72,159 @@ def generate_step(
|
||||
model (Model): The model to use for generation.
|
||||
max_tokens (int): The maximum number of tokens. Use``-1`` for an infinite
|
||||
generator. Default: ``256``.
|
||||
sampler (Callable[mx.array, mx.array], optional): A sampler for sampling a
|
||||
token from a vector of log probabilities. Default: ``None``.
|
||||
sampler (Callable[mx.array, mx.array]): A sampler for sampling a
|
||||
token from a vector of log probabilities.
|
||||
logits_processors (List[Callable[[mx.array, mx.array], mx.array]], optional):
|
||||
A list of functions that take tokens and logits and return the processed
|
||||
logits. Default: ``None``.
|
||||
max_kv_size (int, optional): Maximum size of the key-value cache. Old
|
||||
entries (except the first 4 tokens) will be overwritten.
|
||||
prompt_cache (List[Any], optional): A pre-computed prompt cache. Note, if
|
||||
provided, the cache will be updated in place.
|
||||
prefill_step_size (int): Step size for processing the prompt.
|
||||
kv_bits (int, optional): Number of bits to use for KV cache quantization.
|
||||
None implies no cache quantization. Default: ``None``.
|
||||
kv_group_size (int): Group size for KV cache quantization. Default: ``64``.
|
||||
quantized_kv_start (int): Step to begin using a quantized KV cache.
|
||||
when ``kv_bits`` is non-None. Default: ``0``.
|
||||
prompt_progress_callback (Callable[[int, int], None]): A call-back which takes the
|
||||
prompt tokens processed so far and the total number of prompt tokens.
|
||||
input_embeddings (mx.array, optional): Input embeddings to use instead of or in
|
||||
conjunction with prompt tokens. Default: ``None``.
|
||||
|
||||
Yields:
|
||||
Tuple[int, mx.array]: One token and a vector of log probabilities.
|
||||
"""
|
||||
if input_embeddings is not None:
|
||||
if len(prompt) > 0 and len(prompt) != len(input_embeddings):
|
||||
raise ValueError(
|
||||
f"When providing input_embeddings, their sequence length ({len(input_embeddings)}) "
|
||||
f"must match the sequence length of the prompt ({len(prompt)}), or the "
|
||||
"prompt must be empty."
|
||||
)
|
||||
elif len(prompt) == 0:
|
||||
raise ValueError(
|
||||
"Either input_embeddings or prompt (or both) must be provided."
|
||||
)
|
||||
|
||||
tokens = None
|
||||
|
||||
# Create the KV cache for generation
|
||||
if prompt_cache is None:
|
||||
prompt_cache = cache.make_prompt_cache(
|
||||
model,
|
||||
max_kv_size=max_kv_size,
|
||||
)
|
||||
|
||||
def _step(input_tokens: mx.array):
|
||||
prompt_progress_callback = prompt_progress_callback or (lambda *_: None) # type: ignore[type-arg]
|
||||
|
||||
quantize_cache_fn = functools.partial(
|
||||
maybe_quantize_kv_cache,
|
||||
quantized_kv_start=quantized_kv_start,
|
||||
kv_group_size=kv_group_size,
|
||||
kv_bits=kv_bits,
|
||||
)
|
||||
|
||||
def _model_call(
|
||||
input_tokens: mx.array, input_embeddings: mx.array | None
|
||||
) -> mx.array:
|
||||
if input_embeddings is not None:
|
||||
return model( # type: ignore[type-arg]
|
||||
input_tokens,
|
||||
cache=prompt_cache,
|
||||
input_embeddings=input_embeddings, # type: ignore[type-arg]
|
||||
)
|
||||
else:
|
||||
return model(input_tokens, cache=prompt_cache)
|
||||
|
||||
def _step(
|
||||
input_tokens: mx.array, input_embeddings: mx.array | None = None
|
||||
) -> tuple[mx.array, mx.array]:
|
||||
nonlocal tokens
|
||||
|
||||
with mx.stream(generation_stream):
|
||||
logits = model(
|
||||
input_tokens[None],
|
||||
cache=prompt_cache,
|
||||
logits = _model_call(
|
||||
input_tokens=input_tokens[None],
|
||||
input_embeddings=(
|
||||
input_embeddings[None] if input_embeddings is not None else None
|
||||
),
|
||||
)
|
||||
|
||||
logits = logits[:, -1, :]
|
||||
|
||||
if logits_processors and len(input_tokens) > 0:
|
||||
tokens = (
|
||||
mx.concat([tokens, input_tokens])
|
||||
if tokens is not None
|
||||
else input_tokens
|
||||
)
|
||||
for processor in logits_processors:
|
||||
logits = processor(tokens, logits)
|
||||
|
||||
quantize_cache_fn(prompt_cache)
|
||||
|
||||
logprobs = logits - mx.logsumexp(logits, keepdims=True)
|
||||
sampled = sampler(logprobs)
|
||||
return sampled, logprobs.squeeze(0)
|
||||
|
||||
with mx.stream(generation_stream):
|
||||
total_prompt_tokens = len(prompt)
|
||||
total_prompt_tokens = (
|
||||
len(input_embeddings) if input_embeddings is not None else len(prompt)
|
||||
)
|
||||
prompt_processed_tokens = 0
|
||||
prompt_progress_callback(prompt_processed_tokens, total_prompt_tokens)
|
||||
|
||||
while total_prompt_tokens - prompt_processed_tokens > prefill_step_size:
|
||||
runner_print(
|
||||
f"Prefilling {min(prefill_step_size, len(prompt))} tokens. Remaining tokens: {len(prompt)}. Peak memory: {mx.get_peak_memory() // 2**30} GB"
|
||||
)
|
||||
logits = model(prompt[:prefill_step_size][None], cache=prompt_cache)
|
||||
n_to_process = min(prefill_step_size, prompt.size)
|
||||
_model_call(
|
||||
input_tokens=prompt[:n_to_process][None],
|
||||
input_embeddings=(
|
||||
input_embeddings[:n_to_process][None]
|
||||
if input_embeddings is not None
|
||||
else None
|
||||
),
|
||||
)
|
||||
quantize_cache_fn(prompt_cache)
|
||||
|
||||
start_time = time.time()
|
||||
mx.eval([c.state for c in prompt_cache] + [logits]) # type: ignore
|
||||
mx.eval([c.state for c in prompt_cache]) # type: ignore
|
||||
eval_time = time.time() - start_time
|
||||
prompt_processed_tokens += prefill_step_size
|
||||
prompt_processed_tokens += n_to_process
|
||||
|
||||
prompt = prompt[prefill_step_size:]
|
||||
prompt = prompt[n_to_process:]
|
||||
input_embeddings = (
|
||||
input_embeddings[n_to_process:]
|
||||
if input_embeddings is not None
|
||||
else input_embeddings
|
||||
)
|
||||
|
||||
mx.clear_cache()
|
||||
if eval_time > 7.0:
|
||||
prefill_step_size = prefill_step_size // 2
|
||||
prefill_step_size = broadcast_from_zero(prefill_step_size)
|
||||
if group is not None:
|
||||
prefill_step_size = broadcast_from_zero(prefill_step_size)
|
||||
prefill_step_size = max(1, prefill_step_size)
|
||||
prompt_progress_callback(prompt_processed_tokens, total_prompt_tokens)
|
||||
|
||||
if prompt_processed_tokens > 0:
|
||||
runner_print("finished prefil stage.")
|
||||
|
||||
y, logprobs = _step(input_tokens=prompt)
|
||||
y, logprobs = _step(input_tokens=prompt, input_embeddings=input_embeddings)
|
||||
|
||||
# TODO: Why on earth is this async_eval called twice?
|
||||
# Also why is it async_eval not eval ?
|
||||
mx.async_eval(y, logprobs) # type: ignore
|
||||
n = 0
|
||||
mx.async_eval(y, logprobs) # type: ignore[type-arg]
|
||||
next_y: array | None = None
|
||||
next_logprobs: array | None = None
|
||||
|
||||
mx.async_eval(y, logprobs) # type: ignore
|
||||
n = 0
|
||||
while True:
|
||||
if n != max_tokens:
|
||||
assert y is not None
|
||||
next_y, next_logprobs = _step(y)
|
||||
mx.async_eval(next_y, next_logprobs) # type: ignore
|
||||
mx.async_eval(next_y, next_logprobs) # type: ignore[type-arg]
|
||||
if n == 0:
|
||||
mx.eval(y) # type: ignore
|
||||
mx.eval(y) # type: ignore[type-arg]
|
||||
prompt_progress_callback(total_prompt_tokens, total_prompt_tokens)
|
||||
if n == max_tokens:
|
||||
break
|
||||
yield int(y.item()), logprobs # type: ignore
|
||||
@@ -146,8 +241,16 @@ def stream_generate(
|
||||
max_tokens: int,
|
||||
sampler: Callable[[mx.array], mx.array],
|
||||
conn: AsyncConnection[RunnerResponse, RunnerMessage] | None,
|
||||
logits_processors: list[Callable[[mx.array, mx.array], mx.array]] | None = None,
|
||||
max_kv_size: int | None = None,
|
||||
prompt_cache: Optional[list[KVCache]] = None,
|
||||
prefill_step_size: int = 2048,
|
||||
kv_bits: int | None = None,
|
||||
kv_group_size: int = 64,
|
||||
quantized_kv_start: int = 0,
|
||||
prompt_progress_callback: Callable[[int, int], None] | None = None,
|
||||
input_embeddings: mx.array | None = None,
|
||||
group: mx.distributed.Group | None = None, # type: ignore[type-arg]
|
||||
) -> Generator[GenerationResponse, None, None]:
|
||||
# Try to infer if special tokens are needed
|
||||
add_special_tokens = tokenizer.bos_token is None or not prompt.startswith(
|
||||
@@ -166,8 +269,16 @@ def stream_generate(
|
||||
model,
|
||||
max_tokens=max_tokens,
|
||||
sampler=sampler,
|
||||
logits_processors=logits_processors,
|
||||
max_kv_size=max_kv_size,
|
||||
prompt_cache=prompt_cache,
|
||||
prefill_step_size=prefill_step_size,
|
||||
kv_bits=kv_bits,
|
||||
kv_group_size=kv_group_size,
|
||||
quantized_kv_start=quantized_kv_start,
|
||||
prompt_progress_callback=prompt_progress_callback,
|
||||
input_embeddings=input_embeddings,
|
||||
group=group,
|
||||
)
|
||||
|
||||
token = None
|
||||
@@ -199,6 +310,7 @@ async def warmup_inference(
|
||||
model: Model,
|
||||
tokenizer: TokenizerWrapper,
|
||||
sampler: Callable[[mx.array], mx.array],
|
||||
group: mx.distributed.Group | None = None, # type: ignore
|
||||
) -> int:
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
@@ -220,18 +332,21 @@ async def warmup_inference(
|
||||
|
||||
def _generate_warmup():
|
||||
nonlocal tokens_generated
|
||||
for token in stream_generate(
|
||||
runner_print("Generating warmup tokens")
|
||||
for _r in stream_generate(
|
||||
model=model,
|
||||
tokenizer=tokenizer,
|
||||
prompt=warmup_prompt,
|
||||
max_tokens=50,
|
||||
sampler=sampler,
|
||||
conn=None,
|
||||
group=group,
|
||||
):
|
||||
runner_print("Generated warmup token: " + str(token.text))
|
||||
runner_print("Generated warmup token: " + str(_r.text))
|
||||
tokens_generated += 1
|
||||
|
||||
await loop.run_in_executor(mlx_executor, _generate_warmup)
|
||||
runner_print("Generated ALL warmup tokens")
|
||||
mx_barrier()
|
||||
|
||||
return tokens_generated
|
||||
|
||||
@@ -7,7 +7,6 @@ from multiprocessing.connection import Connection
|
||||
from exo.engines.mlx.utils_mlx import (
|
||||
initialize_mlx,
|
||||
mlx_force_oom,
|
||||
mlx_setup,
|
||||
)
|
||||
from exo.shared.global_conn import set_conn
|
||||
from exo.shared.types.worker.commands_runner import (
|
||||
@@ -26,8 +25,7 @@ from exo.shared.types.worker.communication import (
|
||||
)
|
||||
from exo.shared.types.worker.shards import ShardMetadata
|
||||
from exo.utils import ensure_type
|
||||
from exo.worker.runner.generate import mlx_generate, warmup_inference
|
||||
from exo.worker.runner.utils import get_weights_size
|
||||
from exo.worker.runner.generate import mlx_generate, warmup_inference # type: ignore
|
||||
|
||||
|
||||
async def main(raw_conn: Connection):
|
||||
@@ -40,33 +38,39 @@ async def main(raw_conn: Connection):
|
||||
setup_message = ensure_type(init_message, SetupMessage)
|
||||
model_shard_meta: ShardMetadata = setup_message.model_shard_meta
|
||||
hosts = setup_message.hosts
|
||||
mlx_ibv_devices = setup_message.mlx_ibv_devices
|
||||
mlx_ibv_coordinator = setup_message.mlx_ibv_coordinator
|
||||
|
||||
if getattr(model_shard_meta, "immediate_exception", False):
|
||||
raise Exception("Fake exception - runner failed to spin up.")
|
||||
if timeout := getattr(model_shard_meta, "should_timeout", 0):
|
||||
await asyncio.sleep(timeout)
|
||||
|
||||
mlx_setup(
|
||||
int(get_weights_size(model_shard_meta).in_kb // 2**10),
|
||||
cache_frac_of_mrwss=0.8,
|
||||
wired_frac_of_mrwss=0.8,
|
||||
)
|
||||
|
||||
setup_start_time = time.time()
|
||||
|
||||
mlx_executor = concurrent.futures.ThreadPoolExecutor(max_workers=1)
|
||||
loop = asyncio.get_running_loop()
|
||||
|
||||
model, tokenizer, sampler = await loop.run_in_executor(
|
||||
model, tokenizer, sampler, group = await loop.run_in_executor( # type: ignore[type-arg]
|
||||
mlx_executor,
|
||||
partial(initialize_mlx, model_shard_meta=model_shard_meta, hosts=hosts),
|
||||
partial(
|
||||
initialize_mlx,
|
||||
model_shard_meta=model_shard_meta,
|
||||
hosts=hosts,
|
||||
mlx_ibv_devices=mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=mlx_ibv_coordinator,
|
||||
),
|
||||
)
|
||||
|
||||
runner_print(
|
||||
f"Warming up inference for model_shard_meta: {model_shard_meta} hosts: {hosts}"
|
||||
)
|
||||
toks = await warmup_inference(
|
||||
mlx_executor=mlx_executor,
|
||||
model=model,
|
||||
tokenizer=tokenizer,
|
||||
sampler=sampler,
|
||||
group=group, # type: ignore[type-arg]
|
||||
)
|
||||
runner_print(f"Warmed up by generating {toks} tokens")
|
||||
await conn.send(InitializedResponse(time_taken=time.time() - setup_start_time))
|
||||
|
||||
@@ -34,18 +34,21 @@ from exo.shared.types.worker.common import RunnerError
|
||||
from exo.shared.types.worker.shards import ShardMetadata
|
||||
from exo.worker.runner.bootstrap import entrypoint
|
||||
from exo.worker.runner.utils import (
|
||||
get_init_timeout,
|
||||
get_prefil_timeout,
|
||||
get_token_generate_timeout,
|
||||
get_weights_size,
|
||||
)
|
||||
|
||||
INITIALIZE_TIMEOUT = 400
|
||||
PREFILL_TIMEOUT_SECONDS = 60
|
||||
DECODE_TIMEOUT_SECONDS = 5
|
||||
|
||||
|
||||
class RunnerSupervisor:
|
||||
def __init__(
|
||||
self,
|
||||
model_shard_meta: ShardMetadata,
|
||||
hosts: list[Host],
|
||||
hosts: list[Host] | None,
|
||||
mlx_ibv_devices: list[list[str | None]] | None,
|
||||
mlx_ibv_coordinator: str | None,
|
||||
runner_process: Process,
|
||||
conn: Connection,
|
||||
read_queue: asyncio.Queue[RunnerResponse],
|
||||
@@ -53,6 +56,8 @@ class RunnerSupervisor:
|
||||
):
|
||||
self.model_shard_meta = model_shard_meta
|
||||
self.hosts = hosts
|
||||
self.mlx_ibv_devices = mlx_ibv_devices
|
||||
self.mlx_ibv_coordinator = mlx_ibv_coordinator
|
||||
self.runner_process = runner_process
|
||||
|
||||
self.conn = AsyncConnection[RunnerMessage, RunnerResponse](conn)
|
||||
@@ -67,7 +72,9 @@ class RunnerSupervisor:
|
||||
async def create(
|
||||
cls,
|
||||
model_shard_meta: ShardMetadata,
|
||||
hosts: list[Host],
|
||||
hosts: list[Host] | None = None,
|
||||
mlx_ibv_devices: list[list[str | None]] | None = None,
|
||||
mlx_ibv_coordinator: str | None = None,
|
||||
initialize_timeout: Optional[float] = None,
|
||||
) -> "RunnerSupervisor":
|
||||
"""
|
||||
@@ -93,6 +100,8 @@ class RunnerSupervisor:
|
||||
self = cls(
|
||||
model_shard_meta=model_shard_meta,
|
||||
hosts=hosts,
|
||||
mlx_ibv_devices=mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=mlx_ibv_coordinator,
|
||||
runner_process=runner_process,
|
||||
read_queue=read_queue,
|
||||
conn=parent_conn,
|
||||
@@ -104,12 +113,12 @@ class RunnerSupervisor:
|
||||
SetupMessage(
|
||||
model_shard_meta=model_shard_meta,
|
||||
hosts=hosts,
|
||||
mlx_ibv_devices=mlx_ibv_devices,
|
||||
mlx_ibv_coordinator=mlx_ibv_coordinator,
|
||||
)
|
||||
)
|
||||
|
||||
if not initialize_timeout:
|
||||
initialize_timeout = get_init_timeout(model_shard_meta)
|
||||
|
||||
initialize_timeout = initialize_timeout or INITIALIZE_TIMEOUT
|
||||
response = await self._read_with_error_check(timeout=initialize_timeout)
|
||||
|
||||
assert isinstance(response, InitializedResponse)
|
||||
@@ -206,17 +215,13 @@ class RunnerSupervisor:
|
||||
|
||||
response = await self._read_with_error_check(5.0)
|
||||
assert isinstance(response, TokenizedResponse)
|
||||
prompt_tokens = response.prompt_tokens
|
||||
|
||||
if request_started_callback is not None:
|
||||
await request_started_callback()
|
||||
|
||||
prefil_timeout = get_prefil_timeout(
|
||||
self.model_shard_meta, prompt_tokens=prompt_tokens
|
||||
)
|
||||
token_timeout = get_token_generate_timeout(self.model_shard_meta)
|
||||
timeout = prefil_timeout
|
||||
logger.bind(user_facing=True).info(
|
||||
timeout = PREFILL_TIMEOUT_SECONDS
|
||||
|
||||
logger.info(
|
||||
f"Starting chat completion with timeout {timeout}"
|
||||
)
|
||||
|
||||
@@ -224,8 +229,8 @@ class RunnerSupervisor:
|
||||
try:
|
||||
response = await self._read_with_error_check(timeout)
|
||||
except asyncio.TimeoutError as e:
|
||||
logger.bind(user_facing=True).error(
|
||||
f"Generation timed out during {'prefil' if timeout == prefil_timeout else 'decoding stage'}"
|
||||
logger.error(
|
||||
f"Generation timed out during {'prefill' if timeout == PREFILL_TIMEOUT_SECONDS else 'decoding stage'}"
|
||||
)
|
||||
raise e
|
||||
|
||||
@@ -239,7 +244,7 @@ class RunnerSupervisor:
|
||||
token_id=response.token,
|
||||
finish_reason=response.finish_reason,
|
||||
)
|
||||
timeout = token_timeout
|
||||
timeout = DECODE_TIMEOUT_SECONDS
|
||||
case FinishedResponse():
|
||||
break
|
||||
case _:
|
||||
@@ -322,7 +327,7 @@ class RunnerSupervisor:
|
||||
except Exception:
|
||||
cause = f"signal={sig}"
|
||||
|
||||
logger.bind(user_facing=True).error(f"Runner terminated ({cause}).\n{captured}")
|
||||
logger.error(f"Runner terminated ({cause}).\n{captured}")
|
||||
|
||||
return RunnerError(
|
||||
error_type="RunnerCrash",
|
||||
|
||||
@@ -5,7 +5,6 @@ import sys
|
||||
import psutil
|
||||
from loguru import logger
|
||||
|
||||
from exo.shared.constants import LB_DISK_GBPS, LB_MEMBW_GBPS, LB_TFLOPS
|
||||
from exo.shared.types.memory import Memory
|
||||
from exo.shared.types.worker.shards import ShardMetadata
|
||||
|
||||
@@ -57,48 +56,9 @@ def get_weights_size(model_shard_meta: ShardMetadata) -> Memory:
|
||||
(model_shard_meta.end_layer - model_shard_meta.start_layer)
|
||||
/ model_shard_meta.n_layers
|
||||
* model_shard_meta.model_meta.storage_size.in_kb
|
||||
/ (
|
||||
1
|
||||
if model_shard_meta.strategy in ["auto", "pipeline", "pipeline_rdma"]
|
||||
else model_shard_meta.world_size
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
def get_init_timeout(model_shard_meta: ShardMetadata) -> float:
|
||||
weights_size = get_weights_size(model_shard_meta)
|
||||
|
||||
kbps_read = 1024 * 1024 * LB_DISK_GBPS / 3
|
||||
|
||||
return weights_size.in_kb / kbps_read + 30.0
|
||||
|
||||
|
||||
def _prefill_flops_for_shard(model_shard_meta: ShardMetadata, s: int) -> float:
|
||||
p = get_weights_size(model_shard_meta).in_bytes
|
||||
flops = 2.0 * p * s # parameter-dependent GEMMs
|
||||
# flops += _attention_flops(meta, S) # optional S^2 term
|
||||
return flops
|
||||
|
||||
|
||||
def get_prefil_timeout(
|
||||
model_shard_meta: ShardMetadata,
|
||||
prompt_tokens: int,
|
||||
*,
|
||||
effective_tflops: float = LB_TFLOPS,
|
||||
safety_mult: float = 1.6,
|
||||
base_pad_s: float = 5.0,
|
||||
) -> float:
|
||||
"""
|
||||
Returns a conservative timeout (seconds) for the prefill stage.
|
||||
"""
|
||||
total_flops = _prefill_flops_for_shard(model_shard_meta, prompt_tokens)
|
||||
|
||||
# Convert to seconds using sustained throughput
|
||||
time_seconds = total_flops / (effective_tflops * 1e12)
|
||||
|
||||
# Prefill across pipeline stages is largely sequential; summing FLOPs already accounts for it.
|
||||
# Add a base pad (launch/IO) and a safety multiplier for variance.
|
||||
return base_pad_s + safety_mult * time_seconds
|
||||
|
||||
|
||||
def get_token_generate_timeout(model_shard_meta: ShardMetadata) -> float:
|
||||
weights_size = get_weights_size(model_shard_meta)
|
||||
|
||||
kbps_read = 1024 * 1024 * LB_MEMBW_GBPS / 3
|
||||
|
||||
return weights_size.in_kb / kbps_read + 2.0
|
||||
|
||||
@@ -1,7 +1,6 @@
|
||||
import asyncio
|
||||
import re
|
||||
import sys
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
from loguru import logger
|
||||
from pydantic import BaseModel, Field
|
||||
@@ -72,20 +71,16 @@ async def get_mac_friendly_name_async() -> str | None:
|
||||
return None
|
||||
|
||||
|
||||
async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
|
||||
async def get_network_interface_info_async() -> list[NetworkInterfaceInfo]:
|
||||
"""
|
||||
Retrieves detailed network interface information on macOS.
|
||||
Parses output from 'networksetup -listallhardwareports' and 'ifconfig'
|
||||
to determine interface names, IP addresses, and types (ethernet, wifi, vpn, other).
|
||||
Returns a list of NetworkInterfaceInfo objects.
|
||||
"""
|
||||
if sys.platform != "darwin":
|
||||
return []
|
||||
interfaces_info: list[NetworkInterfaceInfo] = []
|
||||
|
||||
interfaces_info: List[NetworkInterfaceInfo] = []
|
||||
device_to_type_map: Dict[str, str] = {}
|
||||
|
||||
async def _run_cmd_async(command_parts: List[str]) -> Optional[str]:
|
||||
async def _run_cmd_async(command_parts: list[str]) -> str | None:
|
||||
# Helper to run a command and return its stdout, or None on error.
|
||||
try:
|
||||
process = await asyncio.create_subprocess_exec(
|
||||
@@ -118,37 +113,9 @@ async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
|
||||
)
|
||||
return None
|
||||
|
||||
# 1. Get hardware port types from networksetup
|
||||
networksetup_output = await _run_cmd_async(
|
||||
["networksetup", "-listallhardwareports"]
|
||||
)
|
||||
if networksetup_output:
|
||||
current_hardware_port_type_raw: Optional[str] = None
|
||||
for line in networksetup_output.splitlines():
|
||||
line_stripped = line.strip()
|
||||
if line_stripped.startswith("Hardware Port:"):
|
||||
current_hardware_port_type_raw = line_stripped.split(":", 1)[1].strip()
|
||||
elif line_stripped.startswith("Device:") and current_hardware_port_type_raw:
|
||||
device_name = line_stripped.split(":", 1)[1].strip()
|
||||
if device_name and device_name != "N/A":
|
||||
if "Thunderbolt" in current_hardware_port_type_raw:
|
||||
device_to_type_map[device_name] = "thunderbolt"
|
||||
elif (
|
||||
"Wi-Fi" in current_hardware_port_type_raw
|
||||
or "AirPort" in current_hardware_port_type_raw
|
||||
):
|
||||
device_to_type_map[device_name] = "wifi"
|
||||
elif (
|
||||
"Ethernet" in current_hardware_port_type_raw
|
||||
or "LAN" in current_hardware_port_type_raw
|
||||
):
|
||||
device_to_type_map[device_name] = "ethernet"
|
||||
current_hardware_port_type_raw = None # Reset for the next block
|
||||
|
||||
# 2. Get interface names and IP addresses from ifconfig
|
||||
# Get interface names and IP addresses from ifconfig
|
||||
ifconfig_output = await _run_cmd_async(["ifconfig"])
|
||||
if ifconfig_output:
|
||||
current_if_name: Optional[str] = None
|
||||
# Regex for interface name (e.g., en0:, utun0:, tailscale0.)
|
||||
interface_header_pattern = re.compile(r"^([a-zA-Z0-9\._-]+):")
|
||||
# Regex for IPv4 address (inet)
|
||||
@@ -156,44 +123,30 @@ async def get_network_interface_info_async() -> List[NetworkInterfaceInfo]:
|
||||
# Regex for IPv6 address (inet6)
|
||||
inet6_pattern = re.compile(r"^\s+inet6\s+([0-9a-fA-F:]+(?:%[a-zA-Z0-9._-]+)?)")
|
||||
|
||||
def _add_interface_entry(if_name: str, ip_addr: str):
|
||||
_if_type = device_to_type_map.get(if_name)
|
||||
if not _if_type: # Infer type if not found via networksetup
|
||||
if if_name.startswith(("utun", "wg", "ppp")) or "tailscale" in if_name:
|
||||
_if_type = "vpn"
|
||||
elif if_name.startswith("bridge"):
|
||||
_if_type = "virtual" # For non-Thunderbolt bridges (e.g., Docker)
|
||||
else:
|
||||
_if_type = "other"
|
||||
|
||||
interfaces_info.append(
|
||||
NetworkInterfaceInfo(name=if_name, ip_address=ip_addr, type=_if_type)
|
||||
)
|
||||
|
||||
current_if_name: str | None = None
|
||||
for line in ifconfig_output.splitlines():
|
||||
header_match = interface_header_pattern.match(line)
|
||||
if header_match:
|
||||
potential_if_name = header_match.group(1)
|
||||
if potential_if_name == "lo0": # Skip loopback interface
|
||||
current_if_name = None
|
||||
else:
|
||||
current_if_name = potential_if_name
|
||||
continue
|
||||
current_if_name = header_match.group(1)
|
||||
|
||||
if current_if_name:
|
||||
inet_m = inet_pattern.match(line)
|
||||
if inet_m:
|
||||
ipv4_address = inet_m.group(1)
|
||||
_add_interface_entry(
|
||||
current_if_name, ipv4_address
|
||||
) # Add all IPv4, including APIPA
|
||||
continue
|
||||
interfaces_info.append(
|
||||
NetworkInterfaceInfo(
|
||||
name=current_if_name, ip_address=ipv4_address, type=""
|
||||
)
|
||||
)
|
||||
|
||||
inet6_m = inet6_pattern.match(line)
|
||||
if inet6_m:
|
||||
ipv6_address = inet6_m.group(1)
|
||||
# No specific filtering for IPv6 link-local (e.g., fe80::) for now.
|
||||
_add_interface_entry(current_if_name, ipv6_address)
|
||||
interfaces_info.append(
|
||||
NetworkInterfaceInfo(
|
||||
name=current_if_name, ip_address=ipv6_address, type=""
|
||||
)
|
||||
)
|
||||
|
||||
return interfaces_info
|
||||
|
||||
@@ -203,7 +156,7 @@ async def get_mac_system_info_async() -> SystemInfo:
|
||||
model_id_val = "Unknown Model"
|
||||
chip_id_val = "Unknown Chip"
|
||||
memory_val = 0
|
||||
network_interfaces_info_list: List[NetworkInterfaceInfo] = []
|
||||
network_interfaces_info_list: list[NetworkInterfaceInfo] = []
|
||||
|
||||
if sys.platform != "darwin":
|
||||
return SystemInfo(
|
||||
|
||||
24
tmp/run_llm.sh
Executable file
24
tmp/run_llm.sh
Executable file
@@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
if [ $# -lt 2 ]; then
|
||||
echo "Usage: $0 <hostname> <query>"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
HOST="$1"
|
||||
shift
|
||||
QUERY="$*"
|
||||
|
||||
curl -sN -X POST "http://$HOST:8000/v1/chat/completions" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{
|
||||
\"model\": \"mlx-community/DeepSeek-V3.1-8bit\",
|
||||
\"stream\": true,
|
||||
\"messages\": [{ \"role\": \"user\", \"content\": \"$QUERY\" }]
|
||||
}" |
|
||||
grep --line-buffered '^data:' |
|
||||
grep --line-buffered -v 'data: \[DONE\]' |
|
||||
cut -d' ' -f2- |
|
||||
jq -r --unbuffered '.choices[].delta.content // empty' |
|
||||
awk '{ORS=""; print; fflush()} END {print "\n"}'
|
||||
184
uv.lock
generated
184
uv.lock
generated
@@ -1,5 +1,5 @@
|
||||
version = 1
|
||||
revision = 2
|
||||
revision = 3
|
||||
requires-python = ">=3.13"
|
||||
resolution-markers = [
|
||||
"sys_platform == 'darwin'",
|
||||
@@ -391,8 +391,8 @@ requires-dist = [
|
||||
{ name = "greenlet", specifier = ">=3.2.4" },
|
||||
{ name = "huggingface-hub", specifier = ">=0.33.4" },
|
||||
{ name = "loguru", specifier = ">=0.7.3" },
|
||||
{ name = "mlx", specifier = "==0.29.3" },
|
||||
{ name = "mlx-lm", specifier = "==0.28.3" },
|
||||
{ name = "mlx", specifier = ">=0.29.3" },
|
||||
{ name = "mlx-lm", specifier = ">=0.28.3" },
|
||||
{ name = "networkx", specifier = ">=3.5" },
|
||||
{ name = "openai", specifier = ">=1.99.9" },
|
||||
{ name = "pathlib", specifier = ">=1.0.1" },
|
||||
@@ -455,7 +455,7 @@ requires-dist = [
|
||||
|
||||
[[package]]
|
||||
name = "fastapi"
|
||||
version = "0.120.3"
|
||||
version = "0.121.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "annotated-doc", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
@@ -463,9 +463,9 @@ dependencies = [
|
||||
{ name = "starlette", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/85/c6/f324c07f5ebe34237b56b6396a94568d2d4a705df8a2ff82fa45029e7252/fastapi-0.120.3.tar.gz", hash = "sha256:17db50718ee86c9e01e54f9d8600abf130f6f762711cd0d8f02eb392668271ba", size = 339363, upload-time = "2025-10-30T20:41:33.072Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/8c/e3/77a2df0946703973b9905fd0cde6172c15e0781984320123b4f5079e7113/fastapi-0.121.0.tar.gz", hash = "sha256:06663356a0b1ee93e875bbf05a31fb22314f5bed455afaaad2b2dad7f26e98fa", size = 342412, upload-time = "2025-11-03T10:25:54.818Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/37/3a/1eef3ab55ede5af09186723898545a94d0a32b7ac9ea4e7af7bcb95f132a/fastapi-0.120.3-py3-none-any.whl", hash = "sha256:bfee21c98db9128dc425a686eafd14899e26e4471aab33076bff2427fd6dcd22", size = 108255, upload-time = "2025-10-30T20:41:31.247Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/dd/2c/42277afc1ba1a18f8358561eee40785d27becab8f80a1f945c0a3051c6eb/fastapi-0.121.0-py3-none-any.whl", hash = "sha256:8bdf1b15a55f4e4b0d6201033da9109ea15632cb76cf156e7b8b4019f2172106", size = 109183, upload-time = "2025-11-03T10:25:53.27Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -981,7 +981,7 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "openai"
|
||||
version = "2.6.1"
|
||||
version = "2.7.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "anyio", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
@@ -993,9 +993,9 @@ dependencies = [
|
||||
{ name = "tqdm", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/c4/44/303deb97be7c1c9b53118b52825cbd1557aeeff510f3a52566b1fa66f6a2/openai-2.6.1.tar.gz", hash = "sha256:27ae704d190615fca0c0fc2b796a38f8b5879645a3a52c9c453b23f97141bb49", size = 593043, upload-time = "2025-10-24T13:29:52.79Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/84/2c/3ca91dbd1a5b80c20fbd1e21d601f6afd7fd51927a1b27b08226b67ebd61/openai-2.7.0.tar.gz", hash = "sha256:8c42c24d06afece19e69afcb6c2b23b8b90f603a81616d8a0be80b80fb527ed2", size = 595876, upload-time = "2025-11-03T23:52:07.935Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/15/0e/331df43df633e6105ff9cf45e0ce57762bd126a45ac16b25a43f6738d8a2/openai-2.6.1-py3-none-any.whl", hash = "sha256:904e4b5254a8416746a2f05649594fa41b19d799843cd134dac86167e094edef", size = 1005551, upload-time = "2025-10-24T13:29:50.973Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fc/0f/e9618a92a9497846a3071f2a7ed43409215947106c7e5ce7d082f784de10/openai-2.7.0-py3-none-any.whl", hash = "sha256:9fc44861a692b7e80a7ec1252c10af79612a3ef1581ecb192caf4585afca5363", size = 1008759, upload-time = "2025-11-03T23:52:05.322Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1106,22 +1106,22 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "psutil"
|
||||
version = "7.1.2"
|
||||
version = "7.1.3"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/cd/ec/7b8e6b9b1d22708138630ef34c53ab2b61032c04f16adfdbb96791c8c70c/psutil-7.1.2.tar.gz", hash = "sha256:aa225cdde1335ff9684708ee8c72650f6598d5ed2114b9a7c5802030b1785018", size = 487424, upload-time = "2025-10-25T10:46:34.931Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/e1/88/bdd0a41e5857d5d703287598cbf08dad90aed56774ea52ae071bae9071b6/psutil-7.1.3.tar.gz", hash = "sha256:6c86281738d77335af7aec228328e944b30930899ea760ecf33a4dba66be5e74", size = 489059, upload-time = "2025-11-02T12:25:54.619Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/d9/b56cc9f883140ac10021a8c9b0f4e16eed1ba675c22513cdcbce3ba64014/psutil-7.1.2-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0cc5c6889b9871f231ed5455a9a02149e388fffcb30b607fb7a8896a6d95f22e", size = 238575, upload-time = "2025-10-25T10:46:38.728Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/36/eb/28d22de383888deb252c818622196e709da98816e296ef95afda33f1c0a2/psutil-7.1.2-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:8e9e77a977208d84aa363a4a12e0f72189d58bbf4e46b49aae29a2c6e93ef206", size = 239297, upload-time = "2025-10-25T10:46:41.347Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/89/5d/220039e2f28cc129626e54d63892ab05c0d56a29818bfe7268dcb5008932/psutil-7.1.2-cp313-cp313t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7d9623a5e4164d2220ecceb071f4b333b3c78866141e8887c072129185f41278", size = 280420, upload-time = "2025-10-25T10:46:44.122Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ba/7a/286f0e1c167445b2ef4a6cbdfc8c59fdb45a5a493788950cf8467201dc73/psutil-7.1.2-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:364b1c10fe4ed59c89ec49e5f1a70da353b27986fa8233b4b999df4742a5ee2f", size = 283049, upload-time = "2025-10-25T10:46:47.095Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/56/9e/f1c5c746b4ed5320952acd3002d3962fe36f30524c00ea79fdf954cc6779/psutil-7.1.2-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:e09cfe92aa8e22b1ec5e2d394820cf86c5dff6367ac3242366485dfa874d43bc", size = 238640, upload-time = "2025-10-25T10:46:54.089Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/32/ee/fd26216a735395cc25c3899634e34aeb41fb1f3dbb44acc67d9e594be562/psutil-7.1.2-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:fa6342cf859c48b19df3e4aa170e4cfb64aadc50b11e06bb569c6c777b089c9e", size = 239303, upload-time = "2025-10-25T10:46:56.932Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3c/cd/7d96eaec4ef7742b845a9ce2759a2769ecce4ab7a99133da24abacbc9e41/psutil-7.1.2-cp314-cp314t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:625977443498ee7d6c1e63e93bacca893fd759a66c5f635d05e05811d23fb5ee", size = 281717, upload-time = "2025-10-25T10:46:59.116Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bc/1a/7f0b84bdb067d35fe7fade5fff888408688caf989806ce2d6dae08c72dd5/psutil-7.1.2-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:4a24bcd7b7f2918d934af0fb91859f621b873d6aa81267575e3655cd387572a7", size = 284575, upload-time = "2025-10-25T10:47:00.944Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ae/89/b9f8d47ddbc52d7301fc868e8224e5f44ed3c7f55e6d0f54ecaf5dd9ff5e/psutil-7.1.2-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:c9ba5c19f2d46203ee8c152c7b01df6eec87d883cfd8ee1af2ef2727f6b0f814", size = 237244, upload-time = "2025-10-25T10:47:07.086Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c8/7a/8628c2f6b240680a67d73d8742bb9ff39b1820a693740e43096d5dcb01e5/psutil-7.1.2-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:2a486030d2fe81bec023f703d3d155f4823a10a47c36784c84f1cc7f8d39bedb", size = 238101, upload-time = "2025-10-25T10:47:09.523Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/30/28/5e27f4d5a0e347f8e3cc16cd7d35533dbce086c95807f1f0e9cd77e26c10/psutil-7.1.2-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3efd8fc791492e7808a51cb2b94889db7578bfaea22df931424f874468e389e3", size = 258675, upload-time = "2025-10-25T10:47:11.082Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e5/5c/79cf60c9acf36d087f0db0f82066fca4a780e97e5b3a2e4c38209c03d170/psutil-7.1.2-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:e2aeb9b64f481b8eabfc633bd39e0016d4d8bbcd590d984af764d80bf0851b8a", size = 260203, upload-time = "2025-10-25T10:47:13.226Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bd/93/0c49e776b8734fef56ec9c5c57f923922f2cf0497d62e0f419465f28f3d0/psutil-7.1.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:0005da714eee687b4b8decd3d6cc7c6db36215c9e74e5ad2264b90c3df7d92dc", size = 239751, upload-time = "2025-11-02T12:25:58.161Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6f/8d/b31e39c769e70780f007969815195a55c81a63efebdd4dbe9e7a113adb2f/psutil-7.1.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:19644c85dcb987e35eeeaefdc3915d059dac7bd1167cdcdbf27e0ce2df0c08c0", size = 240368, upload-time = "2025-11-02T12:26:00.491Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/62/61/23fd4acc3c9eebbf6b6c78bcd89e5d020cfde4acf0a9233e9d4e3fa698b4/psutil-7.1.3-cp313-cp313t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:95ef04cf2e5ba0ab9eaafc4a11eaae91b44f4ef5541acd2ee91d9108d00d59a7", size = 287134, upload-time = "2025-11-02T12:26:02.613Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/30/1c/f921a009ea9ceb51aa355cb0cc118f68d354db36eae18174bab63affb3e6/psutil-7.1.3-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:1068c303be3a72f8e18e412c5b2a8f6d31750fb152f9cb106b54090296c9d251", size = 289904, upload-time = "2025-11-02T12:26:05.207Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/2e/bb/6670bded3e3236eb4287c7bcdc167e9fae6e1e9286e437f7111caed2f909/psutil-7.1.3-cp314-cp314t-macosx_10_15_x86_64.whl", hash = "sha256:b403da1df4d6d43973dc004d19cee3b848e998ae3154cc8097d139b77156c353", size = 239843, upload-time = "2025-11-02T12:26:11.968Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/66/853d50e75a38c9a7370ddbeefabdd3d3116b9c31ef94dc92c6729bc36bec/psutil-7.1.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:ad81425efc5e75da3f39b3e636293360ad8d0b49bed7df824c79764fb4ba9b8b", size = 240369, upload-time = "2025-11-02T12:26:14.358Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/41/bd/313aba97cb5bfb26916dc29cf0646cbe4dd6a89ca69e8c6edce654876d39/psutil-7.1.3-cp314-cp314t-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:8f33a3702e167783a9213db10ad29650ebf383946e91bc77f28a5eb083496bc9", size = 288210, upload-time = "2025-11-02T12:26:16.699Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c2/fa/76e3c06e760927a0cfb5705eb38164254de34e9bd86db656d4dbaa228b04/psutil-7.1.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fac9cd332c67f4422504297889da5ab7e05fd11e3c4392140f7370f4208ded1f", size = 291182, upload-time = "2025-11-02T12:26:18.848Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ef/94/46b9154a800253e7ecff5aaacdf8ebf43db99de4a2dfa18575b02548654e/psutil-7.1.3-cp36-abi3-macosx_10_9_x86_64.whl", hash = "sha256:2bdbcd0e58ca14996a42adf3621a6244f1bb2e2e528886959c72cf1e326677ab", size = 238359, upload-time = "2025-11-02T12:26:25.284Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/68/3a/9f93cff5c025029a36d9a92fef47220ab4692ee7f2be0fba9f92813d0cb8/psutil-7.1.3-cp36-abi3-macosx_11_0_arm64.whl", hash = "sha256:bc31fa00f1fbc3c3802141eede66f3a2d51d89716a194bf2cd6fc68310a19880", size = 239171, upload-time = "2025-11-02T12:26:27.23Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ce/b1/5f49af514f76431ba4eea935b8ad3725cdeb397e9245ab919dbc1d1dc20f/psutil-7.1.3-cp36-abi3-manylinux2010_x86_64.manylinux_2_12_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3bb428f9f05c1225a558f53e30ccbad9930b11c3fc206836242de1091d3e7dd3", size = 263261, upload-time = "2025-11-02T12:26:29.48Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e0/95/992c8816a74016eb095e73585d747e0a8ea21a061ed3689474fabb29a395/psutil-7.1.3-cp36-abi3-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:56d974e02ca2c8eb4812c3f76c30e28836fffc311d55d979f1465c1feeb2b68b", size = 264635, upload-time = "2025-11-02T12:26:31.74Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1254,54 +1254,54 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "regex"
|
||||
version = "2025.10.23"
|
||||
version = "2025.11.3"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/f8/c8/1d2160d36b11fbe0a61acb7c3c81ab032d9ec8ad888ac9e0a61b85ab99dd/regex-2025.10.23.tar.gz", hash = "sha256:8cbaf8ceb88f96ae2356d01b9adf5e6306fa42fa6f7eab6b97794e37c959ac26", size = 401266, upload-time = "2025-10-21T15:58:20.23Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/cc/a9/546676f25e573a4cf00fe8e119b78a37b6a8fe2dc95cda877b30889c9c45/regex-2025.11.3.tar.gz", hash = "sha256:1fedc720f9bb2494ce31a58a1631f9c82df6a09b49c19517ea5cc280b4541e01", size = 414669, upload-time = "2025-11-03T21:34:22.089Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/28/c6/195a6217a43719d5a6a12cc192a22d12c40290cecfa577f00f4fb822f07d/regex-2025.10.23-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:b7690f95404a1293923a296981fd943cca12c31a41af9c21ba3edd06398fc193", size = 488956, upload-time = "2025-10-21T15:55:42.887Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/4c/93/181070cd1aa2fa541ff2d3afcf763ceecd4937b34c615fa92765020a6c90/regex-2025.10.23-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:1a32d77aeaea58a13230100dd8797ac1a84c457f3af2fdf0d81ea689d5a9105b", size = 290997, upload-time = "2025-10-21T15:55:44.53Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b6/c5/9d37fbe3a40ed8dda78c23e1263002497540c0d1522ed75482ef6c2000f0/regex-2025.10.23-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:b24b29402f264f70a3c81f45974323b41764ff7159655360543b7cabb73e7d2f", size = 288686, upload-time = "2025-10-21T15:55:46.186Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5f/e7/db610ff9f10c2921f9b6ac0c8d8be4681b28ddd40fc0549429366967e61f/regex-2025.10.23-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:563824a08c7c03d96856d84b46fdb3bbb7cfbdf79da7ef68725cda2ce169c72a", size = 798466, upload-time = "2025-10-21T15:55:48.24Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/90/10/aab883e1fa7fe2feb15ac663026e70ca0ae1411efa0c7a4a0342d9545015/regex-2025.10.23-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:a0ec8bdd88d2e2659c3518087ee34b37e20bd169419ffead4240a7004e8ed03b", size = 863996, upload-time = "2025-10-21T15:55:50.478Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a2/b0/8f686dd97a51f3b37d0238cd00a6d0f9ccabe701f05b56de1918571d0d61/regex-2025.10.23-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b577601bfe1d33913fcd9276d7607bbac827c4798d9e14d04bf37d417a6c41cb", size = 912145, upload-time = "2025-10-21T15:55:52.215Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/ca/639f8cd5b08797bca38fc5e7e07f76641a428cf8c7fca05894caf045aa32/regex-2025.10.23-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7c9f2c68ac6cb3de94eea08a437a75eaa2bd33f9e97c84836ca0b610a5804368", size = 803370, upload-time = "2025-10-21T15:55:53.944Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/0d/1e/a40725bb76959eddf8abc42a967bed6f4851b39f5ac4f20e9794d7832aa5/regex-2025.10.23-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:89f8b9ea3830c79468e26b0e21c3585f69f105157c2154a36f6b7839f8afb351", size = 787767, upload-time = "2025-10-21T15:55:56.004Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3d/d8/8ee9858062936b0f99656dce390aa667c6e7fb0c357b1b9bf76fb5e2e708/regex-2025.10.23-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:98fd84c4e4ea185b3bb5bf065261ab45867d8875032f358a435647285c722673", size = 858335, upload-time = "2025-10-21T15:55:58.185Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d8/0a/ed5faaa63fa8e3064ab670e08061fbf09e3a10235b19630cf0cbb9e48c0a/regex-2025.10.23-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:1e11d3e5887b8b096f96b4154dfb902f29c723a9556639586cd140e77e28b313", size = 850402, upload-time = "2025-10-21T15:56:00.023Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/79/14/d05f617342f4b2b4a23561da500ca2beab062bfcc408d60680e77ecaf04d/regex-2025.10.23-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:4f13450328a6634348d47a88367e06b64c9d84980ef6a748f717b13f8ce64e87", size = 789739, upload-time = "2025-10-21T15:56:01.967Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3e/b3/95b310605285573341fc062d1d30b19a54f857530e86c805f942c4ff7941/regex-2025.10.23-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:7d6606524fa77b3912c9ef52a42ef63c6cfbfc1077e9dc6296cd5da0da286044", size = 491850, upload-time = "2025-10-21T15:56:11.685Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a4/8f/207c2cec01e34e56db1eff606eef46644a60cf1739ecd474627db90ad90b/regex-2025.10.23-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:c037aadf4d64bdc38af7db3dbd34877a057ce6524eefcb2914d6d41c56f968cc", size = 292537, upload-time = "2025-10-21T15:56:13.963Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/98/3b/025240af4ada1dc0b5f10d73f3e5122d04ce7f8908ab8881e5d82b9d61b6/regex-2025.10.23-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:99018c331fb2529084a0c9b4c713dfa49fafb47c7712422e49467c13a636c656", size = 290904, upload-time = "2025-10-21T15:56:16.016Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/81/8e/104ac14e2d3450c43db18ec03e1b96b445a94ae510b60138f00ce2cb7ca1/regex-2025.10.23-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:fd8aba965604d70306eb90a35528f776e59112a7114a5162824d43b76fa27f58", size = 807311, upload-time = "2025-10-21T15:56:17.818Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/19/63/78aef90141b7ce0be8a18e1782f764f6997ad09de0e05251f0d2503a914a/regex-2025.10.23-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:238e67264b4013e74136c49f883734f68656adf8257bfa13b515626b31b20f8e", size = 873241, upload-time = "2025-10-21T15:56:19.941Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b3/a8/80eb1201bb49ae4dba68a1b284b4211ed9daa8e74dc600018a10a90399fb/regex-2025.10.23-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b2eb48bd9848d66fd04826382f5e8491ae633de3233a3d64d58ceb4ecfa2113a", size = 914794, upload-time = "2025-10-21T15:56:22.488Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f0/d5/1984b6ee93281f360a119a5ca1af6a8ca7d8417861671388bf750becc29b/regex-2025.10.23-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:d36591ce06d047d0c0fe2fc5f14bfbd5b4525d08a7b6a279379085e13f0e3d0e", size = 812581, upload-time = "2025-10-21T15:56:24.319Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c4/39/11ebdc6d9927172a64ae237d16763145db6bd45ebb4055c17b88edab72a7/regex-2025.10.23-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:b5d4ece8628d6e364302006366cea3ee887db397faebacc5dacf8ef19e064cf8", size = 795346, upload-time = "2025-10-21T15:56:26.232Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3b/b4/89a591bcc08b5e436af43315284bd233ba77daf0cf20e098d7af12f006c1/regex-2025.10.23-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:39a7e8083959cb1c4ff74e483eecb5a65d3b3e1d821b256e54baf61782c906c6", size = 868214, upload-time = "2025-10-21T15:56:28.597Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3d/ff/58ba98409c1dbc8316cdb20dafbc63ed267380a07780cafecaf5012dabc9/regex-2025.10.23-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:842d449a8fefe546f311656cf8c0d6729b08c09a185f1cad94c756210286d6a8", size = 854540, upload-time = "2025-10-21T15:56:30.875Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/9a/f2/4a9e9338d67626e2071b643f828a482712ad15889d7268e11e9a63d6f7e9/regex-2025.10.23-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:d614986dc68506be8f00474f4f6960e03e4ca9883f7df47744800e7d7c08a494", size = 799346, upload-time = "2025-10-21T15:56:32.725Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/73/f6/0caf29fec943f201fbc8822879c99d31e59c1d51a983d9843ee5cf398539/regex-2025.10.23-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:5b5cb5b6344c4c4c24b2dc87b0bfee78202b07ef7633385df70da7fcf6f7cec6", size = 488960, upload-time = "2025-10-21T15:56:40.849Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/7d/ebb7085b8fa31c24ce0355107cea2b92229d9050552a01c5d291c42aecea/regex-2025.10.23-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:a6ce7973384c37bdf0f371a843f95a6e6f4e1489e10e0cf57330198df72959c5", size = 290932, upload-time = "2025-10-21T15:56:42.875Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/27/41/43906867287cbb5ca4cee671c3cc8081e15deef86a8189c3aad9ac9f6b4d/regex-2025.10.23-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:2ee3663f2c334959016b56e3bd0dd187cbc73f948e3a3af14c3caaa0c3035d10", size = 288766, upload-time = "2025-10-21T15:56:44.894Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ab/9e/ea66132776700fc77a39b1056e7a5f1308032fead94507e208dc6716b7cd/regex-2025.10.23-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:2003cc82a579107e70d013482acce8ba773293f2db534fb532738395c557ff34", size = 798884, upload-time = "2025-10-21T15:56:47.178Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d5/99/aed1453687ab63819a443930770db972c5c8064421f0d9f5da9ad029f26b/regex-2025.10.23-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:182c452279365a93a9f45874f7f191ec1c51e1f1eb41bf2b16563f1a40c1da3a", size = 864768, upload-time = "2025-10-21T15:56:49.793Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/99/5d/732fe747a1304805eb3853ce6337eea16b169f7105a0d0dd9c6a5ffa9948/regex-2025.10.23-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:b1249e9ff581c5b658c8f0437f883b01f1edcf424a16388591e7c05e5e9e8b0c", size = 911394, upload-time = "2025-10-21T15:56:52.186Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5e/48/58a1f6623466522352a6efa153b9a3714fc559d9f930e9bc947b4a88a2c3/regex-2025.10.23-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:2b841698f93db3ccc36caa1900d2a3be281d9539b822dc012f08fc80b46a3224", size = 803145, upload-time = "2025-10-21T15:56:55.142Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ea/f6/7dea79be2681a5574ab3fc237aa53b2c1dfd6bd2b44d4640b6c76f33f4c1/regex-2025.10.23-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:956d89e0c92d471e8f7eee73f73fdff5ed345886378c45a43175a77538a1ffe4", size = 787831, upload-time = "2025-10-21T15:56:57.203Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3a/ad/07b76950fbbe65f88120ca2d8d845047c401450f607c99ed38862904671d/regex-2025.10.23-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:5c259cb363299a0d90d63b5c0d7568ee98419861618a95ee9d91a41cb9954462", size = 859162, upload-time = "2025-10-21T15:56:59.195Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/41/87/374f3b2021b22aa6a4fc0b750d63f9721e53d1631a238f7a1c343c1cd288/regex-2025.10.23-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:185d2b18c062820b3a40d8fefa223a83f10b20a674bf6e8c4a432e8dfd844627", size = 849899, upload-time = "2025-10-21T15:57:01.747Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/12/4a/7f7bb17c5a5a9747249807210e348450dab9212a46ae6d23ebce86ba6a2b/regex-2025.10.23-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:281d87fa790049c2b7c1b4253121edd80b392b19b5a3d28dc2a77579cb2a58ec", size = 789372, upload-time = "2025-10-21T15:57:04.018Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a6/d0/2025268315e8b2b7b660039824cb7765a41623e97d4cd421510925400487/regex-2025.10.23-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:1f5799ea1787aa6de6c150377d11afad39a38afd033f0c5247aecb997978c422", size = 491854, upload-time = "2025-10-21T15:57:12.526Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/44/35/5681c2fec5e8b33454390af209c4353dfc44606bf06d714b0b8bd0454ffe/regex-2025.10.23-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:a9639ab7540cfea45ef57d16dcbea2e22de351998d614c3ad2f9778fa3bdd788", size = 292542, upload-time = "2025-10-21T15:57:15.158Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5d/17/184eed05543b724132e4a18149e900f5189001fcfe2d64edaae4fbaf36b4/regex-2025.10.23-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:08f52122c352eb44c3421dab78b9b73a8a77a282cc8314ae576fcaa92b780d10", size = 290903, upload-time = "2025-10-21T15:57:17.108Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/25/d0/5e3347aa0db0de382dddfa133a7b0ae72f24b4344f3989398980b44a3924/regex-2025.10.23-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:ebf1baebef1c4088ad5a5623decec6b52950f0e4d7a0ae4d48f0a99f8c9cb7d7", size = 807546, upload-time = "2025-10-21T15:57:19.179Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d2/bb/40c589bbdce1be0c55e9f8159789d58d47a22014f2f820cf2b517a5cd193/regex-2025.10.23-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:16b0f1c2e2d566c562d5c384c2b492646be0a19798532fdc1fdedacc66e3223f", size = 873322, upload-time = "2025-10-21T15:57:21.36Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fe/56/a7e40c01575ac93360e606278d359f91829781a9f7fb6e5aa435039edbda/regex-2025.10.23-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f7ada5d9dceafaab92646aa00c10a9efd9b09942dd9b0d7c5a4b73db92cc7e61", size = 914855, upload-time = "2025-10-21T15:57:24.044Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5c/4b/d55587b192763db3163c3f508b3b67b31bb6f5e7a0e08b83013d0a59500a/regex-2025.10.23-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:3a36b4005770044bf08edecc798f0e41a75795b9e7c9c12fe29da8d792ef870c", size = 812724, upload-time = "2025-10-21T15:57:26.123Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/33/20/18bac334955fbe99d17229f4f8e98d05e4a501ac03a442be8facbb37c304/regex-2025.10.23-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:af7b2661dcc032da1fae82069b5ebf2ac1dfcd5359ef8b35e1367bfc92181432", size = 795439, upload-time = "2025-10-21T15:57:28.497Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/67/46/c57266be9df8549c7d85deb4cb82280cb0019e46fff677534c5fa1badfa4/regex-2025.10.23-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:1cb976810ac1416a67562c2e5ba0accf6f928932320fef302e08100ed681b38e", size = 868336, upload-time = "2025-10-21T15:57:30.867Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/f3/bd5879e41ef8187fec5e678e94b526a93f99e7bbe0437b0f2b47f9101694/regex-2025.10.23-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:1a56a54be3897d62f54290190fbcd754bff6932934529fbf5b29933da28fcd43", size = 854567, upload-time = "2025-10-21T15:57:33.062Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e6/57/2b6bbdbd2f24dfed5b028033aa17ad8f7d86bb28f1a892cac8b3bc89d059/regex-2025.10.23-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:8f3e6d202fb52c2153f532043bbcf618fd177df47b0b306741eb9b60ba96edc3", size = 799565, upload-time = "2025-10-21T15:57:35.153Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e1/a7/dda24ebd49da46a197436ad96378f17df30ceb40e52e859fc42cac45b850/regex-2025.11.3-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:c1e448051717a334891f2b9a620fe36776ebf3dd8ec46a0b877c8ae69575feb4", size = 489081, upload-time = "2025-11-03T21:31:55.9Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/19/22/af2dc751aacf88089836aa088a1a11c4f21a04707eb1b0478e8e8fb32847/regex-2025.11.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:9b5aca4d5dfd7fbfbfbdaf44850fcc7709a01146a797536a8f84952e940cca76", size = 291123, upload-time = "2025-11-03T21:31:57.758Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/88/1a3ea5672f4b0a84802ee9891b86743438e7c04eb0b8f8c4e16a42375327/regex-2025.11.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:04d2765516395cf7dda331a244a3282c0f5ae96075f728629287dfa6f76ba70a", size = 288814, upload-time = "2025-11-03T21:32:01.12Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fb/8c/f5987895bf42b8ddeea1b315c9fedcfe07cadee28b9c98cf50d00adcb14d/regex-2025.11.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:5d9903ca42bfeec4cebedba8022a7c97ad2aab22e09573ce9976ba01b65e4361", size = 798592, upload-time = "2025-11-03T21:32:03.006Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/99/2a/6591ebeede78203fa77ee46a1c36649e02df9eaa77a033d1ccdf2fcd5d4e/regex-2025.11.3-cp313-cp313-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:639431bdc89d6429f6721625e8129413980ccd62e9d3f496be618a41d205f160", size = 864122, upload-time = "2025-11-03T21:32:04.553Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/94/d6/be32a87cf28cf8ed064ff281cfbd49aefd90242a83e4b08b5a86b38e8eb4/regex-2025.11.3-cp313-cp313-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f117efad42068f9715677c8523ed2be1518116d1c49b1dd17987716695181efe", size = 912272, upload-time = "2025-11-03T21:32:06.148Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/62/11/9bcef2d1445665b180ac7f230406ad80671f0fc2a6ffb93493b5dd8cd64c/regex-2025.11.3-cp313-cp313-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:4aecb6f461316adf9f1f0f6a4a1a3d79e045f9b71ec76055a791affa3b285850", size = 803497, upload-time = "2025-11-03T21:32:08.162Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e5/a7/da0dc273d57f560399aa16d8a68ae7f9b57679476fc7ace46501d455fe84/regex-2025.11.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:3b3a5f320136873cc5561098dfab677eea139521cb9a9e8db98b7e64aef44cbc", size = 787892, upload-time = "2025-11-03T21:32:09.769Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/da/4b/732a0c5a9736a0b8d6d720d4945a2f1e6f38f87f48f3173559f53e8d5d82/regex-2025.11.3-cp313-cp313-musllinux_1_2_ppc64le.whl", hash = "sha256:75fa6f0056e7efb1f42a1c34e58be24072cb9e61a601340cc1196ae92326a4f9", size = 858462, upload-time = "2025-11-03T21:32:11.769Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/0c/f5/a2a03df27dc4c2d0c769220f5110ba8c4084b0bfa9ab0f9b4fcfa3d2b0fc/regex-2025.11.3-cp313-cp313-musllinux_1_2_s390x.whl", hash = "sha256:dbe6095001465294f13f1adcd3311e50dd84e5a71525f20a10bd16689c61ce0b", size = 850528, upload-time = "2025-11-03T21:32:13.906Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d6/09/e1cd5bee3841c7f6eb37d95ca91cdee7100b8f88b81e41c2ef426910891a/regex-2025.11.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:454d9b4ae7881afbc25015b8627c16d88a597479b9dea82b8c6e7e2e07240dc7", size = 789866, upload-time = "2025-11-03T21:32:15.748Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/20/28/fd0c63357caefe5680b8ea052131acbd7f456893b69cc2a90cc3e0dc90d4/regex-2025.11.3-cp313-cp313t-macosx_10_13_universal2.whl", hash = "sha256:1eb1ebf6822b756c723e09f5186473d93236c06c579d2cc0671a722d2ab14281", size = 491984, upload-time = "2025-11-03T21:32:23.466Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/df/ec/7014c15626ab46b902b3bcc4b28a7bae46d8f281fc7ea9c95e22fcaaa917/regex-2025.11.3-cp313-cp313t-macosx_10_13_x86_64.whl", hash = "sha256:1e00ec2970aab10dc5db34af535f21fcf32b4a31d99e34963419636e2f85ae39", size = 292673, upload-time = "2025-11-03T21:32:25.034Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/23/ab/3b952ff7239f20d05f1f99e9e20188513905f218c81d52fb5e78d2bf7634/regex-2025.11.3-cp313-cp313t-macosx_11_0_arm64.whl", hash = "sha256:a4cb042b615245d5ff9b3794f56be4138b5adc35a4166014d31d1814744148c7", size = 291029, upload-time = "2025-11-03T21:32:26.528Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/21/7e/3dc2749fc684f455f162dcafb8a187b559e2614f3826877d3844a131f37b/regex-2025.11.3-cp313-cp313t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:44f264d4bf02f3176467d90b294d59bf1db9fe53c141ff772f27a8b456b2a9ed", size = 807437, upload-time = "2025-11-03T21:32:28.363Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/1b/0b/d529a85ab349c6a25d1ca783235b6e3eedf187247eab536797021f7126c6/regex-2025.11.3-cp313-cp313t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:7be0277469bf3bd7a34a9c57c1b6a724532a0d235cd0dc4e7f4316f982c28b19", size = 873368, upload-time = "2025-11-03T21:32:30.4Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/18/2d868155f8c9e3e9d8f9e10c64e9a9f496bb8f7e037a88a8bed26b435af6/regex-2025.11.3-cp313-cp313t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:0d31e08426ff4b5b650f68839f5af51a92a5b51abd8554a60c2fbc7c71f25d0b", size = 914921, upload-time = "2025-11-03T21:32:32.123Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/2d/71/9d72ff0f354fa783fe2ba913c8734c3b433b86406117a8db4ea2bf1c7a2f/regex-2025.11.3-cp313-cp313t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:e43586ce5bd28f9f285a6e729466841368c4a0353f6fd08d4ce4630843d3648a", size = 812708, upload-time = "2025-11-03T21:32:34.305Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/e7/19/ce4bf7f5575c97f82b6e804ffb5c4e940c62609ab2a0d9538d47a7fdf7d4/regex-2025.11.3-cp313-cp313t-musllinux_1_2_aarch64.whl", hash = "sha256:0f9397d561a4c16829d4e6ff75202c1c08b68a3bdbfe29dbfcdb31c9830907c6", size = 795472, upload-time = "2025-11-03T21:32:36.364Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/03/86/fd1063a176ffb7b2315f9a1b08d17b18118b28d9df163132615b835a26ee/regex-2025.11.3-cp313-cp313t-musllinux_1_2_ppc64le.whl", hash = "sha256:dd16e78eb18ffdb25ee33a0682d17912e8cc8a770e885aeee95020046128f1ce", size = 868341, upload-time = "2025-11-03T21:32:38.042Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/12/43/103fb2e9811205e7386366501bc866a164a0430c79dd59eac886a2822950/regex-2025.11.3-cp313-cp313t-musllinux_1_2_s390x.whl", hash = "sha256:ffcca5b9efe948ba0661e9df0fa50d2bc4b097c70b9810212d6b62f05d83b2dd", size = 854666, upload-time = "2025-11-03T21:32:40.079Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7d/22/e392e53f3869b75804762c7c848bd2dd2abf2b70fb0e526f58724638bd35/regex-2025.11.3-cp313-cp313t-musllinux_1_2_x86_64.whl", hash = "sha256:c56b4d162ca2b43318ac671c65bd4d563e841a694ac70e1a976ac38fcf4ca1d2", size = 799473, upload-time = "2025-11-03T21:32:42.148Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/31/e9/f6e13de7e0983837f7b6d238ad9458800a874bf37c264f7923e63409944c/regex-2025.11.3-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:9697a52e57576c83139d7c6f213d64485d3df5bf84807c35fa409e6c970801c6", size = 489089, upload-time = "2025-11-03T21:32:50.027Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/5c/261f4a262f1fa65141c1b74b255988bd2fa020cc599e53b080667d591cfc/regex-2025.11.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:e18bc3f73bd41243c9b38a6d9f2366cd0e0137a9aebe2d8ff76c5b67d4c0a3f4", size = 291059, upload-time = "2025-11-03T21:32:51.682Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/8e/57/f14eeb7f072b0e9a5a090d1712741fd8f214ec193dba773cf5410108bb7d/regex-2025.11.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:61a08bcb0ec14ff4e0ed2044aad948d0659604f824cbd50b55e30b0ec6f09c73", size = 288900, upload-time = "2025-11-03T21:32:53.569Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/3c/6b/1d650c45e99a9b327586739d926a1cd4e94666b1bd4af90428b36af66dc7/regex-2025.11.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:c9c30003b9347c24bcc210958c5d167b9e4f9be786cb380a7d32f14f9b84674f", size = 799010, upload-time = "2025-11-03T21:32:55.222Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/99/ee/d66dcbc6b628ce4e3f7f0cbbb84603aa2fc0ffc878babc857726b8aab2e9/regex-2025.11.3-cp314-cp314-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:4e1e592789704459900728d88d41a46fe3969b82ab62945560a31732ffc19a6d", size = 864893, upload-time = "2025-11-03T21:32:57.239Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bf/2d/f238229f1caba7ac87a6c4153d79947fb0261415827ae0f77c304260c7d3/regex-2025.11.3-cp314-cp314-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:6538241f45eb5a25aa575dbba1069ad786f68a4f2773a29a2bd3dd1f9de787be", size = 911522, upload-time = "2025-11-03T21:32:59.274Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/bd/3d/22a4eaba214a917c80e04f6025d26143690f0419511e0116508e24b11c9b/regex-2025.11.3-cp314-cp314-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:bce22519c989bb72a7e6b36a199384c53db7722fe669ba891da75907fe3587db", size = 803272, upload-time = "2025-11-03T21:33:01.393Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/84/b1/03188f634a409353a84b5ef49754b97dbcc0c0f6fd6c8ede505a8960a0a4/regex-2025.11.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:66d559b21d3640203ab9075797a55165d79017520685fb407b9234d72ab63c62", size = 787958, upload-time = "2025-11-03T21:33:03.379Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/99/6a/27d072f7fbf6fadd59c64d210305e1ff865cc3b78b526fd147db768c553b/regex-2025.11.3-cp314-cp314-musllinux_1_2_ppc64le.whl", hash = "sha256:669dcfb2e38f9e8c69507bace46f4889e3abbfd9b0c29719202883c0a603598f", size = 859289, upload-time = "2025-11-03T21:33:05.374Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/9a/70/1b3878f648e0b6abe023172dacb02157e685564853cc363d9961bcccde4e/regex-2025.11.3-cp314-cp314-musllinux_1_2_s390x.whl", hash = "sha256:32f74f35ff0f25a5021373ac61442edcb150731fbaa28286bbc8bb1582c89d02", size = 850026, upload-time = "2025-11-03T21:33:07.131Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/dd/d5/68e25559b526b8baab8e66839304ede68ff6727237a47727d240006bd0ff/regex-2025.11.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:e6c7a21dffba883234baefe91bc3388e629779582038f75d2a5be918e250f0ed", size = 789499, upload-time = "2025-11-03T21:33:09.141Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c3/06/49b198550ee0f5e4184271cee87ba4dfd9692c91ec55289e6282f0f86ccf/regex-2025.11.3-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:ba0d8a5d7f04f73ee7d01d974d47c5834f8a1b0224390e4fe7c12a3a92a78ecc", size = 491985, upload-time = "2025-11-03T21:33:16.555Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ce/bf/abdafade008f0b1c9da10d934034cb670432d6cf6cbe38bbb53a1cfd6cf8/regex-2025.11.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:442d86cf1cfe4faabf97db7d901ef58347efd004934da045c745e7b5bd57ac49", size = 292669, upload-time = "2025-11-03T21:33:18.32Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f9/ef/0c357bb8edbd2ad8e273fcb9e1761bc37b8acbc6e1be050bebd6475f19c1/regex-2025.11.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:fd0a5e563c756de210bb964789b5abe4f114dacae9104a47e1a649b910361536", size = 291030, upload-time = "2025-11-03T21:33:20.048Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/79/06/edbb67257596649b8fb088d6aeacbcb248ac195714b18a65e018bf4c0b50/regex-2025.11.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:bf3490bcbb985a1ae97b2ce9ad1c0f06a852d5b19dde9b07bdf25bf224248c95", size = 807674, upload-time = "2025-11-03T21:33:21.797Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f4/d9/ad4deccfce0ea336296bd087f1a191543bb99ee1c53093dcd4c64d951d00/regex-2025.11.3-cp314-cp314t-manylinux2014_ppc64le.manylinux_2_17_ppc64le.manylinux_2_28_ppc64le.whl", hash = "sha256:3809988f0a8b8c9dcc0f92478d6501fac7200b9ec56aecf0ec21f4a2ec4b6009", size = 873451, upload-time = "2025-11-03T21:33:23.741Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/13/75/a55a4724c56ef13e3e04acaab29df26582f6978c000ac9cd6810ad1f341f/regex-2025.11.3-cp314-cp314t-manylinux2014_s390x.manylinux_2_17_s390x.manylinux_2_28_s390x.whl", hash = "sha256:f4ff94e58e84aedb9c9fce66d4ef9f27a190285b451420f297c9a09f2b9abee9", size = 914980, upload-time = "2025-11-03T21:33:25.999Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/67/1e/a1657ee15bd9116f70d4a530c736983eed997b361e20ecd8f5ca3759d5c5/regex-2025.11.3-cp314-cp314t-manylinux2014_x86_64.manylinux_2_17_x86_64.manylinux_2_28_x86_64.whl", hash = "sha256:7eb542fd347ce61e1321b0a6b945d5701528dca0cd9759c2e3bb8bd57e47964d", size = 812852, upload-time = "2025-11-03T21:33:27.852Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/b8/6f/f7516dde5506a588a561d296b2d0044839de06035bb486b326065b4c101e/regex-2025.11.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:d6c2d5919075a1f2e413c00b056ea0c2f065b3f5fe83c3d07d325ab92dce51d6", size = 795566, upload-time = "2025-11-03T21:33:32.364Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d9/dd/3d10b9e170cc16fb34cb2cef91513cf3df65f440b3366030631b2984a264/regex-2025.11.3-cp314-cp314t-musllinux_1_2_ppc64le.whl", hash = "sha256:3f8bf11a4827cc7ce5a53d4ef6cddd5ad25595d3c1435ef08f76825851343154", size = 868463, upload-time = "2025-11-03T21:33:34.459Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/f5/8e/935e6beff1695aa9085ff83195daccd72acc82c81793df480f34569330de/regex-2025.11.3-cp314-cp314t-musllinux_1_2_s390x.whl", hash = "sha256:22c12d837298651e5550ac1d964e4ff57c3f56965fc1812c90c9fb2028eaf267", size = 854694, upload-time = "2025-11-03T21:33:36.793Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/92/12/10650181a040978b2f5720a6a74d44f841371a3d984c2083fc1752e4acf6/regex-2025.11.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:62ba394a3dda9ad41c7c780f60f6e4a70988741415ae96f6d1bf6c239cf01379", size = 799691, upload-time = "2025-11-03T21:33:39.079Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1334,25 +1334,25 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "ruff"
|
||||
version = "0.14.2"
|
||||
version = "0.14.3"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/ee/34/8218a19b2055b80601e8fd201ec723c74c7fe1ca06d525a43ed07b6d8e85/ruff-0.14.2.tar.gz", hash = "sha256:98da787668f239313d9c902ca7c523fe11b8ec3f39345553a51b25abc4629c96", size = 5539663, upload-time = "2025-10-23T19:37:00.956Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/75/62/50b7727004dfe361104dfbf898c45a9a2fdfad8c72c04ae62900224d6ecf/ruff-0.14.3.tar.gz", hash = "sha256:4ff876d2ab2b161b6de0aa1f5bd714e8e9b4033dc122ee006925fbacc4f62153", size = 5558687, upload-time = "2025-10-31T00:26:26.878Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/16/dd/23eb2db5ad9acae7c845700493b72d3ae214dce0b226f27df89216110f2b/ruff-0.14.2-py3-none-linux_armv6l.whl", hash = "sha256:7cbe4e593505bdec5884c2d0a4d791a90301bc23e49a6b1eb642dd85ef9c64f1", size = 12533390, upload-time = "2025-10-23T19:36:18.044Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5a/8c/5f9acff43ddcf3f85130d0146d0477e28ccecc495f9f684f8f7119b74c0d/ruff-0.14.2-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:8d54b561729cee92f8d89c316ad7a3f9705533f5903b042399b6ae0ddfc62e11", size = 12887187, upload-time = "2025-10-23T19:36:22.664Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/99/fa/047646491479074029665022e9f3dc6f0515797f40a4b6014ea8474c539d/ruff-0.14.2-py3-none-macosx_11_0_arm64.whl", hash = "sha256:5c8753dfa44ebb2cde10ce5b4d2ef55a41fb9d9b16732a2c5df64620dbda44a3", size = 11925177, upload-time = "2025-10-23T19:36:24.778Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/15/8b/c44cf7fe6e59ab24a9d939493a11030b503bdc2a16622cede8b7b1df0114/ruff-0.14.2-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3d0bbeffb8d9f4fccf7b5198d566d0bad99a9cb622f1fc3467af96cb8773c9e3", size = 12358285, upload-time = "2025-10-23T19:36:26.979Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/45/01/47701b26254267ef40369aea3acb62a7b23e921c27372d127e0f3af48092/ruff-0.14.2-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:7047f0c5a713a401e43a88d36843d9c83a19c584e63d664474675620aaa634a8", size = 12303832, upload-time = "2025-10-23T19:36:29.192Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/2d/5c/ae7244ca4fbdf2bee9d6405dcd5bc6ae51ee1df66eb7a9884b77b8af856d/ruff-0.14.2-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:3bf8d2f9aa1602599217d82e8e0af7fd33e5878c4d98f37906b7c93f46f9a839", size = 13036995, upload-time = "2025-10-23T19:36:31.861Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/27/4c/0860a79ce6fd4c709ac01173f76f929d53f59748d0dcdd662519835dae43/ruff-0.14.2-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:1c505b389e19c57a317cf4b42db824e2fca96ffb3d86766c1c9f8b96d32048a7", size = 14512649, upload-time = "2025-10-23T19:36:33.915Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/7f/7f/d365de998069720a3abfc250ddd876fc4b81a403a766c74ff9bde15b5378/ruff-0.14.2-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:a307fc45ebd887b3f26b36d9326bb70bf69b01561950cdcc6c0bdf7bb8e0f7cc", size = 14088182, upload-time = "2025-10-23T19:36:36.983Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/6c/ea/d8e3e6b209162000a7be1faa41b0a0c16a133010311edc3329753cc6596a/ruff-0.14.2-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:61ae91a32c853172f832c2f40bd05fd69f491db7289fb85a9b941ebdd549781a", size = 13599516, upload-time = "2025-10-23T19:36:39.208Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/fa/ea/c7810322086db68989fb20a8d5221dd3b79e49e396b01badca07b433ab45/ruff-0.14.2-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bc1967e40286f63ee23c615e8e7e98098dedc7301568bd88991f6e544d8ae096", size = 13272690, upload-time = "2025-10-23T19:36:41.453Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a9/39/10b05acf8c45786ef501d454e00937e1b97964f846bf28883d1f9619928a/ruff-0.14.2-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:2877f02119cdebf52a632d743a2e302dea422bfae152ebe2f193d3285a3a65df", size = 13496497, upload-time = "2025-10-23T19:36:43.61Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/59/a1/1f25f8301e13751c30895092485fada29076e5e14264bdacc37202e85d24/ruff-0.14.2-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:e681c5bc777de5af898decdcb6ba3321d0d466f4cb43c3e7cc2c3b4e7b843a05", size = 12266116, upload-time = "2025-10-23T19:36:45.625Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/5c/fa/0029bfc9ce16ae78164e6923ef392e5f173b793b26cc39aa1d8b366cf9dc/ruff-0.14.2-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:e21be42d72e224736f0c992cdb9959a2fa53c7e943b97ef5d081e13170e3ffc5", size = 12281345, upload-time = "2025-10-23T19:36:47.618Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a5/ab/ece7baa3c0f29b7683be868c024f0838770c16607bea6852e46b202f1ff6/ruff-0.14.2-py3-none-musllinux_1_2_i686.whl", hash = "sha256:b8264016f6f209fac16262882dbebf3f8be1629777cf0f37e7aff071b3e9b92e", size = 12629296, upload-time = "2025-10-23T19:36:49.789Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a4/7f/638f54b43f3d4e48c6a68062794e5b367ddac778051806b9e235dfb7aa81/ruff-0.14.2-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:5ca36b4cb4db3067a3b24444463ceea5565ea78b95fe9a07ca7cb7fd16948770", size = 13371610, upload-time = "2025-10-23T19:36:51.882Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ce/8e/0c10ff1ea5d4360ab8bfca4cb2c9d979101a391f3e79d2616c9bf348cd26/ruff-0.14.3-py3-none-linux_armv6l.whl", hash = "sha256:876b21e6c824f519446715c1342b8e60f97f93264012de9d8d10314f8a79c371", size = 12535613, upload-time = "2025-10-31T00:25:44.302Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/d3/c8/6724f4634c1daf52409fbf13fefda64aa9c8f81e44727a378b7b73dc590b/ruff-0.14.3-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:b6fd8c79b457bedd2abf2702b9b472147cd860ed7855c73a5247fa55c9117654", size = 12855812, upload-time = "2025-10-31T00:25:47.793Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/de/03/db1bce591d55fd5f8a08bb02517fa0b5097b2ccabd4ea1ee29aa72b67d96/ruff-0.14.3-py3-none-macosx_11_0_arm64.whl", hash = "sha256:71ff6edca490c308f083156938c0c1a66907151263c4abdcb588602c6e696a14", size = 11944026, upload-time = "2025-10-31T00:25:49.657Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/0b/75/4f8dbd48e03272715d12c87dc4fcaaf21b913f0affa5f12a4e9c6f8a0582/ruff-0.14.3-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:786ee3ce6139772ff9272aaf43296d975c0217ee1b97538a98171bf0d21f87ed", size = 12356818, upload-time = "2025-10-31T00:25:51.949Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/ec/9b/506ec5b140c11d44a9a4f284ea7c14ebf6f8b01e6e8917734a3325bff787/ruff-0.14.3-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:cd6291d0061811c52b8e392f946889916757610d45d004e41140d81fb6cd5ddc", size = 12336745, upload-time = "2025-10-31T00:25:54.248Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/c7/e1/c560d254048c147f35e7f8131d30bc1f63a008ac61595cf3078a3e93533d/ruff-0.14.3-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a497ec0c3d2c88561b6d90f9c29f5ae68221ac00d471f306fa21fa4264ce5fcd", size = 13101684, upload-time = "2025-10-31T00:25:56.253Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a5/32/e310133f8af5cd11f8cc30f52522a3ebccc5ea5bff4b492f94faceaca7a8/ruff-0.14.3-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:e231e1be58fc568950a04fbe6887c8e4b85310e7889727e2b81db205c45059eb", size = 14535000, upload-time = "2025-10-31T00:25:58.397Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a2/a1/7b0470a22158c6d8501eabc5e9b6043c99bede40fa1994cadf6b5c2a61c7/ruff-0.14.3-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:469e35872a09c0e45fecf48dd960bfbce056b5db2d5e6b50eca329b4f853ae20", size = 14156450, upload-time = "2025-10-31T00:26:00.889Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/0a/96/24bfd9d1a7f532b560dcee1a87096332e461354d3882124219bcaff65c09/ruff-0.14.3-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:3d6bc90307c469cb9d28b7cfad90aaa600b10d67c6e22026869f585e1e8a2db0", size = 13568414, upload-time = "2025-10-31T00:26:03.291Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a7/e7/138b883f0dfe4ad5b76b58bf4ae675f4d2176ac2b24bdd81b4d966b28c61/ruff-0.14.3-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0e2f8a0bbcffcfd895df39c9a4ecd59bb80dca03dc43f7fb63e647ed176b741e", size = 13315293, upload-time = "2025-10-31T00:26:05.708Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/33/f4/c09bb898be97b2eb18476b7c950df8815ef14cf956074177e9fbd40b7719/ruff-0.14.3-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:678fdd7c7d2d94851597c23ee6336d25f9930b460b55f8598e011b57c74fd8c5", size = 13539444, upload-time = "2025-10-31T00:26:08.09Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/9c/aa/b30a1db25fc6128b1dd6ff0741fa4abf969ded161599d07ca7edd0739cc0/ruff-0.14.3-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:1ec1ac071e7e37e0221d2f2dbaf90897a988c531a8592a6a5959f0603a1ecf5e", size = 12252581, upload-time = "2025-10-31T00:26:10.297Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/da/13/21096308f384d796ffe3f2960b17054110a9c3828d223ca540c2b7cc670b/ruff-0.14.3-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:afcdc4b5335ef440d19e7df9e8ae2ad9f749352190e96d481dc501b753f0733e", size = 12307503, upload-time = "2025-10-31T00:26:12.646Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cb/cc/a350bac23f03b7dbcde3c81b154706e80c6f16b06ff1ce28ed07dc7b07b0/ruff-0.14.3-py3-none-musllinux_1_2_i686.whl", hash = "sha256:7bfc42f81862749a7136267a343990f865e71fe2f99cf8d2958f684d23ce3dfa", size = 12675457, upload-time = "2025-10-31T00:26:15.044Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/cb/76/46346029fa2f2078826bc88ef7167e8c198e58fe3126636e52f77488cbba/ruff-0.14.3-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:a65e448cfd7e9c59fae8cf37f9221585d3354febaad9a07f29158af1528e165f", size = 13403980, upload-time = "2025-10-31T00:26:17.81Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
@@ -1443,19 +1443,19 @@ wheels = [
|
||||
|
||||
[[package]]
|
||||
name = "starlette"
|
||||
version = "0.49.1"
|
||||
version = "0.49.3"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "anyio", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/1b/3f/507c21db33b66fb027a332f2cb3abbbe924cc3a79ced12f01ed8645955c9/starlette-0.49.1.tar.gz", hash = "sha256:481a43b71e24ed8c43b11ea02f5353d77840e01480881b8cb5a26b8cae64a8cb", size = 2654703, upload-time = "2025-10-28T17:34:10.928Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/de/1a/608df0b10b53b0beb96a37854ee05864d182ddd4b1156a22f1ad3860425a/starlette-0.49.3.tar.gz", hash = "sha256:1c14546f299b5901a1ea0e34410575bc33bbd741377a10484a54445588d00284", size = 2655031, upload-time = "2025-11-01T15:12:26.13Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/51/da/545b75d420bb23b5d494b0517757b351963e974e79933f01e05c929f20a6/starlette-0.49.1-py3-none-any.whl", hash = "sha256:d92ce9f07e4a3caa3ac13a79523bd18e3bc0042bb8ff2d759a8e7dd0e1859875", size = 74175, upload-time = "2025-10-28T17:34:09.13Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/a3/e0/021c772d6a662f43b63044ab481dc6ac7592447605b5b35a957785363122/starlette-0.49.3-py3-none-any.whl", hash = "sha256:b579b99715fdc2980cf88c8ec96d3bf1ce16f5a8051a7c2b84ef9b1cdecaea2f", size = 74340, upload-time = "2025-11-01T15:12:24.387Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "textual"
|
||||
version = "6.4.0"
|
||||
version = "6.5.0"
|
||||
source = { registry = "https://pypi.org/simple" }
|
||||
dependencies = [
|
||||
{ name = "markdown-it-py", extra = ["linkify"], marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
@@ -1465,9 +1465,9 @@ dependencies = [
|
||||
{ name = "rich", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
{ name = "typing-extensions", marker = "sys_platform == 'darwin' or sys_platform == 'linux'" },
|
||||
]
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/23/6c/565521dc6dd00fa857845483ae0c070575fda1f9a56d92d732554fecfea4/textual-6.4.0.tar.gz", hash = "sha256:f40df9165a001c10249698d532f2f5a71708b70f0e4ef3fce081a9dd93ffeaaa", size = 1573599, upload-time = "2025-10-22T17:29:51.357Z" }
|
||||
sdist = { url = "https://files.pythonhosted.org/packages/af/90/59757aa887ddcea61428820274f1a2d1f986feb7880374a5420ab5d37132/textual-6.5.0.tar.gz", hash = "sha256:e5f152cdd47db48a635d23b839721bae4d0e8b6d855e3fede7285218289294e3", size = 1574116, upload-time = "2025-10-31T17:21:53.4Z" }
|
||||
wheels = [
|
||||
{ url = "https://files.pythonhosted.org/packages/37/20/6eed0e55bdd2576475e9cea49cc71c47f8e56ab54f04cbe04b2fb56440de/textual-6.4.0-py3-none-any.whl", hash = "sha256:b346dbb8e12f17cefb33ddfdf7f19bdc9e66c29daf82fc981a8db6b7d985e115", size = 711663, upload-time = "2025-10-22T17:29:49.346Z" },
|
||||
{ url = "https://files.pythonhosted.org/packages/42/37/1deba011782a49ea249c73adcf703a39b0249ac9b0e17d1a2e4074df8d57/textual-6.5.0-py3-none-any.whl", hash = "sha256:c5505be7fe606b8054fb88431279885f88352bddca64832f6acd293ef7d9b54f", size = 711848, upload-time = "2025-10-31T17:21:51.134Z" },
|
||||
]
|
||||
|
||||
[[package]]
|
||||
|
||||
Reference in New Issue
Block a user