update log message + assertion

add a test to gather TB connectivity data
fix: dashboard TypeScript errors and friendly name showing "Unknown"
2026-01-18 10:58:35 -05:00 · 2026-01-12 16:30:14 +00:00 · 2026-01-12 16:26:40 +00:00 · 2026-01-12 11:59:44 +00:00 · 2026-01-12 11:44:17 +00:00 · 2026-01-12 11:40:26 +00:00
139 changed files with 8764 additions and 7373 deletions
--- a/.github/benchmark-dashboard/README.md
+++ b/.github/benchmark-dashboard/README.md
@@ -1,159 +0,0 @@
-# EXO Benchmark Dashboard
-
-A fully self-contained, browser-based dashboard for tracking EXO benchmark performance over time.
-
-## Features
-
- 📊 **Success Rate Tracking**: Monitor cluster reliability across commits
- ⚡ **Response Time Analysis**: Track average request completion times  
- 🎯 **Throughput Metrics**: Tokens per second visualization
- 📈 **Request Distribution**: Success/failure breakdown over time
- 🔄 **Auto-Refresh**: Updates every 60 seconds
- 📺 **TV-Ready**: Large, clear visualizations perfect for display
- 🔐 **Secure**: Credentials stored in browser localStorage only
- 🌐 **No Backend**: Directly accesses S3 from the browser
-
-## Quick Start
-
-### Option 1: Direct File Access (Simplest)
-
-Just open the HTML file directly in your browser:
-
-```bash
-open .github/benchmark-dashboard/index.html
-```
-
-Then click "Configure AWS Credentials" and enter your keys.
-
-### Option 2: URL Parameters (For Quick Setup)
-
-```bash
-# Serve with credentials in URL (they'll be moved to localStorage)
-open ".github/benchmark-dashboard/index.html?accessKey=YOUR_KEY&secretKey=YOUR_SECRET&region=us-east-1"
-```
-
-The credentials will be saved to localStorage and removed from the URL immediately.
-
-### Option 3: Simple HTTP Server
-
-```bash
-# From repo root
-python3 -m http.server 8080
-
-# Then open: http://localhost:8080/.github/benchmark-dashboard/
-```
-
-## AWS Credentials
-
-The dashboard needs read-only access to the `exo-benchmark-results` S3 bucket.
-
-### Required IAM Permissions
-
-```json
-{
-  "Version": "2012-10-17",
-  "Statement": [
-    {
-      "Effect": "Allow",
-      "Action": [
-        "s3:GetObject",
-        "s3:ListBucket"
-      ],
-      "Resource": [
-        "arn:aws:s3:::exo-benchmark-results",
-        "arn:aws:s3:::exo-benchmark-results/*"
-      ]
-    }
-  ]
-}
-```
-
-### Security Notes
-
- ✅ Credentials stored in browser `localStorage` only
- ✅ Never sent to any server (except AWS)
- ✅ All S3 access happens client-side
- ✅ Use read-only IAM credentials
- ⚠️ Don't commit credentials to git
- ⚠️ Use a dedicated read-only IAM user
-
-## TV/Kiosk Mode
-
-For permanent display on a TV:
-
-### macOS
-```bash
-open -a "Google Chrome" --args --kiosk ".github/benchmark-dashboard/index.html"
-```
-
-### Linux
-```bash
-chromium-browser --kiosk --app="file://$(pwd)/.github/benchmark-dashboard/index.html"
-```
-
-### Auto-start on Boot
-
-Create a simple startup script:
-
-```bash
-#!/bin/bash
-# /usr/local/bin/start-benchmark-dashboard.sh
-
-cd /path/to/exo
-python3 -m http.server 8080 &
-sleep 2
-chromium-browser --kiosk http://localhost:8080/.github/benchmark-dashboard/
-```
-
-## Data Displayed
-
-### Summary Cards
- **Latest Success Rate**: Most recent benchmark success percentage with trend
- **Avg Response Time**: Latest average response time in ms with trend
- **Total Benchmarks**: Count of all benchmarks run
- **Active Configurations**: Number of unique benchmark configs
-
-### Charts
-1. **Success Rate Over Time**: Line chart showing reliability trends
-2. **Average Response Time**: Performance over time (lower is better)
-3. **Throughput**: Tokens/second metric (higher is better)
-4. **Request Distribution**: Stacked bar chart of successes/failures
-
-## How It Works
-
-1. **Loads AWS SDK**: Uses AWS SDK for JavaScript (browser version)
-2. **Lists S3 Objects**: Fetches all files from `s3://exo-benchmark-results/bench/`
-3. **Downloads Results**: Fetches each JSON result file
-4. **Parses & Visualizes**: Uses Chart.js to create interactive charts
-5. **Auto-Refreshes**: Polls S3 every 60 seconds for new results
-
-## Customization
-
-To modify the dashboard:
-
-1. Edit `index.html` 
-2. Adjust `REFRESH_INTERVAL` for different polling frequency
-3. Modify chart colors/styles in the Chart.js configuration
-4. Add new metrics by extending the results parsing
-
-## Troubleshooting
-
-**"AWS credentials not configured"**
- Click "Configure AWS Credentials" and enter your keys
-
-**"Error loading benchmark data"**
- Check AWS credentials are correct
- Verify S3 bucket name is `exo-benchmark-results`
- Ensure IAM user has read permissions
- Check browser console for detailed errors
-
-**"No benchmark results found"**
- Wait for benchmark workflows to run
- Verify results are being uploaded to S3
- Check S3 bucket has files in `bench/` prefix
-
-**Charts not updating**
- Check browser console for errors
- Verify network connectivity to S3
- Try refreshing the page manually
-
--- a/.github/benchmark-dashboard/index.html
+++ b/.github/benchmark-dashboard/index.html
--- a/.github/configs/README.md
+++ b/.github/configs/README.md
@@ -1,186 +0,0 @@
-# EXO Benchmark Configurations
-
-This directory contains configuration files for the EXO staged benchmark system.
-
-## Overview
-
-The staged benchmark system allows you to run complex, multi-stage load tests against EXO clusters. Each stage can have different characteristics:
-
- **Prompt Length**: Number of tokens in the input prompt
- **Generation Length**: Maximum tokens to generate in the response
- **Time Between Requests**: Delay (in seconds) between firing consecutive requests
- **Iterations**: Number of requests to send in this stage
-
-Requests are **fire-and-forget** - they don't wait for the previous request to complete. This allows you to test overlapping request handling and measure success rates under load.
-
-## Configuration Files
-
-### `bench_simple.yaml`
-A minimal configuration that replicates the behavior of the original `bench.py` script:
- Single stage with 1 iteration
- Short prompt (~20 tokens)
- Generates up to 100 tokens
-
-This is useful for quick smoke tests.
-
-### `bench_config.yaml`
-A comprehensive multi-stage benchmark with:
-1. **Warmup** (10 requests): Light load with short prompts
-2. **Medium Load** (20 requests): Moderate load with medium prompts
-3. **Stress Test** (30 requests): Heavy overlapping requests with long prompts
-4. **Cooldown** (5 requests): Light load to wind down
-
-This tests the cluster's behavior under varying load patterns.
-
-## Configuration Schema
-
-```yaml
-# Hardware configuration - maps runner labels to instance counts
-hardware_plan:
-  M3ULTRA_GPU80_512GB: 4
-
-# Environment variables to set on each node (optional)
-environment:
-  OVERRIDE_MEMORY_MB: 512
-
-# Timeout for instance and runner readiness (seconds)
-timeout_seconds: 600
-
-# Model instances to run concurrently
-model_ids:
-  - "mlx-community/Llama-3.2-1B-Instruct-4bit"
-
-# Benchmark stages
-stages:
-  - name: "stage_name"              # Human-readable name for this stage
-    prompt_length: 100               # Target prompt length in tokens
-    generation_length: 200           # Max tokens to generate
-    time_between_requests: 2.0       # Seconds between firing requests
-    iterations: 10                   # Number of requests in this stage
-```
-
-## Running Benchmarks
-
-### Via GitHub Actions
-
-**Automatic (every commit):**
- The **`bench`** workflow runs automatically on every push
- Uses `bench_simple.yaml` as the default configuration
- All settings (hardware plan, timeout, environment variables, models, stages) are defined in the config file
-
-**Manual (on-demand):**
-1. Go to **Actions** → **bench** workflow
-2. Click **Run workflow**
-3. Configure:
-   - **Config File**: Path to your YAML config (default: `.github/configs/bench_simple.yaml`)
-     - `.github/configs/bench_simple.yaml` for quick tests
-     - `.github/configs/bench_config.yaml` for complex multi-stage tests
-   
-All other settings (hardware plan, timeout, environment variables, models, stages) are read from the specified config file.
-
-### Via Command Line
-
-```bash
-# Start EXO on localhost:8000
-uv run exo --api-port 8000
-
-# Run simple benchmark (1 stage, 1 iteration)
-python3 .github/scripts/bench.py \
-  --api-port 8000 \
-  --config .github/configs/bench_simple.yaml \
-  --expected-nodes 1 \
-  --is-primary true \
-  --timeout-seconds 600
-
-# Run complex staged benchmark (4 stages, multiple iterations)
-python3 .github/scripts/bench.py \
-  --api-port 8000 \
-  --config .github/configs/bench_config.yaml \
-  --expected-nodes 1 \
-  --is-primary true \
-  --timeout-seconds 600
-```
-
-## Output Metrics
-
-For each stage, the benchmark reports:
-
- **Total Requests**: Number of requests fired
- **Successful Requests**: Requests that completed successfully
- **Failed Requests**: Requests that encountered errors
- **Success Rate**: Percentage of successful requests
- **Total Tokens**: Sum of all tokens generated across successful requests
- **Avg Tokens/Request**: Average tokens per successful request
- **Avg Time/Request**: Average completion time per successful request
-
-A JSON summary is also printed for easy parsing and storage.
-
-## Creating Custom Benchmarks
-
-To create a custom benchmark:
-
-1. Copy an existing config file (e.g., `bench_config.yaml`)
-2. Modify the stages to match your test scenario
-3. Save it in this directory with a descriptive name
-4. Run it using the workflow or command line
-
-### Example: Sustained Load Test
-
-```yaml
-hardware_plan:
-  M3ULTRA_GPU80_512GB: 2
-
-environment:
-  OVERRIDE_MEMORY_MB: 1024
-
-timeout_seconds: 600
-
-model_ids:
-  - "mlx-community/Llama-3.2-1B-Instruct-4bit"
-
-stages:
-  - name: "sustained_load"
-    prompt_length: 200
-    generation_length: 150
-    time_between_requests: 0.5     # Very fast - 2 requests/second
-    iterations: 100                 # Run for ~50 seconds
-```
-
-### Example: Varying Prompt Sizes
-
-```yaml
-hardware_plan:
-  M4PRO_GPU16_24GB: 3
-
-timeout_seconds: 900
-
-model_ids:
-  - "mlx-community/Llama-3.2-1B-Instruct-4bit"
-
-stages:
-  - name: "tiny_prompts"
-    prompt_length: 10
-    generation_length: 100
-    time_between_requests: 1.0
-    iterations: 10
-    
-  - name: "medium_prompts"
-    prompt_length: 200
-    generation_length: 100
-    time_between_requests: 1.0
-    iterations: 10
-    
-  - name: "large_prompts"
-    prompt_length: 1000
-    generation_length: 100
-    time_between_requests: 1.0
-    iterations: 10
-```
-
-## Tips
-
- **Overlapping Requests**: Set `time_between_requests` < expected completion time to test concurrent request handling
- **Sequential Requests**: Set `time_between_requests` > expected completion time to ensure requests don't overlap
- **Realistic Load**: Model real usage patterns by varying prompt/generation lengths across stages
- **Success Rate**: A 100% success rate indicates the cluster handled the load well; lower rates suggest capacity limits
-
--- a/.github/configs/bench_config.yaml
+++ b/.github/configs/bench_config.yaml
@@ -1,49 +0,0 @@
-# EXO Staged Benchmark Configuration
-# This configuration defines a multi-stage load test for EXO clusters
-
-# Hardware configuration - maps runner labels to instance counts
-hardware_plan:
-  M3ULTRA_GPU80_512GB: 4
-
-# Environment variables to set on each node (optional)
-environment:
-  OVERRIDE_MEMORY_MB: 512
-
-# Timeout for instance and runner readiness (seconds)
-timeout_seconds: 600
-
-# Multiple instances run concurrently on the cluster
-model_ids:
-  - "mlx-community/Qwen3-0.6B-4bit"
-  - "mlx-community/Qwen3-0.6B-4bit"
-
-# Stages run sequentially, each with its own characteristics
-stages:
-  # Stage 1: Light load with short prompts
-  - name: "warmup"
-    prompt_length: 50          # Number of tokens in prompt
-    generation_length: 100     # Max tokens to generate
-    time_between_requests: 5.0 # Seconds between firing requests
-    iterations: 10             # Number of requests to send in this stage
-    
-  # Stage 2: Medium load with medium prompts
-  - name: "medium_load"
-    prompt_length: 200
-    generation_length: 150
-    time_between_requests: 3.0
-    iterations: 20
-    
-  # Stage 3: Heavy load with long prompts - requests will overlap
-  - name: "stress_test"
-    prompt_length: 500
-    generation_length: 200
-    time_between_requests: 1.0  # Fast firing - will definitely overlap
-    iterations: 30
-    
-  # Stage 4: Cool down with simple prompts
-  - name: "cooldown"
-    prompt_length: 50
-    generation_length: 50
-    time_between_requests: 10.0
-    iterations: 5
-
--- a/.github/configs/bench_simple.yaml
+++ b/.github/configs/bench_simple.yaml
@@ -1,125 +0,0 @@
-# Simple single-shot benchmark
-# Tests 2 instances concurrently on 2 nodes
-
-# Hardware configuration - maps runner labels to instance counts
-hardware_plan:
-  puffin4: 1
-  puffin8: 1
-
-# Environment variables to set on each node
-environment:
-  PLACEHOLDER: "placeholder"
-  # OVERRIDE_MEMORY_MB: 50000
-  MLX_METAL_FAST_SYNCH: 1
-
-# Timeout for instance and runner readiness (seconds)
-timeout_seconds: 1800
-
-# Model instances to run concurrently
-model_ids:
-  # - "mlx-community/DeepSeek-V3.1-8bit"
-  # - "mlx-community/Kimi-K2-Instruct-4bit"
-  - "mlx-community/Kimi-K2-Thinking"
-  # - "mlx-community/Qwen3-235B-A22B-4bit"
-  # - "mlx-community/Llama-3.3-70B-Instruct-4bit"
-  # - "mlx-community/Llama-3.3-70B-Instruct-8bit"
-  # - "mlx-community/Llama-3.2-1B-Instruct-4bit"
-
-# Sharding strategy: "Pipeline" or "Tensor"
-sharding: "Tensor"
-
-# Instance type: "MlxRing" or "MlxIbv"
-instance_meta: "MlxIbv"
-
-# If true, run requests sequentially (no overlap); if false, fire-and-forget (default: false)
-no_overlap: true
-
-# Benchmark stages
-# pp: 64, 256, 1024, 2048, 4096, 8192, 16384
-# g: 64, 512
-stages:
-  # - name: "simple"
-  #   prompt_length: 512
-  #   generation_length: 10
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp64_g64"
-  #   prompt_length: 64
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp64_g64"
-  #   prompt_length: 64
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp64_g512"
-  #   prompt_length: 64
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp256_g64"
-  #   prompt_length: 256
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  - name: "pp256_g64"
-    prompt_length: 256
-    generation_length: 64
-    time_between_requests: 2.0
-    iterations: 5
-  # - name: "pp256_g512"
-  #   prompt_length: 256
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp1024_g64"
-  #   prompt_length: 1024
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp1024_g512"
-  #   prompt_length: 1024
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp2048_g64"
-  #   prompt_length: 2048
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp2048_g512"
-  #   prompt_length: 2048
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp4096_g64"
-  #   prompt_length: 4096
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 4
-  # - name: "pp4096_g512"
-  #   prompt_length: 4096
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp8192_g64"
-  #   prompt_length: 8192
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp8192_g512"
-  #   prompt_length: 8192
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 5
-  # - name: "pp16384_g64"
-  #   prompt_length: 16384
-  #   generation_length: 64
-  #   time_between_requests: 2.0
-  #   iterations: 10
-  # - name: "pp16384_g512"
-  #   prompt_length: 16384
-  #   generation_length: 512
-  #   time_between_requests: 2.0
-  #   iterations: 10
--- a/.github/scripts/bench.py
+++ b/.github/scripts/bench.py
--- a/.github/scripts/build_matrix.py
+++ b/.github/scripts/build_matrix.py
@@ -1,70 +0,0 @@
-#!/usr/bin/env python3
-import json
-import os
-from typing import NotRequired, TypedDict, cast
-
-import yaml
-
-
-class MatrixEntry(TypedDict):
-    label: str
-    index: int
-
-
-class MatrixInclude(TypedDict):
-    label: str
-    index: int
-    is_primary: bool
-    expected_nodes: int
-
-
-class Config(TypedDict):
-    hardware_plan: dict[str, int]
-    timeout_seconds: NotRequired[int]
-    environment: NotRequired[dict[str, str]]
-
-
-# Read the config file
-config_file: str = os.environ["CONFIG_FILE"]
-with open(config_file, "r") as f:
-    config: Config = cast(Config, yaml.safe_load(f))
-
-# Extract hardware plan from config
-plan: dict[str, int] = config["hardware_plan"]
-if not plan:
-    raise ValueError(f"No hardware_plan found in {config_file}")
-
-# Build matrix entries
-entries: list[MatrixEntry] = []
-for label, count in plan.items():
-    for idx in range(count):
-        entries.append({"label": label, "index": idx})
-
-total_nodes: int = len(entries)
-matrix: dict[str, list[MatrixInclude]] = {
-    "include": [
-        {
-            "label": e["label"],
-            "index": e["index"],
-            "is_primary": (i == 0),
-            "expected_nodes": total_nodes,
-        }
-        for i, e in enumerate(entries)
-    ]
-}
-
-# Extract other config values
-timeout_seconds: int = config.get("timeout_seconds", 600)
-environment: dict[str, str] = config.get("environment", {})
-
-# Output to GitHub Actions
-with open(os.environ["GITHUB_OUTPUT"], "a") as f:
-    f.write(f"matrix={json.dumps(matrix)}\n")
-    f.write(f"config_file={config_file}\n")
-    f.write(f"timeout_seconds={timeout_seconds}\n")
-    f.write(f"environment={json.dumps(environment)}\n")
-
-print(f"Matrix: {json.dumps(matrix)}")
-print(f"Config file: {config_file}")
-print(f"Timeout: {timeout_seconds}")
-print(f"Environment: {json.dumps(environment)}")
--- a/.github/workflows/BENCH_USAGE.md
+++ b/.github/workflows/BENCH_USAGE.md
@@ -1,156 +0,0 @@
-# Benchmark Workflow Usage
-
-## Overview
-
-The `bench_matrix.yml` workflow enables distributed benchmarking of models across multiple self-hosted macOS runners with different hardware configurations.
-
-## Workflow Inputs
-
-| Input | Description | Default | Required |
-|-------|-------------|---------|----------|
-| `model_id` | Model ID to benchmark | `mlx-community/Llama-3.2-1B-Instruct-4bit` | Yes |
-| `hardware_plan` | JSON mapping of runner labels to counts | `{"M4PRO_GPU16_24GB": 1}` | Yes |
-| `prompt` | Benchmark prompt text | `What is the capital of France?` | No |
-| `timeout_seconds` | Timeout for instance/runner readiness | `600` | No |
-
-## Hardware Plan Format
-
-The `hardware_plan` input is a JSON object mapping runner labels to the number of machines:
-
-```json
-{
-  "M4PRO_GPU16_24GB": 2,
-  "M3ULTRA_GPU80_512GB": 1
-}
-```
-
-This example would:
- Start 2 runners with the `M4PRO_GPU16_24GB` label
- Start 1 runner with the `M3ULTRA_GPU80_512GB` label
- Total of 3 runners coordinating on a single distributed inference instance
-
-## How It Works
-
-1. **Planning Job** (`plan`)
-   - Runs on `ubuntu-latest`
-   - Parses the `hardware_plan` JSON
-   - Generates a dynamic matrix with one entry per runner
-   - Only the first runner (index 0) is marked as `is_primary`
-
-2. **Benchmark Worker Jobs** (`bench_worker`)
-   - Each job runs on a self-hosted macOS runner with the specified label
-   - All runners start EXO in parallel
-   - The primary runner creates the model instance
-   - All runners wait for their assigned runner to be ready (Loaded/Running status)
-   - The primary runner executes the benchmark and prints results
-   - The primary runner deletes the instance
-
-## Example Usage
-
-### Single Machine Benchmark
-
-```yaml
-model_id: mlx-community/Llama-3.2-1B-Instruct-4bit
-hardware_plan: '{"M4PRO_GPU16_24GB": 1}'
-prompt: What is the capital of France?
-timeout_seconds: 600
-```
-
-### Multi-Machine Distributed Benchmark
-
-```yaml
-model_id: mlx-community/Llama-3.2-3B-Instruct-4bit
-hardware_plan: '{"M4PRO_GPU16_24GB": 2, "M3ULTRA_GPU80_512GB": 1}'
-prompt: Explain quantum computing in simple terms.
-timeout_seconds: 900
-```
-
-## Benchmark Output
-
-The primary runner outputs a JSON object with benchmark results:
-
-```json
-{
-  "model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
-  "instance_id": "abc-123-def",
-  "tokens": 42,
-  "elapsed_s": 2.451,
-  "tps": 17.136
-}
-```
-
-Where:
- `tokens`: Number of chunks/tokens generated
- `elapsed_s`: Total elapsed time in seconds
- `tps`: Tokens per second (tokens / elapsed_s)
-
-## Runner Requirements
-
-Each self-hosted runner must:
- Be labeled with appropriate hardware tags (e.g., `M4PRO_GPU16_24GB`)
- Have the `self-hosted` and `macOS` labels
- Have Nix installed with flakes enabled
- Have network connectivity to other runners in the same job
-
-## Architecture
-
-```
-┌─────────────────────────────────────────────────────────────┐
-│ GitHub Actions Workflow (bench_matrix.yml)                  │
-├─────────────────────────────────────────────────────────────┤
-│                                                              │
-│  ┌────────────────┐                                         │
-│  │  Plan Job      │                                         │
-│  │  (ubuntu)      │──┬─► Matrix: [{label, index, primary}] │
-│  └────────────────┘  │                                      │
-│                      │                                      │
-│  ┌───────────────────▼──────────────────────────────────┐  │
-│  │  Bench Worker Jobs (Matrix)                         │  │
-│  ├──────────────────────────────────────────────────────┤  │
-│  │                                                       │  │
-│  │  Runner 0 (Primary)     Runner 1         Runner 2    │  │
-│  │  ┌─────────────┐       ┌─────────────┐ ┌──────────┐ │  │
-│  │  │ Start EXO   │       │ Start EXO   │ │ Start EXO│ │  │
-│  │  │ Create Inst │       │ Wait...     │ │ Wait...  │ │  │
-│  │  │ Wait Ready  │       │ Wait Ready  │ │ Wait...  │ │  │
-│  │  │ Run Bench   │       │ (idle)      │ │ (idle)   │ │  │
-│  │  │ Print TPS   │       │             │ │          │ │  │
-│  │  │ Delete Inst │       │             │ │          │ │  │
-│  │  └─────────────┘       └─────────────┘ └──────────┘ │  │
-│  └───────────────────────────────────────────────────────┘  │
-└─────────────────────────────────────────────────────────────┘
-```
-
-## Implementation Details
-
-### `scripts/bench.py`
-
-A standalone Python script that:
- Creates instance (primary only)
- Polls `/state` endpoint until instance and all runners are ready
- Executes chat completion with timing (primary only)
- Parses SSE stream and counts tokens
- Computes TPS metrics
- Cleans up instance (primary only)
-
-### Key Functions
-
- `wait_for_instance()`: Polls until instance with model_id appears
- `wait_for_runners_ready()`: Polls until expected number of runners reach Loaded/Running status
- `run_benchmark()`: Executes chat completion, measures time, counts tokens
-
-## Troubleshooting
-
-### Instance never becomes ready
- Check EXO logs in the workflow output
- Verify model_id is valid and accessible
- Increase `timeout_seconds`
-
-### Runner mismatch
- Ensure hardware_plan counts match available labeled runners
- Check runner labels match exactly (case-sensitive)
-
-### Network issues
- Verify runners can communicate on the network
- Check firewall rules between runner hosts
-
--- a/.github/workflows/bench.yml
+++ b/.github/workflows/bench.yml
@@ -1,305 +0,0 @@
-name: bench
-
-on: [push]
-
-jobs:
-  plan:
-    if: contains(github.event.head_commit.message, '/bench')
-    runs-on: ubuntu-latest
-    outputs:
-      matrix: ${{ steps.build.outputs.matrix }}
-      config_file: ${{ steps.build.outputs.config_file }}
-      timeout_seconds: ${{ steps.build.outputs.timeout_seconds }}
-      environment: ${{ steps.build.outputs.environment }}
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Build matrix from config file
-        id: build
-        shell: bash
-        run: |
-          set -euo pipefail
-          CONFIG_FILE='.github/configs/bench_simple.yaml'
-          export CONFIG_FILE
-          echo "Config file: $CONFIG_FILE"
-          python3 .github/scripts/build_matrix.py
-
-  bench_worker:
-    needs: plan
-    strategy:
-      fail-fast: false
-      matrix: ${{ fromJSON(needs.plan.outputs.matrix) }}
-    name: "bench on ${{ matrix.label }} [${{ matrix.index }}]"
-    runs-on: [self-hosted, macOS, "${{ matrix.label }}"]
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-        with:
-          lfs: false
-
-      - name: Configure git user
-        run: |
-          git config --local user.email "github-actions@users.noreply.github.com"
-          git config --local user.name  "github-actions bot"
-        shell: bash
-
-      # TODO: this is mega hacky and I'd like a simpler solution.
-      - name: Setup Nix Environment
-        run: |
-          echo "Checking for nix installation..."
-          
-          # Check if nix is already available
-          if command -v nix >/dev/null 2>&1; then
-            echo "Nix already in PATH"
-          # Try sourcing profile scripts to set up environment properly
-          elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
-            echo "Sourcing multi-user nix-daemon profile script"
-            source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
-          elif [ -f "$HOME/.nix-profile/etc/profile.d/nix.sh" ]; then
-            echo "Sourcing single-user nix profile script"
-            source "$HOME/.nix-profile/etc/profile.d/nix.sh"
-          elif [ -f /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh ]; then
-            echo "Sourcing per-user nix profile script"
-            source /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh
-          elif [ -f /etc/profile.d/nix.sh ]; then
-            echo "Sourcing system-wide nix profile script"
-            source /etc/profile.d/nix.sh
-          # Fallback: manually add nix to PATH if binary exists
-          elif [ -f /nix/var/nix/profiles/default/bin/nix ]; then
-            echo "Found nix binary, manually adding to PATH"
-            export PATH="/nix/var/nix/profiles/default/bin:$PATH"
-          elif [ -f "$HOME/.nix-profile/bin/nix" ]; then
-            echo "Found nix binary in user profile, manually adding to PATH"
-            export PATH="$HOME/.nix-profile/bin:$PATH"
-          else
-            echo "Nix not found. Debugging info:"
-            echo "USER: $USER"
-            echo "HOME: $HOME"
-            echo "Current PATH: $PATH"
-            echo ""
-            echo "Checking common Nix locations:"
-            echo "  /nix/var/nix/profiles/default/bin/nix:"
-            ls -la /nix/var/nix/profiles/default/bin/nix 2>/dev/null || echo "    Not found"
-            echo "  /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh:"
-            ls -la /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh 2>/dev/null || echo "    Not found"
-            echo "  ~/.nix-profile/etc/profile.d/nix.sh:"
-            ls -la "$HOME/.nix-profile/etc/profile.d/nix.sh" 2>/dev/null || echo "    Not found"
-            echo "  /nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh:"
-            ls -la "/nix/var/nix/profiles/per-user/$USER/profile/etc/profile.d/nix.sh" 2>/dev/null || echo "    Not found"
-            echo ""
-            echo "/nix directory structure:"
-            ls -la /nix 2>/dev/null || echo "    /nix directory not found"
-            echo ""
-            echo "/nix/var:"
-            ls -la /nix/var 2>/dev/null || echo "    /nix/var not found"
-            echo ""
-            echo "/nix/store:"
-            ls -la /nix/store 2>/dev/null | head -20 || echo "    /nix/store not found"
-            echo ""
-            echo "GitHub Actions runner is running as user '$USER'."
-            echo "If Nix is installed for a different user, either:"
-            echo "  1. Install Nix for user '$USER' (multi-user install recommended)"
-            echo "  2. Configure the runner service to run as the user with Nix installed"
-            echo "  3. Ensure Nix is installed system-wide with proper daemon setup"
-            exit 1
-          fi
-          
-          # Verify nix is available and persist to GITHUB_ENV
-          if command -v nix >/dev/null 2>&1; then
-            echo "✓ Nix is available"
-            nix --version
-            echo "PATH=$PATH" >> $GITHUB_ENV
-            if [ -n "$NIX_PATH" ]; then
-              echo "NIX_PATH=$NIX_PATH" >> $GITHUB_ENV
-            fi
-          else
-            echo "ERROR: Failed to set up Nix"
-            echo "PATH after setup attempt: $PATH"
-            exit 1
-          fi
-        shell: bash
-
-      - name: Setup EXO_HOME and API_PORT
-        run: |
-          EXO_HOME=$(mktemp -d -t exo-e2e-XXXXXXXX)
-          API_PORT=$((49152 + RANDOM % (65535 - 49152 + 1)))
-          EXO_MODELS_DIR="$HOME/.exo/models"
-          EXO_LIBP2P_NAMESPACE="bench-${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}"
-          echo "EXO_HOME=$EXO_HOME" >> "$GITHUB_ENV"
-          echo "API_PORT=$API_PORT" >> "$GITHUB_ENV"
-          echo "EXO_MODELS_DIR=$EXO_MODELS_DIR" >> "$GITHUB_ENV"
-          echo "EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE" >> "$GITHUB_ENV"
-          echo "Created EXO_HOME: $EXO_HOME"
-          echo "Generated API_PORT: $API_PORT"
-          echo "Using models from: $EXO_MODELS_DIR"
-          echo "Using libp2p namespace: $EXO_LIBP2P_NAMESPACE"
-        shell: bash
-
-      - name: Configure local MLX if available
-        run: |
-          echo "=== DEBUG: Checking for local MLX configuration ==="
-          MODIFIED=false
-          
-          echo "Checking for /Users/Shared/mlx directory..."
-          if [ -d "/Users/Shared/mlx" ]; then
-            echo "✓ Found /Users/Shared/mlx"
-            ls -la /Users/Shared/mlx | head -5
-            echo "Enabling local mlx path in pyproject.toml"
-            sed -i.bak 's|^# mlx = { path = "/Users/Shared/mlx", editable=true }$|mlx = { path = "/Users/Shared/mlx", editable=true }|' pyproject.toml
-            MODIFIED=true
-          else
-            echo "✗ /Users/Shared/mlx not found, will use PyPI version"
-          fi
-          
-          echo "Checking for /Users/Shared/mlx-lm directory..."
-          if [ -d "/Users/Shared/mlx-lm" ]; then
-            echo "✓ Found /Users/Shared/mlx-lm"
-            ls -la /Users/Shared/mlx-lm | head -5
-            echo "Enabling local mlx-lm path in pyproject.toml"
-            sed -i.bak 's|^# mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }$|mlx-lm = { path = "/Users/Shared/mlx-lm", editable=true }|' pyproject.toml
-            MODIFIED=true
-          else
-            echo "✗ /Users/Shared/mlx-lm not found, will use PyPI version"
-          fi
-          
-          if [ "$MODIFIED" = true ]; then
-            echo "=== Modified pyproject.toml [tool.uv.sources] section: ==="
-            sed -n '/\[tool\.uv\.sources\]/,/^\[/{/^\[tool\.uv\.sources\]/p; /^\[/!p;}' pyproject.toml
-            echo "=== Regenerating uv.lock with local MLX paths... ==="
-            nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command uv lock --upgrade-package mlx --upgrade-package mlx-lm
-            echo "✓ Lock file regenerated"
-          else
-            echo "⚠ No local MLX directories found, using PyPI packages"
-          fi
-          echo "=== DEBUG: Local MLX configuration complete ==="
-        shell: bash
-
-      - name: Sync dependencies
-        run: |
-          if [ -d "/Users/Shared/test" ]; then
-            pushd /Users/Shared/test
-            uv sync --reinstall
-            popd
-          fi
-          echo "Running just sync to ensure clean dependencies..."
-          nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command just sync
-        shell: bash
-
-      - name: Start EXO and run bench script
-        shell: bash
-        env:
-          IS_PRIMARY: ${{ matrix.is_primary }}
-          EXPECTED_NODES: ${{ matrix.expected_nodes }}
-          HARDWARE_LABEL: ${{ matrix.label }}
-          CONFIG_FILE: ${{ needs.plan.outputs.config_file }}
-          TIMEOUT_SECONDS: ${{ needs.plan.outputs.timeout_seconds }}
-          ENVIRONMENT_JSON: ${{ needs.plan.outputs.environment }}
-        run: |
-          set -euo pipefail
-
-          # Parse environment variables from config
-          ENV_VARS=""
-          if [ -n "$ENVIRONMENT_JSON" ] && [ "$ENVIRONMENT_JSON" != "{}" ]; then
-            ENV_VARS=$(echo "$ENVIRONMENT_JSON" | python3 -c "import sys, json; env = json.load(sys.stdin); print(' '.join([f'{k}={v}' for k, v in env.items()]))")
-          fi
-
-          echo "Starting EXO with API_PORT=${API_PORT} EXO_HOME=${EXO_HOME} EXO_LIBP2P_NAMESPACE=${EXO_LIBP2P_NAMESPACE}"
-          echo "Environment variables from config: $ENV_VARS"
-          LOG_FILE=/tmp/exo.log
-          : > "$LOG_FILE"
-
-          MASTER_FLAG=""
-          if [ "$IS_PRIMARY" = "true" ]; then
-            MASTER_FLAG="-m"
-          fi
-
-          nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
-            "EXO_HOME=$EXO_HOME EXO_MODELS_DIR=$EXO_MODELS_DIR EXO_LIBP2P_NAMESPACE=$EXO_LIBP2P_NAMESPACE $ENV_VARS PYTHONUNBUFFERED=1 PYTHONDEBUG=1 PYTHONPATH=. uv run exo $MASTER_FLAG --api-port $API_PORT" \
-            >> "$LOG_FILE" 2>&1 &
-
-          EXO_PID=$!
-          echo "Started EXO in background with PID: $EXO_PID"
-          echo "Log file: $LOG_FILE"
-
-          cleanup() {
-            echo '=== EXO log (tail) ==='
-            tail -n 300 "$LOG_FILE" || true
-            if ps -p "$EXO_PID" >/dev/null 2>&1; then
-              echo "Killing EXO (PID $EXO_PID)"
-              kill "$EXO_PID" || true
-            fi
-          }
-          trap cleanup EXIT
-
-          for i in $(seq 1 60); do
-            if curl -s "http://localhost:${API_PORT}/state" >/dev/null 2>&1; then
-              echo "EXO API ready"
-              break
-            fi
-            if ! ps -p "$EXO_PID" >/dev/null 2>&1; then
-              echo "EXO terminated early"; sed -n '1,200p' "$LOG_FILE" || true; exit 1
-            fi
-            sleep 1
-          done
-
-          RESULTS_FILE="/tmp/bench_results_${GITHUB_RUN_ID}_${GITHUB_RUN_ATTEMPT}_$(date +%s).json"
-          echo "Results will be saved to: $RESULTS_FILE"
-          echo "RESULTS_FILE=$RESULTS_FILE" >> "$GITHUB_ENV"
-
-          echo "Running bench script with config: $CONFIG_FILE, timeout: $TIMEOUT_SECONDS"
-          nix --extra-experimental-features nix-command --extra-experimental-features flakes develop --command bash -c \
-            "PYTHONUNBUFFERED=1 uv run --no-project --with pyyaml --with pydantic python .github/scripts/bench.py \
-              --api-port $API_PORT \
-              --config $CONFIG_FILE \
-              --expected-nodes ${EXPECTED_NODES} \
-              --is-primary ${IS_PRIMARY} \
-              --timeout-seconds ${TIMEOUT_SECONDS} \
-              --output $RESULTS_FILE \
-              --git-commit ${GITHUB_SHA} \
-              --hardware-labels ${HARDWARE_LABEL}"
-
-      - name: Install AWS CLI
-        if: always() && env.RESULTS_FILE && matrix.is_primary
-        run: |
-          if ! command -v aws &> /dev/null; then
-            echo "AWS CLI not found, installing..."
-            brew install awscli
-          else
-            echo "AWS CLI already installed"
-          fi
-        shell: bash
-
-      - name: Upload results to S3
-        if: always() && env.RESULTS_FILE && matrix.is_primary
-        env:
-          AWS_ACCESS_KEY_ID: ${{ secrets.S3_BENCHMARKS_AWS_ACCESS_KEY_ID }}
-          AWS_SECRET_ACCESS_KEY: ${{ secrets.S3_BENCHMARKS_AWS_SECRET_ACCESS_KEY }}
-          AWS_DEFAULT_REGION: us-east-1
-        run: |
-          echo "Checking for results file: $RESULTS_FILE"
-          echo "Is primary: ${{ matrix.is_primary }}"
-
-          if [ -f "$RESULTS_FILE" ]; then
-            TIMESTAMP=$(date -u +%Y/%m/%d/%H%M%S)
-            S3_KEY="bench/${TIMESTAMP}_${GITHUB_SHA:0:8}_${GITHUB_RUN_ID}.json"
-            echo "Uploading results to s3://exo-benchmark-results/$S3_KEY"
-
-            aws s3 cp "$RESULTS_FILE" "s3://exo-benchmark-results/$S3_KEY" \
-              --content-type application/json \
-              --metadata "commit=${GITHUB_SHA},run_id=${GITHUB_RUN_ID},branch=${GITHUB_REF_NAME}"
-
-            echo "Results uploaded successfully"
-            echo "View at: https://exo-benchmark-results.s3.amazonaws.com/$S3_KEY"
-          else
-            echo "Results file not found at: $RESULTS_FILE"
-            echo "Skipping upload"
-          fi
-        shell: bash
-
-      - name: Cleanup EXO_HOME
-        run: |
-          echo "Cleaning up EXO_HOME: $EXO_HOME"
-          rm -rf "$EXO_HOME"
-        shell: bash
-        if: always()
--- a/.github/workflows/build-app.yml
+++ b/.github/workflows/build-app.yml
@@ -0,0 +1,327 @@
+name: Build EXO macOS DMG
+
+on:
+  workflow_dispatch:
+  push:
+    tags:
+      - "v*"
+    branches:
+      - "test-app"
+
+jobs:
+  build-macos-app:
+    runs-on: "macos-26"
+    env:
+      SPARKLE_VERSION: 2.8.1
+      SPARKLE_DOWNLOAD_PREFIX: ${{ secrets.SPARKLE_DOWNLOAD_PREFIX }}
+      SPARKLE_FEED_URL: ${{ secrets.SPARKLE_FEED_URL }}
+      SPARKLE_ED25519_PUBLIC: ${{ secrets.SPARKLE_ED25519_PUBLIC }}
+      SPARKLE_ED25519_PRIVATE: ${{ secrets.SPARKLE_ED25519_PRIVATE }}
+      SPARKLE_S3_BUCKET: ${{ secrets.SPARKLE_S3_BUCKET }}
+      SPARKLE_S3_PREFIX: ${{ secrets.SPARKLE_S3_PREFIX }}
+      EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT: ${{ secrets.EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT }}
+      AWS_REGION: ${{ secrets.AWS_REGION }}
+      EXO_BUILD_NUMBER: ${{ github.run_number }}
+      EXO_LIBP2P_NAMESPACE: ${{ github.ref_name }}
+
+    steps:
+      # ============================================================
+      # Checkout and tag validation
+      # ============================================================
+
+      - name: Checkout
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Derive release version from tag
+        run: |
+          if [[ "$GITHUB_REF_NAME" == "test-app" || "${{ github.event_name }}" == "workflow_dispatch" ]]; then
+            VERSION="0.0.0-alpha.0"
+            echo "IS_ALPHA=true" >> $GITHUB_ENV
+          else
+            VERSION="${GITHUB_REF_NAME#v}"
+            if [[ "$VERSION" == *-alpha* ]]; then
+              echo "IS_ALPHA=true" >> $GITHUB_ENV
+            else
+              echo "IS_ALPHA=false" >> $GITHUB_ENV
+            fi
+          fi
+          echo "RELEASE_VERSION=$VERSION" >> $GITHUB_ENV
+
+      - name: Compute build version from semver
+        run: |
+          VERSION="$RELEASE_VERSION"
+          # Extract major.minor.patch (strip prerelease suffix)
+          BASE_VERSION="${VERSION%%-*}"
+          MAJOR=$(echo "$BASE_VERSION" | cut -d. -f1)
+          MINOR=$(echo "$BASE_VERSION" | cut -d. -f2)
+          PATCH=$(echo "$BASE_VERSION" | cut -d. -f3)
+
+          # Extract prerelease number (e.g., "alpha.2" -> 2, or 999 for releases)
+          if [[ "$VERSION" == *-* ]]; then
+            PRERELEASE_PART="${VERSION#*-}"
+            PRERELEASE_NUM="${PRERELEASE_PART##*.}"
+            # Default to 0 if not a number
+            if ! [[ "$PRERELEASE_NUM" =~ ^[0-9]+$ ]]; then
+              PRERELEASE_NUM=0
+            fi
+          else
+            PRERELEASE_NUM=999
+          fi
+
+          # Compute: PRERELEASE + (1000 * PATCH) + (1_000_000 * MINOR) + (1_000_000_000 * MAJOR)
+          BUILD_VERSION=$((PRERELEASE_NUM + 1000 * PATCH + 1000000 * MINOR + 1000000000 * MAJOR))
+          echo "EXO_BUILD_VERSION=$BUILD_VERSION" >> $GITHUB_ENV
+          echo "Computed build version: $BUILD_VERSION from $VERSION"
+
+      - name: Ensure tag commit is on main
+        if: github.ref_type == 'tag'
+        run: |
+          git fetch origin main
+          # Alpha tags can be on any branch, production tags must be on main
+          if [[ "$IS_ALPHA" == "true" ]]; then
+            echo "Alpha tag detected, skipping main branch check"
+          elif ! git merge-base --is-ancestor origin/main HEAD; then
+            echo "Production tag must point to a commit on main"
+            exit 1
+          fi
+
+      # ============================================================
+      # Install dependencies
+      # ============================================================
+
+      - name: Select Xcode 26.2
+        run: |
+          sudo xcode-select -s /Applications/Xcode_26.2.app
+          if ! xcrun -f metal >/dev/null 2>&1; then
+            echo "Metal toolchain is not installed."
+            exit 1
+          fi
+
+      - name: Install Homebrew packages
+        run: brew install just awscli macmon
+
+      - name: Install UV
+        uses: astral-sh/setup-uv@v6
+        with:
+          enable-cache: true
+          cache-dependency-glob: uv.lock
+
+      - name: Setup Python
+        run: |
+          uv python install
+          uv sync --locked
+
+      - name: Build dashboard
+        run: |
+          cd dashboard
+          npm ci
+          npm run build
+
+      - name: Install Sparkle CLI
+        run: |
+          CLI_URL="${SPARKLE_CLI_URL:-https://github.com/sparkle-project/Sparkle/releases/download/${SPARKLE_VERSION}/Sparkle-${SPARKLE_VERSION}.tar.xz}"
+          echo "Downloading Sparkle CLI from: $CLI_URL"
+          mkdir -p /tmp/sparkle
+          curl --fail --location --output /tmp/sparkle.tar.xz "$CLI_URL"
+          tar -xJf /tmp/sparkle.tar.xz -C /tmp/sparkle --strip-components=1
+          echo "SPARKLE_BIN=/tmp/sparkle/bin" >> $GITHUB_ENV
+
+      - name: Prepare code-signing keychain
+        env:
+          MACOS_CERTIFICATE: ${{ secrets.MACOS_CERTIFICATE }}
+          MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
+          PROVISIONING_PROFILE: ${{ secrets.PROVISIONING_PROFILE }}
+        run: |
+          KEYCHAIN_PATH="$HOME/Library/Keychains/build.keychain-db"
+
+          # Create fresh keychain
+          security create-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
+
+          # Disable auto-lock (no timeout, no lock-on-sleep)
+          security set-keychain-settings "$KEYCHAIN_PATH"
+
+          # Add to search list while preserving existing keychains
+          security list-keychains -d user -s "$KEYCHAIN_PATH" $(security list-keychains -d user | tr -d '"')
+
+          # Set as default and unlock
+          security default-keychain -s "$KEYCHAIN_PATH"
+          security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
+
+          # Import certificate with full access for codesign
+          echo "$MACOS_CERTIFICATE" | base64 --decode > /tmp/cert.p12
+          security import /tmp/cert.p12 -k "$KEYCHAIN_PATH" -P "$MACOS_CERTIFICATE_PASSWORD" \
+            -T /usr/bin/codesign -T /usr/bin/security -T /usr/bin/productbuild
+          rm /tmp/cert.p12
+
+          # Allow codesign to access the key without prompting
+          security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
+
+          # Verify keychain is unlocked and identity is available
+          echo "Verifying signing identity..."
+          security find-identity -v -p codesigning "$KEYCHAIN_PATH"
+
+          # Setup provisioning profile
+          mkdir -p "$HOME/Library/Developer/Xcode/UserData/Provisioning Profiles"
+          echo "$PROVISIONING_PROFILE" | base64 --decode > "$HOME/Library/Developer/Xcode/UserData/Provisioning Profiles/EXO.provisionprofile"
+
+          # Export keychain path for other steps
+          echo "BUILD_KEYCHAIN_PATH=$KEYCHAIN_PATH" >> $GITHUB_ENV
+
+      # ============================================================
+      # Build the bundle
+      # ============================================================
+
+      - name: Build PyInstaller bundle
+        run: uv run pyinstaller packaging/pyinstaller/exo.spec
+
+      - name: Build Swift app
+        env:
+          MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
+          SPARKLE_FEED_URL: ${{ secrets.SPARKLE_FEED_URL }}
+          SPARKLE_ED25519_PUBLIC: ${{ secrets.SPARKLE_ED25519_PUBLIC }}
+        run: |
+          cd app/EXO
+          security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
+          SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
+          xcodebuild clean build \
+            -scheme EXO \
+            -configuration Release \
+            -derivedDataPath build \
+            MARKETING_VERSION="$RELEASE_VERSION" \
+            CURRENT_PROJECT_VERSION="$EXO_BUILD_VERSION" \
+            EXO_BUILD_TAG="$RELEASE_VERSION" \
+            EXO_BUILD_COMMIT="$GITHUB_SHA" \
+            SPARKLE_FEED_URL="$SPARKLE_FEED_URL" \
+            SPARKLE_ED25519_PUBLIC="$SPARKLE_ED25519_PUBLIC" \
+            EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT="$EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT" \
+            CODE_SIGNING_IDENTITY="$SIGNING_IDENTITY" \
+            CODE_SIGN_INJECT_BASE_ENTITLEMENTS=YES
+          mkdir -p ../../output
+          cp -R build/Build/Products/Release/EXO.app ../../output/EXO.app
+
+      - name: Inject PyInstaller runtime
+        run: |
+          rm -rf output/EXO.app/Contents/Resources/exo
+          mkdir -p output/EXO.app/Contents/Resources
+          cp -R dist/exo output/EXO.app/Contents/Resources/exo
+
+      - name: Codesign PyInstaller runtime
+        env:
+          MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
+        run: |
+          cd output
+          security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
+          SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
+          RUNTIME_DIR="EXO.app/Contents/Resources/exo"
+          find "$RUNTIME_DIR" -type f \( -perm -111 -o -name "*.dylib" -o -name "*.so" \) -print0 |
+            while IFS= read -r -d '' file; do
+              /usr/bin/codesign --force --timestamp --options runtime \
+                --sign "$SIGNING_IDENTITY" "$file"
+            done
+
+      - name: Sign, notarize, and create DMG
+        env:
+          MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
+          APPLE_NOTARIZATION_USERNAME: ${{ secrets.APPLE_NOTARIZATION_USERNAME }}
+          APPLE_NOTARIZATION_PASSWORD: ${{ secrets.APPLE_NOTARIZATION_PASSWORD }}
+          APPLE_NOTARIZATION_TEAM: ${{ secrets.APPLE_NOTARIZATION_TEAM }}
+        run: |
+          cd output
+          security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
+          SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
+          /usr/bin/codesign --deep --force --timestamp --options runtime \
+            --sign "$SIGNING_IDENTITY" EXO.app
+          mkdir -p dmg-root
+          cp -R EXO.app dmg-root/
+          ln -s /Applications dmg-root/Applications
+          DMG_NAME="EXO-${RELEASE_VERSION}.dmg"
+          hdiutil create -volname "EXO" -srcfolder dmg-root -ov -format UDZO "$DMG_NAME"
+          /usr/bin/codesign --force --timestamp --options runtime \
+            --sign "$SIGNING_IDENTITY" "$DMG_NAME"
+          if [[ -n "$APPLE_NOTARIZATION_USERNAME" ]]; then
+            SUBMISSION_OUTPUT=$(xcrun notarytool submit "$DMG_NAME" \
+              --apple-id "$APPLE_NOTARIZATION_USERNAME" \
+              --password "$APPLE_NOTARIZATION_PASSWORD" \
+              --team-id "$APPLE_NOTARIZATION_TEAM" \
+              --wait --timeout 15m 2>&1)
+            echo "$SUBMISSION_OUTPUT"
+
+            SUBMISSION_ID=$(echo "$SUBMISSION_OUTPUT" | awk 'tolower($1)=="id:" && $2 ~ /^[0-9a-fA-F-]+$/ {print $2; exit}')
+            STATUS=$(echo "$SUBMISSION_OUTPUT" | awk 'tolower($1)=="status:" {print $2; exit}')
+
+            if [[ -n "$SUBMISSION_ID" ]]; then
+              xcrun notarytool log "$SUBMISSION_ID" \
+                --apple-id "$APPLE_NOTARIZATION_USERNAME" \
+                --password "$APPLE_NOTARIZATION_PASSWORD" \
+                --team-id "$APPLE_NOTARIZATION_TEAM" > notarization-log.txt || true
+              echo "===== Notarization Log ====="
+              cat notarization-log.txt
+              echo "============================"
+            fi
+
+            if [[ "$STATUS" != "Accepted" ]]; then
+              echo "Notarization failed with status: ${STATUS:-Unknown}"
+              exit 1
+            fi
+
+            xcrun stapler staple "$DMG_NAME"
+          fi
+
+      - name: Generate Sparkle appcast
+        env:
+          SPARKLE_DOWNLOAD_PREFIX: ${{ env.SPARKLE_DOWNLOAD_PREFIX }}
+          SPARKLE_ED25519_PRIVATE: ${{ secrets.SPARKLE_ED25519_PRIVATE }}
+          IS_ALPHA: ${{ env.IS_ALPHA }}
+        run: |
+          set -euo pipefail
+          cd output
+          DOWNLOAD_PREFIX="${SPARKLE_DOWNLOAD_PREFIX:-https://assets.exolabs.net}"
+          echo "$SPARKLE_ED25519_PRIVATE" > sparkle_ed25519.key
+          chmod 600 sparkle_ed25519.key
+
+          CHANNEL_FLAG=""
+          if [[ "$IS_ALPHA" == "true" ]]; then
+            CHANNEL_FLAG="--channel alpha"
+            echo "Generating appcast for alpha channel"
+          fi
+
+          $SPARKLE_BIN/generate_appcast \
+            --ed-key-file sparkle_ed25519.key \
+            --download-url-prefix "$DOWNLOAD_PREFIX" \
+            $CHANNEL_FLAG \
+            .
+
+      # ============================================================
+      # Upload artifacts
+      # ============================================================
+
+      - name: Upload DMG
+        uses: actions/upload-artifact@v4
+        with:
+          name: EXO-dmg-${{ env.RELEASE_VERSION }}
+          path: output/EXO-${{ env.RELEASE_VERSION }}.dmg
+
+      - name: Upload to S3
+        if: env.SPARKLE_S3_BUCKET != '' && github.ref_type == 'tag'
+        env:
+          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
+          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
+          AWS_REGION: ${{ env.AWS_REGION }}
+          SPARKLE_S3_BUCKET: ${{ env.SPARKLE_S3_BUCKET }}
+          SPARKLE_S3_PREFIX: ${{ env.SPARKLE_S3_PREFIX }}
+          IS_ALPHA: ${{ env.IS_ALPHA }}
+        run: |
+          set -euo pipefail
+          cd output
+          PREFIX="${SPARKLE_S3_PREFIX:-}"
+          if [[ -n "$PREFIX" && "${PREFIX: -1}" != "/" ]]; then
+            PREFIX="${PREFIX}/"
+          fi
+          DMG_NAME="EXO-${RELEASE_VERSION}.dmg"
+          aws s3 cp "$DMG_NAME" "s3://${SPARKLE_S3_BUCKET}/${PREFIX}${DMG_NAME}"
+          if [[ "$IS_ALPHA" != "true" ]]; then
+            aws s3 cp "$DMG_NAME" "s3://${SPARKLE_S3_BUCKET}/${PREFIX}EXO-latest.dmg"
+            aws s3 cp appcast.xml "s3://${SPARKLE_S3_BUCKET}/${PREFIX}appcast.xml" --content-type application/xml --cache-control no-cache
+          fi
--- a/.gitignore
+++ b/.gitignore
@@ -7,6 +7,8 @@ digest.txt
 # nix
 .direnv/

+# IDEA (PyCharm)
+.idea

 # xcode / macos
 *.xcuserstate
@@ -14,6 +16,7 @@ digest.txt
 *.xcuserdatad/
 **/.DS_Store
 app/EXO/build/
+dist/


 # rust
--- a/.prettierrc
+++ b/.prettierrc
@@ -0,0 +1,3 @@
+{
+  "useTabs": true
+}
--- a/.swift-format
+++ b/.swift-format
@@ -0,0 +1,6 @@
+{
+  "version": 1,
+  "indentation": {
+    "spaces": 4
+  }
+}
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -5,10 +5,21 @@ Thank you for your interest in contributing to EXO!
 ## Getting Started

 To run EXO from source:
+
+**Prerequisites:**
+- [uv](https://github.com/astral-sh/uv) (for Python dependency management)
+  ```bash
+  brew install uv
+  ```
+- [macmon](https://github.com/vladkens/macmon) (for hardware monitoring on Apple Silicon)
+  ```bash
+  brew install macmon
+  ```
+
 ```bash
 git clone https://github.com/exo-explore/exo.git
 cd exo/dashboard
-npm install && npm run build
+npm install && npm run build && cd ..
 uv run exo
 ```

--- a/README.md
+++ b/README.md
@@ -1,55 +1,323 @@
 <div align="center">

 <picture>
-  <source media="(prefers-color-scheme: light)" srcset="/docs/exo-logo-black-bg.jpg">
-  <img alt="exo logo" src="/docs/exo-logo-transparent.png" width="50%" height="50%">
+  <source media="(prefers-color-scheme: light)" srcset="/docs/imgs/exo-logo-black-bg.jpg">
+  <img alt="exo logo" src="/docs/imgs/exo-logo-transparent.png" width="50%" height="50%">
 </picture>

 exo: Run your own AI cluster at home with everyday devices. Maintained by [exo labs](https://x.com/exolabs).

-
-[![GitHub Repo stars](https://img.shields.io/github/stars/exo-explore/exo)](https://github.com/exo-explore/exo/stargazers)
-[![License: Apache-2.0](https://img.shields.io/badge/License-Apache2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
-
-<a href="https://trendshift.io/repositories/11849" target="_blank"><img src="https://trendshift.io/api/badge/repositories/11849" alt="exo-explore%2Fexo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
+<p align="center">
+  <a href="https://discord.gg/TJ4P57arEm" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white" alt="Discord"></a>
+  <a href="https://x.com/exolabs" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/twitter/follow/exolabs?style=social" alt="X"></a>
+  <a href="https://www.apache.org/licenses/LICENSE-2.0.html" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/License-Apache2.0-blue.svg" alt="License: Apache-2.0"></a>
+</p>

 </div>

 ---

+exo connects all your devices into an AI cluster. Not only does exo enable running models larger than would fit on a single device, but with [day-0 support for RDMA over Thunderbolt](https://x.com/exolabs/status/2001817749744476256?s=20), makes models run faster as you add more devices.
+
 ## Features

- **Automatic Device Discovery**: Devices running EXO automatically discover each other on your local network - no manual configuration.
- **RDMA over Thunderbolt**: EXO ships with Day-0 support for RDMA over Thunderbolt 5, enabling 99% reduction in latency between devices.
- **Auto Parallel**: EXO automatically splits up models to run distributed across devices.
- **Tensor Parallelism**: EXO supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.
- **MLX Support**: EXO uses [ml-explore/mlx](https://github.com/ml-explore/mlx) as an inference backend and [MLX distributed](https://ml-explore.github.io/mlx/build/html/usage/distributed.html) for distributed communication.
+- **Automatic Device Discovery**: Devices running exo automatically discover each other - no manual configuration.
+- **RDMA over Thunderbolt**: exo ships with [day-0 support for RDMA over Thunderbolt 5](https://x.com/exolabs/status/2001817749744476256?s=20), enabling 99% reduction in latency between devices.
+- **Topology-Aware Auto Parallel**: exo figures out the best way to split your model across all available devices based on a realtime view of your device topology. It takes into account device resources and network latency/bandwidth between each link.
+- **Tensor Parallelism**: exo supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.
+- **MLX Support**: exo uses [MLX](https://github.com/ml-explore/mlx) as an inference backend and [MLX distributed](https://ml-explore.github.io/mlx/build/html/usage/distributed.html) for distributed communication.
+
+## Benchmarks
+
+<details>
+  <summary>Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA</summary>
+  <img src="docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-1-qwen3-235b.jpeg" alt="Benchmark - Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA" width="80%" />
+  <p>
+    <strong>Source:</strong> <a href="https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5</a>
+  </p>
+</details>
+
+<details>
+  <summary>DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA</summary>
+  <img src="docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-2-deepseek-3.1-671b.jpeg" alt="Benchmark - DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA" width="80%" />
+  <p>
+    <strong>Source:</strong> <a href="https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5</a>
+  </p>
+</details>
+
+<details>
+  <summary>Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA</summary>
+  <img src="docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-3-kimi-k2-thinking.jpeg" alt="Benchmark - Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA" width="80%" />
+  <p>
+    <strong>Source:</strong> <a href="https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5">Jeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5</a>
+  </p>
+</details>

 ---

 ## Quick Start

-You need at least one Mac device running macOS Tahoe 26.2 (released December 12th 2025).
+Devices running exo automatically discover each other, without needing any manual configuration. Each device provides an API and a dashboard for interacting with your cluster (runs at `http://localhost:52415`).

-You can download the latest build here: [EXO-latest.dmg](https://assets.exolabs.net/EXO-latest.dmg). It will ask for permission to modify system settings and install a new Network profile. We hope to make this smoother in the future!
+There are two ways to run exo:

-To run from source, clone the repo, build the dashboard with `cd dashboard && npm install && npm run build` and run `uv run exo`.
+### Run from Source (macOS)

-After starting with either of these methods go to `http://localhost:52415` in your browser, and you'll have EXO.
+**Prerequisites:**
+- [brew](https://github.com/Homebrew/brew) (for simple package management on macOS)
+  
+  ```bash
+  /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
+  ```
+- [uv](https://github.com/astral-sh/uv) (for Python dependency management)
+- [macmon](https://github.com/vladkens/macmon) (for hardware monitoring on Apple Silicon)
+- [node](https://github.com/nodejs/node) (for building the dashboard)
+  
+  ```bash
+  brew install uv macmon node
+  ```
+- [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now)
+
+  ```bash
+  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+  rustup toolchain install nightly
+  ```
+
+Clone the repo, build the dashboard, and run exo:
+
+```bash
+# Clone exo
+git clone https://github.com/exo-explore/exo
+
+# Build dashboard
+cd exo/dashboard && npm install && npm run build && cd ..
+
+# Run exo
+uv run exo
+```
+
+This starts the exo dashboard and API at http://localhost:52415/
+
+### Run from Source (Linux)
+
+**Prerequisites:**
+
+- [uv](https://github.com/astral-sh/uv) (for Python dependency management)
+- [node](https://github.com/nodejs/node) (for building the dashboard) - version 18 or higher
+- [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now)
+
+**Installation methods:**
+
+**Option 1: Using system package manager (Ubuntu/Debian example):**
+```bash
+# Install Node.js and npm
+sudo apt update
+sudo apt install nodejs npm
+
+# Install uv
+curl -LsSf https://astral.sh/uv/install.sh | sh
+
+# Install Rust (using rustup)
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+rustup toolchain install nightly
+```
+
+**Option 2: Using Homebrew on Linux (if preferred):**
+```bash
+# Install Homebrew on Linux
+/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
+
+# Install dependencies
+brew install uv node
+
+# Install Rust (using rustup)
+curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
+rustup toolchain install nightly
+```
+
+**Note:** The `macmon` package is macOS-only and not required for Linux.
+
+Clone the repo, build the dashboard, and run exo:
+
+```bash
+# Clone exo
+git clone https://github.com/exo-explore/exo
+
+# Build dashboard
+cd exo/dashboard && npm install && npm run build && cd ..
+
+# Run exo
+uv run exo
+```
+
+This starts the exo dashboard and API at http://localhost:52415/
+
+**Important note for Linux users:** Currently, exo runs on CPU on Linux. GPU support for Linux platforms is under development. If you'd like to see support for your specific Linux hardware, please [search for existing feature requests](https://github.com/exo-explore/exo/issues) or create a new one.
+
+### macOS App
+
+exo ships a macOS app that runs in the background on your Mac.
+
+<img src="docs/imgs/macos-app-one-macbook.png" alt="exo macOS App - running on a MacBook" width="35%" />
+
+The macOS app requires macOS Tahoe 26.2 or later.
+
+Download the latest build here: [EXO-latest.dmg](https://assets.exolabs.net/EXO-latest.dmg).
+
+The app will ask for permission to modify system settings and install a new Network profile. Improvements to this are being worked on.
+
+#### Uninstalling the macOS App
+
+The recommended way to uninstall is through the app itself: click the menu bar icon → Advanced → Uninstall. This cleanly removes all system components.
+
+If you've already deleted the app, you can run the standalone uninstaller script:
+
+```bash
+sudo ./app/EXO/uninstall-exo.sh
+```
+
+This removes:
+- Network setup LaunchDaemon
+- Network configuration script
+- Log files
+- The "exo" network location
+
+**Note:** You'll need to manually remove EXO from Login Items in System Settings → General → Login Items.

 ---

-## Requirements
+### Enabling RDMA on macOS

- Mac devices with Apple Silicon (M-series chips)
- macOS Tahoe 26.2 or later (released December 12th 2025)
-  - Older macOS versions may work without RDMA, but only 26.2+ is officially supported
- For RDMA over Thunderbolt: a high quality Thunderbolt 5 cable
+RDMA is a new capability added to macOS 26.2. It works on any Mac with Thunderbolt 5 (M4 Pro Mac Mini, M4 Max Mac Studio, M4 Max MacBook Pro, M3 Ultra Mac Studio).

-We intend to add support for other hardware platforms [like the DGX Spark](https://x.com/exolabs/status/1978525767739883736) in the future, but they are not currently supported. If you'd like support for a new hardware platform, please search for an existing feature request and add a thumbs up so we know what hardware is important to the community.
+Note that on Mac Studio, you cannot use the Thunderbolt 5 port next to the Ethernet port.
+
+To enable RDMA on macOS, follow these steps:
+
+1. Shut down your Mac.
+2. Hold down the power button for 10 seconds until the boot menu appears.
+3. Select "Options" to enter Recovery mode.
+4. When the Recovery UI appears, open the Terminal from the Utilities menu.
+5. In the Terminal, type:
+   ```
+   rdma_ctl enable
+   ```
+   and press Enter.
+6. Reboot your Mac.
+
+After that, RDMA will be enabled in macOS and exo will take care of the rest.
+
+---
+
+### Using the API
+
+If you prefer to interact with exo via the API, here is an example creating an instance of a small model (`mlx-community/Llama-3.2-1B-Instruct-4bit`), sending a chat completions request and deleting the instance.
+
+---
+
+**1. Preview instance placements**
+
+The `/instance/previews` endpoint will preview all valid placements for your model.
+
+```bash
+curl "http://localhost:52415/instance/previews?model_id=llama-3.2-1b"
+```
+
+Sample response:
+
+```json
+{
+  "previews": [
+    {
+      "model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
+      "sharding": "Pipeline",
+      "instance_meta": "MlxRing",
+      "instance": {...},
+      "memory_delta_by_node": {"local": 729808896},
+      "error": null
+    }
+    // ...possibly more placements...
+  ]
+}
+```
+
+This will return all valid placements for this model. Pick a placement that you like.
+To pick the first one, pipe into `jq`:
+
+```bash
+curl "http://localhost:52415/instance/previews?model_id=llama-3.2-1b" | jq -c '.previews[] | select(.error == null) | .instance' | head -n1
+```
+
+---
+
+**2. Create a model instance**
+
+Send a POST to `/instance` with your desired placement in the `instance` field (the full payload must match types as in `CreateInstanceParams`), which you can copy from step 1:
+
+```bash
+curl -X POST http://localhost:52415/instance \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "instance": {...}
+  }'
+```
+
+
+Sample response:
+
+```json
+{
+  "message": "Command received.",
+  "command_id": "e9d1a8ab-...."
+}
+```
+
+---
+
+**3. Send a chat completion**
+
+Now, make a POST to `/v1/chat/completions` (the same format as OpenAI's API):
+
+```bash
+curl -N -X POST http://localhost:52415/v1/chat/completions \
+  -H 'Content-Type: application/json' \
+  -d '{
+    "model": "mlx-community/Llama-3.2-1B-Instruct-4bit",
+    "messages": [
+      {"role": "user", "content": "What is Llama 3.2 1B?"}
+    ],
+    "stream": true
+  }'
+```
+
+---
+
+**4. Delete the instance**
+
+When you're done, delete the instance by its ID (find it via `/state` or `/instance` endpoints):
+
+```bash
+curl -X DELETE http://localhost:52415/instance/YOUR_INSTANCE_ID
+```
+
+**Other useful API endpoints*:**
+
+- List all models: `curl http://localhost:52415/models`
+- Inspect instance IDs and deployment state: `curl http://localhost:52415/state`
+
+For further details, see:
+
+- API basic documentation in [docs/api.md](docs/api.md).
+- API types and endpoints in [src/exo/master/api.py](src/exo/master/api.py).
+
+---
+
+## Hardware Accelerator Support
+
+On macOS, exo uses the GPU. On Linux, exo currently runs on CPU. We are working on extending hardware accelerator support. If you'd like support for a new hardware platform, please [search for an existing feature request](https://github.com/exo-explore/exo/issues) and add a thumbs up so we know what hardware is important to the community.

 ---

 ## Contributing

-See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to EXO.
+See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to exo.
--- a/TODO.md
+++ b/TODO.md
@@ -19,6 +19,7 @@
 25. Rethink retry logic
 26. Task cancellation. When API http request gets cancelled, it should cancel corresponding task.
 27. Log cleanup - per-module log filters and default to DEBUG log levels
+28. Validate RDMA connections with ibv_devinfo in the info gatherer

 Potential refactors:

--- a/app/EXO/EXO/ContentView.swift
+++ b/app/EXO/EXO/ContentView.swift
@@ -12,18 +12,25 @@ struct ContentView: View {
    @EnvironmentObject private var controller: ExoProcessController
    @EnvironmentObject private var stateService: ClusterStateService
    @EnvironmentObject private var networkStatusService: NetworkStatusService
+    @EnvironmentObject private var localNetworkChecker: LocalNetworkChecker
    @EnvironmentObject private var updater: SparkleUpdater
    @State private var focusedNode: NodeViewModel?
    @State private var deletingInstanceIDs: Set<String> = []
    @State private var showAllNodes = false
    @State private var showAllInstances = false
+    @State private var showAdvanced = false
    @State private var showDebugInfo = false
    @State private var bugReportInFlight = false
    @State private var bugReportMessage: String?
+    @State private var uninstallInProgress = false
+    @State private var pendingNamespace: String = ""

    var body: some View {
        VStack(alignment: .leading, spacing: 12) {
            statusSection
+            if shouldShowLocalNetworkWarning {
+                localNetworkWarningBanner
+            }
            if shouldShowClusterDetails {
                Divider()
                overviewSection
@@ -38,6 +45,7 @@ struct ContentView: View {
        }
        .animation(.easeInOut(duration: 0.3), value: shouldShowClusterDetails)
        .animation(.easeInOut(duration: 0.3), value: shouldShowInstances)
+        .animation(.easeInOut(duration: 0.3), value: shouldShowLocalNetworkWarning)
        .padding()
        .frame(width: 340)
        .onAppear {
@@ -47,9 +55,62 @@ struct ContentView: View {
        }
    }

+    private var shouldShowLocalNetworkWarning: Bool {
+        if case .notWorking = localNetworkChecker.status {
+            return controller.status != .stopped
+        }
+        return false
+    }
+
+    private var localNetworkWarningBanner: some View {
+        VStack(alignment: .leading, spacing: 6) {
+            HStack(spacing: 6) {
+                Image(systemName: "exclamationmark.triangle.fill")
+                    .foregroundColor(.orange)
+                Text("Local Network Access Issue")
+                    .font(.caption)
+                    .fontWeight(.semibold)
+            }
+            Text(
+                "Device discovery won't work. To fix:\n1. Quit EXO\n2. Open System Settings → Privacy & Security → Local Network\n3. Toggle EXO off, then back on\n4. Relaunch EXO"
+            )
+            .font(.caption2)
+            .foregroundColor(.secondary)
+            .fixedSize(horizontal: false, vertical: true)
+            Button {
+                openLocalNetworkSettings()
+            } label: {
+                Text("Open Settings")
+                    .font(.caption2)
+            }
+            .buttonStyle(.bordered)
+            .controlSize(.small)
+        }
+        .padding(8)
+        .background(
+            RoundedRectangle(cornerRadius: 8)
+                .fill(Color.orange.opacity(0.1))
+        )
+        .overlay(
+            RoundedRectangle(cornerRadius: 8)
+                .stroke(Color.orange.opacity(0.3), lineWidth: 1)
+        )
+    }
+
+    private func openLocalNetworkSettings() {
+        // Open Privacy & Security settings - Local Network section
+        if let url = URL(
+            string: "x-apple.systempreferences:com.apple.preference.security?Privacy_LocalNetwork")
+        {
+            NSWorkspace.shared.open(url)
+        }
+    }
+
    private var topologySection: some View {
        Group {
-            if let topology = stateService.latestSnapshot?.topologyViewModel(), !topology.nodes.isEmpty {
+            if let topology = stateService.latestSnapshot?.topologyViewModel(
+                localNodeId: stateService.localNodeId), !topology.nodes.isEmpty
+            {
                TopologyMiniView(topology: topology)
            }
        }
@@ -83,8 +144,10 @@ struct ContentView: View {
                VStack(alignment: .leading, spacing: 4) {
                    HStack {
                        VStack(alignment: .leading) {
-                            Text("\(overview.usedRam, specifier: "%.0f") / \(overview.totalRam, specifier: "%.0f") GB")
-                                .font(.headline)
+                            Text(
+                                "\(overview.usedRam, specifier: "%.0f") / \(overview.totalRam, specifier: "%.0f") GB"
+                            )
+                            .font(.headline)
                            Text("Memory")
                                .font(.caption)
                                .foregroundColor(.secondary)
@@ -193,11 +256,7 @@ struct ContentView: View {
                Divider()
                    .padding(.vertical, 4)
            }
-            controlButton(title: "Check for Updates") {
-                updater.checkForUpdates()
-            }
-            .padding(.bottom, 8)
-            debugSection
+            advancedSection
                .padding(.bottom, 8)
            controlButton(title: "Quit", tint: .secondary) {
                controller.stop()
@@ -206,13 +265,63 @@ struct ContentView: View {
        }
    }

-    private func controlButton(title: String, tint: Color = .primary, action: @escaping () -> Void) -> some View {
+    private var advancedSection: some View {
+        VStack(alignment: .leading, spacing: 6) {
+            HStack {
+                Text("Advanced")
+                    .font(.caption)
+                    .foregroundColor(.secondary)
+                Spacer()
+                collapseButton(isExpanded: $showAdvanced)
+            }
+            .animation(nil, value: showAdvanced)
+            if showAdvanced {
+                VStack(alignment: .leading, spacing: 8) {
+                    VStack(alignment: .leading, spacing: 4) {
+                        Text("Cluster Namespace")
+                            .font(.caption2)
+                            .foregroundColor(.secondary)
+                        HStack {
+                            TextField("optional", text: $pendingNamespace)
+                                .textFieldStyle(.roundedBorder)
+                                .font(.caption2)
+                                .onAppear {
+                                    pendingNamespace = controller.customNamespace
+                                }
+                            Button("Save & Restart") {
+                                controller.customNamespace = pendingNamespace
+                                if controller.status == .running || controller.status == .starting {
+                                    controller.restart()
+                                }
+                            }
+                            .font(.caption2)
+                            .disabled(pendingNamespace == controller.customNamespace)
+                        }
+                    }
+                    HoverButton(title: "Check for Updates", small: true) {
+                        updater.checkForUpdates()
+                    }
+                    debugSection
+                    HoverButton(title: "Uninstall", tint: .red, small: true) {
+                        showUninstallConfirmationAlert()
+                    }
+                    .disabled(uninstallInProgress)
+                }
+                .transition(.opacity)
+            }
+        }
+        .animation(.easeInOut(duration: 0.25), value: showAdvanced)
+    }
+
+    private func controlButton(title: String, tint: Color = .primary, action: @escaping () -> Void)
+        -> some View
+    {
        HoverButton(title: title, tint: tint, trailingSystemImage: nil, action: action)
    }

    private var dashboardButton: some View {
        Button {
-            guard let url = URL(string: "http://localhost:8000/") else { return }
+            guard let url = URL(string: "http://localhost:52415/") else { return }
            NSWorkspace.shared.open(url)
        } label: {
            HStack {
@@ -237,9 +346,12 @@ struct ContentView: View {
        Button {
            isExpanded.wrappedValue.toggle()
        } label: {
-            Label(isExpanded.wrappedValue ? "Hide" : "Show All", systemImage: isExpanded.wrappedValue ? "chevron.up" : "chevron.down")
-                .labelStyle(.titleAndIcon)
-                .contentTransition(.symbolEffect(.replace))
+            Label(
+                isExpanded.wrappedValue ? "Hide" : "Show All",
+                systemImage: isExpanded.wrappedValue ? "chevron.up" : "chevron.down"
+            )
+            .labelStyle(.titleAndIcon)
+            .contentTransition(.symbolEffect(.replace))
        }
        .buttonStyle(.plain)
        .font(.caption2)
@@ -328,15 +440,15 @@ struct ContentView: View {
    }

    private var debugSection: some View {
-        VStack(alignment: .leading, spacing: 6) {
-            HStack {
-                Text("Debug Info")
-                    .font(.caption)
-                    .foregroundColor(.secondary)
-                Spacer()
-                collapseButton(isExpanded: $showDebugInfo)
+        VStack(alignment: .leading, spacing: 4) {
+            HoverButton(
+                title: "Debug Info",
+                tint: .primary,
+                trailingSystemImage: showDebugInfo ? "chevron.up" : "chevron.down",
+                small: true
+            ) {
+                showDebugInfo.toggle()
            }
-            .animation(nil, value: showDebugInfo)
            if showDebugInfo {
                VStack(alignment: .leading, spacing: 4) {
                    Text("Version: \(buildTag)")
@@ -349,15 +461,63 @@ struct ContentView: View {
                        .font(.caption2)
                        .foregroundColor(thunderboltStatusColor)
                    interfaceIpList
+                    rdmaStatusView
                    sendBugReportButton
                        .padding(.top, 6)
                }
+                .padding(.leading, 8)
                .transition(.opacity)
            }
        }
        .animation(.easeInOut(duration: 0.25), value: showDebugInfo)
    }

+    private var rdmaStatusView: some View {
+        let rdma = networkStatusService.status.rdmaStatus
+        return VStack(alignment: .leading, spacing: 1) {
+            Text("RDMA: \(rdmaStatusText(rdma))")
+                .font(.caption2)
+                .foregroundColor(rdmaStatusColor(rdma))
+            if !rdma.devices.isEmpty {
+                Text("  Devices: \(rdma.devices.joined(separator: ", "))")
+                    .font(.caption2)
+                    .foregroundColor(.secondary)
+            }
+            if !rdma.activePorts.isEmpty {
+                Text("  Active Ports:")
+                    .font(.caption2)
+                    .foregroundColor(.secondary)
+                ForEach(rdma.activePorts, id: \.device) { port in
+                    Text("    \(port.device) port \(port.port): \(port.state)")
+                        .font(.caption2)
+                        .foregroundColor(.green)
+                }
+            }
+        }
+    }
+
+    private func rdmaStatusText(_ rdma: RDMAStatus) -> String {
+        switch rdma.rdmaCtlEnabled {
+        case .some(true):
+            return "Enabled"
+        case .some(false):
+            return "Disabled"
+        case nil:
+            return rdma.devices.isEmpty ? "Not Available" : "Available"
+        }
+    }
+
+    private func rdmaStatusColor(_ rdma: RDMAStatus) -> Color {
+        switch rdma.rdmaCtlEnabled {
+        case .some(true):
+            return .green
+        case .some(false):
+            return .orange
+        case nil:
+            return rdma.devices.isEmpty ? .secondary : .green
+        }
+    }
+
    private var sendBugReportButton: some View {
        VStack(alignment: .leading, spacing: 4) {
            Button {
@@ -447,6 +607,88 @@ struct ContentView: View {
        bugReportInFlight = false
    }

+    private func showUninstallConfirmationAlert() {
+        let alert = NSAlert()
+        alert.messageText = "Uninstall EXO"
+        alert.informativeText = """
+            This will remove EXO and all its system components:
+
+            • Network configuration daemon
+            • Launch at login registration
+            • EXO network location
+
+            The app will be moved to Trash.
+            """
+        alert.alertStyle = .warning
+        alert.addButton(withTitle: "Uninstall")
+        alert.addButton(withTitle: "Cancel")
+
+        // Style the Uninstall button as destructive
+        if let uninstallButton = alert.buttons.first {
+            uninstallButton.hasDestructiveAction = true
+        }
+
+        let response = alert.runModal()
+        if response == .alertFirstButtonReturn {
+            performUninstall()
+        }
+    }
+
+    private func performUninstall() {
+        uninstallInProgress = true
+
+        // Stop EXO process first
+        controller.cancelPendingLaunch()
+        controller.stop()
+        stateService.stopPolling()
+
+        // Run the privileged uninstall on a background thread
+        // Using .utility QoS to avoid priority inversion with NSAppleScript's subprocess
+        DispatchQueue.global(qos: .utility).async {
+            do {
+                // Remove network setup daemon and components (requires admin privileges)
+                try NetworkSetupHelper.uninstall()
+
+                DispatchQueue.main.async {
+                    // Unregister from launch at login
+                    LaunchAtLoginHelper.disable()
+
+                    // Move app to trash
+                    self.moveAppToTrash()
+
+                    // Quit the app
+                    DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {
+                        NSApplication.shared.terminate(nil)
+                    }
+                }
+            } catch {
+                DispatchQueue.main.async {
+                    self.showErrorAlert(message: error.localizedDescription)
+                    self.uninstallInProgress = false
+                }
+            }
+        }
+    }
+
+    private func showErrorAlert(message: String) {
+        let alert = NSAlert()
+        alert.messageText = "Uninstall Failed"
+        alert.informativeText = message
+        alert.alertStyle = .critical
+        alert.addButton(withTitle: "OK")
+        alert.runModal()
+    }
+
+    private func moveAppToTrash() {
+        guard let appURL = Bundle.main.bundleURL as URL? else { return }
+        do {
+            try FileManager.default.trashItem(at: appURL, resultingItemURL: nil)
+        } catch {
+            // If we can't trash the app, that's OK - user can do it manually
+            // The important system components have already been cleaned up
+        }
+    }
+
    private var buildTag: String {
        Bundle.main.infoDictionary?["EXOBuildTag"] as? String ?? "unknown"
    }
@@ -460,14 +702,27 @@ private struct HoverButton: View {
    let title: String
    let tint: Color
    let trailingSystemImage: String?
+    let small: Bool
    let action: () -> Void

+    init(
+        title: String, tint: Color = .primary, trailingSystemImage: String? = nil,
+        small: Bool = false, action: @escaping () -> Void
+    ) {
+        self.title = title
+        self.tint = tint
+        self.trailingSystemImage = trailingSystemImage
+        self.small = small
+        self.action = action
+    }
+
    @State private var isHovering = false

    var body: some View {
        Button(action: action) {
            HStack {
                Text(title)
+                    .font(small ? .caption : nil)
                Spacer()
                if let systemName = trailingSystemImage {
                    Image(systemName: systemName)
@@ -475,8 +730,8 @@ private struct HoverButton: View {
                }
            }
            .frame(maxWidth: .infinity, alignment: .leading)
-            .padding(.vertical, 6)
-            .padding(.horizontal, 8)
+            .padding(.vertical, small ? 4 : 6)
+            .padding(.horizontal, small ? 6 : 8)
            .background(
                RoundedRectangle(cornerRadius: 6)
                    .fill(
@@ -491,4 +746,3 @@ private struct HoverButton: View {
        .onHover { isHovering = $0 }
    }
 }
-
--- a/app/EXO/EXO/EXOApp.swift
+++ b/app/EXO/EXO/EXOApp.swift
@@ -8,9 +8,9 @@
 import AppKit
 import CoreImage
 import CoreImage.CIFilterBuiltins
+import ServiceManagement
 import Sparkle
 import SwiftUI
-import ServiceManagement
 import UserNotifications
 import os.log

@@ -19,6 +19,7 @@ struct EXOApp: App {
    @StateObject private var controller: ExoProcessController
    @StateObject private var stateService: ClusterStateService
    @StateObject private var networkStatusService: NetworkStatusService
+    @StateObject private var localNetworkChecker: LocalNetworkChecker
    @StateObject private var updater: SparkleUpdater
    private let terminationObserver: TerminationObserver
    private let ciContext = CIContext(options: nil)
@@ -37,9 +38,13 @@ struct EXOApp: App {
        _stateService = StateObject(wrappedValue: service)
        let networkStatus = NetworkStatusService()
        _networkStatusService = StateObject(wrappedValue: networkStatus)
+        let localNetwork = LocalNetworkChecker()
+        _localNetworkChecker = StateObject(wrappedValue: localNetwork)
        _updater = StateObject(wrappedValue: updater)
        enableLaunchAtLoginIfNeeded()
        NetworkSetupHelper.ensureLaunchDaemonInstalled()
+        // Check local network access BEFORE launching exo
+        localNetwork.check()
        controller.scheduleLaunch(after: 15)
        service.startPolling()
        networkStatus.startPolling()
@@ -51,6 +56,7 @@ struct EXOApp: App {
                .environmentObject(controller)
                .environmentObject(stateService)
                .environmentObject(networkStatusService)
+                .environmentObject(localNetworkChecker)
                .environmentObject(updater)
        } label: {
            menuBarIcon
@@ -107,7 +113,7 @@ struct EXOApp: App {
        filter.contrast = 0.9

        guard let output = filter.outputImage,
-              let rendered = ciContext.createCGImage(output, from: output.extent)
+            let rendered = ciContext.createCGImage(output, from: output.extent)
        else {
            return nil
        }
@@ -120,7 +126,26 @@ struct EXOApp: App {
        do {
            try SMAppService.mainApp.register()
        } catch {
-            Logger().error("Failed to register EXO for launch at login: \(error.localizedDescription)")
+            Logger().error(
+                "Failed to register EXO for launch at login: \(error.localizedDescription)")
+        }
+    }
+}
+
+/// Helper for managing EXO's launch-at-login registration
+enum LaunchAtLoginHelper {
+    private static let logger = Logger(subsystem: "io.exo.EXO", category: "LaunchAtLogin")
+
+    /// Unregisters EXO from launching at login
+    static func disable() {
+        guard SMAppService.mainApp.status == .enabled else { return }
+        do {
+            try SMAppService.mainApp.unregister()
+            logger.info("Unregistered EXO from launch at login")
+        } catch {
+            logger.error(
+                "Failed to unregister EXO from launch at login: \(error.localizedDescription, privacy: .public)"
+            )
        }
    }
 }
@@ -145,7 +170,7 @@ final class SparkleUpdater: NSObject, ObservableObject {
        center.requestAuthorization(options: [.alert, .sound]) { _, _ in }
        controller.updater.automaticallyChecksForUpdates = true
        controller.updater.automaticallyDownloadsUpdates = false
-        controller.updater.updateCheckInterval = 900 // 15 minutes
+        controller.updater.updateCheckInterval = 900  // 15 minutes
        DispatchQueue.main.asyncAfter(deadline: .now() + 5) { [weak controller] in
            controller?.updater.checkForUpdatesInBackground()
        }
@@ -212,7 +237,8 @@ private final class ExoNotificationDelegate: NSObject, UNUserNotificationCenterD
    func userNotificationCenter(
        _ center: UNUserNotificationCenter,
        willPresent notification: UNNotification,
-        withCompletionHandler completionHandler: @escaping (UNNotificationPresentationOptions) -> Void
+        withCompletionHandler completionHandler: @escaping (UNNotificationPresentationOptions) ->
+            Void
    ) {
        completionHandler([.banner, .list, .sound])
    }
--- a/app/EXO/EXO/ExoProcessController.swift
+++ b/app/EXO/EXO/ExoProcessController.swift
@@ -2,6 +2,8 @@ import AppKit
 import Combine
 import Foundation

+private let customNamespaceKey = "EXOCustomNamespace"
+
@MainActor
 final class ExoProcessController: ObservableObject {
    enum Status: Equatable {
@@ -27,6 +29,14 @@ final class ExoProcessController: ObservableObject {
    @Published private(set) var status: Status = .stopped
    @Published private(set) var lastError: String?
    @Published private(set) var launchCountdownSeconds: Int?
+    @Published var customNamespace: String = {
+        return UserDefaults.standard.string(forKey: customNamespaceKey) ?? ""
+    }()
+    {
+        didSet {
+            UserDefaults.standard.set(customNamespace, forKey: customNamespaceKey)
+        }
+    }

    private var process: Process?
    private var runtimeDirectoryURL: URL?
@@ -180,7 +190,7 @@ final class ExoProcessController: ObservableObject {
    private func makeEnvironment(for runtimeURL: URL) -> [String: String] {
        var environment = ProcessInfo.processInfo.environment
        environment["EXO_RUNTIME_DIR"] = runtimeURL.path
-        environment["EXO_LIBP2P_NAMESPACE"] = buildTag()
+        environment["EXO_LIBP2P_NAMESPACE"] = computeNamespace()

        var paths: [String] = []
        if let existing = environment["PATH"], !existing.isEmpty {
@@ -212,11 +222,19 @@ final class ExoProcessController: ObservableObject {
        if let tag = Bundle.main.infoDictionary?["EXOBuildTag"] as? String, !tag.isEmpty {
            return tag
        }
-        if let short = Bundle.main.infoDictionary?["CFBundleShortVersionString"] as? String, !short.isEmpty {
+        if let short = Bundle.main.infoDictionary?["CFBundleShortVersionString"] as? String,
+            !short.isEmpty
+        {
            return short
        }
        return "dev"
    }
+
+    private func computeNamespace() -> String {
+        let base = buildTag()
+        let custom = customNamespace.trimmingCharacters(in: .whitespaces)
+        return custom.isEmpty ? base : custom
+    }
 }

 struct RuntimeError: LocalizedError {
--- a/app/EXO/EXO/Info.plist
+++ b/app/EXO/EXO/Info.plist
@@ -8,5 +8,15 @@
 	<string>$(EXO_BUILD_TAG)</string>
 	<key>EXOBuildCommit</key>
 	<string>$(EXO_BUILD_COMMIT)</string>
+	<key>EXOBugReportPresignedUrlEndpoint</key>
+	<string>$(EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT)</string>
+	<key>NSLocalNetworkUsageDescription</key>
+	<string>EXO needs local network access to discover and connect to other devices in your cluster for distributed AI inference.</string>
+	<key>NSBonjourServices</key>
+	<array>
+		<string>_p2p._tcp</string>
+		<string>_p2p._udp</string>
+		<string>_libp2p._udp</string>
+	</array>
 </dict>
 </plist>
--- a/app/EXO/EXO/Models/ClusterState.swift
+++ b/app/EXO/EXO/Models/ClusterState.swift
@@ -16,10 +16,13 @@ struct ClusterState: Decodable {
        self.instances = rawInstances.mapValues(\.instance)
        self.runners = try container.decode([String: RunnerStatusSummary].self, forKey: .runners)
        self.nodeProfiles = try container.decode([String: NodeProfile].self, forKey: .nodeProfiles)
-        let rawTasks = try container.decodeIfPresent([String: TaggedTask].self, forKey: .tasks) ?? [:]
+        let rawTasks =
+            try container.decodeIfPresent([String: TaggedTask].self, forKey: .tasks) ?? [:]
        self.tasks = rawTasks.compactMapValues(\.task)
        self.topology = try container.decodeIfPresent(Topology.self, forKey: .topology)
-        let rawDownloads = try container.decodeIfPresent([String: [TaggedNodeDownload]].self, forKey: .downloads) ?? [:]
+        let rawDownloads =
+            try container.decodeIfPresent([String: [TaggedNodeDownload]].self, forKey: .downloads)
+            ?? [:]
        self.downloads = rawDownloads.mapValues { $0.compactMap(\.status) }
    }

@@ -41,7 +44,8 @@ private struct TaggedInstance: Decodable {
        let payloads = try container.decode([String: ClusterInstancePayload].self)
        guard let entry = payloads.first else {
            throw DecodingError.dataCorrupted(
-                DecodingError.Context(codingPath: decoder.codingPath, debugDescription: "Empty instance payload")
+                DecodingError.Context(
+                    codingPath: decoder.codingPath, debugDescription: "Empty instance payload")
            )
        }
        self.instance = ClusterInstance(
@@ -77,7 +81,8 @@ struct RunnerStatusSummary: Decodable {
        let payloads = try container.decode([String: RunnerStatusDetail].self)
        guard let entry = payloads.first else {
            throw DecodingError.dataCorrupted(
-                DecodingError.Context(codingPath: decoder.codingPath, debugDescription: "Empty runner status payload")
+                DecodingError.Context(
+                    codingPath: decoder.codingPath, debugDescription: "Empty runner status payload")
            )
        }
        self.status = entry.key
@@ -257,7 +262,9 @@ struct ChatCompletionTaskParameters: Decodable, Equatable {

    func promptPreview() -> String? {
        guard let messages else { return nil }
-        if let userMessage = messages.last(where: { $0.role?.lowercased() == "user" && ($0.content?.isEmpty == false) }) {
+        if let userMessage = messages.last(where: {
+            $0.role?.lowercased() == "user" && ($0.content?.isEmpty == false)
+        }) {
            return userMessage.content
        }
        return messages.last?.content
@@ -365,5 +372,3 @@ extension ClusterState {

    func availableModels() -> [ModelOption] { [] }
 }
-
-
--- a/app/EXO/EXO/Services/BugReportService.swift
+++ b/app/EXO/EXO/Services/BugReportService.swift
@@ -1,4 +1,3 @@
-import CryptoKit
 import Foundation

 struct BugReportOutcome: Equatable {
@@ -7,17 +6,17 @@ struct BugReportOutcome: Equatable {
 }

 enum BugReportError: LocalizedError {
-    case missingCredentials
    case invalidEndpoint
+    case presignedUrlFailed(String)
    case uploadFailed(String)
    case collectFailed(String)

    var errorDescription: String? {
        switch self {
-        case .missingCredentials:
-            return "Bug report upload credentials are not set."
        case .invalidEndpoint:
            return "Bug report endpoint is invalid."
+        case .presignedUrlFailed(let message):
+            return "Failed to get presigned URLs: \(message)"
        case .uploadFailed(let message):
            return "Bug report upload failed: \(message)"
        case .collectFailed(let message):
@@ -27,21 +26,23 @@ enum BugReportError: LocalizedError {
 }

 struct BugReportService {
-    struct AWSConfig {
-        let accessKey: String
-        let secretKey: String
-        let region: String
-        let bucket: String
+    private struct PresignedUrlsRequest: Codable {
+        let keys: [String]
+    }
+
+    private struct PresignedUrlsResponse: Codable {
+        let urls: [String: String]
+        let expiresIn: Int?
    }

    func sendReport(
-        baseURL: URL = URL(string: "http://127.0.0.1:8000")!,
+        baseURL: URL = URL(string: "http://127.0.0.1:52415")!,
        now: Date = Date(),
        isManual: Bool = false
    ) async throws -> BugReportOutcome {
-        let credentials = try loadCredentials()
-        let timestamp = ISO8601DateFormatter().string(from: now)
-        let prefix = "reports/\(timestamp)/"
+        let timestamp = Self.runTimestampString(now)
+        let dayPrefix = Self.dayPrefixString(now)
+        let prefix = "reports/\(dayPrefix)/\(timestamp)/"

        let logData = readLog()
        let ifconfigText = try await captureIfconfig()
@@ -66,29 +67,82 @@ struct BugReportService {
            ("\(prefix)exo.log", logData),
            ("\(prefix)state.json", stateData),
            ("\(prefix)events.json", eventsData),
-            ("\(prefix)report.json", reportJSON)
+            ("\(prefix)report.json", reportJSON),
        ]

-        let uploader = try S3Uploader(config: credentials)
-        for item in uploads {
-            guard let data = item.data else { continue }
-            try await uploader.upload(
-                objectPath: item.path,
-                body: data
-            )
+        let uploadItems: [(key: String, body: Data)] = uploads.compactMap { item in
+            guard let body = item.data else { return nil }
+            return (key: item.path, body: body)
        }

-        return BugReportOutcome(success: true, message: "Bug Report sent. Thank you for helping to improve EXO 1.0.")
+        guard !uploadItems.isEmpty else {
+            return BugReportOutcome(success: false, message: "No data to upload")
+        }
+
+        let presignedUrls = try await fetchPresignedUploadUrls(keys: uploadItems.map(\.key))
+        for item in uploadItems {
+            guard let urlString = presignedUrls[item.key], let url = URL(string: urlString) else {
+                throw BugReportError.uploadFailed("Missing presigned URL for \(item.key)")
+            }
+            try await uploadToPresignedUrl(url: url, body: item.body)
+        }
+
+        return BugReportOutcome(
+            success: true, message: "Bug Report sent. Thank you for helping to improve EXO 1.0.")
    }

-    private func loadCredentials() throws -> AWSConfig {
-        // These credentials are write-only and necessary to receive bug reports from users
-        return AWSConfig(
-            accessKey: "AKIAYEKP5EMXTOBYDGHX",
-            secretKey: "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",
-            region: "us-east-1",
-            bucket: "exo-bug-reports"
-        )
+    private static func dayPrefixString(_ date: Date) -> String {
+        var calendar = Calendar(identifier: .gregorian)
+        calendar.timeZone = TimeZone(secondsFromGMT: 0) ?? .current
+        let components = calendar.dateComponents([.year, .month, .day], from: date)
+        let year = components.year ?? 0
+        let month = components.month ?? 0
+        let day = components.day ?? 0
+        return String(format: "%04d/%02d/%02d", year, month, day)
+    }
+
+    private static func runTimestampString(_ date: Date) -> String {
+        let formatter = DateFormatter()
+        formatter.locale = Locale(identifier: "en_US_POSIX")
+        formatter.timeZone = TimeZone(secondsFromGMT: 0) ?? .current
+        formatter.dateFormat = "yyyy-MM-dd'T'HHmmss.SSS'Z'"
+        return formatter.string(from: date)
+    }
+
+    private func fetchPresignedUploadUrls(keys: [String], bundle: Bundle = .main) async throws
+        -> [String: String]
+    {
+        guard
+            let endpointString = bundle.infoDictionary?["EXOBugReportPresignedUrlEndpoint"]
+                as? String
+        else {
+            throw BugReportError.invalidEndpoint
+        }
+        let trimmedEndpointString = endpointString.trimmingCharacters(in: .whitespacesAndNewlines)
+        guard !trimmedEndpointString.isEmpty, let endpoint = URL(string: trimmedEndpointString)
+        else {
+            throw BugReportError.invalidEndpoint
+        }
+
+        var request = URLRequest(url: endpoint)
+        request.httpMethod = "POST"
+        request.timeoutInterval = 10
+        request.setValue("application/json", forHTTPHeaderField: "Content-Type")
+
+        let encoder = JSONEncoder()
+        request.httpBody = try encoder.encode(PresignedUrlsRequest(keys: keys))
+
+        let (data, response) = try await URLSession.shared.data(for: request)
+        guard let http = response as? HTTPURLResponse else {
+            throw BugReportError.presignedUrlFailed("Non-HTTP response")
+        }
+        guard (200..<300).contains(http.statusCode) else {
+            throw BugReportError.presignedUrlFailed("HTTP status \(http.statusCode)")
+        }
+
+        let decoder = JSONDecoder()
+        let decoded = try decoder.decode(PresignedUrlsResponse.self, from: data)
+        return decoded.urls
    }

    private func readLog() -> Data? {
@@ -101,7 +155,8 @@ struct BugReportService {
    private func captureIfconfig() async throws -> String {
        let result = runCommand(["/sbin/ifconfig"])
        guard result.exitCode == 0 else {
-            throw BugReportError.collectFailed(result.error.isEmpty ? "ifconfig failed" : result.error)
+            throw BugReportError.collectFailed(
+                result.error.isEmpty ? "ifconfig failed" : result.error)
        }
        return result.output
    }
@@ -109,12 +164,23 @@ struct BugReportService {
    private func readDebugInfo() -> DebugInfo {
        DebugInfo(
            thunderboltBridgeDisabled: readThunderboltBridgeDisabled(),
-            interfaces: readInterfaces()
+            interfaces: readInterfaces(),
+            rdma: readRDMADebugInfo()
+        )
+    }
+
+    private func readRDMADebugInfo() -> DebugInfo.RDMADebugInfo {
+        DebugInfo.RDMADebugInfo(
+            rdmaCtlStatus: safeRunCommand(["/usr/bin/rdma_ctl", "status"]),
+            ibvDevices: safeRunCommand(["/usr/bin/ibv_devices"]),
+            ibvDevinfo: safeRunCommand(["/usr/bin/ibv_devinfo"])
        )
    }

    private func readThunderboltBridgeDisabled() -> Bool? {
-        let result = runCommand(["/usr/sbin/networksetup", "-getnetworkserviceenabled", "Thunderbolt Bridge"])
+        let result = runCommand([
+            "/usr/sbin/networksetup", "-getnetworkserviceenabled", "Thunderbolt Bridge",
+        ])
        guard result.exitCode == 0 else { return nil }
        let output = result.output.lowercased()
        if output.contains("enabled") {
@@ -157,7 +223,8 @@ struct BugReportService {
        request.timeoutInterval = 5
        do {
            let (data, response) = try await URLSession.shared.data(for: request)
-            guard let http = response as? HTTPURLResponse, (200..<300).contains(http.statusCode) else {
+            guard let http = response as? HTTPURLResponse, (200..<300).contains(http.statusCode)
+            else {
                return nil
            }
            return data
@@ -166,6 +233,36 @@ struct BugReportService {
        }
    }

+    private func uploadToPresignedUrl(url: URL, body: Data) async throws {
+        let maxAttempts = 2
+        var lastError: Error?
+
+        for attempt in 1...maxAttempts {
+            do {
+                var request = URLRequest(url: url)
+                request.httpMethod = "PUT"
+                request.httpBody = body
+                request.timeoutInterval = 30
+
+                let (_, response) = try await URLSession.shared.data(for: request)
+                guard let http = response as? HTTPURLResponse else {
+                    throw BugReportError.uploadFailed("Non-HTTP response")
+                }
+                guard (200..<300).contains(http.statusCode) else {
+                    throw BugReportError.uploadFailed("HTTP status \(http.statusCode)")
+                }
+                return
+            } catch {
+                lastError = error
+                if attempt < maxAttempts {
+                    try await Task.sleep(nanoseconds: 400_000_000)
+                }
+            }
+        }
+
+        throw BugReportError.uploadFailed(lastError?.localizedDescription ?? "Unknown error")
+    }
+
    private func makeReportJson(
        timestamp: String,
        hostName: String,
@@ -183,7 +280,7 @@ struct BugReportService {
            "system": system,
            "exo_version": exo.version as Any,
            "exo_commit": exo.commit as Any,
-            "report_type": isManual ? "manual" : "automated"
+            "report_type": isManual ? "manual" : "automated",
        ]
        return try? JSONSerialization.data(withJSONObject: payload, options: [.prettyPrinted])
    }
@@ -214,10 +311,13 @@ struct BugReportService {
        let user = safeRunCommand(["/usr/bin/whoami"])
        let consoleUser = safeRunCommand(["/usr/bin/stat", "-f%Su", "/dev/console"])
        let uptime = safeRunCommand(["/usr/bin/uptime"])
-        let diskRoot = safeRunCommand(["/bin/sh", "-c", "/bin/df -h / | awk 'NR==2 {print $1, $2, $3, $4, $5}'"])
+        let diskRoot = safeRunCommand([
+            "/bin/sh", "-c", "/bin/df -h / | awk 'NR==2 {print $1, $2, $3, $4, $5}'",
+        ])

        let interfacesList = safeRunCommand(["/usr/sbin/ipconfig", "getiflist"])
-        let interfacesAndIPs = interfacesList?
+        let interfacesAndIPs =
+            interfacesList?
            .split(whereSeparator: { $0 == " " || $0 == "\n" })
            .compactMap { iface -> [String: Any]? in
                let name = String(iface)
@@ -228,7 +328,8 @@ struct BugReportService {
            } ?? []

        let wifiSSID: String?
-        let airportPath = "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport"
+        let airportPath =
+            "/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport"
        if FileManager.default.isExecutableFile(atPath: airportPath) {
            wifiSSID = safeRunCommand([airportPath, "-I"]).flatMap(parseWifiSSID)
        } else {
@@ -256,7 +357,7 @@ struct BugReportService {
            "disk_root": diskRoot as Any,
            "interfaces_and_ips": interfacesAndIPs,
            "ipconfig_getiflist": interfacesList as Any,
-            "wifi_ssid": wifiSSID as Any
+            "wifi_ssid": wifiSSID as Any,
        ]
    }

@@ -314,7 +415,8 @@ struct BugReportService {
        for line in airportOutput.split(separator: "\n") {
            let trimmed = line.trimmingCharacters(in: .whitespaces)
            if trimmed.hasPrefix("SSID:") {
-                return trimmed.replacingOccurrences(of: "SSID:", with: "").trimmingCharacters(in: .whitespaces)
+                return trimmed.replacingOccurrences(of: "SSID:", with: "").trimmingCharacters(
+                    in: .whitespaces)
            }
        }
        return nil
@@ -351,6 +453,7 @@ struct BugReportService {
 private struct DebugInfo {
    let thunderboltBridgeDisabled: Bool?
    let interfaces: [InterfaceStatus]
+    let rdma: RDMADebugInfo

    struct InterfaceStatus {
        let name: String
@@ -359,7 +462,21 @@ private struct DebugInfo {
        func toDictionary() -> [String: Any] {
            [
                "name": name,
-                "ip": ip as Any
+                "ip": ip as Any,
+            ]
+        }
+    }
+
+    struct RDMADebugInfo {
+        let rdmaCtlStatus: String?
+        let ibvDevices: String?
+        let ibvDevinfo: String?
+
+        func toDictionary() -> [String: Any] {
+            [
+                "rdma_ctl_status": rdmaCtlStatus as Any,
+                "ibv_devices": ibvDevices as Any,
+                "ibv_devinfo": ibvDevinfo as Any,
            ]
        }
    }
@@ -367,7 +484,8 @@ private struct DebugInfo {
    func toDictionary() -> [String: Any] {
        [
            "thunderbolt_bridge_disabled": thunderboltBridgeDisabled as Any,
-            "interfaces": interfaces.map { $0.toDictionary() }
+            "interfaces": interfaces.map { $0.toDictionary() },
+            "rdma": rdma.toDictionary(),
        ]
    }
 }
@@ -377,163 +495,3 @@ private struct CommandResult {
    let output: String
    let error: String
 }
-
-private struct S3Uploader {
-    let config: BugReportService.AWSConfig
-
-    init(config: BugReportService.AWSConfig) throws {
-        self.config = config
-    }
-
-    func upload(objectPath: String, body: Data) async throws {
-        let host = "\(config.bucket).s3.amazonaws.com"
-        guard let url = URL(string: "https://\(host)/\(objectPath)") else {
-            throw BugReportError.invalidEndpoint
-        }
-
-        let now = Date()
-        let amzDate = awsTimestamp(now)
-        let dateStamp = dateStamp(now)
-        let payloadHash = sha256Hex(body)
-
-        let headers = [
-            "host": host,
-            "x-amz-content-sha256": payloadHash,
-            "x-amz-date": amzDate
-        ]
-
-        let canonicalRequest = buildCanonicalRequest(
-            method: "PUT",
-            url: url,
-            headers: headers,
-            payloadHash: payloadHash
-        )
-
-        let stringToSign = buildStringToSign(
-            amzDate: amzDate,
-            dateStamp: dateStamp,
-            canonicalRequestHash: sha256Hex(canonicalRequest.data(using: .utf8) ?? Data())
-        )
-
-        let signingKey = deriveKey(secret: config.secretKey, dateStamp: dateStamp, region: config.region, service: "s3")
-        let signature = hmacHex(key: signingKey, data: Data(stringToSign.utf8))
-
-        let signedHeaders = "host;x-amz-content-sha256;x-amz-date"
-        let authorization = """
-AWS4-HMAC-SHA256 Credential=\(config.accessKey)/\(dateStamp)/\(config.region)/s3/aws4_request, SignedHeaders=\(signedHeaders), Signature=\(signature)
-"""
-
-        var request = URLRequest(url: url)
-        request.httpMethod = "PUT"
-        request.httpBody = body
-        request.setValue(headers["x-amz-content-sha256"], forHTTPHeaderField: "x-amz-content-sha256")
-        request.setValue(headers["x-amz-date"], forHTTPHeaderField: "x-amz-date")
-        request.setValue(host, forHTTPHeaderField: "Host")
-        request.setValue(authorization, forHTTPHeaderField: "Authorization")
-
-        let (data, response) = try await URLSession.shared.data(for: request)
-        guard let http = response as? HTTPURLResponse, (200..<300).contains(http.statusCode) else {
-            let statusText = (response as? HTTPURLResponse)?.statusCode ?? -1
-            _ = data // ignore response body for UX
-            throw BugReportError.uploadFailed("HTTP status \(statusText)")
-        }
-    }
-
-    private func buildCanonicalRequest(
-        method: String,
-        url: URL,
-        headers: [String: String],
-        payloadHash: String
-    ) -> String {
-        let canonicalURI = encodePath(url.path)
-        let canonicalQuery = url.query ?? ""
-        let sortedHeaders = headers.sorted { $0.key < $1.key }
-        let canonicalHeaders = sortedHeaders
-            .map { "\($0.key.lowercased()):\($0.value)\n" }
-            .joined()
-        let signedHeaders = sortedHeaders.map { $0.key.lowercased() }.joined(separator: ";")
-
-        return [
-            method,
-            canonicalURI,
-            canonicalQuery,
-            canonicalHeaders,
-            signedHeaders,
-            payloadHash
-        ].joined(separator: "\n")
-    }
-
-    private func encodePath(_ path: String) -> String {
-        return path
-            .split(separator: "/")
-            .map { segment in
-                segment.addingPercentEncoding(withAllowedCharacters: Self.rfc3986) ?? String(segment)
-            }
-            .joined(separator: "/")
-            .prependSlashIfNeeded()
-    }
-
-    private func buildStringToSign(
-        amzDate: String,
-        dateStamp: String,
-        canonicalRequestHash: String
-    ) -> String {
-        """
-AWS4-HMAC-SHA256
-\(amzDate)
-\(dateStamp)/\(config.region)/s3/aws4_request
-\(canonicalRequestHash)
-"""
-    }
-
-    private func deriveKey(secret: String, dateStamp: String, region: String, service: String) -> Data {
-        let kDate = hmac(key: Data(("AWS4" + secret).utf8), data: Data(dateStamp.utf8))
-        let kRegion = hmac(key: kDate, data: Data(region.utf8))
-        let kService = hmac(key: kRegion, data: Data(service.utf8))
-        return hmac(key: kService, data: Data("aws4_request".utf8))
-    }
-
-    private func hmac(key: Data, data: Data) -> Data {
-        let keySym = SymmetricKey(data: key)
-        let mac = HMAC<SHA256>.authenticationCode(for: data, using: keySym)
-        return Data(mac)
-    }
-
-    private func hmacHex(key: Data, data: Data) -> String {
-        hmac(key: key, data: data).map { String(format: "%02x", $0) }.joined()
-    }
-
-    private func sha256Hex(_ data: Data) -> String {
-        let digest = SHA256.hash(data: data)
-        return digest.compactMap { String(format: "%02x", $0) }.joined()
-    }
-
-    private func awsTimestamp(_ date: Date) -> String {
-        let formatter = DateFormatter()
-        formatter.dateFormat = "yyyyMMdd'T'HHmmss'Z'"
-        formatter.timeZone = TimeZone(abbreviation: "UTC")
-        return formatter.string(from: date)
-    }
-
-    private func dateStamp(_ date: Date) -> String {
-        let formatter = DateFormatter()
-        formatter.dateFormat = "yyyyMMdd"
-        formatter.timeZone = TimeZone(abbreviation: "UTC")
-        return formatter.string(from: date)
-    }
-
-    private static let rfc3986: CharacterSet = {
-        var set = CharacterSet.alphanumerics
-        set.insert(charactersIn: "-._~")
-        return set
-    }()
-}
-
-private extension String {
-    func prependSlashIfNeeded() -> String {
-        if hasPrefix("/") {
-            return self
-        }
-        return "/" + self
-    }
-}
--- a/app/EXO/EXO/Services/ClusterStateService.swift
+++ b/app/EXO/EXO/Services/ClusterStateService.swift
@@ -7,6 +7,7 @@ final class ClusterStateService: ObservableObject {
    @Published private(set) var lastError: String?
    @Published private(set) var lastActionMessage: String?
    @Published private(set) var modelOptions: [ModelOption] = []
+    @Published private(set) var localNodeId: String?

    private var timer: Timer?
    private let decoder: JSONDecoder
@@ -15,7 +16,7 @@ final class ClusterStateService: ObservableObject {
    private let endpoint: URL

    init(
-        baseURL: URL = URL(string: "http://127.0.0.1:8000")!,
+        baseURL: URL = URL(string: "http://127.0.0.1:52415")!,
        session: URLSession = .shared
    ) {
        self.baseURL = baseURL
@@ -29,6 +30,7 @@ final class ClusterStateService: ObservableObject {
    func startPolling(interval: TimeInterval = 0.5) {
        stopPolling()
        Task {
+            await fetchLocalNodeId()
            await fetchModels()
            await fetchSnapshot()
        }
@@ -46,9 +48,33 @@ final class ClusterStateService: ObservableObject {
        latestSnapshot = nil
        lastError = nil
        lastActionMessage = nil
+        localNodeId = nil
+    }
+
+    private func fetchLocalNodeId() async {
+        do {
+            let url = baseURL.appendingPathComponent("node_id")
+            var request = URLRequest(url: url)
+            request.cachePolicy = .reloadIgnoringLocalCacheData
+            let (data, response) = try await session.data(for: request)
+            guard let httpResponse = response as? HTTPURLResponse,
+                (200..<300).contains(httpResponse.statusCode)
+            else {
+                return
+            }
+            if let nodeId = try? decoder.decode(String.self, from: data) {
+                localNodeId = nodeId
+            }
+        } catch {
+            // Silently ignore - localNodeId will remain nil and retry on next poll
+        }
    }

    private func fetchSnapshot() async {
+        // Retry fetching local node ID if not yet set
+        if localNodeId == nil {
+            await fetchLocalNodeId()
+        }
        do {
            var request = URLRequest(url: endpoint)
            request.cachePolicy = .reloadIgnoringLocalCacheData
@@ -89,7 +115,9 @@ final class ClusterStateService: ObservableObject {
        }
    }

-    func launchInstance(modelId: String, sharding: String, instanceMeta: String, minNodes: Int) async {
+    func launchInstance(modelId: String, sharding: String, instanceMeta: String, minNodes: Int)
+        async
+    {
        do {
            var request = URLRequest(url: baseURL.appendingPathComponent("instance"))
            request.httpMethod = "POST"
@@ -98,7 +126,7 @@ final class ClusterStateService: ObservableObject {
                "model_id": modelId,
                "sharding": sharding,
                "instance_meta": instanceMeta,
-                "min_nodes": minNodes
+                "min_nodes": minNodes,
            ]
            request.httpBody = try JSONSerialization.data(withJSONObject: payload, options: [])
            let (_, response) = try await session.data(for: request)
@@ -119,7 +147,9 @@ final class ClusterStateService: ObservableObject {
        do {
            let url = baseURL.appendingPathComponent("models")
            let (data, response) = try await session.data(from: url)
-            guard let httpResponse = response as? HTTPURLResponse, (200..<300).contains(httpResponse.statusCode) else {
+            guard let httpResponse = response as? HTTPURLResponse,
+                (200..<300).contains(httpResponse.statusCode)
+            else {
                throw URLError(.badServerResponse)
            }
            let list = try decoder.decode(ModelListResponse.self, from: data)
--- a/app/EXO/EXO/Services/LocalNetworkChecker.swift
+++ b/app/EXO/EXO/Services/LocalNetworkChecker.swift
@@ -0,0 +1,150 @@
+import Foundation
+import Network
+import os.log
+
+/// Checks if the app's local network permission is actually functional.
+///
+/// macOS local network permission can appear enabled in System Preferences but not
+/// actually work after a restart. This service detects this by creating a UDP
+/// connection to the mDNS multicast address (224.0.0.251:5353).
+@MainActor
+final class LocalNetworkChecker: ObservableObject {
+    enum Status: Equatable {
+        case unknown
+        case checking
+        case working
+        case notWorking(reason: String)
+
+        var isHealthy: Bool {
+            if case .working = self { return true }
+            return false
+        }
+
+        var displayText: String {
+            switch self {
+            case .unknown:
+                return "Unknown"
+            case .checking:
+                return "Checking..."
+            case .working:
+                return "Working"
+            case .notWorking(let reason):
+                return reason
+            }
+        }
+    }
+
+    private static let logger = Logger(subsystem: "io.exo.EXO", category: "LocalNetworkChecker")
+
+    @Published private(set) var status: Status = .unknown
+    @Published private(set) var lastConnectionState: String = "none"
+
+    private var connection: NWConnection?
+    private var checkTask: Task<Void, Never>?
+
+    /// Checks if local network access is working.
+    func check() {
+        checkTask?.cancel()
+        status = .checking
+        lastConnectionState = "connecting"
+
+        checkTask = Task { [weak self] in
+            guard let self else { return }
+            let result = await self.performCheck()
+            self.status = result
+            Self.logger.info("Local network check complete: \(result.displayText)")
+        }
+    }
+
+    private func performCheck() async -> Status {
+        Self.logger.info("Checking local network access via UDP multicast")
+
+        connection?.cancel()
+        connection = nil
+
+        // mDNS multicast address - same as libp2p uses for peer discovery
+        let host = NWEndpoint.Host("224.0.0.251")
+        let port = NWEndpoint.Port(integerLiteral: 5353)
+
+        let params = NWParameters.udp
+        params.allowLocalEndpointReuse = true
+
+        let conn = NWConnection(host: host, port: port, using: params)
+        connection = conn
+
+        return await withCheckedContinuation { continuation in
+            var hasResumed = false
+            let lock = NSLock()
+
+            let resumeOnce: (Status) -> Void = { status in
+                lock.lock()
+                defer { lock.unlock() }
+                guard !hasResumed else { return }
+                hasResumed = true
+                continuation.resume(returning: status)
+            }
+
+            conn.stateUpdateHandler = { [weak self] state in
+                let stateStr: String
+                switch state {
+                case .setup: stateStr = "setup"
+                case .preparing: stateStr = "preparing"
+                case .ready: stateStr = "ready"
+                case .waiting(let e): stateStr = "waiting(\(e))"
+                case .failed(let e): stateStr = "failed(\(e))"
+                case .cancelled: stateStr = "cancelled"
+                @unknown default: stateStr = "unknown"
+                }
+
+                Task { @MainActor in
+                    self?.lastConnectionState = stateStr
+                }
+
+                switch state {
+                case .ready:
+                    resumeOnce(.working)
+                case .waiting(let error):
+                    let errorStr = "\(error)"
+                    if errorStr.contains("54") || errorStr.contains("ECONNRESET") {
+                        resumeOnce(.notWorking(reason: "Connection blocked"))
+                    }
+                case .failed(let error):
+                    let errorStr = "\(error)"
+                    if errorStr.contains("65") || errorStr.contains("EHOSTUNREACH")
+                        || errorStr.contains("permission") || errorStr.contains("denied")
+                    {
+                        resumeOnce(.notWorking(reason: "Permission denied"))
+                    } else {
+                        resumeOnce(.notWorking(reason: "Failed: \(error.localizedDescription)"))
+                    }
+                case .cancelled, .setup, .preparing:
+                    break
+                @unknown default:
+                    break
+                }
+            }
+
+            conn.start(queue: .main)
+
+            Task {
+                try? await Task.sleep(nanoseconds: 3_000_000_000)
+                let state = conn.state
+                switch state {
+                case .ready:
+                    resumeOnce(.working)
+                case .waiting, .preparing, .setup:
+                    resumeOnce(.notWorking(reason: "Timeout (may be blocked)"))
+                default:
+                    resumeOnce(.notWorking(reason: "Timeout"))
+                }
+            }
+        }
+    }
+
+    func stop() {
+        checkTask?.cancel()
+        checkTask = nil
+        connection?.cancel()
+        connection = nil
+    }
+}
--- a/app/EXO/EXO/Services/NetworkSetupHelper.swift
+++ b/app/EXO/EXO/Services/NetworkSetupHelper.swift
@@ -5,64 +5,37 @@ import os.log
 enum NetworkSetupHelper {
    private static let logger = Logger(subsystem: "io.exo.EXO", category: "NetworkSetup")
    private static let daemonLabel = "io.exo.networksetup"
-    private static let scriptDestination = "/Library/Application Support/EXO/disable_bridge_enable_dhcp.sh"
+    private static let scriptDestination =
+        "/Library/Application Support/EXO/disable_bridge.sh"
    private static let plistDestination = "/Library/LaunchDaemons/io.exo.networksetup.plist"
    private static let requiredStartInterval: Int = 1791

    private static let setupScript = """
-#!/usr/bin/env bash
+        #!/usr/bin/env bash

-set -euo pipefail
+        set -euo pipefail

-PREFS="/Library/Preferences/SystemConfiguration/preferences.plist"
+        PREFS="/Library/Preferences/SystemConfiguration/preferences.plist"

-# Remove bridge0 interface
-ifconfig bridge0 &>/dev/null && {
-  ifconfig bridge0 | grep -q 'member' && {
-    ifconfig bridge0 | awk '/member/ {print $2}' | xargs -n1 ifconfig bridge0 deletem 2>/dev/null || true
-  }
-  ifconfig bridge0 destroy 2>/dev/null || true
-}
+        # Remove bridge0 interface
+        ifconfig bridge0 &>/dev/null && {
+          ifconfig bridge0 | grep -q 'member' && {
+            ifconfig bridge0 | awk '/member/ {print $2}' | xargs -n1 ifconfig bridge0 deletem 2>/dev/null || true
+          }
+          ifconfig bridge0 destroy 2>/dev/null || true
+        }

-# Remove Thunderbolt Bridge from VirtualNetworkInterfaces in preferences.plist
-/usr/libexec/PlistBuddy -c "Delete :VirtualNetworkInterfaces:Bridge:bridge0" "$PREFS" 2>/dev/null || true
+        # Remove Thunderbolt Bridge from VirtualNetworkInterfaces in preferences.plist
+        /usr/libexec/PlistBuddy -c "Delete :VirtualNetworkInterfaces:Bridge:bridge0" "$PREFS" 2>/dev/null || true

-networksetup -listlocations | grep -q exo || {
-  networksetup -createlocation exo
-}
-
-networksetup -switchtolocation exo
-networksetup -listallhardwareports \\
-  | awk -F': ' '/Hardware Port: / {print $2}' \\
-  | while IFS=":" read -r name; do
-      case "$name" in
-        "Ethernet Adapter"*)
-                ;;
-        "Thunderbolt Bridge")
-                ;;
-        "Thunderbolt "*)
-          networksetup -listallnetworkservices \\
-            | grep -q "EXO $name" \\
-              || networksetup -createnetworkservice "EXO $name" "$name" 2>/dev/null \\
-              || continue
-          networksetup -setdhcp "EXO $name"
-                ;;
-        *)
-          networksetup -listallnetworkservices \\
-            | grep -q "$name" \\
-              || networksetup -createnetworkservice "$name" "$name" 2>/dev/null \\
-              || continue
-                ;;
-      esac
-    done
-
-networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
-  networksetup -setnetworkserviceenabled "Thunderbolt Bridge" off
-} || true
-"""
+        networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
+          networksetup -setnetworkserviceenabled "Thunderbolt Bridge" off
+        } || true
+        """

    static func ensureLaunchDaemonInstalled() {
-        Task.detached {
+        // Use .utility priority to match NSAppleScript's internal QoS and avoid priority inversion
+        Task.detached(priority: .utility) {
            do {
                if daemonAlreadyInstalled() {
                    return
@@ -70,11 +43,70 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
                try await installLaunchDaemon()
                logger.info("Network setup launch daemon installed and started")
            } catch {
-                logger.error("Network setup launch daemon failed: \(error.localizedDescription, privacy: .public)")
+                logger.error(
+                    "Network setup launch daemon failed: \(error.localizedDescription, privacy: .public)"
+                )
            }
        }
    }

+    /// Removes all EXO network setup components from the system.
+    /// This includes the LaunchDaemon, scripts, logs, and network location.
+    /// Requires admin privileges.
+    static func uninstall() throws {
+        let uninstallScript = makeUninstallScript()
+        try runShellAsAdmin(uninstallScript)
+        logger.info("EXO network setup components removed successfully")
+    }
+
+    /// Checks if there are any EXO network components installed that need cleanup
+    static func hasInstalledComponents() -> Bool {
+        let manager = FileManager.default
+        let scriptExists = manager.fileExists(atPath: scriptDestination)
+        let plistExists = manager.fileExists(atPath: plistDestination)
+        return scriptExists || plistExists
+    }
+
+    private static func makeUninstallScript() -> String {
+        """
+        set -euo pipefail
+
+        LABEL="\(daemonLabel)"
+        SCRIPT_DEST="\(scriptDestination)"
+        PLIST_DEST="\(plistDestination)"
+        LOG_OUT="/var/log/\(daemonLabel).log"
+        LOG_ERR="/var/log/\(daemonLabel).err.log"
+
+        # Unload the LaunchDaemon if running
+        launchctl bootout system/"$LABEL" 2>/dev/null || true
+
+        # Remove LaunchDaemon plist
+        rm -f "$PLIST_DEST"
+
+        # Remove the script and parent directory if empty
+        rm -f "$SCRIPT_DEST"
+        rmdir "$(dirname "$SCRIPT_DEST")" 2>/dev/null || true
+
+        # Remove log files
+        rm -f "$LOG_OUT" "$LOG_ERR"
+
+        # Switch back to Automatic network location
+        networksetup -switchtolocation Automatic 2>/dev/null || true
+
+        # Delete the exo network location if it exists
+        networksetup -listlocations | grep -q '^exo$' && {
+          networksetup -deletelocation exo 2>/dev/null || true
+        } || true
+
+        # Re-enable Thunderbolt Bridge if it exists
+        networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
+          networksetup -setnetworkserviceenabled "Thunderbolt Bridge" on 2>/dev/null || true
+        } || true
+
+        echo "EXO network components removed successfully"
+        """
+    }
+
    private static func daemonAlreadyInstalled() -> Bool {
        let manager = FileManager.default
        let scriptExists = manager.fileExists(atPath: scriptDestination)
@@ -82,7 +114,8 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
        guard scriptExists, plistExists else { return false }
        guard
            let data = try? Data(contentsOf: URL(fileURLWithPath: plistDestination)),
-            let plist = try? PropertyListSerialization.propertyList(from: data, options: [], format: nil) as? [String: Any]
+            let plist = try? PropertyListSerialization.propertyList(
+                from: data, options: [], format: nil) as? [String: Any]
        else {
            return false
        }
@@ -92,7 +125,9 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
        else {
            return false
        }
-        if let programArgs = plist["ProgramArguments"] as? [String], programArgs.contains(scriptDestination) == false {
+        if let programArgs = plist["ProgramArguments"] as? [String],
+            programArgs.contains(scriptDestination) == false
+        {
            return false
        }
        return true
@@ -105,58 +140,59 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {

    private static func makeInstallerScript() -> String {
        """
-set -euo pipefail
+        set -euo pipefail

-LABEL="\(daemonLabel)"
-SCRIPT_DEST="\(scriptDestination)"
-PLIST_DEST="\(plistDestination)"
+        LABEL="\(daemonLabel)"
+        SCRIPT_DEST="\(scriptDestination)"
+        PLIST_DEST="\(plistDestination)"

-mkdir -p "$(dirname "$SCRIPT_DEST")"
+        mkdir -p "$(dirname "$SCRIPT_DEST")"

-cat > "$SCRIPT_DEST" <<'EOF_SCRIPT'
-\(setupScript)
-EOF_SCRIPT
-chmod 755 "$SCRIPT_DEST"
+        cat > "$SCRIPT_DEST" <<'EOF_SCRIPT'
+        \(setupScript)
+        EOF_SCRIPT
+        chmod 755 "$SCRIPT_DEST"

-cat > "$PLIST_DEST" <<'EOF_PLIST'
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
-<plist version="1.0">
-<dict>
-  <key>Label</key>
-  <string>\(daemonLabel)</string>
-  <key>ProgramArguments</key>
-  <array>
-    <string>/bin/bash</string>
-    <string>\(scriptDestination)</string>
-  </array>
-  <key>StartInterval</key>
-  <integer>\(requiredStartInterval)</integer>
-  <key>RunAtLoad</key>
-  <true/>
-  <key>StandardOutPath</key>
-  <string>/var/log/\(daemonLabel).log</string>
-  <key>StandardErrorPath</key>
-  <string>/var/log/\(daemonLabel).err.log</string>
-</dict>
-</plist>
-EOF_PLIST
+        cat > "$PLIST_DEST" <<'EOF_PLIST'
+        <?xml version="1.0" encoding="UTF-8"?>
+        <!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+        <plist version="1.0">
+        <dict>
+          <key>Label</key>
+          <string>\(daemonLabel)</string>
+          <key>ProgramArguments</key>
+          <array>
+            <string>/bin/bash</string>
+            <string>\(scriptDestination)</string>
+          </array>
+          <key>StartInterval</key>
+          <integer>\(requiredStartInterval)</integer>
+          <key>RunAtLoad</key>
+          <true/>
+          <key>StandardOutPath</key>
+          <string>/var/log/\(daemonLabel).log</string>
+          <key>StandardErrorPath</key>
+          <string>/var/log/\(daemonLabel).err.log</string>
+        </dict>
+        </plist>
+        EOF_PLIST

-launchctl bootout system/"$LABEL" >/dev/null 2>&1 || true
-launchctl bootstrap system "$PLIST_DEST"
-launchctl enable system/"$LABEL"
-launchctl kickstart -k system/"$LABEL"
-"""
+        launchctl bootout system/"$LABEL" >/dev/null 2>&1 || true
+        launchctl bootstrap system "$PLIST_DEST"
+        launchctl enable system/"$LABEL"
+        launchctl kickstart -k system/"$LABEL"
+        """
    }

    private static func runShellAsAdmin(_ script: String) throws {
-        let escapedScript = script
+        let escapedScript =
+            script
            .replacingOccurrences(of: "\\", with: "\\\\")
            .replacingOccurrences(of: "\"", with: "\\\"")

        let appleScriptSource = """
-do shell script "\(escapedScript)" with administrator privileges
-"""
+            do shell script "\(escapedScript)" with administrator privileges
+            """

        guard let appleScript = NSAppleScript(source: appleScriptSource) else {
            throw NetworkSetupError.scriptCreationFailed
--- a/app/EXO/EXO/Services/NetworkStatusService.swift
+++ b/app/EXO/EXO/Services/NetworkStatusService.swift
@@ -35,14 +35,34 @@ struct NetworkStatus: Equatable {
    let thunderboltBridgeState: ThunderboltState?
    let bridgeInactive: Bool?
    let interfaceStatuses: [InterfaceIpStatus]
+    let rdmaStatus: RDMAStatus

    static let empty = NetworkStatus(
        thunderboltBridgeState: nil,
        bridgeInactive: nil,
-        interfaceStatuses: []
+        interfaceStatuses: [],
+        rdmaStatus: .empty
    )
 }

+struct RDMAStatus: Equatable {
+    let rdmaCtlEnabled: Bool?
+    let devices: [String]
+    let activePorts: [RDMAPort]
+
+    var isAvailable: Bool {
+        rdmaCtlEnabled == true || !devices.isEmpty
+    }
+
+    static let empty = RDMAStatus(rdmaCtlEnabled: nil, devices: [], activePorts: [])
+}
+
+struct RDMAPort: Equatable {
+    let device: String
+    let port: String
+    let state: String
+}
+
 struct InterfaceIpStatus: Equatable {
    let interfaceName: String
    let ipAddress: String?
@@ -59,10 +79,79 @@ private struct NetworkStatusFetcher {
        NetworkStatus(
            thunderboltBridgeState: readThunderboltBridgeState(),
            bridgeInactive: readBridgeInactive(),
-            interfaceStatuses: readInterfaceStatuses()
+            interfaceStatuses: readInterfaceStatuses(),
+            rdmaStatus: readRDMAStatus()
        )
    }

+    private func readRDMAStatus() -> RDMAStatus {
+        let rdmaCtlEnabled = readRDMACtlEnabled()
+        let devices = readRDMADevices()
+        let activePorts = readRDMAActivePorts()
+        return RDMAStatus(
+            rdmaCtlEnabled: rdmaCtlEnabled, devices: devices, activePorts: activePorts)
+    }
+
+    private func readRDMACtlEnabled() -> Bool? {
+        let result = runCommand(["rdma_ctl", "status"])
+        guard result.exitCode == 0 else { return nil }
+        let output = result.output.lowercased().trimmingCharacters(in: .whitespacesAndNewlines)
+        if output.contains("enabled") {
+            return true
+        }
+        if output.contains("disabled") {
+            return false
+        }
+        return nil
+    }
+
+    private func readRDMADevices() -> [String] {
+        let result = runCommand(["ibv_devices"])
+        guard result.exitCode == 0 else { return [] }
+        var devices: [String] = []
+        for line in result.output.split(separator: "\n") {
+            let trimmed = line.trimmingCharacters(in: .whitespaces)
+            if trimmed.hasPrefix("---") || trimmed.lowercased().hasPrefix("device")
+                || trimmed.isEmpty
+            {
+                continue
+            }
+            let parts = trimmed.split(separator: " ", maxSplits: 1)
+            if let deviceName = parts.first {
+                devices.append(String(deviceName))
+            }
+        }
+        return devices
+    }
+
+    private func readRDMAActivePorts() -> [RDMAPort] {
+        let result = runCommand(["ibv_devinfo"])
+        guard result.exitCode == 0 else { return [] }
+        var ports: [RDMAPort] = []
+        var currentDevice: String?
+        var currentPort: String?
+
+        for line in result.output.split(separator: "\n") {
+            let trimmed = line.trimmingCharacters(in: .whitespaces)
+            if trimmed.hasPrefix("hca_id:") {
+                currentDevice = trimmed.replacingOccurrences(of: "hca_id:", with: "")
+                    .trimmingCharacters(in: .whitespaces)
+            } else if trimmed.hasPrefix("port:") {
+                currentPort = trimmed.replacingOccurrences(of: "port:", with: "")
+                    .trimmingCharacters(in: .whitespaces)
+            } else if trimmed.hasPrefix("state:") {
+                let state = trimmed.replacingOccurrences(of: "state:", with: "").trimmingCharacters(
+                    in: .whitespaces)
+                if let device = currentDevice, let port = currentPort {
+                    if state.lowercased().contains("active") {
+                        ports.append(RDMAPort(device: device, port: port, state: state))
+                    }
+                }
+            }
+        }
+        return ports
+    }
+
    private func readThunderboltBridgeState() -> ThunderboltState? {
        let result = runCommand(["networksetup", "-getnetworkserviceenabled", "Thunderbolt Bridge"])
        guard result.exitCode == 0 else {
@@ -85,10 +174,11 @@ private struct NetworkStatusFetcher {
    private func readBridgeInactive() -> Bool? {
        let result = runCommand(["ifconfig", "bridge0"])
        guard result.exitCode == 0 else { return nil }
-        guard let statusLine = result.output
-            .components(separatedBy: .newlines)
-            .first(where: { $0.contains("status:") })?
-            .lowercased()
+        guard
+            let statusLine = result.output
+                .components(separatedBy: .newlines)
+                .first(where: { $0.contains("status:") })?
+                .lowercased()
        else {
            return nil
        }
@@ -171,4 +261,3 @@ private struct NetworkStatusFetcher {
        )
    }
 }
-
--- a/app/EXO/EXO/ViewModels/InstanceViewModel.swift
+++ b/app/EXO/EXO/ViewModels/InstanceViewModel.swift
@@ -57,7 +57,7 @@ struct InstanceViewModel: Identifiable, Equatable {
        case waiting
        case failed
        case idle
-        case unknown
+        case preparing

        var label: String {
            switch self {
@@ -68,7 +68,7 @@ struct InstanceViewModel: Identifiable, Equatable {
            case .waiting: return "Waiting"
            case .failed: return "Failed"
            case .idle: return "Idle"
-            case .unknown: return "Unknown"
+            case .preparing: return "Preparing"
            }
        }
    }
@@ -107,10 +107,13 @@ extension ClusterState {
            let nodeToRunner = instance.shardAssignments.nodeToRunner
            let nodeIds = Array(nodeToRunner.keys)
            let runnerIds = Array(nodeToRunner.values)
-            let nodeNames = nodeIds.compactMap { nodeProfiles[$0]?.friendlyName ?? nodeProfiles[$0]?.modelId ?? $0 }
+            let nodeNames = nodeIds.compactMap {
+                nodeProfiles[$0]?.friendlyName ?? nodeProfiles[$0]?.modelId ?? $0
+            }
            let statuses = runnerIds.compactMap { runners[$0]?.status.lowercased() }
            let downloadProgress = aggregateDownloadProgress(for: nodeIds)
-            let state = InstanceViewModel.State(statuses: statuses, hasActiveDownload: downloadProgress != nil)
+            let state = InstanceViewModel.State(
+                statuses: statuses, hasActiveDownload: downloadProgress != nil)
            let chatTasks = (chatTasksByInstance[entry.key] ?? [])
                .sorted(by: { $0.sortPriority < $1.sortPriority })
                .map { InstanceTaskViewModel(task: $0) }
@@ -165,8 +168,8 @@ extension ClusterState {
    }
 }

-private extension InstanceViewModel.State {
-    init(statuses: [String], hasActiveDownload: Bool = false) {
+extension InstanceViewModel.State {
+    fileprivate init(statuses: [String], hasActiveDownload: Bool = false) {
        if statuses.contains(where: { $0.contains("failed") }) {
            self = .failed
        } else if hasActiveDownload || statuses.contains(where: { $0.contains("downloading") }) {
@@ -182,7 +185,7 @@ private extension InstanceViewModel.State {
        } else if statuses.isEmpty {
            self = .idle
        } else {
-            self = .unknown
+            self = .preparing
        }
    }
 }
@@ -243,4 +246,3 @@ extension InstanceTaskViewModel {
        self.parameters = task.parameters
    }
 }
-
--- a/app/EXO/EXO/ViewModels/NodeViewModel.swift
+++ b/app/EXO/EXO/ViewModels/NodeViewModel.swift
@@ -85,9 +85,11 @@ struct TopologyViewModel {
 }

 extension ClusterState {
-    func topologyViewModel() -> TopologyViewModel? {
+    func topologyViewModel(localNodeId: String?) -> TopologyViewModel? {
        let topologyNodeIds = Set(topology?.nodes.map(\.nodeId) ?? [])
-        let allNodes = nodeViewModels().filter { topologyNodeIds.isEmpty || topologyNodeIds.contains($0.id) }
+        let allNodes = nodeViewModels().filter {
+            topologyNodeIds.isEmpty || topologyNodeIds.contains($0.id)
+        }
        guard !allNodes.isEmpty else { return nil }

        let nodesById = Dictionary(uniqueKeysWithValues: allNodes.map { ($0.id, $0) })
@@ -105,17 +107,25 @@ extension ClusterState {
            orderedNodes = allNodes
        }

+        // Rotate so the local node (from /node_id API) is first
+        if let localId = localNodeId,
+            let index = orderedNodes.firstIndex(where: { $0.id == localId })
+        {
+            orderedNodes = Array(orderedNodes[index...]) + Array(orderedNodes[..<index])
+        }
+
        let nodeIds = Set(orderedNodes.map(\.id))
-        let edgesArray: [TopologyEdgeViewModel] = topology?.connections?.compactMap { connection in
-            guard nodeIds.contains(connection.localNodeId), nodeIds.contains(connection.sendBackNodeId) else { return nil }
-            return TopologyEdgeViewModel(sourceId: connection.localNodeId, targetId: connection.sendBackNodeId)
-        } ?? []
+        let edgesArray: [TopologyEdgeViewModel] =
+            topology?.connections?.compactMap { connection in
+                guard nodeIds.contains(connection.localNodeId),
+                    nodeIds.contains(connection.sendBackNodeId)
+                else { return nil }
+                return TopologyEdgeViewModel(
+                    sourceId: connection.localNodeId, targetId: connection.sendBackNodeId)
+            } ?? []
        let edges = Set(edgesArray)

-        let topologyRootId = topology?.nodes.first?.nodeId
-        let currentId = orderedNodes.first(where: { $0.id == topologyRootId })?.id ?? orderedNodes.first?.id
-
-        return TopologyViewModel(nodes: orderedNodes, edges: Array(edges), currentNodeId: currentId)
+        return TopologyViewModel(
+            nodes: orderedNodes, edges: Array(edges), currentNodeId: localNodeId)
    }
 }
-
--- a/app/EXO/EXO/Views/InstanceRowView.swift
+++ b/app/EXO/EXO/Views/InstanceRowView.swift
@@ -20,8 +20,8 @@ struct InstanceRowView: View {
                if let progress = instance.downloadProgress {
                    downloadStatusView(progress: progress)
                } else {
-                statusChip(label: instance.state.label.uppercased(), color: statusColor)
-            }
+                    statusChip(label: instance.state.label.uppercased(), color: statusColor)
+                }
            }
            if let progress = instance.downloadProgress {
                GeometryReader { geometry in
@@ -83,7 +83,7 @@ struct InstanceRowView: View {
        case .ready: return .teal
        case .waiting, .idle: return .gray
        case .failed: return .red
-        case .unknown: return .secondary
+        case .preparing: return .secondary
        }
    }

@@ -97,7 +97,8 @@ struct InstanceRowView: View {
                        .font(.caption)
                        .fontWeight(.semibold)
                    if let subtitle = task.subtitle,
-                       subtitle.caseInsensitiveCompare(parentModelName) != .orderedSame {
+                        subtitle.caseInsensitiveCompare(parentModelName) != .orderedSame
+                    {
                        Text(subtitle)
                            .font(.caption2)
                            .foregroundColor(.secondary)
@@ -234,9 +235,12 @@ struct InstanceRowView: View {
        Button {
            isExpanded.wrappedValue.toggle()
        } label: {
-            Label(isExpanded.wrappedValue ? "Hide" : "Show", systemImage: isExpanded.wrappedValue ? "chevron.up" : "chevron.down")
-                .labelStyle(.titleAndIcon)
-                .contentTransition(.symbolEffect(.replace))
+            Label(
+                isExpanded.wrappedValue ? "Hide" : "Show",
+                systemImage: isExpanded.wrappedValue ? "chevron.up" : "chevron.down"
+            )
+            .labelStyle(.titleAndIcon)
+            .contentTransition(.symbolEffect(.replace))
        }
        .buttonStyle(.plain)
        .font(.caption2)
@@ -311,7 +315,9 @@ struct InstanceRowView: View {
        }

        @ViewBuilder
-        private func detailRow(icon: String? = nil, title: String, value: String, tint: Color = .secondary) -> some View {
+        private func detailRow(
+            icon: String? = nil, title: String, value: String, tint: Color = .secondary
+        ) -> some View {
            HStack(alignment: .firstTextBaseline, spacing: 6) {
                if let icon {
                    Image(systemName: icon)
@@ -329,4 +335,3 @@ struct InstanceRowView: View {
        }
    }
 }
-
--- a/app/EXO/EXO/Views/NodeDetailView.swift
+++ b/app/EXO/EXO/Views/NodeDetailView.swift
@@ -32,4 +32,3 @@ struct NodeDetailView: View {
        }
    }
 }
-
--- a/app/EXO/EXO/Views/NodeRowView.swift
+++ b/app/EXO/EXO/Views/NodeRowView.swift
@@ -28,4 +28,3 @@ struct NodeRowView: View {
        .padding(.vertical, 4)
    }
 }
-
--- a/app/EXO/EXO/Views/TopologyMiniView.swift
+++ b/app/EXO/EXO/Views/TopologyMiniView.swift
@@ -76,30 +76,33 @@ struct TopologyMiniView: View {

    private func connectionLines(in size: CGSize) -> some View {
        let positions = positionedNodes(in: size)
-        let positionById = Dictionary(uniqueKeysWithValues: positions.map { ($0.node.id, $0.point) })
+        let positionById = Dictionary(
+            uniqueKeysWithValues: positions.map { ($0.node.id, $0.point) })
        return Canvas { context, _ in
            guard !topology.edges.isEmpty else { return }
            let nodeRadius: CGFloat = 32
            let arrowLength: CGFloat = 10
            let arrowSpread: CGFloat = .pi / 7
            for edge in topology.edges {
-                guard let start = positionById[edge.sourceId], let end = positionById[edge.targetId] else { continue }
+                guard let start = positionById[edge.sourceId], let end = positionById[edge.targetId]
+                else { continue }
                let dx = end.x - start.x
                let dy = end.y - start.y
                let distance = max(CGFloat(hypot(dx, dy)), 1)
                let ux = dx / distance
                let uy = dy / distance
-                let adjustedStart = CGPoint(x: start.x + ux * nodeRadius, y: start.y + uy * nodeRadius)
+                let adjustedStart = CGPoint(
+                    x: start.x + ux * nodeRadius, y: start.y + uy * nodeRadius)
                let adjustedEnd = CGPoint(x: end.x - ux * nodeRadius, y: end.y - uy * nodeRadius)

                var linePath = Path()
                linePath.move(to: adjustedStart)
                linePath.addLine(to: adjustedEnd)
-            context.stroke(
+                context.stroke(
                    linePath,
                    with: .color(.secondary.opacity(0.3)),
-                style: StrokeStyle(lineWidth: 1, dash: [4, 4])
-            )
+                    style: StrokeStyle(lineWidth: 1, dash: [4, 4])
+                )

                let angle = atan2(uy, ux)
                let tip = adjustedEnd
@@ -168,5 +171,3 @@ private struct NodeGlyphView: View {
        .frame(width: 95)
    }
 }
-
-
--- a/app/EXO/EXOTests/EXOTests.swift
+++ b/app/EXO/EXOTests/EXOTests.swift
@@ -6,6 +6,7 @@
 //

 import Testing
+
@testable import EXO

 struct EXOTests {
--- a/app/EXO/uninstall-exo.sh
+++ b/app/EXO/uninstall-exo.sh
@@ -0,0 +1,154 @@
+#!/usr/bin/env bash
+#
+# EXO Uninstaller Script
+#
+# This script removes all EXO system components that persist after deleting the app.
+# Run with: sudo ./uninstall-exo.sh
+#
+# Components removed:
+# - LaunchDaemon: /Library/LaunchDaemons/io.exo.networksetup.plist
+# - Network script: /Library/Application Support/EXO/
+# - Log files: /var/log/io.exo.networksetup.*
+# - Network location: "exo"
+# - Launch at login registration
+#
+
+set -euo pipefail
+
+LABEL="io.exo.networksetup"
+SCRIPT_DEST="/Library/Application Support/EXO/disable_bridge_enable_dhcp.sh"
+PLIST_DEST="/Library/LaunchDaemons/io.exo.networksetup.plist"
+LOG_OUT="/var/log/${LABEL}.log"
+LOG_ERR="/var/log/${LABEL}.err.log"
+APP_BUNDLE_ID="io.exo.EXO"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m' # No Color
+
+echo_info() {
+    echo -e "${GREEN}[INFO]${NC} $1"
+}
+
+echo_warn() {
+    echo -e "${YELLOW}[WARN]${NC} $1"
+}
+
+echo_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+
+# Check if running as root
+if [[ $EUID -ne 0 ]]; then
+    echo_error "This script must be run as root (use sudo)"
+    exit 1
+fi
+
+echo ""
+echo "========================================"
+echo "        EXO Uninstaller"
+echo "========================================"
+echo ""
+
+# Unload the LaunchDaemon if running
+echo_info "Stopping network setup daemon..."
+if launchctl list | grep -q "$LABEL"; then
+    launchctl bootout system/"$LABEL" 2>/dev/null || true
+    echo_info "Daemon stopped"
+else
+    echo_warn "Daemon was not running"
+fi
+
+# Remove LaunchDaemon plist
+if [[ -f "$PLIST_DEST" ]]; then
+    rm -f "$PLIST_DEST"
+    echo_info "Removed LaunchDaemon plist"
+else
+    echo_warn "LaunchDaemon plist not found (already removed?)"
+fi
+
+# Remove the script and parent directory
+if [[ -f "$SCRIPT_DEST" ]]; then
+    rm -f "$SCRIPT_DEST"
+    echo_info "Removed network setup script"
+else
+    echo_warn "Network setup script not found (already removed?)"
+fi
+
+# Remove EXO directory if empty
+if [[ -d "/Library/Application Support/EXO" ]]; then
+    rmdir "/Library/Application Support/EXO" 2>/dev/null && \
+        echo_info "Removed EXO support directory" || \
+        echo_warn "EXO support directory not empty, leaving in place"
+fi
+
+# Remove log files
+if [[ -f "$LOG_OUT" ]] || [[ -f "$LOG_ERR" ]]; then
+    rm -f "$LOG_OUT" "$LOG_ERR"
+    echo_info "Removed log files"
+else
+    echo_warn "Log files not found (already removed?)"
+fi
+
+# Switch back to Automatic network location
+echo_info "Restoring network configuration..."
+if networksetup -listlocations | grep -q "^Automatic$"; then
+    networksetup -switchtolocation Automatic 2>/dev/null || true
+    echo_info "Switched to Automatic network location"
+else
+    echo_warn "Automatic network location not found"
+fi
+
+# Delete the exo network location if it exists
+if networksetup -listlocations | grep -q "^exo$"; then
+    networksetup -deletelocation exo 2>/dev/null || true
+    echo_info "Deleted 'exo' network location"
+else
+    echo_warn "'exo' network location not found (already removed?)"
+fi
+
+# Re-enable Thunderbolt Bridge if it exists
+if networksetup -listnetworkservices 2>/dev/null | grep -q "Thunderbolt Bridge"; then
+    networksetup -setnetworkserviceenabled "Thunderbolt Bridge" on 2>/dev/null || true
+    echo_info "Re-enabled Thunderbolt Bridge"
+fi
+
+# Note about launch at login registration
+# SMAppService-based login items cannot be removed from a shell script.
+# They can only be unregistered from within the app itself or manually via System Settings.
+echo_warn "Launch at login must be removed manually:"
+echo_warn "  System Settings → General → Login Items → Remove EXO"
+
+# Check if EXO.app exists in common locations
+APP_FOUND=false
+for app_path in "/Applications/EXO.app" "$HOME/Applications/EXO.app"; do
+    if [[ -d "$app_path" ]]; then
+        if [[ "$APP_FOUND" == false ]]; then
+            echo ""
+            APP_FOUND=true
+        fi
+        echo_warn "EXO.app found at: $app_path"
+        echo_warn "You may want to move it to Trash manually."
+    fi
+done
+
+echo ""
+echo "========================================"
+echo_info "EXO uninstall complete!"
+echo "========================================"
+echo ""
+echo "The following have been removed:"
+echo "  • Network setup LaunchDaemon"
+echo "  • Network configuration script"
+echo "  • Log files"
+echo "  • 'exo' network location"
+echo ""
+echo "Your network has been restored to use the 'Automatic' location."
+echo "Thunderbolt Bridge has been re-enabled (if present)."
+echo ""
+echo "Manual step required:"
+echo "  Remove EXO from Login Items in System Settings → General → Login Items"
+echo ""
+
--- a/bench/exo_bench.py
+++ b/bench/exo_bench.py
@@ -0,0 +1,526 @@
+#!/usr/bin/env python3
+# pyright: reportAny=false, reportUnknownMemberType=false, reportUnknownVariableType=false, reportUnknownArgumentType=false
+from __future__ import annotations
+
+import argparse
+import http.client
+import json
+import os
+import time
+from collections.abc import Callable
+from statistics import mean
+from typing import Any
+from urllib.parse import urlencode
+
+from loguru import logger
+from transformers import AutoTokenizer
+
+from exo.shared.models.model_cards import MODEL_CARDS
+from exo.shared.types.memory import Memory
+
+
+class ExoHttpError(RuntimeError):
+    def __init__(self, status: int, reason: str, body_preview: str):
+        super().__init__(f"HTTP {status} {reason}: {body_preview}")
+        self.status = status
+
+
+class ExoClient:
+    def __init__(self, host: str, port: int, timeout_s: float = 2400.0):
+        self.host = host
+        self.port = port
+        self.timeout_s = timeout_s
+
+    def request_json(
+        self,
+        method: str,
+        path: str,
+        params: dict[str, Any] | None = None,
+        body: dict[str, Any] | None = None,
+        headers: dict[str, str] | None = None,
+    ) -> Any:
+        if not path.startswith("/"):
+            path = "/" + path
+        if params:
+            path = path + "?" + urlencode(params)
+
+        conn = http.client.HTTPConnection(self.host, self.port, timeout=self.timeout_s)
+        try:
+            payload: bytes | None = None
+            hdrs: dict[str, str] = {"Accept": "application/json"}
+
+            if body is not None:
+                payload = json.dumps(body).encode("utf-8")
+                hdrs["Content-Type"] = "application/json"
+            if headers:
+                hdrs.update(headers)
+
+            conn.request(method.upper(), path, body=payload, headers=hdrs)
+            resp = conn.getresponse()
+            raw = resp.read()
+            text = raw.decode("utf-8", errors="replace") if raw else ""
+
+            if resp.status >= 400:
+                raise ExoHttpError(resp.status, resp.reason, text[:300])
+
+            if not text:
+                return None
+            return json.loads(text)
+        finally:
+            conn.close()
+
+    def post_bench_chat_completions(self, payload: dict[str, Any]) -> dict[str, Any]:
+        return self.request_json("POST", "/bench/chat/completions", body=payload)
+
+
+def unwrap_instance(instance: dict[str, Any]) -> dict[str, Any]:
+    if len(instance) != 1:
+        raise KeyError(f"Expected 1 key, got keys={list(instance.keys())}")
+
+    tag = next(iter(instance))
+    inner = instance[tag]
+    if not isinstance(inner, dict):
+        raise TypeError(f"payload for {tag} must be dict, got {type(inner)}")
+    return inner
+
+
+def instance_id_from_instance(instance: dict[str, Any]) -> str:
+    inner = unwrap_instance(instance)
+    return str(inner["instanceId"])
+
+
+def nodes_used_in_instance(instance: dict[str, Any]) -> int:
+    inner = unwrap_instance(instance)
+    return len(inner["shardAssignments"]["nodeToRunner"])
+
+
+def runner_ids_from_instance(instance: dict[str, Any]) -> list[str]:
+    inner = unwrap_instance(instance)
+    runner_to_shard = inner["shardAssignments"]["runnerToShard"]
+    return list(runner_to_shard.keys())
+
+
+def runner_ready(runner: dict[str, Any]) -> bool:
+    return "RunnerReady" in runner
+
+
+def wait_for_instance_ready(
+    client: ExoClient, instance_id: str, timeout: float = 24000.0
+) -> None:
+    start_time = time.time()
+    while time.time() - start_time < timeout:
+        state = client.request_json("GET", "/state")
+        instances = state.get("instances", {})
+
+        if instance_id not in instances:
+            time.sleep(0.1)
+            continue
+
+        instance = instances[instance_id]
+        runner_ids = runner_ids_from_instance(instance)
+        runners = state.get("runners", {})
+
+        if all(runner_ready(runners.get(rid, {})) for rid in runner_ids):
+            return
+
+        time.sleep(0.1)
+
+    raise TimeoutError(f"Instance {instance_id} did not become ready within {timeout=}")
+
+
+def wait_for_instance_gone(
+    client: ExoClient, instance_id: str, timeout: float = 3.0
+) -> None:
+    start_time = time.time()
+    while time.time() - start_time < timeout:
+        try:
+            client.request_json("GET", f"/instance/{instance_id}")
+            time.sleep(0.4)
+        except ExoHttpError as e:
+            if e.status == 404:
+                return
+
+    raise TimeoutError(f"Instance {instance_id} did not get deleted within {timeout=}")
+
+
+def format_peak_memory(b: float) -> str:
+    for unit in ["B", "KB", "MB", "GB", "TB"]:
+        if b < 1024.0:
+            return f"{b:.2f}{unit}"
+        b /= 1024.0
+    raise ValueError("You're using petabytes of memory. Something went wrong...")
+
+
+def parse_int_list(values: list[str]) -> list[int]:
+    items: list[int] = []
+    for v in values:
+        for part in v.split(","):
+            part = part.strip()
+            if part:
+                items.append(int(part))
+
+    seen: set[int] = set()
+    out: list[int] = []
+    for x in items:
+        if x not in seen:
+            out.append(x)
+            seen.add(x)
+    return out
+
+
+def resolve_model_short_id(client: ExoClient, model_arg: str) -> tuple[str, str]:
+    models = client.request_json("GET", "/models") or {}
+    data = models.get("data") or []
+
+    for m in data:
+        if m.get("id") == model_arg:
+            short_id = str(m["id"])
+            full_id = str(m.get("hugging_face_id") or m["id"])
+            return short_id, full_id
+
+    for m in data:
+        if m.get("hugging_face_id") == model_arg:
+            short_id = str(m["id"])
+            full_id = str(m["hugging_face_id"])
+            return short_id, full_id
+
+    raise ValueError(f"Model not found in /models: {model_arg}")
+
+
+def placement_filter(instance_meta: str, wanted: str) -> bool:
+    s = (instance_meta or "").lower()
+    if wanted == "both":
+        return ("ring" in s) or ("jaccl" in s)
+    return wanted in s
+
+
+def sharding_filter(sharding: str, wanted: str) -> bool:
+    s = (sharding or "").lower()
+    if wanted == "both":
+        return ("pipeline" in s) or ("tensor" in s)
+    return wanted in s
+
+
+def run_one_completion(
+    client: ExoClient, model_id: str, pp_hint: int, tg: int, prompt_sizer: PromptSizer
+) -> tuple[dict[str, Any], int]:
+    content, pp_tokens = prompt_sizer.build(pp_hint)
+    payload: dict[str, Any] = {
+        "model": model_id,
+        "messages": [{"role": "user", "content": content}],
+        "stream": False,
+        "max_tokens": tg,
+    }
+
+    t0 = time.perf_counter()
+    out = client.post_bench_chat_completions(payload)
+    elapsed = time.perf_counter() - t0
+
+    stats = out.get("generation_stats")
+
+    preview = (out.get("choices") or [{}])[0]["message"]["content"][:200]
+
+    return {
+        "elapsed_s": elapsed,
+        "output_text_preview": preview,
+        "stats": stats,
+    }, pp_tokens
+
+
+class PromptSizer:
+    def __init__(self, tokenizer: Any, atom: str = "a "):
+        self.tokenizer = tokenizer
+        self.atom = atom
+        self.count_fn = PromptSizer._make_counter(tokenizer)
+        self.base_tokens = self.count_fn("")
+
+    @staticmethod
+    def _make_counter(tokenizer: Any) -> Callable[[str], int]:
+        def count_fn(user_content: str) -> int:
+            messages = [{"role": "user", "content": user_content}]
+            ids = tokenizer.apply_chat_template(
+                messages, tokenize=True, add_generation_prompt=True
+            )
+            return int(len(ids))
+
+        return count_fn
+
+    def build(self, target_prompt_tokens: int) -> tuple[str, int]:
+        target = int(target_prompt_tokens)
+        if target < self.base_tokens:
+            raise RuntimeError(
+                f"Target ({target}) is smaller than template overhead ({self.base_tokens})."
+            )
+
+        content = ""
+        tok = self.count_fn(content)
+
+        while tok < target:
+            content += self.atom
+            tok = self.count_fn(content)
+
+        if tok != target:
+            raise RuntimeError(
+                f"Overshot: got {tok} tokens (target {target}). "
+                f"Pick a different atom (try ' a' or '\\n' or '0 ')."
+            )
+
+        return content, tok
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(
+        prog="exo-bench",
+        description="Benchmark exo model throughput across placement previews.",
+    )
+    ap.add_argument("--host", default=os.environ.get("EXO_HOST", "localhost"))
+    ap.add_argument(
+        "--port", type=int, default=int(os.environ.get("EXO_PORT", "52415"))
+    )
+    ap.add_argument("--model", required=True, help="Model short id or huggingface id")
+    ap.add_argument(
+        "--pp",
+        nargs="+",
+        required=True,
+        help="Prompt-size hints (ints). Accepts commas.",
+    )
+    ap.add_argument(
+        "--tg",
+        nargs="+",
+        required=True,
+        help="Generation lengths (ints). Accepts commas.",
+    )
+    ap.add_argument(
+        "--max-nodes",
+        type=int,
+        default=4,
+        help="Only consider placements using <= this many nodes.",
+    )
+    ap.add_argument(
+        "--instance-meta", choices=["ring", "jaccl", "both"], default="both"
+    )
+    ap.add_argument(
+        "--sharding", choices=["pipeline", "tensor", "both"], default="both"
+    )
+    ap.add_argument(
+        "--skip-pipeline-jaccl",
+        action="store_true",
+        help="Pipeline jaccl is often pointless, skip by default",
+    )
+    ap.add_argument(
+        "--repeat", type=int, default=1, help="Repetitions per (pp,tg) pair."
+    )
+    ap.add_argument(
+        "--warmup",
+        type=int,
+        default=0,
+        help="Warmup runs per placement (uses first pp/tg).",
+    )
+    ap.add_argument(
+        "--timeout", type=float, default=2400.0, help="HTTP timeout (seconds)."
+    )
+    ap.add_argument(
+        "--json-out",
+        default="bench/results.json",
+        help="Write raw per-run results JSON to this path.",
+    )
+    ap.add_argument(
+        "--dry-run", action="store_true", help="List selected placements and exit."
+    )
+    args = ap.parse_args()
+
+    pp_list = parse_int_list(args.pp)
+    tg_list = parse_int_list(args.tg)
+    if not pp_list or not tg_list:
+        logger.error("pp and tg lists must be non-empty")
+        return 2
+    if args.repeat <= 0:
+        logger.error("--repeat must be >= 1")
+        return 2
+
+    client = ExoClient(args.host, args.port, timeout_s=args.timeout)
+    short_id, full_model_id = resolve_model_short_id(client, args.model)
+
+    previews_resp = client.request_json(
+        "GET", "/instance/previews", params={"model_id": short_id}
+    )
+    previews = previews_resp.get("previews") or []
+
+    tokenizer = AutoTokenizer.from_pretrained(
+        full_model_id,
+        trust_remote_code=True,
+    )
+    if tokenizer is None:
+        raise RuntimeError("[exo-bench] tokenizer load failed")
+
+    try:
+        prompt_sizer = PromptSizer(tokenizer)
+        logger.debug(f"[exo-bench] loaded tokenizer: {full_model_id} for prompt sizer")
+    except Exception:
+        logger.error("[exo-bench] tokenizer usable but prompt sizing failed")
+        raise
+
+    selected: list[dict[str, Any]] = []
+    for p in previews:
+        if p.get("error") is not None:
+            continue
+        if not placement_filter(str(p.get("instance_meta", "")), args.instance_meta):
+            continue
+        if not sharding_filter(str(p.get("sharding", "")), args.sharding):
+            continue
+
+        instance = p.get("instance")
+        if not isinstance(instance, dict):
+            continue
+
+        n = nodes_used_in_instance(instance)
+        # Skip tensor ring single node as it is pointless when pipeline ring
+        if n == 1 and (
+            (args.sharding == "both" and "tensor" in p.get("sharding", "").lower())
+            or (
+                args.instance_meta == "both"
+                and "jaccl" in p.get("instance_meta", "").lower()
+            )
+        ):
+            continue
+
+        if (
+            args.skip_pipeline_jaccl
+            and (
+                args.instance_meta == "both"
+                and "jaccl" in p.get("instance_meta", "").lower()
+            )
+            and (
+                args.sharding == "both" and "pipeline" in p.get("sharding", "").lower()
+            )
+        ):
+            continue
+
+        if 0 < n <= args.max_nodes:
+            selected.append(p)
+
+    if not selected:
+        logger.error("No valid placements matched your filters.")
+        return 1
+
+    selected.sort(
+        key=lambda p: (
+            str(p.get("instance_meta", "")),
+            str(p.get("sharding", "")),
+            -nodes_used_in_instance(p["instance"]),
+        ),
+        reverse=True,
+    )
+
+    logger.debug(f"exo-bench model: short_id={short_id} full_id={full_model_id}")
+    logger.info(f"placements: {len(selected)}")
+    for p in selected:
+        logger.info(
+            f"  - {p['sharding']} / {p['instance_meta']} / nodes={nodes_used_in_instance(p['instance'])}"
+        )
+
+    if args.dry_run:
+        return 0
+
+    all_rows: list[dict[str, Any]] = []
+
+    for preview in selected:
+        instance = preview["instance"]
+        instance_id = instance_id_from_instance(instance)
+
+        sharding = str(preview["sharding"])
+        instance_meta = str(preview["instance_meta"])
+        n_nodes = nodes_used_in_instance(instance)
+
+        logger.info("=" * 80)
+        logger.info(
+            f"PLACEMENT: {sharding} / {instance_meta} / nodes={n_nodes} / instance_id={instance_id}"
+        )
+
+        client.request_json("POST", "/instance", body={"instance": instance})
+        wait_for_instance_ready(client, instance_id)
+
+        time.sleep(1)
+
+        try:
+            for i in range(args.warmup):
+                run_one_completion(
+                    client, full_model_id, pp_list[0], tg_list[0], prompt_sizer
+                )
+                logger.debug(f"  warmup {i + 1}/{args.warmup} done")
+
+            for pp in pp_list:
+                if (
+                    pp * n_nodes > 2048
+                    and "ring" in instance_meta.lower()
+                    and "tensor" in sharding.lower()
+                ):
+                    model_card = MODEL_CARDS[short_id]
+                    if model_card.metadata.storage_size > Memory.from_gb(10):
+                        logger.info(
+                            f"Skipping tensor ring as this is too slow for model of size {model_card.metadata.storage_size} on {n_nodes=}"
+                        )
+                        continue
+                for tg in tg_list:
+                    runs: list[dict[str, Any]] = []
+                    for r in range(args.repeat):
+                        time.sleep(3)
+                        try:
+                            row, actual_pp_tokens = run_one_completion(
+                                client, full_model_id, pp, tg, prompt_sizer
+                            )
+                        except Exception as e:
+                            logger.error(e)
+                            continue
+                        row.update(
+                            {
+                                "model_short_id": short_id,
+                                "model_id": full_model_id,
+                                "placement_sharding": sharding,
+                                "placement_instance_meta": instance_meta,
+                                "placement_nodes": n_nodes,
+                                "instance_id": instance_id,
+                                "pp_tokens": actual_pp_tokens,
+                                "tg": tg,
+                                "repeat_index": r,
+                            }
+                        )
+                        runs.append(row)
+                        all_rows.append(row)
+
+                    if runs:
+                        prompt_tps = mean(x["stats"]["prompt_tps"] for x in runs)
+                        gen_tps = mean(x["stats"]["generation_tps"] for x in runs)
+                        ptok = mean(x["stats"]["prompt_tokens"] for x in runs)
+                        gtok = mean(x["stats"]["generation_tokens"] for x in runs)
+                        peak = mean(
+                            x["stats"]["peak_memory_usage"]["inBytes"] for x in runs
+                        )
+
+                        logger.info(
+                            f"prompt_tps={prompt_tps:.2f} gen_tps={gen_tps:.2f}    "
+                            f"prompt_tokens={ptok} gen_tokens={gtok}    "
+                            f"peak_memory={format_peak_memory(peak)}\n"
+                        )
+                    time.sleep(2)
+        finally:
+            try:
+                client.request_json("DELETE", f"/instance/{instance_id}")
+            except ExoHttpError as e:
+                if e.status != 404:
+                    raise
+            wait_for_instance_gone(client, instance_id)
+            logger.debug(f"Deleted instance {instance_id}")
+
+            time.sleep(5)
+
+    if args.json_out:
+        with open(args.json_out, "w", encoding="utf-8") as f:
+            json.dump(all_rows, f, indent=2, ensure_ascii=False)
+        logger.debug(f"\nWrote results JSON: {args.json_out}")
+
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
--- a/dashboard/package-lock.json
+++ b/dashboard/package-lock.json
@@ -9,6 +9,8 @@
 			"version": "1.0.0",
 			"dependencies": {
 				"highlight.js": "^11.11.1",
+				"katex": "^0.16.27",
+				"marked": "^17.0.1",
 				"mode-watcher": "^1.1.0"
 			},
 			"devDependencies": {
@@ -2249,6 +2251,31 @@
 				"jiti": "lib/jiti-cli.mjs"
 			}
 		},
+		"node_modules/katex": {
+			"version": "0.16.27",
+			"resolved": "https://registry.npmjs.org/katex/-/katex-0.16.27.tgz",
+			"integrity": "sha512-aeQoDkuRWSqQN6nSvVCEFvfXdqo1OQiCmmW1kc9xSdjutPv7BGO7pqY9sQRJpMOGrEdfDgF2TfRXe5eUAD2Waw==",
+			"funding": [
+				"https://opencollective.com/katex",
+				"https://github.com/sponsors/katex"
+			],
+			"license": "MIT",
+			"dependencies": {
+				"commander": "^8.3.0"
+			},
+			"bin": {
+				"katex": "cli.js"
+			}
+		},
+		"node_modules/katex/node_modules/commander": {
+			"version": "8.3.0",
+			"resolved": "https://registry.npmjs.org/commander/-/commander-8.3.0.tgz",
+			"integrity": "sha512-OkTL9umf+He2DZkUq8f8J9of7yL6RJKI24dVITBmNfZBmri9zYZQrKkuXiKhyfPSu8tUhnVBB1iKXevvnlR4Ww==",
+			"license": "MIT",
+			"engines": {
+				"node": ">= 12"
+			}
+		},
 		"node_modules/kleur": {
 			"version": "4.1.5",
 			"resolved": "https://registry.npmjs.org/kleur/-/kleur-4.1.5.tgz",
@@ -2535,6 +2562,18 @@
 				"@jridgewell/sourcemap-codec": "^1.5.5"
 			}
 		},
+		"node_modules/marked": {
+			"version": "17.0.1",
+			"resolved": "https://registry.npmjs.org/marked/-/marked-17.0.1.tgz",
+			"integrity": "sha512-boeBdiS0ghpWcSwoNm/jJBwdpFaMnZWRzjA6SkUMYb40SVaN1x7mmfGKp0jvexGcx+7y2La5zRZsYFZI6Qpypg==",
+			"license": "MIT",
+			"bin": {
+				"marked": "bin/marked.js"
+			},
+			"engines": {
+				"node": ">= 20"
+			}
+		},
 		"node_modules/mode-watcher": {
 			"version": "1.1.0",
 			"resolved": "https://registry.npmjs.org/mode-watcher/-/mode-watcher-1.1.0.tgz",
--- a/dashboard/package.json
+++ b/dashboard/package.json
@@ -27,7 +27,8 @@
 	},
 	"dependencies": {
 		"highlight.js": "^11.11.1",
+		"katex": "^0.16.27",
+		"marked": "^17.0.1",
 		"mode-watcher": "^1.1.0"
 	}
 }
-
--- a/dashboard/src/app.d.ts
+++ b/dashboard/src/app.d.ts
@@ -11,4 +11,3 @@ declare global {
 }

 export {};
-
--- a/dashboard/src/lib/components/ChatForm.svelte
+++ b/dashboard/src/lib/components/ChatForm.svelte
@@ -139,6 +139,11 @@
 	}

 	function handleKeydown(event: KeyboardEvent) {
+		// Prevent form submission during IME composition (e.g., Chinese, Japanese, Korean input)
+		if (event.isComposing || event.keyCode === 229) {
+			return;
+		}
+		
 		if (event.key === 'Enter' && !event.shiftKey) {
 			event.preventDefault();
 			handleSubmit();
--- a/dashboard/src/lib/components/ChatMessages.svelte
+++ b/dashboard/src/lib/components/ChatMessages.svelte
@@ -8,89 +8,80 @@
 		regenerateLastResponse
 	} from '$lib/stores/app.svelte';
 	import type { MessageAttachment } from '$lib/stores/app.svelte';
-import { tick, onDestroy } from 'svelte';
+	import MarkdownContent from './MarkdownContent.svelte';

-interface Props {
-	class?: string;
-	scrollParent?: HTMLElement | null;
-}
+	interface Props {
+		class?: string;
+		scrollParent?: HTMLElement | null;
+	}

-let { class: className = '', scrollParent = null }: Props = $props();
+	let { class: className = '', scrollParent = null }: Props = $props();

 	const messageList = $derived(messages());
 	const response = $derived(currentResponse());
 	const loading = $derived(isLoading());

-// Ref for scroll anchor at bottom
-let scrollAnchorRef: HTMLDivElement | undefined = $state();
+	// Scroll management - user controls scroll, show button when not at bottom
+	const SCROLL_THRESHOLD = 100;
+	let showScrollButton = $state(false);
+	let lastMessageCount = 0;
+	let containerRef: HTMLDivElement | undefined = $state();

-// Scroll management
-const SCROLL_BOTTOM_THRESHOLD = 120;
-let autoScrollEnabled = true;
-let currentScrollEl: HTMLElement | null = null;
-
-function resolveScrollElement(): HTMLElement | null {
-	if (scrollParent) return scrollParent;
-	let node: HTMLElement | null = scrollAnchorRef?.parentElement as HTMLElement | null;
-	while (node) {
-		const isScrollable = node.scrollHeight > node.clientHeight + 1;
-		if (isScrollable) return node;
-		node = node.parentElement;
+	function getScrollContainer(): HTMLElement | null {
+		if (scrollParent) return scrollParent;
+		return containerRef?.parentElement ?? null;
 	}
-	return null;
-}

-function handleScroll() {
-	if (!currentScrollEl) return;
-	const distanceFromBottom = currentScrollEl.scrollHeight - currentScrollEl.scrollTop - currentScrollEl.clientHeight;
-	const isNearBottom = distanceFromBottom < SCROLL_BOTTOM_THRESHOLD;
-	autoScrollEnabled = isNearBottom;
-}
-
-function attachScrollListener() {
-	const nextEl = resolveScrollElement();
-	if (currentScrollEl === nextEl) return;
-	if (currentScrollEl) {
-		currentScrollEl.removeEventListener('scroll', handleScroll);
+	function isNearBottom(el: HTMLElement): boolean {
+		return el.scrollHeight - el.scrollTop - el.clientHeight < SCROLL_THRESHOLD;
 	}
-	currentScrollEl = nextEl;
-	if (currentScrollEl) {
-		currentScrollEl.addEventListener('scroll', handleScroll);
-		// Initialize state based on current position
-		handleScroll();
-	}
-}

-onDestroy(() => {
-	if (currentScrollEl) {
-		currentScrollEl.removeEventListener('scroll', handleScroll);
-	}
-});
-
-$effect(() => {
-	// Re-evaluate scroll container if prop changes or after mount
-	scrollParent;
-	attachScrollListener();
-});
-
-// Auto-scroll to bottom when messages change or response updates, but only if user is near bottom
-$effect(() => {
-	// Track these values to trigger effect
-	const _ = messageList.length;
-	const __ = response;
-	const ___ = loading;
-	
-	tick().then(() => {
-		const el = currentScrollEl ?? resolveScrollElement();
-		if (!el || !scrollAnchorRef) return;
-		const distanceFromBottom = el.scrollHeight - el.scrollTop - el.clientHeight;
-		const isNearBottom = distanceFromBottom < SCROLL_BOTTOM_THRESHOLD;
-		if (autoScrollEnabled || isNearBottom) {
-			scrollAnchorRef.scrollIntoView({ behavior: 'smooth', block: 'end' });
-			autoScrollEnabled = true;
+	function scrollToBottom() {
+		const el = getScrollContainer();
+		if (el) {
+			el.scrollTo({ top: el.scrollHeight, behavior: 'smooth' });
 		}
+	}
+
+	function updateScrollButtonVisibility() {
+		const el = getScrollContainer();
+		if (!el) return;
+		showScrollButton = !isNearBottom(el);
+	}
+
+	// Attach scroll listener
+	$effect(() => {
+		const el = scrollParent ?? containerRef?.parentElement;
+		if (!el) return;
+		
+		el.addEventListener('scroll', updateScrollButtonVisibility, { passive: true });
+		// Initial check
+		updateScrollButtonVisibility();
+		return () => el.removeEventListener('scroll', updateScrollButtonVisibility);
+	});
+
+	// Auto-scroll when user sends a new message
+	$effect(() => {
+		const count = messageList.length;
+		if (count > lastMessageCount) {
+			const el = getScrollContainer();
+			if (el) {
+				requestAnimationFrame(() => {
+					el.scrollTo({ top: el.scrollHeight, behavior: 'smooth' });
+				});
+			}
+		}
+		lastMessageCount = count;
+	});
+
+	// Update scroll button visibility when content changes
+	$effect(() => {
+		// Track response to trigger re-check during streaming
+		const _ = response;
+		
+		// Small delay to let DOM update
+		requestAnimationFrame(() => updateScrollButtonVisibility());
 	});
-});

 	// Edit state
 	let editingMessageId = $state<string | null>(null);
@@ -231,7 +222,7 @@ function isThinkingExpanded(messageId: string): boolean {
 <div class="flex flex-col gap-4 sm:gap-6 {className}">
 	{#each messageList as message (message.id)}
 		<div class="group flex {message.role === 'user' ? 'justify-end' : 'justify-start'}">
-			<div class="{message.role === 'user' ? 'max-w-[85%] sm:max-w-[70%] flex flex-col items-end' : 'max-w-[95%] sm:max-w-[85%]'}">
+			<div class="{message.role === 'user' ? 'max-w-[85%] sm:max-w-[70%] flex flex-col items-end' : 'w-full max-w-[98%] sm:max-w-[95%]'}">
 				{#if message.role === 'assistant'}
 					<!-- Assistant message header -->
 					<div class="flex items-center gap-1.5 sm:gap-2 mb-1.5 sm:mb-2">
@@ -305,7 +296,7 @@ function isThinkingExpanded(messageId: string): boolean {
 				{:else}
 					<div class="{message.role === 'user' 
 						? 'command-panel rounded-lg rounded-tr-sm inline-block' 
-						: 'command-panel rounded-lg rounded-tl-sm border-l-2 border-l-exo-yellow/50 inline-block'}">
+						: 'command-panel rounded-lg rounded-tl-sm border-l-2 border-l-exo-yellow/50 block w-full'}">
 						
 						{#if message.role === 'user'}
 							<!-- User message styling -->
@@ -331,7 +322,7 @@ function isThinkingExpanded(messageId: string): boolean {
 								{/if}
 								
 								{#if message.content}
-									<div class="text-sm text-foreground font-mono tracking-wide whitespace-pre-wrap break-words leading-relaxed">
+									<div class="text-xs text-foreground font-mono tracking-wide whitespace-pre-wrap break-words leading-relaxed">
 										{message.content}
 									</div>
 								{/if}
@@ -360,7 +351,7 @@ function isThinkingExpanded(messageId: string): boolean {
 												</svg>
 												<span>Thinking...</span>
 											</span>
-											<span class="text-[10px] tracking-[0.2em] text-exo-light-gray/60">
+											<span class="text-[10px] tracking-[0.2em] text-exo-light-gray/60 ml-4">
 												{isThinkingExpanded(message.id) ? 'HIDE' : 'SHOW'}
 											</span>
 										</button>
@@ -374,8 +365,8 @@ function isThinkingExpanded(messageId: string): boolean {
 										{/if}
 									</div>
 								{/if}
-								<div class="text-sm text-foreground font-mono tracking-wide whitespace-pre-wrap break-words leading-relaxed">
-									{message.content || (loading ? response : '')}
+								<div class="text-xs text-foreground">
+									<MarkdownContent content={message.content || (loading ? response : '')} />
 									{#if loading && !message.content}
 										<span class="inline-block w-2 h-4 bg-exo-yellow/70 ml-1 cursor-blink"></span>
 									{/if}
@@ -457,6 +448,20 @@ function isThinkingExpanded(messageId: string): boolean {
 		</div>
 	{/if}
 	
-	<!-- Scroll anchor for auto-scroll -->
-	<div bind:this={scrollAnchorRef}></div>
+	<!-- Invisible element for container reference -->
+	<div bind:this={containerRef}></div>
+
+	<!-- Scroll to bottom button -->
+	{#if showScrollButton}
+		<button
+			type="button"
+			onclick={scrollToBottom}
+			class="sticky bottom-4 left-1/2 -translate-x-1/2 w-10 h-10 rounded-full bg-exo-dark-gray/90 border border-exo-medium-gray/50 flex items-center justify-center text-exo-light-gray hover:text-exo-yellow hover:border-exo-yellow/50 transition-all shadow-lg cursor-pointer z-10"
+			title="Scroll to bottom"
+		>
+			<svg class="w-5 h-5" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+				<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 14l-7 7m0 0l-7-7m7 7V3" />
+			</svg>
+		</button>
+	{/if}
 </div>
--- a/dashboard/src/lib/components/ChatSidebar.svelte
+++ b/dashboard/src/lib/components/ChatSidebar.svelte
@@ -10,7 +10,9 @@ import {
 		clearChat,
 		instances,
 		debugMode,
-		toggleDebugMode
+		toggleDebugMode,
+		topologyOnlyMode,
+		toggleTopologyOnlyMode
 	} from '$lib/stores/app.svelte';

 	interface Props {
@@ -23,6 +25,7 @@ import {
 	const activeId = $derived(activeConversationId());
 const instanceData = $derived(instances());
 const debugEnabled = $derived(debugMode());
+const topologyOnlyEnabled = $derived(topologyOnlyMode());

 	let searchQuery = $state('');
 	let editingId = $state<string | null>(null);
@@ -424,6 +427,19 @@ const debugEnabled = $derived(debugMode());
 		<div class="text-xs text-white/60 font-mono tracking-wider text-center">
 			{conversationList.length} CONVERSATION{conversationList.length !== 1 ? 'S' : ''}
 		</div>
+		<button
+			type="button"
+			onclick={toggleTopologyOnlyMode}
+			class="p-1.5 rounded border border-exo-medium-gray/40 hover:border-exo-yellow/50 transition-colors cursor-pointer"
+			title="Toggle topology only mode"
+		>
+			<svg class="w-4 h-4 {topologyOnlyEnabled ? 'text-exo-yellow' : 'text-exo-medium-gray'}" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+				<circle cx="12" cy="5" r="2" fill="currentColor" />
+				<circle cx="5" cy="19" r="2" fill="currentColor" />
+				<circle cx="19" cy="19" r="2" fill="currentColor" />
+				<path stroke-linecap="round" d="M12 7v5m0 0l-5 5m5-5l5 5" />
+			</svg>
+		</button>
 	</div>
 	</div>
 </aside>
--- a/dashboard/src/lib/components/HeaderNav.svelte
+++ b/dashboard/src/lib/components/HeaderNav.svelte
@@ -3,6 +3,9 @@

 	export let showHome = true;
 	export let onHome: (() => void) | null = null;
+	export let showSidebarToggle = false;
+	export let sidebarVisible = true;
+	export let onToggleSidebar: (() => void) | null = null;

 	function handleHome(): void {
 		if (onHome) {
@@ -14,13 +17,38 @@
 			window.location.hash = '/';
 		}
 	}
+
+	function handleToggleSidebar(): void {
+		if (onToggleSidebar) {
+			onToggleSidebar();
+		}
+	}
 </script>

 <header class="relative z-20 flex items-center justify-center px-6 pt-8 pb-4 bg-exo-dark-gray">
+	<!-- Left: Sidebar Toggle -->
+	{#if showSidebarToggle}
+	<div class="absolute left-6 top-1/2 -translate-y-1/2">
+		<button
+			onclick={handleToggleSidebar}
+			class="p-2 rounded border border-exo-medium-gray/40 hover:border-exo-yellow/50 transition-colors cursor-pointer"
+			title={sidebarVisible ? 'Hide sidebar' : 'Show sidebar'}
+		>
+			<svg class="w-5 h-5 {sidebarVisible ? 'text-exo-yellow' : 'text-exo-medium-gray'}" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+				{#if sidebarVisible}
+					<path stroke-linecap="round" stroke-linejoin="round" d="M11 19l-7-7 7-7m8 14l-7-7 7-7" />
+				{:else}
+					<path stroke-linecap="round" stroke-linejoin="round" d="M13 5l7 7-7 7M5 5l7 7-7 7" />
+				{/if}
+			</svg>
+		</button>
+	</div>
+	{/if}
+
 	<!-- Center: Logo (clickable to go home) -->
 	<button
 		onclick={handleHome}
-		class="hover:opacity-80 transition-opacity {showHome ? 'cursor-pointer' : 'cursor-default'}"
+		class="bg-transparent border-none outline-none focus:outline-none transition-opacity duration-200 hover:opacity-90 {showHome ? 'cursor-pointer' : 'cursor-default'}"
 		title={showHome ? 'Go to home' : ''}
 		disabled={!showHome}
 	>
--- a/dashboard/src/lib/components/MarkdownContent.svelte
+++ b/dashboard/src/lib/components/MarkdownContent.svelte
@@ -0,0 +1,451 @@
+<script lang="ts">
+	import { marked } from 'marked';
+	import hljs from 'highlight.js';
+	import katex from 'katex';
+	import 'katex/dist/katex.min.css';
+	import { browser } from '$app/environment';
+
+	interface Props {
+		content: string;
+		class?: string;
+	}
+
+	let { content, class: className = '' }: Props = $props();
+
+	let containerRef = $state<HTMLDivElement>();
+	let processedHtml = $state('');
+
+	// Configure marked with syntax highlighting
+	marked.setOptions({
+		gfm: true,
+		breaks: true
+	});
+
+	// Custom renderer for code blocks
+	const renderer = new marked.Renderer();
+
+	renderer.code = function ({ text, lang }: { text: string; lang?: string }) {
+		const language = lang && hljs.getLanguage(lang) ? lang : 'plaintext';
+		const highlighted = hljs.highlight(text, { language }).value;
+		const codeId = `code-${Date.now()}-${Math.random().toString(36).slice(2, 9)}`;
+
+		return `
+			<div class="code-block-wrapper">
+				<div class="code-block-header">
+					<span class="code-language">${language}</span>
+					<button type="button" class="copy-code-btn" data-code="${encodeURIComponent(text)}" title="Copy code">
+						<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+							<rect width="14" height="14" x="8" y="8" rx="2" ry="2"/>
+							<path d="M4 16c-1.1 0-2-.9-2-2V4c0-1.1.9-2 2-2h10c1.1 0 2 .9 2 2"/>
+						</svg>
+					</button>
+				</div>
+				<pre><code class="hljs language-${language}" data-code-id="${codeId}">${highlighted}</code></pre>
+			</div>
+		`;
+	};
+
+	// Inline code
+	renderer.codespan = function ({ text }: { text: string }) {
+		return `<code class="inline-code">${text}</code>`;
+	};
+
+	marked.use({ renderer });
+
+	/**
+	 * Preprocess LaTeX: convert \(...\) to $...$ and \[...\] to $$...$$
+	 * Also protect code blocks from LaTeX processing
+	 */
+	function preprocessLaTeX(text: string): string {
+		// Protect code blocks
+		const codeBlocks: string[] = [];
+		let processed = text.replace(/```[\s\S]*?```|`[^`]+`/g, (match) => {
+			codeBlocks.push(match);
+			return `<<CODE_${codeBlocks.length - 1}>>`;
+		});
+
+		// Convert \(...\) to $...$
+		processed = processed.replace(/\\\((.+?)\\\)/g, '$$$1$');
+		
+		// Convert \[...\] to $$...$$
+		processed = processed.replace(/\\\[([\s\S]*?)\\\]/g, '$$$$$1$$$$');
+
+		// Restore code blocks
+		processed = processed.replace(/<<CODE_(\d+)>>/g, (_, index) => codeBlocks[parseInt(index)]);
+
+		return processed;
+	}
+
+	/**
+	 * Render math expressions with KaTeX after HTML is generated
+	 */
+	function renderMath(html: string): string {
+		// Render display math ($$...$$)
+		html = html.replace(/\$\$([\s\S]*?)\$\$/g, (_, math) => {
+			try {
+				return katex.renderToString(math.trim(), {
+					displayMode: true,
+					throwOnError: false,
+					output: 'html'
+				});
+			} catch {
+				return `<span class="math-error">$$${math}$$</span>`;
+			}
+		});
+
+		// Render inline math ($...$) but avoid matching currency like $5
+		html = html.replace(/\$([^\$\n]+?)\$/g, (match, math) => {
+			// Skip if it looks like currency ($ followed by number)
+			if (/^\d/.test(math.trim())) {
+				return match;
+			}
+			try {
+				return katex.renderToString(math.trim(), {
+					displayMode: false,
+					throwOnError: false,
+					output: 'html'
+				});
+			} catch {
+				return `<span class="math-error">$${math}$</span>`;
+			}
+		});
+
+		return html;
+	}
+
+	function processMarkdown(text: string): string {
+		try {
+			// Preprocess LaTeX notation
+			const preprocessed = preprocessLaTeX(text);
+			// Parse markdown
+			let html = marked.parse(preprocessed) as string;
+			// Render math expressions
+			html = renderMath(html);
+			return html;
+		} catch (error) {
+			console.error('Markdown processing error:', error);
+			return text.replace(/\n/g, '<br>');
+		}
+	}
+
+	async function handleCopyClick(event: Event) {
+		const target = event.currentTarget as HTMLButtonElement;
+		const encodedCode = target.getAttribute('data-code');
+		if (!encodedCode) return;
+
+		const code = decodeURIComponent(encodedCode);
+
+		try {
+			await navigator.clipboard.writeText(code);
+			// Show copied feedback
+			const originalHtml = target.innerHTML;
+			target.innerHTML = `
+				<svg width="16" height="16" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+					<path d="M20 6L9 17l-5-5"/>
+				</svg>
+			`;
+			target.classList.add('copied');
+			setTimeout(() => {
+				target.innerHTML = originalHtml;
+				target.classList.remove('copied');
+			}, 2000);
+		} catch (error) {
+			console.error('Failed to copy:', error);
+		}
+	}
+
+	function setupCopyButtons() {
+		if (!containerRef || !browser) return;
+
+		const buttons = containerRef.querySelectorAll<HTMLButtonElement>('.copy-code-btn');
+		for (const button of buttons) {
+			if (button.dataset.listenerBound !== 'true') {
+				button.dataset.listenerBound = 'true';
+				button.addEventListener('click', handleCopyClick);
+			}
+		}
+	}
+
+	$effect(() => {
+		if (content) {
+			processedHtml = processMarkdown(content);
+		} else {
+			processedHtml = '';
+		}
+	});
+
+	$effect(() => {
+		if (containerRef && processedHtml) {
+			setupCopyButtons();
+		}
+	});
+</script>
+
+<div bind:this={containerRef} class="markdown-content {className}">
+	{@html processedHtml}
+</div>
+
+<style>
+	.markdown-content {
+		line-height: 1.6;
+	}
+
+	/* Paragraphs */
+	.markdown-content :global(p) {
+		margin-bottom: 1rem;
+	}
+
+	.markdown-content :global(p:last-child) {
+		margin-bottom: 0;
+	}
+
+	/* Headers */
+	.markdown-content :global(h1) {
+		font-size: 1.5rem;
+		font-weight: 700;
+		margin: 1.5rem 0 0.75rem 0;
+		color: var(--exo-yellow, #ffd700);
+	}
+
+	.markdown-content :global(h2) {
+		font-size: 1.25rem;
+		font-weight: 600;
+		margin: 1.25rem 0 0.5rem 0;
+		color: var(--exo-yellow, #ffd700);
+	}
+
+	.markdown-content :global(h3) {
+		font-size: 1.125rem;
+		font-weight: 600;
+		margin: 1rem 0 0.5rem 0;
+	}
+
+	.markdown-content :global(h4),
+	.markdown-content :global(h5),
+	.markdown-content :global(h6) {
+		font-size: 1rem;
+		font-weight: 600;
+		margin: 0.75rem 0 0.25rem 0;
+	}
+
+	/* Bold and italic */
+	.markdown-content :global(strong) {
+		font-weight: 600;
+	}
+
+	.markdown-content :global(em) {
+		font-style: italic;
+	}
+
+	/* Inline code */
+	.markdown-content :global(.inline-code) {
+		background: rgba(255, 215, 0, 0.1);
+		color: var(--exo-yellow, #ffd700);
+		padding: 0.125rem 0.375rem;
+		border-radius: 0.25rem;
+		font-family: ui-monospace, SFMono-Regular, 'SF Mono', Monaco, Consolas, monospace;
+		font-size: 0.875em;
+	}
+
+	/* Links */
+	.markdown-content :global(a) {
+		color: var(--exo-yellow, #ffd700);
+		text-decoration: underline;
+		text-underline-offset: 2px;
+	}
+
+	.markdown-content :global(a:hover) {
+		opacity: 0.8;
+	}
+
+	/* Lists */
+	.markdown-content :global(ul) {
+		list-style-type: disc;
+		margin-left: 1.5rem;
+		margin-bottom: 1rem;
+	}
+
+	.markdown-content :global(ol) {
+		list-style-type: decimal;
+		margin-left: 1.5rem;
+		margin-bottom: 1rem;
+	}
+
+	.markdown-content :global(li) {
+		margin-bottom: 0.25rem;
+	}
+
+	.markdown-content :global(li::marker) {
+		color: var(--exo-light-gray, #9ca3af);
+	}
+
+	/* Blockquotes */
+	.markdown-content :global(blockquote) {
+		border-left: 3px solid var(--exo-yellow, #ffd700);
+		padding: 0.5rem 1rem;
+		margin: 1rem 0;
+		background: rgba(255, 215, 0, 0.05);
+		border-radius: 0 0.25rem 0.25rem 0;
+	}
+
+	/* Tables */
+	.markdown-content :global(table) {
+		width: 100%;
+		margin: 1rem 0;
+		border-collapse: collapse;
+		font-size: 0.875rem;
+	}
+
+	.markdown-content :global(th) {
+		background: rgba(255, 215, 0, 0.1);
+		border: 1px solid rgba(255, 215, 0, 0.2);
+		padding: 0.5rem;
+		text-align: left;
+		font-weight: 600;
+	}
+
+	.markdown-content :global(td) {
+		border: 1px solid rgba(255, 255, 255, 0.1);
+		padding: 0.5rem;
+	}
+
+	/* Horizontal rule */
+	.markdown-content :global(hr) {
+		border: none;
+		border-top: 1px solid rgba(255, 255, 255, 0.1);
+		margin: 1.5rem 0;
+	}
+
+	/* Code block wrapper */
+	.markdown-content :global(.code-block-wrapper) {
+		margin: 1rem 0;
+		border-radius: 0.5rem;
+		overflow: hidden;
+		border: 1px solid rgba(255, 215, 0, 0.2);
+		background: rgba(0, 0, 0, 0.4);
+	}
+
+	.markdown-content :global(.code-block-header) {
+		display: flex;
+		justify-content: space-between;
+		align-items: center;
+		padding: 0.5rem 0.75rem;
+		background: rgba(255, 215, 0, 0.05);
+		border-bottom: 1px solid rgba(255, 215, 0, 0.1);
+	}
+
+	.markdown-content :global(.code-language) {
+		color: var(--exo-yellow, #ffd700);
+		font-size: 0.7rem;
+		font-weight: 500;
+		text-transform: uppercase;
+		letter-spacing: 0.1em;
+		font-family: ui-monospace, SFMono-Regular, 'SF Mono', Monaco, Consolas, monospace;
+	}
+
+	.markdown-content :global(.copy-code-btn) {
+		display: flex;
+		align-items: center;
+		justify-content: center;
+		padding: 0.25rem;
+		background: transparent;
+		border: none;
+		color: var(--exo-light-gray, #9ca3af);
+		cursor: pointer;
+		transition: color 0.2s;
+		border-radius: 0.25rem;
+	}
+
+	.markdown-content :global(.copy-code-btn:hover) {
+		color: var(--exo-yellow, #ffd700);
+	}
+
+	.markdown-content :global(.copy-code-btn.copied) {
+		color: #22c55e;
+	}
+
+	.markdown-content :global(.code-block-wrapper pre) {
+		margin: 0;
+		padding: 1rem;
+		overflow-x: auto;
+		background: transparent;
+	}
+
+	.markdown-content :global(.code-block-wrapper code) {
+		font-family: ui-monospace, SFMono-Regular, 'SF Mono', Monaco, Consolas, monospace;
+		font-size: 0.8125rem;
+		line-height: 1.5;
+		background: transparent;
+	}
+
+	/* Syntax highlighting - dark theme matching EXO style */
+	.markdown-content :global(.hljs) {
+		color: #e5e7eb;
+	}
+
+	.markdown-content :global(.hljs-keyword),
+	.markdown-content :global(.hljs-selector-tag),
+	.markdown-content :global(.hljs-literal),
+	.markdown-content :global(.hljs-section),
+	.markdown-content :global(.hljs-link) {
+		color: #c084fc;
+	}
+
+	.markdown-content :global(.hljs-string),
+	.markdown-content :global(.hljs-title),
+	.markdown-content :global(.hljs-name),
+	.markdown-content :global(.hljs-type),
+	.markdown-content :global(.hljs-attribute),
+	.markdown-content :global(.hljs-symbol),
+	.markdown-content :global(.hljs-bullet),
+	.markdown-content :global(.hljs-addition),
+	.markdown-content :global(.hljs-variable),
+	.markdown-content :global(.hljs-template-tag),
+	.markdown-content :global(.hljs-template-variable) {
+		color: #fbbf24;
+	}
+
+	.markdown-content :global(.hljs-comment),
+	.markdown-content :global(.hljs-quote),
+	.markdown-content :global(.hljs-deletion),
+	.markdown-content :global(.hljs-meta) {
+		color: #6b7280;
+	}
+
+	.markdown-content :global(.hljs-number),
+	.markdown-content :global(.hljs-regexp),
+	.markdown-content :global(.hljs-literal),
+	.markdown-content :global(.hljs-built_in) {
+		color: #34d399;
+	}
+
+	.markdown-content :global(.hljs-function),
+	.markdown-content :global(.hljs-class .hljs-title) {
+		color: #60a5fa;
+	}
+
+	/* KaTeX math styling */
+	.markdown-content :global(.katex) {
+		font-size: 1.1em;
+	}
+
+	.markdown-content :global(.katex-display) {
+		margin: 1rem 0;
+		overflow-x: auto;
+		overflow-y: hidden;
+		padding: 0.5rem 0;
+	}
+
+	.markdown-content :global(.katex-display > .katex) {
+		text-align: center;
+	}
+
+	.markdown-content :global(.math-error) {
+		color: #f87171;
+		font-family: ui-monospace, SFMono-Regular, 'SF Mono', Monaco, Consolas, monospace;
+		font-size: 0.875em;
+		background: rgba(248, 113, 113, 0.1);
+		padding: 0.125rem 0.25rem;
+		border-radius: 0.25rem;
+	}
+</style>
--- a/dashboard/src/lib/components/ModelCard.svelte
+++ b/dashboard/src/lib/components/ModelCard.svelte
@@ -1,5 +1,6 @@
 <script lang="ts">
-	import type { DownloadProgress, NodeInfo, PlacementPreview } from '$lib/stores/app.svelte';
+	import type { DownloadProgress, NodeInfo, PlacementPreview, TopologyEdge } from '$lib/stores/app.svelte';
+	import { debugMode, topologyData } from '$lib/stores/app.svelte';

 interface Props {
 		model: { id: string; name?: string; storage_size_megabytes?: number };
@@ -196,7 +197,7 @@ function toggleNodeDetails(nodeId: string): void {
 	// Uses API preview data when available, falls back to local estimation
 	const placementPreview = $derived(() => {
 		const nodeArray = nodeList();
-		if (nodeArray.length === 0) return { nodes: [], canFit: false, totalAvailable: 0, error: null };
+		if (nodeArray.length === 0) return { nodes: [], canFit: false, totalAvailable: 0, topoWidth: 260, topoHeight: 90, error: null };
 		
 		const numNodes = nodeArray.length;
 		const iconSize = numNodes === 1 ? 50 : 36;
@@ -206,12 +207,8 @@ function toggleNodeDetails(nodeId: string): void {
 		const centerY = topoHeight / 2;
 		const radius = numNodes === 1 ? 0 : numNodes === 2 ? 45 : Math.min(topoWidth, topoHeight) * 0.32;
 		
-		// Use API preview data if available
+		// Only use API preview data - no local estimation
 		const hasApiPreview = apiPreview !== null && apiPreview.error === null && apiPreview.memory_delta_by_node !== null;
-		const canFit = hasApiPreview ? true : (() => {
-			const totalAvailable = nodeArray.reduce((sum, n) => sum + n.availableGB, 0);
-			return totalAvailable >= estimatedMemory;
-		})();
 		const error = apiPreview?.error ?? null;
 		
 		let placementNodes: Array<{ 
@@ -232,135 +229,140 @@ function toggleNodeDetails(nodeId: string): void {
 			modelFillHeight: number;
 		}> = [];
 		
-		if (hasApiPreview && apiPreview.memory_delta_by_node) {
-			// Use API placement data
-			const memoryDelta = apiPreview.memory_delta_by_node;
-			placementNodes = nodeArray.map((n, i) => {
-				const deltaBytes = memoryDelta[n.id] ?? 0;
-				const modelUsageGB = deltaBytes / (1024 * 1024 * 1024);
-				const isUsed = deltaBytes > 0;
-				const angle = numNodes === 1 ? 0 : (i / numNodes) * Math.PI * 2 - Math.PI / 2;
-				const safeTotal = Math.max(n.totalGB, 0.001);
-				const currentPercent = clampPercent((n.usedGB / safeTotal) * 100);
-				const newPercent = clampPercent(((n.usedGB + modelUsageGB) / safeTotal) * 100);
-				const screenHeight = iconSize * 0.58;
-				
-				return {
-					id: n.id,
-					deviceName: n.deviceName,
-					deviceType: n.deviceType,
-					totalGB: n.totalGB,
-					currentUsedGB: n.usedGB,
-					modelUsageGB,
-					currentPercent,
-					newPercent,
-					isUsed,
-					x: centerX + Math.cos(angle) * radius,
-					y: centerY + Math.sin(angle) * radius,
-					iconSize,
-					screenHeight,
-					currentFillHeight: screenHeight * (currentPercent / 100),
-					modelFillHeight: screenHeight * ((newPercent - currentPercent) / 100)
-				};
-			});
-		} else if (apiPreview?.error) {
-			// API returned an error - model can't fit, show all nodes as unused
-			placementNodes = nodeArray.map((n, i) => {
-				const angle = numNodes === 1 ? 0 : (i / numNodes) * Math.PI * 2 - Math.PI / 2;
-				const safeTotal = Math.max(n.totalGB, 0.001);
-				const currentPercent = clampPercent((n.usedGB / safeTotal) * 100);
-				const screenHeight = iconSize * 0.58;
-				
-				return {
-					id: n.id,
-					deviceName: n.deviceName,
-					deviceType: n.deviceType,
-					totalGB: n.totalGB,
-					currentUsedGB: n.usedGB,
-					modelUsageGB: 0,
-					currentPercent,
-					newPercent: currentPercent,
-					isUsed: false,
-					x: centerX + Math.cos(angle) * radius,
-					y: centerY + Math.sin(angle) * radius,
-					iconSize,
-					screenHeight,
-					currentFillHeight: screenHeight * (currentPercent / 100),
-					modelFillHeight: 0
-				};
-			});
-		} else {
-			// Fallback: local estimation based on sharding strategy
-			const memoryNeeded = estimatedMemory;
+		// Use API placement data directly
+		const memoryDelta = apiPreview?.memory_delta_by_node ?? {};
+		placementNodes = nodeArray.map((n, i) => {
+			const deltaBytes = memoryDelta[n.id] ?? 0;
+			const modelUsageGB = deltaBytes / (1024 * 1024 * 1024);
+			const isUsed = deltaBytes > 0;
+			const angle = numNodes === 1 ? 0 : (i / numNodes) * Math.PI * 2 - Math.PI / 2;
+			const safeTotal = Math.max(n.totalGB, 0.001);
+			const currentPercent = clampPercent((n.usedGB / safeTotal) * 100);
+			const newPercent = clampPercent(((n.usedGB + modelUsageGB) / safeTotal) * 100);
+			const screenHeight = iconSize * 0.58;
 			
-			if (sharding === 'Pipeline') {
-				const memoryPerNode = memoryNeeded / numNodes;
-				placementNodes = nodeArray.map((n, i) => {
-					const angle = numNodes === 1 ? 0 : (i / numNodes) * Math.PI * 2 - Math.PI / 2;
-					const safeTotal = Math.max(n.totalGB, 0.001);
-					const currentPercent = clampPercent((n.usedGB / safeTotal) * 100);
-					const newPercent = clampPercent(((n.usedGB + memoryPerNode) / safeTotal) * 100);
-					const screenHeight = iconSize * 0.58;
-					
-					return {
-						id: n.id,
-						deviceName: n.deviceName,
-						deviceType: n.deviceType,
-						totalGB: n.totalGB,
-						currentUsedGB: n.usedGB,
-						modelUsageGB: memoryPerNode,
-						currentPercent,
-						newPercent,
-						isUsed: true,
-						x: centerX + Math.cos(angle) * radius,
-						y: centerY + Math.sin(angle) * radius,
-						iconSize,
-						screenHeight,
-						currentFillHeight: screenHeight * (currentPercent / 100),
-						modelFillHeight: screenHeight * ((newPercent - currentPercent) / 100)
-					};
-				});
-			} else {
-				let remaining = memoryNeeded;
-				placementNodes = nodeArray.map((n, i) => {
-					const allocated = Math.min(remaining, n.availableGB);
-					remaining -= allocated;
-					const isUsed = allocated > 0;
-					const angle = numNodes === 1 ? 0 : (i / numNodes) * Math.PI * 2 - Math.PI / 2;
-					const safeTotal = Math.max(n.totalGB, 0.001);
-					const currentPercent = clampPercent((n.usedGB / safeTotal) * 100);
-					const newPercent = clampPercent(((n.usedGB + allocated) / safeTotal) * 100);
-					const screenHeight = iconSize * 0.58;
-					
-					return {
-						id: n.id,
-						deviceName: n.deviceName,
-						deviceType: n.deviceType,
-						totalGB: n.totalGB,
-						currentUsedGB: n.usedGB,
-						modelUsageGB: allocated,
-						currentPercent,
-						newPercent,
-						isUsed,
-						x: centerX + Math.cos(angle) * radius,
-						y: centerY + Math.sin(angle) * radius,
-						iconSize,
-						screenHeight,
-						currentFillHeight: screenHeight * (currentPercent / 100),
-						modelFillHeight: screenHeight * ((newPercent - currentPercent) / 100)
-					};
-				});
-			}
-		}
+			return {
+				id: n.id,
+				deviceName: n.deviceName,
+				deviceType: n.deviceType,
+				totalGB: n.totalGB,
+				currentUsedGB: n.usedGB,
+				modelUsageGB,
+				currentPercent,
+				newPercent,
+				isUsed,
+				x: centerX + Math.cos(angle) * radius,
+				y: centerY + Math.sin(angle) * radius,
+				iconSize,
+				screenHeight,
+				currentFillHeight: screenHeight * (currentPercent / 100),
+				modelFillHeight: screenHeight * ((newPercent - currentPercent) / 100)
+			};
+		});
 		
 		const totalAvailable = nodeArray.reduce((sum, n) => sum + n.availableGB, 0);
-		return { nodes: placementNodes, canFit: hasApiPreview || canFit, totalAvailable, topoWidth, topoHeight, error };
+		return { nodes: placementNodes, canFit: hasApiPreview, totalAvailable, topoWidth, topoHeight, error };
 	});
 	
 	const canFit = $derived(apiPreview ? apiPreview.error === null : placementPreview().canFit);
 	const placementError = $derived(apiPreview?.error ?? null);
 	const nodeCount = $derived(nodeList().length);
 	const filterId = $derived(model.id.replace(/[^a-zA-Z0-9]/g, ''));
+	
+	// Debug mode state
+	const isDebugMode = $derived(debugMode());
+	const topology = $derived(topologyData());
+	const isRdma = $derived(runtime === 'MlxIbv' || runtime === 'MlxJaccl');
+	
+	// Get interface name for an IP from node data
+	function getInterfaceForIp(nodeId: string, ip?: string): string | null {
+		if (!ip || !topology?.nodes) return null;
+		
+		// Strip port if present
+		const cleanIp = ip.includes(':') && !ip.includes('[') ? ip.split(':')[0] : ip;
+		
+		// Check specified node first
+		const node = topology.nodes[nodeId];
+		if (node) {
+			const match = node.network_interfaces?.find((iface) =>
+				(iface.addresses || []).some((addr) => addr === cleanIp || addr === ip)
+			);
+			if (match?.name) return match.name;
+			
+			const mapped = node.ip_to_interface?.[cleanIp] || node.ip_to_interface?.[ip];
+			if (mapped) return mapped;
+		}
+		
+		// Fallback: check all nodes
+		for (const [, otherNode] of Object.entries(topology.nodes)) {
+			if (!otherNode) continue;
+			const match = otherNode.network_interfaces?.find((iface) =>
+				(iface.addresses || []).some((addr) => addr === cleanIp || addr === ip)
+			);
+			if (match?.name) return match.name;
+			
+			const mapped = otherNode.ip_to_interface?.[cleanIp] || otherNode.ip_to_interface?.[ip];
+			if (mapped) return mapped;
+		}
+		
+		return null;
+	}
+	
+	// Get directional arrow based on node positions
+	function getArrow(fromNode: { x: number; y: number }, toNode: { x: number; y: number }): string {
+		const dx = toNode.x - fromNode.x;
+		const dy = toNode.y - fromNode.y;
+		const absX = Math.abs(dx);
+		const absY = Math.abs(dy);
+		
+		if (absX > absY * 2) {
+			return dx > 0 ? '→' : '←';
+		} else if (absY > absX * 2) {
+			return dy > 0 ? '↓' : '↑';
+		} else {
+			if (dx > 0 && dy > 0) return '↘';
+			if (dx > 0 && dy < 0) return '↗';
+			if (dx < 0 && dy > 0) return '↙';
+			return '↖';
+		}
+	}
+
+	// Get connection info for edges between two nodes
+	// Returns exactly one connection per direction (A→B and B→A), preferring non-loopback
+	function getConnectionInfo(nodeId1: string, nodeId2: string): Array<{ ip: string; iface: string | null; from: string; to: string }> {
+		if (!topology?.edges) return [];
+		
+		// Collect candidates for each direction
+		const aToBCandidates: Array<{ ip: string; iface: string | null }> = [];
+		const bToACandidates: Array<{ ip: string; iface: string | null }> = [];
+		
+		for (const edge of topology.edges) {
+			const ip = edge.sendBackIp || '?';
+			const iface = edge.sendBackInterface || getInterfaceForIp(edge.source, ip);
+			
+			if (edge.source === nodeId1 && edge.target === nodeId2) {
+				aToBCandidates.push({ ip, iface });
+			} else if (edge.source === nodeId2 && edge.target === nodeId1) {
+				bToACandidates.push({ ip, iface });
+			}
+		}
+		
+		// Pick best (prefer non-loopback)
+		const pickBest = (candidates: Array<{ ip: string; iface: string | null }>) => {
+			if (candidates.length === 0) return null;
+			return candidates.find(c => !c.ip.startsWith('127.')) || candidates[0];
+		};
+		
+		const result: Array<{ ip: string; iface: string | null; from: string; to: string }> = [];
+		
+		const bestAtoB = pickBest(aToBCandidates);
+		if (bestAtoB) result.push({ ...bestAtoB, from: nodeId1, to: nodeId2 });
+		
+		const bestBtoA = pickBest(bToACandidates);
+		if (bestBtoA) result.push({ ...bestBtoA, from: nodeId2, to: nodeId1 });
+		
+		return result;
+	}
 </script>

 <div class="relative group">
@@ -453,6 +455,26 @@ function toggleNodeDetails(nodeId: string): void {
 					
 					<!-- Connection lines between nodes (if multiple) -->
 					{#if preview.nodes.length > 1}
+						{@const usedNodes = preview.nodes.filter(n => n.isUsed)}
+						{@const nodePositions = Object.fromEntries(preview.nodes.map(n => [n.id, { x: n.x, y: n.y }]))}
+						{@const allConnections = isDebugMode && usedNodes.length > 1 ? (() => {
+							const conns: Array<{ ip: string; iface: string | null; from: string; to: string; midX: number; midY: number; arrow: string }> = [];
+							for (let i = 0; i < usedNodes.length; i++) {
+								for (let j = i + 1; j < usedNodes.length; j++) {
+									const n1 = usedNodes[i];
+									const n2 = usedNodes[j];
+									const midX = (n1.x + n2.x) / 2;
+									const midY = (n1.y + n2.y) / 2;
+									for (const c of getConnectionInfo(n1.id, n2.id)) {
+										const fromPos = nodePositions[c.from];
+										const toPos = nodePositions[c.to];
+										const arrow = fromPos && toPos ? getArrow(fromPos, toPos) : '→';
+										conns.push({ ...c, midX, midY, arrow });
+									}
+								}
+							}
+							return conns;
+						})() : []}
 						{#each preview.nodes as node, i}
 							{#each preview.nodes.slice(i + 1) as node2}
 								<line 
@@ -464,6 +486,43 @@ function toggleNodeDetails(nodeId: string): void {
 								/>
 							{/each}
 						{/each}
+						<!-- Debug: Show connection IPs/interfaces in corners -->
+						{#if isDebugMode && allConnections.length > 0}
+							{@const centerX = preview.topoWidth / 2}
+							{@const centerY = preview.topoHeight / 2}
+							{@const quadrants = {
+								topLeft: allConnections.filter(c => c.midX < centerX && c.midY < centerY),
+								topRight: allConnections.filter(c => c.midX >= centerX && c.midY < centerY),
+								bottomLeft: allConnections.filter(c => c.midX < centerX && c.midY >= centerY),
+								bottomRight: allConnections.filter(c => c.midX >= centerX && c.midY >= centerY)
+							}}
+							{@const padding = 4}
+							{@const lineHeight = 8}
+							<!-- Top Left -->
+							{#each quadrants.topLeft as conn, idx}
+								<text x={padding} y={padding + idx * lineHeight} text-anchor="start" dominant-baseline="hanging" font-size="6" font-family="SF Mono, Monaco, monospace" fill={conn.iface ? 'rgba(255,255,255,0.85)' : 'rgba(248,113,113,0.85)'}>
+									{conn.arrow} {isRdma ? (conn.iface || '?') : `${conn.ip}${conn.iface ? ` (${conn.iface})` : ''}`}
+								</text>
+							{/each}
+							<!-- Top Right -->
+							{#each quadrants.topRight as conn, idx}
+								<text x={preview.topoWidth - padding} y={padding + idx * lineHeight} text-anchor="end" dominant-baseline="hanging" font-size="6" font-family="SF Mono, Monaco, monospace" fill={conn.iface ? 'rgba(255,255,255,0.85)' : 'rgba(248,113,113,0.85)'}>
+									{conn.arrow} {isRdma ? (conn.iface || '?') : `${conn.ip}${conn.iface ? ` (${conn.iface})` : ''}`}
+								</text>
+							{/each}
+							<!-- Bottom Left -->
+							{#each quadrants.bottomLeft as conn, idx}
+								<text x={padding} y={preview.topoHeight - padding - (quadrants.bottomLeft.length - 1 - idx) * lineHeight} text-anchor="start" dominant-baseline="auto" font-size="6" font-family="SF Mono, Monaco, monospace" fill={conn.iface ? 'rgba(255,255,255,0.85)' : 'rgba(248,113,113,0.85)'}>
+									{conn.arrow} {isRdma ? (conn.iface || '?') : `${conn.ip}${conn.iface ? ` (${conn.iface})` : ''}`}
+								</text>
+							{/each}
+							<!-- Bottom Right -->
+							{#each quadrants.bottomRight as conn, idx}
+								<text x={preview.topoWidth - padding} y={preview.topoHeight - padding - (quadrants.bottomRight.length - 1 - idx) * lineHeight} text-anchor="end" dominant-baseline="auto" font-size="6" font-family="SF Mono, Monaco, monospace" fill={conn.iface ? 'rgba(255,255,255,0.85)' : 'rgba(248,113,113,0.85)'}>
+									{conn.arrow} {isRdma ? (conn.iface || '?') : `${conn.ip}${conn.iface ? ` (${conn.iface})` : ''}`}
+								</text>
+							{/each}
+						{/if}
 					{/if}
 					
 					{#each preview.nodes as node}
--- a/dashboard/src/lib/components/TopologyGraph.svelte
+++ b/dashboard/src/lib/components/TopologyGraph.svelte
@@ -1,7 +1,7 @@
 <script lang="ts">
 	import { onMount, onDestroy } from 'svelte';
 	import * as d3 from 'd3';
-import { topologyData, isTopologyMinimized, debugMode } from '$lib/stores/app.svelte';
+import { topologyData, isTopologyMinimized, debugMode, type NodeInfo } from '$lib/stores/app.svelte';

 	interface Props {
 		class?: string;
@@ -24,19 +24,38 @@ function getNodeLabel(nodeId: string): string {

 function getInterfaceLabel(nodeId: string, ip?: string): { label: string; missing: boolean } {
 	if (!ip) return { label: '?', missing: true };
-	const node = data?.nodes?.[nodeId];
-	if (!node) return { label: '?', missing: true };

-	const matchFromInterfaces = node.network_interfaces?.find((iface) =>
-		(iface.addresses || []).some((addr) => addr === ip)
-	);
-	if (matchFromInterfaces?.name) {
-		return { label: matchFromInterfaces.name, missing: false };
+	// Strip port if present (e.g., "192.168.1.1:8080" -> "192.168.1.1")
+	const cleanIp = ip.includes(':') && !ip.includes('[') ? ip.split(':')[0] : ip;
+
+	// Helper to check a node's interfaces
+	function checkNode(node: NodeInfo | undefined): string | null {
+		if (!node) return null;
+
+		const matchFromInterfaces = node.network_interfaces?.find((iface) =>
+			(iface.addresses || []).some((addr) => addr === cleanIp || addr === ip)
+		);
+		if (matchFromInterfaces?.name) {
+			return matchFromInterfaces.name;
+		}
+
+		if (node.ip_to_interface) {
+			const mapped = node.ip_to_interface[cleanIp] || (ip ? node.ip_to_interface[ip] : undefined);
+			if (mapped && mapped.trim().length > 0) {
+				return mapped;
+			}
+		}
+		return null;
 	}

-	const mapped = node.ip_to_interface?.[ip];
-	if (mapped && mapped.trim().length > 0) {
-		return { label: mapped, missing: false };
+	// Try specified node first
+	const result = checkNode(data?.nodes?.[nodeId]);
+	if (result) return { label: result, missing: false };
+
+	// Fallback: search all nodes for this IP
+	for (const [, otherNode] of Object.entries(data?.nodes || {})) {
+		const otherResult = checkNode(otherNode);
+		if (otherResult) return { label: otherResult, missing: false };
 	}

 	return { label: '?', missing: true };
@@ -67,6 +86,7 @@ function wrapLine(text: string, maxLen: number): string[] {
 	return lines;
 }

+
 	// Apple logo path for MacBook Pro screen
 	const APPLE_LOGO_PATH = "M788.1 340.9c-5.8 4.5-108.2 62.2-108.2 190.5 0 148.4 130.3 200.9 134.2 202.2-.6 3.2-20.7 71.9-68.7 141.9-42.8 61.6-87.5 123.1-155.5 123.1s-85.5-39.5-164-39.5c-76.5 0-103.7 40.8-165.9 40.8s-105.6-57-155.5-127C46.7 790.7 0 663 0 541.8c0-194.4 126.4-297.5 250.8-297.5 66.1 0 121.2 43.4 162.7 43.4 39.5 0 101.1-46 176.3-46 28.5 0 130.9 2.6 198.3 99.2zm-234-181.5c31.1-36.9 53.1-88.1 53.1-139.3 0-7.1-.6-14.3-1.9-20.1-50.6 1.9-110.8 33.7-147.1 75.8-28.5 32.4-55.1 83.6-55.1 135.5 0 7.8 1.3 15.6 1.9 18.1 3.2.6 8.4 1.3 13.6 1.3 45.4 0 102.5-30.4 135.5-71.3z";
 	const LOGO_NATIVE_WIDTH = 814;
@@ -237,20 +257,24 @@ function wrapLine(text: string, maxLen: number): string[] {
 		const arrowsGroup = svg.append('g').attr('class', 'arrows-group');
 		const debugLabelsGroup = svg.append('g').attr('class', 'debug-edge-labels');

-		const pairMap = new Map<string, { a: string; b: string; aToB: boolean; bToA: boolean; connections: Array<{ from: string; to: string; ip: string; ifaceLabel: string; missingIface: boolean }> }>();
+		type ConnectionInfo = { from: string; to: string; ip: string; ifaceLabel: string; missingIface: boolean };
+		type PairEntry = { a: string; b: string; aToB: boolean; bToA: boolean; connections: ConnectionInfo[] };
+		type DebugEdgeLabelEntry = { connections: ConnectionInfo[]; isLeft: boolean; isTop: boolean; mx: number; my: number };
+		const pairMap = new Map<string, PairEntry>();
+		const debugEdgeLabels: DebugEdgeLabelEntry[] = [];
 		edges.forEach(edge => {
 			if (!edge.source || !edge.target || edge.source === edge.target) return;
 			if (!positionById[edge.source] || !positionById[edge.target]) return;
-			
+
 			const a = edge.source < edge.target ? edge.source : edge.target;
 			const b = edge.source < edge.target ? edge.target : edge.source;
 			const key = `${a}|${b}`;
 			const entry = pairMap.get(key) || { a, b, aToB: false, bToA: false, connections: [] };
-			
+
 			if (edge.source === a) entry.aToB = true;
 			else entry.bToA = true;

-			const ip = edge.sendBackIp || edge.sendBackMultiaddr?.ip_address || '?';
+			const ip = edge.sendBackIp || '?';
 			const ifaceInfo = getInterfaceLabel(edge.source, ip);
 			entry.connections.push({
 				from: edge.source,
@@ -314,110 +338,97 @@ function wrapLine(text: string, maxLen: number): string[] {
 					.attr('marker-end', 'url(#arrowhead)');
 			}

+			// Collect debug labels for later positioning at edges
 			if (debugEnabled && entry.connections.length > 0) {
-				const maxBoxes = 6;
-				const fontSize = isMinimized ? 8 : 9;
-				const lineGap = 2;
-				const labelOffsetOut = Math.max(140, minDimension * 0.38);
-				const labelOffsetSide = isMinimized ? 16 : 20;
-				const boxWidth = 170;
-				const maxLineLen = 26;
+				// Determine which side of viewport based on edge midpoint
+				const isLeft = mx < centerX;
+				const isTop = my < safeCenterY;

-				const connections = entry.connections.slice(0, maxBoxes);
-				if (entry.connections.length > maxBoxes) {
-					const remaining = entry.connections.length - maxBoxes;
-					connections.push({
-						from: '',
-						to: '',
-						ip: `(+${remaining} more)`,
-						ifaceLabel: '',
-						missingIface: false
-					});
-				}
-
-				let dirX = mx - centerX;
-				let dirY = my - centerY;
-				const dirLen = Math.hypot(dirX, dirY);
-				if (dirLen < 1) {
-					dirX = -uy;
-					dirY = ux;
-				} else {
-					dirX /= dirLen;
-					dirY /= dirLen;
-				}
-
-				const nx = -dirY;
-				const ny = dirX;
-
-				const labelXRaw = mx + dirX * labelOffsetOut + nx * labelOffsetSide;
-				const labelYRaw = my + dirY * labelOffsetOut + ny * labelOffsetSide;
-				const clampPad = Math.min(120, minDimension * 0.12);
-				const labelX = Math.max(clampPad, Math.min(width - clampPad, labelXRaw));
-				const labelY = Math.max(clampPad, Math.min(height - clampPad, labelYRaw));
-
-				const labelGroup = debugLabelsGroup.append('g')
-					.attr('transform', `translate(${labelX}, ${labelY})`);
-
-				const textGroup = labelGroup.append('g');
-
-				connections.forEach((conn, idx) => {
-					const rawLines = conn.from && conn.to
-						? [
-							`${getNodeLabel(conn.from)}→${getNodeLabel(conn.to)}`,
-							`${conn.ip}`,
-							`${conn.ifaceLabel}`
-						]
-						: [conn.ip];
-
-					const wrapped = rawLines.flatMap(line => wrapLine(line, maxLineLen));
-
-					wrapped.forEach((line, lineIdx) => {
-						textGroup.append('text')
-							.attr('x', 0)
-							.attr('y', (idx * (wrapped.length * (fontSize + lineGap))) + lineIdx * (fontSize + lineGap))
-							.attr('text-anchor', 'middle')
-							.attr('dominant-baseline', 'hanging')
-							.attr('font-size', fontSize)
-							.attr('font-family', 'SF Mono, monospace')
-							.attr('fill', conn.missingIface ? 'rgba(248,113,113,0.9)' : 'rgba(255,255,255,0.9)')
-							.text(line);
-					});
+				// Store for batch rendering after all edges processed
+				debugEdgeLabels.push({
+					connections: entry.connections,
+					isLeft,
+					isTop,
+					mx,
+					my
 				});
-
-				const bbox = textGroup.node()?.getBBox();
-				if (bbox) {
-					const paddedWidth = Math.max(boxWidth, bbox.width + 14);
-					const boxHeight = bbox.height + 8;
-					const boxMinX = labelX - paddedWidth / 2;
-					const boxMaxX = labelX + paddedWidth / 2;
-					const boxMinY = labelY + bbox.y - 4;
-					const boxMaxY = boxMinY + boxHeight;
-
-					const clampPadDynamic = Math.min(140, minDimension * 0.18);
-					let shiftX = 0;
-					let shiftY = 0;
-					if (boxMinX < clampPadDynamic) shiftX = clampPadDynamic - boxMinX;
-					if (boxMaxX > width - clampPadDynamic) shiftX = (width - clampPadDynamic) - boxMaxX;
-					if (boxMinY < clampPadDynamic) shiftY = clampPadDynamic - boxMinY;
-					if (boxMaxY > height - clampPadDynamic) shiftY = (height - clampPadDynamic) - boxMaxY;
-
-					const finalX = labelX + shiftX;
-					const finalY = labelY + shiftY;
-					labelGroup.attr('transform', `translate(${finalX}, ${finalY})`);
-
-					labelGroup.insert('rect', 'g')
-						.attr('x', -paddedWidth / 2)
-						.attr('y', bbox.y - 4)
-						.attr('width', paddedWidth)
-						.attr('height', boxHeight)
-						.attr('rx', 4)
-						.attr('fill', 'rgba(0,0,0,0.75)')
-						.attr('stroke', 'rgba(255,255,255,0.12)')
-						.attr('stroke-width', 0.6);
-				}
 			}
 		});

+		// Render debug labels at viewport edges/corners
+		if (debugEdgeLabels && debugEdgeLabels.length > 0) {
+			const fontSize = isMinimized ? 10 : 12;
+			const lineHeight = fontSize + 4;
+			const padding = 10;
+			
+			// Helper to get arrow based on direction vector
+			function getArrow(fromId: string, toId: string): string {
+				const fromPos = positionById[fromId];
+				const toPos = positionById[toId];
+				if (!fromPos || !toPos) return '→';
+				
+				const dirX = toPos.x - fromPos.x;
+				const dirY = toPos.y - fromPos.y;
+				const absX = Math.abs(dirX);
+				const absY = Math.abs(dirY);
+				
+				if (absX > absY * 2) {
+					return dirX > 0 ? '→' : '←';
+				} else if (absY > absX * 2) {
+					return dirY > 0 ? '↓' : '↑';
+				} else {
+					if (dirX > 0 && dirY > 0) return '↘';
+					if (dirX > 0 && dirY < 0) return '↗';
+					if (dirX < 0 && dirY > 0) return '↙';
+					return '↖';
+				}
+			}
+			
+			// Group by quadrant: topLeft, topRight, bottomLeft, bottomRight
+			const quadrants: Record<string, DebugEdgeLabelEntry[]> = {
+				topLeft: [],
+				topRight: [],
+				bottomLeft: [],
+				bottomRight: []
+			};
+
+			debugEdgeLabels.forEach(edge => {
+				const key = (edge.isTop ? 'top' : 'bottom') + (edge.isLeft ? 'Left' : 'Right');
+				quadrants[key].push(edge);
+			});
+
+			// Render each quadrant
+			Object.entries(quadrants).forEach(([quadrant, quadrantEdges]) => {
+				if (quadrantEdges.length === 0) return;
+
+				const isLeft = quadrant.includes('Left');
+				const isTop = quadrant.includes('top');
+
+				let baseX = isLeft ? padding : width - padding;
+				let baseY = isTop ? padding : height - padding;
+				const textAnchor = isLeft ? 'start' : 'end';
+
+				let currentY = baseY;
+
+				quadrantEdges.forEach(edge => {
+					edge.connections.forEach(conn => {
+						const arrow = getArrow(conn.from, conn.to);
+						const label = `${arrow} ${conn.ip} ${conn.ifaceLabel}`;
+						debugLabelsGroup.append('text')
+							.attr('x', baseX)
+							.attr('y', currentY)
+							.attr('text-anchor', textAnchor)
+							.attr('dominant-baseline', isTop ? 'hanging' : 'auto')
+							.attr('font-size', fontSize)
+							.attr('font-family', 'SF Mono, monospace')
+							.attr('fill', conn.missingIface ? 'rgba(248,113,113,0.9)' : 'rgba(255,255,255,0.85)')
+							.text(label);
+						currentY += isTop ? lineHeight : -lineHeight;
+					});
+				});
+			});
+		}
+
 		// Draw nodes
 		const nodesGroup = svg.append('g').attr('class', 'nodes-group');

@@ -968,4 +979,5 @@ function wrapLine(text: string, maxLen: number): string[] {
 		from { stroke-dashoffset: 0; }
 		to { stroke-dashoffset: -10; }
 	}
+
 </style>
--- a/dashboard/src/lib/components/index.ts
+++ b/dashboard/src/lib/components/index.ts
@@ -1,7 +1,7 @@
-export { default as TopologyGraph } from './TopologyGraph.svelte';
-export { default as ChatForm } from './ChatForm.svelte';
-export { default as ChatMessages } from './ChatMessages.svelte';
-export { default as ChatAttachments } from './ChatAttachments.svelte';
-export { default as ChatSidebar } from './ChatSidebar.svelte';
-export { default as ModelCard } from './ModelCard.svelte';
-
+export { default as TopologyGraph } from "./TopologyGraph.svelte";
+export { default as ChatForm } from "./ChatForm.svelte";
+export { default as ChatMessages } from "./ChatMessages.svelte";
+export { default as ChatAttachments } from "./ChatAttachments.svelte";
+export { default as ChatSidebar } from "./ChatSidebar.svelte";
+export { default as ModelCard } from "./ModelCard.svelte";
+export { default as MarkdownContent } from "./MarkdownContent.svelte";
--- a/dashboard/src/lib/stores/app.svelte.ts
+++ b/dashboard/src/lib/stores/app.svelte.ts
--- a/dashboard/src/lib/types/files.ts
+++ b/dashboard/src/lib/types/files.ts
@@ -13,55 +13,124 @@ export interface ChatUploadedFile {
 }

 export interface ChatAttachment {
-	type: 'image' | 'text' | 'pdf' | 'audio';
+	type: "image" | "text" | "pdf" | "audio";
 	name: string;
 	content?: string;
 	base64Url?: string;
 	mimeType?: string;
 }

-export type FileCategory = 'image' | 'text' | 'pdf' | 'audio' | 'unknown';
+export type FileCategory = "image" | "text" | "pdf" | "audio" | "unknown";

-export const IMAGE_EXTENSIONS = ['.jpg', '.jpeg', '.png', '.gif', '.webp', '.svg'];
-export const IMAGE_MIME_TYPES = ['image/jpeg', 'image/png', 'image/gif', 'image/webp', 'image/svg+xml'];
+export const IMAGE_EXTENSIONS = [
+	".jpg",
+	".jpeg",
+	".png",
+	".gif",
+	".webp",
+	".svg",
+];
+export const IMAGE_MIME_TYPES = [
+	"image/jpeg",
+	"image/png",
+	"image/gif",
+	"image/webp",
+	"image/svg+xml",
+];

 export const TEXT_EXTENSIONS = [
-	'.txt', '.md', '.json', '.xml', '.yaml', '.yml', '.csv', '.log',
-	'.js', '.ts', '.jsx', '.tsx', '.py', '.java', '.cpp', '.c', '.h',
-	'.css', '.html', '.htm', '.sql', '.sh', '.bat', '.rs', '.go',
-	'.rb', '.php', '.swift', '.kt', '.scala', '.r', '.dart', '.vue', '.svelte'
+	".txt",
+	".md",
+	".json",
+	".xml",
+	".yaml",
+	".yml",
+	".csv",
+	".log",
+	".js",
+	".ts",
+	".jsx",
+	".tsx",
+	".py",
+	".java",
+	".cpp",
+	".c",
+	".h",
+	".css",
+	".html",
+	".htm",
+	".sql",
+	".sh",
+	".bat",
+	".rs",
+	".go",
+	".rb",
+	".php",
+	".swift",
+	".kt",
+	".scala",
+	".r",
+	".dart",
+	".vue",
+	".svelte",
 ];
 export const TEXT_MIME_TYPES = [
-	'text/plain', 'text/markdown', 'text/csv', 'text/html', 'text/css',
-	'application/json', 'application/xml', 'text/xml', 'application/javascript',
-	'text/javascript', 'application/typescript'
+	"text/plain",
+	"text/markdown",
+	"text/csv",
+	"text/html",
+	"text/css",
+	"application/json",
+	"application/xml",
+	"text/xml",
+	"application/javascript",
+	"text/javascript",
+	"application/typescript",
 ];

-export const PDF_EXTENSIONS = ['.pdf'];
-export const PDF_MIME_TYPES = ['application/pdf'];
+export const PDF_EXTENSIONS = [".pdf"];
+export const PDF_MIME_TYPES = ["application/pdf"];

-export const AUDIO_EXTENSIONS = ['.mp3', '.wav', '.ogg', '.m4a'];
-export const AUDIO_MIME_TYPES = ['audio/mpeg', 'audio/wav', 'audio/ogg', 'audio/mp4'];
+export const AUDIO_EXTENSIONS = [".mp3", ".wav", ".ogg", ".m4a"];
+export const AUDIO_MIME_TYPES = [
+	"audio/mpeg",
+	"audio/wav",
+	"audio/ogg",
+	"audio/mp4",
+];

 /**
 * Get file category based on MIME type and extension
 */
-export function getFileCategory(mimeType: string, fileName: string): FileCategory {
-	const extension = fileName.toLowerCase().slice(fileName.lastIndexOf('.'));
-	
-	if (IMAGE_MIME_TYPES.includes(mimeType) || IMAGE_EXTENSIONS.includes(extension)) {
-		return 'image';
+export function getFileCategory(
+	mimeType: string,
+	fileName: string,
+): FileCategory {
+	const extension = fileName.toLowerCase().slice(fileName.lastIndexOf("."));
+
+	if (
+		IMAGE_MIME_TYPES.includes(mimeType) ||
+		IMAGE_EXTENSIONS.includes(extension)
+	) {
+		return "image";
 	}
 	if (PDF_MIME_TYPES.includes(mimeType) || PDF_EXTENSIONS.includes(extension)) {
-		return 'pdf';
+		return "pdf";
 	}
-	if (AUDIO_MIME_TYPES.includes(mimeType) || AUDIO_EXTENSIONS.includes(extension)) {
-		return 'audio';
+	if (
+		AUDIO_MIME_TYPES.includes(mimeType) ||
+		AUDIO_EXTENSIONS.includes(extension)
+	) {
+		return "audio";
 	}
-	if (TEXT_MIME_TYPES.includes(mimeType) || TEXT_EXTENSIONS.includes(extension) || mimeType.startsWith('text/')) {
-		return 'text';
+	if (
+		TEXT_MIME_TYPES.includes(mimeType) ||
+		TEXT_EXTENSIONS.includes(extension) ||
+		mimeType.startsWith("text/")
+	) {
+		return "text";
 	}
-	return 'unknown';
+	return "unknown";
 }

 /**
@@ -69,36 +138,36 @@ export function getFileCategory(mimeType: string, fileName: string): FileCategor
 */
 export function getAcceptString(categories: FileCategory[]): string {
 	const accepts: string[] = [];
-	
+
 	for (const category of categories) {
 		switch (category) {
-			case 'image':
+			case "image":
 				accepts.push(...IMAGE_EXTENSIONS, ...IMAGE_MIME_TYPES);
 				break;
-			case 'text':
+			case "text":
 				accepts.push(...TEXT_EXTENSIONS, ...TEXT_MIME_TYPES);
 				break;
-			case 'pdf':
+			case "pdf":
 				accepts.push(...PDF_EXTENSIONS, ...PDF_MIME_TYPES);
 				break;
-			case 'audio':
+			case "audio":
 				accepts.push(...AUDIO_EXTENSIONS, ...AUDIO_MIME_TYPES);
 				break;
 		}
 	}
-	
-	return accepts.join(',');
+
+	return accepts.join(",");
 }

 /**
 * Format file size for display
 */
 export function formatFileSize(bytes: number): string {
-	if (bytes === 0) return '0 B';
+	if (bytes === 0) return "0 B";
 	const k = 1024;
-	const sizes = ['B', 'KB', 'MB', 'GB'];
+	const sizes = ["B", "KB", "MB", "GB"];
 	const i = Math.floor(Math.log(bytes) / Math.log(k));
-	return parseFloat((bytes / Math.pow(k, i)).toFixed(1)) + ' ' + sizes[i];
+	return parseFloat((bytes / Math.pow(k, i)).toFixed(1)) + " " + sizes[i];
 }

 /**
@@ -128,42 +197,44 @@ export function readFileAsText(file: File): Promise<string> {
 /**
 * Process uploaded files into ChatUploadedFile format
 */
-export async function processUploadedFiles(files: File[]): Promise<ChatUploadedFile[]> {
+export async function processUploadedFiles(
+	files: File[],
+): Promise<ChatUploadedFile[]> {
 	const results: ChatUploadedFile[] = [];
-	
+
 	for (const file of files) {
-		const id = Date.now().toString() + Math.random().toString(36).substring(2, 9);
+		const id =
+			Date.now().toString() + Math.random().toString(36).substring(2, 9);
 		const category = getFileCategory(file.type, file.name);
-		
+
 		const base: ChatUploadedFile = {
 			id,
 			name: file.name,
 			size: file.size,
 			type: file.type,
-			file
+			file,
 		};
-		
+
 		try {
-			if (category === 'image') {
+			if (category === "image") {
 				const preview = await readFileAsDataURL(file);
 				results.push({ ...base, preview });
-			} else if (category === 'text' || category === 'unknown') {
+			} else if (category === "text" || category === "unknown") {
 				const textContent = await readFileAsText(file);
 				results.push({ ...base, textContent });
-			} else if (category === 'pdf') {
+			} else if (category === "pdf") {
 				results.push(base);
-			} else if (category === 'audio') {
+			} else if (category === "audio") {
 				const preview = await readFileAsDataURL(file);
 				results.push({ ...base, preview });
 			} else {
 				results.push(base);
 			}
 		} catch (error) {
-			console.error('Error processing file:', file.name, error);
+			console.error("Error processing file:", file.name, error);
 			results.push(base);
 		}
 	}
-	
+
 	return results;
 }
-
--- a/dashboard/src/routes/+page.svelte
+++ b/dashboard/src/routes/+page.svelte
@@ -18,6 +18,10 @@
 		selectedChatModel,
 	debugMode,
 	toggleDebugMode,
+	topologyOnlyMode,
+	toggleTopologyOnlyMode,
+	chatSidebarVisible,
+	toggleChatSidebarVisible,
 		type DownloadProgress,
 		type PlacementPreview
 	} from '$lib/stores/app.svelte';
@@ -37,6 +41,8 @@
 	const selectedModelId = $derived(selectedPreviewModelId());
 	const loadingPreviews = $derived(isLoadingPreviews());
 const debugEnabled = $derived(debugMode());
+const topologyOnlyEnabled = $derived(topologyOnlyMode());
+const sidebarVisible = $derived(chatSidebarVisible());

 	let mounted = $state(false);

@@ -45,6 +51,59 @@ const debugEnabled = $derived(debugMode());
 	let selectedSharding = $state<'Pipeline' | 'Tensor'>('Pipeline');
 	type InstanceMeta = 'MlxRing' | 'MlxIbv' | 'MlxJaccl';
 	
+	// Launch defaults persistence
+	const LAUNCH_DEFAULTS_KEY = 'exo-launch-defaults';
+	interface LaunchDefaults {
+		modelId: string | null;
+		sharding: 'Pipeline' | 'Tensor';
+		instanceType: InstanceMeta;
+		minNodes: number;
+	}
+	
+	function saveLaunchDefaults(): void {
+		const defaults: LaunchDefaults = {
+			modelId: selectedPreviewModelId(),
+			sharding: selectedSharding,
+			instanceType: selectedInstanceType,
+			minNodes: selectedMinNodes,
+		};
+		try {
+			localStorage.setItem(LAUNCH_DEFAULTS_KEY, JSON.stringify(defaults));
+		} catch (e) {
+			console.warn('Failed to save launch defaults:', e);
+		}
+	}
+	
+	function loadLaunchDefaults(): LaunchDefaults | null {
+		try {
+			const stored = localStorage.getItem(LAUNCH_DEFAULTS_KEY);
+			if (!stored) return null;
+			return JSON.parse(stored) as LaunchDefaults;
+		} catch (e) {
+			console.warn('Failed to load launch defaults:', e);
+			return null;
+		}
+	}
+	
+	function applyLaunchDefaults(availableModels: Array<{id: string}>, maxNodes: number): void {
+		const defaults = loadLaunchDefaults();
+		if (!defaults) return;
+		
+		// Apply sharding and instance type unconditionally
+		selectedSharding = defaults.sharding;
+		selectedInstanceType = defaults.instanceType;
+		
+		// Apply minNodes if valid (between 1 and maxNodes)
+		if (defaults.minNodes && defaults.minNodes >= 1 && defaults.minNodes <= maxNodes) {
+			selectedMinNodes = defaults.minNodes;
+		}
+		
+		// Only apply model if it exists in the available models
+		if (defaults.modelId && availableModels.some(m => m.id === defaults.modelId)) {
+			selectPreviewModel(defaults.modelId);
+		}
+	}
+	
 	let selectedInstanceType = $state<InstanceMeta>('MlxRing');
 	let selectedMinNodes = $state<number>(1);
 	let minNodesInitialized = $state(false);
@@ -292,6 +351,9 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 				const data = await response.json();
 				// API returns { data: [{ id, name }] } format
 				models = data.data || [];
+				// Restore last launch defaults if available
+				const currentNodeCount = topologyData() ? Object.keys(topologyData()!.nodes).length : 1;
+				applyLaunchDefaults(models, currentNodeCount);
 			}
 		} catch (error) {
 			console.error('Failed to fetch models:', error);
@@ -472,6 +534,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 				
 				const progress = parseDownloadProgress(downloadPayload);
 				if (progress) {
+					// Sum all values across nodes - each node downloads independently
 					totalBytes += progress.totalBytes;
 					downloadedBytes += progress.downloadedBytes;
 					totalSpeed += progress.speed;
@@ -489,13 +552,17 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 			return { isDownloading: false, progress: null, perNode: [] };
 		}

+		// ETA = total remaining bytes / total speed across all nodes
+		const remainingBytes = totalBytes - downloadedBytes;
+		const etaMs = totalSpeed > 0 ? (remainingBytes / totalSpeed) * 1000 : 0;
+
 		return {
 			isDownloading: true,
 			progress: {
 				totalBytes,
 				downloadedBytes,
 				speed: totalSpeed,
-				etaMs: totalSpeed > 0 ? ((totalBytes - downloadedBytes) / totalSpeed) * 1000 : 0,
+				etaMs,
 				percentage: totalBytes > 0 ? (downloadedBytes / totalBytes) * 100 : 0,
 				completedFiles,
 				totalFiles,
@@ -526,7 +593,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 		// Unwrap the instance
 		const [instanceTag, instance] = getTagged(instanceWrapped);
 		if (!instance || typeof instance !== 'object') {
-			return { isDownloading: false, progress: null, statusText: 'UNKNOWN', perNode: [] };
+			return { isDownloading: false, progress: null, statusText: 'PREPARING', perNode: [] };
 		}

 		const inst = instance as { shardAssignments?: { nodeToRunner?: Record<string, string>; runnerToShard?: Record<string, unknown>; modelId?: string } };
@@ -576,6 +643,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 					
 					const progress = parseDownloadProgress(downloadPayload);
 					if (progress) {
+						// Sum all values across nodes - each node downloads independently
 						totalBytes += progress.totalBytes;
 						downloadedBytes += progress.downloadedBytes;
 						totalSpeed += progress.speed;
@@ -596,13 +664,17 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 			return { isDownloading: false, progress: null, statusText: statusInfo.statusText, perNode: [] };
 		}

+		// ETA = total remaining bytes / total speed across all nodes
+		const remainingBytes = totalBytes - downloadedBytes;
+		const etaMs = totalSpeed > 0 ? (remainingBytes / totalSpeed) * 1000 : 0;
+
 		return {
 			isDownloading: true,
 			progress: {
 				totalBytes,
 				downloadedBytes,
 				speed: totalSpeed,
-				etaMs: totalSpeed > 0 ? ((totalBytes - downloadedBytes) / totalSpeed) * 1000 : 0,
+				etaMs,
 				percentage: totalBytes > 0 ? (downloadedBytes / totalBytes) * 100 : 0,
 				completedFiles,
 				totalFiles,
@@ -618,10 +690,12 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 	function getStatusColor(statusText: string): string {
 		switch (statusText) {
 			case 'FAILED': return 'text-red-400';
+			case 'SHUTDOWN': return 'text-gray-400';
 			case 'DOWNLOADING': return 'text-blue-400';
 			case 'LOADING': 
 			case 'WARMING UP': 
-			case 'WAITING': return 'text-yellow-400';
+			case 'WAITING':
+			case 'INITIALIZING': return 'text-yellow-400';
 			case 'RUNNING': return 'text-teal-400';
 			case 'READY': 
 			case 'LOADED': return 'text-green-400';
@@ -632,7 +706,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 	function deriveInstanceStatus(instanceWrapped: unknown): { statusText: string; statusClass: string } {
 		const [, instance] = getTagged(instanceWrapped);
 		if (!instance || typeof instance !== 'object') {
-			return { statusText: 'UNKNOWN', statusClass: 'inactive' };
+			return { statusText: 'PREPARING', statusClass: 'inactive' };
 		}
 		
 		const inst = instance as { shardAssignments?: { runnerToShard?: Record<string, unknown> } };
@@ -644,12 +718,15 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 				if (!r) return null;
 				const [kind] = getTagged(r);
 				const statusMap: Record<string, string> = {
+					RunnerWaitingForInitialization: 'WaitingForInitialization',
+					RunnerInitializingBackend: 'InitializingBackend',
 					RunnerWaitingForModel: 'WaitingForModel',
 					RunnerLoading: 'Loading',
 					RunnerLoaded: 'Loaded',
 					RunnerWarmingUp: 'WarmingUp',
 					RunnerReady: 'Ready',
 					RunnerRunning: 'Running',
+					RunnerShutdown: 'Shutdown',
 					RunnerFailed: 'Failed',
 				};
 				return kind ? statusMap[kind] || null : null;
@@ -658,14 +735,17 @@ function toggleInstanceDownloadDetails(nodeId: string): void {

 		const has = (s: string) => statuses.includes(s);

-		if (statuses.length === 0) return { statusText: 'UNKNOWN', statusClass: 'inactive' };
+		if (statuses.length === 0) return { statusText: 'PREPARING', statusClass: 'inactive' };
 		if (has('Failed')) return { statusText: 'FAILED', statusClass: 'failed' };
+		if (has('Shutdown')) return { statusText: 'SHUTDOWN', statusClass: 'inactive' };
 		if (has('Loading')) return { statusText: 'LOADING', statusClass: 'starting' };
 		if (has('WarmingUp')) return { statusText: 'WARMING UP', statusClass: 'starting' };
 		if (has('Running')) return { statusText: 'RUNNING', statusClass: 'running' };
 		if (has('Ready')) return { statusText: 'READY', statusClass: 'loaded' };
 		if (has('Loaded')) return { statusText: 'LOADED', statusClass: 'loaded' };
 		if (has('WaitingForModel')) return { statusText: 'WAITING', statusClass: 'starting' };
+		if (has('InitializingBackend')) return { statusText: 'INITIALIZING', statusClass: 'starting' };
+		if (has('WaitingForInitialization')) return { statusText: 'INITIALIZING', statusClass: 'starting' };

 		return { statusText: 'RUNNING', statusClass: 'active' };
 	}
@@ -815,7 +895,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 		const runnerEntries = Object.entries(runnerToShard).map(([runnerId, shardWrapped]) => {
 			const [tag, shard] = getTagged(shardWrapped);
 			const meta = (shard as { modelMeta?: { worldSize?: number; nLayers?: number; deviceRank?: number } } | undefined);
-			const deviceRank = (meta?.deviceRank as number | undefined) ?? 0;
+			const deviceRank = meta?.modelMeta?.deviceRank ?? 0;
 			return { runnerId, tag, deviceRank };
 		});

@@ -964,6 +1044,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {

 	function handleSliderMouseUp() {
 		isDraggingSlider = false;
+		saveLaunchDefaults();
 	}

 	// Handle touch events for mobile
@@ -983,6 +1064,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {

 	function handleSliderTouchEnd() {
 		isDraggingSlider = false;
+		saveLaunchDefaults();
 	}

 	const nodeCount = $derived(data ? Object.keys(data.nodes).length : 0);
@@ -1107,16 +1189,47 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 		<div class="shooting-star" style="top: 50%; left: 40%; --duration: 45s; --delay: 30s;"></div>
 	</div>

-	<HeaderNav showHome={chatStarted} onHome={handleGoHome} />
+	{#if !topologyOnlyEnabled}
+	<HeaderNav 
+		showHome={chatStarted} 
+		onHome={handleGoHome} 
+		showSidebarToggle={true}
+		sidebarVisible={sidebarVisible}
+		onToggleSidebar={toggleChatSidebarVisible}
+	/>
+	{/if}

 	<!-- Main Content -->
 	<main class="flex-1 flex overflow-hidden relative">
-		<!-- Left: Conversation History Sidebar (always visible) -->
+		<!-- Left: Conversation History Sidebar (hidden in topology-only mode or when toggled off) -->
+		{#if !topologyOnlyEnabled && sidebarVisible}
 		<div class="w-80 flex-shrink-0 border-r border-exo-yellow/10">
 			<ChatSidebar class="h-full" />
 		</div>
+		{/if}

-		{#if !chatStarted}
+		{#if topologyOnlyEnabled}
+			<!-- TOPOLOGY ONLY MODE: Full-screen topology -->
+			<div class="flex-1 flex flex-col min-h-0 min-w-0 p-4" in:fade={{ duration: 300 }}>
+				<div class="flex-1 relative bg-exo-dark-gray/40 rounded-lg overflow-hidden">
+					<TopologyGraph class="w-full h-full" highlightedNodes={highlightedNodes()} />
+					<!-- Exit topology-only mode button -->
+					<button
+						type="button"
+						onclick={toggleTopologyOnlyMode}
+						class="absolute bottom-4 right-4 p-2 rounded border border-exo-yellow/30 bg-exo-dark-gray/80 hover:border-exo-yellow/50 hover:bg-exo-dark-gray transition-colors cursor-pointer backdrop-blur-sm"
+						title="Exit topology only mode"
+					>
+						<svg class="w-5 h-5 text-exo-yellow" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2">
+							<circle cx="12" cy="5" r="2" fill="currentColor" />
+							<circle cx="5" cy="19" r="2" fill="currentColor" />
+							<circle cx="19" cy="19" r="2" fill="currentColor" />
+							<path stroke-linecap="round" d="M12 7v5m0 0l-5 5m5-5l5 5" />
+						</svg>
+					</button>
+				</div>
+			</div>
+		{:else if !chatStarted}
 			<!-- WELCOME STATE: Topology + Instance Controls (no left sidebar for cleaner look) -->
 			<div class="flex-1 flex overflow-visible relative" in:fade={{ duration: 300 }} out:fade={{ duration: 200 }}>
 				
@@ -1154,9 +1267,9 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 							<div class="flex-1 h-px bg-gradient-to-r from-exo-yellow/30 to-transparent"></div>
 						</div>
 						
-						<div 
+						<div
 							bind:this={instancesContainerRef}
-							class="max-h-72 space-y-3 overflow-y-auto"
+							class="max-h-72 xl:max-h-96 space-y-3 overflow-y-auto overflow-x-hidden py-px"
 						>
 								{#each Object.entries(instanceData) as [id, instance]}
 									{@const downloadInfo = getInstanceDownloadStatus(id, instance)}
@@ -1300,14 +1413,15 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 																			{:else}
 																				{#each nodeProg.progress.files as f}
 																					{@const filePercent = Math.min(100, Math.max(0, f.percentage ?? 0))}
+																					{@const isFileComplete = filePercent >= 100}
 																					<div class="rounded border border-exo-medium-gray/30 bg-exo-black/40 p-2">
 																						<div class="flex items-center justify-between text-[10px] font-mono text-exo-light-gray/90">
 																							<span class="truncate pr-2">{f.name}</span>
-																							<span class="text-white/80">{filePercent.toFixed(1)}%</span>
+																							<span class={isFileComplete ? 'text-green-400' : 'text-white/80'}>{filePercent.toFixed(1)}%</span>
 																						</div>
 																						<div class="relative h-1 bg-exo-black/60 rounded-sm overflow-hidden mt-1">
 																							<div 
-																								class="absolute inset-y-0 left-0 bg-gradient-to-r from-exo-yellow to-exo-yellow/70 transition-all duration-300"
+																								class="absolute inset-y-0 left-0 bg-gradient-to-r {isFileComplete ? 'from-green-500 to-green-400' : 'from-exo-yellow to-exo-yellow/70'} transition-all duration-300"
 																								style="width: {filePercent.toFixed(1)}%"
 																							></div>
 																						</div>
@@ -1408,6 +1522,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 												onclick={() => {
 													if (modelCanFit) {
 														selectPreviewModel(model.id);
+														saveLaunchDefaults();
 														isModelDropdownOpen = false;
 														modelDropdownSearch = '';
 													}
@@ -1441,7 +1556,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 								<div class="text-xs text-white/70 font-mono mb-2">Sharding:</div>
 								<div class="flex gap-2">
 									<button 
-										onclick={() => selectedSharding = 'Pipeline'}
+										onclick={() => { selectedSharding = 'Pipeline'; saveLaunchDefaults(); }}
 										class="flex items-center gap-2 py-2 px-4 text-sm font-mono border rounded transition-all duration-200 cursor-pointer {selectedSharding === 'Pipeline' ? 'bg-transparent text-exo-yellow border-exo-yellow' : 'bg-transparent text-white/70 border-exo-medium-gray/50 hover:border-exo-yellow/50'}"
 									>
 										<span class="w-4 h-4 rounded-full border-2 flex items-center justify-center {selectedSharding === 'Pipeline' ? 'border-exo-yellow' : 'border-exo-medium-gray'}">
@@ -1452,7 +1567,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 										Pipeline
 									</button>
 									<button 
-										onclick={() => selectedSharding = 'Tensor'}
+										onclick={() => { selectedSharding = 'Tensor'; saveLaunchDefaults(); }}
 										class="flex items-center gap-2 py-2 px-4 text-sm font-mono border rounded transition-all duration-200 cursor-pointer {selectedSharding === 'Tensor' ? 'bg-transparent text-exo-yellow border-exo-yellow' : 'bg-transparent text-white/70 border-exo-medium-gray/50 hover:border-exo-yellow/50'}"
 									>
 										<span class="w-4 h-4 rounded-full border-2 flex items-center justify-center {selectedSharding === 'Tensor' ? 'border-exo-yellow' : 'border-exo-medium-gray'}">
@@ -1470,7 +1585,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 								<div class="text-xs text-white/70 font-mono mb-2">Instance Type:</div>
 								<div class="flex gap-2">
 									<button 
-										onclick={() => selectedInstanceType = 'MlxRing'}
+										onclick={() => { selectedInstanceType = 'MlxRing'; saveLaunchDefaults(); }}
 										class="flex items-center gap-2 py-2 px-4 text-sm font-mono border rounded transition-all duration-200 cursor-pointer {selectedInstanceType === 'MlxRing' ? 'bg-transparent text-exo-yellow border-exo-yellow' : 'bg-transparent text-white/70 border-exo-medium-gray/50 hover:border-exo-yellow/50'}"
 									>
 										<span class="w-4 h-4 rounded-full border-2 flex items-center justify-center {selectedInstanceType === 'MlxRing' ? 'border-exo-yellow' : 'border-exo-medium-gray'}">
@@ -1481,7 +1596,7 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 										MLX Ring
 									</button>
 									<button 
-										onclick={() => selectedInstanceType = 'MlxIbv'}
+										onclick={() => { selectedInstanceType = 'MlxIbv'; saveLaunchDefaults(); }}
 										class="flex items-center gap-2 py-2 px-4 text-sm font-mono border rounded transition-all duration-200 cursor-pointer {selectedInstanceType === 'MlxIbv' ? 'bg-transparent text-exo-yellow border-exo-yellow' : 'bg-transparent text-white/70 border-exo-medium-gray/50 hover:border-exo-yellow/50'}"
 									>
 										<span class="w-4 h-4 rounded-full border-2 flex items-center justify-center {selectedInstanceType === 'MlxIbv' ? 'border-exo-yellow' : 'border-exo-medium-gray'}">
@@ -1611,13 +1726,13 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 					in:fade={{ duration: 300, delay: 100 }}
 				>
 					<div class="flex-1 overflow-y-auto px-8 py-6" bind:this={chatScrollRef}>
-						<div class="max-w-3xl mx-auto">
+						<div class="max-w-7xl mx-auto">
 							<ChatMessages scrollParent={chatScrollRef} />
 						</div>
 					</div>
 					
 					<div class="flex-shrink-0 px-8 pb-6 pt-4 bg-gradient-to-t from-exo-black via-exo-black to-transparent">
-						<div class="max-w-3xl mx-auto">
+						<div class="max-w-7xl mx-auto">
 							<ChatForm placeholder="Ask anything" showModelSelector={true} />
 						</div>
 					</div>
@@ -1655,10 +1770,10 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 							<!-- Panel Header -->
 							<div class="flex items-center gap-2 mb-4">
 								<div class="w-2 h-2 bg-exo-yellow rounded-full shadow-[0_0_8px_rgba(255,215,0,0.6)] animate-pulse"></div>
-								<h3 class="text-sm text-exo-yellow font-mono tracking-[0.2em] uppercase">Instances</h3>
+								<h3 class="text-xs text-exo-yellow font-mono tracking-[0.2em] uppercase">Instances</h3>
 								<div class="flex-1 h-px bg-gradient-to-r from-exo-yellow/30 to-transparent"></div>
 							</div>
-								<div class="space-y-3 max-h-72 overflow-y-auto pr-1">
+								<div class="space-y-3 max-h-72 xl:max-h-96 overflow-y-auto overflow-x-hidden py-px pr-1">
 									{#each Object.entries(instanceData) as [id, instance]}
 										{@const downloadInfo = getInstanceDownloadStatus(id, instance)}
 										{@const statusText = downloadInfo.statusText}
@@ -1701,28 +1816,28 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 											<div class="flex justify-between items-start mb-2 pl-2">
 												<div class="flex items-center gap-2">
 													<div class="w-1.5 h-1.5 {isDownloading ? 'bg-blue-400 animate-pulse' : isFailed ? 'bg-red-400' : isLoading ? 'bg-yellow-400 animate-pulse' : isReady ? 'bg-green-400' : 'bg-teal-400'} rounded-full shadow-[0_0_6px_currentColor]"></div>
-													<span class="text-exo-light-gray font-mono text-xs tracking-wider">{id.slice(0, 8).toUpperCase()}</span>
+													<span class="text-exo-light-gray font-mono text-sm tracking-wider">{id.slice(0, 8).toUpperCase()}</span>
 												</div>
 												<button 
 													onclick={() => deleteInstance(id)}
-													class="text-xs px-2 py-1 font-mono tracking-wider uppercase border border-red-500/30 text-red-400/80 hover:bg-red-500/20 hover:text-red-400 hover:border-red-500/50 transition-all duration-200 cursor-pointer"
+													class="text-xs px-2 py-1 font-mono tracking-wider uppercase border border-red-500/30 text-red-400 hover:bg-red-500/20 hover:text-red-400 hover:border-red-500/50 transition-all duration-200 cursor-pointer"
 												>
 													DELETE
 												</button>
 												</div>
 												<div class="pl-2">
-													<div class="text-exo-yellow text-sm font-mono tracking-wide truncate">{getInstanceModelId(instance)}</div>
+													<div class="text-exo-yellow text-xs font-mono tracking-wide truncate">{getInstanceModelId(instance)}</div>
 													<div class="text-white/60 text-xs font-mono">Strategy: <span class="text-white/80">{instanceInfo.sharding} ({instanceInfo.instanceType})</span></div>
 														{#if instanceModelId && instanceModelId !== 'Unknown' && instanceModelId !== 'Unknown Model'}
 															<a
-																class="inline-flex items-center gap-1 text-[10px] text-white/60 hover:text-exo-yellow transition-colors mt-0.5"
+																class="inline-flex items-center gap-1 text-[11px] text-white/60 hover:text-exo-yellow transition-colors mt-1"
 																href={`https://huggingface.co/${instanceModelId}`}
 																target="_blank"
 																rel="noreferrer noopener"
 																aria-label="View model on Hugging Face"
 															>
 																<span>Hugging Face</span>
-																<svg class="w-3 h-3" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
+																<svg class="w-3.5 h-3.5" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
 																	<path d="M14 3h7v7"/>
 																	<path d="M10 14l11-11"/>
 																	<path d="M21 14v6a1 1 0 0 1-1 1h-16a1 1 0 0 1-1-1v-16a1 1 0 0 1 1-1h6"/>
@@ -1733,68 +1848,84 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 														<div class="text-white/60 text-xs font-mono">{instanceInfo.nodeNames.join(', ')}</div>
 													{/if}
 													{#if debugEnabled && instanceConnections.length > 0}
-														<div class="mt-1 space-y-0.5">
-															{#each instanceConnections as conn}
-																<div class="text-[10px] leading-snug font-mono text-white/70">
-																	<span>{conn.from} -> {conn.to}: {conn.ip}</span>
-																	<span class="{conn.missingIface ? 'text-red-400' : 'text-white/60'}"> ({conn.ifaceLabel})</span>
-																</div>
-															{/each}
+													<div class="mt-2 space-y-1">
+														{#each instanceConnections as conn}
+															<div class="text-[11px] leading-snug font-mono text-white/70">
+																<span>{conn.from} -> {conn.to}: {conn.ip}</span>
+																<span class="{conn.missingIface ? 'text-red-400' : 'text-white/60'}"> ({conn.ifaceLabel})</span>
+															</div>
+														{/each}
+													</div>
+												{/if}
+												
+												<!-- Download Progress -->
+												{#if downloadInfo.isDownloading && downloadInfo.progress}
+													<div class="mt-2 space-y-1">
+														<div class="flex justify-between text-xs font-mono">
+															<span class="text-blue-400">{downloadInfo.progress.percentage.toFixed(1)}%</span>
+															<span class="text-exo-light-gray">{formatBytes(downloadInfo.progress.downloadedBytes)}/{formatBytes(downloadInfo.progress.totalBytes)}</span>
 														</div>
-													{/if}
-													
-													<!-- Download Progress -->
-													{#if downloadInfo.isDownloading && downloadInfo.progress}
-														<div class="mt-2 space-y-1">
-															<div class="flex justify-between text-sm font-mono">
-																<span class="text-blue-400">{downloadInfo.progress.percentage.toFixed(1)}%</span>
-																<span class="text-exo-light-gray">{formatBytes(downloadInfo.progress.downloadedBytes)}/{formatBytes(downloadInfo.progress.totalBytes)}</span>
-															</div>
-															<div class="relative h-1 bg-exo-black/60 rounded-sm overflow-hidden">
-																<div 
-																	class="absolute inset-y-0 left-0 bg-gradient-to-r from-blue-500 to-blue-400 transition-all duration-300"
-																	style="width: {downloadInfo.progress.percentage}%"
-																></div>
-															</div>
-															<div class="flex justify-between text-xs font-mono text-exo-light-gray">
-																<span>{formatSpeed(downloadInfo.progress.speed)}</span>
-																<span>ETA: {formatEta(downloadInfo.progress.etaMs)}</span>
-																<span>{downloadInfo.progress.completedFiles}/{downloadInfo.progress.totalFiles} files</span>
-															</div>
+														<div class="relative h-1.5 bg-exo-black/60 rounded-sm overflow-hidden">
+															<div 
+																class="absolute inset-y-0 left-0 bg-gradient-to-r from-blue-500 to-blue-400 transition-all duration-300"
+																style="width: {downloadInfo.progress.percentage}%"
+															></div>
 														</div>
-														{#if downloadInfo.perNode.length > 0}
-															<div class="mt-2 space-y-1.5 max-h-48 overflow-y-auto pr-1">
-																{#each downloadInfo.perNode as nodeProg}
-																	<div class="rounded border border-exo-medium-gray/40 bg-exo-black/30 p-2">
-																		<div class="flex items-center justify-between text-[11px] font-mono text-exo-light-gray mb-1">
+														<div class="flex justify-between text-xs font-mono text-exo-light-gray">
+															<span>{formatSpeed(downloadInfo.progress.speed)}</span>
+															<span>ETA: {formatEta(downloadInfo.progress.etaMs)}</span>
+															<span>{downloadInfo.progress.completedFiles}/{downloadInfo.progress.totalFiles} files</span>
+														</div>
+													</div>
+													{#if downloadInfo.perNode.length > 0}
+														<div class="mt-2 space-y-2 max-h-48 overflow-y-auto pr-1">
+															{#each downloadInfo.perNode as nodeProg}
+																{@const nodePercent = Math.min(100, Math.max(0, nodeProg.progress.percentage))}
+																{@const isExpanded = instanceDownloadExpandedNodes.has(nodeProg.nodeId)}
+																<div class="rounded border border-exo-medium-gray/40 bg-exo-black/30 p-2">
+																	<button
+																		type="button"
+																		class="w-full text-left space-y-1.5"
+																		onclick={() => toggleInstanceDownloadDetails(nodeProg.nodeId)}
+																	>
+																		<div class="flex items-center justify-between text-[11px] font-mono text-exo-light-gray">
 																			<span class="text-white/80 truncate pr-2">{nodeProg.nodeName}</span>
-																			<span class="text-blue-300">{Math.min(100, Math.max(0, nodeProg.progress.percentage)).toFixed(1)}%</span>
+																			<span class="flex items-center gap-1 text-blue-300">
+																				{nodePercent.toFixed(1)}%
+																				<svg class="w-3 h-3 text-exo-light-gray" viewBox="0 0 20 20" fill="none" stroke="currentColor" stroke-width="2">
+																					<path d="M6 8l4 4 4-4" class={isExpanded ? 'transform rotate-180 origin-center transition-transform duration-150' : 'transition-transform duration-150'}></path>
+																				</svg>
+																			</span>
 																		</div>
-																		<div class="relative h-1 bg-exo-black/60 rounded-sm overflow-hidden mb-1.5">
+																		<div class="relative h-1.5 bg-exo-black/60 rounded-sm overflow-hidden">
 																			<div 
-																				class="absolute inset-y-0 left-0 bg-blue-500/80 transition-all duration-300"
-																				style="width: {Math.min(100, Math.max(0, nodeProg.progress.percentage)).toFixed(1)}%"
+																				class="absolute inset-y-0 left-0 bg-gradient-to-r from-blue-500 to-blue-400 transition-all duration-300"
+																				style="width: {nodePercent.toFixed(1)}%"
 																			></div>
 																		</div>
-																		<div class="flex items-center justify-between text-[11px] font-mono text-exo-light-gray mb-1">
+																		<div class="flex items-center justify-between text-[11px] font-mono text-exo-light-gray">
 																			<span>{formatBytes(nodeProg.progress.downloadedBytes)} / {formatBytes(nodeProg.progress.totalBytes)}</span>
 																			<span>{formatSpeed(nodeProg.progress.speed)} • ETA {formatEta(nodeProg.progress.etaMs)}</span>
 																		</div>
-																	{#if nodeProg.progress.files.length > 0}
-																		{@const inProgressFiles = nodeProg.progress.files.filter(f => (f.percentage ?? 0) < 100)}
-																		{@const completedFiles = nodeProg.progress.files.filter(f => (f.percentage ?? 0) >= 100)}
-																		{#if inProgressFiles.length > 0}
-																			<div class="space-y-1">
-																				{#each inProgressFiles as f}
-																					<div class="text-[10px] font-mono text-exo-light-gray/80">
-																						<div class="flex items-center justify-between">
+																	</button>
+
+																	{#if isExpanded}
+																		<div class="mt-2 space-y-1.5">
+																			{#if nodeProg.progress.files.length === 0}
+																				<div class="text-[11px] font-mono text-exo-light-gray/70">No file details reported.</div>
+																			{:else}
+																				{#each nodeProg.progress.files as f}
+																					{@const filePercent = Math.min(100, Math.max(0, f.percentage ?? 0))}
+																					{@const isFileComplete = filePercent >= 100}
+																					<div class="rounded border border-exo-medium-gray/30 bg-exo-black/40 p-2">
+																						<div class="flex items-center justify-between text-[10px] font-mono text-exo-light-gray/90">
 																							<span class="truncate pr-2">{f.name}</span>
-																							<span class="text-white/70">{Math.min(100, Math.max(0, f.percentage)).toFixed(1)}%</span>
+																							<span class={isFileComplete ? 'text-green-400' : 'text-white/80'}>{filePercent.toFixed(1)}%</span>
 																						</div>
-																						<div class="relative h-1 bg-exo-black/50 rounded-sm overflow-hidden mt-0.5">
+																						<div class="relative h-1 bg-exo-black/60 rounded-sm overflow-hidden mt-1">
 																							<div 
-																								class="absolute inset-y-0 left-0 bg-gradient-to-r from-exo-yellow to-exo-yellow/70"
-																								style="width: {Math.min(100, Math.max(0, f.percentage)).toFixed(1)}%"
+																								class="absolute inset-y-0 left-0 bg-gradient-to-r {isFileComplete ? 'from-green-500 to-green-400' : 'from-exo-yellow to-exo-yellow/70'} transition-all duration-300"
+																								style="width: {filePercent.toFixed(1)}%"
 																							></div>
 																						</div>
 																						<div class="flex items-center justify-between text-[10px] text-exo-light-gray/70 mt-0.5">
@@ -1803,27 +1934,17 @@ function toggleInstanceDownloadDetails(nodeId: string): void {
 																						</div>
 																					</div>
 																				{/each}
-																			</div>
-																		{/if}
-																		{#if completedFiles.length > 0}
-																			<div class="pt-1 space-y-0.5">
-																				{#each completedFiles as f}
-																					<div class="text-[10px] font-mono text-exo-light-gray/70 flex items-center justify-between">
-																						<span class="truncate pr-2">{f.name}</span>
-																						<span class="text-white/60">100%</span>
-																					</div>
-																				{/each}
-																			</div>
-																		{/if}
+																			{/if}
+																		</div>
 																	{/if}
-																	</div>
-																{/each}
-															</div>
-														{/if}
-														<div class="text-sm text-blue-400 font-mono tracking-wider mt-1">DOWNLOADING</div>
-													{:else}
-														<div class="text-sm {getStatusColor(downloadInfo.statusText)} font-mono tracking-wider mt-1">{downloadInfo.statusText}</div>
+																</div>
+															{/each}
+														</div>
 													{/if}
+													<div class="text-xs text-blue-400 font-mono tracking-wider mt-1">DOWNLOADING</div>
+												{:else}
+													<div class="text-xs {getStatusColor(downloadInfo.statusText)} font-mono tracking-wider mt-1">{downloadInfo.statusText}</div>
+												{/if}
 												</div>
 											</div>
 										</div>
--- a/dashboard/src/routes/downloads/+page.svelte
+++ b/dashboard/src/routes/downloads/+page.svelte
@@ -345,13 +345,19 @@
 							<div class="rounded border border-exo-medium-gray/30 bg-exo-dark-gray/60 p-3 space-y-2">
 								<div class="flex items-center justify-between gap-3">
 									<div class="min-w-0 space-y-0.5">
-										<div class="text-sm font-mono text-white truncate">{model.prettyName ?? model.modelId}</div>
-										<div class="text-[11px] text-exo-light-gray font-mono truncate">
-											{model.modelId}
-										</div>
-										<div class="text-[11px] text-exo-light-gray font-mono">
-											{formatBytes(model.downloadedBytes)} / {formatBytes(model.totalBytes)}
-										</div>
+										<div 
+											class="text-xs font-mono text-white truncate"
+											title={model.prettyName ?? model.modelId}
+										>{model.prettyName ?? model.modelId}</div>
+										<div 
+											class="text-[10px] text-exo-light-gray font-mono truncate"
+											title={model.modelId}
+										>{model.modelId}</div>
+										{#if model.status !== 'completed'}
+											<div class="text-[11px] text-exo-light-gray font-mono">
+												{formatBytes(model.downloadedBytes)} / {formatBytes(model.totalBytes)}
+											</div>
+										{/if}
 									</div>
 									<div class="flex items-center gap-2">
 										<span class="text-xs font-mono {pct >= 100 ? 'text-green-400' : pct <= 0 ? 'text-red-400' : 'text-exo-yellow'}">
@@ -426,14 +432,14 @@
 <style>
 	.downloads-grid {
 		display: grid;
-		grid-template-columns: repeat(auto-fill, minmax(260px, 1fr));
+		grid-template-columns: repeat(auto-fill, minmax(320px, 1fr));
 	}
 	@media (min-width: 1024px) {
 		.downloads-grid {
 			grid-template-columns: repeat(3, minmax(0, 1fr));
 		}
 	}
-	@media (min-width: 1440px) {
+	@media (min-width: 1600px) {
 		.downloads-grid {
 			grid-template-columns: repeat(4, minmax(0, 1fr));
 		}
--- a/dashboard/vite.config.ts
+++ b/dashboard/vite.config.ts
@@ -1,16 +1,15 @@
-import tailwindcss from '@tailwindcss/vite';
-import { sveltekit } from '@sveltejs/kit/vite';
-import { defineConfig } from 'vite';
+import tailwindcss from "@tailwindcss/vite";
+import { sveltekit } from "@sveltejs/kit/vite";
+import { defineConfig } from "vite";

 export default defineConfig({
 	plugins: [tailwindcss(), sveltekit()],
 	server: {
 		proxy: {
-			'/v1': 'http://localhost:52415',
-			'/state': 'http://localhost:52415',
-			'/models': 'http://localhost:52415',
-			'/instance': 'http://localhost:52415'
-		}
-	}
+			"/v1": "http://localhost:52415",
+			"/state": "http://localhost:52415",
+			"/models": "http://localhost:52415",
+			"/instance": "http://localhost:52415",
+		},
+	},
 });
-
--- a/docs/api.md
+++ b/docs/api.md
@@ -0,0 +1,212 @@
+# EXO API – Technical Reference
+
+This document describes the REST API exposed by the **EXO ** service, as implemented in:
+
+`src/exo/master/api.py`
+
+The API is used to manage model instances in the cluster, inspect cluster state, and perform inference using an OpenAI-compatible interface.
+
+Base URL example:
+
+```
+http://localhost:52415
+```
+
+## 1. General / Meta Endpoints
+
+### Get Master Node ID
+
+**GET** `/node_id`
+
+Returns the identifier of the current master node.
+
+**Response (example):**
+
+```json
+{
+  "node_id": "node-1234"
+}
+```
+
+### Get Cluster State
+
+**GET** `/state`
+
+Returns the current state of the cluster, including nodes and active instances.
+
+**Response:**
+JSON object describing topology, nodes, and instances.
+
+### Get Events
+
+**GET** `/events`
+
+Returns the list of internal events recorded by the master (mainly for debugging and observability).
+
+**Response:**
+Array of event objects.
+
+## 2. Model Instance Management
+
+### Create Instance
+
+**POST** `/instance`
+
+Creates a new model instance in the cluster.
+
+**Request body (example):**
+
+```json
+{
+  "instance": {
+    "model_id": "llama-3.2-1b",
+    "placement": { }
+  }
+}
+```
+
+**Response:**
+JSON description of the created instance.
+
+### Delete Instance
+
+**DELETE** `/instance/{instance_id}`
+
+Deletes an existing instance by ID.
+
+**Path parameters:**
+
+* `instance_id`: string, ID of the instance to delete
+
+**Response:**
+Status / confirmation JSON.
+
+### Get Instance
+
+**GET** `/instance/{instance_id}`
+
+Returns details of a specific instance.
+
+**Path parameters:**
+
+* `instance_id`: string
+
+**Response:**
+JSON description of the instance.
+
+### Preview Placements
+
+**GET** `/instance/previews?model_id=...`
+
+Returns possible placement previews for a given model.
+
+**Query parameters:**
+
+* `model_id`: string, required
+
+**Response:**
+Array of placement preview objects.
+
+### Compute Placement
+
+**GET** `/instance/placement`
+
+Computes a placement for a potential instance without creating it.
+
+**Query parameters (typical):**
+
+* `model_id`: string
+* `sharding`: string or config
+* `instance_meta`: JSON-encoded metadata
+* `min_nodes`: integer
+
+**Response:**
+JSON object describing the proposed placement / instance configuration.
+
+### Place Instance (Dry Operation)
+
+**POST** `/place_instance`
+
+Performs a placement operation for an instance (planning step), without necessarily creating it.
+
+**Request body:**
+JSON describing the instance to be placed.
+
+**Response:**
+Placement result.
+
+## 3. Models
+
+### List Models
+
+**GET** `/models`
+**GET** `/v1/models` (alias)
+
+Returns the list of available models and their metadata.
+
+**Response:**
+Array of model descriptors.
+
+## 4. Inference / Chat Completions
+
+### OpenAI-Compatible Chat Completions
+
+**POST** `/v1/chat/completions`
+
+Executes a chat completion request using an OpenAI-compatible schema. Supports streaming and non-streaming modes.
+
+**Request body (example):**
+
+```json
+{
+  "model": "llama-3.2-1b",
+  "messages": [
+    { "role": "system", "content": "You are a helpful assistant." },
+    { "role": "user", "content": "Hello" }
+  ],
+  "stream": false
+}
+```
+
+**Response:**
+OpenAI-compatible chat completion response.
+
+### Benchmarked Chat Completions
+
+**POST** `/bench/chat/completions`
+
+Same as `/v1/chat/completions`, but also returns performance and generation statistics.
+
+**Request body:**
+Same schema as `/v1/chat/completions`.
+
+**Response:**
+Chat completion plus benchmarking metrics.
+
+## 5. Complete Endpoint Summary
+
+```
+GET     /node_id
+GET     /state
+GET     /events
+
+POST    /instance
+GET     /instance/{instance_id}
+DELETE  /instance/{instance_id}
+
+GET     /instance/previews
+GET     /instance/placement
+POST    /place_instance
+
+GET     /models
+GET     /v1/models
+
+POST    /v1/chat/completions
+POST    /bench/chat/completions
+```
+
+## 6. Notes
+
+* The `/v1/chat/completions` endpoint is compatible with the OpenAI API format, so existing OpenAI clients can be pointed to EXO by changing the base URL.
+* The instance placement endpoints allow you to plan and preview cluster allocations before actually creating instances.
+* The `/events` and `/state` endpoints are primarily intended for operational visibility and debugging.
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -0,0 +1,64 @@
+# EXO Architecture overview
+
+EXO uses an _Event Sourcing_ architecture, and Erlang-style _message passing_. To facilitate this, we've written a channel library extending anyio channels with inspiration from tokio::sync::mpsc. 
+
+Each logical module - designed to be functional independently of the others - communicates with the rest of the system by sending messages on topics.
+
+## Systems
+
+There are currently 5 major systems:
+
+- Master
+    
+    Executes placement and orders events through a single writer
+
+- Worker
+    
+    Schedules work on a node, gathers system information, etc.#
+
+- Runner
+    
+    Executes inference jobs (for now) in an isolated process from the worker for fault-tolerance.
+
+- API
+    
+    Runs a python webserver for exposing state and commands to client applications
+
+- Election
+    
+    Implements a distributed algorithm for master election in unstable networking conditions
+
+## Topics
+
+There are currently 5 topics:
+
+- Commands
+
+    The API and Worker instruct the master when the event log isn't sufficient. Namely placement and catchup requests go through Commands atm.
+
+- Local Events
+
+    All nodes write events here, the master reads those events and orders them
+
+- Global Events
+
+    The master writes events here, all nodes read from this topic and fold the produced events into their `State`
+
+- Election Messages
+
+    Before establishing a cluster, nodes communicate here to negotiate a master node.
+
+- Connection Messages
+
+    The networking system write mdns-discovered hardware connections here.
+
+
+## Event Sourcing
+
+Lots has been written about event sourcing, but it lets us centralize faulty connections and message ACKing with the following model.
+
+Whenever a device produces side effects, it captures those side effects in an `Event`. `Event`s are then "applied" to their model of `State`, which is globally distributed across the cluster. Whenever a command is received, it is combined with state to produce side effects, captured in yet more events. The rule of thumb is "`Event`s are past tense, `Command`s are imperative". Telling a node to perform some action like "place this model" or "Give me a copy of the event log" is represented by a command (The worker's `Task`s are also commands), while "this node is using 300GB of ram" is an event. Notably, `Event`s SHOULD never cause side effects on their own. There are a few exceptions to this, we're working out the specifics of generalizing the distributed event sourcing model to make it better suit our needs
+
+## Purity
+
+A significant goal of the current design is to make data flow explicit. Classes should either represent simple data (`CamelCaseModel`s typically, and `TaggedModel`s for unions) or active `System`s (Erlang `Actor`s), with all transformations of that data being "referentially transparent" - destructure and construct new data, don't mutate in place. We have had varying degrees of success with this, and are still exploring where purity makes sense.
--- a/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-1-qwen3-235b.jpeg
+++ b/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-1-qwen3-235b.jpeg
--- a/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-2-deepseek-3.1-671b.jpeg
+++ b/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-2-deepseek-3.1-671b.jpeg
--- a/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-3-kimi-k2-thinking.jpeg
+++ b/docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-3-kimi-k2-thinking.jpeg
--- a/docs/imgs/exo-logo-black-bg.jpg
+++ b/docs/imgs/exo-logo-black-bg.jpg
--- a/docs/imgs/exo-logo-transparent-black-text.png
+++ b/docs/imgs/exo-logo-transparent-black-text.png
--- a/docs/imgs/exo-logo-transparent.png
+++ b/docs/imgs/exo-logo-transparent.png
--- a/docs/imgs/exo-rounded.png
+++ b/docs/imgs/exo-rounded.png
--- a/docs/imgs/exo-screenshot.jpg
+++ b/docs/imgs/exo-screenshot.jpg
--- a/docs/imgs/four-mac-studio-topology.png
+++ b/docs/imgs/four-mac-studio-topology.png
--- a/docs/imgs/macos-app-one-macbook.png
+++ b/docs/imgs/macos-app-one-macbook.png
--- a/flake.nix
+++ b/flake.nix
@@ -42,11 +42,22 @@
        };
        treefmtEval = inputs.treefmt-nix.lib.evalModule pkgs {
          projectRootFile = "flake.nix";
-          programs.ruff-format.enable = true;
-          programs.ruff-format.excludes = [ "rust/exo_pyo3_bindings/exo_pyo3_bindings.pyi" ];
-          programs.rustfmt.enable = true;
-          programs.rustfmt.package = (fenixToolchain system).rustfmt;
-          programs.nixpkgs-fmt.enable = true;
+          programs = {
+            nixpkgs-fmt.enable = true;
+            ruff-format = {
+              enable = true;
+              excludes = [ "rust/exo_pyo3_bindings/exo_pyo3_bindings.pyi" ];
+            };
+            rustfmt = {
+              enable = true;
+              package = (fenixToolchain system).rustfmt;
+            };
+            prettier = {
+              enable = true;
+              includes = [ "*.ts" ];
+            };
+            swift-format.enable = true;
+          };
        };
      in
      {
@@ -62,6 +73,9 @@
          packages =
            with pkgs;
            [
+              # FORMATTING
+              treefmtEval.config.build.wrapper
+
              # PYTHON
              python313
              uv
@@ -91,6 +105,10 @@
            ++ (pkgs.lib.optionals pkgs.stdenv.isLinux [
              # IFCONFIG
              unixtools.ifconfig
+
+              # Build dependencies for Linux
+              pkg-config
+              openssl
            ])
            ++ (pkgs.lib.optionals pkgs.stdenv.isDarwin [
              # MACMON
@@ -100,6 +118,11 @@
          shellHook = ''
            # PYTHON
            export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:${pkgs.python313}/lib"
+            ${pkgs.lib.optionalString pkgs.stdenv.isLinux ''
+              # Build environment for Linux
+              export PKG_CONFIG_PATH="${pkgs.openssl.dev}/lib/pkgconfig:$PKG_CONFIG_PATH"
+              export LD_LIBRARY_PATH="${pkgs.openssl.out}/lib:$LD_LIBRARY_PATH"
+            ''}
            echo
            echo "🍎🍎 Run 'just <recipe>' to get started"
            just --list
--- a/packaging/pyinstaller/exo.spec
+++ b/packaging/pyinstaller/exo.spec
@@ -0,0 +1,118 @@
+# -*- mode: python ; coding: utf-8 -*-
+
+import importlib.util
+import shutil
+from pathlib import Path
+
+from PyInstaller.utils.hooks import collect_submodules
+
+PROJECT_ROOT = Path.cwd()
+SOURCE_ROOT = PROJECT_ROOT / "src"
+ENTRYPOINT = SOURCE_ROOT / "exo" / "__main__.py"
+DASHBOARD_DIR = PROJECT_ROOT / "dashboard" / "build"
+EXO_SHARED_MODELS_DIR = SOURCE_ROOT / "exo" / "shared" / "models"
+
+if not ENTRYPOINT.is_file():
+    raise SystemExit(f"Unable to locate Exo entrypoint: {ENTRYPOINT}")
+
+if not DASHBOARD_DIR.is_dir():
+    raise SystemExit(f"Dashboard assets are missing: {DASHBOARD_DIR}")
+
+if not EXO_SHARED_MODELS_DIR.is_dir():
+    raise SystemExit(f"Shared model assets are missing: {EXO_SHARED_MODELS_DIR}")
+
+block_cipher = None
+
+
+def _module_directory(module_name: str) -> Path:
+    spec = importlib.util.find_spec(module_name)
+    if spec is None:
+        raise SystemExit(f"Module '{module_name}' is not available in the current environment.")
+    if spec.submodule_search_locations:
+        return Path(next(iter(spec.submodule_search_locations))).resolve()
+    if spec.origin:
+        return Path(spec.origin).resolve().parent
+    raise SystemExit(f"Unable to determine installation directory for '{module_name}'.")
+
+
+MLX_PACKAGE_DIR = _module_directory("mlx")
+MLX_LIB_DIR = MLX_PACKAGE_DIR / "lib"
+if not MLX_LIB_DIR.is_dir():
+    raise SystemExit(f"mlx Metal libraries are missing: {MLX_LIB_DIR}")
+
+
+def _safe_collect(package_name: str) -> list[str]:
+    try:
+        return collect_submodules(package_name)
+    except ImportError:
+        return []
+
+
+HIDDEN_IMPORTS = sorted(
+    set(
+        collect_submodules("mlx")
+        + _safe_collect("mlx_lm")
+        + _safe_collect("transformers")
+    )
+)
+
+DATAS: list[tuple[str, str]] = [
+    (str(DASHBOARD_DIR), "dashboard"),
+    (str(MLX_LIB_DIR), "mlx/lib"),
+    (str(EXO_SHARED_MODELS_DIR), "exo/shared/models"),
+]
+
+MACMON_PATH = shutil.which("macmon")
+if MACMON_PATH is None:
+    raise SystemExit(
+        "macmon binary not found in PATH. "
+        "Install it via: brew install macmon"
+    )
+
+BINARIES: list[tuple[str, str]] = [
+    (MACMON_PATH, "."),
+]
+
+a = Analysis(
+    [str(ENTRYPOINT)],
+    pathex=[str(SOURCE_ROOT)],
+    binaries=BINARIES,
+    datas=DATAS,
+    hiddenimports=HIDDEN_IMPORTS,
+    hookspath=[],
+    hooksconfig={},
+    runtime_hooks=[],
+    excludes=[],
+    win_no_prefer_redirects=False,
+    win_private_assemblies=False,
+    noarchive=False,
+)
+pyz = PYZ(a.pure, a.zipped_data, cipher=block_cipher)
+exe = EXE(
+    pyz,
+    a.scripts,
+    [],
+    exclude_binaries=True,
+    name="exo",
+    debug=False,
+    bootloader_ignore_signals=False,
+    strip=False,
+    upx=False,
+    console=True,
+    disable_windowed_traceback=False,
+    argv_emulation=False,
+    target_arch=None,
+    codesign_identity=None,
+    entitlements_file=None,
+)
+coll = COLLECT(
+    exe,
+    a.binaries,
+    a.zipfiles,
+    a.datas,
+    strip=False,
+    upx=False,
+    upx_exclude=[],
+    name="exo",
+)
+
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -5,8 +5,10 @@ description = "Exo"
 readme = "README.md"
 requires-python = ">=3.13"
 dependencies = [
+    "aiofiles>=24.1.0",
+    "aiohttp>=3.12.14",
+    "types-aiofiles>=24.1.0.20250708",
    "pydantic>=2.11.7",
-    "httpx>=0.28.1",
    "fastapi>=0.116.1",
    "filelock>=3.18.0",
    "rustworkx>=0.17.1",
@@ -15,10 +17,12 @@ dependencies = [
    "loguru>=0.7.3",
    "exo_pyo3_bindings", # rust bindings
    "anyio==4.11.0",
-    "mlx>=0.29.3",
+    "mlx>=0.30.1; sys_platform == 'darwin'",
+    "mlx[cpu]>=0.30.1; sys_platform == 'linux'",
    "mlx-lm>=0.28.3",
    "tiktoken>=0.12.0", # required for kimi k2 tokenizer
    "hypercorn>=0.18.0",
+    "openai-harmony>=0.0.8",
 ]

 [project.scripts]
@@ -29,9 +33,11 @@ exo = "exo.main:main"
 # dependencies only required for development
 [dependency-groups]
 dev = [
+    "pyinstaller>=6.17.0",
    "pytest>=8.4.0",
+    "pytest-asyncio>=1.0.0",
+    "pytest-env",
    "ruff>=0.11.13",
-    "trio>=0.32.0",
 ]

 # mlx[cuda] requires a newer version of mlx. the ideal on linux is: default to mlx[cpu] unless[cuda] specified.
@@ -64,7 +70,7 @@ build-backend = "uv_build"
 ###

 [tool.basedpyright]
-include = [".venv/lib/mlx", ".venv/lib/mlx_lm", "src"]
+include = [".venv/lib/mlx", ".venv/lib/mlx_lm", "src", "bench"]
 typeCheckingMode = "strict"
 failOnWarnings = true

@@ -108,4 +114,12 @@ extend-exclude = ["shared/protobufs/**", "*mlx_typings/**", "rust/exo_pyo3_bindi
 extend-select = ["I", "N", "B", "A", "PIE", "SIM"]

 [tool.pytest.ini_options]
-anyio_mode = "auto"
+pythonpath = "."
+asyncio_mode = "auto"
+markers = [
+    "slow: marks tests as slow (deselected by default)"
+]
+env = [
+  "EXO_TESTS=1"
+]
+addopts = "-m 'not slow'"
--- a/src/exo/main.py
+++ b/src/exo/main.py
@@ -1,4 +1,39 @@
+from __future__ import annotations
+
+import sys
+from collections.abc import Sequence
+from multiprocessing import freeze_support
+from typing import Final
+
 from exo.main import main

+INLINE_CODE_FLAG: Final[str] = "-c"
+
+
+def _maybe_run_inline_code(argv: Sequence[str]) -> bool:
+    """
+    Reproduce the bare minimum of Python's `-c` flag so multiprocessing
+    helper processes (for example the resource tracker) can execute.
+    """
+
+    try:
+        flag_index = argv.index(INLINE_CODE_FLAG)
+    except ValueError:
+        return False
+
+    code_index = flag_index + 1
+    if code_index >= len(argv):
+        return False
+
+    inline_code = argv[code_index]
+    sys.argv = ["-c", *argv[code_index + 1 :]]
+    namespace: dict[str, object] = {"__name__": "__main__"}
+    exec(inline_code, namespace, namespace)
+    return True
+
+
 if __name__ == "__main__":
+    if _maybe_run_inline_code(sys.argv):
+        sys.exit(0)
+    freeze_support()
    main()
--- a/src/exo/main.py
+++ b/src/exo/main.py
@@ -1,5 +1,6 @@
 import argparse
 import multiprocessing as mp
+import os
 import signal
 from dataclasses import dataclass, field
 from typing import Self
@@ -27,7 +28,7 @@ from exo.worker.main import Worker
@dataclass
 class Node:
    router: Router
-    worker: Worker
+    worker: Worker | None
    election: Election  # Every node participates in election, as we do want a node to become master even if it isn't a master candidate if no master candidates are present.
    election_result_receiver: Receiver[ElectionResult]
    master: Master | None
@@ -61,15 +62,19 @@ class Node:
        else:
            api = None

-        worker = Worker(
-            node_id,
-            session_id,
-            exo_shard_downloader(),
-            connection_message_receiver=router.receiver(topics.CONNECTION_MESSAGES),
-            global_event_receiver=router.receiver(topics.GLOBAL_EVENTS),
-            local_event_sender=router.sender(topics.LOCAL_EVENTS),
-            command_sender=router.sender(topics.COMMANDS),
-        )
+        if not args.no_worker:
+            worker = Worker(
+                node_id,
+                session_id,
+                exo_shard_downloader(),
+                connection_message_receiver=router.receiver(topics.CONNECTION_MESSAGES),
+                global_event_receiver=router.receiver(topics.GLOBAL_EVENTS),
+                local_event_sender=router.sender(topics.LOCAL_EVENTS),
+                command_sender=router.sender(topics.COMMANDS),
+            )
+        else:
+            worker = None
+
        # We start every node with a master
        master = Master(
            node_id,
@@ -99,8 +104,9 @@ class Node:
        async with self._tg as tg:
            signal.signal(signal.SIGINT, lambda _, __: self.shutdown())
            tg.start_soon(self.router.run)
-            tg.start_soon(self.worker.run)
            tg.start_soon(self.election.run)
+            if self.worker:
+                tg.start_soon(self.worker.run)
            if self.master:
                tg.start_soon(self.master.run)
            if self.api:
@@ -194,6 +200,7 @@ def main():
    # TODO: Refactor the current verbosity system
    logger_setup(EXO_LOG, args.verbosity)
    logger.info("Starting EXO")
+    logger.info(f"EXO_LIBP2P_NAMESPACE: {os.getenv('EXO_LIBP2P_NAMESPACE')}")

    node = anyio.run(Node.create, args)
    anyio.run(node.run)
@@ -207,6 +214,7 @@ class Args(CamelCaseModel):
    spawn_api: bool = False
    api_port: PositiveInt = 52415
    tb_only: bool = False
+    no_worker: bool = False

    @classmethod
    def parse(cls) -> Self:
@@ -244,6 +252,10 @@ class Args(CamelCaseModel):
            dest="api_port",
            default=52415,
        )
+        parser.add_argument(
+            "--no-worker",
+            action="store_true",
+        )

        args = parser.parse_args()
        return cls(**vars(args))  # pyright: ignore[reportAny] - We are intentionally validating here, we can't do it statically
--- a/src/exo/master/api.py
+++ b/src/exo/master/api.py
@@ -13,6 +13,12 @@ from hypercorn.asyncio import serve  # pyright: ignore[reportUnknownVariableType
 from hypercorn.config import Config
 from hypercorn.typing import ASGIFramework
 from loguru import logger
+from openai_harmony import (  # pyright: ignore[reportMissingTypeStubs]
+    HarmonyEncodingName,
+    Role,
+    StreamableParser,
+    load_harmony_encoding,
+)

 from exo.master.placement import place_instance as get_instance_placements
 from exo.shared.apply import apply
@@ -21,11 +27,16 @@ from exo.shared.logging import InterceptLogger
 from exo.shared.models.model_cards import MODEL_CARDS
 from exo.shared.models.model_meta import get_model_meta
 from exo.shared.types.api import (
+    BenchChatCompletionResponse,
+    BenchChatCompletionTaskParams,
+    ChatCompletionChoice,
    ChatCompletionMessage,
    ChatCompletionResponse,
    CreateInstanceParams,
    CreateInstanceResponse,
    DeleteInstanceResponse,
+    FinishReason,
+    GenerationStats,
    ModelList,
    ModelListModel,
    PlaceInstanceParams,
@@ -56,7 +67,7 @@ from exo.utils.channels import Receiver, Sender, channel
 from exo.utils.dashboard_path import find_dashboard
 from exo.utils.event_buffer import OrderedBuffer

-HIDE_THINKING = False
+encoding = load_harmony_encoding(HarmonyEncodingName.HARMONY_GPT_OSS)


 def chunk_to_response(
@@ -161,7 +172,10 @@ class API:
        self.app.delete("/instance/{instance_id}")(self.delete_instance)
        self.app.get("/models")(self.get_models)
        self.app.get("/v1/models")(self.get_models)
-        self.app.post("/v1/chat/completions")(self.chat_completions)
+        self.app.post("/v1/chat/completions", response_model=None)(
+            self.chat_completions
+        )
+        self.app.post("/bench/chat/completions")(self.bench_chat_completions)
        self.app.get("/state")(lambda: self.state)
        self.app.get("/events")(lambda: self._event_log)

@@ -177,17 +191,32 @@ class API:
        return CreateInstanceResponse(
            message="Command received.",
            command_id=command.command_id,
+            model_meta=command.model_meta,
        )

    async def create_instance(
        self, payload: CreateInstanceParams
    ) -> CreateInstanceResponse:
-        command = CreateInstance(instance=payload.instance)
+        instance = payload.instance
+        model_meta = await resolve_model_meta(instance.shard_assignments.model_id)
+        required_memory = model_meta.storage_size
+        available_memory = self._calculate_total_available_memory()
+
+        if required_memory > available_memory:
+            raise HTTPException(
+                status_code=400,
+                detail=f"Insufficient memory to create instance. Required: {required_memory.in_gb:.1f}GB, Available: {available_memory.in_gb:.1f}GB",
+            )
+
+        command = CreateInstance(
+            instance=instance,
+        )
        await self._send(command)

        return CreateInstanceResponse(
            message="Command received.",
            command_id=command.command_id,
+            model_meta=model_meta,
        )

    async def get_placement(
@@ -207,6 +236,7 @@ class API:
                    instance_meta=instance_meta,
                    min_nodes=min_nodes,
                ),
+                node_profiles=self.state.node_profiles,
                topology=self.state.topology,
                current_instances=self.state.instances,
            )
@@ -262,6 +292,7 @@ class API:
                            instance_meta=instance_meta,
                            min_nodes=min_nodes,
                        ),
+                        node_profiles=self.state.node_profiles,
                        topology=self.state.topology,
                        current_instances=self.state.instances,
                    )
@@ -352,32 +383,52 @@ class API:
            instance_id=instance_id,
        )

-    async def _generate_chat_stream(
-        self, command_id: CommandId
-    ) -> AsyncGenerator[str, None]:
-        """Generate chat completion stream as JSON strings."""
+    async def _process_gpt_oss(self, token_chunks: Receiver[TokenChunk]):
+        stream = StreamableParser(encoding, role=Role.ASSISTANT)
+        thinking = False
+
+        async for chunk in token_chunks:
+            stream.process(chunk.token_id)
+
+            delta = stream.last_content_delta
+            ch = stream.current_channel
+
+            if ch == "analysis" and not thinking:
+                thinking = True
+                yield chunk.model_copy(update={"text": "<think>"})
+
+            if ch != "analysis" and thinking:
+                thinking = False
+                yield chunk.model_copy(update={"text": "</think>"})
+
+            if delta:
+                yield chunk.model_copy(update={"text": delta})
+
+            if chunk.finish_reason is not None:
+                if thinking:
+                    yield chunk.model_copy(update={"text": "</think>"})
+                yield chunk
+                break
+
+    async def _chat_chunk_stream(
+        self, command_id: CommandId, parse_gpt_oss: bool
+    ) -> AsyncGenerator[TokenChunk, None]:
+        """Yield `TokenChunk`s for a given command until completion."""

        try:
            self._chat_completion_queues[command_id], recv = channel[TokenChunk]()

-            is_thinking = False
            with recv as token_chunks:
-                async for chunk in token_chunks:
-                    if HIDE_THINKING:
-                        if chunk.text == "<think>":
-                            is_thinking = True
-                        if chunk.text == "</think>":
-                            is_thinking = False
-                    chunk_response: ChatCompletionResponse = chunk_to_response(
-                        chunk, command_id
-                    )
-                    if not (is_thinking and HIDE_THINKING):
-                        logger.debug(f"chunk_response: {chunk_response}")
-                        yield f"data: {chunk_response.model_dump_json()}\n\n"
-
-                    if chunk.finish_reason is not None:
-                        yield "data: [DONE]\n\n"
-                        break
+                if parse_gpt_oss:
+                    async for chunk in self._process_gpt_oss(token_chunks):
+                        yield chunk
+                        if chunk.finish_reason is not None:
+                            break
+                else:
+                    async for chunk in token_chunks:
+                        yield chunk
+                        if chunk.finish_reason is not None:
+                            break

        except anyio.get_cancelled_exc_class():
            # TODO: TaskCancelled
@@ -392,6 +443,98 @@ class API:
            await self._send(command)
            del self._chat_completion_queues[command_id]

+    async def _generate_chat_stream(
+        self, command_id: CommandId, parse_gpt_oss: bool
+    ) -> AsyncGenerator[str, None]:
+        """Generate chat completion stream as JSON strings."""
+
+        async for chunk in self._chat_chunk_stream(command_id, parse_gpt_oss):
+            chunk_response: ChatCompletionResponse = chunk_to_response(
+                chunk, command_id
+            )
+            logger.debug(f"chunk_response: {chunk_response}")
+
+            yield f"data: {chunk_response.model_dump_json()}\n\n"
+
+            if chunk.finish_reason is not None:
+                yield "data: [DONE]\n\n"
+
+    async def _collect_chat_completion(
+        self, command_id: CommandId, parse_gpt_oss: bool
+    ) -> ChatCompletionResponse:
+        """Collect all token chunks for a chat completion and return a single response."""
+
+        text_parts: list[str] = []
+        model: str | None = None
+        finish_reason: FinishReason | None = None
+
+        async for chunk in self._chat_chunk_stream(command_id, parse_gpt_oss):
+            if model is None:
+                model = chunk.model
+
+            text_parts.append(chunk.text)
+
+            if chunk.finish_reason is not None:
+                finish_reason = chunk.finish_reason
+
+        combined_text = "".join(text_parts)
+        assert model is not None
+
+        return ChatCompletionResponse(
+            id=command_id,
+            created=int(time.time()),
+            model=model,
+            choices=[
+                ChatCompletionChoice(
+                    index=0,
+                    message=ChatCompletionMessage(
+                        role="assistant",
+                        content=combined_text,
+                    ),
+                    finish_reason=finish_reason,
+                )
+            ],
+        )
+
+    async def _collect_chat_completion_with_stats(
+        self, command_id: CommandId, parse_gpt_oss: bool
+    ) -> BenchChatCompletionResponse:
+        text_parts: list[str] = []
+        model: str | None = None
+        finish_reason: FinishReason | None = None
+
+        stats: GenerationStats | None = None
+
+        async for chunk in self._chat_chunk_stream(command_id, parse_gpt_oss):
+            if model is None:
+                model = chunk.model
+
+            text_parts.append(chunk.text)
+            stats = chunk.stats or stats
+
+            if chunk.finish_reason is not None:
+                finish_reason = chunk.finish_reason
+
+        combined_text = "".join(text_parts)
+        assert model is not None
+
+        resp = BenchChatCompletionResponse(
+            id=command_id,
+            created=int(time.time()),
+            model=model,
+            choices=[
+                ChatCompletionChoice(
+                    index=0,
+                    message=ChatCompletionMessage(
+                        role="assistant", content=combined_text
+                    ),
+                    finish_reason=finish_reason,
+                )
+            ],
+            generation_stats=stats,
+        )
+        return resp
+
    async def _trigger_notify_user_to_download_model(self, model_id: str) -> None:
        logger.warning(
            "TODO: we should send a notification to the user to download the model"
@@ -399,10 +542,12 @@ class API:

    async def chat_completions(
        self, payload: ChatCompletionTaskParams
-    ) -> StreamingResponse:
-        """Handle chat completions with proper streaming response."""
+    ) -> ChatCompletionResponse | StreamingResponse:
+        """Handle chat completions, supporting both streaming and non-streaming responses."""
        model_meta = await resolve_model_meta(payload.model)
        payload.model = model_meta.model_id
+        parse_gpt_oss = "gpt-oss" in model_meta.model_id.lower()
+        logger.info(f"{parse_gpt_oss=}")

        if not any(
            instance.shard_assignments.model_id == payload.model
@@ -417,18 +562,47 @@ class API:
            request_params=payload,
        )
        await self._send(command)
-        return StreamingResponse(
-            self._generate_chat_stream(command.command_id),
-            media_type="text/event-stream",
+        if payload.stream:
+            return StreamingResponse(
+                self._generate_chat_stream(command.command_id, parse_gpt_oss),
+                media_type="text/event-stream",
+            )
+
+        return await self._collect_chat_completion(command.command_id, parse_gpt_oss)
+
+    async def bench_chat_completions(
+        self, payload: BenchChatCompletionTaskParams
+    ) -> BenchChatCompletionResponse:
+        model_meta = await resolve_model_meta(payload.model)
+        parse_gpt_oss = "gpt-oss" in model_meta.model_id.lower()
+        payload.model = model_meta.model_id
+
+        if not any(
+            instance.shard_assignments.model_id == payload.model
+            for instance in self.state.instances.values()
+        ):
+            await self._trigger_notify_user_to_download_model(payload.model)
+            raise HTTPException(
+                status_code=404, detail=f"No instance found for model {payload.model}"
+            )
+
+        payload.stream = False
+
+        command = ChatCompletion(request_params=payload)
+        await self._send(command)
+
+        response = await self._collect_chat_completion_with_stats(
+            command.command_id,
+            parse_gpt_oss,
        )
+        return response

    def _calculate_total_available_memory(self) -> Memory:
        """Calculate total available memory across all nodes in bytes."""
        total_available = Memory()

-        for node in self.state.topology.list_nodes():
-            if node.node_profile is not None:
-                total_available += node.node_profile.memory.ram_available
+        for profile in self.state.node_profiles.values():
+            total_available += profile.memory.ram_available

        return total_available

@@ -442,6 +616,8 @@ class API:
                    name=card.name,
                    description=card.description,
                    tags=card.tags,
+                    storage_size_megabytes=int(card.metadata.storage_size.in_mb),
+                    supports_tensor=card.metadata.supports_tensor,
                )
                for card in MODEL_CARDS.values()
            ]
@@ -458,7 +634,7 @@ class API:
        async with create_task_group() as tg:
            self._tg = tg
            logger.info("Starting API")
-            tg.start_soon(self._applystate)
+            tg.start_soon(self._apply_state)
            tg.start_soon(self._pause_on_new_election)
            print_startup_banner(self.port)
            await serve(
@@ -470,7 +646,7 @@ class API:
        self.command_sender.close()
        self.global_event_receiver.close()

-    async def _applystate(self):
+    async def _apply_state(self):
        with self.global_event_receiver as events:
            async for f_event in events:
                if f_event.origin != self.session_id.master_node_id:
--- a/src/exo/master/main.py
+++ b/src/exo/master/main.py
@@ -158,6 +158,7 @@ class Master:
                                command,
                                self.state.topology,
                                self.state.instances,
+                                self.state.node_profiles,
                            )
                            transition_events = get_transition_events(
                                self.state.instances, placement
@@ -200,9 +201,7 @@ class Master:
    async def _plan(self) -> None:
        while True:
            # kill broken instances
-            connected_node_ids = set(
-                [x.node_id for x in self.state.topology.list_nodes()]
-            )
+            connected_node_ids = set([x for x in self.state.topology.list_nodes()])
            for instance_id, instance in self.state.instances.items():
                for node_id in instance.shard_assignments.node_to_runner:
                    if node_id not in connected_node_ids:
--- a/src/exo/master/placement.py
+++ b/src/exo/master/placement.py
@@ -6,10 +6,11 @@ from typing import Sequence
 from loguru import logger

 from exo.master.placement_utils import (
+    NodeWithProfile,
    filter_cycles_by_memory,
-    get_hosts_from_subgraph,
-    get_mlx_ibv_coordinators,
-    get_mlx_ibv_devices_matrix,
+    get_mlx_jaccl_coordinators,
+    get_mlx_jaccl_devices_matrix,
+    get_mlx_ring_hosts_by_node,
    get_shard_assignments,
    get_smallest_cycles,
 )
@@ -19,10 +20,11 @@ from exo.shared.types.commands import (
    DeleteInstance,
    PlaceInstance,
 )
-from exo.shared.types.common import Host
+from exo.shared.types.common import NodeId
 from exo.shared.types.events import Event, InstanceCreated, InstanceDeleted
 from exo.shared.types.memory import Memory
-from exo.shared.types.topology import NodeInfo
+from exo.shared.types.models import ModelId
+from exo.shared.types.profiling import NodePerformanceProfile
 from exo.shared.types.worker.instances import (
    Instance,
    InstanceId,
@@ -30,6 +32,7 @@ from exo.shared.types.worker.instances import (
    MlxJacclInstance,
    MlxRingInstance,
 )
+from exo.shared.types.worker.shards import Sharding


 def random_ephemeral_port() -> int:
@@ -51,33 +54,54 @@ def place_instance(
    command: PlaceInstance,
    topology: Topology,
    current_instances: Mapping[InstanceId, Instance],
+    node_profiles: Mapping[NodeId, NodePerformanceProfile],
 ) -> dict[InstanceId, Instance]:
    all_nodes = list(topology.list_nodes())

-    logger.info("finding cycles:")
-    cycles = topology.get_cycles()
-    singleton_cycles = [[node] for node in all_nodes]
-    candidate_cycles = list(
-        filter(lambda it: len(it) >= command.min_nodes, cycles + singleton_cycles)
-    )
+    cycles = topology.get_cycles() + [[node] for node in all_nodes]
+    candidate_cycles = list(filter(lambda it: len(it) >= command.min_nodes, cycles))
    cycles_with_sufficient_memory = filter_cycles_by_memory(
-        candidate_cycles, command.model_meta.storage_size
+        candidate_cycles, node_profiles, command.model_meta.storage_size
    )
-    if not cycles_with_sufficient_memory:
+    if len(cycles_with_sufficient_memory) == 0:
        raise ValueError("No cycles found with sufficient memory")

+    if command.sharding == Sharding.Tensor:
+        if not command.model_meta.supports_tensor:
+            raise ValueError(
+                f"Requested Tensor sharding but this model does not support tensor parallelism: {command.model_meta.model_id}"
+            )
+        # TODO: the condition here for tensor parallel is not correct, but it works good enough for now.
+        cycles_with_sufficient_memory = [
+            cycle
+            for cycle in cycles_with_sufficient_memory
+            if command.model_meta.hidden_size % len(cycle) == 0
+        ]
+        if not cycles_with_sufficient_memory:
+            raise ValueError(
+                f"No tensor sharding found for model with hidden_size {command.model_meta.hidden_size} candidate cycles"
+            )
+    if command.sharding == Sharding.Pipeline and command.model_meta.model_id == ModelId(
+        "mlx-community/DeepSeek-V3.1-8bit"
+    ):
+        raise ValueError(
+            "Pipeline parallelism is not supported for DeepSeek V3.1 (8-bit)"
+        )
+
    smallest_cycles = get_smallest_cycles(cycles_with_sufficient_memory)

    smallest_tb_cycles = [
        cycle
        for cycle in smallest_cycles
-        if topology.get_subgraph_from_nodes(cycle).is_thunderbolt_cycle(cycle)
+        if topology.get_subgraph_from_nodes(
+            [node.node_id for node in cycle]
+        ).is_thunderbolt_cycle([node.node_id for node in cycle])
    ]

    if smallest_tb_cycles != []:
        smallest_cycles = smallest_tb_cycles

-    cycles_with_leaf_nodes: list[list[NodeInfo]] = [
+    cycles_with_leaf_nodes: list[list[NodeWithProfile]] = [
        cycle
        for cycle in smallest_cycles
        if any(topology.node_is_leaf(node.node_id) for node in cycle)
@@ -86,11 +110,7 @@ def place_instance(
    selected_cycle = max(
        cycles_with_leaf_nodes if cycles_with_leaf_nodes != [] else smallest_cycles,
        key=lambda cycle: sum(
-            (
-                node.node_profile.memory.ram_available
-                for node in cycle
-                if node.node_profile is not None
-            ),
+            (node.node_profile.memory.ram_available for node in cycle),
            start=Memory(),
        ),
    )
@@ -99,14 +119,16 @@ def place_instance(
        command.model_meta, selected_cycle, command.sharding
    )

-    cycle_digraph: Topology = topology.get_subgraph_from_nodes(selected_cycle)
+    cycle_digraph: Topology = topology.get_subgraph_from_nodes(
+        [node.node_id for node in selected_cycle]
+    )

    instance_id = InstanceId()
    target_instances = dict(deepcopy(current_instances))

    if len(selected_cycle) == 1:
        logger.warning(
-            "You have likely selected ibv for a single node instance; falling back to MlxRing"
+            "You have likely selected jaccl for a single node instance; falling back to MlxRing"
        )

        command.instance_meta = InstanceMeta.MlxRing
@@ -114,33 +136,32 @@ def place_instance(
    # TODO: Single node instances
    match command.instance_meta:
        case InstanceMeta.MlxJaccl:
-            mlx_ibv_devices = get_mlx_ibv_devices_matrix(
-                selected_cycle,
+            mlx_jaccl_devices = get_mlx_jaccl_devices_matrix(
                cycle_digraph,
            )
-            mlx_ibv_coordinators = get_mlx_ibv_coordinators(
-                selected_cycle,
+            mlx_jaccl_coordinators = get_mlx_jaccl_coordinators(
+                coordinator=selected_cycle[0].node_id,
                coordinator_port=random_ephemeral_port(),
                cycle_digraph=cycle_digraph,
            )
            target_instances[instance_id] = MlxJacclInstance(
                instance_id=instance_id,
                shard_assignments=shard_assignments,
-                ibv_devices=mlx_ibv_devices,
-                ibv_coordinators=mlx_ibv_coordinators,
+                jaccl_devices=mlx_jaccl_devices,
+                jaccl_coordinators=mlx_jaccl_coordinators,
            )
        case InstanceMeta.MlxRing:
-            hosts: list[Host] = get_hosts_from_subgraph(cycle_digraph)
+            ephemeral_port = random_ephemeral_port()
+            hosts_by_node = get_mlx_ring_hosts_by_node(
+                selected_cycle=selected_cycle,
+                cycle_digraph=cycle_digraph,
+                ephemeral_port=ephemeral_port,
+            )
            target_instances[instance_id] = MlxRingInstance(
                instance_id=instance_id,
                shard_assignments=shard_assignments,
-                hosts=[
-                    Host(
-                        ip=host.ip,
-                        port=random_ephemeral_port(),
-                    )
-                    for host in hosts
-                ],
+                hosts_by_node=hosts_by_node,
+                ephemeral_port=ephemeral_port,
            )

    return target_instances
--- a/src/exo/master/placement_utils.py
+++ b/src/exo/master/placement_utils.py
@@ -1,5 +1,4 @@
-from collections.abc import Generator
-from typing import TypeGuard, cast
+from collections.abc import Generator, Mapping

 from loguru import logger
 from pydantic import BaseModel
@@ -9,7 +8,7 @@ from exo.shared.types.common import Host, NodeId
 from exo.shared.types.memory import Memory
 from exo.shared.types.models import ModelMetadata
 from exo.shared.types.profiling import NodePerformanceProfile
-from exo.shared.types.topology import NodeInfo
+from exo.shared.types.topology import RDMAConnection, SocketConnection
 from exo.shared.types.worker.runners import RunnerId, ShardAssignments
 from exo.shared.types.worker.shards import (
    PipelineShardMetadata,
@@ -24,27 +23,32 @@ class NodeWithProfile(BaseModel):
    node_profile: NodePerformanceProfile


-def narrow_all_nodes(nodes: list[NodeInfo]) -> TypeGuard[list[NodeWithProfile]]:
-    return all(node.node_profile is not None for node in nodes)
-
-
 def filter_cycles_by_memory(
-    cycles: list[list[NodeInfo]], required_memory: Memory
-) -> list[list[NodeInfo]]:
-    filtered_cycles: list[list[NodeInfo]] = []
+    cycles: list[list[NodeId]],
+    node_profiles: Mapping[NodeId, NodePerformanceProfile],
+    required_memory: Memory,
+) -> list[list[NodeWithProfile]]:
+    filtered_cycles: list[list[NodeWithProfile]] = []
    for cycle in cycles:
-        if not narrow_all_nodes(cycle):
+        if not all(node in node_profiles for node in cycle):
            continue

        total_mem = sum(
-            (node.node_profile.memory.ram_available for node in cycle), start=Memory()
+            (node_profiles[node].memory.ram_available for node in cycle), start=Memory()
        )
        if total_mem >= required_memory:
-            filtered_cycles.append(cast(list[NodeInfo], cycle))
+            filtered_cycles.append(
+                [
+                    NodeWithProfile(node_id=node, node_profile=node_profiles[node])
+                    for node in cycle
+                ]
+            )
    return filtered_cycles


-def get_smallest_cycles(cycles: list[list[NodeInfo]]) -> list[list[NodeInfo]]:
+def get_smallest_cycles(
+    cycles: list[list[NodeWithProfile]],
+) -> list[list[NodeWithProfile]]:
    min_nodes = min(len(cycle) for cycle in cycles)
    return [cycle for cycle in cycles if len(cycle) == min_nodes]

@@ -135,11 +139,9 @@ def get_shard_assignments_for_tensor_parallel(

 def get_shard_assignments(
    model_meta: ModelMetadata,
-    selected_cycle: list[NodeInfo],
+    selected_cycle: list[NodeWithProfile],
    sharding: Sharding,
 ) -> ShardAssignments:
-    if not narrow_all_nodes(selected_cycle):
-        raise ValueError("All nodes must have profiles to create shard assignments")
    match sharding:
        case Sharding.Pipeline:
            return get_shard_assignments_for_pipeline_parallel(
@@ -176,17 +178,16 @@ def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
        current_node = cycle[i]
        next_node = cycle[(i + 1) % len(cycle)]

-        for connection in cycle_digraph.list_connections():
-            if (
-                connection.local_node_id == current_node.node_id
-                and connection.send_back_node_id == next_node.node_id
-            ):
+        for src, sink, connection in cycle_digraph.list_connections():
+            if not isinstance(connection, SocketConnection):
+                continue
+
+            if src == current_node and sink == next_node:
                if get_thunderbolt and not connection.is_thunderbolt():
                    continue
-                assert connection.send_back_multiaddr is not None
                host = Host(
-                    ip=connection.send_back_multiaddr.ip_address,
-                    port=connection.send_back_multiaddr.port,
+                    ip=connection.sink_multiaddr.ip_address,
+                    port=connection.sink_multiaddr.port,
                )
                hosts.append(host)
                break
@@ -194,8 +195,7 @@ def get_hosts_from_subgraph(cycle_digraph: Topology) -> list[Host]:
    return hosts


-def get_mlx_ibv_devices_matrix(
-    selected_cycle: list[NodeInfo],
+def get_mlx_jaccl_devices_matrix(
    cycle_digraph: Topology,
 ) -> list[list[str | None]]:
    """Build connectivity matrix mapping device i to device j via RDMA interface names.
@@ -204,6 +204,7 @@ def get_mlx_ibv_devices_matrix(
    to device j, or None if no connection exists or no interface name is found.
    Diagonal elements are always None.
    """
+    selected_cycle = list(cycle_digraph.list_nodes())
    num_nodes = len(selected_cycle)
    matrix: list[list[str | None]] = [
        [None for _ in range(num_nodes)] for _ in range(num_nodes)
@@ -214,86 +215,158 @@ def get_mlx_ibv_devices_matrix(
            if i == j:
                continue

-            # Find the IP J uses to talk to I
-            for connection_ip in _find_connection_ip(node_j, node_i, cycle_digraph):
-                # This is a local IP on I, which is attached to an interface: find that interface
-                if interface_name := _find_interface_name_for_ip(connection_ip, node_i):
-                    matrix[i][j] = interface_name
-                    logger.info(
-                        f"Interface name for {connection_ip} on {node_i.node_id}: {interface_name}"
-                    )
+            for conn in cycle_digraph.get_all_connections_between(node_i, node_j):
+                if isinstance(conn, RDMAConnection):
+                    matrix[i][j] = conn.source_rdma_iface
                    break
            else:
                logger.warning(
-                    f"Failed to find interface name between {node_i.node_id} and {node_j.node_id}"
+                    f"Failed to find interface name between {node_i} and {node_j}"
                )
                raise ValueError(
-                    "Current ibv backend requires all-to-all rdma connections"
+                    "Current jaccl backend requires all-to-all RDMA connections"
                )

    return matrix


 def _find_connection_ip(
-    node_i: NodeInfo,
-    node_j: NodeInfo,
+    node_i: NodeId,
+    node_j: NodeId,
    cycle_digraph: Topology,
-) -> Generator[str]:
+) -> Generator[tuple[str, bool]]:
    """Find all IP addresses that connect node i to node j."""
-    for connection in cycle_digraph.list_connections():
-        if (
-            connection.local_node_id == node_i.node_id
-            and connection.send_back_node_id == node_j.node_id
-        ):
-            yield connection.send_back_multiaddr.ip_address
+    # TODO: Prioritise ETHERNET > ??WIFI > TB for coordinator
+    for connection in cycle_digraph.get_all_connections_between(node_i, node_j):
+        if isinstance(connection, SocketConnection):
+            yield connection.sink_multiaddr.ip_address, connection.is_thunderbolt()


 def _find_interface_name_for_ip(
    ip_address: str,
-    node_info: NodeInfo,
+    node_info: NodeWithProfile,
 ) -> str | None:
-    if node_info.node_profile is None:
-        return None
-
-    logger.info(f"Searching {node_info.node_id} for ip {ip_address}:")
+    """Find the interface name for an IP address on a node (any interface)."""
    for interface in node_info.node_profile.network_interfaces:
-        if interface.name not in ["en2", "en3", "en4", "en5", "en6", "en7"]:
-            continue
-        logger.info(f" | {interface.name}: {interface.ip_address}")
-        if interface.ip_address != ip_address:
-            continue
-
-        logger.info("Found")
-        return f"rdma_{interface.name}"
+        if interface.ip_address == ip_address:
+            return interface.name

    return None


-def get_mlx_ibv_coordinators(
-    selected_cycle: list[NodeInfo],
+def _find_ip_prioritised(
+    node: NodeWithProfile, other_node: NodeWithProfile, cycle_digraph: Topology
+) -> str | None:
+    # TODO: Actually prioritize in the correct Ethernet > Wifi > Non-TB > TB order.
+    """Find an IP address between nodes with prioritization.
+
+    Priority order:
+    1. en0 (Ethernet on Mac Studio, WiFi on MacBook)
+    2. en1 (WiFi on Mac Studio, Ethernet on MacBook)
+    3. Non-Thunderbolt connections
+    4. Any other IP address
+    """
+    ips = list(_find_connection_ip(node.node_id, other_node.node_id, cycle_digraph))
+    # We expect a unique iface -> ip mapping
+    iface_map = {_find_interface_name_for_ip(ip, other_node): ip for ip, _ in ips}
+
+    en0_ip = iface_map.get("en0")
+    if en0_ip:
+        return en0_ip
+
+    en1_ip = iface_map.get("en1")
+    if en1_ip:
+        return en1_ip
+
+    non_thunderbolt_ip = next(
+        (ip for (ip, is_thunderbolt) in ips if not is_thunderbolt), None
+    )
+
+    if non_thunderbolt_ip:
+        return non_thunderbolt_ip
+
+    if ips:
+        return ips[0][0]
+
+    return None
+
+
+def get_mlx_ring_hosts_by_node(
+    selected_cycle: list[NodeWithProfile],
+    cycle_digraph: Topology,
+    ephemeral_port: int,
+) -> dict[NodeId, list[Host]]:
+    """Generate per-node host lists for MLX ring backend.
+
+    Each node gets a list where:
+    - Self position: Host(ip="0.0.0.0", port=ephemeral_port)
+    - Left/right neighbors: actual connection IPs
+    - Non-neighbors: Host(ip="198.51.100.1", port=0) placeholder (RFC 5737 TEST-NET-2)
+    """
+    world_size = len(selected_cycle)
+    if world_size == 0:
+        return {}
+
+    hosts_by_node: dict[NodeId, list[Host]] = {}
+
+    for rank, node in enumerate(selected_cycle):
+        node_id = node.node_id
+        left_rank = (rank - 1) % world_size
+        right_rank = (rank + 1) % world_size
+
+        hosts_for_node: list[Host] = []
+
+        for idx, other_node in enumerate(selected_cycle):
+            if idx == rank:
+                hosts_for_node.append(Host(ip="0.0.0.0", port=ephemeral_port))
+                continue
+
+            if idx not in {left_rank, right_rank}:
+                # Placeholder IP from RFC 5737 TEST-NET-2
+                hosts_for_node.append(Host(ip="198.51.100.1", port=0))
+                continue
+
+            connection_ip = _find_ip_prioritised(node, other_node, cycle_digraph)
+            if connection_ip is None:
+                logger.warning(
+                    f"Failed to find prioritised connection IP between {node_id} and {other_node}"
+                )
+                raise ValueError(
+                    "MLX ring backend requires connectivity between neighbouring nodes"
+                )
+
+            hosts_for_node.append(Host(ip=connection_ip, port=ephemeral_port))
+
+        hosts_by_node[node_id] = hosts_for_node
+
+    return hosts_by_node
+
+
+def get_mlx_jaccl_coordinators(
+    coordinator: NodeId,
    coordinator_port: int,
    cycle_digraph: Topology,
 ) -> dict[NodeId, str]:
-    """Get the coordinator addresses for MLX IBV (rank 0 device).
+    """Get the coordinator addresses for MLX JACCL (rank 0 device).

    Select an IP address that each node can reach for the rank 0 node. Returns
    address in format "X.X.X.X:PORT" per node.
    """
-    rank_0_node = selected_cycle[0]
-    logger.info(f"Selecting coordinator from rank 0 node: {rank_0_node.node_id}")
+    selected_cycle = list(cycle_digraph.list_nodes())
+    logger.info(f"Selecting coordinator: {coordinator}")

-    def get_ip_for_node(n: NodeInfo) -> str:
-        if n.node_id == rank_0_node.node_id:
+    def get_ip_for_node(n: NodeId) -> str:
+        if n == coordinator:
            return "0.0.0.0"

-        for ip in _find_connection_ip(n, rank_0_node, cycle_digraph):
+        for ip, _ in _find_connection_ip(n, coordinator, cycle_digraph):
            return ip

        logger.warning(
-            f"Failed to find directly connected ip between {n.node_id} and {rank_0_node.node_id}"
+            f"Failed to find directly connected ip between {n} and {coordinator}"
+        )
+        raise ValueError(
+            "Current jaccl backend requires all participating devices to be able to communicate"
        )
-        raise ValueError("Current ibv backend requires all-to-all rdma connections")

-    return {
-        n.node_id: f"{get_ip_for_node(n)}:{coordinator_port}" for n in selected_cycle
-    }
+    return {n: f"{get_ip_for_node(n)}:{coordinator_port}" for n in selected_cycle}
--- a/src/exo/master/tests/conftest.py
+++ b/src/exo/master/tests/conftest.py
@@ -1,67 +1,36 @@
-from typing import Callable
-
-import pytest
-
-from exo.shared.types.common import NodeId
 from exo.shared.types.multiaddr import Multiaddr
 from exo.shared.types.profiling import (
-    MemoryPerformanceProfile,
+    MemoryUsage,
    NodePerformanceProfile,
    SystemPerformanceProfile,
 )
-from exo.shared.types.topology import Connection, ConnectionProfile, NodeInfo
+from exo.shared.types.topology import RDMAConnection, SocketConnection


-@pytest.fixture
-def create_node():
-    def _create_node(memory: int, node_id: NodeId | None = None) -> NodeInfo:
-        if node_id is None:
-            node_id = NodeId()
-        return NodeInfo(
-            node_id=node_id,
-            node_profile=NodePerformanceProfile(
-                model_id="test",
-                chip_id="test",
-                friendly_name="test",
-                memory=MemoryPerformanceProfile.from_bytes(
-                    ram_total=1000,
-                    ram_available=memory,
-                    swap_total=1000,
-                    swap_available=1000,
-                ),
-                network_interfaces=[],
-                system=SystemPerformanceProfile(),
-            ),
-        )
-
-    return _create_node
+def create_node_profile(memory: int) -> NodePerformanceProfile:
+    return NodePerformanceProfile(
+        model_id="test",
+        chip_id="test",
+        friendly_name="test",
+        memory=MemoryUsage.from_bytes(
+            ram_total=1000,
+            ram_available=memory,
+            swap_total=1000,
+            swap_available=1000,
+        ),
+        network_interfaces=[],
+        system=SystemPerformanceProfile(),
+    )


 # TODO: this is a hack to get the port for the send_back_multiaddr
-@pytest.fixture
-def create_connection() -> Callable[[NodeId, NodeId, int | None], Connection]:
-    port_counter = 1235
-    ip_counter = 1
+def create_connection(ip: int, sink_port: int = 1234) -> SocketConnection:
+    return SocketConnection(
+        sink_multiaddr=Multiaddr(address=f"/ip4/169.254.0.{ip}/tcp/{sink_port}"),
+    )

-    def _create_connection(
-        source_node_id: NodeId, sink_node_id: NodeId, send_back_port: int | None = None
-    ) -> Connection:
-        nonlocal port_counter
-        nonlocal ip_counter
-        # assign unique ips
-        ip_counter += 1
-        if send_back_port is None:
-            send_back_port = port_counter
-            port_counter += 1
-        return Connection(
-            local_node_id=source_node_id,
-            send_back_node_id=sink_node_id,
-            send_back_multiaddr=Multiaddr(
-                address=f"/ip4/169.254.0.{ip_counter}/tcp/{send_back_port}"
-            ),
-            connection_profile=ConnectionProfile(
-                throughput=1000, latency=1000, jitter=1000
-            ),
-        )

-    return _create_connection
+def create_rdma_connection(iface: int) -> RDMAConnection:
+    return RDMAConnection(
+        source_rdma_iface=f"rdma_en{iface}", sink_rdma_iface=f"rdma_en{iface}"
+    )
--- a/src/exo/master/tests/test_master.py
+++ b/src/exo/master/tests/test_master.py
@@ -2,6 +2,7 @@ from datetime import datetime, timezone
 from typing import Sequence

 import anyio
+import pytest
 from loguru import logger

 from exo.master.main import Master
@@ -18,15 +19,13 @@ from exo.shared.types.events import (
    ForwarderEvent,
    IndexedEvent,
    InstanceCreated,
-    NodePerformanceMeasured,
+    NodeGatheredInfo,
    TaskCreated,
 )
 from exo.shared.types.memory import Memory
 from exo.shared.types.models import ModelId, ModelMetadata
 from exo.shared.types.profiling import (
-    MemoryPerformanceProfile,
-    NodePerformanceProfile,
-    SystemPerformanceProfile,
+    MemoryUsage,
 )
 from exo.shared.types.tasks import ChatCompletion as ChatCompletionTask
 from exo.shared.types.tasks import TaskStatus
@@ -39,6 +38,7 @@ from exo.shared.types.worker.shards import PipelineShardMetadata, Sharding
 from exo.utils.channels import channel


+@pytest.mark.asyncio
 async def test_master():
    keypair = get_node_id_keypair()
    node_id = NodeId(keypair.to_peer_id().to_base58())
@@ -81,21 +81,14 @@ async def test_master():
                origin=sender_node_id,
                session=session_id,
                event=(
-                    NodePerformanceMeasured(
+                    NodeGatheredInfo(
                        when=str(datetime.now(tz=timezone.utc)),
                        node_id=node_id,
-                        node_profile=NodePerformanceProfile(
-                            model_id="maccy",
-                            chip_id="arm",
-                            friendly_name="test",
-                            memory=MemoryPerformanceProfile(
-                                ram_total=Memory.from_bytes(678948 * 1024),
-                                ram_available=Memory.from_bytes(678948 * 1024),
-                                swap_total=Memory.from_bytes(0),
-                                swap_available=Memory.from_bytes(0),
-                            ),
-                            network_interfaces=[],
-                            system=SystemPerformanceProfile(),
+                        info=MemoryUsage(
+                            ram_total=Memory.from_bytes(678948 * 1024),
+                            ram_available=Memory.from_bytes(678948 * 1024),
+                            swap_total=Memory.from_bytes(0),
+                            swap_available=Memory.from_bytes(0),
                        ),
                    )
                ),
@@ -121,6 +114,8 @@ async def test_master():
                            pretty_name="Llama 3.2 1B",
                            n_layers=16,
                            storage_size=Memory.from_bytes(678948),
+                            hidden_size=7168,
+                            supports_tensor=True,
                        ),
                        sharding=Sharding.Pipeline,
                        instance_meta=InstanceMeta.MlxRing,
@@ -159,34 +154,40 @@ async def test_master():
        assert events[0].idx == 0
        assert events[1].idx == 1
        assert events[2].idx == 2
-        assert isinstance(events[0].event, NodePerformanceMeasured)
+        assert isinstance(events[0].event, NodeGatheredInfo)
        assert isinstance(events[1].event, InstanceCreated)
-        runner_id = list(
-            events[1].event.instance.shard_assignments.runner_to_shard.keys()
-        )[0]
-        assert events[1].event.instance == MlxRingInstance(
-            instance_id=events[1].event.instance.instance_id,
-            shard_assignments=ShardAssignments(
-                model_id=ModelId("llama-3.2-1b"),
-                runner_to_shard={
-                    (runner_id): PipelineShardMetadata(
-                        start_layer=0,
-                        end_layer=16,
+        created_instance = events[1].event.instance
+        assert isinstance(created_instance, MlxRingInstance)
+        runner_id = list(created_instance.shard_assignments.runner_to_shard.keys())[0]
+        # Validate the shard assignments
+        expected_shard_assignments = ShardAssignments(
+            model_id=ModelId("llama-3.2-1b"),
+            runner_to_shard={
+                (runner_id): PipelineShardMetadata(
+                    start_layer=0,
+                    end_layer=16,
+                    n_layers=16,
+                    model_meta=ModelMetadata(
+                        model_id=ModelId("llama-3.2-1b"),
+                        pretty_name="Llama 3.2 1B",
                        n_layers=16,
-                        model_meta=ModelMetadata(
-                            model_id=ModelId("llama-3.2-1b"),
-                            pretty_name="Llama 3.2 1B",
-                            n_layers=16,
-                            storage_size=Memory.from_bytes(678948),
-                        ),
-                        device_rank=0,
-                        world_size=1,
-                    )
-                },
-                node_to_runner={node_id: runner_id},
-            ),
-            hosts=[],
+                        storage_size=Memory.from_bytes(678948),
+                        hidden_size=7168,
+                        supports_tensor=True,
+                    ),
+                    device_rank=0,
+                    world_size=1,
+                )
+            },
+            node_to_runner={node_id: runner_id},
        )
+        assert created_instance.shard_assignments == expected_shard_assignments
+        # For single-node, hosts_by_node should have one entry with self-binding
+        assert len(created_instance.hosts_by_node) == 1
+        assert node_id in created_instance.hosts_by_node
+        assert len(created_instance.hosts_by_node[node_id]) == 1
+        assert created_instance.hosts_by_node[node_id][0].ip == "0.0.0.0"
+        assert created_instance.ephemeral_port > 0
        assert isinstance(events[2].event, TaskCreated)
        assert events[2].event.task.task_status == TaskStatus.Pending
        assert isinstance(events[2].event.task, ChatCompletionTask)
--- a/src/exo/master/tests/test_placement.py
+++ b/src/exo/master/tests/test_placement.py
@@ -1,5 +1,3 @@
-from typing import Callable
-
 import pytest
 from loguru import logger

@@ -7,14 +5,20 @@ from exo.master.placement import (
    get_transition_events,
    place_instance,
 )
+from exo.master.tests.conftest import (
+    create_connection,
+    create_node_profile,
+    create_rdma_connection,
+)
 from exo.shared.topology import Topology
 from exo.shared.types.commands import PlaceInstance
 from exo.shared.types.common import CommandId, NodeId
 from exo.shared.types.events import InstanceCreated, InstanceDeleted
 from exo.shared.types.memory import Memory
 from exo.shared.types.models import ModelId, ModelMetadata
-from exo.shared.types.profiling import NetworkInterfaceInfo, NodePerformanceProfile
-from exo.shared.types.topology import Connection, NodeInfo
+from exo.shared.types.multiaddr import Multiaddr
+from exo.shared.types.profiling import NetworkInterfaceInfo
+from exo.shared.types.topology import SocketConnection
 from exo.shared.types.worker.instances import (
    Instance,
    InstanceId,
@@ -26,11 +30,6 @@ from exo.shared.types.worker.runners import ShardAssignments
 from exo.shared.types.worker.shards import Sharding


-@pytest.fixture
-def topology() -> Topology:
-    return Topology()
-
-
@pytest.fixture
 def instance() -> Instance:
    return MlxRingInstance(
@@ -38,7 +37,8 @@ def instance() -> Instance:
        shard_assignments=ShardAssignments(
            model_id=ModelId("test-model"), runner_to_shard={}, node_to_runner={}
        ),
-        hosts=[],
+        hosts_by_node={},
+        ephemeral_port=50000,
    )


@@ -49,6 +49,8 @@ def model_meta() -> ModelMetadata:
        storage_size=Memory.from_kb(1000),
        pretty_name="Test Model",
        n_layers=10,
+        hidden_size=30,
+        supports_tensor=True,
    )


@@ -74,30 +76,36 @@ def test_get_instance_placements_create_instance(
    available_memory: tuple[int, int, int],
    total_layers: int,
    expected_layers: tuple[int, int, int],
-    topology: Topology,
    model_meta: ModelMetadata,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
 ):
    # arrange
    model_meta.n_layers = total_layers
    model_meta.storage_size.in_bytes = sum(
        available_memory
    )  # make it exactly fit across all nodes
+    topology = Topology()

    cic = place_instance_command(model_meta)
    node_id_a = NodeId()
    node_id_b = NodeId()
    node_id_c = NodeId()
-    topology.add_node(create_node(available_memory[0], node_id_a))
-    topology.add_node(create_node(available_memory[1], node_id_b))
-    topology.add_node(create_node(available_memory[2], node_id_c))
-    topology.add_connection(create_connection(node_id_a, node_id_b))
-    topology.add_connection(create_connection(node_id_b, node_id_c))
-    topology.add_connection(create_connection(node_id_c, node_id_a))
+    profiles = {
+        node_id_a: create_node_profile(available_memory[0]),
+        node_id_b: create_node_profile(available_memory[1]),
+        node_id_c: create_node_profile(available_memory[2]),
+    }
+    topology.add_node(node_id_a)
+    topology.add_node(node_id_b)
+    topology.add_node(node_id_c)
+    topology.add_connection(node_id_a, node_id_b, create_connection(1))
+    topology.add_connection(node_id_b, node_id_c, create_connection(2))
+    topology.add_connection(node_id_c, node_id_a, create_connection(3))
+    topology.add_connection(node_id_c, node_id_b, create_connection(4))
+    topology.add_connection(node_id_a, node_id_c, create_connection(5))
+    topology.add_connection(node_id_b, node_id_a, create_connection(6))

    # act
-    placements = place_instance(cic, topology, {})
+    placements = place_instance(cic, topology, {}, profiles)

    # assert
    assert len(placements) == 1
@@ -123,21 +131,22 @@ def test_get_instance_placements_create_instance(
    assert shards_sorted[-1].end_layer == total_layers


-def test_get_instance_placements_one_node_exact_fit(
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-) -> None:
+def test_get_instance_placements_one_node_exact_fit() -> None:
    topology = Topology()
    node_id = NodeId()
-    topology.add_node(create_node(1000 * 1024, node_id))
+    topology.add_node(node_id)
+    profiles = {node_id: create_node_profile(1000 * 1024)}
    cic = place_instance_command(
        ModelMetadata(
            model_id=ModelId("test-model"),
            storage_size=Memory.from_kb(1000),
            pretty_name="Test Model",
            n_layers=10,
+            hidden_size=1000,
+            supports_tensor=True,
        ),
    )
-    placements = place_instance(cic, topology, {})
+    placements = place_instance(cic, topology, {}, profiles)

    assert len(placements) == 1
    instance_id = list(placements.keys())[0]
@@ -148,21 +157,22 @@ def test_get_instance_placements_one_node_exact_fit(
    assert len(instance.shard_assignments.runner_to_shard) == 1


-def test_get_instance_placements_one_node_fits_with_extra_memory(
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-) -> None:
+def test_get_instance_placements_one_node_fits_with_extra_memory() -> None:
    topology = Topology()
    node_id = NodeId()
-    topology.add_node(create_node(1001 * 1024, node_id))
+    topology.add_node(node_id)
+    profiles = {node_id: create_node_profile(1001 * 1024)}
    cic = place_instance_command(
        ModelMetadata(
            model_id=ModelId("test-model"),
            storage_size=Memory.from_kb(1000),
            pretty_name="Test Model",
            n_layers=10,
+            hidden_size=1000,
+            supports_tensor=True,
        ),
    )
-    placements = place_instance(cic, topology, {})
+    placements = place_instance(cic, topology, {}, profiles)

    assert len(placements) == 1
    instance_id = list(placements.keys())[0]
@@ -173,23 +183,24 @@ def test_get_instance_placements_one_node_fits_with_extra_memory(
    assert len(instance.shard_assignments.runner_to_shard) == 1


-def test_get_instance_placements_one_node_not_fit(
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-) -> None:
+def test_get_instance_placements_one_node_not_fit() -> None:
    topology = Topology()
    node_id = NodeId()
-    topology.add_node(create_node(1000 * 1024, node_id))
+    topology.add_node(node_id)
+    profiles = {node_id: create_node_profile(1000 * 1024)}
    cic = place_instance_command(
        model_meta=ModelMetadata(
            model_id=ModelId("test-model"),
            storage_size=Memory.from_kb(1001),
            pretty_name="Test Model",
            n_layers=10,
+            hidden_size=1000,
+            supports_tensor=True,
        ),
    )

    with pytest.raises(ValueError, match="No cycles found with sufficient memory"):
-        place_instance(cic, topology, {})
+        place_instance(cic, topology, {}, profiles)


 def test_get_transition_events_no_change(instance: Instance):
@@ -234,191 +245,103 @@ def test_get_transition_events_delete_instance(instance: Instance):
    assert events[0].instance_id == instance_id


-def test_placement_prioritizes_leaf_cycle_with_less_memory(
-    topology: Topology,
+def test_placement_selects_leaf_nodes(
    model_meta: ModelMetadata,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
 ):
-    # Arrange two 3-node cycles. The A-B-C cycle has a leaf node (only one outgoing
-    # neighbor per node). The D-E-F cycle has extra outgoing edges making its nodes
-    # non-leaves. Ensure both cycles have sufficient total memory, with the A-B-C
-    # cycle having LESS total memory than D-E-F. The algorithm should still choose
-    # the cycle that contains a leaf node.
+    # arrange
+    topology = Topology()

-    # Model requires more than any single node but fits within a 3-node cycle
-    model_meta.storage_size.in_bytes = 1500
-    model_meta.n_layers = 12
+    model_meta.storage_size = Memory.from_bytes(1000)

-    # Create node ids
    node_id_a = NodeId()
    node_id_b = NodeId()
    node_id_c = NodeId()
    node_id_d = NodeId()
-    node_id_e = NodeId()
-    node_id_f = NodeId()

-    # Extra sink nodes to make D/E/F non-leaf via additional outgoing edges
-    node_id_x = NodeId()
-    node_id_y = NodeId()
-    node_id_z = NodeId()
+    profiles = {
+        node_id_a: create_node_profile(500),
+        node_id_b: create_node_profile(600),
+        node_id_c: create_node_profile(600),
+        node_id_d: create_node_profile(500),
+    }

-    # A-B-C cycle total memory = 1600 (< D-E-F total)
-    topology.add_node(create_node(400, node_id_a))
-    topology.add_node(create_node(400, node_id_b))
-    topology.add_node(create_node(800, node_id_c))
+    topology.add_node(node_id_a)
+    topology.add_node(node_id_b)
+    topology.add_node(node_id_c)
+    topology.add_node(node_id_d)

-    # D-E-F cycle total memory = 1800 (> A-B-C total)
-    topology.add_node(create_node(600, node_id_d))
-    topology.add_node(create_node(600, node_id_e))
-    topology.add_node(create_node(600, node_id_f))
+    # Daisy chain topology
+    topology.add_connection(node_id_a, node_id_b, create_connection(1))
+    topology.add_connection(node_id_b, node_id_a, create_connection(1))
+    topology.add_connection(node_id_b, node_id_c, create_connection(1))
+    topology.add_connection(node_id_c, node_id_b, create_connection(1))
+    topology.add_connection(node_id_c, node_id_d, create_connection(1))
+    topology.add_connection(node_id_d, node_id_c, create_connection(1))

-    # Extra nodes with tiny memory so they can't form singleton placements
-    topology.add_node(create_node(10, node_id_x))
-    topology.add_node(create_node(10, node_id_y))
-    topology.add_node(create_node(10, node_id_z))
-
-    # Build directed cycles
-    topology.add_connection(create_connection(node_id_a, node_id_b))
-    topology.add_connection(create_connection(node_id_b, node_id_c))
-    topology.add_connection(create_connection(node_id_c, node_id_a))
-
-    topology.add_connection(create_connection(node_id_d, node_id_e))
-    topology.add_connection(create_connection(node_id_e, node_id_f))
-    topology.add_connection(create_connection(node_id_f, node_id_d))
-
-    # Add extra outgoing edges from D/E/F so none of them are leaves
-    topology.add_connection(create_connection(node_id_d, node_id_x))
-    topology.add_connection(create_connection(node_id_e, node_id_y))
-    topology.add_connection(create_connection(node_id_f, node_id_z))
+    logger.info(list(topology.list_connections()))

    cic = place_instance_command(
        model_meta=model_meta,
    )

-    # Act
-    placements = place_instance(cic, topology, {})
+    # act
+    placements = place_instance(cic, topology, {}, profiles)

-    # Assert the chosen cycle is A-B-C (contains at least one leaf node), even though
-    # D-E-F has more total memory.
+    # assert
    assert len(placements) == 1
-    instance_id = list(placements.keys())[0]
-    instance = placements[instance_id]
+    instance = list(placements.values())[0]

    assigned_nodes = set(instance.shard_assignments.node_to_runner.keys())
-    expected_leaf_cycle_nodes = {node_id_a, node_id_b, node_id_c}
-    non_leaf_cycle_nodes = {node_id_d, node_id_e, node_id_f}
-
-    assert expected_leaf_cycle_nodes.issubset(assigned_nodes)
-    assert assigned_nodes.isdisjoint(non_leaf_cycle_nodes)
+    assert assigned_nodes == set((node_id_a, node_id_b)) or assigned_nodes == set(
+        (node_id_c, node_id_d)
+    )


 def test_tensor_rdma_backend_connectivity_matrix(
-    topology: Topology,
    model_meta: ModelMetadata,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
 ):
+    topology = Topology()
    model_meta.n_layers = 12
    model_meta.storage_size.in_bytes = 1500

-    node_id_a = NodeId()
-    node_id_b = NodeId()
-    node_id_c = NodeId()
+    node_a = NodeId()
+    node_b = NodeId()
+    node_c = NodeId()

-    node_a = create_node(500, node_id_a)
-    node_b = create_node(500, node_id_b)
-    node_c = create_node(500, node_id_c)
+    profiles = {
+        node_a: create_node_profile(500),
+        node_b: create_node_profile(500),
+        node_c: create_node_profile(500),
+    }

    ethernet_interface = NetworkInterfaceInfo(
        name="en0",
        ip_address="192.168.1.100",
    )
-
-    assert node_a.node_profile is not None
-    assert node_b.node_profile is not None
-    assert node_c.node_profile is not None
-
-    conn_a_b = create_connection(node_id_a, node_id_b)
-    conn_b_c = create_connection(node_id_b, node_id_c)
-    conn_c_a = create_connection(node_id_c, node_id_a)
-
-    conn_b_a = create_connection(node_id_b, node_id_a)
-    conn_c_b = create_connection(node_id_c, node_id_b)
-    conn_a_c = create_connection(node_id_a, node_id_c)
-
-    assert conn_a_b.send_back_multiaddr is not None
-    assert conn_b_c.send_back_multiaddr is not None
-    assert conn_c_a.send_back_multiaddr is not None
-
-    assert conn_b_a.send_back_multiaddr is not None
-    assert conn_c_b.send_back_multiaddr is not None
-    assert conn_a_c.send_back_multiaddr is not None
-
-    node_a.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_a.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_c_a.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_b_a.send_back_multiaddr.ip_address,
-            ),
-            ethernet_interface,
-        ],
-        system=node_a.node_profile.system,
-    )
-    node_b.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_b.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_c_b.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_a_b.send_back_multiaddr.ip_address,
-            ),
-            ethernet_interface,
-        ],
-        system=node_b.node_profile.system,
-    )
-    node_c.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_c.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_a_c.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_b_c.send_back_multiaddr.ip_address,
-            ),
-            ethernet_interface,
-        ],
-        system=node_c.node_profile.system,
+    ethernet_conn = SocketConnection(
+        sink_multiaddr=Multiaddr(address=f"/ip4/192.168.1.{100}/tcp/{8000}")
    )

+    profiles[node_a].network_interfaces = [ethernet_interface]
+    profiles[node_b].network_interfaces = [ethernet_interface]
+    profiles[node_c].network_interfaces = [ethernet_interface]
+
    topology.add_node(node_a)
    topology.add_node(node_b)
    topology.add_node(node_c)
-    topology.add_connection(conn_a_b)
-    topology.add_connection(conn_b_c)
-    topology.add_connection(conn_c_a)
-    topology.add_connection(conn_b_a)
-    topology.add_connection(conn_c_b)
-    topology.add_connection(conn_a_c)
+    topology.add_connection(node_a, node_b, create_rdma_connection(3))
+    topology.add_connection(node_b, node_c, create_rdma_connection(4))
+    topology.add_connection(node_c, node_a, create_rdma_connection(5))
+    topology.add_connection(node_b, node_a, create_rdma_connection(3))
+    topology.add_connection(node_c, node_b, create_rdma_connection(4))
+    topology.add_connection(node_a, node_c, create_rdma_connection(5))
+
+    topology.add_connection(node_a, node_b, ethernet_conn)
+    topology.add_connection(node_b, node_c, ethernet_conn)
+    topology.add_connection(node_c, node_a, ethernet_conn)
+    topology.add_connection(node_a, node_c, ethernet_conn)
+    topology.add_connection(node_b, node_a, ethernet_conn)
+    topology.add_connection(node_c, node_b, ethernet_conn)

    cic = PlaceInstance(
        sharding=Sharding.Tensor,
@@ -428,7 +351,7 @@ def test_tensor_rdma_backend_connectivity_matrix(
        min_nodes=1,
    )

-    placements = place_instance(cic, topology, {})
+    placements = place_instance(cic, topology, {}, profiles)

    assert len(placements) == 1
    instance_id = list(placements.keys())[0]
@@ -436,10 +359,10 @@ def test_tensor_rdma_backend_connectivity_matrix(

    assert isinstance(instance, MlxJacclInstance)

-    assert instance.ibv_devices is not None
-    assert instance.ibv_coordinators is not None
+    assert instance.jaccl_devices is not None
+    assert instance.jaccl_coordinators is not None

-    matrix = instance.ibv_devices
+    matrix = instance.jaccl_devices
    assert len(matrix) == 3

    for i in range(3):
@@ -448,21 +371,21 @@ def test_tensor_rdma_backend_connectivity_matrix(
    assigned_nodes = list(instance.shard_assignments.node_to_runner.keys())
    node_to_idx = {node_id: idx for idx, node_id in enumerate(assigned_nodes)}

-    idx_a = node_to_idx[node_id_a]
-    idx_b = node_to_idx[node_id_b]
-    idx_c = node_to_idx[node_id_c]
+    idx_a = node_to_idx[node_a]
+    idx_b = node_to_idx[node_b]
+    idx_c = node_to_idx[node_c]

    logger.info(matrix)

-    assert matrix[idx_a][idx_b] == "rdma_en4"
-    assert matrix[idx_b][idx_c] == "rdma_en3"
-    assert matrix[idx_c][idx_a] == "rdma_en3"
+    assert matrix[idx_a][idx_b] == "rdma_en3"
+    assert matrix[idx_b][idx_c] == "rdma_en4"
+    assert matrix[idx_c][idx_a] == "rdma_en5"

    # Verify coordinators are set for all nodes
-    assert len(instance.ibv_coordinators) == 3
+    assert len(instance.jaccl_coordinators) == 3
    for node_id in assigned_nodes:
-        assert node_id in instance.ibv_coordinators
-        coordinator = instance.ibv_coordinators[node_id]
+        assert node_id in instance.jaccl_coordinators
+        coordinator = instance.jaccl_coordinators[node_id]
        assert ":" in coordinator
        # Rank 0 node should use 0.0.0.0, others should use connection-specific IPs
        if node_id == assigned_nodes[0]:
--- a/src/exo/master/tests/test_placement_utils.py
+++ b/src/exo/master/tests/test_placement_utils.py
@@ -1,56 +1,48 @@
-from typing import Callable
-
 import pytest

 from exo.master.placement_utils import (
+    NodeWithProfile,
    filter_cycles_by_memory,
    get_hosts_from_subgraph,
-    get_mlx_ibv_coordinators,
+    get_mlx_jaccl_coordinators,
    get_shard_assignments,
    get_smallest_cycles,
 )
+from exo.master.tests.conftest import create_connection, create_node_profile
 from exo.shared.topology import Topology
 from exo.shared.types.common import Host, NodeId
 from exo.shared.types.memory import Memory
 from exo.shared.types.models import ModelId, ModelMetadata
-from exo.shared.types.profiling import NetworkInterfaceInfo, NodePerformanceProfile
-from exo.shared.types.topology import Connection, NodeInfo
 from exo.shared.types.worker.shards import Sharding


-@pytest.fixture
-def topology() -> Topology:
-    topology = Topology()
-    return topology
-
-
-def test_filter_cycles_by_memory(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
-):
+def test_filter_cycles_by_memory():
    # arrange
    node1_id = NodeId()
    node2_id = NodeId()
+    topology = Topology()

-    node1 = create_node(1000 * 1024, node1_id)
-    node2 = create_node(1000 * 1024, node2_id)
+    node1 = create_node_profile(1000 * 1024)
+    node2 = create_node_profile(1000 * 1024)
+    node_profiles = {node1_id: node1, node2_id: node2}

-    topology.add_node(node1)
-    topology.add_node(node2)
+    topology.add_node(node1_id)
+    topology.add_node(node2_id)

-    connection1 = create_connection(node1_id, node2_id)
-    connection2 = create_connection(node2_id, node1_id)
+    connection1 = create_connection(1)
+    connection2 = create_connection(2)

-    topology.add_connection(connection1)
-    topology.add_connection(connection2)
+    topology.add_connection(node1_id, node2_id, connection1)
+    topology.add_connection(node2_id, node1_id, connection2)

    cycles = topology.get_cycles()
    assert len(cycles) == 1
    assert len(cycles[0]) == 2

    # act
-    filtered_cycles = filter_cycles_by_memory(cycles, Memory.from_bytes(1))
+    filtered_cycles = filter_cycles_by_memory(
+        cycles, node_profiles, Memory.from_bytes(1)
+    )

    # assert
    assert len(filtered_cycles) == 1
@@ -58,64 +50,65 @@ def test_filter_cycles_by_memory(
    assert set(n.node_id for n in filtered_cycles[0]) == {node1_id, node2_id}


-def test_filter_cycles_by_insufficient_memory(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
-):
+def test_filter_cycles_by_insufficient_memory():
    # arrange
    node1_id = NodeId()
    node2_id = NodeId()
+    topology = Topology()

-    node1 = create_node(1000 * 1024, node1_id)
-    node2 = create_node(1000 * 1024, node2_id)
+    node1 = create_node_profile(1000 * 1024)
+    node2 = create_node_profile(1000 * 1024)
+    node_profiles = {node1_id: node1, node2_id: node2}

-    topology.add_node(node1)
-    topology.add_node(node2)
+    topology.add_node(node1_id)
+    topology.add_node(node2_id)

-    connection1 = create_connection(node1_id, node2_id)
-    connection2 = create_connection(node2_id, node1_id)
+    connection1 = create_connection(1)
+    connection2 = create_connection(2)

-    topology.add_connection(connection1)
-    topology.add_connection(connection2)
+    topology.add_connection(node1_id, node2_id, connection1)
+    topology.add_connection(node2_id, node1_id, connection2)

    # act
    filtered_cycles = filter_cycles_by_memory(
-        topology.get_cycles(), Memory.from_kb(2001)
+        topology.get_cycles(), node_profiles, Memory.from_kb(2001)
    )

    # assert
    assert len(filtered_cycles) == 0


-def test_filter_multiple_cycles_by_memory(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
-):
+def test_filter_multiple_cycles_by_memory():
    # arrange
    node_a_id = NodeId()
    node_b_id = NodeId()
    node_c_id = NodeId()
+    topology = Topology()

-    node_a = create_node(500 * 1024, node_a_id)
-    node_b = create_node(500 * 1024, node_b_id)
-    node_c = create_node(1000 * 1024, node_c_id)
+    node_a = create_node_profile(500 * 1024)
+    node_b = create_node_profile(500 * 1024)
+    node_c = create_node_profile(1000 * 1024)
+    node_profiles = {
+        node_a_id: node_a,
+        node_b_id: node_b,
+        node_c_id: node_c,
+    }

-    topology.add_node(node_a)
-    topology.add_node(node_b)
-    topology.add_node(node_c)
+    topology.add_node(node_a_id)
+    topology.add_node(node_b_id)
+    topology.add_node(node_c_id)

-    topology.add_connection(create_connection(node_a_id, node_b_id))
-    topology.add_connection(create_connection(node_b_id, node_a_id))
-
-    topology.add_connection(create_connection(node_a_id, node_c_id))
-    topology.add_connection(create_connection(node_c_id, node_b_id))
+    topology.add_connection(node_a_id, node_b_id, create_connection(1))
+    topology.add_connection(node_b_id, node_a_id, create_connection(2))
+    topology.add_connection(node_a_id, node_c_id, create_connection(3))
+    topology.add_connection(node_c_id, node_b_id, create_connection(4))

    cycles = topology.get_cycles()

    # act
-    filtered_cycles = filter_cycles_by_memory(cycles, Memory.from_kb(1500))
+    filtered_cycles = filter_cycles_by_memory(
+        cycles, node_profiles, Memory.from_kb(1500)
+    )

    # assert
    assert len(filtered_cycles) == 1
@@ -127,31 +120,38 @@ def test_filter_multiple_cycles_by_memory(
    }


-def test_get_smallest_cycles(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
-):
+def test_get_smallest_cycles():
    # arrange
    node_a_id = NodeId()
    node_b_id = NodeId()
    node_c_id = NodeId()
+    topology = Topology()

-    node_a = create_node(500 * 1024, node_a_id)
-    node_b = create_node(500 * 1024, node_b_id)
-    node_c = create_node(1000 * 1024, node_c_id)
+    node_a = create_node_profile(500 * 1024)
+    node_b = create_node_profile(500 * 1024)
+    node_c = create_node_profile(1000 * 1024)
+    node_profiles = {
+        node_a_id: node_a,
+        node_b_id: node_b,
+        node_c_id: node_c,
+    }

-    topology.add_node(node_a)
-    topology.add_node(node_b)
-    topology.add_node(node_c)
+    topology.add_node(node_a_id)
+    topology.add_node(node_b_id)
+    topology.add_node(node_c_id)

-    topology.add_connection(create_connection(node_a_id, node_b_id))
-    topology.add_connection(create_connection(node_b_id, node_c_id))
-    topology.add_connection(create_connection(node_c_id, node_a_id))
-    topology.add_connection(create_connection(node_b_id, node_a_id))
+    topology.add_connection(node_a_id, node_b_id, create_connection(1))
+    topology.add_connection(node_b_id, node_a_id, create_connection(2))
+    topology.add_connection(node_a_id, node_c_id, create_connection(3))
+    topology.add_connection(node_c_id, node_b_id, create_connection(4))
+
+    cycles = [
+        [NodeWithProfile(node_id=nid, node_profile=node_profiles[nid]) for nid in cycle]
+        for cycle in topology.get_cycles()
+    ]

    # act
-    smallest_cycles = get_smallest_cycles(topology.get_cycles())
+    smallest_cycles = get_smallest_cycles(cycles)

    # assert
    assert len(smallest_cycles) == 1
@@ -168,9 +168,6 @@ def test_get_smallest_cycles(
    ],
 )
 def test_get_shard_assignments(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId], Connection],
    available_memory: tuple[int, int, int],
    total_layers: int,
    expected_layers: tuple[int, int, int],
@@ -179,27 +176,39 @@ def test_get_shard_assignments(
    node_a_id = NodeId()
    node_b_id = NodeId()
    node_c_id = NodeId()
+    topology = Topology()

-    node_a = create_node(available_memory[0] * 1024, node_a_id)
-    node_b = create_node(available_memory[1] * 1024, node_b_id)
-    node_c = create_node(available_memory[2] * 1024, node_c_id)
+    node_a = create_node_profile(available_memory[0] * 1024)
+    node_b = create_node_profile(available_memory[1] * 1024)
+    node_c = create_node_profile(available_memory[2] * 1024)
+    node_profiles = {
+        node_a_id: node_a,
+        node_b_id: node_b,
+        node_c_id: node_c,
+    }

-    topology.add_node(node_a)
-    topology.add_node(node_b)
-    topology.add_node(node_c)
+    topology.add_node(node_a_id)
+    topology.add_node(node_b_id)
+    topology.add_node(node_c_id)

-    topology.add_connection(create_connection(node_a_id, node_b_id))
-    topology.add_connection(create_connection(node_b_id, node_c_id))
-    topology.add_connection(create_connection(node_c_id, node_a_id))
-    topology.add_connection(create_connection(node_b_id, node_a_id))
+    topology.add_connection(node_a_id, node_b_id, create_connection(1))
+    topology.add_connection(node_b_id, node_c_id, create_connection(2))
+    topology.add_connection(node_c_id, node_a_id, create_connection(3))
+    topology.add_connection(node_b_id, node_a_id, create_connection(4))

    model_meta = ModelMetadata(
        model_id=ModelId("test-model"),
        pretty_name="Test Model",
        n_layers=total_layers,
        storage_size=Memory.from_kb(1000),
+        hidden_size=1000,
+        supports_tensor=True,
    )
-    cycles = topology.get_cycles()
+
+    cycles = [
+        [NodeWithProfile(node_id=nid, node_profile=node_profiles[nid]) for nid in cycle]
+        for cycle in topology.get_cycles()
+    ]
    selected_cycle = cycles[0]

    # act
@@ -228,28 +237,21 @@ def test_get_shard_assignments(
    )


-def test_get_hosts_from_subgraph(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId, int | None], Connection],
-):
+def test_get_hosts_from_subgraph():
    # arrange
    node_a_id = NodeId()
    node_b_id = NodeId()
    node_c_id = NodeId()
+    topology = Topology()

-    node_a = create_node(500, node_a_id)
-    node_b = create_node(500, node_b_id)
-    node_c = create_node(1000, node_c_id)
+    topology.add_node(node_a_id)
+    topology.add_node(node_b_id)
+    topology.add_node(node_c_id)

-    topology.add_node(node_a)
-    topology.add_node(node_b)
-    topology.add_node(node_c)
-
-    topology.add_connection(create_connection(node_a_id, node_b_id, 5001))
-    topology.add_connection(create_connection(node_b_id, node_c_id, 5002))
-    topology.add_connection(create_connection(node_c_id, node_a_id, 5003))
-    topology.add_connection(create_connection(node_b_id, node_a_id, 5004))
+    topology.add_connection(node_a_id, node_b_id, create_connection(1))
+    topology.add_connection(node_b_id, node_a_id, create_connection(2))
+    topology.add_connection(node_a_id, node_c_id, create_connection(3))
+    topology.add_connection(node_c_id, node_b_id, create_connection(4))

    # act
    hosts = get_hosts_from_subgraph(topology)
@@ -257,108 +259,47 @@ def test_get_hosts_from_subgraph(
    # assert
    assert len(hosts) == 3
    expected_hosts = [
-        Host(ip=("169.254.0.2"), port=5001),
-        Host(ip=("169.254.0.3"), port=5002),
-        Host(ip=("169.254.0.4"), port=5003),
+        Host(ip=("169.254.0.2"), port=1234),
+        Host(ip=("169.254.0.3"), port=1234),
+        Host(ip=("169.254.0.4"), port=1234),
    ]
    for expected_host in expected_hosts:
        assert expected_host in hosts


-def test_get_mlx_ibv_coordinators(
-    topology: Topology,
-    create_node: Callable[[int, NodeId | None], NodeInfo],
-    create_connection: Callable[[NodeId, NodeId, int | None], Connection],
-):
+def test_get_mlx_jaccl_coordinators():
    # arrange
    node_a_id = NodeId()
    node_b_id = NodeId()
    node_c_id = NodeId()
+    topology = Topology()

-    node_a = create_node(500 * 1024, node_a_id)
-    node_b = create_node(500 * 1024, node_b_id)
-    node_c = create_node(1000 * 1024, node_c_id)
+    topology.add_node(node_a_id)
+    topology.add_node(node_b_id)
+    topology.add_node(node_c_id)

-    conn_a_b = create_connection(node_a_id, node_b_id, 5001)
-    conn_b_a = create_connection(node_b_id, node_a_id, 5002)
-    conn_b_c = create_connection(node_b_id, node_c_id, 5003)
-    conn_c_b = create_connection(node_c_id, node_b_id, 5004)
-    conn_c_a = create_connection(node_c_id, node_a_id, 5005)
-    conn_a_c = create_connection(node_a_id, node_c_id, 5006)
+    topology.add_connection(node_a_id, node_b_id, create_connection(1))
+    topology.add_connection(node_b_id, node_a_id, create_connection(2))
+    topology.add_connection(node_a_id, node_c_id, create_connection(3))
+    topology.add_connection(node_c_id, node_b_id, create_connection(4))

-    # Update node profiles with network interfaces before adding to topology
-    assert node_a.node_profile is not None
-    assert node_b.node_profile is not None
-    assert node_c.node_profile is not None
+    conn_a_b = create_connection(1)
+    conn_b_a = create_connection(2)
+    conn_b_c = create_connection(3)
+    conn_c_b = create_connection(4)
+    conn_c_a = create_connection(5)
+    conn_a_c = create_connection(6)

-    node_a.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_a.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_a_b.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_a_c.send_back_multiaddr.ip_address,
-            ),
-        ],
-        system=node_a.node_profile.system,
-    )
-    node_b.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_b.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_b_a.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_b_c.send_back_multiaddr.ip_address,
-            ),
-        ],
-        system=node_b.node_profile.system,
-    )
-    node_c.node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=node_c.node_profile.memory,
-        network_interfaces=[
-            NetworkInterfaceInfo(
-                name="en3",
-                ip_address=conn_c_b.send_back_multiaddr.ip_address,
-            ),
-            NetworkInterfaceInfo(
-                name="en4",
-                ip_address=conn_c_a.send_back_multiaddr.ip_address,
-            ),
-        ],
-        system=node_c.node_profile.system,
-    )
-
-    topology.add_node(node_a)
-    topology.add_node(node_b)
-    topology.add_node(node_c)
-
-    topology.add_connection(conn_a_b)
-    topology.add_connection(conn_b_a)
-    topology.add_connection(conn_b_c)
-    topology.add_connection(conn_c_b)
-    topology.add_connection(conn_c_a)
-    topology.add_connection(conn_a_c)
-
-    cycle = [node_a, node_b, node_c]
+    topology.add_connection(node_a_id, node_b_id, conn_a_b)
+    topology.add_connection(node_b_id, node_a_id, conn_b_a)
+    topology.add_connection(node_b_id, node_c_id, conn_b_c)
+    topology.add_connection(node_c_id, node_b_id, conn_c_b)
+    topology.add_connection(node_c_id, node_a_id, conn_c_a)
+    topology.add_connection(node_a_id, node_c_id, conn_a_c)

    # act
-    coordinators = get_mlx_ibv_coordinators(
-        cycle, coordinator_port=5000, cycle_digraph=topology
+    coordinators = get_mlx_jaccl_coordinators(
+        node_a_id, coordinator_port=5000, cycle_digraph=topology
    )

    # assert
@@ -387,11 +328,11 @@ def test_get_mlx_ibv_coordinators(

    # Non-rank-0 nodes should use the specific IP from their connection to rank 0
    # node_b uses the IP from conn_b_a (node_b -> node_a)
-    assert coordinators[node_b_id] == (
-        f"{conn_b_a.send_back_multiaddr.ip_address}:5000"
-    ), "node_b should use the IP from conn_b_a"
+    assert coordinators[node_b_id] == (f"{conn_b_a.sink_multiaddr.ip_address}:5000"), (
+        "node_b should use the IP from conn_b_a"
+    )

    # node_c uses the IP from conn_c_a (node_c -> node_a)
-    assert coordinators[node_c_id] == (
-        f"{conn_c_a.send_back_multiaddr.ip_address}:5000"
-    ), "node_c should use the IP from conn_c_a"
+    assert coordinators[node_c_id] == (f"{conn_c_a.sink_multiaddr.ip_address}:5000"), (
+        "node_c should use the IP from conn_c_a"
+    )
--- a/src/exo/master/tests/test_topology.py
+++ b/src/exo/master/tests/test_topology.py
@@ -1,13 +1,14 @@
 import pytest

 from exo.shared.topology import Topology
+from exo.shared.types.common import NodeId
 from exo.shared.types.multiaddr import Multiaddr
 from exo.shared.types.profiling import (
-    MemoryPerformanceProfile,
+    MemoryUsage,
    NodePerformanceProfile,
    SystemPerformanceProfile,
 )
-from exo.shared.types.topology import Connection, ConnectionProfile, NodeId, NodeInfo
+from exo.shared.types.topology import SocketConnection


@pytest.fixture
@@ -16,20 +17,15 @@ def topology() -> Topology:


@pytest.fixture
-def connection() -> Connection:
-    return Connection(
-        local_node_id=NodeId(),
-        send_back_node_id=NodeId(),
-        send_back_multiaddr=Multiaddr(address="/ip4/127.0.0.1/tcp/1235"),
-        connection_profile=ConnectionProfile(
-            throughput=1000, latency=1000, jitter=1000
-        ),
+def connection() -> SocketConnection:
+    return SocketConnection(
+        sink_multiaddr=Multiaddr(address="/ip4/127.0.0.1/tcp/1235"),
    )


@pytest.fixture
 def node_profile() -> NodePerformanceProfile:
-    memory_profile = MemoryPerformanceProfile.from_bytes(
+    memory_profile = MemoryUsage.from_bytes(
        ram_total=1000, ram_available=1000, swap_total=1000, swap_available=1000
    )
    system_profile = SystemPerformanceProfile()
@@ -43,162 +39,85 @@ def node_profile() -> NodePerformanceProfile:
    )


-@pytest.fixture
-def connection_profile() -> ConnectionProfile:
-    return ConnectionProfile(throughput=1000, latency=1000, jitter=1000)
-
-
-def test_add_node(topology: Topology, node_profile: NodePerformanceProfile):
+def test_add_node(topology: Topology):
    # arrange
    node_id = NodeId()

    # act
-    topology.add_node(NodeInfo(node_id=node_id, node_profile=node_profile))
+    topology.add_node(node_id)

    # assert
-    data = topology.get_node_profile(node_id)
-    assert data == node_profile
+    assert topology.node_is_leaf(node_id)


-def test_add_connection(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
-):
+def test_add_connection(topology: Topology, connection: SocketConnection):
    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
+    node_a = NodeId()
+    node_b = NodeId()
+
+    topology.add_node(node_a)
+    topology.add_node(node_b)
+    topology.add_connection(node_a, node_b, connection)

    # act
-    data = topology.get_connection_profile(connection)
+    data = list(conn for _, _, conn in topology.list_connections())

    # assert
-    assert data == connection.connection_profile
+    assert data == [connection]

-
-def test_update_node_profile(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
-):
-    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
-
-    new_node_profile = NodePerformanceProfile(
-        model_id="test",
-        chip_id="test",
-        friendly_name="test",
-        memory=MemoryPerformanceProfile.from_bytes(
-            ram_total=1000, ram_available=1000, swap_total=1000, swap_available=1000
-        ),
-        network_interfaces=[],
-        system=SystemPerformanceProfile(),
-    )
-
-    # act
-    topology.update_node_profile(
-        connection.local_node_id, node_profile=new_node_profile
-    )
-
-    # assert
-    data = topology.get_node_profile(connection.local_node_id)
-    assert data == new_node_profile
-
-
-def test_update_connection_profile(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
-):
-    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
-
-    new_connection_profile = ConnectionProfile(
-        throughput=2000, latency=2000, jitter=2000
-    )
-    connection = Connection(
-        local_node_id=connection.local_node_id,
-        send_back_node_id=connection.send_back_node_id,
-        send_back_multiaddr=connection.send_back_multiaddr,
-        connection_profile=new_connection_profile,
-    )
-
-    # act
-    topology.update_connection_profile(connection)
-
-    # assert
-    data = topology.get_connection_profile(connection)
-    assert data == new_connection_profile
+    assert topology.node_is_leaf(node_a)
+    assert topology.node_is_leaf(node_b)


 def test_remove_connection_still_connected(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
+    topology: Topology, connection: SocketConnection
 ):
    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
+    node_a = NodeId()
+    node_b = NodeId()
+
+    topology.add_node(node_a)
+    topology.add_node(node_b)
+    topology.add_connection(node_a, node_b, connection)

    # act
-    topology.remove_connection(connection)
+    topology.remove_connection(node_a, node_b, connection)

    # assert
-    assert topology.get_connection_profile(connection) is None
+    assert list(topology.get_all_connections_between(node_a, node_b)) == []


-def test_remove_node_still_connected(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
-):
+def test_remove_node_still_connected(topology: Topology, connection: SocketConnection):
    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
+    node_a = NodeId()
+    node_b = NodeId()
+
+    topology.add_node(node_a)
+    topology.add_node(node_b)
+    topology.add_connection(node_a, node_b, connection)
+    assert list(topology.out_edges(node_a)) == [(node_b, connection)]

    # act
-    topology.remove_node(connection.local_node_id)
+    topology.remove_node(node_b)

    # assert
-    assert topology.get_node_profile(connection.local_node_id) is None
+    assert list(topology.out_edges(node_a)) == []


-def test_list_nodes(
-    topology: Topology, node_profile: NodePerformanceProfile, connection: Connection
-):
+def test_list_nodes(topology: Topology, connection: SocketConnection):
    # arrange
-    topology.add_node(
-        NodeInfo(node_id=connection.local_node_id, node_profile=node_profile)
-    )
-    topology.add_node(
-        NodeInfo(node_id=connection.send_back_node_id, node_profile=node_profile)
-    )
-    topology.add_connection(connection)
+    node_a = NodeId()
+    node_b = NodeId()
+
+    topology.add_node(node_a)
+    topology.add_node(node_b)
+    topology.add_connection(node_a, node_b, connection)
+    assert list(topology.out_edges(node_a)) == [(node_b, connection)]

    # act
    nodes = list(topology.list_nodes())

    # assert
    assert len(nodes) == 2
-    assert all(isinstance(node, NodeInfo) for node in nodes)
-    assert {node.node_id for node in nodes} == {
-        connection.local_node_id,
-        connection.send_back_node_id,
-    }
+    assert all(isinstance(node, NodeId) for node in nodes)
+    assert {node for node in nodes} == {node_a, node_b}
--- a/src/exo/routing/tests/test_event_buffer.py
+++ b/src/exo/routing/tests/test_event_buffer.py
@@ -15,6 +15,7 @@ def buffer() -> OrderedBuffer[Event]:
    return OrderedBuffer[Event]()


+@pytest.mark.asyncio
 async def test_initial_state(buffer: OrderedBuffer[Event]):
    """Tests that a new buffer is empty and starts at index 1."""
    assert buffer.next_idx_to_release == 0
@@ -22,6 +23,7 @@ async def test_initial_state(buffer: OrderedBuffer[Event]):
    assert buffer.drain() == []


+@pytest.mark.asyncio
 async def test_ingest_and_drain_sequential_events(buffer: OrderedBuffer[Event]):
    """Tests ingesting and draining a simple, ordered sequence of events."""
    events = [make_indexed_event(0), make_indexed_event(1), make_indexed_event(2)]
@@ -33,6 +35,7 @@ async def test_ingest_and_drain_sequential_events(buffer: OrderedBuffer[Event]):
    assert not buffer.store


+@pytest.mark.asyncio
 async def test_ingest_out_of_order_events(buffer: OrderedBuffer[Event]):
    """Tests that out-of-order events are buffered and drained in the correct sequence."""
    event1 = make_indexed_event(0)
@@ -48,6 +51,7 @@ async def test_ingest_out_of_order_events(buffer: OrderedBuffer[Event]):
    assert buffer.next_idx_to_release == 3


+@pytest.mark.asyncio
 async def test_drain_with_gap_in_sequence(buffer: OrderedBuffer[Event]):
    """Tests that draining stops when there is a gap in the event indices."""
    event1 = make_indexed_event(0)
@@ -64,6 +68,7 @@ async def test_drain_with_gap_in_sequence(buffer: OrderedBuffer[Event]):
    assert 2 in buffer.store


+@pytest.mark.asyncio
 async def test_fill_gap_and_drain_remaining(buffer: OrderedBuffer[Event]):
    """Tests that once a gap is filled, the rest of the sequence is drained."""
    event0 = make_indexed_event(0)
@@ -82,6 +87,7 @@ async def test_fill_gap_and_drain_remaining(buffer: OrderedBuffer[Event]):
    assert buffer.next_idx_to_release == 3


+@pytest.mark.asyncio
 async def test_ingest_drops_duplicate_indices(buffer: OrderedBuffer[Event]):
    """Tests that if multiple events for the same index are ingested, the first one wins."""
    event2_first = make_indexed_event(1)
@@ -100,6 +106,7 @@ async def test_ingest_drops_duplicate_indices(buffer: OrderedBuffer[Event]):
    assert drained[1][1].event_id != event2_second[1].event_id


+@pytest.mark.asyncio
 async def test_ingest_drops_stale_events(buffer: OrderedBuffer[Event]):
    """Tests that events with an index lower than next_idx_to_release are dropped."""
    buffer.ingest(*make_indexed_event(0))
@@ -117,6 +124,7 @@ async def test_ingest_drops_stale_events(buffer: OrderedBuffer[Event]):
    assert buffer.drain() == []


+@pytest.mark.asyncio
 async def test_drain_and_ingest_with_new_sequence(buffer: OrderedBuffer[Event]):
    """Tests reusing the buffer after it has been fully drained."""
    buffer.ingest(*make_indexed_event(0))
--- a/src/exo/shared/apply.py
+++ b/src/exo/shared/apply.py
@@ -11,10 +11,8 @@ from exo.shared.types.events import (
    IndexedEvent,
    InstanceCreated,
    InstanceDeleted,
-    NodeCreated,
    NodeDownloadProgress,
-    NodeMemoryMeasured,
-    NodePerformanceMeasured,
+    NodeGatheredInfo,
    NodeTimedOut,
    RunnerDeleted,
    RunnerStatusUpdated,
@@ -27,13 +25,23 @@ from exo.shared.types.events import (
    TopologyEdgeCreated,
    TopologyEdgeDeleted,
 )
-from exo.shared.types.profiling import NodePerformanceProfile, SystemPerformanceProfile
+from exo.shared.types.profiling import NodePerformanceProfile
 from exo.shared.types.state import State
 from exo.shared.types.tasks import Task, TaskId, TaskStatus
-from exo.shared.types.topology import NodeInfo
+from exo.shared.types.topology import RDMAConnection
 from exo.shared.types.worker.downloads import DownloadProgress
 from exo.shared.types.worker.instances import Instance, InstanceId
 from exo.shared.types.worker.runners import RunnerId, RunnerStatus
+from exo.utils.info_gatherer.info_gatherer import (
+    MacmonMetrics,
+    MacTBConnections,
+    MacTBIdentifiers,
+    MemoryUsage,
+    MiscData,
+    NodeConfig,
+    NodeNetworkInterfaces,
+    StaticNodeInformation,
+)


 def event_apply(event: Event, state: State) -> State:
@@ -47,16 +55,12 @@ def event_apply(event: Event, state: State) -> State:
            return apply_instance_created(event, state)
        case InstanceDeleted():
            return apply_instance_deleted(event, state)
-        case NodeCreated():
-            return apply_topology_node_created(event, state)
        case NodeTimedOut():
            return apply_node_timed_out(event, state)
-        case NodePerformanceMeasured():
-            return apply_node_performance_measured(event, state)
        case NodeDownloadProgress():
            return apply_node_download_progress(event, state)
-        case NodeMemoryMeasured():
-            return apply_node_memory_measured(event, state)
+        case NodeGatheredInfo():
+            return apply_node_gathered_info(event, state)
        case RunnerDeleted():
            return apply_runner_deleted(event, state)
        case RunnerStatusUpdated():
@@ -188,7 +192,7 @@ def apply_runner_deleted(event: RunnerDeleted, state: State) -> State:


 def apply_node_timed_out(event: NodeTimedOut, state: State) -> State:
-    topology = copy.copy(state.topology)
+    topology = copy.deepcopy(state.topology)
    state.topology.remove_node(event.node_id)
    node_profiles = {
        key: value for key, value in state.node_profiles.items() if key != event.node_id
@@ -196,8 +200,12 @@ def apply_node_timed_out(event: NodeTimedOut, state: State) -> State:
    last_seen = {
        key: value for key, value in state.last_seen.items() if key != event.node_id
    }
+    downloads = {
+        key: value for key, value in state.downloads.items() if key != event.node_id
+    }
    return state.model_copy(
        update={
+            "downloads": downloads,
            "topology": topology,
            "node_profiles": node_profiles,
            "last_seen": last_seen,
@@ -205,103 +213,69 @@ def apply_node_timed_out(event: NodeTimedOut, state: State) -> State:
    )


-def apply_node_performance_measured(
-    event: NodePerformanceMeasured, state: State
-) -> State:
-    new_profiles: Mapping[NodeId, NodePerformanceProfile] = {
-        **state.node_profiles,
-        event.node_id: event.node_profile,
-    }
-    last_seen: Mapping[NodeId, datetime] = {
-        **state.last_seen,
-        event.node_id: datetime.fromisoformat(event.when),
-    }
-    state = state.model_copy(update={"node_profiles": new_profiles})
-    topology = copy.copy(state.topology)
-    # TODO: NodeCreated
-    if not topology.contains_node(event.node_id):
-        topology.add_node(NodeInfo(node_id=event.node_id))
-    topology.update_node_profile(event.node_id, event.node_profile)
+def apply_node_gathered_info(event: NodeGatheredInfo, state: State) -> State:
+    topology = copy.deepcopy(state.topology)
+    topology.add_node(event.node_id)
+    info = event.info
+    profile = state.node_profiles.get(event.node_id, NodePerformanceProfile())
+    # TODO: should be broken up into individual events instead of this monster
+    match info:
+        case MacmonMetrics():
+            profile.system = info.system_profile
+            profile.memory = info.memory
+        case MemoryUsage():
+            profile.memory = info
+        case NodeConfig():
+            pass
+        case MiscData():
+            profile.friendly_name = info.friendly_name
+        case StaticNodeInformation():
+            profile.model_id = info.model
+            profile.chip_id = info.chip
+        # TODO: makes me slightly sad
+        case NodeNetworkInterfaces():
+            profile.network_interfaces = info.ifaces
+        case MacTBIdentifiers():
+            profile.tb_interfaces = info.idents
+        case MacTBConnections():
+            conn_map = {
+                tb_ident.domain_uuid: (nid, tb_ident.rdma_interface)
+                for nid in state.node_profiles
+                for tb_ident in state.node_profiles[nid].tb_interfaces
+            }
+            as_rdma_conns = [
+                (
+                    conn_map[tb_conn.sink_uuid][0],
+                    RDMAConnection(
+                        source_rdma_iface=conn_map[tb_conn.source_uuid][1],
+                        sink_rdma_iface=conn_map[tb_conn.sink_uuid][1],
+                    ),
+                )
+                for tb_conn in info.conns
+                if tb_conn.source_uuid in conn_map
+                if tb_conn.sink_uuid in conn_map
+            ]
+            topology.replace_all_out_tb_connections(event.node_id, as_rdma_conns)
+
+    last_seen = {**state.last_seen, event.node_id: datetime.fromisoformat(event.when)}
+    new_profiles = {**state.node_profiles, event.node_id: profile}
    return state.model_copy(
        update={
            "node_profiles": new_profiles,
-            "topology": topology,
            "last_seen": last_seen,
+            "topology": topology,
        }
    )


-def apply_node_memory_measured(event: NodeMemoryMeasured, state: State) -> State:
-    existing = state.node_profiles.get(event.node_id)
-    topology = copy.copy(state.topology)
-
-    if existing is None:
-        created = NodePerformanceProfile(
-            model_id="unknown",
-            chip_id="unknown",
-            friendly_name="Unknown",
-            memory=event.memory,
-            network_interfaces=[],
-            system=SystemPerformanceProfile(
-                # TODO: flops_fp16=0.0,
-                gpu_usage=0.0,
-                temp=0.0,
-                sys_power=0.0,
-                pcpu_usage=0.0,
-                ecpu_usage=0.0,
-                ane_power=0.0,
-            ),
-        )
-        created_profiles: Mapping[NodeId, NodePerformanceProfile] = {
-            **state.node_profiles,
-            event.node_id: created,
-        }
-        last_seen: Mapping[NodeId, datetime] = {
-            **state.last_seen,
-            event.node_id: datetime.fromisoformat(event.when),
-        }
-        if not topology.contains_node(event.node_id):
-            topology.add_node(NodeInfo(node_id=event.node_id))
-            # TODO: NodeCreated
-        topology.update_node_profile(event.node_id, created)
-        return state.model_copy(
-            update={
-                "node_profiles": created_profiles,
-                "topology": topology,
-                "last_seen": last_seen,
-            }
-        )
-
-    updated = existing.model_copy(update={"memory": event.memory})
-    updated_profiles: Mapping[NodeId, NodePerformanceProfile] = {
-        **state.node_profiles,
-        event.node_id: updated,
-    }
-    # TODO: NodeCreated
-    if not topology.contains_node(event.node_id):
-        topology.add_node(NodeInfo(node_id=event.node_id))
-    topology.update_node_profile(event.node_id, updated)
-    return state.model_copy(
-        update={"node_profiles": updated_profiles, "topology": topology}
-    )
-
-
-def apply_topology_node_created(event: NodeCreated, state: State) -> State:
-    topology = copy.copy(state.topology)
-    topology.add_node(NodeInfo(node_id=event.node_id))
-    return state.model_copy(update={"topology": topology})
-
-
 def apply_topology_edge_created(event: TopologyEdgeCreated, state: State) -> State:
-    topology = copy.copy(state.topology)
-    topology.add_connection(event.edge)
+    topology = copy.deepcopy(state.topology)
+    topology.add_connection(event.source, event.sink, event.edge)
    return state.model_copy(update={"topology": topology})


 def apply_topology_edge_deleted(event: TopologyEdgeDeleted, state: State) -> State:
-    topology = copy.copy(state.topology)
-    if not topology.contains_connection(event.edge):
-        return state
-    topology.remove_connection(event.edge)
+    topology = copy.deepcopy(state.topology)
+    topology.remove_connection(event.sink, event.source, event.edge)
    # TODO: Clean up removing the reverse connection
    return state.model_copy(update={"topology": topology})
--- a/src/exo/shared/constants.py
+++ b/src/exo/shared/constants.py
@@ -1,35 +1,47 @@
 import os
+import sys
 from pathlib import Path

-EXO_HOME_RELATIVE_PATH = os.environ.get("EXO_HOME", ".exo")
-EXO_HOME = Path.home() / EXO_HOME_RELATIVE_PATH
+_EXO_HOME_ENV = os.environ.get("EXO_HOME", None)

-EXO_MODELS_DIR_ENV = os.environ.get("EXO_MODELS_DIR")
-EXO_MODELS_DIR = Path(EXO_MODELS_DIR_ENV) if EXO_MODELS_DIR_ENV else EXO_HOME / "models"

-EXO_GLOBAL_EVENT_DB = EXO_HOME / "global_events.db"
-EXO_WORKER_EVENT_DB = EXO_HOME / "worker_events.db"
-EXO_MASTER_STATE = EXO_HOME / "master_state.json"
-EXO_WORKER_STATE = EXO_HOME / "worker_state.json"
-EXO_MASTER_LOG = EXO_HOME / "master.log"
-EXO_WORKER_LOG = EXO_HOME / "worker.log"
-EXO_LOG = EXO_HOME / "exo.log"
-EXO_TEST_LOG = EXO_HOME / "exo_test.log"
+def _get_xdg_dir(env_var: str, fallback: str) -> Path:
+    """Get XDG directory, prioritising EXO_HOME environment variable if its set. On non-Linux platforms, default to ~/.exo."""

-EXO_NODE_ID_KEYPAIR = EXO_HOME / "node_id.keypair"
+    if _EXO_HOME_ENV is not None:
+        return Path.home() / _EXO_HOME_ENV

-EXO_WORKER_KEYRING_FILE = EXO_HOME / "worker_keyring"
-EXO_MASTER_KEYRING_FILE = EXO_HOME / "master_keyring"
+    if sys.platform != "linux":
+        return Path.home() / ".exo"

-EXO_IPC_DIR = EXO_HOME / "ipc"
+    xdg_value = os.environ.get(env_var, None)
+    if xdg_value is not None:
+        return Path(xdg_value) / "exo"
+    return Path.home() / fallback / "exo"
+
+
+EXO_CONFIG_HOME = _get_xdg_dir("XDG_CONFIG_HOME", ".config")
+EXO_DATA_HOME = _get_xdg_dir("XDG_DATA_HOME", ".local/share")
+EXO_CACHE_HOME = _get_xdg_dir("XDG_CACHE_HOME", ".cache")
+
+# Models directory (data)
+_EXO_MODELS_DIR_ENV = os.environ.get("EXO_MODELS_DIR", None)
+EXO_MODELS_DIR = (
+    EXO_DATA_HOME / "models"
+    if _EXO_MODELS_DIR_ENV is None
+    else Path.home() / _EXO_MODELS_DIR_ENV
+)
+
+# Log files (data/logs or cache)
+EXO_LOG = EXO_CACHE_HOME / "exo.log"
+EXO_TEST_LOG = EXO_CACHE_HOME / "exo_test.log"
+
+# Identity (config)
+EXO_NODE_ID_KEYPAIR = EXO_CONFIG_HOME / "node_id.keypair"
+EXO_CONFIG_FILE = EXO_CONFIG_HOME / "config.toml"

 # libp2p topics for event forwarding
 LIBP2P_LOCAL_EVENTS_TOPIC = "worker_events"
 LIBP2P_GLOBAL_EVENTS_TOPIC = "global_events"
 LIBP2P_ELECTION_MESSAGES_TOPIC = "election_message"
 LIBP2P_COMMANDS_TOPIC = "commands"
-
-# lower bounds define timeouts for flops and memory bandwidth - these are the values for the M1 chip.
-LB_TFLOPS = 2.3
-LB_MEMBW_GBPS = 68
-LB_DISK_GBPS = 1.5
--- a/src/exo/shared/logging.py
+++ b/src/exo/shared/logging.py
@@ -24,6 +24,8 @@ class _InterceptHandler(logging.Handler):
        except ValueError:
            level = record.levelno

+        return
+
        logger.opt(depth=3, exception=record.exc_info).log(level, record.getMessage())


--- a/src/exo/shared/models/model_cards.py
+++ b/src/exo/shared/models/model_cards.py
@@ -51,6 +51,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="DeepSeek V3.1 (4-bit)",
            storage_size=Memory.from_gb(378),
            n_layers=61,
+            hidden_size=7168,
+            supports_tensor=True,
        ),
    ),
    "deepseek-v3.1-8bit": ModelCard(
@@ -64,6 +66,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="DeepSeek V3.1 (8-bit)",
            storage_size=Memory.from_gb(713),
            n_layers=61,
+            hidden_size=7168,
+            supports_tensor=True,
        ),
    ),
    # "deepseek-v3.2": ModelCard(
@@ -135,6 +139,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Kimi K2 Instruct (4-bit)",
            storage_size=Memory.from_gb(578),
            n_layers=61,
+            hidden_size=7168,
+            supports_tensor=True,
        ),
    ),
    "kimi-k2-thinking": ModelCard(
@@ -148,6 +154,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Kimi K2 Thinking (4-bit)",
            storage_size=Memory.from_gb(658),
            n_layers=61,
+            hidden_size=7168,
+            supports_tensor=True,
        ),
    ),
    # llama-3.1
@@ -162,6 +170,38 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.1 8B (4-bit)",
            storage_size=Memory.from_mb(4423),
            n_layers=32,
+            hidden_size=4096,
+            supports_tensor=True,
+        ),
+    ),
+    "llama-3.1-8b-8bit": ModelCard(
+        short_id="llama-3.1-8b-8bit",
+        model_id=ModelId("mlx-community/Meta-Llama-3.1-8B-Instruct-8bit"),
+        name="Llama 3.1 8B (8-bit)",
+        description="""Llama 3.1 is a large language model trained on the Llama 3.1 dataset.""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Meta-Llama-3.1-8B-Instruct-8bit"),
+            pretty_name="Llama 3.1 8B (8-bit)",
+            storage_size=Memory.from_mb(8540),
+            n_layers=32,
+            hidden_size=4096,
+            supports_tensor=True,
+        ),
+    ),
+    "llama-3.1-8b-bf16": ModelCard(
+        short_id="llama-3.1-8b-bf16",
+        model_id=ModelId("mlx-community/Meta-Llama-3.1-8B-Instruct-bf16"),
+        name="Llama 3.1 8B (BF16)",
+        description="""Llama 3.1 is a large language model trained on the Llama 3.1 dataset.""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Meta-Llama-3.1-8B-Instruct-bf16"),
+            pretty_name="Llama 3.1 8B (BF16)",
+            storage_size=Memory.from_mb(16100),
+            n_layers=32,
+            hidden_size=4096,
+            supports_tensor=True,
        ),
    ),
    "llama-3.1-70b": ModelCard(
@@ -175,6 +215,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.1 70B (4-bit)",
            storage_size=Memory.from_mb(38769),
            n_layers=80,
+            hidden_size=8192,
+            supports_tensor=True,
        ),
    ),
    # llama-3.2
@@ -189,6 +231,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.2 1B (4-bit)",
            storage_size=Memory.from_mb(696),
            n_layers=16,
+            hidden_size=2048,
+            supports_tensor=True,
        ),
    ),
    "llama-3.2-3b": ModelCard(
@@ -202,6 +246,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.2 3B (4-bit)",
            storage_size=Memory.from_mb(1777),
            n_layers=28,
+            hidden_size=3072,
+            supports_tensor=True,
        ),
    ),
    "llama-3.2-3b-8bit": ModelCard(
@@ -215,6 +261,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.2 3B (8-bit)",
            storage_size=Memory.from_mb(3339),
            n_layers=28,
+            hidden_size=3072,
+            supports_tensor=True,
        ),
    ),
    # llama-3.3
@@ -229,6 +277,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.3 70B",
            storage_size=Memory.from_mb(38769),
            n_layers=80,
+            hidden_size=8192,
+            supports_tensor=True,
        ),
    ),
    "llama-3.3-70b-8bit": ModelCard(
@@ -242,6 +292,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.3 70B (8-bit)",
            storage_size=Memory.from_mb(73242),
            n_layers=80,
+            hidden_size=8192,
+            supports_tensor=True,
        ),
    ),
    "llama-3.3-70b-fp16": ModelCard(
@@ -255,20 +307,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Llama 3.3 70B (FP16)",
            storage_size=Memory.from_mb(137695),
            n_layers=80,
-        ),
-    ),
-    # phi-3
-    "phi-3-mini": ModelCard(
-        short_id="phi-3-mini",
-        model_id=ModelId("mlx-community/Phi-3-mini-128k-instruct-4bit"),
-        name="Phi 3 Mini 128k (4-bit)",
-        description="""Phi 3 Mini is a large language model trained on the Phi 3 Mini dataset.""",
-        tags=[],
-        metadata=ModelMetadata(
-            model_id=ModelId("mlx-community/Phi-3-mini-128k-instruct-4bit"),
-            pretty_name="Phi 3 Mini 128k (4-bit)",
-            storage_size=Memory.from_mb(2099),
-            n_layers=32,
+            hidden_size=8192,
+            supports_tensor=True,
        ),
    ),
    # qwen3
@@ -283,6 +323,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 0.6B (4-bit)",
            storage_size=Memory.from_mb(327),
            n_layers=28,
+            hidden_size=1024,
+            supports_tensor=False,
        ),
    ),
    "qwen3-0.6b-8bit": ModelCard(
@@ -296,6 +338,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 0.6B (8-bit)",
            storage_size=Memory.from_mb(666),
            n_layers=28,
+            hidden_size=1024,
+            supports_tensor=False,
        ),
    ),
    "qwen3-30b": ModelCard(
@@ -309,6 +353,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 30B A3B (4-bit)",
            storage_size=Memory.from_mb(16797),
            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
        ),
    ),
    "qwen3-30b-8bit": ModelCard(
@@ -322,6 +368,68 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 30B A3B (8-bit)",
            storage_size=Memory.from_mb(31738),
            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
+        ),
+    ),
+    "qwen3-80b-a3B-4bit": ModelCard(
+        short_id="qwen3-80b-a3B-4bit",
+        model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit"),
+        name="Qwen3 80B A3B (4-bit)",
+        description="""Qwen3 80B""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Instruct-4bit"),
+            pretty_name="Qwen3 80B A3B (4-bit)",
+            storage_size=Memory.from_mb(44800),
+            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
+        ),
+    ),
+    "qwen3-80b-a3B-8bit": ModelCard(
+        short_id="qwen3-80b-a3B-8bit",
+        model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit"),
+        name="Qwen3 80B A3B (8-bit)",
+        description="""Qwen3 80B""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Instruct-8bit"),
+            pretty_name="Qwen3 80B A3B (8-bit)",
+            storage_size=Memory.from_mb(84700),
+            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
+        ),
+    ),
+    "qwen3-80b-a3B-thinking-4bit": ModelCard(
+        short_id="qwen3-80b-a3B-thinking-4bit",
+        model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Thinking-4bit"),
+        name="Qwen3 80B A3B Thinking (4-bit)",
+        description="""Qwen3 80B Reasoning model""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Thinking-4bit"),
+            pretty_name="Qwen3 80B A3B (4-bit)",
+            storage_size=Memory.from_mb(84700),
+            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
+        ),
+    ),
+    "qwen3-80b-a3B-thinking-8bit": ModelCard(
+        short_id="qwen3-80b-a3B-thinking-8bit",
+        model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Thinking-8bit"),
+        name="Qwen3 80B A3B Thinking (8-bit)",
+        description="""Qwen3 80B Reasoning model""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/Qwen3-Next-80B-A3B-Thinking-8bit"),
+            pretty_name="Qwen3 80B A3B (8-bit)",
+            storage_size=Memory.from_mb(84700),
+            n_layers=48,
+            hidden_size=2048,
+            supports_tensor=True,
        ),
    ),
    "qwen3-235b-a22b-4bit": ModelCard(
@@ -335,6 +443,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 235B A22B (4-bit)",
            storage_size=Memory.from_gb(132),
            n_layers=94,
+            hidden_size=4096,
+            supports_tensor=True,
        ),
    ),
    "qwen3-235b-a22b-8bit": ModelCard(
@@ -348,6 +458,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 235B A22B (8-bit)",
            storage_size=Memory.from_gb(250),
            n_layers=94,
+            hidden_size=4096,
+            supports_tensor=True,
        ),
    ),
    "qwen3-coder-480b-a35b-4bit": ModelCard(
@@ -361,6 +473,8 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 Coder 480B A35B (4-bit)",
            storage_size=Memory.from_gb(270),
            n_layers=62,
+            hidden_size=6144,
+            supports_tensor=True,
        ),
    ),
    "qwen3-coder-480b-a35b-8bit": ModelCard(
@@ -374,77 +488,84 @@ MODEL_CARDS: dict[str, ModelCard] = {
            pretty_name="Qwen3 Coder 480B A35B (8-bit)",
            storage_size=Memory.from_gb(540),
            n_layers=62,
+            hidden_size=6144,
+            supports_tensor=True,
        ),
    ),
-    # granite
-    "granite-3.3-2b": ModelCard(
-        short_id="granite-3.3-2b",
-        model_id=ModelId("mlx-community/granite-3.3-2b-instruct-fp16"),
-        name="Granite 3.3 2B (FP16)",
-        description="""Granite-3.3-2B-Instruct is a 2-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.""",
+    # gpt-oss
+    "gpt-oss-120b-MXFP4-Q8": ModelCard(
+        short_id="gpt-oss-120b-MXFP4-Q8",
+        model_id=ModelId("mlx-community/gpt-oss-120b-MXFP4-Q8"),
+        name="GPT-OSS 120B (MXFP4-Q8, MLX)",
+        description="""OpenAI's GPT-OSS 120B is a 117B-parameter Mixture-of-Experts model designed for high-reasoning and general-purpose use; this variant is a 4-bit MLX conversion for Apple Silicon.""",
        tags=[],
        metadata=ModelMetadata(
-            model_id=ModelId("mlx-community/granite-3.3-2b-instruct-fp16"),
-            pretty_name="Granite 3.3 2B (FP16)",
-            storage_size=Memory.from_mb(4951),
-            n_layers=40,
+            model_id=ModelId("mlx-community/gpt-oss-120b-MXFP4-Q8"),
+            pretty_name="GPT-OSS 120B (MXFP4-Q8, MLX)",
+            storage_size=Memory.from_kb(68_996_301),
+            n_layers=36,
+            hidden_size=2880,
+            supports_tensor=True,
        ),
    ),
-    # "granite-3.3-8b": ModelCard(
-    #     short_id="granite-3.3-8b",
-    #     model_id=ModelId("mlx-community/granite-3.3-8b-instruct-fp16"),
-    #     name="Granite 3.3 8B",
-    #     description="""Granite-3.3-8B-Instruct is a 8-billion parameter 128K context length language model fine-tuned for improved reasoning and instruction-following capabilities.""",
+    "gpt-oss-20b-4bit": ModelCard(
+        short_id="gpt-oss-20b-4bit",
+        model_id=ModelId("mlx-community/gpt-oss-20b-MXFP4-Q4"),
+        name="GPT-OSS 20B (MXFP4-Q4, MLX)",
+        description="""OpenAI's GPT-OSS 20B is a medium-sized MoE model for lower-latency and local or specialized use cases; this MLX variant uses MXFP4 4-bit quantization.""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/gpt-oss-20b-MXFP4-Q4"),
+            pretty_name="GPT-OSS 20B (MXFP4-Q4, MLX)",
+            storage_size=Memory.from_kb(11_744_051),
+            n_layers=24,
+            hidden_size=2880,
+            supports_tensor=True,
+        ),
+    ),
+    # Needs to be quantized g32 or g16.
+    "glm-4.5-air-8bit": ModelCard(
+        short_id="glm-4.5-air-8bit",
+        model_id=ModelId("mlx-community/GLM-4.5-Air-8bit"),
+        name="GLM 4.5 Air 8bit",
+        description="""GLM 4.5 Air 8bit""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/GLM-4.5-Air-8bit"),
+            pretty_name="GLM 4.5 Air 8bit",
+            storage_size=Memory.from_gb(114),
+            n_layers=46,
+            hidden_size=4096,
+            supports_tensor=False,
+        ),
+    ),
+    "glm-4.5-air-bf16": ModelCard(
+        short_id="glm-4.5-air-bf16",
+        model_id=ModelId("mlx-community/GLM-4.5-Air-bf16"),
+        name="GLM 4.5 Air bf16",
+        description="""GLM 4.5 Air bf16""",
+        tags=[],
+        metadata=ModelMetadata(
+            model_id=ModelId("mlx-community/GLM-4.5-Air-bf16"),
+            pretty_name="GLM 4.5 Air bf16",
+            storage_size=Memory.from_gb(214),
+            n_layers=46,
+            hidden_size=4096,
+            supports_tensor=True,
+        ),
+    ),
+    # "devstral-2-123b-instruct-2512-8bit": ModelCard(
+    #     short_id="devstral-2-123b-instruct-2512-8bit",
+    #     model_id=ModelId("mlx-community/Devstral-2-123B-Instruct-2512-8bit"),
+    #     name="Devstral 2 123B Instruct 2512 (8-bit, MLX)",
+    #     description="""Mistral AI's Devstral 2 123B Instruct (2512) is an agentic coding model.""",
    #     tags=[],
    #     metadata=ModelMetadata(
-    #         model_id=ModelId("mlx-community/granite-3.3-8b-instruct-fp16"),
-    #         pretty_name="Granite 3.3 8B",
-    #         storage_size=Memory.from_kb(15958720),
-    #         n_layers=40,
-    #     ),
-    # ),
-    # smol-lm
-    # "smol-lm-135m": ModelCard(
-    #     short_id="smol-lm-135m",
-    #     model_id="mlx-community/SmolLM-135M-4bit",
-    #     name="Smol LM 135M",
-    #     description="""SmolLM is a series of state-of-the-art small language models available in three sizes: 135M, 360M, and 1.7B parameters. """,
-    #     tags=[],
-    #     metadata=ModelMetadata(
-    #         model_id=ModelId("mlx-community/SmolLM-135M-4bit"),
-    #         pretty_name="Smol LM 135M",
-    #         storage_size=Memory.from_kb(73940),
-    #         n_layers=30,
-    #     ),
-    # ),
-    # gpt-oss
-    # "gpt-oss-120b-MXFP4-Q8": ModelCard(
-    #     short_id="gpt-oss-120b-MXFP4-Q8",
-    #     model_id=ModelId("mlx-community/gpt-oss-120b-MXFP4-Q8"),
-    #     name="GPT-OSS 120B (MXFP4-Q8, MLX)",
-    #     description="""OpenAI's GPT-OSS 120B is a 117B-parameter Mixture-of-Experts model designed for high-reasoning and general-purpose use; this variant is a 4-bit MLX conversion for Apple Silicon.""",
-    #     tags=[],
-    #     metadata=ModelMetadata(
-    #         model_id=ModelId("mlx-community/gpt-oss-120b-MXFP4-Q8"),
-    #         pretty_name="GPT-OSS 120B (MXFP4-Q8, MLX)",
-    #         storage_size=Memory.from_kb(68_996_301),
-    #         n_layers=36,
-    #         hidden_size=2880,
-    #         supports_tensor=True,
-    #     ),
-    # ),
-    # "gpt-oss-20b-4bit": ModelCard(
-    #     short_id="gpt-oss-20b-4bit",
-    #     model_id=ModelId("mlx-community/gpt-oss-20b-MXFP4-Q4"),
-    #     name="GPT-OSS 20B (MXFP4-Q4, MLX)",
-    #     description="""OpenAI's GPT-OSS 20B is a medium-sized MoE model for lower-latency and local or specialized use cases; this MLX variant uses MXFP4 4-bit quantization.""",
-    #     tags=[],
-    #     metadata=ModelMetadata(
-    #         model_id=ModelId("mlx-community/gpt-oss-20b-MXFP4-Q4"),
-    #         pretty_name="GPT-OSS 20B (MXFP4-Q4, MLX)",
-    #         storage_size=Memory.from_kb(11_744_051),
-    #         n_layers=24,
-    #         hidden_size=2880,
+    #         model_id=ModelId("mlx-community/Devstral-2-123B-Instruct-2512-8bit"),
+    #         pretty_name="Devstral 2 123B Instruct 2512 (8-bit, MLX)",
+    #         storage_size=Memory.from_kb(133_000_000),
+    #         n_layers=88,
+    #         hidden_size=12288,
    #         supports_tensor=True,
    #     ),
    # ),
--- a/src/exo/shared/models/model_meta.py
+++ b/src/exo/shared/models/model_meta.py
@@ -1,9 +1,12 @@
 from typing import Annotated

+import aiofiles
+import aiofiles.os as aios
 from huggingface_hub import model_info
 from loguru import logger
 from pydantic import BaseModel, Field

+from exo.shared.models.model_cards import MODEL_CARDS
 from exo.shared.types.memory import Memory
 from exo.shared.types.models import ModelId, ModelMetadata
 from exo.worker.download.download_utils import (
@@ -23,6 +26,7 @@ class ConfigData(BaseModel):
    n_layers: Annotated[int, Field(ge=0)] | None = None  # Sometimes used
    num_decoder_layers: Annotated[int, Field(ge=0)] | None = None  # Transformer models
    decoder_layers: Annotated[int, Field(ge=0)] | None = None  # Some architectures
+    hidden_size: Annotated[int, Field(ge=0)] | None = None

    @property
    def layer_count(self) -> int:
@@ -48,7 +52,7 @@ class ConfigData(BaseModel):
 async def get_config_data(model_id: str) -> ConfigData:
    """Downloads and parses config.json for a model."""
    target_dir = (await ensure_models_dir()) / str(model_id).replace("/", "--")
-    await target_dir.mkdir(parents=True, exist_ok=True)
+    await aios.makedirs(target_dir, exist_ok=True)
    config_path = await download_file_with_retry(
        model_id,
        "main",
@@ -58,14 +62,14 @@ async def get_config_data(model_id: str) -> ConfigData:
            f"Downloading config.json for {model_id}: {curr_bytes}/{total_bytes} ({is_renamed=})"
        ),
    )
-    async with await config_path.open("r") as f:
+    async with aiofiles.open(config_path, "r") as f:
        return ConfigData.model_validate_json(await f.read())


 async def get_safetensors_size(model_id: str) -> Memory:
    """Gets model size from safetensors index or falls back to HF API."""
    target_dir = (await ensure_models_dir()) / str(model_id).replace("/", "--")
-    await target_dir.mkdir(parents=True, exist_ok=True)
+    await aios.makedirs(target_dir, exist_ok=True)
    index_path = await download_file_with_retry(
        model_id,
        "main",
@@ -75,7 +79,7 @@ async def get_safetensors_size(model_id: str) -> Memory:
            f"Downloading model.safetensors.index.json for {model_id}: {curr_bytes}/{total_bytes} ({is_renamed=})"
        ),
    )
-    async with await index_path.open("r") as f:
+    async with aiofiles.open(index_path, "r") as f:
        index_data = ModelSafetensorsIndex.model_validate_json(await f.read())

    metadata = index_data.metadata
@@ -104,10 +108,19 @@ async def _get_model_meta(model_id: str) -> ModelMetadata:
    config_data = await get_config_data(model_id)
    num_layers = config_data.layer_count
    mem_size_bytes = await get_safetensors_size(model_id)
+    model_card = next(
+        (card for card in MODEL_CARDS.values() if card.model_id == ModelId(model_id)),
+        None,
+    )

    return ModelMetadata(
        model_id=ModelId(model_id),
-        pretty_name=model_id,
+        pretty_name=model_card.name if model_card is not None else model_id,
        storage_size=mem_size_bytes,
        n_layers=num_layers,
+        hidden_size=config_data.hidden_size or 0,
+        # TODO: all custom models currently do not support tensor. We could add a dynamic test for this?
+        supports_tensor=model_card.metadata.supports_tensor
+        if model_card is not None
+        else False,
    )
--- a/src/exo/shared/tests/conftest.py
+++ b/src/exo/shared/tests/conftest.py
@@ -1,5 +1,8 @@
 """Pytest configuration and shared fixtures for shared package tests."""

+import asyncio
+from typing import Generator
+
 import pytest
 from _pytest.logging import LogCaptureFixture
 from loguru import logger
@@ -9,6 +12,21 @@ from exo.shared.types.models import ModelId, ModelMetadata
 from exo.shared.types.worker.shards import PipelineShardMetadata, ShardMetadata


+@pytest.fixture(scope="session")
+def event_loop() -> Generator[asyncio.AbstractEventLoop, None, None]:
+    """Create an event loop for the test session."""
+    loop = asyncio.new_event_loop()
+    asyncio.set_event_loop(loop)
+    yield loop
+    loop.close()
+
+
+@pytest.fixture(autouse=True)
+def reset_event_loop():
+    """Reset the event loop for each test to ensure clean state."""
+    # This ensures each test gets a fresh event loop state
+
+
 def get_pipeline_shard_metadata(
    model_id: ModelId, device_rank: int, world_size: int = 1
 ) -> ShardMetadata:
@@ -18,6 +36,8 @@ def get_pipeline_shard_metadata(
            pretty_name=str(model_id),
            storage_size=Memory.from_mb(100000),
            n_layers=32,
+            hidden_size=1000,
+            supports_tensor=True,
        ),
        device_rank=device_rank,
        world_size=world_size,
@@ -34,7 +54,7 @@ def caplog(caplog: LogCaptureFixture):
        format="{message}",
        level=0,
        filter=lambda record: record["level"].no >= caplog.handler.level,
-        enqueue=True,
+        enqueue=True,  # Set to 'True' if your test is spawning child processes.
    )
    yield caplog
    logger.remove(handler_id)
--- a/src/exo/shared/tests/test_apply/test_apply_node_download.py
+++ b/src/exo/shared/tests/test_apply/test_apply_node_download.py
@@ -19,7 +19,7 @@ def test_apply_node_download_progress():
        NodeDownloadProgress(download_progress=event), state
    )

-    assert new_state == State(downloads={NodeId("node-1"): [event]})
+    assert new_state.downloads == {NodeId("node-1"): [event]}


 def test_apply_two_node_download_progress():
@@ -39,7 +39,4 @@ def test_apply_two_node_download_progress():
        NodeDownloadProgress(download_progress=event2), state
    )

-    # TODO: This test is failing. We should support the following:
-    # 1. Downloading multiple models concurrently on the same node (one per runner is fine).
-    # 2. Downloading a model, it completes, then downloading a different model on the same node.
-    assert new_state == State(downloads={NodeId("node-1"): [event1, event2]})
+    assert new_state.downloads == {NodeId("node-1"): [event1, event2]}
--- a/src/exo/shared/tests/test_election.py
+++ b/src/exo/shared/tests/test_election.py
@@ -46,6 +46,7 @@ def fast_election_timeout(monkeypatch: pytest.MonkeyPatch):
    monkeypatch.setattr("exo.shared.election.DEFAULT_ELECTION_TIMEOUT", 0.1)


+@pytest.mark.anyio
 async def test_single_round_broadcasts_and_updates_seniority_on_self_win() -> None:
    """
    Start a round by injecting an ElectionMessage with higher clock.
@@ -101,6 +102,7 @@ async def test_single_round_broadcasts_and_updates_seniority_on_self_win() -> No
    assert election.seniority == 2


+@pytest.mark.anyio
 async def test_peer_with_higher_seniority_wins_and_we_switch_master() -> None:
    """
    If a peer with clearly higher seniority participates in the round, they should win.
@@ -154,6 +156,7 @@ async def test_peer_with_higher_seniority_wins_and_we_switch_master() -> None:
    assert election.seniority == 0


+@pytest.mark.anyio
 async def test_ignores_older_messages() -> None:
    """
    Messages with a lower clock than the current round are ignored by the receiver.
@@ -202,6 +205,7 @@ async def test_ignores_older_messages() -> None:
    # Not asserting on the result; focus is on ignore behavior.


+@pytest.mark.anyio
 async def test_two_rounds_emit_two_broadcasts_and_increment_clock() -> None:
    """
    Two successive rounds → two broadcasts. Second round triggered by a higher-clock message.
@@ -247,6 +251,7 @@ async def test_two_rounds_emit_two_broadcasts_and_increment_clock() -> None:
    # Not asserting on who won; just that both rounds were broadcast.


+@pytest.mark.anyio
 async def test_promotion_new_seniority_counts_participants() -> None:
    """
    When we win against two peers in the same round, our seniority becomes
@@ -295,6 +300,7 @@ async def test_promotion_new_seniority_counts_participants() -> None:
    assert election.seniority == 3


+@pytest.mark.anyio
 async def test_connection_message_triggers_new_round_broadcast() -> None:
    """
    A connection message increments the clock and starts a new campaign.
@@ -346,6 +352,7 @@ async def test_connection_message_triggers_new_round_broadcast() -> None:
    # After cancellation (before election finishes), no seniority changes asserted here.


+@pytest.mark.anyio
 async def test_tie_breaker_prefers_node_with_more_commands_seen() -> None:
    """
    With equal seniority, the node that has seen more commands should win the election.
--- a/src/exo/shared/tests/test_state_serialization.py
+++ b/src/exo/shared/tests/test_state_serialization.py
@@ -1,7 +1,7 @@
 from exo.shared.types.common import NodeId
 from exo.shared.types.multiaddr import Multiaddr
 from exo.shared.types.state import State
-from exo.shared.types.topology import Connection
+from exo.shared.types.topology import SocketConnection


 def test_state_serialization_roundtrip() -> None:
@@ -11,17 +11,16 @@ def test_state_serialization_roundtrip() -> None:
    node_a = NodeId("node-a")
    node_b = NodeId("node-b")

-    connection = Connection(
-        local_node_id=node_a,
-        send_back_node_id=node_b,
-        send_back_multiaddr=Multiaddr(address="/ip4/127.0.0.1/tcp/10001"),
+    connection = SocketConnection(
+        sink_multiaddr=Multiaddr(address="/ip4/127.0.0.1/tcp/10001"),
    )

    state = State()
-    state.topology.add_connection(connection)
+    state.topology.add_connection(node_a, node_b, connection)

    json_repr = state.model_dump_json()
    restored_state = State.model_validate_json(json_repr)

-    assert state.topology.to_snapshot() == restored_state.topology.to_snapshot()
+    assert state.topology.to_snapshot().nodes == restored_state.topology.to_snapshot().nodes
+    assert set(state.topology.to_snapshot().connections) == set(restored_state.topology.to_snapshot().connections)
    assert restored_state.model_dump_json() == json_repr
--- a/src/exo/shared/tests/test_xdg_paths.py
+++ b/src/exo/shared/tests/test_xdg_paths.py
@@ -0,0 +1,118 @@
+"""Tests for XDG Base Directory Specification compliance."""
+
+import os
+import sys
+from pathlib import Path
+from unittest import mock
+
+
+def test_xdg_paths_on_linux():
+    """Test that XDG paths are used on Linux when XDG env vars are set."""
+    with (
+        mock.patch.dict(
+            os.environ,
+            {
+                "XDG_CONFIG_HOME": "/tmp/test-config",
+                "XDG_DATA_HOME": "/tmp/test-data",
+                "XDG_CACHE_HOME": "/tmp/test-cache",
+            },
+            clear=False,
+        ),
+        mock.patch.object(sys, "platform", "linux"),
+    ):
+        # Re-import to pick up mocked values
+        import importlib
+
+        import exo.shared.constants as constants
+
+        importlib.reload(constants)
+
+        assert Path("/tmp/test-config/exo") == constants.EXO_CONFIG_HOME
+        assert Path("/tmp/test-data/exo") == constants.EXO_DATA_HOME
+        assert Path("/tmp/test-cache/exo") == constants.EXO_CACHE_HOME
+
+
+def test_xdg_default_paths_on_linux():
+    """Test that XDG default paths are used on Linux when env vars are not set."""
+    # Remove XDG env vars and EXO_HOME
+    env = {
+        k: v
+        for k, v in os.environ.items()
+        if not k.startswith("XDG_") and k != "EXO_HOME"
+    }
+    with (
+        mock.patch.dict(os.environ, env, clear=True),
+        mock.patch.object(sys, "platform", "linux"),
+    ):
+        import importlib
+
+        import exo.shared.constants as constants
+
+        importlib.reload(constants)
+
+        home = Path.home()
+        assert home / ".config" / "exo" == constants.EXO_CONFIG_HOME
+        assert home / ".local/share" / "exo" == constants.EXO_DATA_HOME
+        assert home / ".cache" / "exo" == constants.EXO_CACHE_HOME
+
+
+def test_legacy_exo_home_takes_precedence():
+    """Test that EXO_HOME environment variable takes precedence for backward compatibility."""
+    with mock.patch.dict(
+        os.environ,
+        {
+            "EXO_HOME": ".custom-exo",
+            "XDG_CONFIG_HOME": "/tmp/test-config",
+        },
+        clear=False,
+    ):
+        import importlib
+
+        import exo.shared.constants as constants
+
+        importlib.reload(constants)
+
+        home = Path.home()
+        assert home / ".custom-exo" == constants.EXO_CONFIG_HOME
+        assert home / ".custom-exo" == constants.EXO_DATA_HOME
+
+
+def test_macos_uses_traditional_paths():
+    """Test that macOS uses traditional ~/.exo directory."""
+    # Remove EXO_HOME to ensure we test the default behavior
+    env = {k: v for k, v in os.environ.items() if k != "EXO_HOME"}
+    with (
+        mock.patch.dict(os.environ, env, clear=True),
+        mock.patch.object(sys, "platform", "darwin"),
+    ):
+        import importlib
+
+        import exo.shared.constants as constants
+
+        importlib.reload(constants)
+
+        home = Path.home()
+        assert home / ".exo" == constants.EXO_CONFIG_HOME
+        assert home / ".exo" == constants.EXO_DATA_HOME
+        assert home / ".exo" == constants.EXO_CACHE_HOME
+
+
+def test_node_id_in_config_dir():
+    """Test that node ID keypair is in the config directory."""
+    import exo.shared.constants as constants
+
+    assert constants.EXO_NODE_ID_KEYPAIR.parent == constants.EXO_CONFIG_HOME
+
+
+def test_models_in_data_dir():
+    """Test that models directory is in the data directory."""
+    # Clear EXO_MODELS_DIR to test default behavior
+    env = {k: v for k, v in os.environ.items() if k != "EXO_MODELS_DIR"}
+    with mock.patch.dict(os.environ, env, clear=True):
+        import importlib
+
+        import exo.shared.constants as constants
+
+        importlib.reload(constants)
+
+        assert constants.EXO_MODELS_DIR.parent == constants.EXO_DATA_HOME
--- a/src/exo/shared/topology.py
+++ b/src/exo/shared/topology.py
@@ -1,203 +1,215 @@
 import contextlib
+from collections.abc import Mapping, Sequence
+from dataclasses import dataclass, field
 from typing import Iterable

 import rustworkx as rx
 from pydantic import BaseModel, ConfigDict

 from exo.shared.types.common import NodeId
-from exo.shared.types.profiling import ConnectionProfile, NodePerformanceProfile
-from exo.shared.types.topology import Connection, NodeInfo
+from exo.shared.types.topology import RDMAConnection, SocketConnection


 class TopologySnapshot(BaseModel):
-    nodes: list[NodeInfo]
-    connections: list[Connection]
+    nodes: Sequence[NodeId]
+    connections: Iterable[tuple[NodeId, NodeId, SocketConnection | RDMAConnection]]

-    model_config = ConfigDict(frozen=True, extra="forbid", strict=True)
+    model_config = ConfigDict(frozen=True, extra="forbid")


+@dataclass
 class Topology:
-    def __init__(self) -> None:
-        self._graph: rx.PyDiGraph[NodeInfo, Connection] = rx.PyDiGraph()
-        self._node_id_to_rx_id_map: dict[NodeId, int] = dict()
-        self._rx_id_to_node_id_map: dict[int, NodeId] = dict()
-        self._edge_id_to_rx_id_map: dict[Connection, int] = dict()
+    # the _graph can be used as a int -> NodeId map.
+    _graph: rx.PyDiGraph[NodeId, SocketConnection | RDMAConnection] = field(
+        init=False, default_factory=rx.PyDiGraph
+    )
+    _vertex_indices: dict[NodeId, int] = field(init=False, default_factory=dict)

    def to_snapshot(self) -> TopologySnapshot:
        return TopologySnapshot(
-            nodes=list(self.list_nodes()),
-            connections=list(self.list_connections()),
+            nodes=list(self.list_nodes()), connections=self.list_connections()
        )

    @classmethod
    def from_snapshot(cls, snapshot: TopologySnapshot) -> "Topology":
        topology = cls()

-        for node in snapshot.nodes:
+        for node_id in snapshot.nodes:
            with contextlib.suppress(ValueError):
-                topology.add_node(node)
+                topology.add_node(node_id)

-        for connection in snapshot.connections:
-            topology.add_connection(connection)
+        for source, sink, conn in snapshot.connections:
+            topology.add_connection(source, sink, conn)

        return topology

-    def add_node(self, node: NodeInfo) -> None:
-        if node.node_id in self._node_id_to_rx_id_map:
+    def add_node(self, node_id: NodeId) -> None:
+        if node_id in self._vertex_indices:
            return
-        rx_id = self._graph.add_node(node)
-        self._node_id_to_rx_id_map[node.node_id] = rx_id
-        self._rx_id_to_node_id_map[rx_id] = node.node_id
+        rx_id = self._graph.add_node(node_id)
+        self._vertex_indices[node_id] = rx_id

    def node_is_leaf(self, node_id: NodeId) -> bool:
        return (
-            node_id in self._node_id_to_rx_id_map
-            and len(self._graph.neighbors(self._node_id_to_rx_id_map[node_id])) == 1
+            node_id in self._vertex_indices
+            and len(self._graph.neighbors(self._vertex_indices[node_id])) <= 1
        )

    def neighbours(self, node_id: NodeId) -> list[NodeId]:
        return [
-            self._rx_id_to_node_id_map[rx_id]
-            for rx_id in self._graph.neighbors(self._node_id_to_rx_id_map[node_id])
+            self._graph[rx_id]
+            for rx_id in self._graph.neighbors(self._vertex_indices[node_id])
        ]

-    def out_edges(self, node_id: NodeId) -> list[tuple[NodeId, Connection]]:
-        if node_id not in self._node_id_to_rx_id_map:
+    def out_edges(
+        self, node_id: NodeId
+    ) -> Iterable[tuple[NodeId, SocketConnection | RDMAConnection]]:
+        if node_id not in self._vertex_indices:
            return []
-        return [
-            (self._rx_id_to_node_id_map[nid], conn)
-            for _, nid, conn in self._graph.out_edges(
-                self._node_id_to_rx_id_map[node_id]
-            )
-        ]
+        return (
+            (self._graph[nid], conn)
+            for _, nid, conn in self._graph.out_edges(self._vertex_indices[node_id])
+        )

    def contains_node(self, node_id: NodeId) -> bool:
-        return node_id in self._node_id_to_rx_id_map
-
-    def contains_connection(self, connection: Connection) -> bool:
-        return connection in self._edge_id_to_rx_id_map
+        return node_id in self._vertex_indices

    def add_connection(
        self,
-        connection: Connection,
+        source: NodeId,
+        sink: NodeId,
+        connection: SocketConnection | RDMAConnection,
    ) -> None:
-        if connection.local_node_id not in self._node_id_to_rx_id_map:
-            self.add_node(NodeInfo(node_id=connection.local_node_id))
-        if connection.send_back_node_id not in self._node_id_to_rx_id_map:
-            self.add_node(NodeInfo(node_id=connection.send_back_node_id))
-
-        if connection in self._edge_id_to_rx_id_map:
+        if connection in self.get_all_connections_between(source, sink):
            return

-        src_id = self._node_id_to_rx_id_map[connection.local_node_id]
-        sink_id = self._node_id_to_rx_id_map[connection.send_back_node_id]
+        if source not in self._vertex_indices:
+            self.add_node(source)
+        if sink not in self._vertex_indices:
+            self.add_node(sink)

-        rx_id = self._graph.add_edge(src_id, sink_id, connection)
-        self._edge_id_to_rx_id_map[connection] = rx_id
+        src_id = self._vertex_indices[source]
+        sink_id = self._vertex_indices[sink]

-    def list_nodes(self) -> Iterable[NodeInfo]:
-        return (self._graph[i] for i in self._graph.node_indices())
+        _ = self._graph.add_edge(src_id, sink_id, connection)

-    def list_connections(self) -> Iterable[Connection]:
-        return (connection for _, _, connection in self._graph.weighted_edge_list())
+    def get_all_connections_between(
+        self, source: NodeId, sink: NodeId
+    ) -> Iterable[SocketConnection | RDMAConnection]:
+        if source not in self._vertex_indices:
+            return []
+        if sink not in self._vertex_indices:
+            return []

-    def get_node_profile(self, node_id: NodeId) -> NodePerformanceProfile | None:
+        src_id = self._vertex_indices[source]
+        sink_id = self._vertex_indices[sink]
        try:
-            rx_idx = self._node_id_to_rx_id_map[node_id]
-            return self._graph.get_node_data(rx_idx).node_profile
-        except KeyError:
-            return None
+            return self._graph.get_all_edge_data(src_id, sink_id)
+        except rx.NoEdgeBetweenNodes:
+            return []

-    def update_node_profile(
-        self, node_id: NodeId, node_profile: NodePerformanceProfile
-    ) -> None:
-        rx_idx = self._node_id_to_rx_id_map[node_id]
-        self._graph[rx_idx].node_profile = node_profile
+    def list_nodes(self) -> Iterable[NodeId]:
+        return self._graph.nodes()

-    def update_connection_profile(self, connection: Connection) -> None:
-        rx_idx = self._edge_id_to_rx_id_map[connection]
-        self._graph.update_edge_by_index(rx_idx, connection)
+    def map_connections(
+        self,
+    ) -> Mapping[NodeId, Mapping[NodeId, Sequence[SocketConnection | RDMAConnection]]]:
+        base: dict[NodeId, dict[NodeId, list[SocketConnection | RDMAConnection]]] = {}
+        for src_id, sink_id, connection in self._graph.weighted_edge_list():
+            source = self._graph[src_id]
+            sink = self._graph[sink_id]
+            if source not in base:
+                base[source] = {}
+            if sink not in base[source]:
+                base[source][sink] = []
+            base[source][sink].append(connection)
+        return base

-    def get_connection_profile(
-        self, connection: Connection
-    ) -> ConnectionProfile | None:
-        try:
-            rx_idx = self._edge_id_to_rx_id_map[connection]
-            return self._graph.get_edge_data_by_index(rx_idx).connection_profile
-        except KeyError:
-            return None
+    def list_connections(
+        self,
+    ) -> Iterable[tuple[NodeId, NodeId, SocketConnection | RDMAConnection]]:
+        return (
+            (
+                self._graph[src_id],
+                self._graph[sink_id],
+                connection,
+            )
+            for src_id, sink_id, connection in self._graph.weighted_edge_list()
+        )

    def remove_node(self, node_id: NodeId) -> None:
-        if node_id not in self._node_id_to_rx_id_map:
+        if node_id not in self._vertex_indices:
            return

-        for connection in self.list_connections():
-            if (
-                connection.local_node_id == node_id
-                or connection.send_back_node_id == node_id
-            ):
-                self.remove_connection(connection)
-
-        rx_idx = self._node_id_to_rx_id_map[node_id]
+        rx_idx = self._vertex_indices[node_id]
        self._graph.remove_node(rx_idx)

-        del self._node_id_to_rx_id_map[node_id]
-        del self._rx_id_to_node_id_map[rx_idx]
+        del self._vertex_indices[node_id]

-    def remove_connection(self, connection: Connection) -> None:
-        if connection not in self._edge_id_to_rx_id_map:
+    def replace_all_out_tb_connections(
+        self, source: NodeId, new_connections: Sequence[tuple[NodeId, RDMAConnection]]
+    ) -> None:
+        for conn_idx in self._graph.out_edge_indices(self._vertex_indices[source]):
+            if isinstance(self._graph.get_edge_data_by_index(conn_idx), RDMAConnection):
+                self._graph.remove_edge_from_index(conn_idx)
+        for sink, conn in new_connections:
+            self.add_connection(source, sink, conn)
+
+    def remove_connection(
+        self, source: NodeId, sink: NodeId, edge: SocketConnection | RDMAConnection
+    ) -> None:
+        if source not in self._vertex_indices or sink not in self._vertex_indices:
            return
-        rx_idx = self._edge_id_to_rx_id_map[connection]
-        self._graph.remove_edge_from_index(rx_idx)
-        del self._edge_id_to_rx_id_map[connection]
+        for conn_idx in self._graph.edge_indices_from_endpoints(
+            self._vertex_indices[source], self._vertex_indices[sink]
+        ):
+            if self._graph.get_edge_data_by_index(conn_idx) == edge:
+                self._graph.remove_edge_from_index(conn_idx)

-    def get_cycles(self) -> list[list[NodeInfo]]:
+    def get_cycles(self) -> list[list[NodeId]]:
        cycle_idxs = rx.simple_cycles(self._graph)
-        cycles: list[list[NodeInfo]] = []
+        cycles: list[list[NodeId]] = []
        for cycle_idx in cycle_idxs:
            cycle = [self._graph[idx] for idx in cycle_idx]
            cycles.append(cycle)

        return cycles

-    def get_cycles_tb(self) -> list[list[NodeInfo]]:
+    def get_cycles_tb(self) -> list[list[NodeId]]:
        tb_edges = [
            (u, v, conn)
            for u, v, conn in self._graph.weighted_edge_list()
            if conn.is_thunderbolt()
        ]

-        tb_graph: rx.PyDiGraph[NodeInfo, Connection] = rx.PyDiGraph()
+        tb_graph: rx.PyDiGraph[NodeId, SocketConnection] = rx.PyDiGraph()
        tb_graph.add_nodes_from(self._graph.nodes())

        for u, v, conn in tb_edges:
-            tb_graph.add_edge(u, v, conn)
+            if isinstance(conn, SocketConnection):
+                tb_graph.add_edge(u, v, conn)

        cycle_idxs = rx.simple_cycles(tb_graph)
-        cycles: list[list[NodeInfo]] = []
+        cycles: list[list[NodeId]] = []
        for cycle_idx in cycle_idxs:
            cycle = [tb_graph[idx] for idx in cycle_idx]
            cycles.append(cycle)

        return cycles

-    def get_subgraph_from_nodes(self, nodes: list[NodeInfo]) -> "Topology":
-        node_idxs = [node.node_id for node in nodes]
-        rx_idxs = [self._node_id_to_rx_id_map[idx] for idx in node_idxs]
+    def get_subgraph_from_nodes(self, node_ids: list[NodeId]) -> "Topology":
+        rx_idxs = [self._vertex_indices[idx] for idx in node_ids]
        topology = Topology()
        for rx_idx in rx_idxs:
            topology.add_node(self._graph[rx_idx])
-        for connection in self.list_connections():
-            if (
-                connection.local_node_id in node_idxs
-                and connection.send_back_node_id in node_idxs
-            ):
-                topology.add_connection(connection)
+        for source, sink, connection in self.list_connections():
+            if source in node_ids and sink in node_ids:
+                topology.add_connection(source, sink, connection)
        return topology

-    def is_thunderbolt_cycle(self, cycle: list[NodeInfo]) -> bool:
-        node_idxs = [node.node_id for node in cycle]
-        rx_idxs = [self._node_id_to_rx_id_map[idx] for idx in node_idxs]
+    def is_thunderbolt_cycle(self, cycle: list[NodeId]) -> bool:
+        node_idxs = [node for node in cycle]
+        rx_idxs = [self._vertex_indices[idx] for idx in node_idxs]
        for rid in rx_idxs:
            for neighbor_rid in self._graph.neighbors(rid):
                if neighbor_rid not in rx_idxs:
--- a/src/exo/shared/types/api.py
+++ b/src/exo/shared/types/api.py
@@ -5,7 +5,8 @@ from pydantic import BaseModel, Field, field_validator
 from pydantic_core import PydanticUseDefault

 from exo.shared.types.common import CommandId
-from exo.shared.types.models import ModelId
+from exo.shared.types.memory import Memory
+from exo.shared.types.models import ModelId, ModelMetadata
 from exo.shared.types.worker.instances import Instance, InstanceId, InstanceMeta
 from exo.shared.types.worker.shards import Sharding

@@ -51,6 +52,10 @@ class ChatCompletionMessage(BaseModel):
    function_call: dict[str, Any] | None = None


+class BenchChatCompletionMessage(ChatCompletionMessage):
+    pass
+
+
 class TopLogprobItem(BaseModel):
    token: str
    logprob: float
@@ -113,6 +118,18 @@ class ChatCompletionResponse(BaseModel):
    service_tier: str | None = None


+class GenerationStats(BaseModel):
+    prompt_tps: float
+    generation_tps: float
+    prompt_tokens: int
+    generation_tokens: int
+    peak_memory_usage: Memory
+
+
+class BenchChatCompletionResponse(ChatCompletionResponse):
+    generation_stats: GenerationStats | None = None
+
+
 class ChatCompletionTaskParams(BaseModel):
    model: str
    frequency_penalty: float | None = None
@@ -135,6 +152,10 @@ class ChatCompletionTaskParams(BaseModel):
    user: str | None = None


+class BenchChatCompletionTaskParams(ChatCompletionTaskParams):
+    pass
+
+
 class PlaceInstanceParams(BaseModel):
    model_id: str
    sharding: Sharding = Sharding.Pipeline
@@ -174,6 +195,7 @@ class DeleteInstanceTaskParams(BaseModel):
 class CreateInstanceResponse(BaseModel):
    message: str
    command_id: CommandId
+    model_meta: ModelMetadata


 class DeleteInstanceResponse(BaseModel):
--- a/src/exo/shared/types/chunks.py
+++ b/src/exo/shared/types/chunks.py
@@ -1,5 +1,6 @@
 from enum import Enum

+from exo.shared.types.api import GenerationStats
 from exo.utils.pydantic_ext import TaggedModel

 from .api import FinishReason
@@ -20,6 +21,7 @@ class TokenChunk(BaseChunk):
    text: str
    token_id: int
    finish_reason: FinishReason | None = None
+    stats: GenerationStats | None = None


 class ImageChunk(BaseChunk):
--- a/src/exo/shared/types/events.py
+++ b/src/exo/shared/types/events.py
@@ -2,14 +2,14 @@ from datetime import datetime

 from pydantic import Field

-from exo.shared.topology import Connection, NodePerformanceProfile
+from exo.shared.topology import SocketConnection
 from exo.shared.types.chunks import GenerationChunk
 from exo.shared.types.common import CommandId, Id, NodeId, SessionId
-from exo.shared.types.profiling import MemoryPerformanceProfile
 from exo.shared.types.tasks import Task, TaskId, TaskStatus
 from exo.shared.types.worker.downloads import DownloadProgress
 from exo.shared.types.worker.instances import Instance, InstanceId
 from exo.shared.types.worker.runners import RunnerId, RunnerStatus
+from exo.utils.info_gatherer.info_gatherer import GatheredInfo
 from exo.utils.pydantic_ext import CamelCaseModel, TaggedModel


@@ -76,25 +76,15 @@ class RunnerDeleted(BaseEvent):
    runner_id: RunnerId


-# TODO
-class NodeCreated(BaseEvent):
-    node_id: NodeId
-
-
 class NodeTimedOut(BaseEvent):
    node_id: NodeId


-class NodePerformanceMeasured(BaseEvent):
+# TODO: bikeshed this naem
+class NodeGatheredInfo(BaseEvent):
    node_id: NodeId
    when: str  # this is a manually cast datetime overrode by the master when the event is indexed, rather than the local time on the device
-    node_profile: NodePerformanceProfile
-
-
-class NodeMemoryMeasured(BaseEvent):
-    node_id: NodeId
-    when: str  # this is a manually cast datetime overrode by the master when the event is indexed, rather than the local time on the device
-    memory: MemoryPerformanceProfile
+    info: GatheredInfo  # NB: this model is UNTAGGED!!! be warned for ser/de errors.


 class NodeDownloadProgress(BaseEvent):
@@ -107,11 +97,15 @@ class ChunkGenerated(BaseEvent):


 class TopologyEdgeCreated(BaseEvent):
-    edge: Connection
+    source: NodeId
+    sink: NodeId
+    edge: SocketConnection


 class TopologyEdgeDeleted(BaseEvent):
-    edge: Connection
+    source: NodeId
+    sink: NodeId
+    edge: SocketConnection


 Event = (
@@ -125,10 +119,8 @@ Event = (
    | InstanceDeleted
    | RunnerStatusUpdated
    | RunnerDeleted
-    | NodeCreated
    | NodeTimedOut
-    | NodePerformanceMeasured
-    | NodeMemoryMeasured
+    | NodeGatheredInfo
    | NodeDownloadProgress
    | ChunkGenerated
    | TopologyEdgeCreated
--- a/src/exo/shared/types/models.py
+++ b/src/exo/shared/types/models.py
@@ -14,3 +14,5 @@ class ModelMetadata(CamelCaseModel):
    pretty_name: str
    storage_size: Memory
    n_layers: PositiveInt
+    hidden_size: PositiveInt
+    supports_tensor: bool
--- a/src/exo/shared/types/multiaddr.py
+++ b/src/exo/shared/types/multiaddr.py
@@ -1,10 +1,11 @@
 import re
 from typing import ClassVar

-from pydantic import BaseModel, computed_field, field_validator
+from pydantic import BaseModel, ConfigDict, computed_field, field_validator


 class Multiaddr(BaseModel):
+    model_config = ConfigDict(frozen=True)
    address: str

    PATTERNS: ClassVar[list[str]] = [
--- a/src/exo/shared/types/profiling.py
+++ b/src/exo/shared/types/profiling.py
@@ -1,12 +1,14 @@
+from collections.abc import Sequence
 from typing import Self

 import psutil

 from exo.shared.types.memory import Memory
+from exo.shared.types.thunderbolt import TBIdentifier
 from exo.utils.pydantic_ext import CamelCaseModel


-class MemoryPerformanceProfile(CamelCaseModel):
+class MemoryUsage(CamelCaseModel):
    ram_total: Memory
    ram_available: Memory
    swap_total: Memory
@@ -44,7 +46,6 @@ class SystemPerformanceProfile(CamelCaseModel):
    sys_power: float = 0.0
    pcpu_usage: float = 0.0
    ecpu_usage: float = 0.0
-    ane_power: float = 0.0


 class NetworkInterfaceInfo(CamelCaseModel):
@@ -53,15 +54,16 @@ class NetworkInterfaceInfo(CamelCaseModel):


 class NodePerformanceProfile(CamelCaseModel):
-    model_id: str
-    chip_id: str
-    friendly_name: str
-    memory: MemoryPerformanceProfile
-    network_interfaces: list[NetworkInterfaceInfo] = []
-    system: SystemPerformanceProfile
+    model_id: str = "Unknown"
+    chip_id: str = "Unknown"
+    friendly_name: str = "Unknown"
+    memory: MemoryUsage = MemoryUsage.from_bytes(
+        ram_total=0, ram_available=0, swap_total=0, swap_available=0
+    )
+    network_interfaces: Sequence[NetworkInterfaceInfo] = []
+    tb_interfaces: Sequence[TBIdentifier] = []
+    system: SystemPerformanceProfile = SystemPerformanceProfile()


 class ConnectionProfile(CamelCaseModel):
-    throughput: float
-    latency: float
-    jitter: float
+    pass
--- a/src/exo/shared/types/tasks.py
+++ b/src/exo/shared/types/tasks.py
@@ -40,6 +40,10 @@ class LoadModel(BaseTask):  # emitted by Worker
    pass


+class ConnectToGroup(BaseTask):  # emitted by Worker
+    pass
+
+
 class StartWarmup(BaseTask):  # emitted by Worker
    pass

@@ -57,5 +61,11 @@ class Shutdown(BaseTask):  # emitted by Worker


 Task = (
-    CreateRunner | DownloadModel | LoadModel | StartWarmup | ChatCompletion | Shutdown
+    CreateRunner
+    | DownloadModel
+    | ConnectToGroup
+    | LoadModel
+    | StartWarmup
+    | ChatCompletion
+    | Shutdown
 )
--- a/src/exo/shared/types/thunderbolt.py
+++ b/src/exo/shared/types/thunderbolt.py
@@ -0,0 +1,75 @@
+import anyio
+from pydantic import BaseModel, Field
+
+from exo.utils.pydantic_ext import CamelCaseModel
+
+
+class TBConnection(CamelCaseModel):
+    source_uuid: str
+    sink_uuid: str
+
+
+class TBIdentifier(CamelCaseModel):
+    rdma_interface: str
+    domain_uuid: str
+
+
+## Intentionally minimal, only collecting data we care about - there's a lot more
+
+
+class TBReceptacleTag(BaseModel, extra="ignore"):
+    receptacle_id_key: str | None = None
+
+
+class TBConnectivityItem(BaseModel, extra="ignore"):
+    domain_uuid_key: str | None = None
+
+
+class TBConnectivityData(BaseModel, extra="ignore"):
+    domain_uuid_key: str | None = None
+    items: list[TBConnectivityItem] | None = Field(None, alias="_items")
+    receptacle_1_tag: TBReceptacleTag | None = None
+
+    def ident(self, ifaces: dict[str, str]) -> TBIdentifier | None:
+        if (
+            self.domain_uuid_key is None
+            or self.receptacle_1_tag is None
+            or self.receptacle_1_tag.receptacle_id_key is None
+        ):
+            return
+        tag = f"Thunderbolt {self.receptacle_1_tag.receptacle_id_key}"
+        assert tag in ifaces  # doesn't need to be an assertion but im confident
+        # if tag not in ifaces: return None
+        iface = f"rdma_{ifaces[tag]}"
+        return TBIdentifier(rdma_interface=iface, domain_uuid=self.domain_uuid_key)
+
+    def conn(self) -> TBConnection | None:
+        if self.domain_uuid_key is None or self.items is None:
+            return
+
+        sink_key = next(
+            (
+                item.domain_uuid_key
+                for item in self.items
+                if item.domain_uuid_key is not None
+            ),
+            None,
+        )
+        if sink_key is None:
+            return None
+
+        return TBConnection(source_uuid=self.domain_uuid_key, sink_uuid=sink_key)
+
+
+class TBConnectivity(BaseModel, extra="ignore"):
+    SPThunderboltDataType: list[TBConnectivityData] = []
+
+    @classmethod
+    async def gather(cls) -> list[TBConnectivityData] | None:
+        proc = await anyio.run_process(
+            ["system_profiler", "SPThunderboltDataType", "-json"], check=False
+        )
+        if proc.returncode != 0:
+            return None
+        # Saving you from PascalCase while avoiding too much pydantic
+        return TBConnectivity.model_validate_json(proc.stdout).SPThunderboltDataType
--- a/src/exo/shared/types/topology.py
+++ b/src/exo/shared/types/topology.py
@@ -1,37 +1,32 @@
-from exo.shared.types.common import NodeId
+from enum import Enum
+
+from loguru import logger
+
 from exo.shared.types.multiaddr import Multiaddr
-from exo.shared.types.profiling import ConnectionProfile, NodePerformanceProfile
-from exo.utils.pydantic_ext import CamelCaseModel
+from exo.utils.pydantic_ext import FrozenModel


-class NodeInfo(CamelCaseModel):
-    node_id: NodeId
-    node_profile: NodePerformanceProfile | None = None
-
-
-class Connection(CamelCaseModel):
-    local_node_id: NodeId
-    send_back_node_id: NodeId
-    send_back_multiaddr: Multiaddr
-    connection_profile: ConnectionProfile | None = None
-
-    def __hash__(self) -> int:
-        return hash(
-            (
-                self.local_node_id,
-                self.send_back_node_id,
-                self.send_back_multiaddr.address,
-            )
-        )
-
-    def __eq__(self, other: object) -> bool:
-        if not isinstance(other, Connection):
-            raise ValueError("Cannot compare Connection with non-Connection")
-        return (
-            self.local_node_id == other.local_node_id
-            and self.send_back_node_id == other.send_back_node_id
-            and self.send_back_multiaddr == other.send_back_multiaddr
-        )
+class RDMAConnection(FrozenModel):
+    source_rdma_iface: str
+    sink_rdma_iface: str

    def is_thunderbolt(self) -> bool:
-        return str(self.send_back_multiaddr.ipv4_address).startswith("169.254")
+        logger.warning("duh")
+        return True
+
+
+# TODO
+class LinkType(str, Enum):
+    Thunderbolt = "Thunderbolt"
+    Ethernet = "Ethernet"
+    WiFi = "WiFi"
+
+
+class SocketConnection(FrozenModel):
+    sink_multiaddr: Multiaddr
+
+    def __hash__(self):
+        return hash(self.sink_multiaddr.ip_address)
+
+    def is_thunderbolt(self) -> bool:
+        return str(self.sink_multiaddr.ipv4_address).startswith("169.254")
--- a/src/exo/shared/types/worker/instances.py
+++ b/src/exo/shared/types/worker/instances.py
@@ -25,12 +25,13 @@ class BaseInstance(TaggedModel):


 class MlxRingInstance(BaseInstance):
-    hosts: list[Host]
+    hosts_by_node: dict[NodeId, list[Host]]
+    ephemeral_port: int


 class MlxJacclInstance(BaseInstance):
-    ibv_devices: list[list[str | None]]
-    ibv_coordinators: dict[NodeId, str]
+    jaccl_devices: list[list[str | None]]
+    jaccl_coordinators: dict[NodeId, str]


 # TODO: Single node instance
--- a/src/exo/shared/types/worker/runner_response.py
+++ b/src/exo/shared/types/worker/runner_response.py
@@ -1,4 +1,4 @@
-from exo.shared.types.api import FinishReason
+from exo.shared.types.api import FinishReason, GenerationStats
 from exo.utils.pydantic_ext import TaggedModel


@@ -15,6 +15,7 @@ class GenerationResponse(BaseRunnerResponse):
    token: int
    # logprobs: list[float] | None = None # too big. we can change to be top-k
    finish_reason: FinishReason | None = None
+    stats: GenerationStats | None = None


 class FinishedResponse(BaseRunnerResponse):
--- a/Show More
+++ b/Show More