Compare commits

..

806 Commits

Author SHA1 Message Date
Evan
879d178057 add kimi tool parseing 2026-01-21 16:53:38 +00:00
Evan
daa31b4472 implement mlx-lm tool calling 2026-01-21 16:53:38 +00:00
ciaranbor
6a9251b920 Add mflux type stubs (#1234)
## Motivation

Simplify image generation review
2026-01-21 15:07:42 +00:00
rltakashige
758464703d Fix GPT OSS tensor sharding with upstream MLX LM (#1223)
## Motivation
MLX LM has given GPT OSS a shard method, but MLX does not have an update
to match.

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-20 18:24:54 +00:00
rltakashige
9e2179c848 Register original layer in CustomMlxLayer (#1229)
## Motivation
Kimi K2 Thinking Pipeline RDMA was broken before.

## Why It Works
No clue tbh

## Test Plan

### Manual Testing
Kimi K2 Thinking and GPT OSS work at the same time on Pipeline RDMA.
Needs exo bench to check more thoroughly

### Automated Testing
Layer composition tests still pass.
2026-01-20 18:20:01 +00:00
Evan Quiney
22b5d836ef swap all instances of model_id: str for model_id: ModelId (#1221)
This change uses the stronger typed ModelId, and introduces some
convenience methods. It also cleans up some code left over from #1204.

## Changes

`model_id: str -> model_id: ModelId`
`repo_id: str -> model_id: ModelId`

Introduces methods on ModelId, in particular ModelId.normalize() to
replace `/` with `--`.

This PR did introduce some circular imports, so has moved some code
around to try and limit them.

## Test Plan

Tests still pass, types still check. As this is about metadata, I
haven't tested inference.
2026-01-20 17:38:06 +00:00
Alex Cheema
ea9c6d6bdf Remove dead local paths code from download_shard (#1227)
## Motivation

The `download_progress_for_local_path` function and the "Handle local
paths" code block in `download_shard` are dead code that cannot be
reached in normal usage. The code checks if `model_id` (e.g.,
"mlx-community/Llama-3.2-3B-Instruct-4bit") exists as a filesystem path,
but model IDs are constrained to HuggingFace repo format and there's no
API pathway to pass local paths.

## Changes

- Removed `download_progress_for_local_path()` function (45 lines)
- Removed the "Handle local paths" block in `download_shard()` (7 lines)

## Why It Works

This code was added in PR #669 as part of a "feature-local-models"
branch, but the feature was never fully integrated. The check
`aios.path.exists(str(shard.model_card.model_id))` would only return
true if a directory literally named
"mlx-community/Llama-3.2-3B-Instruct-4bit" existed in the cwd, which
doesn't happen in practice. Offline caching is already handled by
`fetch_file_list_with_cache`.

## Test Plan

### Manual Testing
- Run exo normally and verify downloads still work

### Automated Testing
- Existing tests pass (this code had no test coverage)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 17:07:27 +00:00
Alex Cheema
4ea66d427b Reduce download log spam (#1225)
## Motivation

When `skip_download=True`, exo was logging a lot of unnecessary messages during periodic download status checks. This resulted in spammy logs that made it hard to see important messages.

## Changes

- Only log "Downloading ... with allow_patterns=..." when actually downloading (not when skip_download is true)
- Changed periodic download progress check logs from INFO to DEBUG level

## Why It Works

The `skip_download=True` parameter is used when checking download status without actually downloading. By guarding the log behind `if not skip_download:`, we avoid logging on every status check. Changing the periodic emitting logs to DEBUG level reduces noise while still keeping them available for debugging.

## Test Plan

### Manual Testing
- Run exo and observe that logs are less spammy during normal operation
- Use -v or -vv flags to see DEBUG logs when needed

### Automated Testing
- Existing tests cover this code path
2026-01-20 16:57:05 +00:00
rltakashige
8b709e68b2 Mark slow tests as slow (#1220)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-20 15:03:46 +00:00
Evan Quiney
4da6eeb11f fix a test broken by #1204 (#1219)
bad merge broke a test - fix it
2026-01-20 14:56:20 +00:00
Evan
3d2eee4884 quiet localhost log
this log is just noise - remove it
2026-01-20 14:51:26 +00:00
Evan
116558839e don't clear mdns discovered connections
pingers currently removes mdns discovered connections - these systems
should be independent
2026-01-20 14:46:20 +00:00
Evan Quiney
d4f551c602 Simplify model cards (#1204)
## Motivation

We have a lot of unneeded data in the model card - lets just keep the
necessary stuff and add back more data when we need it

## Test Plan

EXO still runs! (pipeline on 2)

Co-authored-by: rltakashige <rl.takashige@gmail.com>
2026-01-20 11:01:19 +00:00
Alex Cheema
176ab5ba40 Add GLM-4.7-Flash model cards (4bit, 5bit, 6bit, 8bit) (#1214)
## Motivation

Add support for GLM-4.7-Flash, a lighter variant of GLM-4.7 with the
`glm4_moe_lite` architecture. These models are smaller and faster while
maintaining good performance.

## Changes

1. **Added 4 new model cards** for GLM-4.7-Flash variants:
   - `glm-4.7-flash-4bit` (~18 GB)
   - `glm-4.7-flash-5bit` (~21 GB)
   - `glm-4.7-flash-6bit` (~25 GB)
   - `glm-4.7-flash-8bit` (~32 GB)

   All variants have:
   - `n_layers`: 47 (vs 91 in GLM-4.7)
   - `hidden_size`: 2048 (vs 5120 in GLM-4.7)
   - `supports_tensor`: True (native `shard()` method)

2. **Bumped mlx from 0.30.1 to 0.30.3** - required by mlx-lm 0.30.4

3. **Updated mlx-lm from 0.30.2 to 0.30.4** - adds `glm4_moe_lite`
architecture support

4. **Added type ignores** in `auto_parallel.py` for stricter type
annotations in new mlx-lm

5. **Fixed EOS token IDs** for GLM-4.7-Flash - uses different tokenizer
with IDs `[154820, 154827, 154829]` vs other GLM models' `[151336,
151329, 151338]`

6. **Renamed `MLX_IBV_DEVICES` to `MLX_JACCL_DEVICES`** - env var name
changed in new mlx

## Why It Works

The model cards follow the same pattern as existing GLM-4.7 models.
Tensor parallel support is enabled because GLM-4.7-Flash implements the
native `shard()` method in mlx-lm 0.30.4, which is automatically
detected in `auto_parallel.py`.

GLM-4.7-Flash uses a new tokenizer with different special token IDs.
Without the correct EOS tokens, generation wouldn't stop properly.

## Test Plan

### Manual Testing
Tested generation with GLM-4.7-Flash-4bit - now correctly stops at EOS
tokens.

### Automated Testing
- `basedpyright`: 0 errors
- `ruff check`: All checks passed
- `pytest`: 162/162 tests pass (excluding pre-existing
`test_distributed_fix.py` timeout failures)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-20 03:58:09 +00:00
rltakashige
f5e6aa82d2 Load layers individually (#1211)
## Motivation

Certain models hang at model loading in tensor parallel. 

Hopefully closes #1205 

## Changes

- Load layer by layer for tensor parallel sharding
- Move eval_with_timeout to auto_parallel.py to resolve circular import.

## Why It Works

The naive way to fix this is to use load model with lazy = False and
then shard in tensor parallel. However, this requires the entire model
to be loaded into memory.

Instead, we can load layer by layer and shard after loading. There is a
very small memory footprint to this, but it is negligible.

I tried loading layer by layer after the sharding, and this allowed
model loading but got stuck at warming up.

## Test Plan

### Manual Testing
GPT OSS loads with TP and FAST SYNCH. Kimi does too.

### Automated Testing
We need to run a suite of exo_bench before merging this!
2026-01-20 03:26:51 +00:00
Alex Cheema
39f0ed6018 Prepend <think> tag to stream for thinking models like GLM-4.7 (#1186)
## Motivation

For thinking models like GLM-4.7, the `<think>` tag is inserted by the
tokenizer's `apply_chat_template()` into the **prompt** (input). The
model generates tokens starting *after* this tag, so `<think>` never
appears in the streamed output. The frontend expects
`<think>...</think>` tags to extract and display thinking content.

**Log evidence:**
```
[gMASK]<sop><|system|>...<|user|>...<|assistant|><think>
```
The prompt ends with `<think>`, so the model generates content after it,
never returning the opening tag.

## Changes

- Added `detect_thinking_prompt_suffix()` helper function in
`utils_mlx.py` to detect if a prompt ends with `<think>` tag
- Added `parse_thinking_models()` generator wrapper in `runner.py` that
prepends the thinking tag to the output stream
- Modified the main generation loop to use the thinking wrapper for
non-GptOssModel models when a thinking prefix is detected
- Updated test mocks to handle the new `apply_chat_template` call

## Why It Works

The solution follows the same pattern as `parse_gpt_oss()` - a generator
wrapper that transforms the output stream. When the chat template ends
with `<think>`, we prepend this tag to the first generated token so the
frontend receives the complete `<think>...</think>` structure it
expects.

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
- Run exo: `uv run exo`
- Send a chat request to GLM-4.7:
  ```bash
curl http://localhost:52415/v1/chat/completions -H "Content-Type:
application/json" -d '{
    "model": "mlx-community/GLM-4.7-8bit-gs32",
    "messages": [{"role": "user", "content": "What is 2+2?"}],
    "stream": true
  }'
  ```
- Verify the streamed response starts with `<think>` tag
- Verify the frontend dashboard correctly shows the thinking section
collapsed

### Automated Testing
- All 72 worker tests pass: `uv run pytest src/exo/worker/`
- Type checker passes: `uv run basedpyright`
- Linter passes: `uv run ruff check`

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Ryuichi Leo Takashige <leo@exolabs.net>
2026-01-19 19:44:51 +00:00
Alex Cheema
ee43b598fe Split NodePerformanceProfile into granular state mappings (#1209)
## Motivation

The current `NodePerformanceProfile` is a monolithic object where every
update (even 1-second memory updates) replaces the entire profile,
touching unrelated data. Different fields update at vastly different
frequencies:

| Data | Update Frequency |
|------|------------------|
| Memory, System | 1 second |
| Thunderbolt | 5 seconds |
| Network interfaces | 10 seconds |
| Friendly name | 60 seconds |
| Model/Chip ID | Once at startup |

## Changes

Split into separate state mappings so each data type updates
independently:

- `node_identities`: Static and slow-changing data (model_id, chip_id,
friendly_name)
- `node_memory`: RAM and swap usage
- `node_system`: GPU usage, temperature, power, CPU metrics
- `node_network`: Network interface information
- `node_thunderbolt`: Thunderbolt interface identifiers

Added a backwards-compatible `node_profiles` property that reconstructs
`NodePerformanceProfile` from the granular mappings for dashboard
compatibility.

**Files modified:**
- `src/exo/shared/types/profiling.py` - Added `NodeIdentity`,
`NodeNetworkInfo`, `NodeThunderboltInfo` types
- `src/exo/shared/types/state.py` - Added 5 new mappings +
`node_profiles` property
- `src/exo/shared/apply.py` - Updated `apply_node_gathered_info` and
`apply_node_timed_out`

## Why It Works

Each info type now writes only to its specific mapping, avoiding
unnecessary updates to unrelated data. The `MacThunderboltConnections`
handler reads from `node_thunderbolt` instead of the old `node_profiles`
for RDMA connection mapping. The backwards-compatible property ensures
the dashboard continues to work unchanged.

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
- Start exo and verify dashboard shows node info
- Verify memory/GPU updates stream correctly
- Check that node timeout properly cleans up all mappings

### Automated Testing
- All 162 existing tests pass
- basedpyright: 0 errors
- ruff check: All checks passed
- nix fmt: Applied

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 18:24:15 +00:00
rltakashige
5fd55594c9 Wrap pipeline models for explicit mx.depends between cache and logits (#1206)
## Motivation

GPU timeouts often when prompt size > profile_step_size. It also happens
for seemingly random models.

## Changes

Add mx.depends for cache on the logits.
All gather at the model level rather than the layer level, reducing the
amount of data sent.

## Why It Works

mlx_lm's prefill loop only evaluates cache state, not logits.
When prompt > prefill_step_size, the all_gather is never evaluated,
causing GPU timeout.

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
Added failing test cases and then resolved them.
2026-01-19 17:49:42 +00:00
Jake Hillion
5ab1f8b3e2 NetworkSetupHelper: detect stale startup script content
The daemonAlreadyInstalled() function checked that the startup script
file existed and validated plist properties, but did not compare the
actual script content. If the setupScript constant was updated in a new
app version, the stale on-disk script would not be detected or replaced.

Added a guard clause that reads the installed script from disk and
compares it against the expected setupScript content (with whitespace
normalization). When content differs, the function returns false,
triggering the reinstallation flow with an admin privileges prompt.

Test plan:
- Installed on a cluster that had the previous network config. Got the
  popup asking for permissions. After accepting I could run Kimi K2
  Thinking Tensor RDMA on all 4 nodes.
2026-01-19 17:36:15 +00:00
Evan Quiney
2202685c3e refactor all information sources (including ipless rdma discovery) (#928)
## Motivation

Information gathering is tightly coupled to MacMon - we should start
generalizing our information sources so we can add more in future.

## Changes

Added a new system to gather any information. Currently, it is attached
to the Worker - though this is mostly to keep the data processing logic
simple. It could be made independent quite easily.

I also refactored topology to include different kinds of connections as
we can gather RDMA connections without having a pre-existing socket
connection, and made the relevant placement updates. We should no longer
need the network locations script in the app.

Other sources of information now include:
- static node information like "model" and "chip" (macos, "Unknown"
fallback)
- device friendly name (macos, falls back to device hostname)
- network interfaces + ips (cross platform)
- thunderbolt interfaces (macos)
- thunderbolt connections (macos)
- RAM usage (cross platform)
- per-device configuration written to EXO_HOME/config.toml

## Limitations

Model and Chip are not cross platform concepts.

We do not differentiate between unified and non-unified memory systems.

A lot of this data collection is based on simple timers. Watching the SC
store on macos is the correct way to gather some of this information,
but requires a detour into rust for macos.

## Why It Works

The InfoGatherer is a generic subsystem which returns a union of metric
datatypes. It writes them to an event, which is applied to state. It is
currently re-spawned with the worker so each cluster receives the
correct information.

As for topology, macOS identifies TB ports with a uuid in
SPThunderboltDataType, and also stores remote uuids if it can find them.
These changes read that data with the system_profiler, hopefully not so
often as to cause notable performance impacts (though this should be
tuned) but frequently enough for moderate responsiveness.
As we can identify TB connections between devices without needing ips
attached to each interface, we can remove the network setup script
(almost) completely.

## Test Plan

### Manual Testing
Spawn RDMA instances without enabling DHCP on the RDMA interfaces.

### Automated Testing
Updated the current master and shared tests to cover the topology
refactor and new events.

---------

Co-authored-by: Sami Khan <smsak99@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Jake Hillion <jake@hillion.co.uk>
2026-01-19 16:58:09 +00:00
Andrei Onel
ce3ad391b1 Update README.md with some changes from release 1.0.61 (#1157)
Updated README.md with documentation for four new features:

- added a "Benchmarking" section documenting the exo-bench tool for
measuring model performance across different placement configurations
- documented the custom namespace feature for cluster isolation in the
macOS app section
- added a "Configuration Options" subsection explaining the --no-worker
CLI flag for coordinator-only nodes
- added a "File Locations (Linux)" subsection documenting XDG Base
Directory Specification compliance on Linux systems

Issue #930
2026-01-19 16:43:18 +00:00
Jake Hillion
fb0151630d shard_downloader: make on_progress callback async
The on_progress callback was synchronous but always invoked from async
contexts, forcing the use of send_nowait() which could raise WouldBlock
if the channel buffer was full, potentially dropping progress updates.

Changed the callback type from `Callable[[ShardMetadata,
RepoDownloadProgress], None]` to return a coroutine, updated all
implementations to be async, and replaced send_nowait() with await
send() in the worker's download progress handler.

This allows proper backpressure handling when sending download progress
events through the channel, eliminating the "Footgun!" that was
previously documented in the code.

Test plan:
- Built a DMG and ran it on one node. All existing models showed as
  downloaded.
- Downloaded a new model. The progress bar on the download page worked.
- Downloaded another new model. The progress bar on the home page
  worked.
2026-01-19 16:19:37 +00:00
Alex Cheema
346b13e2c9 Enhance LaTeX rendering in dashboard markdown (#1197)
## Motivation

When models output LaTeX-formatted math proofs, the dashboard was not
rendering them correctly. Issues included:
- `\documentclass`, `\begin{document}`, `\usepackage` showing as raw
text
- `$...$` inline math with complex expressions (like `\frac`, `\ldots`)
not rendering due to markdown escaping backslashes
- `\begin{align*}...\end{align*}` and other math environments showing as
raw text
- `\emph{...}`, `\textbf{...}` LaTeX formatting commands not being
converted
- `$\require{...}$` (MathJax-specific) causing KaTeX errors
- `\begin{proof}...\end{proof}` showing as raw text

## Changes

Enhanced `MarkdownContent.svelte` with comprehensive LaTeX support:

**Math extraction before markdown processing:**
- Extract `$...$`, `$$...$$`, `\(...\)`, `\[...\]` into placeholders
before markdown processes the text
- Use alphanumeric placeholders (`MATHPLACEHOLDERINLINE0END`) that won't
be interpreted as HTML tags
- Restore and render with KaTeX after markdown processing

**LaTeX document command removal:**
- Strip `\documentclass{...}`, `\usepackage{...}`, `\begin{document}`,
`\end{document}`
- Strip `\maketitle`, `\title{...}`, `\author{...}`, `\date{...}`
- Strip `\require{...}` (MathJax-specific, not KaTeX)
- Replace `tikzpicture` environments with `[diagram]` placeholder
- Strip `\label{...}` cross-reference commands

**LaTeX math environments:**
- Convert `\begin{align*}`, `\begin{equation}`, `\begin{gather}`, etc.
to display math blocks

**LaTeX text formatting:**
- `\emph{...}` and `\textit{...}` → `<em>...</em>`
- `\textbf{...}` → `<strong>...</strong>`
- `\texttt{...}` → `<code>...</code>`
- `\underline{...}` → `<u>...</u>`

**LaTeX environments styling:**
- `\begin{proof}...\end{proof}` → styled proof block with QED symbol
- `\begin{theorem}`, `\begin{lemma}`, etc. → styled theorem blocks

**Display math enhancements:**
- Wrapped in styled container with subtle gold border
- "LaTeX" label and copy button appear on hover
- Dark theme KaTeX color overrides for better readability
- Custom scrollbar for overflow

## Why It Works

The key insight is that markdown processing was escaping backslashes in
LaTeX before KaTeX could see them. By extracting all math expressions
into alphanumeric placeholders *before* markdown runs, then restoring
them *after*, the LaTeX content passes through to KaTeX unmodified.

Using purely alphanumeric placeholders like `MATHPLACEHOLDERINLINE0END`
instead of `<<MATH_INLINE_0>>` prevents markdown from interpreting them
as HTML tags and stripping them.

## Test Plan

### Manual Testing
- Hardware: Any machine with the dashboard
- What you did:
  - Ask model to "write a proof in latex"
  - Verify inline math like `$x \in S$` renders correctly
- Verify display math like `\begin{align*}...\end{align*}` renders as
block
  - Verify `\documentclass`, `\begin{document}` are stripped (not shown)
  - Verify `\emph{...}` converts to italics
  - Verify copy button works on display math blocks
- Test edge cases: `$5` (currency) stays as text, `\$50` (escaped)
becomes `$50`

Before:
<img width="799" height="637" alt="Screenshot 2026-01-19 at 11 51 22 AM"
src="https://github.com/user-attachments/assets/62a705b8-b3c2-47b8-afd0-5d0c1b240e44"
/>

After:
<img width="809" height="642" alt="Screenshot 2026-01-19 at 11 46 58 AM"
src="https://github.com/user-attachments/assets/4f35fa1d-333c-4285-bc68-58a50f8f148e"
/>


### Automated Testing
- Dashboard builds successfully with `npm run build`
- Existing functionality preserved

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 14:50:41 +00:00
rltakashige
ea0588429b Custom mlx layer composition (#1201)
## Motivation

With a single pipeline layer, PipelineFirstLayer gets composed with
PipelineLastLayer.

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing


### Automated Testing
Made failing tests. Fixed them!
2026-01-19 12:36:25 +00:00
rltakashige
73b3f87e07 Set swa_idx and ga_idx for single layer (#1202)
## Motivation

Layer types does not contain either "sliding_attention" or
"full_attention" for pipeline parallel (single layer).

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
Manually tested single layer of GPT OSS. Doesn't crash

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-19 12:31:11 +00:00
Evan Quiney
746589ba6b tidy: remove context manager from api (#1199) 2026-01-19 11:58:13 +00:00
rltakashige
f82f862fd7 Fix several issues with placement (#1200)
## Motivation

Uneven placements were causing issues for some users with lopsided
setups. While fixing, I ran into another issue with impossible
allocation of memory.

## Changes

- Allocate at least 1 layer per device.
- Catch overallocation of memory with an error.

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
Tested that GPT OSS is placed correctly.

### Automated Testing
Added breaking tests in the first commit. Resolved with new placement
algorithm in the second one.
2026-01-19 11:52:35 +00:00
Alex Cheema
7ff937d8a1 Add dashboard screenshots to README (#1185)
## Motivation

The README showcases exo's features and benchmarks but doesn't show what
the dashboard actually looks like. Adding a screenshot helps users
understand what they'll get when they run exo.

## Changes

- Added dashboard screenshot to `docs/imgs/dashboard-cluster-view.png`:
Shows the cluster topology view with 4 × 512GB M3 Ultra Mac Studio
running DeepSeek v3.1 (8-bit) and Kimi-K2-Thinking (4-bit)
- Added a new "Dashboard" section to README.md below Features,
displaying the screenshot with caption

## Why It Works

Visual documentation helps users understand what exo offers before they
install it. The screenshot demonstrates the cluster management
capabilities.

## Test Plan

### Manual Testing
- Verified image renders correctly in GitHub markdown preview

### Automated Testing
- N/A - documentation only change

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 10:43:27 +00:00
Evan Quiney
d19bf02404 re-raise exceptions in the runner (#1198)
## Motivation

Runners that crash can swallow errors - we should re-raise. Also the
exception handler annoyed me.

## Changes

The try: except in the runner's chat now re-raises.
2026-01-19 10:35:23 +00:00
rltakashige
618cee5223 Resolve test event ordering flakiness (#1194)
## Motivation

mp sender occasionally does not have time to flush its events before
collect() is called, making the event ordering test fail.

## Changes

- Replace mp_channel with simple collector for event ordering test
- Also suppress warning for <frozen importlib._bootstrap>:488 <frozen
importlib._bootstrap>:488: DeprecationWarning: builtin type SwigPyObject
has no __module__ attribute


## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
Ran the test 100 times without it failing.
2026-01-18 20:33:20 +00:00
Antonio Lujano Luna
9c29eb7d48 Add proxy and custom SSL certificate support for corporate networks (#1189)
Support HTTPS_PROXY/HTTP_PROXY environment variables for proxy
configuration and SSL_CERT_FILE for custom CA certificates, enabling use
in corporate environments with SSL inspection.

## Motivation
Users in corporate environments often need to route traffic through HTTP
proxies and use custom CA certificates for SSL inspection. Without this
support, exo cannot download models in these network configurations.

## Changes
- Added `HTTPS_PROXY`/`HTTP_PROXY` environment variable support to
`create_http_session()` in `download_utils.py`
- Added `SSL_CERT_FILE` environment variable support for custom CA
certificate bundles, falling back to certifi's default bundle

## Why It Works
- `aiohttp.ClientSession` natively supports the `proxy` parameter for
routing requests through HTTP proxies
- `ssl.create_default_context(cafile=...)` accepts a custom CA bundle
path, allowing corporate CAs to be trusted
- Using environment variables is consistent with the codebase's existing
configuration patterns (e.g., `EXO_HOME`, `HF_ENDPOINT`)

## Test Plan
### Manual Testing
- Set `HTTPS_PROXY` environment variable and verified model downloads
route through proxy
- Set `SSL_CERT_FILE` to custom CA bundle and verified SSL verification
succeeds with corporate SSL inspection

### Automated Testing
- No automated tests added; this change is configuration-only and does
not alter existing behavior when environment variables are unset
2026-01-18 12:05:50 +00:00
Alex Cheema
c5158bee53 Add pre-commit checks documentation to AGENTS.md (#1184)
## Motivation

CI failures can be avoided by running checks locally before committing.
This adds clear documentation to AGENTS.md so that AI agents (and
humans) know exactly which checks must pass before pushing code.

## Changes

Added a new "Pre-Commit Checks (REQUIRED)" section to AGENTS.md that:
- Lists all 4 required checks (basedpyright, ruff, nix fmt, pytest)
- Provides a one-liner to run all checks in sequence
- Notes that `nix fmt` changes must be staged before committing
- Explains that CI runs `nix flake check` which verifies everything

## Why It Works

Clear documentation prevents CI failures by ensuring contributors run
checks locally first. The one-liner command makes it easy to run all
checks before committing.

## Test Plan

### Manual Testing
- Verified the documented commands work correctly

### Automated Testing
- N/A - documentation only change

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-17 21:50:24 +00:00
rltakashige
5c8a237940 Handle model timeouts (#1177)
- Add eval with a timeout.
- Add fast synch flag

## Motivation

Because of the experimental FAST SYNCH flag, some models may not work.
This PR catches when this occurs and allows users to specify a run
without fast synch

## Changes

- Adds a flag to enable or disable fast synch (--fast-synch and
--no-fast-synch)
- Adds a heuristic timeout
- Reduces exo_bench default timeout to 10 minutes.

## Why It Works

Heuristic timeout assumes normal loading times on Mac devices (60 +
model size in gb / 5: e.g. DeepSeek takes up to 120 seconds to load on
tensor parallel, and timeout is set to 60 + 120 = 180s.

We could raise this value if necessary.

## Test Plan

### Manual Testing
Catches that GPT OSS fails to load in Tensor RDMA
Can launch with --no-fast-synch flag to launch GPT OSS.

**GPT OSS 20B**
TP with fast synch
<img width="3064" height="456" alt="image"
src="https://github.com/user-attachments/assets/f6e25cd8-8621-4e99-99fe-292ee05c4035"
/>

TP without fast synch
<img width="3098" height="496" alt="image"
src="https://github.com/user-attachments/assets/d36453d9-6686-4cfe-aa7c-a7d458369d4d"
/>
[Note: the performance is really not great as fast synch is off]

(As a sanity check)
PP with fast synch
<img width="3124" height="496" alt="image"
src="https://github.com/user-attachments/assets/e97d4547-c6fa-483d-badb-4b371b900b4c"
/>

PP without fast synch
<img width="3078" height="508" alt="image"
src="https://github.com/user-attachments/assets/b2e20dfd-4b0e-4295-8a92-417dfe745c28"
/>

PP without RDMA
<img width="3070" height="498" alt="image"
src="https://github.com/user-attachments/assets/a8509d68-0aef-4cda-bca5-a67d39a0801e"
/>

TP without RDMA
<img width="3068" height="496" alt="image"
src="https://github.com/user-attachments/assets/b5691429-89f4-4369-bcf2-8fde2ad7154a"
/>
2026-01-16 20:25:12 +00:00
rltakashige
745343c705 Return error responses for Chat Completions (#1173)
- Error chunks
- Use error handling in exo_bench.py

## Motivation

Return when an error occurs so that generation stops. Adding timeouts is
a separate TODO for model loading and chat completions.

## Changes

- Return HTTP exceptions as JSON responses in an OpenAI compatible
format.
- Context manager for generation to catch and return error messages.
- Use error handling in exo_bench.py.

## Test Plan

### Manual Testing
Manually tested that exo_bench returns on failures within and outside
generation

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-16 19:24:37 +00:00
Alex Cheema
5e28664c41 Fix draft release detection (attempt 3) (#1176)
## Motivation

Previous fix still failed in CI. Suspecting permissions issue with
GITHUB_TOKEN not being able to see draft releases via API.

## Changes

1. Add explicit `permissions: contents: write` to the job
2. Use `gh release list` first to check if draft exists (this uses a
different code path that might work better)
3. Add debug echo statements

## Test Plan

Delete v1.0.63 tag and re-push after merging.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 17:26:06 +00:00
Alex Cheema
ae0a804ccb Fix draft release detection query (#1175)
## Motivation

Fixes the draft release detection that failed on the v1.0.63 release
attempt.

## Changes

The jq query was piped to `head -1` which truncated multi-line JSON
output to just `{`, causing the empty check to fail.

Changed to use `first // empty` in jq instead.

## Test Plan

Tested locally:
```bash
GITHUB_REF_NAME="v1.0.63"
gh api repos/exo-explore/exo/releases --jq "[.[] | select(.draft == true) | select(.name == \"$GITHUB_REF_NAME\")] | first // empty"
# Returns the full draft release JSON (2711 chars)
```

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 17:05:24 +00:00
Alex Cheema
07cf2c1aa1 Add GitHub releases with Sparkle release notes integration (#1172)
## Motivation

Closes #1140

Currently releases are uploaded to S3 for Sparkle updates but there's no
GitHub Release created, and Sparkle update dialogs don't show release
notes. Users have no visibility into what changed.

## Changes

- Added release workflow documentation comment at top of `build-app.yml`
- Added "Fetch release notes for Sparkle" step that converts markdown
from draft GitHub release to HTML
- Added "Inject release notes into appcast" step that embeds HTML in
appcast.xml with CDATA
- Added "Publish GitHub Release" step that attaches DMG and publishes
the draft

## Why It Works

- Sparkle's `<description>` tag supports HTML wrapped in CDATA for
rendering in update dialogs
- GitHub's markdown API (`/markdown`) converts the release notes to HTML
with proper formatting
- Draft releases allow writing polished notes before the build, then the
workflow publishes them automatically
- The workflow fails if no draft release exists, ensuring release notes
are always provided

## Test Plan

### Manual Testing
1. Create a draft GitHub release for a new tag with markdown release
notes
2. Push the tag to trigger the workflow
3. Verify the GitHub release is published with DMG attached
4. Download appcast.xml from S3 and verify
`<description><![CDATA[...]]></description>` contains HTML
5. Test Sparkle update dialog on macOS to confirm release notes appear

### Automated Testing
No automated tests added - this is CI workflow configuration.

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 16:47:33 +00:00
Evan
83c5285a80 reduce logs
previous commits logs were too verbose, this tones them down a bit
2026-01-16 14:05:47 +00:00
Evan Quiney
39ee2bf7bd switch from synchronous threaded pinging to an async implementation (#1170)
still seeing churn in our networking - lets properly rate limit it

## changes

added an httpx client with max connections with a persistent AsyncClient

## testing

deployed on cluster, discovery VASTLY more stable (the only deleted
edges were those discovered by mdns)
2026-01-16 13:20:03 +00:00
Sami Khan
991adfbd6f fix local network warning (#1136)
## Motivation

Local network warning banner was showing on fresh install even though
mDNS was working. The check would fail before the user had a chance to
grant permission via the macOS prompt.

## Changes

- Added `hasWorkedBefore` flag persisted in UserDefaults
- Only show warning if permission previously worked but now doesn't

## Why It Works

On fresh install, the check may fail (no permission yet), but
`hasWorkedBefore` is false so no warning shows. Once the user grants
permission and a check succeeds, we record it. Future failures (zombie
permission after restart) will show the warning since `hasWorkedBefore`
is now true.

## Test Plan

### Manual Testing
Run locally

### Automated Testing
N/A
2026-01-16 13:10:50 +00:00
rltakashige
4b3de6b984 Fix exo bench for transformers 5.x (#1168)
## Motivation
Prompt Sizer was broken as transformers 5.x tokenizers create
BatchEncodings which are essentially a dictionary of {input_ids: []}
instead of the list of input ids.

## Test Plan

### Manual Testing
Tested that exo bench runs as expected.

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-16 12:39:22 +00:00
Evan
c8de3b90ea quiet rust logs
rust logs were too verbose - now only warnings propagate to python

entirely happy not to merge this and to clean up rust logging instead,
but this felt saner right now
2026-01-16 12:34:28 +00:00
Sami Khan
6e6567a802 resolve issue #1070 (#1076)
## Motivation

https://github.com/exo-explore/exo/issues/1070

## Changes

Added check in ChatForm.svelte to reset selectedChatModel when it no
longer matches any running instance.

## Why It Works

The $effect now detects when the selected model is stale (not in
availableModels()) and resets to the first available model.

## Test Plan

### Manual Testing

1. Create instance of Model A → Delete it → Create instance of Model B →
Chat
2. Verify request goes to Model B (not Model A)

---------

Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>
2026-01-15 20:00:41 +00:00
rltakashige
a735dad667 Parse GPT OSS in runner (#1160)
## Motivation

Simplification of API + moving model specific code to the runner

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Test Plan

### Manual Testing
Tested that GPT OSS outputs are parsed correctly on the dashboard.

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2026-01-15 19:53:55 +00:00
rltakashige
aaf4e36bc3 FIX GPT OSS (#1165)
## Motivation

Adds several unmerged fixes for GPT OSS.
Also adds GPT OSS 20B MXFP4 Q8 instead of Q4 for numerical stability (as
this is unstable for MLX LM too)
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->


## Test Plan

### Manual Testing
Manually tested. No further gibberish responses.

### Automated Testing
Ran EXO Bench - pipeline, tensor and single node work on both 20B and
120B models
2026-01-15 19:20:17 +00:00
Evan Quiney
3e623ccf0d up http timeout to 3 seconds and retry on BadStatusLine (#1164)
we're seeing a lot of network churn - perhaps this is a connection
timing out issue? lets also re-try after a second

## testing
none yet

---------

Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-15 18:15:12 +00:00
Evan Quiney
c22dad8a7d dashboard: add peer: true to package lock (#1162)
this happens every time i run npm install - lets upstream it

## testing
dashboard builds and renders
2026-01-15 17:01:43 +00:00
Evan
4bc4d50685 rust: remove dead code
the system custodian has been made unnecessary with the swift app - we
can remove it

## testing
everything still builds
2026-01-15 16:51:46 +00:00
Jake Hillion
e0aab46fd8 model_cards.py: clean up commented out code
Clean up the commented out code and make sure the comments are unified.
Carrying around the commented out code means people making changes to
model_cards are supposed to update it, but that's not clear and won't be
picked up by type checking etc. Drop it for now - it's in the git
history.

Also make the rest of the comments a bit more uniform, and place
comments about a specific model card inside the model card (instead of
above) so they don't get lost when code is added/moved around.

Test plan:
- my eyes
2026-01-15 13:21:58 +00:00
Evan Quiney
82ba42bae9 add glm-47, minimax-m21 (#1147)
Adds support glm 4.7 and MiniMax M2.1

Manual testing:
Tensor + Pipeline execution of both models.

Closes #1141 and #1142
2026-01-14 16:33:17 +00:00
Jake Hillion
3671528fa4 nix: add dashboard build with dream2nix
Continue working towards a fully Nix based build by building the
dashboard with Nix. Continuing the theme of using the existing lock
files, use dream2nix to parse the lock file and build the tree of
dependency derivations.

dream2nix doesn't like the bundleDependencies, so we apply a small patch
to the lock file that drops all dependencies that are bundled. This
should ideally be contributed upstream but that can be done later.

Use this new dashboard build in the build-app CI workflow, meaning
future macOS apps will include this reproducible dashboard.

Test plan:
- Built a DMG, shipped to a cluster, loaded in a browser with no cache
  and the dashboard looks good.

- Directory layout is as expected:
```
$ nix build .#dashboard
$ find result/
...
result/_app/immutable/entry
result/_app/immutable/entry/app.CTPAnMjf.js
result/_app/immutable/entry/start.fUSEa-2O.js
result/_app/immutable/nodes
result/_app/immutable/nodes/3.DqQr1Obm.js
result/_app/immutable/nodes/0.DgEY44RO.js
result/_app/immutable/nodes/2.BjZg_lJh.js
result/_app/immutable/nodes/1.D6vGUYYT.js
result/_app/env.js
result/_app/version.json
result/exo-logo.png
result/favicon.ico
result/index.html
```
2026-01-14 15:58:16 +01:00
Jake Hillion
e6434ec446 nix: add Rust builds with crane and fenix
The Rust workspace lacked Nix build support, making it difficult to
build packages reproducibly or run checks in CI.

Added a flake-parts module at rust/parts.nix that uses crane for Rust
builds and fenix for the nightly toolchain. The source filter isolates
rust/ and root Cargo files to prevent Python/docs changes from
triggering Rust rebuilds. Exports packages (system_custodian,
exo_pyo3_bindings wheel, exo-rust-workspace) and checks (cargo-nextest,
cargo-doc) for all three target platforms.

The devShell now uses inputsFrom to inherit build dependencies from the
workspace package, removing the need for manual pkg-config/openssl setup.

Test plan:
- Ran `nix flake check` successfully
- Built `nix build ".#checks.x86_64-linux.cargo-nextest"` and tests pass
- Built `nix build ".#exo_pyo3_bindings"` and wheel is produced
2026-01-14 11:52:29 +00:00
Jake Hillion
bdb43e1dbb nix: drop noisy echos from devshell
Drop all the printing when entering a devshell. It's annoying, and not a
super accurate description of how to develop exo anyway.
2026-01-14 10:04:57 +00:00
Jake Hillion
e4a01e2b0e chore(deps): nix lock file maintenance
Update nix flake inputs. Add a second input as Swift is currently broken
in nixpkgs on Linux for `swift-format` as we want `nix fmt` to continue
being reproducible everywhere.
2026-01-13 19:57:14 +01:00
Evan Quiney
1200a7db64 Add tensor sharding for GPT-OSS (#1144)
## Motivation

GPT OSS did not previously support tensor sharding

## Changes

Add GPT sharding support in tensor_auto_parallel.
Code is mostly @rltakashige's

## Test Plan

### Manual Testing
Tested GPT-OSS - MLX Fast Sync causes issues in Tensor RDMA - this is a general problem at the moment.
2026-01-13 17:25:52 +00:00
Evan Quiney
47ceb54bc1 up the rlimit (#1148)
Fixes #1117 

Manual testing:
Launched 100 instances. worked. yay.
2026-01-13 15:00:54 +00:00
Jake Hillion
f8112fdf25 nix: convert to flake-parts
Preparing to add a flake-parts module for Rust builds. The flake-utils
library doesn't support the module system needed for cleanly separating
the Rust build configuration.

Converted from flake-utils to flake-parts, switching to the treefmt-nix
flakeModule import pattern. The devShell and formatter outputs remain
functionally equivalent.

Test plan:
- Ran `nix flake check` successfully
- Verified `nix develop` provides the same environment
2026-01-13 15:06:44 +01:00
Alex Cheema
e388f59480 docs: add AGENTS.md for AI coding agents guidance (#1132)
## Motivation

Add documentation to help AI coding agents (Claude Code, Cursor, GitHub
Copilot, etc.) understand the exo codebase and contribute effectively.

## Changes

- Add `AGENTS.md` with guidance for AI agents working on the codebase
- Add symlink `CLAUDE.md -> AGENTS.md` for backwards compatibility with
Claude Code

## Why It Works

`AGENTS.md` is becoming a standard convention for AI agent instructions.
The symlink ensures Claude Code (which looks for `CLAUDE.md`) continues
to work while supporting the broader `AGENTS.md` convention.

## Test Plan

### Manual Testing
- Verified symlink works correctly

### Automated Testing
- N/A (documentation only)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-13 13:05:47 +00:00
Alex Cheema
e5e74e1eef Upgrade mlx-lm to 0.30.2 with transformers 5.x compatibility (#1125)
## Motivation

Upgrade mlx-lm to version 0.30.2 which requires transformers 5.0.0rc2 as
a prerelease dependency. This enables support for newer models like Kimi
K2 Thinking while maintaining compatibility with existing models.

The transformers 5.x release includes breaking changes that affect
custom tokenizers like Kimi's TikTokenTokenizer, requiring compatibility
fixes.

## Changes

### Core Changes
- **mlx-lm upgrade**: Bump to 0.30.2 with locked exact versions for
mlx/mlx-lm to prevent breaking changes
- **transformers 5.x compatibility**: Enable prerelease transformers
dependency

### Kimi K2 Tokenizer Fixes
- Add `bytes_to_unicode` monkey-patch to restore function moved in
transformers 5.0.0rc2
- Load `TikTokenTokenizer` directly instead of via `AutoTokenizer` to
bypass transformers 5.x bug with `auto_map` fallback
- Patch `encode()` to use tiktoken directly with `allowed_special="all"`
to handle special tokens from chat templates

### Other Changes
- Dashboard: Show disk usage for completed model downloads
- CI: Add `workflow_dispatch` trigger to build-app workflow
- Docs: Add basic API documentation

### Testing
- Add comprehensive tokenizer unit tests for all supported models
- Tests verify encode/decode, special token handling, and chat template
encoding

## Why It Works

**bytes_to_unicode issue**: transformers 5.0.0rc2 moved
`bytes_to_unicode` from `transformers.models.gpt2.tokenization_gpt2` to
`transformers.convert_slow_tokenizer`. Kimi's `tokenization_kimi.py`
imports from the old location. The monkey-patch restores it at module
load time.

**AutoTokenizer issue**: transformers 5.x has a bug where
`tokenizer_class_from_name('TikTokenTokenizer')` returns `None` for
custom tokenizers with `auto_map`. Loading the tokenizer directly
bypasses this.

**encode() issue**: transformers 5.x's `pad()` method fails for slow
tokenizers. Using tiktoken's encode directly with
`allowed_special="all"` avoids this path and properly handles special
tokens like `<|im_user|>` from chat templates.

## Test Plan

### Manual Testing
- Hardware: 2x Mac Studios connected via Thunderbolt 5 (mike22 and
james21)
- Tested Kimi K2 Thinking, GPT-OSS-120B, GPT-OSS-20B, LLama-3.1-8B-bf16, qwen3-30B-A3B-8bit model with pipeline parallelism across both
nodes
- Verified warmup inference completes successfully
- Verified chat completions work with special tokens

### Automated Testing
- Added `test_tokenizers.py` with 31 tests covering:
- Basic encode/decode for all model families (deepseek, kimi, llama,
qwen, gpt-oss, glm)
  - Special token encoding (critical for chat templates)
  - Chat template application and encoding
  - Kimi-specific and GLM-specific edge cases
- All tests pass: `uv run pytest
src/exo/worker/tests/unittests/test_mlx/test_tokenizers.py`

### Failing Tests
RDMA with all models.

---------

Co-authored-by: Evan <evanev7@gmail.com>
2026-01-13 12:06:04 +00:00
Jake Hillion
b968d6f0a0 ci: remove old commented out job 2026-01-13 12:42:04 +01:00
Jake Hillion
3bfffd9b4f ci: build all Nix outputs on all platforms and push to cachix
The CI was only running `nix flake check` on ubuntu-latest, missing
builds for other platforms and not caching packages or devShells.

Added a matrix-based `nix-build` job that runs on macos-26 (aarch64-darwin),
ubuntu-latest (x86_64-linux), and ubuntu-24.04-arm (aarch64-linux). Each
job enumerates all packages and devShells via `nix flake show --json`,
builds them in a single `nix build` call for parallelization, then runs
`nix flake check`. The cachix-action pushes all built outputs automatically.

This ensures all Nix outputs are built and cached for every supported
platform, speeding up local development and CI runs.

Test plan:
- Tested jq enumeration command locally, correctly outputs devShell paths
- Verified xargs pipeline works with the enumerated outputs
2026-01-13 12:37:12 +01:00
Jake Hillion
007eb80029 nix: enable cachix
Enable cachix and push to it in the pipeline.yml workflow. This won't
cache a huge amount yet but will automatically extend our caching as we
build more of the repo with Nix in CI. It can also be used by local
users by accepting our cache to improve the speed of local builds.

Test plan:
- CI
2026-01-12 17:24:59 +01:00
Jake Hillion
8d7b6789b3 dashboard: show disk usage for completed models
The downloads dashboard showed "Completed" for finished model downloads
but provided no indication of how much disk space each model or the
total models on a node were using.

Added total_bytes field to DownloadCompleted type so the size is
preserved when a download completes. Updated the dashboard to display
the model size next to "Completed" status (e.g., "Completed (251.1GB)")
and a total disk usage line below the model count for each node (e.g.,
"502.2GB on disk").

Test plan:
- Ran unit tests for download apply and planning logic
- Type checked all modified files with basedpyright
2026-01-12 16:34:29 +01:00
Jake Hillion
3c5b7ea670 ci: add workflow_dispatch trigger to build-app
Build app is the most convenient way to get a DMG for testing, but
currently it's a bit limited. You have to push to test-app every time
which is far from ideal and requires a bit too much force pushing for my
liking.

Add the workflow_dispatch trigger. This adds a button in the actions UI
to trigger a workflow for a named branch, which means you can use your
normal dev branch instead of having to push to test-app. We'll leave
that behaviour there for now too, though it may change in future.

Filter on `"${{ github.event_name }}" == "workflow_dispatch"` and set
those to alpha as well. Will verify by pushing the first version from
`main` just in case. Unfortunately we do have to merge this before we
can test it.

Test plan:
- Looking really hard.
2026-01-12 12:14:21 +01:00
PG
b74a610537 Add a basic documentation to the api interface (#1122)
## Motivation

Adds basic api documentation

## Changes

- Add docs/api.md
- Modify README.md
2026-01-11 18:44:40 +00:00
Jake Hillion
18c4e49f91 nix: put treefmt in devshell
treefmt is a useful to be able to access directly for some formatters like
`jj fix`. Expose it in the devshell.

Test plan:
- Used with `jj fix` on a large branch. It worked.
2026-01-09 17:53:50 +01:00
Sami Khan
d85b5d3781 feat: uninstall button (#1077)
## Motivation

https://github.com/exo-explore/exo/issues/1075

## Changes

- Added in-app "Uninstall" option under Advanced menu that cleanly
removes all system components
- Added NetworkSetupHelper.uninstall() to remove LaunchDaemon, scripts,
logs, and restore network settings
- Added LaunchAtLoginHelper.disable() to unregister from login items
- Created standalone uninstall-exo.sh script for users who already
deleted the app
- Added uninstall documentation to README

<img width="386" height="577" alt="image"
src="https://github.com/user-attachments/assets/6bbcd18a-992a-409d-8791-ed5e13bbcfe0"
/>
<img width="372" height="432" alt="image"
src="https://github.com/user-attachments/assets/ee76b45d-c111-4807-ab28-3f2f20e01140"
/>


## Why It Works

The in-app uninstaller runs a privileged shell script (via AppleScript)
to launchctl bootout the daemon, remove files, and restore the
"Automatic" network location. The standalone script provides the same
cleanup for users who already deleted the app.

## Test Plan

### Manual Testing
Hardware: MacBook Pro
- Built and ran app, verified LaunchDaemon and network location were
created
- Used in-app Uninstall, verified all components removed and network
restored to Automatic
- Rebuilt app, quit normally, ran sudo ./uninstall-exo.sh, verified same
cleanup

### Automated Testing
N/A

---------

Co-authored-by: Evan <evanev7@gmail.com>
2026-01-09 14:49:08 +00:00
Evan Quiney
caafc48693 Forward tools to the models chat template properly (#1106)
We did not properly forward tools to the chat template before. This is not a full tool calling impl - but it should improve things slightly.

## Changes made

Pass tools to the hf tokenizers chat template
Join message chunks into a larger message (opencode does this sometimes - we were ignoring before)

## Future work

We need to parse the model output and normalise the return format to be compatible with the openai api.
2026-01-09 13:28:41 +00:00
Evan
cca8c9984a cleanup unused dependencies
we have a lot of dependencies we have no intent of using. kill them with
fire!

## testing
exo still launches and does the worst inference known to man on my Qwen3
instance. tests pass too!!
2026-01-09 13:11:58 +00:00
Sami Khan
d1e88def42 scrollbars fixed (#1113)
## Motivation

Fixes https://github.com/exo-explore/exo/issues/1107 - Horizontal
scrollbar always appears in instances section, and vertical scrollbar
appears too early (with just 1-2 instances on large screens).


## Changes

- Added overflow-x-hidden to remove horizontal scrollbar
- Added xl:max-h-96 for responsive vertical height (384px on xl+ screens
vs 288px default)
- Added py-px to accommodate corner accent decorations that extend 1px
outside cards

## Why It Works

- overflow-x-hidden prevents horizontal scroll regardless of content
- Larger max-height on xl screens fits 2 instances without scrollbar;
3rd triggers it
- 1px vertical padding accommodates the -top-px/-bottom-px positioned
corner accents that caused tiny overflow

## Test Plan

### Manual Testing
<img width="1190" height="868" alt="image"
src="https://github.com/user-attachments/assets/2a582328-5b4f-4490-a488-52106f2e85ef"
/>

### Automated Testing
N/A
2026-01-09 12:51:05 +00:00
Sami Khan
59e7594e34 UNKNOWN to PREPARING (#1112)
## Motivation

The "UNKNOWN" status shown when first launching an instance is confusing
and unhelpful. "PREPARING" better describes what's actually happening.

![telegram-cloud-photo-size-4-5981245965962251168-x](https://github.com/user-attachments/assets/65b0802b-fb64-4fa7-bff7-c13757035b3a)


## Changes

- Renamed status from "UNKNOWN" to "PREPARING" in dashboard
(+page.svelte)
- Renamed unknown state to preparing in macOS app
(InstanceViewModel.swift, InstanceRowView.swift)

## Why It Works

The status appears when an instance exists but runners haven't reported
status yet. "PREPARING" accurately describes this transitional state.

## Test Plan

### Manual Testing
Hardware: MacBook Pro
<img width="319" height="200" alt="image"
src="https://github.com/user-attachments/assets/9a1c3caf-026d-47ea-80d1-63c6e41d93aa"
/>

### Automated Testing
N/A
2026-01-09 11:46:51 +00:00
Chris A
c65320acd3 Fix mlx seed (#1094)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->

---------

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: Ryuichi Leo Takashige <leo@exolabs.net>
2026-01-09 01:40:15 +00:00
Jake Hillion
b9a78f6f3a ci: compute CURRENT_PROJECT_VERSION from semver
Previous Sparkle builds were cut from a different repo with different
build numbers, breaking version ordering. Users aren't receiving updates
because CFBundleVersion values don't reflect the actual version sequence.

Added a step to compute the build version deterministically from semver:
PRERELEASE + (1000 * PATCH) + (1_000_000 * MINOR) + (1_000_000_000 * MAJOR).
Release versions use prerelease=999 to ensure they're always higher than
their prereleases (e.g., 1.0.61 > 1.0.61-alpha.3).

This ensures consistent version ordering across repos, allowing Sparkle
to correctly identify and deliver updates to users.

Test plan:
- Verified formula with test script:

```sh
compute_version() {
  VERSION="$1"
  BASE_VERSION="${VERSION%%-*}"
  MAJOR=$(echo "$BASE_VERSION" | cut -d. -f1)
  MINOR=$(echo "$BASE_VERSION" | cut -d. -f2)
  PATCH=$(echo "$BASE_VERSION" | cut -d. -f3)

  if [[ "$VERSION" == *-* ]]; then
    PRERELEASE_PART="${VERSION#*-}"
    PRERELEASE_NUM="${PRERELEASE_PART##*.}"
    if ! [[ "$PRERELEASE_NUM" =~ ^[0-9]+$ ]]; then
      PRERELEASE_NUM=0
    fi
  else
    PRERELEASE_NUM=999
  fi

  BUILD_VERSION=$((PRERELEASE_NUM + 1000 * PATCH + 1000000 * MINOR + 1000000000 * MAJOR))
  printf "%-20s -> %12s\n" "$VERSION" "$BUILD_VERSION"
}

compute_version "1.0.61-alpha.2"
compute_version "1.0.61-alpha.3"
compute_version "1.0.61"
compute_version "1.0.62-alpha.1"
compute_version "1.1.0-alpha.1"
compute_version "2.0.0-alpha.1"
compute_version "0.0.0-alpha.0"
compute_version "0.0.1-alpha.1"
compute_version "1.2.3"
compute_version "1.2.3-beta.5"
```

- Output:

```sh
Version              -> Build Number
----------------------------------------
1.0.61-alpha.2       ->   1000061002
1.0.61-alpha.3       ->   1000061003
1.0.61               ->   1000061999
1.0.62-alpha.1       ->   1000062001
1.1.0-alpha.1        ->   1001000001
2.0.0-alpha.1        ->   2000000001
0.0.0-alpha.0        ->            0
0.0.1-alpha.1        ->         1001
1.2.3                ->   1002003999
1.2.3-beta.5         ->   1002003005
```

- Confirmed ordering: alpha.2 < alpha.3 < release < next-alpha
2026-01-08 19:52:33 +01:00
Jake Hillion
8f7f0e893a ci: avoid uploading alpha appcasts
Currently alpha appcasts get uploaded. It turns out these overwrite the
standard appcast, so even though no one will update to the alpha
channel, everyone will miss regular updates while the latest build was
an alpha one.

Ideally we should combine the source of truth for both the alpha and
release channels, but as no one is using the alpha channel for yet let's
stop uploading it for now.

Test plan:

![eyes](https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExeGNwdDk0dmdscjlkZnd6eGxhcjJzdDBsYndmc2t2cnlpZDNxZnZhYSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/gKHGnB1ml0moQdjhEJ/giphy.gif)
2026-01-08 18:52:10 +01:00
Alex Cheema
4759b09d4c Use presigned URLs for bug report uploads (#1109)
## Motivation

Previously we hardcoded AWS credentials into the app.
This is not good practice.

## Changes

Use presigned URLs instead.

## Why It Works

Presigned URLs are an S3 feature for this kind of thing. They provide an
expiring presigned URL with certain permissions. In this case we have a
presigned URL with `s3:PutObject` permission that expires after 5
minutes. The client uses this presigned URL to upload a bug report
instead of using its own credentials to sign a request. This also
simplifies a lot of the Swift code.

## Test Plan

### Manual Testing
On a single MacBook, I downloaded the app and sent a bug report. It
worked and appeared in the bucket.
2026-01-08 17:17:48 +00:00
Alex Cheema
ca680185f3 Display RDMA debug info in macOS app. (#1072)
## Motivation

Often users are running into issues with RDMA. See
https://github.com/exo-explore/exo/issues?q=is%3Aissue%20rdma
Having some debug info in the macOS app will help to debug these issues.

## Changes

Displays output of the following commands in the debug info section of
the macOS app:

1. `rdma_ctl status`
2. `ibv_devices`
3. `ibv_devinfo`

## Why It Works

It displays RDMA debug info in the debug info section of the macOS app.

## Test Plan

### Manual Testing
We need to make a new build of the macOS app and check the output under
the following conditions:

1. No RDMA enabled.
2. RDMA enabled but no devices connected over TB5.
3. RDMA enabled and devices connected over TB5.
2026-01-08 15:17:00 +00:00
Jake Hillion
383309e24e fmt: add typescript formatting
Add typescript auto formatting with Prettier and treefmt-nix. Added a
.prettierrc to useTabs, which isn't the default, to reduce churn. The
rest looks okay and will be checked by CI.

Test plan:
- CI
2026-01-08 13:47:27 +00:00
Jake Hillion
55463a9806 fmt: add swift formatting
Swift code currently has no auto formatting. Add `swift-format` to the
`treefmt-nix` config to get this formatted.

As our existing Swift code uses 4-space formatting instead of the
default 2-space, also adds a custom `.swift-format

Test plan:
- CI
2026-01-08 13:34:45 +00:00
Evan Quiney
56af61fac9 add a server for distributed testing in /tests until we work out a stable solution. (#1098)
## Motivation

Testing multiple devices simultaneously requires coordination, and we
don't necessarily want to run a full EXO to test single components. We
need a mid-scale integration testing framework for distributed tests.

## Changes

Add a simple python server + bash query that runs Jaccl and Ring tests
without constructing a worker/master/networking. The query relies on all
devices being accessible over tailscale, currently.

## Test Plan

Manually tested RDMA + Ring inference on 2 nodes.
2026-01-08 12:50:04 +00:00
Evan Quiney
f76d543d98 We shouldn't fail on an HTTPException in the tier-2 discovery system. (#1104)
## Motivation

Fixed a crash we found

## Changes

try/catch return None if we get an exception instead of crashing exo

## Test Plan

### Manual Testing
Exo launches. Couldn't repro the original case this arose.
2026-01-08 12:43:34 +00:00
Sami Khan
ea841aca37 local network check (#1103)
## Motivation

After machine restart, macOS local network permission can appear enabled
in System Settings but not actually work. EXO fails to discover other
machines, and the only fix is manually toggling the permission off/on
and relaunching. Users had no way to know this was happening.

## Changes

- Added LocalNetworkChecker service that detects if local network access
is actually functional
- Added warning banner with instructions and "Open Settings" button when
blocked
- Added NSLocalNetworkUsageDescription and NSBonjourServices to
Info.plist (required by macOS)

<img width="386" height="712" alt="image"
src="https://github.com/user-attachments/assets/c6fc873d-2c6a-4c9b-89cb-f7bc7322e25b"
/>

## Why It Works

Uses NWConnection to UDP multicast address 224.0.0.251:5353 (mDNS),
which is subject to the app's actual TCC permission state. Other
approaches (NWBrowser, dns-sd subprocess) either require additional
entitlements or run with their own permissions, giving false results.

## Test Plan

### Manual Testing
Hardware: MacBook Pro
  - Toggle local network OFF in System Settings → warning banner appears
  - Toggle local network ON → warning disappears
  - Verified detection correctly reflects actual permission state

### Automated Testing
N/A
2026-01-08 12:24:46 +00:00
rltakashige
077b1bc732 exo-bench (Benchmark model pp & tg speed) (#1099)
## Motivation

This PR implements benchmarking in the style of llama-bench. The main
difficulty here is the fact that exo is not a library - it exposes an
endpoint. This means that benchmarking numbers will be inaccurate if the
API is measured.

The solution assumes nodes are set up with uv run exo (or via the app),
and then hits the new endpoint /bench/chat/completions to retrieve
generation statistics directly from mlx_lm.
<!-- Why is this change needed? What problem does it solve? -->

This will allow us to release benchmarks for models and perform
regression tests.

TODO: Performance benchmarking.
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->
- Adds /bench/chat/completions endpoint
- Adds BenchChatCompletion/Response
- Adds a logits processor to prevent response from ending early
- Adds a "Prompt Sizer" which downloads the tokenizer and dynamically
adjusts the prompt of "a" to fit the desired prompt size.
- Reduce prefill step size to 2048 for now (in future, dynamically
adjust this value)

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
Benchmarked Llama, Qwen, DeepSeek and Kimi models. Will require several
fixes to run consistently on all configurations (to be done in the
future).
Manually tested the normal API to verify chat requests complete as
expected.

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
Not really possible. Type checker passes.
2026-01-06 17:39:09 +00:00
Alex Cheema
4963c33162 Fix Discord link in README.md. Fixes #1096 (#1097)
## Motivation

Discord link expired.

## Changes

Replace discord invite link with permanent link.

## Why It Works

It's permanent now.

## Test Plan

Clicked the link. It works.
2026-01-06 14:05:09 +00:00
madanlalit
4f6fcd9e93 feat(macos-app): add custom namespace UI for cluster isolation
Add Advanced Options section with custom namespace field that allows
users to override EXO_LIBP2P_NAMESPACE environment variable. This
enables splitting machines that can see each other into separate
clusters.

- Added customNamespace property with UserDefaults persistence
- Added Advanced Options collapsible section with text field
- Added Save & Restart button that auto-restarts exo process
- Namespace replaces buildTag when custom value is set
- Falls back to buildTag (version) when namespace is empty
2026-01-05 15:25:00 +01:00
Evan Quiney
839b67f318 [feat] Add an option to disable the worker (#1091)
## Motivation

Workerless machines can be used for networking without running any gpu
jobs - add a cli flag that adds this basic functionality.

## Changes

Adds the --no-worker cli flag

## Test Plan

### Manual Testing

Exo starts as expected

### Automated Testing

None
2026-01-05 12:05:03 +00:00
Drifter4242
47b8e0ce12 feat: remember last launch settings (model, sharding, instance type) (#1028)
## Motivation

Saves the last launch settings, so that the next time you run exo it
will default to the same launch settings.
This is just a small quality of life improvement.

## Changes

When you launch it saves the settings to the web browser local storage.
When it fills out the model list, it reads the settings and sets the
default.

I reviewed, tested and edited the code, but some of the code was written
by Claude Opus. I hope that's ok.

## Why It Works

See above

## Test Plan

### Manual Testing

I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with
Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split.
I ran exo multiple times to confirm that the default works.

### Automated Testing

No changes to automated testing.
2026-01-05 11:27:14 +00:00
Evan Quiney
17f9b583a4 Task Deduplication (#1062) 2026-01-03 20:01:49 +00:00
RickyChen / 陳昭儒
844bcc7ce6 fix: prevent form submission during IME composition (#1069)
## Problem
When typing in Chinese (or other IME-based languages like
Japanese/Korean), pressing Enter to select a character from the IME
candidate list would incorrectly submit the message instead of
confirming the character selection.

## Solution
Added IME composition state detection in the `handleKeydown` function in
`ChatForm.svelte`:
- Check `event.isComposing` to detect active IME composition
- Fallback to `event.keyCode === 229` for broader browser compatibility
- Return early when IME is active, allowing normal character selection

## Changes
- Modified `dashboard/src/lib/components/ChatForm.svelte` 
- Added IME composition check before Enter key handling

Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>
2025-12-31 17:11:04 +00:00
Evan Quiney
c1be5184b2 Fix tests broken by 283c (#1063)
Some tests were broken by #1058 and #1046 - this fixes them.
2025-12-31 01:53:55 +00:00
Alex Cheema
1ec550dff1 Emit download progress on start, and change downloads to be keyed by model_id (#1044)
## Motivation

We added a download page to the dashboard which shows the currently
download status of each model on each node. Users have reported this to
be extremely useful.

However, we don't currently fetch the download progress on start, so it
doesn't show any model's download status.

## Changes

Fetch and emit model download status on start of worker, and
periodically every 5 mins.
Also to support this, I changed download_status to be keyed by model_id
instead of shard, since we want download_status of each model, not each
shard.

## Why It Works

The dashboard already implements the correct functionality, we just
weren't populating the download status in the state. Now it gets
populated and shows correctly.

## Test Plan

### Manual Testing
On a cluster of 2 x 512GB M3 Ultra Mac Studio, I launched an instance
onto one node that hadn't been downloaded. I checked the download page
and it showed the in progress download. I downloaded it to completion,
restarted exo on both nodes, and then opened the download page and it
showed the model as 100% downloaded and other models as 0% that hadn't
been downloaded.

---------

Co-authored-by: Evan <evanev7@gmail.com>
2025-12-31 01:18:10 +00:00
Alex Cheema
283c0e39e4 Placement filters for tensor parallel supports_tensor, tensor dimension and pipeline parallel deepseek v3.1 (#1058)
## Motivation

Certain placements are not valid. Added filters to exclude these placements. There were invalid placement previews being shown in the dashboard which would then fail when the user actually tries to launch an instance with that placement.


## Changes

Three filters added:

1. Certain models do not support tensor parallel at all. Checks `supports_tensor` on the model_meta.
2. For models that do support tensor parallelism, certain tensor parallel sizes are not valid. This check is actually not correct right now but it works fine for now. The actual correct check is more involved.
3. For unknown reasons, deepseek v3.1 (8-bit) does not work with tensor parallelism.

## Why It Works

`place_instance` now raises an `Exception` for invalid placements.

## Test Plan

### Manual Testing
Since `/instance/previews` enumerates all possible placements and runs `place_instance`, I checked the dashboard to see if invalid placements are still shown.
2025-12-31 00:33:40 +00:00
Alex Cheema
35be4c55c3 prioritise mlx jaccl coordinator ip (en0 -> en1 -> non-TB5 -> other) 2025-12-31 00:10:19 +00:00
Alex Cheema
31d4cd8409 set KV_CACHE_BITS to None to disable quantized kv cache 2025-12-31 00:03:30 +00:00
Alex Cheema
8a6da58404 remove mx.set_cache_limit 2025-12-30 23:58:15 +00:00
Alex Cheema
16e2bfd3b3 log EXO_LIBP2P_NAMESPACE on start 2025-12-30 04:08:47 +00:00
Alex Cheema
ade3ee7ec5 fix warmup order. should be rank!=0 then rank=0 2025-12-30 03:29:34 +00:00
Evan Quiney
fea42473dd Place local node at the top of the dashboard. (#1033)
@samiamjidkhan and @AlexCheema's work moving the topology to place the
local node at the top of the topology in the app dashboard.
2025-12-28 21:12:47 +00:00
Alex Cheema
ca7adcc2a8 Update README.md with instructions to enable RDMA. (#1031)
## Motivation

We didn't have instructions for enabling RDMA on macOS.

## Changes

I added instructions for enabling RDMA on macOS.

## Why It Works

Tried it on my M4 Max MacBook Pro and works.

## Test Plan

### Manual Testing
Tried it on my M4 Max MacBook Pro and works.

### Automated Testing
In the future, we could automate this from fresh macOS builds using KVM
over IP. See #1030
2025-12-28 20:56:26 +00:00
Evan Quiney
9d9e24f969 some dashboard updates (#1017)
Mostly @samiamjidkhan and @AlexCheema's work in progress.

---------

Co-authored-by: Sami Khan <smsak99@gmail.com>
Co-authored-by: Alex Cheema
2025-12-28 20:50:23 +00:00
Jake Hillion
b5d424b658 placement: generate per-node host lists for MLX ring backend
Pipeline + MLX Ring worked with 2 nodes but failed to initialize with
3 or more nodes. The MLX ring backend requires each node to know its
specific left and right neighbors in the ring, but the previous
implementation provided a single flat host list shared by all nodes.

With 2 nodes, a flat list [host0, host1] accidentally worked because
each node could find its only neighbor. With 3+ nodes, each node needs
a customized view:
- Rank 0: [self, right_neighbor, placeholder]
- Rank 1: [left_neighbor, self, right_neighbor]
- Rank 2: [placeholder, left_neighbor, self]

Changed MlxRingInstance from `hosts: list[Host]` to
`hosts_by_node: dict[NodeId, list[Host]]` with `ephemeral_port: int`.

Added `get_mlx_ring_hosts_by_node()` which generates per-node host
lists where:
- Self position uses 0.0.0.0 for local binding
- Left/right neighbors use actual connection IPs
- Non-neighbors use 198.51.100.1 (RFC 5737 TEST-NET-2 placeholder)

Also added IP prioritization (en0 > en1 > non-Thunderbolt > any) to
prefer stable network interfaces.

Fixed topology discovery recording loopback addresses (127.0.0.1) as
valid connections to remote nodes. The reachability check now verifies
node identity via HTTP GET /node_id rather than just checking if the
port is open.

Test plan:

- Built a DMG [0]
- Installed on all Macs and started cluster.
- Requested a 3 node Pipeline + MLX Ring Llama 3.3 70B (FP16).
- It started and I was able to send a few chat messages.

Eventually my instance seemed to get into a broken state and chat
stopped working, but this commit is a clear step forward.

[0] https://github.com/exo-explore/exo/actions/runs/20473983471/job/58834969418
2025-12-28 20:38:20 +00:00
Drifter4242
b465134012 Fix Kimi K2 Thinking download by adding tiktoken.model to download patterns (#1024)
Kimi-K2 Thinking uses tiktoken.model for its tokenizer, which wasn't
being downloaded. This adds it to the default_patterns alongside
tokenizer.model.
I'm a bit confused why this isn't a problem for other people - I know
that others have used Kimi K2 (I wonder if they manually fixed the
download).

## Motivation

I downloaded Kimi K2 Thinking and it didn't work because it didn't
download tiktoken.model file.

## Changes

Added tiktoken.model to the default patterns.

## Why It Works

Now downloads the file.

## Test Plan

### Manual Testing

I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with
Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split. It
ran successfully.

### Automated Testing
No automated test changes. I don't think they are needed.
2025-12-28 19:30:31 +00:00
Matiwos Kebede
eabdcab978 Fix linux docs (#1022)
This PR updates the "Run from Source (Mac & Linux)" section in README.md
to clarify Linux instructions.

Changes include:
- Split the section into macOS and Linux subsections.
- Added native Linux package manager commands (apt, dnf, pacman) for
dependencies: uv, node, npm.
- Clarified that macmon is macOS-only.
- Noted that Homebrew on Linux is optional, with native package managers
preferred.

These changes improve clarity for Linux users and fix confusion from the
previous macOS-centric instructions.
2025-12-27 19:56:44 +00:00
Evan Quiney
8e9332d6a7 Separate out the Runner's behaviour into a "connect" phase and a "load" phase (#1006)
## Motivation

We should ensure all runners are connected before loading the model -
this gives us finer grained control in the future for the workers
planning mechanism over the runners state.

## Changes

- Introduced task ConnectToGroup, preceeding LoadModel
- Introduced runner statuses Idle, Connecting, Connected
- Separated out initialize_mlx from shard_and_load
- Single instances never go through the connecting phase

## Test Plan

# Automated Testing
Added a test for checking event ordering in a standard workflow.

# Manual testing
Tested Llama 3.2 1b and Kimi K2 Thinking loads and shuts down repeatedly
on multiple configurations.
Not exhaustive, however.

---------

Co-authored-by: rltakashige <rl.takashige@gmail.com>
2025-12-27 16:28:42 +00:00
Heath Dutton🕴️
4b65d5f896 Fix race condition in mlx_distributed_init with concurrent instances (#1012)
## Motivation

Fixes #1005

When multiple instances initialize concurrently with the same rank, they
overwrite each other's coordination files (hosts_{rank}.json), causing
"[jaccl] Malformed device file" errors and initialization failures.

## Changes

- Changed coordination filename from `./hosts_{rank}.json` to
`./hosts_{instance_id}_{rank}.json` to make it unique per instance
- Added cleanup in a finally block to remove coordination files after
initialization completes
- Applied fix to both MlxRingInstance and MlxJacclInstance cases

## Why It Works

Each instance now gets a unique coordination file based on its
instance_id, preventing concurrent instances from overwriting each
other's files. The cleanup logic ensures files are removed after use,
preventing accumulation and handling both success and failure cases.

## Test Plan

### Manual Testing
Code review and logic verification. The fix prevents the race condition
by ensuring filename uniqueness per instance.

### Automated Testing
No new tests added. Existing tests continue to pass.

---------

Co-authored-by: Ryuichi Leo Takashige <rl.takashige@gmail.com>
2025-12-27 16:13:26 +00:00
Jake Hillion
1c1792f5e8 mlx: update to 0.30.1 and align coordinator naming with MLX conventions
The Jaccl distributed backend requires MLX 0.30.1+, which includes the
RDMA over Thunderbolt support. The previous minimum version (0.29.3)
would fail at runtime with "The only valid values for backend are
'any', 'mpi' and 'ring' but 'jaccl' was provided."

Bump MLX dependency to >=0.30.1 and rename ibv_coordinators to
jaccl_coordinators to match MLX's naming conventions. This includes
the environment variable change from MLX_IBV_COORDINATOR to
MLX_JACCL_COORDINATOR.

Test plan:

Hardware setup: 3x Mac Studio M3 Ultra connected all-to-all with TB5

- Built a DMG [0]
- Installed on all Macs and started cluster.
- Requested a 2 node Tensor + MLX RDMA instance of Llama 3.3 70B (FP16).
- It started successfully.
- Queried the chat a few times. All was good. This didn't work
  previously.
- Killed the instance and spawned Pipeline + MLX Ring Llama 3.3 70B (FP16).
  Also started succesfully on two nodes and could be queried.

Still not working:
- Pipeline + MLX Ring on 3 nodes is failing. Haven't debugged that yet.

[0] https://github.com/exo-explore/exo/actions/runs/20467656904/job/58815275013
2025-12-24 16:47:01 +00:00
Jake Hillion
9afc1043ef exo: handle -c flag for multiprocessing helpers in frozen apps
When Python's multiprocessing spawns child processes on macOS (using the
"spawn" method), it also spawns helper processes like the resource tracker
by executing:

    ./frozen_app -c "from multiprocessing.resource_tracker import main; main()"

A frozen PyInstaller app doesn't understand `-c` natively - it just runs
main(). This causes the resource tracker to fail silently.

This adds a minimal `-c` handler that intercepts the flag, extracts the
inline code, and exec()s it before main() runs. This is required for the
Process() spawn in runner_supervisor.py to work correctly in the DMG.

Note that the pyinstaller docs say `freeze_support` is supposed to make
this work, but it doesn't.

Test plan:

Hardware setup: 3x Mac Studio M3 Ultra connected all-to-all with TB5

- Built a DMG[0].
- Installed on the Macs.
- Started an instance. Got an error this time in ~/.exo/exo.log. The
  last DMG from main doesn't show anything when an instance starts, this
  now shows the errors.

[0] https://github.com/exo-explore/exo/actions/runs/20464409279/job/58804485197
2025-12-23 17:08:50 +00:00
Evan Quiney
70c423f5e0 feat: conform to XDG Base Directory Specification on Linux (#988)
This is an extension of #964 with some cleanup.

---------

Co-authored-by: majiayu000 <1835304752@qq.com>
2025-12-23 17:02:55 +00:00
Jake Hillion
a24bdf7680 exo: enable multiprocessing support in PyInstaller bundles
Model loading fails silently when running from the DMG-packaged app,
despite working correctly with `uv run exo`. The bundled app spawns
child processes for model inference via multiprocessing, but these
processes fail to start in a frozen (PyInstaller) environment.

Add `freeze_support()` which is required for multiprocessing to work
in frozen applications.

Test plan:

Hardware setup: 3x Mac Studio M3 Ultra connected all-to-all with TB5

- Built a DMG using a modified .github/workflows/build-app.yml[0] to avoid
  publishing it.
- Installed on all 3 Macs, replacing the existing Exo.
- Downloaded Llama 3.3 70B (FP16).
- Downloaded Qwen3 Coder 235B A22B (8-bit).

Things that work now but didn't on the previous app:
- Topology looks good, previously there was no discovery.

What didn't work:
- Started an instance with Pipeline + MLX Ring + 3 Nodes. Failed.
- Started an instance with Tensor + MLX RDMA + 2 Nodes. Failed.

Will continue debugging the instance starting issues separately.

[0] https://github.com/exo-explore/exo/actions/runs/20461320368
2025-12-23 14:34:21 +00:00
Jake Hillion
e8855959c1 build-app: add branch trigger from named branch
As I've been working on the .dmg, it's become clear we need a way to
test changes to the app. It's too hard to reproduce the full DMG locally
to be reasonable and much more convenient to test if it's signed.

Add a feature to the build-app workflow where if you push specifically
to the `test-app` branch it'll perform a build. The version is stubbed
to `0.0.0-alpha.0`, which is about as low as it gets in semver so you'll
always update away from it automatically with Sparkle. The resulting DMG
won't be pushed to S3 but will be uploaded as a GitHub Actions artifact.

I've been using similar commits to this for a while for testing. It's
worked well and not interfered with auto updating at all.

Test plan:
- Pushed this change to `test-app`.
- Generated action at
  https://github.com/exo-explore/exo/actions/runs/20447213358/job/58752909332
- Installed the DMG on a Mac. It worked as intended.
2025-12-23 12:53:30 +00:00
Jake Hillion
0a7fe5d943 ci: migrate build-app to github hosted runners 2025-12-22 19:51:48 +00:00
rltakashige
51a5191ff3 format readme (#978)
## Motivation

README looks weird after last update. 
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
I actually checked the file on GitHub this time.

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-22 18:06:27 +00:00
Evan Quiney
1efbd26388 add architecture.md, move images to docs/imgs (#968)
## Motivation

Documentation will make contribution easier and communicate our
development philosophy and decision process. Closes #967

## Changes

Added `architecture.md` to docs/ and moved the images out of docs and
into their own docs/imgs/ folder
2025-12-22 17:57:43 +00:00
Jake Hillion
02c915a88d pyproject: drop pathlib dependency 2025-12-22 17:52:44 +00:00
rltakashige
fc41bfa1f1 Add all prerequisites to README (#975)
## Motivation

Addresses #974 
```
INFO: pip is looking at multiple versions of exo to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement exo-pyo3-bindings (from exo) (from versions: none)
ERROR: No matching distribution found for exo-pyo3-bindings
```

## Changes

Describes Rust dependency for building from source

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
Tested locally and runs after this setup without exo-pyo3-bindings error

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-22 17:38:51 +00:00
Jake Hillion
dd0638b74d pyproject: add pyinstaller to dev-dependencies 2025-12-22 15:49:27 +00:00
majiayu000
e06830ce0b fix: update macOS app to use correct API port (52415)
Fixes #960

The macOS app was incorrectly using port 8000 instead of the default
exo API port 52415. This caused confusion as the README correctly
documents port 52415 but the app was connecting to a different port.
2025-12-22 13:24:09 +00:00
Jake Hillion
1df5079b98 ci: avoid pushing alpha build as latest 2025-12-22 13:00:49 +00:00
Nightguarder
1e75aeb2c2 Add Prerequisites to Readme (#936)
## Motivation
Users need to know what **prerequisites** they need in order to run exo.
Simple addition to docs prevents future raised issues.

## Changes

Updated ``README.md``:
- to include installation instructions for
**[uv](https://github.com/astral-sh/uv)** and
**[macmon](https://github.com/vladkens/macmon)**.

Updated ``CONTRIBUTING.md``:
-  to verify these prerequisites are met before starting development.

- Standardized on brew installation instructions for macOS users to keep
the guide simple.

## Why It Works

By listing these prerequisites upfront, users will set up their
environment correctly before attempting to run exo.

## Test Plan

### Manual Testing
MacBook Pro M4
- Verified that ``uv`` and ``macmon`` were missing initially, causing
failures
- after installing them via brew (as documented), uv run exo starts
successfully.

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->

---------

Co-authored-by: Evan Quiney <evanev7@gmail.com>
2025-12-22 02:28:08 +00:00
Heath Dutton🕴️
c582bdd673 bugfix: Handle MacMon errors gracefully 2025-12-22 02:21:29 +00:00
Jake Hillion
1bae8ebbf6 ci: add build-app workflow 2025-12-22 02:12:30 +00:00
Alex Cheema
abaeb0323d Update README.md. (#956)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
Made a mistake on the merge of the last PR.
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-21 23:09:44 +00:00
Alex Cheema
7d15fbdaab readme tweaks5 (#954)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-21 22:48:35 +00:00
Alex Cheema
4a6e0fe171 Update README.md. (#949)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-21 18:31:23 +00:00
Olimbek Nizomov
f4792dce14 fix(downloads): use certifi for robust SSL certificate verification (#941)
fix(downloads): use certifi for robust SSL certificate verification

## Description
This change updates the SSL context creation in \`download_utils.py\` to
explicitly use the \`certifi\` CA bundle. This ensures that the
application has access to a reliable, up-to-date set of root
certificates, which is critical for verifying SSL connections to
external services like Hugging Face.

## Problem
On macOS environments (and potentially others), Python's default SSL
context often fails to locate the system's root certificates. This leads
to \`aiohttp.client_exceptions.ClientConnectorCertificateError\` errors
when attempting to download models.

## Solution
By passing \`cafile=certifi.where()\` to
\`ssl.create_default_context()\`, we force the application to use the
trusted certificate store provided by the \`certifi\` package. This is a
standard best practice for cross-platform Python applications and
resolves the verification failure.
2025-12-21 12:03:52 +00:00
rltakashige
a1b14a272e Extend eos_token_id fix for other models (#938)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
We currently use mlx_lm's load_tokenizer instead of load. This means
that some models are missing some configurations, such as eos_token_id.
This is clear for a model like GLM, which does not finish token
generation.

## Changes

<!-- Describe what you changed in detail -->
A small stopgap, to allow eos_token_ids to be added, and a TODO for us
to migrate to load. The reason we don't want to do this now is that a
solid testing framework is not configured in this repo yet.

## Why It Works

<!-- Explain why your approach solves the problem -->
It just uses the eos_token_ids I obtained from loading a tokenizer in
mlx_lm and calling `tokenizer.eos_token_ids` .

## Test Plan

### Manual Testing
Tested on several Macs.

### Automated Testing
None yet, as described.

---------

Co-authored-by: Evan <evanev7@gmail.com>
2025-12-20 20:18:17 +00:00
Alex Cheema
f8483cfc18 Update README.md. (#932)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-19 21:23:25 +00:00
Alex Cheema
8bafd6fe68 Update README.md (#925)
## Motivation

<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->

## Changes

<!-- Describe what you changed in detail -->

## Why It Works

<!-- Explain why your approach solves the problem -->

## Test Plan

### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->

### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
2025-12-19 14:38:40 +00:00
Jake Hillion
f16afd723d nix: get rust build working on linux 2025-12-19 13:51:15 +00:00
Alex Cheema
4da0043253 Update README.md (#917) 2025-12-18 20:38:00 +00:00
Jake Hillion
9e2bdeef92 LICENSE: Fix company name/year 2025-12-18 20:24:44 +00:00
Jake Hillion
379744fe5c exo: open source mac app and build process 2025-12-18 20:06:03 +00:00
Jake Hillion
74bae3ba6d Update README.md 2025-12-18 19:18:59 +00:00
Evan Quiney
9815283a82 8000 -> 52415 (#915)
* 8000 -> 52415

* dont grab the api port for placement

---------

Co-authored-by: rltakashige <rl.takashige@gmail.com>
2025-12-18 18:39:44 +00:00
Evan Quiney
5bd39e84d9 Merge pull request #914 from exo-explore/remove-old-cli-flag
remove old tb_only flag from master
2025-12-18 18:30:45 +00:00
Evan
658cf5ccf9 remove tb_only from master 2025-12-18 17:39:02 +00:00
rltakashige
170d2dcbaf Add Windows as a potential planned platform 2025-12-18 17:33:25 +00:00
Evan Quiney
ba66f14299 Merge pull request #912 from exo-explore/update-dashboard-error-message 2025-12-18 17:12:28 +00:00
Evan
274e35f926 update readme 2025-12-18 17:05:35 +00:00
Evan
3fe7bd250f update error message 2025-12-18 17:02:52 +00:00
Evan
004fea6935 clarify platform support 2025-12-18 16:27:43 +00:00
Evan
5c2d254fd1 add platform support information 2025-12-18 15:45:53 +00:00
Jake Hillion
19ca48c4f1 more readme fixups 2025-12-18 14:47:04 +00:00
Jake Hillion
57d3813692 re-add LICENSE 2025-12-18 14:35:40 +00:00
Evan
7cd1527ce3 update CONTRIBUTING 2025-12-18 14:35:20 +00:00
Evan Quiney
423c066ecc Merge pull request #906 from exo-explore/jj/sluxkvlmwons
re-add logos
2025-12-18 14:29:29 +00:00
Jake Hillion
ebf0e18c0e re-add logos 2025-12-18 14:26:27 +00:00
Evan
28a6151b8e remove discord link from README 2025-12-18 14:02:38 +00:00
Jake Hillion
2c16e00be9 github docs 2025-12-18 13:49:07 +00:00
Jake Hillion
f64d17fac0 exo v1 2025-12-18 13:46:40 +00:00
Jake Hillion
0fcee70833 prep repo for v1 2025-12-17 15:31:02 +00:00
Evan Quiney
09593c5e85 backport the dashboard to staging 2025-12-17 12:22:22 +00:00
Evan Quiney
880a18d205 fix disconnects
Co-authored-by: Ryuichi Leo Takashige <leo@exolabs.net>
2025-12-15 15:23:13 +00:00
rltakashige
70298ce0a9 Negative index nack request 2025-12-09 07:57:28 -08:00
Jake Hillion
ac3a0a6b47 ci: enable ruff check in CI through nix 2025-12-09 12:26:56 +00:00
rltakashige
859233a279 Reduce RequestEventLog spam 2025-12-09 11:43:54 +00:00
Evan Quiney
c9e2062f6e switch from uvicorn to hypercorn 2025-12-05 17:29:06 +00:00
Jake Hillion
e8566a3f95 placement: pass different ibv_coordinator per node 2025-12-05 17:23:22 +00:00
Jake Hillion
39d76aa0a5 nix: move formatting checks to nix and enable in ci 2025-12-05 17:00:33 +00:00
Jake Hillion
5629983809 fmt: format all python/rust/nix files 2025-12-05 16:58:55 +00:00
Evan Quiney
7312a7e000 plan fix 2025-12-05 16:43:11 +00:00
Evan Quiney
9e0a1c23ef rename ibv to jaccl inline with mlx 2025-12-05 16:42:43 +00:00
Evan Quiney
f5783d6455 proper collection of rdma ports in placement 2025-12-05 16:42:20 +00:00
Evan Quiney
e702313b32 pingers
Co-authored-by: Jake Hillion <jake@hillion.co.uk>
2025-12-05 16:41:19 +00:00
Evan
a3f8ecba9e prioritise LL4 2025-12-05 15:08:18 +00:00
Jake Hillion
5ef1df1e10 rust: move Cargo.toml to the root 2025-12-05 12:01:44 +00:00
Evan
40a0d47de8 jaccl 2025-12-03 13:53:12 +00:00
rltakashige
2b243bd80e Consolidate!!! Fixes 2025-12-03 12:19:25 +00:00
Evan Quiney
10c905c8dd worker no longer gets stuck after shutdown 2025-12-02 11:35:02 +00:00
Evan
93f699b660 add aarch64-linux for the spark 2025-11-28 11:08:18 +00:00
Alex Cheema
b43d30563d todo for layer-independent parameters in get_allow_patterns 2025-11-27 19:26:02 +00:00
Alex Cheema
20d73e90cd fix dashboard case sensitive model id 2025-11-26 18:16:32 +00:00
Alex Cheema
e56daa7c23 render download progress properly 2025-11-26 11:48:30 +00:00
Alex Cheema
63c85e1298 get rid of spammy Finished tokenizing log 2025-11-25 13:02:06 +00:00
Evan
7088988a65 bump pyo3 stub-gen 2025-11-25 12:13:53 +00:00
rltakashige
7b3e3fd66c Worker tests 2 2025-11-21 16:42:52 +00:00
rltakashige
de50811313 Worker tests on staging 1
Test plan
2025-11-21 15:22:40 +00:00
rltakashige
b45cbdeecd Consolidate cleanup 2025-11-21 14:54:02 +00:00
rltakashige
28a91787e8 Demo
Co-authored-by: Evan <evanev7@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-11-20 20:03:51 +00:00
Alex Cheema
d793f5f96c fix kimi eos token ids 2025-11-13 18:39:14 +00:00
Evan Quiney
b62f68474a improved master error handling
Co-authored-by: Ryuichi Leo Takashige <rl.takashige@gmail.com>
2025-11-11 18:04:40 +00:00
Alex Cheema
631cb81009 kimi k2 thinking 2025-11-11 18:03:39 +00:00
Evan Quiney
364087b91f five billion percent better shutdown handling 2025-11-11 17:43:53 +00:00
Evan Quiney
aa519b8c03 Worker refactor
Co-authored-by: rltakashige <rl.takashige@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-11-10 23:31:53 +00:00
Alex Cheema
9058b117c0 pipeline parallel fix 2025-11-08 02:19:19 +00:00
rltakashige
612f58c78d Revert dumb merge mistake 2025-11-07 02:39:08 +00:00
Evan
6bcac37d98 stop benching on all pushes 2025-11-06 22:26:30 +00:00
rltakashige
ff00b165c5 MLX LM type stubs 2025-11-06 21:59:29 +00:00
Alex Cheema
19e90572e6 set max_transmit_size on gossipsub to 1MB. Fixes large message erorr 2025-11-06 19:18:48 +00:00
Alex Cheema
e60681963f show ips on dashboard 2025-11-06 19:18:07 +00:00
rltakashige
0bb621b653 Add mlx nn stubs 2025-11-06 11:59:37 +00:00
Alex Cheema
699fd9591e fix exo scripts 2025-11-05 21:47:08 -08:00
rltakashige
6bbb6344b6 mlx.distributed.Group type stubs 2025-11-06 05:26:04 +00:00
rltakashige
16f724e24c Update staging 14
Co-authored-by: Evan <evanev7@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: David Munha Canas Correia <dmunha@MacBook-David.local>
Co-authored-by: github-actions bot <github-actions@users.noreply.github.com>
2025-11-05 01:44:24 +00:00
Evan Quiney
3b409647ba Squash merge merging_clusters into tensor_parallel94 2025-10-31 17:41:57 +00:00
Alex Cheema
d46c7e6a76 fix race condition with downloads where it cancels the download before renaming 2025-10-30 19:03:23 -07:00
rltakashige
91c635ca7a Update mlx and mlx-lm packages
Co-authored-by: Evan <evanev7@gmail.com>
2025-10-31 01:34:43 +00:00
Alex Cheema
5f18faec17 Update. 2025-10-30 11:59:59 -07:00
Alex Cheema
a346af3477 download fixes 2025-10-22 11:56:52 +01:00
Alex Cheema
56f783b38d Update. 2025-10-21 17:29:48 +01:00
Evan Quiney
363c98a872 leaf placement
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-10-15 12:47:26 +01:00
Evan Quiney
f25689d9c2 fix a race condition 2025-10-15 10:49:53 +01:00
Evan Quiney
1c6b5ce911 new tagged union
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Sorry Andrei!
2025-10-10 16:22:09 +01:00
Alex Cheema
76ed8a516b typecheck on ubuntu with install-nix-action
Co-authored-by: Evan <evanev7@gmail.com>
2025-10-10 16:15:39 +01:00
Evan Quiney
e8a6efe281 add kimi k2 2025-10-07 17:17:06 +01:00
Evan Quiney
a4e8335241 add just clean 2025-10-07 16:29:51 +01:00
Alex Cheema
84dfc8a738 Fast memory profiling
Co-authored-by: Evan <evanev7@gmail.com>
2025-10-07 16:23:51 +01:00
Alex Cheema
e01f9cf739 Disable build macos app 2025-10-07 15:39:15 +01:00
Alex Cheema
35ab6b376e fix: master tests
Co-authored-by: Evan <evanev7@gmail.com>
2025-10-07 15:36:05 +01:00
Evan Quiney
962e5ef40d version bump for brew consistency 2025-10-07 15:18:54 +01:00
Evan Quiney
b1721e941b nix cleanup 2025-10-01 09:47:00 +01:00
Evan Quiney
22f0ca2a59 FIX: OpenWebUI compat 2025-09-30 16:28:38 +01:00
Evan Quiney
57486a4305 kill go
Fairwell Gelu, Chief Lunch Officer
2025-09-30 11:10:55 +01:00
Evan Quiney
38ff949bf4 big refactor
Fix. Everything.

Co-authored-by: Andrei Cravtov <the.andrei.cravtov@gmail.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-09-30 11:03:04 +01:00
Matt Beton
7040c9508f Multiprocessing Runner 2025-09-17 09:31:49 +01:00
Matt Beton
35c4311587 Dashboard Status & Bugfixes 2025-08-29 17:34:17 +01:00
Matt Beton
a33787f5fd Prompt length 2025-08-29 16:07:36 +01:00
Matt Beton
1b8b456ced full mlx caching implementation 2025-08-26 17:15:08 +01:00
Matt Beton
84c90a6d35 feat: mlx memory cache for faster ttft
Co-authored-by: Evan <evanev7@gmail.com>
Co-authored-by: s17 <s17@s17s-Mac-Studio.local>
2025-08-26 13:05:42 +01:00
Evan Quiney
5efe5562d7 feat: single entrypoint and logging rework 2025-08-26 11:08:09 +01:00
Andrei Cravtov
ef5c5b9654 changes include: ipc, general utilities, flakes stuff w/ just, autopull script 2025-08-25 17:33:40 +01:00
Alex Cheema
5bfc99b415 add EXO logo to dashboard 2025-08-25 16:41:13 +01:00
Evan Quiney
11f8b4ef33 tidy: fix justfile, run.sh, run formatter 2025-08-21 18:44:53 +01:00
Evan Quiney
be6f5ae7f1 feat: build system and homebrew compatibility 2025-08-21 16:07:37 +01:00
Evan Quiney
40efed4436 unvendored macmon 2025-08-20 13:04:46 +01:00
Gelu Vrabie
ea9e573409 Refactor runner supervisor
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-08-18 18:37:52 +01:00
Gelu Vrabie
345fafd80d Forwarder versioning
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-08-18 15:08:50 +01:00
Evan Quiney
ea3eeea826 improved go caching with nix
Co-authored-by: Gelu Vrabie <gelu.vrabie.univ@gmail.com>
2025-08-15 15:24:58 +01:00
Gelu Vrabie
a2a37c0ebe discovery fixed
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-08-15 15:23:20 +01:00
Gelu Vrabie
57073f35c3 collection of fixes for Shanghai demo
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-08-15 15:21:51 +01:00
Andrei Cravtov
7e19804aa5 Integrate flake parts 2025-08-13 09:55:22 +01:00
Matt Beton
dbcd09aa53 No 70b 2025-08-12 18:42:27 +01:00
Matt Beton
c1d5b381f4 70B model unit test only runs if its downloaded 2025-08-07 10:41:56 +01:00
Alex Cheema
473512ddd0 r1 size 2025-08-04 22:57:31 +08:00
Alex Cheema
817c5993f0 fix dem model cards yo 2025-08-04 22:56:06 +08:00
Gelu Vrabie
75ecda55a9 fix gitignore
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-08-04 13:49:49 +01:00
Alex Cheema
c560c55c4e build and release on staging 2025-08-04 07:41:09 +08:00
Sami Khan
f51f8f72f8 app launches python modules 2025-08-04 06:18:31 +08:00
Seth Howes
407796d18f Minor dashboard fixes 2025-08-04 06:15:01 +08:00
Alex Cheema
6daf7f31f7 clean model cards 2025-08-04 05:31:30 +08:00
Alex Cheema
f352ddfc5f run configure_mlx.sh in run.sh 2025-08-04 03:59:42 +08:00
Alex Cheema
6855a7727d set a 15 sec timeout for getting initial download progress 2025-08-03 20:37:20 +08:00
Matt Beton
1fe4ed3442 Worker Exception & Timeout Refactor
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-08-02 08:28:37 -07:00
Alex Cheema
92c9688bf0 Remove rust 2025-08-02 08:16:39 -07:00
Sami Khan
a46f8c3cd1 app
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-08-01 19:14:27 -07:00
Seth Howes
71bafabc63 Dashboard with instances 2025-08-01 14:38:07 +01:00
Gelu Vrabie
0e32599e71 fix libp2p + other prs that were wrongly overwritten before (111,112,117,118,1119 + misc commits from Alex)
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>
Co-authored-by: Seth Howes <71157822+sethhowes@users.noreply.github.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-31 20:36:47 +01:00
Alex Cheema
2031d9481d fix api get_state 2025-07-30 07:15:15 -07:00
Matt Beton
b350ededb2 Test Supervisor Errors. 2025-07-30 13:30:54 +01:00
Gelu Vrabie
ff3d11c748 just run
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-29 16:58:27 +01:00
Gelu Vrabie
25fa46c6f6 Update CODEOWNERS 2025-07-29 13:08:29 +01:00
Seth Howes
3f192f20cc Reinstate dashboard 2025-07-28 23:18:23 +01:00
Alex Cheema
a2b4093d25 add metrics: gpu_usage, temp, sys_power, pcpu_usage, ecpu_usage, ane_… 2025-07-28 23:02:33 +01:00
Alex Cheema
12566865d5 better profiling 2025-07-28 22:15:04 +01:00
Gelu Vrabie
b88abf1cc2 fix topology disconnects and add heartbeat
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-28 22:00:05 +01:00
Alex Cheema
dbd0bdc34b fix ci linter 2025-07-28 20:12:48 +01:00
Alex Cheema
20241e3290 some finishing touches to get this working e2e 2025-07-28 13:07:29 +01:00
Seth Howes
176d077c87 Fix IPv4 serialisation for topology 2025-07-28 13:07:10 +01:00
Gelu Vrabie
c3c8ddbce8 fix forwarder supervisor tests
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-28 13:03:43 +01:00
Matt Beton
36a5d75efd Fix download tests 2025-07-28 12:51:10 +01:00
Seth Howes
e9b803604b Add Multiaddr type and refactor Hosts type for creating shard placement 2025-07-28 11:39:46 +01:00
Alex Cheema
b285a9f0b7 fix placement tests 2025-07-28 11:18:32 +01:00
Alex Cheema
57ca487fde Fixes for running this end to end
Co-authored-by: Gelu Vrabie <gelu.vrabie.univ@gmail.com>
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-28 10:51:03 +01:00
Andrei Cravtov
b687dec6b2 Discovery integration master
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-27 13:43:59 +01:00
Alex Cheema
98f204d14a Fix placement single node 2025-07-26 20:08:37 +01:00
Matt Beton
93330f0283 Inference Integration Test
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-26 20:08:25 +01:00
Gelu Vrabie
2e4635a8f5 add node started event
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-26 19:12:26 +01:00
Gelu Vrabie
261e575262 Serialize topology
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-25 15:09:03 +01:00
Alex Cheema
a97fb27c64 Glue TWO 2025-07-25 14:32:34 +01:00
Gelu Vrabie
9be08ec7dd add resource monitor
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-25 13:10:53 +01:00
Alex Cheema
a241c92dd1 Glue 2025-07-25 13:10:29 +01:00
Seth Howes
6f8e3419d5 Placement strategy
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-24 20:22:40 +01:00
Gelu Vrabie
4c0e4ef853 Go build
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-24 19:45:45 +01:00
Matt Beton
f41531d945 Worker Loop
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
2025-07-24 18:44:31 +01:00
Alex Cheema
67c70b22e4 Best master 2025-07-24 17:12:52 +01:00
Andrei Cravtov
3730160477 Fix the node-ID test
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-24 17:09:12 +01:00
Gelu Vrabie
df1fe3af26 Topology apply
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-24 14:27:09 +01:00
Matt Beton
5097493a42 Fix tests 2025-07-24 13:22:58 +01:00
Alex Cheema
a6b3ab6332 Worker plan
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Seth Howes <71157822+sethhowes@users.noreply.github.com>
Co-authored-by: Gelu Vrabie <gelu.vrabie.univ@gmail.com>
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Andrei Cravtov <the.andrei.cravtov@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-07-24 12:45:27 +01:00
Gelu Vrabie
56d3565781 Add apply functions
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-24 11:02:20 +01:00
Andrei Cravtov
3ab5609289 wrote race-condition-free persistent NodeID-getting function 2025-07-23 20:18:56 +01:00
Matt Beton
7a452c3351 Fix tests 2025-07-23 18:25:50 +01:00
Seth Howes
7ac23ce96b Refactor tasks / commands / api 2025-07-23 15:52:29 +01:00
Andrei Cravtov
81060b7062 Made basedpyright work with Jetbrains environment
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-23 14:12:11 +01:00
Andrei Cravtov
8d2536d926 Implemented basic discovery library in Rust + python bindings
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-23 13:11:29 +01:00
Gelu Vrabie
76f903504c fix
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-22 22:29:35 +01:00
Seth Howes
cd9a1a9192 Topology update 2025-07-22 22:29:17 +01:00
Matt Beton
14b3c4a6be New API! 2025-07-22 21:21:12 +01:00
Gelu Vrabie
596d9fc9d0 add forwarder service
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-22 20:53:26 +01:00
Matt Beton
53c652c307 Fix tests! 2025-07-22 15:20:32 +01:00
Matt Beton
5adad08e09 New events 2025-07-22 15:16:06 +01:00
Gelu Vrabie
108128b620 fix sqlite connector
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-21 22:43:09 +01:00
Alex Cheema
449fdac27a Downloads 2025-07-21 22:42:37 +01:00
Seth Howes
cb101e3d24 Refactor model types 2025-07-21 20:35:27 +01:00
Gelu Vrabie
54efd01d77 add forwarder supervisor
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-21 20:21:43 +01:00
Seth Howes
bae58dd368 Refactor worker + master state into single state 2025-07-21 19:36:54 +01:00
Seth Howes
d19aa4f95a Simplify Task type + merge control & data plane types into single type 2025-07-21 17:10:09 +01:00
Gelu Vrabie
2f64e30dd1 Add sqlite connector
Co-authored-by: Gelu Vrabie <gelu@exolabs.net>
2025-07-21 14:10:29 +01:00
Alex Cheema
bb7f1ae994 New worker
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
2025-07-18 10:08:56 +01:00
Matt Beton
cc45c7e9b9 Fixed events issue. 2025-07-17 12:21:01 +01:00
Arbion Halili
038cc4cdfa fix: Normalize Naming 2025-07-16 16:11:51 +01:00
Arbion Halili
e2a7935019 fix: Fix incorrect logic 2025-07-16 14:39:20 +01:00
Arbion Halili
6a671908a3 fix: FrozenSet Related Bits 2025-07-16 13:45:57 +01:00
Arbion Halili
520b1122a3 fix: Many Fixes 2025-07-16 13:35:31 +01:00
Arbion Halili
d9b9aa7ad2 Merge branch 'master-node' into staging 2025-07-15 16:32:08 +01:00
Arbion Halili
7fa7de8e83 more incomplete trash 2025-07-15 13:42:17 +01:00
Arbion Halili
9f96b6791f fix: Some, still broken 2025-07-15 13:11:21 +01:00
Arbion Halili
9b3c105bea fix: Save Andrei's sanity 2025-07-15 13:11:20 +01:00
Arbion Halili
8060120136 tweak 2025-07-14 22:37:53 +01:00
Arbion Halili
df6626fa31 fix: Event definitions, state definitions 2025-07-14 21:41:14 +01:00
Arbion Halili
70f0f09c05 Tweaked, Still Broken tho 2025-07-14 21:19:39 +01:00
Arbion Halili
8799c288b0 BROKEN: work thus far 2025-07-14 21:09:08 +01:00
Arbion Halili
4e4dbf52ec fix: Use Nix-compatible LSP set-up 2025-07-14 21:08:43 +01:00
Matt Beton
21acd3794a New Runner! 2025-07-10 16:34:35 +01:00
Arbion Halili
b0bd951005 Merge Basic Interfaces
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Seth Howes <sethshowes@gmail.com>
Co-authored-by: Matt Beton <matthew.beton@gmail.com>
Co-authored-by: Andrei Cravtov <the.andrei.cravtov@gmail.com>
2025-07-09 19:04:21 +01:00
Arbion Halili
74d56e52ff fix: Improve naming 2025-07-07 20:22:27 +01:00
Arbion Halili
fe17aaf9f8 fix: Make master hold a queue of task data 2025-07-07 20:22:00 +01:00
Arbion Halili
e1894bc106 refactor: A Lot 2025-07-07 20:19:08 +01:00
Arbion Halili
81cf6bce64 refactor: Simplify networking 2025-07-07 19:33:14 +01:00
Andrei Cravtov
6c8b8b30ae added rust to flake 2025-07-07 18:11:40 +01:00
Matt Beton
0425422f55 Simple fix 2025-07-07 17:18:43 +01:00
Matt Beton
03a1cf59a6 Matt's interfaces
Added interfaces for chunks, worker, runner, supervisor, resourcemonitor, etc.
2025-07-07 16:42:52 +01:00
Arbion Halili
367e76c8fa fix: Fix validation over Task types 2025-07-04 17:25:14 +01:00
Arbion Halili
cda3de2a28 fix: Use state for tasks 2025-07-04 15:08:54 +01:00
Arbion Halili
10224d09de refactor: Distinguish the topology of the control plane from that of the data plane 2025-07-03 15:45:54 +01:00
Arbion Halili
c456934342 refactor: Remove timestamp from Wrapped Events 2025-07-03 13:05:35 +01:00
Arbion Halili
0b6aadf576 refactor: Add safe state mutation method .apply() 2025-07-03 12:33:29 +01:00
Arbion Halili
f8039e20e0 feature: Add pretty_name to ModelMetadata 2025-07-03 12:32:32 +01:00
Arbion Halili
4bb3a995a4 feature: Interfaces for graph interfaces 2025-07-02 22:44:55 +01:00
Arbion Halili
7dd8a979d2 feature: Simplest utilities for logging 2025-07-02 22:13:42 +01:00
Arbion Halili
40793f1d86 refactor: Refactor most things 2025-07-02 21:11:49 +01:00
Arbion Halili
8596d5c5b1 refactor: Fix UUID implementation 2025-07-02 11:04:52 +01:00
Arbion Halili
6de1f2883f feat: Update Interfaces 2025-07-01 18:41:37 +01:00
Arbion Halili
73ac8969bc feat: Add ResourceGraph, runner types, etc. 2025-07-01 13:14:26 +01:00
Arbion Halili
df824e2e87 fix: Ensure MasterState inherits from SharedState 2025-07-01 12:18:54 +01:00
Seth Howes
d5033e658c refactor: Replace Literal with Enum in sources.py 2025-07-01 12:15:28 +01:00
Arbion Halili
c0df8e5463 feat: Implement Many Interfaces 2025-07-01 01:37:00 +01:00
Arbion Halili
899d8820dd Merge Seth's Control Plane API Work into Alex's Events Branch
Co-authored-by: Seth Howes <sethshowes@gmail.com>
2025-06-30 23:54:41 +01:00
Arbion Halili
53d5d23898 refactor: Use enums 2025-06-30 23:45:27 +01:00
Arbion Halili
b758df83cf Chore: Tweak CI 2025-06-30 22:41:33 +01:00
Alex Cheema
133ab70d67 chore: Run formatter 2025-06-30 09:48:03 +01:00
Alex Cheema
aae3e4a82d refactor: Put type defs on one line 2025-06-30 09:46:44 +01:00
Alex Cheema
596b069f84 chore: Fail pipeline if working tree changes instead of committing them in CI 2025-06-30 09:40:47 +01:00
Alex Cheema
c0b8bb9c98 chore: Rename conditional-commit.yml to action.yml 2025-06-29 22:34:04 +01:00
Alex Cheema
0c46adc298 refactor: Use official OpenAI types 2025-06-29 22:30:18 +01:00
Alex Cheema
4b3e60f899 refactor: Add types for model downloading 2025-06-29 21:59:06 +01:00
Alex Cheema
784f0ec423 chore: Skip protobuf generation if no .proto files exist 2025-06-29 21:52:46 +01:00
Alex Cheema
38dcf698eb chore: Fix typecheck job in GitHub workflow 2025-06-29 21:47:23 +01:00
Alex Cheema
c9d44a1658 chore: Fix typecheck job in GitHub workflow 2025-06-29 21:45:41 +01:00
Alex Cheema
bbdfdac7be refactor: Remove redundant comment 2025-06-29 21:42:00 +01:00
Alex Cheema
5ba230ed16 refactor: Add all event types with Event implementations 2025-06-29 21:41:00 +01:00
Arbion Halili
5abf03e31b Scaffold Event Sourcing 2025-06-29 19:44:58 +01:00
Arbion Halili
d8459358cf Refactor CI 2025-06-28 14:42:53 +01:00
Arbion Halili
c977ce9419 Ensure exo-shared is a Dependency of exo-master and exo-worker 2025-06-28 14:34:49 +01:00
Arbion Halili
74adbc4280 Remove PoeThePoet 2025-06-28 14:33:01 +01:00
Arbion Halili
587a52a944 Remove Bad UUID Implementation 2025-06-28 14:08:18 +01:00
Arbion Halili
885c7d5cd8 Add RULES.md and .cursorrules 2025-06-28 14:03:01 +01:00
Arbion Halili
e4c4b3e95a Overhaul CI Design 2025-06-28 12:29:01 +01:00
Arbion Halili
f7f779da19 Fix Type Checker; Improve Protobuf Generation 2025-06-28 12:28:26 +01:00
Arbion Halili
38bc8ea7e4 Keep Protobuf Directories 2025-06-28 01:32:10 +01:00
Arbion Halili
b53c1ba999 Use Hatch Build System 2025-06-28 01:28:52 +01:00
Arbion Halili
423efe10b8 Add Protobuf Support 2025-06-28 01:27:25 +01:00
Arbion Halili
61b8b1cb18 Add Protobuf Support 2025-06-28 01:26:49 +01:00
Arbion Halili
7f0f71b9eb Add .gitignore 2025-06-28 01:25:51 +01:00
Arbion Halili
da50da2b43 Add Simple env.py 2025-06-27 11:57:03 +01:00
Arbion Halili
3564d77e58 Add Sync to Runner 2025-06-27 11:56:02 +01:00
Arbion Halili
77546b951e Update pyproject.toml 2025-06-17 22:28:48 +01:00
Arbion Halili
c15e402f3b Add Simple Groundwork 2025-06-17 22:23:01 +01:00
Arbion Halili
c57ed32fc5 Add Initial Contribution Rules 2025-06-17 16:11:15 +01:00
Arbion Halili
41085eef7b Prepare Environment Parser 2025-06-17 16:10:58 +01:00
Arbion Halili
685c8eff58 Configure Runner Tasks to Cover "engines/" 2025-06-17 07:37:08 +01:00
Arbion Halili
13b6043c09 Add Linter 2025-06-17 07:32:33 +01:00
Arbion Halili
180748ee83 Update Workspace Configuration, Configure Build Backend 2025-06-17 06:45:25 +01:00
Arbion Halili
043253a55d Add ML Engines (Backend) 2025-06-17 05:55:43 +01:00
Arbion Halili
090265a374 Add Formatter To CI 2025-06-17 05:46:33 +01:00
Arbion Halili
e2508f3419 Add Type Checker In CI 2025-06-17 05:46:08 +01:00
Arbion Halili
ac2dfa6565 Initial Structure 2025-06-17 03:55:41 +01:00
Alex Cheema
db1a5252a2 Add CODEOWNERS. 2025-06-14 23:32:30 -07:00
Alex Cheema
e4238f9ef3 Merge pull request #800 from exo-explore/grpcio1.71.0
downgrade grpcio, grpcio-tools to 1.70.0
2025-03-21 15:23:32 -07:00
Alex Cheema
ad3bc6ceaa downgrade grpcio, grpcio-tools to 1.70.0 2025-03-21 15:23:11 -07:00
Alex Cheema
04d5dca18f Merge pull request #778 from exo-explore/grpcio1.71.0
upgrade grpcio and grpcio-tools to 1.71.0
2025-03-12 06:24:57 +00:00
Alex Cheema
50b6800a61 m3 ultra flops estimates based on some quick profiling 2025-03-11 22:51:23 -07:00
Alex Cheema
2857975bf3 upgrade grpcio and grpcio-tools to 1.71.0 2025-03-11 17:23:37 -07:00
Alex Cheema
854f515cf5 Merge pull request #763 from deftdawg/amdgpu
AMD/ROCm: Changes required to detect and inference on AMD GPUs
2025-03-06 16:07:05 +00:00
DeftDawg
f98d9bac53 Changes required to detect AMD GPUs 2025-03-05 22:49:29 -05:00
Alex Cheema
017bf93cf5 Merge pull request #753 from mags0ft/patch-1
remove dead links in README
2025-03-03 23:01:34 +00:00
mags0ft
013d2573e7 remove dead links in README 2025-03-02 18:37:59 +01:00
Alex Cheema
2702975762 Merge pull request #746 from exo-explore/grpcio1.70.0
downgrade grpc to 1.67.0. waiting for fix
2025-02-28 21:26:11 +00:00
Alex Cheema
30c3f58a00 downgrade grpc to 1.67.0. waiting for fix bd8f8a86e0 2025-02-28 21:25:11 +00:00
Alex Cheema
1bbbb1e1d8 Merge pull request #745 from exo-explore/grpcio1.70.0
Grpcio1.70.0
2025-02-28 21:05:41 +00:00
Alex Cheema
4081305e60 adjust grpc settings, ensure connected before sending any grpc commands 2025-02-28 20:52:12 +00:00
Alex Cheema
52a21645c6 Merge pull request #742 from samiamjidkhan/main
build fix
2025-02-28 12:29:58 +00:00
Sami Khan
63570c7b8b Merge pull request #1 from samiamjidkhan/build-fix
build fix
2025-02-28 15:47:36 +05:00
Sami Khan
971f5240bf build fix 2025-02-28 15:45:57 +05:00
Alex Cheema
36a6389af0 bump grpcio and grpcio-tools to 1.70.0 2025-02-27 01:40:04 +00:00
Alex Cheema
af734f1bf6 Merge pull request #737 from exo-explore/handlegzipdownload
handle -gzip suffix in etag for integrity check fixes #633
2025-02-25 22:10:05 +00:00
Alex Cheema
ee095766d9 handle -gzip suffix in etag for integrity check fixes #633 2025-02-25 22:08:15 +00:00
Alex Cheema
a605e233ad Merge pull request #709 from exo-explore/notice
update notice in README
2025-02-18 11:43:14 +00:00
Alex Cheema
f9a1e5342b update notice in README 2025-02-18 11:41:09 +00:00
Alex Cheema
7a374a74cd Merge pull request #708 from exo-explore/notice
add notice to README
2025-02-17 22:55:44 +00:00
Alex Cheema
5a00899d73 Merge pull request #705 from cadenmackenzie/addingModelNameInputContainer
adding current model name to input container information
2025-02-17 22:55:29 +00:00
Alex Cheema
cb4bee2694 add notice to README 2025-02-17 22:54:56 +00:00
Caden MacKenzie
9078d094b9 adding current model name to input container information 2025-02-16 18:34:38 -08:00
Alex Cheema
ed70d47cfd Merge pull request #702 from exo-explore/alwayslogdownloaderror
make max_parallel_downloads configurable, increase download chunk size to 8MB
2025-02-14 21:27:12 +00:00
Alex Cheema
477e3a5e4c make max_parallel_downloads configurable, increase download chunk size to 8MB 2025-02-14 21:26:41 +00:00
Alex Cheema
be3b9ee973 Merge pull request #698 from exo-explore/alwayslogdownloaderror
always log download errors. some people e.g cant access huggingface
2025-02-13 22:56:33 +00:00
Alex Cheema
b4e6f8acad always log download errors. some people eg cant access huggingface which causes confusion 2025-02-13 22:55:09 +00:00
Alex Cheema
de99da7c75 Merge pull request #684 from divinity76/patch-1
workaround f16 cast ambiguity
2025-02-08 12:45:10 +00:00
Alex Cheema
76d1bd95f5 Merge pull request #688 from exo-explore/readmeupdate
apt-get debian noninteractive in circleci
2025-02-08 02:41:19 +00:00
Alex Cheema
928214d479 apt-get debian noninteractive in circleci 2025-02-08 02:40:51 +00:00
Alex Cheema
ce34a886c2 Merge pull request #687 from exo-explore/readmeupdate
README updates
2025-02-08 02:15:50 +00:00
Alex Cheema
d8c3aed0cc update discovery / peer networking modules 2025-02-08 02:15:13 +00:00
Alex Cheema
2c982d9295 update README to better reflect support for other devices like NVIDIA and Pi's 2025-02-08 02:13:04 +00:00
divinity76
5fe241ec61 code-breaking typo
oops
2025-02-06 19:02:02 +01:00
divinity76
05ff20fa89 workaround f16 cast ambiguity
for unknown reasons, without this, when trying to execute "Llama 3.2 1B", I get the error below. Fwiw I do not know the performance impact for this change. I can't even get exo running, but this change allows me to /get further/ (before running into a second issue with vram allocation? story for another day i suppose)


error: 
Failed to fetch completions: Error processing prompt (see logs with DEBUG>=2): Nvrtc Error 6, NVRTC_ERROR_COMPILATION
<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                 ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                               ^

<null>(18): error: more than one user-defined conversion from "nv_bfloat16" to "half" applies:
            function "__half::__half(float)" (declared at line 214 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(short)" (declared at line 227 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned short)" (declared at line 228 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(int)" (declared at line 229 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned int)" (declared at line 230 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(long long)" (declared at line 231 of /usr/include/cuda_fp16.hpp)
            function "__half::__half(unsigned long long)" (declared at line 232 of /usr/include/cuda_fp16.hpp)
    *((half4*)((data0+(alu0+(gidx1<<14)+(lidx0<<11)+alu1)))) = make_half4(((half)(val0)),((half)(val1)),((half)(val2)),((half)(val3)));
                                                                                                                              ^

4 errors detected in the compilation of "<null>".
2025-02-06 18:54:15 +01:00
Alex Cheema
b5fc4bc288 Merge pull request #675 from exo-explore/rmtenacity
remove tenacity dependency, implement simple retry logic instead
2025-02-03 21:58:08 +00:00
Alex Cheema
5157d80a46 remove tenacity dependency, implement simple retry logic instead 2025-02-03 21:56:38 +00:00
Alex Cheema
75914b4de8 Merge pull request #669 from pavel-rodionov/feature-local-models
Add toggle to show only models downloaded locally
2025-02-03 21:45:27 +00:00
Rodionov Pavel
d084dbe574 Add toggle to show only models downloaded locally 2025-02-01 23:45:19 -08:00
Alex Cheema
1a77a52d71 Merge pull request #666 from exo-explore/patchmanualdiscovery
patch for manual discovery, set known_peers
2025-02-01 23:07:21 +00:00
Alex Cheema
72329ba984 patch for manual discovery, set known_peers 2025-02-01 23:06:57 +00:00
Alex Cheema
f663b0afa2 Merge pull request #665 from exo-explore/resumedownload
add model downloading section to README
2025-02-01 20:23:58 +00:00
Alex Cheema
51b5c2ca9b add model downloading section to README 2025-02-01 20:23:05 +00:00
Alex Cheema
9a1f0a85e6 Merge pull request #664 from exo-explore/resumedownload
resumable downloads with integrity checks
2025-02-01 18:34:36 +00:00
Alex Cheema
2c0d17c336 beautiful download 2025-02-01 17:29:19 +00:00
Alex Cheema
7034ee0fcb resumable downloads with integrity checks 2025-02-01 13:22:51 +00:00
Alex Cheema
7a75fb09b2 Merge pull request #660 from exo-explore/robustdownload
cleanup tmp files on failed download
2025-01-30 20:25:15 +00:00
Alex Cheema
0bebf8dfde fix indent 2025-01-30 20:21:28 +00:00
Alex Cheema
55c4385db5 cleanup tmp files on failed download 2025-01-30 20:11:06 +00:00
Alex Cheema
90690a7d10 Merge pull request #647 from deftdawg/patch-1
Add 4-bit to the end of DeepSeek V3/R1 model descriptions
2025-01-30 19:49:38 +00:00
Alex Cheema
130d998d36 Merge pull request #659 from exo-explore/robustdownload
ensure exo dir on start, retry with exp backoff on file downloads
2025-01-30 19:49:00 +00:00
Alex Cheema
788c49784c retry fetch_file_list also 2025-01-30 19:45:12 +00:00
Alex Cheema
6b1c8635fc ensure exo dir on start, retry with exp backoff on file downloads 2025-01-30 19:40:35 +00:00
Alex Cheema
24c410c19c Merge pull request #653 from exo-explore/tinyfixes
Tiny fixes
2025-01-29 19:08:05 +00:00
Alex Cheema
f6ed830ba6 Merge pull request #651 from exo-explore/parallelise_model_loadin
parallelise model loading
2025-01-29 19:07:25 +00:00
Alex Cheema
e6b4f2993c fix prompt output spacing in tui 2025-01-29 19:01:30 +00:00
DeftDawg
a25e02c913 Add 4-bit to the end of DeepSeek V3/R1 model descriptions 2025-01-29 14:00:13 -05:00
Alex Cheema
3675804f4d throttle repo progress events and only send them out if something changed 2025-01-29 18:55:54 +00:00
Alex Cheema
96f1aecb05 only in_progress if any given file is in_progress 2025-01-29 18:43:43 +00:00
Alex Cheema
23a5030604 even if part of a file is downloaded it may not be in_progress 2025-01-29 18:39:23 +00:00
Alex Cheema
31b56e862f make a singleton thread pool executor for tinygrad since we always want it to run on the same thread 2025-01-29 18:37:09 +00:00
Alex Cheema
9f6c688d62 update tinygrad 2025-01-29 18:06:38 +00:00
Alex Cheema
4887be5103 parallelise model loading 2025-01-29 02:32:59 +00:00
Alex Cheema
75091e206b Merge pull request #650 from exo-explore/chatgpttimeout
increase chatgpt api response timeout to 900 seconds
2025-01-29 02:03:52 +00:00
Alex Cheema
141de0d011 increase chatgpt api response timeout to 900 seconds 2025-01-29 02:03:00 +00:00
Alex Cheema
263b18a31e Merge pull request #649 from eclecticc/amd_fix
Fix AMD device capabilities fields
2025-01-29 02:01:06 +00:00
Nirav Patel
9cf6818f10 Fix AMD device capabilities fields 2025-01-28 16:58:58 -08:00
Alex Cheema
837ed5d980 Merge pull request #648 from exo-explore/modelasyncload
Fixes
2025-01-28 23:39:11 +00:00
Alex Cheema
9c1bea97e8 fix embed_tokens for last layer in qwen models 2025-01-28 23:09:45 +00:00
Alex Cheema
af171f06fa propagate prompts to other nodes so they can display them, cleaner prompt/output output 2025-01-28 21:50:49 +00:00
Alex Cheema
edfa53a4c2 Merge pull request #646 from exo-explore/modelasyncload
make sure mlx stuff is on separate thread non blocking
2025-01-28 18:56:19 +00:00
Alex Cheema
4a5b80a958 make sure mlx stuff is on separate thread non blocking 2025-01-28 18:56:00 +00:00
Alex Cheema
92d1bc01de Merge pull request #645 from exo-explore/modelasyncload
load mlx model shard on mlx thread so it doesnt block
2025-01-28 18:49:47 +00:00
Alex Cheema
6662d5668c load mlx model shard on mlx thread so it doesnt block 2025-01-28 18:49:19 +00:00
Alex Cheema
a0d673fa3a Merge pull request #640 from exo-explore/simpledownload
Simple download
2025-01-27 19:38:11 +00:00
Alex Cheema
7c649085a1 fix eta/speed for resuming an existing download, using the session downloaded bytes 2025-01-27 19:23:18 +00:00
Alex Cheema
90e0e2761f ignore not_started progress updates 2025-01-27 06:05:59 +00:00
Alex Cheema
265586f7b4 set timeout on get too 2025-01-27 06:05:40 +00:00
Alex Cheema
4748bb7dc7 increase file download timeout to 30min 2025-01-27 05:49:17 +00:00
Alex Cheema
ae770db4f3 increase download chunks to 1MB 2025-01-27 05:37:50 +00:00
Alex Cheema
82f75d0ccf increase hf download http timeout 15 mins for large downloads 2025-01-27 05:20:30 +00:00
Alex Cheema
295f41c5cc increase bench job timeout to give enough time to download 2025-01-27 05:03:35 +00:00
Alex Cheema
19a27c5bfd HF_HOME -> EXO_HOME 2025-01-27 02:59:23 +00:00
Alex Cheema
d7ca9b7732 show each node id in the tinychat topology viz 2025-01-27 02:20:22 +00:00
Alex Cheema
b349e48b0d fix visual bug where frontend would show the full hf repo size, but in some cases that includes redundant files so we should use the model index in those cases too 2025-01-27 02:13:05 +00:00
Alex Cheema
21586063f6 use llama-3.2-1b in tinygrad test 2025-01-27 01:35:33 +00:00
Alex Cheema
277d63d860 special case when a model doesnt have a model index file, then use wildcard for allow_patterns 2025-01-27 01:26:15 +00:00
Alex Cheema
74379ef671 log download logs with DEBUG>=6 very verbose 2025-01-27 01:11:54 +00:00
Alex Cheema
3c7bd48aa3 get rid of some more hf bloat 2025-01-27 01:08:46 +00:00
Alex Cheema
1df023023e remove a lot of hf bloat 2025-01-27 01:06:47 +00:00
Alex Cheema
b89495f444 rewrite ShardDownloader, simplify significantly 2025-01-27 00:37:57 +00:00
Alex Cheema
903950f64e Merge pull request #638 from exo-explore/deepseekv3fix
add exception for mlx-community/DeepSeek-R1-3bit and mlx-community/DeepSeek-V3-3bit in tokenizers test
2025-01-26 20:33:22 +00:00
Alex Cheema
a3766f538a add exception for mlx-community/DeepSeek-R1-3bit and mlx-community/DeepSeek-V3-3bit in tokenizers test 2025-01-26 20:32:48 +00:00
Alex Cheema
9711d632e0 Merge pull request #637 from exo-explore/deepseekv3fix
fix post_init deepseek v3
2025-01-26 20:31:53 +00:00
Alex Cheema
82ef086010 add deepseek-v3-3bit and deepseek-r1-3bit 2025-01-26 20:31:28 +00:00
Alex Cheema
55ea366932 fix post_init deepseek v3 2025-01-26 20:27:31 +00:00
Alex Cheema
63318983de Merge pull request #631 from sigseg5/main
Some adaptivity fixes in tinychat
2025-01-26 19:20:58 +00:00
sigseg5
fb841a1f50 Adjust truncate size in history list for text without any spaces 2025-01-26 00:38:58 +03:00
sigseg5
4512366580 Fix bubble behavior when user passes long text without any spaces 2025-01-26 00:02:17 +03:00
sigseg5
9525c0e7a7 Add adaptive padding for user and assistant messages on width <= 1480px 2025-01-26 00:01:54 +03:00
Alex Cheema
66f73768cc Merge pull request #627 from exo-explore/deepseek
Deepseek, tinychat group models, latex formatting, thinking boxes
2025-01-24 18:14:57 +00:00
Alex Cheema
fdd05baddb fix tokenizer tests 2025-01-24 18:13:36 +00:00
Alex Cheema
59174bdc62 we have a lot of models so group them nicely 2025-01-24 18:02:00 +00:00
Alex Cheema
cfdaaef8e6 handle thinking outputs nicely, format latex beautifully 2025-01-24 17:49:25 +00:00
Alex Cheema
d8ffa59dba add deepseek v1, v3 and all the distills 2025-01-24 16:39:38 +00:00
Alex Cheema
aa1ce21f82 Merge pull request #625 from eltociear/patch-1
chore: update manual_discovery.py
2025-01-23 16:51:32 +00:00
Ikko Eltociear Ashimine
4fb01f516d chore: update manual_discovery.py
occured -> occurred
2025-01-24 00:18:42 +09:00
Alex Cheema
a635b23044 Merge pull request #619 from exo-explore/runners2
fix readme images
2025-01-23 02:18:33 +00:00
Alex Cheema
ad0e0d02d8 fix readme images 2025-01-23 02:17:58 +00:00
Alex Cheema
2644fd02c8 Merge pull request #617 from exo-explore/runners2
Lots of fixes and QoL improvements.
2025-01-23 02:05:17 +00:00
Alex Cheema
88ac12df6c install clang test 2025-01-23 01:55:14 +00:00
Alex Cheema
dfd9d3eb48 linux install 2025-01-23 01:44:57 +00:00
Alex Cheema
200ff4d713 linux install 2025-01-23 01:43:00 +00:00
Alex Cheema
b2764f177f linux install 2025-01-23 01:40:59 +00:00
Alex Cheema
e57fa1dfa0 xlarge 2025-01-23 01:40:13 +00:00
Alex Cheema
209163c595 add linux tinygrad test 2025-01-23 01:38:10 +00:00
Alex Cheema
495987b50b beef up the instance 2025-01-23 01:37:38 +00:00
Alex Cheema
8484eb4165 fix config 2025-01-23 01:37:01 +00:00
Alex Cheema
790c08afd4 add linux tinygrad test 2025-01-23 01:31:44 +00:00
Alex Cheema
a8a9e3ffa1 explicitly enable TOKENIZERS_PARALLELISM=true 2025-01-23 01:26:27 +00:00
Alex Cheema
5c9bcb8620 set GRPC_VERBOSITY=error; TRANSFORMERS_VERBOSITY=error 2025-01-23 01:22:19 +00:00
Alex Cheema
d54e19c20a runners back 2025-01-23 00:55:52 +00:00
Alex Cheema
cc78738e24 remove kern scan intervals 2025-01-23 00:49:32 +00:00
Alex Cheema
2391051c11 remove kern.timer.scan_interval from bootstrap.sh 2025-01-23 00:41:40 +00:00
Alex Cheema
112dea1582 add back the benchmarks baby 2025-01-23 00:15:54 +00:00
Alex Cheema
dc5cdc4d78 add back opaque 2025-01-22 23:59:39 +00:00
Alex Cheema
f8db4e131e fix check for sd2.1 2025-01-22 23:53:42 +00:00
Alex Cheema
bbb6856988 fix check for sd2.1 2025-01-22 23:51:09 +00:00
Alex Cheema
9ba8bbbcf8 fix filter to include 169.254.* since thats what mac uses for ethernet 2025-01-22 23:47:43 +00:00
Alex Cheema
8ab9977f01 fix stable diffusion case for tui, make mlx run on its own thread again and non-blocking 2025-01-22 23:22:53 +00:00
Alex Cheema
3a4bae0dab fix issue with eos_token_id 2025-01-22 22:58:09 +00:00
Alex Cheema
87d1271d33 fix stream: false completion 2025-01-22 22:46:04 +00:00
Alex Cheema
55d1846f5e clean up DEBUG=2 logs, a few fixes for token 2025-01-22 22:27:02 +00:00
Alex Cheema
9954ce8e4d fix treating token as a list 2025-01-22 22:13:13 +00:00
Alex Cheema
09e12d8673 temporarily disable github runner benchmarks 2025-01-22 22:00:13 +00:00
Alex Cheema
98d6e986bd add back .circleci 2025-01-22 21:58:46 +00:00
Alex Cheema
d80324fe20 disable test-m3-single-node 2025-01-22 21:58:40 +00:00
Alex Cheema
97f3bad38f fix peer_handle 2025-01-22 21:07:49 +00:00
Alex Cheema
461e4f37cb Merge remote-tracking branch 'origin/main' into runners2 2025-01-22 21:06:12 +00:00
Alex Cheema
07ceb19f0a Merge pull request #614 from samiamjidkhan/main
animation fix
2025-01-22 14:59:54 +00:00
Sami Khan
27b4577f38 directory for images 2025-01-22 05:47:25 -05:00
Sami Khan
a70943f8d2 base images for animation 2025-01-22 05:46:38 -05:00
Alex Cheema
410d901505 Merge pull request #613 from samiamjidkhan/dmg-backend
image and text mode fix
2025-01-21 13:12:08 +00:00
Sami Khan
5c4ce5392c image and text mode fix 2025-01-21 04:33:54 -05:00
Alex Cheema
819ec7626e Merge pull request #611 from exo-explore/fixbuildname
fix scripts/build_exo.py: com.exolabs.exo -> net.exolabs.exo
2025-01-21 05:36:34 +00:00
Alex Cheema
ba5bb3e171 fix scripts/build_exo.py: com.exolabs.exo -> net.exolabs.exo 2025-01-21 05:36:02 +00:00
Alex Cheema
f4bbcf4c8f Merge pull request #607 from tensorsofthewall/smol_fix
Fixes for cross-platform operability
2025-01-21 02:21:18 +00:00
Alex Cheema
6b8cd0577e fix some issues with results 2025-01-20 16:30:16 +00:00
Alex Cheema
218c1e79d9 Merge branch 'main' into runners2 2025-01-20 16:12:55 +00:00
Sandesh Bharadwaj
b9eccedc3d Formatting 2025-01-17 05:40:42 -05:00
Sandesh Bharadwaj
5f06aa2759 Replace netifaces (unmaintained,outdated) with scapy + add dependencies for previous fixes 2025-01-17 05:37:01 -05:00
Sandesh Bharadwaj
349b5344eb Minor fix for Shard typing 2025-01-16 14:36:46 -05:00
Sandesh Bharadwaj
df3624d27a Add AMD GPU querying + Windows device capabilities 2025-01-14 20:37:02 -05:00
Sandesh Bharadwaj
6737e36e23 Fixed MLX import blocking native Windows execution of exo. (Not Final) 2025-01-14 20:35:21 -05:00
Alex Cheema
c260689a06 Merge pull request #602 from exo-explore/fixexodir
fix exo folder
2025-01-12 03:46:14 +00:00
Alex Cheema
fcc699a55f fix 2025-01-12 03:40:59 +00:00
Alex Cheema
e7b98f5ae5 fix unit tests 2025-01-12 03:35:24 +00:00
Alex Cheema
ffe78f6d0b fix dummy test 2025-01-12 03:30:06 +00:00
Alex Cheema
ce5041ee1b types 2025-01-12 03:24:42 +00:00
Alex Cheema
9b2c01c873 ensure dir exists 2025-01-12 03:15:49 +00:00
Alex Cheema
2aed3f3518 handle inference_state properly 2025-01-12 03:13:17 +00:00
Alex Cheema
2af5ee02e4 fix exo folder 2025-01-12 03:10:11 +00:00
Alex Cheema
b5cbcbc7a2 Merge pull request #474 from pranav4501/stable-stable-diffusion-mlx
Stable diffusion mlx
2025-01-12 02:57:21 +00:00
Alex Cheema
5f3d000a7b Merge branch 'main' into stable-stable-diffusion-mlx 2025-01-12 02:56:34 +00:00
Alex Cheema
bd2e8e7a5a Merge pull request #598 from exo-explore/fixphitest
typo in phi test
2025-01-08 22:09:38 +00:00
Alex Cheema
40696b21f7 typo in phi test 2025-01-08 22:09:04 +00:00
Alex Cheema
4937fb3df8 Merge pull request #597 from exo-explore/tuioverflow
Tui overflow
2025-01-08 16:40:16 +00:00
Alex Cheema
2d631ea53d Merge pull request #596 from exo-explore/phi4
add phi 3.5, phi 4
2025-01-08 16:39:32 +00:00
Alex Cheema
2846a9122f tok tests 2025-01-08 16:39:11 +00:00
Alex Cheema
553ccce728 fix prompt and output overflow in tui 2025-01-08 16:36:56 +00:00
Alex Cheema
c587593364 add phi 3.5, phi 4 2025-01-08 16:19:43 +00:00
Alex Cheema
3c9efe103d Merge pull request #590 from metaspartan/fix-models-api
Fix the /v1/models API to output proper OpenAI compatible endpoint
2025-01-07 02:32:06 +00:00
Carsen Klock
627bfcae7c Fix the /v1/models API to output proper OpenAI compatible endpoint
Modify the `/v1/models` API to output a proper OpenAI compatible endpoint with an object and a `data` object containing the models list.

* Change the `handle_get_models` method in `exo/api/chatgpt_api.py` to wrap the models list in an object with a `data` field.
* Add an `object` field with the value "list" to the response format.

---

For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/metaspartan/exo?shareId=XXXX-XXXX-XXXX-XXXX).
2025-01-06 01:20:30 -07:00
Alex Cheema
d9a836f152 Merge pull request #588 from exo-explore/betterdl
better download
2025-01-05 02:35:04 +00:00
Alex Cheema
29244c6369 fix args for ensure_shard 2025-01-05 02:33:25 +00:00
Alex Cheema
8c191050a2 download status in parallel, support async ensure shard with using shard_downloader instead 2025-01-05 02:31:59 +00:00
Alex Cheema
7b1656140e Merge pull request #585 from pepebruari/main
Add --system-prompt to exo cli
2025-01-03 23:49:50 +00:00
pepebruari
fe50d4d34d Add --system-prompt to exo cli 2025-01-03 16:16:22 -05:00
Alex Cheema
03aa6cecf1 Merge pull request #584 from exo-explore/AlexCheema-patch-1
add trending badge to README.md
2024-12-31 17:51:10 +00:00
Alex Cheema
178cc4d961 add trending badge to README.md 2024-12-31 17:50:29 +00:00
Pranav Veldurthi
b13e368368 fix inference engine 2024-12-30 19:41:19 -05:00
Pranav Veldurthi
9986fb86d4 remove prints and fix download progress for SD 2024-12-30 19:07:37 -05:00
Pranav Veldurthi
3475be9e9e Remove build 2024-12-30 18:39:17 -05:00
Pranav Veldurthi
fff8a1a690 fix inference engine for inference state 2024-12-30 18:36:53 -05:00
Pranav Veldurthi
54605299b8 Merge Latest 2024-12-30 18:36:23 -05:00
Alex Cheema
023ddc207e support different network interface tests 2024-12-17 21:03:00 +00:00
Alex Cheema
2f0b543a1e add peer connection info to tinychat 2024-12-17 17:37:40 +00:00
Alex Cheema
7ac4004392 change it back to collecting topology periodically even if peers dont change 2024-12-17 17:32:18 +00:00
Alex Cheema
198308b1eb more robust udp broadcast 2024-12-17 17:28:55 +00:00
Alex Cheema
1f108a06ff remove test sleep 2024-12-17 16:47:05 +00:00
Alex Cheema
3a58576f8c make sure this is actually doing something 2024-12-17 16:22:22 +00:00
Alex Cheema
0a07223074 switch to uvloop (faster asyncio event loop) and optimise grpc settings 2024-12-17 16:10:56 +00:00
Alex Cheema
58f0a0f547 optimise grpc parameters 2024-12-17 14:50:52 +00:00
Pranav Veldurthi
5c0cd1839b Update strength image to image gen 2024-12-16 18:40:36 -05:00
Alex Cheema
e2474c3f15 fail if we never get the desired node count 2024-12-16 21:59:02 +00:00
Alex Cheema
1b14be6013 make device_capabilities async running on a thread pool 2024-12-16 21:17:30 +00:00
Alex Cheema
036224f877 add topology to tinychat ui 2024-12-16 21:17:12 +00:00
Alex Cheema
b17faa8199 dont broadcast every single process_tensor 2024-12-16 20:54:38 +00:00
Alex Cheema
35d90d947c Merge remote-tracking branch 'origin/main' into runners 2024-12-16 20:04:03 +00:00
Alex Cheema
8d94b8ae12 trigger test 2024-12-16 20:03:22 +00:00
Alex Cheema
99a70f1045 Merge commit: trigger test 2024-12-16 20:01:23 +00:00
Alex Cheema
bd0febe35f Merge commit: trigger test 2024-12-16 20:01:09 +00:00
Alex Cheema
34ecbbe01c Merge commit: trigger test 2024-12-16 20:00:50 +00:00
Alex Cheema
427d0718b3 Merge commit: trigger test 2024-12-16 20:00:39 +00:00
Alex Cheema
b49c4ca0e5 Merge commit: trigger test 2024-12-16 20:00:21 +00:00
Alex Cheema
41eaaec5a9 Merge commit: trigger test 2024-12-16 20:00:10 +00:00
Alex Cheema
bf1aafdea7 Merge commit: trigger test 2024-12-16 19:59:51 +00:00
Alex Cheema
bfa06ee9f3 Merge commit: trigger test 2024-12-16 19:59:39 +00:00
Alex Cheema
c0534b67c3 Merge commit: trigger test 2024-12-16 19:59:08 +00:00
Alex Cheema
063964aab3 remove redundant sample_logits, put back opaque status for process_prompt so we have a way of preemptively starting downloads 2024-12-16 19:50:36 +00:00
Alex Cheema
804ad4705a upgrade mlx 2024-12-16 19:50:33 +00:00
Alex Cheema
c9ded9ba96 optimise networking, remove bloat 2024-12-16 19:50:29 +00:00
Alex Cheema
64365d684f one two and three m4 pro clusters 2024-12-16 19:50:24 +00:00
Alex Cheema
9397464fad add commit to results 2024-12-16 19:50:19 +00:00
Nel Nibcord
08912d1b64 Only collect topology if peers changed 2024-12-16 19:50:18 +00:00
Alex Cheema
06c2e236b8 rip out stats bloat 2024-12-16 19:50:17 +00:00
Alex Cheema
cb4615c95d fix SendNewToken 2024-12-16 19:50:14 +00:00
Alex Cheema
f55a53ae7e one token at a time 2024-12-16 19:49:52 +00:00
Gary
25b4af70e0 Merge branch 'main' into runners 2024-12-14 20:48:58 +00:00
Alex Cheema
a93092105c set max-generate-tokens to 250 2024-12-14 19:10:03 +00:00
Alex Cheema
0c6ab35333 increase timeout of http request in bench.py up to 10 mins 2024-12-14 18:33:41 +00:00
Alex Cheema
e5d54c77a9 add llama-3.3-70b to 3 M4 Pro cluster 2024-12-12 18:51:26 +00:00
Alex Cheema
2ff4638122 Merge remote-tracking branch 'origin/main' into runners 2024-12-12 17:14:40 +00:00
Alex Cheema
b6f2385c41 run llama-3.1-8b on 3 m4 pro cluster 2024-12-12 15:13:10 +00:00
Alex Cheema
9472ab0d2c t 2024-12-12 15:05:55 +00:00
Alex Cheema
dbb7ad3c08 run with three m4 pro 2024-12-12 14:36:18 +00:00
Alex Cheema
2abe57be21 grasping at straws 2024-12-12 12:03:20 +00:00
Alex Cheema
eeecdcb409 try a different taskpolicy 2024-12-12 11:45:01 +00:00
Alex Cheema
f9f76129a1 better bench system info 2024-12-12 11:34:37 +00:00
Alex Cheema
8c6d37d9b8 m4 cluster test 2024-12-12 11:13:13 +00:00
Alex Cheema
1194db6e65 m3 2024-12-12 00:02:20 +00:00
Alex Cheema
8cb7327da2 re-enable m4 cluster run 2024-12-12 00:01:14 +00:00
Alex Cheema
bba0aa0877 single node test 20 2024-12-11 22:58:44 +00:00
Alex Cheema
279354a1fd single node test 19 2024-12-11 22:58:38 +00:00
Alex Cheema
92e2b74902 single node test 18 2024-12-11 22:58:33 +00:00
Alex Cheema
76196b8c2f single node test 17 2024-12-11 22:58:27 +00:00
Alex Cheema
8408c8499f single node test 16 2024-12-11 22:58:21 +00:00
Alex Cheema
c65d1d9141 single node test 15 2024-12-11 22:58:16 +00:00
Alex Cheema
0bd44c0f78 single node test 14 2024-12-11 22:58:10 +00:00
Alex Cheema
f22bc99f2c single node test 13 2024-12-11 22:58:04 +00:00
Alex Cheema
3fda05aa39 single node test 12 2024-12-11 22:57:58 +00:00
Alex Cheema
6c322ac070 single node test 11 2024-12-11 22:57:53 +00:00
Alex Cheema
c5c27a32af single node test 10 2024-12-11 22:57:47 +00:00
Alex Cheema
9f1393dc7f single node test 9 2024-12-11 22:57:42 +00:00
Alex Cheema
32ff3ef9af single node test 8 2024-12-11 22:57:36 +00:00
Alex Cheema
b23c3fdaad single node test 7 2024-12-11 22:57:31 +00:00
Alex Cheema
8b47a9d017 single node test 6 2024-12-11 22:57:25 +00:00
Alex Cheema
f89b85b3f2 single node test 5 2024-12-11 22:57:19 +00:00
Alex Cheema
6f097c9321 single node test 4 2024-12-11 22:57:14 +00:00
Alex Cheema
fb7a0defe1 single node test 3 2024-12-11 22:57:08 +00:00
Alex Cheema
fe506a53d9 single node test 2 2024-12-11 22:57:02 +00:00
Alex Cheema
3f6ef1c763 single node test 1 2024-12-11 22:56:56 +00:00
Alex Cheema
e63c224c71 testtt 2024-12-11 22:53:02 +00:00
Alex Cheema
20e3065e57 les goh 2024-12-11 22:49:29 +00:00
Alex Cheema
83892d5b7e t 2024-12-11 22:45:59 +00:00
Alex Cheema
83470a98b4 t 2024-12-11 22:42:02 +00:00
Alex Cheema
92edfa5efc t 2024-12-11 22:40:47 +00:00
Alex Cheema
225dcba788 t 2024-12-11 22:37:11 +00:00
Alex Cheema
6249bee793 tes 2024-12-11 22:35:30 +00:00
Alex Cheema
741c31836e test 2024-12-11 22:27:10 +00:00
Alex Cheema
d0b7f1b4bb t 2024-12-11 22:11:01 +00:00
Alex Cheema
90677415c7 t 2024-12-11 22:01:29 +00:00
Alex Cheema
6cf2af39e8 t 2024-12-11 21:55:24 +00:00
Alex Cheema
5a1a0f5fd2 t 2024-12-11 21:45:53 +00:00
Alex Cheema
dd3fd279dc t 2024-12-11 21:42:01 +00:00
Alex Cheema
61c09631c0 t 2024-12-11 21:40:47 +00:00
Alex Cheema
e698ef6ab1 t 2024-12-11 21:39:27 +00:00
Alex Cheema
26351e719d t 2024-12-11 21:36:59 +00:00
Alex Cheema
5dee5e55fe t 2024-12-11 21:33:03 +00:00
Alex Cheema
6acfb81860 t 2024-12-11 20:28:07 +00:00
Alex Cheema
b1142d4ff4 t 2024-12-11 19:39:58 +00:00
Alex Cheema
a932afc01c oi 2024-12-11 19:30:28 +00:00
Alex Cheema
cdae702673 t 2024-12-11 19:24:43 +00:00
Alex Cheema
d95f40b6c8 a 2024-12-11 19:07:36 +00:00
Alex Cheema
97ffb83e86 t 2024-12-11 19:01:24 +00:00
Alex Cheema
9a11e27c93 ttt 2024-12-11 18:54:51 +00:00
Alex Cheema
d6c2146dd9 t 2024-12-11 18:34:35 +00:00
Alex Cheema
63da9fc194 a 2024-12-11 18:30:02 +00:00
Alex Cheema
7c0c5ef7fc ttttttt 2024-12-11 18:23:59 +00:00
Alex Cheema
739b7d178e tttttt 2024-12-11 18:02:22 +00:00
Alex Cheema
cacf50cd57 tttt 2024-12-11 18:00:28 +00:00
Alex Cheema
0904cda3ac ttt 2024-12-11 17:58:59 +00:00
Alex Cheema
6bb38939ec tt 2024-12-11 17:56:22 +00:00
Alex Cheema
1dbe11caf9 t 2024-12-11 17:54:41 +00:00
Alex Cheema
8d9e3b88d3 t 2024-12-11 17:52:07 +00:00
Alex Cheema
9dd33d37f2 t 2024-12-11 17:44:14 +00:00
Alex Cheema
a4bb4bb6ac update bootstrap 2024-12-11 17:37:38 +00:00
Alex Cheema
7b99cb4a12 t 2024-12-11 17:30:50 +00:00
Alex Cheema
9848a45da5 TT 2024-12-11 17:27:53 +00:00
Alex Cheema
378975813c t 2024-12-11 17:15:39 +00:00
Alex Cheema
e680e8a1ed fix name 2024-12-11 17:07:45 +00:00
Alex Cheema
7b2282d300 run without debug flag 2024-12-11 17:07:19 +00:00
Alex Cheema
3b1ea1933b use .venv exo 2024-12-11 17:02:58 +00:00
Alex Cheema
668766fc4b t 2024-12-11 16:55:57 +00:00
Alex Cheema
e501eeaf91 tweak install 2024-12-11 16:52:07 +00:00
Alex Cheema
41902f716f tweaks 2024-12-11 16:40:21 +00:00
Alex Cheema
b7bab80ec8 test2 2024-12-11 16:36:50 +00:00
Alex Cheema
6169996c70 test 2024-12-11 16:35:26 +00:00
Alex Cheema
bbb58460f8 Test on m4 2024-12-11 16:29:52 +00:00
Alex Cheema
cff03fc6c5 perf diag 2024-12-11 16:19:47 +00:00
Alex Cheema
f7122d400d add system_status check to bench 2024-12-11 16:13:53 +00:00
Alex Cheema
c938efb531 t 2024-12-11 16:06:14 +00:00
Alex Cheema
e2d3a90832 runner-token typo 2024-12-11 15:47:10 +00:00
Alex Cheema
ba96413a63 bootstrap script tweaks 2024-12-11 15:45:05 +00:00
Alex Cheema
cb40eb23ce more robust configure_mlx.sh 2024-12-11 15:38:45 +00:00
Alex Cheema
afe71c01da check gpu usage 2024-12-11 15:28:57 +00:00
Alex Cheema
a84cba4e3a Merge remote-tracking branch 'origin/main' into runners 2024-12-11 15:22:35 +00:00
Alex Cheema
23158a42ad add branch name to results 2024-12-11 12:59:55 +00:00
Alex Cheema
18e7919971 test 30 2024-12-11 12:55:05 +00:00
Alex Cheema
0e32a625d7 test 29 2024-12-11 12:54:59 +00:00
Alex Cheema
04bc163fea test 28 2024-12-11 12:54:52 +00:00
Alex Cheema
949055dec0 test 27 2024-12-11 12:54:45 +00:00
Alex Cheema
070b163cc7 test 26 2024-12-11 12:54:38 +00:00
Alex Cheema
fc26ad4006 test 25 2024-12-11 12:54:27 +00:00
Alex Cheema
5d3be3c6ed test 24 2024-12-11 12:54:20 +00:00
Alex Cheema
23dd5de3ae test 23 2024-12-11 12:54:14 +00:00
Alex Cheema
6030b39964 test 22 2024-12-11 12:54:08 +00:00
Alex Cheema
4f4ac0fa52 test 21 2024-12-11 12:54:01 +00:00
Alex Cheema
16d9839071 test {i} 2024-12-11 12:53:55 +00:00
Alex Cheema
8269b4b190 t 2024-12-11 12:38:51 +00:00
Alex Cheema
1e869a0f15 trigger test 2024-12-10 02:04:52 +00:00
Alex Cheema
5a4d128db6 trigger test 2024-12-09 08:02:29 +00:00
Alex Cheema
8a5d212cfc test 20 2024-12-08 23:38:30 +00:00
Alex Cheema
53edb8508b test 19 2024-12-08 23:38:24 +00:00
Alex Cheema
29d9df04bf test 18 2024-12-08 23:38:18 +00:00
Alex Cheema
4d6af6e6ca test 17 2024-12-08 23:38:13 +00:00
Alex Cheema
8c7c156f57 test 16 2024-12-08 23:38:07 +00:00
Alex Cheema
310843487f test 15 2024-12-08 23:38:01 +00:00
Alex Cheema
a4b221d0a0 test 14 2024-12-08 23:37:55 +00:00
Alex Cheema
286db875de test 13 2024-12-08 23:37:49 +00:00
Alex Cheema
d714e40f62 test 12 2024-12-08 23:37:43 +00:00
Alex Cheema
e78ef75531 test 11 2024-12-08 23:37:37 +00:00
Alex Cheema
38eaecf087 test 10 2024-12-08 23:37:31 +00:00
Alex Cheema
3cf28f8452 test 9 2024-12-08 23:37:26 +00:00
Alex Cheema
9ba8bbdd70 test 8 2024-12-08 23:37:20 +00:00
Alex Cheema
af6048e373 test 7 2024-12-08 23:37:14 +00:00
Alex Cheema
d93b8e8948 test 6 2024-12-08 23:37:08 +00:00
Alex Cheema
b69cb49a46 test 5 2024-12-08 23:37:02 +00:00
Alex Cheema
cc74b1f9b3 test 4 2024-12-08 23:36:57 +00:00
Alex Cheema
e78a52de5f test 3 2024-12-08 23:36:51 +00:00
Alex Cheema
f6c2c37c4b test 2 2024-12-08 23:36:45 +00:00
Alex Cheema
314a5d9781 test 1 2024-12-08 23:36:22 +00:00
Alex Cheema
b4e885bbd2 test range 2024-12-08 23:36:14 +00:00
Alex Cheema
bd9d11861b sleep before bench 2024-12-08 23:24:46 +00:00
Alex Cheema
571b26c50e allowed interface types 2024-12-08 23:20:08 +00:00
Glen
b21681931d remove 2024-12-08 23:13:10 +00:00
Alex Cheema
f584e86d8e get rid of lfs stuff 2024-12-08 22:55:19 +00:00
Alex Cheema
fd05bca1c8 lfs 2024-12-08 22:46:49 +00:00
Alex Cheema
cbac4d6a3e git version 2024-12-08 22:44:32 +00:00
Alex Cheema
b0977f97ab t 2024-12-08 22:43:23 +00:00
Glen
1716f637f7 test 2024-12-08 22:32:03 +00:00
Glen
903a5aabf7 fix 2024-12-08 22:26:44 +00:00
Glen
b4f86496ea bootstrap 2024-12-08 22:23:28 +00:00
Alex Cheema
8e57f3385c trigger test 2024-12-08 22:14:23 +00:00
Alex Cheema
3ccbdf19de add DEBUG_DISCOVERY 2024-12-08 22:07:48 +00:00
Alex Cheema
3687ba18df bench logs 2024-12-08 22:02:39 +00:00
Alex Cheema
6bb7c11bbb enable debug 2024-12-08 21:54:24 +00:00
Glen
c8f93721c5 model matrix 2024-12-08 21:14:36 +00:00
Alex Cheema
fb8d87025f t 2024-12-08 21:02:42 +00:00
Alex Cheema
87865f0cd9 list exo processes before test, warmup req in bench 2024-12-08 20:58:44 +00:00
Glen
755dd477dd jobname 2024-12-08 20:37:50 +00:00
Alex Cheema
fb44eb086c simplify bench 2024-12-08 20:30:07 +00:00
Alex Cheema
be8cbc0f56 trigger test 2024-12-08 19:28:55 +00:00
Glen
fe8074929f fix 2024-12-08 19:08:47 +00:00
Glen
c3c80c61c9 name 2024-12-08 19:02:53 +00:00
Glen
c138de0875 job_name 2024-12-08 18:56:37 +00:00
Glen
38bd00390c fix 2024-12-08 18:32:38 +00:00
Glen
732ba915aa new_conf 2024-12-08 18:32:06 +00:00
Glen
785710355f aws 2024-12-07 19:28:54 +00:00
Glen
320892dccc maxtok 2024-12-07 19:28:54 +00:00
Glen
6dae3a4719 conf 2024-12-07 19:28:54 +00:00
Glen
7b77ef000e flush 2024-12-07 19:28:54 +00:00
Glen
6c08b32350 nodebug 2024-12-07 19:28:54 +00:00
Glen
4dd617ad37 shorter 2024-12-07 19:28:54 +00:00
Glen
acdee16aee debug 2024-12-07 19:28:54 +00:00
Glen
9fc33587da path 2024-12-07 19:28:54 +00:00
Glen
f087c0ac99 fix 2024-12-07 19:28:54 +00:00
Glen
16b126d890 fix 2024-12-07 19:28:54 +00:00
Glen
faf0aaedba jq 2024-12-07 19:28:54 +00:00
Glen
4cac1bb151 quotes 2024-12-07 19:28:54 +00:00
Glen
cb3c1477bb fix 2024-12-07 19:28:54 +00:00
Glen
19a7d5a5cf fix 2024-12-07 19:28:54 +00:00
Glen
f7e0348f62 activate 2024-12-07 19:28:54 +00:00
Glen
c3dfac60a6 debug 2024-12-07 19:28:54 +00:00
Glen
64954aacfe fixed 2024-12-07 19:28:54 +00:00
Glen
ccc5415cc6 try 2024-12-07 19:28:54 +00:00
Glen
1dcc731b43 fix 2024-12-07 19:28:54 +00:00
Glen
3662ec402a fix 2024-12-07 19:28:54 +00:00
Glen
0739dc9564 fix 2024-12-07 19:28:54 +00:00
Glen
d16280ddfc debug 2024-12-07 19:28:54 +00:00
Glen
f9c23617a7 fix3 2024-12-07 19:28:54 +00:00
Glen
ce2ccddc93 fix2 2024-12-07 19:28:54 +00:00
Glen
1af28cb5a1 fix 2024-12-07 19:28:54 +00:00
Glen
6b61fc6660 tweak python install 2024-12-07 19:28:54 +00:00
Glen
bdf417f25e tweak 2024-12-07 19:28:54 +00:00
Glen
d154d37ac4 add exo run 2024-12-07 19:28:54 +00:00
Glen
90fd5c13a4 matrix 2024-12-07 19:28:54 +00:00
Glen
7d223a0095 matrix 2024-12-07 19:28:54 +00:00
Glen
cb3d89eb48 test runner 2024-12-07 19:28:54 +00:00
Glen
8302fd0aae test runner 2024-12-07 19:28:54 +00:00
Alex Cheema
deb80d2577 clang for tinygrad 2024-12-07 19:28:54 +00:00
Alex Cheema
976e5f2fdb disable mlx test for now..plan to run this on a self-hosted runner 2024-12-07 19:28:54 +00:00
Alex Cheema
9dc76ef03b tooonygrad 2024-12-07 19:28:54 +00:00
Alex Cheema
32cd1f1d72 give this a goh 2024-12-07 19:28:54 +00:00
Alex Cheema
6b54188140 cond 2024-12-07 19:28:54 +00:00
Alex Cheema
58bcf5b429 check discovery on integration tests too 2024-12-07 19:28:54 +00:00
Alex Cheema
3c0297c3e9 more robust discovery log check 2024-12-07 19:28:54 +00:00
Alex Cheema
8d433e6579 run tinygrad and discovery integratrion tests on linux 2024-12-07 19:28:54 +00:00
Alex Cheema
676125bfe6 job 2024-12-07 19:28:54 +00:00
Alex Cheema
902e0d35e1 github env vars 2024-12-07 19:28:54 +00:00
Alex Cheema
972aea446c macos 15 2024-12-07 19:28:53 +00:00
Alex Cheema
0d0338f871 migrate from circleci to github actions 2024-12-07 19:28:53 +00:00
Pranav Veldurthi
0f10244900 Merge latest 2024-12-04 22:52:48 -05:00
Pranav Veldurthi
686e139508 Merge Latest 2024-12-04 22:52:25 -05:00
Pranav Veldurthi
ca0caad0ae Image to image generation 2024-12-04 22:40:12 -05:00
Alex Cheema
f94c9067e2 trigger test 2024-12-04 03:09:12 +00:00
Alex Cheema
f0bb515d1d trigger test 2024-12-02 11:20:21 +00:00
Alex Cheema
71db641fe4 trigger test 2024-12-02 04:11:43 +00:00
Pranav Veldurthi
4b8c4a795f Images stored in system 2024-12-01 19:31:51 -05:00
Alex Cheema
f339f74fe3 trigger test 2024-12-01 17:39:53 +00:00
Alex Cheema
7dc0a7467b trigger test 2024-12-01 14:31:23 +00:00
Pranav Veldurthi
497756f7c8 merge latest main 2024-11-25 17:50:33 -05:00
Pranav Veldurthi
4874295b34 Image streaming while generation 2024-11-20 18:08:54 -05:00
Alex Cheema
fece3f0cef gitignore tinychat pngs 2024-11-20 10:01:06 +04:00
Alex Cheema
38ee815107 static images dir 2024-11-20 09:55:36 +04:00
Pranav Veldurthi
3d5746f16f Merge 2024-11-19 23:17:21 -05:00
Pranav Veldurthi
6b28ef0349 Stable stable diffusion mlx 2024-11-19 23:13:22 -05:00
679 changed files with 62860 additions and 18427 deletions

View File

@@ -1,346 +0,0 @@
version: 2.1
orbs:
python: circleci/python@2
commands:
run_chatgpt_api_test:
parameters:
inference_engine:
type: string
model_id:
type: string
expected_output:
type: string
prompt:
type: string
steps:
- run:
name: Run chatgpt api integration test (<<parameters.inference_engine>>, <<parameters.model_id>>)
command: |
source env/bin/activate
# Set CLANG=1 for tinygrad only
if [ "<<parameters.inference_engine>>" = "tinygrad" ]; then
pip install llvmlite
export TOKENIZERS_PARALLELISM=true SUPPORT_BF16=0 CLANG=1
fi
# Start first instance
HF_HOME="$(pwd)/.hf_cache_node1" DEBUG_DISCOVERY=7 DEBUG=7 exo --inference-engine <<parameters.inference_engine>> \
--node-id "node1" --listen-port 5678 --broadcast-port 5679 --chatgpt-api-port 8000 \
--chatgpt-api-response-timeout 900 --disable-tui > output1.log &
PID1=$!
tail -f output1.log &
TAIL1=$!
# Start second instance
HF_HOME="$(pwd)/.hf_cache_node2" DEBUG_DISCOVERY=7 DEBUG=7 exo --inference-engine <<parameters.inference_engine>> \
--node-id "node2" --listen-port 5679 --broadcast-port 5678 --chatgpt-api-port 8001 \
--chatgpt-api-response-timeout 900 --disable-tui > output2.log &
PID2=$!
tail -f output2.log &
TAIL2=$!
# Remember to kill the tail processes at the end
trap 'kill $TAIL1 $TAIL2' EXIT
# Wait for discovery
sleep 10
# Function to check if processes are still running
check_processes() {
if ! kill -0 $PID1 2>/dev/null; then
echo "First instance (PID $PID1) died unexpectedly. Log output:"
cat output1.log
exit 1
fi
if ! kill -0 $PID2 2>/dev/null; then
echo "Second instance (PID $PID2) died unexpectedly. Log output:"
cat output2.log
exit 1
fi
}
# Check processes before proceeding
check_processes
echo "Sending request to first instance..."
response_1=$(curl -s http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "<<parameters.model_id>>",
"messages": [{"role": "user", "content": "<<parameters.prompt>>"}],
"temperature": 0.7
}')
echo "Response 1: $response_1"
# Check processes after first response
check_processes
echo "Sending request to second instance..."
response_2=$(curl -s http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "<<parameters.model_id>>",
"messages": [{"role": "user", "content": "<<parameters.prompt>>"}],
"temperature": 0.7
}')
echo "Response 2: $response_2"
# Check processes after second response
check_processes
# Stop both instances
kill $PID1 $PID2
echo ""
# Extract content using jq and check if it contains expected output
content1=$(echo "$response_1" | jq -r '.choices[0].message.content')
content2=$(echo "$response_2" | jq -r '.choices[0].message.content')
if [[ "$content1" != *"<<parameters.expected_output>>"* ]] || [[ "$content2" != *"<<parameters.expected_output>>"* ]]; then
echo "Test failed: Response does not match '<<parameters.expected_output>>'"
echo "Response 1 content: $content1"
echo ""
echo "Response 2 content: $content2"
echo "Output of first instance:"
cat output1.log
echo "Output of second instance:"
cat output2.log
exit 1
else
echo "Test passed: Response from both nodes matches '<<parameters.expected_output>>'"
fi
jobs:
unit_test:
macos:
xcode: "16.0.0"
resource_class: m2pro.large
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
- run:
name: Run tests
command: |
source env/bin/activate
# set TEMPERATURE to 0 for deterministic sampling
echo "Running inference engine tests..."
METAL_DEVICE_WRAPPER_TYPE=1 METAL_DEBUG_ERROR_MODE=0 METAL_XCODE=1 TEMPERATURE=0 python3 -m exo.inference.test_inference_engine
echo "Running tokenizer tests..."
python3 ./test/test_tokenizers.py
python3 ./test/test_model_helpers.py
discovery_integration_test:
macos:
xcode: "16.0.0"
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
- run:
name: Run discovery integration test
command: |
source env/bin/activate
DEBUG_DISCOVERY=7 DEBUG=7 exo --node-id "node1" --listen-port 5678 --broadcast-port 5679 --chatgpt-api-port 8000 --disable-tui > output1.log 2>&1 &
PID1=$!
DEBUG_DISCOVERY=7 DEBUG=7 exo --node-id "node2" --listen-port 5679 --broadcast-port 5678 --chatgpt-api-port 8001 --disable-tui > output2.log 2>&1 &
PID2=$!
sleep 10
kill $PID1 $PID2
if grep -q "Peer statuses: {\\'node2\\': \\'is_connected=True, health_check=True" output1.log && ! grep -q "Failed to connect peers:" output1.log && grep -q "Peer statuses: {\\'node1\\': \\'is_connected=True, health_check=True" output2.log && ! grep -q "Failed to connect peers:" output2.log; then
echo "Test passed: Both instances discovered each other"
exit 0
else
echo "Test failed: Devices did not discover each other"
echo "Output of first instance:"
cat output1.log
echo "Output of second instance:"
cat output2.log
exit 1
fi
chatgpt_api_integration_test_mlx:
macos:
xcode: "16.0.0"
resource_class: m2pro.large
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
- run_chatgpt_api_test:
inference_engine: mlx
model_id: llama-3.2-1b
prompt: "Keep responses concise. Who was the king of pop?"
expected_output: "Michael Jackson"
chatgpt_api_integration_test_dummy:
macos:
xcode: "16.0.0"
resource_class: m2pro.large
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
- run_chatgpt_api_test:
inference_engine: dummy
model_id: dummy
prompt: "Dummy prompt."
expected_output: "dummy"
chatgpt_api_integration_test_tinygrad:
macos:
xcode: "16.0.0"
resource_class: m2pro.large
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
- run_chatgpt_api_test:
inference_engine: tinygrad
model_id: llama-3.2-1b
prompt: "Keep responses concise. Who was the king of pop?"
expected_output: "Michael Jackson"
measure_pip_sizes:
macos:
xcode: "16.0.0"
steps:
- checkout
- run:
name: Set up Python
command: |
brew install python@3.12
python3.12 -m venv env
source env/bin/activate
- run:
name: Install dependencies and measure sizes
command: |
source env/bin/activate
pip install --upgrade pip
pip install .
python ./extra/pipsize.py --json ./pipsize.json
- store_artifacts:
path: ./pipsize.json
destination: pip-sizes.json
check_line_count:
docker:
- image: cimg/python:3.10
steps:
- checkout
- run:
name: Setup git for PR comparison
command: |
if [[ -n "$CIRCLE_PULL_REQUEST" ]]; then
PR_NUMBER=$(echo $CIRCLE_PULL_REQUEST | rev | cut -d'/' -f1 | rev)
BASE_BRANCH=$(curl -s -H "Circle-Token: $CIRCLE_TOKEN" \
"https://circleci.com/api/v2/project/github/$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME/pipeline/$CIRCLE_WORKFLOW_ID" \
| jq -r '.target_branch')
git clone -b $BASE_BRANCH --single-branch \
https://github.com/$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME.git \
base_branch
fi
- run:
name: Install dependencies
command: |
python -m pip install --upgrade pip
pip install tabulate
- run:
name: Run line count check
command: |
if [[ -n "$CIRCLE_PULL_REQUEST" ]]; then
python extra/line_counter.py base_branch .
else
python extra/line_counter.py .
fi
- store_artifacts:
path: line-count-snapshot.json
destination: line-count-snapshot.json
- store_artifacts:
path: line-count-diff.json
destination: line-count-diff.json
- run:
name: Create test results directory
command: |
mkdir -p test-results/line-count
cp line-count-*.json test-results/line-count/
- store_test_results:
path: test-results
workflows:
version: 2
build_and_test:
jobs:
- check_line_count:
filters:
branches:
only: /.*/
tags:
only: /.*/
- unit_test
- discovery_integration_test
- chatgpt_api_integration_test_mlx
- chatgpt_api_integration_test_tinygrad
- chatgpt_api_integration_test_dummy
- measure_pip_sizes

63
.clauderules Normal file
View File

@@ -0,0 +1,63 @@
# Claude Code Rules - Follow Every Rule Exactly
You must prioritize straightforward code semantics, well-named types, clear function signatures, and robust, carefully-chosen abstractions. Think about how your decisions might impact these aspects of code quality before proposing any changes.
You have access to all modern Python features from Python 3.13, 3.12, 3.11...
**When you're done making changes, remove any redundant comments; remaining comments should only apply to complex code segments, adding relevant context.**
## 1. Code Discipline
* Eliminate superfluous `try`/`catch` and `if` branches through strict typing and static analysis.
* Use pure functions unless you must mutate fixed state—then wrap that state in a class.
* Every function is **referentially transparent**: same inputs ⇒ same outputs, no hidden state, no unintended I/O.
* Put side-effects in injectable "effect handlers"; keep core logic pure.
## 2. Naming
* Choose descriptive, non-abbreviated names—no 3-letter acronyms or non-standard contractions.
* Anyone reading a function's type signature alone should grasp its purpose without extra context.
## 3. Typing
* Maintain **strict, exhaustive** typing; never bypass the type-checker.
* Default to `Literal[...]` when an enum-like set is needed.
* Prefer built-in types; when two values share structure but differ in meaning, enforce separation:
* Use `typing.NewType` for primitives (zero runtime cost).
* For serializable objects, add a `type: str` field that states the object's identity.
## 4. Pydantic
* Read, respect, and rely on Pydantic documentation.
* Centralize a common `ConfigDict` with `frozen=True` and `strict=True` (or stricter) and reuse it everywhere.
* For hierarchies of `BaseModel` variants, declare a discriminated union with `typing.Annotated[Base, Field(discriminator='variant')]`; publish a single `TypeAdapter[Base]` so all variants share one strict validator.
## 5. IDs & UUIDs
* Subclass Pydantic's `UUID4` for custom ID types.
* Generate fresh IDs with `uuid.uuid4()`.
* Create idempotency keys by hashing *persisted* state plus a **function-specific salt** to avoid collisions after crashes.
## 6. Error Handling
* Catch an exception **only** where you can handle or transform it meaningfully.
* State in the docstring **where** each exception is expected to be handled and **why**.
## 7. Dependencies
* Introduce new external dependencies only after approval.
* Request only libraries common in production environments.
## 8. Use of `@final` & Freezing
* Mark classes, methods, and variables as `@final` or otherwise immutable wherever applicable.
## 9. Repository Workflow
If you spot a rule violation within code that you've not been asked to work on directly, inform the user rather than patching it ad-hoc.
---
### One-Sentence Summary
Write strictly-typed, pure, self-describing Python that uses Pydantic, well-scoped side-effects, immutable state, approved dependencies, and explicit error handling.

64
.cursorrules Normal file
View File

@@ -0,0 +1,64 @@
# follow **every** rule exactly; report any violation instead of silently fixing it.
You must prioritize straightforward code semantics, well-named types, clear function signatures, and robust, carefully-chosen abstractions. Think about how your decisions might impact these aspects of code quality before proposing any changes.
You can use the advanced features of `typing`. You have access to all of the new features from Python 3.13, 3.12, 3.11...
**When you're done making your changes, remove any redundant comments that you may have left; the comments that remain should only apply to complex segments of code, adding relevant context.**
## 1. Code Discipline
* Eliminate superfluous `try` / `catch` and `if` branches through strict typing and static analysis.
* Use pure functions unless you must mutate fixed state—then wrap that state in a class.
* Every function is **referentially transparent**: same inputs ⇒ same outputs, no hidden state, no unintended I/O.
* Put side-effects in injectable “effect handlers”; keep core logic pure.
## 2. Naming
* Choose descriptive, non-abbreviated names—no 3-letter acronyms or non-standard contractions.
* Anyone reading a functions type signature alone should grasp its purpose without extra context.
## 3. Typing
* Maintain **strict, exhaustive** typing; never bypass the type-checker.
* Default to `Literal[...]` when an enum-like set is needed.
* Prefer built-in types; when two values share structure but differ in meaning, enforce separation:
* Use `typing.NewType` for primitives (zero runtime cost).
* For serialisable objects, add a `type: str` field that states the objects identity.
## 4. Pydantic
* Read, respect, and rely on Pydantic docs.
* Centralise a common `ConfigDict` with `frozen=True` and `strict=True` (or stricter) and reuse it everywhere.
* For hierarchies of `BaseModel` variants, declare a discriminated union with `typing.Annotated[Base, Field(discriminator='variant')]`; publish a single `TypeAdapter[Base]` so all variants share one strict validator.
## 5. IDs & UUIDs
* Subclass Pydantics `UUID4` for custom ID types.
* Generate fresh IDs with `uuid.uuid4()`.
* Create idempotency keys by hashing *persisted* state plus a **function-specific salt** to avoid collisions after crashes.
## 6. Error Handling
* Catch an exception **only** where you can handle or transform it meaningfully.
* State in the docstring **where** each exception is expected to be handled and **why**.
## 7. Dependencies
* Introduce new external dependencies only after approval.
* Request only libraries common in production environments.
## 8. Use of `@final` & Freezing
* Mark classes, methods, and variables as `@final` or otherwise immutable wherever applicable.
## 9. Repository Workflow
If you spot a rule violation within code that you've not been asked to work on directly, inform the user rather than patching it ad-hoc.
---
### One-Sentence Summary
Write strictly-typed, pure, self-describing Python that uses Pydantic, well-scoped side-effects, immutable state, approved dependencies, and explicit error handling

1
.envrc Normal file
View File

@@ -0,0 +1 @@
use flake

2
.gitattributes vendored
View File

@@ -1,2 +0,0 @@
*.mp3 filter=lfs diff=lfs merge=lfs -text
*.png filter=lfs diff=lfs merge=lfs -text

3
.githooks/post-checkout Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { printf >&2 "\n%s\n\n" "This repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting the 'post-checkout' file in the hooks directory (set by 'core.hookspath'; usually '.git/hooks')."; exit 2; }
git lfs post-checkout "$@"

3
.githooks/post-commit Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { printf >&2 "\n%s\n\n" "This repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting the 'post-commit' file in the hooks directory (set by 'core.hookspath'; usually '.git/hooks')."; exit 2; }
git lfs post-commit "$@"

3
.githooks/post-merge Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { printf >&2 "\n%s\n\n" "This repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting the 'post-merge' file in the hooks directory (set by 'core.hookspath'; usually '.git/hooks')."; exit 2; }
git lfs post-merge "$@"

3
.githooks/pre-push Executable file
View File

@@ -0,0 +1,3 @@
#!/bin/sh
command -v git-lfs >/dev/null 2>&1 || { printf >&2 "\n%s\n\n" "This repository is configured for Git LFS but 'git-lfs' was not found on your path. If you no longer wish to use Git LFS, remove this hook by deleting the 'pre-push' file in the hooks directory (set by 'core.hookspath'; usually '.git/hooks')."; exit 2; }
git lfs pre-push "$@"

3
.github/CODEOWNERS vendored Normal file
View File

@@ -0,0 +1,3 @@
* @ToxicPine
* @AlexCheema
* @GeluVrabie

43
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,43 @@
---
name: Bug Report
about: Create a report to help us improve
title: '[BUG] '
labels: bug
assignees: ''
---
## Describe the bug
A clear and concise description of what the bug is.
## To Reproduce
Steps to reproduce the behavior:
1.
2.
3.
## Expected behavior
A clear and concise description of what you expected to happen.
## Actual behavior
A clear and concise description of what actually happened.
## Environment
- macOS Version:
- EXO Version:
- Hardware:
- Device 1: (e.g., MacBook Pro M1 Max, 32GB RAM)
- Device 2: (e.g., Mac Mini M2, 16GB RAM)
- Additional devices:
- Interconnection:
- (e.g., Thunderbolt 4 cable between Device 1 and 2)
- (e.g., WiFi 6 for Device 3)
- (e.g., 10GbE Ethernet between all devices)
## Additional context
Add any other context about the problem here.

View File

@@ -0,0 +1,11 @@
---
name: Feature Request
about: Suggest an idea for this project
title: ''
labels: enhancement
assignees: ''
---
<!-- Please use a clear, descriptive title above -->
Describe what you'd like to see added to EXO.

View File

@@ -0,0 +1,16 @@
name: Commit if changed
description: "Create a commit when the working tree is dirty"
inputs:
message:
description: "Commit message"
required: true
runs:
using: composite
steps:
- name: Commit changed files
shell: bash
run: |
git diff --quiet && exit 0
git commit -am "${{ inputs.message }}"

10
.github/actions/format/action.yml vendored Normal file
View File

@@ -0,0 +1,10 @@
name: Format Code
description: "Run code formatter"
runs:
using: "composite"
steps:
- name: Format code
run: nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just fmt
shell: bash

10
.github/actions/lint-check/action.yml vendored Normal file
View File

@@ -0,0 +1,10 @@
name: Lint Check
description: "Check for lint errors"
runs:
using: "composite"
steps:
- name: Lint check
run: nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just lint-check
shell: bash

10
.github/actions/lint/action.yml vendored Normal file
View File

@@ -0,0 +1,10 @@
name: Lint Code
description: "Run code linter"
runs:
using: "composite"
steps:
- name: Lint code
run: nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just lint
shell: bash

View File

@@ -0,0 +1,10 @@
name: Regenerate Protobufs
description: "Regenerate protobuf files"
runs:
using: "composite"
steps:
- name: Regenerate protobufs
run: nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just regenerate-protobufs
shell: bash

View File

@@ -0,0 +1,20 @@
name: Setup Python & uv
description: "Regenerate Python environment from uv.lock"
runs:
using: "composite"
steps:
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
cache-dependency-glob: uv.lock
- name: Install Python
run: uv python install
shell: bash
- name: Sync
run: uv sync --locked --all-extras --dev
shell: bash

12
.github/actions/typecheck/action.yml vendored Normal file
View File

@@ -0,0 +1,12 @@
name: Type Check
description: "Run type checker"
runs:
using: "composite"
steps:
- name: Run type checker
run: |
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just sync
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just check
shell: bash

12
.github/actions/unit-test/action.yml vendored Normal file
View File

@@ -0,0 +1,12 @@
name: Unit Test
description: "Run unit tests"
runs:
using: "composite"
steps:
- name: Run unit tests
run: |
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just sync-clean
nix --extra-experimental-features nix-command --extra-experimental-features flakes develop -c just test-fast
shell: bash

20
.github/actions/verify-clean/action.yml vendored Normal file
View File

@@ -0,0 +1,20 @@
name: Verify Clean Working Tree
description: "Fail the job if the previous step left the working tree dirty"
inputs:
step:
description: "The name of the step that just executed"
required: true
runs:
using: composite
steps:
- name: Check git diff
shell: bash
run: |
if ! git diff --quiet; then
echo "Error: ${{ inputs.step }} left working tree dirty." >&2
git --no-pager diff >&2
exit 1
fi

23
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,23 @@
## Motivation
<!-- Why is this change needed? What problem does it solve? -->
<!-- If it fixes an open issue, please link to the issue here -->
## Changes
<!-- Describe what you changed in detail -->
## Why It Works
<!-- Explain why your approach solves the problem -->
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) -->
<!-- What you did: -->
<!-- - -->
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover this change -->
<!-- - -->

442
.github/workflows/build-app.yml vendored Normal file
View File

@@ -0,0 +1,442 @@
name: Build EXO macOS DMG
# Release workflow:
# 1. Create a draft GitHub Release with the tag name (e.g. v1.0.0) and write release notes in markdown
# 2. Push the tag: git tag v1.0.0 && git push origin v1.0.0
# 3. This workflow builds, signs, and notarizes the DMG
# 4. Release notes are embedded in appcast.xml for Sparkle (rendered as markdown)
# 5. DMG and appcast.xml are uploaded to S3
# 6. The draft GitHub Release is published with the DMG attached
#
# For alpha releases (e.g. v1.0.0-alpha.1): draft release and notes are optional.
# If no draft exists, a release is auto-created with generated notes.
on:
workflow_dispatch:
push:
tags:
- "v*"
branches:
- "test-app"
jobs:
build-macos-app:
runs-on: "macos-26"
permissions:
contents: write
env:
SPARKLE_VERSION: 2.9.0-beta.1
SPARKLE_DOWNLOAD_PREFIX: ${{ secrets.SPARKLE_DOWNLOAD_PREFIX }}
SPARKLE_FEED_URL: ${{ secrets.SPARKLE_FEED_URL }}
SPARKLE_ED25519_PUBLIC: ${{ secrets.SPARKLE_ED25519_PUBLIC }}
SPARKLE_ED25519_PRIVATE: ${{ secrets.SPARKLE_ED25519_PRIVATE }}
SPARKLE_S3_BUCKET: ${{ secrets.SPARKLE_S3_BUCKET }}
SPARKLE_S3_PREFIX: ${{ secrets.SPARKLE_S3_PREFIX }}
EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT: ${{ secrets.EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT }}
AWS_REGION: ${{ secrets.AWS_REGION }}
EXO_BUILD_NUMBER: ${{ github.run_number }}
EXO_LIBP2P_NAMESPACE: ${{ github.ref_name }}
steps:
# ============================================================
# Checkout and tag validation
# ============================================================
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Derive release version from tag
run: |
if [[ "$GITHUB_REF_NAME" == "test-app" || "${{ github.event_name }}" == "workflow_dispatch" ]]; then
VERSION="0.0.0-alpha.0"
echo "IS_ALPHA=true" >> $GITHUB_ENV
else
VERSION="${GITHUB_REF_NAME#v}"
if [[ "$VERSION" == *-alpha* ]]; then
echo "IS_ALPHA=true" >> $GITHUB_ENV
else
echo "IS_ALPHA=false" >> $GITHUB_ENV
fi
fi
echo "RELEASE_VERSION=$VERSION" >> $GITHUB_ENV
- name: Compute build version from semver
run: |
VERSION="$RELEASE_VERSION"
# Extract major.minor.patch (strip prerelease suffix)
BASE_VERSION="${VERSION%%-*}"
MAJOR=$(echo "$BASE_VERSION" | cut -d. -f1)
MINOR=$(echo "$BASE_VERSION" | cut -d. -f2)
PATCH=$(echo "$BASE_VERSION" | cut -d. -f3)
# Extract prerelease number (e.g., "alpha.2" -> 2, or 999 for releases)
if [[ "$VERSION" == *-* ]]; then
PRERELEASE_PART="${VERSION#*-}"
PRERELEASE_NUM="${PRERELEASE_PART##*.}"
# Default to 0 if not a number
if ! [[ "$PRERELEASE_NUM" =~ ^[0-9]+$ ]]; then
PRERELEASE_NUM=0
fi
else
PRERELEASE_NUM=999
fi
# Compute: PRERELEASE + (1000 * PATCH) + (1_000_000 * MINOR) + (1_000_000_000 * MAJOR)
BUILD_VERSION=$((PRERELEASE_NUM + 1000 * PATCH + 1000000 * MINOR + 1000000000 * MAJOR))
echo "EXO_BUILD_VERSION=$BUILD_VERSION" >> $GITHUB_ENV
echo "Computed build version: $BUILD_VERSION from $VERSION"
- name: Ensure tag commit is on main
if: github.ref_type == 'tag'
run: |
git fetch origin main
# Alpha tags can be on any branch, production tags must be on main
if [[ "$IS_ALPHA" == "true" ]]; then
echo "Alpha tag detected, skipping main branch check"
elif ! git merge-base --is-ancestor origin/main HEAD; then
echo "Production tag must point to a commit on main"
exit 1
fi
- name: Fetch and validate release notes
if: github.ref_type == 'tag'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
# Find draft release by name using gh release list (more reliable with default token)
echo "Looking for draft release named '$GITHUB_REF_NAME'..."
DRAFT_EXISTS=$(gh release list --json name,isDraft --jq ".[] | select(.isDraft == true) | select(.name == \"$GITHUB_REF_NAME\") | .name" 2>/dev/null || echo "")
if [[ -z "$DRAFT_EXISTS" ]]; then
if [[ "$IS_ALPHA" == "true" ]]; then
echo "No draft release found for alpha tag $GITHUB_REF_NAME (optional for alphas)"
echo "HAS_RELEASE_NOTES=false" >> $GITHUB_ENV
exit 0
fi
echo "ERROR: No draft release found for tag $GITHUB_REF_NAME"
echo "Please create a draft release with release notes before pushing the tag."
exit 1
fi
# Fetch full release details via API to get body and ID
echo "Found draft release, fetching details..."
RELEASE_JSON=$(gh api repos/${{ github.repository }}/releases --jq ".[] | select(.draft == true) | select(.name == \"$GITHUB_REF_NAME\")" 2>/dev/null || echo "")
# Extract release notes
NOTES=$(echo "$RELEASE_JSON" | jq -r '.body // ""')
if [[ -z "$NOTES" || "$NOTES" == "null" ]]; then
if [[ "$IS_ALPHA" == "true" ]]; then
echo "Draft release has no notes (optional for alphas)"
echo "HAS_RELEASE_NOTES=false" >> $GITHUB_ENV
exit 0
fi
echo "ERROR: Draft release exists but has no release notes"
echo "Please add release notes to the draft release before pushing the tag."
exit 1
fi
# Save release ID for later publishing
RELEASE_ID=$(echo "$RELEASE_JSON" | jq -r '.id')
echo "DRAFT_RELEASE_ID=$RELEASE_ID" >> $GITHUB_ENV
echo "HAS_RELEASE_NOTES=true" >> $GITHUB_ENV
echo "Found draft release (ID: $RELEASE_ID), saving release notes..."
echo "$NOTES" > /tmp/release_notes.md
echo "RELEASE_NOTES_FILE=/tmp/release_notes.md" >> $GITHUB_ENV
# ============================================================
# Install dependencies
# ============================================================
- name: Select Xcode 26.2
run: |
sudo xcode-select -s /Applications/Xcode_26.2.app
if ! xcrun -f metal >/dev/null 2>&1; then
echo "Metal toolchain is not installed."
exit 1
fi
- name: Install Homebrew packages
run: brew install just awscli macmon
- name: Install UV
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
cache-dependency-glob: uv.lock
- name: Setup Python
run: |
uv python install
uv sync --locked
- name: Install Nix
uses: cachix/install-nix-action@v31
with:
nix_path: nixpkgs=channel:nixos-unstable
- name: Configure Cachix
uses: cachix/cachix-action@v14
with:
name: exo
authToken: "${{ secrets.CACHIX_AUTH_TOKEN }}"
- name: Build dashboard
run: |
DASHBOARD_OUT=$(nix build .#dashboard --print-build-logs --no-link --print-out-paths)
mkdir -p dashboard/build
cp -r "$DASHBOARD_OUT"/* dashboard/build/
- name: Install Sparkle CLI
run: |
CLI_URL="${SPARKLE_CLI_URL:-https://github.com/sparkle-project/Sparkle/releases/download/${SPARKLE_VERSION}/Sparkle-${SPARKLE_VERSION}.tar.xz}"
echo "Downloading Sparkle CLI from: $CLI_URL"
mkdir -p /tmp/sparkle
curl --fail --location --output /tmp/sparkle.tar.xz "$CLI_URL"
tar -xJf /tmp/sparkle.tar.xz -C /tmp/sparkle --strip-components=1
echo "SPARKLE_BIN=/tmp/sparkle/bin" >> $GITHUB_ENV
- name: Prepare code-signing keychain
env:
MACOS_CERTIFICATE: ${{ secrets.MACOS_CERTIFICATE }}
MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
PROVISIONING_PROFILE: ${{ secrets.PROVISIONING_PROFILE }}
run: |
KEYCHAIN_PATH="$HOME/Library/Keychains/build.keychain-db"
# Create fresh keychain
security create-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
# Disable auto-lock (no timeout, no lock-on-sleep)
security set-keychain-settings "$KEYCHAIN_PATH"
# Add to search list while preserving existing keychains
security list-keychains -d user -s "$KEYCHAIN_PATH" $(security list-keychains -d user | tr -d '"')
# Set as default and unlock
security default-keychain -s "$KEYCHAIN_PATH"
security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
# Import certificate with full access for codesign
echo "$MACOS_CERTIFICATE" | base64 --decode > /tmp/cert.p12
security import /tmp/cert.p12 -k "$KEYCHAIN_PATH" -P "$MACOS_CERTIFICATE_PASSWORD" \
-T /usr/bin/codesign -T /usr/bin/security -T /usr/bin/productbuild
rm /tmp/cert.p12
# Allow codesign to access the key without prompting
security set-key-partition-list -S apple-tool:,apple:,codesign: -s -k "$MACOS_CERTIFICATE_PASSWORD" "$KEYCHAIN_PATH"
# Verify keychain is unlocked and identity is available
echo "Verifying signing identity..."
security find-identity -v -p codesigning "$KEYCHAIN_PATH"
# Setup provisioning profile
mkdir -p "$HOME/Library/Developer/Xcode/UserData/Provisioning Profiles"
echo "$PROVISIONING_PROFILE" | base64 --decode > "$HOME/Library/Developer/Xcode/UserData/Provisioning Profiles/EXO.provisionprofile"
# Export keychain path for other steps
echo "BUILD_KEYCHAIN_PATH=$KEYCHAIN_PATH" >> $GITHUB_ENV
# ============================================================
# Build the bundle
# ============================================================
- name: Build PyInstaller bundle
run: uv run pyinstaller packaging/pyinstaller/exo.spec
- name: Build Swift app
env:
MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
SPARKLE_FEED_URL: ${{ secrets.SPARKLE_FEED_URL }}
SPARKLE_ED25519_PUBLIC: ${{ secrets.SPARKLE_ED25519_PUBLIC }}
run: |
cd app/EXO
security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
xcodebuild clean build \
-scheme EXO \
-configuration Release \
-derivedDataPath build \
MARKETING_VERSION="$RELEASE_VERSION" \
CURRENT_PROJECT_VERSION="$EXO_BUILD_VERSION" \
EXO_BUILD_TAG="$RELEASE_VERSION" \
EXO_BUILD_COMMIT="$GITHUB_SHA" \
SPARKLE_FEED_URL="$SPARKLE_FEED_URL" \
SPARKLE_ED25519_PUBLIC="$SPARKLE_ED25519_PUBLIC" \
EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT="$EXO_BUG_REPORT_PRESIGNED_URL_ENDPOINT" \
CODE_SIGNING_IDENTITY="$SIGNING_IDENTITY" \
CODE_SIGN_INJECT_BASE_ENTITLEMENTS=YES
mkdir -p ../../output
cp -R build/Build/Products/Release/EXO.app ../../output/EXO.app
- name: Inject PyInstaller runtime
run: |
rm -rf output/EXO.app/Contents/Resources/exo
mkdir -p output/EXO.app/Contents/Resources
cp -R dist/exo output/EXO.app/Contents/Resources/exo
- name: Codesign PyInstaller runtime
env:
MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
run: |
cd output
security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
RUNTIME_DIR="EXO.app/Contents/Resources/exo"
find "$RUNTIME_DIR" -type f \( -perm -111 -o -name "*.dylib" -o -name "*.so" \) -print0 |
while IFS= read -r -d '' file; do
/usr/bin/codesign --force --timestamp --options runtime \
--sign "$SIGNING_IDENTITY" "$file"
done
- name: Sign, notarize, and create DMG
env:
MACOS_CERTIFICATE_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PASSWORD }}
APPLE_NOTARIZATION_USERNAME: ${{ secrets.APPLE_NOTARIZATION_USERNAME }}
APPLE_NOTARIZATION_PASSWORD: ${{ secrets.APPLE_NOTARIZATION_PASSWORD }}
APPLE_NOTARIZATION_TEAM: ${{ secrets.APPLE_NOTARIZATION_TEAM }}
run: |
cd output
security unlock-keychain -p "$MACOS_CERTIFICATE_PASSWORD" "$BUILD_KEYCHAIN_PATH"
SIGNING_IDENTITY=$(security find-identity -v -p codesigning "$BUILD_KEYCHAIN_PATH" | awk -F '"' '{print $2}')
/usr/bin/codesign --deep --force --timestamp --options runtime \
--sign "$SIGNING_IDENTITY" EXO.app
mkdir -p dmg-root
cp -R EXO.app dmg-root/
ln -s /Applications dmg-root/Applications
DMG_NAME="EXO-${RELEASE_VERSION}.dmg"
hdiutil create -volname "EXO" -srcfolder dmg-root -ov -format UDZO "$DMG_NAME"
/usr/bin/codesign --force --timestamp --options runtime \
--sign "$SIGNING_IDENTITY" "$DMG_NAME"
if [[ -n "$APPLE_NOTARIZATION_USERNAME" ]]; then
SUBMISSION_OUTPUT=$(xcrun notarytool submit "$DMG_NAME" \
--apple-id "$APPLE_NOTARIZATION_USERNAME" \
--password "$APPLE_NOTARIZATION_PASSWORD" \
--team-id "$APPLE_NOTARIZATION_TEAM" \
--wait --timeout 15m 2>&1)
echo "$SUBMISSION_OUTPUT"
SUBMISSION_ID=$(echo "$SUBMISSION_OUTPUT" | awk 'tolower($1)=="id:" && $2 ~ /^[0-9a-fA-F-]+$/ {print $2; exit}')
STATUS=$(echo "$SUBMISSION_OUTPUT" | awk 'tolower($1)=="status:" {print $2; exit}')
if [[ -n "$SUBMISSION_ID" ]]; then
xcrun notarytool log "$SUBMISSION_ID" \
--apple-id "$APPLE_NOTARIZATION_USERNAME" \
--password "$APPLE_NOTARIZATION_PASSWORD" \
--team-id "$APPLE_NOTARIZATION_TEAM" > notarization-log.txt || true
echo "===== Notarization Log ====="
cat notarization-log.txt
echo "============================"
fi
if [[ "$STATUS" != "Accepted" ]]; then
echo "Notarization failed with status: ${STATUS:-Unknown}"
exit 1
fi
xcrun stapler staple "$DMG_NAME"
fi
- name: Generate Sparkle appcast
env:
SPARKLE_DOWNLOAD_PREFIX: ${{ env.SPARKLE_DOWNLOAD_PREFIX }}
SPARKLE_ED25519_PRIVATE: ${{ secrets.SPARKLE_ED25519_PRIVATE }}
IS_ALPHA: ${{ env.IS_ALPHA }}
run: |
set -euo pipefail
cd output
DOWNLOAD_PREFIX="${SPARKLE_DOWNLOAD_PREFIX:-https://assets.exolabs.net}"
echo "$SPARKLE_ED25519_PRIVATE" > sparkle_ed25519.key
chmod 600 sparkle_ed25519.key
CHANNEL_FLAG=""
if [[ "$IS_ALPHA" == "true" ]]; then
CHANNEL_FLAG="--channel alpha"
echo "Generating appcast for alpha channel"
fi
$SPARKLE_BIN/generate_appcast \
--ed-key-file sparkle_ed25519.key \
--download-url-prefix "$DOWNLOAD_PREFIX" \
$CHANNEL_FLAG \
.
- name: Inject release notes into appcast
if: github.ref_type == 'tag' && env.HAS_RELEASE_NOTES == 'true'
env:
RELEASE_VERSION: ${{ env.RELEASE_VERSION }}
run: |
# Inject markdown release notes with sparkle:format="markdown" (Sparkle 2.9+)
export NOTES=$(cat "$RELEASE_NOTES_FILE")
# Insert description after the enclosure tag for this version
awk '
/<enclosure[^>]*>/ && index($0, ENVIRON["RELEASE_VERSION"]) {
print
print " <description sparkle:format=\"markdown\"><![CDATA["
print ENVIRON["NOTES"]
print " ]]></description>"
next
}
{ print }
' output/appcast.xml > output/appcast.xml.tmp && mv output/appcast.xml.tmp output/appcast.xml
echo "Injected markdown release notes for version $RELEASE_VERSION"
# ============================================================
# Upload artifacts
# ============================================================
- name: Upload DMG
uses: actions/upload-artifact@v4
with:
name: EXO-dmg-${{ env.RELEASE_VERSION }}
path: output/EXO-${{ env.RELEASE_VERSION }}.dmg
- name: Upload to S3
if: env.SPARKLE_S3_BUCKET != '' && github.ref_type == 'tag'
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_REGION: ${{ env.AWS_REGION }}
SPARKLE_S3_BUCKET: ${{ env.SPARKLE_S3_BUCKET }}
SPARKLE_S3_PREFIX: ${{ env.SPARKLE_S3_PREFIX }}
IS_ALPHA: ${{ env.IS_ALPHA }}
run: |
set -euo pipefail
cd output
PREFIX="${SPARKLE_S3_PREFIX:-}"
if [[ -n "$PREFIX" && "${PREFIX: -1}" != "/" ]]; then
PREFIX="${PREFIX}/"
fi
DMG_NAME="EXO-${RELEASE_VERSION}.dmg"
aws s3 cp "$DMG_NAME" "s3://${SPARKLE_S3_BUCKET}/${PREFIX}${DMG_NAME}"
if [[ "$IS_ALPHA" != "true" ]]; then
aws s3 cp "$DMG_NAME" "s3://${SPARKLE_S3_BUCKET}/${PREFIX}EXO-latest.dmg"
aws s3 cp appcast.xml "s3://${SPARKLE_S3_BUCKET}/${PREFIX}appcast.xml" --content-type application/xml --cache-control no-cache
fi
- name: Publish GitHub Release
if: github.ref_type == 'tag'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
DMG_PATH="output/EXO-${RELEASE_VERSION}.dmg"
if [[ "$HAS_RELEASE_NOTES" == "true" ]]; then
# Update the draft release with the tag and upload DMG
gh api --method PATCH "repos/${{ github.repository }}/releases/$DRAFT_RELEASE_ID" \
-f tag_name="$GITHUB_REF_NAME" \
-F draft=false
gh release upload "$GITHUB_REF_NAME" "$DMG_PATH" --clobber
echo "Published release $GITHUB_REF_NAME with DMG attached"
else
# Alpha without draft release - create one with auto-generated notes
gh release create "$GITHUB_REF_NAME" "$DMG_PATH" \
--title "$GITHUB_REF_NAME" \
--generate-notes \
--prerelease
echo "Created alpha release $GITHUB_REF_NAME with auto-generated notes"
fi

136
.github/workflows/pipeline.yml vendored Normal file
View File

@@ -0,0 +1,136 @@
name: ci-pipeline
on:
push:
pull_request:
branches:
- staging
- main
jobs:
typecheck:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: false
- uses: cachix/install-nix-action@v31
with:
nix_path: nixpkgs=channel:nixos-unstable
- uses: cachix/cachix-action@v14
name: Configure Cachix
with:
name: exo
authToken: "${{ secrets.CACHIX_AUTH_TOKEN }}"
- name: Configure git user
run: |
git config --local user.email "github-actions@users.noreply.github.com"
git config --local user.name "github-actions bot"
shell: bash
- name: Pull LFS files
run: |
echo "Pulling Git LFS files..."
git lfs pull
shell: bash
- name: Setup Nix Environment
run: |
echo "Checking for nix installation..."
# Check if nix binary exists directly
if [ -f /nix/var/nix/profiles/default/bin/nix ]; then
echo "Found nix binary at /nix/var/nix/profiles/default/bin/nix"
export PATH="/nix/var/nix/profiles/default/bin:$PATH"
echo "PATH=$PATH" >> $GITHUB_ENV
nix --version
elif [ -f /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh ]; then
echo "Found nix profile script, sourcing..."
source /nix/var/nix/profiles/default/etc/profile.d/nix-daemon.sh
nix --version
elif command -v nix >/dev/null 2>&1; then
echo "Nix already in PATH"
nix --version
else
echo "Nix not found. Debugging info:"
echo "Contents of /nix/var/nix/profiles/default/:"
ls -la /nix/var/nix/profiles/default/ 2>/dev/null || echo "Directory not found"
echo "Contents of /nix/var/nix/profiles/default/bin/:"
ls -la /nix/var/nix/profiles/default/bin/ 2>/dev/null || echo "Directory not found"
exit 1
fi
shell: bash
- name: Configure basedpyright include for local MLX
run: |
RUNNER_LABELS='${{ toJSON(runner.labels) }}'
if echo "$RUNNER_LABELS" | grep -q "local_mlx"; then
if [ -d "/Users/Shared/mlx" ]; then
echo "Updating [tool.basedpyright].include to use /Users/Shared/mlx"
awk '
BEGIN { in=0 }
/^\[tool\.basedpyright\]/ { in=1; print; next }
in && /^\[/ { in=0 } # next section
in && /^[ \t]*include[ \t]*=/ {
print "include = [\"/Users/Shared/mlx\"]"
next
}
{ print }
' pyproject.toml > pyproject.toml.tmp && mv pyproject.toml.tmp pyproject.toml
echo "New [tool.basedpyright] section:"
sed -n '/^\[tool\.basedpyright\]/,/^\[/p' pyproject.toml | sed '$d' || true
else
echo "local_mlx tag present but /Users/Shared/mlx not found; leaving pyproject unchanged."
fi
else
echo "Runner does not have 'local_mlx' tag; leaving pyproject unchanged."
fi
shell: bash
- uses: ./.github/actions/typecheck
nix:
name: Build and check (${{ matrix.system }})
runs-on: ${{ matrix.runner }}
strategy:
fail-fast: false
matrix:
include:
- runner: macos-26
system: aarch64-darwin
- runner: ubuntu-latest
system: x86_64-linux
- runner: ubuntu-24.04-arm
system: aarch64-linux
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
lfs: false
- uses: cachix/install-nix-action@v31
with:
nix_path: nixpkgs=channel:nixos-unstable
- uses: cachix/cachix-action@v14
name: Configure Cachix
with:
name: exo
authToken: "${{ secrets.CACHIX_AUTH_TOKEN }}"
- name: Build all Nix outputs
run: |
nix flake show --json | jq -r '
[
(.packages."${{ matrix.system }}" // {} | keys[] | ".#packages.${{ matrix.system }}.\(.)"),
(.devShells."${{ matrix.system }}" // {} | keys[] | ".#devShells.${{ matrix.system }}.\(.)")
] | .[]
' | xargs nix build
- name: Run nix flake check
run: nix flake check

191
.gitignore vendored
View File

@@ -1,173 +1,30 @@
__pycache__/
.venv*
test_weights.npz
.exo_used_ports
.exo_node_id
# gitingest
digest.txt
# python
**/__pycache__
# nix
.direnv/
# IDEA (PyCharm)
.idea
.DS_Store
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# xcode / macos
*.xcuserstate
*.xcuserdata
*.xcuserdatad/
**/.DS_Store
app/EXO/build/
dist/
# C extensions
*.so
# Distribution / packaging
/.Python
/develop-eggs/
/dist/
/downloads/
/eggs/
/.eggs/
/lib/
/lib64/
/parts/
/sdist/
/var/
/wheels/
/share/python-wheels/
/*.egg-info/
/.installed.cfg
/*.egg
/MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
# rust
target/
**/*.rs.bk
*.pdb
# Jupyter Notebook
.ipynb_checkpoints
Untitled.ipynb
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
**/*.xcodeproj/*
.aider*
# svelte
dashboard/build/
dashboard/node_modules/
dashboard/.svelte-kit/

9
.idea/.gitignore generated vendored Normal file
View File

@@ -0,0 +1,9 @@
# Default ignored files
/shelf/
/workspace.xml
# Editor-based HTTP Client requests
/httpRequests/
# Datasource local storage ignored files
/dataSources/
/dataSources.local.xml
workspace.xml

16
.idea/LanguageServersSettings.xml generated Normal file
View File

@@ -0,0 +1,16 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="LanguageServerSettingsState">
<state>
<map>
<entry key="com.insyncwithfoo.pyright">
<value>
<LanguageServerDefinitionSettings>
<option name="errorReportingKind" value="in_log" />
</LanguageServerDefinitionSettings>
</value>
</entry>
</map>
</state>
</component>
</project>

31
.idea/exo-v2.iml generated Normal file
View File

@@ -0,0 +1,31 @@
<?xml version="1.0" encoding="UTF-8"?>
<module type="EMPTY_MODULE" version="4">
<component name="FacetManager">
<facet type="Python" name="Python facet">
<configuration sdkName="Python 3.13 virtualenv at ~/Desktop/exo/.venv" />
</facet>
</component>
<component name="Go" enabled="true" />
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/scripts/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/rust/exo_pyo3_bindings/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/rust/exo_pyo3_bindings/tests" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/rust/util/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/rust/networking/examples" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/rust/networking/src" isTestSource="false" />
<sourceFolder url="file://$MODULE_DIR$/rust/networking/tests" isTestSource="true" />
<sourceFolder url="file://$MODULE_DIR$/rust/system_custodian/src" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/.venv" />
<excludeFolder url="file://$MODULE_DIR$/.direnv" />
<excludeFolder url="file://$MODULE_DIR$/build" />
<excludeFolder url="file://$MODULE_DIR$/dist" />
<excludeFolder url="file://$MODULE_DIR$/.go_cache" />
<excludeFolder url="file://$MODULE_DIR$/rust/target" />
</content>
<orderEntry type="jdk" jdkName="Python 3.13 (exo)" jdkType="Python SDK" />
<orderEntry type="sourceFolder" forTests="false" />
<orderEntry type="library" name="Python 3.13 virtualenv at ~/Desktop/exo/.venv interpreter library" level="application" />
</component>
</module>

6
.idea/externalDependencies.xml generated Normal file
View File

@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ExternalDependencies">
<plugin id="systems.fehn.intellijdirenv" />
</component>
</project>

View File

@@ -0,0 +1,14 @@
<component name="InspectionProjectProfileManager">
<profile version="1.0">
<option name="myName" value="Project Default" />
<inspection_tool class="PyCompatibilityInspection" enabled="true" level="WARNING" enabled_by_default="true">
<option name="ourVersions">
<value>
<list size="1">
<item index="0" class="java.lang.String" itemvalue="3.14" />
</list>
</value>
</option>
</inspection_tool>
</profile>
</component>

10
.idea/misc.xml generated Normal file
View File

@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="Black">
<option name="sdkName" value="Python 3.13 (exo)" />
</component>
<component name="ProjectRootManager" version="2" project-jdk-name="Python 3.13 (exo)" project-jdk-type="Python SDK" />
<component name="PythonCompatibilityInspectionAdvertiser">
<option name="version" value="3" />
</component>
</project>

8
.idea/modules.xml generated Normal file
View File

@@ -0,0 +1,8 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="ProjectModuleManager">
<modules>
<module fileurl="file://$PROJECT_DIR$/.idea/exo.iml" filepath="$PROJECT_DIR$/.idea/exo.iml" />
</modules>
</component>
</project>

18
.idea/pyright-overrides.xml generated Normal file
View File

@@ -0,0 +1,18 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="com.insyncwithfoo.pyright.configurations.Override">
<option name="names">
<map>
<entry key="configurationFile" value="true" />
<entry key="diagnosticMode" value="true" />
<entry key="inlayHintsGenericTypes" value="true" />
<entry key="prefixTooltipMessages" value="true" />
<entry key="runningMode" value="true" />
<entry key="smartExecutableResolution" value="true" />
<entry key="smartLanguageServerExecutableResolution" value="true" />
<entry key="useEditorFontForTooltips" value="true" />
<entry key="useTypingExtensions" value="true" />
</map>
</option>
</component>
</project>

9
.idea/pyright.xml generated Normal file
View File

@@ -0,0 +1,9 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="com.insyncwithfoo.pyright.configurations.Local">
<option name="diagnosticMode" value="WORKSPACE" />
<option name="inlayHintsGenericTypes" value="true" />
<option name="prefixTooltipMessages" value="true" />
<option name="useEditorFontForTooltips" value="true" />
</component>
</project>

6
.idea/vcs.xml generated Normal file
View File

@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
<project version="4">
<component name="VcsDirectoryMappings">
<mapping directory="" vcs="Git" />
</component>
</project>

View File

@@ -0,0 +1,7 @@
"""
This type stub file was generated by pyright.
"""
import os
if "TOKENIZERS_PARALLELISM" not in os.environ: ...

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,47 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import PIL.Image
import tqdm
from typing import Protocol
from mflux.models.common.config.config import Config
class BeforeLoopCallback(Protocol):
def call_before_loop(
self,
seed: int,
prompt: str,
latents: mx.array,
config: Config,
canny_image: PIL.Image.Image | None = ...,
depth_image: PIL.Image.Image | None = ...,
) -> None: ...
class InLoopCallback(Protocol):
def call_in_loop(
self,
t: int,
seed: int,
prompt: str,
latents: mx.array,
config: Config,
time_steps: tqdm,
) -> None: ...
class AfterLoopCallback(Protocol):
def call_after_loop(
self, seed: int, prompt: str, latents: mx.array, config: Config
) -> None: ...
class InterruptCallback(Protocol):
def call_interrupt(
self,
t: int,
seed: int,
prompt: str,
latents: mx.array,
config: Config,
time_steps: tqdm,
) -> None: ...

View File

@@ -0,0 +1,24 @@
"""
This type stub file was generated by pyright.
"""
from typing import TYPE_CHECKING
from mflux.callbacks.callback import (
AfterLoopCallback,
BeforeLoopCallback,
InLoopCallback,
InterruptCallback,
)
from mflux.callbacks.generation_context import GenerationContext
from mflux.models.common.config.config import Config
if TYPE_CHECKING: ...
class CallbackRegistry:
def __init__(self) -> None: ...
def register(self, callback) -> None: ...
def start(self, seed: int, prompt: str, config: Config) -> GenerationContext: ...
def before_loop_callbacks(self) -> list[BeforeLoopCallback]: ...
def in_loop_callbacks(self) -> list[InLoopCallback]: ...
def after_loop_callbacks(self) -> list[AfterLoopCallback]: ...
def interrupt_callbacks(self) -> list[InterruptCallback]: ...

View File

@@ -0,0 +1,29 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import PIL.Image
import tqdm
from typing import TYPE_CHECKING
from mflux.callbacks.callback_registry import CallbackRegistry
from mflux.models.common.config.config import Config
if TYPE_CHECKING: ...
class GenerationContext:
def __init__(
self, registry: CallbackRegistry, seed: int, prompt: str, config: Config
) -> None: ...
def before_loop(
self,
latents: mx.array,
*,
canny_image: PIL.Image.Image | None = ...,
depth_image: PIL.Image.Image | None = ...,
) -> None: ...
def in_loop(self, t: int, latents: mx.array, time_steps: tqdm = ...) -> None: ...
def after_loop(self, latents: mx.array) -> None: ...
def interruption(
self, t: int, latents: mx.array, time_steps: tqdm = ...
) -> None: ...

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,22 @@
"""
This type stub file was generated by pyright.
"""
import os
BATTERY_PERCENTAGE_STOP_LIMIT = ...
CONTROLNET_STRENGTH = ...
DEFAULT_DEV_FILL_GUIDANCE = ...
DEFAULT_DEPTH_GUIDANCE = ...
DIMENSION_STEP_PIXELS = ...
GUIDANCE_SCALE = ...
GUIDANCE_SCALE_KONTEXT = ...
IMAGE_STRENGTH = ...
MODEL_CHOICES = ...
MODEL_INFERENCE_STEPS = ...
QUANTIZE_CHOICES = ...
if os.environ.get("MFLUX_CACHE_DIR"):
MFLUX_CACHE_DIR = ...
else:
MFLUX_CACHE_DIR = ...
MFLUX_LORA_CACHE_DIR = ...

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,8 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.common.config.config import Config
from mflux.models.common.config.model_config import ModelConfig
__all__ = ["Config", "ModelConfig"]

View File

@@ -0,0 +1,66 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from pathlib import Path
from typing import Any
from tqdm import tqdm
from mflux.models.common.config.model_config import ModelConfig
logger = ...
class Config:
def __init__(
self,
model_config: ModelConfig,
num_inference_steps: int = ...,
height: int = ...,
width: int = ...,
guidance: float = ...,
image_path: Path | str | None = ...,
image_strength: float | None = ...,
depth_image_path: Path | str | None = ...,
redux_image_paths: list[Path | str] | None = ...,
redux_image_strengths: list[float] | None = ...,
masked_image_path: Path | str | None = ...,
controlnet_strength: float | None = ...,
scheduler: str = ...,
) -> None: ...
@property
def height(self) -> int: ...
@property
def width(self) -> int: ...
@width.setter
def width(self, value): # -> None:
...
@property
def image_seq_len(self) -> int: ...
@property
def guidance(self) -> float: ...
@property
def num_inference_steps(self) -> int: ...
@property
def precision(self) -> mx.Dtype: ...
@property
def num_train_steps(self) -> int: ...
@property
def image_path(self) -> Path | None: ...
@property
def image_strength(self) -> float | None: ...
@property
def depth_image_path(self) -> Path | None: ...
@property
def redux_image_paths(self) -> list[Path] | None: ...
@property
def redux_image_strengths(self) -> list[float] | None: ...
@property
def masked_image_path(self) -> Path | None: ...
@property
def init_time_step(self) -> int: ...
@property
def time_steps(self) -> tqdm: ...
@property
def controlnet_strength(self) -> float | None: ...
@property
def scheduler(self) -> Any: ...

View File

@@ -0,0 +1,86 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from functools import lru_cache
from typing import Literal
class ModelConfig:
precision: mx.Dtype = ...
def __init__(
self,
priority: int,
aliases: list[str],
model_name: str,
base_model: str | None,
controlnet_model: str | None,
custom_transformer_model: str | None,
num_train_steps: int | None,
max_sequence_length: int | None,
supports_guidance: bool | None,
requires_sigma_shift: bool | None,
transformer_overrides: dict | None = ...,
) -> None: ...
@staticmethod
@lru_cache
def dev() -> ModelConfig: ...
@staticmethod
@lru_cache
def schnell() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_kontext() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_fill() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_redux() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_depth() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_controlnet_canny() -> ModelConfig: ...
@staticmethod
@lru_cache
def schnell_controlnet_canny() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_controlnet_upscaler() -> ModelConfig: ...
@staticmethod
@lru_cache
def dev_fill_catvton() -> ModelConfig: ...
@staticmethod
@lru_cache
def krea_dev() -> ModelConfig: ...
@staticmethod
@lru_cache
def flux2_klein_4b() -> ModelConfig: ...
@staticmethod
@lru_cache
def flux2_klein_9b() -> ModelConfig: ...
@staticmethod
@lru_cache
def qwen_image() -> ModelConfig: ...
@staticmethod
@lru_cache
def qwen_image_edit() -> ModelConfig: ...
@staticmethod
@lru_cache
def fibo() -> ModelConfig: ...
@staticmethod
@lru_cache
def z_image_turbo() -> ModelConfig: ...
@staticmethod
@lru_cache
def seedvr2_3b() -> ModelConfig: ...
def x_embedder_input_dim(self) -> int: ...
def is_canny(self) -> bool: ...
@staticmethod
def from_name(
model_name: str, base_model: Literal["dev", "schnell", "krea-dev"] | None = ...
) -> ModelConfig: ...
AVAILABLE_MODELS = ...

View File

@@ -0,0 +1,7 @@
"""
This type stub file was generated by pyright.
"""
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,49 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from pathlib import Path
from typing import TYPE_CHECKING, TypeAlias
from mlx import nn
from mflux.models.common.vae.tiling_config import TilingConfig
from mflux.models.fibo.latent_creator.fibo_latent_creator import FiboLatentCreator
from mflux.models.flux.latent_creator.flux_latent_creator import FluxLatentCreator
from mflux.models.qwen.latent_creator.qwen_latent_creator import QwenLatentCreator
from mflux.models.z_image.latent_creator.z_image_latent_creator import (
ZImageLatentCreator,
)
if TYPE_CHECKING:
LatentCreatorType: TypeAlias = type[
FiboLatentCreator | FluxLatentCreator | QwenLatentCreator | ZImageLatentCreator
]
class Img2Img:
def __init__(
self,
vae: nn.Module,
latent_creator: LatentCreatorType,
sigmas: mx.array,
init_time_step: int,
image_path: str | Path | None,
tiling_config: TilingConfig | None = ...,
) -> None: ...
class LatentCreator:
@staticmethod
def create_for_txt2img_or_img2img(
seed: int, height: int, width: int, img2img: Img2Img
) -> mx.array: ...
@staticmethod
def encode_image(
vae: nn.Module,
image_path: str | Path,
height: int,
width: int,
tiling_config: TilingConfig | None = ...,
) -> mx.array: ...
@staticmethod
def add_noise_by_interpolation(
clean: mx.array, noise: mx.array, sigma: float
) -> mx.array: ...

View File

@@ -0,0 +1,3 @@
"""
This type stub file was generated by pyright.
"""

View File

@@ -0,0 +1,13 @@
"""
This type stub file was generated by pyright.
"""
from mlx import nn
from mflux.models.common.lora.layer.linear_lora_layer import LoRALinear
class FusedLoRALinear(nn.Module):
def __init__(
self, base_linear: nn.Linear | nn.QuantizedLinear, loras: list[LoRALinear]
) -> None: ...
def __call__(self, x): # -> array:
...

View File

@@ -0,0 +1,22 @@
"""
This type stub file was generated by pyright.
"""
from mlx import nn
class LoRALinear(nn.Module):
@staticmethod
def from_linear(
linear: nn.Linear | nn.QuantizedLinear, r: int = ..., scale: float = ...
): # -> LoRALinear:
...
def __init__(
self,
input_dims: int,
output_dims: int,
r: int = ...,
scale: float = ...,
bias: bool = ...,
) -> None: ...
def __call__(self, x): # -> array:
...

View File

@@ -0,0 +1,26 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
from collections.abc import Callable
from dataclasses import dataclass
from mflux.models.common.lora.mapping.lora_mapping import LoRATarget
@dataclass
class PatternMatch:
source_pattern: str
target_path: str
matrix_name: str
transpose: bool
transform: Callable[[mx.array], mx.array] | None = ...
class LoRALoader:
@staticmethod
def load_and_apply_lora(
lora_mapping: list[LoRATarget],
transformer: nn.Module,
lora_paths: list[str] | None = ...,
lora_scales: list[float] | None = ...,
) -> tuple[list[str], list[float]]: ...

View File

@@ -0,0 +1,21 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from collections.abc import Callable
from dataclasses import dataclass
from typing import List, Protocol
@dataclass
class LoRATarget:
model_path: str
possible_up_patterns: List[str]
possible_down_patterns: List[str]
possible_alpha_patterns: List[str] = ...
up_transform: Callable[[mx.array], mx.array] | None = ...
down_transform: Callable[[mx.array], mx.array] | None = ...
class LoRAMapping(Protocol):
@staticmethod
def get_mapping() -> List[LoRATarget]: ...

View File

@@ -0,0 +1,9 @@
"""
This type stub file was generated by pyright.
"""
import mlx.nn as nn
class LoRASaver:
@staticmethod
def bake_and_strip_lora(module: nn.Module) -> nn.Module: ...

View File

@@ -0,0 +1,35 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
class LoraTransforms:
@staticmethod
def split_q_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_k_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_v_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_q_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_k_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_v_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_q_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_k_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_v_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_mlp_up(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_q_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_k_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_v_down(tensor: mx.array) -> mx.array: ...
@staticmethod
def split_single_mlp_down(tensor: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,17 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.common.resolution.config_resolution import ConfigResolution
from mflux.models.common.resolution.lora_resolution import LoraResolution
from mflux.models.common.resolution.path_resolution import PathResolution
from mflux.models.common.resolution.quantization_resolution import (
QuantizationResolution,
)
__all__ = [
"ConfigResolution",
"LoraResolution",
"PathResolution",
"QuantizationResolution",
]

View File

@@ -0,0 +1,39 @@
"""
This type stub file was generated by pyright.
"""
from enum import Enum
from typing import NamedTuple
class QuantizationAction(Enum):
NONE = ...
STORED = ...
REQUESTED = ...
class PathAction(Enum):
LOCAL = ...
HUGGINGFACE_CACHED = ...
HUGGINGFACE = ...
ERROR = ...
class LoraAction(Enum):
LOCAL = ...
REGISTRY = ...
HUGGINGFACE_COLLECTION_CACHED = ...
HUGGINGFACE_COLLECTION = ...
HUGGINGFACE_REPO_CACHED = ...
HUGGINGFACE_REPO = ...
ERROR = ...
class ConfigAction(Enum):
EXACT_MATCH = ...
EXPLICIT_BASE = ...
INFER_SUBSTRING = ...
ERROR = ...
class Rule(NamedTuple):
priority: int
name: str
check: str
action: QuantizationAction | PathAction | LoraAction | ConfigAction
...

View File

@@ -0,0 +1,14 @@
"""
This type stub file was generated by pyright.
"""
from typing import TYPE_CHECKING
from mflux.models.common.config.model_config import ModelConfig
if TYPE_CHECKING: ...
logger = ...
class ConfigResolution:
RULES = ...
@staticmethod
def resolve(model_name: str, base_model: str | None = ...) -> ModelConfig: ...

View File

@@ -0,0 +1,21 @@
"""
This type stub file was generated by pyright.
"""
from pathlib import Path
logger = ...
class LoraResolution:
RULES = ...
_registry: dict[str, Path] = ...
@staticmethod
def resolve(path: str) -> str: ...
@staticmethod
def resolve_paths(paths: list[str] | None) -> list[str]: ...
@staticmethod
def resolve_scales(scales: list[float] | None, num_paths: int) -> list[float]: ...
@staticmethod
def get_registry() -> dict[str, Path]: ...
@staticmethod
def discover_files(library_paths: list[Path]) -> dict[str, Path]: ...

View File

@@ -0,0 +1,12 @@
"""
This type stub file was generated by pyright.
"""
from pathlib import Path
logger = ...
class PathResolution:
RULES = ...
@staticmethod
def resolve(path: str | None, patterns: list[str] | None = ...) -> Path | None: ...

View File

@@ -0,0 +1,12 @@
"""
This type stub file was generated by pyright.
"""
logger = ...
class QuantizationResolution:
RULES = ...
@staticmethod
def resolve(
stored: int | None, requested: int | None
) -> tuple[int | None, str | None]: ...

View File

@@ -0,0 +1,26 @@
"""
This type stub file was generated by pyright.
"""
from .flow_match_euler_discrete_scheduler import FlowMatchEulerDiscreteScheduler
from .linear_scheduler import LinearScheduler
from .seedvr2_euler_scheduler import SeedVR2EulerScheduler
__all__ = [
"LinearScheduler",
"FlowMatchEulerDiscreteScheduler",
"SeedVR2EulerScheduler",
]
class SchedulerModuleNotFound(ValueError): ...
class SchedulerClassNotFound(ValueError): ...
class InvalidSchedulerType(TypeError): ...
SCHEDULER_REGISTRY = ...
def register_contrib(scheduler_object, scheduler_name=...): # -> None:
...
def try_import_external_scheduler(
scheduler_object_path: str,
): # -> type[BaseScheduler]:
...

View File

@@ -0,0 +1,16 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from abc import ABC, abstractmethod
class BaseScheduler(ABC):
@property
@abstractmethod
def sigmas(self) -> mx.array: ...
@abstractmethod
def step(
self, noise: mx.array, timestep: int, latents: mx.array, **kwargs
) -> mx.array: ...
def scale_model_input(self, latents: mx.array, t: int) -> mx.array: ...

View File

@@ -0,0 +1,26 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from typing import TYPE_CHECKING
from mflux.models.common.config.config import Config
from mflux.models.common.schedulers.base_scheduler import BaseScheduler
if TYPE_CHECKING: ...
class FlowMatchEulerDiscreteScheduler(BaseScheduler):
def __init__(self, config: Config) -> None: ...
@property
def sigmas(self) -> mx.array: ...
@property
def timesteps(self) -> mx.array: ...
def set_image_seq_len(self, image_seq_len: int) -> None: ...
@staticmethod
def get_timesteps_and_sigmas(
image_seq_len: int, num_inference_steps: int, num_train_timesteps: int = ...
) -> tuple[mx.array, mx.array]: ...
def step(
self, noise: mx.array, timestep: int, latents: mx.array, **kwargs
) -> mx.array: ...
def scale_model_input(self, latents: mx.array, t: int) -> mx.array: ...

View File

@@ -0,0 +1,20 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from typing import TYPE_CHECKING
from mflux.models.common.config.config import Config
from mflux.models.common.schedulers.base_scheduler import BaseScheduler
if TYPE_CHECKING: ...
class LinearScheduler(BaseScheduler):
def __init__(self, config: Config) -> None: ...
@property
def sigmas(self) -> mx.array: ...
@property
def timesteps(self) -> mx.array: ...
def step(
self, noise: mx.array, timestep: int, latents: mx.array, **kwargs
) -> mx.array: ...

View File

@@ -0,0 +1,20 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from typing import TYPE_CHECKING
from mflux.models.common.config.config import Config
from mflux.models.common.schedulers.base_scheduler import BaseScheduler
if TYPE_CHECKING: ...
class SeedVR2EulerScheduler(BaseScheduler):
def __init__(self, config: Config) -> None: ...
@property
def timesteps(self) -> mx.array: ...
@property
def sigmas(self) -> mx.array: ...
def step(
self, noise: mx.array, timestep: int, latents: mx.array, **kwargs
) -> mx.array: ...

View File

@@ -0,0 +1,24 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.common.tokenizer.tokenizer import (
BaseTokenizer,
LanguageTokenizer,
Tokenizer,
VisionLanguageTokenizer,
)
from mflux.models.common.tokenizer.tokenizer_loader import TokenizerLoader
from mflux.models.common.tokenizer.tokenizer_output import TokenizerOutput
"""
This type stub file was generated by pyright.
"""
__all__ = [
"Tokenizer",
"BaseTokenizer",
"LanguageTokenizer",
"VisionLanguageTokenizer",
"TokenizerLoader",
"TokenizerOutput",
]

View File

@@ -0,0 +1,74 @@
"""
This type stub file was generated by pyright.
"""
from abc import ABC, abstractmethod
from typing import Protocol, runtime_checkable
from PIL import Image
from transformers import PreTrainedTokenizer
from mflux.models.common.tokenizer.tokenizer_output import TokenizerOutput
"""
This type stub file was generated by pyright.
"""
@runtime_checkable
class Tokenizer(Protocol):
tokenizer: PreTrainedTokenizer
def tokenize(
self,
prompt: str | list[str],
images: list[Image.Image] | None = ...,
max_length: int | None = ...,
**kwargs,
) -> TokenizerOutput: ...
class BaseTokenizer(ABC):
def __init__(
self, tokenizer: PreTrainedTokenizer, max_length: int = ...
) -> None: ...
@abstractmethod
def tokenize(
self,
prompt: str | list[str],
images: list[Image.Image] | None = ...,
max_length: int | None = ...,
**kwargs,
) -> TokenizerOutput: ...
class LanguageTokenizer(BaseTokenizer):
def __init__(
self,
tokenizer: PreTrainedTokenizer,
max_length: int = ...,
padding: str = ...,
return_attention_mask: bool = ...,
template: str | None = ...,
use_chat_template: bool = ...,
chat_template_kwargs: dict | None = ...,
add_special_tokens: bool = ...,
) -> None: ...
def tokenize(
self,
prompt: str | list[str],
images: list[Image.Image] | None = ...,
max_length: int | None = ...,
**kwargs,
) -> TokenizerOutput: ...
class VisionLanguageTokenizer(BaseTokenizer):
def __init__(
self,
tokenizer: PreTrainedTokenizer,
processor,
max_length: int = ...,
template: str | None = ...,
image_token: str = ...,
) -> None: ...
def tokenize(
self,
prompt: str | list[str],
images: list[Image.Image] | None = ...,
max_length: int | None = ...,
**kwargs,
) -> TokenizerOutput: ...

View File

@@ -0,0 +1,22 @@
"""
This type stub file was generated by pyright.
"""
from typing import TYPE_CHECKING
from mflux.models.common.tokenizer.tokenizer import BaseTokenizer
from mflux.models.common.weights.loading.weight_definition import TokenizerDefinition
"""
This type stub file was generated by pyright.
"""
if TYPE_CHECKING: ...
class TokenizerLoader:
@staticmethod
def load(definition: TokenizerDefinition, model_path: str) -> BaseTokenizer: ...
@staticmethod
def load_all(
definitions: list[TokenizerDefinition],
model_path: str,
max_length_overrides: dict[str, int] | None = ...,
) -> dict[str, BaseTokenizer]: ...

View File

@@ -0,0 +1,17 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from dataclasses import dataclass
"""
This type stub file was generated by pyright.
"""
@dataclass
class TokenizerOutput:
input_ids: mx.array
attention_mask: mx.array
pixel_values: mx.array | None = ...
image_grid_thw: mx.array | None = ...

View File

@@ -0,0 +1,8 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.common.vae.tiling_config import TilingConfig
from mflux.models.common.vae.vae_tiler import VAETiler
__all__ = ["TilingConfig", "VAETiler"]

View File

@@ -0,0 +1,13 @@
"""
This type stub file was generated by pyright.
"""
from dataclasses import dataclass
@dataclass(frozen=True, slots=True)
class TilingConfig:
vae_decode_tiles_per_dim: int | None = ...
vae_decode_overlap: int = ...
vae_encode_tiled: bool = ...
vae_encode_tile_size: int = ...
vae_encode_tile_overlap: int = ...

View File

@@ -0,0 +1,27 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from typing import Callable
class VAETiler:
@staticmethod
def encode_image_tiled(
*,
image: mx.array,
encode_fn: Callable[[mx.array], mx.array],
latent_channels: int,
tile_size: tuple[int, int] = ...,
tile_overlap: tuple[int, int] = ...,
spatial_scale: int = ...,
) -> mx.array: ...
@staticmethod
def decode_image_tiled(
*,
latent: mx.array,
decode_fn: Callable[[mx.array], mx.array],
tile_size: tuple[int, int] = ...,
tile_overlap: tuple[int, int] = ...,
spatial_scale: int = ...,
) -> mx.array: ...

View File

@@ -0,0 +1,17 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from mlx import nn
from mflux.models.common.vae.tiling_config import TilingConfig
class VAEUtil:
@staticmethod
def encode(
vae: nn.Module, image: mx.array, tiling_config: TilingConfig | None = ...
) -> mx.array: ...
@staticmethod
def decode(
vae: nn.Module, latent: mx.array, tiling_config: TilingConfig | None = ...
) -> mx.array: ...

View File

@@ -0,0 +1,18 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.common.weights.loading.loaded_weights import LoadedWeights, MetaData
from mflux.models.common.weights.loading.weight_applier import WeightApplier
from mflux.models.common.weights.loading.weight_definition import ComponentDefinition
from mflux.models.common.weights.loading.weight_loader import WeightLoader
from mflux.models.common.weights.saving.model_saver import ModelSaver
__all__ = [
"ComponentDefinition",
"LoadedWeights",
"MetaData",
"ModelSaver",
"WeightApplier",
"WeightLoader",
]

View File

@@ -0,0 +1,18 @@
"""
This type stub file was generated by pyright.
"""
from dataclasses import dataclass
@dataclass
class MetaData:
quantization_level: int | None = ...
mflux_version: str | None = ...
@dataclass
class LoadedWeights:
components: dict[str, dict]
meta_data: MetaData
def __getattr__(self, name: str) -> dict | None: ...
def num_transformer_blocks(self, component_name: str = ...) -> int: ...
def num_single_transformer_blocks(self, component_name: str = ...) -> int: ...

View File

@@ -0,0 +1,30 @@
"""
This type stub file was generated by pyright.
"""
import mlx.nn as nn
from typing import TYPE_CHECKING
from mflux.models.common.weights.loading.loaded_weights import LoadedWeights
from mflux.models.common.weights.loading.weight_definition import (
ComponentDefinition,
WeightDefinitionType,
)
if TYPE_CHECKING: ...
class WeightApplier:
@staticmethod
def apply_and_quantize_single(
weights: LoadedWeights,
model: nn.Module,
component: ComponentDefinition,
quantize_arg: int | None,
quantization_predicate=...,
) -> int | None: ...
@staticmethod
def apply_and_quantize(
weights: LoadedWeights,
models: dict[str, nn.Module],
quantize_arg: int | None,
weight_definition: WeightDefinitionType,
) -> int | None: ...

View File

@@ -0,0 +1,73 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from dataclasses import dataclass
from typing import Callable, List, TYPE_CHECKING, TypeAlias
from mflux.models.common.weights.mapping.weight_mapping import WeightTarget
from mflux.models.common.tokenizer.tokenizer import BaseTokenizer
from mflux.models.depth_pro.weights.depth_pro_weight_definition import (
DepthProWeightDefinition,
)
from mflux.models.fibo.weights.fibo_weight_definition import FIBOWeightDefinition
from mflux.models.fibo_vlm.weights.fibo_vlm_weight_definition import (
FIBOVLMWeightDefinition,
)
from mflux.models.flux.weights.flux_weight_definition import FluxWeightDefinition
from mflux.models.qwen.weights.qwen_weight_definition import QwenWeightDefinition
from mflux.models.seedvr2.weights.seedvr2_weight_definition import (
SeedVR2WeightDefinition,
)
from mflux.models.z_image.weights.z_image_weight_definition import (
ZImageWeightDefinition,
)
"""
This type stub file was generated by pyright.
"""
if TYPE_CHECKING:
WeightDefinitionType: TypeAlias = type[
FluxWeightDefinition
| FIBOWeightDefinition
| FIBOVLMWeightDefinition
| QwenWeightDefinition
| ZImageWeightDefinition
| SeedVR2WeightDefinition
| DepthProWeightDefinition
]
@dataclass
class ComponentDefinition:
name: str
hf_subdir: str
mapping_getter: Callable[[], List[WeightTarget]] | None = ...
model_attr: str | None = ...
num_blocks: int | None = ...
num_layers: int | None = ...
loading_mode: str = ...
precision: mx.Dtype | None = ...
skip_quantization: bool = ...
bulk_transform: Callable[[mx.array], mx.array] | None = ...
weight_subkey: str | None = ...
download_url: str | None = ...
weight_prefix_filters: List[str] | None = ...
weight_files: List[str] | None = ...
@dataclass
class TokenizerDefinition:
name: str
hf_subdir: str
tokenizer_class: str = ...
fallback_subdirs: List[str] | None = ...
download_patterns: List[str] | None = ...
encoder_class: type[BaseTokenizer] | None = ...
max_length: int = ...
padding: str = ...
template: str | None = ...
use_chat_template: bool = ...
chat_template_kwargs: dict | None = ...
add_special_tokens: bool = ...
processor_class: type | None = ...
image_token: str = ...
chat_template: str | None = ...

View File

@@ -0,0 +1,23 @@
"""
This type stub file was generated by pyright.
"""
from typing import TYPE_CHECKING
from mflux.models.common.weights.loading.loaded_weights import LoadedWeights
from mflux.models.common.weights.loading.weight_definition import (
ComponentDefinition,
WeightDefinitionType,
)
if TYPE_CHECKING: ...
logger = ...
class WeightLoader:
@staticmethod
def load_single(
component: ComponentDefinition, repo_id: str, file_pattern: str = ...
) -> LoadedWeights: ...
@staticmethod
def load(
weight_definition: WeightDefinitionType, model_path: str | None = ...
) -> LoadedWeights: ...

View File

@@ -0,0 +1,16 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from typing import Dict, List, Optional
from mflux.models.common.weights.mapping.weight_mapping import WeightTarget
class WeightMapper:
@staticmethod
def apply_mapping(
hf_weights: Dict[str, mx.array],
mapping: List[WeightTarget],
num_blocks: Optional[int] = ...,
num_layers: Optional[int] = ...,
) -> Dict: ...

View File

@@ -0,0 +1,23 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from dataclasses import dataclass
from typing import Callable, List, Optional, Protocol
"""
This type stub file was generated by pyright.
"""
@dataclass
class WeightTarget:
to_pattern: str
from_pattern: List[str]
transform: Optional[Callable[[mx.array], mx.array]] = ...
required: bool = ...
max_blocks: Optional[int] = ...
class WeightMapping(Protocol):
@staticmethod
def get_mapping() -> List[WeightTarget]: ...

View File

@@ -0,0 +1,17 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
class WeightTransforms:
@staticmethod
def reshape_gamma_to_1d(tensor: mx.array) -> mx.array: ...
@staticmethod
def transpose_patch_embed(tensor: mx.array) -> mx.array: ...
@staticmethod
def transpose_conv3d_weight(tensor: mx.array) -> mx.array: ...
@staticmethod
def transpose_conv2d_weight(tensor: mx.array) -> mx.array: ...
@staticmethod
def transpose_conv_transpose2d_weight(tensor: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,14 @@
"""
This type stub file was generated by pyright.
"""
from typing import Any, TYPE_CHECKING
from mflux.models.common.weights.loading.weight_definition import WeightDefinitionType
if TYPE_CHECKING: ...
class ModelSaver:
@staticmethod
def save_model(
model: Any, bits: int, base_path: str, weight_definition: WeightDefinitionType
) -> None: ...

View File

@@ -0,0 +1,9 @@
"""
This type stub file was generated by pyright.
"""
from mflux.models.depth_pro.model.depth_pro_model import DepthProModel
class DepthProInitializer:
@staticmethod
def init(model: DepthProModel, quantize: int | None = ...) -> None: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class FeatureFusionBlock2d(nn.Module):
def __init__(self, num_features: int, deconv: bool = ...) -> None: ...
def __call__(self, x0: mx.array, x1: mx.array | None = ...) -> mx.array: ...

View File

@@ -0,0 +1,17 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class MultiresConvDecoder(nn.Module):
def __init__(self) -> None: ...
def __call__(
self,
x0_latent: mx.array,
x1_latent: mx.array,
x0_features: mx.array,
x1_features: mx.array,
x_global_features: mx.array,
) -> mx.array: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class ResidualBlock(nn.Module):
def __init__(self, num_features: int) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,20 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from dataclasses import dataclass
from pathlib import Path
from PIL import Image
@dataclass
class DepthResult:
depth_image: Image.Image
depth_array: mx.array
min_depth: float
max_depth: float
...
class DepthPro:
def __init__(self, quantize: int | None = ...) -> None: ...
def create_depth_map(self, image_path: str | Path) -> DepthResult: ...

View File

@@ -0,0 +1,12 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class DepthProModel(nn.Module):
def __init__(self) -> None: ...
def __call__(
self, x0: mx.array, x1: mx.array, x2: mx.array
) -> tuple[mx.array, mx.array]: ...

View File

@@ -0,0 +1,15 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class DepthProUtil:
@staticmethod
def split(x: mx.array, overlap_ratio: float = ...) -> mx.array: ...
@staticmethod
def interpolate(x: mx.array, size=..., scale_factor=...): # -> array:
...
@staticmethod
def apply_conv(x: mx.array, conv_module: nn.Module) -> mx.array: ...

View File

@@ -0,0 +1,12 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
from mlx import nn
class Attention(nn.Module):
def __init__(
self, dim: int = ..., head_dim: int = ..., num_heads: int = ...
) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class DinoVisionTransformer(nn.Module):
def __init__(self) -> None: ...
def __call__(self, x: mx.array) -> tuple[mx.array, mx.array, mx.array]: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class LayerScale(nn.Module):
def __init__(self, dims: int, init_values: float = ...) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class MLP(nn.Module):
def __init__(self) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class PatchEmbed(nn.Module):
def __init__(self) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,10 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class TransformerBlock(nn.Module):
def __init__(self) -> None: ...
def __call__(self, x: mx.array) -> mx.array: ...

View File

@@ -0,0 +1,12 @@
"""
This type stub file was generated by pyright.
"""
import mlx.core as mx
import mlx.nn as nn
class DepthProEncoder(nn.Module):
def __init__(self) -> None: ...
def __call__(
self, x0: mx.array, x1: mx.array, x2: mx.array
) -> tuple[mx.array, mx.array, mx.array, mx.array, mx.array]: ...

Some files were not shown because too many files have changed in this diff Show More