## Motivation
We have a lot of unneeded data in the model card - lets just keep the
necessary stuff and add back more data when we need it
## Test Plan
EXO still runs! (pipeline on 2)
Co-authored-by: rltakashige <rl.takashige@gmail.com>
## Motivation
Add support for GLM-4.7-Flash, a lighter variant of GLM-4.7 with the
`glm4_moe_lite` architecture. These models are smaller and faster while
maintaining good performance.
## Changes
1. **Added 4 new model cards** for GLM-4.7-Flash variants:
- `glm-4.7-flash-4bit` (~18 GB)
- `glm-4.7-flash-5bit` (~21 GB)
- `glm-4.7-flash-6bit` (~25 GB)
- `glm-4.7-flash-8bit` (~32 GB)
All variants have:
- `n_layers`: 47 (vs 91 in GLM-4.7)
- `hidden_size`: 2048 (vs 5120 in GLM-4.7)
- `supports_tensor`: True (native `shard()` method)
2. **Bumped mlx from 0.30.1 to 0.30.3** - required by mlx-lm 0.30.4
3. **Updated mlx-lm from 0.30.2 to 0.30.4** - adds `glm4_moe_lite`
architecture support
4. **Added type ignores** in `auto_parallel.py` for stricter type
annotations in new mlx-lm
5. **Fixed EOS token IDs** for GLM-4.7-Flash - uses different tokenizer
with IDs `[154820, 154827, 154829]` vs other GLM models' `[151336,
151329, 151338]`
6. **Renamed `MLX_IBV_DEVICES` to `MLX_JACCL_DEVICES`** - env var name
changed in new mlx
## Why It Works
The model cards follow the same pattern as existing GLM-4.7 models.
Tensor parallel support is enabled because GLM-4.7-Flash implements the
native `shard()` method in mlx-lm 0.30.4, which is automatically
detected in `auto_parallel.py`.
GLM-4.7-Flash uses a new tokenizer with different special token IDs.
Without the correct EOS tokens, generation wouldn't stop properly.
## Test Plan
### Manual Testing
Tested generation with GLM-4.7-Flash-4bit - now correctly stops at EOS
tokens.
### Automated Testing
- `basedpyright`: 0 errors
- `ruff check`: All checks passed
- `pytest`: 162/162 tests pass (excluding pre-existing
`test_distributed_fix.py` timeout failures)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
The current `NodePerformanceProfile` is a monolithic object where every
update (even 1-second memory updates) replaces the entire profile,
touching unrelated data. Different fields update at vastly different
frequencies:
| Data | Update Frequency |
|------|------------------|
| Memory, System | 1 second |
| Thunderbolt | 5 seconds |
| Network interfaces | 10 seconds |
| Friendly name | 60 seconds |
| Model/Chip ID | Once at startup |
## Changes
Split into separate state mappings so each data type updates
independently:
- `node_identities`: Static and slow-changing data (model_id, chip_id,
friendly_name)
- `node_memory`: RAM and swap usage
- `node_system`: GPU usage, temperature, power, CPU metrics
- `node_network`: Network interface information
- `node_thunderbolt`: Thunderbolt interface identifiers
Added a backwards-compatible `node_profiles` property that reconstructs
`NodePerformanceProfile` from the granular mappings for dashboard
compatibility.
**Files modified:**
- `src/exo/shared/types/profiling.py` - Added `NodeIdentity`,
`NodeNetworkInfo`, `NodeThunderboltInfo` types
- `src/exo/shared/types/state.py` - Added 5 new mappings +
`node_profiles` property
- `src/exo/shared/apply.py` - Updated `apply_node_gathered_info` and
`apply_node_timed_out`
## Why It Works
Each info type now writes only to its specific mapping, avoiding
unnecessary updates to unrelated data. The `MacThunderboltConnections`
handler reads from `node_thunderbolt` instead of the old `node_profiles`
for RDMA connection mapping. The backwards-compatible property ensures
the dashboard continues to work unchanged.
## Test Plan
### Manual Testing
<!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB,
connected via Thunderbolt 4) -->
<!-- What you did: -->
- Start exo and verify dashboard shows node info
- Verify memory/GPU updates stream correctly
- Check that node timeout properly cleans up all mappings
### Automated Testing
- All 162 existing tests pass
- basedpyright: 0 errors
- ruff check: All checks passed
- nix fmt: Applied
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
Information gathering is tightly coupled to MacMon - we should start
generalizing our information sources so we can add more in future.
## Changes
Added a new system to gather any information. Currently, it is attached
to the Worker - though this is mostly to keep the data processing logic
simple. It could be made independent quite easily.
I also refactored topology to include different kinds of connections as
we can gather RDMA connections without having a pre-existing socket
connection, and made the relevant placement updates. We should no longer
need the network locations script in the app.
Other sources of information now include:
- static node information like "model" and "chip" (macos, "Unknown"
fallback)
- device friendly name (macos, falls back to device hostname)
- network interfaces + ips (cross platform)
- thunderbolt interfaces (macos)
- thunderbolt connections (macos)
- RAM usage (cross platform)
- per-device configuration written to EXO_HOME/config.toml
## Limitations
Model and Chip are not cross platform concepts.
We do not differentiate between unified and non-unified memory systems.
A lot of this data collection is based on simple timers. Watching the SC
store on macos is the correct way to gather some of this information,
but requires a detour into rust for macos.
## Why It Works
The InfoGatherer is a generic subsystem which returns a union of metric
datatypes. It writes them to an event, which is applied to state. It is
currently re-spawned with the worker so each cluster receives the
correct information.
As for topology, macOS identifies TB ports with a uuid in
SPThunderboltDataType, and also stores remote uuids if it can find them.
These changes read that data with the system_profiler, hopefully not so
often as to cause notable performance impacts (though this should be
tuned) but frequently enough for moderate responsiveness.
As we can identify TB connections between devices without needing ips
attached to each interface, we can remove the network setup script
(almost) completely.
## Test Plan
### Manual Testing
Spawn RDMA instances without enabling DHCP on the RDMA interfaces.
### Automated Testing
Updated the current master and shared tests to cover the topology
refactor and new events.
---------
Co-authored-by: Sami Khan <smsak99@gmail.com>
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
Co-authored-by: Jake Hillion <jake@hillion.co.uk>
## Motivation
When models output LaTeX-formatted math proofs, the dashboard was not
rendering them correctly. Issues included:
- `\documentclass`, `\begin{document}`, `\usepackage` showing as raw
text
- `$...$` inline math with complex expressions (like `\frac`, `\ldots`)
not rendering due to markdown escaping backslashes
- `\begin{align*}...\end{align*}` and other math environments showing as
raw text
- `\emph{...}`, `\textbf{...}` LaTeX formatting commands not being
converted
- `$\require{...}$` (MathJax-specific) causing KaTeX errors
- `\begin{proof}...\end{proof}` showing as raw text
## Changes
Enhanced `MarkdownContent.svelte` with comprehensive LaTeX support:
**Math extraction before markdown processing:**
- Extract `$...$`, `$$...$$`, `\(...\)`, `\[...\]` into placeholders
before markdown processes the text
- Use alphanumeric placeholders (`MATHPLACEHOLDERINLINE0END`) that won't
be interpreted as HTML tags
- Restore and render with KaTeX after markdown processing
**LaTeX document command removal:**
- Strip `\documentclass{...}`, `\usepackage{...}`, `\begin{document}`,
`\end{document}`
- Strip `\maketitle`, `\title{...}`, `\author{...}`, `\date{...}`
- Strip `\require{...}` (MathJax-specific, not KaTeX)
- Replace `tikzpicture` environments with `[diagram]` placeholder
- Strip `\label{...}` cross-reference commands
**LaTeX math environments:**
- Convert `\begin{align*}`, `\begin{equation}`, `\begin{gather}`, etc.
to display math blocks
**LaTeX text formatting:**
- `\emph{...}` and `\textit{...}` → `<em>...</em>`
- `\textbf{...}` → `<strong>...</strong>`
- `\texttt{...}` → `<code>...</code>`
- `\underline{...}` → `<u>...</u>`
**LaTeX environments styling:**
- `\begin{proof}...\end{proof}` → styled proof block with QED symbol
- `\begin{theorem}`, `\begin{lemma}`, etc. → styled theorem blocks
**Display math enhancements:**
- Wrapped in styled container with subtle gold border
- "LaTeX" label and copy button appear on hover
- Dark theme KaTeX color overrides for better readability
- Custom scrollbar for overflow
## Why It Works
The key insight is that markdown processing was escaping backslashes in
LaTeX before KaTeX could see them. By extracting all math expressions
into alphanumeric placeholders *before* markdown runs, then restoring
them *after*, the LaTeX content passes through to KaTeX unmodified.
Using purely alphanumeric placeholders like `MATHPLACEHOLDERINLINE0END`
instead of `<<MATH_INLINE_0>>` prevents markdown from interpreting them
as HTML tags and stripping them.
## Test Plan
### Manual Testing
- Hardware: Any machine with the dashboard
- What you did:
- Ask model to "write a proof in latex"
- Verify inline math like `$x \in S$` renders correctly
- Verify display math like `\begin{align*}...\end{align*}` renders as
block
- Verify `\documentclass`, `\begin{document}` are stripped (not shown)
- Verify `\emph{...}` converts to italics
- Verify copy button works on display math blocks
- Test edge cases: `$5` (currency) stays as text, `\$50` (escaped)
becomes `$50`
Before:
<img width="799" height="637" alt="Screenshot 2026-01-19 at 11 51 22 AM"
src="https://github.com/user-attachments/assets/62a705b8-b3c2-47b8-afd0-5d0c1b240e44"
/>
After:
<img width="809" height="642" alt="Screenshot 2026-01-19 at 11 46 58 AM"
src="https://github.com/user-attachments/assets/4f35fa1d-333c-4285-bc68-58a50f8f148e"
/>
### Automated Testing
- Dashboard builds successfully with `npm run build`
- Existing functionality preserved
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Motivation
https://github.com/exo-explore/exo/issues/1070
## Changes
Added check in ChatForm.svelte to reset selectedChatModel when it no
longer matches any running instance.
## Why It Works
The $effect now detects when the selected model is stale (not in
availableModels()) and resets to the first available model.
## Test Plan
### Manual Testing
1. Create instance of Model A → Delete it → Create instance of Model B →
Chat
2. Verify request goes to Model B (not Model A)
---------
Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>
Continue working towards a fully Nix based build by building the
dashboard with Nix. Continuing the theme of using the existing lock
files, use dream2nix to parse the lock file and build the tree of
dependency derivations.
dream2nix doesn't like the bundleDependencies, so we apply a small patch
to the lock file that drops all dependencies that are bundled. This
should ideally be contributed upstream but that can be done later.
Use this new dashboard build in the build-app CI workflow, meaning
future macOS apps will include this reproducible dashboard.
Test plan:
- Built a DMG, shipped to a cluster, loaded in a browser with no cache
and the dashboard looks good.
- Directory layout is as expected:
```
$ nix build .#dashboard
$ find result/
...
result/_app/immutable/entry
result/_app/immutable/entry/app.CTPAnMjf.js
result/_app/immutable/entry/start.fUSEa-2O.js
result/_app/immutable/nodes
result/_app/immutable/nodes/3.DqQr1Obm.js
result/_app/immutable/nodes/0.DgEY44RO.js
result/_app/immutable/nodes/2.BjZg_lJh.js
result/_app/immutable/nodes/1.D6vGUYYT.js
result/_app/env.js
result/_app/version.json
result/exo-logo.png
result/favicon.ico
result/index.html
```
The downloads dashboard showed "Completed" for finished model downloads
but provided no indication of how much disk space each model or the
total models on a node were using.
Added total_bytes field to DownloadCompleted type so the size is
preserved when a download completes. Updated the dashboard to display
the model size next to "Completed" status (e.g., "Completed (251.1GB)")
and a total disk usage line below the model count for each node (e.g.,
"502.2GB on disk").
Test plan:
- Ran unit tests for download apply and planning logic
- Type checked all modified files with basedpyright
## Motivation
The "UNKNOWN" status shown when first launching an instance is confusing
and unhelpful. "PREPARING" better describes what's actually happening.

## Changes
- Renamed status from "UNKNOWN" to "PREPARING" in dashboard
(+page.svelte)
- Renamed unknown state to preparing in macOS app
(InstanceViewModel.swift, InstanceRowView.swift)
## Why It Works
The status appears when an instance exists but runners haven't reported
status yet. "PREPARING" accurately describes this transitional state.
## Test Plan
### Manual Testing
Hardware: MacBook Pro
<img width="319" height="200" alt="image"
src="https://github.com/user-attachments/assets/9a1c3caf-026d-47ea-80d1-63c6e41d93aa"
/>
### Automated Testing
N/A
Add typescript auto formatting with Prettier and treefmt-nix. Added a
.prettierrc to useTabs, which isn't the default, to reduce churn. The
rest looks okay and will be checked by CI.
Test plan:
- CI
## Motivation
Saves the last launch settings, so that the next time you run exo it
will default to the same launch settings.
This is just a small quality of life improvement.
## Changes
When you launch it saves the settings to the web browser local storage.
When it fills out the model list, it reads the settings and sets the
default.
I reviewed, tested and edited the code, but some of the code was written
by Claude Opus. I hope that's ok.
## Why It Works
See above
## Test Plan
### Manual Testing
I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with
Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split.
I ran exo multiple times to confirm that the default works.
### Automated Testing
No changes to automated testing.
## Problem
When typing in Chinese (or other IME-based languages like
Japanese/Korean), pressing Enter to select a character from the IME
candidate list would incorrectly submit the message instead of
confirming the character selection.
## Solution
Added IME composition state detection in the `handleKeydown` function in
`ChatForm.svelte`:
- Check `event.isComposing` to detect active IME composition
- Fallback to `event.keyCode === 229` for broader browser compatibility
- Return early when IME is active, allowing normal character selection
## Changes
- Modified `dashboard/src/lib/components/ChatForm.svelte`
- Added IME composition check before Enter key handling
Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>
## Motivation
Users need to know what **prerequisites** they need in order to run exo.
Simple addition to docs prevents future raised issues.
## Changes
Updated ``README.md``:
- to include installation instructions for
**[uv](https://github.com/astral-sh/uv)** and
**[macmon](https://github.com/vladkens/macmon)**.
Updated ``CONTRIBUTING.md``:
- to verify these prerequisites are met before starting development.
- Standardized on brew installation instructions for macOS users to keep
the guide simple.
## Why It Works
By listing these prerequisites upfront, users will set up their
environment correctly before attempting to run exo.
## Test Plan
### Manual Testing
MacBook Pro M4
- Verified that ``uv`` and ``macmon`` were missing initially, causing
failures
- after installing them via brew (as documented), uv run exo starts
successfully.
### Automated Testing
<!-- Describe changes to automated tests, or how existing tests cover
this change -->
<!-- - -->
---------
Co-authored-by: Evan Quiney <evanev7@gmail.com>