mirror/exo - exo - Gitea: Git with a cup of tea

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-01-30 16:51:02 -05:00

Author	SHA1	Message	Date
Evan Quiney	d4f551c602	Simplify model cards (#1204 ) ## Motivation We have a lot of unneeded data in the model card - lets just keep the necessary stuff and add back more data when we need it ## Test Plan EXO still runs! (pipeline on 2) Co-authored-by: rltakashige <rl.takashige@gmail.com>	2026-01-20 11:01:19 +00:00
Alex Cheema	176ab5ba40	Add GLM-4.7-Flash model cards (4bit, 5bit, 6bit, 8bit) (#1214 ) ## Motivation Add support for GLM-4.7-Flash, a lighter variant of GLM-4.7 with the `glm4_moe_lite` architecture. These models are smaller and faster while maintaining good performance. ## Changes 1. Added 4 new model cards for GLM-4.7-Flash variants: - `glm-4.7-flash-4bit` (~18 GB) - `glm-4.7-flash-5bit` (~21 GB) - `glm-4.7-flash-6bit` (~25 GB) - `glm-4.7-flash-8bit` (~32 GB) All variants have: - `n_layers`: 47 (vs 91 in GLM-4.7) - `hidden_size`: 2048 (vs 5120 in GLM-4.7) - `supports_tensor`: True (native `shard()` method) 2. Bumped mlx from 0.30.1 to 0.30.3 - required by mlx-lm 0.30.4 3. Updated mlx-lm from 0.30.2 to 0.30.4 - adds `glm4_moe_lite` architecture support 4. Added type ignores in `auto_parallel.py` for stricter type annotations in new mlx-lm 5. Fixed EOS token IDs for GLM-4.7-Flash - uses different tokenizer with IDs `[154820, 154827, 154829]` vs other GLM models' `[151336, 151329, 151338]` 6. Renamed `MLX_IBV_DEVICES` to `MLX_JACCL_DEVICES` - env var name changed in new mlx ## Why It Works The model cards follow the same pattern as existing GLM-4.7 models. Tensor parallel support is enabled because GLM-4.7-Flash implements the native `shard()` method in mlx-lm 0.30.4, which is automatically detected in `auto_parallel.py`. GLM-4.7-Flash uses a new tokenizer with different special token IDs. Without the correct EOS tokens, generation wouldn't stop properly. ## Test Plan ### Manual Testing Tested generation with GLM-4.7-Flash-4bit - now correctly stops at EOS tokens. ### Automated Testing - `basedpyright`: 0 errors - `ruff check`: All checks passed - `pytest`: 162/162 tests pass (excluding pre-existing `test_distributed_fix.py` timeout failures) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-20 03:58:09 +00:00
Alex Cheema	ee43b598fe	Split NodePerformanceProfile into granular state mappings (#1209 ) ## Motivation The current `NodePerformanceProfile` is a monolithic object where every update (even 1-second memory updates) replaces the entire profile, touching unrelated data. Different fields update at vastly different frequencies: \| Data \| Update Frequency \| \|------\|------------------\| \| Memory, System \| 1 second \| \| Thunderbolt \| 5 seconds \| \| Network interfaces \| 10 seconds \| \| Friendly name \| 60 seconds \| \| Model/Chip ID \| Once at startup \| ## Changes Split into separate state mappings so each data type updates independently: - `node_identities`: Static and slow-changing data (model_id, chip_id, friendly_name) - `node_memory`: RAM and swap usage - `node_system`: GPU usage, temperature, power, CPU metrics - `node_network`: Network interface information - `node_thunderbolt`: Thunderbolt interface identifiers Added a backwards-compatible `node_profiles` property that reconstructs `NodePerformanceProfile` from the granular mappings for dashboard compatibility. Files modified: - `src/exo/shared/types/profiling.py` - Added `NodeIdentity`, `NodeNetworkInfo`, `NodeThunderboltInfo` types - `src/exo/shared/types/state.py` - Added 5 new mappings + `node_profiles` property - `src/exo/shared/apply.py` - Updated `apply_node_gathered_info` and `apply_node_timed_out` ## Why It Works Each info type now writes only to its specific mapping, avoiding unnecessary updates to unrelated data. The `MacThunderboltConnections` handler reads from `node_thunderbolt` instead of the old `node_profiles` for RDMA connection mapping. The backwards-compatible property ensures the dashboard continues to work unchanged. ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> - Start exo and verify dashboard shows node info - Verify memory/GPU updates stream correctly - Check that node timeout properly cleans up all mappings ### Automated Testing - All 162 existing tests pass - basedpyright: 0 errors - ruff check: All checks passed - nix fmt: Applied 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 18:24:15 +00:00
Evan Quiney	2202685c3e	refactor all information sources (including ipless rdma discovery) (#928 ) ## Motivation Information gathering is tightly coupled to MacMon - we should start generalizing our information sources so we can add more in future. ## Changes Added a new system to gather any information. Currently, it is attached to the Worker - though this is mostly to keep the data processing logic simple. It could be made independent quite easily. I also refactored topology to include different kinds of connections as we can gather RDMA connections without having a pre-existing socket connection, and made the relevant placement updates. We should no longer need the network locations script in the app. Other sources of information now include: - static node information like "model" and "chip" (macos, "Unknown" fallback) - device friendly name (macos, falls back to device hostname) - network interfaces + ips (cross platform) - thunderbolt interfaces (macos) - thunderbolt connections (macos) - RAM usage (cross platform) - per-device configuration written to EXO_HOME/config.toml ## Limitations Model and Chip are not cross platform concepts. We do not differentiate between unified and non-unified memory systems. A lot of this data collection is based on simple timers. Watching the SC store on macos is the correct way to gather some of this information, but requires a detour into rust for macos. ## Why It Works The InfoGatherer is a generic subsystem which returns a union of metric datatypes. It writes them to an event, which is applied to state. It is currently re-spawned with the worker so each cluster receives the correct information. As for topology, macOS identifies TB ports with a uuid in SPThunderboltDataType, and also stores remote uuids if it can find them. These changes read that data with the system_profiler, hopefully not so often as to cause notable performance impacts (though this should be tuned) but frequently enough for moderate responsiveness. As we can identify TB connections between devices without needing ips attached to each interface, we can remove the network setup script (almost) completely. ## Test Plan ### Manual Testing Spawn RDMA instances without enabling DHCP on the RDMA interfaces. ### Automated Testing Updated the current master and shared tests to cover the topology refactor and new events. --------- Co-authored-by: Sami Khan <smsak99@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Co-authored-by: Jake Hillion <jake@hillion.co.uk>	2026-01-19 16:58:09 +00:00
Alex Cheema	346b13e2c9	Enhance LaTeX rendering in dashboard markdown (#1197 ) ## Motivation When models output LaTeX-formatted math proofs, the dashboard was not rendering them correctly. Issues included: - `\documentclass`, `\begin{document}`, `\usepackage` showing as raw text - `$...$` inline math with complex expressions (like `\frac`, `\ldots`) not rendering due to markdown escaping backslashes - `\begin{align}...\end{align}` and other math environments showing as raw text - `\emph{...}`, `\textbf{...}` LaTeX formatting commands not being converted - `$\require{...}$` (MathJax-specific) causing KaTeX errors - `\begin{proof}...\end{proof}` showing as raw text ## Changes Enhanced `MarkdownContent.svelte` with comprehensive LaTeX support: Math extraction before markdown processing: - Extract `$...$`, `$$...$$`, `$...$`, `\[...\]` into placeholders before markdown processes the text - Use alphanumeric placeholders (`MATHPLACEHOLDERINLINE0END`) that won't be interpreted as HTML tags - Restore and render with KaTeX after markdown processing LaTeX document command removal: - Strip `\documentclass{...}`, `\usepackage{...}`, `\begin{document}`, `\end{document}` - Strip `\maketitle`, `\title{...}`, `\author{...}`, `\date{...}` - Strip `\require{...}` (MathJax-specific, not KaTeX) - Replace `tikzpicture` environments with `[diagram]` placeholder - Strip `\label{...}` cross-reference commands LaTeX math environments: - Convert `\begin{align}`, `\begin{equation}`, `\begin{gather}`, etc. to display math blocks LaTeX text formatting:* - `\emph{...}` and `\textit{...}` → `<em>...</em>` - `\textbf{...}` → `<strong>...</strong>` - `\texttt{...}` → `<code>...</code>` - `\underline{...}` → `<u>...</u>` LaTeX environments styling: - `\begin{proof}...\end{proof}` → styled proof block with QED symbol - `\begin{theorem}`, `\begin{lemma}`, etc. → styled theorem blocks Display math enhancements: - Wrapped in styled container with subtle gold border - "LaTeX" label and copy button appear on hover - Dark theme KaTeX color overrides for better readability - Custom scrollbar for overflow ## Why It Works The key insight is that markdown processing was escaping backslashes in LaTeX before KaTeX could see them. By extracting all math expressions into alphanumeric placeholders before markdown runs, then restoring them after, the LaTeX content passes through to KaTeX unmodified. Using purely alphanumeric placeholders like `MATHPLACEHOLDERINLINE0END` instead of `<<MATH_INLINE_0>>` prevents markdown from interpreting them as HTML tags and stripping them. ## Test Plan ### Manual Testing - Hardware: Any machine with the dashboard - What you did: - Ask model to "write a proof in latex" - Verify inline math like `$x \in S$` renders correctly - Verify display math like `\begin{align}...\end{align}` renders as block - Verify `\documentclass`, `\begin{document}` are stripped (not shown) - Verify `\emph{...}` converts to italics - Verify copy button works on display math blocks - Test edge cases: `$5` (currency) stays as text, `\$50` (escaped) becomes `$50` Before: <img width="799" height="637" alt="Screenshot 2026-01-19 at 11 51 22 AM" src="https://github.com/user-attachments/assets/62a705b8-b3c2-47b8-afd0-5d0c1b240e44" /> After: <img width="809" height="642" alt="Screenshot 2026-01-19 at 11 46 58 AM" src="https://github.com/user-attachments/assets/4f35fa1d-333c-4285-bc68-58a50f8f148e" /> ### Automated Testing - Dashboard builds successfully with `npm run build` - Existing functionality preserved 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-19 14:50:41 +00:00
Sami Khan	6e6567a802	resolve issue #1070 (#1076 ) ## Motivation https://github.com/exo-explore/exo/issues/1070 ## Changes Added check in ChatForm.svelte to reset selectedChatModel when it no longer matches any running instance. ## Why It Works The $effect now detects when the selected model is stale (not in availableModels()) and resets to the first available model. ## Test Plan ### Manual Testing 1. Create instance of Model A → Delete it → Create instance of Model B → Chat 2. Verify request goes to Model B (not Model A) --------- Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>	2026-01-15 20:00:41 +00:00
Evan Quiney	c22dad8a7d	dashboard: add peer: true to package lock (#1162 ) this happens every time i run npm install - lets upstream it ## testing dashboard builds and renders	2026-01-15 17:01:43 +00:00
Jake Hillion	3671528fa4	nix: add dashboard build with dream2nix Continue working towards a fully Nix based build by building the dashboard with Nix. Continuing the theme of using the existing lock files, use dream2nix to parse the lock file and build the tree of dependency derivations. dream2nix doesn't like the bundleDependencies, so we apply a small patch to the lock file that drops all dependencies that are bundled. This should ideally be contributed upstream but that can be done later. Use this new dashboard build in the build-app CI workflow, meaning future macOS apps will include this reproducible dashboard. Test plan: - Built a DMG, shipped to a cluster, loaded in a browser with no cache and the dashboard looks good. - Directory layout is as expected: ``` $ nix build .#dashboard $ find result/ ... result/_app/immutable/entry result/_app/immutable/entry/app.CTPAnMjf.js result/_app/immutable/entry/start.fUSEa-2O.js result/_app/immutable/nodes result/_app/immutable/nodes/3.DqQr1Obm.js result/_app/immutable/nodes/0.DgEY44RO.js result/_app/immutable/nodes/2.BjZg_lJh.js result/_app/immutable/nodes/1.D6vGUYYT.js result/_app/env.js result/_app/version.json result/exo-logo.png result/favicon.ico result/index.html ```	2026-01-14 15:58:16 +01:00
Jake Hillion	8d7b6789b3	dashboard: show disk usage for completed models The downloads dashboard showed "Completed" for finished model downloads but provided no indication of how much disk space each model or the total models on a node were using. Added total_bytes field to DownloadCompleted type so the size is preserved when a download completes. Updated the dashboard to display the model size next to "Completed" status (e.g., "Completed (251.1GB)") and a total disk usage line below the model count for each node (e.g., "502.2GB on disk"). Test plan: - Ran unit tests for download apply and planning logic - Type checked all modified files with basedpyright	2026-01-12 16:34:29 +01:00
Sami Khan	d1e88def42	scrollbars fixed (#1113 ) ## Motivation Fixes https://github.com/exo-explore/exo/issues/1107 - Horizontal scrollbar always appears in instances section, and vertical scrollbar appears too early (with just 1-2 instances on large screens). ## Changes - Added overflow-x-hidden to remove horizontal scrollbar - Added xl:max-h-96 for responsive vertical height (384px on xl+ screens vs 288px default) - Added py-px to accommodate corner accent decorations that extend 1px outside cards ## Why It Works - overflow-x-hidden prevents horizontal scroll regardless of content - Larger max-height on xl screens fits 2 instances without scrollbar; 3rd triggers it - 1px vertical padding accommodates the -top-px/-bottom-px positioned corner accents that caused tiny overflow ## Test Plan ### Manual Testing <img width="1190" height="868" alt="image" src="https://github.com/user-attachments/assets/2a582328-5b4f-4490-a488-52106f2e85ef" /> ### Automated Testing N/A	2026-01-09 12:51:05 +00:00
Sami Khan	59e7594e34	UNKNOWN to PREPARING (#1112 ) ## Motivation The "UNKNOWN" status shown when first launching an instance is confusing and unhelpful. "PREPARING" better describes what's actually happening. ![telegram-cloud-photo-size-4-5981245965962251168-x](https://github.com/user-attachments/assets/65b0802b-fb64-4fa7-bff7-c13757035b3a) ## Changes - Renamed status from "UNKNOWN" to "PREPARING" in dashboard (+page.svelte) - Renamed unknown state to preparing in macOS app (InstanceViewModel.swift, InstanceRowView.swift) ## Why It Works The status appears when an instance exists but runners haven't reported status yet. "PREPARING" accurately describes this transitional state. ## Test Plan ### Manual Testing Hardware: MacBook Pro <img width="319" height="200" alt="image" src="https://github.com/user-attachments/assets/9a1c3caf-026d-47ea-80d1-63c6e41d93aa" /> ### Automated Testing N/A	2026-01-09 11:46:51 +00:00
Jake Hillion	383309e24e	fmt: add typescript formatting Add typescript auto formatting with Prettier and treefmt-nix. Added a .prettierrc to useTabs, which isn't the default, to reduce churn. The rest looks okay and will be checked by CI. Test plan: - CI	2026-01-08 13:47:27 +00:00
Drifter4242	47b8e0ce12	feat: remember last launch settings (model, sharding, instance type) (#1028 ) ## Motivation Saves the last launch settings, so that the next time you run exo it will default to the same launch settings. This is just a small quality of life improvement. ## Changes When you launch it saves the settings to the web browser local storage. When it fills out the model list, it reads the settings and sets the default. I reviewed, tested and edited the code, but some of the code was written by Claude Opus. I hope that's ok. ## Why It Works See above ## Test Plan ### Manual Testing I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split. I ran exo multiple times to confirm that the default works. ### Automated Testing No changes to automated testing.	2026-01-05 11:27:14 +00:00
RickyChen / 陳昭儒	844bcc7ce6	fix: prevent form submission during IME composition (#1069 ) ## Problem When typing in Chinese (or other IME-based languages like Japanese/Korean), pressing Enter to select a character from the IME candidate list would incorrectly submit the message instead of confirming the character selection. ## Solution Added IME composition state detection in the `handleKeydown` function in `ChatForm.svelte`: - Check `event.isComposing` to detect active IME composition - Fallback to `event.keyCode === 229` for broader browser compatibility - Return early when IME is active, allowing normal character selection ## Changes - Modified `dashboard/src/lib/components/ChatForm.svelte` - Added IME composition check before Enter key handling Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>	2025-12-31 17:11:04 +00:00
Evan Quiney	9d9e24f969	some dashboard updates (#1017 ) Mostly @samiamjidkhan and @AlexCheema's work in progress. --------- Co-authored-by: Sami Khan <smsak99@gmail.com> Co-authored-by: Alex Cheema	2025-12-28 20:50:23 +00:00
Nightguarder	1e75aeb2c2	Add Prerequisites to Readme (#936 ) ## Motivation Users need to know what prerequisites they need in order to run exo. Simple addition to docs prevents future raised issues. ## Changes Updated ``README.md``: - to include installation instructions for [uv](https://github.com/astral-sh/uv) and [macmon](https://github.com/vladkens/macmon). Updated ``CONTRIBUTING.md``: - to verify these prerequisites are met before starting development. - Standardized on brew installation instructions for macOS users to keep the guide simple. ## Why It Works By listing these prerequisites upfront, users will set up their environment correctly before attempting to run exo. ## Test Plan ### Manual Testing MacBook Pro M4 - Verified that ``uv`` and ``macmon`` were missing initially, causing failures - after installing them via brew (as documented), uv run exo starts successfully. ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - --> --------- Co-authored-by: Evan Quiney <evanev7@gmail.com>	2025-12-22 02:28:08 +00:00
Evan Quiney	9815283a82	8000 -> 52415 (#915 ) * 8000 -> 52415 * dont grab the api port for placement --------- Co-authored-by: rltakashige <rl.takashige@gmail.com>	2025-12-18 18:39:44 +00:00
Evan Quiney	09593c5e85	backport the dashboard to staging	2025-12-17 12:22:22 +00:00
Alex Cheema	20d73e90cd	fix dashboard case sensitive model id	2025-11-26 18:16:32 +00:00
Alex Cheema	e56daa7c23	render download progress properly	2025-11-26 11:48:30 +00:00
rltakashige	28a91787e8	Demo Co-authored-by: Evan <evanev7@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com>	2025-11-20 20:03:51 +00:00
Evan Quiney	aa519b8c03	Worker refactor Co-authored-by: rltakashige <rl.takashige@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com>	2025-11-10 23:31:53 +00:00
Alex Cheema	e60681963f	show ips on dashboard	2025-11-06 19:18:07 +00:00
rltakashige	16f724e24c	Update staging 14 Co-authored-by: Evan <evanev7@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Co-authored-by: David Munha Canas Correia <dmunha@MacBook-David.local> Co-authored-by: github-actions bot <github-actions@users.noreply.github.com>	2025-11-05 01:44:24 +00:00
Alex Cheema	a346af3477	download fixes	2025-10-22 11:56:52 +01:00
Evan Quiney	1c6b5ce911	new tagged union Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Sorry Andrei!	2025-10-10 16:22:09 +01:00
Alex Cheema	84dfc8a738	Fast memory profiling Co-authored-by: Evan <evanev7@gmail.com>	2025-10-07 16:23:51 +01:00
Evan Quiney	38ff949bf4	big refactor Fix. Everything. Co-authored-by: Andrei Cravtov <the.andrei.cravtov@gmail.com> Co-authored-by: Matt Beton <matthew.beton@gmail.com> Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Co-authored-by: Seth Howes <sethshowes@gmail.com>	2025-09-30 11:03:04 +01:00
Matt Beton	35c4311587	Dashboard Status & Bugfixes	2025-08-29 17:34:17 +01:00
Alex Cheema	5bfc99b415	add EXO logo to dashboard	2025-08-25 16:41:13 +01:00
Seth Howes	407796d18f	Minor dashboard fixes	2025-08-04 06:15:01 +08:00
Matt Beton	1fe4ed3442	Worker Exception & Timeout Refactor Co-authored-by: Gelu Vrabie <gelu@exolabs.net> Co-authored-by: Alex Cheema <alexcheema123@gmail.com> Co-authored-by: Seth Howes <sethshowes@gmail.com>	2025-08-02 08:28:37 -07:00
Seth Howes	71bafabc63	Dashboard with instances	2025-08-01 14:38:07 +01:00
Seth Howes	3f192f20cc	Reinstate dashboard	2025-07-28 23:18:23 +01:00

34 Commits