mirror/exo - exo - Gitea: Git with a cup of tea

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-02-18 14:55:13 -05:00

Author	SHA1	Message	Date
Sami Khan	ffacabe7e4	Fix uninstall button error (#1306 ) ## Motivation Fix "Network setup script failed" error when clicking uninstall button and resolve Xcode compiler warnings. ## Changes - NetworkSetupHelper.swift: Add \|\| true guards and explicit return 0 in find_and_enable_thunderbolt_bridge to prevent script failures with set -euo pipefail - ThunderboltBridgeService.swift: Use withCString and withUnsafeMutablePointer for Authorization API calls to fix pointer lifetime warnings - EXOApp.swift: Mark showNotification as nonisolated to fix main actor isolation warning ## Why It Works - The uninstall script's Thunderbolt re-enable function could exit non-zero in edge cases (no bridges, no matches). Since this is a cleanup step, failures should not abort uninstall. - Swift requires explicit pointer lifetime management when passing strings/structs to C APIs. - showNotification is called from a nonisolated delegate method and uses thread-safe APIs. ## Test Plan ### Manual Testing Hardware: MacBook Pro - Clicked Uninstall button, verified it completes without error - Built in Xcode, verified no warnings ### Automated Testing N/A	2026-01-29 12:57:48 +00:00
rltakashige	9e58a57599	Add RDMA caveats to README.md (#1316 ) ## Motivation Running RDMA from source is not well documented as is. Several surprising things that took time to debug internally too. App should be updated to detect MacOS versions in future.	2026-01-28 18:44:00 +00:00
Evan Quiney	748a026071	fix configdata validation for kimi-k2 (#1314 ) ## motivation our shard downloader could not correctly fetch data for kimi-k2, as it deferred some values to a text_config field. ## changes config_data now prioritizes this field if it exists in information like layer_count	2026-01-28 14:29:36 +00:00
Alex Cheema	f1a2d054ec	Update tagline to "Run frontier AI locally" (#1313 ) - Update README tagline from "Run your own AI cluster at home with everyday devices" to "Run frontier AI locally"	2026-01-28 12:38:14 +00:00
Alex Cheema	b3c8f85fc8	Update MLX to 0.30.4 (#1311 ) ## Summary - Bump mlx from 0.30.3 to 0.30.4 ## Test plan - [x] `uv lock` succeeds - [x] Type checking passes (`uv run basedpyright`) - [x] Run inference tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-28 04:30:21 -08:00
rltakashige	a562114ba5	Add Kimi K2.5 support (#1302 ) ## Motivation <!-- Why is this change needed? What problem does it solve? --> <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - --> --------- Co-authored-by: Alex Cheema <41707476+AlexCheema@users.noreply.github.com>	2026-01-28 05:44:19 +00:00
Evan Quiney	991d278119	replace nix fmt with treefmt in just lint (#1301 ) man evaluating the nix flake is so slow. treefmt speeeedy	2026-01-27 17:03:01 +00:00
rltakashige	c55cbf6739	Add mlx lm style tensor sharding for Minimax (#1299 ) ## Motivation Broken right now. We'll potentially add a better one later ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing Used for evals without any issue. ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - -->	2026-01-27 15:29:06 +00:00
Alex Cheema	bd4f0bf048	Fix download speed/ETA display for re-downloads (#1294 ) ## Motivation After the download verification fix, when files are re-downloaded due to upstream changes (size mismatch), the download progress displays correctly (completion %, bytes, file counts), but speed shows 0 B/s and ETA shows "--" for both overall and per-file progress. ## Changes - Modified `on_progress_wrapper` in `src/exo/download/download_utils.py` to detect re-download scenarios - Added re-download detection: when `curr_bytes < previous_downloaded`, the file was deleted and download restarted - On re-download: reset `start_time` to current time and set `downloaded_this_session = curr_bytes` - Added two tests to `test_download_verification.py` covering re-download and continuing download scenarios ## Why It Works The bug occurred because: 1. `file_progress` is initialized with the OLD local file size (e.g., 1.5GB) 2. When `_download_file` detects size mismatch, it deletes the file and starts fresh 3. Progress callback receives small `curr_bytes` (e.g., 8KB) but compares against old size 4. `downloaded_this_session = 0 + (8KB - 1.5GB) = -1.5GB` (negative!) 5. Negative session bytes → 0 or negative speed → ETA shows "--" The fix detects when `curr_bytes < previous_downloaded` (indicating re-download started) and resets tracking to treat it as a fresh download. ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> - Download a model, modify a file to change its size, restart exo, verify speed/ETA display correctly during re-download ### Automated Testing - Added `TestProgressResetOnRedownload` class with two tests: - `test_progress_resets_correctly_on_redownload`: Verifies progress resets correctly when re-download starts - `test_progress_accumulates_on_continuing_download`: Verifies continuing downloads still accumulate correctly - All 11 download tests pass - Type checking (basedpyright): 0 errors - Linting (ruff): All checks passed 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 21:56:58 +00:00
rltakashige	cd8c01b7c8	Fix kv prefix cache (#1262 ) ## Motivation OpenCode sends very large prompts, most of which are repeated on the next call. ## Changes Add prefix caching, reducing average time in prefill (in testing) from 40 seconds to 4. This massively improves user experience. Also evicts KV caches from this prefix cache in a LRU-style manner. ## Why It Works We no longer prefill repeatedly but rather use kv cache stored in memory. A future update may want to use storage to make the prefix cache larger. ## Test Plan ### Manual Testing Tested speedup on OpenCode ### Automated Testing Added a lot of tests --------- Co-authored-by: David Hind <davehind@yahoo.co.uk>	2026-01-26 20:13:58 +00:00
rltakashige	59e991ce15	Only ignore message if actually empty (#1292 ) ## Motivation <!-- Why is this change needed? What problem does it solve? --> <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - -->	2026-01-26 19:33:23 +00:00
ciaranbor	ffba340e70	Ciaran/image quantization (#1272 ) ## Motivation Enable users to select and use quantized variants (8-bit, 4-bit) of image models ## Changes Use exolabs HF org for image models ## Why It Works Quantized versions have been uploaded to exolabs HF org ## Test Plan Loaded and ran different quantized variants. Confirmed lower memory usage and different outputs for the same seed. Verified chat completion still works.	2026-01-26 19:25:05 +00:00
rltakashige	9968abe816	Leo/fix basic model shard (#1291 ) ## Motivation Some models, on some configurations, would have several issues that caused the model to be stuck on loading. ## Changes Several loading issues were with upstream mlx lm shard loading for tensor parallel. GLM 4.7 Flash now uses GLM 4.7 Lite. A final portion of the issues were from mlx memory not being properly released before calling mx.eval(model), causing the system to run out of memory. ## Test Plan ### Manual Testing Done a bunch (thanks @AlexCheema), hopefully exhaustive. ### Automated Testing A bunch of automated testing is imminent but not landed yet. --------- Co-authored-by: Alex Cheema <alexcheema123@gmail.com>	2026-01-26 17:49:09 +00:00
Alex Cheema	0e30b0830f	Fix download system for upstream file changes (#1290 ) ## Motivation When upstream files change on Hugging Face, exo's download system doesn't detect the change and downloads get stuck. The only workaround is deleting `~/.exo/models/` and the cache. Root causes: 1. Existing files are never re-verified against remote metadata 2. File list cache is never invalidated, causing stale sizes to be used ## Changes 1. Verify existing files against remote size (`_download_file`): Before returning early for existing files, verify the local file size matches remote. If mismatched, delete and re-download. If network fails (offline), fall back to trusting local file. 2. Always try fresh file list first (`fetch_file_list_with_cache`): Always attempt to fetch fresh data from Hugging Face. On success, update the cache. On failure, fall back to cached data if available. 3. Clear cache on model delete (`delete_model`): When a model is deleted, also delete its cache entry to prevent stale metadata. ## Why It Works - Online: Stale local files are detected via size mismatch and re-downloaded. Fresh file list is always fetched and cache is updated. - Offline with cache: Existing files are trusted. Cached file list is used as fallback. - Offline without cache: Fails gracefully (can't download without knowing what files to get). The size check is O(1) so there's no performance impact. Hash verification still happens after download completes (existing behavior). ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> - Download a model, manually modify a local file's content, restart exo, verify it re-downloads ### Automated Testing Added 9 new tests in `src/exo/download/tests/test_download_verification.py`: - Re-download when file size changes upstream - Skip download when file size matches - Offline fallback uses local file - Fetch fresh file list and update cache - Fall back to cache when fetch fails - Error propagates when no cache exists - Model delete clears cache - Delete when only cache exists - Delete nonexistent model All tests pass: `uv run pytest src/exo/download/tests/ -v` Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 09:14:58 -08:00
Alex Cheema	44453c4c8b	Remove change-detection checks from info gatherer monitors (#1283 ) ## Summary - When a node times out, its info gets cleared from state. The monitor functions only sent data when something changed, leaving no mechanism to re-populate this info after a timeout. - Removes change-detection checks from `_monitor_misc`, `_monitor_system_profiler_thunderbolt_data`, `_watch_system_info`, and `_monitor_thunderbolt_bridge_status` so data is sent periodically regardless of whether it changed. ## Test plan - [ ] Verify type checker passes: `uv run basedpyright` - [ ] Verify linter passes: `uv run ruff check` - [ ] Verify tests pass: `uv run pytest` - [ ] Manually test that node info is re-populated after a timeout by observing cluster behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-26 12:23:22 +00:00
Jake Hillion	1290e8ed9f	dashboard: fix prettier-svelte rebuilding on every file change The prettier-svelte package was rebuilding whenever any file in the repository changed because dashboardStubSrc referenced inputs.self directly. Since inputs.self's store path hash is computed from the entire repository contents, any file modification invalidated the derivation. Added dashboardLockfileSrc using lib.cleanSourceWith to filter inputs.self to only include package.json and package-lock.json from the dashboard directory. Updated dashboardStubSrc to reference this filtered source instead of inputs.self directly. This ensures prettier-svelte only rebuilds when the lockfiles actually change, significantly improving build caching for unrelated changes. Test plan: - Built prettier-svelte with nix build .#prettier-svelte - Modified src/exo/main.py and rebuilt - same store path (no rebuild) - Modified dashboard/package.json and rebuilt - different store path (rebuild triggered) - Ran nix flake check successfully	2026-01-26 12:02:05 +00:00
Evan Quiney	d93db3d6bf	re enable the evil network script (#1277 ) seems like we still need the interfaces to be routable for mdns. at least we're not dependent on this behaviour anymore.	2026-01-24 13:36:06 +00:00
Alex Cheema	ff4a2022f7	Revert state compaction (#1259 ) (#1275 ) ## Summary Reverts the state compaction feature (#1259) to investigate issues with nodes staying as "unknown" after joining a cluster. ## Test plan - [ ] Verify nodes properly show up after joining cluster - [ ] Verify state catchup works correctly without compaction 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-01-23 16:29:48 -08:00
rltakashige	cee48f6f34	Parse GPT OSS tool calling (#1271 ) ## Motivation <img width="3162" height="858" alt="image" src="https://github.com/user-attachments/assets/e552f373-620a-4522-894b-6f93fd7f1e50" /> ## Changes OpenAI Harmony StreamableParser does parsing for us. ## Why It Works <img width="3230" height="588" alt="image" src="https://github.com/user-attachments/assets/81f8a43e-c04b-4bd0-9fd0-65e9b5f6ea1d" />	2026-01-23 20:43:53 +00:00
Evan Quiney	2b67e84a03	state compaction (#1259 ) ## motivation a node joining a long-running cluster would bring down networking. this attempts to mitigate that issue by compacting the state for catching up new devices ## changes introduces a new topic ("state_catchup") over which a full state can be sent. currently the master sends the worker + api this new state, and they update only if they have no other events applied - otherwise usual NACK systems function ## testing manually tested on two and eight nodes - its an improvement, not a fix Co-authored-by: rltakashige <rl.takashige@gmail.com>	2026-01-23 20:32:49 +00:00
Alex Cheema	7204fdeb4a	Restore Thunderbolt Bridge LaunchDaemon (#1270 ) ## Motivation The LaunchDaemon approach for disabling Thunderbolt Bridge was removed in commit `43f12f5d` and replaced with dynamic cycle detection. However, the LaunchDaemon runs automatically on reboot, ensuring the bridge is always disabled before it can cause packet storms. ## Changes - Restore `NetworkSetupHelper.promptAndInstallIfNeeded()` to install a LaunchDaemon that disables Thunderbolt Bridge on startup - Show user prompt explaining what will be installed before requesting admin password - Remove old cleanup-only logic from `EXOApp.swift` - Installer removes any existing installation before installing fresh (handles upgrades) ## Why It Works The LaunchDaemon runs at boot with `RunAtLoad=true` and periodically (every ~30 min), destroying bridge0 and disabling Thunderbolt Bridge before it can cause packet storms. The daemon is only installed once—`daemonAlreadyInstalled()` checks script content and plist config match before prompting. ## Test Plan ### Manual Testing - Run app first time → should see prompt → click Install → enter admin password → daemon installed - Run app again → no prompt (already installed) - Reboot → bridge0 should be destroyed/disabled automatically - Check daemon: `launchctl list \| grep io.exo.networksetup` - Check files: `/Library/LaunchDaemons/io.exo.networksetup.plist`, `/Library/Application Support/EXO/disable_bridge.sh` ### Automated Testing N/A - requires admin privileges and system-level changes Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 20:25:37 +00:00
Evan Quiney	ec345a4315	fix: deprioritise uncertain ethernet devices (#1267 ) we were placing coordinators on uncertain devices (enX+) that are listed as "USB LAN" - these could be thunderbolt ports breaking RDMA instances	2026-01-23 20:13:28 +00:00
ciaranbor	9967dfa734	Prevent conversation collision (#1266 ) ## Motivation When a user switched conversations while a response was still streaming, the streaming content would be written to the currently selected conversation instead of the original one. For streamed image generation, each partial image would be written to the open conversation ## Changes Added helper methods to track and update the correct conversation during streaming: - updateConversationMessage() - Update a message in a specific conversation by ID - syncActiveMessagesIfNeeded() - Sync this.messages from target conversation only if it's active - conversationExists() - Check if a conversation still exists (handles mid-stream deletion) - persistConversation() - Persist a specific conversation to storage - addMessageToConversation() - Add a message directly to a specific conversation ## Why It Works Capturing the conversation ID at the start of the request ensures we know which conversation to update ## Test Plan ### Manual Testing Tested switching conversation during generation across each model type	2026-01-23 19:59:08 +00:00
ciaranbor	23fd37fe4d	Add FLUX.1-Krea-dev model (#1269 ) ## Why It Works Same implementation as FLUX.1-dev, just different weights	2026-01-23 19:48:24 +00:00
Alex Cheema	d229df38f9	Fix placement filter to use subset matching instead of exact match (#1265 ) ## Motivation When using the dashboard's instance placement filter (clicking nodes in the topology), it was filtering to placements that use exactly the selected nodes. This isn't the expected behavior - users want to see placements that include all selected nodes, but may also include additional nodes. For example, selecting nodes [A, B] should show placements using [A, B], [A, B, C], [A, B, C, D], etc. - not just [A, B]. ## Changes - Added `required_nodes` parameter to `place_instance()` in `placement.py` - Filter cycles early in placement to only those containing all required nodes (subset matching) - Simplified `api.py` by removing the subgraph topology filtering and passing `required_nodes` directly to placement - Renamed internal `node_ids` variable to `placement_node_ids` to avoid shadowing the parameter ## Why It Works By filtering cycles at the placement level using `required_nodes.issubset(cycle.node_ids)`, we ensure that only cycles containing all the user-selected nodes are considered. This happens early in the placement algorithm, so we don't waste time computing placements that would be filtered out later. ## Test Plan ### Manual Testing - Select nodes in the dashboard topology view - Verify that placements shown include all selected nodes (but may include additional nodes) - Verify that placements not containing the selected nodes are filtered out ### Automated Testing - Existing placement tests pass - `uv run pytest src/exo/master/tests/ -v` - 37 tests pass Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 19:40:31 +00:00
Alex Cheema	8a595fee2f	Fix Thunderbolt bridge cycle detection to include 2-node cycles (#1261 ) ## Motivation Packet storms occur with Thunderbolt bridge enabled on 2 machines connected by Thunderbolt, not just 3+ node cycles as previously assumed. The cycle detection was too conservative and missed this case. ## Changes - Changed the minimum cycle length from >2 (3+ nodes) to >=2 (2+ nodes) - Updated the early return threshold from `< 3` to `< 2` enabled nodes - Updated docstring to reflect the new behavior ## Why It Works A Thunderbolt bridge loop between just 2 machines can still create broadcast storms when both have the bridge enabled. The previous threshold of 3+ was based on an incorrect assumption that 2-node connections wouldn't cause this problem. ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> - Tested with 2 machines connected via Thunderbolt with bridge enabled - Confirmed packet storms occur in this configuration - Verified the fix correctly detects and handles 2-node cycles ### Automated Testing - Existing topology tests cover cycle detection logic Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-23 19:34:48 +00:00
ciaranbor	c8571a17a3	Fix guidance (#1264 ) ## Motivation Previously, we only handled user-provided guidance parameter for CFG models. ## Changes Just pass the parameter to model setup	2026-01-23 19:13:45 +00:00
Evan Quiney	771a86331b	fix instance port assignment (#1268 ) we were overassigning the port 52414 to instances because of an error in placement	2026-01-23 18:37:40 +00:00
Jake Hillion	6dbbe7797b	downloads: add download and delete buttons to downloads UI The downloads page showed model download progress but provided no way for users to trigger downloads or remove completed models from disk. Added API endpoints (POST /download/start, DELETE /download/{node_id}/{model_id}) that send StartDownload and DeleteDownload commands via the download_command_sender. Updated the dashboard downloads page with per-model buttons: a download button for incomplete downloads and a delete button for completed ones. This allows users to manage downloads directly from the UI without needing to trigger downloads through other means. Test plan: - Deployed on a 3 machine cluster. Did several downloads/deletions - all work and the dashboard updates relatively fluently. It takes roughly 5 seconds to render a 131GB model deletion which isn't too bad.	2026-01-23 18:11:17 +00:00
Jake Hillion	9357503c6f	downloads: refactor to run at node level The Worker previously owned the ShardDownloader directly via dependency injection, which prevented --no-worker nodes from downloading and made it impossible for multiple Workers to share a single downloader instance. Moved download functionality to a new DownloadCoordinator component at the Node level that communicates via the DOWNLOAD_COMMANDS pub/sub topic. Workers now send StartDownload commands instead of calling the downloader directly, and receive progress updates through the event-sourced state. This decouples downloads from the Worker lifecycle and enables future features like UI-triggered downloads to specific nodes and multi-worker download sharing. Test plan: - Mostly tested in the next PR that adds explicit downloads/deletions to the dashboard. - Started a model that isn't downloaded - it works.	2026-01-23 18:04:09 +00:00
ciaranbor	ba19940828	Fix regenerate for image models (#1263 ) ## Motivation The 'regenerate' button was hardcoded to chat completion. Clicking 'regenerate' for image request would result in an error after the model is loaded ## Changes Store request type and dispatch to appropriate request upon regeneration ## Why It Works We make sure to repeat the same request type as was performed originally ## Test Plan ### Manual Testing Checked 'regenerate' works for chat completion, image generation, image editing	2026-01-23 16:33:01 +00:00
Jake Hillion	f255345a1a	dashboard: decouple prettier-svelte from dashboard source The prettier-svelte formatter depended on the full dashboard build (dashboardFull), causing the devshell to rebuild whenever any dashboard source file changed. Created a deps-only dream2nix derivation (deps.nix) that uses a stub source containing only package.json, package-lock.json, and minimal files for vite to succeed. Updated prettier-svelte to use this derivation instead of dashboardFull. The stub source is constant unless lockfiles change, so prettier-svelte and the devshell no longer rebuild when dashboard source files are modified. Test plan: - nix flake check passed - nix fmt successfully formatted svelte files	2026-01-23 15:16:48 +00:00
ciaranbor	a1939c89f2	Enable UI settings for image editing (#1258 ) ## Motivation Image editing was missing UI controls for quality, output format, and advanced parameters that text-to-image generation already supported. ## Changes - Added quality, output_format, and advanced_params to image edit API endpoints - Extended isImageModel check to include image editing models ## Why It Works The API now accepts and forwards these settings for image edits, and the UI displays the appropriate controls for image editing models. ## Test Plan ### Manual Testing Verified parameters can be set in UI and that they progagate through to model inference	2026-01-23 13:37:25 +00:00
ciaranbor	cb9c9ee55c	Enable generating multiple images. Optionally stream partial images (#1251 ) ## Motivation Support OpenAI API `n` setting ## Changes - Users can select `n` to generate more than one image with the same prompt - each image uses a different seed -> different results - `stream` and `partial_images` settings can be overwritten in UI	2026-01-23 11:19:58 +00:00
Alex Cheema	df240f834d	Fix GLM and Kimi tool calling crashes (#1255 ) ## Motivation Fixes tool calling crashes with GLM-4.7-Flash and Kimi-K2 models. Related: #1254 Two distinct issues were causing crashes: 1. Tool parser crashes - The upstream GLM47 and Kimi tool parsers call `.group()` on regex matches without checking for `None`, causing `AttributeError` when the model outputs malformed tool calls 2. Chat template crashes - GLM's chat template expects `tool_calls[].function.arguments` to be a dict, but OpenAI format provides it as a JSON string, causing `'str object' has no attribute 'items'` ## Changes `src/exo/worker/runner/runner.py`: - Add `patch_glm_tokenizer()` - fixed version of mlx_lm's glm47 parser with None checks - Fix `patch_kimi_tokenizer()` - add None checks before calling `.group()` on regex matches - Add `ValueError` and `AttributeError` to exception handling in `parse_tool_calls()` `src/exo/worker/engines/mlx/utils_mlx.py`: - Add `_normalize_tool_calls()` - parses `tool_calls[].function.arguments` from JSON string to dict for templates that expect dicts (like GLM-4.7-Flash) ## Why It Works 1. Parser fixes: By checking if regex matches are `None` before calling `.group()`, we can raise a proper `ValueError` instead of crashing with `AttributeError` 2. Template fix: The GLM-4.7-Flash chat template iterates over arguments with `.items()`: ```jinja2 {% set _args = tc.arguments %}{% for k, v in _args.items() %} ``` OpenAI format has `arguments` as a JSON string. `_normalize_tool_calls()` parses this to a dict before passing to the template. ## Test Plan ### Manual Testing - Hardware: Mac with GLM-4.7-Flash-4bit model - Tested tool calling with GLM model - no longer crashes ### Automated Testing - Existing tests pass (`uv run pytest`) - Type checking passes (`uv run basedpyright`) - Linting passes (`uv run ruff check`) --------- Co-authored-by: Claude <noreply@anthropic.com>	2026-01-23 01:39:59 +00:00
ciaranbor	cd125b3b8c	Use icon for image editing models (#1252 ) ## Motivation Visual indicator for image editing models ## Changes Add pencil icon to edit models in model list	2026-01-22 22:37:34 +00:00
Alex Cheema	b783a21399	dashboard: add placement filter by clicking topology nodes (#1248 ) ## Motivation When selecting a model for placement, users often want to see placements that utilize specific nodes in their cluster. Currently there's no way to filter the placement previews to focus on configurations that include particular machines. ## Changes - Backend: Added `node_ids` query parameter to the `/placement-previews` API endpoint. When provided, the endpoint filters the topology to only include the specified nodes before generating placements using the new `Topology.filter_to_nodes()` method. - Topology class: Added `filter_to_nodes(node_ids)` method that creates a new topology containing only the specified nodes and edges between them. - App store: Added `previewNodeFilter` state to track selected nodes, with methods to toggle/clear the filter. Automatically cleans up filter when nodes are removed from the cluster and re-fetches previews when topology changes. - TopologyGraph component: Added click handlers to toggle node filter selection, hover effects to indicate clickable nodes, and visual styling (yellow highlight for selected, dimmed for filtered-out nodes). - Main page: Added filter indicator in top-right corner of topology showing active filter count with a clear button. ## Why It Works The filtering happens at the backend/placement generation level rather than just filtering the results. This ensures we see all valid placement combinations for the selected nodes, not just a subset that happened to be generated for the full topology. The visual feedback uses the same rendering approach as the existing highlight system - state is tracked in Svelte and applied during render, so it persists across data updates without flickering. ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> - Click a node in topology → should show yellow highlight and filter indicator - Click another node → indicator shows "2 nodes", previews update to show only placements using both - Hover over nodes → subtle yellow highlight indicates they're clickable - Click X on filter indicator → clears filter, shows all placements again - Disconnect a node while it's in filter → filter auto-removes that node ### Automated Testing - Existing tests cover the Topology class; the new `filter_to_nodes` method follows the same patterns --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 22:12:57 +00:00
Alex Cheema	43f12f5d08	Replace LaunchDaemon with dynamic Thunderbolt Bridge loop detection (#1222 ) ## Motivation The previous approach installed a LaunchDaemon plist that ran periodically to disable Thunderbolt Bridge. This required full admin privileges upfront and ran regardless of whether a problematic loop existed. This change replaces that with dynamic detection - only prompting the user when an actual TB bridge loop with 3+ machines is detected, and using fine-grained SCPreferences authorization instead of full admin. ## Changes Backend (Python): - Added `ThunderboltBridgeStatus` model to track bridge enabled/exists state per node - Added `node_thunderbolt_bridge` and `thunderbolt_bridge_cycles` fields to State - Added `get_thunderbolt_bridge_cycles()` method to Topology class - Robust TB bridge detection: - Finds bridge network services from `-listnetworkserviceorder` (not `-listallhardwareports` which can miss bridges) - Checks each bridge's member interfaces via `ifconfig` to verify it contains Thunderbolt interfaces - Handles varying service names (e.g., "TB Bridge", "Thunderbolt Bridge", "Bridge (bridge0)") - Includes `service_name` in status for correct disable commands - Added warning logs for all error cases in detection - Updated `apply.py` to handle the new event type and recompute cycles on node timeout Swift App: - New `ThunderboltBridgeService` that monitors for cycles from cluster state - Shows NSAlert when a cycle with >2 machines is detected - Uses `SCPreferencesCreateWithAuthorization` with `system.services.systemconfiguration.network` right for targeted permissions - Auto-cleanup of legacy LaunchDaemon: On app startup, checks for and removes old plist/scripts (non-fatal if user cancels) - Periodic local network checking: Re-checks every 10s so the warning disappears when user grants permission - Fixed ClusterState model: Updated to decode new granular state fields (`nodeIdentities`, `nodeMemory`, `nodeSystem`, `nodeThunderboltBridge`) with computed `nodeProfiles` property for backwards compatibility - Fixed Topology model: Updated to match actual JSON structure where `nodes` is an array of strings (not objects) and `connections` is a nested map (not flat array) - Cleaned up `NetworkSetupHelper` by removing daemon installation code (now only handles uninstall) Dashboard: - Added yellow warning badge on topology when TB bridge cycle detected - On hover: highlights affected nodes in yellow on the topology graph - Shows which machines are in the cycle with friendly names - Provides copy-paste terminal command with the correct service name: ``` sudo networksetup -setnetworkserviceenabled "<service-name>" off ``` - Warning appears in all topology views (full, welcome, and minimized chat sidebar) - Debug mode: Shows "TB:ON" or "TB:OFF" status next to each node in the topology ## Why It Works - Cycle detection happens on the backend where we have full topology information - Only cycles with 3+ machines are flagged (2-node connections are fine) - TB bridge detection is robust: - Uses `-listnetworkserviceorder` to find bridges (works on all machines tested) - Verifies bridge membership via `ifconfig` to confirm Thunderbolt interfaces - Handles different service names across machines - The Swift app reacts to detected cycles and prompts the user once per cycle - The dashboard provides visual feedback and actionable instructions - `SCPreferencesCreateWithAuthorization` provides the minimal permissions needed to modify network service state - Legacy LaunchDaemon is automatically cleaned up on first launch with this version ## Test Plan ### Manual Testing Here EXO detected a TB bridge cycle: #### Dashboard: <img width="1363" height="884" alt="Screenshot 2026-01-21 at 10 07 30 PM" src="https://github.com/user-attachments/assets/7da9c621-0c91-42c4-898e-4952188a1f61" /> #### Hovering the warning: <img width="359" height="279" alt="Screenshot 2026-01-21 at 16 30 57" src="https://github.com/user-attachments/assets/05501dcf-3d4a-4704-9f38-257748c05a53" /> #### macOS app warning popup: <img width="270" height="410" alt="Screenshot 2026-01-21 at 16 29 08" src="https://github.com/user-attachments/assets/45714427-08c3-4fb4-9e61-144925c51adf" /> ### Which then asks for the user's password: <img width="263" height="372" alt="Screenshot 2026-01-21 at 16 29 28" src="https://github.com/user-attachments/assets/7502e591-596d-4128-8cf5-6a12674e27bc" /> Which when entered, successfully disables bridge and no longer shows the warning on dashboard. #### When it fails it shows the error message: <img width="263" height="234" alt="Screenshot 2026-01-21 at 14 45 38" src="https://github.com/user-attachments/assets/2d10b3d5-69d7-46ea-b631-d52d8651ab41" /> ### Automated Testing - Type checker: 0 errors (`uv run basedpyright`) - Linter: All checks passed (`uv run ruff check`) - Tests: 118 passed (`uv run pytest`) - Dashboard: Builds successfully (`npm run build`) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-22 21:53:05 +00:00
ciaranbor	8027d7933f	Ciaran/hf token (#1250 ) ## Motivation black-forest-labs models require hf auth and signup to download. We don't handle this gracefully. https://github.com/exo-explore/exo/issues/1242 ## Changes - Handle auth errors - Surface error to UI and suggest resolution - Support using HF_TOKEN env variable for auto - Hide image functionality behind `EXO_ENABLE_IMAGE_MODELS=true` for now ## Why It Works Users are presented with actionable feedback when issue occurs ## Test Plan ### Manual Testing Confirmed loading black-forest-labs model in UI presents the issue in the UI. Confirmed both `hf auto login` and setting `HF_TOKEN` resolve the issue	2026-01-22 20:39:53 +00:00
Evan	ac6efa747b	add kimi tool parseing this patches the kimi tokenizer to add tool calling - it can be reverted once upstream support is added for kimi-k2	2026-01-22 11:49:25 +00:00
Evan	2e3c33db6d	implement mlx-lm tool calling splits up the runners generation chunks into tool calls, tokens and errors, and writes tool call chunks when the upstream parser detects them.	2026-01-22 11:49:25 +00:00
rltakashige	fc8e6ad06b	Reduce download log spam (#1249 ) ## Motivation <!-- Why is this change needed? What problem does it solve? --> <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - -->	2026-01-22 11:28:36 +00:00
Alex Cheema	023108a19d	Disable image model cards temporarily (#1247 ) ## Motivation Image generation feature is not stable and causing issues for users. Fixes #1242 ## Changes - Commented out image model cards (flux1-schnell, flux1-dev, qwen-image, qwen-image-edit-2509) in `src/exo/shared/models/model_cards.py` - Added reference to issue #1242 in the comment explaining why they are disabled ## Why It Works By commenting out the model cards, these image models will no longer appear in the model list, preventing users from attempting to use the unstable feature until it is stabilized. ## Test Plan ### Manual Testing - Run exo and verify image models no longer appear in the model list ### Automated Testing - No changes to automated tests needed - this simply removes models from the available list Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 22:39:59 +00:00
Jake Hillion	c9818c30b4	dashboard: show model total size on downloads page for pending downloads The downloads page showed "0B / 0B" for models that haven't started downloading yet because the download progress data only gets populated after the file list is fetched from HuggingFace. Added a fetch to the /models API endpoint on page mount and created a helper function that falls back to storage_size_megabytes when the download's totalBytes is 0. This allows users to see the actual model size (e.g., "0 / 25GB") before a download begins, which is helpful for a future feature that lets you download models explicitly. Test plan: - Deployed to a cluster, the previous 0B now show sensible values.	2026-01-21 21:53:54 +00:00
Alex Cheema	8f6726d6be	Fix config.json download errors for image models (#1245 ) ## Motivation When `get_shard_download_status()` runs, it iterates over all models in `MODEL_CARDS` and calls `build_full_shard()` → `build_base_shard()` → `ModelCard.from_hf()`. This unconditionally tried to download `config.json` from HuggingFace, but image models (FLUX, Qwen-Image) don't have a root-level config.json file, causing errors: ``` Error downloading shard: File not found: https://huggingface.co/black-forest-labs/FLUX.1-dev/resolve/main/config.json Error downloading shard: File not found: https://huggingface.co/black-forest-labs/FLUX.1-schnell/resolve/main/config.json Error downloading shard: File not found: https://huggingface.co/Qwen/Qwen-Image/resolve/main/config.json Error downloading shard: File not found: https://huggingface.co/Qwen/Qwen-Image-Edit-2509/resolve/main/config.json ``` ## Changes ### ModelCard.load() fix - `build_base_shard()` now uses `ModelCard.load()` instead of `ModelCard.from_hf()` - `ModelCard.load()` iterates through `MODEL_CARDS.values()` to find a match by `model_id` ### exo-bench fixes - Use `name` field instead of `id` for model resolution - Pass `full_model_id` to `/instance/previews` endpoint - Make model name matching case-insensitive - Update README example model name ## Why It Works `MODEL_CARDS` uses short names as keys (e.g., `"flux1-schnell"`) but the `model_id` values are HuggingFace paths (e.g., `"black-forest-labs/FLUX.1-schnell"`). When `ModelCard.load()` was called with the HF path, it didn't match any key and fell back to `from_hf()` which tried to download config.json. The fix iterates through `MODEL_CARDS.values()` to find a match by `model_id`, ensuring predefined models (including image models) use their registry entries directly without network calls. A key lookup is unnecessary since `load()` is always called with HF paths which don't match the short-name keys. ## Test Plan ### Manual Testing - Run exo and verify no more "Error downloading shard: File not found: .../config.json" errors for image models - Run exo-bench and verify model resolution works correctly ### Automated Testing - `uv run basedpyright` - passes with 0 errors - `uv run pytest` - all tests pass 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 21:30:48 +00:00
rltakashige	ede779219c	Reduce log spam (#1241 ) ## Motivation Way too much spam. Some logs were also obsolete, leading to users thinking there was something wrong during expected behaviour. ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - -->	2026-01-21 19:08:30 +00:00
Jake Hillion	a7e205e489	treefmt: add Svelte file formatting Package prettier with Svelte support and add it to treefmt-nix to format the dashboard. This change is brutal, I spent a long time trying to get it nicer but it doesn't seem there's a good way to make this minimal. Sorry for the noise! This will make it easier for new contributors to get the formatting right first time. Also removes the `.prettierrc` because it turns out treefmt-nix was ignoring it. Test plan: - CI	2026-01-21 18:51:55 +00:00
rltakashige	a354aaa3e5	Fix tests broken in recent commits (#1239 ) We'll have good CI soon... ## Test Plan ### Automated Testing Wroks	2026-01-21 18:32:49 +00:00
ciaranbor	307f454b96	feat: initial image generation support (#1095 ) ## Motivation Enable distributed image generation across exo clusters ## Changes - Added OpenAI-compatible /v1/images/generations and /v1/images/edits API endpoints - Added /bench/images/generations and /bench/images/edits endpoints that return generation statistics (timing, throughput metrics) - Implemented PipeFusion distributed inference for diffusion models, enabling patch-based parallelism across nodes - Added model adapters for Flux (schnell, dev) and Qwen image models ## Why It Works https://arxiv.org/abs/2405.14430 ## Test Plan ### Manual Testing - Generate images using /v1/images/generations endpoint with single and multi-node clusters - Test image editing via /v1/images/edits with source images - Verify streaming partial images appear progressively in the dashboard - Use /bench/images/generations to measure generation performance - Test both Flux and Qwen model families --------- Co-authored-by: Sami Khan <smsak99@gmail.com>	2026-01-21 18:21:58 +00:00
rltakashige	a31b6ee045	Import download utils once all modules are loaded (#1238 ) ## Motivation Test failed due to circular import ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing Tried importing and calling the functions, worked fine. ### Automated Testing Tests pass again	2026-01-21 17:58:06 +00:00

1 2 3 4 5 ...

1986 Commits