mirror/exo - exo - Gitea: Git with a cup of tea

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-04-24 16:01:21 -04:00

Author	SHA1	Message	Date
Jake Hillion	bdb43e1dbb	nix: drop noisy echos from devshell Drop all the printing when entering a devshell. It's annoying, and not a super accurate description of how to develop exo anyway.	2026-01-14 10:04:57 +00:00
Jake Hillion	e4a01e2b0e	chore(deps): nix lock file maintenance Update nix flake inputs. Add a second input as Swift is currently broken in nixpkgs on Linux for `swift-format` as we want `nix fmt` to continue being reproducible everywhere.	2026-01-13 19:57:14 +01:00
Evan Quiney	1200a7db64	Add tensor sharding for GPT-OSS (#1144 ) ## Motivation GPT OSS did not previously support tensor sharding ## Changes Add GPT sharding support in tensor_auto_parallel. Code is mostly @rltakashige's ## Test Plan ### Manual Testing Tested GPT-OSS - MLX Fast Sync causes issues in Tensor RDMA - this is a general problem at the moment.	2026-01-13 17:25:52 +00:00
Evan Quiney	47ceb54bc1	up the rlimit (#1148 ) Fixes #1117 Manual testing: Launched 100 instances. worked. yay.	2026-01-13 15:00:54 +00:00
Jake Hillion	f8112fdf25	nix: convert to flake-parts Preparing to add a flake-parts module for Rust builds. The flake-utils library doesn't support the module system needed for cleanly separating the Rust build configuration. Converted from flake-utils to flake-parts, switching to the treefmt-nix flakeModule import pattern. The devShell and formatter outputs remain functionally equivalent. Test plan: - Ran `nix flake check` successfully - Verified `nix develop` provides the same environment	2026-01-13 15:06:44 +01:00
Alex Cheema	e388f59480	docs: add AGENTS.md for AI coding agents guidance (#1132 ) ## Motivation Add documentation to help AI coding agents (Claude Code, Cursor, GitHub Copilot, etc.) understand the exo codebase and contribute effectively. ## Changes - Add `AGENTS.md` with guidance for AI agents working on the codebase - Add symlink `CLAUDE.md -> AGENTS.md` for backwards compatibility with Claude Code ## Why It Works `AGENTS.md` is becoming a standard convention for AI agent instructions. The symlink ensures Claude Code (which looks for `CLAUDE.md`) continues to work while supporting the broader `AGENTS.md` convention. ## Test Plan ### Manual Testing - Verified symlink works correctly ### Automated Testing - N/A (documentation only) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-13 13:05:47 +00:00
Alex Cheema	e5e74e1eef	Upgrade mlx-lm to 0.30.2 with transformers 5.x compatibility (#1125 ) ## Motivation Upgrade mlx-lm to version 0.30.2 which requires transformers 5.0.0rc2 as a prerelease dependency. This enables support for newer models like Kimi K2 Thinking while maintaining compatibility with existing models. The transformers 5.x release includes breaking changes that affect custom tokenizers like Kimi's TikTokenTokenizer, requiring compatibility fixes. ## Changes ### Core Changes - mlx-lm upgrade: Bump to 0.30.2 with locked exact versions for mlx/mlx-lm to prevent breaking changes - transformers 5.x compatibility: Enable prerelease transformers dependency ### Kimi K2 Tokenizer Fixes - Add `bytes_to_unicode` monkey-patch to restore function moved in transformers 5.0.0rc2 - Load `TikTokenTokenizer` directly instead of via `AutoTokenizer` to bypass transformers 5.x bug with `auto_map` fallback - Patch `encode()` to use tiktoken directly with `allowed_special="all"` to handle special tokens from chat templates ### Other Changes - Dashboard: Show disk usage for completed model downloads - CI: Add `workflow_dispatch` trigger to build-app workflow - Docs: Add basic API documentation ### Testing - Add comprehensive tokenizer unit tests for all supported models - Tests verify encode/decode, special token handling, and chat template encoding ## Why It Works bytes_to_unicode issue: transformers 5.0.0rc2 moved `bytes_to_unicode` from `transformers.models.gpt2.tokenization_gpt2` to `transformers.convert_slow_tokenizer`. Kimi's `tokenization_kimi.py` imports from the old location. The monkey-patch restores it at module load time. AutoTokenizer issue: transformers 5.x has a bug where `tokenizer_class_from_name('TikTokenTokenizer')` returns `None` for custom tokenizers with `auto_map`. Loading the tokenizer directly bypasses this. encode() issue: transformers 5.x's `pad()` method fails for slow tokenizers. Using tiktoken's encode directly with `allowed_special="all"` avoids this path and properly handles special tokens like `<\|im_user\|>` from chat templates. ## Test Plan ### Manual Testing - Hardware: 2x Mac Studios connected via Thunderbolt 5 (mike22 and james21) - Tested Kimi K2 Thinking, GPT-OSS-120B, GPT-OSS-20B, LLama-3.1-8B-bf16, qwen3-30B-A3B-8bit model with pipeline parallelism across both nodes - Verified warmup inference completes successfully - Verified chat completions work with special tokens ### Automated Testing - Added `test_tokenizers.py` with 31 tests covering: - Basic encode/decode for all model families (deepseek, kimi, llama, qwen, gpt-oss, glm) - Special token encoding (critical for chat templates) - Chat template application and encoding - Kimi-specific and GLM-specific edge cases - All tests pass: `uv run pytest src/exo/worker/tests/unittests/test_mlx/test_tokenizers.py` ### Failing Tests RDMA with all models. --------- Co-authored-by: Evan <evanev7@gmail.com>	2026-01-13 12:06:04 +00:00
Jake Hillion	b968d6f0a0	ci: remove old commented out job	2026-01-13 12:42:04 +01:00
Jake Hillion	3bfffd9b4f	ci: build all Nix outputs on all platforms and push to cachix The CI was only running `nix flake check` on ubuntu-latest, missing builds for other platforms and not caching packages or devShells. Added a matrix-based `nix-build` job that runs on macos-26 (aarch64-darwin), ubuntu-latest (x86_64-linux), and ubuntu-24.04-arm (aarch64-linux). Each job enumerates all packages and devShells via `nix flake show --json`, builds them in a single `nix build` call for parallelization, then runs `nix flake check`. The cachix-action pushes all built outputs automatically. This ensures all Nix outputs are built and cached for every supported platform, speeding up local development and CI runs. Test plan: - Tested jq enumeration command locally, correctly outputs devShell paths - Verified xargs pipeline works with the enumerated outputs	2026-01-13 12:37:12 +01:00
Jake Hillion	007eb80029	nix: enable cachix Enable cachix and push to it in the pipeline.yml workflow. This won't cache a huge amount yet but will automatically extend our caching as we build more of the repo with Nix in CI. It can also be used by local users by accepting our cache to improve the speed of local builds. Test plan: - CI	2026-01-12 17:24:59 +01:00
Jake Hillion	8d7b6789b3	dashboard: show disk usage for completed models The downloads dashboard showed "Completed" for finished model downloads but provided no indication of how much disk space each model or the total models on a node were using. Added total_bytes field to DownloadCompleted type so the size is preserved when a download completes. Updated the dashboard to display the model size next to "Completed" status (e.g., "Completed (251.1GB)") and a total disk usage line below the model count for each node (e.g., "502.2GB on disk"). Test plan: - Ran unit tests for download apply and planning logic - Type checked all modified files with basedpyright	2026-01-12 16:34:29 +01:00
Jake Hillion	3c5b7ea670	ci: add workflow_dispatch trigger to build-app Build app is the most convenient way to get a DMG for testing, but currently it's a bit limited. You have to push to test-app every time which is far from ideal and requires a bit too much force pushing for my liking. Add the workflow_dispatch trigger. This adds a button in the actions UI to trigger a workflow for a named branch, which means you can use your normal dev branch instead of having to push to test-app. We'll leave that behaviour there for now too, though it may change in future. Filter on `"${{ github.event_name }}" == "workflow_dispatch"` and set those to alpha as well. Will verify by pushing the first version from `main` just in case. Unfortunately we do have to merge this before we can test it. Test plan: - Looking really hard.	2026-01-12 12:14:21 +01:00
PG	b74a610537	Add a basic documentation to the api interface (#1122 ) ## Motivation Adds basic api documentation ## Changes - Add docs/api.md - Modify README.md	2026-01-11 18:44:40 +00:00
Jake Hillion	18c4e49f91	nix: put treefmt in devshell treefmt is a useful to be able to access directly for some formatters like `jj fix`. Expose it in the devshell. Test plan: - Used with `jj fix` on a large branch. It worked.	2026-01-09 17:53:50 +01:00
Sami Khan	d85b5d3781	feat: uninstall button (#1077 ) ## Motivation https://github.com/exo-explore/exo/issues/1075 ## Changes - Added in-app "Uninstall" option under Advanced menu that cleanly removes all system components - Added NetworkSetupHelper.uninstall() to remove LaunchDaemon, scripts, logs, and restore network settings - Added LaunchAtLoginHelper.disable() to unregister from login items - Created standalone uninstall-exo.sh script for users who already deleted the app - Added uninstall documentation to README <img width="386" height="577" alt="image" src="https://github.com/user-attachments/assets/6bbcd18a-992a-409d-8791-ed5e13bbcfe0" /> <img width="372" height="432" alt="image" src="https://github.com/user-attachments/assets/ee76b45d-c111-4807-ab28-3f2f20e01140" /> ## Why It Works The in-app uninstaller runs a privileged shell script (via AppleScript) to launchctl bootout the daemon, remove files, and restore the "Automatic" network location. The standalone script provides the same cleanup for users who already deleted the app. ## Test Plan ### Manual Testing Hardware: MacBook Pro - Built and ran app, verified LaunchDaemon and network location were created - Used in-app Uninstall, verified all components removed and network restored to Automatic - Rebuilt app, quit normally, ran sudo ./uninstall-exo.sh, verified same cleanup ### Automated Testing N/A --------- Co-authored-by: Evan <evanev7@gmail.com>	2026-01-09 14:49:08 +00:00
Evan Quiney	caafc48693	Forward tools to the models chat template properly (#1106 ) We did not properly forward tools to the chat template before. This is not a full tool calling impl - but it should improve things slightly. ## Changes made Pass tools to the hf tokenizers chat template Join message chunks into a larger message (opencode does this sometimes - we were ignoring before) ## Future work We need to parse the model output and normalise the return format to be compatible with the openai api.	2026-01-09 13:28:41 +00:00
Evan	cca8c9984a	cleanup unused dependencies we have a lot of dependencies we have no intent of using. kill them with fire! ## testing exo still launches and does the worst inference known to man on my Qwen3 instance. tests pass too!!	2026-01-09 13:11:58 +00:00
Sami Khan	d1e88def42	scrollbars fixed (#1113 ) ## Motivation Fixes https://github.com/exo-explore/exo/issues/1107 - Horizontal scrollbar always appears in instances section, and vertical scrollbar appears too early (with just 1-2 instances on large screens). ## Changes - Added overflow-x-hidden to remove horizontal scrollbar - Added xl:max-h-96 for responsive vertical height (384px on xl+ screens vs 288px default) - Added py-px to accommodate corner accent decorations that extend 1px outside cards ## Why It Works - overflow-x-hidden prevents horizontal scroll regardless of content - Larger max-height on xl screens fits 2 instances without scrollbar; 3rd triggers it - 1px vertical padding accommodates the -top-px/-bottom-px positioned corner accents that caused tiny overflow ## Test Plan ### Manual Testing <img width="1190" height="868" alt="image" src="https://github.com/user-attachments/assets/2a582328-5b4f-4490-a488-52106f2e85ef" /> ### Automated Testing N/A	2026-01-09 12:51:05 +00:00
Sami Khan	59e7594e34	UNKNOWN to PREPARING (#1112 ) ## Motivation The "UNKNOWN" status shown when first launching an instance is confusing and unhelpful. "PREPARING" better describes what's actually happening. ![telegram-cloud-photo-size-4-5981245965962251168-x](https://github.com/user-attachments/assets/65b0802b-fb64-4fa7-bff7-c13757035b3a) ## Changes - Renamed status from "UNKNOWN" to "PREPARING" in dashboard (+page.svelte) - Renamed unknown state to preparing in macOS app (InstanceViewModel.swift, InstanceRowView.swift) ## Why It Works The status appears when an instance exists but runners haven't reported status yet. "PREPARING" accurately describes this transitional state. ## Test Plan ### Manual Testing Hardware: MacBook Pro <img width="319" height="200" alt="image" src="https://github.com/user-attachments/assets/9a1c3caf-026d-47ea-80d1-63c6e41d93aa" /> ### Automated Testing N/A	2026-01-09 11:46:51 +00:00
Chris A	c65320acd3	Fix mlx seed (#1094 ) ## Motivation <!-- Why is this change needed? What problem does it solve? --> <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - --> --------- Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com> Co-authored-by: Ryuichi Leo Takashige <leo@exolabs.net>	2026-01-09 01:40:15 +00:00
Jake Hillion	b9a78f6f3a	ci: compute CURRENT_PROJECT_VERSION from semver Previous Sparkle builds were cut from a different repo with different build numbers, breaking version ordering. Users aren't receiving updates because CFBundleVersion values don't reflect the actual version sequence. Added a step to compute the build version deterministically from semver: PRERELEASE + (1000 * PATCH) + (1_000_000 * MINOR) + (1_000_000_000 * MAJOR). Release versions use prerelease=999 to ensure they're always higher than their prereleases (e.g., 1.0.61 > 1.0.61-alpha.3). This ensures consistent version ordering across repos, allowing Sparkle to correctly identify and deliver updates to users. Test plan: - Verified formula with test script: ```sh compute_version() { VERSION="$1" BASE_VERSION="${VERSION%%-}" MAJOR=$(echo "$BASE_VERSION" \| cut -d. -f1) MINOR=$(echo "$BASE_VERSION" \| cut -d. -f2) PATCH=$(echo "$BASE_VERSION" \| cut -d. -f3) if [[ "$VERSION" == -* ]]; then PRERELEASE_PART="${VERSION#-}" PRERELEASE_NUM="${PRERELEASE_PART##.}" if ! [[ "$PRERELEASE_NUM" =~ ^[0-9]+$ ]]; then PRERELEASE_NUM=0 fi else PRERELEASE_NUM=999 fi BUILD_VERSION=$((PRERELEASE_NUM + 1000 * PATCH + 1000000 * MINOR + 1000000000 * MAJOR)) printf "%-20s -> %12s\n" "$VERSION" "$BUILD_VERSION" } compute_version "1.0.61-alpha.2" compute_version "1.0.61-alpha.3" compute_version "1.0.61" compute_version "1.0.62-alpha.1" compute_version "1.1.0-alpha.1" compute_version "2.0.0-alpha.1" compute_version "0.0.0-alpha.0" compute_version "0.0.1-alpha.1" compute_version "1.2.3" compute_version "1.2.3-beta.5" ``` - Output: ```sh Version -> Build Number ---------------------------------------- 1.0.61-alpha.2 -> 1000061002 1.0.61-alpha.3 -> 1000061003 1.0.61 -> 1000061999 1.0.62-alpha.1 -> 1000062001 1.1.0-alpha.1 -> 1001000001 2.0.0-alpha.1 -> 2000000001 0.0.0-alpha.0 -> 0 0.0.1-alpha.1 -> 1001 1.2.3 -> 1002003999 1.2.3-beta.5 -> 1002003005 ``` - Confirmed ordering: alpha.2 < alpha.3 < release < next-alpha v1.0.62	2026-01-08 19:52:33 +01:00
Jake Hillion	8f7f0e893a	ci: avoid uploading alpha appcasts Currently alpha appcasts get uploaded. It turns out these overwrite the standard appcast, so even though no one will update to the alpha channel, everyone will miss regular updates while the latest build was an alpha one. Ideally we should combine the source of truth for both the alpha and release channels, but as no one is using the alpha channel for yet let's stop uploading it for now. Test plan: ![eyes](https://media1.giphy.com/media/v1.Y2lkPTc5MGI3NjExeGNwdDk0dmdscjlkZnd6eGxhcjJzdDBsYndmc2t2cnlpZDNxZnZhYSZlcD12MV9pbnRlcm5hbF9naWZfYnlfaWQmY3Q9Zw/gKHGnB1ml0moQdjhEJ/giphy.gif)	2026-01-08 18:52:10 +01:00
Alex Cheema	4759b09d4c	Use presigned URLs for bug report uploads (#1109 ) ## Motivation Previously we hardcoded AWS credentials into the app. This is not good practice. ## Changes Use presigned URLs instead. ## Why It Works Presigned URLs are an S3 feature for this kind of thing. They provide an expiring presigned URL with certain permissions. In this case we have a presigned URL with `s3:PutObject` permission that expires after 5 minutes. The client uses this presigned URL to upload a bug report instead of using its own credentials to sign a request. This also simplifies a lot of the Swift code. ## Test Plan ### Manual Testing On a single MacBook, I downloaded the app and sent a bug report. It worked and appeared in the bucket. v1.0.61	2026-01-08 17:17:48 +00:00
Alex Cheema	ca680185f3	Display RDMA debug info in macOS app. (#1072 ) ## Motivation Often users are running into issues with RDMA. See https://github.com/exo-explore/exo/issues?q=is%3Aissue%20rdma Having some debug info in the macOS app will help to debug these issues. ## Changes Displays output of the following commands in the debug info section of the macOS app: 1. `rdma_ctl status` 2. `ibv_devices` 3. `ibv_devinfo` ## Why It Works It displays RDMA debug info in the debug info section of the macOS app. ## Test Plan ### Manual Testing We need to make a new build of the macOS app and check the output under the following conditions: 1. No RDMA enabled. 2. RDMA enabled but no devices connected over TB5. 3. RDMA enabled and devices connected over TB5.	2026-01-08 15:17:00 +00:00
Jake Hillion	383309e24e	fmt: add typescript formatting Add typescript auto formatting with Prettier and treefmt-nix. Added a .prettierrc to useTabs, which isn't the default, to reduce churn. The rest looks okay and will be checked by CI. Test plan: - CI	2026-01-08 13:47:27 +00:00
Jake Hillion	55463a9806	fmt: add swift formatting Swift code currently has no auto formatting. Add `swift-format` to the `treefmt-nix` config to get this formatted. As our existing Swift code uses 4-space formatting instead of the default 2-space, also adds a custom `.swift-format Test plan: - CI	2026-01-08 13:34:45 +00:00
Evan Quiney	56af61fac9	add a server for distributed testing in /tests until we work out a stable solution. (#1098 ) ## Motivation Testing multiple devices simultaneously requires coordination, and we don't necessarily want to run a full EXO to test single components. We need a mid-scale integration testing framework for distributed tests. ## Changes Add a simple python server + bash query that runs Jaccl and Ring tests without constructing a worker/master/networking. The query relies on all devices being accessible over tailscale, currently. ## Test Plan Manually tested RDMA + Ring inference on 2 nodes. v1.0.61-alpha.2	2026-01-08 12:50:04 +00:00
Evan Quiney	f76d543d98	We shouldn't fail on an HTTPException in the tier-2 discovery system. (#1104 ) ## Motivation Fixed a crash we found ## Changes try/catch return None if we get an exception instead of crashing exo ## Test Plan ### Manual Testing Exo launches. Couldn't repro the original case this arose.	2026-01-08 12:43:34 +00:00
Sami Khan	ea841aca37	local network check (#1103 ) ## Motivation After machine restart, macOS local network permission can appear enabled in System Settings but not actually work. EXO fails to discover other machines, and the only fix is manually toggling the permission off/on and relaunching. Users had no way to know this was happening. ## Changes - Added LocalNetworkChecker service that detects if local network access is actually functional - Added warning banner with instructions and "Open Settings" button when blocked - Added NSLocalNetworkUsageDescription and NSBonjourServices to Info.plist (required by macOS) <img width="386" height="712" alt="image" src="https://github.com/user-attachments/assets/c6fc873d-2c6a-4c9b-89cb-f7bc7322e25b" /> ## Why It Works Uses NWConnection to UDP multicast address 224.0.0.251:5353 (mDNS), which is subject to the app's actual TCC permission state. Other approaches (NWBrowser, dns-sd subprocess) either require additional entitlements or run with their own permissions, giving false results. ## Test Plan ### Manual Testing Hardware: MacBook Pro - Toggle local network OFF in System Settings → warning banner appears - Toggle local network ON → warning disappears - Verified detection correctly reflects actual permission state ### Automated Testing N/A	2026-01-08 12:24:46 +00:00
rltakashige	077b1bc732	exo-bench (Benchmark model pp & tg speed) (#1099 ) ## Motivation This PR implements benchmarking in the style of llama-bench. The main difficulty here is the fact that exo is not a library - it exposes an endpoint. This means that benchmarking numbers will be inaccurate if the API is measured. The solution assumes nodes are set up with uv run exo (or via the app), and then hits the new endpoint /bench/chat/completions to retrieve generation statistics directly from mlx_lm. <!-- Why is this change needed? What problem does it solve? --> This will allow us to release benchmarks for models and perform regression tests. TODO: Performance benchmarking. <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> - Adds /bench/chat/completions endpoint - Adds BenchChatCompletion/Response - Adds a logits processor to prevent response from ending early - Adds a "Prompt Sizer" which downloads the tokenizer and dynamically adjusts the prompt of "a" to fit the desired prompt size. - Reduce prefill step size to 2048 for now (in future, dynamically adjust this value) <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> Benchmarked Llama, Qwen, DeepSeek and Kimi models. Will require several fixes to run consistently on all configurations (to be done in the future). Manually tested the normal API to verify chat requests complete as expected. ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - --> Not really possible. Type checker passes. v1.0.61-alpha.1	2026-01-06 17:39:09 +00:00
Alex Cheema	4963c33162	Fix Discord link in README.md. Fixes #1096 (#1097 ) ## Motivation Discord link expired. ## Changes Replace discord invite link with permanent link. ## Why It Works It's permanent now. ## Test Plan Clicked the link. It works.	2026-01-06 14:05:09 +00:00
madanlalit	4f6fcd9e93	feat(macos-app): add custom namespace UI for cluster isolation Add Advanced Options section with custom namespace field that allows users to override EXO_LIBP2P_NAMESPACE environment variable. This enables splitting machines that can see each other into separate clusters. - Added customNamespace property with UserDefaults persistence - Added Advanced Options collapsible section with text field - Added Save & Restart button that auto-restarts exo process - Namespace replaces buildTag when custom value is set - Falls back to buildTag (version) when namespace is empty v1.0.61-alpha.0	2026-01-05 15:25:00 +01:00
Evan Quiney	839b67f318	[feat] Add an option to disable the worker (#1091 ) ## Motivation Workerless machines can be used for networking without running any gpu jobs - add a cli flag that adds this basic functionality. ## Changes Adds the --no-worker cli flag ## Test Plan ### Manual Testing Exo starts as expected ### Automated Testing None	2026-01-05 12:05:03 +00:00
Drifter4242	47b8e0ce12	feat: remember last launch settings (model, sharding, instance type) (#1028 ) ## Motivation Saves the last launch settings, so that the next time you run exo it will default to the same launch settings. This is just a small quality of life improvement. ## Changes When you launch it saves the settings to the web browser local storage. When it fills out the model list, it reads the settings and sets the default. I reviewed, tested and edited the code, but some of the code was written by Claude Opus. I hope that's ok. ## Why It Works See above ## Test Plan ### Manual Testing I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split. I ran exo multiple times to confirm that the default works. ### Automated Testing No changes to automated testing.	2026-01-05 11:27:14 +00:00
Evan Quiney	17f9b583a4	Task Deduplication (#1062 )	2026-01-03 20:01:49 +00:00
RickyChen / 陳昭儒	844bcc7ce6	fix: prevent form submission during IME composition (#1069 ) ## Problem When typing in Chinese (or other IME-based languages like Japanese/Korean), pressing Enter to select a character from the IME candidate list would incorrectly submit the message instead of confirming the character selection. ## Solution Added IME composition state detection in the `handleKeydown` function in `ChatForm.svelte`: - Check `event.isComposing` to detect active IME composition - Fallback to `event.keyCode === 229` for broader browser compatibility - Return early when IME is active, allowing normal character selection ## Changes - Modified `dashboard/src/lib/components/ChatForm.svelte` - Added IME composition check before Enter key handling Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>	2025-12-31 17:11:04 +00:00
Evan Quiney	c1be5184b2	Fix tests broken by 283c (#1063 ) Some tests were broken by #1058 and #1046 - this fixes them.	2025-12-31 01:53:55 +00:00
Alex Cheema	1ec550dff1	Emit download progress on start, and change downloads to be keyed by model_id (#1044 ) ## Motivation We added a download page to the dashboard which shows the currently download status of each model on each node. Users have reported this to be extremely useful. However, we don't currently fetch the download progress on start, so it doesn't show any model's download status. ## Changes Fetch and emit model download status on start of worker, and periodically every 5 mins. Also to support this, I changed download_status to be keyed by model_id instead of shard, since we want download_status of each model, not each shard. ## Why It Works The dashboard already implements the correct functionality, we just weren't populating the download status in the state. Now it gets populated and shows correctly. ## Test Plan ### Manual Testing On a cluster of 2 x 512GB M3 Ultra Mac Studio, I launched an instance onto one node that hadn't been downloaded. I checked the download page and it showed the in progress download. I downloaded it to completion, restarted exo on both nodes, and then opened the download page and it showed the model as 100% downloaded and other models as 0% that hadn't been downloaded. --------- Co-authored-by: Evan <evanev7@gmail.com>	2025-12-31 01:18:10 +00:00
Alex Cheema	283c0e39e4	Placement filters for tensor parallel supports_tensor, tensor dimension and pipeline parallel deepseek v3.1 (#1058 ) ## Motivation Certain placements are not valid. Added filters to exclude these placements. There were invalid placement previews being shown in the dashboard which would then fail when the user actually tries to launch an instance with that placement. ## Changes Three filters added: 1. Certain models do not support tensor parallel at all. Checks `supports_tensor` on the model_meta. 2. For models that do support tensor parallelism, certain tensor parallel sizes are not valid. This check is actually not correct right now but it works fine for now. The actual correct check is more involved. 3. For unknown reasons, deepseek v3.1 (8-bit) does not work with tensor parallelism. ## Why It Works `place_instance` now raises an `Exception` for invalid placements. ## Test Plan ### Manual Testing Since `/instance/previews` enumerates all possible placements and runs `place_instance`, I checked the dashboard to see if invalid placements are still shown.	2025-12-31 00:33:40 +00:00
Alex Cheema	35be4c55c3	prioritise mlx jaccl coordinator ip (en0 -> en1 -> non-TB5 -> other)	2025-12-31 00:10:19 +00:00
Alex Cheema	31d4cd8409	set KV_CACHE_BITS to None to disable quantized kv cache	2025-12-31 00:03:30 +00:00
Alex Cheema	8a6da58404	remove mx.set_cache_limit	2025-12-30 23:58:15 +00:00
Alex Cheema	16e2bfd3b3	log EXO_LIBP2P_NAMESPACE on start	2025-12-30 04:08:47 +00:00
Alex Cheema	ade3ee7ec5	fix warmup order. should be rank!=0 then rank=0	2025-12-30 03:29:34 +00:00
Evan Quiney	fea42473dd	Place local node at the top of the dashboard. (#1033 ) @samiamjidkhan and @AlexCheema's work moving the topology to place the local node at the top of the topology in the app dashboard.	2025-12-28 21:12:47 +00:00
Alex Cheema	ca7adcc2a8	Update README.md with instructions to enable RDMA. (#1031 ) ## Motivation We didn't have instructions for enabling RDMA on macOS. ## Changes I added instructions for enabling RDMA on macOS. ## Why It Works Tried it on my M4 Max MacBook Pro and works. ## Test Plan ### Manual Testing Tried it on my M4 Max MacBook Pro and works. ### Automated Testing In the future, we could automate this from fresh macOS builds using KVM over IP. See #1030	2025-12-28 20:56:26 +00:00
Evan Quiney	9d9e24f969	some dashboard updates (#1017 ) Mostly @samiamjidkhan and @AlexCheema's work in progress. --------- Co-authored-by: Sami Khan <smsak99@gmail.com> Co-authored-by: Alex Cheema	2025-12-28 20:50:23 +00:00
Jake Hillion	b5d424b658	placement: generate per-node host lists for MLX ring backend Pipeline + MLX Ring worked with 2 nodes but failed to initialize with 3 or more nodes. The MLX ring backend requires each node to know its specific left and right neighbors in the ring, but the previous implementation provided a single flat host list shared by all nodes. With 2 nodes, a flat list [host0, host1] accidentally worked because each node could find its only neighbor. With 3+ nodes, each node needs a customized view: - Rank 0: [self, right_neighbor, placeholder] - Rank 1: [left_neighbor, self, right_neighbor] - Rank 2: [placeholder, left_neighbor, self] Changed MlxRingInstance from `hosts: list[Host]` to `hosts_by_node: dict[NodeId, list[Host]]` with `ephemeral_port: int`. Added `get_mlx_ring_hosts_by_node()` which generates per-node host lists where: - Self position uses 0.0.0.0 for local binding - Left/right neighbors use actual connection IPs - Non-neighbors use 198.51.100.1 (RFC 5737 TEST-NET-2 placeholder) Also added IP prioritization (en0 > en1 > non-Thunderbolt > any) to prefer stable network interfaces. Fixed topology discovery recording loopback addresses (127.0.0.1) as valid connections to remote nodes. The reachability check now verifies node identity via HTTP GET /node_id rather than just checking if the port is open. Test plan: - Built a DMG [0] - Installed on all Macs and started cluster. - Requested a 3 node Pipeline + MLX Ring Llama 3.3 70B (FP16). - It started and I was able to send a few chat messages. Eventually my instance seemed to get into a broken state and chat stopped working, but this commit is a clear step forward. [0] https://github.com/exo-explore/exo/actions/runs/20473983471/job/58834969418	2025-12-28 20:38:20 +00:00
Drifter4242	b465134012	Fix Kimi K2 Thinking download by adding tiktoken.model to download patterns (#1024 ) Kimi-K2 Thinking uses tiktoken.model for its tokenizer, which wasn't being downloaded. This adds it to the default_patterns alongside tokenizer.model. I'm a bit confused why this isn't a problem for other people - I know that others have used Kimi K2 (I wonder if they manually fixed the download). ## Motivation I downloaded Kimi K2 Thinking and it didn't work because it didn't download tiktoken.model file. ## Changes Added tiktoken.model to the default patterns. ## Why It Works Now downloads the file. ## Test Plan ### Manual Testing I have two Macbook Studio M3 Ultras, each with 512Gb ram, connected with Thunderbolt 5. I ran Kimi K2 Thinking with MLX Ring and Tensor Split. It ran successfully. ### Automated Testing No automated test changes. I don't think they are needed.	2025-12-28 19:30:31 +00:00
Matiwos Kebede	eabdcab978	Fix linux docs (#1022 ) This PR updates the "Run from Source (Mac & Linux)" section in README.md to clarify Linux instructions. Changes include: - Split the section into macOS and Linux subsections. - Added native Linux package manager commands (apt, dnf, pacman) for dependencies: uv, node, npm. - Clarified that macmon is macOS-only. - Noted that Homebrew on Linux is optional, with native package managers preferred. These changes improve clarity for Linux users and fix confusion from the previous macOS-centric instructions.	2025-12-27 19:56:44 +00:00

1 2 3 4 5 ...

1886 Commits