feat: issue #1075

fix: prevent form submission during IME composition (#1069 )
## Problem When typing in Chinese (or other IME-based languages like Japanese/Korean), pressing Enter to select a character from the IME candidate list would incorrectly submit the message instead of confirming the character selection. ## Solution Added IME composition state detection in the `handleKeydown` function in `ChatForm.svelte`: - Check `event.isComposing` to detect active IME composition - Fallback to `event.keyCode === 229` for broader browser compatibility - Return early when IME is active, allowing normal character selection ## Changes - Modified `dashboard/src/lib/components/ChatForm.svelte` - Added IME composition check before Enter key handling Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>
2026-01-01 10:38:06 -05:00 · 2026-01-01 08:43:47 +05:00 · 2025-12-31 17:11:04 +00:00 · 2025-12-31 01:53:55 +00:00 · 2025-12-31 01:18:10 +00:00 · 2025-12-31 00:33:40 +00:00
18 changed files with 518 additions and 62 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,6 +16,7 @@ digest.txt
 *.xcuserdatad/
 **/.DS_Store
 app/EXO/build/
+dist/


 # rust
--- a/README.md
+++ b/README.md
@@ -166,6 +166,24 @@ Download the latest build here: [EXO-latest.dmg](https://assets.exolabs.net/EXO-

 The app will ask for permission to modify system settings and install a new Network profile. Improvements to this are being worked on.

+#### Uninstalling the macOS App
+
+The recommended way to uninstall is through the app itself: click the menu bar icon → Advanced → Uninstall. This cleanly removes all system components.
+
+If you've already deleted the app, you can run the standalone uninstaller script:
+
+```bash
+sudo ./app/EXO/uninstall-exo.sh
+```
+
+This removes:
+- Network setup LaunchDaemon
+- Network configuration script
+- Log files
+- The "exo" network location
+
+**Note:** You'll need to manually remove EXO from Login Items in System Settings → General → Login Items.
+
 ---

 ### Enabling RDMA on macOS
--- a/app/EXO/EXO/ContentView.swift
+++ b/app/EXO/EXO/ContentView.swift
@@ -17,9 +17,11 @@ struct ContentView: View {
    @State private var deletingInstanceIDs: Set<String> = []
    @State private var showAllNodes = false
    @State private var showAllInstances = false
+    @State private var showAdvanced = false
    @State private var showDebugInfo = false
    @State private var bugReportInFlight = false
    @State private var bugReportMessage: String?
+    @State private var uninstallInProgress = false

    var body: some View {
        VStack(alignment: .leading, spacing: 12) {
@@ -193,11 +195,7 @@ struct ContentView: View {
                Divider()
                    .padding(.vertical, 4)
            }
-            controlButton(title: "Check for Updates") {
-                updater.checkForUpdates()
-            }
-            .padding(.bottom, 8)
-            debugSection
+            advancedSection
                .padding(.bottom, 8)
            controlButton(title: "Quit", tint: .secondary) {
                controller.stop()
@@ -206,6 +204,33 @@ struct ContentView: View {
        }
    }

+    private var advancedSection: some View {
+        VStack(alignment: .leading, spacing: 6) {
+            HStack {
+                Text("Advanced")
+                    .font(.caption)
+                    .foregroundColor(.secondary)
+                Spacer()
+                collapseButton(isExpanded: $showAdvanced)
+            }
+            .animation(nil, value: showAdvanced)
+            if showAdvanced {
+                VStack(alignment: .leading, spacing: 2) {
+                    HoverButton(title: "Check for Updates", small: true) {
+                        updater.checkForUpdates()
+                    }
+                    debugSection
+                    HoverButton(title: "Uninstall", tint: .red, small: true) {
+                        showUninstallConfirmationAlert()
+                    }
+                    .disabled(uninstallInProgress)
+                }
+                .transition(.opacity)
+            }
+        }
+        .animation(.easeInOut(duration: 0.25), value: showAdvanced)
+    }
+
    private func controlButton(title: String, tint: Color = .primary, action: @escaping () -> Void) -> some View {
        HoverButton(title: title, tint: tint, trailingSystemImage: nil, action: action)
    }
@@ -328,15 +353,15 @@ struct ContentView: View {
    }

    private var debugSection: some View {
-        VStack(alignment: .leading, spacing: 6) {
-            HStack {
-                Text("Debug Info")
-                    .font(.caption)
-                    .foregroundColor(.secondary)
-                Spacer()
-                collapseButton(isExpanded: $showDebugInfo)
+        VStack(alignment: .leading, spacing: 4) {
+            HoverButton(
+                title: "Debug Info",
+                tint: .primary,
+                trailingSystemImage: showDebugInfo ? "chevron.up" : "chevron.down",
+                small: true
+            ) {
+                showDebugInfo.toggle()
            }
-            .animation(nil, value: showDebugInfo)
            if showDebugInfo {
                VStack(alignment: .leading, spacing: 4) {
                    Text("Version: \(buildTag)")
@@ -352,6 +377,7 @@ struct ContentView: View {
                    sendBugReportButton
                        .padding(.top, 6)
                }
+                .padding(.leading, 8)
                .transition(.opacity)
            }
        }
@@ -447,6 +473,88 @@ struct ContentView: View {
        bugReportInFlight = false
    }

+    private func showUninstallConfirmationAlert() {
+        let alert = NSAlert()
+        alert.messageText = "Uninstall EXO"
+        alert.informativeText = """
+            This will remove EXO and all its system components:
+
+            • Network configuration daemon
+            • Launch at login registration
+            • EXO network location
+
+            The app will be moved to Trash.
+            """
+        alert.alertStyle = .warning
+        alert.addButton(withTitle: "Uninstall")
+        alert.addButton(withTitle: "Cancel")
+
+        // Style the Uninstall button as destructive
+        if let uninstallButton = alert.buttons.first {
+            uninstallButton.hasDestructiveAction = true
+        }
+
+        let response = alert.runModal()
+        if response == .alertFirstButtonReturn {
+            performUninstall()
+        }
+    }
+
+    private func performUninstall() {
+        uninstallInProgress = true
+
+        // Stop EXO process first
+        controller.cancelPendingLaunch()
+        controller.stop()
+        stateService.stopPolling()
+
+        // Run the privileged uninstall on a background thread
+        // Using .utility QoS to avoid priority inversion with NSAppleScript's subprocess
+        DispatchQueue.global(qos: .utility).async {
+            do {
+                // Remove network setup daemon and components (requires admin privileges)
+                try NetworkSetupHelper.uninstall()
+
+                DispatchQueue.main.async {
+                    // Unregister from launch at login
+                    LaunchAtLoginHelper.disable()
+
+                    // Move app to trash
+                    self.moveAppToTrash()
+
+                    // Quit the app
+                    DispatchQueue.main.asyncAfter(deadline: .now() + 0.5) {
+                        NSApplication.shared.terminate(nil)
+                    }
+                }
+            } catch {
+                DispatchQueue.main.async {
+                    self.showErrorAlert(message: error.localizedDescription)
+                    self.uninstallInProgress = false
+                }
+            }
+        }
+    }
+
+    private func showErrorAlert(message: String) {
+        let alert = NSAlert()
+        alert.messageText = "Uninstall Failed"
+        alert.informativeText = message
+        alert.alertStyle = .critical
+        alert.addButton(withTitle: "OK")
+        alert.runModal()
+    }
+
+    private func moveAppToTrash() {
+        guard let appURL = Bundle.main.bundleURL as URL? else { return }
+        do {
+            try FileManager.default.trashItem(at: appURL, resultingItemURL: nil)
+        } catch {
+            // If we can't trash the app, that's OK - user can do it manually
+            // The important system components have already been cleaned up
+        }
+    }
+
    private var buildTag: String {
        Bundle.main.infoDictionary?["EXOBuildTag"] as? String ?? "unknown"
    }
@@ -460,14 +568,24 @@ private struct HoverButton: View {
    let title: String
    let tint: Color
    let trailingSystemImage: String?
+    let small: Bool
    let action: () -> Void

+    init(title: String, tint: Color = .primary, trailingSystemImage: String? = nil, small: Bool = false, action: @escaping () -> Void) {
+        self.title = title
+        self.tint = tint
+        self.trailingSystemImage = trailingSystemImage
+        self.small = small
+        self.action = action
+    }
+
    @State private var isHovering = false

    var body: some View {
        Button(action: action) {
            HStack {
                Text(title)
+                    .font(small ? .caption : nil)
                Spacer()
                if let systemName = trailingSystemImage {
                    Image(systemName: systemName)
@@ -475,8 +593,8 @@ private struct HoverButton: View {
                }
            }
            .frame(maxWidth: .infinity, alignment: .leading)
-            .padding(.vertical, 6)
-            .padding(.horizontal, 8)
+            .padding(.vertical, small ? 4 : 6)
+            .padding(.horizontal, small ? 6 : 8)
            .background(
                RoundedRectangle(cornerRadius: 6)
                    .fill(
--- a/app/EXO/EXO/EXOApp.swift
+++ b/app/EXO/EXO/EXOApp.swift
@@ -125,6 +125,22 @@ struct EXOApp: App {
    }
 }

+/// Helper for managing EXO's launch-at-login registration
+enum LaunchAtLoginHelper {
+    private static let logger = Logger(subsystem: "io.exo.EXO", category: "LaunchAtLogin")
+
+    /// Unregisters EXO from launching at login
+    static func disable() {
+        guard SMAppService.mainApp.status == .enabled else { return }
+        do {
+            try SMAppService.mainApp.unregister()
+            logger.info("Unregistered EXO from launch at login")
+        } catch {
+            logger.error("Failed to unregister EXO from launch at login: \(error.localizedDescription, privacy: .public)")
+        }
+    }
+}
+
 final class SparkleUpdater: NSObject, ObservableObject {
    private let controller: SPUStandardUpdaterController
    private let delegateProxy: ExoUpdaterDelegate
--- a/app/EXO/EXO/Services/NetworkSetupHelper.swift
+++ b/app/EXO/EXO/Services/NetworkSetupHelper.swift
@@ -62,7 +62,8 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
 """

    static func ensureLaunchDaemonInstalled() {
-        Task.detached {
+        // Use .utility priority to match NSAppleScript's internal QoS and avoid priority inversion
+        Task.detached(priority: .utility) {
            do {
                if daemonAlreadyInstalled() {
                    return
@@ -75,6 +76,63 @@ networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
        }
    }

+    /// Removes all EXO network setup components from the system.
+    /// This includes the LaunchDaemon, scripts, logs, and network location.
+    /// Requires admin privileges.
+    static func uninstall() throws {
+        let uninstallScript = makeUninstallScript()
+        try runShellAsAdmin(uninstallScript)
+        logger.info("EXO network setup components removed successfully")
+    }
+
+    /// Checks if there are any EXO network components installed that need cleanup
+    static func hasInstalledComponents() -> Bool {
+        let manager = FileManager.default
+        let scriptExists = manager.fileExists(atPath: scriptDestination)
+        let plistExists = manager.fileExists(atPath: plistDestination)
+        return scriptExists || plistExists
+    }
+
+    private static func makeUninstallScript() -> String {
+        """
+set -euo pipefail
+
+LABEL="\(daemonLabel)"
+SCRIPT_DEST="\(scriptDestination)"
+PLIST_DEST="\(plistDestination)"
+LOG_OUT="/var/log/\(daemonLabel).log"
+LOG_ERR="/var/log/\(daemonLabel).err.log"
+
+# Unload the LaunchDaemon if running
+launchctl bootout system/"$LABEL" 2>/dev/null || true
+
+# Remove LaunchDaemon plist
+rm -f "$PLIST_DEST"
+
+# Remove the script and parent directory if empty
+rm -f "$SCRIPT_DEST"
+rmdir "$(dirname "$SCRIPT_DEST")" 2>/dev/null || true
+
+# Remove log files
+rm -f "$LOG_OUT" "$LOG_ERR"
+
+# Switch back to Automatic network location
+networksetup -switchtolocation Automatic 2>/dev/null || true
+
+# Delete the exo network location if it exists
+networksetup -listlocations | grep -q '^exo$' && {
+  networksetup -deletelocation exo 2>/dev/null || true
+} || true
+
+# Re-enable Thunderbolt Bridge if it exists
+networksetup -listnetworkservices | grep -q "Thunderbolt Bridge" && {
+  networksetup -setnetworkserviceenabled "Thunderbolt Bridge" on 2>/dev/null || true
+} || true
+
+echo "EXO network components removed successfully"
+"""
+    }
+
    private static func daemonAlreadyInstalled() -> Bool {
        let manager = FileManager.default
        let scriptExists = manager.fileExists(atPath: scriptDestination)
--- a/app/EXO/uninstall-exo.sh
+++ b/app/EXO/uninstall-exo.sh
@@ -0,0 +1,154 @@
+#!/usr/bin/env bash
+#
+# EXO Uninstaller Script
+#
+# This script removes all EXO system components that persist after deleting the app.
+# Run with: sudo ./uninstall-exo.sh
+#
+# Components removed:
+# - LaunchDaemon: /Library/LaunchDaemons/io.exo.networksetup.plist
+# - Network script: /Library/Application Support/EXO/
+# - Log files: /var/log/io.exo.networksetup.*
+# - Network location: "exo"
+# - Launch at login registration
+#
+
+set -euo pipefail
+
+LABEL="io.exo.networksetup"
+SCRIPT_DEST="/Library/Application Support/EXO/disable_bridge_enable_dhcp.sh"
+PLIST_DEST="/Library/LaunchDaemons/io.exo.networksetup.plist"
+LOG_OUT="/var/log/${LABEL}.log"
+LOG_ERR="/var/log/${LABEL}.err.log"
+APP_BUNDLE_ID="io.exo.EXO"
+
+# Colors for output
+RED='\033[0;31m'
+GREEN='\033[0;32m'
+YELLOW='\033[1;33m'
+NC='\033[0m' # No Color
+
+echo_info() {
+    echo -e "${GREEN}[INFO]${NC} $1"
+}
+
+echo_warn() {
+    echo -e "${YELLOW}[WARN]${NC} $1"
+}
+
+echo_error() {
+    echo -e "${RED}[ERROR]${NC} $1"
+}
+
+# Check if running as root
+if [[ $EUID -ne 0 ]]; then
+    echo_error "This script must be run as root (use sudo)"
+    exit 1
+fi
+
+echo ""
+echo "========================================"
+echo "        EXO Uninstaller"
+echo "========================================"
+echo ""
+
+# Unload the LaunchDaemon if running
+echo_info "Stopping network setup daemon..."
+if launchctl list | grep -q "$LABEL"; then
+    launchctl bootout system/"$LABEL" 2>/dev/null || true
+    echo_info "Daemon stopped"
+else
+    echo_warn "Daemon was not running"
+fi
+
+# Remove LaunchDaemon plist
+if [[ -f "$PLIST_DEST" ]]; then
+    rm -f "$PLIST_DEST"
+    echo_info "Removed LaunchDaemon plist"
+else
+    echo_warn "LaunchDaemon plist not found (already removed?)"
+fi
+
+# Remove the script and parent directory
+if [[ -f "$SCRIPT_DEST" ]]; then
+    rm -f "$SCRIPT_DEST"
+    echo_info "Removed network setup script"
+else
+    echo_warn "Network setup script not found (already removed?)"
+fi
+
+# Remove EXO directory if empty
+if [[ -d "/Library/Application Support/EXO" ]]; then
+    rmdir "/Library/Application Support/EXO" 2>/dev/null && \
+        echo_info "Removed EXO support directory" || \
+        echo_warn "EXO support directory not empty, leaving in place"
+fi
+
+# Remove log files
+if [[ -f "$LOG_OUT" ]] || [[ -f "$LOG_ERR" ]]; then
+    rm -f "$LOG_OUT" "$LOG_ERR"
+    echo_info "Removed log files"
+else
+    echo_warn "Log files not found (already removed?)"
+fi
+
+# Switch back to Automatic network location
+echo_info "Restoring network configuration..."
+if networksetup -listlocations | grep -q "^Automatic$"; then
+    networksetup -switchtolocation Automatic 2>/dev/null || true
+    echo_info "Switched to Automatic network location"
+else
+    echo_warn "Automatic network location not found"
+fi
+
+# Delete the exo network location if it exists
+if networksetup -listlocations | grep -q "^exo$"; then
+    networksetup -deletelocation exo 2>/dev/null || true
+    echo_info "Deleted 'exo' network location"
+else
+    echo_warn "'exo' network location not found (already removed?)"
+fi
+
+# Re-enable Thunderbolt Bridge if it exists
+if networksetup -listnetworkservices 2>/dev/null | grep -q "Thunderbolt Bridge"; then
+    networksetup -setnetworkserviceenabled "Thunderbolt Bridge" on 2>/dev/null || true
+    echo_info "Re-enabled Thunderbolt Bridge"
+fi
+
+# Note about launch at login registration
+# SMAppService-based login items cannot be removed from a shell script.
+# They can only be unregistered from within the app itself or manually via System Settings.
+echo_warn "Launch at login must be removed manually:"
+echo_warn "  System Settings → General → Login Items → Remove EXO"
+
+# Check if EXO.app exists in common locations
+APP_FOUND=false
+for app_path in "/Applications/EXO.app" "$HOME/Applications/EXO.app"; do
+    if [[ -d "$app_path" ]]; then
+        if [[ "$APP_FOUND" == false ]]; then
+            echo ""
+            APP_FOUND=true
+        fi
+        echo_warn "EXO.app found at: $app_path"
+        echo_warn "You may want to move it to Trash manually."
+    fi
+done
+
+echo ""
+echo "========================================"
+echo_info "EXO uninstall complete!"
+echo "========================================"
+echo ""
+echo "The following have been removed:"
+echo "  • Network setup LaunchDaemon"
+echo "  • Network configuration script"
+echo "  • Log files"
+echo "  • 'exo' network location"
+echo ""
+echo "Your network has been restored to use the 'Automatic' location."
+echo "Thunderbolt Bridge has been re-enabled (if present)."
+echo ""
+echo "Manual step required:"
+echo "  Remove EXO from Login Items in System Settings → General → Login Items"
+echo ""
+
--- a/dashboard/src/lib/components/ChatForm.svelte
+++ b/dashboard/src/lib/components/ChatForm.svelte
@@ -139,6 +139,11 @@
 	}

 	function handleKeydown(event: KeyboardEvent) {
+		// Prevent form submission during IME composition (e.g., Chinese, Japanese, Korean input)
+		if (event.isComposing || event.keyCode === 229) {
+			return;
+		}
+		
 		if (event.key === 'Enter' && !event.shiftKey) {
 			event.preventDefault();
 			handleSubmit();
--- a/src/exo/master/placement.py
+++ b/src/exo/master/placement.py
@@ -21,6 +21,7 @@ from exo.shared.types.commands import (
 )
 from exo.shared.types.events import Event, InstanceCreated, InstanceDeleted
 from exo.shared.types.memory import Memory
+from exo.shared.types.models import ModelId
 from exo.shared.types.topology import NodeInfo
 from exo.shared.types.worker.instances import (
    Instance,
@@ -29,6 +30,7 @@ from exo.shared.types.worker.instances import (
    MlxJacclInstance,
    MlxRingInstance,
 )
+from exo.shared.types.worker.shards import Sharding


 def random_ephemeral_port() -> int:
@@ -65,6 +67,28 @@ def place_instance(
    if not cycles_with_sufficient_memory:
        raise ValueError("No cycles found with sufficient memory")

+    if command.sharding == Sharding.Tensor:
+        if not command.model_meta.supports_tensor:
+            raise ValueError(
+                f"Requested Tensor sharding but this model does not support tensor parallelism: {command.model_meta.model_id}"
+            )
+        # TODO: the condition here for tensor parallel is not correct, but it works good enough for now.
+        cycles_with_sufficient_memory = [
+            cycle
+            for cycle in cycles_with_sufficient_memory
+            if command.model_meta.hidden_size % len(cycle) == 0
+        ]
+        if not cycles_with_sufficient_memory:
+            raise ValueError(
+                f"No tensor sharding found for model with hidden_size {command.model_meta.hidden_size} candidate cycles"
+            )
+    if command.sharding == Sharding.Pipeline and command.model_meta.model_id == ModelId(
+        "mlx-community/DeepSeek-V3.1-8bit"
+    ):
+        raise ValueError(
+            "Pipeline parallelism is not supported for DeepSeek V3.1 (8-bit)"
+        )
+
    smallest_cycles = get_smallest_cycles(cycles_with_sufficient_memory)

    smallest_tb_cycles = [
--- a/src/exo/master/placement_utils.py
+++ b/src/exo/master/placement_utils.py
@@ -385,13 +385,14 @@ def get_mlx_jaccl_coordinators(
    address in format "X.X.X.X:PORT" per node.
    """
    rank_0_node = selected_cycle[0]
-    logger.info(f"Selecting coordinator from rank 0 node: {rank_0_node.node_id}")
+    logger.debug(f"Selecting coordinator from rank 0 node: {rank_0_node.node_id}")

    def get_ip_for_node(n: NodeInfo) -> str:
        if n.node_id == rank_0_node.node_id:
            return "0.0.0.0"

-        for ip, _ in _find_connection_ip(n, rank_0_node, cycle_digraph):
+        ip = _find_ip_prioritised(n, rank_0_node, cycle_digraph)
+        if ip:
            return ip

        logger.warning(
--- a/src/exo/master/tests/test_placement.py
+++ b/src/exo/master/tests/test_placement.py
@@ -50,7 +50,7 @@ def model_meta() -> ModelMetadata:
        storage_size=Memory.from_kb(1000),
        pretty_name="Test Model",
        n_layers=10,
-        hidden_size=10,
+        hidden_size=30,
        supports_tensor=True,
    )

--- a/src/exo/worker/download/download_utils.py
+++ b/src/exo/worker/download/download_utils.py
@@ -450,6 +450,11 @@ async def get_weight_map(repo_id: str, revision: str = "main") -> dict[str, str]


 async def resolve_allow_patterns(shard: ShardMetadata) -> list[str]:
+    # TODO: 'Smart' downloads are disabled because:
+    #  (i) We don't handle all kinds of files;
+    # (ii) We don't have sticky sessions.
+    # (iii) Tensor parallel requires all files.
+    return ["*"]
    try:
        weight_map = await get_weight_map(str(shard.model_meta.model_id))
        return get_allow_patterns(weight_map, shard)
--- a/src/exo/worker/engines/mlx/constants.py
+++ b/src/exo/worker/engines/mlx/constants.py
@@ -9,7 +9,7 @@ MAX_KV_SIZE: int | None = 3200
 KEEP_KV_SIZE: int | None = 1600
 QUANTIZE_MODEL_MODE: str | None = "affine"
 CACHE_GROUP_SIZE: int = 64
-KV_CACHE_BITS: int | None = 8
+KV_CACHE_BITS: int | None = None

 # TODO: We should really make this opt-in, but Kimi requires trust_remote_code=True
 TRUST_REMOTE_CODE: bool = True
--- a/src/exo/worker/engines/mlx/utils_mlx.py
+++ b/src/exo/worker/engines/mlx/utils_mlx.py
@@ -395,11 +395,5 @@ def set_wired_limit_for_model(model_size: Memory):
            "MB. This can be slow. See the documentation for possible work-arounds: "
            "https://github.com/ml-explore/mlx-lm/tree/main#large-models"
        )
-    kv_bytes = int(0.02 * model_bytes)
-    target_cache = int(1.10 * (model_bytes + kv_bytes))
-    target_cache = min(target_cache, max_rec_size)
-    mx.set_cache_limit(target_cache)
    mx.set_wired_limit(max_rec_size)
-    logger.info(
-        f"Wired limit set to {max_rec_size}. Cache limit set to {target_cache}."
-    )
+    logger.info(f"Wired limit set to {max_rec_size}.")
--- a/src/exo/worker/main.py
+++ b/src/exo/worker/main.py
@@ -23,6 +23,7 @@ from exo.shared.types.events import (
    TopologyEdgeCreated,
    TopologyEdgeDeleted,
 )
+from exo.shared.types.models import ModelId
 from exo.shared.types.multiaddr import Multiaddr
 from exo.shared.types.profiling import MemoryPerformanceProfile, NodePerformanceProfile
 from exo.shared.types.state import State
@@ -83,7 +84,7 @@ class Worker:
        self.out_for_delivery: dict[EventId, ForwarderEvent] = {}

        self.state: State = State()
-        self.download_status: dict[ShardMetadata, DownloadProgress] = {}
+        self.download_status: dict[ModelId, DownloadProgress] = {}
        self.runners: dict[RunnerId, RunnerSupervisor] = {}
        self._tg: TaskGroup | None = None

@@ -128,6 +129,7 @@ class Worker:
            tg.start_soon(start_polling_node_metrics, resource_monitor_callback)

            tg.start_soon(start_polling_memory_metrics, memory_monitor_callback)
+            tg.start_soon(self._emit_existing_download_progress)
            tg.start_soon(self._connection_message_event_writer)
            tg.start_soon(self._resend_out_for_delivery)
            tg.start_soon(self._event_applier)
@@ -200,11 +202,11 @@ class Worker:
                        )
                    )
                case DownloadModel(shard_metadata=shard):
-                    if shard not in self.download_status:
+                    if shard.model_meta.model_id not in self.download_status:
                        progress = DownloadPending(
                            shard_metadata=shard, node_id=self.node_id
                        )
-                        self.download_status[shard] = progress
+                        self.download_status[shard.model_meta.model_id] = progress
                        await self.event_sender.send(
                            NodeDownloadProgress(download_progress=progress)
                        )
@@ -217,7 +219,7 @@ class Worker:
                        progress = DownloadCompleted(
                            shard_metadata=shard, node_id=self.node_id
                        )
-                        self.download_status[shard] = progress
+                        self.download_status[shard.model_meta.model_id] = progress
                        await self.event_sender.send(
                            NodeDownloadProgress(download_progress=progress)
                        )
@@ -349,7 +351,7 @@ class Worker:
                initial_progress
            ),
        )
-        self.download_status[task.shard_metadata] = status
+        self.download_status[task.shard_metadata.model_meta.model_id] = status
        self.event_sender.send_nowait(NodeDownloadProgress(download_progress=status))

        last_progress_time = 0.0
@@ -363,7 +365,7 @@ class Worker:
            nonlocal last_progress_time
            if progress.status == "complete":
                status = DownloadCompleted(shard_metadata=shard, node_id=self.node_id)
-                self.download_status[shard] = status
+                self.download_status[shard.model_meta.model_id] = status
                # Footgun!
                self.event_sender.send_nowait(
                    NodeDownloadProgress(download_progress=status)
@@ -384,7 +386,7 @@ class Worker:
                        progress
                    ),
                )
-                self.download_status[shard] = status
+                self.download_status[shard.model_meta.model_id] = status
                self.event_sender.send_nowait(
                    NodeDownloadProgress(download_progress=status)
                )
@@ -444,3 +446,40 @@ class Worker:
                    await self.event_sender.send(TopologyEdgeDeleted(edge=conn))

            await anyio.sleep(10)
+
+    async def _emit_existing_download_progress(self) -> None:
+        try:
+            while True:
+                logger.info("Fetching and emitting existing download progress...")
+                async for (
+                    _,
+                    progress,
+                ) in self.shard_downloader.get_shard_download_status():
+                    if progress.status == "complete":
+                        status = DownloadCompleted(
+                            node_id=self.node_id, shard_metadata=progress.shard
+                        )
+                    elif progress.status in ["in_progress", "not_started"]:
+                        if progress.downloaded_bytes_this_session.in_bytes == 0:
+                            status = DownloadPending(
+                                node_id=self.node_id, shard_metadata=progress.shard
+                            )
+                        else:
+                            status = DownloadOngoing(
+                                node_id=self.node_id,
+                                shard_metadata=progress.shard,
+                                download_progress=map_repo_download_progress_to_download_progress_data(
+                                    progress
+                                ),
+                            )
+                    else:
+                        continue
+
+                    self.download_status[progress.shard.model_meta.model_id] = status
+                    await self.event_sender.send(
+                        NodeDownloadProgress(download_progress=status)
+                    )
+                logger.info("Done emitting existing download progress.")
+                await anyio.sleep(5 * 60)  # 5 minutes
+        except Exception as e:
+            logger.error(f"Error emitting existing download progress: {e}")
--- a/src/exo/worker/plan.py
+++ b/src/exo/worker/plan.py
@@ -3,6 +3,7 @@
 from collections.abc import Mapping, Sequence

 from exo.shared.types.common import NodeId
+from exo.shared.types.models import ModelId
 from exo.shared.types.tasks import (
    ChatCompletion,
    ConnectToGroup,
@@ -34,7 +35,6 @@ from exo.shared.types.worker.runners import (
    RunnerStatus,
    RunnerWarmingUp,
 )
-from exo.shared.types.worker.shards import ShardMetadata
 from exo.worker.runner.runner_supervisor import RunnerSupervisor


@@ -43,7 +43,7 @@ def plan(
    # Runners is expected to be FRESH and so should not come from state
    runners: Mapping[RunnerId, RunnerSupervisor],
    # DL_status is expected to be FRESH and so should not come from state
-    download_status: Mapping[ShardMetadata, DownloadProgress],
+    download_status: Mapping[ModelId, DownloadProgress],
    # gdls is not expected to be fresh
    global_download_status: Mapping[NodeId, Sequence[DownloadProgress]],
    instances: Mapping[InstanceId, Instance],
@@ -111,13 +111,14 @@ def _create_runner(

 def _model_needs_download(
    runners: Mapping[RunnerId, RunnerSupervisor],
-    download_status: Mapping[ShardMetadata, DownloadProgress],
+    download_status: Mapping[ModelId, DownloadProgress],
 ) -> DownloadModel | None:
    for runner in runners.values():
+        model_id = runner.bound_instance.bound_shard.model_meta.model_id
        if isinstance(runner.status, RunnerIdle) and (
-            not isinstance(
-                download_status.get(runner.bound_instance.bound_shard, None),
-                (DownloadOngoing, DownloadCompleted),
+            model_id not in download_status
+            or not isinstance(
+                download_status[model_id], (DownloadOngoing, DownloadCompleted)
            )
        ):
            # We don't invalidate download_status randomly in case a file gets deleted on disk
--- a/src/exo/worker/tests/constants.py
+++ b/src/exo/worker/tests/constants.py
@@ -9,9 +9,11 @@ MASTER_NODE_ID = NodeId("ffffffff-aaaa-4aaa-8aaa-aaaaaaaaaaaa")

 NODE_A: Final[NodeId] = NodeId("aaaaaaaa-aaaa-4aaa-8aaa-aaaaaaaaaaaa")
 NODE_B: Final[NodeId] = NodeId("bbbbbbbb-bbbb-4bbb-8bbb-bbbbbbbbbbbb")
+NODE_C: Final[NodeId] = NodeId("cccccccc-cccc-4ccc-8ccc-cccccccccccc")

 RUNNER_1_ID: Final[RunnerId] = RunnerId("11111111-1111-4111-8111-111111111111")
 RUNNER_2_ID: Final[RunnerId] = RunnerId("33333333-3333-4333-8333-333333333333")
+RUNNER_3_ID: Final[RunnerId] = RunnerId("Runner3")

 INSTANCE_1_ID: Final[InstanceId] = InstanceId("22222222-2222-4222-8222-222222222222")
 INSTANCE_2_ID: Final[InstanceId] = InstanceId("44444444-4444-4444-8444-444444444444")
--- a/src/exo/worker/tests/unittests/test_plan/test_download_and_loading.py
+++ b/src/exo/worker/tests/unittests/test_plan/test_download_and_loading.py
@@ -1,5 +1,6 @@
 import exo.worker.plan as plan_mod
 from exo.shared.types.common import NodeId
+from exo.shared.types.models import ModelId
 from exo.shared.types.tasks import LoadModel
 from exo.shared.types.worker.downloads import DownloadCompleted, DownloadProgress
 from exo.shared.types.worker.instances import BoundInstance
@@ -7,7 +8,6 @@ from exo.shared.types.worker.runners import (
    RunnerConnected,
    RunnerIdle,
 )
-from exo.shared.types.worker.shards import ShardMetadata
 from exo.worker.tests.constants import (
    INSTANCE_1_ID,
    MODEL_A_ID,
@@ -46,7 +46,7 @@ def test_plan_requests_download_when_waiting_and_shard_not_downloaded():
    all_runners = {RUNNER_1_ID: RunnerIdle()}

    # No entry for this shard -> should trigger DownloadModel
-    download_status: dict[ShardMetadata, DownloadProgress] = {}
+    download_status: dict[ModelId, DownloadProgress] = {}

    result = plan_mod.plan(
        node_id=NODE_A,
@@ -94,7 +94,7 @@ def test_plan_loads_model_when_all_shards_downloaded_and_waiting():

    # Local node has already marked its shard as downloaded (not actually used by _load_model)
    local_download_status = {
-        shard1: DownloadCompleted(shard_metadata=shard1, node_id=NODE_A)  # type: ignore[reportUnhashable]
+        MODEL_A_ID: DownloadCompleted(shard_metadata=shard1, node_id=NODE_A)
    }

    # Global view has completed downloads for both nodes
@@ -140,7 +140,7 @@ def test_plan_does_not_request_download_when_shard_already_downloaded():

    # Local status claims the shard is downloaded already
    local_download_status = {
-        shard: DownloadCompleted(shard_metadata=shard, node_id=NODE_A)  # type: ignore[reportUnhashable]
+        MODEL_A_ID: DownloadCompleted(shard_metadata=shard, node_id=NODE_A)
    }

    # Global view hasn't caught up yet (no completed shards recorded for NODE_A)
@@ -192,7 +192,7 @@ def test_plan_does_not_load_model_until_all_shards_downloaded_globally():

    # Only NODE_A's shard is recorded as downloaded globally
    local_download_status = {
-        shard1: DownloadCompleted(shard_metadata=shard1, node_id=NODE_A)  # type: ignore[reportUnhashable]
+        MODEL_A_ID: DownloadCompleted(shard_metadata=shard1, node_id=NODE_A)
    }
    global_download_status = {
        NODE_A: [DownloadCompleted(shard_metadata=shard1, node_id=NODE_A)],
--- a/src/exo/worker/tests/unittests/test_plan/test_warmup.py
+++ b/src/exo/worker/tests/unittests/test_plan/test_warmup.py
@@ -12,8 +12,10 @@ from exo.worker.tests.constants import (
    MODEL_A_ID,
    NODE_A,
    NODE_B,
+    NODE_C,
    RUNNER_1_ID,
    RUNNER_2_ID,
+    RUNNER_3_ID,
 )
 from exo.worker.tests.unittests.conftest import (
    FakeRunnerSupervisor,
@@ -24,37 +26,39 @@ from exo.worker.tests.unittests.conftest import (

 def test_plan_starts_warmup_for_accepting_rank_when_all_loaded_or_warming():
    """
-    For non-final device_rank shards, StartWarmup should be emitted when all
+    For non-zero device_rank shards, StartWarmup should be emitted when all
    shards in the instance are Loaded/WarmingUp.
    """
-    shard0 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=0, world_size=2)
-    shard1 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=1, world_size=2)
+    shard0 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=0, world_size=3)
+    shard1 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=1, world_size=3)
+    shard2 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=2, world_size=3)
    instance = get_mlx_ring_instance(
        instance_id=INSTANCE_1_ID,
        model_id=MODEL_A_ID,
-        node_to_runner={NODE_A: RUNNER_1_ID, NODE_B: RUNNER_2_ID},
-        runner_to_shard={RUNNER_1_ID: shard0, RUNNER_2_ID: shard1},
+        node_to_runner={NODE_A: RUNNER_1_ID, NODE_B: RUNNER_2_ID, NODE_C: RUNNER_3_ID},
+        runner_to_shard={RUNNER_1_ID: shard0, RUNNER_2_ID: shard1, RUNNER_3_ID: shard2},
    )

    bound_instance = BoundInstance(
-        instance=instance, bound_runner_id=RUNNER_1_ID, bound_node_id=NODE_A
+        instance=instance, bound_runner_id=RUNNER_2_ID, bound_node_id=NODE_B
    )
    local_runner = FakeRunnerSupervisor(
        bound_instance=bound_instance, status=RunnerLoaded()
    )

-    runners = {RUNNER_1_ID: local_runner}
+    runners = {RUNNER_2_ID: local_runner}
    instances = {INSTANCE_1_ID: instance}
    all_runners = {
        RUNNER_1_ID: RunnerLoaded(),
        RUNNER_2_ID: RunnerLoaded(),
+        RUNNER_3_ID: RunnerWarmingUp(),
    }

    result = plan_mod.plan(
-        node_id=NODE_A,
+        node_id=NODE_B,
        runners=runners,  # type: ignore
        download_status={},
-        global_download_status={NODE_B: []},
+        global_download_status={NODE_A: []},
        instances=instances,
        all_runners=all_runners,
        tasks={},
@@ -150,9 +154,9 @@ def test_plan_does_not_start_warmup_for_rank_zero_until_others_warming():
    """
    Rank-zero shard should not start warmup until all non-zero ranks are
    already WarmingUp.
-    For accepting ranks (device_rank != world_size - 1), StartWarmup should be
+    For accepting ranks (device_rank != 0), StartWarmup should be
    emitted when all shards in the instance are Loaded/WarmingUp.
-    In a 2-node setup, rank 0 is the accepting rank.
+    In a 2-node setup, rank 1 is the accepting rank.
    """
    shard0 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=0, world_size=2)
    shard1 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=1, world_size=2)
@@ -163,7 +167,7 @@ def test_plan_does_not_start_warmup_for_rank_zero_until_others_warming():
        runner_to_shard={RUNNER_1_ID: shard0, RUNNER_2_ID: shard1},
    )

-    # Rank 0 is the accepting rank
+    # Rank 1 is the accepting rank
    bound_instance = BoundInstance(
        instance=instance, bound_runner_id=RUNNER_1_ID, bound_node_id=NODE_A
    )
@@ -188,6 +192,23 @@ def test_plan_does_not_start_warmup_for_rank_zero_until_others_warming():
        tasks={},
    )

+    assert result is None
+
+    all_runners = {
+        RUNNER_1_ID: RunnerLoaded(),
+        RUNNER_2_ID: RunnerWarmingUp(),
+    }
+
+    result = plan_mod.plan(
+        node_id=NODE_A,
+        runners=runners,  # type: ignore
+        download_status={},
+        global_download_status={NODE_A: []},
+        instances=instances,
+        all_runners=all_runners,
+        tasks={},
+    )
+
    assert isinstance(result, StartWarmup)
    assert result.instance_id == INSTANCE_1_ID

@@ -280,9 +301,8 @@ def test_plan_does_not_start_warmup_for_accepting_rank_until_all_loaded_or_warmi

 def test_plan_does_not_start_warmup_for_connecting_rank_until_others_warming():
    """
-    Connecting rank (device_rank == world_size - 1) should not start warmup
+    Connecting rank (device_rank == 0) should not start warmup
    until all other ranks are already WarmingUp.
-    In a 2-node setup, rank 1 is the connecting rank.
    """
    shard0 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=0, world_size=2)
    shard1 = get_pipeline_shard_metadata(MODEL_A_ID, device_rank=1, world_size=2)
@@ -295,13 +315,13 @@ def test_plan_does_not_start_warmup_for_connecting_rank_until_others_warming():

    # Rank 1 is the connecting rank
    bound_instance = BoundInstance(
-        instance=instance, bound_runner_id=RUNNER_2_ID, bound_node_id=NODE_B
+        instance=instance, bound_runner_id=RUNNER_1_ID, bound_node_id=NODE_A
    )
    local_runner = FakeRunnerSupervisor(
        bound_instance=bound_instance, status=RunnerLoaded()
    )

-    runners = {RUNNER_2_ID: local_runner}
+    runners = {RUNNER_1_ID: local_runner}
    instances = {INSTANCE_1_ID: instance}
    all_runners = {
        RUNNER_1_ID: RunnerLoaded(),
@@ -309,7 +329,7 @@ def test_plan_does_not_start_warmup_for_connecting_rank_until_others_warming():
    }

    result = plan_mod.plan(
-        node_id=NODE_B,
+        node_id=NODE_A,
        runners=runners,  # type: ignore
        download_status={},
        global_download_status={NODE_A: [], NODE_B: []},
Author	SHA1	Message	Date
Sami Khan	4e17c4acb5	feat: issue #1075	2026-01-01 08:43:47 +05:00
RickyChen / 陳昭儒	844bcc7ce6	fix: prevent form submission during IME composition (#1069 ) ## Problem When typing in Chinese (or other IME-based languages like Japanese/Korean), pressing Enter to select a character from the IME candidate list would incorrectly submit the message instead of confirming the character selection. ## Solution Added IME composition state detection in the `handleKeydown` function in `ChatForm.svelte`: - Check `event.isComposing` to detect active IME composition - Fallback to `event.keyCode === 229` for broader browser compatibility - Return early when IME is active, allowing normal character selection ## Changes - Modified `dashboard/src/lib/components/ChatForm.svelte` - Added IME composition check before Enter key handling Co-authored-by: Ricky Chen <rickychen@Rickys-MacBook-Pro.local>	2025-12-31 17:11:04 +00:00
Evan Quiney	c1be5184b2	Fix tests broken by 283c (#1063 ) Some tests were broken by #1058 and #1046 - this fixes them.	2025-12-31 01:53:55 +00:00
Alex Cheema	1ec550dff1	Emit download progress on start, and change downloads to be keyed by model_id (#1044 ) ## Motivation We added a download page to the dashboard which shows the currently download status of each model on each node. Users have reported this to be extremely useful. However, we don't currently fetch the download progress on start, so it doesn't show any model's download status. ## Changes Fetch and emit model download status on start of worker, and periodically every 5 mins. Also to support this, I changed download_status to be keyed by model_id instead of shard, since we want download_status of each model, not each shard. ## Why It Works The dashboard already implements the correct functionality, we just weren't populating the download status in the state. Now it gets populated and shows correctly. ## Test Plan ### Manual Testing On a cluster of 2 x 512GB M3 Ultra Mac Studio, I launched an instance onto one node that hadn't been downloaded. I checked the download page and it showed the in progress download. I downloaded it to completion, restarted exo on both nodes, and then opened the download page and it showed the model as 100% downloaded and other models as 0% that hadn't been downloaded. --------- Co-authored-by: Evan <evanev7@gmail.com>	2025-12-31 01:18:10 +00:00
Alex Cheema	283c0e39e4	Placement filters for tensor parallel supports_tensor, tensor dimension and pipeline parallel deepseek v3.1 (#1058 ) ## Motivation Certain placements are not valid. Added filters to exclude these placements. There were invalid placement previews being shown in the dashboard which would then fail when the user actually tries to launch an instance with that placement. ## Changes Three filters added: 1. Certain models do not support tensor parallel at all. Checks `supports_tensor` on the model_meta. 2. For models that do support tensor parallelism, certain tensor parallel sizes are not valid. This check is actually not correct right now but it works fine for now. The actual correct check is more involved. 3. For unknown reasons, deepseek v3.1 (8-bit) does not work with tensor parallelism. ## Why It Works `place_instance` now raises an `Exception` for invalid placements. ## Test Plan ### Manual Testing Since `/instance/previews` enumerates all possible placements and runs `place_instance`, I checked the dashboard to see if invalid placements are still shown.	2025-12-31 00:33:40 +00:00
Alex Cheema	35be4c55c3	prioritise mlx jaccl coordinator ip (en0 -> en1 -> non-TB5 -> other)	2025-12-31 00:10:19 +00:00
Alex Cheema	31d4cd8409	set KV_CACHE_BITS to None to disable quantized kv cache	2025-12-31 00:03:30 +00:00
Alex Cheema	8a6da58404	remove mx.set_cache_limit	2025-12-30 23:58:15 +00:00