## Motivation
`mlx-community` has just published the new **Qwen3.6-35B-A3B**
multimodal MoE family on HuggingFace. Without static model cards exo
doesn't surface these models in the dashboard picker or match its
placement / prefill logic, so users can't one-click launch them. This PR
adds cards for the three quants whose safetensors indexes are already
live on HF (4bit / 5bit / bf16).
## Changes
Three new TOML files in `resources/inference_model_cards/`:
- `mlx-community--Qwen3.6-35B-A3B-4bit.toml` (~19 GB)
- `mlx-community--Qwen3.6-35B-A3B-5bit.toml` (~23 GB)
- `mlx-community--Qwen3.6-35B-A3B-bf16.toml` (~65 GB)
All three share the same architectural fields (`n_layers = 40`,
`hidden_size = 2048`, `num_key_value_heads = 2`, `context_length =
262144`, capabilities `text, thinking, thinking_toggle, vision`,
`base_model = "Qwen3.6 35B A3B"`) — only `model_id`, `quantization`, and
`storage_size.in_bytes` differ between variants.
## Why It Works
- Qwen3.6-35B-A3B reuses the `qwen3_5_moe` architecture
(`Qwen3_5MoeForConditionalGeneration`) — the same one already wired into
exo's MLX runner at `src/exo/worker/engines/mlx/auto_parallel.py:47` via
`Qwen3_5MoeModel`. The architectural fields are taken verbatim from the
HF `config.json.text_config` and match the existing `Qwen3.5-35B-A3B-*`
cards.
- Storage sizes are the exact `metadata.total_size` read from each
variant's `model.safetensors.index.json` on HF, so download progress and
cluster-memory-fit checks are accurate.
- Vision support is flagged in `capabilities`; the `[vision]` block is
auto-detected by `ModelCard._autodetect_vision` from the upstream
`config.json`, so no hand-written vision config is required.
- The card loader (`_refresh_card_cache` in
`src/exo/shared/models/model_cards.py`) globs every `.toml` in
`resources/inference_model_cards/` on startup, so nothing else needs to
change — the `/models` endpoint and the dashboard picker pick them up
automatically.
The `mxfp4` / `mxfp8` / `nvfp4` variants are still uploading upstream
(index JSONs currently 404) and can be added in a follow-up PR once HF
completes.
## Test Plan
### Manual Testing
Hardware: MacBook Pro M4 Max, 48 GB unified memory.
- Built the dashboard, ran `uv run exo`, waited for the API to come up
on `http://localhost:52415`.
- `curl -s http://localhost:52415/models` returns the three new model
ids (`mlx-community/Qwen3.6-35B-A3B-{4bit,5bit,bf16}`) alongside
existing models.
- Opened the dashboard, clicked SELECT MODEL, typed "Qwen3.6" into the
search box. A single **"Qwen3.6 35B A3B"** group appears showing `3
variants (19GB-65GB)`. Expanding it lists the `4bit` / `5bit` / `bf16`
quants with sizes `19GB` / `23GB` / `65GB`, exactly as expected:

- Programmatically loaded each TOML via `ModelCard.load_from_path(...)`
and confirmed the parsed fields (layers / hidden / KV heads / context /
quant / base_model / caps / bytes) match what's written in the files.
### Automated Testing
No code paths were touched — these are pure TOML data files that plug
into the existing model-card loader. The existing pytest suite covers
TOML parsing and card serving; adding new TOMLs doesn't require new test
scaffolding. `uv run ruff check` and `nix fmt` are clean.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ryuichi Leo Takashige <rl.takashige@gmail.com>
updates macmon to an upstream fork that fixes m5 max issues.
might see if the upstream version gets merged before we release.
---------
Co-authored-by: Alex Cheema <alexcheema123@gmail.com>
## Summary
- **Complete onboarding wizard**: 7-step flow guiding new users from
Welcome → Your Devices (topology) → Add More Devices (animation) →
Choose Model → Download → Load → Chat
- **Native macOS integration**: NSPopover welcome callout anchored to
menu bar icon on first launch, polished DMG installer with
drag-to-Applications arrow
- **Dashboard UX polish**: auto-download on model select, toast
notifications, connection banner, skeleton loading, download progress in
header, recommended model tags, sidebar hidden in home state for cleaner
first impression
- **Settings & menu bar overhaul**: native Settings window with Advanced
tab, onboarding reset, chat sidebar toggle
## Test plan
- [ ] Fresh install: verify onboarding wizard appears and flows Welcome
→ Topology → Animation → Model → Download → Load → Chat
- [ ] Verify topology shows real device data in onboarding step 2
- [ ] Verify selecting a model in the main dashboard picker
auto-triggers download
- [ ] Verify chat sidebar is hidden on home view, appears when chat is
active
- [ ] Verify DMG installer has white background with curved arrow
- [ ] Verify NSPopover appears anchored to menu bar icon on first launch
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Ryuichi Leo Takashige <leo@exolabs.net>
## Summary
- **Auto-open dashboard** in browser on first launch (uses
`~/.exo/.dashboard_opened` marker)
- **Welcome overlay** with "Choose a Model" CTA button when no model
instance is running
- **Tutorial progress messages** during model download → loading → ready
lifecycle stages
- **Fix conversation sidebar** text contrast — bumped to white text,
added active state background
- **Simplify technical jargon** — sharding/instance type/min nodes
hidden behind collapsible "Advanced Options" toggle; strategy display
hidden behind debug mode
- **Polished DMG installer** with drag-to-Applications layout, custom
branded background, and AppleScript-configured window positioning
## Test plan
- [ ] Launch exo for the first time (delete `~/.exo/.dashboard_opened`
to simulate) — browser should auto-open
- [ ] Verify welcome overlay appears on topology when no model is loaded
- [ ] Launch a model and verify download/loading/ready messages appear
in instance cards
- [ ] Check conversation sidebar text is readable (white on dark, yellow
when active)
- [ ] Verify "Advanced Options" toggle hides/shows sharding controls
- [ ] Build DMG with `packaging/dmg/create-dmg.sh` and verify
drag-to-Applications layout
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>