exo/resources at 3ed07bb655b7be0a4e989779b2b0ffb7b0606302 - exo

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-02-05 11:43:17 -05:00

Files

Alex Cheema ffe6396c91 Add Qwen3-Coder-Next model cards (#1367 )

## Motivation

Qwen3-Coder-Next just dropped on mlx-community in several quantizations.
It's an 80B MoE model (Qwen3NextForCausalLM) which we already have
tensor parallelism support for via QwenShardingStrategy — just needs
model cards.

## Changes

Added model cards for all 5 available quantizations:
- `mlx-community/Qwen3-Coder-Next-4bit` (~46GB)
- `mlx-community/Qwen3-Coder-Next-5bit` (~58GB)
- `mlx-community/Qwen3-Coder-Next-6bit` (~69GB)
- `mlx-community/Qwen3-Coder-Next-8bit` (~89GB)
- `mlx-community/Qwen3-Coder-Next-bf16` (~158GB)

All with `supports_tensor = true` since the architecture is already
supported.

## Why It Works

`Qwen3NextForCausalLM` is already handled by QwenShardingStrategy in
auto_parallel.py and is in the supports_tensor allowlist in
model_cards.py. No code changes needed — just the TOML card files.

## Test Plan

### Manual Testing
<!-- n/a - model card addition only -->

### Automated Testing
- `basedpyright` — 0 errors
- `ruff check` — passes
- `nix fmt` — no changes
- `pytest` — 173 passed, 1 skipped


🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-05 13:37:18 +00:00

image_model_cards

Ciaran/parallel cfg (#1361 )

2026-02-04 21:16:35 +00:00

inference_model_cards

Add Qwen3-Coder-Next model cards (#1367 )

2026-02-05 13:37:18 +00:00