mirror of
https://github.com/exo-explore/exo.git
synced 2026-04-17 12:30:29 -04:00
## Motivation The mlx-community [MiniMax-M2.7 collection](https://huggingface.co/collections/mlx-community/minimax-m27) landed but exo didn't have model cards for any of the variants yet, so they weren't selectable from the dashboard model picker. Adding cards also makes them discoverable under the existing MiniMax family entry. ## Changes Added 6 new model cards in `resources/inference_model_cards/`, one per quant of MiniMax M2.7: - `mlx-community--MiniMax-M2.7.toml` (bf16, full precision — 457 GB) - `mlx-community--MiniMax-M2.7-4bit.toml` (128 GB) - `mlx-community--MiniMax-M2.7-4bit-mxfp4.toml` (121 GB) - `mlx-community--MiniMax-M2.7-5bit.toml` (157 GB) - `mlx-community--MiniMax-M2.7-6bit.toml` (185 GB) - `mlx-community--MiniMax-M2.7-8bit.toml` (243 GB) All six use `family = "minimax"` and share `base_model = "MiniMax M2.7"` so they collapse into a single group in the picker with the existing MiniMax logo. Architecture fields (`n_layers = 62`, `hidden_size = 3072`, `num_key_value_heads = 8`, `context_length = 196608`) were read from each repo's `config.json`; `storage_size.in_bytes` was summed from the HF tree API per repo. `capabilities = ["text", "thinking"]` follows the existing MiniMax M2.5 cards — the chat template always emits `<think>` tags (no toggle), matching M2.5 behavior. ## Why It Works Model cards in `resources/inference_model_cards/` are auto-loaded by `src/exo/shared/models/model_cards.py::get_model_cards`. The dashboard picker groups by `base_model` and filters by `family`, so sharing both across all six variants gives a single "MiniMax M2.7" group under the MiniMax sidebar entry, with the quant variants exposed as selectable sub-options. ## Test Plan ### Manual Testing <!-- Hardware: MacBook Pro M3 Max --> - Ran `uv run python -c "…await get_model_cards()…"` and confirmed all 6 new cards load with `family=minimax`, `base_model="MiniMax M2.7"`, and correct quant + byte sizes. - `cd dashboard && npm run build` then `uv run exo`, opened the model picker → **MiniMax** family → **MiniMax M2.7** group shows all six quant variants. ### Automated Testing - No new automated tests — these are data files validated by the existing Pydantic `ModelCard` schema at load time. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>