mirror/exo - exo - Gitea: Git with a cup of tea

mirror/exo

mirror of https://github.com/exo-explore/exo.git synced 2026-04-17 20:40:35 -04:00

Author	SHA1	Message	Date
rltakashige	5757c27dd5	Add download utility script (#1855 ) ## Motivation <!-- Why is this change needed? What problem does it solve? --> <!-- If it fixes an open issue, please link to the issue here --> ## Changes <!-- Describe what you changed in detail --> ## Why It Works <!-- Explain why your approach solves the problem --> ## Test Plan ### Manual Testing <!-- Hardware: (e.g., MacBook Pro M1 Max 32GB, Mac Mini M2 16GB, connected via Thunderbolt 4) --> <!-- What you did: --> <!-- - --> ### Automated Testing <!-- Describe changes to automated tests, or how existing tests cover this change --> <!-- - -->	2026-04-08 00:58:39 +00:00
Mustafa Alp Yılmaz	2994b41089	fix: validate num_key_value_heads in tensor sharding placement (#1669 ) ## Problem Models with fewer KV heads than nodes crash during tensor parallelism. For example, Qwen3.5 MoE models have only 2 KV heads — trying to shard across 4 nodes produces empty tensors and a reshape error at runtime. The placement system already validates `hidden_size % num_nodes == 0` but doesn't check KV heads, so it creates configurations that look valid but blow up when the worker tries to split the attention heads. Affected models include Qwen3.5-35B-A3B, Qwen3.5-122B-A10B, Qwen3.5-397B-A17B, Qwen3-Next-80B-A3B, and Qwen3-Coder-Next (all have 2 KV heads). ## Changes Placement validation (`src/exo/master/placement.py`): - Combined KV heads divisibility check with the existing hidden_size filter in a single pass - Cycles where `num_key_value_heads % len(cycle) != 0` are now excluded for tensor sharding - Error message includes both constraints when no valid cycle is found Model card schema (`src/exo/shared/models/model_cards.py`): - Added optional `num_key_value_heads` field to `ModelCard` and `ConfigData` - Extracted from HuggingFace `config.json` (handles both top-level and `text_config` nesting) - Passed through in `fetch_from_hf()` for dynamically fetched cards All 68 inference model cards (`resources/inference_model_cards/.toml`): - Populated `num_key_value_heads` from each model's HuggingFace config Utility script* (`scripts/fetch_kv_heads.py`): - Fetches `num_key_value_heads` from HuggingFace and updates TOML cards - `--missing`: only fills in cards that don't have the field yet - `--all`: re-fetches and overwrites everything - Uses tomlkit for safe TOML editing and ThreadPoolExecutor for parallel fetches ## Behavior - Instance previews no longer show tensor options for models that can't split their KV heads across the cluster size - `place_instance()` rejects with a clear error instead of crash-looping - Pipeline parallelism is unaffected - 2-node tensor still works for 2-KV-head models (2 ÷ 2 = 1) - Field is optional — existing custom cards without it continue to work (validation is skipped when `None`)	2026-03-11 13:46:33 +00:00
Jake Hillion	0fcee70833	prep repo for v1	2025-12-17 15:31:02 +00:00
Sami Khan	971f5240bf	build fix	2025-02-28 15:45:57 +05:00
Sami Khan	a70943f8d2	base images for animation	2025-01-22 05:46:38 -05:00
Alex Cheema	ba5bb3e171	fix scripts/build_exo.py: com.exolabs.exo -> net.exolabs.exo	2025-01-21 05:36:02 +00:00
DeftDawg	cde912deef	- Use `#!/usr/bin/env bash` instead of `#!/bin/bash` for better portability	2024-12-22 01:14:54 -05:00
Alex Cheema	e8ece1158f	tweak sed, make compile_grpc.sh executable	2024-12-06 13:23:06 +00:00
josh	0996bcc3b6	Merge branch 'main' into package-exo-fixes	2024-11-22 10:39:56 -08:00
Nel Nibcord	e3ec9eaa44	Fixed GRPC issues	2024-11-21 17:28:44 -08:00
josh	f5afa4db4d	compile error fix	2024-11-21 08:47:45 -08:00
josh	5269629d76	removed unused code	2024-11-21 05:22:17 -08:00
josh	90765922c8	added one file	2024-11-20 00:12:12 -08:00
josh	41697431dc	error fix	2024-11-19 20:47:56 -08:00
josh	44118252e9	build error fix	2024-11-19 20:34:48 -08:00
josh	3a1871c84b	typo fix	2024-11-19 08:00:41 -08:00
josh	97ed990a98	macos sign	2024-11-19 07:58:01 -08:00
josh	8bc823229a	missing lib	2024-11-19 07:57:22 -08:00
josh	ce9231ad3d	move model fix	2024-11-19 07:56:20 -08:00
josh	e1519246ee	error fix	2024-11-19 05:54:05 -08:00
Alex Cheema	1fa42f3063	typo	2024-11-19 17:02:07 +04:00
josh	6fc0b04479	error fix	2024-11-19 04:55:50 -08:00
josh	520d9d1164	error fix	2024-11-19 04:49:02 -08:00
josh	bcd885dcc9	cleaned code	2024-11-19 01:02:46 -08:00
josh	8ce0fe2bb3	pr suggestion	2024-11-19 00:59:33 -08:00
josh	867f348e71	moving models	2024-11-19 00:49:10 -08:00
josh	00d4bda5bd	fix build script	2024-11-18 23:29:53 -08:00
josh	e991438e72	pr suggestions fix	2024-11-18 23:02:03 -08:00
josh	fea1c0fc29	clean branch	2024-11-18 08:47:17 -08:00
Nel Nibcord	9712d696a9	Added a small script to compile grpc	2024-11-12 23:20:55 -08:00

30 Commits