Default Branch

63b8e64715 · Add model cards for Qwen3.6-35B-A3B variants (#1907) · Updated 2026-04-16 18:25:26 -04:00

Branches

8f61dd5970 · cuda support · Updated 2026-04-16 16:37:05 -04:00

6
2

c4936c124d · Revert "Update mlx and mlx lm to latest (#1906)" · Updated 2026-04-16 16:33:46 -04:00

1
1

4de70d2598 · qwen3_5_moe_split: all_gather + eval every 4th layer · Updated 2026-04-16 15:23:10 -04:00

35
47

04614b734c · hot fix · Updated 2026-04-16 12:47:56 -04:00

13
22

9cbf14ec5f · More thermal results · Updated 2026-04-16 11:48:18 -04:00

1
2

ca10d21228 · Auto-detect existing instance unless --fresh-instance · Updated 2026-04-16 10:21:32 -04:00

1
14

bfb7975ab8 · Fallback to HF mirror as a mitigation for China HF limitation · Updated 2026-04-15 18:04:15 -04:00

3
1

7ca01c5ba8 · Add graph · Updated 2026-04-15 13:01:03 -04:00

5
10

65b9c9df81 · remove layer loading callback · Updated 2026-04-15 06:49:39 -04:00

6
1

63f0f0cfde · add assertion that receivers are only created before the main loop begins · Updated 2026-04-15 06:44:26 -04:00

6
1

76d77b6b57 · wuff · Updated 2026-04-15 05:57:21 -04:00

6
2

e81d245f8e · Handle fp8 · Updated 2026-04-14 14:08:09 -04:00

9
2

051d60059e · DFlash warmup: full S_ctx sweep to pre-compile every drafter kernel · Updated 2026-04-14 13:35:10 -04:00

35
8

1c90aa0f5f · No more model load timeout, no more crazy sigkills, try harder to clean up nicely · Updated 2026-04-14 11:04:57 -04:00

10
1

a618d02277 · more fixes · Updated 2026-04-14 07:45:27 -04:00

13
63

a534596063 · add assertion to runner supervisor · Updated 2026-04-14 06:53:24 -04:00

10
1

aacb82e418 · wah · Updated 2026-04-13 11:57:31 -04:00

13
1

e1f13fc8b5 · :( · Updated 2026-04-10 06:48:55 -04:00

24
5

f78d08dfec · Disable server side tools · Updated 2026-04-10 06:18:31 -04:00

24
6

2e4d996ac3 · Add free space and retry button for failed downloads · Updated 2026-04-08 13:32:49 -04:00

26
20