Default Branch

099a0f18ef · build: fix Dockerfile mlx directory (#14131) · Updated 2026-02-06 20:08:53 -05:00

Branches

9319f13ff5 · compile expert select · Updated 2026-02-06 19:00:31 -05:00

3
11

bd5d3b0ebd · quant · Updated 2026-02-06 17:51:32 -05:00

3
9

20299cb1da · nil keys · Updated 2026-02-06 17:49:31 -05:00

3
8

1ec216fe0a · append vector · Updated 2026-02-06 17:47:34 -05:00

3
4

d5ac80125f · colour changes + feed the linter · Updated 2026-02-06 15:26:47 -05:00

3
9

d5a2849d1f · cmd: set codex env vars on launch and handle zstd request bodies · Updated 2026-02-06 13:42:18 -05:00

4
1

f92a82db15 · app: match model picker to server models · Updated 2026-02-05 18:43:22 -05:00

13
1

2e9d9acf18 · add ability to turn on debug request logging · Updated 2026-02-05 18:14:35 -05:00

12
1

c330ea33ed · qwen3next: handle mixed recurrent batches · Updated 2026-02-05 14:50:00 -05:00

13
1

52f757d8a2 · cmd: fix gofmt formatting in pi integration · Updated 2026-02-04 22:27:34 -05:00

16
2

55746e31fa · ggml: add MLA flash attention config for GLM-4.7-flash · Updated 2026-02-03 15:57:48 -05:00

24
1

846f3fbcc8 · app: expose server's default context length to UI · Updated 2026-02-02 19:25:29 -05:00

31
1

b202a9b4ce · qwen3-coder parser: allow missing opening tool call tag · Updated 2026-02-02 15:53:45 -05:00

42
1

c0496e6125 · fix lint · Updated 2026-01-28 16:16:52 -05:00

45
3

e6f5a982d3 · cmd: add usage cmd to chat to see token consumption · Updated 2026-01-27 20:14:25 -05:00

45
1

8e22b09e2c · ggml-cuda: fix fattn build for GLM 4.7 flash support · Updated 2026-01-23 22:47:16 -05:00

59
1

8b4410633d · Add image generation documentation · Updated 2026-01-22 17:09:58 -05:00

70
1

4bd2afd22f · address comment · Updated 2026-01-22 14:25:52 -05:00

70
18

c73feaf73d · Clean up the manifest and modelpath (#13807) · Updated 2026-01-21 22:08:21 -05:00

72
2

e9dac1aaef · docs: integration overview · Updated 2026-01-21 20:20:49 -05:00

70
1