Default Branch

fba8c9c498 · fix(distributed): track in-flight for non-LLM inference methods (VAD, diarize, voice, ...) (#10238) · Updated 2026-06-10 10:29:50 -04:00

Branches

50580a84ae · fix(ci): switch apt mirror per runner — azure on github-hosted, kernel.org on self-hosted · Updated 2026-05-03 18:59:26 -04:00

411
0
Included

3280b9a287 · fix(distributed): per-replica backend logs (store aggregation + UI) · Updated 2026-04-27 16:55:24 -04:00

448
0
Included

9787bee48b · fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles · Updated 2026-04-24 16:09:36 -04:00

493
8

d9d7b5c29b · docs(readme): add April 2026 highlights to Latest News · Updated 2026-04-23 16:47:06 -04:00

501
0
Included

5f7a0c3b26 · chore(turboquant): bump fork pin to rebase/upstream-sync-april-2026 · Updated 2026-04-22 16:01:49 -04:00

515
1

798b5b2d84 · chore(turboquant): bump fork to 4d24ad87 and patch ggml-hip for new f16-turbo fattn-vec instances · Updated 2026-04-22 03:13:47 -04:00

519
1

b27de08fff · chore(gallery): fixup wan · Updated 2026-04-19 17:31:22 -04:00

558
0
Included

44e7d9806b · fix(distributed): stop queue loops on agent nodes + dead-letter cap · Updated 2026-04-19 17:27:05 -04:00

563
8

fbc93b0a34 · fix(llama-cpp): default rms_norm_eps for Gemma 3 GGUFs missing the key · Updated 2026-04-19 12:15:26 -04:00

561
1

cd56a05c3e · ci(vllm): disable tests-vllm-grpc job (heterogeneous runners) · Updated 2026-04-13 03:46:57 -04:00

628
16

5fe87cb0d5 · feat: upgrade banner with Upgrade All button, detect pre-existing backends · Updated 2026-04-11 18:11:23 -04:00

637
8

6e11f882f7 · feat(turboquant.cpp): add new backend · Updated 2026-04-03 16:57:15 -04:00

702
1

659636195c · deterministic builds · Updated 2026-04-01 15:45:31 -04:00

712
3

8997ff6042 · Fix tests · Updated 2026-03-20 21:04:28 -04:00

795
6

2aaddbb3b8 · chore(ci): wire external backend for tests · Updated 2026-03-01 16:33:20 -05:00

975
1

e169492543 · fix: this backend is CUDA only · Updated 2026-02-26 18:08:24 -05:00

998
2

1f0110368d · step-flash fixes · Updated 2026-02-12 17:36:16 -05:00

1079
1

6a1e44c8ff · Fix markdown parsing to handle multi-line constructs correctly · Updated 2026-02-03 06:42:51 -05:00

1144
2

5041294265 · Initial plan · Updated 2026-02-02 17:01:37 -05:00

1154
1

805654fa23 · Initial plan · Updated 2026-02-02 16:38:30 -05:00

1154
1