Files
LocalAI/backend/cpp/llama-cpp
Ettore Di Giacinto ce60737fc5 kernel(doc): dense scope resolved - two FP4 kernels (dense first, then grouped)
Benchmark confirms dense prefill 7.6-32x behind too, so the kernel track needs a
non-grouped FP4 dense GEMM (simpler, land first) + the MoE grouped GEMM. Both
share the e2m1 block-scaled collective; dense is grouped-with-one-group.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-20 03:56:33 +00:00
..
2026-04-12 08:51:30 +02:00