Files
LocalAI/backend/cpp/llama-cpp/patches/paged
Ettore Di Giacinto c1d7f336cb docs(paged): enrich track-B scope with code-level FP4-GEMM inefficiencies
Add the source-read kernel-mechanism map (no cp.async weight pipeline,
mmq_x tile-maximizing selector vs GB10 occupancy, MoE per-expert M-tile
waste, iter_k=512 coupling, ruled-out non-levers) and strip the stray
trailing tags from the prior write.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-24 14:11:41 +00:00
..