Files
LocalAI/backend/cpp/llama-cpp
Ettore Di Giacinto ee78ae4a11 docs(paged): Qwen3.6 NVFP4 h2h bench doc - MoE llama.cpp table
First crash-resilient slab of the apples-to-apples NVFP4-vs-NVFP4
llama.cpp-vs-vLLM benchmark on GB10. MoE Qwen3.6-35B-A3B paged
llama.cpp (patch 0015) decode/prefill/TTFT/VRAM at npl 8/32/64/128.
vLLM and dense tables append as the sweeps land.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-23 19:43:55 +00:00
..
2026-04-12 08:51:30 +02:00