mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-27 18:06:58 -04:00
Synthesizes the four ARCH_GENERALITY_AUDIT sections (build-matrix, gguf-gallery-targeting, optimization-generality, patch-arch-safety) into a single cross-arch ship decision: build-safety table per target, every patch bucketed (SAFE-EVERYWHERE / BLACKWELL-ONLY-clean-fallback / GB10-TUNED / RISKY), the NVFP4 gallery recommendation, a per-arch roadmap ranked by value/effort, the empirical-verification matrix (GB10 + M4 cover all but non-Blackwell NVIDIA), and the ship verdict. Verdict: SAFE to ship as Blackwell/Linux today; the build is arch-general (no GB10 pin; FP4 code is default-off + #if-guarded) and NVFP4 GGUFs run everywhere via dequant. The one hard prerequisite before extending paged to Metal/Vulkan/SYCL is closing the backend-ungated, default-on fused GDN/conv op emission (discriminated GGML_OP_SSM_CONV via non-null src[3], CUDA+CPU only, no supports_op guard) - latent on current Linux targets, silent miscompute on a future non-CUDA paged build of a gated-DeltaNet model. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>