LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-27 09:57:14 -04:00

Files

Ettore Di Giacinto 34abf392fc docs(paged): ARCH audit - NVFP4 GGUF off-Blackwell portability + gallery targeting gap

NVFP4 (GGML_TYPE_NVFP4=40 / MOSTLY_NVFP4) GGUFs are portable: full CPU/CUDA-DP4A/
generic-MMA/Vulkan dequant coverage. FP4-MMA is a runtime Blackwell-only speed
tier (mmq.cu use_native_fp4 flag), not a load/run gate. Off-Blackwell = runs via
dequant, correct-but-slow, never fail/garbage. Gallery has no microarch-gating
primitive (tags are search-only, capabilities map is family-level nvidia/amd/
metal, model struct has no hardware field), so the 6 -paged entries can only
express Blackwell-targeting via description prose + tags.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-27 07:00:34 +00:00

ds4

chore: ⬆️ Update antirez/ds4 to 80ebbc396aee40eedc1d829222f3362d10fa4c6c (#10378 )

2026-06-18 00:32:13 +02:00

grpc

fix: speedup git submodule update with --single-branch (#2847 )

2024-07-13 22:32:25 +02:00

ik-llama-cpp

fix(backends): quote $CURDIR in run.sh (fixes backends in paths with spaces) (#10519 )

2026-06-26 01:02:48 +02:00

llama-cpp

docs(paged): ARCH audit - NVFP4 GGUF off-Blackwell portability + gallery targeting gap