Files
LocalAI/.github
Ettore Di Giacinto 9bb8994c4e chore(paged): drop CUDA-12 variants of llama-cpp-localai-paged, keep CUDA-13 only
The paged backend targets Blackwell sm_121a, which CUDA 12.0 cannot target
at all, so the CUDA-12 variants were pointless. They were also broken: the
cublas-12 / nvidia-l4t / arm64 build failed to compile paged-kv-manager.cpp
("no declaration matches ...", a ~10-function mismatch the older
cuda-12-base gcc rejects). CUDA-13 compiles it fine (confirmed on GB10).

Removed (config-only, scoped to the paged backend):
- backend-matrix.yml: the two CUDA-12 paged rows
  (-gpu-nvidia-cuda-12-llama-cpp-localai-paged,
   -nvidia-l4t-arm64-llama-cpp-localai-paged)
- backend/index.yaml: CUDA-12 capability keys (nvidia-cuda-12,
  nvidia-l4t-cuda-12, nvidia-l4t) on both meta-backends, repointed
  default/nvidia to the cuda13 amd64 variant, and dropped the orphaned
  cuda12-* / nvidia-l4t-arm64-* variant definitions (latest + -development).

Kept CUDA-13 only: cuda13-llama-cpp-localai-paged (amd64) and
cuda13-nvidia-l4t-arm64-llama-cpp-localai-paged (l4t arm64). Matrix
tag-suffixes <-> index variant URIs form a clean 2:2 bijection.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-28 01:37:54 +00:00
..
2023-07-09 13:39:00 +02:00
2025-02-13 09:58:19 +01:00