feat(paged): restrict llama-cpp-localai-paged to CUDA-only build targets

The paged backend previously built for cublas/cuda, cpu, vulkan, sycl,
hipblas and darwin/metal. On non-CUDA the patchset's wins are inert: the
GDN fusions are gated off (patch 0030) and NVFP4 falls back to dequant,
so the backend is neutral-to-negative there (README section 4c). The
darwin grpc-server link also fails on undefined upstream server symbols,
turning CI red. Both broken and pointless off-CUDA, so ship CUDA-only.

- backend-matrix.yml: drop the hipblas, sycl f32/f16, cpu amd64/arm64,
  vulkan amd64/arm64 and metal-darwin rows for this backend; keep the
  four cublas rows (cuda-12, cuda-13, nvidia-l4t cuda-12 and cuda-13).
- index.yaml: meta-backend (and -development) capabilities are now
  CUDA-only with default pointing at cuda12 (mirrors faster-qwen3-tts);
  removed the orphaned cpu/rocm/sycl/vulkan/metal variant entries.
- Removed the now-unused darwin build script and its Makefile target /
  .NOTPARALLEL entry / backend_build_darwin.yml step.
- Documented the CUDA-only build coverage in the patch README and plan.

Non-CUDA users should use the stock llama-cpp backend.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto
2026-06-27 12:29:15 +00:00
parent 9115c2c52c
commit a4e730979d
7 changed files with 25 additions and 299 deletions

View File

@@ -230,16 +230,6 @@ jobs:
make protogen-go
make backends/llama-cpp-darwin
# llama-cpp-localai-paged reuses the same bespoke llama-cpp darwin build path
# (CPU_ALL_VARIANTS + Metal + otool dylib bundling) via its own wrapper script,
# so it gets a dedicated step like stock llama-cpp rather than the generic
# build-darwin-go-backend mold.
- name: Build ${{ inputs.backend }}-darwin (llama-cpp-localai-paged)
if: inputs.backend == 'llama-cpp-localai-paged'
run: |
make protogen-go
make backends/llama-cpp-localai-paged-darwin
- name: Build ds4 backend (Darwin Metal)
if: inputs.backend == 'ds4'
run: |
@@ -255,7 +245,7 @@ jobs:
make backends/privacy-filter-darwin
- name: Build ${{ inputs.backend }}-darwin
if: inputs.backend != 'llama-cpp' && inputs.backend != 'llama-cpp-localai-paged' && inputs.backend != 'ds4' && inputs.backend != 'privacy-filter'
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4' && inputs.backend != 'privacy-filter'
run: |
make protogen-go
BACKEND=${{ inputs.backend }} BUILD_TYPE=${{ inputs.build-type }} USE_PIP=${{ inputs.use-pip }} make build-darwin-${{ inputs.lang }}-backend