LocalAI/.github at a4e730979dec876eb6913445a6a568797dc47731 - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-27 09:57:14 -04:00

Files

History

Ettore Di Giacinto a4e730979d feat(paged): restrict llama-cpp-localai-paged to CUDA-only build targets

The paged backend previously built for cublas/cuda, cpu, vulkan, sycl,
hipblas and darwin/metal. On non-CUDA the patchset's wins are inert: the
GDN fusions are gated off (patch 0030) and NVFP4 falls back to dequant,
so the backend is neutral-to-negative there (README section 4c). The
darwin grpc-server link also fails on undefined upstream server symbols,
turning CI red. Both broken and pointless off-CUDA, so ship CUDA-only.

- backend-matrix.yml: drop the hipblas, sycl f32/f16, cpu amd64/arm64,
  vulkan amd64/arm64 and metal-darwin rows for this backend; keep the
  four cublas rows (cuda-12, cuda-13, nvidia-l4t cuda-12 and cuda-13).
- index.yaml: meta-backend (and -development) capabilities are now
  CUDA-only with default pointing at cuda12 (mirrors faster-qwen3-tts);
  removed the orphaned cpu/rocm/sycl/vulkan/metal variant entries.
- Removed the now-unused darwin build script and its Makefile target /
  .NOTPARALLEL entry / backend_build_darwin.yml step.
- Documented the CUDA-only build coverage in the patch README and plan.

Non-CUDA users should use the stock llama-cpp backend.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-27 12:29:15 +00:00

..

ci: phase 1-3 of GHA free tier migration (path filter, multi-arch split prep, /mnt disk relief) (#9726 )

2026-05-08 23:43:41 +02:00

fix: roll out bluemonday Sanitize more widely (#3794 )

2024-10-12 09:45:47 +02:00

Harden gallery-agent Hugging Face fetches against transient rate limiting (#10187 )

2026-06-05 23:43:06 +02:00

docs/examples: enhancements (#1572 )

2024-01-18 19:41:08 +01:00

refactor(paged): stock llama-cpp is patch-free; paged backend owns its patch series

2026-06-27 11:01:22 +00:00

feat(paged): restrict llama-cpp-localai-paged to CUDA-only build targets

2026-06-27 12:29:15 +00:00

backend-matrix.yml

feat(paged): restrict llama-cpp-localai-paged to CUDA-only build targets

2026-06-27 12:29:15 +00:00

bump_deps.sh

feat: do not bundle llama-cpp anymore (#5790 )

2025-07-18 13:24:12 +02:00

bump_docs.sh

fix: github bump_docs.sh regex to drop emoji and other text (#2180 )

2024-04-29 03:55:29 +00:00

bump_vllm_metal.sh

feat(vllm): macOS/Metal support via vllm-metal (MLX) (#10489 )

2026-06-25 15:46:19 +02:00

bump_vllm_wheel.sh

feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563 )

2026-04-29 00:49:28 +02:00

check_and_update.py

fix(ci): fixup checksum scanning pipeline (#3631 )

2024-09-23 10:56:10 +02:00

checksum_checker.sh

fix(ci): fixup correct path for check_and_update.py (#2777 )

2024-07-11 23:05:43 +02:00

dependabot.yml

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

FUNDING.yml

Create FUNDING.yml (#725 )

2023-07-09 13:39:00 +02:00

labeler.yml

chore(ci): update labels

2025-02-13 09:58:19 +01:00

PULL_REQUEST_TEMPLATE.md

feat(vllm): Allow to set quantization (#1094 )

2023-09-22 15:52:38 +02:00

release.yml

feat(p2p): Federation and AI swarms (#2723 )

2024-07-08 22:04:06 +02:00

stale.yml

feat: add PR template and stale configuration (#316 )

2023-05-20 09:10:20 +02:00