LocalAI/.github at 23b11a5239cd076ca7aeab61b067f7b846ba3fce - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-28 10:27:30 -04:00

Files

History

Ettore Di Giacinto 9bb8994c4e chore(paged): drop CUDA-12 variants of llama-cpp-localai-paged, keep CUDA-13 only

The paged backend targets Blackwell sm_121a, which CUDA 12.0 cannot target
at all, so the CUDA-12 variants were pointless. They were also broken: the
cublas-12 / nvidia-l4t / arm64 build failed to compile paged-kv-manager.cpp
("no declaration matches ...", a ~10-function mismatch the older
cuda-12-base gcc rejects). CUDA-13 compiles it fine (confirmed on GB10).

Removed (config-only, scoped to the paged backend):
- backend-matrix.yml: the two CUDA-12 paged rows
  (-gpu-nvidia-cuda-12-llama-cpp-localai-paged,
   -nvidia-l4t-arm64-llama-cpp-localai-paged)
- backend/index.yaml: CUDA-12 capability keys (nvidia-cuda-12,
  nvidia-l4t-cuda-12, nvidia-l4t) on both meta-backends, repointed
  default/nvidia to the cuda13 amd64 variant, and dropped the orphaned
  cuda12-* / nvidia-l4t-arm64-* variant definitions (latest + -development).

Kept CUDA-13 only: cuda13-llama-cpp-localai-paged (amd64) and
cuda13-nvidia-l4t-arm64-llama-cpp-localai-paged (l4t arm64). Matrix
tag-suffixes <-> index variant URIs form a clean 2:2 bijection.

Assisted-by: Claude:opus-4.8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-28 01:37:54 +00:00

..

ci: phase 1-3 of GHA free tier migration (path filter, multi-arch split prep, /mnt disk relief) (#9726 )

2026-05-08 23:43:41 +02:00

fix: roll out bluemonday Sanitize more widely (#3794 )

2024-10-12 09:45:47 +02:00

Harden gallery-agent Hugging Face fetches against transient rate limiting (#10187 )

2026-06-05 23:43:06 +02:00

docs/examples: enhancements (#1572 )

2024-01-18 19:41:08 +01:00

docs(paged): drop moot PIN_SYNC_c299a92c record, repoint to README sec 7

2026-06-27 21:34:10 +00:00

docs(paged): drop moot PIN_SYNC_c299a92c record, repoint to README sec 7

2026-06-27 21:34:10 +00:00

backend-matrix.yml

chore(paged): drop CUDA-12 variants of llama-cpp-localai-paged, keep CUDA-13 only

2026-06-28 01:37:54 +00:00

bump_deps.sh

feat: do not bundle llama-cpp anymore (#5790 )

2025-07-18 13:24:12 +02:00

bump_docs.sh

fix: github bump_docs.sh regex to drop emoji and other text (#2180 )

2024-04-29 03:55:29 +00:00

bump_vllm_metal.sh

feat(vllm): macOS/Metal support via vllm-metal (MLX) (#10489 )

2026-06-25 15:46:19 +02:00

bump_vllm_wheel.sh

feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563 )

2026-04-29 00:49:28 +02:00

check_and_update.py

fix(ci): fixup checksum scanning pipeline (#3631 )

2024-09-23 10:56:10 +02:00

checksum_checker.sh

fix(ci): fixup correct path for check_and_update.py (#2777 )

2024-07-11 23:05:43 +02:00

dependabot.yml

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

FUNDING.yml

Create FUNDING.yml (#725 )

2023-07-09 13:39:00 +02:00

labeler.yml

chore(ci): update labels

2025-02-13 09:58:19 +01:00

PULL_REQUEST_TEMPLATE.md

feat(vllm): Allow to set quantization (#1094 )

2023-09-22 15:52:38 +02:00

release.yml

feat(p2p): Federation and AI swarms (#2723 )

2024-07-08 22:04:06 +02:00

stale.yml

feat: add PR template and stale configuration (#316 )

2023-05-20 09:10:20 +02:00