LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-17 13:10:23 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	5a12392570	ci(concurrency): make cancel-in-progress event-aware, group by sha on push Yesterday two PRs (#9724 llama.cpp bump, #9731 llama-cpp-darwin consolidation) merged 11 seconds apart. Both shared the same backend.yml concurrency group (ci-backends-refs/heads/master-...) due to "${{ github.head_ref \|\| github.ref }}" — empty head_ref on push events falls through to the static refs/heads/master. With cancel-in-progress: true that meant the second merge cancelled the first's in-flight backend builds. The first PR's CI never finished; the second PR only touched CI files so its run was a no-op. Two changes per workflow: - group: replace "${{ github.head_ref \|\| github.ref }}" with "${{ github.event.pull_request.number \|\| github.sha }}". On PRs this groups by PR number (same as before, just keyed on number not branch name); on push events it groups per-commit, so two master pushes never share a group. - cancel-in-progress: gate on github.event_name == 'pull_request' so rapid pushes to a PR still cancel old runs (newer push wins) but master pushes never cancel each other. Trade-off vs alternatives: - Merge queue would also solve this and additionally test the merged commit before it lands. Heavier process change; out of scope here. - Allowing per-commit master concurrency means two simultaneous master runs may overlap and race on tag pushes, but each commit's manifest digest is unique and the registry is last-writer-wins on tags — newer commit's tag overwrites older. Applied to 11 workflows that share the same concurrency pattern: backend.yml, backend_pr.yml, image.yml, image-pr.yml, lint.yml, test.yml, test-extra.yml, tests-e2e.yml, tests-aio.yml, tests-ui-e2e.yml, generate_intel_image.yaml. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-09 08:30:55 +00:00
LocalAI [bot]	f0374aa0e8	ci: finish GHA free-tier migration (per-arch fan-out, image splits, retire self-hosted, fix provenance) (#9730 ) * ci: add per-arch + manifest-merge support for LocalAI server image Mirror the backend_build.yml + backend_merge.yml pattern shipped in PR #9726 for the LocalAI server image: - image_build.yml accepts optional platform-tag (default ''), scopes registry cache to cache-localai<suffix>-<platform-tag>, and pushes by canonical digest only on push events. Digests upload as artifacts named digests-localai<suffix>-<platform-tag>, with a "-core" placeholder when tag-suffix is empty so the merge job's download pattern doesn't over-match across multiple suffixes. - image_merge.yml is a new reusable workflow that downloads matching digest artifacts and assembles the final tagged manifest list via docker buildx imagetools create. Image names differ from backend_.yml: the LocalAI server is published under quay.io/go-skynet/local-ai and localai/localai (not -backends). Not yet wired into image.yml / image-pr.yml — Commit C does that. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> ci: fan out per-arch split to remaining 34 backends Convert all remaining linux/amd64,linux/arm64 entries in backend-matrix.yml to per-arch + manifest-merge form. Each was a single matrix entry running both arches on x86 under QEMU emulation; each becomes two entries — amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). Four backends that were on bigger-runner (-cpu-llama-cpp, -cpu-turboquant, -gpu-vulkan-llama-cpp, -gpu-vulkan-turboquant) have both legs moved to free tier as part of the same change. They are compile-only (no torch/CUDA install) and fit comfortably with the setup-build-disk /mnt relocation. Phase 4 (next commit) retires the remaining 5 single-arch bigger-runner entries. After this commit: - 271 total matrix entries (was 237) - 0 multi-arch entries left - 36 per-arch pairs (34 new + 2 pilots from PR #9727) - 5 bigger-runner entries remaining (single-arch, Phase 4 target) Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: split LocalAI image multi-arch entries per arch + merge Mirror the backend per-arch split for the main LocalAI image: - image.yml's core-image-build matrix: split the core ('') and -gpu-vulkan entries into amd64 + arm64 legs each. amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). - New top-level core-image-merge and gpu-vulkan-image-merge jobs call image_merge.yml after core-image-build completes. - image-pr.yml's image-build matrix: split the -vulkan-core entry. No merge job added on the PR side — image_build.yml's digest-push is push-only-event-gated, so a PR-side merge would have nothing to download. After this commit, no workflow file references linux/amd64,linux/arm64 in a single matrix slot. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: retire bigger-runner from backend matrix (Phase 4) Migrate the remaining 5 single-arch bigger-runner entries to ubuntu-latest. Combined with the Phase 3 setup-build-disk /mnt relocation (PR #9726), free-tier ubuntu-latest now has ~100 GB of working space — enough for ROCm dev image (~16 GB), CUDA toolkit (~5 GB), and the per-backend compile/install steps these entries do. Backends migrated: - -gpu-nvidia-cuda-12-llama-cpp - -gpu-nvidia-cuda-12-turboquant - -gpu-rocm-hipblas-faster-whisper - -gpu-rocm-hipblas-coqui - -cpu-ik-llama-cpp After this commit, .github/backend-matrix.yml has zero bigger-runner references. The bigger-runner used in tests-vibevoice-cpp-grpc- transcription (test-extra.yml) is a separate concern handled in a follow-up. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 9 Intel oneAPI backends to free tier (Phase 5.1) Intel oneAPI base image is ~6 GB; each backend's wheel install stays well within the ~100 GB working space provided by Phase 3's setup-build-disk /mnt relocation. Lowest-risk batch of the arc-runner-set retirement. Backends migrated: vllm, sglang, vibevoice, qwen-asr, nemo, qwen-tts, fish-speech, voxcpm, pocket-tts (all -gpu-intel-* variants). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 15 ROCm Python backends to free tier (Phase 5.2) ROCm dev image (~16 GB) plus per-backend torch/wheels install fits on ubuntu-latest with the /mnt-relocated Docker root. These entries include the heavier vLLM/sglang/transformers/diffusers stack on ROCm; if any specific backend OOMs or runs out of disk, individual flips back to arc-runner-set are revertable per-entry. Backends migrated: all 15 -gpu-rocm-hipblas-* entries previously on arc-runner-set (vllm/vllm-omni/sglang/transformers/diffusers/ ace-step/kokoro/vibevoice/qwen-asr/nemo/qwen-tts/fish-speech/ voxcpm/pocket-tts/neutts). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 6 CUDA Python backends to free tier (Phase 5.3) vLLM/sglang stacks on CUDA 12 and CUDA 13 are the heaviest backends in the matrix — flash-attn intermediate layers can spike disk usage during build. setup-build-disk's /mnt relocation gives ~100 GB working space which fits the documented peak. Highest-risk batch of the arc-runner-set retirement; if any backend fails to build on free tier, the per-entry runs-on flip is the unit of revert. Backends migrated: -gpu-nvidia-cuda-{12,13}-{vllm,vllm-omni,sglang}. After this commit, .github/backend-matrix.yml has zero references to arc-runner-set or bigger-runner. The migration is complete. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: disable provenance on multi-registry digest pushes Root-caused on master via PR #9727's pilot: when docker/build-push-action@v7 pushes a single build to TWO registries simultaneously with push-by-digest=true, buildx generates a per-registry provenance attestation manifest (because mode=max — the default for push:true — includes the runner ID). That makes the resulting manifest-list digest diverge across registries: arm64 -cpu-faster-whisper build: image manifest: sha256:d3bdd34b... (identical, content-only) quay manifest list: sha256:66b4cfc8... (with quay attestation) dockerhub manifest list: sha256:e0733c3b... (with dockerhub attestation) steps.build.outputs.digest returns only one of the list digests (empirically the dockerhub one). The merge job then asks "quay.io/...@sha256:e0733c3b..." which doesn't exist on quay — that list has digest 66b4cfc8 there. Result: imagetools create fails with "not found" and the merge job fails (run 25581983094, job 75110021491). Setting provenance: false drops the per-registry attestation; the manifest-list digest becomes pure content, identical across both registries, and steps.build.outputs.digest works on either lookup. Applied to backend_build.yml and image_build.yml — both refactored to use the same multi-registry digest-push pattern in the prior PRs. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-09 09:37:00 +02:00
Ettore Di Giacinto	bdfa5e934a	ci: switch image/backend build cache to a dedicated registry image - Switch cache-from/cache-to in backend_build.yml and image_build.yml from the unused gha cache to type=registry pointing at quay.io/go-skynet/ci-cache:cache<tag-suffix>, mode=max with ignore-error=true. Master/tag builds populate their own per-matrix-entry cache; PR builds read-only. - Drop the broken generate_grpc_cache.yaml cron. It targeted a `grpc` Dockerfile stage that was removed by `b1fc5acd` in July 2025, has been failing every night since, and never populated the gha cache. The new registry-cache scheme is self-warming, so no separate populator is needed. - Remove the dead GRPC_VERSION / GRPC_BASE_IMAGE / GRPC_MAKEFLAGS build-args from image_build.yml and the orphan ARG GRPC_BASE_IMAGE in the root Dockerfile (the root Dockerfile no longer compiles gRPC; the source build now lives in backend/Dockerfile.{llama-cpp, ik-llama-cpp, turboquant} only and uses its own ARG defaults). - Drop the unused grpc-base-image input from image_build.yml plus the matrix passthroughs in image.yml / image-pr.yml. - Drop the unused GRPC_VERSION env in test.yml. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7-1m	2026-04-27 13:13:04 +00:00
Alex Brick	41ed8ced70	[intel GPU support] Use latest oneapi-basekit image for Intel images to support b70 (in more places this time) (#9578 ) Update additional intel base images	2026-04-27 09:18:57 +02:00
Ettore Di Giacinto	151ad271f2	feat(rocm): bump to 7.x (#9323 ) feat(rocm): bump to 7.2.1 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-12 08:51:30 +02:00
Ettore Di Giacinto	5affb747a9	chore: drop AIO images (#9004 ) AIO images are behind, and takes effort to maintain these. Wizard and installation of models have been semplified massively, so AIO images lost their purpose. This allows us to be more laser focused on main images and reliefes stress from CI. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-14 17:49:36 +01:00
Andres	efd552f83e	fix(api)!: Stop model prior to deletion (#8422 ) * Unload model prior to deletion Signed-off-by: Andres Smith <andressmithdev@pm.me> * Fix LFM model in gallery Signed-off-by: Andres Smith <andressmithdev@pm.me> * Remove mistakenly added files Signed-off-by: Andres Smith <andressmithdev@pm.me> --------- Signed-off-by: Andres Smith <andressmithdev@pm.me>	2026-02-06 09:22:10 +01:00
Ettore Di Giacinto	4ca5b737bf	chore(cuda): target 12.8 for 12 to increase compatibility (#8297 ) Some datacenter setups might be stuck with the 5.x kernel which doesn't play well with CUDA >=12.9. To incrase compatibility with the CUDA 12.x branch, downgrade to 12.8. For newer systems, it is still suggested to use CUDA 13.x wherever compatible. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 12:58:44 +01:00
Ettore Di Giacinto	b736db4bbe	chore(ci): use latest jetpack image for l4t (#7926 ) This image is for HW prior Jetpack 7. Jetpack 7 broke compatibility with older devices (which are still in use) such as AGX Orin or Jetsons. While we do have l4t-cuda-13 images with sbsa support for new Nvidia devices (Thor, DGX, etc). For older HW we are forced to keep old images around as 24.04 does not seem to be supported. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-08 18:30:59 +01:00
Copilot	b2ff1cea2a	feat: enable Vulkan arm64 image builds (#7912 ) * Initial plan * Add arm64 support for Vulkan builds in Dockerfiles and workflows Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-07 21:49:50 +01:00
Ettore Di Giacinto	1fe0e9f74f	chore(ci): restore building of GPU vendor images (#7910 ) Until we figure out https://github.com/mudler/LocalAI/issues/7909 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-07 16:32:22 +01:00
Copilot	fd53978a7b	feat: package GPU libraries inside backend containers for unified base image (#7891 ) * Initial plan * Add GPU library packaging for isolated backend environments - Create scripts/build/package-gpu-libs.sh for packaging CUDA, ROCm, SYCL, and Vulkan libraries - Update llama-cpp, whisper, stablediffusion-ggml package.sh to include GPU libraries - Update Dockerfile.python to package GPU libraries into Python backends - Update libbackend.sh to set LD_LIBRARY_PATH for GPU library loading Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Address code review feedback: fix variable consistency and quoting Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Fix code review issues: improve glob handling and remove redundant variable Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> * Simplify main Dockerfile and workflow to use unified base image - Remove GPU-specific driver installation from Dockerfile (CUDA, ROCm, Vulkan, Intel) - Simplify image.yml workflow to build single unified base image for linux/amd64 and linux/arm64 - GPU libraries are now packaged in individual backend containers Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-07 15:48:51 +01:00
Richard Palethorpe	e6ba26c3e7	chore: Update to Ubuntu24.04 (cont #7423 ) (#7769 ) * ci(workflows): bump GitHub Actions images to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04) Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): bump GitHub Actions CUDA support to 12.9 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): disable parallel backend builds to avoid race conditions Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): update backend Dockerfiles to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore: add local-ai-launcher to .gitignore Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix backends GitHub Actions workflows after rebase Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): use build-time UBUNTU_VERSION variable Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(docker): remove libquadmath0 from requirements-stage base image Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(docker): correct CUDA installation steps in backend Dockerfiles Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): update ROCm to 6.4 and align Python hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix failing GitHub Actions runners Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix: Allow FROM_SOURCE to be unset, use upstream Intel images etc. Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): rm all traces of CUDA 11 Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): Add Ubuntu codename as an argument Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>	2026-01-06 15:26:42 +01:00
Ettore Di Giacinto	774ddc60db	chore(ci): specify ubuntu version in pipelines Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-03 11:10:18 +01:00
Ettore Di Giacinto	0ca1322b43	chore(ci): correctly pass ubuntu-version Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-03 09:58:10 +01:00
Ettore Di Giacinto	8dfeea2f55	fix: use ubuntu 24.04 for cuda13 l4t images (#7418 ) * fix: use ubuntu 24.04 for cuda13 l4t images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop openblas from containers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-03 09:47:03 +01:00
Ettore Di Giacinto	7a5c61b057	fix: configure sbsa packages for arm64 (#7413 ) * fix: configure sbsa packages for arm64 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-02 18:59:36 +01:00
Ettore Di Giacinto	cfd95745ed	feat: add cuda13 images (#7404 ) * chore(ci): add cuda13 jobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to pipelines and to capabilities. Start to work on the gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * capabilities: try to detect by looking at /usr/local Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * neutts Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * backends.yaml Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add cuda13 l4t requirements.txt Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add cuda13 requirements.txt Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Pin vllm Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Not all backends are compatible Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add vllm to requirements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * vllm is not pre-compiled for cuda 13 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-02 14:24:35 +01:00
Ettore Di Giacinto	77c5acb9db	Revert "feat(nvidia-gpu): bump images to cuda 12.8" (#6303 ) Revert "feat(nvidia-gpu): bump images to cuda 12.8 (#6239)" This reverts commit `d9e25af7b5`.	2025-09-17 19:31:43 +02:00
Ettore Di Giacinto	d9e25af7b5	feat(nvidia-gpu): bump images to cuda 12.8 (#6239 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-09-09 13:02:17 +02:00
Ettore Di Giacinto	22067e3384	chore(rocm): bump rocm image, add gfx1200 support (#6065 ) Fixes: https://github.com/mudler/LocalAI/issues/6044 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-08-15 16:36:54 +02:00
Richard Palethorpe	d6274eaf4a	chore(build): Rename sycl to intel (#5964 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2025-08-04 11:01:28 +02:00
Richard Palethorpe	c07bc55fee	fix(intel): Set GPU vendor on Intel images and cleanup (#5945 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2025-07-31 19:44:46 +02:00
Ettore Di Giacinto	a8057b952c	fix(cuda): be consistent with image tag naming (#5916 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-26 08:30:59 +02:00
Ettore Di Giacinto	facf7625f3	fix(vulkan): use correct image suffix (#5911 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-25 19:20:20 +02:00
Ettore Di Giacinto	98e5291afc	feat: refactor build process, drop embedded backends (#5875 ) * feat: split remaining backends and drop embedded backends - Drop silero-vad, huggingface, and stores backend from embedded binaries - Refactor Makefile and Dockerfile to avoid building grpc backends - Drop golang code that was used to embed backends - Simplify building by using goreleaser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(gallery): be specific with llama-cpp backend templates Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(docs): update Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ci): minor fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: drop all ffmpeg references Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: run protogen-go Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Always enable p2p mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update gorelease file Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(stores): do not always load Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix linting issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Mac OS fixup Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-22 16:31:04 +02:00
Ettore Di Giacinto	e1cc7ee107	fix(ci): enable tag-latest to auto (#5738 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-27 18:17:01 +02:00
Ettore Di Giacinto	b5b0ab26e7	fix(ci): remove non-existant input from build matrix Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-26 21:42:27 +02:00
Ettore Di Giacinto	7c4a2e9b85	chore(ci): ⚠️ fix latest tag by using docker meta action (#5722 ) chore(ci): fix latest tag by using docker meta action Also uniform tagging names Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-26 18:40:25 +02:00
Ettore Di Giacinto	b706dddc93	chore(ci): switch to public runners for base images (#5680 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-17 22:38:50 +02:00
Ettore Di Giacinto	912c8eff04	chore(ci): use public runner for extra backends (#5657 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-16 08:21:18 +02:00
Ettore Di Giacinto	236ac30252	chore(ci): do not specify image-type anymore Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-06-15 17:28:40 +02:00
Ettore Di Giacinto	2d64269763	feat: Add backend gallery (#5607 ) * feat: Add backend gallery This PR add support to manage backends as similar to models. There is now available a backend gallery which can be used to install and remove extra backends. The backend gallery can be configured similarly as a model gallery, and API calls allows to install and remove new backends in runtime, and as well during the startup phase of LocalAI. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add backends docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip: Backend Dockerfile for python backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: drop extras images, build python backends separately Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup on all backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Tweaks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop old backends leftovers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixup CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Move dockerfile upper Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix proto Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Feature dropped for consistency - we prefer model galleries Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add missing packages in the build image Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * exllama is ponly available on cublas Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * pin torch on chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups to index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Debug CI * Install accellerators deps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add target arch * Add cuda minor version Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use self-hosted runners Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: use quay for test images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups for vllm and chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups on CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chatterbox is only available for nvidia Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify CI builds Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Adapt test, use qwen3 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(model gallery): add jina-reranker-v1-tiny-en-gguf Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use reranker from llama.cpp in AIO images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Limit concurrent jobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-06-15 14:56:52 +02:00
Ettore Di Giacinto	3be71be696	fix(ci): tag latest against cpu-only image (#5362 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-05-13 22:00:41 +02:00
Ettore Di Giacinto	11c67d16b8	chore(ci): strip 'core' in the image suffix, identify python-based images with 'extras' (#5353 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-05-12 09:36:59 +02:00
Ettore Di Giacinto	8e9b41d05f	chore(ci): build only images with ffmpeg included, simplify tags (#5251 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-04-27 08:23:25 +02:00
Ettore Di Giacinto	0474804541	fix(ci): remove duplicate entry Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-04-17 19:51:21 +02:00
Ettore Di Giacinto	c8f6858218	chore(ci): add latest images for core (#5198 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-04-17 10:00:18 +02:00
Ettore Di Giacinto	b5eeb5c5ab	ci(arm64): run in parallel Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-01-17 10:24:15 +01:00
Ettore Di Giacinto	b147ad0596	ci: try to build for arm64 Try to use the free arm64 runners from Github: https://github.blog/changelog/2025-01-16-linux-arm64-hosted-runners-now-available-for-free-in-public-repositories-public-preview/ Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-01-17 10:14:26 +01:00
Ettore Di Giacinto	1006e8a2ed	ci: disable arm jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-01-03 21:58:04 +01:00
Ettore Di Giacinto	9bcfda171b	ci: lower concurrent jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-01-03 20:48:23 +01:00
Ettore Di Giacinto	baee4f7bd5	ci: split jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-01-03 19:23:05 +01:00
Ettore Di Giacinto	286dc32fe0	ci(arm64): try building on self-hosted Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-01-03 19:18:18 +01:00
Ettore Di Giacinto	4dd9ac39b0	chore(ci): comment arm64 job until we find a native CI runner (#4452 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-12-23 12:34:39 +01:00
Ettore Di Giacinto	8864156300	chore(nvidia-l4t): add l4t arm64 images (#4449 ) chore(nvidia-l4t): add nvidia-l4t arm64 images Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-12-22 21:29:33 +01:00
Ettore Di Giacinto	bf8f8671d1	chore(ci): adjust parallelism Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-09-23 19:04:36 +02:00
Ettore Di Giacinto	fd70a22196	chore(ci): adjust parallel jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-09-22 15:21:16 +02:00
Ettore Di Giacinto	56f4deb938	chore(ci): split hipblas jobs Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-09-22 15:19:38 +02:00
Ettore Di Giacinto	a9757fb057	fix(cuda): downgrade to 12.0 to increase compatibility range (#2994 ) * fix(cuda): downgrade to 12.0 to increase compatibility range Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * improve messaging Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-23 23:35:31 +02:00

1 2 3

111 Commits