LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-17 13:10:23 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	f0374aa0e8	ci: finish GHA free-tier migration (per-arch fan-out, image splits, retire self-hosted, fix provenance) (#9730 ) * ci: add per-arch + manifest-merge support for LocalAI server image Mirror the backend_build.yml + backend_merge.yml pattern shipped in PR #9726 for the LocalAI server image: - image_build.yml accepts optional platform-tag (default ''), scopes registry cache to cache-localai<suffix>-<platform-tag>, and pushes by canonical digest only on push events. Digests upload as artifacts named digests-localai<suffix>-<platform-tag>, with a "-core" placeholder when tag-suffix is empty so the merge job's download pattern doesn't over-match across multiple suffixes. - image_merge.yml is a new reusable workflow that downloads matching digest artifacts and assembles the final tagged manifest list via docker buildx imagetools create. Image names differ from backend_.yml: the LocalAI server is published under quay.io/go-skynet/local-ai and localai/localai (not -backends). Not yet wired into image.yml / image-pr.yml — Commit C does that. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> ci: fan out per-arch split to remaining 34 backends Convert all remaining linux/amd64,linux/arm64 entries in backend-matrix.yml to per-arch + manifest-merge form. Each was a single matrix entry running both arches on x86 under QEMU emulation; each becomes two entries — amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). Four backends that were on bigger-runner (-cpu-llama-cpp, -cpu-turboquant, -gpu-vulkan-llama-cpp, -gpu-vulkan-turboquant) have both legs moved to free tier as part of the same change. They are compile-only (no torch/CUDA install) and fit comfortably with the setup-build-disk /mnt relocation. Phase 4 (next commit) retires the remaining 5 single-arch bigger-runner entries. After this commit: - 271 total matrix entries (was 237) - 0 multi-arch entries left - 36 per-arch pairs (34 new + 2 pilots from PR #9727) - 5 bigger-runner entries remaining (single-arch, Phase 4 target) Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: split LocalAI image multi-arch entries per arch + merge Mirror the backend per-arch split for the main LocalAI image: - image.yml's core-image-build matrix: split the core ('') and -gpu-vulkan entries into amd64 + arm64 legs each. amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). - New top-level core-image-merge and gpu-vulkan-image-merge jobs call image_merge.yml after core-image-build completes. - image-pr.yml's image-build matrix: split the -vulkan-core entry. No merge job added on the PR side — image_build.yml's digest-push is push-only-event-gated, so a PR-side merge would have nothing to download. After this commit, no workflow file references linux/amd64,linux/arm64 in a single matrix slot. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: retire bigger-runner from backend matrix (Phase 4) Migrate the remaining 5 single-arch bigger-runner entries to ubuntu-latest. Combined with the Phase 3 setup-build-disk /mnt relocation (PR #9726), free-tier ubuntu-latest now has ~100 GB of working space — enough for ROCm dev image (~16 GB), CUDA toolkit (~5 GB), and the per-backend compile/install steps these entries do. Backends migrated: - -gpu-nvidia-cuda-12-llama-cpp - -gpu-nvidia-cuda-12-turboquant - -gpu-rocm-hipblas-faster-whisper - -gpu-rocm-hipblas-coqui - -cpu-ik-llama-cpp After this commit, .github/backend-matrix.yml has zero bigger-runner references. The bigger-runner used in tests-vibevoice-cpp-grpc- transcription (test-extra.yml) is a separate concern handled in a follow-up. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 9 Intel oneAPI backends to free tier (Phase 5.1) Intel oneAPI base image is ~6 GB; each backend's wheel install stays well within the ~100 GB working space provided by Phase 3's setup-build-disk /mnt relocation. Lowest-risk batch of the arc-runner-set retirement. Backends migrated: vllm, sglang, vibevoice, qwen-asr, nemo, qwen-tts, fish-speech, voxcpm, pocket-tts (all -gpu-intel-* variants). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 15 ROCm Python backends to free tier (Phase 5.2) ROCm dev image (~16 GB) plus per-backend torch/wheels install fits on ubuntu-latest with the /mnt-relocated Docker root. These entries include the heavier vLLM/sglang/transformers/diffusers stack on ROCm; if any specific backend OOMs or runs out of disk, individual flips back to arc-runner-set are revertable per-entry. Backends migrated: all 15 -gpu-rocm-hipblas-* entries previously on arc-runner-set (vllm/vllm-omni/sglang/transformers/diffusers/ ace-step/kokoro/vibevoice/qwen-asr/nemo/qwen-tts/fish-speech/ voxcpm/pocket-tts/neutts). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 6 CUDA Python backends to free tier (Phase 5.3) vLLM/sglang stacks on CUDA 12 and CUDA 13 are the heaviest backends in the matrix — flash-attn intermediate layers can spike disk usage during build. setup-build-disk's /mnt relocation gives ~100 GB working space which fits the documented peak. Highest-risk batch of the arc-runner-set retirement; if any backend fails to build on free tier, the per-entry runs-on flip is the unit of revert. Backends migrated: -gpu-nvidia-cuda-{12,13}-{vllm,vllm-omni,sglang}. After this commit, .github/backend-matrix.yml has zero references to arc-runner-set or bigger-runner. The migration is complete. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: disable provenance on multi-registry digest pushes Root-caused on master via PR #9727's pilot: when docker/build-push-action@v7 pushes a single build to TWO registries simultaneously with push-by-digest=true, buildx generates a per-registry provenance attestation manifest (because mode=max — the default for push:true — includes the runner ID). That makes the resulting manifest-list digest diverge across registries: arm64 -cpu-faster-whisper build: image manifest: sha256:d3bdd34b... (identical, content-only) quay manifest list: sha256:66b4cfc8... (with quay attestation) dockerhub manifest list: sha256:e0733c3b... (with dockerhub attestation) steps.build.outputs.digest returns only one of the list digests (empirically the dockerhub one). The merge job then asks "quay.io/...@sha256:e0733c3b..." which doesn't exist on quay — that list has digest 66b4cfc8 there. Result: imagetools create fails with "not found" and the merge job fails (run 25581983094, job 75110021491). Setting provenance: false drops the per-registry attestation; the manifest-list digest becomes pure content, identical across both registries, and steps.build.outputs.digest works on either lookup. Applied to backend_build.yml and image_build.yml — both refactored to use the same multi-registry digest-push pattern in the prior PRs. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-09 09:37:00 +02:00
LocalAI [bot]	1f313cfdb0	ci: phase 1-3 of GHA free tier migration (path filter, multi-arch split prep, /mnt disk relief) (#9726 ) * ci: extract free-disk-space composite action Consolidate the apt-clean + dotnet/android/ghc/boost removal blocks from backend_build.yml, image_build.yml, and test.yml into a single composite action. The three callers had slightly different inline blocks; the composite uses the more aggressive backend_build/image_build variant for all three callers — test.yml jobs now also purge snapd, edge/firefox/ powershell/r-base-core, and sweep /opt/ghc + /usr/local/share/boost + $AGENT_TOOLSDIRECTORY. Idempotent and skipped on self-hosted runners. In test.yml, actions/checkout now runs before the composite action call because the composite lives at ./.github/actions/free-disk-space and requires a checked-out repo. The original ordering relied on jlumbroso/free-disk-space@main being a remote action; this is the minimum-invasive change to support a local composite. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: path-filter backend.yml master push Run scripts/changed-backends.js on master pushes too (not just PRs) so unrelated commits don't rebuild all ~210 backend container images. Tag pushes still build the full matrix via FORCE_ALL. Push events use the GitHub Compare API to diff event.before..event.after. Edge cases (first push with zero base, API truncation beyond 300 files, missing fields, network failure) fall back to "run everything" — better safe than silently miss a backend. The matrix literal moves from .github/workflows/backend.yml into a new data-only file at .github/backend-matrix.yml (outside workflows/ so actionlint doesn't try to parse it as a workflow). Both backend.yml and backend_pr.yml now consume the dynamic matrix output uniformly via fromJson(needs.generate-matrix.outputs.matrix); the script reads the matrix from the new location. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: bound max-parallel on backend-jobs matrices Cap to 8 concurrent jobs to avoid queue starvation on the shared GHA free pool while migration is in flight. Lift after Phases 4-5 retire the self-hosted runners. Also drops a leftover commented-out max-parallel line that lived in backend.yml since the previous matrix shape. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: scope backend cache per arch, push by digest Prepare backend_build.yml for the multi-arch split. The reusable workflow now accepts a `platform-tag` input ("amd64" / "arm64") that scopes the registry cache to cache<suffix>-<platform-tag> and (on push events) pushes the resulting image by canonical digest only. Digests are uploaded as artifacts named digests<suffix>-<platform-tag> for the merge job (Task 2.2) to consume. `platform-tag` is optional with empty default during the migration — existing callers continue to work unchanged (their cache key just becomes `cache<suffix>-`, an orphaned but valid key). Tasks 2.3+ will update callers to pass an explicit "amd64" / "arm64" value. Phase 6 flips the input to required: true once every caller is wired. PR builds keep their existing tag-based push to ci-tests but pick up the per-arch cache key. Multi-arch PR builds remain emulated in this commit; they migrate when the matrix entries split (Tasks 2.3+). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: add backend_merge.yml reusable workflow Joins per-arch digest artifacts (uploaded by backend_build.yml when called with platform-tag) into a single tagged multi-arch manifest list via `docker buildx imagetools create`. Called once per backend by backend.yml after both per-arch build jobs succeed. The workflow generates final tags identically to the previous monolithic build job (same docker/metadata-action invocation), so consumers of quay.io/go-skynet/local-ai-backends and localai/localai-backends see no tag-shape change. Two imagetools calls (one per registry) reference the same per-arch digests under different image names. Not yet wired into backend.yml — Tasks 2.3+ rewrite individual matrix entries to expand into per-arch + merge jobs that call this workflow. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: relocate Docker data-root to /mnt on hosted runners GHA hosted ubuntu-latest runners ship a ~75 GB /mnt drive that's unused by default. Stopping Docker, rsync'ing /var/lib/docker to /mnt, and restarting with data-root pointing there yields ~100 GB of working space (combined with the apt-clean from Task 1.1) — enough for ROCm dev image + vLLM torch install + flash-attn intermediate layers. This is the structural change that lets Phases 4 and 5 of the migration plan move the bigger-runner and arc-runner-set jobs onto ubuntu-latest. The composite action is no-op on self-hosted runners (where /mnt isn't expected) and on non-X64 runners (Task 3.2 verifies the arm64 hosted pool's /mnt shape separately before enabling). Wired into both backend_build.yml and image_build.yml between free-disk-space and the first Docker operation. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(setup-build-disk): chmod 1777 /mnt/docker-tmp buildx CLI runs as the unprivileged 'runner' user and creates config dirs under TMPDIR before binding them into the buildkit container. /mnt is root-owned by default, so the original mkdir produced a permission-denied when buildx tried to write there: ERROR: mkdir /mnt/docker-tmp/buildkitd-config2740457204: permission denied Mirror /tmp's permission mode (1777 — world-writable with sticky bit) on /mnt/docker-tmp so non-root processes can stage their config. Caught by the first PR run (image-build hipblas job) on PR #9726. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: weekly full-matrix rebuild via cron Path-filtering backend.yml master push (the previous commit's main optimization) skips backends whose source didn't change. That broke the DEPS_REFRESH cache-buster's coverage: the build-arg keyed on %Y-W%V busts the install layer's cache on a new ISO week, but only when the build actually runs. Untouched Python backends (torch, transformers, vllm with no version pin) would otherwise ship stale wheels indefinitely. Add a Sunday 06:00 UTC cron that fires the full matrix. Schedule events have no event.ref / event.before, so the script's changedFiles == null fallback (scripts/changed-backends.js) emits the full matrix automatically — no script change needed. C++/Go backends with pinned deps cache-hit and complete fast, so the weekly cost is dominated by Python re-resolves which is exactly what we want. workflow_dispatch added so a maintainer can trigger an ad-hoc full-matrix rebuild without faking a tag push. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-08 23:43:41 +02:00
Ettore Di Giacinto	50580a84ae	fix(ci): switch apt mirror per runner — azure on github-hosted, kernel.org on self-hosted Self-hosted runners (arc-runner-set, bigger-runner) cannot reach azure.archive.ubuntu.com — they live in different networks (e.g. our arc-runner-set Kubernetes cluster) where Azure's mirror IP is not routable. Symptom: "Connection failed [IP: 51.11.236.225 80]" with each Ign:/Err: cycle taking 60s, hanging the build for ~16 minutes before exit 100. Pick the mirror based on `runner.environment`: * github-hosted (ubuntu-latest, ubuntu-24.04-arm) → Azure (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com) — same VPC as the runner. * self-hosted (arc-runner-set, bigger-runner) → kernel.org (https://mirrors.edge.kernel.org for both archive and ports) — publicly reachable from any network. The choice now lives in one place: the .github/actions/configure-apt-mirror composite action exposes `effective-mirror` / `effective-ports-mirror` outputs so the reusable workflows can forward the same value as Docker build-args without duplicating the per-runner-environment branch. The now-redundant `apt-mirror` / `apt-ports-mirror` workflow inputs on image_build.yml and backend_build.yml are dropped — defaults live in the composite action and are visible there. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-03 22:59:26 +00:00
Ettore Di Giacinto	8edac61e57	feat(ci): allow routing apt traffic through an alternate Ubuntu mirror (#9650 ) * feat(ci): allow routing apt traffic through an alternate Ubuntu mirror Adds opt-in APT_MIRROR / APT_PORTS_MIRROR knobs to all Dockerfiles, the Makefile, and CI workflows so we can fail over to a non-canonical Ubuntu mirror when archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com are degraded (recently observed: multi-day DDoS against the default pool). Defaults are empty everywhere — behavior is unchanged unless a mirror is configured. To enable in CI, set the repo-level GitHub Actions variables APT_MIRROR (and APT_PORTS_MIRROR for arm64 builds). Locally: make docker APT_MIRROR=http://azure.archive.ubuntu.com A small POSIX-sh helper in .docker/apt-mirror.sh rewrites both DEB822 (/etc/apt/sources.list.d/ubuntu.sources, Ubuntu 24.04+) and the legacy /etc/apt/sources.list before the first apt-get update. Dockerfile stages load it via RUN --mount=type=bind, so there is no extra layer and no cache invalidation when the script is unchanged. Reusable workflows also rewrite the runner's own /etc/apt sources before any sudo apt-get call. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(apt-mirror): default to the Azure mirror, visible in the workflow source Bakes Azure (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com) in as the default for both Docker builds and runner-side apt — rather than hiding the URL behind a GitHub Actions repo variable that's not visible from the source tree. A new composite action at .github/actions/configure-apt-mirror is the single source of truth for runner-side rewrites. Five standalone workflows (build-test, release, tests-e2e, tests-ui-e2e, update_swagger) just `uses: ./.github/actions/configure-apt-mirror`. Three workflows (image_build, backend_build, checksum_checker) keep an inline bash rewrite, because they install/upgrade git via apt before the checkout step (so the local composite action isn't loadable yet). The Azure URL is visible in those files too. The `apt-mirror` / `apt-ports-mirror` inputs of the reusable workflows keep their now-Azure defaults — they still feed the Docker build-args block in addition to the inline runner-side rewrite. Callers (image.yml, image-pr.yml, backend.yml, backend_pr.yml) drop the previous `vars.APT_MIRROR` plumbing and rely on those defaults. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(apt-mirror): drop Force Install GIT, consolidate on the composite action The PPA git upgrade ran add-apt-repository ppa:git-core/ppa, which talks to api.launchpad.net — also part of Canonical's infrastructure and currently returning HTTP 504. The Azure mirror only covers archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com, not PPAs. The system git that ubuntu-latest already ships is sufficient for actions/checkout and the build pipeline, so just drop the upgrade. With that gone, the apt-before-checkout constraint disappears too — all three holdouts (image_build, backend_build, checksum_checker) can now switch to ./.github/actions/configure-apt-mirror like the other five. Net: 0 inline apt-mirror blocks, all 8 workflows route through the composite action. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-03 23:50:13 +02:00
Ettore Di Giacinto	bdfa5e934a	ci: switch image/backend build cache to a dedicated registry image - Switch cache-from/cache-to in backend_build.yml and image_build.yml from the unused gha cache to type=registry pointing at quay.io/go-skynet/ci-cache:cache<tag-suffix>, mode=max with ignore-error=true. Master/tag builds populate their own per-matrix-entry cache; PR builds read-only. - Drop the broken generate_grpc_cache.yaml cron. It targeted a `grpc` Dockerfile stage that was removed by `b1fc5acd` in July 2025, has been failing every night since, and never populated the gha cache. The new registry-cache scheme is self-warming, so no separate populator is needed. - Remove the dead GRPC_VERSION / GRPC_BASE_IMAGE / GRPC_MAKEFLAGS build-args from image_build.yml and the orphan ARG GRPC_BASE_IMAGE in the root Dockerfile (the root Dockerfile no longer compiles gRPC; the source build now lives in backend/Dockerfile.{llama-cpp, ik-llama-cpp, turboquant} only and uses its own ARG defaults). - Drop the unused grpc-base-image input from image_build.yml plus the matrix passthroughs in image.yml / image-pr.yml. - Drop the unused GRPC_VERSION env in test.yml. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7-1m	2026-04-27 13:13:04 +00:00
Ettore Di Giacinto	5affb747a9	chore: drop AIO images (#9004 ) AIO images are behind, and takes effort to maintain these. Wizard and installation of models have been semplified massively, so AIO images lost their purpose. This allows us to be more laser focused on main images and reliefes stress from CI. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-14 17:49:36 +01:00
dependabot[bot]	01bd3d8212	chore(deps): bump docker/login-action from 3 to 4 (#8918 ) Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/v3...v4) --- updated-dependencies: - dependency-name: docker/login-action dependency-version: '4' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-03-09 22:30:11 +01:00
dependabot[bot]	7f11f66b44	chore(deps): bump docker/build-push-action from 6 to 7 (#8919 ) Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v6...v7) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-03-09 22:29:51 +01:00
dependabot[bot]	2a351e1f0c	chore(deps): bump docker/metadata-action from 5 to 6 (#8917 ) Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5 to 6. - [Release notes](https://github.com/docker/metadata-action/releases) - [Commits](https://github.com/docker/metadata-action/compare/v5...v6) --- updated-dependencies: - dependency-name: docker/metadata-action dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-03-09 22:27:02 +01:00
Richard Palethorpe	e6ba26c3e7	chore: Update to Ubuntu24.04 (cont #7423 ) (#7769 ) * ci(workflows): bump GitHub Actions images to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04) Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): bump GitHub Actions CUDA support to 12.9 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): disable parallel backend builds to avoid race conditions Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): update backend Dockerfiles to Ubuntu 24.04 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore: add local-ai-launcher to .gitignore Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix backends GitHub Actions workflows after rebase Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): use build-time UBUNTU_VERSION variable Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(docker): remove libquadmath0 from requirements-stage base image Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix(docker): correct CUDA installation steps in backend Dockerfiles Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * chore(backend): update ROCm to 6.4 and align Python hipblas requirements Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64 Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * ci(workflows): fix failing GitHub Actions runners Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> * fix: Allow FROM_SOURCE to be unset, use upstream Intel images etc. Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): rm all traces of CUDA 11 Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(build): Add Ubuntu codename as an argument Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com> Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>	2026-01-06 15:26:42 +01:00
Ettore Di Giacinto	8dfeea2f55	fix: use ubuntu 24.04 for cuda13 l4t images (#7418 ) * fix: use ubuntu 24.04 for cuda13 l4t images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop openblas from containers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-03 09:47:03 +01:00
dependabot[bot]	91248da09e	chore(deps): bump actions/checkout from 5 to 6 (#7339 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v5...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-11-24 21:18:15 +01:00
dependabot[bot]	0ca1765c17	chore(deps): bump actions/checkout from 4 to 5 (#6014 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v4...v5) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '5' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-08-12 18:54:39 +02:00
Ettore Di Giacinto	98e5291afc	feat: refactor build process, drop embedded backends (#5875 ) * feat: split remaining backends and drop embedded backends - Drop silero-vad, huggingface, and stores backend from embedded binaries - Refactor Makefile and Dockerfile to avoid building grpc backends - Drop golang code that was used to embed backends - Simplify building by using goreleaser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(gallery): be specific with llama-cpp backend templates Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(docs): update Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ci): minor fixes Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: drop all ffmpeg references Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: run protogen-go Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Always enable p2p mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update gorelease file Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(stores): do not always load Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix linting issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Mac OS fixup Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-22 16:31:04 +02:00
Ettore Di Giacinto	7c4a2e9b85	chore(ci): ⚠️ fix latest tag by using docker meta action (#5722 ) chore(ci): fix latest tag by using docker meta action Also uniform tagging names Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-26 18:40:25 +02:00
Ettore Di Giacinto	be3ff482d0	chore(ci): try to optimize disk space when tagging latest (#5695 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-20 15:54:14 +02:00
Ettore Di Giacinto	89040ff6f7	fix: add python symlink, use absolute python env path when running backends (#5664 ) * fix: add python symlink, use absolute python env path when running backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ci): do not push images when building PRs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-16 23:00:53 +02:00
Ettore Di Giacinto	912c8eff04	chore(ci): use public runner for extra backends (#5657 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-06-16 08:21:18 +02:00
Ettore Di Giacinto	2d64269763	feat: Add backend gallery (#5607 ) * feat: Add backend gallery This PR add support to manage backends as similar to models. There is now available a backend gallery which can be used to install and remove extra backends. The backend gallery can be configured similarly as a model gallery, and API calls allows to install and remove new backends in runtime, and as well during the startup phase of LocalAI. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add backends docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip: Backend Dockerfile for python backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: drop extras images, build python backends separately Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup on all backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Tweaks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop old backends leftovers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixup CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Move dockerfile upper Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix proto Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Feature dropped for consistency - we prefer model galleries Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add missing packages in the build image Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * exllama is ponly available on cublas Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * pin torch on chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups to index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Debug CI * Install accellerators deps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add target arch * Add cuda minor version Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use self-hosted runners Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: use quay for test images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups for vllm and chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups on CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chatterbox is only available for nvidia Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify CI builds Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Adapt test, use qwen3 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(model gallery): add jina-reranker-v1-tiny-en-gguf Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use reranker from llama.cpp in AIO images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Limit concurrent jobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-06-15 14:56:52 +02:00
Ettore Di Giacinto	e84081769e	chore(ci): cleanup before pulling images again	2025-02-16 09:20:22 +01:00
Ettore Di Giacinto	0a748b009e	chore(ci): avoit cache hits until the ci gRPC job is fixed Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-02-12 09:11:40 +01:00
Ettore Di Giacinto	fe3ced2919	chore(ci): try again to bump parallelism in grpc jobs As we moved these out to self-hosted Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-02-11 09:31:00 +01:00
Ettore Di Giacinto	516cd660f1	chore(grpcio): reduce parallelism (#4799 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-02-10 18:56:13 +01:00
Ettore Di Giacinto	8fd3ace9a1	chore(grpcio): bump to 1.70 (#4798 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-02-10 18:38:53 +01:00
Ettore Di Giacinto	099469cb05	chore(tests): decrease parallelism for gRPC builds (#4797 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-02-10 12:59:59 +01:00
Ettore Di Giacinto	8864156300	chore(nvidia-l4t): add l4t arm64 images (#4449 ) chore(nvidia-l4t): add nvidia-l4t arm64 images Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-12-22 21:29:33 +01:00
dependabot[bot]	ce035416aa	build(deps): bump docker/build-push-action from 5 to 6 (#2592 ) Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/v5...v6) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-07-13 21:08:59 +00:00
Rene Leonhardt	fc87507012	chore(deps): Update Dependencies (#2538 ) * chore(deps): Update dependencies Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> * chore(deps): Upgrade github.com/imdario/mergo to dario.cat/mergo Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> * remove version identifiers for MeloTTS Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> --------- Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com> Signed-off-by: Dave <dave@gray101.com> Co-authored-by: Dave <dave@gray101.com>	2024-07-12 19:54:08 +00:00
Ettore Di Giacinto	2845baecd5	fix(cuda): downgrade default version from 12.5 to 12.4 (#2707 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-09 23:13:29 +02:00
Rene Leonhardt	43f0688a95	feat: Upgrade to CUDA 12.5 (#2601 ) Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>	2024-06-19 17:50:49 +02:00
Ettore Di Giacinto	d075dc44dd	ci: push test images when building PRs (#2424 ) ci: try to push image Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-27 22:07:35 +02:00
Ettore Di Giacinto	e0187c2a1a	ci: do not tag latest on AIO automatically Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-24 09:41:13 +02:00
Ettore Di Giacinto	1a3dedece0	dependencies(grpcio): bump to fix CI issues (#2362 ) feat(grpcio): bump to fix CI issues Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-05-21 14:33:47 +02:00
cryptk	f7aabf1b50	fix: bring everything onto the same GRPC version to fix tests (#2199 ) fix: more places where we are installing grpc that need a version specified fix: attempt to fix metal tests fix: metal/brew is forcing an update, they don't have 1.58 available anymore Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-30 19:12:15 +00:00
cryptk	987b7ad42d	feat: only keep the build artifacts from the grpc build (#2172 ) * feat: only keep the build artifacts from the grpc build Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: remove separate Cache GRPC build step Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: remove docker inspect step, it is leftover from previous debugging Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-28 19:24:16 +00:00
cryptk	9fc0135991	feat: cleanup Dockerfile and make final image a little smaller (#2146 ) * feat: cleanup Dockerfile and make final image a little smaller Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: add build-essential to final stage Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: more GRPC cache misses Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: correct for another cause of GRPC cache misses Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * feat: generate new GRPC cache automatically if needed Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> * fix: use new GRPC_MAKEFLAGS build arg in GRPC cache generation Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com> --------- Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-27 19:48:20 +02:00
cryptk	13012cfa70	feat: better control of GRPC docker cache (#2070 ) Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>	2024-04-18 16:19:36 -04:00
Ettore Di Giacinto	d692b2c32a	ci: push latest images for dockerhub (#1984 ) Fixes: #1983 Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-10 10:31:59 +02:00
Ettore Di Giacinto	cc3d601836	ci: fixup latest image push Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-04-09 09:49:11 +02:00
Ettore Di Giacinto	93cfec3c32	ci: correctly tag latest and aio images	2024-04-03 11:30:23 +02:00
Ettore Di Giacinto	89560ef87f	fix(ci): manually tag latest images (#1948 ) fix(ci): manually tag images Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-04-02 19:25:46 +02:00
cryptk	93702e39d4	feat(build): adjust number of parallel make jobs (#1915 ) * feat(build): adjust number of parallel make jobs * fix: update make on MacOS from brew to support --output-sync argument * fix: cache grpc with version as part of key to improve validity of cache hits * fix: use gmake for tests-apple to use the updated GNU make version * fix: actually use the new make version for tests-apple * feat: parallelize tests-extra * feat: attempt to cache grpc build for docker images * fix: don't quote GRPC version * fix: don't cache go modules, we have limited cache space, better used elsewhere * fix: release with the same version of go that we test with * fix: don't fail on exporting cache layers * fix: remove deprecated BUILD_GRPC docker arg from Makefile	2024-03-29 22:32:40 +01:00
Ettore Di Giacinto	49cec7fd61	ci(aio): add latest tag images (#1884 ) Tangentially also fixes #1868	2024-03-23 16:08:32 +01:00
Ettore Di Giacinto	418ba02025	ci: fix typo Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-03-22 09:14:17 +01:00
Ettore Di Giacinto	abc9360dc6	feat(aio): entrypoint, update workflows (#1872 )	2024-03-21 22:09:04 +01:00
cryptk	020ce29cd8	fix(make): allow to parallelize jobs (#1845 ) * fix: clean up Makefile dependencies to allow for parallel builds * refactor: remove old unused backend from Makefile * fix: finish removing legacy backend, update piper * fix: I broke llama... I fixed llama * feat: give the tests and builds a few threads * fix: ensure libraries are replaced before build, add dropreplace target * Fix image build workflows	2024-03-17 15:39:20 +01:00
Ettore Di Giacinto	ddd21f1644	feat: Use ubuntu as base for container images, drop deprecated ggml-transformers backends (#1689 ) * cleanup backends * switch image to ubuntu 22.04 * adapt commands for ubuntu * transformers cleanup * no contrib on ubuntu * Change test model to gguf * ci: disable bark tests (too cpu-intensive) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * cleanup * refinements * use intel base image * Makefile: Add docker targets * Change test model --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-02-08 20:12:51 +01:00
Ettore Di Giacinto	d168c7c9dc	ci: cleanup worker before run (#1685 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-02-06 19:42:27 +01:00
Ettore Di Giacinto	bcf02449b3	ci(dockerhub): push images also to dockerhub (#1542 ) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-01-04 08:32:29 +01:00
Ettore Di Giacinto	c3fb4b1d8e	ci: rename workflow Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2023-11-30 19:25:33 +01:00

1 2

51 Commits