LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-14 19:58:44 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	05e8e1e9f4	ci(images): publish chronologically-orderable master-<epoch>-<sha> tags The existing master push pipeline produces `master` (rolling) and `sha-<short>` tags. Neither is orderable by build time, so downstream GitOps that want to auto-bump to the newest master build (e.g. Flux ImagePolicy) can't pick the latest from the tag list — alphabetical sort over hex shas is effectively random, and the rolling `master` tag can't be referenced as an immutable bump target. Add a third tag of the form `master-<epoch>-<sha>` (Unix epoch in seconds + short sha), gated on default-branch pushes via metadata- action's `is_default_branch` predicate. The sha is retained for traceability; the epoch makes the tags numerically orderable, so a Flux ImagePolicy like filterTags: pattern: '^master-(?P<ts>[0-9]+)-[a-f0-9]+$' extract: '$ts' policy: numerical: order: asc will reliably bump to the newest master build. Applied to both image_build.yml (OCI labels stay consistent) and image_merge.yml (the actual tag publisher via buildx imagetools).	2026-05-21 17:18:30 +00:00
LocalAI [bot]	42a8db3573	ci(image): publish missing :latest-* and :v<X>-* singleton image tags (#9812 ) * ci(image): wire singleton merges + `--` artifact separator Closes the same singletons gap on the LocalAI server image workflow that PR #9781 closed for backends. The user observed it as missing :latest-gpu-nvidia-cuda-12 etc. on quay.io/go-skynet/local-ai — the build matrix has six single-arch entries with no corresponding merge step, so their per-arch digests push (push-by-digest=true) and never get tagged: - -gpu-hipblas (hipblas-jobs) - -gpu-nvidia-cuda-12 (core-image-build) - -gpu-nvidia-cuda-13 (core-image-build) - -gpu-intel (core-image-build) - -nvidia-l4t-arm64 (gh-runner) - -nvidia-l4t-arm64-cuda-13 (gh-runner) Only :latest, :v<X>, :latest-gpu-vulkan and :v<X>-gpu-vulkan were actually being published before this commit (the two multiarch suffixes that had merge jobs). Changes: 1. image.yml: add six new merge jobs, one per single-arch entry. Each `needs:` only its parent build job (matching the existing pattern for core-image-merge / gpu-vulkan-image-merge). 2. image_build.yml: switch artifact name to `digests-localai<suffix>--<platform-tag-or-"single">`. The `--` separator anchors the merge-side glob so a singleton tag-suffix doesn't over-match a longer suffix that shares its prefix (-nvidia-l4t-arm64 vs -nvidia-l4t-arm64-cuda-13). Same convention as backend_build.yml's fix. 3. image_merge.yml: update the download pattern to match. Next master push or tag release should produce :latest-gpu-hipblas, :latest-gpu-nvidia-cuda-12, :latest-gpu-nvidia-cuda-13, :latest-gpu-intel, :latest-nvidia-l4t-arm64, :latest-nvidia-l4t-arm64-cuda-13 (and their :v<X>-* equivalents) for the first time on the post-#9781 workflow. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(image): add !cancelled() guard to all 8 image merge jobs Parity pass with backend.yml's merge jobs (`8521af14`). Without !cancelled(), GHA's default `needs:` cascade skips the merge when ANY matrix cell of the parent build job fails or is cancelled — so a single flaky leg would suppress publication of every other tag-suffix's manifest list. Same fix the backend got after v4.2.1 showed 2 failed singlearch builds cascade-skip 199 singlearch merge entries. Applied to all 8 image merges: - core-image-merge - gpu-vulkan-image-merge - gpu-nvidia-cuda-12-image-merge (added in `e5300f1a`) - gpu-nvidia-cuda-13-image-merge (added in `e5300f1a`) - gpu-intel-image-merge (added in `e5300f1a`) - gpu-hipblas-image-merge (added in `e5300f1a`) - nvidia-l4t-arm64-image-merge (added in `e5300f1a`) - nvidia-l4t-arm64-cuda-13-image-merge (added in `e5300f1a`) Build jobs (hipblas-jobs, core-image-build, gh-runner) are intentionally NOT changed — they have no upstream `needs:` to cascade- skip from. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-14 00:28:48 +02:00
Ettore Di Giacinto	6bfe7f8c05	ci(image-merge): apply the keepalive+ci-cache source fix to image_merge Mirror of `8521af14` (which fixed backend_merge.yml) for image_merge.yml. Today's master-push run 25823024353 failed the gpu-vulkan-image-merge job with the exact same error pattern the backend merge had on v4.2.2: ERROR: quay.io/go-skynet/local-ai@sha256:68b22611...: not found Same root cause: image_build.yml pushes the per-arch manifest to quay.io/go-skynet/local-ai with push-by-digest=true (no tag), then the merge runs minutes-to-hours later, by which time quay's per-repo manifest GC has reaped the untagged digest from local-ai. The blob still lives in quay's storage but local-ai@<digest> no longer resolves. Three matching edits: 1. image_build.yml: anchor each per-arch digest into ci-cache immediately after the push, reusing .github/scripts/anchor-digest-in-cache.sh with SOURCE_IMAGE=quay.io/go-skynet/local-ai and TAG_SUFFIX defaulting to "-core" for the core image (matches the artifact-name convention). 2. image_merge.yml: change the quay merge source from local-ai@<digest> to ci-cache@<digest>. Same correctness argument as backend_merge.yml — the manifest content is alive in ci-cache; buildx imagetools create republishes it into local-ai and writes the user-facing manifest list pointing at it. End state in local-ai is self-contained. 3. image_merge.yml: add a sparse `actions/checkout@v6` (only .github/scripts) so cleanup-keepalive-tags.sh is available, plus the cleanup step itself with TAG_SUFFIX matching the anchor's "-core" placeholder. v4.2.3's image.yml run completed successfully (~50 min between push and merge — beat quay's GC). This commit closes the race for future releases and master pushes regardless of run length. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-13 21:37:03 +00:00
dependabot[bot]	d75173dd2a	chore(deps): bump actions/download-artifact from 4 to 8 (#9771 ) Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 8. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/v4...v8) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-version: '8' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-12 09:20:14 +02:00
LocalAI [bot]	f0374aa0e8	ci: finish GHA free-tier migration (per-arch fan-out, image splits, retire self-hosted, fix provenance) (#9730 ) * ci: add per-arch + manifest-merge support for LocalAI server image Mirror the backend_build.yml + backend_merge.yml pattern shipped in PR #9726 for the LocalAI server image: - image_build.yml accepts optional platform-tag (default ''), scopes registry cache to cache-localai<suffix>-<platform-tag>, and pushes by canonical digest only on push events. Digests upload as artifacts named digests-localai<suffix>-<platform-tag>, with a "-core" placeholder when tag-suffix is empty so the merge job's download pattern doesn't over-match across multiple suffixes. - image_merge.yml is a new reusable workflow that downloads matching digest artifacts and assembles the final tagged manifest list via docker buildx imagetools create. Image names differ from backend_.yml: the LocalAI server is published under quay.io/go-skynet/local-ai and localai/localai (not -backends). Not yet wired into image.yml / image-pr.yml — Commit C does that. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> ci: fan out per-arch split to remaining 34 backends Convert all remaining linux/amd64,linux/arm64 entries in backend-matrix.yml to per-arch + manifest-merge form. Each was a single matrix entry running both arches on x86 under QEMU emulation; each becomes two entries — amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). Four backends that were on bigger-runner (-cpu-llama-cpp, -cpu-turboquant, -gpu-vulkan-llama-cpp, -gpu-vulkan-turboquant) have both legs moved to free tier as part of the same change. They are compile-only (no torch/CUDA install) and fit comfortably with the setup-build-disk /mnt relocation. Phase 4 (next commit) retires the remaining 5 single-arch bigger-runner entries. After this commit: - 271 total matrix entries (was 237) - 0 multi-arch entries left - 36 per-arch pairs (34 new + 2 pilots from PR #9727) - 5 bigger-runner entries remaining (single-arch, Phase 4 target) Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: split LocalAI image multi-arch entries per arch + merge Mirror the backend per-arch split for the main LocalAI image: - image.yml's core-image-build matrix: split the core ('') and -gpu-vulkan entries into amd64 + arm64 legs each. amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). - New top-level core-image-merge and gpu-vulkan-image-merge jobs call image_merge.yml after core-image-build completes. - image-pr.yml's image-build matrix: split the -vulkan-core entry. No merge job added on the PR side — image_build.yml's digest-push is push-only-event-gated, so a PR-side merge would have nothing to download. After this commit, no workflow file references linux/amd64,linux/arm64 in a single matrix slot. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: retire bigger-runner from backend matrix (Phase 4) Migrate the remaining 5 single-arch bigger-runner entries to ubuntu-latest. Combined with the Phase 3 setup-build-disk /mnt relocation (PR #9726), free-tier ubuntu-latest now has ~100 GB of working space — enough for ROCm dev image (~16 GB), CUDA toolkit (~5 GB), and the per-backend compile/install steps these entries do. Backends migrated: - -gpu-nvidia-cuda-12-llama-cpp - -gpu-nvidia-cuda-12-turboquant - -gpu-rocm-hipblas-faster-whisper - -gpu-rocm-hipblas-coqui - -cpu-ik-llama-cpp After this commit, .github/backend-matrix.yml has zero bigger-runner references. The bigger-runner used in tests-vibevoice-cpp-grpc- transcription (test-extra.yml) is a separate concern handled in a follow-up. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 9 Intel oneAPI backends to free tier (Phase 5.1) Intel oneAPI base image is ~6 GB; each backend's wheel install stays well within the ~100 GB working space provided by Phase 3's setup-build-disk /mnt relocation. Lowest-risk batch of the arc-runner-set retirement. Backends migrated: vllm, sglang, vibevoice, qwen-asr, nemo, qwen-tts, fish-speech, voxcpm, pocket-tts (all -gpu-intel-* variants). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 15 ROCm Python backends to free tier (Phase 5.2) ROCm dev image (~16 GB) plus per-backend torch/wheels install fits on ubuntu-latest with the /mnt-relocated Docker root. These entries include the heavier vLLM/sglang/transformers/diffusers stack on ROCm; if any specific backend OOMs or runs out of disk, individual flips back to arc-runner-set are revertable per-entry. Backends migrated: all 15 -gpu-rocm-hipblas-* entries previously on arc-runner-set (vllm/vllm-omni/sglang/transformers/diffusers/ ace-step/kokoro/vibevoice/qwen-asr/nemo/qwen-tts/fish-speech/ voxcpm/pocket-tts/neutts). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 6 CUDA Python backends to free tier (Phase 5.3) vLLM/sglang stacks on CUDA 12 and CUDA 13 are the heaviest backends in the matrix — flash-attn intermediate layers can spike disk usage during build. setup-build-disk's /mnt relocation gives ~100 GB working space which fits the documented peak. Highest-risk batch of the arc-runner-set retirement; if any backend fails to build on free tier, the per-entry runs-on flip is the unit of revert. Backends migrated: -gpu-nvidia-cuda-{12,13}-{vllm,vllm-omni,sglang}. After this commit, .github/backend-matrix.yml has zero references to arc-runner-set or bigger-runner. The migration is complete. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: disable provenance on multi-registry digest pushes Root-caused on master via PR #9727's pilot: when docker/build-push-action@v7 pushes a single build to TWO registries simultaneously with push-by-digest=true, buildx generates a per-registry provenance attestation manifest (because mode=max — the default for push:true — includes the runner ID). That makes the resulting manifest-list digest diverge across registries: arm64 -cpu-faster-whisper build: image manifest: sha256:d3bdd34b... (identical, content-only) quay manifest list: sha256:66b4cfc8... (with quay attestation) dockerhub manifest list: sha256:e0733c3b... (with dockerhub attestation) steps.build.outputs.digest returns only one of the list digests (empirically the dockerhub one). The merge job then asks "quay.io/...@sha256:e0733c3b..." which doesn't exist on quay — that list has digest 66b4cfc8 there. Result: imagetools create fails with "not found" and the merge job fails (run 25581983094, job 75110021491). Setting provenance: false drops the per-registry attestation; the manifest-list digest becomes pure content, identical across both registries, and steps.build.outputs.digest works on either lookup. Applied to backend_build.yml and image_build.yml — both refactored to use the same multi-registry digest-push pattern in the prior PRs. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-09 09:37:00 +02:00

5 Commits