Commit Graph

54 Commits

Author SHA1 Message Date
LocalAI [bot]
42a8db3573 ci(image): publish missing :latest-* and :v<X>-* singleton image tags (#9812)
* ci(image): wire singleton merges + `--` artifact separator

Closes the same singletons gap on the LocalAI server image workflow that
PR #9781 closed for backends. The user observed it as missing
:latest-gpu-nvidia-cuda-12 etc. on quay.io/go-skynet/local-ai — the
build matrix has six single-arch entries with no corresponding merge
step, so their per-arch digests push (push-by-digest=true) and never
get tagged:

  - -gpu-hipblas              (hipblas-jobs)
  - -gpu-nvidia-cuda-12       (core-image-build)
  - -gpu-nvidia-cuda-13       (core-image-build)
  - -gpu-intel                (core-image-build)
  - -nvidia-l4t-arm64         (gh-runner)
  - -nvidia-l4t-arm64-cuda-13 (gh-runner)

Only :latest, :v<X>, :latest-gpu-vulkan and :v<X>-gpu-vulkan were
actually being published before this commit (the two multiarch suffixes
that had merge jobs).

Changes:

1. image.yml: add six new merge jobs, one per single-arch entry. Each
   `needs:` only its parent build job (matching the existing pattern
   for core-image-merge / gpu-vulkan-image-merge).
2. image_build.yml: switch artifact name to
   `digests-localai<suffix>--<platform-tag-or-"single">`. The `--`
   separator anchors the merge-side glob so a singleton tag-suffix
   doesn't over-match a longer suffix that shares its prefix
   (-nvidia-l4t-arm64 vs -nvidia-l4t-arm64-cuda-13). Same convention
   as backend_build.yml's fix.
3. image_merge.yml: update the download pattern to match.

Next master push or tag release should produce :latest-gpu-hipblas,
:latest-gpu-nvidia-cuda-12, :latest-gpu-nvidia-cuda-13, :latest-gpu-intel,
:latest-nvidia-l4t-arm64, :latest-nvidia-l4t-arm64-cuda-13 (and their
:v<X>-* equivalents) for the first time on the post-#9781 workflow.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci(image): add !cancelled() guard to all 8 image merge jobs

Parity pass with backend.yml's merge jobs (8521af14). Without
!cancelled(), GHA's default `needs:` cascade skips the merge when ANY
matrix cell of the parent build job fails or is cancelled — so a single
flaky leg would suppress publication of every other tag-suffix's
manifest list. Same fix the backend got after v4.2.1 showed 2 failed
singlearch builds cascade-skip 199 singlearch merge entries.

Applied to all 8 image merges:

  - core-image-merge
  - gpu-vulkan-image-merge
  - gpu-nvidia-cuda-12-image-merge       (added in e5300f1a)
  - gpu-nvidia-cuda-13-image-merge       (added in e5300f1a)
  - gpu-intel-image-merge                (added in e5300f1a)
  - gpu-hipblas-image-merge              (added in e5300f1a)
  - nvidia-l4t-arm64-image-merge         (added in e5300f1a)
  - nvidia-l4t-arm64-cuda-13-image-merge (added in e5300f1a)

Build jobs (hipblas-jobs, core-image-build, gh-runner) are
intentionally NOT changed — they have no upstream `needs:` to cascade-
skip from.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-14 00:28:48 +02:00
Ettore Di Giacinto
6bfe7f8c05 ci(image-merge): apply the keepalive+ci-cache source fix to image_merge
Mirror of 8521af14 (which fixed backend_merge.yml) for image_merge.yml.
Today's master-push run 25823024353 failed the gpu-vulkan-image-merge job
with the exact same error pattern the backend merge had on v4.2.2:

  ERROR: quay.io/go-skynet/local-ai@sha256:68b22611...: not found

Same root cause: image_build.yml pushes the per-arch manifest to
quay.io/go-skynet/local-ai with push-by-digest=true (no tag), then the
merge runs minutes-to-hours later, by which time quay's per-repo manifest
GC has reaped the untagged digest from local-ai. The blob still lives in
quay's storage but local-ai@<digest> no longer resolves.

Three matching edits:

1. image_build.yml: anchor each per-arch digest into ci-cache immediately
   after the push, reusing .github/scripts/anchor-digest-in-cache.sh with
   SOURCE_IMAGE=quay.io/go-skynet/local-ai and TAG_SUFFIX defaulting to
   "-core" for the core image (matches the artifact-name convention).
2. image_merge.yml: change the quay merge source from local-ai@<digest>
   to ci-cache@<digest>. Same correctness argument as backend_merge.yml —
   the manifest content is alive in ci-cache; buildx imagetools create
   republishes it into local-ai and writes the user-facing manifest list
   pointing at it. End state in local-ai is self-contained.
3. image_merge.yml: add a sparse `actions/checkout@v6` (only
   .github/scripts) so cleanup-keepalive-tags.sh is available, plus the
   cleanup step itself with TAG_SUFFIX matching the anchor's "-core"
   placeholder.

v4.2.3's image.yml run completed successfully (~50 min between push and
merge — beat quay's GC). This commit closes the race for future releases
and master pushes regardless of run length.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-13 21:37:03 +00:00
dependabot[bot]
9be5310394 chore(deps): bump actions/upload-artifact from 4 to 7 (#9770)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 7.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v7)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-12 09:20:03 +02:00
LocalAI [bot]
f0374aa0e8 ci: finish GHA free-tier migration (per-arch fan-out, image splits, retire self-hosted, fix provenance) (#9730)
* ci: add per-arch + manifest-merge support for LocalAI server image

Mirror the backend_build.yml + backend_merge.yml pattern shipped in
PR #9726 for the LocalAI server image:

- image_build.yml accepts optional platform-tag (default ''), scopes
  registry cache to cache-localai<suffix>-<platform-tag>, and pushes
  by canonical digest only on push events. Digests upload as artifacts
  named digests-localai<suffix>-<platform-tag>, with a "-core"
  placeholder when tag-suffix is empty so the merge job's download
  pattern doesn't over-match across multiple suffixes.
- image_merge.yml is a new reusable workflow that downloads matching
  digest artifacts and assembles the final tagged manifest list via
  docker buildx imagetools create.

Image names differ from backend_*.yml: the LocalAI server is published
under quay.io/go-skynet/local-ai and localai/localai (not -backends).

Not yet wired into image.yml / image-pr.yml — Commit C does that.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: fan out per-arch split to remaining 34 backends

Convert all remaining linux/amd64,linux/arm64 entries in
backend-matrix.yml to per-arch + manifest-merge form. Each was a
single matrix entry running both arches on x86 under QEMU emulation;
each becomes two entries — amd64 on ubuntu-latest, arm64 on
ubuntu-24.04-arm (native).

Four backends that were on bigger-runner (-cpu-llama-cpp,
-cpu-turboquant, -gpu-vulkan-llama-cpp, -gpu-vulkan-turboquant) have
both legs moved to free tier as part of the same change. They are
compile-only (no torch/CUDA install) and fit comfortably with the
setup-build-disk /mnt relocation. Phase 4 (next commit) retires the
remaining 5 single-arch bigger-runner entries.

After this commit:
- 271 total matrix entries (was 237)
- 0 multi-arch entries left
- 36 per-arch pairs (34 new + 2 pilots from PR #9727)
- 5 bigger-runner entries remaining (single-arch, Phase 4 target)

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: split LocalAI image multi-arch entries per arch + merge

Mirror the backend per-arch split for the main LocalAI image:

- image.yml's core-image-build matrix: split the core ('') and
  -gpu-vulkan entries into amd64 + arm64 legs each. amd64 on
  ubuntu-latest, arm64 on ubuntu-24.04-arm (native).
- New top-level core-image-merge and gpu-vulkan-image-merge jobs
  call image_merge.yml after core-image-build completes.
- image-pr.yml's image-build matrix: split the -vulkan-core entry.
  No merge job added on the PR side — image_build.yml's digest-push
  is push-only-event-gated, so a PR-side merge would have nothing
  to download.

After this commit, no workflow file references
linux/amd64,linux/arm64 in a single matrix slot.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: retire bigger-runner from backend matrix (Phase 4)

Migrate the remaining 5 single-arch bigger-runner entries to
ubuntu-latest. Combined with the Phase 3 setup-build-disk /mnt
relocation (PR #9726), free-tier ubuntu-latest now has ~100 GB of
working space — enough for ROCm dev image (~16 GB), CUDA toolkit
(~5 GB), and the per-backend compile/install steps these entries do.

Backends migrated:
- -gpu-nvidia-cuda-12-llama-cpp
- -gpu-nvidia-cuda-12-turboquant
- -gpu-rocm-hipblas-faster-whisper
- -gpu-rocm-hipblas-coqui
- -cpu-ik-llama-cpp

After this commit, .github/backend-matrix.yml has zero bigger-runner
references. The bigger-runner used in tests-vibevoice-cpp-grpc-
transcription (test-extra.yml) is a separate concern handled in a
follow-up.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: migrate 9 Intel oneAPI backends to free tier (Phase 5.1)

Intel oneAPI base image is ~6 GB; each backend's wheel install
stays well within the ~100 GB working space provided by Phase 3's
setup-build-disk /mnt relocation. Lowest-risk batch of the
arc-runner-set retirement.

Backends migrated:
  vllm, sglang, vibevoice, qwen-asr, nemo, qwen-tts,
  fish-speech, voxcpm, pocket-tts (all -gpu-intel-* variants).

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: migrate 15 ROCm Python backends to free tier (Phase 5.2)

ROCm dev image (~16 GB) plus per-backend torch/wheels install fits
on ubuntu-latest with the /mnt-relocated Docker root. These entries
include the heavier vLLM/sglang/transformers/diffusers stack on
ROCm; if any specific backend OOMs or runs out of disk, individual
flips back to arc-runner-set are revertable per-entry.

Backends migrated: all 15 -gpu-rocm-hipblas-* entries previously on
arc-runner-set (vllm/vllm-omni/sglang/transformers/diffusers/
ace-step/kokoro/vibevoice/qwen-asr/nemo/qwen-tts/fish-speech/
voxcpm/pocket-tts/neutts).

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: migrate 6 CUDA Python backends to free tier (Phase 5.3)

vLLM/sglang stacks on CUDA 12 and CUDA 13 are the heaviest
backends in the matrix — flash-attn intermediate layers can spike
disk usage during build. setup-build-disk's /mnt relocation gives
~100 GB working space which fits the documented peak.

Highest-risk batch of the arc-runner-set retirement; if any
backend fails to build on free tier, the per-entry runs-on flip
is the unit of revert.

Backends migrated: -gpu-nvidia-cuda-{12,13}-{vllm,vllm-omni,sglang}.

After this commit, .github/backend-matrix.yml has zero references
to arc-runner-set or bigger-runner. The migration is complete.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: disable provenance on multi-registry digest pushes

Root-caused on master via PR #9727's pilot: when docker/build-push-action@v7
pushes a single build to TWO registries simultaneously with
push-by-digest=true, buildx generates a per-registry provenance
attestation manifest (because mode=max — the default for push:true —
includes the runner ID). That makes the resulting manifest-list digest
diverge across registries:

  arm64 -cpu-faster-whisper build:
    image manifest:        sha256:d3bdd34b... (identical, content-only)
    quay manifest list:    sha256:66b4cfc8... (with quay attestation)
    dockerhub manifest list: sha256:e0733c3b... (with dockerhub attestation)

steps.build.outputs.digest returns only one of the list digests
(empirically the dockerhub one). The merge job then asks
"quay.io/...@sha256:e0733c3b..." which doesn't exist on quay — that
list has digest 66b4cfc8 there. Result: imagetools create fails with
"not found" and the merge job fails (run 25581983094, job 75110021491).

Setting provenance: false drops the per-registry attestation; the
manifest-list digest becomes pure content, identical across both
registries, and steps.build.outputs.digest works on either lookup.

Applied to backend_build.yml and image_build.yml — both refactored
to use the same multi-registry digest-push pattern in the prior PRs.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-09 09:37:00 +02:00
LocalAI [bot]
1f313cfdb0 ci: phase 1-3 of GHA free tier migration (path filter, multi-arch split prep, /mnt disk relief) (#9726)
* ci: extract free-disk-space composite action

Consolidate the apt-clean + dotnet/android/ghc/boost removal blocks from
backend_build.yml, image_build.yml, and test.yml into a single composite
action. The three callers had slightly different inline blocks; the
composite uses the more aggressive backend_build/image_build variant for
all three callers — test.yml jobs now also purge snapd, edge/firefox/
powershell/r-base-core, and sweep /opt/ghc + /usr/local/share/boost +
$AGENT_TOOLSDIRECTORY. Idempotent and skipped on self-hosted runners.

In test.yml, actions/checkout now runs before the composite action call
because the composite lives at ./.github/actions/free-disk-space and
requires a checked-out repo. The original ordering relied on
jlumbroso/free-disk-space@main being a remote action; this is the
minimum-invasive change to support a local composite.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: path-filter backend.yml master push

Run scripts/changed-backends.js on master pushes too (not just PRs) so
unrelated commits don't rebuild all ~210 backend container images. Tag
pushes still build the full matrix via FORCE_ALL.

Push events use the GitHub Compare API to diff event.before..event.after.
Edge cases (first push with zero base, API truncation beyond 300 files,
missing fields, network failure) fall back to "run everything" — better
safe than silently miss a backend.

The matrix literal moves from .github/workflows/backend.yml into a new
data-only file at .github/backend-matrix.yml (outside workflows/ so
actionlint doesn't try to parse it as a workflow). Both backend.yml and
backend_pr.yml now consume the dynamic matrix output uniformly via
fromJson(needs.generate-matrix.outputs.matrix); the script reads the
matrix from the new location.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: bound max-parallel on backend-jobs matrices

Cap to 8 concurrent jobs to avoid queue starvation on the shared GHA free
pool while migration is in flight. Lift after Phases 4-5 retire the
self-hosted runners. Also drops a leftover commented-out max-parallel
line that lived in backend.yml since the previous matrix shape.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: scope backend cache per arch, push by digest

Prepare backend_build.yml for the multi-arch split. The reusable
workflow now accepts a `platform-tag` input ("amd64" / "arm64") that
scopes the registry cache to cache<suffix>-<platform-tag> and (on push
events) pushes the resulting image by canonical digest only. Digests
are uploaded as artifacts named digests<suffix>-<platform-tag> for the
merge job (Task 2.2) to consume.

`platform-tag` is optional with empty default during the migration —
existing callers continue to work unchanged (their cache key just
becomes `cache<suffix>-`, an orphaned but valid key). Tasks 2.3+ will
update callers to pass an explicit "amd64" / "arm64" value. Phase 6
flips the input to required: true once every caller is wired.

PR builds keep their existing tag-based push to ci-tests but pick up
the per-arch cache key. Multi-arch PR builds remain emulated in this
commit; they migrate when the matrix entries split (Tasks 2.3+).

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: add backend_merge.yml reusable workflow

Joins per-arch digest artifacts (uploaded by backend_build.yml when
called with platform-tag) into a single tagged multi-arch manifest list
via `docker buildx imagetools create`. Called once per backend by
backend.yml after both per-arch build jobs succeed.

The workflow generates final tags identically to the previous monolithic
build job (same docker/metadata-action invocation), so consumers of
quay.io/go-skynet/local-ai-backends and localai/localai-backends see no
tag-shape change. Two imagetools calls (one per registry) reference the
same per-arch digests under different image names.

Not yet wired into backend.yml — Tasks 2.3+ rewrite individual matrix
entries to expand into per-arch + merge jobs that call this workflow.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: relocate Docker data-root to /mnt on hosted runners

GHA hosted ubuntu-latest runners ship a ~75 GB /mnt drive that's unused
by default. Stopping Docker, rsync'ing /var/lib/docker to /mnt, and
restarting with data-root pointing there yields ~100 GB of working
space (combined with the apt-clean from Task 1.1) — enough for ROCm
dev image + vLLM torch install + flash-attn intermediate layers.

This is the structural change that lets Phases 4 and 5 of the migration
plan move the bigger-runner and arc-runner-set jobs onto ubuntu-latest.

The composite action is no-op on self-hosted runners (where /mnt isn't
expected) and on non-X64 runners (Task 3.2 verifies the arm64 hosted
pool's /mnt shape separately before enabling). Wired into both
backend_build.yml and image_build.yml between free-disk-space and the
first Docker operation.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci(setup-build-disk): chmod 1777 /mnt/docker-tmp

buildx CLI runs as the unprivileged 'runner' user and creates config
dirs under TMPDIR before binding them into the buildkit container.
/mnt is root-owned by default, so the original mkdir produced a
permission-denied when buildx tried to write there:

  ERROR: mkdir /mnt/docker-tmp/buildkitd-config2740457204: permission denied

Mirror /tmp's permission mode (1777 — world-writable with sticky bit)
on /mnt/docker-tmp so non-root processes can stage their config.

Caught by the first PR run (image-build hipblas job) on PR #9726.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: weekly full-matrix rebuild via cron

Path-filtering backend.yml master push (the previous commit's main
optimization) skips backends whose source didn't change. That broke
the DEPS_REFRESH cache-buster's coverage: the build-arg keyed on
%Y-W%V busts the install layer's cache on a new ISO week, but only
when the build actually runs. Untouched Python backends (torch,
transformers, vllm with no version pin) would otherwise ship stale
wheels indefinitely.

Add a Sunday 06:00 UTC cron that fires the full matrix. Schedule
events have no event.ref / event.before, so the script's changedFiles
== null fallback (scripts/changed-backends.js) emits the full matrix
automatically — no script change needed.

C++/Go backends with pinned deps cache-hit and complete fast, so the
weekly cost is dominated by Python re-resolves which is exactly what
we want.

workflow_dispatch added so a maintainer can trigger an ad-hoc
full-matrix rebuild without faking a tag push.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-08 23:43:41 +02:00
Ettore Di Giacinto
50580a84ae fix(ci): switch apt mirror per runner — azure on github-hosted, kernel.org on self-hosted
Self-hosted runners (arc-runner-set, bigger-runner) cannot reach
azure.archive.ubuntu.com — they live in different networks (e.g. our
arc-runner-set Kubernetes cluster) where Azure's mirror IP is not
routable. Symptom: "Connection failed [IP: 51.11.236.225 80]" with each
Ign:/Err: cycle taking 60s, hanging the build for ~16 minutes before
exit 100.

Pick the mirror based on `runner.environment`:

  * github-hosted (ubuntu-latest, ubuntu-24.04-arm) → Azure
    (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com)
    — same VPC as the runner.
  * self-hosted (arc-runner-set, bigger-runner)    → kernel.org
    (https://mirrors.edge.kernel.org for both archive and ports)
    — publicly reachable from any network.

The choice now lives in one place: the .github/actions/configure-apt-mirror
composite action exposes `effective-mirror` / `effective-ports-mirror`
outputs so the reusable workflows can forward the same value as Docker
build-args without duplicating the per-runner-environment branch.

The now-redundant `apt-mirror` / `apt-ports-mirror` workflow inputs on
image_build.yml and backend_build.yml are dropped — defaults live in the
composite action and are visible there.

Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-03 22:59:26 +00:00
Ettore Di Giacinto
8edac61e57 feat(ci): allow routing apt traffic through an alternate Ubuntu mirror (#9650)
* feat(ci): allow routing apt traffic through an alternate Ubuntu mirror

Adds opt-in APT_MIRROR / APT_PORTS_MIRROR knobs to all Dockerfiles, the
Makefile, and CI workflows so we can fail over to a non-canonical Ubuntu
mirror when archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com
are degraded (recently observed: multi-day DDoS against the default pool).

Defaults are empty everywhere — behavior is unchanged unless a mirror is
configured. To enable in CI, set the repo-level GitHub Actions variables
APT_MIRROR (and APT_PORTS_MIRROR for arm64 builds). Locally:
    make docker APT_MIRROR=http://azure.archive.ubuntu.com

A small POSIX-sh helper in .docker/apt-mirror.sh rewrites both DEB822
(/etc/apt/sources.list.d/ubuntu.sources, Ubuntu 24.04+) and the legacy
/etc/apt/sources.list before the first apt-get update. Dockerfile stages
load it via RUN --mount=type=bind, so there is no extra layer and no
cache invalidation when the script is unchanged. Reusable workflows also
rewrite the runner's own /etc/apt sources before any sudo apt-get call.

Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci(apt-mirror): default to the Azure mirror, visible in the workflow source

Bakes Azure (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com)
in as the default for both Docker builds and runner-side apt — rather than
hiding the URL behind a GitHub Actions repo variable that's not visible
from the source tree.

A new composite action at .github/actions/configure-apt-mirror is the
single source of truth for runner-side rewrites. Five standalone
workflows (build-test, release, tests-e2e, tests-ui-e2e, update_swagger)
just `uses: ./.github/actions/configure-apt-mirror`.

Three workflows (image_build, backend_build, checksum_checker) keep an
inline bash rewrite, because they install/upgrade git via apt *before*
the checkout step (so the local composite action isn't loadable yet).
The Azure URL is visible in those files too.

The `apt-mirror` / `apt-ports-mirror` inputs of the reusable workflows
keep their now-Azure defaults — they still feed the Docker build-args
block in addition to the inline runner-side rewrite. Callers (image.yml,
image-pr.yml, backend.yml, backend_pr.yml) drop the previous
`vars.APT_MIRROR` plumbing and rely on those defaults.

Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci(apt-mirror): drop Force Install GIT, consolidate on the composite action

The PPA git upgrade ran add-apt-repository ppa:git-core/ppa, which talks
to api.launchpad.net — also part of Canonical's infrastructure and
currently returning HTTP 504. The Azure mirror only covers
archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com, not PPAs.

The system git that ubuntu-latest already ships is sufficient for
actions/checkout and the build pipeline, so just drop the upgrade. With
that gone, the apt-before-checkout constraint disappears too — all three
holdouts (image_build, backend_build, checksum_checker) can now switch
to ./.github/actions/configure-apt-mirror like the other five.

Net: 0 inline apt-mirror blocks, all 8 workflows route through the
composite action.

Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-03 23:50:13 +02:00
Ettore Di Giacinto
bdfa5e934a ci: switch image/backend build cache to a dedicated registry image
- Switch cache-from/cache-to in backend_build.yml and image_build.yml
  from the unused gha cache to type=registry pointing at
  quay.io/go-skynet/ci-cache:cache<tag-suffix>, mode=max with
  ignore-error=true. Master/tag builds populate their own
  per-matrix-entry cache; PR builds read-only.
- Drop the broken generate_grpc_cache.yaml cron. It targeted a `grpc`
  Dockerfile stage that was removed by b1fc5acd in July 2025, has been
  failing every night since, and never populated the gha cache. The new
  registry-cache scheme is self-warming, so no separate populator is
  needed.
- Remove the dead GRPC_VERSION / GRPC_BASE_IMAGE / GRPC_MAKEFLAGS
  build-args from image_build.yml and the orphan ARG GRPC_BASE_IMAGE in
  the root Dockerfile (the root Dockerfile no longer compiles gRPC; the
  source build now lives in backend/Dockerfile.{llama-cpp,
  ik-llama-cpp, turboquant} only and uses its own ARG defaults).
- Drop the unused grpc-base-image input from image_build.yml plus the
  matrix passthroughs in image.yml / image-pr.yml.
- Drop the unused GRPC_VERSION env in test.yml.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: claude-code:claude-opus-4-7-1m
2026-04-27 13:13:04 +00:00
Ettore Di Giacinto
5affb747a9 chore: drop AIO images (#9004)
AIO images are behind, and takes effort to maintain these. Wizard and
installation of models have been semplified massively, so AIO images
lost their purpose.

This allows us to be more laser focused on main images and reliefes
stress from CI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-14 17:49:36 +01:00
dependabot[bot]
01bd3d8212 chore(deps): bump docker/login-action from 3 to 4 (#8918)
Bumps [docker/login-action](https://github.com/docker/login-action) from 3 to 4.
- [Release notes](https://github.com/docker/login-action/releases)
- [Commits](https://github.com/docker/login-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/login-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-09 22:30:11 +01:00
dependabot[bot]
7f11f66b44 chore(deps): bump docker/build-push-action from 6 to 7 (#8919)
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v6...v7)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-09 22:29:51 +01:00
dependabot[bot]
2a351e1f0c chore(deps): bump docker/metadata-action from 5 to 6 (#8917)
Bumps [docker/metadata-action](https://github.com/docker/metadata-action) from 5 to 6.
- [Release notes](https://github.com/docker/metadata-action/releases)
- [Commits](https://github.com/docker/metadata-action/compare/v5...v6)

---
updated-dependencies:
- dependency-name: docker/metadata-action
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-09 22:27:02 +01:00
Richard Palethorpe
e6ba26c3e7 chore: Update to Ubuntu24.04 (cont #7423) (#7769)
* ci(workflows): bump GitHub Actions images to Ubuntu 24.04

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04)

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): bump GitHub Actions CUDA support to 12.9

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): disable parallel backend builds to avoid race conditions

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(backend): update backend Dockerfiles to Ubuntu 24.04

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore: add local-ai-launcher to .gitignore

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): fix backends GitHub Actions workflows after rebase

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): use build-time UBUNTU_VERSION variable

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(docker): remove libquadmath0 from requirements-stage base image

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix(docker): correct CUDA installation steps in backend Dockerfiles

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(backend): update ROCm to 6.4 and align Python hipblas requirements

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): fix failing GitHub Actions runners

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix: Allow FROM_SOURCE to be unset, use upstream Intel images etc.

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): rm all traces of CUDA 11

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): Add Ubuntu codename as an argument

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>
2026-01-06 15:26:42 +01:00
Ettore Di Giacinto
8dfeea2f55 fix: use ubuntu 24.04 for cuda13 l4t images (#7418)
* fix: use ubuntu 24.04 for cuda13 l4t images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop openblas from containers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-03 09:47:03 +01:00
dependabot[bot]
91248da09e chore(deps): bump actions/checkout from 5 to 6 (#7339)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 21:18:15 +01:00
dependabot[bot]
0ca1765c17 chore(deps): bump actions/checkout from 4 to 5 (#6014)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-12 18:54:39 +02:00
Ettore Di Giacinto
98e5291afc feat: refactor build process, drop embedded backends (#5875)
* feat: split remaining backends and drop embedded backends

- Drop silero-vad, huggingface, and stores backend from embedded
  binaries
- Refactor Makefile and Dockerfile to avoid building grpc backends
- Drop golang code that was used to embed backends
- Simplify building by using goreleaser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): be specific with llama-cpp backend templates

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(docs): update

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): minor fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: drop all ffmpeg references

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: run protogen-go

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Always enable p2p mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update gorelease file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(stores): do not always load

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix linting issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Mac OS fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-22 16:31:04 +02:00
Ettore Di Giacinto
7c4a2e9b85 chore(ci): ⚠️ fix latest tag by using docker meta action (#5722)
chore(ci): fix latest tag by using docker meta action

Also uniform tagging names

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-26 18:40:25 +02:00
Ettore Di Giacinto
be3ff482d0 chore(ci): try to optimize disk space when tagging latest (#5695)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-20 15:54:14 +02:00
Ettore Di Giacinto
89040ff6f7 fix: add python symlink, use absolute python env path when running backends (#5664)
* fix: add python symlink, use absolute python env path when running backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(ci): do not push images when building PRs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-16 23:00:53 +02:00
Ettore Di Giacinto
912c8eff04 chore(ci): use public runner for extra backends (#5657)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-16 08:21:18 +02:00
Ettore Di Giacinto
2d64269763 feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00
Ettore Di Giacinto
e84081769e chore(ci): cleanup before pulling images again 2025-02-16 09:20:22 +01:00
Ettore Di Giacinto
0a748b009e chore(ci): avoit cache hits until the ci gRPC job is fixed
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-12 09:11:40 +01:00
Ettore Di Giacinto
fe3ced2919 chore(ci): try again to bump parallelism in grpc jobs
As we moved these out to self-hosted

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-11 09:31:00 +01:00
Ettore Di Giacinto
516cd660f1 chore(grpcio): reduce parallelism (#4799)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-10 18:56:13 +01:00
Ettore Di Giacinto
8fd3ace9a1 chore(grpcio): bump to 1.70 (#4798)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-10 18:38:53 +01:00
Ettore Di Giacinto
099469cb05 chore(tests): decrease parallelism for gRPC builds (#4797)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-10 12:59:59 +01:00
Ettore Di Giacinto
8864156300 chore(nvidia-l4t): add l4t arm64 images (#4449)
chore(nvidia-l4t): add nvidia-l4t arm64 images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-22 21:29:33 +01:00
dependabot[bot]
ce035416aa build(deps): bump docker/build-push-action from 5 to 6 (#2592)
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 5 to 6.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v5...v6)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-07-13 21:08:59 +00:00
Rene Leonhardt
fc87507012 chore(deps): Update Dependencies (#2538)
* chore(deps): Update dependencies

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

* chore(deps): Upgrade github.com/imdario/mergo to dario.cat/mergo

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

* remove version identifiers for MeloTTS

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

---------

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-07-12 19:54:08 +00:00
Ettore Di Giacinto
2845baecd5 fix(cuda): downgrade default version from 12.5 to 12.4 (#2707)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-09 23:13:29 +02:00
Rene Leonhardt
43f0688a95 feat: Upgrade to CUDA 12.5 (#2601)
Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>
2024-06-19 17:50:49 +02:00
Ettore Di Giacinto
d075dc44dd ci: push test images when building PRs (#2424)
ci: try to push image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-27 22:07:35 +02:00
Ettore Di Giacinto
e0187c2a1a ci: do not tag latest on AIO automatically
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-24 09:41:13 +02:00
Ettore Di Giacinto
1a3dedece0 dependencies(grpcio): bump to fix CI issues (#2362)
feat(grpcio): bump to fix CI issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-21 14:33:47 +02:00
cryptk
f7aabf1b50 fix: bring everything onto the same GRPC version to fix tests (#2199)
fix: more places where we are installing grpc that need a version specified
fix: attempt to fix metal tests
fix: metal/brew is forcing an update, they don't have 1.58 available anymore

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-30 19:12:15 +00:00
cryptk
987b7ad42d feat: only keep the build artifacts from the grpc build (#2172)
* feat: only keep the build artifacts from the grpc build

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove separate Cache GRPC build step

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: remove docker inspect step, it is leftover from previous debugging

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-28 19:24:16 +00:00
cryptk
9fc0135991 feat: cleanup Dockerfile and make final image a little smaller (#2146)
* feat: cleanup Dockerfile and make final image a little smaller

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: add build-essential to final stage

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: more GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: correct for another cause of GRPC cache misses

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* feat: generate new GRPC cache automatically if needed

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

* fix: use new GRPC_MAKEFLAGS build arg in GRPC cache generation

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>

---------

Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-27 19:48:20 +02:00
cryptk
13012cfa70 feat: better control of GRPC docker cache (#2070)
Signed-off-by: Chris Jowett <421501+cryptk@users.noreply.github.com>
2024-04-18 16:19:36 -04:00
Ettore Di Giacinto
d692b2c32a ci: push latest images for dockerhub (#1984)
Fixes: #1983

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-10 10:31:59 +02:00
Ettore Di Giacinto
cc3d601836 ci: fixup latest image push
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-04-09 09:49:11 +02:00
Ettore Di Giacinto
93cfec3c32 ci: correctly tag latest and aio images 2024-04-03 11:30:23 +02:00
Ettore Di Giacinto
89560ef87f fix(ci): manually tag latest images (#1948)
fix(ci): manually tag images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-02 19:25:46 +02:00
cryptk
93702e39d4 feat(build): adjust number of parallel make jobs (#1915)
* feat(build): adjust number of parallel make jobs

* fix: update make on MacOS from brew to support --output-sync argument

* fix: cache grpc with version as part of key to improve validity of cache hits

* fix: use gmake for tests-apple to use the updated GNU make version

* fix: actually use the new make version for tests-apple

* feat: parallelize tests-extra

* feat: attempt to cache grpc build for docker images

* fix: don't quote GRPC version

* fix: don't cache go modules, we have limited cache space, better used elsewhere

* fix: release with the same version of go that we test with

* fix: don't fail on exporting cache layers

* fix: remove deprecated BUILD_GRPC docker arg from Makefile
2024-03-29 22:32:40 +01:00
Ettore Di Giacinto
49cec7fd61 ci(aio): add latest tag images (#1884)
Tangentially also fixes #1868
2024-03-23 16:08:32 +01:00
Ettore Di Giacinto
418ba02025 ci: fix typo
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-03-22 09:14:17 +01:00
Ettore Di Giacinto
abc9360dc6 feat(aio): entrypoint, update workflows (#1872) 2024-03-21 22:09:04 +01:00
cryptk
020ce29cd8 fix(make): allow to parallelize jobs (#1845)
* fix: clean up Makefile dependencies to allow for parallel builds

* refactor: remove old unused backend from Makefile

* fix: finish removing legacy backend, update piper

* fix: I broke llama... I fixed llama

* feat: give the tests and builds a few threads

* fix: ensure libraries are replaced before build, add dropreplace target

* Fix image build workflows
2024-03-17 15:39:20 +01:00
Ettore Di Giacinto
ddd21f1644 feat: Use ubuntu as base for container images, drop deprecated ggml-transformers backends (#1689)
* cleanup backends

* switch image to ubuntu 22.04

* adapt commands for ubuntu

* transformers cleanup

* no contrib on ubuntu

* Change test model to gguf

* ci: disable bark tests (too cpu-intensive)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cleanup

* refinements

* use intel base image

* Makefile: Add docker targets

* Change test model

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-02-08 20:12:51 +01:00