mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-17 04:56:52 -04:00
* ci: add per-arch + manifest-merge support for LocalAI server image Mirror the backend_build.yml + backend_merge.yml pattern shipped in PR #9726 for the LocalAI server image: - image_build.yml accepts optional platform-tag (default ''), scopes registry cache to cache-localai<suffix>-<platform-tag>, and pushes by canonical digest only on push events. Digests upload as artifacts named digests-localai<suffix>-<platform-tag>, with a "-core" placeholder when tag-suffix is empty so the merge job's download pattern doesn't over-match across multiple suffixes. - image_merge.yml is a new reusable workflow that downloads matching digest artifacts and assembles the final tagged manifest list via docker buildx imagetools create. Image names differ from backend_*.yml: the LocalAI server is published under quay.io/go-skynet/local-ai and localai/localai (not -backends). Not yet wired into image.yml / image-pr.yml — Commit C does that. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: fan out per-arch split to remaining 34 backends Convert all remaining linux/amd64,linux/arm64 entries in backend-matrix.yml to per-arch + manifest-merge form. Each was a single matrix entry running both arches on x86 under QEMU emulation; each becomes two entries — amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). Four backends that were on bigger-runner (-cpu-llama-cpp, -cpu-turboquant, -gpu-vulkan-llama-cpp, -gpu-vulkan-turboquant) have both legs moved to free tier as part of the same change. They are compile-only (no torch/CUDA install) and fit comfortably with the setup-build-disk /mnt relocation. Phase 4 (next commit) retires the remaining 5 single-arch bigger-runner entries. After this commit: - 271 total matrix entries (was 237) - 0 multi-arch entries left - 36 per-arch pairs (34 new + 2 pilots from PR #9727) - 5 bigger-runner entries remaining (single-arch, Phase 4 target) Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: split LocalAI image multi-arch entries per arch + merge Mirror the backend per-arch split for the main LocalAI image: - image.yml's core-image-build matrix: split the core ('') and -gpu-vulkan entries into amd64 + arm64 legs each. amd64 on ubuntu-latest, arm64 on ubuntu-24.04-arm (native). - New top-level core-image-merge and gpu-vulkan-image-merge jobs call image_merge.yml after core-image-build completes. - image-pr.yml's image-build matrix: split the -vulkan-core entry. No merge job added on the PR side — image_build.yml's digest-push is push-only-event-gated, so a PR-side merge would have nothing to download. After this commit, no workflow file references linux/amd64,linux/arm64 in a single matrix slot. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: retire bigger-runner from backend matrix (Phase 4) Migrate the remaining 5 single-arch bigger-runner entries to ubuntu-latest. Combined with the Phase 3 setup-build-disk /mnt relocation (PR #9726), free-tier ubuntu-latest now has ~100 GB of working space — enough for ROCm dev image (~16 GB), CUDA toolkit (~5 GB), and the per-backend compile/install steps these entries do. Backends migrated: - -gpu-nvidia-cuda-12-llama-cpp - -gpu-nvidia-cuda-12-turboquant - -gpu-rocm-hipblas-faster-whisper - -gpu-rocm-hipblas-coqui - -cpu-ik-llama-cpp After this commit, .github/backend-matrix.yml has zero bigger-runner references. The bigger-runner used in tests-vibevoice-cpp-grpc- transcription (test-extra.yml) is a separate concern handled in a follow-up. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 9 Intel oneAPI backends to free tier (Phase 5.1) Intel oneAPI base image is ~6 GB; each backend's wheel install stays well within the ~100 GB working space provided by Phase 3's setup-build-disk /mnt relocation. Lowest-risk batch of the arc-runner-set retirement. Backends migrated: vllm, sglang, vibevoice, qwen-asr, nemo, qwen-tts, fish-speech, voxcpm, pocket-tts (all -gpu-intel-* variants). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 15 ROCm Python backends to free tier (Phase 5.2) ROCm dev image (~16 GB) plus per-backend torch/wheels install fits on ubuntu-latest with the /mnt-relocated Docker root. These entries include the heavier vLLM/sglang/transformers/diffusers stack on ROCm; if any specific backend OOMs or runs out of disk, individual flips back to arc-runner-set are revertable per-entry. Backends migrated: all 15 -gpu-rocm-hipblas-* entries previously on arc-runner-set (vllm/vllm-omni/sglang/transformers/diffusers/ ace-step/kokoro/vibevoice/qwen-asr/nemo/qwen-tts/fish-speech/ voxcpm/pocket-tts/neutts). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: migrate 6 CUDA Python backends to free tier (Phase 5.3) vLLM/sglang stacks on CUDA 12 and CUDA 13 are the heaviest backends in the matrix — flash-attn intermediate layers can spike disk usage during build. setup-build-disk's /mnt relocation gives ~100 GB working space which fits the documented peak. Highest-risk batch of the arc-runner-set retirement; if any backend fails to build on free tier, the per-entry runs-on flip is the unit of revert. Backends migrated: -gpu-nvidia-cuda-{12,13}-{vllm,vllm-omni,sglang}. After this commit, .github/backend-matrix.yml has zero references to arc-runner-set or bigger-runner. The migration is complete. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: disable provenance on multi-registry digest pushes Root-caused on master via PR #9727's pilot: when docker/build-push-action@v7 pushes a single build to TWO registries simultaneously with push-by-digest=true, buildx generates a per-registry provenance attestation manifest (because mode=max — the default for push:true — includes the runner ID). That makes the resulting manifest-list digest diverge across registries: arm64 -cpu-faster-whisper build: image manifest: sha256:d3bdd34b... (identical, content-only) quay manifest list: sha256:66b4cfc8... (with quay attestation) dockerhub manifest list: sha256:e0733c3b... (with dockerhub attestation) steps.build.outputs.digest returns only one of the list digests (empirically the dockerhub one). The merge job then asks "quay.io/...@sha256:e0733c3b..." which doesn't exist on quay — that list has digest 66b4cfc8 there. Result: imagetools create fails with "not found" and the merge job fails (run 25581983094, job 75110021491). Setting provenance: false drops the per-registry attestation; the manifest-list digest becomes pure content, identical across both registries, and steps.build.outputs.digest works on either lookup. Applied to backend_build.yml and image_build.yml — both refactored to use the same multi-registry digest-push pattern in the prior PRs. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
224 lines
8.0 KiB
YAML
224 lines
8.0 KiB
YAML
---
|
|
name: 'build container images (reusable)'
|
|
|
|
on:
|
|
workflow_call:
|
|
inputs:
|
|
base-image:
|
|
description: 'Base image'
|
|
required: true
|
|
type: string
|
|
build-type:
|
|
description: 'Build type'
|
|
default: ''
|
|
type: string
|
|
cuda-major-version:
|
|
description: 'CUDA major version'
|
|
default: "12"
|
|
type: string
|
|
cuda-minor-version:
|
|
description: 'CUDA minor version'
|
|
default: "9"
|
|
type: string
|
|
platforms:
|
|
description: 'Platforms'
|
|
default: ''
|
|
type: string
|
|
platform-tag:
|
|
description: |
|
|
Short tag identifying the platform leg, e.g. "amd64" or "arm64".
|
|
Used to scope the per-arch registry cache and the digest artifact name.
|
|
Optional during the migration; will be flipped to required: true once
|
|
every caller passes an explicit value.
|
|
required: false
|
|
default: ''
|
|
type: string
|
|
tag-latest:
|
|
description: 'Tag latest'
|
|
default: ''
|
|
type: string
|
|
tag-suffix:
|
|
description: 'Tag suffix'
|
|
default: ''
|
|
type: string
|
|
skip-drivers:
|
|
description: 'Skip drivers by default'
|
|
default: 'false'
|
|
type: string
|
|
runs-on:
|
|
description: 'Runs on'
|
|
required: true
|
|
default: ''
|
|
type: string
|
|
makeflags:
|
|
description: 'Make Flags'
|
|
required: false
|
|
default: '--jobs=4 --output-sync=target'
|
|
type: string
|
|
ubuntu-version:
|
|
description: 'Ubuntu version'
|
|
required: false
|
|
default: '2204'
|
|
type: string
|
|
ubuntu-codename:
|
|
description: 'Ubuntu codename'
|
|
required: false
|
|
default: 'noble'
|
|
type: string
|
|
secrets:
|
|
dockerUsername:
|
|
required: true
|
|
dockerPassword:
|
|
required: true
|
|
quayUsername:
|
|
required: true
|
|
quayPassword:
|
|
required: true
|
|
jobs:
|
|
reusable_image-build:
|
|
runs-on: ${{ inputs.runs-on }}
|
|
steps:
|
|
|
|
- name: Checkout
|
|
uses: actions/checkout@v6
|
|
|
|
- name: Configure apt mirror on runner
|
|
id: apt_mirror
|
|
uses: ./.github/actions/configure-apt-mirror
|
|
|
|
- name: Free disk space
|
|
uses: ./.github/actions/free-disk-space
|
|
with:
|
|
mode: ${{ inputs.runs-on == 'ubuntu-latest' && 'hosted' || 'skip' }}
|
|
|
|
- name: Set up build disk
|
|
uses: ./.github/actions/setup-build-disk
|
|
|
|
- name: Docker meta
|
|
id: meta
|
|
if: github.event_name != 'pull_request'
|
|
uses: docker/metadata-action@v6
|
|
with:
|
|
images: |
|
|
quay.io/go-skynet/local-ai
|
|
localai/localai
|
|
tags: |
|
|
type=ref,event=branch
|
|
type=semver,pattern={{raw}}
|
|
type=sha
|
|
flavor: |
|
|
latest=${{ inputs.tag-latest }}
|
|
suffix=${{ inputs.tag-suffix }},onlatest=true
|
|
- name: Docker meta for PR
|
|
id: meta_pull_request
|
|
if: github.event_name == 'pull_request'
|
|
uses: docker/metadata-action@v6
|
|
with:
|
|
images: |
|
|
quay.io/go-skynet/ci-tests
|
|
tags: |
|
|
type=ref,event=branch,suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
|
|
type=semver,pattern={{raw}},suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
|
|
type=sha,suffix=localai${{ github.event.number }}-${{ inputs.build-type }}-${{ inputs.cuda-major-version }}-${{ inputs.cuda-minor-version }}
|
|
flavor: |
|
|
latest=${{ inputs.tag-latest }}
|
|
suffix=${{ inputs.tag-suffix }}
|
|
- name: Set up QEMU
|
|
uses: docker/setup-qemu-action@master
|
|
with:
|
|
platforms: all
|
|
|
|
- name: Set up Docker Buildx
|
|
id: buildx
|
|
uses: docker/setup-buildx-action@master
|
|
|
|
- name: Login to DockerHub
|
|
if: github.event_name != 'pull_request'
|
|
uses: docker/login-action@v4
|
|
with:
|
|
username: ${{ secrets.dockerUsername }}
|
|
password: ${{ secrets.dockerPassword }}
|
|
|
|
- name: Login to DockerHub
|
|
if: github.event_name != 'pull_request'
|
|
uses: docker/login-action@v4
|
|
with:
|
|
registry: quay.io
|
|
username: ${{ secrets.quayUsername }}
|
|
password: ${{ secrets.quayPassword }}
|
|
|
|
- name: Build and push by digest
|
|
id: build
|
|
uses: docker/build-push-action@v7
|
|
if: github.event_name != 'pull_request'
|
|
with:
|
|
builder: ${{ steps.buildx.outputs.name }}
|
|
build-args: |
|
|
BUILD_TYPE=${{ inputs.build-type }}
|
|
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
|
|
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
|
|
BASE_IMAGE=${{ inputs.base-image }}
|
|
MAKEFLAGS=${{ inputs.makeflags }}
|
|
SKIP_DRIVERS=${{ inputs.skip-drivers }}
|
|
UBUNTU_VERSION=${{ inputs.ubuntu-version }}
|
|
UBUNTU_CODENAME=${{ inputs.ubuntu-codename }}
|
|
APT_MIRROR=${{ steps.apt_mirror.outputs.effective-mirror }}
|
|
APT_PORTS_MIRROR=${{ steps.apt_mirror.outputs.effective-ports-mirror }}
|
|
context: .
|
|
file: ./Dockerfile
|
|
cache-from: type=registry,ref=quay.io/go-skynet/ci-cache:cache-localai${{ inputs.tag-suffix }}-${{ inputs.platform-tag }}
|
|
cache-to: type=registry,ref=quay.io/go-skynet/ci-cache:cache-localai${{ inputs.tag-suffix }}-${{ inputs.platform-tag }},mode=max,ignore-error=true
|
|
platforms: ${{ inputs.platforms }}
|
|
outputs: |
|
|
type=image,name=quay.io/go-skynet/local-ai,push-by-digest=true,name-canonical=true,push=true
|
|
type=image,name=localai/localai,push-by-digest=true,name-canonical=true,push=true
|
|
# See backend_build.yml for the rationale — provenance=mode=max
|
|
# diverges the manifest-list digest per registry, breaking the
|
|
# downstream imagetools create lookup.
|
|
provenance: false
|
|
labels: ${{ steps.meta.outputs.labels }}
|
|
|
|
- name: Export digest
|
|
if: github.event_name != 'pull_request'
|
|
run: |
|
|
mkdir -p /tmp/digests
|
|
digest="${{ steps.build.outputs.digest }}"
|
|
touch "/tmp/digests/${digest#sha256:}"
|
|
|
|
- name: Upload digest artifact
|
|
if: github.event_name != 'pull_request'
|
|
uses: actions/upload-artifact@v4
|
|
with:
|
|
name: digests-localai${{ inputs.tag-suffix == '' && '-core' || inputs.tag-suffix }}-${{ inputs.platform-tag }}
|
|
path: /tmp/digests/*
|
|
if-no-files-found: error
|
|
retention-days: 1
|
|
### Start testing image
|
|
- name: Build and push
|
|
uses: docker/build-push-action@v7
|
|
if: github.event_name == 'pull_request'
|
|
with:
|
|
builder: ${{ steps.buildx.outputs.name }}
|
|
build-args: |
|
|
BUILD_TYPE=${{ inputs.build-type }}
|
|
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
|
|
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
|
|
BASE_IMAGE=${{ inputs.base-image }}
|
|
MAKEFLAGS=${{ inputs.makeflags }}
|
|
SKIP_DRIVERS=${{ inputs.skip-drivers }}
|
|
UBUNTU_VERSION=${{ inputs.ubuntu-version }}
|
|
UBUNTU_CODENAME=${{ inputs.ubuntu-codename }}
|
|
APT_MIRROR=${{ steps.apt_mirror.outputs.effective-mirror }}
|
|
APT_PORTS_MIRROR=${{ steps.apt_mirror.outputs.effective-ports-mirror }}
|
|
context: .
|
|
file: ./Dockerfile
|
|
cache-from: type=registry,ref=quay.io/go-skynet/ci-cache:cache-localai${{ inputs.tag-suffix }}-${{ inputs.platform-tag }}
|
|
platforms: ${{ inputs.platforms }}
|
|
#push: true
|
|
tags: ${{ steps.meta_pull_request.outputs.tags }}
|
|
labels: ${{ steps.meta_pull_request.outputs.labels }}
|
|
## End testing image
|
|
- name: job summary
|
|
run: |
|
|
echo "Built image: ${{ steps.meta.outputs.labels }}" >> $GITHUB_STEP_SUMMARY
|