* feat(gallery): verify backend OCI images with keyless cosign Close a trust gap where a registry compromise or MITM could silently replace a backend image: the gallery YAML tells LocalAI which image to pull, but until now nothing verified the bytes came from our CI. Consumer (pkg/oci/cosignverify): - New package using sigstore-go to verify keyless-cosign signatures. - OCI 1.1 referrers API + new bundle format (no legacy :tag.sig). - Policy fields: Issuer / IssuerRegex / Identity / IdentityRegex / NotBefore. NotBefore is the revocation lever — keyless Fulcio certs are ephemeral so revocation is policy-side; advancing not_before in the gallery YAML invalidates every signature predating the cutoff. - TUF trusted root cached process-wide so N backends from one gallery do 1 fetch, not N. Plumbing: - pkg/downloader: ImageVerifier interface + WithImageVerifier option threaded through DownloadFileWithContext. Verification runs between oci.GetImage and oci.ExtractOCIImage, with digest pinning via pinnedImageRef to close the TOCTOU window. Skips the verifier's HEAD when the ref is already digest-pinned. - core/config: Gallery.Verification YAML block. - core/gallery: backendDownloadOptions builds the verifier from the policy; applied on initial URI, mirrors, and tag fallbacks. - core/gallery/upgrade: the upgrade path now routes through the same options builder. A regression Ginkgo spec pins this contract — without it, UpgradeBackend silently bypassed verification. - core/cli: --require-backend-integrity (LOCALAI_REQUIRE_BACKEND_INTEGRITY) escalates missing policy / empty SHA256 from warn to hard-fail. Producer (.github/workflows/backend_merge.yml): - id-token: write at job scope (PR-fork-safe via existing event gate). - sigstore/cosign-installer@v3 pinned to v2.4.1. - After each docker buildx imagetools create, resolve the manifest list digest and run cosign sign --recursive --new-bundle-format --registry-referrers-mode=oci-1-1 against repo@digest. --recursive signs the index and every per-arch entry, matching how the consumer resolves a tag to a platform-specific manifest before verifying. Rollout: backend/index.yaml has no `verification:` block yet, so this PR is backward-compatible — installs proceed with a warning until the gallery is populated. Strict mode is opt-in. Assisted-by: claude-code:claude-opus-4-7 [Bash] [Edit] [Read] [Write] [WebSearch] [WebFetch] Signed-off-by: Richard Palethorpe <io@richiejp.com> * refactor(gallery): plumb RequireBackendIntegrity through config instead of env The previous implementation re-exported the --require-backend-integrity CLI flag into LOCALAI_REQUIRE_BACKEND_INTEGRITY via os.Setenv, then re-read it in core/gallery via os.Getenv. This leaked process state into the gallery package and made the flag impossible to override per-call or test without touching the env. Add RequireBackendIntegrity to ApplicationConfig (with a matching WithRequireBackendIntegrity AppOption) and thread the bool through every install/upgrade path: InstallBackend, InstallBackendFromGallery, UpgradeBackend, InstallModelFromGallery, InstallExternalBackend, ApplyGalleryFromString/File, startup.InstallModels. Worker subcommands gain the same env-bound flag on WorkerFlags so distributed-worker installs honor it consistently with the worker daemon path. Add a forbidigo lint rule against os.Getenv / os.LookupEnv / os.Environ to keep the env-leak pattern from creeping back. Existing offenders (p2p, config loaders, etc.) are baseline-grandfathered by the existing new-from-merge-base: origin/master setting; targeted path exclusions cover the legitimate cases — kong CLI entry points, backend subprocesses, system capability probes, gRPC AUTH_TOKEN inheritance, test gating env vars. Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>
17 KiB
Adding a New Backend
When adding a new backend to LocalAI, you need to update several files to ensure the backend is properly built, tested, and registered. Here's a step-by-step guide based on the pattern used for adding backends like moonshine:
1. Create Backend Directory Structure
Create the backend directory under the appropriate location:
- Python backends:
backend/python/<backend-name>/ - Go backends:
backend/go/<backend-name>/ - C++ backends:
backend/cpp/<backend-name>/ - Rust backends:
backend/rust/<backend-name>/
For Python backends, you'll typically need:
backend.py- Main gRPC server implementationMakefile- Build configurationinstall.sh- Installation script for dependenciesprotogen.sh- Protocol buffer generation scriptrequirements.txt- Python dependenciesrun.sh- Runtime scripttest.py/test.sh- Test files
For Rust backends, you'll typically need (see backend/rust/kokoros/ as a reference):
Cargo.toml- Crate manifest; depend on the upstream project as a submodule undersources/build.rs- Invokestonic_buildto generate gRPC stubs frombackend/backend.proto(use theBACKEND_PROTO_PATHenv var so the Makefile can inject the canonical copy)src/- The gRPC server implementation (implementBackendviatonic)Makefile- Copiesbackend.protointo the crate, runscargo build --release, thenpackage.shpackage.sh- Useslddto bundle the binary's dynamic deps andld.sointopackage/lib/run.sh- SetsLD_LIBRARY_PATH/SSL_CERT_DIRand execs the binary via the bundledlib/ld.sosources/<UpstreamProject>/- Git submodule with the upstream Rust crate
2. Add Build Configurations to .github/backend-matrix.yml
The build matrix is data-only YAML at .github/backend-matrix.yml (not inside backend.yml itself). backend.yml (master push) and backend_pr.yml (PR) load it via scripts/changed-backends.js, which also handles per-file path filtering so only touched backends rebuild on PRs and master pushes alike. Add build matrix entries to .github/backend-matrix.yml for each platform/GPU type you want to support. Look at similar backends for reference — chatterbox/faster-whisper for Python, piper/silero-vad for Go, kokoros for Rust.
Without an entry here no image is ever built or pushed, and the gallery entry in backend/index.yaml will point at a tag that does not exist. The dockerfile: field must point at ./backend/Dockerfile.<lang> matching the language bucket from step 1 (e.g. Dockerfile.python, Dockerfile.golang, Dockerfile.rust). The tag-suffix must match the uri: in the corresponding backend/index.yaml image entry exactly.
scripts/changed-backends.js registration — REQUIRED for any new dockerfile suffix. This is the single most common omission, because it has no effect on the PR that adds the backend (when no prior path filter could catch it anyway) — it only breaks the next PR that touches your backend's directory, which then gets zero CI jobs and looks broken for unrelated reasons. Edit scripts/changed-backends.js:inferBackendPath and add a branch BEFORE the more-generic suffixes:
if (item.dockerfile.endsWith("<your-dockerfile-suffix>")) {
return `backend/cpp/<your-backend>/`; // or backend/python|go|rust/...
}
The endsWith() test is against the matrix entry's dockerfile: value (e.g. ./backend/Dockerfile.ds4 → endsWith("ds4")). Specificity order matters here just like it does for importers: more-specific suffixes go BEFORE more-generic ones (e.g. ds4 before llama-cpp even though both end with letters, because some upstream might one day call itself super-ds4-llama-cpp). Verify locally before pushing:
# Confirm your dockerfile suffix is unique enough
node -e "
const yaml = require('js-yaml'); const fs = require('fs');
const m = yaml.load(fs.readFileSync('.github/backend-matrix.yml','utf8'));
for (const e of m.include.filter(e => e.backend === '<your-backend>')) {
console.log(e.dockerfile, '->', e.dockerfile.endsWith('<suffix>'));
}"
A quick way to find the right insertion point: grep -n 'item.dockerfile.endsWith' scripts/changed-backends.js.
bump_deps.yaml registration — REQUIRED for any backend pinning an upstream commit. If your backend's Makefile has a *_VERSION?=<sha> pin to a third-party repo, the daily auto-bump bot at .github/workflows/bump_deps.yaml won't notice it unless you register the backend in its matrix. The bot runs .github/bump_deps.sh which greps for ^$VAR?= in the Makefile you list — so the pin MUST live in the Makefile (not in a separate shell script). The bump for ds4 (#9761) had to walk this back because the original landed the pin in prepare.sh, which the bot can't see. Pattern (for antirez/ds4):
# .github/workflows/bump_deps.yaml
matrix:
include:
- repository: "antirez/ds4"
variable: "DS4_VERSION"
branch: "main"
file: "backend/cpp/ds4/Makefile"
And the corresponding Makefile shape (mirror backend/cpp/llama-cpp/Makefile):
DS4_VERSION?=ae302c2fa18cc6d9aefc021d0f27ae03c9ad2fc0
DS4_REPO?=https://github.com/antirez/ds4
...
ds4:
mkdir -p ds4
cd ds4 && git init -q && \
git remote add origin $(DS4_REPO) && \
git fetch --depth 1 origin $(DS4_VERSION) && \
git checkout FETCH_HEAD
If you have a prepare.sh doing the clone, delete it — the recipe belongs in the Makefile target so make purge && make works as a clean-and-rebuild and so the bump bot finds the pin.
Placement in file:
- CPU builds: Add after other CPU builds (e.g., after
cpu-chatterbox) - CUDA 12 builds: Add after other CUDA 12 builds (e.g., after
gpu-nvidia-cuda-12-chatterbox) - CUDA 13 builds: Add after other CUDA 13 builds (e.g., after
gpu-nvidia-cuda-13-chatterbox)
Additional build types you may need:
- ROCm/HIP: Use
build-type: 'hipblas'withbase-image: "rocm/dev-ubuntu-24.04:7.2.1" - Intel/SYCL: Use
build-type: 'intel'orbuild-type: 'sycl_f16'/sycl_f32withbase-image: "intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04" - L4T (ARM): Use
build-type: 'l4t'withplatforms: 'linux/arm64'andruns-on: 'ubuntu-24.04-arm'
Per-arch native builds (linux/amd64 + linux/arm64):
Multi-arch backends are NOT a single matrix entry with platforms: 'linux/amd64,linux/arm64'. Instead, add two entries — one with platforms: 'linux/amd64' + platform-tag: 'amd64' + runs-on: 'ubuntu-latest', one with platforms: 'linux/arm64' + platform-tag: 'arm64' + runs-on: 'ubuntu-24.04-arm' — both sharing the same tag-suffix. The script detects the shared tag-suffix and emits a merge-matrix entry, so backend-merge-jobs (in backend.yml/backend_pr.yml) automatically assembles the manifest list from per-arch digest artifacts. See -cpu-faster-whisper in .github/backend-matrix.yml for a reference shape.
llama-cpp / ik-llama-cpp / turboquant variants only — builder-base-image:
Entries whose dockerfile is ./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant} must also set a builder-base-image field pointing at a prebuilt base from quay.io/go-skynet/ci-cache:base-grpc-* (CI builds these via .github/workflows/base-images.yml). The mapping is by (build-type, platforms) — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~25–35 min cold). Local make backends/<name> ignores builder-base-image and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
3. Add Backend Metadata to backend/index.yaml
Step 3a: Add Meta Definition
Add a YAML anchor definition in the ## metas section (around line 2-300). Look for similar backends to use as a template such as diffusers or chatterbox
Step 3b: Add Image Entries
Add image entries at the end of the file, following the pattern of similar backends such as diffusers or chatterbox. Include both latest (production) and master (development) tags.
Note on integrity: OCI backends installed from a gallery whose verification: block is set are verified against a keyless-cosign policy before extraction; tarball/HTTP backends use the optional sha256: field. New backends do not need any extra YAML — the gallery-level verification: block covers every entry. See .agents/backend-signing.md for the producer-side CI step.
4. Update the Makefile
The Makefile needs to be updated in several places to support building and testing the new backend:
Step 4a: Add to .NOTPARALLEL
Add backends/<backend-name> to the .NOTPARALLEL line (around line 2) to prevent parallel execution conflicts:
.NOTPARALLEL: ... backends/<backend-name>
Step 4b: Add to prepare-test-extra
Add the backend to the prepare-test-extra target to prepare it for testing. Use the path matching your language bucket (backend/python/, backend/go/, backend/rust/, …):
prepare-test-extra: protogen-python
...
$(MAKE) -C backend/<lang>/<backend-name>
For Rust backends the target is usually the crate build target itself (e.g. $(MAKE) -C backend/rust/<backend-name> <backend-name>-grpc) so the binary is in place before test runs.
Step 4c: Add to test-extra
Add the backend to the test-extra target to run its tests — applies to Go and Rust backends too, not only Python:
test-extra: prepare-test-extra
...
$(MAKE) -C backend/<lang>/<backend-name> test
Each backend's own Makefile should define a test target so this line works regardless of language. Integration tests that need large model downloads should be gated behind an env var (see backend/rust/kokoros/'s KOKOROS_MODEL_PATH pattern) so CI only runs unit tests.
Step 4d: Add Backend Definition
Add a backend definition variable in the backend definitions section (around line 428-457). The format depends on the backend type:
For Python backends with root context (like faster-whisper, coqui):
BACKEND_<BACKEND_NAME> = <backend-name>|python|.|false|true
For Python backends with ./backend context (like chatterbox, moonshine):
BACKEND_<BACKEND_NAME> = <backend-name>|python|./backend|false|true
For Go backends:
BACKEND_<BACKEND_NAME> = <backend-name>|golang|.|false|true
For Rust backends:
BACKEND_<BACKEND_NAME> = <backend-name>|rust|.|false|true
The language field (python/golang/rust/…) must match a backend/Dockerfile.<lang> file.
Step 4e: Generate Docker Build Target
Add an eval call to generate the docker-build target (around line 480-501):
$(eval $(call generate-docker-build-target,$(BACKEND_<BACKEND_NAME>)))
Step 4f: Add to docker-build-backends
Add docker-build-<backend-name> to the docker-build-backends target (around line 507):
docker-build-backends: ... docker-build-<backend-name>
Determining the Context:
- If the backend is in
backend/python/<backend-name>/and uses./backendas context in the workflow file, use./backendcontext - If the backend is in
backend/python/<backend-name>/but uses.as context in the workflow file, use.context - Check similar backends to determine the correct context
5. Verification Checklist
After adding a new backend, verify:
- Backend directory structure is complete with all necessary files
- Build configurations added to
.github/backend-matrix.ymlfor all desired platforms (per-arch entries withplatform-tagfor multi-arch;builder-base-imagefor llama-cpp / ik-llama-cpp / turboquant) - Meta definition added to
backend/index.yamlin the## metassection - Image entries added to
backend/index.yamlfor all build variants (latest + development) - Tag suffixes match between workflow file and index.yaml
- Makefile updated with all 6 required changes (
.NOTPARALLEL,prepare-test-extra,test-extra, backend definition, docker-build target eval,docker-build-backends) - No YAML syntax errors (check with linter)
- No Makefile syntax errors (check with linter)
- Follows the same pattern as similar backends (e.g., if it's a transcription backend, follow
faster-whisperpattern)
Bundling runtime shared libraries (package.sh)
The final Dockerfile.python stage is FROM scratch — there is no system libc, no apt, no fallback library path. Only files explicitly copied from the builder stage end up in the backend image. That means any runtime dlopen your backend (or its Python deps) needs must be packaged into ${BACKEND}/lib/.
Pattern:
- Make sure the library is installed in the builder stage of
backend/Dockerfile.python(add it to the top-levelapt-get install). - Drop a
package.shin your backend directory that copies the library — and its soname symlinks — into$(dirname $0)/lib. Seebackend/python/vllm/package.shfor a reference implementation that walks/usr/lib/x86_64-linux-gnu,/usr/lib/aarch64-linux-gnu, etc. Dockerfile.pythonalready runspackage.shautomatically if it exists, afterpackage-gpu-libs.sh.libbackend.shautomatically prepends${EDIR}/libtoLD_LIBRARY_PATHat run time, so anything packaged this way is found bydlopen.
How to find missing libs: when a Python module silently fails to register torch ops or you see AttributeError: '_OpNamespace' '...' object has no attribute '...', run the backend image's Python with LD_DEBUG=libs to see which dlopen failed. The filename in the error message (e.g. libnuma.so.1) is what you need to package.
To verify packaging works without trusting the host:
make docker-build-<backend>
CID=$(docker create --entrypoint=/run.sh local-ai-backend:<backend>)
docker cp $CID:/lib /tmp/check && docker rm $CID
ls /tmp/check # expect the bundled .so files + symlinks
Then boot it inside a fresh ubuntu:24.04 (which intentionally does not have the lib installed) to confirm it actually loads from the backend dir.
Importer integration
When you add a new backend, you MUST also make it importable via the model import form (/import-model). The import form dropdown is sourced dynamically from GET /backends/known — it reads the importer registry at core/gallery/importers/importers.go, so the steps below are the ONLY way to make your backend show up.
Required steps:
- If your backend has unambiguous detection signals (unique file extension, HF
pipeline_tag, unique repo name pattern, unique artefact likemodules.json):- Create an importer file at
core/gallery/importers/<backend>.gofollowing the Match/Import pattern inllama-cpp.go. - Register it in
importers.go:defaultImportersin specificity order — more specific detectors must appear BEFORE more generic ones (e.g.sentencetransformersbeforetransformers,stablediffusion-ggmlbeforellama-cpp,vllm-omnibeforevllm). First match wins.
- Create an importer file at
- If your backend is a drop-in replacement (same artefacts as another backend, e.g.
ik-llama-cppandturboquantboth consume GGUF the same wayllama-cppdoes):- Do NOT create a new importer. Extend the existing importer's
Import()to swap the emittedbackend:field whenpreferences.backendmatches. Seellama-cpp.gofor the pattern.
- Do NOT create a new importer. Extend the existing importer's
- If your backend has no reliable auto-detect signal (preference-only — e.g.
sglang,tinygrad,whisperx):- Do NOT create an importer. Instead add the backend name to the curated pref-only slice in
core/http/endpoints/localai/backend.gothat feeds/backends/known. A single line addition.
- Do NOT create an importer. Instead add the backend name to the curated pref-only slice in
- Always add a table-driven test in
core/gallery/importers/importers_test.go(Ginkgo/Gomega):- Use a real public HuggingFace repo URI as the test fixture (existing tests already hit the live HF API — follow that pattern).
- Cover detection (auto-match without preferences), preference-override (explicit
backend:in preferences wins), and — if the backend's modality has a commonpipeline_tagbut ambiguous artefacts — an ambiguity test assertingerrors.Is(err, importers.ErrAmbiguousImport).
Rules of thumb:
- When in doubt, lean pref-only. A wrong auto-detect is worse than a forced preference.
- Never silently emit a modality mismatch (e.g. emit
llama-cppfor a TTS repo because.ggufis present). ReturnErrAmbiguousImportinstead. - Registration order is the single most common source of bugs. Check by running
go test ./core/gallery/importers/...— the existing suite will fail if you've shadowed a pre-existing detector.
6. Example: Adding a Python Backend
For reference, when moonshine was added:
- Files created:
backend/python/moonshine/{backend.py, Makefile, install.sh, protogen.sh, requirements.txt, run.sh, test.py, test.sh} - Workflow entries: 3 build configurations (CPU, CUDA 12, CUDA 13)
- Index entries: 1 meta definition + 6 image entries (cpu, cuda12, cuda13 x latest/development)
- Makefile updates:
- Added to
.NOTPARALLELline - Added to
prepare-test-extraandtest-extratargets - Added
BACKEND_MOONSHINE = moonshine|python|./backend|false|true - Added eval for docker-build target generation
- Added
docker-build-moonshinetodocker-build-backends
- Added to