Files
LocalAI/core/gallery/backends.go
Richard Palethorpe 670259ce43 chore: Security hardening (#9719)
* fix(http): close 0.0.0.0/[::] SSRF bypass in /api/cors-proxy

The CORS proxy carried its own private-network blocklist (RFC 1918 + a
handful of IPv6 ranges) instead of using the same classification as
pkg/utils/urlfetch.go. The hand-rolled list missed 0.0.0.0/8 and ::/128,
both of which Linux routes to localhost — so any user with FeatureMCP
(default-on for new users) could reach LocalAI's own listener and any
other service bound to 0.0.0.0:port via:

  GET /api/cors-proxy?url=http://0.0.0.0:8080/...
  GET /api/cors-proxy?url=http://[::]:8080/...

Replace the custom check with utils.IsPublicIP (Go stdlib IsLoopback /
IsLinkLocalUnicast / IsPrivate / IsUnspecified, plus IPv4-mapped IPv6
unmasking) and add an upfront hostname rejection for localhost, *.local,
and the cloud metadata aliases so split-horizon DNS can't paper over the
IP check.

The IP-pinning DialContext is unchanged: the validated IP from the
single resolution is reused for the connection, so DNS rebinding still
cannot swap a public answer for a private one between validate and dial.

Regression tests cover 0.0.0.0, 0.0.0.0:PORT, [::], ::ffff:127.0.0.1,
::ffff:10.0.0.1, file://, gopher://, ftp://, localhost, 127.0.0.1,
10.0.0.1, 169.254.169.254, metadata.google.internal.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(downloader): verify SHA before promoting temp file to final path

DownloadFileWithContext renamed the .partial file to its final name
*before* checking the streamed SHA, so a hash mismatch returned an
error but left the tampered file at filePath. Subsequent code that
operated on filePath (a backend launcher, a YAML loader, a re-download
that finds the file already present and skips) would consume the
attacker-supplied bytes.

Reorder: verify the streamed hash first, remove the .partial on
mismatch, then rename. The streamed hash is computed during io.Copy
so no second read is needed.

While here, raise the empty-SHA case from a Debug log to a Warn so
"this download had no integrity check" is visible at the default log
level. Backend installs currently pass through with no digest; the
warning makes that footprint observable without changing behaviour.

Regression test asserts os.IsNotExist on the destination after a
deliberate SHA mismatch.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(auth): require email_verified for OIDC admin promotion

extractOIDCUserInfo read the ID token's "email" claim but never
inspected "email_verified". With LOCALAI_ADMIN_EMAIL set, an attacker
who could register on the configured OIDC IdP under that email (some
IdPs accept self-supplied unverified emails) inherited admin role:

  - first login:  AssignRole(tx, email, adminEmail) → RoleAdmin
  - re-login:     MaybePromote(db, user, adminEmail) → flip to RoleAdmin

Add EmailVerified to oauthUserInfo, parse email_verified from the OIDC
claims (default false on absence so an IdP that omits the claim cannot
short-circuit the gate), and substitute "" for the role-decision email
when verified=false via emailForRoleDecision. The user record still
stores the unverified email for display.

GitHub's path defaults EmailVerified=true: GitHub only returns a public
profile email after verification, and fetchGitHubPrimaryEmail explicitly
filters to Verified=true.

Regression tests cover both the helper contract and integration with
AssignRole, including the bootstrap "first user" branch that would
otherwise mask the gate.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(cli): refuse public bind when no auth backend is configured

When neither an auth DB nor a static API key is set, the auth
middleware passes every request through. That is fine for a developer
laptop, a home LAN, or a Tailnet — the network itself is the trust
boundary. It is not fine on a public IP, where every model install,
settings change, and admin endpoint becomes reachable from the
internet.

Refuse to start in that exact configuration. Loopback, RFC 1918,
RFC 4193 ULA, link-local, and RFC 6598 CGNAT (Tailscale's default
range) all count as trusted; wildcard binds (`:port`, `0.0.0.0`,
`[::]`) are accepted only when every host interface is in one of those
ranges. Hostnames are resolved and treated as trusted only when every
answer is.

A new --allow-insecure-public-bind / LOCALAI_ALLOW_INSECURE_PUBLIC_BIND
flag opts out for deployments that gate access externally (a reverse
proxy enforcing auth, a mesh ACL, etc.). The error message lists this
plus the three constructive alternatives (bind a private interface,
enable --auth, set --api-keys).

The interface enumeration goes through a package-level interfaceAddrsFn
var so tests can simulate cloud-VM, home-LAN, Tailscale-only, and
enumeration-failure topologies without poking at the real network
stack.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* test(http): regression-test the localai_assistant admin gate

ChatEndpoint already rejects metadata.localai_assistant=true from a
non-admin caller, but the gate was open-coded inline with no direct
test coverage. The chat route is FeatureChat-gated (default-on), and
the assistant's in-process MCP server can install/delete models and
edit configs — the wrong handler change would silently turn the LLM
into a confused deputy.

Extract the gate into requireAssistantAccess(c, authEnabled) and pin
its behaviour: auth disabled is a no-op, unauthenticated is 403,
RoleUser is 403, RoleAdmin and the synthetic legacy-key admin are
admitted.

No behaviour change in the production path.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* test(http): assert every API route is auth-classified

The auth middleware classifies path prefixes (/api/, /v1/, /models/,
etc.) as protected and treats anything else as a static-asset
passthrough. A new endpoint shipped under a brand-new prefix — or a
new path that simply isn't on the prefix allowlist — would be
reachable anonymously.

Walk every route registered by API() with auth enabled and a fresh
in-memory database (no users, no keys), and assert each API-prefixed
route returns 401 / 404 / 405 to an anonymous request. Public surfaces
(/api/auth/*, /api/branding, /api/node/* token-authenticated routes,
/healthz, branding asset server, generated-content server, static
assets) are explicit allowlist entries with comments justifying them.

Build-tagged 'auth' so it runs against the SQLite-backed auth DB
(matches the existing auth suite).

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* test(http): pin agent endpoint per-user isolation contract

agents.go's getUserID / effectiveUserID / canImpersonateUser /
wantsAllUsers helpers are the single trust boundary for cross-user
access on agent, agent-jobs, collections, and skills routes. A
regression there is the difference between "regular user reads their
own data" and "regular user reads anyone's data via ?user_id=victim".

Lock in the contract:
  - effectiveUserID ignores ?user_id= for unauthenticated and RoleUser
  - effectiveUserID honours it for RoleAdmin and ProviderAgentWorker
  - wantsAllUsers requires admin AND the literal "true" string
  - canImpersonateUser is admin OR agent-worker, never plain RoleUser

No production change — this commit only adds tests.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(downloader): drop redundant stat in removePartialFile

The stat-then-remove pattern is a TOCTOU window and a wasted syscall —
os.Remove already returns ErrNotExist for the missing-file case, so trust
that and treat it as a no-op.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(http): redact secrets from trace buffer and distribution-token logs

The /api/traces buffer captured Authorization, Cookie, Set-Cookie, and
API-key headers verbatim from every request when tracing was enabled. The
endpoint is admin-only but the buffer is reachable via any heap-style
introspection and the captured tokens otherwise outlive the request.
Strip those header values at capture time. Body redaction is left to a
follow-up — the prompts are usually the operator's own and JSON-walking
is invasive.

Distribution tokens were also logged in plaintext from
core/explorer/discovery.go; logs forward to syslog/journald and outlive
the token. Redact those to a short prefix/suffix instead.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(auth): rate-limit OAuth callbacks separately from password endpoints

The shared 5/min/IP limit on auth endpoints is right for password-style
flows but too tight for OAuth callbacks: corporate SSO funnels many real
users through one outbound IP and would trip the limit. Add a separate
60/min/IP limiter for /api/auth/{github,oidc}/callback so callbacks are
bounded against floods without breaking shared-IP deployments.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(gallery): verify backend tarball sha256 when set in gallery entry

GalleryBackend gained an optional sha256 field; the install path now
threads it through to the existing downloader hash-verify (which already
streams, verifies, and rolls back on mismatch). Galleries without sha256
keep working; the empty-SHA path still emits the existing
"downloading without integrity check" warning.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* test(http): pin CSRF coverage on multipart endpoints

The CSRF middleware in app.go is global (e.Use) so it covers every
multipart upload route — branding assets, fine-tune datasets, audio
transforms, agent collections. Pin that contract: cross-site multipart
POSTs are rejected; same-origin / same-site / API-key clients are not.
Also pins the SameSite=Lax fallback path the skipper relies on when
Sec-Fetch-Site is absent.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(http): XSS hardening — CSP headers, safe href, base-href escape, SVG sandbox

Several closely related XSS-prevention changes spanning the SPA shell, the
React UI, and the branding asset server:

- New SecurityHeaders middleware sets CSP, X-Content-Type-Options,
  X-Frame-Options, and Referrer-Policy on every response. The CSP keeps
  script-src permissive because the Vite bundle relies on inline + eval'd
  scripts; tightening that requires moving to a nonce-based policy.

- The <base href> injection in the SPA shell escaped attacker-controllable
  Host / X-Forwarded-Host headers — a single quote in the host header
  broke out of the attribute. Pass through SecureBaseHref (html.EscapeString).

- Three React sinks rendering untrusted content via dangerouslySetInnerHTML
  switch to text-node rendering with whiteSpace: pre-wrap: user message
  bodies in Chat.jsx and AgentChat.jsx, and the agent activity log in
  AgentChat.jsx. The hand-rolled escape on the agent user-message variant
  is replaced by the same plain-text path.

- New safeHref util collapses non-allowlisted URI schemes (most
  importantly javascript:) to '#'. Applied to gallery `<a href={url}>`
  links in Models / Backends / Manage and to canvas artifact links —
  these come from gallery JSON or assistant tool calls and must be treated
  as untrusted.

- The branding asset server attaches a sandbox CSP plus same-origin CORP
  to .svg responses. The React UI loads logos via <img>, but the same URL
  is also reachable via direct navigation; this prevents script
  execution if a hostile SVG slipped past upload validation.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(http): bound HTTP server with read-header and idle timeouts

A net/http server with no timeouts is trivially Slowloris-able and leaks
idle keep-alive connections. Set ReadHeaderTimeout (30s) to plug the
slow-headers attack and IdleTimeout (120s) to cap keep-alive sockets.

ReadTimeout and WriteTimeout stay at 0 because request bodies can be
multi-GB model uploads and SSE / chat completions stream for many
minutes; operators who need tighter per-request bounds should terminate
slow clients at a reverse proxy.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* test(auth): pin PUT /api/auth/profile field-tampering contract

The handler uses an explicit local body struct (only name and avatar_url)
plus a gorm Updates(map) with a column allowlist, so an attacker posting
{"role":"admin","email":"...","password_hash":"..."} can't mass-assign
those fields. Lock that down with a regression test so a future
"let's just c.Bind(&user)" refactor breaks loudly.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(services): strip directory components from multipart upload filenames

UploadDataset and UploadToCollectionForUser took the raw multipart
file.Filename and joined it into a destination path. The fine-tune
upload was incidentally safe because of a UUID prefix that fused any
leading '..' to a literal segment, but the protection is fragile.
UploadToCollectionForUser handed the filename to a vendored backend
without sanitising at all.

Strip to filepath.Base at both boundaries and reject the trivial
unsafe values ("", ".", "..", "/").

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(react-ui): validate persisted MCP server entries on load

localStorage is shared across same-origin pages; an XSS that lands once
can poison persisted MCP server config to attempt header injection or
to feed a non-http URL into the fetch path on subsequent loads.
Validate every entry: types must match, URL must parse with http(s)
scheme, header keys/values must be control-char-free. Drop anything
that doesn't fit.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(http): close X-Forwarded-Prefix open redirect

The reverse-proxy support concatenated X-Forwarded-Prefix into the
redirect target without validation, so a forged header value of
"//evil.com" turned the SPA-shell redirect helper at /, /browse, and
/browse/* into a 301 to //evil.com/app. The path-strip middleware had
the same shape on its prefix-trailing-slash redirect.

Add SafeForwardedPrefix at the middleware boundary: must start with
a single '/', no protocol-relative '//' opener, no scheme, no
backslash, no control characters. Apply at both consumers; misconfig
trips the validator and the header is dropped.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(http): refuse wildcard CORS when LOCALAI_CORS=true with empty allowlist

When LOCALAI_CORS=true but LOCALAI_CORS_ALLOW_ORIGINS was empty, Echo's
CORSWithConfig saw an empty allow-list and fell back to its default
AllowOrigins=["*"]. An operator who flipped the strict-CORS feature
flag without populating the list got the opposite of what they asked
for. Echo never sets Allow-Credentials: true so this isn't directly
exploitable (cookies aren't sent under wildcard CORS), but the
misconfiguration trap is worth closing. Skip the registration and warn.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(auth): zxcvbn password strength check with user-acknowledged override

The previous policy was len < 8, which let through "Password1" and the
rest of the credential-stuffing corpus. LocalAI has no second factor
yet, so the bar needs to sit higher.

Add ValidatePasswordStrength using github.com/timbutler/zxcvbn (an
actively-maintained fork of the trustelem port; v1.0.4, April 2024):
- min 12 chars, max 72 (bcrypt's truncation point)
- reject NUL bytes (some bcrypt callers truncate at the first NUL)
- require zxcvbn score >= 3 ("safely unguessable, ~10^8 guesses to
  break"); the hint list ["localai", "local-ai", "admin"] penalises
  passwords built from the app's own branding

zxcvbn produces false positives sometimes (a strong-looking password
that happens to match a dictionary word) and operators occasionally
need to set a known-weak password (kiosk demos, CI rigs). Add an
acknowledgement path: PasswordPolicy{AllowWeak: true} skips the
entropy check while still enforcing the hard rules. The structured
PasswordErrorResponse marks weak-password rejections as Overridable
so the UI can surface a "use this anyway" checkbox.

Wired through register, self-service password change, and admin
password reset on both the server and the React UI.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(react-ui): drop HTML5 minLength on new-password inputs

minLength={12} on the new-password input let the browser block the
form submit silently before any JS or network call ran. The browser
focused the field, showed a brief native tooltip, and that was that —
no toast, no fetch, no clue. Reproducible by typing fewer than 12
chars on the second password change of a session.

The JS-level length check in handleSubmit already shows a toast and
the server rejects with a structured error, so the HTML5 attribute
was redundant defence anyway. Drop it.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(react-ui): bundle Geist fonts locally instead of fetching from Google

The new CSP correctly refused to apply styles from
fonts.googleapis.com because style-src is locked to 'self' and
'unsafe-inline'. Loosening the CSP would defeat its purpose; the
right fix is to stop reaching out to a third-party CDN for fonts on
every page load.

Add @fontsource-variable/geist and @fontsource-variable/geist-mono as
npm deps and import them once at boot. Drop the <link rel="preconnect">
and external stylesheet from index.html.

Side benefit: no third-party tracking via Referer / IP on every UI
load, no failure mode when offline / behind a captive portal.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(react-ui): refresh i18n strings to reflect 12-char password minimum

The translations still said "at least 8 characters" everywhere — the
client-side toast on a too-short password change told the user the
wrong floor. Update tooShort and newPasswordPlaceholder /
newPasswordDescription across all five locales (en, es, it, de,
zh-CN) to match the real ValidatePasswordStrength rule.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(auth): make password length-floor overridable like the entropy check

The 12-char minimum was a policy choice, not a technical invariant —
only "non-empty", "<= 72 bytes", and "no NUL bytes" are real bcrypt
constraints. Treating length-12 as a hard rule was inconsistent with
the entropy check (already overridable) and friction for use cases
where the account is just a name on a session, not a security
boundary (single-user kiosk, CI rig, lab demo).

Restructure ValidatePasswordStrength:
- Hard rules (always enforced): non-empty, <= MaxPasswordLength, no NUL byte
- Policy rules (skipped when AllowWeak=true): length >= 12, zxcvbn score >= 3

PasswordError now marks password_too_short as Overridable too. The
React forms generalised from `error_code === 'password_too_weak'` to
`overridable === true`, and the JS-side preflight length checks were
removed (server is source of truth, returns the same checkbox flow).

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-05-08 16:25:45 +02:00

622 lines
20 KiB
Go

// Package gallery provides installation and registration utilities for LocalAI backends,
// including meta-backend resolution based on system capabilities.
package gallery
import (
"context"
"encoding/json"
"errors"
"fmt"
"os"
"path/filepath"
"strings"
"time"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/pkg/downloader"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/oci"
"github.com/mudler/LocalAI/pkg/system"
"github.com/mudler/xlog"
cp "github.com/otiai10/copy"
)
// ErrBackendNotFound is returned when a backend is not found in the system.
var ErrBackendNotFound = errors.New("backend not found")
const (
metadataFile = "metadata.json"
runFile = "run.sh"
)
// Default fallback tag values
const (
defaultLatestTag = "latest"
defaultMasterTag = "master"
defaultDevSuffix = "development"
)
// getFallbackTagValues returns the configurable fallback tag values from SystemState
func getFallbackTagValues(systemState *system.SystemState) (latestTag, masterTag, devSuffix string) {
// Use SystemState fields if set, otherwise use defaults
if systemState.BackendImagesReleaseTag != "" {
latestTag = systemState.BackendImagesReleaseTag
} else {
latestTag = defaultLatestTag
}
if systemState.BackendImagesBranchTag != "" {
masterTag = systemState.BackendImagesBranchTag
} else {
masterTag = defaultMasterTag
}
if systemState.BackendDevSuffix != "" {
devSuffix = systemState.BackendDevSuffix
} else {
devSuffix = defaultDevSuffix
}
return latestTag, masterTag, devSuffix
}
// backendCandidate represents an installed concrete backend option for a given alias
type backendCandidate struct {
name string
runFile string
}
// readBackendMetadata reads the metadata JSON file for a backend
func readBackendMetadata(backendPath string) (*BackendMetadata, error) {
metadataPath := filepath.Join(backendPath, metadataFile)
// If metadata file doesn't exist, return nil (for backward compatibility)
if _, err := os.Stat(metadataPath); os.IsNotExist(err) {
return nil, nil
}
data, err := os.ReadFile(metadataPath)
if err != nil {
return nil, fmt.Errorf("failed to read metadata file %q: %v", metadataPath, err)
}
var metadata BackendMetadata
if err := json.Unmarshal(data, &metadata); err != nil {
return nil, fmt.Errorf("failed to unmarshal metadata file %q: %v", metadataPath, err)
}
return &metadata, nil
}
// writeBackendMetadata writes the metadata JSON file for a backend
func writeBackendMetadata(backendPath string, metadata *BackendMetadata) error {
metadataPath := filepath.Join(backendPath, metadataFile)
data, err := json.MarshalIndent(metadata, "", " ")
if err != nil {
return fmt.Errorf("failed to marshal metadata: %v", err)
}
if err := os.WriteFile(metadataPath, data, 0644); err != nil {
return fmt.Errorf("failed to write metadata file %q: %v", metadataPath, err)
}
return nil
}
// InstallBackendFromGallery installs a backend from the gallery.
func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, name string, downloadStatus func(string, string, string, float64), force bool) error {
if !force {
// check if we already have the backend installed
backends, err := ListSystemBackends(systemState)
if err != nil {
return err
}
// Only short-circuit if the install is *actually usable*. An orphaned
// meta entry whose concrete was removed still shows up in
// ListSystemBackends with a RunFile pointing at a path that no longer
// exists; returning early there leaves the caller with a broken
// alias and the worker fails with "backend not found after install
// attempt" on every retry. Re-install in that case.
if existing, ok := backends.Get(name); ok && isBackendRunnable(existing) {
return nil
}
}
if name == "" {
return fmt.Errorf("backend name is empty")
}
xlog.Debug("Installing backend from gallery", "galleries", galleries, "name", name)
backends, err := AvailableBackends(galleries, systemState)
if err != nil {
return err
}
backend := FindGalleryElement(backends, name)
if backend == nil {
return fmt.Errorf("no backend found with name %q", name)
}
if backend.IsMeta() {
xlog.Debug("Backend is a meta backend", "systemState", systemState, "name", name)
// Then, let's try to find the best backend based on the capabilities map
bestBackend := backend.FindBestBackendFromMeta(systemState, backends)
if bestBackend == nil {
return fmt.Errorf("no backend found with capabilities %q", backend.CapabilitiesMap)
}
xlog.Debug("Installing backend from meta backend", "name", name, "bestBackend", bestBackend.Name)
// Then, let's install the best backend
if err := InstallBackend(ctx, systemState, modelLoader, bestBackend, downloadStatus); err != nil {
return err
}
// we need now to create a path for the meta backend, with the alias to the installed ones so it can be used to remove it
metaBackendPath := filepath.Join(systemState.Backend.BackendsPath, name)
if err := os.MkdirAll(metaBackendPath, 0750); err != nil {
return fmt.Errorf("failed to create meta backend path %q: %v", metaBackendPath, err)
}
// Create metadata for the meta backend
metaMetadata := &BackendMetadata{
MetaBackendFor: bestBackend.Name,
Name: name,
GalleryURL: backend.Gallery.URL,
InstalledAt: time.Now().Format(time.RFC3339),
Version: bestBackend.Version,
}
if err := writeBackendMetadata(metaBackendPath, metaMetadata); err != nil {
return fmt.Errorf("failed to write metadata for meta backend %q: %v", name, err)
}
return nil
}
return InstallBackend(ctx, systemState, modelLoader, backend, downloadStatus)
}
func InstallBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, config *GalleryBackend, downloadStatus func(string, string, string, float64)) error {
// Get configurable fallback tag values from SystemState
latestTag, masterTag, devSuffix := getFallbackTagValues(systemState)
// Create base path if it doesn't exist
err := os.MkdirAll(systemState.Backend.BackendsPath, 0750)
if err != nil {
return fmt.Errorf("failed to create base path: %v", err)
}
if config.IsMeta() {
return fmt.Errorf("meta backends cannot be installed directly")
}
name := config.Name
backendPath := filepath.Join(systemState.Backend.BackendsPath, name)
// Clean up legacy flat-layout artefacts: earlier dev builds of the
// golang backends dropped the compiled binary directly at
// `<backendsPath>/<name>` (a plain file) instead of
// `<backendsPath>/<name>/<name>` (the nested layout the current code
// expects). MkdirAll below returns ENOTDIR when such a stale file
// exists, permanently blocking any reinstall or upgrade. Remove the
// file first so the install can proceed; the new install will write
// the correct nested layout, including metadata.json + run.sh.
if fi, statErr := os.Lstat(backendPath); statErr == nil && !fi.IsDir() {
xlog.Warn("removing stale non-directory backend artefact to make room for fresh install", "path", backendPath)
if rmErr := os.Remove(backendPath); rmErr != nil {
return fmt.Errorf("failed to remove stale backend artefact at %s: %w", backendPath, rmErr)
}
}
err = os.MkdirAll(backendPath, 0750)
if err != nil {
return fmt.Errorf("failed to create base path: %v", err)
}
uri := downloader.URI(config.URI)
// Check if it is a directory
if uri.LooksLikeDir() {
// It is a directory, we just copy it over in the backend folder
if err := cp.Copy(config.URI, backendPath); err != nil {
return fmt.Errorf("failed copying: %w", err)
}
} else {
xlog.Debug("Downloading backend", "uri", config.URI, "backendPath", backendPath)
if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err != nil {
xlog.Debug("Backend download failed, trying fallback", "backendPath", backendPath, "error", err)
// resetBackendPath cleans up partial state from a failed OCI extraction
// so the next download attempt starts fresh. The directory is re-created
// because OCI image extractors need it to exist for writing files into.
resetBackendPath := func() {
os.RemoveAll(backendPath)
os.MkdirAll(backendPath, 0750)
}
success := false
// Try to download from mirrors
for _, mirror := range config.Mirrors {
// Check for cancellation before trying next mirror
select {
case <-ctx.Done():
return ctx.Err()
default:
}
resetBackendPath()
if err := downloader.URI(mirror).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
success = true
xlog.Debug("Downloaded backend from mirror", "uri", config.URI, "backendPath", backendPath)
break
}
}
if !success {
// Try fallback: replace latestTag + "-" with masterTag + "-" in the URI
fallbackURI := strings.Replace(string(config.URI), latestTag+"-", masterTag+"-", 1)
if fallbackURI != string(config.URI) {
resetBackendPath()
xlog.Info("Trying fallback URI", "original", config.URI, "fallback", fallbackURI)
if err := downloader.URI(fallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
xlog.Info("Downloaded backend using fallback URI", "uri", fallbackURI, "backendPath", backendPath)
success = true
} else {
xlog.Info("Fallback URI failed", "fallback", fallbackURI, "error", err)
if !strings.Contains(fallbackURI, "-"+devSuffix) {
resetBackendPath()
devFallbackURI := fallbackURI + "-" + devSuffix
xlog.Info("Trying development fallback URI", "fallback", devFallbackURI)
if err := downloader.URI(devFallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
xlog.Info("Downloaded backend using development fallback URI", "uri", devFallbackURI, "backendPath", backendPath)
success = true
} else {
xlog.Info("Development fallback URI failed", "fallback", devFallbackURI, "error", err)
}
}
}
}
}
if !success {
// Clean up backend directory only when all download attempts have failed
if cleanupErr := os.RemoveAll(backendPath); cleanupErr != nil {
xlog.Warn("Failed to clean up backend directory", "backendPath", backendPath, "error", cleanupErr)
}
xlog.Error("Failed to download backend", "uri", config.URI, "backendPath", backendPath, "error", err)
return fmt.Errorf("failed to download backend %q: %v", config.URI, err)
}
} else {
xlog.Debug("Downloaded backend", "uri", config.URI, "backendPath", backendPath)
}
}
// sanity check - check if runfile is present
runFile := filepath.Join(backendPath, runFile)
if _, err := os.Stat(runFile); os.IsNotExist(err) {
xlog.Error("Run file not found", "runFile", runFile)
return fmt.Errorf("not a valid backend: run file not found %q", runFile)
}
// Create metadata for the backend
metadata := &BackendMetadata{
Name: name,
GalleryURL: config.Gallery.URL,
InstalledAt: time.Now().Format(time.RFC3339),
Version: config.Version,
URI: string(uri),
}
// Record the OCI digest for upgrade detection (non-fatal on failure)
if uri.LooksLikeOCI() {
digest, digestErr := oci.GetImageDigest(string(uri), "", nil, nil)
if digestErr != nil {
xlog.Warn("Failed to get OCI image digest for backend", "uri", string(uri), "error", digestErr)
} else {
metadata.Digest = digest
}
}
if config.Alias != "" {
metadata.Alias = config.Alias
}
if err := writeBackendMetadata(backendPath, metadata); err != nil {
return fmt.Errorf("failed to write metadata for backend %q: %v", name, err)
}
return RegisterBackends(systemState, modelLoader)
}
func DeleteBackendFromSystem(systemState *system.SystemState, name string) error {
backends, err := ListSystemBackends(systemState)
if err != nil {
return err
}
backend, ok := backends.Get(name)
if !ok {
// Not found by direct key — try matching by gallery name (metadata.Name)
// The UI may send gallery-style names like "localai@llama-cpp" which
// don't match the directory-based keys used in the backends map.
for _, b := range backends {
if b.Metadata != nil && b.Metadata.Name == name && !b.IsMeta {
backend = b
ok = true
break
}
}
if !ok {
return fmt.Errorf("backend %q: %w", name, ErrBackendNotFound)
}
}
if backend.IsSystem {
return fmt.Errorf("system backend %q cannot be deleted", name)
}
// Use the backend's actual Name (directory key) for path resolution,
// not the caller-supplied name which may be a gallery-style name.
dirName := backend.Name
backendDirectory := filepath.Join(systemState.Backend.BackendsPath, dirName)
// check if the backend dir exists
if _, err := os.Stat(backendDirectory); os.IsNotExist(err) {
// if doesn't exist, it might be an alias, so we need to check if we have a matching alias in
// all the backends in the basePath
backends, err := os.ReadDir(systemState.Backend.BackendsPath)
if err != nil {
return err
}
foundBackend := false
for _, backend := range backends {
if backend.IsDir() {
metadata, err := readBackendMetadata(filepath.Join(systemState.Backend.BackendsPath, backend.Name()))
if err != nil {
return err
}
if metadata != nil && (metadata.Alias == name || metadata.Alias == dirName) {
backendDirectory = filepath.Join(systemState.Backend.BackendsPath, backend.Name())
foundBackend = true
break
}
}
}
// If no backend found, return successfully (idempotent behavior)
if !foundBackend {
return fmt.Errorf("no backend found with name %q", name)
}
}
// If it's a meta backend, delete also associated backend
metadata, err := readBackendMetadata(backendDirectory)
if err != nil {
return err
}
if metadata != nil && metadata.MetaBackendFor != "" {
concreteDirectory := filepath.Join(systemState.Backend.BackendsPath, metadata.MetaBackendFor)
xlog.Debug("Deleting concrete backend referenced by meta", "concreteDirectory", concreteDirectory)
// If the concrete the meta points to is already gone (earlier delete,
// partial install, or manual cleanup), keep going and remove the
// orphaned meta dir. Previously we returned an error here, which made
// the orphaned meta impossible to uninstall from the UI — the delete
// kept failing and every subsequent install short-circuited because
// the stale meta metadata made ListSystemBackends.Exists(name) true.
if _, statErr := os.Stat(concreteDirectory); statErr == nil {
os.RemoveAll(concreteDirectory)
} else if os.IsNotExist(statErr) {
xlog.Warn("Concrete backend referenced by meta not found — removing orphaned meta only",
"meta", name, "concrete", metadata.MetaBackendFor)
} else {
return statErr
}
}
return os.RemoveAll(backendDirectory)
}
// isBackendRunnable reports whether the given backend entry can actually be
// invoked. A meta backend is runnable only if its concrete's run.sh still
// exists on disk; concrete backends are considered runnable as long as their
// RunFile is set (ListSystemBackends only emits them when the runfile is
// present). Used to guard the "already installed" short-circuit so an
// orphaned meta pointing at a missing concrete triggers a real reinstall
// rather than being silently skipped.
func isBackendRunnable(b SystemBackend) bool {
if b.RunFile == "" {
return false
}
if fi, err := os.Stat(b.RunFile); err != nil || fi.IsDir() {
return false
}
return true
}
type SystemBackend struct {
Name string
RunFile string
IsMeta bool
IsSystem bool
Metadata *BackendMetadata
UpgradeAvailable bool `json:"upgrade_available,omitempty"`
AvailableVersion string `json:"available_version,omitempty"`
// Nodes holds per-node attribution in distributed mode. Empty in single-node.
// Each entry describes a node that has this backend installed, with the
// version/digest it reports. Lets the UI surface drift and per-node status.
Nodes []NodeBackendRef `json:"nodes,omitempty"`
}
// NodeBackendRef describes one node's view of an installed backend. Used both
// for per-node attribution in the UI and for drift detection during upgrade
// checks (a cluster with mismatched versions/digests is flagged upgradeable).
type NodeBackendRef struct {
NodeID string `json:"node_id"`
NodeName string `json:"node_name"`
NodeStatus string `json:"node_status"` // healthy | unhealthy | offline | draining | pending
Version string `json:"version,omitempty"`
Digest string `json:"digest,omitempty"`
URI string `json:"uri,omitempty"`
InstalledAt string `json:"installed_at,omitempty"`
}
type SystemBackends map[string]SystemBackend
func (b SystemBackends) Exists(name string) bool {
_, ok := b[name]
return ok
}
func (b SystemBackends) Get(name string) (SystemBackend, bool) {
backend, ok := b[name]
return backend, ok
}
func (b SystemBackends) GetAll() []SystemBackend {
backends := make([]SystemBackend, 0)
for _, backend := range b {
backends = append(backends, backend)
}
return backends
}
func ListSystemBackends(systemState *system.SystemState) (SystemBackends, error) {
// Gather backends from system and user paths, then resolve alias conflicts by capability.
backends := make(SystemBackends)
// System-provided backends
if systemBackends, err := os.ReadDir(systemState.Backend.BackendsSystemPath); err == nil {
for _, systemBackend := range systemBackends {
if systemBackend.IsDir() {
run := filepath.Join(systemState.Backend.BackendsSystemPath, systemBackend.Name(), runFile)
if _, err := os.Stat(run); err == nil {
backends[systemBackend.Name()] = SystemBackend{
Name: systemBackend.Name(),
RunFile: run,
IsMeta: false,
IsSystem: true,
Metadata: nil,
}
}
}
}
} else if !errors.Is(err, os.ErrNotExist) {
xlog.Warn("Failed to read system backends, proceeding with user-managed backends", "error", err)
} else if errors.Is(err, os.ErrNotExist) {
xlog.Debug("No system backends found")
}
// User-managed backends and alias collection
entries, err := os.ReadDir(systemState.Backend.BackendsPath)
if err != nil {
return nil, err
}
aliasGroups := make(map[string][]backendCandidate)
metaMap := make(map[string]*BackendMetadata)
for _, e := range entries {
if !e.IsDir() {
continue
}
dir := e.Name()
run := filepath.Join(systemState.Backend.BackendsPath, dir, runFile)
var metadata *BackendMetadata
metadataPath := filepath.Join(systemState.Backend.BackendsPath, dir, metadataFile)
if _, err := os.Stat(metadataPath); os.IsNotExist(err) {
metadata = &BackendMetadata{Name: dir}
} else {
m, rerr := readBackendMetadata(filepath.Join(systemState.Backend.BackendsPath, dir))
if rerr != nil {
return nil, rerr
}
if m == nil {
metadata = &BackendMetadata{Name: dir}
} else {
metadata = m
}
}
metaMap[dir] = metadata
// Concrete-backend entry
if _, err := os.Stat(run); err == nil {
backends[dir] = SystemBackend{
Name: dir,
RunFile: run,
IsMeta: false,
Metadata: metadata,
}
}
// Alias candidates
if metadata.Alias != "" {
aliasGroups[metadata.Alias] = append(aliasGroups[metadata.Alias], backendCandidate{name: dir, runFile: run})
}
// Meta backends indirection
if metadata.MetaBackendFor != "" {
backends[metadata.Name] = SystemBackend{
Name: metadata.Name,
RunFile: filepath.Join(systemState.Backend.BackendsPath, metadata.MetaBackendFor, runFile),
IsMeta: true,
Metadata: metadata,
}
}
}
// Resolve aliases using system capability preferences
tokens := systemState.BackendPreferenceTokens()
for alias, cands := range aliasGroups {
chosen := backendCandidate{}
// Try preference tokens
for _, t := range tokens {
for _, c := range cands {
if strings.Contains(strings.ToLower(c.name), t) && c.runFile != "" {
chosen = c
break
}
}
if chosen.runFile != "" {
break
}
}
// Fallback: first runnable
if chosen.runFile == "" {
for _, c := range cands {
if c.runFile != "" {
chosen = c
break
}
}
}
if chosen.runFile == "" {
continue
}
md := metaMap[chosen.name]
backends[alias] = SystemBackend{
Name: alias,
RunFile: chosen.runFile,
IsMeta: false,
Metadata: md,
}
}
return backends, nil
}
func RegisterBackends(systemState *system.SystemState, modelLoader *model.ModelLoader) error {
backends, err := ListSystemBackends(systemState)
if err != nil {
return err
}
for _, backend := range backends {
xlog.Debug("Registering backend", "name", backend.Name, "runFile", backend.RunFile)
modelLoader.SetExternalBackend(backend.Name, backend.RunFile)
}
return nil
}