mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-17 04:56:52 -04:00
* fix(http): close 0.0.0.0/[::] SSRF bypass in /api/cors-proxy The CORS proxy carried its own private-network blocklist (RFC 1918 + a handful of IPv6 ranges) instead of using the same classification as pkg/utils/urlfetch.go. The hand-rolled list missed 0.0.0.0/8 and ::/128, both of which Linux routes to localhost — so any user with FeatureMCP (default-on for new users) could reach LocalAI's own listener and any other service bound to 0.0.0.0:port via: GET /api/cors-proxy?url=http://0.0.0.0:8080/... GET /api/cors-proxy?url=http://[::]:8080/... Replace the custom check with utils.IsPublicIP (Go stdlib IsLoopback / IsLinkLocalUnicast / IsPrivate / IsUnspecified, plus IPv4-mapped IPv6 unmasking) and add an upfront hostname rejection for localhost, *.local, and the cloud metadata aliases so split-horizon DNS can't paper over the IP check. The IP-pinning DialContext is unchanged: the validated IP from the single resolution is reused for the connection, so DNS rebinding still cannot swap a public answer for a private one between validate and dial. Regression tests cover 0.0.0.0, 0.0.0.0:PORT, [::], ::ffff:127.0.0.1, ::ffff:10.0.0.1, file://, gopher://, ftp://, localhost, 127.0.0.1, 10.0.0.1, 169.254.169.254, metadata.google.internal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(downloader): verify SHA before promoting temp file to final path DownloadFileWithContext renamed the .partial file to its final name *before* checking the streamed SHA, so a hash mismatch returned an error but left the tampered file at filePath. Subsequent code that operated on filePath (a backend launcher, a YAML loader, a re-download that finds the file already present and skips) would consume the attacker-supplied bytes. Reorder: verify the streamed hash first, remove the .partial on mismatch, then rename. The streamed hash is computed during io.Copy so no second read is needed. While here, raise the empty-SHA case from a Debug log to a Warn so "this download had no integrity check" is visible at the default log level. Backend installs currently pass through with no digest; the warning makes that footprint observable without changing behaviour. Regression test asserts os.IsNotExist on the destination after a deliberate SHA mismatch. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(auth): require email_verified for OIDC admin promotion extractOIDCUserInfo read the ID token's "email" claim but never inspected "email_verified". With LOCALAI_ADMIN_EMAIL set, an attacker who could register on the configured OIDC IdP under that email (some IdPs accept self-supplied unverified emails) inherited admin role: - first login: AssignRole(tx, email, adminEmail) → RoleAdmin - re-login: MaybePromote(db, user, adminEmail) → flip to RoleAdmin Add EmailVerified to oauthUserInfo, parse email_verified from the OIDC claims (default false on absence so an IdP that omits the claim cannot short-circuit the gate), and substitute "" for the role-decision email when verified=false via emailForRoleDecision. The user record still stores the unverified email for display. GitHub's path defaults EmailVerified=true: GitHub only returns a public profile email after verification, and fetchGitHubPrimaryEmail explicitly filters to Verified=true. Regression tests cover both the helper contract and integration with AssignRole, including the bootstrap "first user" branch that would otherwise mask the gate. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(cli): refuse public bind when no auth backend is configured When neither an auth DB nor a static API key is set, the auth middleware passes every request through. That is fine for a developer laptop, a home LAN, or a Tailnet — the network itself is the trust boundary. It is not fine on a public IP, where every model install, settings change, and admin endpoint becomes reachable from the internet. Refuse to start in that exact configuration. Loopback, RFC 1918, RFC 4193 ULA, link-local, and RFC 6598 CGNAT (Tailscale's default range) all count as trusted; wildcard binds (`:port`, `0.0.0.0`, `[::]`) are accepted only when every host interface is in one of those ranges. Hostnames are resolved and treated as trusted only when every answer is. A new --allow-insecure-public-bind / LOCALAI_ALLOW_INSECURE_PUBLIC_BIND flag opts out for deployments that gate access externally (a reverse proxy enforcing auth, a mesh ACL, etc.). The error message lists this plus the three constructive alternatives (bind a private interface, enable --auth, set --api-keys). The interface enumeration goes through a package-level interfaceAddrsFn var so tests can simulate cloud-VM, home-LAN, Tailscale-only, and enumeration-failure topologies without poking at the real network stack. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): regression-test the localai_assistant admin gate ChatEndpoint already rejects metadata.localai_assistant=true from a non-admin caller, but the gate was open-coded inline with no direct test coverage. The chat route is FeatureChat-gated (default-on), and the assistant's in-process MCP server can install/delete models and edit configs — the wrong handler change would silently turn the LLM into a confused deputy. Extract the gate into requireAssistantAccess(c, authEnabled) and pin its behaviour: auth disabled is a no-op, unauthenticated is 403, RoleUser is 403, RoleAdmin and the synthetic legacy-key admin are admitted. No behaviour change in the production path. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): assert every API route is auth-classified The auth middleware classifies path prefixes (/api/, /v1/, /models/, etc.) as protected and treats anything else as a static-asset passthrough. A new endpoint shipped under a brand-new prefix — or a new path that simply isn't on the prefix allowlist — would be reachable anonymously. Walk every route registered by API() with auth enabled and a fresh in-memory database (no users, no keys), and assert each API-prefixed route returns 401 / 404 / 405 to an anonymous request. Public surfaces (/api/auth/*, /api/branding, /api/node/* token-authenticated routes, /healthz, branding asset server, generated-content server, static assets) are explicit allowlist entries with comments justifying them. Build-tagged 'auth' so it runs against the SQLite-backed auth DB (matches the existing auth suite). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin agent endpoint per-user isolation contract agents.go's getUserID / effectiveUserID / canImpersonateUser / wantsAllUsers helpers are the single trust boundary for cross-user access on agent, agent-jobs, collections, and skills routes. A regression there is the difference between "regular user reads their own data" and "regular user reads anyone's data via ?user_id=victim". Lock in the contract: - effectiveUserID ignores ?user_id= for unauthenticated and RoleUser - effectiveUserID honours it for RoleAdmin and ProviderAgentWorker - wantsAllUsers requires admin AND the literal "true" string - canImpersonateUser is admin OR agent-worker, never plain RoleUser No production change — this commit only adds tests. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(downloader): drop redundant stat in removePartialFile The stat-then-remove pattern is a TOCTOU window and a wasted syscall — os.Remove already returns ErrNotExist for the missing-file case, so trust that and treat it as a no-op. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): redact secrets from trace buffer and distribution-token logs The /api/traces buffer captured Authorization, Cookie, Set-Cookie, and API-key headers verbatim from every request when tracing was enabled. The endpoint is admin-only but the buffer is reachable via any heap-style introspection and the captured tokens otherwise outlive the request. Strip those header values at capture time. Body redaction is left to a follow-up — the prompts are usually the operator's own and JSON-walking is invasive. Distribution tokens were also logged in plaintext from core/explorer/discovery.go; logs forward to syslog/journald and outlive the token. Redact those to a short prefix/suffix instead. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): rate-limit OAuth callbacks separately from password endpoints The shared 5/min/IP limit on auth endpoints is right for password-style flows but too tight for OAuth callbacks: corporate SSO funnels many real users through one outbound IP and would trip the limit. Add a separate 60/min/IP limiter for /api/auth/{github,oidc}/callback so callbacks are bounded against floods without breaking shared-IP deployments. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(gallery): verify backend tarball sha256 when set in gallery entry GalleryBackend gained an optional sha256 field; the install path now threads it through to the existing downloader hash-verify (which already streams, verifies, and rolls back on mismatch). Galleries without sha256 keep working; the empty-SHA path still emits the existing "downloading without integrity check" warning. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin CSRF coverage on multipart endpoints The CSRF middleware in app.go is global (e.Use) so it covers every multipart upload route — branding assets, fine-tune datasets, audio transforms, agent collections. Pin that contract: cross-site multipart POSTs are rejected; same-origin / same-site / API-key clients are not. Also pins the SameSite=Lax fallback path the skipper relies on when Sec-Fetch-Site is absent. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): XSS hardening — CSP headers, safe href, base-href escape, SVG sandbox Several closely related XSS-prevention changes spanning the SPA shell, the React UI, and the branding asset server: - New SecurityHeaders middleware sets CSP, X-Content-Type-Options, X-Frame-Options, and Referrer-Policy on every response. The CSP keeps script-src permissive because the Vite bundle relies on inline + eval'd scripts; tightening that requires moving to a nonce-based policy. - The <base href> injection in the SPA shell escaped attacker-controllable Host / X-Forwarded-Host headers — a single quote in the host header broke out of the attribute. Pass through SecureBaseHref (html.EscapeString). - Three React sinks rendering untrusted content via dangerouslySetInnerHTML switch to text-node rendering with whiteSpace: pre-wrap: user message bodies in Chat.jsx and AgentChat.jsx, and the agent activity log in AgentChat.jsx. The hand-rolled escape on the agent user-message variant is replaced by the same plain-text path. - New safeHref util collapses non-allowlisted URI schemes (most importantly javascript:) to '#'. Applied to gallery `<a href={url}>` links in Models / Backends / Manage and to canvas artifact links — these come from gallery JSON or assistant tool calls and must be treated as untrusted. - The branding asset server attaches a sandbox CSP plus same-origin CORP to .svg responses. The React UI loads logos via <img>, but the same URL is also reachable via direct navigation; this prevents script execution if a hostile SVG slipped past upload validation. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): bound HTTP server with read-header and idle timeouts A net/http server with no timeouts is trivially Slowloris-able and leaks idle keep-alive connections. Set ReadHeaderTimeout (30s) to plug the slow-headers attack and IdleTimeout (120s) to cap keep-alive sockets. ReadTimeout and WriteTimeout stay at 0 because request bodies can be multi-GB model uploads and SSE / chat completions stream for many minutes; operators who need tighter per-request bounds should terminate slow clients at a reverse proxy. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(auth): pin PUT /api/auth/profile field-tampering contract The handler uses an explicit local body struct (only name and avatar_url) plus a gorm Updates(map) with a column allowlist, so an attacker posting {"role":"admin","email":"...","password_hash":"..."} can't mass-assign those fields. Lock that down with a regression test so a future "let's just c.Bind(&user)" refactor breaks loudly. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(services): strip directory components from multipart upload filenames UploadDataset and UploadToCollectionForUser took the raw multipart file.Filename and joined it into a destination path. The fine-tune upload was incidentally safe because of a UUID prefix that fused any leading '..' to a literal segment, but the protection is fragile. UploadToCollectionForUser handed the filename to a vendored backend without sanitising at all. Strip to filepath.Base at both boundaries and reject the trivial unsafe values ("", ".", "..", "/"). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): validate persisted MCP server entries on load localStorage is shared across same-origin pages; an XSS that lands once can poison persisted MCP server config to attempt header injection or to feed a non-http URL into the fetch path on subsequent loads. Validate every entry: types must match, URL must parse with http(s) scheme, header keys/values must be control-char-free. Drop anything that doesn't fit. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): close X-Forwarded-Prefix open redirect The reverse-proxy support concatenated X-Forwarded-Prefix into the redirect target without validation, so a forged header value of "//evil.com" turned the SPA-shell redirect helper at /, /browse, and /browse/* into a 301 to //evil.com/app. The path-strip middleware had the same shape on its prefix-trailing-slash redirect. Add SafeForwardedPrefix at the middleware boundary: must start with a single '/', no protocol-relative '//' opener, no scheme, no backslash, no control characters. Apply at both consumers; misconfig trips the validator and the header is dropped. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): refuse wildcard CORS when LOCALAI_CORS=true with empty allowlist When LOCALAI_CORS=true but LOCALAI_CORS_ALLOW_ORIGINS was empty, Echo's CORSWithConfig saw an empty allow-list and fell back to its default AllowOrigins=["*"]. An operator who flipped the strict-CORS feature flag without populating the list got the opposite of what they asked for. Echo never sets Allow-Credentials: true so this isn't directly exploitable (cookies aren't sent under wildcard CORS), but the misconfiguration trap is worth closing. Skip the registration and warn. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): zxcvbn password strength check with user-acknowledged override The previous policy was len < 8, which let through "Password1" and the rest of the credential-stuffing corpus. LocalAI has no second factor yet, so the bar needs to sit higher. Add ValidatePasswordStrength using github.com/timbutler/zxcvbn (an actively-maintained fork of the trustelem port; v1.0.4, April 2024): - min 12 chars, max 72 (bcrypt's truncation point) - reject NUL bytes (some bcrypt callers truncate at the first NUL) - require zxcvbn score >= 3 ("safely unguessable, ~10^8 guesses to break"); the hint list ["localai", "local-ai", "admin"] penalises passwords built from the app's own branding zxcvbn produces false positives sometimes (a strong-looking password that happens to match a dictionary word) and operators occasionally need to set a known-weak password (kiosk demos, CI rigs). Add an acknowledgement path: PasswordPolicy{AllowWeak: true} skips the entropy check while still enforcing the hard rules. The structured PasswordErrorResponse marks weak-password rejections as Overridable so the UI can surface a "use this anyway" checkbox. Wired through register, self-service password change, and admin password reset on both the server and the React UI. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): drop HTML5 minLength on new-password inputs minLength={12} on the new-password input let the browser block the form submit silently before any JS or network call ran. The browser focused the field, showed a brief native tooltip, and that was that — no toast, no fetch, no clue. Reproducible by typing fewer than 12 chars on the second password change of a session. The JS-level length check in handleSubmit already shows a toast and the server rejects with a structured error, so the HTML5 attribute was redundant defence anyway. Drop it. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): bundle Geist fonts locally instead of fetching from Google The new CSP correctly refused to apply styles from fonts.googleapis.com because style-src is locked to 'self' and 'unsafe-inline'. Loosening the CSP would defeat its purpose; the right fix is to stop reaching out to a third-party CDN for fonts on every page load. Add @fontsource-variable/geist and @fontsource-variable/geist-mono as npm deps and import them once at boot. Drop the <link rel="preconnect"> and external stylesheet from index.html. Side benefit: no third-party tracking via Referer / IP on every UI load, no failure mode when offline / behind a captive portal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): refresh i18n strings to reflect 12-char password minimum The translations still said "at least 8 characters" everywhere — the client-side toast on a too-short password change told the user the wrong floor. Update tooShort and newPasswordPlaceholder / newPasswordDescription across all five locales (en, es, it, de, zh-CN) to match the real ValidatePasswordStrength rule. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): make password length-floor overridable like the entropy check The 12-char minimum was a policy choice, not a technical invariant — only "non-empty", "<= 72 bytes", and "no NUL bytes" are real bcrypt constraints. Treating length-12 as a hard rule was inconsistent with the entropy check (already overridable) and friction for use cases where the account is just a name on a session, not a security boundary (single-user kiosk, CI rig, lab demo). Restructure ValidatePasswordStrength: - Hard rules (always enforced): non-empty, <= MaxPasswordLength, no NUL byte - Policy rules (skipped when AllowWeak=true): length >= 12, zxcvbn score >= 3 PasswordError now marks password_too_short as Overridable too. The React forms generalised from `error_code === 'password_too_weak'` to `overridable === true`, and the JS-side preflight length checks were removed (server is source of truth, returns the same checkbox flow). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>
622 lines
20 KiB
Go
622 lines
20 KiB
Go
// Package gallery provides installation and registration utilities for LocalAI backends,
|
|
// including meta-backend resolution based on system capabilities.
|
|
package gallery
|
|
|
|
import (
|
|
"context"
|
|
"encoding/json"
|
|
"errors"
|
|
"fmt"
|
|
"os"
|
|
"path/filepath"
|
|
"strings"
|
|
"time"
|
|
|
|
"github.com/mudler/LocalAI/core/config"
|
|
"github.com/mudler/LocalAI/pkg/downloader"
|
|
"github.com/mudler/LocalAI/pkg/model"
|
|
"github.com/mudler/LocalAI/pkg/oci"
|
|
"github.com/mudler/LocalAI/pkg/system"
|
|
"github.com/mudler/xlog"
|
|
cp "github.com/otiai10/copy"
|
|
)
|
|
|
|
// ErrBackendNotFound is returned when a backend is not found in the system.
|
|
var ErrBackendNotFound = errors.New("backend not found")
|
|
|
|
const (
|
|
metadataFile = "metadata.json"
|
|
runFile = "run.sh"
|
|
)
|
|
|
|
// Default fallback tag values
|
|
const (
|
|
defaultLatestTag = "latest"
|
|
defaultMasterTag = "master"
|
|
defaultDevSuffix = "development"
|
|
)
|
|
|
|
// getFallbackTagValues returns the configurable fallback tag values from SystemState
|
|
func getFallbackTagValues(systemState *system.SystemState) (latestTag, masterTag, devSuffix string) {
|
|
// Use SystemState fields if set, otherwise use defaults
|
|
if systemState.BackendImagesReleaseTag != "" {
|
|
latestTag = systemState.BackendImagesReleaseTag
|
|
} else {
|
|
latestTag = defaultLatestTag
|
|
}
|
|
if systemState.BackendImagesBranchTag != "" {
|
|
masterTag = systemState.BackendImagesBranchTag
|
|
} else {
|
|
masterTag = defaultMasterTag
|
|
}
|
|
if systemState.BackendDevSuffix != "" {
|
|
devSuffix = systemState.BackendDevSuffix
|
|
} else {
|
|
devSuffix = defaultDevSuffix
|
|
}
|
|
|
|
return latestTag, masterTag, devSuffix
|
|
}
|
|
|
|
// backendCandidate represents an installed concrete backend option for a given alias
|
|
type backendCandidate struct {
|
|
name string
|
|
runFile string
|
|
}
|
|
|
|
// readBackendMetadata reads the metadata JSON file for a backend
|
|
func readBackendMetadata(backendPath string) (*BackendMetadata, error) {
|
|
metadataPath := filepath.Join(backendPath, metadataFile)
|
|
|
|
// If metadata file doesn't exist, return nil (for backward compatibility)
|
|
if _, err := os.Stat(metadataPath); os.IsNotExist(err) {
|
|
return nil, nil
|
|
}
|
|
|
|
data, err := os.ReadFile(metadataPath)
|
|
if err != nil {
|
|
return nil, fmt.Errorf("failed to read metadata file %q: %v", metadataPath, err)
|
|
}
|
|
|
|
var metadata BackendMetadata
|
|
if err := json.Unmarshal(data, &metadata); err != nil {
|
|
return nil, fmt.Errorf("failed to unmarshal metadata file %q: %v", metadataPath, err)
|
|
}
|
|
|
|
return &metadata, nil
|
|
}
|
|
|
|
// writeBackendMetadata writes the metadata JSON file for a backend
|
|
func writeBackendMetadata(backendPath string, metadata *BackendMetadata) error {
|
|
metadataPath := filepath.Join(backendPath, metadataFile)
|
|
|
|
data, err := json.MarshalIndent(metadata, "", " ")
|
|
if err != nil {
|
|
return fmt.Errorf("failed to marshal metadata: %v", err)
|
|
}
|
|
|
|
if err := os.WriteFile(metadataPath, data, 0644); err != nil {
|
|
return fmt.Errorf("failed to write metadata file %q: %v", metadataPath, err)
|
|
}
|
|
|
|
return nil
|
|
}
|
|
|
|
// InstallBackendFromGallery installs a backend from the gallery.
|
|
func InstallBackendFromGallery(ctx context.Context, galleries []config.Gallery, systemState *system.SystemState, modelLoader *model.ModelLoader, name string, downloadStatus func(string, string, string, float64), force bool) error {
|
|
if !force {
|
|
// check if we already have the backend installed
|
|
backends, err := ListSystemBackends(systemState)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
// Only short-circuit if the install is *actually usable*. An orphaned
|
|
// meta entry whose concrete was removed still shows up in
|
|
// ListSystemBackends with a RunFile pointing at a path that no longer
|
|
// exists; returning early there leaves the caller with a broken
|
|
// alias and the worker fails with "backend not found after install
|
|
// attempt" on every retry. Re-install in that case.
|
|
if existing, ok := backends.Get(name); ok && isBackendRunnable(existing) {
|
|
return nil
|
|
}
|
|
}
|
|
|
|
if name == "" {
|
|
return fmt.Errorf("backend name is empty")
|
|
}
|
|
|
|
xlog.Debug("Installing backend from gallery", "galleries", galleries, "name", name)
|
|
|
|
backends, err := AvailableBackends(galleries, systemState)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
backend := FindGalleryElement(backends, name)
|
|
if backend == nil {
|
|
return fmt.Errorf("no backend found with name %q", name)
|
|
}
|
|
|
|
if backend.IsMeta() {
|
|
xlog.Debug("Backend is a meta backend", "systemState", systemState, "name", name)
|
|
|
|
// Then, let's try to find the best backend based on the capabilities map
|
|
bestBackend := backend.FindBestBackendFromMeta(systemState, backends)
|
|
if bestBackend == nil {
|
|
return fmt.Errorf("no backend found with capabilities %q", backend.CapabilitiesMap)
|
|
}
|
|
|
|
xlog.Debug("Installing backend from meta backend", "name", name, "bestBackend", bestBackend.Name)
|
|
|
|
// Then, let's install the best backend
|
|
if err := InstallBackend(ctx, systemState, modelLoader, bestBackend, downloadStatus); err != nil {
|
|
return err
|
|
}
|
|
|
|
// we need now to create a path for the meta backend, with the alias to the installed ones so it can be used to remove it
|
|
metaBackendPath := filepath.Join(systemState.Backend.BackendsPath, name)
|
|
if err := os.MkdirAll(metaBackendPath, 0750); err != nil {
|
|
return fmt.Errorf("failed to create meta backend path %q: %v", metaBackendPath, err)
|
|
}
|
|
|
|
// Create metadata for the meta backend
|
|
metaMetadata := &BackendMetadata{
|
|
MetaBackendFor: bestBackend.Name,
|
|
Name: name,
|
|
GalleryURL: backend.Gallery.URL,
|
|
InstalledAt: time.Now().Format(time.RFC3339),
|
|
Version: bestBackend.Version,
|
|
}
|
|
|
|
if err := writeBackendMetadata(metaBackendPath, metaMetadata); err != nil {
|
|
return fmt.Errorf("failed to write metadata for meta backend %q: %v", name, err)
|
|
}
|
|
|
|
return nil
|
|
}
|
|
|
|
return InstallBackend(ctx, systemState, modelLoader, backend, downloadStatus)
|
|
}
|
|
|
|
func InstallBackend(ctx context.Context, systemState *system.SystemState, modelLoader *model.ModelLoader, config *GalleryBackend, downloadStatus func(string, string, string, float64)) error {
|
|
// Get configurable fallback tag values from SystemState
|
|
latestTag, masterTag, devSuffix := getFallbackTagValues(systemState)
|
|
|
|
// Create base path if it doesn't exist
|
|
err := os.MkdirAll(systemState.Backend.BackendsPath, 0750)
|
|
if err != nil {
|
|
return fmt.Errorf("failed to create base path: %v", err)
|
|
}
|
|
|
|
if config.IsMeta() {
|
|
return fmt.Errorf("meta backends cannot be installed directly")
|
|
}
|
|
|
|
name := config.Name
|
|
backendPath := filepath.Join(systemState.Backend.BackendsPath, name)
|
|
// Clean up legacy flat-layout artefacts: earlier dev builds of the
|
|
// golang backends dropped the compiled binary directly at
|
|
// `<backendsPath>/<name>` (a plain file) instead of
|
|
// `<backendsPath>/<name>/<name>` (the nested layout the current code
|
|
// expects). MkdirAll below returns ENOTDIR when such a stale file
|
|
// exists, permanently blocking any reinstall or upgrade. Remove the
|
|
// file first so the install can proceed; the new install will write
|
|
// the correct nested layout, including metadata.json + run.sh.
|
|
if fi, statErr := os.Lstat(backendPath); statErr == nil && !fi.IsDir() {
|
|
xlog.Warn("removing stale non-directory backend artefact to make room for fresh install", "path", backendPath)
|
|
if rmErr := os.Remove(backendPath); rmErr != nil {
|
|
return fmt.Errorf("failed to remove stale backend artefact at %s: %w", backendPath, rmErr)
|
|
}
|
|
}
|
|
err = os.MkdirAll(backendPath, 0750)
|
|
if err != nil {
|
|
return fmt.Errorf("failed to create base path: %v", err)
|
|
}
|
|
|
|
uri := downloader.URI(config.URI)
|
|
// Check if it is a directory
|
|
if uri.LooksLikeDir() {
|
|
// It is a directory, we just copy it over in the backend folder
|
|
if err := cp.Copy(config.URI, backendPath); err != nil {
|
|
return fmt.Errorf("failed copying: %w", err)
|
|
}
|
|
} else {
|
|
xlog.Debug("Downloading backend", "uri", config.URI, "backendPath", backendPath)
|
|
if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err != nil {
|
|
xlog.Debug("Backend download failed, trying fallback", "backendPath", backendPath, "error", err)
|
|
|
|
// resetBackendPath cleans up partial state from a failed OCI extraction
|
|
// so the next download attempt starts fresh. The directory is re-created
|
|
// because OCI image extractors need it to exist for writing files into.
|
|
resetBackendPath := func() {
|
|
os.RemoveAll(backendPath)
|
|
os.MkdirAll(backendPath, 0750)
|
|
}
|
|
|
|
success := false
|
|
// Try to download from mirrors
|
|
for _, mirror := range config.Mirrors {
|
|
// Check for cancellation before trying next mirror
|
|
select {
|
|
case <-ctx.Done():
|
|
return ctx.Err()
|
|
default:
|
|
}
|
|
resetBackendPath()
|
|
if err := downloader.URI(mirror).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
|
|
success = true
|
|
xlog.Debug("Downloaded backend from mirror", "uri", config.URI, "backendPath", backendPath)
|
|
break
|
|
}
|
|
}
|
|
|
|
if !success {
|
|
// Try fallback: replace latestTag + "-" with masterTag + "-" in the URI
|
|
fallbackURI := strings.Replace(string(config.URI), latestTag+"-", masterTag+"-", 1)
|
|
if fallbackURI != string(config.URI) {
|
|
resetBackendPath()
|
|
xlog.Info("Trying fallback URI", "original", config.URI, "fallback", fallbackURI)
|
|
if err := downloader.URI(fallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
|
|
xlog.Info("Downloaded backend using fallback URI", "uri", fallbackURI, "backendPath", backendPath)
|
|
success = true
|
|
} else {
|
|
xlog.Info("Fallback URI failed", "fallback", fallbackURI, "error", err)
|
|
if !strings.Contains(fallbackURI, "-"+devSuffix) {
|
|
resetBackendPath()
|
|
devFallbackURI := fallbackURI + "-" + devSuffix
|
|
xlog.Info("Trying development fallback URI", "fallback", devFallbackURI)
|
|
if err := downloader.URI(devFallbackURI).DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus); err == nil {
|
|
xlog.Info("Downloaded backend using development fallback URI", "uri", devFallbackURI, "backendPath", backendPath)
|
|
success = true
|
|
} else {
|
|
xlog.Info("Development fallback URI failed", "fallback", devFallbackURI, "error", err)
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
if !success {
|
|
// Clean up backend directory only when all download attempts have failed
|
|
if cleanupErr := os.RemoveAll(backendPath); cleanupErr != nil {
|
|
xlog.Warn("Failed to clean up backend directory", "backendPath", backendPath, "error", cleanupErr)
|
|
}
|
|
xlog.Error("Failed to download backend", "uri", config.URI, "backendPath", backendPath, "error", err)
|
|
return fmt.Errorf("failed to download backend %q: %v", config.URI, err)
|
|
}
|
|
} else {
|
|
xlog.Debug("Downloaded backend", "uri", config.URI, "backendPath", backendPath)
|
|
}
|
|
}
|
|
|
|
// sanity check - check if runfile is present
|
|
runFile := filepath.Join(backendPath, runFile)
|
|
if _, err := os.Stat(runFile); os.IsNotExist(err) {
|
|
xlog.Error("Run file not found", "runFile", runFile)
|
|
return fmt.Errorf("not a valid backend: run file not found %q", runFile)
|
|
}
|
|
|
|
// Create metadata for the backend
|
|
metadata := &BackendMetadata{
|
|
Name: name,
|
|
GalleryURL: config.Gallery.URL,
|
|
InstalledAt: time.Now().Format(time.RFC3339),
|
|
Version: config.Version,
|
|
URI: string(uri),
|
|
}
|
|
|
|
// Record the OCI digest for upgrade detection (non-fatal on failure)
|
|
if uri.LooksLikeOCI() {
|
|
digest, digestErr := oci.GetImageDigest(string(uri), "", nil, nil)
|
|
if digestErr != nil {
|
|
xlog.Warn("Failed to get OCI image digest for backend", "uri", string(uri), "error", digestErr)
|
|
} else {
|
|
metadata.Digest = digest
|
|
}
|
|
}
|
|
|
|
if config.Alias != "" {
|
|
metadata.Alias = config.Alias
|
|
}
|
|
|
|
if err := writeBackendMetadata(backendPath, metadata); err != nil {
|
|
return fmt.Errorf("failed to write metadata for backend %q: %v", name, err)
|
|
}
|
|
|
|
return RegisterBackends(systemState, modelLoader)
|
|
}
|
|
|
|
func DeleteBackendFromSystem(systemState *system.SystemState, name string) error {
|
|
backends, err := ListSystemBackends(systemState)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
backend, ok := backends.Get(name)
|
|
if !ok {
|
|
// Not found by direct key — try matching by gallery name (metadata.Name)
|
|
// The UI may send gallery-style names like "localai@llama-cpp" which
|
|
// don't match the directory-based keys used in the backends map.
|
|
for _, b := range backends {
|
|
if b.Metadata != nil && b.Metadata.Name == name && !b.IsMeta {
|
|
backend = b
|
|
ok = true
|
|
break
|
|
}
|
|
}
|
|
if !ok {
|
|
return fmt.Errorf("backend %q: %w", name, ErrBackendNotFound)
|
|
}
|
|
}
|
|
|
|
if backend.IsSystem {
|
|
return fmt.Errorf("system backend %q cannot be deleted", name)
|
|
}
|
|
|
|
// Use the backend's actual Name (directory key) for path resolution,
|
|
// not the caller-supplied name which may be a gallery-style name.
|
|
dirName := backend.Name
|
|
backendDirectory := filepath.Join(systemState.Backend.BackendsPath, dirName)
|
|
|
|
// check if the backend dir exists
|
|
if _, err := os.Stat(backendDirectory); os.IsNotExist(err) {
|
|
// if doesn't exist, it might be an alias, so we need to check if we have a matching alias in
|
|
// all the backends in the basePath
|
|
backends, err := os.ReadDir(systemState.Backend.BackendsPath)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
foundBackend := false
|
|
|
|
for _, backend := range backends {
|
|
if backend.IsDir() {
|
|
metadata, err := readBackendMetadata(filepath.Join(systemState.Backend.BackendsPath, backend.Name()))
|
|
if err != nil {
|
|
return err
|
|
}
|
|
if metadata != nil && (metadata.Alias == name || metadata.Alias == dirName) {
|
|
backendDirectory = filepath.Join(systemState.Backend.BackendsPath, backend.Name())
|
|
foundBackend = true
|
|
break
|
|
}
|
|
}
|
|
}
|
|
|
|
// If no backend found, return successfully (idempotent behavior)
|
|
if !foundBackend {
|
|
return fmt.Errorf("no backend found with name %q", name)
|
|
}
|
|
}
|
|
|
|
// If it's a meta backend, delete also associated backend
|
|
metadata, err := readBackendMetadata(backendDirectory)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
if metadata != nil && metadata.MetaBackendFor != "" {
|
|
concreteDirectory := filepath.Join(systemState.Backend.BackendsPath, metadata.MetaBackendFor)
|
|
xlog.Debug("Deleting concrete backend referenced by meta", "concreteDirectory", concreteDirectory)
|
|
// If the concrete the meta points to is already gone (earlier delete,
|
|
// partial install, or manual cleanup), keep going and remove the
|
|
// orphaned meta dir. Previously we returned an error here, which made
|
|
// the orphaned meta impossible to uninstall from the UI — the delete
|
|
// kept failing and every subsequent install short-circuited because
|
|
// the stale meta metadata made ListSystemBackends.Exists(name) true.
|
|
if _, statErr := os.Stat(concreteDirectory); statErr == nil {
|
|
os.RemoveAll(concreteDirectory)
|
|
} else if os.IsNotExist(statErr) {
|
|
xlog.Warn("Concrete backend referenced by meta not found — removing orphaned meta only",
|
|
"meta", name, "concrete", metadata.MetaBackendFor)
|
|
} else {
|
|
return statErr
|
|
}
|
|
}
|
|
|
|
return os.RemoveAll(backendDirectory)
|
|
}
|
|
|
|
// isBackendRunnable reports whether the given backend entry can actually be
|
|
// invoked. A meta backend is runnable only if its concrete's run.sh still
|
|
// exists on disk; concrete backends are considered runnable as long as their
|
|
// RunFile is set (ListSystemBackends only emits them when the runfile is
|
|
// present). Used to guard the "already installed" short-circuit so an
|
|
// orphaned meta pointing at a missing concrete triggers a real reinstall
|
|
// rather than being silently skipped.
|
|
func isBackendRunnable(b SystemBackend) bool {
|
|
if b.RunFile == "" {
|
|
return false
|
|
}
|
|
if fi, err := os.Stat(b.RunFile); err != nil || fi.IsDir() {
|
|
return false
|
|
}
|
|
return true
|
|
}
|
|
|
|
type SystemBackend struct {
|
|
Name string
|
|
RunFile string
|
|
IsMeta bool
|
|
IsSystem bool
|
|
Metadata *BackendMetadata
|
|
UpgradeAvailable bool `json:"upgrade_available,omitempty"`
|
|
AvailableVersion string `json:"available_version,omitempty"`
|
|
// Nodes holds per-node attribution in distributed mode. Empty in single-node.
|
|
// Each entry describes a node that has this backend installed, with the
|
|
// version/digest it reports. Lets the UI surface drift and per-node status.
|
|
Nodes []NodeBackendRef `json:"nodes,omitempty"`
|
|
}
|
|
|
|
// NodeBackendRef describes one node's view of an installed backend. Used both
|
|
// for per-node attribution in the UI and for drift detection during upgrade
|
|
// checks (a cluster with mismatched versions/digests is flagged upgradeable).
|
|
type NodeBackendRef struct {
|
|
NodeID string `json:"node_id"`
|
|
NodeName string `json:"node_name"`
|
|
NodeStatus string `json:"node_status"` // healthy | unhealthy | offline | draining | pending
|
|
Version string `json:"version,omitempty"`
|
|
Digest string `json:"digest,omitempty"`
|
|
URI string `json:"uri,omitempty"`
|
|
InstalledAt string `json:"installed_at,omitempty"`
|
|
}
|
|
|
|
type SystemBackends map[string]SystemBackend
|
|
|
|
func (b SystemBackends) Exists(name string) bool {
|
|
_, ok := b[name]
|
|
return ok
|
|
}
|
|
|
|
func (b SystemBackends) Get(name string) (SystemBackend, bool) {
|
|
backend, ok := b[name]
|
|
return backend, ok
|
|
}
|
|
|
|
func (b SystemBackends) GetAll() []SystemBackend {
|
|
backends := make([]SystemBackend, 0)
|
|
for _, backend := range b {
|
|
backends = append(backends, backend)
|
|
}
|
|
return backends
|
|
}
|
|
|
|
func ListSystemBackends(systemState *system.SystemState) (SystemBackends, error) {
|
|
// Gather backends from system and user paths, then resolve alias conflicts by capability.
|
|
backends := make(SystemBackends)
|
|
|
|
// System-provided backends
|
|
if systemBackends, err := os.ReadDir(systemState.Backend.BackendsSystemPath); err == nil {
|
|
for _, systemBackend := range systemBackends {
|
|
if systemBackend.IsDir() {
|
|
run := filepath.Join(systemState.Backend.BackendsSystemPath, systemBackend.Name(), runFile)
|
|
if _, err := os.Stat(run); err == nil {
|
|
backends[systemBackend.Name()] = SystemBackend{
|
|
Name: systemBackend.Name(),
|
|
RunFile: run,
|
|
IsMeta: false,
|
|
IsSystem: true,
|
|
Metadata: nil,
|
|
}
|
|
}
|
|
}
|
|
}
|
|
} else if !errors.Is(err, os.ErrNotExist) {
|
|
xlog.Warn("Failed to read system backends, proceeding with user-managed backends", "error", err)
|
|
} else if errors.Is(err, os.ErrNotExist) {
|
|
xlog.Debug("No system backends found")
|
|
}
|
|
|
|
// User-managed backends and alias collection
|
|
entries, err := os.ReadDir(systemState.Backend.BackendsPath)
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
aliasGroups := make(map[string][]backendCandidate)
|
|
metaMap := make(map[string]*BackendMetadata)
|
|
|
|
for _, e := range entries {
|
|
if !e.IsDir() {
|
|
continue
|
|
}
|
|
dir := e.Name()
|
|
run := filepath.Join(systemState.Backend.BackendsPath, dir, runFile)
|
|
|
|
var metadata *BackendMetadata
|
|
metadataPath := filepath.Join(systemState.Backend.BackendsPath, dir, metadataFile)
|
|
if _, err := os.Stat(metadataPath); os.IsNotExist(err) {
|
|
metadata = &BackendMetadata{Name: dir}
|
|
} else {
|
|
m, rerr := readBackendMetadata(filepath.Join(systemState.Backend.BackendsPath, dir))
|
|
if rerr != nil {
|
|
return nil, rerr
|
|
}
|
|
if m == nil {
|
|
metadata = &BackendMetadata{Name: dir}
|
|
} else {
|
|
metadata = m
|
|
}
|
|
}
|
|
|
|
metaMap[dir] = metadata
|
|
|
|
// Concrete-backend entry
|
|
if _, err := os.Stat(run); err == nil {
|
|
backends[dir] = SystemBackend{
|
|
Name: dir,
|
|
RunFile: run,
|
|
IsMeta: false,
|
|
Metadata: metadata,
|
|
}
|
|
}
|
|
|
|
// Alias candidates
|
|
if metadata.Alias != "" {
|
|
aliasGroups[metadata.Alias] = append(aliasGroups[metadata.Alias], backendCandidate{name: dir, runFile: run})
|
|
}
|
|
|
|
// Meta backends indirection
|
|
if metadata.MetaBackendFor != "" {
|
|
backends[metadata.Name] = SystemBackend{
|
|
Name: metadata.Name,
|
|
RunFile: filepath.Join(systemState.Backend.BackendsPath, metadata.MetaBackendFor, runFile),
|
|
IsMeta: true,
|
|
Metadata: metadata,
|
|
}
|
|
}
|
|
}
|
|
|
|
// Resolve aliases using system capability preferences
|
|
tokens := systemState.BackendPreferenceTokens()
|
|
for alias, cands := range aliasGroups {
|
|
chosen := backendCandidate{}
|
|
// Try preference tokens
|
|
for _, t := range tokens {
|
|
for _, c := range cands {
|
|
if strings.Contains(strings.ToLower(c.name), t) && c.runFile != "" {
|
|
chosen = c
|
|
break
|
|
}
|
|
}
|
|
if chosen.runFile != "" {
|
|
break
|
|
}
|
|
}
|
|
// Fallback: first runnable
|
|
if chosen.runFile == "" {
|
|
for _, c := range cands {
|
|
if c.runFile != "" {
|
|
chosen = c
|
|
break
|
|
}
|
|
}
|
|
}
|
|
if chosen.runFile == "" {
|
|
continue
|
|
}
|
|
md := metaMap[chosen.name]
|
|
backends[alias] = SystemBackend{
|
|
Name: alias,
|
|
RunFile: chosen.runFile,
|
|
IsMeta: false,
|
|
Metadata: md,
|
|
}
|
|
}
|
|
|
|
return backends, nil
|
|
}
|
|
|
|
func RegisterBackends(systemState *system.SystemState, modelLoader *model.ModelLoader) error {
|
|
backends, err := ListSystemBackends(systemState)
|
|
if err != nil {
|
|
return err
|
|
}
|
|
|
|
for _, backend := range backends {
|
|
xlog.Debug("Registering backend", "name", backend.Name, "runFile", backend.RunFile)
|
|
modelLoader.SetExternalBackend(backend.Name, backend.RunFile)
|
|
}
|
|
|
|
return nil
|
|
}
|