mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-16 20:52:08 -04:00
* fix(http): close 0.0.0.0/[::] SSRF bypass in /api/cors-proxy The CORS proxy carried its own private-network blocklist (RFC 1918 + a handful of IPv6 ranges) instead of using the same classification as pkg/utils/urlfetch.go. The hand-rolled list missed 0.0.0.0/8 and ::/128, both of which Linux routes to localhost — so any user with FeatureMCP (default-on for new users) could reach LocalAI's own listener and any other service bound to 0.0.0.0:port via: GET /api/cors-proxy?url=http://0.0.0.0:8080/... GET /api/cors-proxy?url=http://[::]:8080/... Replace the custom check with utils.IsPublicIP (Go stdlib IsLoopback / IsLinkLocalUnicast / IsPrivate / IsUnspecified, plus IPv4-mapped IPv6 unmasking) and add an upfront hostname rejection for localhost, *.local, and the cloud metadata aliases so split-horizon DNS can't paper over the IP check. The IP-pinning DialContext is unchanged: the validated IP from the single resolution is reused for the connection, so DNS rebinding still cannot swap a public answer for a private one between validate and dial. Regression tests cover 0.0.0.0, 0.0.0.0:PORT, [::], ::ffff:127.0.0.1, ::ffff:10.0.0.1, file://, gopher://, ftp://, localhost, 127.0.0.1, 10.0.0.1, 169.254.169.254, metadata.google.internal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(downloader): verify SHA before promoting temp file to final path DownloadFileWithContext renamed the .partial file to its final name *before* checking the streamed SHA, so a hash mismatch returned an error but left the tampered file at filePath. Subsequent code that operated on filePath (a backend launcher, a YAML loader, a re-download that finds the file already present and skips) would consume the attacker-supplied bytes. Reorder: verify the streamed hash first, remove the .partial on mismatch, then rename. The streamed hash is computed during io.Copy so no second read is needed. While here, raise the empty-SHA case from a Debug log to a Warn so "this download had no integrity check" is visible at the default log level. Backend installs currently pass through with no digest; the warning makes that footprint observable without changing behaviour. Regression test asserts os.IsNotExist on the destination after a deliberate SHA mismatch. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(auth): require email_verified for OIDC admin promotion extractOIDCUserInfo read the ID token's "email" claim but never inspected "email_verified". With LOCALAI_ADMIN_EMAIL set, an attacker who could register on the configured OIDC IdP under that email (some IdPs accept self-supplied unverified emails) inherited admin role: - first login: AssignRole(tx, email, adminEmail) → RoleAdmin - re-login: MaybePromote(db, user, adminEmail) → flip to RoleAdmin Add EmailVerified to oauthUserInfo, parse email_verified from the OIDC claims (default false on absence so an IdP that omits the claim cannot short-circuit the gate), and substitute "" for the role-decision email when verified=false via emailForRoleDecision. The user record still stores the unverified email for display. GitHub's path defaults EmailVerified=true: GitHub only returns a public profile email after verification, and fetchGitHubPrimaryEmail explicitly filters to Verified=true. Regression tests cover both the helper contract and integration with AssignRole, including the bootstrap "first user" branch that would otherwise mask the gate. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(cli): refuse public bind when no auth backend is configured When neither an auth DB nor a static API key is set, the auth middleware passes every request through. That is fine for a developer laptop, a home LAN, or a Tailnet — the network itself is the trust boundary. It is not fine on a public IP, where every model install, settings change, and admin endpoint becomes reachable from the internet. Refuse to start in that exact configuration. Loopback, RFC 1918, RFC 4193 ULA, link-local, and RFC 6598 CGNAT (Tailscale's default range) all count as trusted; wildcard binds (`:port`, `0.0.0.0`, `[::]`) are accepted only when every host interface is in one of those ranges. Hostnames are resolved and treated as trusted only when every answer is. A new --allow-insecure-public-bind / LOCALAI_ALLOW_INSECURE_PUBLIC_BIND flag opts out for deployments that gate access externally (a reverse proxy enforcing auth, a mesh ACL, etc.). The error message lists this plus the three constructive alternatives (bind a private interface, enable --auth, set --api-keys). The interface enumeration goes through a package-level interfaceAddrsFn var so tests can simulate cloud-VM, home-LAN, Tailscale-only, and enumeration-failure topologies without poking at the real network stack. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): regression-test the localai_assistant admin gate ChatEndpoint already rejects metadata.localai_assistant=true from a non-admin caller, but the gate was open-coded inline with no direct test coverage. The chat route is FeatureChat-gated (default-on), and the assistant's in-process MCP server can install/delete models and edit configs — the wrong handler change would silently turn the LLM into a confused deputy. Extract the gate into requireAssistantAccess(c, authEnabled) and pin its behaviour: auth disabled is a no-op, unauthenticated is 403, RoleUser is 403, RoleAdmin and the synthetic legacy-key admin are admitted. No behaviour change in the production path. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): assert every API route is auth-classified The auth middleware classifies path prefixes (/api/, /v1/, /models/, etc.) as protected and treats anything else as a static-asset passthrough. A new endpoint shipped under a brand-new prefix — or a new path that simply isn't on the prefix allowlist — would be reachable anonymously. Walk every route registered by API() with auth enabled and a fresh in-memory database (no users, no keys), and assert each API-prefixed route returns 401 / 404 / 405 to an anonymous request. Public surfaces (/api/auth/*, /api/branding, /api/node/* token-authenticated routes, /healthz, branding asset server, generated-content server, static assets) are explicit allowlist entries with comments justifying them. Build-tagged 'auth' so it runs against the SQLite-backed auth DB (matches the existing auth suite). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin agent endpoint per-user isolation contract agents.go's getUserID / effectiveUserID / canImpersonateUser / wantsAllUsers helpers are the single trust boundary for cross-user access on agent, agent-jobs, collections, and skills routes. A regression there is the difference between "regular user reads their own data" and "regular user reads anyone's data via ?user_id=victim". Lock in the contract: - effectiveUserID ignores ?user_id= for unauthenticated and RoleUser - effectiveUserID honours it for RoleAdmin and ProviderAgentWorker - wantsAllUsers requires admin AND the literal "true" string - canImpersonateUser is admin OR agent-worker, never plain RoleUser No production change — this commit only adds tests. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(downloader): drop redundant stat in removePartialFile The stat-then-remove pattern is a TOCTOU window and a wasted syscall — os.Remove already returns ErrNotExist for the missing-file case, so trust that and treat it as a no-op. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): redact secrets from trace buffer and distribution-token logs The /api/traces buffer captured Authorization, Cookie, Set-Cookie, and API-key headers verbatim from every request when tracing was enabled. The endpoint is admin-only but the buffer is reachable via any heap-style introspection and the captured tokens otherwise outlive the request. Strip those header values at capture time. Body redaction is left to a follow-up — the prompts are usually the operator's own and JSON-walking is invasive. Distribution tokens were also logged in plaintext from core/explorer/discovery.go; logs forward to syslog/journald and outlive the token. Redact those to a short prefix/suffix instead. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): rate-limit OAuth callbacks separately from password endpoints The shared 5/min/IP limit on auth endpoints is right for password-style flows but too tight for OAuth callbacks: corporate SSO funnels many real users through one outbound IP and would trip the limit. Add a separate 60/min/IP limiter for /api/auth/{github,oidc}/callback so callbacks are bounded against floods without breaking shared-IP deployments. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(gallery): verify backend tarball sha256 when set in gallery entry GalleryBackend gained an optional sha256 field; the install path now threads it through to the existing downloader hash-verify (which already streams, verifies, and rolls back on mismatch). Galleries without sha256 keep working; the empty-SHA path still emits the existing "downloading without integrity check" warning. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin CSRF coverage on multipart endpoints The CSRF middleware in app.go is global (e.Use) so it covers every multipart upload route — branding assets, fine-tune datasets, audio transforms, agent collections. Pin that contract: cross-site multipart POSTs are rejected; same-origin / same-site / API-key clients are not. Also pins the SameSite=Lax fallback path the skipper relies on when Sec-Fetch-Site is absent. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): XSS hardening — CSP headers, safe href, base-href escape, SVG sandbox Several closely related XSS-prevention changes spanning the SPA shell, the React UI, and the branding asset server: - New SecurityHeaders middleware sets CSP, X-Content-Type-Options, X-Frame-Options, and Referrer-Policy on every response. The CSP keeps script-src permissive because the Vite bundle relies on inline + eval'd scripts; tightening that requires moving to a nonce-based policy. - The <base href> injection in the SPA shell escaped attacker-controllable Host / X-Forwarded-Host headers — a single quote in the host header broke out of the attribute. Pass through SecureBaseHref (html.EscapeString). - Three React sinks rendering untrusted content via dangerouslySetInnerHTML switch to text-node rendering with whiteSpace: pre-wrap: user message bodies in Chat.jsx and AgentChat.jsx, and the agent activity log in AgentChat.jsx. The hand-rolled escape on the agent user-message variant is replaced by the same plain-text path. - New safeHref util collapses non-allowlisted URI schemes (most importantly javascript:) to '#'. Applied to gallery `<a href={url}>` links in Models / Backends / Manage and to canvas artifact links — these come from gallery JSON or assistant tool calls and must be treated as untrusted. - The branding asset server attaches a sandbox CSP plus same-origin CORP to .svg responses. The React UI loads logos via <img>, but the same URL is also reachable via direct navigation; this prevents script execution if a hostile SVG slipped past upload validation. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): bound HTTP server with read-header and idle timeouts A net/http server with no timeouts is trivially Slowloris-able and leaks idle keep-alive connections. Set ReadHeaderTimeout (30s) to plug the slow-headers attack and IdleTimeout (120s) to cap keep-alive sockets. ReadTimeout and WriteTimeout stay at 0 because request bodies can be multi-GB model uploads and SSE / chat completions stream for many minutes; operators who need tighter per-request bounds should terminate slow clients at a reverse proxy. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(auth): pin PUT /api/auth/profile field-tampering contract The handler uses an explicit local body struct (only name and avatar_url) plus a gorm Updates(map) with a column allowlist, so an attacker posting {"role":"admin","email":"...","password_hash":"..."} can't mass-assign those fields. Lock that down with a regression test so a future "let's just c.Bind(&user)" refactor breaks loudly. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(services): strip directory components from multipart upload filenames UploadDataset and UploadToCollectionForUser took the raw multipart file.Filename and joined it into a destination path. The fine-tune upload was incidentally safe because of a UUID prefix that fused any leading '..' to a literal segment, but the protection is fragile. UploadToCollectionForUser handed the filename to a vendored backend without sanitising at all. Strip to filepath.Base at both boundaries and reject the trivial unsafe values ("", ".", "..", "/"). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): validate persisted MCP server entries on load localStorage is shared across same-origin pages; an XSS that lands once can poison persisted MCP server config to attempt header injection or to feed a non-http URL into the fetch path on subsequent loads. Validate every entry: types must match, URL must parse with http(s) scheme, header keys/values must be control-char-free. Drop anything that doesn't fit. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): close X-Forwarded-Prefix open redirect The reverse-proxy support concatenated X-Forwarded-Prefix into the redirect target without validation, so a forged header value of "//evil.com" turned the SPA-shell redirect helper at /, /browse, and /browse/* into a 301 to //evil.com/app. The path-strip middleware had the same shape on its prefix-trailing-slash redirect. Add SafeForwardedPrefix at the middleware boundary: must start with a single '/', no protocol-relative '//' opener, no scheme, no backslash, no control characters. Apply at both consumers; misconfig trips the validator and the header is dropped. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): refuse wildcard CORS when LOCALAI_CORS=true with empty allowlist When LOCALAI_CORS=true but LOCALAI_CORS_ALLOW_ORIGINS was empty, Echo's CORSWithConfig saw an empty allow-list and fell back to its default AllowOrigins=["*"]. An operator who flipped the strict-CORS feature flag without populating the list got the opposite of what they asked for. Echo never sets Allow-Credentials: true so this isn't directly exploitable (cookies aren't sent under wildcard CORS), but the misconfiguration trap is worth closing. Skip the registration and warn. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): zxcvbn password strength check with user-acknowledged override The previous policy was len < 8, which let through "Password1" and the rest of the credential-stuffing corpus. LocalAI has no second factor yet, so the bar needs to sit higher. Add ValidatePasswordStrength using github.com/timbutler/zxcvbn (an actively-maintained fork of the trustelem port; v1.0.4, April 2024): - min 12 chars, max 72 (bcrypt's truncation point) - reject NUL bytes (some bcrypt callers truncate at the first NUL) - require zxcvbn score >= 3 ("safely unguessable, ~10^8 guesses to break"); the hint list ["localai", "local-ai", "admin"] penalises passwords built from the app's own branding zxcvbn produces false positives sometimes (a strong-looking password that happens to match a dictionary word) and operators occasionally need to set a known-weak password (kiosk demos, CI rigs). Add an acknowledgement path: PasswordPolicy{AllowWeak: true} skips the entropy check while still enforcing the hard rules. The structured PasswordErrorResponse marks weak-password rejections as Overridable so the UI can surface a "use this anyway" checkbox. Wired through register, self-service password change, and admin password reset on both the server and the React UI. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): drop HTML5 minLength on new-password inputs minLength={12} on the new-password input let the browser block the form submit silently before any JS or network call ran. The browser focused the field, showed a brief native tooltip, and that was that — no toast, no fetch, no clue. Reproducible by typing fewer than 12 chars on the second password change of a session. The JS-level length check in handleSubmit already shows a toast and the server rejects with a structured error, so the HTML5 attribute was redundant defence anyway. Drop it. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): bundle Geist fonts locally instead of fetching from Google The new CSP correctly refused to apply styles from fonts.googleapis.com because style-src is locked to 'self' and 'unsafe-inline'. Loosening the CSP would defeat its purpose; the right fix is to stop reaching out to a third-party CDN for fonts on every page load. Add @fontsource-variable/geist and @fontsource-variable/geist-mono as npm deps and import them once at boot. Drop the <link rel="preconnect"> and external stylesheet from index.html. Side benefit: no third-party tracking via Referer / IP on every UI load, no failure mode when offline / behind a captive portal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): refresh i18n strings to reflect 12-char password minimum The translations still said "at least 8 characters" everywhere — the client-side toast on a too-short password change told the user the wrong floor. Update tooShort and newPasswordPlaceholder / newPasswordDescription across all five locales (en, es, it, de, zh-CN) to match the real ValidatePasswordStrength rule. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): make password length-floor overridable like the entropy check The 12-char minimum was a policy choice, not a technical invariant — only "non-empty", "<= 72 bytes", and "no NUL bytes" are real bcrypt constraints. Treating length-12 as a hard rule was inconsistent with the entropy check (already overridable) and friction for use cases where the account is just a name on a session, not a security boundary (single-user kiosk, CI rig, lab demo). Restructure ValidatePasswordStrength: - Hard rules (always enforced): non-empty, <= MaxPasswordLength, no NUL byte - Policy rules (skipped when AllowWeak=true): length >= 12, zxcvbn score >= 3 PasswordError now marks password_too_short as Overridable too. The React forms generalised from `error_code === 'password_too_weak'` to `overridable === true`, and the JS-side preflight length checks were removed (server is source of truth, returns the same checkbox flow). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>
530 lines
19 KiB
Go
530 lines
19 KiB
Go
package http
|
|
|
|
import (
|
|
"embed"
|
|
"errors"
|
|
"fmt"
|
|
"io/fs"
|
|
"mime"
|
|
"net/http"
|
|
"os"
|
|
"path/filepath"
|
|
"strings"
|
|
"time"
|
|
|
|
"github.com/labstack/echo/v4"
|
|
"github.com/labstack/echo/v4/middleware"
|
|
|
|
"github.com/mudler/LocalAI/core/http/auth"
|
|
"github.com/mudler/LocalAI/core/http/endpoints/localai"
|
|
|
|
httpMiddleware "github.com/mudler/LocalAI/core/http/middleware"
|
|
"github.com/mudler/LocalAI/core/http/routes"
|
|
|
|
"github.com/mudler/LocalAI/core/application"
|
|
"github.com/mudler/LocalAI/core/schema"
|
|
"github.com/mudler/LocalAI/core/services/finetune"
|
|
"github.com/mudler/LocalAI/core/services/galleryop"
|
|
"github.com/mudler/LocalAI/core/services/monitoring"
|
|
"github.com/mudler/LocalAI/core/services/nodes"
|
|
"github.com/mudler/LocalAI/core/services/quantization"
|
|
|
|
"github.com/mudler/xlog"
|
|
)
|
|
|
|
// Embed a directory
|
|
//
|
|
//go:embed static/*
|
|
var embedDirStatic embed.FS
|
|
|
|
// Embed React UI build output
|
|
//
|
|
//go:embed react-ui/dist/*
|
|
var reactUI embed.FS
|
|
|
|
var quietPaths = []string{"/api/operations", "/api/resources", "/healthz", "/readyz"}
|
|
|
|
// @title LocalAI API
|
|
// @version 2.0.0
|
|
// @description The LocalAI Rest API.
|
|
// @termsOfService
|
|
// @contact.name LocalAI
|
|
// @contact.url https://localai.io
|
|
// @license.name MIT
|
|
// @license.url https://raw.githubusercontent.com/mudler/LocalAI/master/LICENSE
|
|
// @BasePath /
|
|
// @schemes http https
|
|
// @securityDefinitions.apikey BearerAuth
|
|
// @in header
|
|
// @name Authorization
|
|
// @tag.name inference
|
|
// @tag.description Chat completions, text completions, edits, and responses (OpenAI-compatible)
|
|
// @tag.name embeddings
|
|
// @tag.description Vector embeddings (OpenAI-compatible)
|
|
// @tag.name audio
|
|
// @tag.description Text-to-speech, transcription, voice activity detection, sound generation
|
|
// @tag.name images
|
|
// @tag.description Image generation and inpainting
|
|
// @tag.name video
|
|
// @tag.description Video generation from prompts
|
|
// @tag.name detection
|
|
// @tag.description Object detection in images
|
|
// @tag.name tokenize
|
|
// @tag.description Tokenization and token metrics
|
|
// @tag.name models
|
|
// @tag.description Model gallery browsing, installation, deletion, and listing
|
|
// @tag.name backends
|
|
// @tag.description Backend gallery browsing, installation, deletion, and listing
|
|
// @tag.name config
|
|
// @tag.description Model configuration metadata, autocomplete, PATCH updates, VRAM estimation
|
|
// @tag.name monitoring
|
|
// @tag.description Prometheus metrics, backend status, system information
|
|
// @tag.name mcp
|
|
// @tag.description Model Context Protocol — tool-augmented chat with MCP servers
|
|
// @tag.name agent-jobs
|
|
// @tag.description Agent task and job management
|
|
// @tag.name p2p
|
|
// @tag.description Peer-to-peer networking nodes and tokens
|
|
// @tag.name rerank
|
|
// @tag.description Document reranking
|
|
// @tag.name instructions
|
|
// @tag.description API instruction discovery — browse instruction areas and get endpoint guides
|
|
|
|
func API(application *application.Application) (*echo.Echo, error) {
|
|
e := echo.New()
|
|
|
|
// Set body limit
|
|
if application.ApplicationConfig().UploadLimitMB > 0 {
|
|
e.Use(middleware.BodyLimit(fmt.Sprintf("%dM", application.ApplicationConfig().UploadLimitMB)))
|
|
}
|
|
|
|
// SPA fallback handler, set later when React UI is available
|
|
var spaFallback func(echo.Context) error
|
|
|
|
// Set error handler
|
|
if !application.ApplicationConfig().OpaqueErrors {
|
|
e.HTTPErrorHandler = func(err error, c echo.Context) {
|
|
code := http.StatusInternalServerError
|
|
var he *echo.HTTPError
|
|
if errors.As(err, &he) {
|
|
code = he.Code
|
|
}
|
|
|
|
// Handle 404 errors: serve React SPA for HTML requests, JSON otherwise
|
|
if code == http.StatusNotFound {
|
|
if spaFallback != nil {
|
|
accept := c.Request().Header.Get("Accept")
|
|
contentType := c.Request().Header.Get("Content-Type")
|
|
if strings.Contains(accept, "text/html") && !strings.Contains(contentType, "application/json") {
|
|
spaFallback(c)
|
|
return
|
|
}
|
|
}
|
|
notFoundHandler(c)
|
|
return
|
|
}
|
|
|
|
// Send custom error page
|
|
c.JSON(code, schema.ErrorResponse{
|
|
Error: &schema.APIError{Message: err.Error(), Code: code},
|
|
})
|
|
}
|
|
} else {
|
|
e.HTTPErrorHandler = func(err error, c echo.Context) {
|
|
code := http.StatusInternalServerError
|
|
var he *echo.HTTPError
|
|
if errors.As(err, &he) {
|
|
code = he.Code
|
|
}
|
|
c.NoContent(code)
|
|
}
|
|
}
|
|
|
|
// Set renderer
|
|
e.Renderer = renderEngine()
|
|
|
|
// Hide banner
|
|
e.HideBanner = true
|
|
e.HidePort = true
|
|
|
|
// Middleware - StripPathPrefix must be registered early as it uses Rewrite which runs before routing
|
|
e.Pre(httpMiddleware.StripPathPrefix())
|
|
|
|
e.Pre(middleware.RemoveTrailingSlash())
|
|
|
|
if application.ApplicationConfig().MachineTag != "" {
|
|
e.Use(func(next echo.HandlerFunc) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
c.Response().Header().Set("Machine-Tag", application.ApplicationConfig().MachineTag)
|
|
return next(c)
|
|
}
|
|
})
|
|
}
|
|
|
|
// Security headers (CSP, X-Content-Type-Options, X-Frame-Options,
|
|
// Referrer-Policy). Set early so every response — including 404s and
|
|
// errors — picks them up.
|
|
e.Use(httpMiddleware.SecurityHeaders())
|
|
|
|
// Custom logger middleware using xlog
|
|
e.Use(func(next echo.HandlerFunc) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
req := c.Request()
|
|
res := c.Response()
|
|
err := next(c)
|
|
|
|
// Echo's central HTTPErrorHandler runs *after* this middleware
|
|
// returns, so res.Status still reads the default 200 here when a
|
|
// handler returned an error without writing a response. Mirror
|
|
// echo.DefaultHTTPErrorHandler's status derivation so the access
|
|
// log reflects the status the client actually receives — without
|
|
// this, every silent handler error logs as 200.
|
|
status := res.Status
|
|
if err != nil && !res.Committed {
|
|
status = http.StatusInternalServerError
|
|
var he *echo.HTTPError
|
|
if errors.As(err, &he) {
|
|
status = he.Code
|
|
}
|
|
}
|
|
|
|
// Fix for #7989: Reduce log verbosity of Web UI polling, resources API, and health checks
|
|
// These paths are logged at DEBUG level (hidden by default) instead of INFO.
|
|
isQuietPath := false
|
|
for _, path := range quietPaths {
|
|
if req.URL.Path == path {
|
|
isQuietPath = true
|
|
break
|
|
}
|
|
}
|
|
|
|
if isQuietPath && status == 200 {
|
|
xlog.Debug("HTTP request", "method", req.Method, "path", req.URL.Path, "status", status)
|
|
} else {
|
|
xlog.Info("HTTP request", "method", req.Method, "path", req.URL.Path, "status", status)
|
|
}
|
|
return err
|
|
}
|
|
})
|
|
|
|
// Recover middleware
|
|
if !application.ApplicationConfig().Debug {
|
|
e.Use(middleware.Recover())
|
|
}
|
|
|
|
// Metrics middleware
|
|
if !application.ApplicationConfig().DisableMetrics {
|
|
metricsService, err := monitoring.NewLocalAIMetricsService()
|
|
if err != nil {
|
|
return nil, err
|
|
}
|
|
|
|
if metricsService != nil {
|
|
e.Use(localai.LocalAIMetricsAPIMiddleware(metricsService))
|
|
e.Server.RegisterOnShutdown(func() {
|
|
metricsService.Shutdown()
|
|
})
|
|
}
|
|
}
|
|
|
|
// Health Checks should always be exempt from auth, so register these first
|
|
routes.HealthRoutes(e)
|
|
|
|
// Build auth middleware: use the new auth.Middleware when auth is enabled or
|
|
// as a unified replacement for the legacy key-auth middleware.
|
|
authMiddleware := auth.Middleware(application.AuthDB(), application.ApplicationConfig())
|
|
|
|
// Favicon handler
|
|
e.GET("/favicon.svg", func(c echo.Context) error {
|
|
data, err := embedDirStatic.ReadFile("static/favicon.svg")
|
|
if err != nil {
|
|
return c.NoContent(http.StatusNotFound)
|
|
}
|
|
c.Response().Header().Set("Content-Type", "image/svg+xml")
|
|
return c.Blob(http.StatusOK, "image/svg+xml", data)
|
|
})
|
|
|
|
// Static files - use fs.Sub to create a filesystem rooted at "static"
|
|
staticFS, err := fs.Sub(embedDirStatic, "static")
|
|
if err != nil {
|
|
return nil, fmt.Errorf("failed to create static filesystem: %w", err)
|
|
}
|
|
e.StaticFS("/static", staticFS)
|
|
|
|
// Generated content directories
|
|
if application.ApplicationConfig().GeneratedContentDir != "" {
|
|
os.MkdirAll(application.ApplicationConfig().GeneratedContentDir, 0750)
|
|
audioPath := filepath.Join(application.ApplicationConfig().GeneratedContentDir, "audio")
|
|
imagePath := filepath.Join(application.ApplicationConfig().GeneratedContentDir, "images")
|
|
videoPath := filepath.Join(application.ApplicationConfig().GeneratedContentDir, "videos")
|
|
|
|
os.MkdirAll(audioPath, 0750)
|
|
os.MkdirAll(imagePath, 0750)
|
|
os.MkdirAll(videoPath, 0750)
|
|
|
|
e.Static("/generated-audio", audioPath)
|
|
e.Static("/generated-images", imagePath)
|
|
e.Static("/generated-videos", videoPath)
|
|
}
|
|
|
|
// Initialize usage recording when auth DB is available
|
|
if application.AuthDB() != nil {
|
|
httpMiddleware.InitUsageRecorder(application.AuthDB())
|
|
}
|
|
|
|
// Auth is applied to _all_ endpoints. Filtering out endpoints to bypass is
|
|
// the role of the exempt-path logic inside the middleware.
|
|
e.Use(authMiddleware)
|
|
|
|
// Feature and model access control (after auth middleware, before routes)
|
|
if application.AuthDB() != nil {
|
|
e.Use(auth.RequireRouteFeature(application.AuthDB()))
|
|
e.Use(auth.RequireModelAccess(application.AuthDB()))
|
|
e.Use(auth.RequireQuota(application.AuthDB()))
|
|
}
|
|
|
|
// CORS middleware. When CORS=true the operator must also specify the
|
|
// allowed origins; an empty allowlist would otherwise let Echo fall back
|
|
// to AllowOrigins=["*"], which is almost never what someone enabling
|
|
// "strict CORS" intended.
|
|
if application.ApplicationConfig().CORS {
|
|
if application.ApplicationConfig().CORSAllowOrigins == "" {
|
|
xlog.Warn("LOCALAI_CORS=true but LOCALAI_CORS_ALLOW_ORIGINS is empty; refusing to register a wildcard CORS policy. Set the allowlist or unset LOCALAI_CORS.")
|
|
} else {
|
|
corsConfig := middleware.CORSConfig{
|
|
AllowOrigins: strings.Split(application.ApplicationConfig().CORSAllowOrigins, ","),
|
|
}
|
|
e.Use(middleware.CORSWithConfig(corsConfig))
|
|
}
|
|
} else {
|
|
e.Use(middleware.CORS())
|
|
}
|
|
|
|
// CSRF middleware (enabled by default, disable with LOCALAI_DISABLE_CSRF=true)
|
|
//
|
|
// Protection relies on Echo's Sec-Fetch-Site header check (supported by all
|
|
// modern browsers). The legacy cookie+token approach is removed because
|
|
// Echo's Sec-Fetch-Site short-circuit never sets the cookie, so the frontend
|
|
// could never read a token to send back.
|
|
if !application.ApplicationConfig().DisableCSRF {
|
|
xlog.Debug("Enabling CSRF middleware (Sec-Fetch-Site mode)")
|
|
e.Use(middleware.CSRFWithConfig(middleware.CSRFConfig{
|
|
Skipper: func(c echo.Context) bool {
|
|
// Skip CSRF for API clients using auth headers (may be cross-origin)
|
|
if c.Request().Header.Get("Authorization") != "" {
|
|
return true
|
|
}
|
|
if c.Request().Header.Get("x-api-key") != "" || c.Request().Header.Get("xi-api-key") != "" {
|
|
return true
|
|
}
|
|
// Skip when Sec-Fetch-Site header is absent (older browsers, reverse
|
|
// proxies that strip the header). The SameSite=Lax cookie attribute
|
|
// provides baseline CSRF protection for these clients.
|
|
if c.Request().Header.Get("Sec-Fetch-Site") == "" {
|
|
return true
|
|
}
|
|
return false
|
|
},
|
|
// Allow same-site requests (subdomains / different ports) in addition
|
|
// to same-origin which Echo already permits by default.
|
|
AllowSecFetchSiteFunc: func(c echo.Context) (bool, error) {
|
|
secFetchSite := c.Request().Header.Get("Sec-Fetch-Site")
|
|
if secFetchSite == "same-site" {
|
|
return true, nil
|
|
}
|
|
// cross-site: block
|
|
return false, nil
|
|
},
|
|
}))
|
|
}
|
|
|
|
// Admin middleware: enforces admin role when auth is enabled, no-op otherwise
|
|
var adminMiddleware echo.MiddlewareFunc
|
|
if application.AuthDB() != nil {
|
|
adminMiddleware = auth.RequireAdmin()
|
|
} else {
|
|
adminMiddleware = auth.NoopMiddleware()
|
|
}
|
|
|
|
// Feature middlewares: per-feature access control
|
|
agentsMw := auth.RequireFeature(application.AuthDB(), auth.FeatureAgents)
|
|
skillsMw := auth.RequireFeature(application.AuthDB(), auth.FeatureSkills)
|
|
collectionsMw := auth.RequireFeature(application.AuthDB(), auth.FeatureCollections)
|
|
mcpJobsMw := auth.RequireFeature(application.AuthDB(), auth.FeatureMCPJobs)
|
|
|
|
requestExtractor := httpMiddleware.NewRequestExtractor(application.ModelConfigLoader(), application.ModelLoader(), application.ApplicationConfig())
|
|
|
|
// Register auth routes (login, callback, API keys, user management)
|
|
routes.RegisterAuthRoutes(e, application)
|
|
|
|
routes.RegisterElevenLabsRoutes(e, requestExtractor, application.ModelConfigLoader(), application.ModelLoader(), application.ApplicationConfig())
|
|
|
|
// Create opcache for tracking UI operations (used by both UI and LocalAI routes)
|
|
var opcache *galleryop.OpCache
|
|
if !application.ApplicationConfig().DisableWebUI {
|
|
opcache = galleryop.NewOpCache(application.GalleryService())
|
|
}
|
|
|
|
mcpMw := auth.RequireFeature(application.AuthDB(), auth.FeatureMCP)
|
|
routes.RegisterLocalAIRoutes(e, requestExtractor, application.ModelConfigLoader(), application.ModelLoader(), application.ApplicationConfig(), application.GalleryService(), opcache, application.TemplatesEvaluator(), application, adminMiddleware, mcpJobsMw, mcpMw)
|
|
routes.RegisterAgentPoolRoutes(e, application, agentsMw, skillsMw, collectionsMw)
|
|
// Fine-tuning routes
|
|
fineTuningMw := auth.RequireFeature(application.AuthDB(), auth.FeatureFineTuning)
|
|
ftService := finetune.NewFineTuneService(
|
|
application.ApplicationConfig(),
|
|
application.ModelLoader(),
|
|
application.ModelConfigLoader(),
|
|
)
|
|
if d := application.Distributed(); d != nil {
|
|
ftService.SetNATSClient(d.Nats)
|
|
if d.DistStores != nil && d.DistStores.FineTune != nil {
|
|
ftService.SetFineTuneStore(d.DistStores.FineTune)
|
|
}
|
|
}
|
|
routes.RegisterFineTuningRoutes(e, ftService, application.ApplicationConfig(), fineTuningMw)
|
|
|
|
// Quantization routes
|
|
quantizationMw := auth.RequireFeature(application.AuthDB(), auth.FeatureQuantization)
|
|
qService := quantization.NewQuantizationService(
|
|
application.ApplicationConfig(),
|
|
application.ModelLoader(),
|
|
application.ModelConfigLoader(),
|
|
)
|
|
routes.RegisterQuantizationRoutes(e, qService, application.ApplicationConfig(), quantizationMw)
|
|
|
|
// Node management routes (distributed mode)
|
|
distCfg := application.ApplicationConfig().Distributed
|
|
var registry *nodes.NodeRegistry
|
|
var remoteUnloader nodes.NodeCommandSender
|
|
if d := application.Distributed(); d != nil {
|
|
registry = d.Registry
|
|
if d.Router != nil {
|
|
remoteUnloader = d.Router.Unloader()
|
|
}
|
|
}
|
|
routes.RegisterNodeSelfServiceRoutes(e, registry, distCfg.RegistrationToken, distCfg.AutoApproveNodes, application.AuthDB(), application.ApplicationConfig().Auth.APIKeyHMACSecret)
|
|
routes.RegisterNodeAdminRoutes(e, registry, remoteUnloader, adminMiddleware, application.AuthDB(), application.ApplicationConfig().Auth.APIKeyHMACSecret, application.ApplicationConfig().Distributed.RegistrationToken)
|
|
|
|
// Distributed SSE routes (job progress + agent events via NATS)
|
|
if d := application.Distributed(); d != nil {
|
|
if d.Dispatcher != nil {
|
|
e.GET("/api/agent/jobs/:id/progress", d.Dispatcher.SSEHandler(), mcpJobsMw)
|
|
}
|
|
if d.AgentBridge != nil {
|
|
e.GET("/api/agents/:name/sse/distributed", d.AgentBridge.SSEHandler(), agentsMw)
|
|
}
|
|
}
|
|
|
|
routes.RegisterOpenAIRoutes(e, requestExtractor, application)
|
|
routes.RegisterAnthropicRoutes(e, requestExtractor, application)
|
|
routes.RegisterOpenResponsesRoutes(e, requestExtractor, application)
|
|
routes.RegisterOllamaRoutes(e, requestExtractor, application)
|
|
if application.ApplicationConfig().OllamaAPIRootEndpoint {
|
|
routes.RegisterOllamaRootEndpoint(e)
|
|
}
|
|
if !application.ApplicationConfig().DisableWebUI {
|
|
routes.RegisterUIAPIRoutes(e, application.ModelConfigLoader(), application.ModelLoader(), application.ApplicationConfig(), application.GalleryService(), opcache, application, adminMiddleware)
|
|
routes.RegisterUIRoutes(e, application.ModelConfigLoader(), application.ApplicationConfig(), application.GalleryService(), adminMiddleware)
|
|
|
|
// Serve React SPA from / with SPA fallback via 404 handler
|
|
reactFS, fsErr := fs.Sub(reactUI, "react-ui/dist")
|
|
if fsErr != nil {
|
|
xlog.Warn("React UI not available (build with 'make core/http/react-ui/dist')", "error", fsErr)
|
|
} else {
|
|
serveIndex := func(c echo.Context) error {
|
|
indexHTML, err := reactUI.ReadFile("react-ui/dist/index.html")
|
|
if err != nil {
|
|
return c.String(http.StatusNotFound, "React UI not built")
|
|
}
|
|
// Inject <base href> for reverse-proxy support; baseURL comes
|
|
// from attacker-controllable Host / X-Forwarded-Host headers.
|
|
baseURL := httpMiddleware.BaseURL(c)
|
|
if baseURL != "" {
|
|
baseTag := `<base href="` + httpMiddleware.SecureBaseHref(baseURL) + `" />`
|
|
indexHTML = []byte(strings.Replace(string(indexHTML), "<head>", "<head>\n "+baseTag, 1))
|
|
}
|
|
return c.HTMLBlob(http.StatusOK, indexHTML)
|
|
}
|
|
|
|
// Enable SPA fallback in the 404 handler for client-side routing
|
|
spaFallback = serveIndex
|
|
|
|
// Serve React SPA at /app
|
|
e.GET("/app", serveIndex)
|
|
e.GET("/app/*", serveIndex)
|
|
|
|
// prefixRedirect performs a redirect that preserves X-Forwarded-Prefix
|
|
// for reverse-proxy support. The prefix is forgeable on misconfigured
|
|
// proxy chains, so reject anything that isn't a same-origin path.
|
|
prefixRedirect := func(c echo.Context, target string) error {
|
|
if prefix, ok := httpMiddleware.SafeForwardedPrefix(c.Request().Header.Get("X-Forwarded-Prefix")); ok {
|
|
target = strings.TrimSuffix(prefix, "/") + target
|
|
}
|
|
return c.Redirect(http.StatusMovedPermanently, target)
|
|
}
|
|
|
|
// Redirect / to /app
|
|
e.GET("/", func(c echo.Context) error {
|
|
return prefixRedirect(c, "/app")
|
|
})
|
|
|
|
// Backward compatibility: redirect /browse/* to /app/*
|
|
e.GET("/browse", func(c echo.Context) error {
|
|
return prefixRedirect(c, "/app")
|
|
})
|
|
e.GET("/browse/*", func(c echo.Context) error {
|
|
p := c.Param("*")
|
|
return prefixRedirect(c, "/app/"+p)
|
|
})
|
|
|
|
// Serve React static assets (JS, CSS, etc.) and i18n locale JSONs
|
|
// from the embedded React build.
|
|
serveReactSubdir := func(subdir string) echo.HandlerFunc {
|
|
return func(c echo.Context) error {
|
|
p := subdir + "/" + c.Param("*")
|
|
f, err := reactFS.Open(p)
|
|
if err == nil {
|
|
defer f.Close()
|
|
stat, statErr := f.Stat()
|
|
if statErr == nil && !stat.IsDir() {
|
|
contentType := mime.TypeByExtension(filepath.Ext(p))
|
|
if contentType == "" {
|
|
contentType = echo.MIMEOctetStream
|
|
}
|
|
return c.Stream(http.StatusOK, contentType, f)
|
|
}
|
|
}
|
|
return echo.NewHTTPError(http.StatusNotFound)
|
|
}
|
|
}
|
|
e.GET("/assets/*", serveReactSubdir("assets"))
|
|
e.GET("/locales/*", serveReactSubdir("locales"))
|
|
}
|
|
}
|
|
routes.RegisterJINARoutes(e, requestExtractor, application.ModelConfigLoader(), application.ModelLoader(), application.ApplicationConfig())
|
|
|
|
// Note: 404 handling is done via HTTPErrorHandler above, no need for catch-all route
|
|
|
|
// HTTP server timeouts.
|
|
//
|
|
// - ReadHeaderTimeout: bounds the slow-headers Slowloris case. 30s is
|
|
// enough for a real client on a poor connection but cuts off a
|
|
// drip-feeding attacker.
|
|
// - IdleTimeout: bounds idle keep-alive connections.
|
|
//
|
|
// We deliberately leave ReadTimeout and WriteTimeout at 0:
|
|
// - Request bodies can be multi-GB model/dataset uploads.
|
|
// - Chat-completion and SSE responses can stream for many minutes.
|
|
// Operators who need stricter limits should front the server with a
|
|
// reverse proxy that terminates slow clients per-request.
|
|
e.Server.ReadHeaderTimeout = 30 * time.Second
|
|
e.Server.IdleTimeout = 120 * time.Second
|
|
|
|
// Log startup message
|
|
e.Server.RegisterOnShutdown(func() {
|
|
xlog.Info("LocalAI API server shutting down")
|
|
})
|
|
|
|
return e, nil
|
|
}
|