LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-31 04:00:05 -04:00

Author	SHA1	Message	Date
Siddharth More	c9a1a7e6a0	UI: add 'Fits in my GPU' filter on Install Models (#10017 ) * feat(ui): add GPU fit filter on models install page * Delete docs/vram-fits-filter-backend-optionals.md Signed-off-by: Siddharth More <siddimore@gmail.com> --------- Signed-off-by: Siddharth More <siddimore@gmail.com>	2026-05-27 15:17:44 +02:00
Richard Palethorpe	6a80e23733	feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 ) Add a routing middleware stack and a cloud-proxy backend. * cloud-proxy: a Go gRPC backend that forwards OpenAI- and Anthropic-shaped chat requests to upstream providers, with an optional translate mode (OpenAI request -> Anthropic /v1/messages -> OpenAI response) and full tool-calling support. * routing: admission control, content-aware model routing (embedding cache + classifier + rerank + Arch-Router score), PII detection/redaction (regex + NER) with streaming filter and OpenAI/Anthropic adapters, and a per-user/per-key billing recorder backed by GORM or in-memory storage. * middleware: UsageMiddleware records usage via the billing recorder, plus admission, route-model, usage-stamp and trace middlewares. * observability: BackendTrace ring buffer stores full request bodies (capped), MITM proxy emits structured trace events, and router classifier decisions surface at /api/router/decide. * gallery: Arch-Router-1.5B (Q4_K_M and Q8_0). * UI: cloud-proxy model-editor fields, classifier system-prompt and score-normalization config, and a Traces page rendering request bodies. Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-25 09:28:27 +02:00
LocalAI [bot]	f0cb02afb8	feat(usage): attribute Sources rows to user accounts in admin view (#9935 ) The merged feature (#9920) let admins see per-API-key and per-source totals but did not surface which user owned each key, and lumped every user's Web UI traffic into a single global Web UI row. This makes the admin Sources tab properly per-user attributable: - KeyTotal gains UserID + UserName, populated from the snapshot the usage middleware already records. The by_key roll-up now groups by (api_key_id, api_key_name, user_id, user_name). - New SourceTotals.ByUserSource roll-up groups (source, user_id, user_name) for sources without a key identity (web, legacy). Only populated on the admin path (includeLegacy=true); the non-admin endpoint stays unchanged for backwards compatibility. - SourcesTable accepts showUserColumn={isAdmin}; admin view renders a User column, makes the search match user name/id, and expands Web UI / legacy pseudo-rows from the global aggregate to one row per user using by_user_source. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 23:23:06 +02:00
LocalAI [bot]	f15b9178ec	feat(usage): track and visualise usage per API key (#9920 ) * feat(usage): add Source, APIKeyID, APIKeyName columns to UsageRecord Adds three additive columns plus UsageSource* constants. The columns are auto-migrated by InitDB. APIKeyID is a nullable foreign reference to UserAPIKey.ID; APIKeyName is snapshotted on each row so revoked keys keep showing their name in history. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): backfill Source on pre-feature usage rows InitDB now classifies any pre-existing usage_record with an empty source: 'legacy-api-key' user -> legacy, everything else -> web. The backfill is idempotent (only touches NULL/empty rows). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add GetUserUsageBySource aggregator Groups by (bucket, source, api_key_id, api_key_name). Filters out legacy by default. Returns both per-bucket detail and roll-ups (by_source, by_key sorted desc and capped at 200, grand_total). The MAX(created_at) projection is iterated via Rows().Scan into a string column and parsed manually because the SQLite driver surfaces the aggregated timestamp as a string, which database/sql refuses to scan directly into time.Time. Postgres returns a real timestamp; the same string path handles its RFC3339 form too. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(usage): log Rows() errors and assert LastUsed in tests Adds rows.Err() and Rows() open-failure logging in computeSourceTotals so silent data drops surface in logs. Logs on parseLastUsedString format misses for the same reason. Strengthens the snapshot-survival test to assert LastUsed is a recent timestamp, locking the SQLite time-string parser behaviour. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add admin GetAllUsageBySource with filters and truncation Optional user_id and api_key_id filters (composed with AND). Legacy bucket is included for admin callers. truncated=true when more than 200 distinct keys would be in the by_key roll-up. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(auth): plumb auth_source and auth_apikey through Echo context tryAuthenticate now sets auth_source on every successful branch (web for session/Bearer-session, apikey for Bearer-key/x-api-key/ token-cookie, legacy for legacy env key match). For named-key branches it also stores the resolved UserAPIKey under auth_apikey so downstream middlewares can snapshot id+name without re-validating. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> fix(auth): expand tryAuthenticate godoc and cover Bearer-session branch Documents all three context-keys side effects (auth_source, auth_apikey, _auth_session) plus the split of responsibilities with the parent Middleware. Adds a test for the Bearer-as-session-token classification so future regressions there fail loudly. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): UsageMiddleware records source + snapshots key name Reads auth_source and auth_apikey from the Echo context (set by auth.Middleware in the previous task). Snapshots UserAPIKey.ID and Name onto each row so revoked keys remain readable in history. Falls back to source=web when no auth_source is set (auth disabled or unrecognised path). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add /api/auth/usage/sources and admin variant Self endpoint filters legacy server-side; admin endpoint includes legacy and accepts user_id + api_key_id filters. Response includes buckets, totals.{by_source, by_key, grand_total}, and a truncated flag set when the per-key roll-up was capped at 200. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(routes): mark test mirror handlers as keep-in-sync with production The newTestAuthApp helper duplicates production route handlers inline because it cannot use RegisterAuthRoutes (which requires a application.Application). Naming the source path on each mirror makes the drift contract explicit for future maintainers. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> feat(ui): add usageApi.getMySources/getAdminSources + i18n strings Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add Sources tab skeleton with data fetch Adds Usage page tab that fetches /api/auth/usage/sources (or the admin variant). Renders raw totals plus a placeholder key list; real visualisations land in subsequent commits. Restructures the existing tab button block so Models and Sources are visible to non-admins (Users remains admin-only). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): source mix ribbon + searchable/sortable sources table Replaces the SourcesTab placeholder rendering with two reusable components: SourceMixRibbon (one segmented bar per source class) and SourcesTable (search + sort + revoked-key dim). Pulls the current API key list to detect revoked keys. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ui): skip revoked-key detection until the key list is known existingKeyIds defaulted to an empty Set, which made every live api_key row render as (revoked) during the brief window before apiKeysApi.list() resolved, and permanently after a fetch failure. Use null as the unknown state and suppress the revoked badge until the parent provides a real Set. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): top-N stacked time chart and drill-in chip for Sources tab Top 7 sources by total tokens get distinct colours; the rest roll up into 'Other'. Clicking a row in the SourcesTable dims everything except that series in the chart; the chip is the canonical clear. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(usage): document per-API-key Sources tab and endpoints Extends features/authentication.md Usage Tracking section with: - A 'Sources' tab description and source-class taxonomy - Endpoint documentation for /api/auth/usage/sources and the admin variant - Response shape example with by_source / by_key / grand_total - Migration note about pre-feature row backfill Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(usage): silence errcheck on deferred rows.Close CI errcheck flagged the bare 'defer rows.Close()' in computeSourceTotals. Wrap in a closure that discards the close error explicitly; an error here is non-actionable since we have already drained the rows and logged any iteration failure. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(usage): bound batcher intake and add Shutdown/FlushNow hooks The pre-existing usage batcher had no cap on its add() path; the usageMaxPending=5000 constant only guarded the re-queue path after a failed write, leaving memory growth unbounded if the DB fell behind. This commit: - Adds the cap to add() so saturation drops new records (rate-limited warn at 1/1024) instead of growing unbounded. - Raises usageMaxPending to 50000 to absorb realistic inference bursts. - Replaces the package-level batcher global with a mutex-guarded pair plus a currentBatcher() accessor so Init / Shutdown cycles are race-free. - Adds ShutdownUsageRecorder() for graceful drain on process exit (not yet wired into app shutdown, just published). - Adds FlushNow() for deterministic tests; the middleware suite no longer needs 6s sleeps per spec and now runs in ~50ms instead of 18s. - Re-queue on failed flush is now cap-aware: prepends as much of the failed batch as fits alongside concurrent arrivals, instead of dropping the whole batch when full. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): drain usage batcher on graceful shutdown Registers ShutdownUsageRecorder with the existing signals.RegisterGracefulTerminationHandler so SIGINT/SIGTERM synchronously flushes any in-memory usage records before the process exits. Without this, up to one flush interval (5s) of recorded usage was lost when LocalAI restarted. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 16:34:02 +02:00
LocalAI [bot]	11d5bd0cc3	fix(react-ui/chat): stop wiping selection on every /api/operations poll (#9904 ) (#9917 ) useOperations() was calling setOperations() with a fresh array on every 1s poll, even when the payload was identical. In React 19 the DOM diff no longer short-circuits dangerouslySetInnerHTML on equal __html, so the forced Chat re-render re-assigned innerHTML on every assistant message once per second — wiping any text the user had selected. Skip the state update when the serialised operations payload is unchanged, and switch loading/error to functional setters so they also short-circuit at the source. Also fixes the chat copy button on plain HTTP: navigator.clipboard is undefined in non-secure contexts (a common LXC+Docker deployment), but the previous code called it unconditionally and showed a success toast regardless. Routed Chat, AgentChat and CanvasPanel through a new copyToClipboard() helper that uses navigator.clipboard when available and falls back to a hidden-textarea + execCommand('copy') trick that browsers still honour outside secure contexts. The fallback preserves the user's existing selection. Regression coverage in e2e/chat-polling-selection.spec.js: a MutationObserver counts mutations on the assistant content node across 3s of polling (must be 0); the copy test stubs out navigator.clipboard and asserts that execCommand('copy') is invoked. Assisted-by: claude-opus-4-7-1m Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 12:17:51 +02:00
Richard Palethorpe	0245b33eab	feat(realtime): Add Liquid Audio s2s model and assistant mode on talk page (#9801 ) * feat(liquid-audio): add LFM2.5-Audio any-to-any backend + realtime_audio usecase Wires LiquidAI's LFM2.5-Audio-1.5B as a self-contained Realtime API model: single engine handles VAD, transcription, LLM, and TTS in one bidirectional stream — drop-in alternative to a VAD+STT+LLM+TTS pipeline. Backend - backend/python/liquid-audio/ — new Python gRPC backend wrapping the `liquid-audio` package. Modes: chat / asr / tts / s2s, voice presets, Load/Predict/PredictStream/AudioTranscription/TTS/VAD/AudioToAudioStream/ Free and StartFineTune/FineTuneProgress/StopFineTune. Runtime monkey-patch on `liquid_audio.utils.snapshot_download` so absolute local paths from LocalAI's gallery resolve without a HF round-trip. soundfile in place of torchaudio.load/save (torchcodec drags NVIDIA NPP we don't bundle). - backend/backend.proto + pkg/grpc/{backend,client,server,base,embed, interface}.go — new AudioToAudioStream RPC mirroring AudioTransformStream (config/frame/control oneof in; typed event+pcm+meta out). - core/services/nodes/{health_mock,inflight}_test.go — add stubs for the new RPC to the test fakes. Config + capabilities - core/config/backend_capabilities.go — UsecaseRealtimeAudio, MethodAudio ToAudioStream, UsecaseInfoMap entry, liquid-audio BackendCapability row. - core/config/model_config.go — FLAG_REALTIME_AUDIO bitmask, ModalityGroups membership in both speech-input and audio-output groups so a lone flag still reads as multimodal, GetAllModelConfigUsecases entry, GuessUsecases branch. Realtime endpoint - core/http/endpoints/openai/realtime.go — extract prepareRealtimeConfig() so the gate is unit-testable; accept realtime_audio models and self-fill empty pipeline slots with the model's own name (user-pinned slots win). - core/http/endpoints/openai/realtime_gate_test.go — six specs covering nil cfg, empty pipeline, legacy pipeline, self-contained realtime_audio, user-pinned VAD slot, and partial legacy pipeline. UI + endpoints - core/http/routes/ui.go — /api/pipeline-models accepts either a legacy VAD+STT+LLM+TTS pipeline or a realtime_audio model; surfaces a self_contained flag so the Talk page can collapse the four cards. - core/http/routes/ui_api.go — realtime_audio in usecaseFilters. - core/http/routes/ui_pipeline_models_test.go — covers both code paths. - core/http/react-ui/src/pages/Talk.jsx — self-contained badge instead of the four-slot grid; rename Edit Pipeline → Edit Model Config; less pipeline-specific wording. - core/http/react-ui/src/pages/Models.jsx + locales/en/models.json — new realtime_audio filter button + i18n. - core/http/react-ui/src/utils/capabilities.js — CAP_REALTIME_AUDIO. - core/http/react-ui/src/pages/FineTune.jsx — voice + validation-dataset fields, surfaced when backend === liquid-audio, plumbed via extra_options on submit/export/import. Gallery + importer - gallery/liquid-audio.yaml — config template with known_usecases: [realtime_audio, chat, tts, transcript, vad]. - gallery/index.yaml — four model entries (realtime/chat/asr/tts) keyed by mode option. Fixed pre-existing `transcribe` typo on the asr entry (loader silently dropped the unknown string → entry never surfaced as a transcript model). - gallery/lfm.yaml — function block for the LFM2 Pythonic tool-call format `<\|tool_call_start\|>[name(k="v")]<\|tool_call_end\|>` matching common_chat_params_init_lfm2 in vendored llama.cpp. - core/gallery/importers/{liquid-audio,liquid-audio_test}.go — detector matches LFM2-Audio HF repos (excludes -gguf mirrors); mode/voice preferences plumbed through to options. - core/gallery/importers/importers.go — register LiquidAudioImporter before LlamaCPPImporter. - pkg/functions/parse_lfm2_test.go — seven specs for the response/argument regex pair on the LFM2 pythonic format. Build matrix - .github/backend-matrix.yml — seven liquid-audio targets (cuda12, cuda13, l4t-cuda-13, hipblas, intel, cpu amd64, cpu arm64). Jetpack r36 cuda-12 is skipped (Ubuntu 22.04 / Python 3.10 incompatible with liquid-audio's 3.12 floor). - backend/index.yaml — anchor + 13 image entries. - Makefile — .NOTPARALLEL, prepare-test-extra, test-extra, docker-build-liquid-audio. Docs - .agents/plans/liquid-audio-integration.md — phased plan; PR-D (real any-to-any wiring via AudioToAudioStream), PR-E (mid-audio tool-call detector), PR-G (GGUF entries once upstream llama.cpp PR #18641 lands) remain. - .agents/api-endpoints-and-auth.md — expand the capability-surface checklist with every place a new FLAG_* needs to be registered. Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): function calling + history cap for any-to-any models Three pieces, all on the realtime_audio path that just landed: 1. liquid-audio backend (backend/python/liquid-audio/backend.py): - _build_chat_state grows a `tools_prelude` arg. - new _render_tools_prelude parses request.Tools (the OpenAI Chat Completions function array realtime.go already serialises) and emits an LFM2 `<\|tool_list_start\|>…<\|tool_list_end\|>` system turn ahead of the user history. Mirrors gallery/lfm.yaml's `function:` template so the model sees the same prompt shape whether served via llama-cpp or here. Without this the backend silently dropped tools — function calling was wired end-to-end on the Go side but the model never saw a tool list. 2. Realtime history cap (core/http/endpoints/openai/realtime.go): - Session grows MaxHistoryItems int; default picked by new defaultMaxHistoryItems(cfg) — 6 for realtime_audio models (LFM2.5 1.5B degrades quickly past a handful of turns), 0/unlimited for legacy pipelines composing larger LLMs. - triggerResponse runs conv.Items through trimRealtimeItems before building conversationHistory. Helper walks the cut left if it would orphan a function_call_output, so tool result + call pairs stay intact. - realtime_gate_test.go: specs for defaultMaxHistoryItems and trimRealtimeItems (zero cap, under cap, over cap, tool-call pair preservation). 3. Talk page (core/http/react-ui/src/pages/Talk.jsx): - Reuses the chat page's MCP plumbing — useMCPClient hook, ClientMCPDropdown component, same auto-connect/disconnect effect pattern. No bespoke tool registry, no new REST endpoints; tools come from whichever MCP servers the user toggles on, exactly as on the chat page. - sendSessionUpdate now passes session.tools=getToolsForLLM(); the update re-fires when the active server set changes mid-session. - New response.function_call_arguments.done handler executes via the hook's executeTool (which round-trips through the MCP client SDK), then replies with conversation.item.create {type:function_call_output} + response.create so the model completes its turn with the tool output. Mirrors chat's client-side agentic loop, translated to the realtime wire shape. UI changes require a LocalAI image rebuild (Dockerfile:308-313 bakes react-ui/dist into the runtime image). Backend.py changes can be swapped live in /backends/<id>/backend.py + /backend/shutdown. Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): LocalAI Assistant ("Manage Mode") for the Talk page Mirrors the chat-page metadata.localai_assistant flow so users can ask the realtime model what's loaded / installed / configured. Tools are run server-side via the same in-process MCP holder that powers the chat modality — no transport switch, no proxy, no new wire protocol. Wire: - core/http/endpoints/openai/realtime.go: - RealtimeSessionOptions{LocalAIAssistant,IsAdmin}; isCurrentUserAdmin helper mirrors chat.go's requireAssistantAccess (no-op when auth disabled, else requires auth.RoleAdmin). - Session grows AssistantExecutor mcpTools.ToolExecutor. - runRealtimeSession, when opts.LocalAIAssistant is set: gate on admin, fail closed if DisableLocalAIAssistant or the holder has no tools, DiscoverTools and inject into session.Tools, prepend holder.SystemPrompt() to instructions. - Tool-call dispatch loop: when AssistantExecutor.IsTool(name), run ExecuteTool inproc, append a FunctionCallOutput to conv.Items, skip the function_call_arguments client emit (the client can't execute these — it doesn't know about them). After the loop, if any assistant tool ran, trigger another response so the model speaks the result. Mirrors chat's agentic loop, driven server-side rather than via client round-trip. - core/http/endpoints/openai/realtime_webrtc.go: RealtimeCallRequest gains `localai_assistant` (JSON omitempty). Handshake calls isCurrentUserAdmin and builds RealtimeSessionOptions. - core/http/react-ui/src/pages/Talk.jsx: admin-only "Manage Mode" checkbox under the Tools dropdown; passes localai_assistant: true to realtimeApi.call's body, captured in the connect callback's deps. Mirroring chat's pattern means the in-process MCP tools surface "just works" for the Talk page without exposing a Streamable-HTTP MCP endpoint (which was the alternative). Clients with their own MCP servers can still use the existing ClientMCPDropdown path in parallel; the realtime handler distinguishes them by AssistantExecutor.IsTool() at dispatch time. Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): render Manage Mode tool calls in the Talk transcript Previously the realtime endpoint only emitted response.output_item.added for the FunctionCall item, and Talk.jsx's switch ignored the event — so server-side tool runs were invisible in the UI. The model would speak the result but the user had no way to see what tool was actually called. realtime.go: after executing an assistant tool inproc, emit a second output_item.added/.done pair for the FunctionCallOutput item. Mirrors the way the chat page displays tool_call + tool_result blocks. Talk.jsx: handle both response.output_item.added and .done. Render FunctionCall (with arguments) and FunctionCallOutput (pretty-printed JSON when possible) as two transcript entries — `tool_call` with the wrench icon, `tool_result` with the clipboard icon, both in mono-space secondary-colour. Resets streamingRef after the result so the next assistant text delta starts a fresh transcript entry instead of appending to the previous turn. Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * refactor(realtime): bound the Manage Mode tool-loop + preserve assistant tools Fallout from a review pass on the Manage Mode patches: - Bound the server-side agentic loop. triggerResponse used to recurse on executedAssistantTool with no cap — a model that kept calling tools would blow the goroutine stack. New maxAssistantToolTurns = 10 (mirrors useChat.js's maxToolTurns). Public triggerResponse is now a thin shim over triggerResponseAtTurn(toolTurn int); recursion increments the counter and stops at the cap with an xlog.Warn. - Preserve Manage Mode tools across client session.update. The handler used to blindly overwrite session.Tools, so toggling a client MCP server mid-session silently wiped the in-process admin tools. Session now caches the original AssistantTools slice at session creation and the session.update handler merges them back in (client names win on collision — the client is explicit). - strconv.ParseBool for the localai_assistant query param instead of hand-rolled "1" \|\| "true". Mirrors LocalAIAssistantFromMetadata. - Talk.jsx: render both tool_call and tool_result on response.output_item.done instead of splitting them across .added and .done. The server's event pairing (added → done) stays correct; the UI just doesn't need to inspect both phases of the same item. One switch case instead of two, no behavioural change. Out of scope (noted for follow-ups): extract a shared assistant-tools helper between chat.go and realtime.go (duplication is small enough that two parallel implementations stay readable for now), and an i18n key for the Manage Mode helper text (Talk.jsx doesn't use i18n anywhere else yet). Assisted-by: claude-code:claude-opus-4-7-1m [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * ci(test-extra): wire liquid-audio backend smoke test The backend ships test.py + a `make test` target and is listed in backend-matrix.yml, so scripts/changed-backends.js already writes a `liquid-audio=true\|false` output when files under backend/python/liquid-audio/ change. The workflow just wasn't reading it. - Expose the `liquid-audio` output on the detect-changes job - Add a tests-liquid-audio job that runs `make` + `make test` in backend/python/liquid-audio, gated on the per-backend detect flag The smoke covers Health() and LoadModel(mode:finetune); fine-tune mode short-circuits before any HuggingFace download (backend.py:192), so the job needs neither weights nor a GPU. The full-inference path remains gated on LIQUID_AUDIO_MODEL_ID, which CI doesn't set. The four new Go test files (core/gallery/importers/liquid-audio_test.go, core/http/endpoints/openai/realtime_gate_test.go, core/http/routes/ui_pipeline_models_test.go, pkg/functions/parse_lfm2_test.go) are already picked up by the existing test.yml workflow via `make test` → `ginkgo -r ./pkg/... ./core/...`; their packages all carry RunSpecs entries. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-13 21:57:27 +02:00
Richard Palethorpe	670259ce43	chore: Security hardening (#9719 ) * fix(http): close 0.0.0.0/[::] SSRF bypass in /api/cors-proxy The CORS proxy carried its own private-network blocklist (RFC 1918 + a handful of IPv6 ranges) instead of using the same classification as pkg/utils/urlfetch.go. The hand-rolled list missed 0.0.0.0/8 and ::/128, both of which Linux routes to localhost — so any user with FeatureMCP (default-on for new users) could reach LocalAI's own listener and any other service bound to 0.0.0.0:port via: GET /api/cors-proxy?url=http://0.0.0.0:8080/... GET /api/cors-proxy?url=http://[::]:8080/... Replace the custom check with utils.IsPublicIP (Go stdlib IsLoopback / IsLinkLocalUnicast / IsPrivate / IsUnspecified, plus IPv4-mapped IPv6 unmasking) and add an upfront hostname rejection for localhost, .local, and the cloud metadata aliases so split-horizon DNS can't paper over the IP check. The IP-pinning DialContext is unchanged: the validated IP from the single resolution is reused for the connection, so DNS rebinding still cannot swap a public answer for a private one between validate and dial. Regression tests cover 0.0.0.0, 0.0.0.0:PORT, [::], ::ffff:127.0.0.1, ::ffff:10.0.0.1, file://, gopher://, ftp://, localhost, 127.0.0.1, 10.0.0.1, 169.254.169.254, metadata.google.internal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> fix(downloader): verify SHA before promoting temp file to final path DownloadFileWithContext renamed the .partial file to its final name before checking the streamed SHA, so a hash mismatch returned an error but left the tampered file at filePath. Subsequent code that operated on filePath (a backend launcher, a YAML loader, a re-download that finds the file already present and skips) would consume the attacker-supplied bytes. Reorder: verify the streamed hash first, remove the .partial on mismatch, then rename. The streamed hash is computed during io.Copy so no second read is needed. While here, raise the empty-SHA case from a Debug log to a Warn so "this download had no integrity check" is visible at the default log level. Backend installs currently pass through with no digest; the warning makes that footprint observable without changing behaviour. Regression test asserts os.IsNotExist on the destination after a deliberate SHA mismatch. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(auth): require email_verified for OIDC admin promotion extractOIDCUserInfo read the ID token's "email" claim but never inspected "email_verified". With LOCALAI_ADMIN_EMAIL set, an attacker who could register on the configured OIDC IdP under that email (some IdPs accept self-supplied unverified emails) inherited admin role: - first login: AssignRole(tx, email, adminEmail) → RoleAdmin - re-login: MaybePromote(db, user, adminEmail) → flip to RoleAdmin Add EmailVerified to oauthUserInfo, parse email_verified from the OIDC claims (default false on absence so an IdP that omits the claim cannot short-circuit the gate), and substitute "" for the role-decision email when verified=false via emailForRoleDecision. The user record still stores the unverified email for display. GitHub's path defaults EmailVerified=true: GitHub only returns a public profile email after verification, and fetchGitHubPrimaryEmail explicitly filters to Verified=true. Regression tests cover both the helper contract and integration with AssignRole, including the bootstrap "first user" branch that would otherwise mask the gate. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(cli): refuse public bind when no auth backend is configured When neither an auth DB nor a static API key is set, the auth middleware passes every request through. That is fine for a developer laptop, a home LAN, or a Tailnet — the network itself is the trust boundary. It is not fine on a public IP, where every model install, settings change, and admin endpoint becomes reachable from the internet. Refuse to start in that exact configuration. Loopback, RFC 1918, RFC 4193 ULA, link-local, and RFC 6598 CGNAT (Tailscale's default range) all count as trusted; wildcard binds (`:port`, `0.0.0.0`, `[::]`) are accepted only when every host interface is in one of those ranges. Hostnames are resolved and treated as trusted only when every answer is. A new --allow-insecure-public-bind / LOCALAI_ALLOW_INSECURE_PUBLIC_BIND flag opts out for deployments that gate access externally (a reverse proxy enforcing auth, a mesh ACL, etc.). The error message lists this plus the three constructive alternatives (bind a private interface, enable --auth, set --api-keys). The interface enumeration goes through a package-level interfaceAddrsFn var so tests can simulate cloud-VM, home-LAN, Tailscale-only, and enumeration-failure topologies without poking at the real network stack. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): regression-test the localai_assistant admin gate ChatEndpoint already rejects metadata.localai_assistant=true from a non-admin caller, but the gate was open-coded inline with no direct test coverage. The chat route is FeatureChat-gated (default-on), and the assistant's in-process MCP server can install/delete models and edit configs — the wrong handler change would silently turn the LLM into a confused deputy. Extract the gate into requireAssistantAccess(c, authEnabled) and pin its behaviour: auth disabled is a no-op, unauthenticated is 403, RoleUser is 403, RoleAdmin and the synthetic legacy-key admin are admitted. No behaviour change in the production path. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): assert every API route is auth-classified The auth middleware classifies path prefixes (/api/, /v1/, /models/, etc.) as protected and treats anything else as a static-asset passthrough. A new endpoint shipped under a brand-new prefix — or a new path that simply isn't on the prefix allowlist — would be reachable anonymously. Walk every route registered by API() with auth enabled and a fresh in-memory database (no users, no keys), and assert each API-prefixed route returns 401 / 404 / 405 to an anonymous request. Public surfaces (/api/auth/, /api/branding, /api/node/ token-authenticated routes, /healthz, branding asset server, generated-content server, static assets) are explicit allowlist entries with comments justifying them. Build-tagged 'auth' so it runs against the SQLite-backed auth DB (matches the existing auth suite). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin agent endpoint per-user isolation contract agents.go's getUserID / effectiveUserID / canImpersonateUser / wantsAllUsers helpers are the single trust boundary for cross-user access on agent, agent-jobs, collections, and skills routes. A regression there is the difference between "regular user reads their own data" and "regular user reads anyone's data via ?user_id=victim". Lock in the contract: - effectiveUserID ignores ?user_id= for unauthenticated and RoleUser - effectiveUserID honours it for RoleAdmin and ProviderAgentWorker - wantsAllUsers requires admin AND the literal "true" string - canImpersonateUser is admin OR agent-worker, never plain RoleUser No production change — this commit only adds tests. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(downloader): drop redundant stat in removePartialFile The stat-then-remove pattern is a TOCTOU window and a wasted syscall — os.Remove already returns ErrNotExist for the missing-file case, so trust that and treat it as a no-op. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): redact secrets from trace buffer and distribution-token logs The /api/traces buffer captured Authorization, Cookie, Set-Cookie, and API-key headers verbatim from every request when tracing was enabled. The endpoint is admin-only but the buffer is reachable via any heap-style introspection and the captured tokens otherwise outlive the request. Strip those header values at capture time. Body redaction is left to a follow-up — the prompts are usually the operator's own and JSON-walking is invasive. Distribution tokens were also logged in plaintext from core/explorer/discovery.go; logs forward to syslog/journald and outlive the token. Redact those to a short prefix/suffix instead. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): rate-limit OAuth callbacks separately from password endpoints The shared 5/min/IP limit on auth endpoints is right for password-style flows but too tight for OAuth callbacks: corporate SSO funnels many real users through one outbound IP and would trip the limit. Add a separate 60/min/IP limiter for /api/auth/{github,oidc}/callback so callbacks are bounded against floods without breaking shared-IP deployments. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(gallery): verify backend tarball sha256 when set in gallery entry GalleryBackend gained an optional sha256 field; the install path now threads it through to the existing downloader hash-verify (which already streams, verifies, and rolls back on mismatch). Galleries without sha256 keep working; the empty-SHA path still emits the existing "downloading without integrity check" warning. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(http): pin CSRF coverage on multipart endpoints The CSRF middleware in app.go is global (e.Use) so it covers every multipart upload route — branding assets, fine-tune datasets, audio transforms, agent collections. Pin that contract: cross-site multipart POSTs are rejected; same-origin / same-site / API-key clients are not. Also pins the SameSite=Lax fallback path the skipper relies on when Sec-Fetch-Site is absent. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): XSS hardening — CSP headers, safe href, base-href escape, SVG sandbox Several closely related XSS-prevention changes spanning the SPA shell, the React UI, and the branding asset server: - New SecurityHeaders middleware sets CSP, X-Content-Type-Options, X-Frame-Options, and Referrer-Policy on every response. The CSP keeps script-src permissive because the Vite bundle relies on inline + eval'd scripts; tightening that requires moving to a nonce-based policy. - The <base href> injection in the SPA shell escaped attacker-controllable Host / X-Forwarded-Host headers — a single quote in the host header broke out of the attribute. Pass through SecureBaseHref (html.EscapeString). - Three React sinks rendering untrusted content via dangerouslySetInnerHTML switch to text-node rendering with whiteSpace: pre-wrap: user message bodies in Chat.jsx and AgentChat.jsx, and the agent activity log in AgentChat.jsx. The hand-rolled escape on the agent user-message variant is replaced by the same plain-text path. - New safeHref util collapses non-allowlisted URI schemes (most importantly javascript:) to '#'. Applied to gallery `<a href={url}>` links in Models / Backends / Manage and to canvas artifact links — these come from gallery JSON or assistant tool calls and must be treated as untrusted. - The branding asset server attaches a sandbox CSP plus same-origin CORP to .svg responses. The React UI loads logos via <img>, but the same URL is also reachable via direct navigation; this prevents script execution if a hostile SVG slipped past upload validation. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(http): bound HTTP server with read-header and idle timeouts A net/http server with no timeouts is trivially Slowloris-able and leaks idle keep-alive connections. Set ReadHeaderTimeout (30s) to plug the slow-headers attack and IdleTimeout (120s) to cap keep-alive sockets. ReadTimeout and WriteTimeout stay at 0 because request bodies can be multi-GB model uploads and SSE / chat completions stream for many minutes; operators who need tighter per-request bounds should terminate slow clients at a reverse proxy. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * test(auth): pin PUT /api/auth/profile field-tampering contract The handler uses an explicit local body struct (only name and avatar_url) plus a gorm Updates(map) with a column allowlist, so an attacker posting {"role":"admin","email":"...","password_hash":"..."} can't mass-assign those fields. Lock that down with a regression test so a future "let's just c.Bind(&user)" refactor breaks loudly. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(services): strip directory components from multipart upload filenames UploadDataset and UploadToCollectionForUser took the raw multipart file.Filename and joined it into a destination path. The fine-tune upload was incidentally safe because of a UUID prefix that fused any leading '..' to a literal segment, but the protection is fragile. UploadToCollectionForUser handed the filename to a vendored backend without sanitising at all. Strip to filepath.Base at both boundaries and reject the trivial unsafe values ("", ".", "..", "/"). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): validate persisted MCP server entries on load localStorage is shared across same-origin pages; an XSS that lands once can poison persisted MCP server config to attempt header injection or to feed a non-http URL into the fetch path on subsequent loads. Validate every entry: types must match, URL must parse with http(s) scheme, header keys/values must be control-char-free. Drop anything that doesn't fit. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): close X-Forwarded-Prefix open redirect The reverse-proxy support concatenated X-Forwarded-Prefix into the redirect target without validation, so a forged header value of "//evil.com" turned the SPA-shell redirect helper at /, /browse, and /browse/* into a 301 to //evil.com/app. The path-strip middleware had the same shape on its prefix-trailing-slash redirect. Add SafeForwardedPrefix at the middleware boundary: must start with a single '/', no protocol-relative '//' opener, no scheme, no backslash, no control characters. Apply at both consumers; misconfig trips the validator and the header is dropped. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(http): refuse wildcard CORS when LOCALAI_CORS=true with empty allowlist When LOCALAI_CORS=true but LOCALAI_CORS_ALLOW_ORIGINS was empty, Echo's CORSWithConfig saw an empty allow-list and fell back to its default AllowOrigins=[""]. An operator who flipped the strict-CORS feature flag without populating the list got the opposite of what they asked for. Echo never sets Allow-Credentials: true so this isn't directly exploitable (cookies aren't sent under wildcard CORS), but the misconfiguration trap is worth closing. Skip the registration and warn. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> feat(auth): zxcvbn password strength check with user-acknowledged override The previous policy was len < 8, which let through "Password1" and the rest of the credential-stuffing corpus. LocalAI has no second factor yet, so the bar needs to sit higher. Add ValidatePasswordStrength using github.com/timbutler/zxcvbn (an actively-maintained fork of the trustelem port; v1.0.4, April 2024): - min 12 chars, max 72 (bcrypt's truncation point) - reject NUL bytes (some bcrypt callers truncate at the first NUL) - require zxcvbn score >= 3 ("safely unguessable, ~10^8 guesses to break"); the hint list ["localai", "local-ai", "admin"] penalises passwords built from the app's own branding zxcvbn produces false positives sometimes (a strong-looking password that happens to match a dictionary word) and operators occasionally need to set a known-weak password (kiosk demos, CI rigs). Add an acknowledgement path: PasswordPolicy{AllowWeak: true} skips the entropy check while still enforcing the hard rules. The structured PasswordErrorResponse marks weak-password rejections as Overridable so the UI can surface a "use this anyway" checkbox. Wired through register, self-service password change, and admin password reset on both the server and the React UI. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): drop HTML5 minLength on new-password inputs minLength={12} on the new-password input let the browser block the form submit silently before any JS or network call ran. The browser focused the field, showed a brief native tooltip, and that was that — no toast, no fetch, no clue. Reproducible by typing fewer than 12 chars on the second password change of a session. The JS-level length check in handleSubmit already shows a toast and the server rejects with a structured error, so the HTML5 attribute was redundant defence anyway. Drop it. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): bundle Geist fonts locally instead of fetching from Google The new CSP correctly refused to apply styles from fonts.googleapis.com because style-src is locked to 'self' and 'unsafe-inline'. Loosening the CSP would defeat its purpose; the right fix is to stop reaching out to a third-party CDN for fonts on every page load. Add @fontsource-variable/geist and @fontsource-variable/geist-mono as npm deps and import them once at boot. Drop the <link rel="preconnect"> and external stylesheet from index.html. Side benefit: no third-party tracking via Referer / IP on every UI load, no failure mode when offline / behind a captive portal. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(react-ui): refresh i18n strings to reflect 12-char password minimum The translations still said "at least 8 characters" everywhere — the client-side toast on a too-short password change told the user the wrong floor. Update tooShort and newPasswordPlaceholder / newPasswordDescription across all five locales (en, es, it, de, zh-CN) to match the real ValidatePasswordStrength rule. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(auth): make password length-floor overridable like the entropy check The 12-char minimum was a policy choice, not a technical invariant — only "non-empty", "<= 72 bytes", and "no NUL bytes" are real bcrypt constraints. Treating length-12 as a hard rule was inconsistent with the entropy check (already overridable) and friction for use cases where the account is just a name on a session, not a security boundary (single-user kiosk, CI rig, lab demo). Restructure ValidatePasswordStrength: - Hard rules (always enforced): non-empty, <= MaxPasswordLength, no NUL byte - Policy rules (skipped when AllowWeak=true): length >= 12, zxcvbn score >= 3 PasswordError now marks password_too_short as Overridable too. The React forms generalised from `error_code === 'password_too_weak'` to `overridable === true`, and the JS-side preflight length checks were removed (server is source of truth, returns the same checkbox flow). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-08 16:25:45 +02:00
Richard Palethorpe	969005b2a1	feat(gallery): Speed up load times and clean gallery entries (#9211 ) * feat: Rework VRAM estimation and use known_usecases in gallery Signed-off-by: Richard Palethorpe <io@richiejp.com> Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] * chore(gallery): regenerate gallery index and add known_usecases to model entries Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-06 14:51:38 +02:00
Richard Palethorpe	bb033b16a9	feat: add LocalVQE backend and audio transformations UI (#9640 ) feat(audio-transform): add LocalVQE backend, bidi gRPC RPC, Studio UI Introduce a generic "audio transform" capability for any audio-in / audio-out operation (echo cancellation, noise suppression, dereverberation, voice conversion, etc.) and ship LocalVQE as the first backend implementation. Backend protocol: - Two new gRPC RPCs in backend.proto: unary AudioTransform for batch and bidirectional AudioTransformStream for low-latency frame-by-frame use. This is the first bidi stream in the proto; per-frame unary at LocalVQE's 16 ms hop would be RTT-bound. Wire it through pkg/grpc/{client,server, embed,interface,base} with paired-channel ergonomics. LocalVQE backend (backend/go/localvqe/): - Go-Purego wrapper around upstream liblocalvqe.so. CMake builds the upstream shared lib + its libggml-cpu-.so runtime variants directly — no MODULE wrapper needed because LocalVQE handles CPU feature selection internally via GGML_BACKEND_DL. - Sets GGML_NTHREADS from opts.Threads (or runtime.NumCPU()-1) — without it LocalVQE runs single-threaded at ~1× realtime instead of the documented ~9.6×. - Reference-length policy: zero-pad short refs, truncate long ones (the trailing portion can't have leaked into a mic that wasn't recording). - Ginkgo test suite (9 always-on specs + 2 model-gated). HTTP layer: - POST /audio/transformations (alias /audio/transform): multipart batch endpoint, accepts audio + optional reference + params[]=v form fields. Persists inputs alongside the output in GeneratedContentDir/audio so the React UI history can replay past (audio, reference, output) triples. - GET /audio/transformations/stream: WebSocket bidi, 16 ms PCM frames (interleaved stereo mic+ref in, mono out). JSON session.update envelope for config; constants hoisted in core/schema/audio_transform.go. - ffmpeg-based input normalisation to 16 kHz mono s16 WAV via the existing utils.AudioToWav (with passthrough fast-path), so the user can upload any format / rate without seeing the model's strict 16 kHz constraint. - BackendTraceAudioTransform integration so /api/backend-traces and the Traces UI light up with audio_snippet base64 and timing. - Routes registered under routes/localai.go (LocalAI extension; OpenAI has no /audio/transformations endpoint), traced via TraceMiddleware. Auth + capability + importer: - FLAG_AUDIO_TRANSFORM (model_config.go), FeatureAudioTransform (default-on, in APIFeatures), three RouteFeatureRegistry rows. - localvqe added to knownPrefOnlyBackends with modality "audio-transform". - Gallery entry localvqe-v1-1.3m (sha256-pinned, hosted on huggingface.co/LocalAI-io/LocalVQE). React UI: - New /app/transform page surfaced via a dedicated "Enhance" sidebar section (sibling of Tools / Biometrics) — the page is enhancement, not generation, so it lives outside Studio. Two AudioInput components (Upload + Record tabs, drag-drop, mic capture). - Echo-test button: records mic while playing the loaded reference through the speakers — the mic naturally picks up speaker bleed, giving a real (mic, ref) pair for AEC testing without leaving the UI. - Reusable WaveformPlayer (canvas peaks + click-to-seek + audio controls) and useAudioPeaks hook (shared module-scoped AudioContext to avoid hitting browser context limits with three players on one page); migrated TTS, Sound, Traces audio blocks to use it. - Past runs saved in localStorage via useMediaHistory('audio-transform') — the history entry stores all three URLs so clicking re-renders the full triple, not just the output. Build + e2e: - 11 matrix entries removed from .github/workflows/backend.yml (CUDA, ROCm, SYCL, Metal, L4T): upstream supports only CPU + Vulkan, so we ship those two and let GPU-class hardware route through Vulkan in the gallery capabilities map. - tests-localvqe-grpc-transform job in test-extra.yml (gated on detect-changes.outputs.localvqe). - New audio_transform capability + 4 specs in tests/e2e-backends. - Playwright spec suite in core/http/react-ui/e2e/audio-transform.spec.js (8 specs covering tabs, file upload, multipart shape, history, errors). Docs: - New docs/content/features/audio-transform.md covering the (audio, reference) mental model, batch + WebSocket wire formats, LocalVQE param keys, and a YAML config example. Cross-links from text-to-audio and audio-to-text feature pages. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit Write Agent TaskCreate] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-04 22:07:11 +02:00
Ettore Di Giacinto	87cf736068	feat(react-ui): add multilingual (i18n) support (#9642 ) Adds end-to-end internationalization to the React UI with five seed languages (English, Italian, Spanish, German, Simplified Chinese) and a sidebar-footer language switcher next to the existing theme toggle. Library: react-i18next + i18next + i18next-http-backend + i18next-browser-languagedetector. The detector caches the user's choice in localStorage (key `localai-language`, mirroring the existing `localai-theme` convention) and updates the `<html lang>` attribute on change. fallbackLng is `en`, so any missing translation in another locale falls back transparently. Translation files live under `public/locales/<lng>/<ns>.json`. They ride along with the existing `//go:embed react-ui/dist/` directive, but the previous SPA route in core/http/app.go only exposed `/assets/` from the embedded React build. This commit generalizes the asset handler into a `serveReactSubdir(subdir)` helper and adds a matching `/locales/*` route so i18next-http-backend can fetch the JSONs at runtime. The http-backend `loadPath` is built via the existing `apiUrl()` helper so instances served under a sub-path (e.g. `<base href="/ui/">`) resolve correctly. Namespaces (13): common, nav, errors, auth, home, models, importModel, chat, agents, skills, collections, media, admin. Translated UI surfaces include the sidebar/header/footer chrome, login + account flows, the Home dashboard (incl. the manage-by-chat assistant CTA), the model gallery + import flow, the chat experience (Chat.jsx + ChatsMenu), agents/skills/collections list pages, the studio media tabs (Image, Video, TTS), and the admin page-headers (Settings incl. its section nav, Manage, Backends, Traces, Nodes, P2P, Users, Usage). Shared components (ConfirmDialog, Toast) take their default labels from the common namespace so callers don't need to pass strings explicitly. Tooling for incremental adoption is included: - `i18next-parser.config.js` + `npm run i18n:extract` to sweep `t()` keys into the JSON skeletons. - `scripts/translate-locales.mjs` (one-off helper) to bootstrap non-English locales from English source via OpenAI or Anthropic APIs, with --copy mode as a placeholder fallback. Idempotent; preserves existing translations unless --overwrite is passed. Larger config-driven pages (ModelEditor, Settings deep field forms, AgentChat/AgentCreate, SkillEdit, CollectionDetails, Talk, Sound, biometrics, FineTune/Quantize, Users modals, Nodes/P2P install pickers, BackendLogs, Traces deep filters, Explorer) intentionally keep their inner content untranslated for now — they fall back to English via fallbackLng so functionality is unaffected, and the extracted-strings pattern + the bootstrap script make follow-up extraction straightforward. The initial Suspense fallback at the root in main.jsx covers the first JSON fetch on cold load. A simple `.app-boot-spinner` styled in App.css provides a non-empty paint while the first namespace loads. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit Write Agent] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-02 22:42:08 +02:00

10 Commits