LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-13 17:54:02 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	61bf34ea2f	fix(traces): cap captured body size to keep admin Traces UI responsive (#9946 ) The trace middleware buffered the full request and response bodies for every JSON exchange. With a chatty agent-pool RAG workload, /embeddings responses (large vector arrays) accumulated to tens of MB in the in-memory buffer; the admin Traces page would then download and parse 40+ MB on every load and on every 5s auto-refresh, locking the UI in a loading state. Add LOCALAI_TRACING_MAX_BODY_BYTES (default 64 KiB) that caps each captured body. The full payload still flows through to the real client; only the trace copy is bounded. Exchanges record body_truncated and original body_bytes so the dashboard can show that truncation happened. The cap is configurable via env, CLI, and runtime_settings.json. Also unblock recovery: the Traces page now keeps the Clear button enabled while loading, since "buffer too large to render" is exactly when the user needs to clear it. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-22 15:29:24 +02:00
LocalAI [bot]	0b2ae3c6ca	fix(openai): stream usage non-zero when tools are enabled (#9941 ) * chore: ignore local .worktrees directory Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(openai): stream usage non-zero when tools are enabled The streaming chat-completions worker for tool-bearing requests (processTools in core/http/endpoints/openai/chat.go) never forwarded the cumulative TokenUsage from ComputeChoices to the chunks it placed on the responses channel. The outer streaming loop's running usage tracker therefore stayed at the zero value, and the include_usage trailer reported {prompt_tokens:0, completion_tokens:0, total_tokens:0} whenever the request carried a `tools` array. Without tools, the alternative `process` path stamps Usage on every chunk, so that path was unaffected. Forward the final TokenUsage via a usage-only sentinel chunk (empty Choices, populated Usage) emitted right before close(responses). The outer loop's per-chunk Usage capture moves above the empty-Choices skip so the sentinel updates the tracker without ever reaching the wire, keeping the existing OpenAI spec contract (intermediate chunks carry no `usage` field, and the deferred-final-chunk helpers remain Usage-free per the regression test for issue #8546). Adds streamUsageFromTokenUsage, usageSentinelChunk, and applyChunkToUsage helpers with focused Ginkgo coverage plus a flow-level test that mirrors the outer-loop sequence. Fixes #9927 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4-7 [Claude Code] * refactor(openai): return final TokenUsage from stream workers Replace the usage-only sentinel SSE chunk introduced in the previous commit with a plain return value. The streaming workers process and processTools (now extracted as package-level processStream and processStreamWithTools) return (backend.TokenUsage, error); the outer ChatEndpoint loop reads the cumulative counts off the existing `ended` channel (now carrying streamWorkerResult{usage, err}) and builds the include_usage trailer from a normal Go value after the LOOP exits. This drops the empty-Choices "skip but capture Usage" rule from the outer loop and removes the usageSentinelChunk / applyChunkToUsage helpers entirely. The SSE responses channel is back to a single purpose: wire chunks only. processStream and processStreamWithTools move into chat_stream_workers.go so they can be exercised directly from tests. The chat_stream_usage_test.go suite now drives the workers with a mocked backend.ModelInferenceFunc and asserts on the returned TokenUsage. The regression coverage for issue #9927 is therefore behavioral: reverting the fix (discarding ComputeChoices' usage return) makes the assertions fail with concrete count mismatches. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4-7 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-22 10:13:41 +02:00
LocalAI [bot]	4735345105	chore: ⬆️ Update ggml-org/llama.cpp to `bb28c1fe246b72276ee1d00ce89306be7b865766` (#9934 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-22 09:49:33 +02:00
LocalAI [bot]	7384fd800b	chore: ⬆️ Update antirez/ds4 to `8d576642c39b9a2d782a80159ba84ef5a81c0b81` (#9932 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-22 08:31:49 +02:00
LocalAI [bot]	6942713d85	chore: ⬆️ Update leejet/stable-diffusion.cpp to `3a8788cb7d74f185d6b18688e9563015524ecaf5` (#9933 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-22 00:31:19 +02:00
LocalAI [bot]	0cf52c44d4	chore: ⬆️ Update ggml-org/whisper.cpp to `8443cf05e3fa8ce1b32348e1bcbcf8fc31f7f3ae` (#9929 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-21 23:24:01 +02:00
LocalAI [bot]	0d34cf7cbd	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `48a55f74e4c6e2aeda363dd386c1ac9170a0af71` (#9930 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-21 23:23:37 +02:00
LocalAI [bot]	f0cb02afb8	feat(usage): attribute Sources rows to user accounts in admin view (#9935 ) The merged feature (#9920) let admins see per-API-key and per-source totals but did not surface which user owned each key, and lumped every user's Web UI traffic into a single global Web UI row. This makes the admin Sources tab properly per-user attributable: - KeyTotal gains UserID + UserName, populated from the snapshot the usage middleware already records. The by_key roll-up now groups by (api_key_id, api_key_name, user_id, user_name). - New SourceTotals.ByUserSource roll-up groups (source, user_id, user_name) for sources without a key identity (web, legacy). Only populated on the admin path (includeLegacy=true); the non-admin endpoint stays unchanged for backwards compatibility. - SourcesTable accepts showUserColumn={isAdmin}; admin view renders a User column, makes the search match user name/id, and expands Web UI / legacy pseudo-rows from the global aggregate to one row per user using by_user_source. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 23:23:06 +02:00
LocalAI [bot]	a39e025d64	fix(nodes): make per-node backend install async via gallery job queue (#9928 ) * feat(galleryop): add TargetNodeID to ManagementOp for single-node installs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(galleryop): add NodeScopedKey helpers for per-node opcache rows Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(galleryop): use strings.Cut for NodeScopedKey parsing, reject empty nodeID Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(nodes): scope DistributedBackendManager.InstallBackend to single node via TargetNodeID Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(http): make /api/nodes/:id/backends/install async via gallery service job queue The handler previously called unloader.InstallBackend synchronously and blocked the browser for up to 3 minutes waiting on the NATS reply. It now enqueues a TargetNodeID-scoped ManagementOp on BackendGalleryChannel and returns HTTP 202 + jobID immediately, matching /api/backends/install/:id. The opcache key is built via NodeScopedKey(nodeID, backend) so concurrent installs of the same backend across different nodes do not stomp each other. galleryService/opcache/appConfig are threaded through RegisterNodeAdminRoutes for this. Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(http): log malformed backend_galleries override and stop test drain goroutine Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(api): expose nodeID for node-scoped backend ops in /api/operations Node-scoped backend installs land in opcache under "node:<nodeID>:<backend>" keys. Without splitting that prefix back out, the operations panel renders the full key as the display name and has no structured way to label which worker an install is targeting. Detect the prefix, surface nodeID as its own response field, and reduce the display name back to the bare backend slug. Bare (non-scoped) ops are left untouched so legacy installs do not gain a misleading empty nodeID. Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(react-ui): poll job status for node-targeted backend installs Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(react-ui): make NodeInstallPicker state updates pure and surface cancellations as errors Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(react-ui): clarify async semantics in handleInstallOnTarget Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(http): use statusUrl casing for node install response to match codebase precedent Assisted-by: Claude:opus-4-7 [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 22:25:53 +02:00
Ettore Di Giacinto	05e8e1e9f4	ci(images): publish chronologically-orderable master-<epoch>-<sha> tags The existing master push pipeline produces `master` (rolling) and `sha-<short>` tags. Neither is orderable by build time, so downstream GitOps that want to auto-bump to the newest master build (e.g. Flux ImagePolicy) can't pick the latest from the tag list — alphabetical sort over hex shas is effectively random, and the rolling `master` tag can't be referenced as an immutable bump target. Add a third tag of the form `master-<epoch>-<sha>` (Unix epoch in seconds + short sha), gated on default-branch pushes via metadata- action's `is_default_branch` predicate. The sha is retained for traceability; the epoch makes the tags numerically orderable, so a Flux ImagePolicy like filterTags: pattern: '^master-(?P<ts>[0-9]+)-[a-f0-9]+$' extract: '$ts' policy: numerical: order: asc will reliably bump to the newest master build. Applied to both image_build.yml (OCI labels stay consistent) and image_merge.yml (the actual tag publisher via buildx imagetools).	2026-05-21 17:18:30 +00:00
Rin	a7f6cc8956	[utils] Fail immediately on extraction errors (#9926 ) utils: fail immediately on extraction errors Setting ContinueOnError to false ensures that ExtractArchive does not leave the model or backend directory in an inconsistent state if a partial failure occurs. This improves robustness against malformed archives or unexpected I/O issues during installation. Signed-off-by: RinZ27 <222222878+RinZ27@users.noreply.github.com>	2026-05-21 19:00:33 +02:00
LocalAI [bot]	f15b9178ec	feat(usage): track and visualise usage per API key (#9920 ) * feat(usage): add Source, APIKeyID, APIKeyName columns to UsageRecord Adds three additive columns plus UsageSource* constants. The columns are auto-migrated by InitDB. APIKeyID is a nullable foreign reference to UserAPIKey.ID; APIKeyName is snapshotted on each row so revoked keys keep showing their name in history. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): backfill Source on pre-feature usage rows InitDB now classifies any pre-existing usage_record with an empty source: 'legacy-api-key' user -> legacy, everything else -> web. The backfill is idempotent (only touches NULL/empty rows). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add GetUserUsageBySource aggregator Groups by (bucket, source, api_key_id, api_key_name). Filters out legacy by default. Returns both per-bucket detail and roll-ups (by_source, by_key sorted desc and capped at 200, grand_total). The MAX(created_at) projection is iterated via Rows().Scan into a string column and parsed manually because the SQLite driver surfaces the aggregated timestamp as a string, which database/sql refuses to scan directly into time.Time. Postgres returns a real timestamp; the same string path handles its RFC3339 form too. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(usage): log Rows() errors and assert LastUsed in tests Adds rows.Err() and Rows() open-failure logging in computeSourceTotals so silent data drops surface in logs. Logs on parseLastUsedString format misses for the same reason. Strengthens the snapshot-survival test to assert LastUsed is a recent timestamp, locking the SQLite time-string parser behaviour. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add admin GetAllUsageBySource with filters and truncation Optional user_id and api_key_id filters (composed with AND). Legacy bucket is included for admin callers. truncated=true when more than 200 distinct keys would be in the by_key roll-up. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(auth): plumb auth_source and auth_apikey through Echo context tryAuthenticate now sets auth_source on every successful branch (web for session/Bearer-session, apikey for Bearer-key/x-api-key/ token-cookie, legacy for legacy env key match). For named-key branches it also stores the resolved UserAPIKey under auth_apikey so downstream middlewares can snapshot id+name without re-validating. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> fix(auth): expand tryAuthenticate godoc and cover Bearer-session branch Documents all three context-keys side effects (auth_source, auth_apikey, _auth_session) plus the split of responsibilities with the parent Middleware. Adds a test for the Bearer-as-session-token classification so future regressions there fail loudly. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): UsageMiddleware records source + snapshots key name Reads auth_source and auth_apikey from the Echo context (set by auth.Middleware in the previous task). Snapshots UserAPIKey.ID and Name onto each row so revoked keys remain readable in history. Falls back to source=web when no auth_source is set (auth disabled or unrecognised path). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): add /api/auth/usage/sources and admin variant Self endpoint filters legacy server-side; admin endpoint includes legacy and accepts user_id + api_key_id filters. Response includes buckets, totals.{by_source, by_key, grand_total}, and a truncated flag set when the per-key roll-up was capped at 200. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(routes): mark test mirror handlers as keep-in-sync with production The newTestAuthApp helper duplicates production route handlers inline because it cannot use RegisterAuthRoutes (which requires a application.Application). Naming the source path on each mirror makes the drift contract explicit for future maintainers. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> feat(ui): add usageApi.getMySources/getAdminSources + i18n strings Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add Sources tab skeleton with data fetch Adds Usage page tab that fetches /api/auth/usage/sources (or the admin variant). Renders raw totals plus a placeholder key list; real visualisations land in subsequent commits. Restructures the existing tab button block so Models and Sources are visible to non-admins (Users remains admin-only). Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): source mix ribbon + searchable/sortable sources table Replaces the SourcesTab placeholder rendering with two reusable components: SourceMixRibbon (one segmented bar per source class) and SourcesTable (search + sort + revoked-key dim). Pulls the current API key list to detect revoked keys. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ui): skip revoked-key detection until the key list is known existingKeyIds defaulted to an empty Set, which made every live api_key row render as (revoked) during the brief window before apiKeysApi.list() resolved, and permanently after a fetch failure. Use null as the unknown state and suppress the revoked badge until the parent provides a real Set. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): top-N stacked time chart and drill-in chip for Sources tab Top 7 sources by total tokens get distinct colours; the rest roll up into 'Other'. Clicking a row in the SourcesTable dims everything except that series in the chart; the chip is the canonical clear. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(usage): document per-API-key Sources tab and endpoints Extends features/authentication.md Usage Tracking section with: - A 'Sources' tab description and source-class taxonomy - Endpoint documentation for /api/auth/usage/sources and the admin variant - Response shape example with by_source / by_key / grand_total - Migration note about pre-feature row backfill Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(usage): silence errcheck on deferred rows.Close CI errcheck flagged the bare 'defer rows.Close()' in computeSourceTotals. Wrap in a closure that discards the close error explicitly; an error here is non-actionable since we have already drained the rows and logged any iteration failure. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(usage): bound batcher intake and add Shutdown/FlushNow hooks The pre-existing usage batcher had no cap on its add() path; the usageMaxPending=5000 constant only guarded the re-queue path after a failed write, leaving memory growth unbounded if the DB fell behind. This commit: - Adds the cap to add() so saturation drops new records (rate-limited warn at 1/1024) instead of growing unbounded. - Raises usageMaxPending to 50000 to absorb realistic inference bursts. - Replaces the package-level batcher global with a mutex-guarded pair plus a currentBatcher() accessor so Init / Shutdown cycles are race-free. - Adds ShutdownUsageRecorder() for graceful drain on process exit (not yet wired into app shutdown, just published). - Adds FlushNow() for deterministic tests; the middleware suite no longer needs 6s sleeps per spec and now runs in ~50ms instead of 18s. - Re-queue on failed flush is now cap-aware: prepends as much of the failed batch as fits alongside concurrent arrivals, instead of dropping the whole batch when full. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(usage): drain usage batcher on graceful shutdown Registers ShutdownUsageRecorder with the existing signals.RegisterGracefulTerminationHandler so SIGINT/SIGTERM synchronously flushes any in-memory usage records before the process exits. Without this, up to one flush interval (5s) of recorded usage was lost when LocalAI restarted. Refs: #9862 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 16:34:02 +02:00
LocalAI [bot]	959de86761	feat(llama-cpp): make server-side prompt cache work by default (#9925 ) Aligns LocalAI's llama-cpp gRPC backend with upstream's auto-on prompt cache path so repeated system prompts (agents, OpenAI/Anthropic-compatible CLIs, coding assistants) skip prefill on subsequent calls without any YAML changes. Reported in #9921. Upstream's server enables `kv_unified=true` (and bumps `n_parallel` to 4) when slot count is auto, which unlocks `cache_idle_slots`. LocalAI hardcodes `n_parallel=1` and so far also hardcoded `kv_unified=false`, which silently force-disables idle-slot saving at server init. The host prompt cache was allocated but never written across requests. Changes in backend/cpp/llama-cpp/grpc-server.cpp: - params.kv_unified: false -> true (single-slot path now benefits from the prompt cache; users can opt out with `kv_unified:false`) - params.n_ctx_checkpoints: 8 -> 32 (match upstream default) - params.cache_idle_slots = true initialized explicitly (upstream default) - params.checkpoint_every_nt = 8192 initialized explicitly (upstream default) - New option parsers: cache_idle_slots / idle_slots_cache, checkpoint_every_nt / checkpoint_every_n_tokens Docs: - features/text-generation.md: fix misleading `cache_ram` description (it's the host-side prompt cache, not the KV cache), document the kv_unified + cache_ram + cache_idle_slots interaction, add rows for the two newly-exposed options, and add a worked example for the agent/CLI workload from the issue. - advanced/model-configuration.md: mark the legacy `prompt_cache_path` / `prompt_cache_all` / `prompt_cache_ro` YAML fields as unused by the llama-cpp gRPC backend (they target upstream's CLI completion tool and are not consumed by grpc-server.cpp) and point readers at the new prompt-cache explainer. Closes #9921 Assisted-by: claude:opus-4.7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 16:31:48 +02:00
LocalAI [bot]	4c234abc2c	refactor(agents): bump skillserver, drop redundant Name from list_skills output (#9916 ) refactor(agents): bump skillserver, drop redundant Name from list_skills/search_skills skillserver's list_skills MCP tool used to ship every entry with name="" (field was commented out), while search_skills populated it - two tools with inconsistent shape for the same data. skill.Name and skill.ID are populated from the same source string anyway (the directory name), so returning both was pure duplication. Bumps github.com/mudler/skillserver to a7317cb, which drops the Name field from both SkillInfo and SearchResult and leaves ID as the single canonical identifier (already what read_skill consumes). Adds core/services/skills/skills_mcp_test.go, a regression that drives the LocalAI FilesystemManager through an in-process MCP session and asserts a newly-created skill is visible by ID on the still-open session. This is a cleanup, not the root cause of #9868 - the reporter likely sees something deeper than a cosmetic JSON shape issue. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 14:45:53 +02:00
Richard Palethorpe	c68818a62e	fix(llama-cpp): terminate tensor_buft_overrides with sentinel (#9919 ) llama.cpp's model loader asserts back().pattern == nullptr on params.tensor_buft_overrides (and on params.kv_overrides.back().key[0] == 0) before binding them into llama_model_params. PR #8560 attempted to satisfy llama_params_fit's placeholder requirement by pre-filling params.tensor_buft_overrides up to llama_max_tensor_buft_overrides() before the option-parse loop. Any subsequent push_back from override_tensor / draft_cpu_moe / draft_n_cpu_moe / draft_override_tensor then appended real entries after the placeholders, leaving back() with a real pattern and tripping the assert. The draft override vector likewise had no terminator at all. Mirror upstream common/arg.cpp:645-658 instead: real entries are pushed during option parsing, and after parsing we pad the main vector up to ntbo (placeholders land at the end, so back() is always nullptr) and append a single {nullptr, nullptr} to the draft vector when it is non-empty. The existing kv_overrides terminator block already matches upstream and stays. Verified against ggml-org/llama.cpp@5cbaa5e: only tensor_buft_overrides (main + draft) and kv_overrides are sentinel-terminated common_params fields; everything else is size-driven std::vector. Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-21 12:55:06 +02:00
LocalAI [bot]	11d5bd0cc3	fix(react-ui/chat): stop wiping selection on every /api/operations poll (#9904 ) (#9917 ) useOperations() was calling setOperations() with a fresh array on every 1s poll, even when the payload was identical. In React 19 the DOM diff no longer short-circuits dangerouslySetInnerHTML on equal __html, so the forced Chat re-render re-assigned innerHTML on every assistant message once per second — wiping any text the user had selected. Skip the state update when the serialised operations payload is unchanged, and switch loading/error to functional setters so they also short-circuit at the source. Also fixes the chat copy button on plain HTTP: navigator.clipboard is undefined in non-secure contexts (a common LXC+Docker deployment), but the previous code called it unconditionally and showed a success toast regardless. Routed Chat, AgentChat and CanvasPanel through a new copyToClipboard() helper that uses navigator.clipboard when available and falls back to a hidden-textarea + execCommand('copy') trick that browsers still honour outside secure contexts. The fallback preserves the user's existing selection. Regression coverage in e2e/chat-polling-selection.spec.js: a MutationObserver counts mutations on the assistant content node across 3s of polling (must be 0); the copy test stubs out navigator.clipboard and asserts that execCommand('copy') is invoked. Assisted-by: claude-opus-4-7-1m Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-21 12:17:51 +02:00
LocalAI [bot]	12e056e96d	chore: ⬆️ Update ggml-org/llama.cpp to `ad277572619fcfb6ddd38f4c6437283a4b2b8636` (#9915 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-21 09:07:31 +02:00
LocalAI [bot]	308aa8908a	chore: ⬆️ Update ace-step/acestep.cpp to `ed53caf164e4492a5620b2e3f2264629cf66da24` (#9913 ) ⬆️ Update ace-step/acestep.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-21 00:15:57 +02:00
LocalAI [bot]	b2d68a53a2	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `11a1fea9e291f12ce2c803a9d7812c30ca806bcf` (#9914 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 22:04:06 +00:00
LocalAI [bot]	e3706c0512	chore(model-gallery): ⬆️ update checksum (#9910 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 23:38:45 +02:00
LocalAI [bot]	1ffd82a050	chore: ⬆️ Update antirez/ds4 to `2606543be7a8c125a32cee37f5d1d85dc78f2fcf` (#9909 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 21:22:26 +00:00
LocalAI [bot]	f515168dbe	chore(acestep-cpp): bump pin to ed53caf and adapt wrapper to new API (#9908 ) The new ace-step.cpp revision moves backend initialization inside each `_load` call and drops the separate `DiTGGMLConfig` argument from `dit_ggml_load` (config now lives in `DiTGGML::cfg`, populated from GGUF metadata at load time). Drop the now-removed `_init_backend` calls and replace `g_dit_cfg` accesses with `g_dit.cfg`. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-20 21:05:32 +00:00
LocalAI [bot]	ef6ca34513	chore: ⬆️ Update leejet/stable-diffusion.cpp to `5b0267e941cade15bd80089d89838795d9f4baa6` (#9907 ) Adapt the C++ wrapper to the new `generate_video()` signature: upstream now returns `bool` and writes frames/audio via out-parameters (`sd_image_t`, `sd_audio_t`). Also set `p->fps` on the params struct (new upstream field) and free the returned audio handle on both the success and error paths. Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-20 20:53:19 +00:00
dependabot[bot]	9413c3767f	chore(deps): update transformers requirement from >=5.8.0 to >=5.8.1 in /backend/python/transformers (#9883 ) chore(deps): update transformers requirement Updates the requirements on [transformers](https://github.com/huggingface/transformers) to permit the latest version. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v5.8.0...v5.8.1) --- updated-dependencies: - dependency-name: transformers dependency-version: 5.8.1 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 22:16:02 +02:00
dependabot[bot]	3bf3cce232	chore(deps): bump sentence-transformers from 5.4.0 to 5.5.0 in /backend/python/transformers (#9888 ) chore(deps): bump sentence-transformers in /backend/python/transformers Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.4.0 to 5.5.0. - [Release notes](https://github.com/huggingface/sentence-transformers/releases) - [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.4.0...v5.5.0) --- updated-dependencies: - dependency-name: sentence-transformers dependency-version: 5.5.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-20 22:13:39 +02:00
LocalAI [bot]	06f8159035	chore: ⬆️ Update ggml-org/llama.cpp to `67ace021da905e27ecbdf1176b0eef578a5288c0` (#9897 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 22:05:58 +02:00
LocalAI [bot]	f6a73f54fa	feat(swagger): update swagger (#9872 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 22:05:35 +02:00
LocalAI [bot]	24e04d8e81	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `77413bc900f9a2bfd8a5407f184427bcc0825f6c` (#9899 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 01:02:53 +02:00
LocalAI [bot]	b9a49449ae	chore: ⬆️ Update ggml-org/whisper.cpp to `afa2ea544fb4b0448916b4a31ecd33c8685bd482` (#9898 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 01:02:25 +02:00
LocalAI [bot]	1879e11042	chore: ⬆️ Update antirez/ds4 to `599e49d253971451f710cb8323344e789906ed6c` (#9900 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 01:01:45 +02:00
LocalAI [bot]	403d391316	chore(model-gallery): ⬆️ update checksum (#9901 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-20 01:01:20 +02:00
Daniel Liljeberg	fc3980dadd	fix: inject text-file content into chat completions messages (#9896 ) Non-image/non-audio file attachments (txt, md, csv, json) were being stored in the 'files' metadata field but never added to the message content array sent to /v1/chat/completions. Images and audio correctly received content blocks; files did not. Fix: push a text content block into messageContent when textContent is present, matching the pattern used for image_url and audio_url. Also fixes Home.jsx addFiles which never called file.text() at all, meaning files attached on the home screen had empty textContent even before reaching useChat.js. Note: PDF files use file.text() which returns raw bytes rather than parsed text. Proper PDF support would require PDF.js or server-side extraction and is not part of this fix. Signed-off-by: Daniel Liljeberg <damien_@hotmail.com>	2026-05-20 01:00:32 +02:00
Richard Palethorpe	2009544b44	fix(nix): correct flake src path and add dev shell (#9894 ) The flake set `src = ./sources;` referencing a non-existent subdirectory, so `nix build` and `nix develop` both failed evaluation. Point `src` at the repo root and refresh `vendorHash` accordingly. Add `devShells.default` with the Go toolchain, protobuf generators, Node.js/bun for the React UI (`make react-ui`), and the linters used by `make lint` (golangci-lint, gofumpt, goimports, staticcheck). Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-19 19:28:30 +02:00
dependabot[bot]	e859345b12	chore(deps): bump github.com/alecthomas/kong from 1.14.0 to 1.15.0 (#9881 ) Bumps [github.com/alecthomas/kong](https://github.com/alecthomas/kong) from 1.14.0 to 1.15.0. - [Commits](https://github.com/alecthomas/kong/compare/v1.14.0...v1.15.0) --- updated-dependencies: - dependency-name: github.com/alecthomas/kong dependency-version: 1.15.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-19 08:07:07 +02:00
dependabot[bot]	f30712f8e8	chore(deps): bump github.com/aws/aws-sdk-go-v2 from 1.41.6 to 1.41.7 (#9892 ) Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.41.6 to 1.41.7. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.41.6...v1.41.7) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2 dependency-version: 1.41.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-19 08:06:50 +02:00
dependabot[bot]	a19c77c5f8	chore(deps): bump github.com/onsi/ginkgo/v2 from 2.28.2 to 2.29.0 (#9882 ) Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.28.2 to 2.29.0. - [Release notes](https://github.com/onsi/ginkgo/releases) - [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md) - [Commits](https://github.com/onsi/ginkgo/compare/v2.28.2...v2.29.0) --- updated-dependencies: - dependency-name: github.com/onsi/ginkgo/v2 dependency-version: 2.29.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-19 08:06:34 +02:00
LocalAI [bot]	4b02d23c0c	chore: ⬆️ Update ggml-org/llama.cpp to `5cbaa5e69e09bde3334cd8c355570553a0dca027` (#9876 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-19 08:06:16 +02:00
LocalAI [bot]	21140e96b2	chore: ⬆️ Update ggml-org/whisper.cpp to `47b9eb37a33c5031a1b667ace64477330b9f36c1` (#9877 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-19 08:05:56 +02:00
dependabot[bot]	fc803e8d48	chore(deps): bump golang.org/x/crypto from 0.50.0 to 0.51.0 (#9886 ) Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.50.0 to 0.51.0. - [Commits](https://github.com/golang/crypto/compare/v0.50.0...v0.51.0) --- updated-dependencies: - dependency-name: golang.org/x/crypto dependency-version: 0.51.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-19 08:04:15 +02:00
LocalAI [bot]	ca51606bfe	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `40aae0b6d86d50c0ee7011b3ce59a233203e430a` (#9875 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-19 08:01:41 +02:00
Azteczek	cb502de309	feat: add flake.nix for dockerless setup (#9851 ) * Add flake.nix Signed-off-by: Azteczek <243776410+Azteczek@users.noreply.github.com> * Add flake.lock Signed-off-by: Azteczek <243776410+Azteczek@users.noreply.github.com> --------- Signed-off-by: Azteczek <243776410+Azteczek@users.noreply.github.com>	2026-05-18 15:23:10 +01:00
Richard Palethorpe	5d0b549049	feat(gallery): verify backend OCI images with keyless cosign (#9823 ) * feat(gallery): verify backend OCI images with keyless cosign Close a trust gap where a registry compromise or MITM could silently replace a backend image: the gallery YAML tells LocalAI which image to pull, but until now nothing verified the bytes came from our CI. Consumer (pkg/oci/cosignverify): - New package using sigstore-go to verify keyless-cosign signatures. - OCI 1.1 referrers API + new bundle format (no legacy :tag.sig). - Policy fields: Issuer / IssuerRegex / Identity / IdentityRegex / NotBefore. NotBefore is the revocation lever — keyless Fulcio certs are ephemeral so revocation is policy-side; advancing not_before in the gallery YAML invalidates every signature predating the cutoff. - TUF trusted root cached process-wide so N backends from one gallery do 1 fetch, not N. Plumbing: - pkg/downloader: ImageVerifier interface + WithImageVerifier option threaded through DownloadFileWithContext. Verification runs between oci.GetImage and oci.ExtractOCIImage, with digest pinning via pinnedImageRef to close the TOCTOU window. Skips the verifier's HEAD when the ref is already digest-pinned. - core/config: Gallery.Verification YAML block. - core/gallery: backendDownloadOptions builds the verifier from the policy; applied on initial URI, mirrors, and tag fallbacks. - core/gallery/upgrade: the upgrade path now routes through the same options builder. A regression Ginkgo spec pins this contract — without it, UpgradeBackend silently bypassed verification. - core/cli: --require-backend-integrity (LOCALAI_REQUIRE_BACKEND_INTEGRITY) escalates missing policy / empty SHA256 from warn to hard-fail. Producer (.github/workflows/backend_merge.yml): - id-token: write at job scope (PR-fork-safe via existing event gate). - sigstore/cosign-installer@v3 pinned to v2.4.1. - After each docker buildx imagetools create, resolve the manifest list digest and run cosign sign --recursive --new-bundle-format --registry-referrers-mode=oci-1-1 against repo@digest. --recursive signs the index and every per-arch entry, matching how the consumer resolves a tag to a platform-specific manifest before verifying. Rollout: backend/index.yaml has no `verification:` block yet, so this PR is backward-compatible — installs proceed with a warning until the gallery is populated. Strict mode is opt-in. Assisted-by: claude-code:claude-opus-4-7 [Bash] [Edit] [Read] [Write] [WebSearch] [WebFetch] Signed-off-by: Richard Palethorpe <io@richiejp.com> * refactor(gallery): plumb RequireBackendIntegrity through config instead of env The previous implementation re-exported the --require-backend-integrity CLI flag into LOCALAI_REQUIRE_BACKEND_INTEGRITY via os.Setenv, then re-read it in core/gallery via os.Getenv. This leaked process state into the gallery package and made the flag impossible to override per-call or test without touching the env. Add RequireBackendIntegrity to ApplicationConfig (with a matching WithRequireBackendIntegrity AppOption) and thread the bool through every install/upgrade path: InstallBackend, InstallBackendFromGallery, UpgradeBackend, InstallModelFromGallery, InstallExternalBackend, ApplyGalleryFromString/File, startup.InstallModels. Worker subcommands gain the same env-bound flag on WorkerFlags so distributed-worker installs honor it consistently with the worker daemon path. Add a forbidigo lint rule against os.Getenv / os.LookupEnv / os.Environ to keep the env-leak pattern from creeping back. Existing offenders (p2p, config loaders, etc.) are baseline-grandfathered by the existing new-from-merge-base: origin/master setting; targeted path exclusions cover the legitimate cases — kong CLI entry points, backend subprocesses, system capability probes, gRPC AUTH_TOKEN inheritance, test gating env vars. Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-18 08:02:20 +02:00
LocalAI [bot]	11cff1b309	chore: ⬆️ Update ggml-org/llama.cpp to `87589042cac2c390cec8d68fb2fad64e0a2a252a` (#9855 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-18 08:01:30 +02:00
LocalAI [bot]	4ca3d2cdc0	docs: ⬆️ update docs version mudler/LocalAI (#9863 ) ⬆️ Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-17 23:20:16 +02:00
LocalAI [bot]	3cba35ed32	chore: ⬆️ Update antirez/ds4 to `c9dd9499bfa57c1bbfbb4446eff963330ab5329b` (#9864 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-17 23:19:58 +02:00
LocalAI [bot]	265ae35231	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `c35189d83c91aad780aba62b89f2830cb2916223` (#9866 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-17 23:19:43 +02:00
LocalAI [bot]	6a48157a80	chore: ⬆️ Update leejet/stable-diffusion.cpp to `bd17f53b7386fb5f60e8587b75e73c4b2fed3426` (#9854 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> v4.2.6	2026-05-16 23:12:05 +02:00
LocalAI [bot]	41c838b2df	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `3e573cfea6e0a332eff822ffbdb1dd3b112e9051` (#9856 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-16 22:44:08 +02:00
LocalAI [bot]	21e793ad2a	chore: ⬆️ Update antirez/ds4 to `ef0a4905d05263df8e63689f2dd1efac618a752c` (#9857 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-16 22:43:46 +02:00
LocalAI [bot]	7c190bb4b9	docs: ⬆️ update docs version mudler/LocalAI (#9853 ) ⬆️ Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-16 22:43:06 +02:00

1 2 3 4 5 ...

6422 Commits