LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-25 00:59:28 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	46b76cb4ac	test(http): cover parseForwarded edge cases; clarify base-url flag group Adds direct unit coverage for quoted/malformed/multi-element Forwarded headers and regroups the external base URL flag away from auth-only. Refs #10482 Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 22:14:25 +00:00
Ettore Di Giacinto	15c7ce059a	docs: document LOCALAI_BASE_URL and reverse-proxy headers Refs #10482 Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 22:08:58 +00:00
Ettore Di Giacinto	975b54dfc5	feat(config): generalize LOCALAI_BASE_URL to ExternalBaseURL LOCALAI_BASE_URL now sets a single instance-wide external base URL used for OAuth callbacks and all self-referential links. A Pre middleware stamps it into the request context for middleware.BaseURL. Refs #10482 Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 22:03:56 +00:00
Ettore Di Giacinto	2eec8bfeb9	feat(http): honor explicit external base URL in BaseURL When _external_base_url is set in the request context it dictates the origin (scheme+host+port); the proxy path prefix is still appended. Refs #10482 Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 21:58:21 +00:00
Ettore Di Giacinto	d9feac54dc	fix(http): harden BaseURL proxy scheme/host detection Split comma-separated X-Forwarded-Proto and honor the RFC 7239 Forwarded header so generated links use https behind common reverse-proxy setups. Refs #10482 Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 21:57:43 +00:00
LocalAI [bot]	5c3d48ab50	feat(ui): usage & UX enhancements (last-used model, polling, starter models, usage cost, a11y) (#10496 ) * feat(ui): remember last-used model per capability ModelSelector auto-selected the first option whenever the bound value was empty or stale, so every visit to the Home chat box, Image, TTS or Talk pages reset the choice to whatever sorted first. Persist the user's pick in localStorage keyed by capability and prefer it on auto-select when the model is still available, falling back to the first option otherwise. Because every modality picker funnels through ModelSelector, this fixes the friction everywhere at once. External-options callers pass no capability and keep the previous first-item behaviour. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): add visibility-aware polling hook The app had 26 hand-rolled setInterval polls, none of which paused when the browser tab was hidden, so backgrounded dashboards kept hitting the server every few seconds for data nobody was looking at. Add usePolling: runs immediately, polls on a fixed interval, pauses while document.hidden, fires a catch-up poll on return, and guards against overlapping slow requests. Route useResources (the highest-frequency shared poll) through it. Further callers can be migrated incrementally. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): hardware-aware starter models on empty home A fresh install dropped admins straight into a 1000+ model gallery with no guidance. Add a StarterModels widget to the empty-state wizard that recommends a small, curated set tuned to the detected hardware: - CPU-only machines (no GPU VRAM) are steered to genuinely small models (1-4B, Q4) that stay responsive without a GPU. - GPU machines get suggestions scaled to available VRAM. Curated names are real gallery entries, intersected against the live gallery at render time so a trimmed/custom gallery degrades gracefully. Install is one click via the existing model-install API. Also routes Home's cluster and system-info polls through usePolling so a backgrounded home page stops fetching. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): optional token-cost estimates on usage dashboard The usage dashboard tracked tokens but had no monetary view. Multi-user deployments that bill back or budget compute had to export and compute cost elsewhere. Add an opt-in pricing control: admins set $ per 1M prompt/completion tokens (stored per-browser). When set, an estimated-cost summary card and per-model / per-user cost columns appear, computed from recorded token counts. The entire cost surface stays hidden until a price is entered, so the default view is unchanged. Cost is clearly labelled an estimate - LocalAI itself has no notion of price. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(ui): label icon-only send buttons for screen readers The chat and agent-chat send buttons were a bare paper-plane icon with no accessible name, so screen readers announced only "button". Add an aria-label/title ("Send message") and mark the icon aria-hidden. An audit of all icon-only buttons found these were the only two unlabeled controls; the rest already carry visible text. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 23:30:08 +02:00
LocalAI [bot]	764b0352b9	docs: ⬆️ update docs version mudler/LocalAI (#10491 ) ⬆️ Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 23:18:24 +02:00
LocalAI [bot]	75ba2daba1	chore(model-gallery): ⬆️ update checksum (#10495 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 23:18:04 +02:00
LocalAI [bot]	62b14fd635	feat(backends): add darwin/metal build for liquid-audio (#10486 ) * feat(backends): add darwin/metal build for liquid-audio Wire the already-MPS-ready liquid-audio backend (it ships requirements-mps.txt) into the darwin CI matrix and the gallery so metal-darwin-arm64 images are built and selectable. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4.8 [Claude Code] * ci(liquid-audio): trigger darwin build via requirements-mps note The changed-backends path filter only builds a backend when a file under its directory changes. The metal wiring lived in index.yaml + the matrix, so the darwin job was skipped. Add a documenting comment to the MPS requirements so CI actually exercises the darwin build. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4.8 [Claude Code] * fix(liquid-audio): guard uv-only --index-strategy for the pip/darwin path Same fix as trl: the darwin/MPS build installs with pip (USE_PIP=true), which rejects the uv-only --index-strategy flag and failed the darwin backend build. Add it only on the uv path; Linux/CUDA resolution is unchanged. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4.8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 23:16:27 +02:00
LocalAI [bot]	193d0e6aef	fix(backends): darwin/metal support for supertonic (#10488 ) The supertonic Go TTS backend dlopens ONNX Runtime, but its runtime and packaging scripts were Linux-only: run.sh exported LD_LIBRARY_PATH, pointed ONNXRUNTIME_LIB_PATH at libonnxruntime.so, and always tried the ld.so exec path, while package.sh hard-failed on any non-Linux host. On macOS dyld has no ld.so loader, uses DYLD_LIBRARY_PATH, and ONNX Runtime ships as a .dylib. This applies the same purego .dylib/DYLD_LIBRARY_PATH fix that PR #10481 landed for 15 other ONNX/purego backends (sherpa-onnx, silero-vad, etc.) but which omitted supertonic: - run.sh: on darwin export DYLD_LIBRARY_PATH and point ONNXRUNTIME_LIB_PATH at libonnxruntime.dylib; guard the ld.so exec path to Linux only. - package.sh: recognize Darwin instead of erroring out; the bundled .dylib is resolved via DYLD_LIBRARY_PATH, no glibc/ld.so to bundle. - helper.go: platform-native default library extension (dylib on darwin) for the last-resort dlopen fallback. It also wires the darwin CI build and gallery entries, resolving the inconsistency where backend/index.yaml advertised metal for supertonic but no includeDarwin matrix entry built the image: - .github/backend-matrix.yml: add the -metal-darwin-arm64-supertonic Go entry. - backend/index.yaml: declare metal capabilities and add the concrete metal-supertonic / metal-supertonic-development child entries. The Makefile already detects Darwin/osx/arm64 and stages the per-OS ONNX Runtime tarball, mirroring sherpa-onnx, so no Makefile change is required. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 22:19:03 +02:00
LocalAI [bot]	482314c623	fix(realtime): resolve model aliases for pipeline sub-models (#10484 ) Realtime pipeline sub-models (llm/transcription/tts/vad/sound-detection) were loaded via cl.LoadModelConfigFileByName without alias resolution, unlike top-level API requests which resolve aliases in core/http/middleware/request.go. So a pipeline that references an alias (e.g. `pipeline.llm: default`, where `default` is an alias for a real LLM) reached model loading as the alias stub with an empty Backend. This was silently broken on a single host (it failed downstream) and a hard error in distributed/p2p mode: routing model : loading model default: ... installing backend on node X: backend name is empty Fix by routing every pipeline sub-model load through a small helper that follows a single alias hop (mirroring the top-level resolution), so non-alias sub-models behave identically and aliased ones get the target's full config (Backend, Model, ...). Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 21:50:44 +02:00
Dedy F. Setyawan	e8ae88a2a0	i18n(id): update and complete Indonesian translations (#10480 ) - translate remaining English strings in chat, common, home, and media locales. - fix typo and improve wording consistency (e.g., klaster -> kluster, otomasi -> automasi). Signed-off-by: Dedy F. Setyawan <dedyfajars@gmail.com>	2026-06-24 18:35:21 +02:00
Richard Palethorpe	e1994579f8	fix(pii): load default detectors at startup + add LOCALAI_PII_DEFAULT_DETECTORS (#10474 ) pii_default_detectors was applied to the live config only by a live POST /api/settings (ApplyRuntimeSettings) — neither the startup loader nor the config file watcher read it back. So after a restart the persisted default detectors were dropped, and the cloud-proxy MITM listener (which resolves each intercept host's detectors once at start via ResolvePIIPolicy) came up with an empty set and forwarded intercepted traffic unredacted, even though the MITM model had pii.enabled:true and the defaults were on disk. Request-side default redaction broke the same way. - startup.go: loadRuntimeSettingsFromFile now applies pii_default_detectors, before startMITMIfConfigured, with env > file precedence. - config_file_watcher.go: apply pii_default_detectors on live file edits, matching the existing env-guard pattern used for the other fields. - settings endpoint: rebuild the MITM listener when pii_default_detectors changes (its per-host detector map is frozen at listener start), not only on a mitm_listen change — so toggling a default detector takes effect on cloud-proxy traffic immediately. - new LOCALAI_PII_DEFAULT_DETECTORS env var / CLI flag (WithPIIDefaultDetectors) so the default detector set can be pinned at boot for immutable deployments. Assisted-by: Claude:claude-opus-4-8 Claude-Code Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-06-24 11:08:57 +02:00
LocalAI [bot]	e5620989dd	refactor(distributed): make in-flight tracking coverage a compile-time contract (#10476 ) PR #10475 fixed SoundDetection in-flight tracking, but the underlying trap remains: InFlightTrackingClient embedded the whole grpc.Backend interface "for passthrough of untracked methods", so any newly added inference method is silently satisfied by the embedded passthrough and never wrapped with track(). That leaves onFirstComplete unfired and in-flight stuck at 1 - the exact SoundDetection bug, waiting to recur for the next backend method. Close the gap at the type level instead of relying on reviewers to remember: - Split grpc.Backend into two composed sub-interfaces: InferenceBackend (methods that are one discrete inference call and must be tracked) and ControlBackend (control-plane calls plus the streaming constructors whose work spans the returned stream, safe to pass through). The classification now lives next to the interface it documents. - InFlightTrackingClient embeds only grpc.ControlBackend and implements every InferenceBackend method explicitly, delegating to an inner InferenceBackend. A `var _ grpc.Backend = (*InFlightTrackingClient)(nil)` assertion makes the package fail to compile if any inference method is left unwrapped. Now adding a method to InferenceBackend is a build error (at the assertion and every call site: "does not implement grpc.Backend (missing method X)"), not a silent runtime leak - and the obvious fix is to copy a neighbouring wrapper, which calls track(). No runtime guard or reviewer vigilance required. Pure refactor: the composed Backend interface is identical to the old flat one, so all implementers and consumers are unaffected (verified with a full `go build ./...`). Behaviour is unchanged; the existing nodes suite passes. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 11:08:29 +02:00
LocalAI [bot]	fc618dcee6	fix(distributed): track in-flight for SoundDetection requests (#10475 ) The distributed router wraps backend clients in InFlightTrackingClient so the eviction logic knows which replicas are actively serving. Every inference method must be wrapped: track() increments in-flight on entry and decrements (plus fires onFirstComplete, which releases the load-time reservation) on return. SoundDetection was added after the tracking client and never got a wrapper, so its calls fell through to the embedded passthrough Backend. The increment/decrement never ran and, critically, onFirstComplete never fired, so the reservation set at model load was never released - leaving in-flight stuck at 1 and the replica permanently ineligible for eviction. Wrap SoundDetection like the other non-LLM methods and cover it in the "non-LLM inference methods track in-flight" table test. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 10:13:37 +02:00
LocalAI [bot]	e6042080c0	fix(agents): URL-decode collection/agent name path params (#10443 ) (#10471 ) fix(agents): URL-decode collection/agent name path params Collection and agent names carry a "legacy-api-key:" prefix, so the ':' arrives percent-encoded as %3A in the request path. Echo routes such paths via URL.RawPath and stores the matched path-param value still escaped, so c.Param("name") returned "legacy-api-key%3ALiteraryResearch" and the store lookup 404'd ("collection not found"). This was second-order fallout of #10375/#10387: once colons became valid in names, the URL-decode gap surfaced on every name-bearing endpoint. Add a decodedParam helper that url.PathUnescape's the param (falling back to the raw value on invalid encoding) and wire it into all collection endpoints and the agent :name endpoints, which share the identical prefix. The entry endpoints already unescaped c.Param("*"); this closes the same gap for :name. Fixes #10443 Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-24 09:42:09 +02:00
LocalAI [bot]	0f3b24436d	chore: ⬆️ Update mudler/parakeet.cpp to `89f5e2977b4d8bccd45e7bcc6f2ef7c4ed49e89a` (#10468 ) ⬆️ Update mudler/parakeet.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 09:41:43 +02:00
LocalAI [bot]	4b6f911835	chore: ⬆️ Update ggml-org/whisper.cpp to `43d78af5be58f41d6ffbc227d608f104577741ea` (#10466 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 09:41:14 +02:00
LocalAI [bot]	a5e28942a6	chore: ⬆️ Update ggml-org/llama.cpp to `be4a6a63eb2b848e19c277bdcf2bd399e8af76d9` (#10467 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 09:40:54 +02:00
LocalAI [bot]	dba9cd7ca4	chore: ⬆️ Update CrispStrobe/CrispASR to `96b2a6ee31d30389fed8a7ef1a54239b75231ddc` (#10465 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 09:40:34 +02:00
LocalAI [bot]	c93190de50	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `7ccf1d209588962b96eacca325b37e9b3e8faf5e` (#10456 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 09:40:13 +02:00
LocalAI [bot]	4dbf69f889	chore(model gallery): 🤖 add 1 new models via gallery agent (#10472 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-24 00:00:26 +02:00
LocalAI [bot]	deb430f3ec	chore(model-gallery): ⬆️ update checksum (#10469 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> v4.5.0	2026-06-23 23:15:47 +02:00
LocalAI [bot]	dd8c8778e2	chore(model gallery): 🤖 add 1 new models via gallery agent (#10464 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 15:43:21 +02:00
LocalAI [bot]	06a7b6cadb	chore: ⬆️ Update leejet/stable-diffusion.cpp to `f440ad9c29dd8bc34e5d1f4b863832b96d6ea05f` (#10457 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 13:29:07 +02:00
LocalAI [bot]	67c8889866	chore: ⬆️ Update CrispStrobe/CrispASR to `63b57289255267edf66e43e33bc3911e04a2e92d` (#10455 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 13:28:49 +02:00
LocalAI [bot]	1d49041c85	chore: ⬆️ Update ggml-org/llama.cpp to `73618f27a801c0b8614ceaf3547d3c2a99baae14` (#10458 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 13:28:09 +02:00
LocalAI [bot]	2edc4e25b3	chore: ⬆️ Update ggml-org/whisper.cpp to `bae6bc02b1940bbfb87b6a0299c565e563b916d1` (#10459 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 13:27:51 +02:00
Richard Palethorpe	7888067914	fix(settings): merge partial /api/settings updates instead of overwriting (#10463 ) POST /api/settings rebuilt runtime_settings.json from only the request body, so a focused admin page that submits a single field wiped every other persisted setting. The Middleware proxy tab (mitm_listen) and detector table (pii_default_detectors), plus the MCP SetBranding tool (instance_name/instance_tagline), all POST partial bodies; the no-omitempty api_keys and pii_default_detectors fields even round-tripped as JSON null. Read the persisted settings and overlay only the fields the request set (RuntimeSettings.MergeNonNil) before writing. Every field is a pointer, so the reflection-based merge is total over the struct and any field added later is preserved automatically. Absent or null fields are now kept; clearing a setting is done by sending its explicit empty/zero value (api_keys [], mitm_listen "", etc.), unchanged from before. The full Settings page sends every field, so its Save behaves identically. Assisted-by: Claude:claude-opus-4-8 Claude-Code Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-06-23 13:27:34 +02:00
LocalAI [bot]	9eedbf537a	chore(model gallery): 🤖 add 1 new models via gallery agent (#10461 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-23 08:04:46 +02:00
LocalAI [bot]	69c16481c8	fix(test): update e2e UpdateProgress calls for new cancellable arg (#10460 ) PR #10454 added a `cancellable bool` parameter to GalleryStore.UpdateProgress but missed two callers under tests/e2e/distributed, breaking the build on master (golangci-lint and tests-e2e-backend both failed to compile with "not enough arguments in call to ... UpdateProgress"). Pass cancellable=true (both ops are downloading installs, which are cancellable) and assert the flag is persisted, exercising the new behavior. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 23:45:22 +02:00
LocalAI [bot]	56f8a6623f	fix(galleryop): persist cancellable so restarted in-flight ops stay cancellable (#10454 ) In distributed mode a model/backend install marks OpStatus.Cancellable=true while downloading, but the gallery_operations row never recorded it: UpdateStatus persisted only progress/status and Create left the cancellable column at its zero value. After a replica restart Hydrate rebuilt the op with cancellable=false, /api/operations reported false, and the UI hid the cancel button - the orphaned op then lingered until the 30-minute stale reaper expired it ("stays there on restart, can't cancel, after a bit it expires"). Persist the flag on every progress tick and at row creation (installs are cancellable, deletes are not), and clear it on terminal transitions. A rehydrated in-flight op is now cancellable, so an admin can dismiss the orphaned op immediately instead of waiting out the reaper. The functional cancel path already survived restart (CancelOperation persists store.Cancel even with no live CancelFunc); this restores the UI affordance that drives it. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 22:41:16 +02:00
Ettore Di Giacinto	4755d676a3	Revert "feat(ui): role and deployment-mode adaptive UI (landing, sidebar, top navbar)" (#10453 ) Revert "feat(ui): role and deployment-mode adaptive UI (landing, sidebar, top…" This reverts commit `9d54a599b0`.	2026-06-22 21:59:05 +02:00
dependabot[bot]	10184b5e28	chore(deps): bump actions/checkout from 6 to 7 (#10451 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 6 to 7. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v6...v7) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-22 21:38:37 +02:00
LocalAI [bot]	fdf475ec5f	feat(realtime): conversation compaction (summarize-then-drop) + OpenAI item.delete/truncate/clear (#10446 ) * feat(realtime): add pipeline.compaction config + resolution Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(realtime): extract itemID helper, reuse in item.retrieve Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(realtime): drop duplicate Ginkgo bootstrap, fold specs into openai suite Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): implement conversation.item.delete Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): implement input_audio_buffer.clear Add a handler for the input_audio_buffer.clear client event that discards a partially-captured utterance (raw PCM + buffered Opus frames) via a unit-tested clearInputAudio helper, then acks with input_audio_buffer.cleared. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): implement conversation.item.truncate (text) Clears both .Text and .Transcript of the assistant content part at contentIndex so barge-in truncation also works for audio turns whose spoken words live in .Transcript. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): add Conversation.Memory + pair-safe compactionCut Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(realtime): compactionCut returns 0 for keep<=0 (no-cap sentinel, avoids panic) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * style(realtime): gofmt compaction test helper closures Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): inject rolling memory into the prompt + summary builders Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): server-side summarize-then-drop compactor Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(realtime): unit-test prefixMatches eviction-safety predicate Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(realtime): resolve summarizer model + schedule compaction per turn Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(realtime): document conversation compaction + new item events Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(realtime): resolve summary model inside compaction goroutine (lazy, off-path) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(realtime): reuse reasoning.ExtractReasoningComplete for summary stripping Replace the bespoke <think> regex in the compactor with the shared pkg/reasoning extractor (via spokenReasoningConfig), matching the rest of the realtime path and covering all reasoning tag families, not just <think>. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(config): register pipeline.compaction fields in meta registry TestAllFieldsHaveRegistryEntries requires every ModelConfig field to have a UI/meta registry entry; add the four pipeline.compaction.* leaves so they render with proper labels/descriptions instead of the reflection fallback. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 21:28:49 +02:00
LocalAI [bot]	9d54a599b0	feat(ui): role and deployment-mode adaptive UI (landing, sidebar, top navbar) (#10449 ) * feat(ui): add shared DeploymentContext (features + p2p signal) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(ui): extract launchAssistantChat shared helper Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): role/mode-aware landing redirect at /app Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): pin Cluster group and collapse Create for cluster admins Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): desktop top navbar with mode pill and admin-via-chat jump Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): admin token-usage meter in the top navbar Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ui): top-navbar breakpoint handoff + assistant jump from chat page M1: the desktop .top-navbar was hidden at max-width 768px while the .mobile-header only appears at max-width 639px, leaving 640-768px with neither bar so admins lost the mode pill, token meter and admin-via-chat jump. Hide the top bar at 639px instead so it covers every width the rail sidebar is shown and hands off to the mobile-header exactly at 639px. M2: the navbar 'Admin via chat' button wrote localStorage and called navigate('/app/chat'), but when already on the chat page Chat does not remount so its mount-time payload reader never fired and the click was a no-op until reload. The payload consume logic is factored into a shared callback; the launcher now dispatches a localai-open-assistant event that the mounted Chat listens for to re-consume the payload. Mount behavior is unchanged. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 21:27:43 +02:00
Richard Palethorpe	63bcbf6c12	fix(pii): post-merge review fixes + live NER e2e for the privacy-filter tier (#10401 ) * fix(pii): post-merge review fixes + live NER e2e for the privacy-filter tier Follow-up to the NER tier engine (#10360), already on master. This carries only the incremental review fixes and tests that postdate that merge — the feature itself is not re-introduced. Review fixes: - openai_completion.go: remove the dead `elem >= 0` conjunct in applyAnyText (the `elem < 0` guard above already returns). - application.go: collapse ResolvePIIPolicy's inline re-implementation of PIIIsEnabled to a single cfg.PIIIsEnabled() call (sole source of the "explicit pii.enabled wins, else cloud-proxy default" rule) and return true past the !enabled guard where it is provable. - pattern.go: hoist the triple `appConfig != nil && EnableTracing` check in patternDetector.Detect into one local. - grammar.go: MaxQuantifier was 4096, but Go's regexp/syntax rejects repeat bounds above 1000 at Parse time, so walk()'s {n,m} guard could never fire — dead code shadowed by the parser. Lower it to 512 so a bound in (512,1000] is rejected here with an actionable error; >1000 still fails closed via Parse. Specs pin the relationship so the guard can't silently revert. - PatternListEditor.jsx: clamp a directly-typed negative min_len to >=0 and force the DOM value back when clamping (min={0} only constrained the spinner, so a negative reached saved config and silently disabled the length filter). Tests: - piipattern_test.go: MaxQuantifier guard specs (must stay live, not dead). - model-config.spec.js: assert the min_len clamp, and that entity_actions collapses a duplicate group to a single row (map semantics; regression guard against emitting an array that drops a row on save). - tests/e2e-backends: token_classify capability driving the TokenClassify gRPC RPC against the backend image, asserting byte-correct, UTF-8 rune-aligned spans (entity.Text == text[start:end]) at threshold 0. Verified on CPU via `make test-extra-backend-privacy-filter` (3/3 specs). - Makefile: test-extra-backend-privacy-filter wrapper. - tests/e2e: e2e_pii_ner_test.go drives /api/pii/analyze + /api/pii/redact (mask + block) through the full HTTP -> detector -> redactor path; gated on PII_NER_MODEL_GGUF so the default suite is unaffected. - .github/workflows/tests-pii-ner-e2e.yml: path-filtered / nightly CI job running the container harness on CPU. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(gallery): add privacy-filter-nemotron (f16 + q8) GGUF conversions of OpenMed/privacy-filter-nemotron — a fine-grained English PII token-classifier (55 categories / 221 BIOES classes), fine-tuned from openai/privacy-filter on NVIDIA's Nemotron-PII dataset. Sibling to the existing privacy-filter-multilingual entry, trading language breadth for category depth. - privacy-filter-nemotron: F16 reference artifact (~2.8 GB). - privacy-filter-nemotron-q8: Q8_0 quant (~1.64 GB) for RAM-constrained / edge use; description notes the size/speed tradeoff and to validate on your own data (a single dropped span is a PII leak). Both run on the privacy-filter backend with known_usecases [token_classify] and a default mask policy (min_score 0.5); operators add per-category entity_actions as needed. sha256s taken from the HF repo's LFS object ids. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-06-22 18:26:19 +02:00
LocalAI [bot]	95b058e1c5	feat(ui): restructure Cluster Nodes view (pulse + panel roster + detail page) (#10447 ) * chore: gitignore SDD scratch directory Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(nodes): add GET /api/nodes/models cluster-wide loaded-models endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): add nodesApi.allModels() for cluster-wide model roster Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): move Scheduling to its own page and nav item Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): replace nodes stat-card strip with cluster pulse + attention callout Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): node-panel roster with inline model chips and segmented filter Replace the Nodes table with a full-width node-panel roster that shows each backend node's running-model chips without an expand click, plus an All/Backend/Agent segmented filter. Per-node detail (models, backends, labels, capacity) moves to the node detail page. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(ui): add deep-linkable node detail page at /app/nodes/:id Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(ui): remove em-dash from CapacityEditor comment; align detail spec backend mock Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(ui): nodes page cleanup, hover/chip polish, docs for restructured cluster view Nodes.jsx dead-code sweep confirmed clean (no StatCard/table/expand state/scheduling-form leftovers). Two App.css polish fixes: move the node-panel hover border-color onto the bordered element so hover gives real feedback, and add the missing .model-chip__state rule the ModelChip component already emits. Update distributed-mode docs prose to describe the restructured cluster view (cluster pulse, attention callout, node-panel roster with inline model chips, All/Backend/Agent filter, node detail page at /app/nodes/:id, Scheduling as its own page). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(ui): drop unused gpuVendorLabel export from nodeStatus Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 18:24:29 +02:00
LocalAI [bot]	f2abcc7503	chore(model gallery): 🤖 add 1 new models via gallery agent (#10445 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 16:09:16 +02:00
Adira	62c99c10b3	fix(diffusers): pin diffusers and transformers to a known-good pair (#9979 ) (#10442 ) fix(diffusers): pin diffusers and transformers to a known-good pair The diffusers backend tracked git+https://github.com/huggingface/diffusers (main) with an unpinned transformers. transformers v5 restructured CLIPTextModel and removed the .text_model attribute that diffusers' single -file loader reads, so loading any single-file Stable Diffusion checkpoint fails: create_diffusers_clip_model_from_ldm (single_file_utils.py) position_embedding_dim = model.text_model.embeddings.position_embedding... AttributeError: 'CLIPTextModel' object has no attribute 'text_model' No released diffusers (<=0.38.0) supports transformers v5 - only unreleased diffusers main does. Because the requirements tracked main plus an unpinned transformers, every backend image froze whichever pair existed at build time, and images built once transformers v5 shipped but before diffusers main caught up are permanently broken. Pin the last known-good released pair across all requirements files: diffusers==0.38.0 and transformers==4.57.6. 0.38.0 still exposes every pipeline backend.py imports (Flux, Wan, Sana, LTX2, Qwen, GGUF), so no functionality is lost, and builds become reproducible instead of drifting into the broken window. Fixes #9979 Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>	2026-06-22 12:38:06 +02:00
LocalAI [bot]	7226bb9f30	chore: ⬆️ Update CrispStrobe/CrispASR to `7a8cb80907341c0204bd0488c1244764f4163883` (#10315 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 12:21:58 +02:00
LocalAI [bot]	569d9bbd9e	fix(distributed): broadcast file-staging progress across replicas (#10440 ) File-staging progress lived only in the SmartRouter's in-memory StagingTracker on the replica performing the transfer. In a multi-replica deployment behind a round-robin load balancer, a /api/operations poll that lands on any other replica saw no staging row, so the progress ("processing file ... Total ... Current ...") flickered in and out as polls rotated between frontends. Mirror the pattern already used for gallery-install progress: the origin replica broadcasts staging ticks over NATS (SubjectStagingProgress, a new staging.<model>.progress subject), and peers merge them via ApplyRemote (SubscribeBroadcasts on the wildcard). Byte-level ticks are leading-edge debounced (~1/s); Start/FileComplete/Complete always publish. A locally-owned op stays authoritative so the origin's own echo and stray peer events can't clobber it, and mirrored remote ops expire after a TTL so a missed Done event can't leave a phantom row. The UI read path (StagingTracker.GetAll) is unchanged. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 09:28:07 +02:00
LocalAI [bot]	682fb2718c	fix(distributed): detach cold-load staging from the request context (#10438 ) A model not yet loaded on a worker is staged lazily on the inference request path. Staging a multi-GB model takes minutes - far longer than any client keeps its HTTP request open - so a browser refresh, an ingress/LB idle-timeout, or a round-robined retry landing on another frontend replica cancels the request context and aborts the upload with "context canceled" mid-transfer. Large models then never finish staging, so they never load (observed in a 2-replica deployment: both frontends repeatedly failed to stage a 15.7 GB GGUF, each attempt dying at a different offset). Bind the cold load (staging + LoadModel + the per-model advisory lock) to context.WithoutCancel(ctx): it keeps the request's values (prefix chain) but drops cancellation/deadline. Each long step keeps its own bound (the file stager's resume budget, LoadModel's 5m timeout), and the advisory lock still de-dupes concurrent loaders across replicas. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 09:06:20 +02:00
LocalAI [bot]	20c643e1f6	chore(model gallery): 🤖 add 1 new models via gallery agent (#10439 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 08:46:34 +02:00
VJSai	64a4351f3a	feat: send a LocalAI User-Agent on registry pulls (#10434 ) LocalAI pulls models from OCI registries (via go-containerregistry), the Ollama registry, and OCI blob stores (via oras), but every request went out with the underlying library's generic User-Agent, so registry operators had no way to attribute traffic to LocalAI. Add an oci.UserAgent() helper that returns "LocalAI" (or "LocalAI/<version>" when the binary is built with a version stamp via internal.Version) and wire it into all three pull paths: - pkg/oci/image.go: remote.WithUserAgent on the go-containerregistry image and digest requests - pkg/oci/ollama.go: a User-Agent header on the Ollama manifest request - pkg/oci/blob.go: a LocalAI User-Agent on the oras blob client. This mirrors oras' auth.DefaultClient (same retry.DefaultClient policy); only the advertised User-Agent changes. Implements #6258. Assisted-by: Claude:claude-opus-4-8 golangci-lint Signed-off-by: Vijay Sai <vijaysaijnv@gmail.com>	2026-06-22 08:44:12 +02:00
LocalAI [bot]	b7d67f5779	chore: ⬆️ Update ggml-org/llama.cpp to `7c082bc417bbe53210a83df4ba5b49e18ce6193c` (#10417 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 08:43:40 +02:00
LocalAI [bot]	600dafd20b	feat(ced): sound-event classification backend (CED audio tagger) (#10425 ) * feat(ced): sketch sound-classification backend (CED audio tagger) Wires ced.cpp (CED, 527-class AudioSet sound-event tagger; baby cry, footsteps, glass, alarms, dog bark) into LocalAI as a Go/purego backend. SKETCH (backend skeleton real; core REST wiring + CI/gallery is a checklist in DESIGN.md): - backend/backend.proto: new SoundDetection rpc + SoundClass messages (run `make protogen-go` to regenerate pkg/grpc/proto). - backend/go/ced: main.go (purego dlopen libced.so + ced_capi.h), goced.go (Ced gRPC backend: Load + SoundDetection), Makefile (clone-at-pin CED_VERSION, ggml static-PIC shared build), run.sh, package.sh, .gitignore. - DESIGN.md: REST /v1/audio/classification wiring (handler/route/capability registration checklist), gallery/index + CI registration, and a scoping note for the realtime/websocket live-recognition path (sliding-window classify over the existing ws transport + voicegate; the ced C-API per-PCM entry point is already window-friendly). Backend code does not compile until protogen-go regenerates the pb types and a libced.so is built (Makefile clones+builds it). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): REST /v1/audio/classification endpoint + capability registration Wires the ced sound-event classification backend (AudioSet audio tagger) end to end through the REST surface, mirroring the transcription path. - Handler: core/http/endpoints/openai/sound_classification.go parses the multipart audio upload, temp-files it, resolves the model config and calls the SoundDetection RPC; returns {model, detections[]} JSON. - Backend wrapper: core/backend/sound_classification.go (ModelSoundDetection) loads the model and normalizes the proto response into schema types. - Schema: core/schema/sound_classification.go (SoundClassificationResult). - gRPC layer: SoundDetection wired through the LocalAI wrapper (interface, Backend client, Client, embed, server, base default) so the loader-typed client exposes the RPC; proto regenerated via make protogen-go. - Route: POST /v1/audio/classification (+ /audio/classification alias) with the audio/multipart default-model middleware in routes/openai.go. - Capability surfaces: swagger @Tags/@Router on the handler; FLAG_SOUND_ CLASSIFICATION usecase flag + UsecaseSoundClassification + UsecaseInfoMap + GuessUsecases + ModalityGroups + GetAllModelConfigUsecases; meta usecase option; /api/instructions audio area updated; auth RouteFeatureRegistry + FeatureAudioClassification (APIFeatures, default ON) + FeatureMetas; UI usecaseFilters, capabilities.js CAP_SOUND_CLASSIFICATION, Models.jsx filter + i18n; docs page features/audio-classification.md + whats-new + crosslink. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): realtime sound-event detection over the websocket API When a realtime pipeline configures a sound-classification model, each VAD-committed utterance (the same window the transcription path produces) is also run through the CED sound-event classifier and the scored AudioSet tags are emitted as a new server event. No new backend rpc is needed: the SoundDetection gRPC method already exists on this branch. - config: add Pipeline.SoundDetection (yaml/json sound_detection,omitempty) beside Transcription/VAD. - realtime: add Model.SoundDetection(ctx, audio, topK, threshold) to the ModelInterface; implement it on wrappedModel and transcriptOnlyModel by calling backend.ModelSoundDetection with the session's sound-classification model config (mirrors how Transcribe dispatches). Load the optional config in newModel / newTranscriptionOnlyModel; nil config keeps it additive. - types: add ConversationItemSoundDetectionEvent (item_id, content_index, detections[]{label,score,index}) with type conversation.item.sound_detection, its ServerEventType constant and MarshalJSON, mirroring the transcription completed event. - realtime: add emitSoundDetection (unary path: classify the committed window, build the event, t.SendEvent) and wire it at the utterance-commit hook right after emitTranscription; gated on session.SoundDetectionEnabled (resolved from Pipeline.SoundDetection at session setup, defaults top_k=5, threshold=0). Its error is logged via xlog but never aborts the turn. - test: Ginkgo specs for emitSoundDetection (tags emitted, empty detections, classifier error) plus a SoundDetection method on the fakeModel double. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ced): implement SoundDetection in nodes backend test doubles The SoundDetection method added to the grpc backend interface left two test doubles (fakeBackendClient, fakeGRPCBackend) incomplete, so core/services/nodes failed to compile under `go vet`/`go test` (go build missed it: the doubles live in _test.go). Add the method to both, mirroring their existing Detect mock. Repairs CI for the nodes package. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): decouple realtime sound detection from VAD (sound-only sessions) Sound-event detection must activate on sounds, not speech, so it no longer runs through the voice VAD/transcription path. A sound-detection-only pipeline (sound_detection set, no transcription/LLM) now: - is accepted by prepareRealtimeConfig (sound_detection counts as a pipeline stage), - builds a lightweight model via newSoundDetectionOnlyModel (no VAD/STT/LLM/TTS loaded), and - defaults the session to turn_detection none (no VAD) with no transcription stage, so the client drives windowing via input_audio_buffer.commit (option A: client-side sliding window). The per-PCM C-API already supports arbitrary windows. commitUtterance gains a sound-only branch: it emits the conversation.item.sound_detection event (scored AudioSet tags) and stops - no transcription, no LLM response. generateResponse is now guarded on a transcription stage being present, so a sound-only turn never invokes the LLM. Existing transcription/VAD sessions are unchanged (additive). Added a commitUtterance sound-only Ginkgo spec asserting it emits the sound event and neither transcribes nor generates a response. go vet + golangci-lint (new-from-merge-base) clean; openai suite green. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): register sound-classification backend in gallery + CI Mechanical backend-image registration for the ced sound-event classifier, mirroring the parakeet-cpp Go/purego backend everywhere it is wired up. - .github/backend-matrix.yml: add the ced build matrix, field-for-field copies of the parakeet-cpp entries (cpu amd64/arm64, cublas cuda 12/13 amd64, l4t cuda-13 arm64, l4t-jetpack cuda-12 arm64, sycl f32/f16, vulkan amd64/arm64, rocm hipblas, and the metal darwin entry), changing only backend and tag-suffix. dockerfile stays ./backend/Dockerfile.golang. - backend/index.yaml: add the &ced meta anchor (capabilities map per platform) plus ced-development and the per-arch image entries, each uri/mirror tag-suffix matching the matrix exactly. The model gallery (GGUF) entry is intentionally deferred pending the HuggingFace publish (TODO note inline). - scripts/changed-backends.js: add an explicit item.backend === "ced" branch in inferBackendPath mapping to backend/go/ced/, same mechanism and ordering as the parakeet-cpp branch (before the generic golang fallthrough). - .github/workflows/bump_deps.yaml: register mudler/ced.cpp -> CED_VERSION in backend/go/ced/Makefile so the daily bot bumps the pin. - swagger/{docs.go,swagger.json,swagger.yaml}: regenerated via make swagger so the existing /v1/audio/classification annotations land in the generated spec. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): server-side windowing for realtime sound detection (option B) Adds an optional server-driven sliding-window classifier so a sound-only realtime client only has to stream audio (no input_audio_buffer.commit): - Pipeline.sound_detection_window_ms / sound_detection_hop_ms config knobs. When both > 0 on a sound-only session, the server classifies the last window of streamed audio every hop and emits a conversation.item.sound_ detection event; the input buffer is trimmed to one window so a long stream stays bounded. When unset, the session stays client-driven (option A). Runs independent of VAD (sound events are not speech). - handleSoundWindow (ticker) + classifySoundWindow (one tick, extracted so it is unit-testable) + writeWindowWAV, which declares the true InputSampleRate (NewWAVHeaderWithRate) so the classifier resamples correctly. Goroutine is started after toggleVAD and torn down with the session (close + wg.Wait). - Register pipeline.sound_detection (+window_ms/hop_ms) in the config meta registry; the earlier realtime commit added pipeline.sound_detection without a registry entry, failing TestAllFieldsHaveRegistryEntries. This fixes that and covers the two new knobs. Tests: classifySoundWindow emits an event + trims the buffer to one window, no-ops on too-little audio; writeWindowWAV declares the given sample rate. go build/vet + golangci-lint (new-from-merge-base) clean; config + openai suites green. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): add ced-base GGUF model gallery entries (f16 + q8_0) The ced-base weights are now published at mudler/ced-base-gguf (Apache-2.0, converted from mispeech/ced-base). Adds gallery/ced.yaml (backend: ced + known_usecases: sound_classification) and two gallery/index.yaml entries (ced-base-f16 default, ced-base-q8 smallest) with sha256-pinned files, and removes the now-resolved TODO from backend/index.yaml. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ced): add tiny/mini/small GGUF model gallery entries Publishes the rest of the CED family (same architecture, metadata-driven port verified end-to-end on ced-tiny) to mudler/ced-{tiny,mini,small}-gguf and adds their f16 + q8_0 gallery entries: ced-tiny (5.5M, edge/Pi-class) f16 11MB / q8_0 6MB ced-mini (9.6M) f16 19MB / q8_0 11MB ced-small (22M) f16 42MB / q8_0 23MB All sha256-pinned. ced-base remains the accuracy default. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): point gallery entries at the consolidated mudler/ced-gguf repo All CED quantizations (tiny/mini/small/base, f16/q8_0) now live in a single HuggingFace repo, mudler/ced-gguf, instead of per-model repos. Repoint the 8 gallery model entries' urls + file uris accordingly. sha256 and filenames are unchanged. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): bump CED_VERSION to the short-clip fix Pin the ced backend to ced.cpp 99c6ed3, which fixes a crash on any clip shorter than target_length (~10.11s): time_pos_embed was added at its full 63-frame grid instead of being sliced to the clip's actual time grid, tripping ggml_can_repeat in ggml_add. Surfaced by the live realtime e2e (sub-10s windows) and gated with a short-clip parity test upstream. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(ced): list ced.cpp as a LocalAI-team engine + backend-guide directive - README.md: add ced.cpp to the "native C/C++/GGML engines developed and maintained by the LocalAI project" table. - docs/content/features/backends.md: add a Sound Classification backend category (sound-event classification / audio tagging) listing ced.cpp. - .agents/adding-backends.md: add a "Documenting the backend" section and two verification-checklist items requiring new backends to be documented in the backends.md category list, and in-house native engines to be added to the README maintained-engines table. This directive was missing. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ced): repin CED_VERSION to the v0.1.0 release commit ced.cpp history was squashed into a single release commit (tagged v0.1.0), so the previous pin (99c6ed3) no longer exists upstream. Pin to c04ac14, the v0.1.0 release commit, so the backend builds against a commit that exists. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(ced): silence gosec G304/G103 + govet unsafeptr on audited paths - sound_classification.go: os.Create(dst) where dst = temp dir + path.Base of the upload (no traversal). #nosec G304, matching the depth-anything-cpp handler. - goced.go: reading a NUL-terminated C string from a libced-owned buffer. #nosec G103 (gosec) + //nolint:govet (golangci-lint's unsafeptr check), since the uintptr is a C-owned malloc'd buffer, not Go-GC memory. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-22 01:00:28 +02:00
LocalAI [bot]	ce8a3e9266	chore: ⬆️ Update ServeurpersoCom/qwentts.cpp to `4536dcdce27c3764a93a06d6bf64026b124962f5` (#10431 ) ⬆️ Update ServeurpersoCom/qwentts.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 01:00:10 +02:00
LocalAI [bot]	a88d9d2de3	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `6c00e87ac84404af588ad2e65935bd6f079c696f` (#10430 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 00:57:49 +02:00
LocalAI [bot]	1cf1bf32e1	chore: ⬆️ Update leejet/stable-diffusion.cpp to `b12098f5d09fc83da36e65c784f7bdb16a5a5ebf` (#10429 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-22 00:57:33 +02:00

1 2 3 4 5 ...

6815 Commits