Commit Graph

222 Commits

Author SHA1 Message Date
Ettore Di Giacinto
5b0196c7d0 fix(whisper): scrub invalid UTF-8 from segment text before protobuf marshal
whisper.cpp can emit bytes that are not valid UTF-8 — typically a
multibyte codepoint split across token boundaries. protobuf string
fields reject those at marshal time, which would surface as a transcribe
failure. Run strings.ToValidUTF8 on the segment text before it leaves
the cgo boundary so the bad byte gets replaced with U+FFFD.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7 [Claude Code]
2026-04-26 19:35:39 +00:00
LocalAI [bot]
1c9592c77f chore: ⬆️ Update leejet/stable-diffusion.cpp to b8bdffc19962be7e5a84bfefeb2e31bd885b571a (#9521)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-24 15:15:15 +02:00
Richard Palethorpe
13734ae9fa feat: Add Sherpa ONNX backend for ASR and TTS (#8523)
feat(backend): Add Sherpa ONNX backend and Omnilingual ASR

Adds a new Go backend wrapping sherpa-onnx via purego (no cgo). Same
approach as opus/stablediffusion-ggml/whisper — a thin C shim
(csrc/shim.c + shim.h → libsherpa-shim.so) wraps the bits purego
can't reach directly: nested struct config writes, result-struct field
reads, and the streaming TTS callback trampoline. The Go side uses
opaque uintptr handles and purego.NewCallback for the TTS callback.

Supports:
- VAD via sherpa-onnx's Silero VAD
- Offline ASR: Whisper, Paraformer, SenseVoice, Omnilingual CTC
- Online/streaming ASR: zipformer transducer with endpoint detection
  (AudioTranscriptionStream emits delta events during decode)
- Offline TTS: VITS (LJS, etc.)
- Streaming TTS: sherpa-onnx's callback API → PCM chunks on a channel,
  prefixed by a streaming WAV header

Gallery entries: omnilingual-0.3b-ctc-q8-sherpa (1600-language offline
ASR), streaming-zipformer-en-sherpa (low-latency streaming ASR),
silero-vad-sherpa, vits-ljs-sherpa.

E2E coverage: tests/e2e-backends for offline + streaming ASR,
tests/e2e for the full realtime pipeline (VAD + STT + TTS).

Assisted-by: claude-opus-4-7-1M [Claude Code]

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-24 14:40:06 +02:00
Ettore Di Giacinto
4906cbad04 feat: add biometrics UI (#9524)
* feat(react-ui): add Face & Voice Recognition pages

Expose the face and voice biometrics endpoints
(/v1/face/*, /v1/voice/*) through the React UI. Each page has four
tabs driving the six endpoints per modality: Analyze (demographics
with bounding boxes / waveform segments), Compare (verify with a
match gauge and live threshold slider), Enrollment (register /
identify / forget with a top-K matches view), Embedding (raw
vector inspector with sparkline + copy).

MediaInput supports file upload plus live capture: webcam
snap-to-canvas for face, MediaRecorder -> AudioContext ->
16-bit PCM mono WAV transcode for voice (libsndfile on the
backend only handles WAV/FLAC/OGG natively).

Sidebar gets a new Biometrics section feature-gated on
face_recognition / voice_recognition; routes are wrapped in
<RequireFeature>. No new dependencies -- Font Awesome icons
picked from the Free set.

Assisted-by: Claude:Opus 4.7

* fix(localai): accept data URI prefixes with codec/charset params

Browser MediaRecorder produces data URIs like
  data:audio/webm;codecs=opus;base64,...
so the pre-';base64,' section can carry multiple parameter
segments. The `^data:([^;]+);base64,` regex in pkg/utils/base64.go
and core/http/endpoints/localai/audio.go only matched exactly one
segment, so recordings straight from the React UI's live-capture
tab failed the strip and then tripped the base64 decoder on the
leading 'data:' literal, surfacing as
  "invalid audio base64: illegal base64 data at input byte 4"

Widened both regexes to `^data:[^,]+?;base64,` so any number of
';param=value' segments between the mime type and ';base64,' are
tolerated. Added a regression test covering the MediaRecorder
shape.

Assisted-by: Claude:Opus 4.7

* fix(insightface): scope pack ONNX loading to known manifests

LocalAI's gallery extracts buffalo_* zips flat into the models
directory, which inevitably mixes with ONNX files from other
backends (opencv face engine, MiniFASNet antispoof, WeSpeaker
voice embedding) and older buffalo pack installs. Feeding those
foreign files into insightface's model_zoo.get_model() blows up
inside the router -- it assumes a 4-D NCHW input and indexes
`input_shape[2]` on tensors that aren't shaped like a face model,
raising IndexError mid-load and leaving the backend unusable.

The router's dispatch isn't amenable to per-file try/except alone
(first-file-wins picks det_10g.onnx from buffalo_l even when the
user asked for buffalo_sc -- alphabetical order happens to favour
the wrong pack). Instead, ship an explicit manifest of the
upstream v0.7 pack contents and scope the glob to that when the
requested pack is known. The manifest is small and stable; future
packs can be added alongside or fall through to the tolerance
loop, which also swallows any remaining IndexError / ValueError
from foreign files with a clear `[insightface] skipped` stderr
line for diagnostics.

Assisted-by: Claude:Opus 4.7

* fix(speaker-recognition): extract FBank features for rank-3 ONNX encoders

Pre-exported speaker-encoder ONNX graphs come in two shapes:

  rank-2  [batch, samples]           -- some 3D-Speaker exports,
                                        take raw waveform directly.
  rank-3  [batch, frames, n_mels]    -- WeSpeaker and most Kaldi-
                                        lineage encoders, expect
                                        pre-computed Kaldi FBank.

OnnxDirectEngine unconditionally fed `audio.reshape(1, -1)` --
correct for rank-2, IndexError-on-input_shape[3] on rank-3, which
surfaced to the UI as
  "Invalid rank for input: feats Got: 2 Expected: 3"

Detect the input rank at session init and run Kaldi FBank
(80-dim, 25ms/10ms frames, dither=0.0, per-utterance CMN) before
the forward pass when rank>=3. All knobs are configurable via
backend options for encoders that deviate from defaults.

torchaudio.compliance.kaldi is already in the backend's
requirements (SpeechBrain pulls torchaudio in), so no new
dependency.

Assisted-by: Claude:Opus 4.7

* fix(biometrics): isolate face and voice vector stores

Face (ArcFace, 512-D) and voice (ECAPA-TDNN 192-D / WeSpeaker
256-D) biometric embeddings were colliding inside a single
in-memory local-store instance. Enrolling one after the other
failed with
  "Try to add key with length N when existing length is M"
because local-store correctly refuses to mix dimensions in one
keyspace.

The registries were constructed with `storeName=""`, which in
StoreBackend() is just a WithModel() call. But ModelLoader's
cache is keyed on `modelID`, not `model` -- so both registries
collapsed to the same `modelID=""` slot and reused the same
backend process despite looking isolated on paper.

Three complementary fixes:

  1. application.go -- give each registry a distinct default
     namespace ("localai-face-biometrics" /
     "localai-voice-biometrics"). The comment claimed
     isolation, now it's actually enforced.

  2. stores.go -- pass the storeName as both WithModelID and
     WithModel so the ModelLoader cache key separates
     namespaces and the loader spawns distinct processes.

  3. local-store/store.go -- drop the Load() `opts.Model != ""`
     guard. It was there to prevent generic model-loading loops
     from picking up local-store by accident, but that auto-load
     path is being retired; the guard now just blocks legitimate
     namespace isolation. opts.Model is treated as a tag; the
     per-tuple process isolation upstream handles discrimination.

Assisted-by: Claude:Opus 4.7

* fix(gallery): stale-file cleanup and upgrade-tmp directory safety

Two related robustness fixes for backend install/upgrade:

pkg/downloader/uri.go
  OCI downloads passed through
      if filepath.Ext(filePath) != "" ...
          filePath = filepath.Dir(filePath)
  which was intended to redirect file-shaped download targets
  into their parent directory for OCI extraction. The heuristic
  misfires on directory-shaped paths with a dot-suffix --
  gallery.UpgradeBackend uses
      tmpPath = "<backendsPath>/<name>.upgrade-tmp"
  and Go's filepath.Ext treats ".upgrade-tmp" as an extension.
  The rewrite landed the extraction at "<backendsPath>/", which
  then **overwrote the real install** (backends/<name>/) with a
  flat-layout file and left a stray run.sh at the top level. The
  tmp dir itself stayed empty, so the validation step that
  checked "<tmpPath>/run.sh" predictably failed with
      "upgrade validation failed: run.sh not found in new backend"
  Every manual upgrade silently corrupted the backends tree this
  way. Guard the rewrite behind "target isn't already an existing
  directory" -- InstallBackend / UpgradeBackend both pre-create
  the target as a directory, so they get the correct behaviour;
  existing file-path callers with a genuine dot-extension still
  get the parent redirect.

core/gallery/backends.go
  InstallBackend's MkdirAll returned ENOTDIR when something at
  the target path was already a file (legacy dev builds dropped
  golang backend binaries directly at `<backendsPath>/<name>`
  instead of nesting them under their own subdir). That
  permanently blocked reinstall and upgrade for anyone carrying
  that state, since every retry hit the same error. Detect a
  pre-existing non-directory, warn, and remove it before the
  MkdirAll so the fresh install can write the correct nested
  layout with metadata.json + run.sh.

Assisted-by: Claude:Opus 4.7

* fix(galleryop): refresh upgrade cache after backend ops

UpgradeChecker caches the last upgrade-check result and only
refreshes on the 6-hour tick or after an auto-upgrade cycle.
Manual upgrades (POST /api/backends/upgrade/:name) go through
the async galleryop worker, which completes the upgrade
correctly but never tells UpgradeChecker to re-check -- so
/api/backends/upgrades continued to list a just-upgraded backend
as upgradeable, indistinguishable from a failed upgrade, for up
to six hours.

Add an optional `OnBackendOpCompleted func()` hook on
GalleryService that fires after every successful install /
upgrade / delete on the backend channel (async, so a slow
callback doesn't stall the queue). startup.go wires it to
UpgradeChecker.TriggerCheck after both services exist. Result:
the upgrade banner clears within milliseconds of the worker
finishing.

Assisted-by: Claude:Opus 4.7

* build: prepend GOPATH/bin to PATH for protogen-go

install-go-tools runs `go install` for protoc-gen-go and
protoc-gen-go-grpc, which writes them into `go env GOPATH`/bin.
That directory isn't on every dev's PATH, and protoc resolves
its code-gen plugins via PATH, so the immediately-following
protoc invocation fails with
  "protoc-gen-go: program not found"
which in turn blocks `make build` and any
`make backends/%` target that depends on build.

Prepend `go env GOPATH`/bin to PATH for the protoc invocation
so the freshly-installed plugins are found without requiring a
shell-profile change.

Assisted-by: Claude:Opus 4.7

* refactor(ui-api): non-blocking backend upgrade handler with opcache

POST /api/backends/upgrade/:name used to send the ManagementOp
directly onto the unbuffered BackendGalleryChannel, which blocked
the HTTP request whenever the galleryop worker was busy with a
prior operation. The op also didn't show up in /api/operations,
so the Backends UI couldn't reflect upgrade progress on the
affected row.

Register the op in opcache immediately, wrap it in a cancellable
context, store the cancellation function on the GalleryService,
and push onto the channel from a goroutine so the handler
returns right away. Response gains a `jobID` field and a
`message` string so clients have a consistent handle regardless
of whether the op is queued or running.

Pairs with the OnBackendOpCompleted hook added in the galleryop
commit — together the UI sees the upgrade start, watches
progress via /api/operations, and drops the "upgradeable" flag
the moment the worker finishes.

Assisted-by: Claude:Opus 4.7
2026-04-24 08:50:34 +02:00
LocalAI [bot]
eb00d9b178 chore: ⬆️ Update leejet/stable-diffusion.cpp to c97702e1057c2fe13a7074cd9069cb9dd6edc1bf (#9495)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-23 09:32:21 +02:00
LocalAI [bot]
cd94a0b61a chore: ⬆️ Update ggml-org/whisper.cpp to fc674574ca27cac59a15e5b22a09b9d9ad62aafe (#9450)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-21 11:09:05 +02:00
Ettore Di Giacinto
60633c4dd5 fix(stable-diffusion.ggml): force mp4 container in ffmpeg mux (#9435)
gen_video's ffmpeg subprocess was relying on the filename extension to
choose the output container. Distributed LocalAI hands the backend a
staging path (e.g. /staging/localai-output-NNN.tmp) that is renamed to
.mp4 only after the backend returns, so ffmpeg saw a .tmp extension and
bailed with "Unable to choose an output format". Inference had already
completed and the frames were piped in, producing the cryptic
"video inference failed (code 1)" at the API layer.

Pass -f mp4 explicitly so the container is selected by flag instead of
by filename suffix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 00:41:54 +02:00
LocalAI [bot]
3804497186 chore: ⬆️ Update leejet/stable-diffusion.cpp to 44cca3d626d301e2215d5e243277e8f0e65bfa78 (#9428)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-19 23:39:07 +02:00
LocalAI [bot]
e94a9a8f10 chore: ⬆️ Update leejet/stable-diffusion.cpp to 7d33d4b2ddeafa672761a5880ec33bdff452504d (#9417)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-04-19 09:26:58 +02:00
Ettore Di Giacinto
054c4b4b45 feat(stable-diffusion.ggml): add support for video generation (#9420)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-19 09:26:33 +02:00
LocalAI [bot]
86c673fd94 chore: ⬆️ Update ggml-org/whisper.cpp to 166c20b473d5f4d04052e699f992f625ea2a2fdd (#9403)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-18 00:42:32 +02:00
LocalAI [bot]
b6a68e5df4 chore: ⬆️ Update leejet/stable-diffusion.cpp to a564fdf642780d1df123f1c413b19961375b8346 (#9383)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-17 08:11:55 +02:00
LocalAI [bot]
7f88a3ba30 chore: ⬆️ Update leejet/stable-diffusion.cpp to c41c5ded7af85e01b7fe442ff7950c720706d53a (#9366)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-16 09:04:33 +02:00
Ettore Di Giacinto
87e6de1989 feat: wire transcription for llama.cpp, add streaming support (#9353)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-14 16:13:40 +02:00
Ettore Di Giacinto
151ad271f2 feat(rocm): bump to 7.x (#9323)
feat(rocm): bump to 7.2.1

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-12 08:51:30 +02:00
Ettore Di Giacinto
7a0e6ae6d2 feat(qwen3tts.cpp): add new backend (#9316)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-11 23:14:26 +02:00
LocalAI [bot]
e4bfc42a2d chore: ⬆️ Update leejet/stable-diffusion.cpp to 6b675a5ede9b0edf0a0f44191e8b79d7ef27615a (#9320)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-11 23:07:30 +02:00
LocalAI [bot]
606f462da4 chore: ⬆️ Update PABannier/sam3.cpp to 01832ef85fcc8eb6488f1d01cd247f07e96ff5a9 (#9311)
⬆️ Update PABannier/sam3.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-11 08:38:30 +02:00
Ettore Di Giacinto
706cf5d43c feat(sam.cpp): add sam.cpp detection backend (#9288)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-09 21:49:11 +02:00
LocalAI [bot]
7081b54c09 chore: ⬆️ Update leejet/stable-diffusion.cpp to e8323cabb0e4511ba18a50b1cb34cf1f87fc71ef (#9281)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-09 08:12:23 +02:00
LocalAI [bot]
8e59346091 chore: ⬆️ Update leejet/stable-diffusion.cpp to 8afbeb6ba9702c15d41a38296f2ab1fe5c829fa0 (#9262)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:39:38 +02:00
LocalAI [bot]
e6e4e19633 chore: ⬆️ Update ace-step/acestep.cpp to e0c8d75a672fca5684c88c68dbf6d12f58754258 (#9261)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:39:24 +02:00
LocalAI [bot]
11637b5a1b chore: ⬆️ Update leejet/stable-diffusion.cpp to 7397ddaa86f4e8837d5261724678cde0f36d4d89 (#9242)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-06 10:52:51 +02:00
LocalAI [bot]
2d40725ca2 chore: ⬆️ Update leejet/stable-diffusion.cpp to 87ecb95cbc65dc8e58e3d88f4f4a59a0939796f5 (#9200)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-02 08:53:04 +02:00
LocalAI [bot]
ba7cdd532a chore: ⬆️ Update leejet/stable-diffusion.cpp to 09b12d5f6d51d862749e8e0ee8baac8f012089e2 (#9195)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-01 00:48:25 +02:00
Copilot
87a63316c7 stablediffusion-ggml: replace hand-maintained enum string arrays with upstream API calls (#9192)
* Initial plan

* Remove hand-maintained enum string arrays in gosd.cpp, use upstream API functions

Agent-Logs-Url: https://github.com/mudler/LocalAI/sessions/561fb489-89ed-4588-8f1e-7b967d91ba37

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-31 14:53:38 +02:00
LocalAI [bot]
56db76599a chore: ⬆️ Update ggml-org/whisper.cpp to 95ea8f9bfb03a15db08a8989966fd1ae3361e20d (#9168)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-30 08:27:11 +02:00
LocalAI [bot]
ad57cdfefe chore: ⬆️ Update leejet/stable-diffusion.cpp to f16a110f8776398ef23a2a6b7b57522c2471637a (#9167)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-30 08:26:45 +02:00
Ettore Di Giacinto
59108fbe32 feat: add distributed mode (#9124)
* feat: add distributed mode (experimental)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix data races, mutexes, transactions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix events and tool stream in agent chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* use ginkgo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(cron): compute correctly time boundaries avoiding re-triggering

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not flood of healthy checks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not list obvious backends as text backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop redundant healthcheck

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-30 00:47:27 +02:00
LocalAI [bot]
7209457f53 chore: ⬆️ Update ace-step/acestep.cpp to 6f35c874ee11e86d511b860019b84976f5b52d3a (#9128)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-25 07:52:31 +01:00
LocalAI [bot]
bf92117259 chore: ⬆️ Update ggml-org/whisper.cpp to 76684141a5d059be71cbe23dc2f0ed552213ba2d (#9094)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-22 00:57:28 +01:00
LocalAI [bot]
8036d22ec6 chore: ⬆️ Update ace-step/acestep.cpp to 7326a7bea0c2037982ec924f7364e998df70450c (#9086)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-22 00:56:52 +01:00
LocalAI [bot]
7d81bf0aa3 chore: ⬆️ Update ggml-org/whisper.cpp to 9386f239401074690479731c1e41683fbbeac557 (#9077)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-19 23:27:35 +01:00
LocalAI [bot]
dd1a8b174f chore: ⬆️ Update ggml-org/whisper.cpp to ef3463bb29ef90d25dfabfd1e75993111c52412d (#9062)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-19 07:58:11 +01:00
LocalAI [bot]
8560a1e571 chore: ⬆️ Update ace-step/acestep.cpp to ab020a9aefcd364423e0665da12babc6b0c7b507 (#9046)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-18 08:54:15 +01:00
LocalAI [bot]
29c33e6a6a chore: ⬆️ Update ggml-org/whisper.cpp to dc9611662265870df22a7230b7586176a99c1955 (#9045)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-18 08:46:35 +01:00
LocalAI [bot]
e21ad5cfaa chore: ⬆️ Update leejet/stable-diffusion.cpp to 545fac4f3fb0117a4e962b1a04cf933a7e635933 (#9036)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-17 18:07:30 +01:00
LocalAI [bot]
5c5e537b31 chore: ⬆️ Update ace-step/acestep.cpp to 15740f4301b3ec3020875f1fb975a6cfdb2f6767 (#9038)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-17 10:22:53 +01:00
LocalAI [bot]
3eabd6d1d0 chore: ⬆️ Update ggml-org/whisper.cpp to 79218f51d02ffe70575ef7fba3496dfc7adda027 (#9037)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-17 08:25:31 +01:00
LocalAI [bot]
9f903ec06e chore: ⬆️ Update leejet/stable-diffusion.cpp to 862a6586cb6fcec037c14f9ed902329ecec7d990 (#9019)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-16 00:09:59 +01:00
Ettore Di Giacinto
c596d8a5d9 fix: Change baseDir assignment to use ModelPath (#9010)
Fixes: https://github.com/mudler/LocalAI/issues/9005

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-15 09:45:58 +01:00
LocalAI [bot]
0ec3ea4a46 fix(acestep-cpp): resolve relative model paths in options (#8993)
* fix(acestep-cpp): resolve relative model paths in options

The acestep-cpp backend was failing to load models because the model
paths in options (text_encoder_model, dit_model, vae_model) were being
passed to the C++ code without resolving their relative paths.

When a user configures acestep-cpp-turbo-4b, the model paths are specified
as relative paths like 'acestep-cpp/acestep-v15-turbo-Q8_0.gguf'. The
backend was passing these paths directly to the C++ code without joining
them with the model directory.

This fix:
1. Gets the base directory from the ModelFile path
2. Resolves all relative paths in options to be absolute paths
3. Adds debug logging to show resolved paths for troubleshooting

Fixes #8991

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* test: fix acestep tests to not join modeldir in options

According to code review feedback, the Options array in TestLoadModel
and TestSoundGeneration should contain just the model filenames without
filepath.Join with modelDir. The model paths are handled internally by
the backend.

* fix: change bpm parameter type to float32 to match C++ API signature

* test: fix TestLoadModel and TestSoundGeneration to use baseDir for model paths

- Modified TestLoadModel to compute baseDir from main model path and use it for relative model paths
- Modified TestSoundGeneration similarly to use baseDir for model paths
- Changed bpm parameter type from int32 to float32 to match C++ API

* Apply suggestions from code review

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-14 01:16:13 +01:00
Richard Palethorpe
f9a850c02a feat(realtime): WebRTC support (#8790)
* feat(realtime): WebRTC support

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(tracing): Show full LLM opts and deltas

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-13 21:37:15 +01:00
LocalAI [bot]
ff21bc6cbb chore: ⬆️ Update ace-step/acestep.cpp to 5aa065445541094cba934299cd498bbb9fa5c434 (#8984)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-13 07:59:34 +01:00
LocalAI [bot]
c0351b8e6a Remove HuggingFace backend support (#8971)
* Remove HuggingFace backend support, restore other backends

- Removed backend/go/huggingface directory and all related files
- Removed pkg/langchain/huggingface.go
- Removed LCHuggingFaceBackend from pkg/model/initializers.go
- Removed huggingface backend entries from backend/index.yaml
- Updated backend/README.md to remove HuggingFace backend reference
- Restored kitten-tts, local-store, silero-vad, piper backends that were incorrectly removed

This change removes only HuggingFace backend support from LocalAI
as per the P0 priority request in issue #8963, while preserving
other backends (kitten-tts, local-store, silero-vad, piper).

Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>

* Remove huggingface backend from test.yml build command

The tests-linux CI job was failing because it was trying to build the
huggingface backend which no longer exists after the backend removal.
This removes huggingface from the build command in test.yml.

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: team-coding-agent-1 <team-coding-agent-1@localai.dev>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-13 01:09:30 +01:00
Ettore Di Giacinto
a738f8b0e4 feat(backends): add ace-step.cpp (#8965)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-12 18:56:26 +01:00
LocalAI [bot]
515cd968ae chore: ⬆️ Update leejet/stable-diffusion.cpp to d6dd6d7b555c233bb9bc9f20b4751eb8c9269743 (#8925)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-10 07:29:54 +01:00
LocalAI [bot]
85e4871d4d chore: ⬆️ Update leejet/stable-diffusion.cpp to c8fb3d245858d495be1f140efdcfaa0d49de41e5 (#8841)
* chore: ⬆️ update stable-diffusion.cpp to `c8fb3d245858d495be1f140efdcfaa0d49de41e5`

Update stablediffusion-ggml to include fix for SD1 Pix2Pix issue
(leejet/stable-diffusion.cpp#1329).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: localai-bot <localai-bot@noreply.github.com>

* fix: address CI failures in stablediffusion update

Signed-off-by: localai-bot <localai-bot@noreply.github.com>

* fix: resolve remaining CI failures in stablediffusion update

- Move flow_shift to global scope so gen_image() can access the value
  set during load_model() (was causing compilation error)
- Fix sd_type_str array: TQ1_0 should be at index 34, TQ2_0 at index 35
  to match upstream SD_TYPE_TQ1_0=34, SD_TYPE_TQ2_0=35 enum values

Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 09:53:08 +01:00
LocalAI [bot]
ac91413eb2 chore: ⬆️ Update ggml-org/whisper.cpp to 30c5194c9691e4e9a98b3dea9f19727397d3f46e (#8796)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-05 22:49:32 +01:00
LocalAI [bot]
6e5a58ca70 feat: Add Free RPC to backend.proto for VRAM cleanup (#8751)
* fix: Add VRAM cleanup when stopping models

- Add Free() method to AIModel interface for proper GPU resource cleanup
- Implement Free() in llama backend to release llama.cpp model resources
- Add Free() stub implementations in base and SingleThread backends
- Modify deleteProcess() to call Free() before stopping the process
  to ensure VRAM is properly released when models are unloaded

Fixes issue where VRAM was not freed when stopping models, which
could lead to memory exhaustion when running multiple models
sequentially.

* feat: Add Free RPC to backend.proto for VRAM cleanup\n\n- Add rpc Free(HealthMessage) returns (Result) {} to backend.proto\n- This RPC is required to properly expose the Free() method\n  through the gRPC interface for VRAM resource cleanup\n\nRefs: PR #8739

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-03 12:39:06 +01:00