LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-16 12:49:08 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	3d295adfa8	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `2f524850a1f67716bc0ba80ffa30ce39c5b8bd5f` (#10336 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-06-16 09:04:35 +02:00
LocalAI [bot]	4fa2064875	chore: ⬆️ Update ggml-org/llama.cpp to `7dad2f1a17d65b5e2034c277125bc9f97573a779` (#10337 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-16 08:22:26 +02:00
LocalAI [bot]	cb74399b3a	chore: ⬆️ Update ggml-org/whisper.cpp to `0ec0845110dc934911dc48e8c5beb5ad3189b3f3` (#10349 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-16 08:22:10 +02:00
dependabot[bot]	2388686369	chore(deps): bump grpcio from 1.81.0 to 1.81.1 in /backend/python/vllm (#10347 ) Bumps [grpcio](https://github.com/grpc/grpc) from 1.81.0 to 1.81.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.81.0...v1.81.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.81.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-15 22:57:38 +02:00
LocalAI [bot]	2df2876db2	feat(supertonic): add Supertonic ONNX TTS backend (CPU) (#10342 ) * feat(supertonic): vendor upstream Go TTS pipeline (helper.go) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(supertonic): add gRPC backend (Load/TTS/TTSStream, CPU) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(supertonic): satisfy unused linter (use onnxProvider; exclude vendored helper.go) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(supertonic): unit tests for resolvers + gated end-to-end synthesis Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * style(supertonic): gofmt backend.go comment block Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(supertonic): add Makefile, run.sh, package.sh (CPU build) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(supertonic): wire backend into root Makefile Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(supertonic): check ort.DestroyEnvironment return (errcheck) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(supertonic): resolve voice_styles as sibling of onnx dir; guard trim; test voice Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(supertonic): add CPU build matrix + gallery index entries Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(supertonic): expose as pref-only importable backend Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(supertonic): add Supertonic/supertonic-3 TTS model to the gallery 16 files (4 onnx + tts.json + unicode_indexer.json + 10 voice styles) from HF Supertone/supertonic-3, served via the supertonic backend. Defaults to voice F1; onnx/ + sibling voice_styles/ layout matches the backend's resolveVoicesDir. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(meta): register pipeline.max_history_items config field Pre-existing on master: the field was added without a registry entry, failing TestAllFieldsHaveRegistryEntries (core/config/meta). Add the entry so it renders properly in the model-config UI. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(secscan): exclude vendored supertonic backend from gosec helper.go is vendored from supertone-inc/supertonic; its G304/G404/G104 findings are inherent to upstream and the math/rand use is correct for flow-matching noise (crypto/rand would be wrong). Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-15 16:54:11 +02:00
LocalAI [bot]	f648f07b13	chore: ⬆️ Update ggml-org/llama.cpp to `4988f6e866057afd130c1515ecef0c9bab9a15f8` (#10280 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-14 21:53:25 +02:00
LocalAI [bot]	61cde6fd77	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `5f917a64b391b7d31839845153a473a65f630458` (#10240 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-14 16:46:49 +02:00
LocalAI [bot]	692970e507	chore: ⬆️ Update leejet/stable-diffusion.cpp to `276025e054555166ec419413c6748ca79986ee93` (#10313 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-14 16:44:05 +02:00
LocalAI [bot]	36e3419203	chore: ⬆️ Update vllm-project/vllm cu130 wheel to `0.23.0` (#10314 ) ⬆️ Update vllm-project/vllm cu130 wheel Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-13 23:39:10 +02:00
LocalAI [bot]	4bb592cf91	feat(qwen3-tts-cpp): migrate to ServeurpersoCom/qwentts.cpp (streaming, speakers, voice design) (#10316 ) * feat(qwen3-tts-cpp): repoint upstream to ServeurpersoCom/qwentts.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): flatten qt_* ABI into qt3_* purego shim Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): build shim against upstream qwen-core static lib Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): add option/language/voice/sampling parsing Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): add 24kHz WAV encode/decode/stream-header helpers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): purego backend with streaming, speakers, voice design Map TTSRequest onto qwentts.cpp: instructions->instruct, voice->named speaker or clone-reference path, params map->ref_text + sampling. Add TTSStream over the qt chunk callback. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * test(qwen3-tts-cpp): unit specs + build-gated TTS/TTSStream e2e Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(qwen3-tts-cpp): close defensive PCM-free gap on zero-sample result Register CppPCMFree before the n<=0 guard so a non-null buffer with zero samples cannot leak (the C contract returns NULL on failure, so this is defensive). Raised in code review. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(qwen3-tts-cpp): advertise TTSStream capability Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(qwen3-tts-cpp): update backend index metadata for qwentts.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * feat(gallery): qwentts.cpp models - base/customvoice/voicedesign, Q8_0 & Q4_K_M Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * docs(qwen3-tts-cpp): release note for qwentts.cpp migration Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * test(qwen3-tts-cpp): cover audio_path voice-cloning fallback Add resolveRequest unit specs (config audio_path used as the clone reference when Voice is empty; per-request audio Voice overrides it; a named-speaker Voice does not trigger cloning) plus a real-inference e2e that clones from audio_path (confirmed ref_spk_emb=yes in the pipeline). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * chore(qwen3-tts-cpp): drop the release-note doc Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 23:09:59 +02:00
LocalAI [bot]	0854932a25	feat(omnivoice-cpp): add OmniVoice TTS backend (file + streaming, voice cloning + voice design) (#10310 ) * feat(omnivoice-cpp): add C wrapper + CMake/Makefile build over OmniVoice ov_* ABI Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): add option/language parsing + WAV framing helpers with tests Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): wire purego binding with TTS + streaming TTSStream Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * build(omnivoice-cpp): wire backend into root Makefile Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(omnivoice-cpp): add build matrix entries + dep-bump registration Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): register backend meta + image entries Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): expose as preference-only importable backend Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add omnivoice-cpp TTS models (Q8_0 default + BF16 HQ) Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(omnivoice-cpp): document the OmniVoice TTS backend Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(omnivoice-cpp): add env-gated e2e for TTS + streaming Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(omnivoice-cpp): honor tts.audio_path/tts.voice config as default cloning reference The model config tts.audio_path (ModelOptions.AudioPath) and tts.voice now provide a default voice-cloning reference used when a request omits Voice, so a cloned voice can be pinned in the model YAML instead of passed per request. A per-request voice still overrides. Paths resolve relative to the model dir. Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(omnivoice-cpp): add missing omnivoice-cpp-development backend meta Mirrors the whisper/vibevoice convention: a -development meta aggregating the master-tagged image variants (the production meta and per-variant prod+dev image entries already existed; only the development meta aggregator was missing). Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 21:28:46 +02:00
LocalAI [bot]	203410871b	feat(sherpa-onnx): add Kokoro TTS + multilingual Piper voices (#10309 ) Wire the Kokoro model family into the sherpa-onnx backend (which only supported VITS/Piper before) and add gallery voices for Italian, English, Spanish, French and German plus a multilingual Kokoro model. - csrc/shim.{c,h}: kokoro_* config setters (model/voices/tokens/data_dir/ dict_dir/lexicon/lang/length_scale) mirroring the VITS path, with the matching frees in tts_config_free. - backend.go: loadTTS now detects a Kokoro model (a voices.bin beside the ONNX) and routes to configureKokoroTTS, otherwise configureVitsTTS. Kokoro picks up espeak-ng-data, the jieba dict and the per-language lexicons (only one English variant, to avoid tens of thousands of duplicate-word warnings at load); the language= option hints the lang. - backend_test.go: functional test for isKokoroModel detection. - gallery: 5 Piper VITS voices (it_IT-paola, en_US-amy, es_ES-davefx, fr_FR-siwis, de_DE-thorsten) + kokoro-multi-lang-v1.0, served through sherpa-onnx-tts.yaml with native streaming TTS. Verified by building the backend and synthesizing with a real Piper and Kokoro model (31/31 specs pass, including real-model synth smokes). Assisted-by: Claude:claude-opus-4-8 gofmt golangci-lint go-test Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 21:27:27 +02:00
LocalAI [bot]	0eca930b8d	fix(gallery): correct meta-backend definitions for platform auto-selection (#10299 ) fix(gallery): correct meta-backend definitions in backend/index.yaml Backends that ship per-platform images must be meta backends (a capabilities map and NO uri) so the right variant is auto-selected per platform - mirroring llama-cpp/whisper. Several entries were misdefined; fixed here: - Concrete base + metal sibling (could not select the Apple Silicon variant): silero-vad, piper, kitten-tts, local-store (+ their -development). Converted each anchor to a meta and added the cpu-<name> concrete. - mlx family (mlx, mlx-vlm, mlx-audio, mlx-distributed + -development): anchor had both a uri AND a capabilities map, so IsMeta() was false and the map was ignored (always resolved to the metal-darwin image); the metal-<name> target did not exist. Removed the uri and added the missing metal-<name> concretes. - Dangling capability targets: diffusers/kokoro nvidia-l4t-cuda-12 repointed to the existing nvidia-l4t-<name> concrete; coqui nvidia-cuda-13 key removed (no cuda13-coqui image). - locate-anything: the meta existed but its concrete entries were never added, so it was un-installable on every platform. Added the full concrete set plus the locate-anything-development meta, mirroring rfdetr-cpp. Image tags grounded against the published quay.io tags. - trl (cuda12/13): repointed the stale 'cublas-cuda12/13-trl' image tags to the actually-published 'gpu-nvidia-cuda-12/13-trl' tags (fixes #9236). Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 10:43:14 +02:00
LocalAI [bot]	0413fc03f8	fix(gallery): make opus a meta backend for platform auto-selection (#9813 ) (#10291 ) fix(gallery): make opus a meta backend so the platform variant is auto-selected (#9813) The realtime/WebRTC path loads the "opus" codec backend by name, but on macOS arm64 only "metal-opus" is installable, so Load("opus") failed with "opus backend not available". The root cause: unlike llama-cpp and whisper, the opus entry was a concrete CPU backend (it carried a uri and no capabilities map) rather than a meta backend, so nothing mapped "opus" to the platform-appropriate variant. Restructure opus to mirror llama-cpp/whisper: "opus" becomes a meta backend with a capabilities map (default -> cpu-opus, metal -> metal-opus) and no uri; the CPU image moves to a new "cpu-opus" concrete (and its dev variant to "cpu-opus-development"). Installing "opus" now resolves to metal-opus on Apple Silicon and cpu-opus elsewhere, and Load("opus") works on every platform via the meta pointer - so the realtime endpoint needs no special casing. This reverts the realtime_webrtc.go resolution helper from the earlier approach in favor of the gallery-level fix. Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 09:51:02 +02:00
LocalAI [bot]	7088572f75	fix(neutts): pin torchaudio to match torch (fixes undefined symbol) (#9798 ) (#10292 ) fix(neutts): pin torchaudio to match torch to avoid ABI mismatch (#9798) neucodec pulls torchaudio transitively but it was unpinned, so an incompatible torchaudio could be resolved against the pinned torch==2.8.0, producing the 'undefined symbol: torch_library_impl' load failure. Pin torchaudio==2.8.0 alongside torch in the cpu and cublas12 requirements. Assisted-by: claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-13 09:28:41 +02:00
LocalAI [bot]	d28a5b6da1	chore: ⬆️ Update mudler/locate-anything.cpp to `92c1682da792c1e8a5dec91acc2be4b02c742ded` (#10282 ) ⬆️ Update mudler/locate-anything.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-13 09:01:17 +02:00
LocalAI [bot]	cf71e291b4	fix(darwin): fix vibevoice-cpp build linkage + fail-safe go backend packaging (#10276 ) * fix(darwin): never package a go backend build tree as a working image The darwin/arm64 vibevoice-cpp image shipped the source tree with a half-built CMake directory (build-libgovibevoicecpp-fallback.so/) and no backend binary, so the backend could never start: run.sh exec'd a vibevoice-cpp binary that was not in the package and LocalAI timed out waiting for the gRPC service. Two durable, backend-agnostic defenses: - backend/go/vibevoice-cpp/Makefile: mirror whisper's cleanup discipline so a partial CMake tree cannot survive into packaging. Run `make purge` before each variant build and `rm -rfv build` after. The old recipe only removed its build dir after a successful `mv`, so a failed build left the half-built tree behind. - scripts/build/golang-darwin.sh: before creating the OCI image, remove any stray build- directory and assert that the binary run.sh launches actually exists. A build that produced no binary now fails the job loudly instead of publishing a source tree as a working backend. The binary name is derived from run.sh's `exec $CURDIR/<binary>` line (parakeet-cpp launches parakeet-cpp-grpc, so it is not always ${BACKEND}) with a ${BACKEND} fallback. The underlying native build failure that left vibevoice-cpp half-built still needs to be reproduced and fixed on Apple Silicon; this change ensures such a failure can never again be published as a working image. Refs #10267 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(vibevoice-cpp): build libvibevoice.a on darwin (link target, not path) The darwin build failed with: No rule to make target 'vibevoice/libvibevoice.a', needed by 'libgovibevoicecpp.so'. Stop. The upstream vibevoice project is added with add_subdirectory(... EXCLUDE_FROM_ALL), so its `vibevoice` static-library target is only built when something links it as a target. The Apple branch linked only `$<TARGET_FILE:vibevoice>` - a bare archive path with no target reference - so CMake never emitted a rule to build libvibevoice.a, while the Linux branch worked because it passes the `vibevoice` target name inside the --whole-archive flags. Link the `vibevoice` target on Apple (establishing the build dependency) and apply -force_load as a separate link option to keep whole-archive semantics so purego can dlsym the vv_capi_* symbols. Refs #10267 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 23:13:50 +02:00
LocalAI [bot]	a7a7bd646b	fix(mlx): route vision-language models to the mlx-vlm backend (#10274 ) Vision-language checkpoints such as mlx-community/gemma-4-E4B-it-qat-4bit declare the "image-text-to-text" pipeline tag on HuggingFace. The mlx importer hardcoded backend "mlx" for every mlx-community model, so these VLMs were served by the text-only mlx-lm backend whose tokenizer does not carry the processor chat template. The template was never applied and the model produced degenerate, looping output that echoed the prompt. Detect the "image-text-to-text" pipeline tag in the importer and route those models to mlx-vlm, which applies the processor-aware chat template. An explicit backend preference still wins. As a defensive backstop, the mlx backend now warns loudly when the loaded model has no chat template, so a misrouted VLM surfaces the problem instead of silently looping. Fixes #10269 Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 23:12:42 +02:00
LocalAI [bot]	722bdb87e9	chore: ⬆️ Update mudler/parakeet.cpp to `b8012f11e5269126eddb7f4fd02f891a2ccc29b0` (#10281 ) * ⬆️ Update mudler/parakeet.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(parakeet-cpp): close streaming segments on <EOB> after ABI v5 eou/eob split parakeet.cpp ABI v5 (the pin this PR bumps to) splits the streaming JSON "eou" flag: in v4 "eou":1 fired for either <EOU> (end of utterance) or <EOB> (backchannel); in v5 "eou" means <EOU> only, with a new separate "eob" field for the backchannel token. The streamSegmenter closed a segment on "eou" alone, so after the bump a backchannel token would silently stop ending a segment and merge into the next utterance. Read the new "eob" field and flush on either signal to preserve the v4 segmentation boundaries. The flat stream_feed eou_out path is unaffected: its mask is still non-zero for either event. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 23:12:04 +02:00
LocalAI [bot]	50dea8c983	feat(crispasr): bundle espeak-ng and add piper TTS voices to the gallery (#10283 ) CrispASR's piper backend phonemizes non-English text via espeak-ng (dlopen, the MIT-clean path; English uses a built-in G2P). The FROM scratch crispasr image shipped none of it, so non-English piper voices loaded but failed synthesis with "phonemization failed". Bundle the espeak-ng runtime so they work: - Dockerfile.golang: install espeak-ng-data + libespeak-ng1 and its libpcaudio0 / libsonic0 deps in the crispasr builder (espeak's dlopen fails without the latter two). - package.sh: copy libespeak-ng.so.1, libpcaudio.so.0, libsonic.so.0 into package/lib/ and the espeak-ng-data dir into the package root. - run.sh: export CRISPASR_ESPEAK_DATA_PATH so the bundled data is found. Add 9 single-speaker piper voices (de/en/it, incl. Italian paola + riccardo) to the gallery, run through backend:piper, hosted at LocalAI-Community/piper-voices-GGUF (converted from rhasspy/piper-voices with CrispASR's convert-piper-to-gguf.py). Only single-speaker low/medium voices are included; the engine does not yet support multi-speaker or high-quality piper decoders. All 9 verified end-to-end: each synthesizes a WAV at the model's native sample rate using only the image-bundled espeak payload. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 23:10:30 +02:00
LocalAI [bot]	46ba70632b	fix(crispasr): write piper TTS WAV at the model's native sample rate (#10277 ) CrispASR's piper backend returns PCM at the voice's native rate (from the GGUF piper.sample_rate key: 16 kHz for x_low/low, 22.05 kHz for medium/high) and does not resample, but the Go WAV encoder hardcoded 24000 Hz. Every piper voice was therefore written with a wrong header and played back at the wrong pitch/speed. Read piper.sample_rate from the model's GGUF metadata at Load via the vendored gguf-parser-go and use it for the WAV header, falling back to the 24 kHz default for the other CrispASR TTS engines (vibevoice/orpheus/chatterbox/qwen3-tts) that emit 24 kHz and carry no such key. Adds unit specs (minimal crafted GGUFs + WAV-header decode) and an env-gated end-to-end spec (CRISPASR_PIPER_MODEL_PATH). Verified e2e: en_GB-cori-medium synthesizes a 22050 Hz WAV through backend:piper. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 23:10:17 +02:00
LocalAI [bot]	60facc7252	fix(darwin): publish sherpa-onnx and speaker-recognition images for darwin/arm64 (#10275 ) Neither the sherpa-onnx nor the speaker-recognition backend had a darwin/arm64 image, so `local-ai backends install` failed with "no child with platform darwin/arm64" on macOS. This left /v1/audio/diarization (the sherpa-onnx path) and /v1/voice/embed without any usable backend on Apple Silicon. Both backends build on darwin/arm64: - sherpa-onnx (Go) already fetches the onnxruntime osx-arm64 runtime in its Makefile; it only needed a darwin matrix entry (build-type metal, lang go, like whisper and silero-vad). - speaker-recognition (Python) needed a requirements-mps.txt so the mps build installs plain onnxruntime (which ships a macOS arm64 wheel) instead of the onnxruntime-gpu pulled by its base requirements (which does not). Add both to the includeDarwin build matrix, wire the metal capability and metal image aliases into the gallery, and add the speaker-recognition requirements-mps.txt. Fixes #10268 Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 22:32:42 +02:00
LocalAI [bot]	8c8204d3c4	feat(parakeet-cpp): enable GGML_CUDA_GRAPHS in the cublas build (#10273 ) ggml leaves GGML_CUDA_GRAPHS off by default. Passing -DGGML_CUDA_GRAPHS=ON for cublas builds lets the CUDA backend capture and replay the compute graph for a small free speedup (about 1% measured on a GB10, never negative). It is not gated by parakeet.cpp's CMake options, so it passes straight through to ggml. Assisted-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-12 18:47:36 +02:00
Richard Palethorpe	085fc53bbc	fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 ) * fix(router): score classifier production-readiness Conversation trimming runs through the classifier model's chat template and trims by exact token count, sized to the model's n_batch which is now scaled to context so long probes can't crash the backend. Missing chat_message templates are a hard error at router build time. Router- facing factories (Embedder/Scorer/Reranker/TokenCounter) re-resolve ModelConfig per call so a model installed post-startup doesn't bind a stub Backend="" config and silently fall into the loader's auto- iterate path. New 'vector_store' backend trace recorded inside localVectorStore on every Search/Insert — including the backend-load-failure path that previously vanished into an xlog.Warn — with outcome tagging (hit/miss/empty_store/backend_load_error/find_error/insert_error/ok). Companion cleanup drops misleading similarity:0 and input_tokens_count:0 from non-hit and text-mode traces. Gallery local-store-development aliases to 'local-store' so the master image satisfies pkg/model.LocalStoreBackend lookups from the embedding cache. Misc: llama-cpp TokenizeString reads the correct 'prompt' JSON key (the original bug); ModelTokenize nil-guard; non-fatal mitm proxy startup; PII 'route_local' renamed to 'allow' with docs/UI in sync; model-editor footer no longer eats the edit area on small screens; several config-editor template/dropdown/section fixes. Tests: e2e router specs (casual/code-hint + long-conversation trim), vector_store trace specs, lazy-factory specs, gallery dev-alias resolution, Playwright trace badge + scroll regression. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(backend): auto-size batch to context for embedding and rerank models Embedding and rerank models pool over the whole input in a single physical batch (n_ubatch). With batch left at the 512 default, the backend rejects longer inputs with "input is too large to process", silently capping a large-context embedder (e.g. 8k/32k) at 512 tokens. Size n_batch to the context for these single-pass usecases, mirroring the existing FLAG_SCORE behaviour; an explicit batch: still wins. Extracts EffectiveContextSize/EffectiveBatchSize from grpcModelOpts so the effective decode window has one home for other callers to reuse. Adds an e2e-aio regression test that embeds a >512-token input. The AIO embedding model is switched to nomic-embed-text-v1.5 (2048 context) because the previous granite model was capped at 512 tokens and could not exercise the larger batch. Assisted-by: claude-code:claude-opus-4-8 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(gallery): raise arch-router scoring output cap via parallel:64 Scoring decodes the whole prompt+candidate in a single llama_decode and reads one logit row per candidate token. The vendored llama.cpp server caps causal output rows at n_parallel, so the default of 1 aborts with GGML_ASSERT(n_outputs_max <= cparams.n_outputs_max) on multi-token route labels. Set options: [parallel:64] on both arch-router quant entries to lift the cap; kv_unified (the grpc-server default) keeps the full context per sequence, so this does not split the KV cache. Assisted-by: claude-code:claude-opus-4-8 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-06-12 16:21:15 +02:00
LocalAI [bot]	56cc4f63fc	feat(backend): locate-anything-cpp (open-vocabulary object detection via ggml) (#10264 ) * feat(backend): add locate-anything-cpp backend (open-vocab detection via la_capi) A Go/purego backend wrapping locate-anything.cpp's la_capi C ABI, implementing the gRPC Detect RPC: image + open-vocabulary text prompt -> labeled boxes. Mirrors backend/go/rfdetr-cpp; static-links ggml into a per-CPU-variant .so. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(backend): register locate-anything-cpp in build matrix Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): locate-anything gallery entry + model importer Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(backend): locate-anything-cpp Load+Detect wire test Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add locate-anything-3b model to the gallery index Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(backend): register locate-anything.cpp in bump_deps auto-bump Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: mudler <mudler@localai.io> * ci(test): e2e smoke for locate-anything-cpp in test-extra (loads the 3B + image, runs Detect) Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: mudler <mudler@localai.io>	2026-06-12 14:59:07 +02:00
LocalAI [bot]	a53f34e78f	chore: ⬆️ Update ggml-org/llama.cpp to `4c6595503fe45d5a39f88d194e270f64c7424677` (#10261 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-12 14:57:52 +02:00
LocalAI [bot]	006a9d38c7	chore: ⬆️ Update mudler/parakeet.cpp to `9db92be63179a27201d3b88d5d40c545b2ac48ae` (#10263 ) ⬆️ Update mudler/parakeet.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-12 09:18:21 +02:00
LocalAI [bot]	892ce951ce	chore: ⬆️ Update antirez/ds4 to `d881f2a05e8ff6bec001315a36b794b4aa310173` (#10262 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-12 09:18:07 +02:00
LocalAI [bot]	9a88eb81e7	chore: ⬆️ Update CrispStrobe/CrispASR to `d745bda4386ae0f9d1d2f23fff8ec95d76428221` (#10260 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-12 09:17:34 +02:00
pos-ei-don	58cdc050e9	fix(cuda): install cuda-nvrtc-dev alongside the other CUDA dev packages (#10257 ) Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>	2026-06-11 23:57:00 +02:00
pos-ei-don	b962f4a192	fix(vllm): parse tool_call function arguments before applying the chat template (#10256 ) Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com>	2026-06-11 23:55:38 +02:00
LocalAI [bot]	b6fcb3e1db	chore: ⬆️ Update CrispStrobe/CrispASR to `4b27392ffd0991a857594652cbb8b57e585bcd7b` (#10241 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-11 18:33:58 +02:00
LocalAI [bot]	ff09683d84	chore: ⬆️ Update ggml-org/llama.cpp to `ac4cddeb0dbd778f650bf568f6f08344a06abe3a` (#10239 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-11 18:33:38 +02:00
pos-ei-don	228a6dfe79	fix(vllm): restore compatibility with vLLM >= 0.22 (get_tokenizer moved to vllm.tokenizers) (#10252 ) fix(vllm): restore compatibility with vLLM >= 0.22 (get_tokenizer moved) vLLM 0.22 moved get_tokenizer from vllm.transformers_utils.tokenizer to vllm.tokenizers. Since the backend requirements install vllm unpinned, freshly built/installed vllm backends currently fail to start with ModuleNotFoundError: No module named 'vllm.transformers_utils.tokenizer' (surfacing as 'grpc service not ready' when loading a model). Use the same try/except version-compat import pattern already used elsewhere in this file: try the new vllm.tokenizers location first and fall back to the pre-0.22 path. Tested on a DGX Spark (GB10, ARM64) with the cuda13-nvidia-l4t-arm64-vllm backend and vllm 0.22.0: model load, chat completions and tool calls all work with this patch applied. Signed-off-by: pos-ei-don <1822533+pos-ei-don@users.noreply.github.com> Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 09:05:23 +02:00
LocalAI [bot]	51a92b6093	chore: ⬆️ Update antirez/ds4 to `8384adf0f9fa0f3bb342dd925372de778b95b263` (#10242 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-11 00:10:34 +02:00
LocalAI [bot]	6b2badb837	chore: ⬆️ Update CrispStrobe/CrispASR to `c29f6653a516a3001d923944dad8892072cc7334` (#10236 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 16:16:24 +02:00
LocalAI [bot]	8b8506d01a	chore: ⬆️ Update ggml-org/llama.cpp to `039e20a2db9e87b2477c76cc04905f3e1acad77f` (#10223 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 12:22:03 +02:00
LocalAI [bot]	6910a0bb48	chore: ⬆️ Update antirez/ds4 to `91bafb5acd5a6cf00b1e55ef68bf40ddd207bee7` (#10234 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 12:08:19 +02:00
LocalAI [bot]	cffd03b522	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `e6f8112f3ba126eed3ff5b30cdd08085414a7516` (#10233 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 12:07:49 +02:00
LocalAI [bot]	bf448d3794	chore: ⬆️ Update ggml-org/whisper.cpp to `df7638d8229a243af8a4b5a8ae557e0d74e0a0ae` (#10220 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 01:16:29 +02:00
LocalAI [bot]	1d4a12f7c0	chore: ⬆️ Update CrispStrobe/CrispASR to `97cad527d247edefc904e6c40c4cf5ee78bed055` (#10221 ) ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 01:16:17 +02:00
LocalAI [bot]	186d62801d	chore: ⬆️ Update leejet/stable-diffusion.cpp to `19bdfe22d255d5b4dff39d449318b9bc5ea2317f` (#10222 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 01:16:06 +02:00
LocalAI [bot]	da4ed05429	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `2768b6251548b78b6610e95edad13f888ad95982` (#10219 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 01:15:54 +02:00
LocalAI [bot]	ec1eea4f45	chore: ⬆️ Update antirez/ds4 to `512d07cb08f234b704b5a5959aa9e2d4c466eeb0` (#10224 ) ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-10 01:15:42 +02:00
LocalAI [bot]	9323f4b5ca	feat(llama-cpp): video input support (mtmd #24269 ) (#10216 ) * chore(llama-cpp): bump to 8f83d6c for mtmd video input support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(llama-cpp): forward video input to mtmd (template + non-template paths) Wire request->videos() into grpc-server.cpp mirroring the existing image and audio handling: a video_data build + non-template files extraction, and input_video chat chunks on the tokenizer-template path. allow_video is auto-set at model load by the vendored upstream chat_params. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add video attachment support to the chat UI Mirror the image/audio attachment path for video: emit video_url content parts, accept video/* in the picker, keep video files as base64, show a film icon badge, and render attached video inline with a <video> player. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(llama-cpp): patch mtmd video stdin double-close (heap crash) Upstream mtmd video input (ggml-org/llama.cpp#24269) double-fcloses the ffmpeg/ffprobe stdin FILE: feed_stdin() fclose()s the FILE returned by subprocess_stdin() (which is sp->stdin_file), then subprocess_destroy() fclose()s the same pointer again -> heap corruption that aborts the backend on any base64 input_video request (the CLI --video file path is unaffected). Vendor a one-line fix (null sp->stdin_file after fclose) via prepare.sh's patches/ until upstream merges it. Verified e2e with gemma-4-e2b-it-qat-q4_0: video frames decode via ffmpeg and the model answers correctly (red clip -> 'Red', blue -> 'Blue'). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(llama-cpp): re-pin to upstream #24316, drop vendored stdin patch Upstream replaced the ad-hoc video stdin handling with a proper RAII refactor (ggml-org/llama.cpp#24316, "mtmd: refactor video subproc handling"), which includes the same `sp->stdin_file = nullptr` guard our patch added (plus join-before-destroy ordering). Re-pin LLAMA_VERSION to that branch head and drop patches/0001 - it's now redundant. Verified e2e with gemma-4-e2b-it-qat-q4_0: no crash, video frames decode and the model answers correctly (red clip -> "Red", blue -> "Blue"). NOTE: #24316 is not yet merged, so this pins to its branch-head commit (28ca1e60). Re-pin to the squash-merge commit on master once it lands, otherwise `git fetch` may lose the commit after the branch is deleted. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-08 23:17:50 +02:00
LocalAI [bot]	c20225fc13	chore: ⬆️ Update CrispStrobe/CrispASR to `f7838a306687f22c281d29c250f879a4ab3df2d7` (#10177 ) * ⬆️ Update CrispStrobe/CrispASR Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(crispasr): link crispasr-lib CMake target instead of crispasr The dependency-bump regeneration of this branch reset CMakeLists.txt to master and dropped the prior link-target fix, reintroducing the `cannot find -lcrispasr` failure. Upstream CrispASR (f7838a3) defines the library as the CMake target `crispasr-lib` (with OUTPUT_NAME crispasr); there is no target named `crispasr`, so target_link_libraries falls back to a bare `-lcrispasr` linker flag that cannot be resolved. Point the link at the real target name. Verified locally: CPU cmake-configure of the bumped source generates a gocrispasr link line referencing sources/CrispASR/src/libcrispasr.a with no dangling -lcrispasr. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4.8 [Claude Code] --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-08 16:01:19 +02:00
LocalAI [bot]	337acc4c37	chore: ⬆️ Update antirez/ds4 to `c463029c205c2ec8d7ab6c0df4a3f52979091286` (#10189 ) * ⬆️ Update antirez/ds4 Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix(ds4): link ds4_ssd.o into the backend build Upstream antirez/ds4 splits the SSD expert-cache into its own ds4_ssd.c translation unit, whose symbols (ds4_ssd_memory_lock_acquire/release, ds4_ssd_cache_experts_for_byte_budget, ds4_ssd_auto_cache_plan) are referenced by ds4.c/ds4_cpu.o. The dependency-bump automation regenerated this branch from clean master and dropped the prior linkage fix, so the cpu-ds4 / cublas-ds4 backend builds fail again with undefined references. Re-apply the ds4_ssd.o linkage GPU-agnostically (mirroring ds4_distributed.o) in both the backend Makefile (DS4_OBJ_TARGET + the engine-object build rule for every GPU mode) and CMakeLists.txt (list(APPEND DS4_OBJS ds4_ssd.o)). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:opus-4.8 [Claude Code] --------- Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-06-08 11:15:32 +02:00
LocalAI [bot]	2e93186043	chore: ⬆️ Update ggml-org/llama.cpp to `9e3b928fd8c9d14dbf15a8768b9fdd7e5c721d66` (#10210 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-08 09:35:17 +02:00
LocalAI [bot]	d07037e817	chore: ⬆️ Update leejet/stable-diffusion.cpp to `b3d56d0ba1bd437886079e339118e8e75bb79ee7` (#10211 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-08 09:03:57 +02:00
LocalAI [bot]	f6cc90d258	chore: ⬆️ Update mudler/parakeet.cpp to `e270af73b94c9a5c37ec516230219ed4580e1db6` (#10212 ) ⬆️ Update mudler/parakeet.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-06-07 23:52:44 +02:00

1 2 3 4 5 ...

1455 Commits