LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-22 15:49:12 -04:00

Author	SHA1	Message	Date
eureka928	e91d83b171	fix(whisperx): pin torch CPU variant to fix uv resolution failure Pin torch==2.8.0+cpu so uv resolves the CPU wheel from the extra index instead of picking torch==2.8.0+cu128 from PyPI, which pulls unresolvable CUDA dependencies. Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	53c18e17cd	fix(whisperx): pin torch ROCm variant to fix CI build failure Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	2d42c30f87	fix(whisperx): unpin torch versions and use CPU index for cpu requirements Address review feedback: - Use --extra-index-url for CPU torch wheels to reduce size - Remove torch version pins, let uv resolve compatible versions Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	70a64fe9b8	ci(whisperx): add build matrix entries for CPU, CUDA 12/13, and ROCm Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	8e51db3cab	feat(whisperx): add whisperx meta and image entries to index.yaml Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	a87c507030	feat(whisperx): register whisperx backend in Makefile Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	c8245d069d	feat(whisperx): add whisperx backend for transcription with diarization Add Python gRPC backend using WhisperX for speech-to-text with word-level timestamps, forced alignment, and speaker diarization via pyannote-audio when HF_TOKEN is provided. Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
eureka928	4553ee02c7	feat(proto): add speaker field to TranscriptSegment for diarization Add speaker field to the gRPC TranscriptSegment message and map it through the Go schema, enabling backends to return speaker labels. Signed-off-by: eureka928 <meobius123@gmail.com>	2026-02-02 11:36:32 +01:00
Alex O'Connell	b7585ca738	fix(api): Add missing field in initial OpenAI streaming response (#8341 ) Add missing field in initial OpenAI streaming response Signed-off-by: Alex O'Connell <35843486+acon96@users.noreply.github.com>	2026-02-02 08:30:04 +01:00
LocalAI [bot]	8cae99229c	chore: ⬆️ Update ggml-org/llama.cpp to `2634ed207a17db1a54bd8df0555bd8499a6ab691` (#8336 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-01 21:23:57 +00:00
rampa3	04e0f444e1	chore(model gallery): Add Qwen 3 VL 8B thinking & instruct (#8329 ) Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-02-01 17:34:31 +01:00
rampa3	6f410f4cbe	chore(model gallery): Rename downloaded filename for Magistral Small mmproj (#8327 ) Rename downloaded filename for Magistral Small mmproj Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-02-01 17:33:48 +01:00
Ettore Di Giacinto	800f749c7b	fix: drop gguf VRAM estimation (now redundant) (#8325 ) fix: drop gguf VRAM estimation Cleanup. This is now handled directly in llama.cpp, no need to estimate from Go. VRAM estimation in general is tricky, but llama.cpp ( `41ea26144e/src/llama.cpp (L168)` ) lately has added an automatic "fitting" of models to VRAM, so we can drop backend-specific GGUF VRAM estimation from our code instead of trying to guess as we already enable it `397f7f0862/backend/cpp/llama-cpp/grpc-server.cpp (L393)` Fixes: https://github.com/mudler/LocalAI/issues/8302 See: https://github.com/mudler/LocalAI/issues/8302#issuecomment-3830773472	2026-02-01 17:33:28 +01:00
Andres	b6459ddd57	feat(api): Add transcribe response format request parameter & adjust STT backends (#8318 ) * WIP response format implementation for audio transcriptions (cherry picked from commit e271dd764bbc13846accf3beb8b6522153aa276f) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Rework transcript response_format and add more formats (cherry picked from commit 6a93a8f63e2ee5726bca2980b0c9cf4ef8b7aeb8) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Add test and replace go-openai package with official openai go client (cherry picked from commit f25d1a04e46526429c89db4c739e1e65942ca893) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Fix faster-whisper backend and refactor transcription formatting to also work on CLI Signed-off-by: Andres Smith <andressmithdev@pm.me> (cherry picked from commit 69a93977d5e113eb7172bd85a0f918592d3d2168) Signed-off-by: Andres Smith <andressmithdev@pm.me> --------- Signed-off-by: Andres Smith <andressmithdev@pm.me> Co-authored-by: nanoandrew4 <nanoandrew4@gmail.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-01 17:33:17 +01:00
Ettore Di Giacinto	397f7f0862	fix(ui): take account of reasoning in token count calculation (#8324 ) We were skipping reasoning traces when counting tokens, yielding to a wrong sum count. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-01 10:48:31 +01:00
LocalAI [bot]	234072769c	chore(model gallery): 🤖 add 1 new models via gallery agent (#8321 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-01 09:02:05 +01:00
LocalAI [bot]	3445415b3d	chore: ⬆️ Update ggml-org/llama.cpp to `41ea26144e55d23f37bb765f88c07588d786567f` (#8317 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-31 21:18:31 +00:00
LocalAI [bot]	b05e110aa6	chore: ⬆️ Update ggml-org/llama.cpp to `1488339138d609139c4400d1b80f8a5b1a9a203c` (#8306 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-31 08:59:09 +01:00
LocalAI [bot]	e69cba2444	chore: ⬆️ Update ggml-org/whisper.cpp to `aa1bc0d1a6dfd70dbb9f60c11df12441e03a9075` (#8305 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-31 08:58:54 +01:00
LocalAI [bot]	f7903597ac	chore(model-gallery): ⬆️ update checksum (#8307 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-30 21:27:48 +01:00
LocalAI [bot]	ee76a0cd1c	feat(swagger): update swagger (#8304 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-30 21:27:10 +01:00
Ettore Di Giacinto	4ca5b737bf	chore(cuda): target 12.8 for 12 to increase compatibility (#8297 ) Some datacenter setups might be stuck with the 5.x kernel which doesn't play well with CUDA >=12.9. To incrase compatibility with the CUDA 12.x branch, downgrade to 12.8. For newer systems, it is still suggested to use CUDA 13.x wherever compatible. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 12:58:44 +01:00
Ettore Di Giacinto	4077aaf978	chore: re-enable e2e tests, fixups anthropic API tools support (#8296 ) * chore(tests): add mock backend e2e tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixup anthropic tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * prepare e2e tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop repetitive tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop specific CI workflow Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup anthropic issues, move all e2e tests to use mocked backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 12:41:50 +01:00
Ettore Di Giacinto	68dd9765a0	feat(tts): add support for streaming mode (#8291 ) * feat(tts): add support for streaming mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Send first audio, make sure it's 16 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 11:58:01 +01:00
LocalAI [bot]	2c44b06a67	chore: ⬆️ Update ggml-org/llama.cpp to `4fdbc1e4dba428ce0cf9d2ac22232dc170bbca82` (#8283 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-29 23:43:29 +01:00
LocalAI [bot]	7cc90db3e5	chore(model-gallery): ⬆️ update checksum (#8285 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-29 21:51:18 +01:00
Ettore Di Giacinto	1e08e02598	feat(qwen-asr): add support to qwen-asr (#8281 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-29 21:50:35 +01:00
Richard Palethorpe	dd8e74a486	feat(realtime): Add audio conversations (#6245 ) * feat(realtime): Add audio conversations Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(realtime): Vendor the updated API and modify for server side Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): Update to the GA realtime API Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore: Document realtime API and add docs to AGENTS.md Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat: Filter reasoning from spoken output Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Send delta and done events for tool calls and audio transcripts Ensure that content is sent in both deltas and done events for function call arguments and audio transcripts. This fixes compatibility with clients that rely on delta events for parsing. 💘 Generated with Crush Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Improve tool call handling and error reporting - Refactor Model interface to accept []types.ToolUnion and *types.ToolChoiceUnion instead of JSON strings, eliminating unnecessary marshal/unmarshal cycles - Fix Parameters field handling: support both map[string]any and JSON string formats - Add PredictConfig() method to Model interface for accessing model configuration - Add comprehensive debug logging for tool call parsing and function config - Add missing return statement after prediction error (critical bug fix) - Add warning logs for NoAction function argument parsing failures - Improve error visibility throughout generateResponse function 💘 Generated with Crush Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land> Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-01-29 08:44:53 +01:00
Ettore Di Giacinto	48e08772f3	chore(llama.cpp): bump to 'f6b533d898ce84bae8d9fa8dfc6697ac087800bf' (#8275 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-29 00:22:25 +01:00
LocalAI [bot]	c28c0227c6	chore: ⬆️ Update leejet/stable-diffusion.cpp to `e411520407663e1ddf8ff2e5ed4ff3a116fbbc97` (#8274 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-28 21:23:05 +00:00
Richard Palethorpe	856ca2d6b1	fix(qwen3): Be explicit with function calling format (#8265 ) Qwen3 4b was using the wrong function format (i.e. using "function" instead of "name") within the realtime API. If we specify the function calling format explicitly then it stops it. Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-01-28 14:44:29 +01:00
Ettore Di Giacinto	9b973b79f6	feat: add VoxCPM tts backend (#8109 ) * feat: add VoxCPM tts backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Disable voxcpm on arm64 cpu Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-28 14:44:04 +01:00
Ettore Di Giacinto	cba8ef4e38	chore: fix backend icons Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-28 09:09:00 +01:00
Ettore Di Giacinto	f729e300d6	chore: fix backend icons Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-28 09:08:03 +01:00
LocalAI [bot]	9916811a79	chore: ⬆️ Update ggml-org/llama.cpp to `2b4cbd2834e427024bc7f935a1f232aecac6679b` (#8258 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-28 08:50:16 +01:00
Ettore Di Giacinto	2f7c595cd1	chore(model gallery): add z-image and z-image-turbo for diffusers (#8260 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-27 22:42:10 +01:00
rampa3	73decac746	chore(model gallery): Add mistral-community/pixtral-12b with mmproj (#8245 ) Rebased branch add_pixtral on master Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-01-27 21:43:31 +01:00
Ettore Di Giacinto	ec1598868b	feat(vibevoice): add ASR support (#8222 ) * feat(vibevoice): add ASR support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): download voice files Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Try to run on bigger runner Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI can't hold vibevoice Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-27 20:19:22 +01:00
rampa3	93d7e5d4b8	chore(model gallery): Add entry for Magistral Small 1.2 with mmproj (#8248 ) Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-01-27 16:55:00 +01:00
rampa3	ff5a54b9d1	chore(model gallery): Add entry for Mistral Small 3.1 with mmproj (#8247 ) * chore(model gallery): Add entry for Mistral Small 3.1 with mmproj Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com> * Use llama-cpp subfolder structure akin to Qwen 3 VL Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com> --------- Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-01-27 16:54:14 +01:00
LocalAI [bot]	3c1f823c47	chore: ⬆️ Update ggml-org/llama.cpp to `8f80d1b254aef70a0959e314be368d05debe7294` (#8229 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-26 21:19:43 +00:00
LocalAI [bot]	4024220d00	chore(model gallery): 🤖 add 1 new models via gallery agent (#8220 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-26 12:11:24 +01:00
LocalAI [bot]	f76958d761	chore: ⬆️ Update ggml-org/llama.cpp to `0440bfd1605333726ea0fb7a836942660bf2f9a6` (#8216 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-26 00:50:35 +01:00
LocalAI [bot]	2bd5ca45de	chore: ⬆️ Update leejet/stable-diffusion.cpp to `43e829f21966abb96b08c712bccee872dc820914` (#8215 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-26 00:50:16 +01:00
Ettore Di Giacinto	6804ce1c39	chore(docs): change MEMORY_FILE_PATH to MEMORY_INDEX_PATH Updated MEMORY_FILE_PATH to MEMORY_INDEX_PATH in memory configuration. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-25 22:14:11 +01:00
Dedy F. Setyawan	d499071bff	fix(ui): correctly display selected image model (#8208 ) Signed-off-by: Dedy F. Setyawan <dedyfajars@gmail.com>	2026-01-25 14:54:40 +01:00
Ettore Di Giacinto	26a374b717	chore: drop bark which is unmaintained (#8207 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-25 09:26:40 +01:00
rampa3	980de0e25b	chore(model gallery): Add most of not yet present Piper voices from Hugging Face (#8202 ) Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-01-25 08:56:53 +01:00
Ettore Di Giacinto	4767371aee	chore(README): Add links Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-24 22:49:27 +01:00
Ettore Di Giacinto	131d247b78	chore(README): Update and simplify links Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-24 22:46:40 +01:00

1 2 3 4 5 ...

5478 Commits