Commit Graph

5502 Commits

Author SHA1 Message Date
Ettore Di Giacinto
697f6aa71c feat(audio): set audio content type (#8416)
* feat(audio): set audio content type

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 19:14:12 +01:00
Ettore Di Giacinto
218d0526cb fix(qwen-tts): add six dependency
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 18:05:31 +01:00
Ettore Di Giacinto
9bc5ab18fa fix(voxcpm): make sed call unix-compliant
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 17:15:58 +01:00
Ettore Di Giacinto
a9267f391c fix(huggingface): add clean target
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 16:54:41 +01:00
Ettore Di Giacinto
029ae3420d fix(package.sh): drop redundant -a and -R
-a implies already -R

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 16:39:38 +01:00
Ettore Di Giacinto
c0461f32a1 fix: add missing clean targets
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 16:38:16 +01:00
Ettore Di Giacinto
8989d2944e fix: add clean target to local-store
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 14:55:34 +01:00
Ettore Di Giacinto
7aea2add44 Revert "chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/rerankers in the pip group across 1 directory" (#8412)
Revert "chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/re…"

This reverts commit 55e43b3f92.
2026-02-05 14:17:33 +01:00
dependabot[bot]
55e43b3f92 chore(deps): bump torch from 2.4.1 to 2.7.1+xpu in /backend/python/rerankers in the pip group across 1 directory (#8407)
chore(deps): bump torch

Bumps the pip group with 1 update in the /backend/python/rerankers directory: torch.


Updates `torch` from 2.4.1 to 2.7.1+xpu

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.7.1+xpu
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-05 12:37:52 +00:00
Ettore Di Giacinto
53276d28e7 feat(musicgen): add ace-step and UI interface (#8396)
* feat(musicgen): add ace-step and UI interface

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Correctly handle model dir

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop auto-download

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to models, fixup UIs icons

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* l4t13 is incompatbile

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* avoid pinning version for cuda12

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop l4t12

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 12:04:53 +01:00
Yaroslav98214
6dbcdb0b9e fix: filter GGUF and GGML files from model list (#8397)
Filter GGUF and GGML files from model list

Skip .gguf/.ggml loose files when listing models and add a test
for .gguf exclusion.

Closes #1077

Signed-off-by: Yaroslav98214 <diakovichyaroslav30@gmail.com>
2026-02-05 10:17:46 +01:00
LocalAI [bot]
c30866ba95 chore: ⬆️ Update ggml-org/llama.cpp to b536eb023368701fe3564210440e2df6151c3e65 (#8399)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-04 23:08:08 +01:00
LocalAI [bot]
b413beba2d chore: ⬆️ Update ggml-org/whisper.cpp to 941bdabbe4561bc6de68981aea01bc5ab05781c5 (#8398)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-04 21:20:59 +00:00
Ettore Di Giacinto
9db4df22f3 chore: update torch and torchaudio version specifications for qwen-tts in MPS
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-04 16:55:42 +01:00
Jonas Bernard
5ac50c9348 fix(docs): Promote DEBUG=false in production docker compose (#8390)
fix(docs): Use DEBUG=false in production docker compose

Signed-off-by: Jonas Bernard <public.jbernard@web.de>
2026-02-04 09:35:32 +01:00
Ettore Di Giacinto
5201b58d3e feat(mlx): Add support for CUDA12, CUDA13, L4T, SBSA and CPU (#8380)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-03 23:53:34 +01:00
LocalAI [bot]
8fa6737bdc chore(model gallery): 🤖 add 1 new models via gallery agent (#8381)
chore(model gallery): 🤖 add new models via gallery agent

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-03 22:40:22 +01:00
Ettore Di Giacinto
3039ced287 chore(ci): enlarge sleep startup time
Even if suboptimal as we should poll to wait for the service to be available, this should at least alleviate tests for now

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-03 22:07:07 +01:00
Ettore Di Giacinto
e7fc604dbc feat(metal): try to extend support to remaining backends (#8374)
* feat(metal): try to extend support to remaining backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* neutts doesn't work

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* split outetts out of transformers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Remove torch pin to whisperx

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-03 21:57:50 +01:00
Richard Palethorpe
5195062e12 fix(realtime): Include noAction function in prompt template and handle tool_choice (#8372)
The realtime endpoint was not passing the noAction "answer" function to the
model in the prompt template, causing the model to always call user-provided
tools even when a direct response was appropriate.

Root cause:
- User tools were added to the funcs list
- TemplateMessages() was called to generate the prompt
- noAction function was only added AFTER templating
- This meant the prompt didn't include the "answer" function, even though
  the grammar did

Fix:
- Move noAction function creation before TemplateMessages() call so it's
  included in both the prompt and grammar
- Add proper tool_choice parameter handling to support "auto", "required",
  "none", and specific function selection
- Match behavior of the standard chat endpoint

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land>

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-03 14:30:37 +01:00
dependabot[bot]
c86edf06f2 chore(deps): bump github.com/onsi/gomega from 1.39.0 to 1.39.1 (#8357)
Bumps [github.com/onsi/gomega](https://github.com/onsi/gomega) from 1.39.0 to 1.39.1.
- [Release notes](https://github.com/onsi/gomega/releases)
- [Changelog](https://github.com/onsi/gomega/blob/master/CHANGELOG.md)
- [Commits](https://github.com/onsi/gomega/compare/v1.39.0...v1.39.1)

---
updated-dependencies:
- dependency-name: github.com/onsi/gomega
  dependency-version: 1.39.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 09:00:59 +01:00
dependabot[bot]
5d14c2fe4d chore(deps): bump sentence-transformers from 5.2.0 to 5.2.2 in /backend/python/transformers (#8358)
chore(deps): bump sentence-transformers in /backend/python/transformers

Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.2.0 to 5.2.2.
- [Release notes](https://github.com/huggingface/sentence-transformers/releases)
- [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.2.0...v5.2.2)

---
updated-dependencies:
- dependency-name: sentence-transformers
  dependency-version: 5.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:37:05 +01:00
dependabot[bot]
4601143998 chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus from 0.61.0 to 0.62.0 (#8359)
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus

Bumps [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go) from 0.61.0 to 0.62.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/exporters/prometheus/v0.61.0...exporters/prometheus/v0.62.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
  dependency-version: 0.62.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:36:34 +01:00
dependabot[bot]
7913ea2bfb chore(deps): bump go.opentelemetry.io/otel/sdk/metric from 1.39.0 to 1.40.0 (#8354)
chore(deps): bump go.opentelemetry.io/otel/sdk/metric

Bumps [go.opentelemetry.io/otel/sdk/metric](https://github.com/open-telemetry/opentelemetry-go) from 1.39.0 to 1.40.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/sdk/metric
  dependency-version: 1.40.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:36:10 +01:00
Ettore Di Giacinto
d6409bd2eb Revert "chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vllm in the pip group across 1 directory" (#8367)
Revert "chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vl…"

This reverts commit 4c0e70086d.
2026-02-03 08:34:54 +01:00
dependabot[bot]
98872791e5 chore(deps): bump protobuf from 6.33.4 to 6.33.5 in /backend/python/transformers (#8356)
chore(deps): bump protobuf in /backend/python/transformers

Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 6.33.4 to 6.33.5.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Commits](https://github.com/protocolbuffers/protobuf/commits)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-version: 6.33.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:33:45 +01:00
dependabot[bot]
c6d47cb4e5 chore(deps): bump github.com/anthropics/anthropic-sdk-go from 1.19.0 to 1.20.0 (#8355)
chore(deps): bump github.com/anthropics/anthropic-sdk-go

Bumps [github.com/anthropics/anthropic-sdk-go](https://github.com/anthropics/anthropic-sdk-go) from 1.19.0 to 1.20.0.
- [Release notes](https://github.com/anthropics/anthropic-sdk-go/releases)
- [Changelog](https://github.com/anthropics/anthropic-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/anthropics/anthropic-sdk-go/compare/v1.19.0...v1.20.0)

---
updated-dependencies:
- dependency-name: github.com/anthropics/anthropic-sdk-go
  dependency-version: 1.20.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:33:24 +01:00
dependabot[bot]
f754a8edb1 chore(deps): bump go.opentelemetry.io/otel/metric from 1.39.0 to 1.40.0 (#8353)
Bumps [go.opentelemetry.io/otel/metric](https://github.com/open-telemetry/opentelemetry-go) from 1.39.0 to 1.40.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/v1.39.0...v1.40.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/metric
  dependency-version: 1.40.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 08:33:04 +01:00
dependabot[bot]
4c0e70086d chore(deps): bump torch from 2.7.0 to 2.7.1+xpu in /backend/python/vllm in the pip group across 1 directory (#8360)
chore(deps): bump torch

Bumps the pip group with 1 update in the /backend/python/vllm directory: torch.


Updates `torch` from 2.7.0 to 2.7.1+xpu

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.7.1+xpu
  dependency-type: direct:production
  dependency-group: pip
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 03:07:02 +00:00
dependabot[bot]
f8bd527dfe chore(deps): bump appleboy/ssh-action from 1.2.4 to 1.2.5 (#8352)
Bumps [appleboy/ssh-action](https://github.com/appleboy/ssh-action) from 1.2.4 to 1.2.5.
- [Release notes](https://github.com/appleboy/ssh-action/releases)
- [Commits](https://github.com/appleboy/ssh-action/compare/v1.2.4...v1.2.5)

---
updated-dependencies:
- dependency-name: appleboy/ssh-action
  dependency-version: 1.2.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-02-03 01:53:20 +00:00
Ettore Di Giacinto
08b2b8d755 fix(libbackend): do not inject --index-strategy unsafe-best-match to all
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-02 17:13:45 +01:00
Dream
10a1e6c74d feat(whisperx): add whisperx backend for transcription with speaker diarization (#8299)
* feat(proto): add speaker field to TranscriptSegment for diarization

Add speaker field to the gRPC TranscriptSegment message and map it
through the Go schema, enabling backends to return speaker labels.

Signed-off-by: eureka928 <meobius123@gmail.com>

* feat(whisperx): add whisperx backend for transcription with diarization

Add Python gRPC backend using WhisperX for speech-to-text with
word-level timestamps, forced alignment, and speaker diarization
via pyannote-audio when HF_TOKEN is provided.

Signed-off-by: eureka928 <meobius123@gmail.com>

* feat(whisperx): register whisperx backend in Makefile

Signed-off-by: eureka928 <meobius123@gmail.com>

* feat(whisperx): add whisperx meta and image entries to index.yaml

Signed-off-by: eureka928 <meobius123@gmail.com>

* ci(whisperx): add build matrix entries for CPU, CUDA 12/13, and ROCm

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(whisperx): unpin torch versions and use CPU index for cpu requirements

Address review feedback:
- Use --extra-index-url for CPU torch wheels to reduce size
- Remove torch version pins, let uv resolve compatible versions

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(whisperx): pin torch ROCm variant to fix CI build failure

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(whisperx): pin torch CPU variant to fix uv resolution failure

Pin torch==2.8.0+cpu so uv resolves the CPU wheel from the extra
index instead of picking torch==2.8.0+cu128 from PyPI, which pulls
unresolvable CUDA dependencies.

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(whisperx): use unsafe-best-match index strategy to fix uv resolution failure

uv's default first-match strategy finds torch on PyPI before checking
the extra index, causing it to pick torch==2.8.0+cu128 instead of the
CPU variant. This makes whisperx's transitive torch dependency
unresolvable. Using unsafe-best-match lets uv consider all indexes.

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(whisperx): drop +cpu local version suffix to fix uv resolution failure

PEP 440 ==2.8.0 matches 2.8.0+cpu from the extra index, avoiding the
issue where uv cannot locate an explicit +cpu local version specifier.
This aligns with the pattern used by all other CPU backends.

Signed-off-by: eureka928 <meobius123@gmail.com>

* fix(backends): drop +rocm local version suffixes from hipblas requirements to fix uv resolution

uv cannot resolve PEP 440 local version specifiers (e.g. +rocm6.4,
+rocm6.3) in pinned requirements. The --extra-index-url already points
to the correct ROCm wheel index and --index-strategy unsafe-best-match
(set in libbackend.sh) ensures the ROCm variant is preferred.

Applies the same fix as 7f5d72e8 (which resolved this for +cpu) across
all 14 hipblas requirements files.

Signed-off-by: eureka928 <meobius123@gmail.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: eureka928 <meobius123@gmail.com>

* revert: scope hipblas suffix fix to whisperx only

Reverts changes to non-whisperx hipblas requirements files per
maintainer review — other backends are building fine with the +rocm
local version suffix.

Signed-off-by: eureka928 <meobius123@gmail.com>

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: eureka928 <meobius123@gmail.com>

---------

Signed-off-by: eureka928 <meobius123@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 16:33:12 +01:00
Alex O'Connell
b7585ca738 fix(api): Add missing field in initial OpenAI streaming response (#8341)
Add missing field in initial OpenAI streaming response

Signed-off-by: Alex O'Connell <35843486+acon96@users.noreply.github.com>
2026-02-02 08:30:04 +01:00
LocalAI [bot]
8cae99229c chore: ⬆️ Update ggml-org/llama.cpp to 2634ed207a17db1a54bd8df0555bd8499a6ab691 (#8336)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-01 21:23:57 +00:00
rampa3
04e0f444e1 chore(model gallery): Add Qwen 3 VL 8B thinking & instruct (#8329)
Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>
2026-02-01 17:34:31 +01:00
rampa3
6f410f4cbe chore(model gallery): Rename downloaded filename for Magistral Small mmproj (#8327)
Rename downloaded filename for Magistral Small mmproj

Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>
2026-02-01 17:33:48 +01:00
Ettore Di Giacinto
800f749c7b fix: drop gguf VRAM estimation (now redundant) (#8325)
fix: drop gguf VRAM estimation

Cleanup. This is now handled directly in llama.cpp, no need to estimate from Go.

VRAM estimation in general is tricky, but llama.cpp ( 41ea26144e/src/llama.cpp (L168) ) lately has added an automatic "fitting" of models to VRAM, so we can drop backend-specific GGUF VRAM estimation from our code instead of trying to guess as we already enable it

 397f7f0862/backend/cpp/llama-cpp/grpc-server.cpp (L393)

Fixes: https://github.com/mudler/LocalAI/issues/8302
See: https://github.com/mudler/LocalAI/issues/8302#issuecomment-3830773472
2026-02-01 17:33:28 +01:00
Andres
b6459ddd57 feat(api): Add transcribe response format request parameter & adjust STT backends (#8318)
* WIP response format implementation for audio transcriptions

(cherry picked from commit e271dd764bbc13846accf3beb8b6522153aa276f)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Rework transcript response_format and add more formats

(cherry picked from commit 6a93a8f63e2ee5726bca2980b0c9cf4ef8b7aeb8)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Add test and replace go-openai package with official openai go client

(cherry picked from commit f25d1a04e46526429c89db4c739e1e65942ca893)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Fix faster-whisper backend and refactor transcription formatting to also work on CLI

Signed-off-by: Andres Smith <andressmithdev@pm.me>
(cherry picked from commit 69a93977d5e113eb7172bd85a0f918592d3d2168)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
Co-authored-by: nanoandrew4 <nanoandrew4@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-01 17:33:17 +01:00
Ettore Di Giacinto
397f7f0862 fix(ui): take account of reasoning in token count calculation (#8324)
We were skipping reasoning traces when counting tokens, yielding to a
wrong sum count.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-01 10:48:31 +01:00
LocalAI [bot]
234072769c chore(model gallery): 🤖 add 1 new models via gallery agent (#8321)
chore(model gallery): 🤖 add new models via gallery agent

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-01 09:02:05 +01:00
LocalAI [bot]
3445415b3d chore: ⬆️ Update ggml-org/llama.cpp to 41ea26144e55d23f37bb765f88c07588d786567f (#8317)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-31 21:18:31 +00:00
LocalAI [bot]
b05e110aa6 chore: ⬆️ Update ggml-org/llama.cpp to 1488339138d609139c4400d1b80f8a5b1a9a203c (#8306)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-31 08:59:09 +01:00
LocalAI [bot]
e69cba2444 chore: ⬆️ Update ggml-org/whisper.cpp to aa1bc0d1a6dfd70dbb9f60c11df12441e03a9075 (#8305)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-31 08:58:54 +01:00
LocalAI [bot]
f7903597ac chore(model-gallery): ⬆️ update checksum (#8307)
⬆️ Checksum updates in gallery/index.yaml

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-30 21:27:48 +01:00
LocalAI [bot]
ee76a0cd1c feat(swagger): update swagger (#8304)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-30 21:27:10 +01:00
Ettore Di Giacinto
4ca5b737bf chore(cuda): target 12.8 for 12 to increase compatibility (#8297)
Some datacenter setups might be stuck with the 5.x kernel which doesn't
play well with CUDA >=12.9. To incrase compatibility with the CUDA 12.x
branch, downgrade to 12.8. For newer systems, it is still suggested to
use CUDA 13.x wherever compatible.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-30 12:58:44 +01:00
Ettore Di Giacinto
4077aaf978 chore: re-enable e2e tests, fixups anthropic API tools support (#8296)
* chore(tests): add mock backend e2e tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup anthropic tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* prepare e2e tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop repetitive tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop specific CI workflow

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup anthropic issues, move all e2e tests to use mocked backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-30 12:41:50 +01:00
Ettore Di Giacinto
68dd9765a0 feat(tts): add support for streaming mode (#8291)
* feat(tts): add support for streaming mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Send first audio, make sure it's 16

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-30 11:58:01 +01:00
LocalAI [bot]
2c44b06a67 chore: ⬆️ Update ggml-org/llama.cpp to 4fdbc1e4dba428ce0cf9d2ac22232dc170bbca82 (#8283)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-29 23:43:29 +01:00
LocalAI [bot]
7cc90db3e5 chore(model-gallery): ⬆️ update checksum (#8285)
⬆️ Checksum updates in gallery/index.yaml

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-01-29 21:51:18 +01:00