chore(deps): bump grpcio in /backend/python/transformers

Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.0. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.0) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>
chore: ⬆️ Update antirez/voxtral.c to c9e8773a2042d67c637fc492c8a655c485354080 (#8477 )
2026-02-10 06:31:39 -05:00 · 2026-02-09 22:31:10 +00:00 · 2026-02-09 22:20:03 +01:00 · 2026-02-09 09:12:05 +01:00 · 2026-02-09 09:10:37 +01:00 · 2026-02-09 09:09:32 +01:00
29 changed files with 1261 additions and 26 deletions
--- a/.github/workflows/backend.yml
+++ b/.github/workflows/backend.yml
@@ -1674,6 +1674,20 @@ jobs:
            dockerfile: "./backend/Dockerfile.golang"
            context: "./"
            ubuntu-version: '2404'
+          # voxtral
+          - build-type: ''
+            cuda-major-version: ""
+            cuda-minor-version: ""
+            platforms: 'linux/amd64,linux/arm64'
+            tag-latest: 'auto'
+            tag-suffix: '-cpu-voxtral'
+            runs-on: 'ubuntu-latest'
+            base-image: "ubuntu:24.04"
+            skip-drivers: 'false'
+            backend: "voxtral"
+            dockerfile: "./backend/Dockerfile.golang"
+            context: "./"
+            ubuntu-version: '2404'
          #silero-vad
          - build-type: ''
            cuda-major-version: ""
@@ -1945,6 +1959,10 @@ jobs:
            tag-suffix: "-metal-darwin-arm64-whisper"
            build-type: "metal"
            lang: "go"
+          - backend: "voxtral"
+            tag-suffix: "-metal-darwin-arm64-voxtral"
+            build-type: "metal"
+            lang: "go"
          - backend: "vibevoice"
            tag-suffix: "-metal-darwin-arm64-vibevoice"
            build-type: "mps"
--- a/.github/workflows/bump_deps.yaml
+++ b/.github/workflows/bump_deps.yaml
@@ -30,6 +30,10 @@ jobs:
            variable: "PIPER_VERSION"
            branch: "master"
            file: "backend/go/piper/Makefile"
+          - repository: "antirez/voxtral.c"
+            variable: "VOXTRAL_VERSION"
+            branch: "main"
+            file: "backend/go/voxtral/Makefile"
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v6
--- a/.github/workflows/test-extra.yml
+++ b/.github/workflows/test-extra.yml
@@ -361,3 +361,34 @@ jobs:
        run: |
          make --jobs=5 --output-sync=target -C backend/python/voxcpm
          make --jobs=5 --output-sync=target -C backend/python/voxcpm test
+  tests-voxtral:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Clone
+        uses: actions/checkout@v6
+        with:
+          submodules: true
+      - name: Dependencies
+        run: |
+          sudo apt-get update
+          sudo apt-get install -y build-essential cmake curl libopenblas-dev ffmpeg
+      - name: Setup Go
+        uses: actions/setup-go@v5
+      # You can test your matrix by printing the current Go version
+      - name: Display Go version
+        run: go version
+      - name: Proto Dependencies
+        run: |
+          # Install protoc
+          curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
+          unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
+          rm protoc.zip
+          go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
+          go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
+          PATH="$PATH:$HOME/go/bin" make protogen-go
+      - name: Build voxtral
+        run: |
+          make --jobs=5 --output-sync=target -C backend/go/voxtral
+      - name: Test voxtral
+        run: |
+          make --jobs=5 --output-sync=target -C backend/go/voxtral test
--- a/6
+++ b/6
@@ -1,5 +1,5 @@
 # Disable parallel execution for backend builds
-.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/moonshine backends/pocket-tts backends/qwen-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step
+.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/moonshine backends/pocket-tts backends/qwen-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/voxtral

 GOCMD=go
 GOTEST=$(GOCMD) test
@@ -453,6 +453,7 @@ BACKEND_HUGGINGFACE = huggingface|golang|.|false|true
 BACKEND_SILERO_VAD = silero-vad|golang|.|false|true
 BACKEND_STABLEDIFFUSION_GGML = stablediffusion-ggml|golang|.|--progress=plain|true
 BACKEND_WHISPER = whisper|golang|.|false|true
+BACKEND_VOXTRAL = voxtral|golang|.|false|true

 # Python backends with root context
 BACKEND_RERANKERS = rerankers|python|.|false|true
@@ -506,6 +507,7 @@ $(eval $(call generate-docker-build-target,$(BACKEND_HUGGINGFACE)))
 $(eval $(call generate-docker-build-target,$(BACKEND_SILERO_VAD)))
 $(eval $(call generate-docker-build-target,$(BACKEND_STABLEDIFFUSION_GGML)))
 $(eval $(call generate-docker-build-target,$(BACKEND_WHISPER)))
+$(eval $(call generate-docker-build-target,$(BACKEND_VOXTRAL)))
 $(eval $(call generate-docker-build-target,$(BACKEND_RERANKERS)))
 $(eval $(call generate-docker-build-target,$(BACKEND_TRANSFORMERS)))
 $(eval $(call generate-docker-build-target,$(BACKEND_OUTETTS)))
@@ -533,7 +535,7 @@ $(eval $(call generate-docker-build-target,$(BACKEND_ACE_STEP)))
 docker-save-%: backend-images
 	docker save local-ai-backend:$* -o backend-images/$*.tar

-docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-vllm-omni docker-build-transformers docker-build-outetts docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-chatterbox docker-build-vibevoice docker-build-moonshine docker-build-pocket-tts docker-build-qwen-tts docker-build-qwen-asr docker-build-nemo docker-build-voxcpm docker-build-whisperx docker-build-ace-step
+docker-build-backends: docker-build-llama-cpp docker-build-rerankers docker-build-vllm docker-build-vllm-omni docker-build-transformers docker-build-outetts docker-build-diffusers docker-build-kokoro docker-build-faster-whisper docker-build-coqui docker-build-chatterbox docker-build-vibevoice docker-build-moonshine docker-build-pocket-tts docker-build-qwen-tts docker-build-qwen-asr docker-build-nemo docker-build-voxcpm docker-build-whisperx docker-build-ace-step docker-build-voxtral

 ########################################################
 ### Mock Backend for E2E Tests
--- a/backend/Dockerfile.golang
+++ b/backend/Dockerfile.golang
@@ -20,7 +20,7 @@ RUN apt-get update && \
        build-essential \
        git ccache \
        ca-certificates \
-        make cmake wget \
+        make cmake wget libopenblas-dev \
        curl unzip \
        libssl-dev && \
    apt-get clean && \
--- a/backend/cpp/llama-cpp/Makefile
+++ b/backend/cpp/llama-cpp/Makefile
@@ -1,5 +1,5 @@

-LLAMA_VERSION?=8872ad2125336d209a9911a82101f80095a9831d
+LLAMA_VERSION?=e06088da0fa86aa444409f38dff274904931c507
 LLAMA_REPO?=https://github.com/ggerganov/llama.cpp

 CMAKE_ARGS?=
--- a/backend/go/stablediffusion-ggml/.gitignore
+++ b/backend/go/stablediffusion-ggml/.gitignore
@@ -2,5 +2,5 @@ package/
 sources/
 .cache/
 build/
-libgosd.so
+*.so
 stablediffusion-ggml
--- a/backend/go/stablediffusion-ggml/Makefile
+++ b/backend/go/stablediffusion-ggml/Makefile
@@ -66,15 +66,18 @@ sources/stablediffusion-ggml.cpp:
 	git checkout $(STABLEDIFFUSION_GGML_VERSION) && \
 	git submodule update --init --recursive --depth 1 --single-branch

-libgosd.so: sources/stablediffusion-ggml.cpp CMakeLists.txt gosd.cpp gosd.h
-	mkdir -p build && \
-	cd build && \
-	cmake .. $(CMAKE_ARGS) && \
-	cmake --build . --config Release -j$(JOBS) && \
-	cd .. && \
-	mv build/libgosd.so ./
+# Detect OS
+UNAME_S := $(shell uname -s)

-stablediffusion-ggml: main.go gosd.go libgosd.so
+# Only build CPU variants on Linux
+ifeq ($(UNAME_S),Linux)
+	VARIANT_TARGETS = libgosd-avx.so libgosd-avx2.so libgosd-avx512.so libgosd-fallback.so
+else
+	# On non-Linux (e.g., Darwin), build only fallback variant
+	VARIANT_TARGETS = libgosd-fallback.so
+endif
+
+stablediffusion-ggml: main.go gosd.go $(VARIANT_TARGETS)
 	CGO_ENABLED=0 $(GOCMD) build -tags "$(GO_TAGS)" -o stablediffusion-ggml ./

 package: stablediffusion-ggml
@@ -82,5 +85,46 @@ package: stablediffusion-ggml

 build: package

-clean:
-	rm -rf libgosd.so build stablediffusion-ggml package sources
+clean: purge
+	rm -rf libgosd*.so stablediffusion-ggml package sources
+
+purge:
+	rm -rf build*
+
+# Build all variants (Linux only)
+ifeq ($(UNAME_S),Linux)
+libgosd-avx.so: sources/stablediffusion-ggml.cpp
+	$(MAKE) purge
+	$(info ${GREEN}I stablediffusion-ggml build info:avx${RESET})
+	SO_TARGET=libgosd-avx.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgosd-custom
+	rm -rfv build*
+
+libgosd-avx2.so: sources/stablediffusion-ggml.cpp
+	$(MAKE) purge
+	$(info ${GREEN}I stablediffusion-ggml build info:avx2${RESET})
+	SO_TARGET=libgosd-avx2.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=on -DGGML_AVX512=off -DGGML_FMA=on -DGGML_F16C=on -DGGML_BMI2=on" $(MAKE) libgosd-custom
+	rm -rfv build*
+
+libgosd-avx512.so: sources/stablediffusion-ggml.cpp
+	$(MAKE) purge
+	$(info ${GREEN}I stablediffusion-ggml build info:avx512${RESET})
+	SO_TARGET=libgosd-avx512.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=on -DGGML_AVX512=on -DGGML_FMA=on -DGGML_F16C=on -DGGML_BMI2=on" $(MAKE) libgosd-custom
+	rm -rfv build*
+endif
+
+# Build fallback variant (all platforms)
+libgosd-fallback.so: sources/stablediffusion-ggml.cpp
+	$(MAKE) purge
+	$(info ${GREEN}I stablediffusion-ggml build info:fallback${RESET})
+	SO_TARGET=libgosd-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgosd-custom
+	rm -rfv build*
+
+libgosd-custom: CMakeLists.txt gosd.cpp gosd.h
+	mkdir -p build-$(SO_TARGET) && \
+	cd build-$(SO_TARGET) && \
+	cmake .. $(CMAKE_ARGS) && \
+	cmake --build . --config Release -j$(JOBS) && \
+	cd .. && \
+	mv build-$(SO_TARGET)/libgosd.so ./$(SO_TARGET)
+
+all: stablediffusion-ggml package
--- a/backend/go/stablediffusion-ggml/main.go
+++ b/backend/go/stablediffusion-ggml/main.go
@@ -2,6 +2,7 @@ package main

 import (
 	"flag"
+	"os"

 	"github.com/ebitengine/purego"
 	grpc "github.com/mudler/LocalAI/pkg/grpc"
@@ -17,7 +18,13 @@ type LibFuncs struct {
 }

 func main() {
-	gosd, err := purego.Dlopen("./libgosd.so", purego.RTLD_NOW|purego.RTLD_GLOBAL)
+	// Get library name from environment variable, default to fallback
+	libName := os.Getenv("SD_LIBRARY")
+	if libName == "" {
+		libName = "./libgosd-fallback.so"
+	}
+
+	gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
 	if err != nil {
 		panic(err)
 	}
--- a/backend/go/stablediffusion-ggml/package.sh
+++ b/backend/go/stablediffusion-ggml/package.sh
@@ -11,7 +11,7 @@ REPO_ROOT="${CURDIR}/../../.."
 # Create lib directory
 mkdir -p $CURDIR/package/lib

-cp -avf $CURDIR/libgosd.so $CURDIR/package/
+cp -avf $CURDIR/libgosd-*.so $CURDIR/package/
 cp -avf $CURDIR/stablediffusion-ggml $CURDIR/package/
 cp -fv $CURDIR/run.sh $CURDIR/package/

--- a/backend/go/stablediffusion-ggml/run.sh
+++ b/backend/go/stablediffusion-ggml/run.sh
@@ -1,14 +1,52 @@
 #!/bin/bash
 set -ex

+# Get the absolute current dir where the script is located
 CURDIR=$(dirname "$(realpath $0)")

+cd /
+
+echo "CPU info:"
+if [ "$(uname)" != "Darwin" ]; then
+	grep -e "model\sname" /proc/cpuinfo | head -1
+	grep -e "flags" /proc/cpuinfo | head -1
+fi
+
+LIBRARY="$CURDIR/libgosd-fallback.so"
+
+if [ "$(uname)" != "Darwin" ]; then
+	if grep -q -e "\savx\s" /proc/cpuinfo ; then
+		echo "CPU:    AVX    found OK"
+		if [ -e $CURDIR/libgosd-avx.so ]; then
+			LIBRARY="$CURDIR/libgosd-avx.so"
+		fi
+	fi
+
+	if grep -q -e "\savx2\s" /proc/cpuinfo ; then
+		echo "CPU:    AVX2   found OK"
+		if [ -e $CURDIR/libgosd-avx2.so ]; then
+			LIBRARY="$CURDIR/libgosd-avx2.so"
+		fi
+	fi
+
+	# Check avx 512
+	if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
+		echo "CPU:    AVX512F found OK"
+		if [ -e $CURDIR/libgosd-avx512.so ]; then
+			LIBRARY="$CURDIR/libgosd-avx512.so"
+		fi
+	fi
+fi
+
 export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
+export SD_LIBRARY=$LIBRARY

 # If there is a lib/ld.so, use it
 if [ -f $CURDIR/lib/ld.so ]; then
 	echo "Using lib/ld.so"
+	echo "Using library: $LIBRARY"
 	exec $CURDIR/lib/ld.so $CURDIR/stablediffusion-ggml "$@"
 fi

-exec $CURDIR/stablediffusion-ggml "$@"
+echo "Using library: $LIBRARY"
+exec $CURDIR/stablediffusion-ggml "$@"
--- a/backend/go/voxtral/.gitignore
+++ b/backend/go/voxtral/.gitignore
@@ -0,0 +1,9 @@
+.cache/
+sources/
+build/
+build-*/
+package/
+voxtral
+*.so
+*.dylib
+compile_commands.json
--- a/backend/go/voxtral/CMakeLists.txt
+++ b/backend/go/voxtral/CMakeLists.txt
@@ -0,0 +1,84 @@
+cmake_minimum_required(VERSION 3.12)
+
+if(USE_METAL)
+    project(govoxtral LANGUAGES C OBJC)
+else()
+    project(govoxtral LANGUAGES C)
+endif()
+
+set(CMAKE_POSITION_INDEPENDENT_CODE ON)
+set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
+
+# Workaround: CMake + GCC linker depfile generation fails for MODULE libraries
+set(CMAKE_C_LINKER_DEPFILE_SUPPORTED FALSE)
+
+# Build voxtral.c as a library
+set(VOXTRAL_SOURCES
+    sources/voxtral.c/voxtral.c
+    sources/voxtral.c/voxtral_kernels.c
+    sources/voxtral.c/voxtral_audio.c
+    sources/voxtral.c/voxtral_encoder.c
+    sources/voxtral.c/voxtral_decoder.c
+    sources/voxtral.c/voxtral_tokenizer.c
+    sources/voxtral.c/voxtral_safetensors.c
+)
+
+# Metal GPU acceleration (macOS arm64 only)
+if(USE_METAL)
+    # Generate embedded shader header from .metal source via xxd
+    add_custom_command(
+        OUTPUT ${CMAKE_CURRENT_SOURCE_DIR}/sources/voxtral.c/voxtral_shaders_source.h
+        COMMAND xxd -i voxtral_shaders.metal > voxtral_shaders_source.h
+        WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/sources/voxtral.c
+        DEPENDS sources/voxtral.c/voxtral_shaders.metal
+        COMMENT "Generating embedded Metal shaders header"
+    )
+    list(APPEND VOXTRAL_SOURCES sources/voxtral.c/voxtral_metal.m)
+    set_source_files_properties(sources/voxtral.c/voxtral_metal.m PROPERTIES
+        COMPILE_FLAGS "-fobjc-arc"
+    )
+endif()
+
+add_library(govoxtral MODULE csrc/govoxtral.c ${VOXTRAL_SOURCES})
+
+target_include_directories(govoxtral PRIVATE sources/voxtral.c csrc)
+
+target_compile_options(govoxtral PRIVATE -O3 -ffast-math)
+
+if(USE_METAL)
+    target_compile_definitions(govoxtral PRIVATE USE_BLAS USE_METAL ACCELERATE_NEW_LAPACK)
+    target_link_libraries(govoxtral PRIVATE
+        "-framework Accelerate"
+        "-framework Metal"
+        "-framework MetalPerformanceShaders"
+        "-framework MetalPerformanceShadersGraph"
+        "-framework Foundation"
+        "-framework AudioToolbox"
+        "-framework CoreFoundation"
+        m
+    )
+    # Ensure the generated shader header is built before compiling
+    target_sources(govoxtral PRIVATE
+        ${CMAKE_CURRENT_SOURCE_DIR}/sources/voxtral.c/voxtral_shaders_source.h
+    )
+elseif(USE_OPENBLAS)
+    # Try to find OpenBLAS; use it if available, otherwise fall back to pure C
+    find_package(BLAS)
+    if(BLAS_FOUND)
+        target_compile_definitions(govoxtral PRIVATE USE_BLAS USE_OPENBLAS)
+        target_link_libraries(govoxtral PRIVATE ${BLAS_LIBRARIES} m)
+        target_include_directories(govoxtral PRIVATE /usr/include/openblas)
+    else()
+        message(WARNING "OpenBLAS requested but not found, building without BLAS")
+        target_link_libraries(govoxtral PRIVATE m)
+    endif()
+elseif(APPLE)
+    # macOS without Metal: use Accelerate framework
+    target_compile_definitions(govoxtral PRIVATE USE_BLAS ACCELERATE_NEW_LAPACK)
+    target_link_libraries(govoxtral PRIVATE "-framework Accelerate" m)
+else()
+    target_link_libraries(govoxtral PRIVATE m)
+endif()
+
+set_property(TARGET govoxtral PROPERTY C_STANDARD 11)
+set_target_properties(govoxtral PROPERTIES LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR})
--- a/backend/go/voxtral/Makefile
+++ b/backend/go/voxtral/Makefile
@@ -0,0 +1,107 @@
+.NOTPARALLEL:
+
+CMAKE_ARGS?=
+BUILD_TYPE?=
+NATIVE?=true
+
+GOCMD?=go
+GO_TAGS?=
+JOBS?=$(shell nproc --ignore=1 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)
+
+# voxtral.c version
+VOXTRAL_REPO?=https://github.com/antirez/voxtral.c
+VOXTRAL_VERSION?=c9e8773a2042d67c637fc492c8a655c485354080
+
+# Detect OS
+UNAME_S := $(shell uname -s)
+
+# Shared library extension
+ifeq ($(UNAME_S),Darwin)
+	SO_EXT=dylib
+else
+	SO_EXT=so
+endif
+
+SO_TARGET?=libgovoxtral.$(SO_EXT)
+
+CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
+
+ifeq ($(NATIVE),false)
+ifneq ($(UNAME_S),Darwin)
+	CMAKE_ARGS+=-DCMAKE_C_FLAGS="-march=x86-64"
+endif
+endif
+
+ifeq ($(BUILD_TYPE),cublas)
+	CMAKE_ARGS+=-DUSE_OPENBLAS=OFF
+else ifeq ($(BUILD_TYPE),hipblas)
+	CMAKE_ARGS+=-DUSE_OPENBLAS=OFF
+else ifeq ($(BUILD_TYPE),metal)
+	CMAKE_ARGS+=-DUSE_OPENBLAS=OFF -DUSE_METAL=ON
+else ifeq ($(UNAME_S),Darwin)
+	# Default on macOS: use Accelerate (no OpenBLAS needed)
+	CMAKE_ARGS+=-DUSE_OPENBLAS=OFF
+else
+	CMAKE_ARGS+=-DUSE_OPENBLAS=ON
+endif
+
+# Single library target
+ifeq ($(UNAME_S),Darwin)
+VARIANT_TARGETS = libgovoxtral.dylib
+else
+VARIANT_TARGETS = libgovoxtral.so
+endif
+
+sources/voxtral.c:
+	mkdir -p sources/voxtral.c
+	cd sources/voxtral.c && \
+	git init && \
+	git remote add origin $(VOXTRAL_REPO) && \
+	git fetch origin && \
+	git checkout $(VOXTRAL_VERSION) && \
+	git submodule update --init --recursive --depth 1 --single-branch
+
+voxtral: main.go govoxtral.go $(VARIANT_TARGETS)
+	CGO_ENABLED=0 $(GOCMD) build -tags "$(GO_TAGS)" -o voxtral ./
+
+package: voxtral
+	bash package.sh
+
+build: package
+
+clean: purge
+	rm -rf libgovoxtral.so libgovoxtral.dylib package sources/voxtral.c voxtral
+
+purge:
+	rm -rf build*
+
+# Build single library
+ifeq ($(UNAME_S),Darwin)
+libgovoxtral.dylib: sources/voxtral.c
+	$(MAKE) purge
+	$(info Building voxtral: darwin)
+	SO_TARGET=libgovoxtral.dylib NATIVE=true $(MAKE) libgovoxtral-custom
+	rm -rfv build*
+else
+libgovoxtral.so: sources/voxtral.c
+	$(MAKE) purge
+	$(info Building voxtral)
+	SO_TARGET=libgovoxtral.so $(MAKE) libgovoxtral-custom
+	rm -rfv build*
+endif
+
+libgovoxtral-custom: CMakeLists.txt csrc/govoxtral.c csrc/govoxtral.h
+	mkdir -p build-$(SO_TARGET) && \
+	cd build-$(SO_TARGET) && \
+	cmake .. $(CMAKE_ARGS) && \
+	cmake --build . --config Release -j$(JOBS) && \
+	cd .. && \
+	(mv build-$(SO_TARGET)/libgovoxtral.so ./$(SO_TARGET) 2>/dev/null || \
+	 mv build-$(SO_TARGET)/libgovoxtral.dylib ./$(SO_TARGET) 2>/dev/null)
+
+test: voxtral
+	@echo "Running voxtral tests..."
+	bash test.sh
+	@echo "voxtral tests completed."
+
+all: voxtral package
--- a/backend/go/voxtral/csrc/govoxtral.c
+++ b/backend/go/voxtral/csrc/govoxtral.c
@@ -0,0 +1,62 @@
+#include "govoxtral.h"
+#include "voxtral.h"
+#include "voxtral_audio.h"
+#ifdef USE_METAL
+#include "voxtral_metal.h"
+#endif
+#include <stdlib.h>
+#include <string.h>
+#include <stdio.h>
+
+static vox_ctx_t *ctx = NULL;
+static char *last_result = NULL;
+static int metal_initialized = 0;
+
+int load_model(const char *model_dir) {
+    if (ctx != NULL) {
+        vox_free(ctx);
+        ctx = NULL;
+    }
+
+#ifdef USE_METAL
+    if (!metal_initialized) {
+        vox_metal_init();
+        metal_initialized = 1;
+    }
+#endif
+
+    ctx = vox_load(model_dir);
+    if (ctx == NULL) {
+        fprintf(stderr, "error: failed to load voxtral model from %s\n", model_dir);
+        return 1;
+    }
+
+    return 0;
+}
+
+const char *transcribe(const char *wav_path) {
+    if (ctx == NULL) {
+        fprintf(stderr, "error: model not loaded\n");
+        return "";
+    }
+
+    if (last_result != NULL) {
+        free(last_result);
+        last_result = NULL;
+    }
+
+    last_result = vox_transcribe(ctx, wav_path);
+    if (last_result == NULL) {
+        fprintf(stderr, "error: transcription failed for %s\n", wav_path);
+        return "";
+    }
+
+    return last_result;
+}
+
+void free_result(void) {
+    if (last_result != NULL) {
+        free(last_result);
+        last_result = NULL;
+    }
+}
--- a/backend/go/voxtral/csrc/govoxtral.h
+++ b/backend/go/voxtral/csrc/govoxtral.h
@@ -0,0 +1,8 @@
+#ifndef GOVOXTRAL_H
+#define GOVOXTRAL_H
+
+extern int load_model(const char *model_dir);
+extern const char *transcribe(const char *wav_path);
+extern void free_result(void);
+
+#endif /* GOVOXTRAL_H */
--- a/backend/go/voxtral/govoxtral.go
+++ b/backend/go/voxtral/govoxtral.go
@@ -0,0 +1,60 @@
+package main
+
+import (
+	"fmt"
+	"os"
+	"strings"
+
+	"github.com/mudler/LocalAI/pkg/grpc/base"
+	pb "github.com/mudler/LocalAI/pkg/grpc/proto"
+	"github.com/mudler/LocalAI/pkg/utils"
+)
+
+var (
+	CppLoadModel  func(modelDir string) int
+	CppTranscribe func(wavPath string) string
+	CppFreeResult func()
+)
+
+type Voxtral struct {
+	base.SingleThread
+}
+
+func (v *Voxtral) Load(opts *pb.ModelOptions) error {
+	if ret := CppLoadModel(opts.ModelFile); ret != 0 {
+		return fmt.Errorf("failed to load Voxtral model from %s", opts.ModelFile)
+	}
+	return nil
+}
+
+func (v *Voxtral) AudioTranscription(opts *pb.TranscriptRequest) (pb.TranscriptResult, error) {
+	dir, err := os.MkdirTemp("", "voxtral")
+	if err != nil {
+		return pb.TranscriptResult{}, err
+	}
+	defer os.RemoveAll(dir)
+
+	convertedPath := dir + "/converted.wav"
+
+	if err := utils.AudioToWav(opts.Dst, convertedPath); err != nil {
+		return pb.TranscriptResult{}, err
+	}
+
+	result := strings.Clone(CppTranscribe(convertedPath))
+	CppFreeResult()
+
+	text := strings.TrimSpace(result)
+
+	segments := []*pb.TranscriptSegment{}
+	if text != "" {
+		segments = append(segments, &pb.TranscriptSegment{
+			Id:   0,
+			Text: text,
+		})
+	}
+
+	return pb.TranscriptResult{
+		Segments: segments,
+		Text:     text,
+	}, nil
+}
--- a/backend/go/voxtral/main.go
+++ b/backend/go/voxtral/main.go
@@ -0,0 +1,53 @@
+package main
+
+// Note: this is started internally by LocalAI and a server is allocated for each model
+import (
+	"flag"
+	"os"
+	"runtime"
+
+	"github.com/ebitengine/purego"
+	grpc "github.com/mudler/LocalAI/pkg/grpc"
+)
+
+var (
+	addr = flag.String("addr", "localhost:50051", "the address to connect to")
+)
+
+type LibFuncs struct {
+	FuncPtr any
+	Name    string
+}
+
+func main() {
+	// Get library name from environment variable, default to fallback
+	libName := os.Getenv("VOXTRAL_LIBRARY")
+	if libName == "" {
+		if runtime.GOOS == "darwin" {
+			libName = "./libgovoxtral.dylib"
+		} else {
+			libName = "./libgovoxtral.so"
+		}
+	}
+
+	gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
+	if err != nil {
+		panic(err)
+	}
+
+	libFuncs := []LibFuncs{
+		{&CppLoadModel, "load_model"},
+		{&CppTranscribe, "transcribe"},
+		{&CppFreeResult, "free_result"},
+	}
+
+	for _, lf := range libFuncs {
+		purego.RegisterLibFunc(lf.FuncPtr, gosd, lf.Name)
+	}
+
+	flag.Parse()
+
+	if err := grpc.StartServer(*addr, &Voxtral{}); err != nil {
+		panic(err)
+	}
+}
--- a/backend/go/voxtral/package.sh
+++ b/backend/go/voxtral/package.sh
@@ -0,0 +1,68 @@
+#!/bin/bash
+
+# Script to copy the appropriate libraries based on architecture
+
+set -e
+
+CURDIR=$(dirname "$(realpath $0)")
+REPO_ROOT="${CURDIR}/../../.."
+
+# Create lib directory
+mkdir -p $CURDIR/package/lib
+
+cp -avf $CURDIR/voxtral $CURDIR/package/
+cp -fv $CURDIR/libgovoxtral-*.so $CURDIR/package/ 2>/dev/null || true
+cp -fv $CURDIR/libgovoxtral-*.dylib $CURDIR/package/ 2>/dev/null || true
+cp -fv $CURDIR/run.sh $CURDIR/package/
+
+# Detect architecture and copy appropriate libraries
+if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
+    # x86_64 architecture
+    echo "Detected x86_64 architecture, copying x86_64 libraries..."
+    cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so
+    cp -arfLv /lib/x86_64-linux-gnu/libc.so.6 $CURDIR/package/lib/libc.so.6
+    cp -arfLv /lib/x86_64-linux-gnu/libgcc_s.so.1 $CURDIR/package/lib/libgcc_s.so.1
+    cp -arfLv /lib/x86_64-linux-gnu/libstdc++.so.6 $CURDIR/package/lib/libstdc++.so.6
+    cp -arfLv /lib/x86_64-linux-gnu/libm.so.6 $CURDIR/package/lib/libm.so.6
+    cp -arfLv /lib/x86_64-linux-gnu/libgomp.so.1 $CURDIR/package/lib/libgomp.so.1
+    cp -arfLv /lib/x86_64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
+    cp -arfLv /lib/x86_64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
+    cp -arfLv /lib/x86_64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
+    # OpenBLAS if available
+    if [ -f /usr/lib/x86_64-linux-gnu/libopenblas.so.0 ]; then
+        cp -arfLv /usr/lib/x86_64-linux-gnu/libopenblas.so.0 $CURDIR/package/lib/
+    fi
+elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
+    # ARM64 architecture
+    echo "Detected ARM64 architecture, copying ARM64 libraries..."
+    cp -arfLv /lib/ld-linux-aarch64.so.1 $CURDIR/package/lib/ld.so
+    cp -arfLv /lib/aarch64-linux-gnu/libc.so.6 $CURDIR/package/lib/libc.so.6
+    cp -arfLv /lib/aarch64-linux-gnu/libgcc_s.so.1 $CURDIR/package/lib/libgcc_s.so.1
+    cp -arfLv /lib/aarch64-linux-gnu/libstdc++.so.6 $CURDIR/package/lib/libstdc++.so.6
+    cp -arfLv /lib/aarch64-linux-gnu/libm.so.6 $CURDIR/package/lib/libm.so.6
+    cp -arfLv /lib/aarch64-linux-gnu/libgomp.so.1 $CURDIR/package/lib/libgomp.so.1
+    cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
+    cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
+    cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
+    # OpenBLAS if available
+    if [ -f /usr/lib/aarch64-linux-gnu/libopenblas.so.0 ]; then
+        cp -arfLv /usr/lib/aarch64-linux-gnu/libopenblas.so.0 $CURDIR/package/lib/
+    fi
+elif [ $(uname -s) = "Darwin" ]; then
+    echo "Detected Darwin — system frameworks linked dynamically, no bundled libs needed"
+else
+    echo "Error: Could not detect architecture"
+    exit 1
+fi
+
+# Package GPU libraries based on BUILD_TYPE
+GPU_LIB_SCRIPT="${REPO_ROOT}/scripts/build/package-gpu-libs.sh"
+if [ -f "$GPU_LIB_SCRIPT" ]; then
+    echo "Packaging GPU libraries for BUILD_TYPE=${BUILD_TYPE:-cpu}..."
+    source "$GPU_LIB_SCRIPT" "$CURDIR/package/lib"
+    package_gpu_libs
+fi
+
+echo "Packaging completed successfully"
+ls -liah $CURDIR/package/
+ls -liah $CURDIR/package/lib/
--- a/backend/go/voxtral/run.sh
+++ b/backend/go/voxtral/run.sh
@@ -0,0 +1,49 @@
+#!/bin/bash
+set -ex
+
+# Get the absolute current dir where the script is located
+CURDIR=$(dirname "$(realpath $0)")
+
+cd /
+
+echo "CPU info:"
+if [ "$(uname)" != "Darwin" ]; then
+	grep -e "model\sname" /proc/cpuinfo | head -1
+	grep -e "flags" /proc/cpuinfo | head -1
+fi
+
+if [ "$(uname)" = "Darwin" ]; then
+	# macOS: single dylib variant (Metal or Accelerate)
+	LIBRARY="$CURDIR/libgovoxtral-fallback.dylib"
+	export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
+else
+	LIBRARY="$CURDIR/libgovoxtral-fallback.so"
+
+	if grep -q -e "\savx\s" /proc/cpuinfo ; then
+		echo "CPU:    AVX    found OK"
+		if [ -e $CURDIR/libgovoxtral-avx.so ]; then
+			LIBRARY="$CURDIR/libgovoxtral-avx.so"
+		fi
+	fi
+
+	if grep -q -e "\savx2\s" /proc/cpuinfo ; then
+		echo "CPU:    AVX2   found OK"
+		if [ -e $CURDIR/libgovoxtral-avx2.so ]; then
+			LIBRARY="$CURDIR/libgovoxtral-avx2.so"
+		fi
+	fi
+
+	export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
+fi
+
+export VOXTRAL_LIBRARY=$LIBRARY
+
+# If there is a lib/ld.so, use it (Linux only)
+if [ -f $CURDIR/lib/ld.so ]; then
+	echo "Using lib/ld.so"
+	echo "Using library: $LIBRARY"
+	exec $CURDIR/lib/ld.so $CURDIR/voxtral "$@"
+fi
+
+echo "Using library: $LIBRARY"
+exec $CURDIR/voxtral "$@"
--- a/backend/go/voxtral/test.sh
+++ b/backend/go/voxtral/test.sh
@@ -0,0 +1,48 @@
+#!/bin/bash
+set -e
+
+CURDIR=$(dirname "$(realpath $0)")
+
+echo "Running voxtral backend tests..."
+
+# The test requires:
+#   - VOXTRAL_MODEL_DIR: path to directory containing consolidated.safetensors + tekken.json
+#   - VOXTRAL_BINARY: path to the voxtral binary (defaults to ./voxtral)
+#
+# Tests that require the model will be skipped if VOXTRAL_MODEL_DIR is not set.
+
+cd "$CURDIR"
+export VOXTRAL_MODEL_DIR="${VOXTRAL_MODEL_DIR:-./voxtral-model}"
+
+if [ ! -d "$VOXTRAL_MODEL_DIR" ]; then
+    echo "Creating voxtral-model directory for tests..."
+    mkdir -p "$VOXTRAL_MODEL_DIR"
+    MODEL_ID="mistralai/Voxtral-Mini-4B-Realtime-2602"
+    echo "Model: ${MODEL_ID}"
+    echo ""
+
+    # Files to download
+    FILES=(
+        "consolidated.safetensors"
+        "params.json"
+        "tekken.json"
+    )
+
+    BASE_URL="https://huggingface.co/${MODEL_ID}/resolve/main"
+
+    for file in "${FILES[@]}"; do
+        dest="${VOXTRAL_MODEL_DIR}/${file}"
+        if [ -f "${dest}" ]; then
+            echo "  [skip] ${file} (already exists)"
+        else
+            echo "  [download] ${file}..."
+            curl -L -o "${dest}" "${BASE_URL}/${file}" --progress-bar
+            echo "  [done] ${file}"
+        fi
+    done
+fi
+
+# Run Go tests
+go test -v -timeout 300s ./...
+
+echo "All voxtral tests passed."
--- a/backend/go/voxtral/voxtral_test.go
+++ b/backend/go/voxtral/voxtral_test.go
@@ -0,0 +1,201 @@
+package main
+
+import (
+	"context"
+	"fmt"
+	"io"
+	"net/http"
+	"os"
+	"os/exec"
+	"path/filepath"
+	"strings"
+	"testing"
+	"time"
+
+	pb "github.com/mudler/LocalAI/pkg/grpc/proto"
+	"google.golang.org/grpc"
+	"google.golang.org/grpc/credentials/insecure"
+)
+
+const (
+	testAddr    = "localhost:50051"
+	sampleAudio = "https://qianwen-res.oss-cn-beijing.aliyuncs.com/Qwen3-ASR-Repo/asr_en.wav"
+	startupWait = 5 * time.Second
+)
+
+func skipIfNoModel(t *testing.T) string {
+	t.Helper()
+	modelDir := os.Getenv("VOXTRAL_MODEL_DIR")
+	if modelDir == "" {
+		t.Skip("VOXTRAL_MODEL_DIR not set, skipping test (set to voxtral model directory)")
+	}
+	if _, err := os.Stat(filepath.Join(modelDir, "consolidated.safetensors")); os.IsNotExist(err) {
+		t.Skipf("Model file not found in %s, skipping", modelDir)
+	}
+	return modelDir
+}
+
+func startServer(t *testing.T) *exec.Cmd {
+	t.Helper()
+	binary := os.Getenv("VOXTRAL_BINARY")
+	if binary == "" {
+		binary = "./voxtral"
+	}
+	if _, err := os.Stat(binary); os.IsNotExist(err) {
+		t.Skipf("Backend binary not found at %s, skipping", binary)
+	}
+	cmd := exec.Command(binary, "--addr", testAddr)
+	cmd.Stdout = os.Stderr
+	cmd.Stderr = os.Stderr
+	if err := cmd.Start(); err != nil {
+		t.Fatalf("Failed to start server: %v", err)
+	}
+	time.Sleep(startupWait)
+	return cmd
+}
+
+func stopServer(cmd *exec.Cmd) {
+	if cmd != nil && cmd.Process != nil {
+		cmd.Process.Kill()
+		cmd.Wait()
+	}
+}
+
+func dialGRPC(t *testing.T) *grpc.ClientConn {
+	t.Helper()
+	conn, err := grpc.Dial(testAddr,
+		grpc.WithTransportCredentials(insecure.NewCredentials()),
+		grpc.WithDefaultCallOptions(
+			grpc.MaxCallRecvMsgSize(50*1024*1024),
+			grpc.MaxCallSendMsgSize(50*1024*1024),
+		),
+	)
+	if err != nil {
+		t.Fatalf("Failed to dial gRPC: %v", err)
+	}
+	return conn
+}
+
+func downloadFile(url, dest string) error {
+	resp, err := http.Get(url)
+	if err != nil {
+		return fmt.Errorf("HTTP GET failed: %w", err)
+	}
+	defer resp.Body.Close()
+	if resp.StatusCode != http.StatusOK {
+		return fmt.Errorf("bad status: %s", resp.Status)
+	}
+	f, err := os.Create(dest)
+	if err != nil {
+		return err
+	}
+	defer f.Close()
+	_, err = io.Copy(f, resp.Body)
+	return err
+}
+
+func TestServerHealth(t *testing.T) {
+	cmd := startServer(t)
+	defer stopServer(cmd)
+
+	conn := dialGRPC(t)
+	defer conn.Close()
+
+	client := pb.NewBackendClient(conn)
+	resp, err := client.Health(context.Background(), &pb.HealthMessage{})
+	if err != nil {
+		t.Fatalf("Health check failed: %v", err)
+	}
+	if string(resp.Message) != "OK" {
+		t.Fatalf("Expected OK, got %s", string(resp.Message))
+	}
+}
+
+func TestLoadModel(t *testing.T) {
+	modelDir := skipIfNoModel(t)
+	cmd := startServer(t)
+	defer stopServer(cmd)
+
+	conn := dialGRPC(t)
+	defer conn.Close()
+
+	client := pb.NewBackendClient(conn)
+	resp, err := client.LoadModel(context.Background(), &pb.ModelOptions{
+		ModelFile: modelDir,
+	})
+	if err != nil {
+		t.Fatalf("LoadModel failed: %v", err)
+	}
+	if !resp.Success {
+		t.Fatalf("LoadModel returned failure: %s", resp.Message)
+	}
+}
+
+func TestAudioTranscription(t *testing.T) {
+	modelDir := skipIfNoModel(t)
+
+	tmpDir, err := os.MkdirTemp("", "voxtral-test")
+	if err != nil {
+		t.Fatal(err)
+	}
+	defer os.RemoveAll(tmpDir)
+
+	// Download sample audio — JFK "ask not what your country can do for you" clip
+	audioFile := filepath.Join(tmpDir, "sample.wav")
+	t.Log("Downloading sample audio...")
+	if err := downloadFile(sampleAudio, audioFile); err != nil {
+		t.Fatalf("Failed to download sample audio: %v", err)
+	}
+
+	cmd := startServer(t)
+	defer stopServer(cmd)
+
+	conn := dialGRPC(t)
+	defer conn.Close()
+
+	client := pb.NewBackendClient(conn)
+
+	// Load model
+	loadResp, err := client.LoadModel(context.Background(), &pb.ModelOptions{
+		ModelFile: modelDir,
+	})
+	if err != nil {
+		t.Fatalf("LoadModel failed: %v", err)
+	}
+	if !loadResp.Success {
+		t.Fatalf("LoadModel returned failure: %s", loadResp.Message)
+	}
+
+	// Transcribe
+	transcriptResp, err := client.AudioTranscription(context.Background(), &pb.TranscriptRequest{
+		Dst: audioFile,
+	})
+	if err != nil {
+		t.Fatalf("AudioTranscription failed: %v", err)
+	}
+	if transcriptResp == nil {
+		t.Fatal("AudioTranscription returned nil")
+	}
+
+	t.Logf("Transcribed text: %s", transcriptResp.Text)
+	t.Logf("Number of segments: %d", len(transcriptResp.Segments))
+
+	if transcriptResp.Text == "" {
+		t.Fatal("Transcription returned empty text")
+	}
+
+	allText := strings.ToLower(transcriptResp.Text)
+	for _, seg := range transcriptResp.Segments {
+		allText += " " + strings.ToLower(seg.Text)
+	}
+	t.Logf("All text: %s", allText)
+
+	if !strings.Contains(allText, "big") {
+		t.Errorf("Expected 'big' in transcription, got: %s", allText)
+	}
+
+	// The sample audio should contain recognizable speech
+	if len(allText) < 10 {
+		t.Errorf("Transcription too short: %q", allText)
+	}
+}
--- a/backend/go/whisper/Makefile
+++ b/backend/go/whisper/Makefile
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)

 # whisper.cpp version
 WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
-WHISPER_CPP_VERSION?=941bdabbe4561bc6de68981aea01bc5ab05781c5
+WHISPER_CPP_VERSION?=4b23ff249e7f93137cb870b28fb27818e074c255
 SO_TARGET?=libgowhisper.so

 CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
@@ -88,19 +88,19 @@ ifeq ($(UNAME_S),Linux)
 libgowhisper-avx.so: sources/whisper.cpp
 	$(MAKE) purge
 	$(info ${GREEN}I whisper build info:avx${RESET})
-	SO_TARGET=libgowhisper-avx.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off" $(MAKE) libgowhisper-custom
+	SO_TARGET=libgowhisper-avx.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgowhisper-custom
 	rm -rfv build*

 libgowhisper-avx2.so: sources/whisper.cpp
 	$(MAKE) purge
 	$(info ${GREEN}I whisper build info:avx2${RESET})
-	SO_TARGET=libgowhisper-avx2.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=on -DGGML_AVX512=off -DGGML_FMA=on -DGGML_F16C=on" $(MAKE) libgowhisper-custom
+	SO_TARGET=libgowhisper-avx2.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=on -DGGML_AVX512=off -DGGML_FMA=on -DGGML_F16C=on -DGGML_BMI2=on" $(MAKE) libgowhisper-custom
 	rm -rfv build*

 libgowhisper-avx512.so: sources/whisper.cpp
 	$(MAKE) purge
 	$(info ${GREEN}I whisper build info:avx512${RESET})
-	SO_TARGET=libgowhisper-avx512.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=off -DGGML_AVX512=on -DGGML_FMA=on -DGGML_F16C=on" $(MAKE) libgowhisper-custom
+	SO_TARGET=libgowhisper-avx512.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=on -DGGML_AVX2=on -DGGML_AVX512=on -DGGML_FMA=on -DGGML_F16C=on -DGGML_BMI2=on" $(MAKE) libgowhisper-custom
 	rm -rfv build*
 endif

@@ -108,7 +108,7 @@ endif
 libgowhisper-fallback.so: sources/whisper.cpp
 	$(MAKE) purge
 	$(info ${GREEN}I whisper build info:fallback${RESET})
-	SO_TARGET=libgowhisper-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off" $(MAKE) libgowhisper-custom
+	SO_TARGET=libgowhisper-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgowhisper-custom
 	rm -rfv build*

 libgowhisper-custom: CMakeLists.txt gowhisper.cpp gowhisper.h
--- a/backend/index.yaml
+++ b/backend/index.yaml
@@ -56,6 +56,21 @@
    nvidia-cuda-12: "cuda12-whisper"
    nvidia-l4t-cuda-12: "nvidia-l4t-arm64-whisper"
    nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-whisper"
+- &voxtral
+  name: "voxtral"
+  alias: "voxtral"
+  license: mit
+  description: |
+    Voxtral Realtime 4B Pure C speech-to-text inference engine
+  urls:
+    - https://github.com/mudler/voxtral.c
+  tags:
+    - audio-transcription
+    - CPU
+    - Metal
+  capabilities:
+    default: "cpu-voxtral"
+    metal-darwin-arm64: "metal-voxtral"
 - &stablediffusionggml
  name: "stablediffusion-ggml"
  alias: "stablediffusion-ggml"
@@ -2594,3 +2609,24 @@
  uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-pocket-tts"
  mirrors:
    - localai/localai-backends:master-metal-darwin-arm64-pocket-tts
+## voxtral
+- !!merge <<: *voxtral
+  name: "cpu-voxtral"
+  uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-voxtral"
+  mirrors:
+    - localai/localai-backends:latest-cpu-voxtral
+- !!merge <<: *voxtral
+  name: "cpu-voxtral-development"
+  uri: "quay.io/go-skynet/local-ai-backends:master-cpu-voxtral"
+  mirrors:
+    - localai/localai-backends:master-cpu-voxtral
+- !!merge <<: *voxtral
+  name: "metal-voxtral"
+  uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-voxtral"
+  mirrors:
+    - localai/localai-backends:latest-metal-darwin-arm64-voxtral
+- !!merge <<: *voxtral
+  name: "metal-voxtral-development"
+  uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-voxtral"
+  mirrors:
+    - localai/localai-backends:master-metal-darwin-arm64-voxtral
--- a/backend/python/transformers/requirements.txt
+++ b/backend/python/transformers/requirements.txt
@@ -1,4 +1,4 @@
-grpcio==1.76.0
+grpcio==1.78.0
 protobuf==6.33.5
 certifi
 setuptools
--- a/docs/content/features/backends.md
+++ b/docs/content/features/backends.md
@@ -122,3 +122,4 @@ LocalAI supports various types of backends:
 - **Diffusion Backends**: For image generation
 - **TTS Backends**: For text-to-speech conversion
 - **Whisper Backends**: For speech-to-text conversion
+- **Sound Generation Backends**: For music and audio generation (e.g., ACE-Step)
--- a/docs/content/features/model-gallery.md
+++ b/docs/content/features/model-gallery.md
@@ -14,7 +14,7 @@ LocalAI to ease out installations of models provide a way to preload models on s


 {{% notice note %}}
-The models in this gallery are not directly maintained by LocalAI. If you find a model that is not working, please open an issue on the model gallery repository.
+The models in this gallery are not directly maintained by LocalAI. If you find a model that is not working, please open an issue on the [main LocalAI repository](https://github.com/mudler/LocalAI/issues).
 {{% /notice %}}

 {{% notice note %}}
--- a/docs/data/version.json
+++ b/docs/data/version.json
@@ -1,3 +1,3 @@
 {
-  "version": "v3.10.1"
+  "version": "v3.11.0"
 }
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -12398,6 +12398,311 @@
    - filename: llama-cpp/mmproj/mmproj-mistral-community_pixtral-12b-f16.gguf
      sha256: a0b21e5a3b0f9b0b604385c45bb841142e7a5ac7660fa6a397dbc87c66b2083e
      uri: huggingface://bartowski/mistral-community_pixtral-12b-GGUF/mmproj-mistral-community_pixtral-12b-f16.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-14b-instruct-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-14B-Instruct-2512
+    - https://huggingface.co/unsloth/Ministral-3-14B-Instruct-2512-GGUF
+  description: |
+    The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 24GB of VRAM in FP8, and less if further quantized.
+
+    Key Features:
+    Ministral 3 14B consists of two main architectural components:
+
+        - 13.5B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 14B Instruct model offers the following capabilities:
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 16384
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-14B-Instruct-2512-Q4_K_M.gguf
+      temperature: 0.15
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-14B-Instruct-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-14B-Instruct-2512-Q4_K_M.gguf
+      sha256: 76ce697c065f2e40f1e8e958118b02cab38e2c10a6015f7d7908036a292dc8c8
+      uri: huggingface://unsloth/Ministral-3-14B-Instruct-2512-GGUF/Ministral-3-14B-Instruct-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-14B-Instruct-2512-f32.gguf
+      sha256: 2740ba9e9b30b09be4282a9a9f617ec43dc47b89aed416cb09b5f698f90783b5
+      uri: huggingface://unsloth/Ministral-3-14B-Instruct-2512-GGUF/mmproj-F32.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-14b-reasoning-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-14B-Reasoning-2512
+    - https://huggingface.co/unsloth/Ministral-3-14B-Reasoning-2512-GGUF
+  description: |
+    The largest model in the Ministral 3 family, Ministral 3 14B offers frontier capabilities and performance comparable to its larger Mistral Small 3.2 24B counterpart. A powerful and efficient language model with vision capabilities.
+
+    This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 14B can even be deployed locally, capable of fitting in 32GB of VRAM in BF16, and less than 24GB of RAM/VRAM when quantized.
+
+    Key Features:
+    Ministral 3 14B consists of two main architectural components:
+
+
+        - 13.5B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 14B Reasoning model offers the following capabilities:
+
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 32768
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf
+      temperature: 0.7
+      top_p: 0.95
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-14B-Reasoning-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf
+      sha256: f577390559b89ebdbfe52cc234ea334649c24e6003ffa4b6a2474c5e2a47aa17
+      uri: huggingface://unsloth/Ministral-3-14B-Reasoning-2512-GGUF/Ministral-3-14B-Reasoning-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-14B-Reasoning-2512-f32.gguf
+      sha256: 891bf262a032968f6e5b3d4e9ffc84cf6381890033c2f5204fbdf4817af4ab9b
+      uri: huggingface://unsloth/Ministral-3-14B-Reasoning-2512-GGUF/mmproj-F32.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-8b-instruct-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-8B-Instruct-2512
+    - https://huggingface.co/unsloth/Ministral-3-8B-Instruct-2512-GGUF
+  description: |
+    A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 12GB of VRAM in FP8, and less if further quantized.
+
+    Key Features:
+    Ministral 3 8B consists of two main architectural components:
+
+        - 8.4B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 8B Instruct model offers the following capabilities:
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 16384
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-8B-Instruct-2512-Q4_K_M.gguf
+      temperature: 0.15
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-8B-Instruct-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-8B-Instruct-2512-Q4_K_M.gguf
+      sha256: 5dbc3647eb563b9f8d3c70ec3d906cce84b86bb35c5e0b8a36e7df3937ab7174
+      uri: huggingface://unsloth/Ministral-3-8B-Instruct-2512-GGUF/Ministral-3-8B-Instruct-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-8B-Instruct-2512-f32.gguf
+      sha256: 242d11ff65ef844b0aac4e28d4b1318813370608845f17b3ef5826fd7e7fd015
+      uri: huggingface://unsloth/Ministral-3-8B-Instruct-2512-GGUF/mmproj-F32.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-8b-reasoning-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-8B-Reasoning-2512
+    - https://huggingface.co/unsloth/Ministral-3-8B-Reasoning-2512-GGUF
+  description: |
+    A balanced model in the Ministral 3 family, Ministral 3 8B is a powerful, efficient tiny language model with vision capabilities.
+
+    This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 8B can even be deployed locally, capable of fitting in 24GB of VRAM in BF16, and less than 12GB of RAM/VRAM when quantized.
+
+    Key Features:
+    Ministral 3 8B consists of two main architectural components:
+
+
+        - 8.4B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 8B Reasoning model offers the following capabilities:
+
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 32768
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-8B-Reasoning-2512-Q4_K_M.gguf
+      temperature: 0.7
+      top_p: 0.95
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-8B-Reasoning-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-8B-Reasoning-2512-Q4_K_M.gguf
+      sha256: c3d1c5ab7406a0fc9d50ad2f0d15d34d5693db00bf953e8a9cd9a243b81cb1b2
+      uri: huggingface://unsloth/Ministral-3-8B-Reasoning-2512-GGUF/Ministral-3-8B-Reasoning-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-8B-Reasoning-2512-f32.gguf
+      sha256: 92252621cb957949379ff81ee14b15887d37eade3845a6e937e571b98c2c84c2
+      uri: huggingface://unsloth/Ministral-3-8B-Reasoning-2512-GGUF/mmproj-F32.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-3b-instruct-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512
+    - https://huggingface.co/unsloth/Ministral-3-3B-Instruct-2512-GGUF
+  description: |
+    The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, capable of fitting in 8GB of VRAM in FP8, and less if further quantized.
+
+    Key Features:
+    Ministral 3 3B consists of two main architectural components:
+
+        - 3.4B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 3B Instruct model offers the following capabilities:
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 16384
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-3B-Instruct-2512-Q4_K_M.gguf
+      temperature: 0.15
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-3B-Instruct-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-3B-Instruct-2512-Q4_K_M.gguf
+      sha256: fd46fc371ff0509bfa8657ac956b7de8534d7d9baaa4947975c0648c3aa397f4
+      uri: huggingface://unsloth/Ministral-3-3B-Instruct-2512-GGUF/Ministral-3-3B-Instruct-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-3B-Instruct-2512-f32.gguf
+      sha256: 57bb4e6f01166985ca2fc16061be4023fcb95cb8e60f445b8d0bf1ee30268636
+      uri: huggingface://unsloth/Ministral-3-3B-Instruct-2512-GGUF/mmproj-F32.gguf
+- !!merge <<: *mistral03
+  name: "mistralai_ministral-3-3b-reasoning-2512-multimodal"
+  urls:
+    - https://huggingface.co/mistralai/Ministral-3-3B-Reasoning-2512
+    - https://huggingface.co/unsloth/Ministral-3-3B-Reasoning-2512-GGUF
+  description: |
+    The smallest model in the Ministral 3 family, Ministral 3 3B is a powerful, efficient tiny language model with vision capabilities.
+
+    This model is the reasoning post-trained version, trained for reasoning tasks, making it ideal for math, coding and stem related use cases.
+
+    The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware. Ministral 3 3B can even be deployed locally, fitting in 16GB of VRAM in BF16, and less than 8GB of RAM/VRAM when quantized.
+
+    Key Features:
+    Ministral 3 3B consists of two main architectural components:
+
+        - 3.4B Language Model
+        - 0.4B Vision Encoder
+
+    The Ministral 3 3B Reasoning model offers the following capabilities:
+
+        - Vision: Enables the model to analyze images and provide insights based on visual content, in addition to text.
+        - Multilingual: Supports dozens of languages, including English, French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic.
+        - System Prompt: Maintains strong adherence and support for system prompts.
+        - Agentic: Offers best-in-class agentic capabilities with native function calling and JSON outputting.
+        - Reasoning: Excels at complex, multi-step reasoning and dynamic problem-solving.
+        - Edge-Optimized: Delivers best-in-class performance at a small scale, deployable anywhere.
+        - Apache 2.0 License: Open-source license allowing usage and modification for both commercial and non-commercial purposes.
+        - Large Context Window: Supports a 256k context window.
+
+    This gallery entry includes mmproj for multimodality and uses Unsloth recommended defaults.
+  tags:
+    - llm
+    - gguf
+    - gpu
+    - mistral
+    - cpu
+    - function-calling
+    - multimodal
+  overrides:
+    context_size: 32768
+    parameters:
+      model: llama-cpp/models/mistralai_Ministral-3-3B-Reasoning-2512-Q4_K_M.gguf
+      temperature: 0.7
+      top_p: 0.95
+    mmproj: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-3B-Reasoning-2512-f32.gguf
+  files:
+    - filename: llama-cpp/models/mistralai_Ministral-3-3B-Reasoning-2512-Q4_K_M.gguf
+      sha256: a2648395d533b6d1408667d00e0b778f3823f3f3179ba371f89355f2e957e42e
+      uri: huggingface://unsloth/Ministral-3-3B-Reasoning-2512-GGUF/Ministral-3-3B-Reasoning-2512-Q4_K_M.gguf
+    - filename: llama-cpp/mmproj/mmproj-mistralai_Ministral-3-3B-Reasoning-2512-f32.gguf
+      sha256: 8035a6a10dfc6250f50c62764fae3ac2ef6d693fc9252307c7093198aabba812
+      uri: huggingface://unsloth/Ministral-3-3B-Reasoning-2512-GGUF/mmproj-F32.gguf
 - &mudler
  url: "github:mudler/LocalAI/gallery/mudler.yaml@master" ### START mudler's LocalAI specific-models
  name: "LocalAI-llama3-8b-function-call-v0.2"
Author	SHA1	Message	Date
dependabot[bot]	cc14aaad28	chore(deps): bump grpcio in /backend/python/transformers Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.0. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.0) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2026-02-09 22:31:10 +00:00
LocalAI [bot]	0c040beb59	chore: ⬆️ Update antirez/voxtral.c to `c9e8773a2042d67c637fc492c8a655c485354080` (#8477 ) ⬆️ Update antirez/voxtral.c Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-09 22:20:03 +01:00
Ettore Di Giacinto	bf5a1dd840	feat(voxtral): add voxtral backend (#8451 ) * feat(voxtral): add voxtral backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * simplify Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-09 09:12:05 +01:00
rampa3	f44200bec8	chore(model gallery): Add Ministral 3 family of models (aside from base versions) (#8467 ) Signed-off-by: rampa3 <68955305+rampa3@users.noreply.github.com>	2026-02-09 09:10:37 +01:00
LocalAI [bot]	3b1b08efd6	chore: ⬆️ Update ggml-org/llama.cpp to `e06088da0fa86aa444409f38dff274904931c507` (#8464 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-09 09:09:32 +01:00
LocalAI [bot]	3d8791067f	chore: ⬆️ Update ggml-org/whisper.cpp to `4b23ff249e7f93137cb870b28fb27818e074c255` (#8463 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-09 09:08:55 +01:00
Austen	da8207b73b	feat(stablediffusion-ggml): Improve legacy CPU support for stablediffusion-ggml backend (#8461 ) * Port AVX logic from whisper to stablediffusion-ggml Signed-off-by: Austen Dicken <cvpcsm@gmail.com> * disable BMI2 on AVX builds Signed-off-by: Austen Dicken <cvpcsm@gmail.com> --------- Signed-off-by: Austen Dicken <cvpcsm@gmail.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-08 23:11:33 +00:00
Varun Chawla	aa9ca401fa	docs: update model gallery documentation to reference main repository (#8452 ) Fixes #8212 - Updated the note about reporting broken models to reference the main LocalAI repository instead of the outdated separate gallery repository reference.	2026-02-08 22:14:23 +01:00
LocalAI [bot]	e43c0c3ffc	docs: ⬆️ update docs version mudler/LocalAI (#8462 ) ⬆️ Update docs version mudler/LocalAI Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-08 21:12:50 +00:00