mirror of https://github.com/mudler/LocalAI.git synced 2026-07-30 18:09:05 -04:00

Files

LocalAI [bot] 60facc7252 fix(darwin): publish sherpa-onnx and speaker-recognition images for darwin/arm64 (#10275 )

Neither the sherpa-onnx nor the speaker-recognition backend had a
darwin/arm64 image, so `local-ai backends install` failed with "no child
with platform darwin/arm64" on macOS. This left /v1/audio/diarization (the
sherpa-onnx path) and /v1/voice/embed without any usable backend on Apple
Silicon.

Both backends build on darwin/arm64:
- sherpa-onnx (Go) already fetches the onnxruntime osx-arm64 runtime in its
  Makefile; it only needed a darwin matrix entry (build-type metal, lang go,
  like whisper and silero-vad).
- speaker-recognition (Python) needed a requirements-mps.txt so the mps build
  installs plain onnxruntime (which ships a macOS arm64 wheel) instead of the
  onnxruntime-gpu pulled by its base requirements (which does not).

Add both to the includeDarwin build matrix, wire the metal capability and
metal image aliases into the gallery, and add the speaker-recognition
requirements-mps.txt.

Fixes #10268


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-12 22:32:42 +02:00

ace-step

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

chatterbox

feat(tts): support per-request instructions and params (#10172 )

2026-06-04 11:45:02 +02:00

common

fix(python-backend): make JIT subprocesses work on hosts of any size (#9679 )

2026-05-06 00:28:01 +02:00

coqui

chore(deps): bump packaging from 24.1 to 26.2 in /backend/python/coqui (#9594 )

2026-04-28 08:44:53 +02:00

diffusers

fix(diffusers): drop compel from requirements to unblock pip resolver (#9632 )

2026-05-01 14:45:14 +02:00

faster-qwen3-tts

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

faster-whisper

test(ci): trigger faster-whisper rebuild to observe per-arch+merge

2026-05-08 22:09:46 +00:00

fish-speech

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

insightface

feat: add biometrics UI (#9524 )

2026-04-24 08:50:34 +02:00

kitten-tts

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

kokoro

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

liquid-audio

feat(realtime): Add Liquid Audio s2s model and assistant mode on talk page (#9801 )

2026-05-13 21:57:27 +02:00

llama-cpp-quantization

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

mlx

feat: refactor shared helpers and enhance MLX backend functionality (#9335 )

2026-04-13 18:44:03 +02:00

mlx-audio

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

mlx-distributed

feat: refactor shared helpers and enhance MLX backend functionality (#9335 )

2026-04-13 18:44:03 +02:00

mlx-vlm

fix(mlx-vlm): pin upstream to v0.4.4 to unblock CUDA builds (#9568 )

2026-04-25 22:06:01 +02:00

moonshine

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

nemo

fix(nemo): pin texterrors to 1.1.6 for GLIBCXX compatibility (#10134 )

2026-06-02 14:48:27 +02:00

neutts

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

outetts

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

pocket-tts

feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 )

2026-05-01 10:56:24 +02:00

qwen-asr

fix(qwen-asr): enable timestamp output when forced_aligner is configured (#10013 )

2026-05-26 20:34:21 +00:00

qwen-tts

feat(tts): support per-request instructions and params (#10172 )

2026-06-04 11:45:02 +02:00

rerankers

fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688 )

2026-05-06 17:07:24 +02:00

rfdetr

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

sglang

fix(L4T13 backends): switch vllm/sglang/vllm-omni to PyPI aarch64+cu130 wheels (#9950 )

2026-05-22 23:01:22 +02:00

speaker-recognition

fix(darwin): publish sherpa-onnx and speaker-recognition images for darwin/arm64 (#10275 )

2026-06-12 22:32:42 +02:00

tinygrad

feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 )

2026-05-01 10:56:24 +02:00

transformers

chore(deps): bump grpcio from 1.80.0 to 1.81.0 in /backend/python/transformers (#10158 )

2026-06-03 10:38:43 +02:00

trl

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

vibevoice

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

vllm

fix(vllm): parse tool_call function arguments before applying the chat template (#10256 )

2026-06-11 23:55:38 +02:00

vllm-omni

fix(L4T13 backends): switch vllm/sglang/vllm-omni to PyPI aarch64+cu130 wheels (#9950 )

2026-05-22 23:01:22 +02:00

voxcpm

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

whisperx

chore(whisperx): drop ROCm/hipblas build target (#9474 )

2026-04-21 21:50:18 +02:00

README.md

chore: drop bark which is unmaintained (#8207 )

2026-01-25 09:26:40 +01:00

README.md

Python Backends for LocalAI

This directory contains Python-based AI backends for LocalAI, providing support for various AI models and hardware acceleration targets.

Overview

The Python backends use a unified build system based on libbackend.sh that provides:

Automatic virtual environment management with support for both uv and pip
Hardware-specific dependency installation (CPU, CUDA, Intel, MLX, etc.)
Portable Python support for standalone deployments
Consistent backend execution across different environments

Available Backends

Core AI Models

transformers - Hugging Face Transformers framework (PyTorch-based)
vllm - High-performance LLM inference engine
mlx - Apple Silicon optimized ML framework

Audio & Speech

coqui - Coqui TTS models
faster-whisper - Fast Whisper speech recognition
kitten-tts - Lightweight TTS
mlx-audio - Apple Silicon audio processing
chatterbox - TTS model
kokoro - TTS models

Computer Vision

diffusers - Stable Diffusion and image generation
mlx-vlm - Vision-language models for Apple Silicon
rfdetr - Object detection models

Specialized

rerankers - Text reranking models

Quick Start

Prerequisites

Python 3.10+ (default: 3.10.18)
uv package manager (recommended) or pip
Appropriate hardware drivers for your target (CUDA, Intel, etc.)

Installation

Each backend can be installed individually:

# Navigate to a specific backend
cd backend/python/transformers

# Install dependencies
make transformers
# or
bash install.sh

# Run the backend
make run
# or
bash run.sh

Using the Unified Build System

The libbackend.sh script provides consistent commands across all backends:

# Source the library in your backend script
source $(dirname $0)/../common/libbackend.sh

# Install requirements (automatically handles hardware detection)
installRequirements

# Start the backend server
startBackend $@

# Run tests
runUnittests

Hardware Targets

The build system automatically detects and configures for different hardware:

CPU - Standard CPU-only builds
CUDA - NVIDIA GPU acceleration (supports CUDA 12/13)
Intel - Intel XPU/GPU optimization
MLX - Apple Silicon (M1/M2/M3) optimization
HIP - AMD GPU acceleration

Target-Specific Requirements

Backends can specify hardware-specific dependencies:

requirements.txt - Base requirements
requirements-cpu.txt - CPU-specific packages
requirements-cublas12.txt - CUDA 12 packages
requirements-cublas13.txt - CUDA 13 packages
requirements-intel.txt - Intel-optimized packages
requirements-mps.txt - Apple Silicon packages

Configuration Options

Environment Variables

PYTHON_VERSION - Python version (default: 3.10)
PYTHON_PATCH - Python patch version (default: 18)
BUILD_TYPE - Force specific build target
USE_PIP - Use pip instead of uv (default: false)
PORTABLE_PYTHON - Enable portable Python builds
LIMIT_TARGETS - Restrict backend to specific targets

Example: CUDA 12 Only Backend

# In your backend script
LIMIT_TARGETS="cublas12"
source $(dirname $0)/../common/libbackend.sh

Example: Intel-Optimized Backend

# In your backend script
LIMIT_TARGETS="intel"
source $(dirname $0)/../common/libbackend.sh

Development

Adding a New Backend

Create a new directory in backend/python/
Copy the template structure from common/template/
Implement your backend.py with the required gRPC interface
Add appropriate requirements files for your target hardware
Use libbackend.sh for consistent build and execution

Testing

# Run backend tests
make test
# or
bash test.sh

Building

# Install dependencies
make <backend-name>

# Clean build artifacts
make clean

Architecture

Each backend follows a consistent structure:

backend-name/
├── backend.py          # Main backend implementation
├── requirements.txt    # Base dependencies
├── requirements-*.txt  # Hardware-specific dependencies
├── install.sh         # Installation script
├── run.sh            # Execution script
├── test.sh           # Test script
├── Makefile          # Build targets
└── test.py           # Unit tests

Troubleshooting

Common Issues

Missing dependencies: Ensure all requirements files are properly configured
Hardware detection: Check that BUILD_TYPE matches your system
Python version: Verify Python 3.10+ is available
Virtual environment: Use ensureVenv to create/activate environments

Contributing

When adding new backends or modifying existing ones:

Follow the established directory structure
Use libbackend.sh for consistent behavior
Include appropriate requirements files for all target hardware
Add comprehensive tests
Update this README if adding new backend types