mirror of https://github.com/mudler/LocalAI.git synced 2026-05-17 21:21:23 -04:00

Files

Attila Györffy 5a67b5d73c Fix image upload processing and img2img pipeline in diffusers backend (#8879 )

* fix: add missing bufio.Flush in processImageFile

The processImageFile function writes decoded image data (from base64
or URL download) through a bufio.NewWriter but never calls Flush()
before closing the underlying file. Since bufio's default buffer is
4096 bytes, small images produce 0-byte files and large images are
truncated — causing PIL to fail with "cannot identify image file".

This breaks all image input paths: file, files, and ref_images
parameters in /v1/images/generations, making img2img, inpainting,
and reference image features non-functional.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

* fix: merge options into kwargs in diffusers GenerateImage

The GenerateImage method builds a local `options` dict containing the
source image (PIL), negative_prompt, and num_inference_steps, but
never merges it into `kwargs` before calling self.pipe(**kwargs).
This causes img2img to fail with "Input is in incorrect format"
because the pipeline never receives the image parameter.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

* test: add unit test for processImageFile base64 decoding

Verifies that a base64-encoded PNG survives the write path
(encode → decode → bufio.Write → Flush → file on disk) with
byte-for-byte fidelity. The test image is small enough to fit
entirely in bufio's 4096-byte buffer, which is the exact scenario
where the missing Flush() produced a 0-byte file.

Also tests that invalid base64 input is handled gracefully.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

* test: verify GenerateImage merges options into pipeline kwargs

Mocks the diffusers pipeline and calls GenerateImage with a source
image and negative prompt. Asserts that the pipeline receives the
image, negative_prompt, and num_inference_steps via kwargs — the
exact parameters that were silently dropped before the fix.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

* fix: move kwargs.update(options) earlier in GenerateImage

Move the options merge right after self.options merge (L742) so that
image, negative_prompt, and num_inference_steps are available to all
downstream code paths including img2vid and txt2vid.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

* test: convert processImageFile tests to ginkgo

Replace standard testing with ginkgo/gomega to be consistent with
the rest of the test suites in the project.

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>

---------

Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2026-03-11 08:05:50 +01:00

ace-step

feat(musicgen): add ace-step and UI interface (#8396 )

2026-02-05 12:04:53 +01:00

chatterbox

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

common

chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/common/template (#8641 )

2026-02-25 08:14:43 +01:00

coqui

chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/coqui (#8642 )

2026-02-25 08:14:30 +01:00

diffusers

Fix image upload processing and img2img pipeline in diffusers backend (#8879 )

2026-03-11 08:05:50 +01:00

faster-qwen3-tts

chore(faster-qwen3-tts): Add anyio to requirements.txt

2026-03-03 09:43:29 +01:00

faster-whisper

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

kitten-tts

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

kokoro

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

mlx

feat(mlx): Add support for CUDA12, CUDA13, L4T, SBSA and CPU (#8380 )

2026-02-03 23:53:34 +01:00

mlx-audio

feat(mlx): Add support for CUDA12, CUDA13, L4T, SBSA and CPU (#8380 )

2026-02-03 23:53:34 +01:00

mlx-distributed

feat(mlx-distributed): add new MLX-distributed backend (#8801 )

2026-03-09 17:29:32 +01:00

mlx-vlm

feat(mlx): Add support for CUDA12, CUDA13, L4T, SBSA and CPU (#8380 )

2026-02-03 23:53:34 +01:00

moonshine

fix: update moonshine API, add setuptools to voxcpm requirements (#8541 )

2026-02-12 23:22:37 +01:00

nemo

feat(nemo): add Nemo (only asr for now) backend (#8436 )

2026-02-07 08:19:37 +01:00

neutts

fix: pin neutts-air to known working commit (#8566 )

2026-02-14 21:16:37 +01:00

outetts

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

pocket-tts

feat: Add debug logging for pocket-tts voice issue #8244 (#8715 )

2026-03-02 09:24:59 +01:00

qwen-asr

fix(ci): try to fix deps for l4t13 on qwen-*

2026-02-14 10:21:23 +01:00

qwen-tts

fix(qwen-tts): duplicate instruct argument in voice design mode (#8842 )

2026-03-08 08:48:22 +01:00

rerankers

chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/rerankers (#8636 )

2026-02-25 08:16:57 +01:00

rfdetr

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

transformers

chore(deps): bump sentence-transformers from 5.2.2 to 5.2.3 in /backend/python/transformers (#8638 )

2026-02-25 08:16:41 +01:00

vibevoice

feat(vibevoice): add ASR support (#8222 )

2026-01-27 20:19:22 +01:00

vllm

chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/vllm (#8635 )

2026-02-25 08:17:32 +01:00

vllm-omni

feat(vllm-omni): add new backend (#8188 )

2026-01-24 22:23:30 +01:00

voxcpm

fix(voxcpm): pin setuptools (#8556 )

2026-02-13 23:44:35 +01:00

whisperx

feat(metal): try to extend support to remaining backends (#8374 )

2026-02-03 21:57:50 +01:00

README.md

chore: drop bark which is unmaintained (#8207 )

2026-01-25 09:26:40 +01:00

README.md

Python Backends for LocalAI

This directory contains Python-based AI backends for LocalAI, providing support for various AI models and hardware acceleration targets.

Overview

The Python backends use a unified build system based on libbackend.sh that provides:

Automatic virtual environment management with support for both uv and pip
Hardware-specific dependency installation (CPU, CUDA, Intel, MLX, etc.)
Portable Python support for standalone deployments
Consistent backend execution across different environments

Available Backends

Core AI Models

transformers - Hugging Face Transformers framework (PyTorch-based)
vllm - High-performance LLM inference engine
mlx - Apple Silicon optimized ML framework

Audio & Speech

coqui - Coqui TTS models
faster-whisper - Fast Whisper speech recognition
kitten-tts - Lightweight TTS
mlx-audio - Apple Silicon audio processing
chatterbox - TTS model
kokoro - TTS models

Computer Vision

diffusers - Stable Diffusion and image generation
mlx-vlm - Vision-language models for Apple Silicon
rfdetr - Object detection models

Specialized

rerankers - Text reranking models

Quick Start

Prerequisites

Python 3.10+ (default: 3.10.18)
uv package manager (recommended) or pip
Appropriate hardware drivers for your target (CUDA, Intel, etc.)

Installation

Each backend can be installed individually:

# Navigate to a specific backend
cd backend/python/transformers

# Install dependencies
make transformers
# or
bash install.sh

# Run the backend
make run
# or
bash run.sh

Using the Unified Build System

The libbackend.sh script provides consistent commands across all backends:

# Source the library in your backend script
source $(dirname $0)/../common/libbackend.sh

# Install requirements (automatically handles hardware detection)
installRequirements

# Start the backend server
startBackend $@

# Run tests
runUnittests

Hardware Targets

The build system automatically detects and configures for different hardware:

CPU - Standard CPU-only builds
CUDA - NVIDIA GPU acceleration (supports CUDA 12/13)
Intel - Intel XPU/GPU optimization
MLX - Apple Silicon (M1/M2/M3) optimization
HIP - AMD GPU acceleration

Target-Specific Requirements

Backends can specify hardware-specific dependencies:

requirements.txt - Base requirements
requirements-cpu.txt - CPU-specific packages
requirements-cublas12.txt - CUDA 12 packages
requirements-cublas13.txt - CUDA 13 packages
requirements-intel.txt - Intel-optimized packages
requirements-mps.txt - Apple Silicon packages

Configuration Options

Environment Variables

PYTHON_VERSION - Python version (default: 3.10)
PYTHON_PATCH - Python patch version (default: 18)
BUILD_TYPE - Force specific build target
USE_PIP - Use pip instead of uv (default: false)
PORTABLE_PYTHON - Enable portable Python builds
LIMIT_TARGETS - Restrict backend to specific targets

Example: CUDA 12 Only Backend

# In your backend script
LIMIT_TARGETS="cublas12"
source $(dirname $0)/../common/libbackend.sh

Example: Intel-Optimized Backend

# In your backend script
LIMIT_TARGETS="intel"
source $(dirname $0)/../common/libbackend.sh

Development

Adding a New Backend

Create a new directory in backend/python/
Copy the template structure from common/template/
Implement your backend.py with the required gRPC interface
Add appropriate requirements files for your target hardware
Use libbackend.sh for consistent build and execution

Testing

# Run backend tests
make test
# or
bash test.sh

Building

# Install dependencies
make <backend-name>

# Clean build artifacts
make clean

Architecture

Each backend follows a consistent structure:

backend-name/
├── backend.py          # Main backend implementation
├── requirements.txt    # Base dependencies
├── requirements-*.txt  # Hardware-specific dependencies
├── install.sh         # Installation script
├── run.sh            # Execution script
├── test.sh           # Test script
├── Makefile          # Build targets
└── test.py           # Unit tests

Troubleshooting

Common Issues

Missing dependencies: Ensure all requirements files are properly configured
Hardware detection: Check that BUILD_TYPE matches your system
Python version: Verify Python 3.10+ is available
Virtual environment: Use ensureVenv to create/activate environments

Contributing

When adding new backends or modifying existing ones:

Follow the established directory structure
Use libbackend.sh for consistent behavior
Include appropriate requirements files for all target hardware
Add comprehensive tests
Update this README if adding new backend types