fix(diffusers): pin diffusers and transformers to a known-good pair
The diffusers backend tracked git+https://github.com/huggingface/diffusers
(main) with an unpinned transformers. transformers v5 restructured
CLIPTextModel and removed the .text_model attribute that diffusers' single
-file loader reads, so loading any single-file Stable Diffusion checkpoint
fails:
create_diffusers_clip_model_from_ldm (single_file_utils.py)
position_embedding_dim = model.text_model.embeddings.position_embedding...
AttributeError: 'CLIPTextModel' object has no attribute 'text_model'
No released diffusers (<=0.38.0) supports transformers v5 - only unreleased
diffusers main does. Because the requirements tracked main plus an unpinned
transformers, every backend image froze whichever pair existed at build
time, and images built once transformers v5 shipped but before diffusers
main caught up are permanently broken.
Pin the last known-good released pair across all requirements files:
diffusers==0.38.0 and transformers==4.57.6. 0.38.0 still exposes every
pipeline backend.py imports (Flux, Wan, Sana, LTX2, Qwen, GGUF), so no
functionality is lost, and builds become reproducible instead of drifting
into the broken window.
Fixes#9979
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
compel 2.3.1 (latest, Nov 2025) declares transformers~=4.25 in its
metadata, i.e. >=4.25,<5.0. After transformers 5.0 (2026-01-26) and
huggingface-hub 1.0 (2025-10-27) shipped, the weekly DEPS_REFRESH
cache rotation in CI started seeing the new majors and pip's resolver
went into multi-hour backtracking storms walking every transformers
4.x candidate against every accelerate/hf-hub/tokenizers combination
to find a set compel would accept. The 2026-04-29 backend-build for
the diffusers backend (darwin-mps + l4t + cublas13-turboquant matrix
cells) hit the GitHub Actions 6h job timeout still inside pip
install — the build itself never started.
compel is the only hard upper bound on transformers in this stack
(diffusers, accelerate, peft, optimum-quanto are all flexible), and
upstream support for transformers 5 is still in flight: damian0815/
compel#129 ("Modernize Compel for Transformers 5") and #128 ("Bump
transformers version to >5.0") are both open as of today.
backend.py only constructs Compel() when COMPEL=1 is set in the env
(default off), so make compel a true optional extra:
- Wrap the top-level `from compel import ...` in try/except
ImportError, mirroring the existing sd_embed pattern.
- Auto-disable COMPEL with a warning when the module isn't
installed, instead of crashing on module load.
- Drop compel from all eight requirements-*.txt variants so the
resolver no longer has to satisfy its transformers cap.
- Leave a TODO in backend.py and in each requirements file
pointing at the upstream PR/issue, so the dependency can be
reinstated once compel supports transformers >= 5.
Users who rely on weighted-prompt embeddings can opt in with a
manual `pip install compel` alongside COMPEL=1; the warning emitted
on startup tells them how.
Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit WebFetch]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add distributed mode (experimental)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix data races, mutexes, transactions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix events and tool stream in agent chat
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* use ginkgo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(cron): compute correctly time boundaries avoiding re-triggering
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not flood of healthy checks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not list obvious backends as text backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* tests fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop redundant healthcheck
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: add missing bufio.Flush in processImageFile
The processImageFile function writes decoded image data (from base64
or URL download) through a bufio.NewWriter but never calls Flush()
before closing the underlying file. Since bufio's default buffer is
4096 bytes, small images produce 0-byte files and large images are
truncated — causing PIL to fail with "cannot identify image file".
This breaks all image input paths: file, files, and ref_images
parameters in /v1/images/generations, making img2img, inpainting,
and reference image features non-functional.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: merge options into kwargs in diffusers GenerateImage
The GenerateImage method builds a local `options` dict containing the
source image (PIL), negative_prompt, and num_inference_steps, but
never merges it into `kwargs` before calling self.pipe(**kwargs).
This causes img2img to fail with "Input is in incorrect format"
because the pipeline never receives the image parameter.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: add unit test for processImageFile base64 decoding
Verifies that a base64-encoded PNG survives the write path
(encode → decode → bufio.Write → Flush → file on disk) with
byte-for-byte fidelity. The test image is small enough to fit
entirely in bufio's 4096-byte buffer, which is the exact scenario
where the missing Flush() produced a 0-byte file.
Also tests that invalid base64 input is handled gracefully.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: verify GenerateImage merges options into pipeline kwargs
Mocks the diffusers pipeline and calls GenerateImage with a source
image and negative prompt. Asserts that the pipeline receives the
image, negative_prompt, and num_inference_steps via kwargs — the
exact parameters that were silently dropped before the fix.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* fix: move kwargs.update(options) earlier in GenerateImage
Move the options merge right after self.options merge (L742) so that
image, negative_prompt, and num_inference_steps are available to all
downstream code paths including img2vid and txt2vid.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
* test: convert processImageFile tests to ginkgo
Replace standard testing with ginkgo/gomega to be consistent with
the rest of the test suites in the project.
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
---------
Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
* add experimental support for sd_embed-style prompt embedding
Signed-off-by: Austen Dicken <cvpcsm@gmail.com>
* add doc equivalent to compel
Signed-off-by: Austen Dicken <cvpcsm@gmail.com>
* need to use flux1 embedding function for flux model
Signed-off-by: Austen Dicken <cvpcsm@gmail.com>
---------
Signed-off-by: Austen Dicken <cvpcsm@gmail.com>
* Debug
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop openai video endpoint (is not complete)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add download button
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(diffusers): add support to LTX-2
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add to the gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: install only torch/torchvision from jetson index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: use pip for l4t-12
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Revert "fix: install only torch/torchvision from jetson index"
This reverts commit 2d2b020078
* chatterbox needs wheel
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): add cuda13 jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add to pipelines and to capabilities. Start to work on the gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* capabilities: try to detect by looking at /usr/local
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* neutts
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* backends.yaml
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add cuda13 l4t requirements.txt
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add cuda13 requirements.txt
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Pin vllm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Not all backends are compatible
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* add vllm to requirements
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* vllm is not pre-compiled for cuda 13
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(diffusers): add support for wan2.2
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): use ttl.sh for PRs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add ftfy deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Revert "chore(ci): use ttl.sh for PRs"
This reverts commit c9fc3ecf28.
* Simplify
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: do not pin torch/torchvision on cuda12
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat(backends): bundle python
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* test ci
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* vllm on self-hosted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add clang
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to fix it for Mac
* Relocate links only when is portable
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make sure to call macosPortableEnv
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use self-hosted for vllm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore: allow to install with pip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make the backend to build and actually work
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* List models from system only
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add script to build darwin python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Run protogen in libbackend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Detect if mps is available across python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI: try to build backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Index mlx-vlm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Remove mlx-vlm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop CI test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): add backend build tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(deps): bump torch and diffusers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): run diffusers/hipblas on self-hosted
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(ci): do not publish darwin if building from PRs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(README): Add device flags for Intel/XPU
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* fix(diffusers/xpu): Set device to XPU and ignore CUDA request when on Intel
Signed-off-by: Richard Palethorpe <io@richiejp.com>
---------
Signed-off-by: Richard Palethorpe <io@richiejp.com>
* feat: Add backend gallery
This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add backends docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wip: Backend Dockerfile for python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* feat: drop extras images, build python backends separately
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixup on all backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* test CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Tweaks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop old backends leftovers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixup CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Move dockerfile upper
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix proto
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Feature dropped for consistency - we prefer model galleries
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add missing packages in the build image
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* exllama is ponly available on cublas
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* pin torch on chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups to index
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug CI
* Install accellerators deps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add target arch
* Add cuda minor version
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use self-hosted runners
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* ci: use quay for test images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups for vllm and chatterbox
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups on CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chatterbox is only available for nvidia
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Simplify CI builds
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt test, use qwen3
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* chore(model gallery): add jina-reranker-v1-tiny-en-gguf
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use reranker from llama.cpp in AIO images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Limit concurrent jobs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>