From 62c99c10b3dcd312112f6555aeb231bc727266c3 Mon Sep 17 00:00:00 2001 From: Adira Date: Mon, 22 Jun 2026 13:38:06 +0300 Subject: [PATCH] fix(diffusers): pin diffusers and transformers to a known-good pair (#9979) (#10442) fix(diffusers): pin diffusers and transformers to a known-good pair The diffusers backend tracked git+https://github.com/huggingface/diffusers (main) with an unpinned transformers. transformers v5 restructured CLIPTextModel and removed the .text_model attribute that diffusers' single -file loader reads, so loading any single-file Stable Diffusion checkpoint fails: create_diffusers_clip_model_from_ldm (single_file_utils.py) position_embedding_dim = model.text_model.embeddings.position_embedding... AttributeError: 'CLIPTextModel' object has no attribute 'text_model' No released diffusers (<=0.38.0) supports transformers v5 - only unreleased diffusers main does. Because the requirements tracked main plus an unpinned transformers, every backend image froze whichever pair existed at build time, and images built once transformers v5 shipped but before diffusers main caught up are permanently broken. Pin the last known-good released pair across all requirements files: diffusers==0.38.0 and transformers==4.57.6. 0.38.0 still exposes every pipeline backend.py imports (Flux, Wan, Sana, LTX2, Qwen, GGUF), so no functionality is lost, and builds become reproducible instead of drifting into the broken window. Fixes #9979 Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Adira Denis Muhando --- backend/python/diffusers/requirements-cpu.txt | 22 ++++++++++++------- .../diffusers/requirements-cublas12.txt | 22 ++++++++++++------- .../diffusers/requirements-cublas13.txt | 22 ++++++++++++------- .../python/diffusers/requirements-hipblas.txt | 22 ++++++++++++------- .../python/diffusers/requirements-intel.txt | 22 ++++++++++++------- .../python/diffusers/requirements-l4t12.txt | 22 ++++++++++++------- .../python/diffusers/requirements-l4t13.txt | 22 ++++++++++++------- backend/python/diffusers/requirements-mps.txt | 22 ++++++++++++------- 8 files changed, 112 insertions(+), 64 deletions(-) diff --git a/backend/python/diffusers/requirements-cpu.txt b/backend/python/diffusers/requirements-cpu.txt index 8db419b29..46959222c 100644 --- a/backend/python/diffusers/requirements-cpu.txt +++ b/backend/python/diffusers/requirements-cpu.txt @@ -1,7 +1,7 @@ --extra-index-url https://download.pytorch.org/whl/cpu -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 torchvision==0.22.1 accelerate git+https://github.com/xhinker/sd_embed @@ -10,9 +10,15 @@ sentencepiece torch==2.7.1 optimum-quanto ftfy -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. \ No newline at end of file +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. \ No newline at end of file diff --git a/backend/python/diffusers/requirements-cublas12.txt b/backend/python/diffusers/requirements-cublas12.txt index e3351ae75..5e6852cc7 100644 --- a/backend/python/diffusers/requirements-cublas12.txt +++ b/backend/python/diffusers/requirements-cublas12.txt @@ -1,7 +1,7 @@ --extra-index-url https://download.pytorch.org/whl/cu121 -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 torchvision accelerate git+https://github.com/xhinker/sd_embed @@ -10,9 +10,15 @@ sentencepiece torch ftfy optimum-quanto -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. diff --git a/backend/python/diffusers/requirements-cublas13.txt b/backend/python/diffusers/requirements-cublas13.txt index 546998ba4..ce77b6e6e 100644 --- a/backend/python/diffusers/requirements-cublas13.txt +++ b/backend/python/diffusers/requirements-cublas13.txt @@ -1,7 +1,7 @@ --extra-index-url https://download.pytorch.org/whl/cu130 -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 torchvision accelerate git+https://github.com/xhinker/sd_embed @@ -10,9 +10,15 @@ sentencepiece torch ftfy optimum-quanto -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. diff --git a/backend/python/diffusers/requirements-hipblas.txt b/backend/python/diffusers/requirements-hipblas.txt index 3480d1fd6..f3666d5f5 100644 --- a/backend/python/diffusers/requirements-hipblas.txt +++ b/backend/python/diffusers/requirements-hipblas.txt @@ -1,17 +1,23 @@ --extra-index-url https://download.pytorch.org/whl/rocm7.0 torch==2.10.0+rocm7.0 torchvision==0.25.0+rocm7.0 -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 accelerate peft sentencepiece optimum-quanto ftfy -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. \ No newline at end of file +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. \ No newline at end of file diff --git a/backend/python/diffusers/requirements-intel.txt b/backend/python/diffusers/requirements-intel.txt index c78f5ef23..73ab5b3b8 100644 --- a/backend/python/diffusers/requirements-intel.txt +++ b/backend/python/diffusers/requirements-intel.txt @@ -3,18 +3,24 @@ torch torchvision optimum[openvino] setuptools -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 accelerate git+https://github.com/xhinker/sd_embed peft sentencepiece optimum-quanto ftfy -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. \ No newline at end of file +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. \ No newline at end of file diff --git a/backend/python/diffusers/requirements-l4t12.txt b/backend/python/diffusers/requirements-l4t12.txt index 15857c4b0..9a9cdb0df 100644 --- a/backend/python/diffusers/requirements-l4t12.txt +++ b/backend/python/diffusers/requirements-l4t12.txt @@ -1,7 +1,7 @@ --extra-index-url https://pypi.jetson-ai-lab.io/jp6/cu129/ torch -git+https://github.com/huggingface/diffusers -transformers +diffusers==0.38.0 +transformers==4.57.6 accelerate peft optimum-quanto @@ -9,9 +9,15 @@ numpy<2 sentencepiece torchvision ftfy -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. diff --git a/backend/python/diffusers/requirements-l4t13.txt b/backend/python/diffusers/requirements-l4t13.txt index 226033a61..964c9c9f2 100644 --- a/backend/python/diffusers/requirements-l4t13.txt +++ b/backend/python/diffusers/requirements-l4t13.txt @@ -1,7 +1,7 @@ --extra-index-url https://download.pytorch.org/whl/cu130 torch -git+https://github.com/huggingface/diffusers -transformers +diffusers==0.38.0 +transformers==4.57.6 accelerate peft optimum-quanto @@ -10,9 +10,15 @@ sentencepiece torchvision ftfy chardet -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. diff --git a/backend/python/diffusers/requirements-mps.txt b/backend/python/diffusers/requirements-mps.txt index 58eb65f02..eeea59ddd 100644 --- a/backend/python/diffusers/requirements-mps.txt +++ b/backend/python/diffusers/requirements-mps.txt @@ -1,16 +1,22 @@ torch==2.7.1 torchvision==0.22.1 -git+https://github.com/huggingface/diffusers +diffusers==0.38.0 opencv-python -transformers +transformers==4.57.6 accelerate peft sentencepiece optimum-quanto ftfy -# TODO: re-add compel once it supports transformers >= 5. -# Tracking: https://github.com/damian0815/compel/pull/129 -# https://github.com/damian0815/compel/issues/128 -# compel currently pins transformers~=4.25, which forced pip into multi-hour -# resolver backtracking storms in CI. backend.py imports it lazily and gates -# the COMPEL=1 env var on the import succeeding, so dropping it here is safe. \ No newline at end of file +# diffusers and transformers are pinned together on purpose. transformers v5 +# restructured CLIPTextModel and dropped the `.text_model` attribute, which +# breaks single-file Stable Diffusion loading on every released diffusers +# (<=0.38.0); only unreleased diffusers main supports transformers v5. Tracking +# main via git froze whichever broken pair existed at image-build time. Pin the +# last known-good released pair so builds are reproducible and can't drift into +# the broken window. See https://github.com/mudler/LocalAI/issues/9979 +# +# compel is intentionally omitted: it pins transformers~=4.25, which conflicts +# with this pin and previously forced pip into multi-hour resolver backtracking +# storms in CI. backend.py imports it lazily and gates the COMPEL=1 env var on +# the import succeeding, so dropping it here is safe. \ No newline at end of file