fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688)

Two unrelated CI breakages bundled together since both are one-liners: - rerankers: bump torch 2.4.1 -> 2.7.1 on cpu/cublas12. The unpinned transformers resolves to 5.x, whose moe.py registers a custom_op with string-typed `'torch.Tensor'` annotations that torch 2.4.1's infer_schema rejects, blocking the gRPC server from starting and failing all 5 backend tests with "Connection refused" on :50051. Matches the version used by the transformers backend. - vllm-omni: strip fa3-fwd from the upstream requirements/cuda.txt before resolving on aarch64. fa3-fwd 0.0.3 ships only an x86_64 wheel and has no sdist, making the cuda profile unsatisfiable on Jetson/SBSA. fa3-fwd is a soft runtime dep — vllm-omni's attention backends fall back to FA2 then SDPA when it's missing. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-16 20:52:08 -04:00 · 2026-05-06 17:07:24 +02:00
parent 969005b2a1
commit 4e154b59e5
3 changed files with 10 additions and 2 deletions
--- a/backend/python/rerankers/requirements-cpu.txt
+++ b/backend/python/rerankers/requirements-cpu.txt
@@ -1,4 +1,4 @@
 transformers
 accelerate
-torch==2.4.1
+torch==2.7.1
 rerankers[transformers]
--- a/backend/python/rerankers/requirements-cublas12.txt
+++ b/backend/python/rerankers/requirements-cublas12.txt
@@ -1,4 +1,4 @@
 transformers
 accelerate
-torch==2.4.1
+torch==2.7.1
 rerankers[transformers]
--- a/backend/python/vllm-omni/install.sh
+++ b/backend/python/vllm-omni/install.sh
@@ -79,6 +79,14 @@ fi

 cd vllm-omni/

+# fa3-fwd ships no aarch64 wheels and there is no source distribution, so on
+# aarch64 (e.g. l4t13 / SBSA cu130) the upstream requirements/cuda.txt is
+# unsatisfiable. Drop it before resolving — vllm-omni does not hard-require
+# the fused FA3 kernel at import time on Jetson/SBSA targets.
+if [ "$(uname -m)" = "aarch64" ] && [ -f requirements/cuda.txt ]; then
+    sed -i '/^fa3-fwd[[:space:]]*==/d' requirements/cuda.txt
+fi
+
 if [ "x${USE_PIP}" == "xtrue" ]; then
    pip install ${EXTRA_PIP_INSTALL_FLAGS:-} -e .
 else