mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-08 00:36:37 -04:00

Files

History

Ettore Di Giacinto 2de6ca51d4 fix(vllm): switch L4T13 backend to PyPI aarch64+cu130 wheels

The L4T13 vllm backend pulled torch / torchvision / torchaudio / vllm from
pypi.jetson-ai-lab.io's sbsa/cu130 mirror via [tool.uv.sources] with no
version pins. That mirror started shipping torch 2.11.0 next to a
vllm-0.20.0+cu130 wheel that was still compiled against torch 2.10's c10
ABI, so uv landed on the mismatched pair and vllm crashed at import:

  ImportError: vllm/_C.abi3.so: undefined symbol:
  _ZN3c1013MessageLoggerC1EPKciib

(c10::MessageLogger's constructor signature changed between torch 2.10 and
2.11; the vllm wheel referenced the 2.10 form, the installed libc10.so
exported only the 2.11 form.)

Since torch 2.11 (April 2026) PyPI publishes its own aarch64 + cu130
manylinux wheels, and vllm 0.20.0 ships an aarch64 wheel whose Requires-
Dist locks torch==2.11.0 / torchvision==0.26.0 / torchaudio==2.11.0. That
makes uv's resolver produce an ABI-consistent set automatically, so the
mirror and the [tool.uv.sources] pinning are no longer needed.

flash-attn is dropped from the dep list: PyPI has no aarch64 wheel, but
vLLM 0.20+ already bundles its own vllm_flash_attn (fa2 + fa3) inside the
main wheel, so the Dao-AILab package isn't required at runtime.

Reference: https://pytorch.org/blog/vllm-and-pytorch-work-together-to-improve-the-developer-experience-on-aarch64/

Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] [WebFetch]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-05-22 19:32:04 +00:00

..

backend.py

feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 )

2026-05-01 10:56:24 +02:00

install.sh

fix(vllm): switch L4T13 backend to PyPI aarch64+cu130 wheels

2026-05-22 19:32:04 +00:00

Makefile

feat(mlx): add mlx backend (#6049 )

2025-08-22 08:42:29 +02:00

package.sh

feat(vllm, distributed): tensor parallel distributed workers (#9612 )

2026-05-06 00:22:50 +02:00

pyproject.toml

fix(vllm): switch L4T13 backend to PyPI aarch64+cu130 wheels

2026-05-22 19:32:04 +00:00

README.md

refactor: move backends into the backends directory (#1279 )

2023-11-13 22:40:16 +01:00

requirements-after.txt

feat(vllm): parity with llama.cpp backend (#9328 )

2026-04-13 11:00:29 +02:00

requirements-cpu-after.txt

feat(vllm): parity with llama.cpp backend (#9328 )

2026-04-13 11:00:29 +02:00

requirements-cpu.txt

feat(vllm): parity with llama.cpp backend (#9328 )

2026-04-13 11:00:29 +02:00

requirements-cublas12-after.txt

fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557 )

2026-04-25 15:38:13 +00:00

requirements-cublas12.txt

fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557 )

2026-04-25 15:38:13 +00:00

requirements-cublas13-after.txt

chore: ⬆️ Update vllm-project/vllm cu130 wheel to 0.21.0 (#9846 )

2026-05-15 23:45:41 +02:00

requirements-cublas13.txt

feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 )

2026-04-25 12:26:29 +02:00

requirements-hipblas-after.txt

feat(vllm): parity with llama.cpp backend (#9328 )

2026-04-13 11:00:29 +02:00

requirements-hipblas.txt

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

requirements-install.txt

fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation

2026-04-28 20:08:26 +00:00

requirements-intel-after.txt

feat(vllm, distributed): tensor parallel distributed workers (#9612 )

2026-05-06 00:22:50 +02:00

requirements-intel.txt

feat(vllm, distributed): tensor parallel distributed workers (#9612 )

2026-05-06 00:22:50 +02:00

requirements.txt

chore(deps): update charset-normalizer requirement from >=3.4.0 to >=3.4.7 in /backend/python/vllm (#9779 )

2026-05-12 09:22:23 +02:00

run.sh

fix(python-backend): make JIT subprocesses work on hosts of any size (#9679 )

2026-05-06 00:28:01 +02:00

test.py

feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563 )

2026-04-29 00:49:28 +02:00

test.sh

feat: Add backend gallery (#5607 )

2025-06-15 14:56:52 +02:00

README.md

Creating a separate environment for the vllm project

make vllm