LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-03 21:07:33 -04:00

Files

Ettore Di Giacinto 2aedd2cf44 fix(backends): enable ROCm/HIP GPU offload for ggml audio backends (#10666 )

qwen3-tts-cpp, omnivoice-cpp, acestep-cpp and vibevoice-cpp shipped
rocm-* variants that silently ran on CPU ([Load] backend: CPU). Two
coupled defects:

- The Makefiles passed -DGGML_HIPBLAS=ON, but the vendored ggml only
  understands -DGGML_HIP=ON (GGML_HIPBLAS was removed upstream), so the
  ggml-hip backend target was never created and no GPU code was built.
- The CMake foreach that links the ggml GPU backends into the module
  listed blas/cuda/metal/vulkan but not hip, so even a built ggml-hip
  would not have been linked and its static backend registration would
  never run.

CUDA users were unaffected because cublas passes the correct GGML_CUDA=ON
and the foreach already links cuda. Mirror the proven llama-cpp hipblas
block (ROCm clang CC/CXX + AMDGPU_TARGETS) and add hip to each foreach.
Upstream picks the best device via ggml_backend_init_best(), so no
runtime flag is needed once HIP is compiled and linked.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8[1m] [Claude Code]

2026-07-03 20:28:11 +00:00

cpp

chore(acestep-cpp): bump pin to ed53caf and adapt wrapper to new API (#9908 )

2026-05-20 21:05:32 +00:00

acestepcpp_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

CMakeLists.txt

fix(backends): enable ROCm/HIP GPU offload for ggml audio backends (#10666 )

2026-07-03 20:28:11 +00:00

goacestepcpp.go

feat: add distributed mode (#9124 )