ggml emits its loadable backends (per-microarch CPU variants, metal, blas) with a
.so suffix even on darwin, while the core libraries (ggml-base/ggml/llama/
llama-common/mtmd) use .dylib. Split the distribution by suffix: .so DL backends
go in the package root for ggml's executable-directory scan, .dylib core libs go
in lib/ for DYLD_LIBRARY_PATH. The previous .dylib name-pattern matched none of the
variants.
Verified on an M4: ggml loads the apple_m4 CPU variant (SME=1) and Metal, model
loads and generates correct tokens.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
- arm64: ggml CPU_ALL_VARIANTS builds armv9.2 SME variants whose -march=...+sme
is rejected by the Ubuntu 24.04 default gcc-13. Build the arm64 variants with
gcc-14 (installed in the compile step). The host only selects a variant it
actually supports at runtime, but every variant must still compile.
- darwin: scripts/build/llama-cpp-darwin.sh builds llama-cpp-cpu-all instead of
the fallback binary, keeps Metal (GGML_METAL stays ON; --target ggml also builds
ggml-metal). The per-microarch libggml-cpu-*.dylib are placed in the package
root next to the binary (darwin has no bundled ld.so, so ggml's executable-dir
scan looks there), while the other shared dylibs go in lib/ for DYLD_LIBRARY_PATH.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
* chore: allow to install with pip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Make the backend to build and actually work
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* List models from system only
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add script to build darwin python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Run protogen in libbackend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Detect if mps is available across python backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* CI: try to build backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Debug CI
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Index mlx-vlm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Remove mlx-vlm
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop CI test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>