fix(llama-cpp-darwin): distribute ggml backends by suffix (.so root, .dylib lib)

ggml emits its loadable backends (per-microarch CPU variants, metal, blas) with a
.so suffix even on darwin, while the core libraries (ggml-base/ggml/llama/
llama-common/mtmd) use .dylib. Split the distribution by suffix: .so DL backends
go in the package root for ggml's executable-directory scan, .dylib core libs go
in lib/ for DYLD_LIBRARY_PATH. The previous .dylib name-pattern matched none of the
variants.

Verified on an M4: ggml loads the apple_m4 CPU variant (SME=1) and Metal, model
loads and generates correct tokens.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
This commit is contained in:
Ettore Di Giacinto
2026-06-24 21:59:29 +00:00
parent 3b47122e54
commit 4e9bb4f879

View File

@@ -24,17 +24,19 @@ cp -rf backend/cpp/llama-cpp/llama-cpp-cpu-all build/darwin/
cp -rf backend/cpp/llama-cpp/llama-cpp-grpc build/darwin/
cp -rf backend/cpp/llama-cpp/llama-cpp-rpc-server build/darwin/
# Distribute the shared ggml/llama dylibs from the CPU_ALL_VARIANTS build. Unlike the old
# fully-static fallback build, these are real dylibs with @rpath install names, so the
# otool loop below (which only copies deps that exist on disk) will not pick them up.
# - the per-microarch libggml-cpu-*.dylib go in the package ROOT, next to the binary,
# because on darwin run.sh execs the binary directly (no bundled ld.so) and ggml
# discovers CPU backends by scanning the executable's own directory.
# - everything else (libggml-base/libggml/libllama/libmtmd/libggml-metal/...) goes in
# lib/, resolved at load time via the DYLD_LIBRARY_PATH=lib that run.sh exports.
# Distribute the shared ggml/llama libraries from the CPU_ALL_VARIANTS build. Unlike the
# old fully-static fallback build, these have @rpath install names, so the otool loop below
# (which only copies deps that exist on disk) will not pick them up. The split is by suffix:
# - ggml emits its loadable backends (per-microarch CPU variants, metal, blas) with a .so
# suffix EVEN ON DARWIN. These go in the package ROOT next to the binary, because darwin
# run.sh execs the binary directly (no bundled ld.so) so ggml's executable-directory
# scan looks there.
# - the core libraries (libggml-base/libggml/libllama/libllama-common/libmtmd) use the
# platform .dylib suffix and are NEEDED deps; they go in lib/, resolved at load time via
# the DYLD_LIBRARY_PATH=lib that run.sh exports. -a preserves the version symlinks.
SHLIBS=backend/cpp/llama-cpp/ggml-shared-libs
cp -rfv $SHLIBS/libggml-cpu-*.dylib build/darwin/
find $SHLIBS -name '*.dylib' ! -name 'libggml-cpu-*.dylib' -exec cp -rfv {} build/darwin/lib/ \;
cp -a $SHLIBS/*.so build/darwin/
cp -a $SHLIBS/*.dylib build/darwin/lib/
# Set default additional libs only for Darwin on M chips (arm64)
if [[ "$(uname -s)" == "Darwin" && "$(uname -m)" == "arm64" ]]; then