mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-27 01:47:18 -04:00
Compare commits
58 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
14b29ebf4e | ||
|
|
f0d0bff232 | ||
|
|
64150ca7ab | ||
|
|
f98b0f1c1e | ||
|
|
2c96c2d08e | ||
|
|
f01a969f7b | ||
|
|
56600eec3e | ||
|
|
c4fa256cdf | ||
|
|
17c1fc74b2 | ||
|
|
068d397acf | ||
|
|
5b3572f8b8 | ||
|
|
6afe127cd4 | ||
|
|
f58dcefed4 | ||
|
|
11b062f8f4 | ||
|
|
114eeaae81 | ||
|
|
d388f874de | ||
|
|
86677495a2 | ||
|
|
253aedff06 | ||
|
|
74f07ecc35 | ||
|
|
ae0da454a7 | ||
|
|
179210b970 | ||
|
|
6c03e46390 | ||
|
|
f2ed63e39a | ||
|
|
286c508ce0 | ||
|
|
d1a9d59917 | ||
|
|
f72046b5b5 | ||
|
|
79783120dd | ||
|
|
4ac67d255d | ||
|
|
3a87d9e48f | ||
|
|
693e3eec05 | ||
|
|
f1e5071321 | ||
|
|
93d6255de3 | ||
|
|
fe4f425fb5 | ||
|
|
fae9f6356f | ||
|
|
066abf82c0 | ||
|
|
a7fec9a49d | ||
|
|
c678530cf0 | ||
|
|
3c63431e46 | ||
|
|
3f647a2764 | ||
|
|
f88981cdce | ||
|
|
0d6de15ae9 | ||
|
|
5c3d48ab50 | ||
|
|
764b0352b9 | ||
|
|
75ba2daba1 | ||
|
|
62b14fd635 | ||
|
|
193d0e6aef | ||
|
|
482314c623 | ||
|
|
e8ae88a2a0 | ||
|
|
e1994579f8 | ||
|
|
e5620989dd | ||
|
|
fc618dcee6 | ||
|
|
e6042080c0 | ||
|
|
0f3b24436d | ||
|
|
4b6f911835 | ||
|
|
a5e28942a6 | ||
|
|
dba9cd7ca4 | ||
|
|
c93190de50 | ||
|
|
4dbf69f889 |
@@ -102,6 +102,24 @@ Multi-arch backends are NOT a single matrix entry with `platforms: 'linux/amd64,
|
|||||||
|
|
||||||
Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~25–35 min cold). Local `make backends/<name>` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
|
Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~25–35 min cold). Local `make backends/<name>` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
|
||||||
|
|
||||||
|
### Cover every OS the project supports (Linux **and** Darwin)
|
||||||
|
|
||||||
|
`.github/backend-matrix.yml` has two matrices, and they are the source of truth for which OS a backend ships on:
|
||||||
|
|
||||||
|
- `include:` — the **Linux** matrix (x86_64 + arm64; CPU and CUDA / ROCm / SYCL / Vulkan).
|
||||||
|
- `includeDarwin:` — the **macOS / Apple Silicon** matrix (arm64; Metal where the engine supports it, otherwise a native arm64 CPU build).
|
||||||
|
|
||||||
|
**A new backend must target every OS it can build for — do not ship Linux-only by default.** A backend that appears only under `include:` is silently unavailable on macOS even when its code would run there. Most C/C++/GGML engines build on Darwin out of the box (ggml defaults `GGML_METAL=ON` on Apple, so a plain build is Metal-enabled), and many Python backends do too (CPU / MPS wheels). If a backend genuinely cannot support an OS (e.g. CUDA-only, no CPU variant), state that in the PR description instead of omitting it silently.
|
||||||
|
|
||||||
|
Wiring a backend into `includeDarwin:` is more than the matrix entry:
|
||||||
|
|
||||||
|
1. **`includeDarwin:` entry** — `tag-suffix: "-metal-darwin-arm64-<backend>"`, `build-type: "metal"`, `lang: "go"` for go+ggml backends; omit `build-type` for the bespoke C++ ones (llama-cpp / ds4 / privacy-filter). Match an existing entry of the same shape.
|
||||||
|
2. **`backend/index.yaml`** — add `metal:` to the backend's `capabilities` map (main and `-development`) and concrete `metal-<backend>` / `metal-<backend>-development` image entries pointing at the `-metal-darwin-arm64-<backend>` images.
|
||||||
|
3. **C/C++ backends only** — add an `inferBackendPathDarwin` case in `scripts/changed-backends.js` returning `backend/cpp/<backend>/` (the generic fallthrough assumes `backend/<lang>/`, which is wrong for a C++ source tree driven with `lang: go`), and give `run.sh` a Darwin branch that exports `DYLD_LIBRARY_PATH` instead of `LD_LIBRARY_PATH`. If the build is bespoke (single `grpc-server` + dylib bundling), model it on `scripts/build/ds4-darwin.sh` and add a `backends/<backend>-darwin` make target plus a gated step in `.github/workflows/backend_build_darwin.yml`.
|
||||||
|
4. **C++ proto gotcha** — if the backend compiles the generated gRPC/protobuf in a separate CMake target (e.g. `hw_grpc_proto`), that target must link `protobuf::libprotobuf` + `gRPC::grpc++` so the Homebrew include dirs propagate; otherwise macOS fails with `google/protobuf/runtime_version.h not found` (Linux hides this because apt headers sit in `/usr/include`).
|
||||||
|
|
||||||
|
The CI path filter only builds a backend on a PR when a file under its directory changes, so a darwin-only YAML edit builds nothing — touch a file under `backend/<lang>/<backend>/` (a one-line comment is enough) in the same PR.
|
||||||
|
|
||||||
## 3. Add Backend Metadata to `backend/index.yaml`
|
## 3. Add Backend Metadata to `backend/index.yaml`
|
||||||
|
|
||||||
**Step 3a: Add Meta Definition**
|
**Step 3a: Add Meta Definition**
|
||||||
@@ -225,6 +243,7 @@ After adding a new backend, verify:
|
|||||||
|
|
||||||
- [ ] Backend directory structure is complete with all necessary files
|
- [ ] Backend directory structure is complete with all necessary files
|
||||||
- [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant)
|
- [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant)
|
||||||
|
- [ ] **OS coverage considered**: added to `includeDarwin:` (macOS/Apple Silicon) if the backend can build there — with the `backend/index.yaml` `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD branch and `inferBackendPathDarwin` case for C++ backends — or the PR explains why an OS is unsupported. Do not ship Linux-only by default.
|
||||||
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
|
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
|
||||||
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
|
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
|
||||||
- [ ] Tag suffixes match between workflow file and index.yaml
|
- [ ] Tag suffixes match between workflow file and index.yaml
|
||||||
|
|||||||
@@ -17,19 +17,29 @@ if [[ -n "${CUDA_DOCKER_ARCH:-}" ]]; then
|
|||||||
rm -rf /LocalAI/backend/cpp/llama-cpp-*-build
|
rm -rf /LocalAI/backend/cpp/llama-cpp-*-build
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ "${TARGETARCH}" = "arm64" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then
|
cd /LocalAI/backend/cpp/llama-cpp
|
||||||
cd /LocalAI/backend/cpp/llama-cpp
|
if [ -z "${BUILD_TYPE:-}" ]; then
|
||||||
make llama-cpp-fallback
|
# Pure CPU image (BUILD_TYPE empty): one build with ggml CPU_ALL_VARIANTS replaces the
|
||||||
make llama-cpp-grpc
|
# per-microarch binaries (x86: avx/avx2/avx512/fallback; arm64: armv8.x/armv9.x). ggml
|
||||||
make llama-cpp-rpc-server
|
# dlopens the best libggml-cpu-*.so at runtime by probing host CPU features.
|
||||||
|
#
|
||||||
|
# arm64: the CPU_ALL_VARIANTS table includes armv9.2 SME variants whose -march=...+sme is
|
||||||
|
# rejected by the Ubuntu 24.04 default gcc-13. gcc-14 accepts it, so build the arm64
|
||||||
|
# variants with it (the host never *selects* SME unless it has it, but every variant must
|
||||||
|
# still compile).
|
||||||
|
if [ "${TARGETARCH}" = "arm64" ]; then
|
||||||
|
apt-get update -qq && apt-get install -y -qq gcc-14 g++-14
|
||||||
|
export CC=gcc-14 CXX=g++-14
|
||||||
|
fi
|
||||||
|
make llama-cpp-cpu-all
|
||||||
else
|
else
|
||||||
cd /LocalAI/backend/cpp/llama-cpp
|
# GPU build (cublas/hipblas/sycl/vulkan/...): the accelerator does the compute, so a
|
||||||
make llama-cpp-avx
|
# single fallback CPU build is enough - no per-microarch CPU variants needed. (This also
|
||||||
make llama-cpp-avx2
|
# keeps the heavy GPU backend compile from also building the whole CPU variant matrix,
|
||||||
make llama-cpp-avx512
|
# and avoids the gcc-14 apt step on GPU base images such as nvidia l4t.)
|
||||||
make llama-cpp-fallback
|
make llama-cpp-fallback
|
||||||
make llama-cpp-grpc
|
|
||||||
make llama-cpp-rpc-server
|
|
||||||
fi
|
fi
|
||||||
|
make llama-cpp-grpc
|
||||||
|
make llama-cpp-rpc-server
|
||||||
|
|
||||||
ccache -s || true
|
ccache -s || true
|
||||||
|
|||||||
@@ -19,17 +19,21 @@ fi
|
|||||||
|
|
||||||
cd /LocalAI/backend/cpp/turboquant
|
cd /LocalAI/backend/cpp/turboquant
|
||||||
|
|
||||||
if [ "${TARGETARCH}" = "arm64" ] || [ "${BUILD_TYPE}" = "hipblas" ]; then
|
if [ -z "${BUILD_TYPE:-}" ]; then
|
||||||
make turboquant-fallback
|
# Pure CPU image: one ggml CPU_ALL_VARIANTS build replaces the per-microarch binaries.
|
||||||
make turboquant-grpc
|
# arm64: the armv9.2 SME variants need gcc-14 (gcc-13 rejects +sme).
|
||||||
make turboquant-rpc-server
|
if [ "${TARGETARCH}" = "arm64" ]; then
|
||||||
|
apt-get update -qq && apt-get install -y -qq gcc-14 g++-14
|
||||||
|
export CC=gcc-14 CXX=g++-14
|
||||||
|
fi
|
||||||
|
make turboquant-cpu-all
|
||||||
else
|
else
|
||||||
make turboquant-avx
|
# GPU build (cublas/hipblas/sycl/vulkan/...): single fallback CPU build, the accelerator
|
||||||
make turboquant-avx2
|
# does the compute. Keeps the GPU compile from also building the CPU variant matrix and
|
||||||
make turboquant-avx512
|
# avoids the gcc-14 apt step on GPU base images such as nvidia l4t.
|
||||||
make turboquant-fallback
|
make turboquant-fallback
|
||||||
make turboquant-grpc
|
|
||||||
make turboquant-rpc-server
|
|
||||||
fi
|
fi
|
||||||
|
make turboquant-grpc
|
||||||
|
make turboquant-rpc-server
|
||||||
|
|
||||||
ccache -s || true
|
ccache -s || true
|
||||||
|
|||||||
70
.github/backend-matrix.yml
vendored
70
.github/backend-matrix.yml
vendored
@@ -2,6 +2,28 @@
|
|||||||
# Matrix data for backend container image builds.
|
# Matrix data for backend container image builds.
|
||||||
# Consumed by scripts/changed-backends.js for both backend.yml and backend_pr.yml.
|
# Consumed by scripts/changed-backends.js for both backend.yml and backend_pr.yml.
|
||||||
# This file is NOT a workflow — it has no top-level 'on:' or 'jobs:'.
|
# This file is NOT a workflow — it has no top-level 'on:' or 'jobs:'.
|
||||||
|
#
|
||||||
|
# OS / platform coverage — READ THIS WHEN ADDING A BACKEND
|
||||||
|
# --------------------------------------------------------
|
||||||
|
# This file is the source of truth for which OS each backend is built and
|
||||||
|
# published for. A backend ships ONLY for the matrices it appears in:
|
||||||
|
# - Linux -> the `include:` matrix below (x86_64 + arm64; CPU and
|
||||||
|
# CUDA / ROCm / SYCL / Vulkan variants).
|
||||||
|
# - macOS -> the `includeDarwin:` matrix (Apple Silicon / arm64; Metal where
|
||||||
|
# the engine supports it, otherwise a native arm64 CPU build).
|
||||||
|
#
|
||||||
|
# New backends must target EVERY OS they can build for, not just Linux. A backend
|
||||||
|
# listed only under `include:` is silently unavailable on macOS even when its code
|
||||||
|
# would run there. Most C/C++/GGML engines build on Darwin (ggml defaults
|
||||||
|
# GGML_METAL=ON on Apple, so a plain build is Metal-enabled), and many Python
|
||||||
|
# backends do too (CPU / MPS). If a backend genuinely cannot support an OS, say so
|
||||||
|
# in its PR description rather than silently omitting it.
|
||||||
|
#
|
||||||
|
# Adding a backend to `includeDarwin:` is more than one line — see the darwin
|
||||||
|
# checklist in .agents/adding-backends.md (includeDarwin entry, the index.yaml
|
||||||
|
# `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD
|
||||||
|
# branch for C/C++ backends, and the inferBackendPathDarwin case in
|
||||||
|
# scripts/changed-backends.js so the path filter actually builds it).
|
||||||
|
|
||||||
# Linux matrix (consumed by backend-jobs).
|
# Linux matrix (consumed by backend-jobs).
|
||||||
include:
|
include:
|
||||||
@@ -4922,6 +4944,37 @@ includeDarwin:
|
|||||||
tag-suffix: "-metal-darwin-arm64-vibevoice-cpp"
|
tag-suffix: "-metal-darwin-arm64-vibevoice-cpp"
|
||||||
build-type: "metal"
|
build-type: "metal"
|
||||||
lang: "go"
|
lang: "go"
|
||||||
|
# Vision/utility C++/ggml backends (go+cgo). Their Makefiles already carry a
|
||||||
|
# Darwin/Metal path (GGML_METAL=ON when build-type=metal); this just builds and
|
||||||
|
# publishes the metal image so Apple Silicon can install them.
|
||||||
|
- backend: "depth-anything-cpp"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-depth-anything-cpp"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
|
- backend: "locate-anything-cpp"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-locate-anything-cpp"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
|
- backend: "rfdetr-cpp"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-rfdetr-cpp"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
|
- backend: "sam3-cpp"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-sam3-cpp"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
|
# privacy-filter (PII/NER) is a C++/ggml backend built by a bespoke darwin
|
||||||
|
# script (make backends/privacy-filter-darwin); ggml defaults Metal ON on Apple
|
||||||
|
# so the build is Metal-enabled. lang=go drives runner/toolchain selection only.
|
||||||
|
- backend: "privacy-filter"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-privacy-filter"
|
||||||
|
lang: "go"
|
||||||
|
# LocalVQE has no Metal path; on Apple Silicon it builds CPU-only (GGML_METAL
|
||||||
|
# OFF) but is still a native arm64 image. Uses the darwin/metal build profile.
|
||||||
|
- backend: "localvqe"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-localvqe"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
- backend: "voxtral"
|
- backend: "voxtral"
|
||||||
tag-suffix: "-metal-darwin-arm64-voxtral"
|
tag-suffix: "-metal-darwin-arm64-voxtral"
|
||||||
build-type: "metal"
|
build-type: "metal"
|
||||||
@@ -4974,6 +5027,19 @@ includeDarwin:
|
|||||||
- backend: "kitten-tts"
|
- backend: "kitten-tts"
|
||||||
tag-suffix: "-metal-darwin-arm64-kitten-tts"
|
tag-suffix: "-metal-darwin-arm64-kitten-tts"
|
||||||
build-type: "mps"
|
build-type: "mps"
|
||||||
|
# vLLM on Apple Silicon via vllm-metal (MLX). The install is custom
|
||||||
|
# (backend/python/vllm/install.sh has a darwin branch); lang stays python so
|
||||||
|
# backend_build_darwin.yml drives it through build-darwin-python-backend ->
|
||||||
|
# scripts/build/python-darwin.sh, which runs the backend's install.sh.
|
||||||
|
- backend: "vllm"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-vllm"
|
||||||
|
build-type: "mps"
|
||||||
|
- backend: "trl"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-trl"
|
||||||
|
build-type: "mps"
|
||||||
|
- backend: "liquid-audio"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-liquid-audio"
|
||||||
|
build-type: "mps"
|
||||||
- backend: "piper"
|
- backend: "piper"
|
||||||
tag-suffix: "-metal-darwin-arm64-piper"
|
tag-suffix: "-metal-darwin-arm64-piper"
|
||||||
build-type: "metal"
|
build-type: "metal"
|
||||||
@@ -4990,6 +5056,10 @@ includeDarwin:
|
|||||||
tag-suffix: "-metal-darwin-arm64-sherpa-onnx"
|
tag-suffix: "-metal-darwin-arm64-sherpa-onnx"
|
||||||
build-type: "metal"
|
build-type: "metal"
|
||||||
lang: "go"
|
lang: "go"
|
||||||
|
- backend: "supertonic"
|
||||||
|
tag-suffix: "-metal-darwin-arm64-supertonic"
|
||||||
|
build-type: "metal"
|
||||||
|
lang: "go"
|
||||||
- backend: "local-store"
|
- backend: "local-store"
|
||||||
tag-suffix: "-metal-darwin-arm64-local-store"
|
tag-suffix: "-metal-darwin-arm64-local-store"
|
||||||
build-type: "metal"
|
build-type: "metal"
|
||||||
|
|||||||
55
.github/bump_vllm_metal.sh
vendored
Executable file
55
.github/bump_vllm_metal.sh
vendored
Executable file
@@ -0,0 +1,55 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
# Bump the single vllm-metal pin (VLLM_METAL_VERSION) in the vLLM backend's
|
||||||
|
# darwin (Apple Silicon) install path. The macOS/Metal build
|
||||||
|
# (backend/python/vllm/install.sh, Darwin branch) installs vllm-metal, which is
|
||||||
|
# version-locked to a specific vLLM source release. install.sh derives that vLLM
|
||||||
|
# version at build time from vllm-metal's own installer (`vllm_v=`) at the pinned
|
||||||
|
# tag, so there is only ONE value to bump here -- mirroring bump_vllm_wheel.sh,
|
||||||
|
# which bumps the Linux cu130 wheel pin.
|
||||||
|
#
|
||||||
|
# This deliberately tracks vllm-project/vllm-metal, NOT vllm-project/vllm: the
|
||||||
|
# darwin build can only use the exact vLLM version vllm-metal supports, so it may
|
||||||
|
# lag the Linux pin (requirements-cublas13-after.txt) until vllm-metal catches up.
|
||||||
|
set -xe
|
||||||
|
REPO=$1 # vllm-project/vllm-metal
|
||||||
|
FILE=$2 # backend/python/vllm/install.sh
|
||||||
|
VAR=$3 # VLLM_METAL_VERSION (used for the workflow's output file names)
|
||||||
|
|
||||||
|
if [ -z "$FILE" ] || [ -z "$REPO" ] || [ -z "$VAR" ]; then
|
||||||
|
echo "usage: $0 <repo> <install-file> <var-name>" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
# vllm-metal ships frequent dev releases, all flagged as non-prerelease, so
|
||||||
|
# /releases/latest returns the newest one (with its cp312 wheel asset).
|
||||||
|
LATEST_TAG=$(curl -sS -H "Accept: application/vnd.github+json" \
|
||||||
|
"https://api.github.com/repos/$REPO/releases/latest" \
|
||||||
|
| python3 -c "import json,sys; print(json.load(sys.stdin)['tag_name'])")
|
||||||
|
|
||||||
|
# The coupled vLLM source version lives in vllm-metal's installer at that tag.
|
||||||
|
NEW_VLLM_VERSION=$(curl -fsSL \
|
||||||
|
"https://raw.githubusercontent.com/$REPO/$LATEST_TAG/install.sh" \
|
||||||
|
| grep -oE 'vllm_v="[0-9]+\.[0-9]+\.[0-9]+"' | head -1 | cut -d'"' -f2)
|
||||||
|
|
||||||
|
if [ -z "$LATEST_TAG" ] || [ -z "$NEW_VLLM_VERSION" ]; then
|
||||||
|
echo "Could not resolve vllm-metal tag ($LATEST_TAG) or its vllm_v ($NEW_VLLM_VERSION)." >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
|
||||||
|
set +e
|
||||||
|
CURRENT_TAG=$(grep -oE 'VLLM_METAL_VERSION="[^"]*"' "$FILE" | head -1 | cut -d'"' -f2)
|
||||||
|
set -e
|
||||||
|
|
||||||
|
# Rewrite the single pin. install.sh derives VLLM_VERSION from this tag at build
|
||||||
|
# time, so there is nothing else to touch. peter-evans/create-pull-request opens
|
||||||
|
# no PR on a clean tree, so a no-op rewrite (already current) is safe.
|
||||||
|
sed -i "$FILE" \
|
||||||
|
-e "s|VLLM_METAL_VERSION=\"[^\"]*\"|VLLM_METAL_VERSION=\"$LATEST_TAG\"|"
|
||||||
|
|
||||||
|
if [ -z "$CURRENT_TAG" ]; then
|
||||||
|
echo "Could not find VLLM_METAL_VERSION=\"...\" in $FILE." >&2
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
echo "vllm-metal ${CURRENT_TAG} -> ${LATEST_TAG} (builds vLLM ${NEW_VLLM_VERSION}): https://github.com/$REPO/releases/tag/${LATEST_TAG}" >> "${VAR}_message.txt"
|
||||||
|
echo "${LATEST_TAG}" >> "${VAR}_commit.txt"
|
||||||
22
.github/workflows/backend_build_darwin.yml
vendored
22
.github/workflows/backend_build_darwin.yml
vendored
@@ -99,6 +99,7 @@ jobs:
|
|||||||
/opt/homebrew/Cellar/xxhash
|
/opt/homebrew/Cellar/xxhash
|
||||||
/opt/homebrew/Cellar/zstd
|
/opt/homebrew/Cellar/zstd
|
||||||
/opt/homebrew/Cellar/nlohmann-json
|
/opt/homebrew/Cellar/nlohmann-json
|
||||||
|
/opt/homebrew/Cellar/opus
|
||||||
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
|
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
|
||||||
|
|
||||||
- name: Dependencies
|
- name: Dependencies
|
||||||
@@ -113,7 +114,12 @@ jobs:
|
|||||||
# nlohmann-json is header-only and required by the ds4 backend
|
# nlohmann-json is header-only and required by the ds4 backend
|
||||||
# (dsml_renderer.cpp includes <nlohmann/json.hpp>); on Linux it comes
|
# (dsml_renderer.cpp includes <nlohmann/json.hpp>); on Linux it comes
|
||||||
# from the apt-installed nlohmann-json3-dev in the build image.
|
# from the apt-installed nlohmann-json3-dev in the build image.
|
||||||
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json
|
# opus + pkg-config are required by the opus go backend: its
|
||||||
|
# Makefile/package.sh call `pkg-config --cflags/--libs opus` to build
|
||||||
|
# libopusshim.dylib and to locate libopus.dylib for bundling. brew's
|
||||||
|
# pkg-config defaults its search path to the Homebrew prefix so the
|
||||||
|
# opus.pc is found.
|
||||||
|
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config
|
||||||
# Force-reinstall ccache so brew re-validates its full runtime-dep
|
# Force-reinstall ccache so brew re-validates its full runtime-dep
|
||||||
# closure on every run. This is the durable fix: when the upstream
|
# closure on every run. This is the durable fix: when the upstream
|
||||||
# ccache formula gains a new transitive dep (as it has multiple times
|
# ccache formula gains a new transitive dep (as it has multiple times
|
||||||
@@ -132,7 +138,7 @@ jobs:
|
|||||||
# and decides "already installed" without re-linking, so on a cache-
|
# and decides "already installed" without re-linking, so on a cache-
|
||||||
# hit run the formulas aren't on PATH. Force-link them; --overwrite
|
# hit run the formulas aren't on PATH. Force-link them; --overwrite
|
||||||
# tolerates pre-existing symlinks from earlier installs.
|
# tolerates pre-existing symlinks from earlier installs.
|
||||||
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json 2>/dev/null || true
|
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config 2>/dev/null || true
|
||||||
|
|
||||||
- name: Save Homebrew cache
|
- name: Save Homebrew cache
|
||||||
if: github.event_name != 'pull_request' && steps.brew-cache.outputs.cache-hit != 'true'
|
if: github.event_name != 'pull_request' && steps.brew-cache.outputs.cache-hit != 'true'
|
||||||
@@ -153,6 +159,7 @@ jobs:
|
|||||||
/opt/homebrew/Cellar/xxhash
|
/opt/homebrew/Cellar/xxhash
|
||||||
/opt/homebrew/Cellar/zstd
|
/opt/homebrew/Cellar/zstd
|
||||||
/opt/homebrew/Cellar/nlohmann-json
|
/opt/homebrew/Cellar/nlohmann-json
|
||||||
|
/opt/homebrew/Cellar/opus
|
||||||
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
|
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
|
||||||
|
|
||||||
# ---- ccache for llama.cpp CMake builds ----
|
# ---- ccache for llama.cpp CMake builds ----
|
||||||
@@ -228,8 +235,17 @@ jobs:
|
|||||||
run: |
|
run: |
|
||||||
make backends/ds4-darwin
|
make backends/ds4-darwin
|
||||||
|
|
||||||
|
# privacy-filter is a C++/ggml backend like ds4 - a single grpc-server with
|
||||||
|
# otool dylib bundling - so it gets its own bespoke darwin script rather than
|
||||||
|
# the generic build-darwin-go-backend path.
|
||||||
|
- name: Build privacy-filter backend (Darwin Metal)
|
||||||
|
if: inputs.backend == 'privacy-filter'
|
||||||
|
run: |
|
||||||
|
make protogen-go
|
||||||
|
make backends/privacy-filter-darwin
|
||||||
|
|
||||||
- name: Build ${{ inputs.backend }}-darwin
|
- name: Build ${{ inputs.backend }}-darwin
|
||||||
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4'
|
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4' && inputs.backend != 'privacy-filter'
|
||||||
run: |
|
run: |
|
||||||
make protogen-go
|
make protogen-go
|
||||||
BACKEND=${{ inputs.backend }} BUILD_TYPE=${{ inputs.build-type }} USE_PIP=${{ inputs.use-pip }} make build-darwin-${{ inputs.lang }}-backend
|
BACKEND=${{ inputs.backend }} BUILD_TYPE=${{ inputs.build-type }} USE_PIP=${{ inputs.use-pip }} make build-darwin-${{ inputs.lang }}-backend
|
||||||
|
|||||||
36
.github/workflows/bump_deps.yaml
vendored
36
.github/workflows/bump_deps.yaml
vendored
@@ -154,3 +154,39 @@ jobs:
|
|||||||
branch: "update/VLLM_VERSION"
|
branch: "update/VLLM_VERSION"
|
||||||
body: ${{ steps.bump.outputs.message }}
|
body: ${{ steps.bump.outputs.message }}
|
||||||
signoff: true
|
signoff: true
|
||||||
|
|
||||||
|
bump-vllm-metal:
|
||||||
|
# The darwin (Apple Silicon) vLLM build installs vllm-metal, which is locked
|
||||||
|
# to a specific vLLM source release. install.sh pins both VLLM_METAL_VERSION
|
||||||
|
# (the wheel release) and VLLM_VERSION (the vLLM it builds against); this job
|
||||||
|
# tracks vllm-project/vllm-metal and rewrites both atomically. Separate from
|
||||||
|
# bump-vllm-wheel because darwin follows vllm-metal, not vllm/vllm latest.
|
||||||
|
if: github.repository == 'mudler/LocalAI'
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v7
|
||||||
|
- name: Bump vllm-metal pin 🔧
|
||||||
|
id: bump
|
||||||
|
run: |
|
||||||
|
bash .github/bump_vllm_metal.sh vllm-project/vllm-metal backend/python/vllm/install.sh VLLM_METAL_VERSION
|
||||||
|
{
|
||||||
|
echo 'message<<EOF'
|
||||||
|
cat "VLLM_METAL_VERSION_message.txt"
|
||||||
|
echo EOF
|
||||||
|
} >> "$GITHUB_OUTPUT"
|
||||||
|
{
|
||||||
|
echo 'commit<<EOF'
|
||||||
|
cat "VLLM_METAL_VERSION_commit.txt"
|
||||||
|
echo EOF
|
||||||
|
} >> "$GITHUB_OUTPUT"
|
||||||
|
rm -rfv VLLM_METAL_VERSION_message.txt VLLM_METAL_VERSION_commit.txt
|
||||||
|
- name: Create Pull Request
|
||||||
|
uses: peter-evans/create-pull-request@v8
|
||||||
|
with:
|
||||||
|
token: ${{ secrets.UPDATE_BOT_TOKEN }}
|
||||||
|
push-to-fork: ci-forks/LocalAI
|
||||||
|
commit-message: ':arrow_up: Update vllm-project/vllm-metal (darwin)'
|
||||||
|
title: 'chore: :arrow_up: Update vllm-metal (darwin) to `${{ steps.bump.outputs.commit }}`'
|
||||||
|
branch: "update/VLLM_METAL_VERSION"
|
||||||
|
body: ${{ steps.bump.outputs.message }}
|
||||||
|
signoff: true
|
||||||
|
|||||||
21
.github/workflows/release.yaml
vendored
21
.github/workflows/release.yaml
vendored
@@ -24,6 +24,11 @@ jobs:
|
|||||||
args: release --clean
|
args: release --clean
|
||||||
env:
|
env:
|
||||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||||
|
MACOS_SIGN_P12: ${{ secrets.MACOS_CERTIFICATE }}
|
||||||
|
MACOS_SIGN_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
|
||||||
|
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
|
||||||
|
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
|
||||||
|
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
|
||||||
launcher-build-darwin:
|
launcher-build-darwin:
|
||||||
runs-on: macos-latest
|
runs-on: macos-latest
|
||||||
steps:
|
steps:
|
||||||
@@ -35,9 +40,19 @@ jobs:
|
|||||||
uses: actions/setup-go@v5
|
uses: actions/setup-go@v5
|
||||||
with:
|
with:
|
||||||
go-version: 1.23
|
go-version: 1.23
|
||||||
- name: Build launcher for macOS ARM64
|
- name: Import signing certificate
|
||||||
run: |
|
env:
|
||||||
make build-launcher-darwin
|
MACOS_CERTIFICATE: ${{ secrets.MACOS_CERTIFICATE }}
|
||||||
|
MACOS_CERTIFICATE_PWD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
|
||||||
|
MACOS_CI_KEYCHAIN_PWD: ${{ secrets.MACOS_CI_KEYCHAIN_PWD }}
|
||||||
|
run: bash contrib/macos/sign-and-notarize.sh import-cert
|
||||||
|
- name: Build, sign and notarize the DMG
|
||||||
|
env:
|
||||||
|
MACOS_SIGN_IDENTITY: ${{ secrets.MACOS_SIGN_IDENTITY }}
|
||||||
|
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
|
||||||
|
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
|
||||||
|
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
|
||||||
|
run: make release-launcher-darwin
|
||||||
- name: Upload DMG to Release
|
- name: Upload DMG to Release
|
||||||
uses: softprops/action-gh-release@v3
|
uses: softprops/action-gh-release@v3
|
||||||
with:
|
with:
|
||||||
|
|||||||
16
.github/workflows/test.yml
vendored
16
.github/workflows/test.yml
vendored
@@ -121,3 +121,19 @@ jobs:
|
|||||||
detached: true
|
detached: true
|
||||||
connect-timeout-seconds: 180
|
connect-timeout-seconds: 180
|
||||||
limit-access-to-actor: true
|
limit-access-to-actor: true
|
||||||
|
|
||||||
|
# Fast standalone unit tests for the backends' pure C++ helpers - currently the
|
||||||
|
# llama-cpp message reconstruction (backend/cpp/llama-cpp/message_content.h),
|
||||||
|
# which guards the OpenAI chat content normalization (mudler/LocalAI#10524,
|
||||||
|
# #7324, #7528). The runner discovers every *_test.cpp under backend/cpp/, so
|
||||||
|
# new pure-C++ unit tests are picked up with no CI changes. These need only the
|
||||||
|
# C++ stdlib + nlohmann/json, so they run on every PR without the full
|
||||||
|
# llama.cpp + gRPC backend build. (The same suite is also wired as an opt-in
|
||||||
|
# CMake/ctest target, -DLLAMA_GRPC_BUILD_TESTS=ON, for in-backend-build runs.)
|
||||||
|
tests-backend-cpp:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- name: Clone
|
||||||
|
uses: actions/checkout@v7
|
||||||
|
- name: Run backend C++ unit tests
|
||||||
|
run: make test-backend-cpp
|
||||||
|
|||||||
3
.gitignore
vendored
3
.gitignore
vendored
@@ -94,3 +94,6 @@ core/http/react-ui/test-results/
|
|||||||
|
|
||||||
# SDD / brainstorm scratch (agent-driven development)
|
# SDD / brainstorm scratch (agent-driven development)
|
||||||
.superpowers/
|
.superpowers/
|
||||||
|
|
||||||
|
# Local Apple signing material (never commit)
|
||||||
|
.certs/
|
||||||
|
|||||||
@@ -9,7 +9,8 @@ source:
|
|||||||
enabled: true
|
enabled: true
|
||||||
name_template: '{{ .ProjectName }}-{{ .Tag }}-source'
|
name_template: '{{ .ProjectName }}-{{ .Tag }}-source'
|
||||||
builds:
|
builds:
|
||||||
- main: ./cmd/local-ai
|
- id: local-ai
|
||||||
|
main: ./cmd/local-ai
|
||||||
env:
|
env:
|
||||||
- CGO_ENABLED=0
|
- CGO_ENABLED=0
|
||||||
ldflags:
|
ldflags:
|
||||||
@@ -35,3 +36,19 @@ snapshot:
|
|||||||
version_template: "{{ .Tag }}-next"
|
version_template: "{{ .Tag }}-next"
|
||||||
changelog:
|
changelog:
|
||||||
use: github-native
|
use: github-native
|
||||||
|
# Sign + notarize the macOS server binary via the quill backend (runs on Linux,
|
||||||
|
# no macOS runner needed). Disabled automatically when MACOS_SIGN_P12 is unset
|
||||||
|
# (forks / PRs), so those builds stay unsigned and green.
|
||||||
|
notarize:
|
||||||
|
macos:
|
||||||
|
- enabled: '{{ isEnvSet "MACOS_SIGN_P12" }}'
|
||||||
|
ids:
|
||||||
|
- local-ai
|
||||||
|
sign:
|
||||||
|
certificate: "{{.Env.MACOS_SIGN_P12}}"
|
||||||
|
password: "{{.Env.MACOS_SIGN_PASSWORD}}"
|
||||||
|
notarize:
|
||||||
|
issuer_id: "{{.Env.MACOS_NOTARY_ISSUER_ID}}"
|
||||||
|
key_id: "{{.Env.MACOS_NOTARY_KEY_ID}}"
|
||||||
|
key: "{{.Env.MACOS_NOTARY_KEY}}"
|
||||||
|
wait: true
|
||||||
|
|||||||
@@ -43,4 +43,5 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
|
|||||||
- **New API endpoints**: LocalAI advertises its capability surface in several independent places — swagger `@Tags`, `/api/instructions` registry, auth `RouteFeatureRegistry`, React UI `capabilities.js`, docs. Read [.agents/api-endpoints-and-auth.md](.agents/api-endpoints-and-auth.md) and follow its checklist — missing any surface means clients, admins, and the UI won't know the endpoint exists.
|
- **New API endpoints**: LocalAI advertises its capability surface in several independent places — swagger `@Tags`, `/api/instructions` registry, auth `RouteFeatureRegistry`, React UI `capabilities.js`, docs. Read [.agents/api-endpoints-and-auth.md](.agents/api-endpoints-and-auth.md) and follow its checklist — missing any surface means clients, admins, and the UI won't know the endpoint exists.
|
||||||
- **Admin endpoints → MCP tool**: every admin endpoint that an admin would manage conversationally (install/list/edit/toggle/upgrade) MUST also be exposed as an MCP tool in `pkg/mcp/localaitools/`. The LocalAI Assistant chat modality and the standalone `local-ai mcp-server` consume that package; drift between REST and MCP is a real risk. Read [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) — the `TestToolHTTPRouteMappingComplete` test fails until you wire the new tool and update the route map.
|
- **Admin endpoints → MCP tool**: every admin endpoint that an admin would manage conversationally (install/list/edit/toggle/upgrade) MUST also be exposed as an MCP tool in `pkg/mcp/localaitools/`. The LocalAI Assistant chat modality and the standalone `local-ai mcp-server` consume that package; drift between REST and MCP is a real risk. Read [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) — the `TestToolHTTPRouteMappingComplete` test fails until you wire the new tool and update the route map.
|
||||||
- **Build**: Inspect `Makefile` and `.github/workflows/` — ask the user before running long builds
|
- **Build**: Inspect `Makefile` and `.github/workflows/` — ask the user before running long builds
|
||||||
|
- **Backend OS coverage**: a new backend must target every OS it can build for, not just Linux. `.github/backend-matrix.yml` has two matrices — `include:` (Linux) and `includeDarwin:` (macOS / Apple Silicon). Most C/C++/GGML and many Python backends build on Darwin too — wire the `includeDarwin` entry + `backend/index.yaml` `metal:` entries, or say in the PR why an OS is unsupported. See the darwin checklist in [.agents/adding-backends.md](.agents/adding-backends.md).
|
||||||
- **UI**: The active UI is the React app in `core/http/react-ui/`. The older Alpine.js/HTML UI in `core/http/static/` is pending deprecation — all new UI work goes in the React UI
|
- **UI**: The active UI is the React app in `core/http/react-ui/`. The older Alpine.js/HTML UI in `core/http/static/` is pending deprecation — all new UI work goes in the React UI
|
||||||
|
|||||||
50
Makefile
50
Makefile
@@ -1,5 +1,5 @@
|
|||||||
# Disable parallel execution for backend builds
|
# Disable parallel execution for backend builds
|
||||||
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter
|
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter backends/privacy-filter-darwin
|
||||||
|
|
||||||
GOCMD=go
|
GOCMD=go
|
||||||
GOTEST=$(GOCMD) test
|
GOTEST=$(GOCMD) test
|
||||||
@@ -103,7 +103,7 @@ COVERAGE_E2E_LABELS?=!real-models
|
|||||||
COVERAGE_EXCLUDE_RE?=grpc/proto/.*[.]pb[.]go
|
COVERAGE_EXCLUDE_RE?=grpc/proto/.*[.]pb[.]go
|
||||||
|
|
||||||
|
|
||||||
.PHONY: all test test-coverage test-coverage-baseline test-coverage-check test-ui test-ui-coverage-baseline test-ui-coverage-check install-hooks build vendor lint lint-all
|
.PHONY: all test test-coverage test-coverage-baseline test-coverage-check test-backend-cpp test-ui test-ui-coverage-baseline test-ui-coverage-check install-hooks build vendor lint lint-all
|
||||||
|
|
||||||
all: help
|
all: help
|
||||||
|
|
||||||
@@ -201,6 +201,13 @@ test: prepare-test
|
|||||||
OPUS_SHIM_LIBRARY=$(abspath ./pkg/opus/shim/libopusshim.so) \
|
OPUS_SHIM_LIBRARY=$(abspath ./pkg/opus/shim/libopusshim.so) \
|
||||||
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts $(TEST_FLAKES) --fail-fast -v -r $(TEST_PATHS)
|
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts $(TEST_FLAKES) --fail-fast -v -r $(TEST_PATHS)
|
||||||
|
|
||||||
|
## Compiles and runs the standalone C++ unit tests for the backends (pure
|
||||||
|
## helpers that depend only on the stdlib + nlohmann/json, no full backend
|
||||||
|
## build). Discovers every *_test.cpp under backend/cpp/ - see
|
||||||
|
## backend/cpp/run-unit-tests.sh. Set NLOHMANN_INCLUDE to skip the header fetch.
|
||||||
|
test-backend-cpp:
|
||||||
|
bash backend/cpp/run-unit-tests.sh
|
||||||
|
|
||||||
## Runs the core suite ($(TEST_PATHS)) with statement-coverage instrumentation
|
## Runs the core suite ($(TEST_PATHS)) with statement-coverage instrumentation
|
||||||
## and writes a merged profile to $(COVERAGE_PROFILE). Deliberately omits
|
## and writes a merged profile to $(COVERAGE_PROFILE). Deliberately omits
|
||||||
## --fail-fast so a single failure doesn't truncate the coverage number, and
|
## --fail-fast so a single failure doesn't truncate the coverage number, and
|
||||||
@@ -1129,6 +1136,10 @@ backends/ds4-darwin: build
|
|||||||
bash ./scripts/build/ds4-darwin.sh
|
bash ./scripts/build/ds4-darwin.sh
|
||||||
./local-ai backends install "ocifile://$(abspath ./backend-images/ds4.tar)"
|
./local-ai backends install "ocifile://$(abspath ./backend-images/ds4.tar)"
|
||||||
|
|
||||||
|
backends/privacy-filter-darwin: build
|
||||||
|
bash ./scripts/build/privacy-filter-darwin.sh
|
||||||
|
./local-ai backends install "ocifile://$(abspath ./backend-images/privacy-filter.tar)"
|
||||||
|
|
||||||
build-darwin-python-backend: build
|
build-darwin-python-backend: build
|
||||||
bash ./scripts/build/python-darwin.sh
|
bash ./scripts/build/python-darwin.sh
|
||||||
|
|
||||||
@@ -1449,13 +1460,32 @@ docs: docs/static/gallery.html
|
|||||||
########################################################
|
########################################################
|
||||||
|
|
||||||
## fyne cross-platform build
|
## fyne cross-platform build
|
||||||
build-launcher-darwin: build-launcher
|
# Build LocalAI.app from the launcher via fyne (metadata read from cmd/launcher/FyneApp.toml).
|
||||||
go run github.com/tiagomelo/macos-dmg-creator/cmd/createdmg@latest \
|
# Signing happens via contrib/macos/sign-and-notarize.sh, which is a no-op when the signing
|
||||||
--appName "LocalAI" \
|
# secrets are unset, so unsigned local/fork builds keep working.
|
||||||
--appBinaryPath "$(LAUNCHER_BINARY_NAME)" \
|
build-launcher-darwin:
|
||||||
--bundleIdentifier "com.localai.launcher" \
|
rm -rf dist/LocalAI.app cmd/launcher/LocalAI.app
|
||||||
--iconPath "core/http/static/logo.png" \
|
mkdir -p dist
|
||||||
--outputDir "dist/"
|
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os darwin -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)
|
||||||
|
mv cmd/launcher/LocalAI.app dist/LocalAI.app
|
||||||
|
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.app
|
||||||
|
|
||||||
|
# Wrap the (signed) app into a drag-to-Applications DMG via hdiutil, then sign the DMG.
|
||||||
|
dmg-launcher-darwin: build-launcher-darwin
|
||||||
|
rm -rf dist/dmg dist/LocalAI.dmg
|
||||||
|
mkdir -p dist/dmg
|
||||||
|
cp -R dist/LocalAI.app dist/dmg/LocalAI.app
|
||||||
|
ln -s /Applications dist/dmg/Applications
|
||||||
|
hdiutil create -volname "LocalAI" -srcfolder dist/dmg -ov -format UDZO dist/LocalAI.dmg
|
||||||
|
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.dmg
|
||||||
|
|
||||||
|
# Submit the DMG to Apple notarization and staple the ticket (no-op without notary secrets).
|
||||||
|
notarize-launcher-darwin: dmg-launcher-darwin
|
||||||
|
bash contrib/macos/sign-and-notarize.sh notarize dist/LocalAI.dmg
|
||||||
|
|
||||||
|
# Single entrypoint for CI: build -> sign app -> dmg -> sign dmg -> notarize -> staple.
|
||||||
|
release-launcher-darwin: notarize-launcher-darwin
|
||||||
|
@echo "dist/LocalAI.dmg is ready"
|
||||||
|
|
||||||
build-launcher-linux:
|
build-launcher-linux:
|
||||||
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv launcher.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz
|
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv LocalAI.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
|
|
||||||
IK_LLAMA_VERSION?=6c00e87ac84404af588ad2e65935bd6f079c696f
|
IK_LLAMA_VERSION?=b84902d2ad27c34f989f23947200c4b91b1568fd
|
||||||
LLAMA_REPO?=https://github.com/ikawrakow/ik_llama.cpp
|
LLAMA_REPO?=https://github.com/ikawrakow/ik_llama.cpp
|
||||||
|
|
||||||
CMAKE_ARGS?=
|
CMAKE_ARGS?=
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -13,28 +13,28 @@ grep -e "flags" /proc/cpuinfo | head -1
|
|||||||
# ik_llama.cpp requires AVX2 — default to avx2 binary
|
# ik_llama.cpp requires AVX2 — default to avx2 binary
|
||||||
BINARY=ik-llama-cpp-avx2
|
BINARY=ik-llama-cpp-avx2
|
||||||
|
|
||||||
if [ -e $CURDIR/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if [ -e "$CURDIR"/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 NOT found, using fallback"
|
echo "CPU: AVX2 NOT found, using fallback"
|
||||||
BINARY=ik-llama-cpp-fallback
|
BINARY=ik-llama-cpp-fallback
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Extend ld library path with the dir where this script is located/lib
|
# Extend ld library path with the dir where this script is located/lib
|
||||||
if [ "$(uname)" == "Darwin" ]; then
|
if [ "$(uname)" == "Darwin" ]; then
|
||||||
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
|
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
|
||||||
else
|
else
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/$BINARY "$@"
|
||||||
|
|
||||||
# We should never reach this point, however just in case we do, run fallback
|
# We should never reach this point, however just in case we do, run fallback
|
||||||
exec $CURDIR/ik-llama-cpp-fallback "$@"
|
exec "$CURDIR"/ik-llama-cpp-fallback "$@"
|
||||||
|
|||||||
@@ -50,8 +50,13 @@ add_custom_command(
|
|||||||
"${hw_proto}"
|
"${hw_proto}"
|
||||||
DEPENDS "${hw_proto}")
|
DEPENDS "${hw_proto}")
|
||||||
|
|
||||||
# hw_grpc_proto
|
# hw_grpc_proto: force STATIC. Under the CPU_ALL_VARIANTS build BUILD_SHARED_LIBS=ON
|
||||||
add_library(hw_grpc_proto
|
# (ggml/llama become shared), which would otherwise make this glue library a DSO. As a
|
||||||
|
# DSO it references the hidden-visibility symbols in the static libprotobuf.a, which the
|
||||||
|
# linker cannot satisfy ("hidden symbol ... in libprotobuf.a is referenced by DSO").
|
||||||
|
# Keeping it STATIC links protobuf/gRPC directly into the grpc-server executable while
|
||||||
|
# only ggml/llama stay shared. No effect on the static variants (already BUILD_SHARED_LIBS=OFF).
|
||||||
|
add_library(hw_grpc_proto STATIC
|
||||||
${hw_grpc_srcs}
|
${hw_grpc_srcs}
|
||||||
${hw_grpc_hdrs}
|
${hw_grpc_hdrs}
|
||||||
${hw_proto_srcs}
|
${hw_proto_srcs}
|
||||||
@@ -82,3 +87,18 @@ target_compile_features(${TARGET} PRIVATE cxx_std_11)
|
|||||||
if(TARGET BUILD_INFO)
|
if(TARGET BUILD_INFO)
|
||||||
add_dependencies(${TARGET} BUILD_INFO)
|
add_dependencies(${TARGET} BUILD_INFO)
|
||||||
endif()
|
endif()
|
||||||
|
|
||||||
|
# Unit test for the message-content normalization helper (message_content.h).
|
||||||
|
# Off by default so the normal backend build is untouched; enable with
|
||||||
|
# -DLLAMA_GRPC_BUILD_TESTS=ON and run via ctest. It reuses llama.cpp's vendored
|
||||||
|
# <nlohmann/json.hpp> (propagated by the common helpers library) so it has no
|
||||||
|
# extra dependency beyond what the backend already builds against.
|
||||||
|
option(LLAMA_GRPC_BUILD_TESTS "Build grpc-server unit tests" OFF)
|
||||||
|
if(LLAMA_GRPC_BUILD_TESTS)
|
||||||
|
enable_testing()
|
||||||
|
add_executable(message_content_test message_content_test.cpp message_content.h)
|
||||||
|
target_include_directories(message_content_test PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
|
||||||
|
target_link_libraries(message_content_test PRIVATE ${_LLAMA_COMMON_TARGET})
|
||||||
|
target_compile_features(message_content_test PRIVATE cxx_std_17)
|
||||||
|
add_test(NAME message_content_test COMMAND message_content_test)
|
||||||
|
endif()
|
||||||
|
|||||||
@@ -1,5 +1,5 @@
|
|||||||
|
|
||||||
LLAMA_VERSION?=73618f27a801c0b8614ceaf3547d3c2a99baae14
|
LLAMA_VERSION?=9d5d882d8cd0f0a9283d87ed5e6fe3ee0d925fb1
|
||||||
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
|
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
|
||||||
|
|
||||||
CMAKE_ARGS?=
|
CMAKE_ARGS?=
|
||||||
@@ -10,8 +10,16 @@ TARGET?=--target grpc-server
|
|||||||
JOBS?=$(shell nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 1)
|
JOBS?=$(shell nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 1)
|
||||||
ARCH?=$(shell uname -m)
|
ARCH?=$(shell uname -m)
|
||||||
|
|
||||||
# Disable Shared libs as we are linking on static gRPC and we can't mix shared and static
|
# Shared libs default to OFF: we link static gRPC and the avx/avx2/avx512/fallback
|
||||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF -DLLAMA_CURL=OFF
|
# variants are fully static. The CPU_ALL_VARIANTS build flips SHARED_LIBS=ON (ggml/llama
|
||||||
|
# become shared so the dynamic CPU backends work; gRPC stays static via its imported
|
||||||
|
# targets). SHARED_LIBS is a make variable, not an appended -D, so it survives the
|
||||||
|
# recursive sub-make into the VARIANT build dir (which re-parses this Makefile) instead
|
||||||
|
# of being re-clobbered by a second -DBUILD_SHARED_LIBS=OFF. EXTRA_CMAKE_ARGS is the hook
|
||||||
|
# the CPU_ALL_VARIANTS target uses to inject -DGGML_BACKEND_DL/-DGGML_CPU_ALL_VARIANTS.
|
||||||
|
SHARED_LIBS?=OFF
|
||||||
|
EXTRA_CMAKE_ARGS?=
|
||||||
|
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=$(SHARED_LIBS) -DLLAMA_CURL=OFF $(EXTRA_CMAKE_ARGS)
|
||||||
|
|
||||||
CURRENT_MAKEFILE_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))
|
CURRENT_MAKEFILE_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))
|
||||||
ifeq ($(NATIVE),false)
|
ifeq ($(NATIVE),false)
|
||||||
@@ -120,6 +128,30 @@ llama-cpp-fallback: llama.cpp
|
|||||||
CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) VARIANT="llama-cpp-fallback-build" build-llama-cpp-grpc-server
|
CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) VARIANT="llama-cpp-fallback-build" build-llama-cpp-grpc-server
|
||||||
cp -rfv $(CURRENT_MAKEFILE_DIR)/../llama-cpp-fallback-build/grpc-server llama-cpp-fallback
|
cp -rfv $(CURRENT_MAKEFILE_DIR)/../llama-cpp-fallback-build/grpc-server llama-cpp-fallback
|
||||||
|
|
||||||
|
# Single-build CPU backend using ggml's CPU_ALL_VARIANTS. Produces ONE grpc-server
|
||||||
|
# plus a set of dlopen-able libggml-cpu-*.so (sandybridge/haswell/skylakex/...) that
|
||||||
|
# ggml's backend registry selects from at runtime by probing host CPU features.
|
||||||
|
# Replaces the avx/avx2/avx512/fallback multi-binary build on x86.
|
||||||
|
#
|
||||||
|
# CPU_ALL_VARIANTS requires GGML_BACKEND_DL, which requires BUILD_SHARED_LIBS=ON, so we
|
||||||
|
# pass SHARED_LIBS=ON and the DL flags as make variables (NOT pre-expanded into the
|
||||||
|
# CMAKE_ARGS env string): command-line make variables propagate through every recursive
|
||||||
|
# sub-make, so the deepest VARIANT-dir build computes BUILD_SHARED_LIBS=ON consistently.
|
||||||
|
# Only ggml/llama go shared - gRPC is found via its static imported targets, so the
|
||||||
|
# grpc-server binary keeps static gRPC and only dynamically links ggml.
|
||||||
|
#
|
||||||
|
# TARGET adds "ggml": the per-microarch backends are runtime-dlopened, not link deps of
|
||||||
|
# grpc-server, so they only build because each is an add_dependencies() of the ggml target.
|
||||||
|
llama-cpp-cpu-all: llama.cpp
|
||||||
|
cp -rf $(CURRENT_MAKEFILE_DIR)/../llama-cpp $(CURRENT_MAKEFILE_DIR)/../llama-cpp-cpu-all-build
|
||||||
|
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../llama-cpp-cpu-all-build purge
|
||||||
|
$(info ${GREEN}I llama-cpp build info:cpu-all-variants${RESET})
|
||||||
|
$(MAKE) SHARED_LIBS=ON EXTRA_CMAKE_ARGS="-DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON" TARGET="--target grpc-server --target ggml" VARIANT="llama-cpp-cpu-all-build" build-llama-cpp-grpc-server
|
||||||
|
cp -rfv $(CURRENT_MAKEFILE_DIR)/../llama-cpp-cpu-all-build/grpc-server llama-cpp-cpu-all
|
||||||
|
rm -rf ggml-shared-libs && mkdir -p ggml-shared-libs
|
||||||
|
find $(CURRENT_MAKEFILE_DIR)/../llama-cpp-cpu-all-build/llama.cpp/build \( -name '*.so*' -o -name '*.dylib' \) -exec cp -av {} ggml-shared-libs/ \;
|
||||||
|
@echo "Collected ggml shared backends:" && ls -la ggml-shared-libs/
|
||||||
|
|
||||||
llama-cpp-grpc: llama.cpp
|
llama-cpp-grpc: llama.cpp
|
||||||
cp -rf $(CURRENT_MAKEFILE_DIR)/../llama-cpp $(CURRENT_MAKEFILE_DIR)/../llama-cpp-grpc-build
|
cp -rf $(CURRENT_MAKEFILE_DIR)/../llama-cpp $(CURRENT_MAKEFILE_DIR)/../llama-cpp-grpc-build
|
||||||
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../llama-cpp-grpc-build purge
|
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../llama-cpp-grpc-build purge
|
||||||
|
|||||||
@@ -37,7 +37,9 @@
|
|||||||
#include "backend.pb.h"
|
#include "backend.pb.h"
|
||||||
#include "backend.grpc.pb.h"
|
#include "backend.grpc.pb.h"
|
||||||
#include "common.h"
|
#include "common.h"
|
||||||
|
#include "arg.h"
|
||||||
#include "chat-auto-parser.h"
|
#include "chat-auto-parser.h"
|
||||||
|
#include "message_content.h"
|
||||||
#include <getopt.h>
|
#include <getopt.h>
|
||||||
#include <grpcpp/ext/proto_server_reflection_plugin.h>
|
#include <grpcpp/ext/proto_server_reflection_plugin.h>
|
||||||
#include <grpcpp/grpcpp.h>
|
#include <grpcpp/grpcpp.h>
|
||||||
@@ -592,6 +594,10 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
|
|||||||
params.checkpoint_min_step = 256;
|
params.checkpoint_min_step = 256;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
|
// Raw upstream llama-server flags collected from any option entry that
|
||||||
|
// starts with '-'. Applied once after the loop via common_params_parse.
|
||||||
|
std::vector<std::string> extra_argv;
|
||||||
|
|
||||||
// decode options. Options are in form optname:optvale, or if booleans only optname.
|
// decode options. Options are in form optname:optvale, or if booleans only optname.
|
||||||
for (int i = 0; i < request->options_size(); i++) {
|
for (int i = 0; i < request->options_size(); i++) {
|
||||||
std::string opt = request->options(i);
|
std::string opt = request->options(i);
|
||||||
@@ -1080,6 +1086,31 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
|
|||||||
} catch (...) {}
|
} catch (...) {}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// --- main model MoE on CPU (upstream --cpu-moe / --n-cpu-moe) ---
|
||||||
|
} else if (!strcmp(optname, "cpu_moe")) {
|
||||||
|
// Bool-style flag: keep all MoE expert weights on CPU.
|
||||||
|
const bool enable = (optval == NULL) ||
|
||||||
|
optval_str == "true" || optval_str == "1" || optval_str == "yes" ||
|
||||||
|
optval_str == "on" || optval_str == "enabled";
|
||||||
|
if (enable) {
|
||||||
|
params.tensor_buft_overrides.push_back(llm_ffn_exps_cpu_override());
|
||||||
|
}
|
||||||
|
} else if (!strcmp(optname, "n_cpu_moe")) {
|
||||||
|
if (optval != NULL) {
|
||||||
|
try {
|
||||||
|
int n = std::stoi(optval_str);
|
||||||
|
if (n < 0) n = 0;
|
||||||
|
// Keep override-name storage alive for the lifetime of the
|
||||||
|
// params struct (mirrors upstream arg.cpp's function-local static).
|
||||||
|
static std::list<std::string> buft_overrides_main;
|
||||||
|
for (int i = 0; i < n; ++i) {
|
||||||
|
buft_overrides_main.push_back(llm_ffn_exps_block_regex(i));
|
||||||
|
params.tensor_buft_overrides.push_back(
|
||||||
|
{buft_overrides_main.back().c_str(), ggml_backend_cpu_buffer_type()});
|
||||||
|
}
|
||||||
|
} catch (...) {}
|
||||||
|
}
|
||||||
|
|
||||||
// --- draft model tensor buffer overrides (upstream --spec-draft-override-tensor) ---
|
// --- draft model tensor buffer overrides (upstream --spec-draft-override-tensor) ---
|
||||||
} else if (!strcmp(optname, "draft_override_tensor") || !strcmp(optname, "spec_draft_override_tensor")) {
|
} else if (!strcmp(optname, "draft_override_tensor") || !strcmp(optname, "spec_draft_override_tensor")) {
|
||||||
// Format: <tensor regex>=<buffer type>,<tensor regex>=<buffer type>,...
|
// Format: <tensor regex>=<buffer type>,<tensor regex>=<buffer type>,...
|
||||||
@@ -1111,6 +1142,30 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
|
|||||||
else { cur.push_back(c); }
|
else { cur.push_back(c); }
|
||||||
}
|
}
|
||||||
if (!cur.empty()) flush(cur);
|
if (!cur.empty()) flush(cur);
|
||||||
|
|
||||||
|
// --- generic passthrough: any entry starting with '-' is a raw
|
||||||
|
// upstream llama-server flag, forwarded verbatim to the parser. ---
|
||||||
|
} else if (optname[0] == '-') {
|
||||||
|
std::string flag = optname;
|
||||||
|
// These flags make upstream's parser exit() (printing usage /
|
||||||
|
// completion), which would kill the backend process. Skip them.
|
||||||
|
if (flag == "-h" || flag == "--help" || flag == "--usage" ||
|
||||||
|
flag == "--version" || flag == "--license" ||
|
||||||
|
flag == "--list-devices" || flag == "-cl" ||
|
||||||
|
flag == "--cache-list" ||
|
||||||
|
flag.rfind("--completion", 0) == 0) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[llama-cpp] ignoring passthrough flag that would exit: %s\n",
|
||||||
|
flag.c_str());
|
||||||
|
} else {
|
||||||
|
extra_argv.push_back(flag);
|
||||||
|
// Preserve the whole value after the first ':' so embedded
|
||||||
|
// colons (e.g. host:port) survive strtok's truncation of optval.
|
||||||
|
auto colon = opt.find(':');
|
||||||
|
if (colon != std::string::npos) {
|
||||||
|
extra_argv.push_back(opt.substr(colon + 1));
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -1146,27 +1201,6 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!params.kv_overrides.empty()) {
|
|
||||||
params.kv_overrides.emplace_back();
|
|
||||||
params.kv_overrides.back().key[0] = 0;
|
|
||||||
}
|
|
||||||
|
|
||||||
// tensor_buft_overrides sentinel termination (mirrors upstream common/arg.cpp).
|
|
||||||
// Real entries are pushed during option parsing; here we pad/terminate so the
|
|
||||||
// model loader sees back().pattern == nullptr (GGML_ASSERT at common.cpp:1543)
|
|
||||||
// and so llama_params_fit has the placeholder slots it requires.
|
|
||||||
{
|
|
||||||
const size_t ntbo = llama_max_tensor_buft_overrides();
|
|
||||||
while (params.tensor_buft_overrides.size() < ntbo) {
|
|
||||||
params.tensor_buft_overrides.push_back({nullptr, nullptr});
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Terminate the draft tensor_buft_overrides list with a sentinel, mirroring
|
|
||||||
// the main-model handling above.
|
|
||||||
if (!params.speculative.draft.tensor_buft_overrides.empty()) {
|
|
||||||
params.speculative.draft.tensor_buft_overrides.push_back({nullptr, nullptr});
|
|
||||||
}
|
|
||||||
|
|
||||||
// TODO: Add yarn
|
// TODO: Add yarn
|
||||||
|
|
||||||
if (!request->tensorsplit().empty()) {
|
if (!request->tensorsplit().empty()) {
|
||||||
@@ -1259,6 +1293,69 @@ static void params_parse(server_context& /*ctx_server*/, const backend::ModelOpt
|
|||||||
params.sampling.grammar_triggers.push_back(std::move(trigger));
|
params.sampling.grammar_triggers.push_back(std::move(trigger));
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Apply any raw upstream flags last so an explicit passthrough flag wins
|
||||||
|
// over the LocalAI-resolved field it maps to (e.g. --ctx-size beats
|
||||||
|
// context_size). This is the same parser llama-server itself uses.
|
||||||
|
if (!extra_argv.empty()) {
|
||||||
|
// common_params_parser_init resets a few fields for the SERVER example
|
||||||
|
// (n_parallel -> -1, use_color). Snapshot n_parallel so an unrelated
|
||||||
|
// passthrough flag can't silently clobber LocalAI's resolved value.
|
||||||
|
const int saved_n_parallel = params.n_parallel;
|
||||||
|
|
||||||
|
std::vector<char *> argv;
|
||||||
|
std::string prog = "llama-server";
|
||||||
|
argv.push_back(prog.data());
|
||||||
|
for (auto & a : extra_argv) {
|
||||||
|
argv.push_back(a.data());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ctx_arg.params is a reference, so this overlays the given flags onto
|
||||||
|
// `params` in place. Returns false on a recoverable parse error (and
|
||||||
|
// self-restores params); may exit() on a hard error, exactly as
|
||||||
|
// passing the same bad flag to llama-server would.
|
||||||
|
if (!common_params_parse((int)argv.size(), argv.data(), params,
|
||||||
|
LLAMA_EXAMPLE_SERVER)) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[llama-cpp] failed to parse passthrough options; ignoring them\n");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Restore n_parallel unless a passthrough flag explicitly set it
|
||||||
|
// (parser_init's reset sentinel for SERVER is -1).
|
||||||
|
if (params.n_parallel == -1) {
|
||||||
|
params.n_parallel = saved_n_parallel;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Terminate/pad the override vectors only after BOTH the named-option loop
|
||||||
|
// and the generic passthrough (common_params_parse above) have pushed their
|
||||||
|
// real entries, so back() is the null sentinel the model loader asserts on.
|
||||||
|
// Running these before the passthrough let a passthrough flag (--cpu-moe,
|
||||||
|
// --override-tensor, --override-kv, ...) append a real entry after the
|
||||||
|
// sentinel: a GGML_ASSERT crash for tensor_buft_overrides, a silent drop for
|
||||||
|
// kv_overrides. Double-termination is harmless (the while is a no-op if the
|
||||||
|
// passthrough parse already padded; an extra trailing null is ignored).
|
||||||
|
|
||||||
|
if (!params.kv_overrides.empty()) {
|
||||||
|
params.kv_overrides.emplace_back();
|
||||||
|
params.kv_overrides.back().key[0] = 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
// tensor_buft_overrides sentinel termination (mirrors upstream common/arg.cpp).
|
||||||
|
// Real entries are pushed during option parsing; here we pad/terminate so the
|
||||||
|
// model loader sees back().pattern == nullptr (GGML_ASSERT at common.cpp:1543)
|
||||||
|
// and so llama_params_fit has the placeholder slots it requires.
|
||||||
|
{
|
||||||
|
const size_t ntbo = llama_max_tensor_buft_overrides();
|
||||||
|
while (params.tensor_buft_overrides.size() < ntbo) {
|
||||||
|
params.tensor_buft_overrides.push_back({nullptr, nullptr});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Terminate the draft tensor_buft_overrides list with a sentinel, mirroring
|
||||||
|
// the main-model handling above.
|
||||||
|
if (!params.speculative.draft.tensor_buft_overrides.empty()) {
|
||||||
|
params.speculative.draft.tensor_buft_overrides.push_back({nullptr, nullptr});
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
@@ -1520,242 +1617,20 @@ public:
|
|||||||
|
|
||||||
for (int i = 0; i < request->messages_size(); i++) {
|
for (int i = 0; i < request->messages_size(); i++) {
|
||||||
const auto& msg = request->messages(i);
|
const auto& msg = request->messages(i);
|
||||||
json msg_json;
|
llama_grpc::ReconstructedMessageInput rin;
|
||||||
msg_json["role"] = msg.role();
|
rin.role = msg.role();
|
||||||
|
rin.content = msg.content();
|
||||||
bool is_last_user_msg = (i == last_user_msg_idx);
|
rin.name = msg.name();
|
||||||
bool has_images_or_audio = (request->images_size() > 0 || request->audios_size() > 0 || request->videos_size() > 0);
|
rin.tool_call_id = msg.tool_call_id();
|
||||||
|
rin.reasoning_content = msg.reasoning_content();
|
||||||
// Handle content - can be string, null, or array
|
rin.tool_calls = msg.tool_calls();
|
||||||
// For multimodal content, we'll embed images/audio from separate fields
|
rin.is_last_user_msg = (i == last_user_msg_idx);
|
||||||
if (!msg.content().empty()) {
|
if (rin.is_last_user_msg) {
|
||||||
// Try to parse content as JSON to see if it's already an array
|
for (int j = 0; j < request->images_size(); j++) rin.images.push_back(request->images(j));
|
||||||
json content_val;
|
for (int j = 0; j < request->audios_size(); j++) rin.audios.push_back(request->audios(j));
|
||||||
try {
|
for (int j = 0; j < request->videos_size(); j++) rin.videos.push_back(request->videos(j));
|
||||||
content_val = json::parse(msg.content());
|
|
||||||
// Handle null values - convert to empty string to avoid template errors
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
content_val = "";
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error&) {
|
|
||||||
// Not JSON, treat as plain string
|
|
||||||
content_val = msg.content();
|
|
||||||
}
|
|
||||||
|
|
||||||
// If content is an object (e.g., from tool call failures), convert to string
|
|
||||||
if (content_val.is_object()) {
|
|
||||||
content_val = content_val.dump();
|
|
||||||
}
|
|
||||||
|
|
||||||
// If content is a string and this is the last user message with images/audio, combine them
|
|
||||||
if (content_val.is_string() && is_last_user_msg && has_images_or_audio) {
|
|
||||||
json content_array = json::array();
|
|
||||||
// Add text first
|
|
||||||
content_array.push_back({{"type", "text"}, {"text", content_val.get<std::string>()}});
|
|
||||||
// Add images
|
|
||||||
if (request->images_size() > 0) {
|
|
||||||
for (int j = 0; j < request->images_size(); j++) {
|
|
||||||
json image_chunk;
|
|
||||||
image_chunk["type"] = "image_url";
|
|
||||||
json image_url;
|
|
||||||
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
|
|
||||||
image_chunk["image_url"] = image_url;
|
|
||||||
content_array.push_back(image_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Add audios
|
|
||||||
if (request->audios_size() > 0) {
|
|
||||||
for (int j = 0; j < request->audios_size(); j++) {
|
|
||||||
json audio_chunk;
|
|
||||||
audio_chunk["type"] = "input_audio";
|
|
||||||
json input_audio;
|
|
||||||
input_audio["data"] = request->audios(j);
|
|
||||||
input_audio["format"] = "wav"; // default, could be made configurable
|
|
||||||
audio_chunk["input_audio"] = input_audio;
|
|
||||||
content_array.push_back(audio_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->videos_size() > 0) {
|
|
||||||
for (int j = 0; j < request->videos_size(); j++) {
|
|
||||||
json video_chunk;
|
|
||||||
video_chunk["type"] = "input_video";
|
|
||||||
json input_video;
|
|
||||||
input_video["data"] = request->videos(j);
|
|
||||||
video_chunk["input_video"] = input_video;
|
|
||||||
content_array.push_back(video_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
msg_json["content"] = content_array;
|
|
||||||
} else {
|
|
||||||
// Use content as-is (already array or not last user message)
|
|
||||||
// Ensure null values are converted to empty string
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
msg_json["content"] = "";
|
|
||||||
} else {
|
|
||||||
msg_json["content"] = content_val;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else if (is_last_user_msg && has_images_or_audio) {
|
|
||||||
// If no content but this is the last user message with images/audio, create content array
|
|
||||||
json content_array = json::array();
|
|
||||||
if (request->images_size() > 0) {
|
|
||||||
for (int j = 0; j < request->images_size(); j++) {
|
|
||||||
json image_chunk;
|
|
||||||
image_chunk["type"] = "image_url";
|
|
||||||
json image_url;
|
|
||||||
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
|
|
||||||
image_chunk["image_url"] = image_url;
|
|
||||||
content_array.push_back(image_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->audios_size() > 0) {
|
|
||||||
for (int j = 0; j < request->audios_size(); j++) {
|
|
||||||
json audio_chunk;
|
|
||||||
audio_chunk["type"] = "input_audio";
|
|
||||||
json input_audio;
|
|
||||||
input_audio["data"] = request->audios(j);
|
|
||||||
input_audio["format"] = "wav"; // default, could be made configurable
|
|
||||||
audio_chunk["input_audio"] = input_audio;
|
|
||||||
content_array.push_back(audio_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->videos_size() > 0) {
|
|
||||||
for (int j = 0; j < request->videos_size(); j++) {
|
|
||||||
json video_chunk;
|
|
||||||
video_chunk["type"] = "input_video";
|
|
||||||
json input_video;
|
|
||||||
input_video["data"] = request->videos(j);
|
|
||||||
video_chunk["input_video"] = input_video;
|
|
||||||
content_array.push_back(video_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
msg_json["content"] = content_array;
|
|
||||||
} else if (msg.role() == "tool") {
|
|
||||||
// Tool role messages must have content field set, even if empty
|
|
||||||
// Jinja templates expect content to be a string, not null or object
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d is tool role, content_empty=%d\n", i, msg.content().empty() ? 1 : 0);
|
|
||||||
if (msg.content().empty()) {
|
|
||||||
msg_json["content"] = "";
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): empty content, set to empty string\n", i);
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): content exists: %s\n",
|
|
||||||
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
|
|
||||||
// Content exists, parse and ensure it's a string
|
|
||||||
json content_val;
|
|
||||||
try {
|
|
||||||
content_val = json::parse(msg.content());
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): parsed JSON, type=%s\n",
|
|
||||||
i, content_val.is_null() ? "null" :
|
|
||||||
content_val.is_object() ? "object" :
|
|
||||||
content_val.is_string() ? "string" :
|
|
||||||
content_val.is_array() ? "array" : "other");
|
|
||||||
// Handle null values - Jinja templates expect content to be a string, not null
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
msg_json["content"] = "";
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): null content, converted to empty string\n", i);
|
|
||||||
} else if (content_val.is_object()) {
|
|
||||||
// If content is an object (e.g., from tool call failures/errors), convert to string
|
|
||||||
msg_json["content"] = content_val.dump();
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): object content, converted to string: %s\n",
|
|
||||||
i, content_val.dump().substr(0, std::min<size_t>(200, content_val.dump().size())).c_str());
|
|
||||||
} else if (content_val.is_string()) {
|
|
||||||
msg_json["content"] = content_val.get<std::string>();
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): string content, using as-is\n", i);
|
|
||||||
} else {
|
|
||||||
// For arrays or other types, convert to string
|
|
||||||
msg_json["content"] = content_val.dump();
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): %s content, converted to string\n",
|
|
||||||
i, content_val.is_array() ? "array" : "other type");
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error&) {
|
|
||||||
// Not JSON, treat as plain string
|
|
||||||
msg_json["content"] = msg.content();
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): not JSON, using as string\n", i);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Ensure all messages have content set (fallback for any unhandled cases)
|
|
||||||
// Jinja templates expect content to be present, default to empty string if not set
|
|
||||||
if (!msg_json.contains("content")) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (role=%s): no content field, adding empty string\n",
|
|
||||||
i, msg.role().c_str());
|
|
||||||
msg_json["content"] = "";
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
messages_json.push_back(llama_grpc::build_reconstructed_message(rin));
|
||||||
// Add optional fields for OpenAI-compatible message format
|
|
||||||
if (!msg.name().empty()) {
|
|
||||||
msg_json["name"] = msg.name();
|
|
||||||
}
|
|
||||||
if (!msg.tool_call_id().empty()) {
|
|
||||||
msg_json["tool_call_id"] = msg.tool_call_id();
|
|
||||||
}
|
|
||||||
if (!msg.reasoning_content().empty()) {
|
|
||||||
msg_json["reasoning_content"] = msg.reasoning_content();
|
|
||||||
}
|
|
||||||
if (!msg.tool_calls().empty()) {
|
|
||||||
// Parse tool_calls JSON string and add to message
|
|
||||||
try {
|
|
||||||
json tool_calls = json::parse(msg.tool_calls());
|
|
||||||
msg_json["tool_calls"] = tool_calls;
|
|
||||||
SRV_INF("[TOOL CALLS DEBUG] PredictStream: Message %d has tool_calls: %s\n", i, tool_calls.dump().c_str());
|
|
||||||
// IMPORTANT: If message has tool_calls but content is empty or not set,
|
|
||||||
// set content to space " " instead of empty string "", because llama.cpp's
|
|
||||||
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
|
|
||||||
// which causes template errors when accessing message.content[:tool_start_length]
|
|
||||||
if (!msg_json.contains("content") || (msg_json.contains("content") && msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d has tool_calls but empty content, setting to space\n", i);
|
|
||||||
msg_json["content"] = " ";
|
|
||||||
}
|
|
||||||
// Log each tool call with name and arguments
|
|
||||||
if (tool_calls.is_array()) {
|
|
||||||
for (size_t tc_idx = 0; tc_idx < tool_calls.size(); tc_idx++) {
|
|
||||||
const auto& tc = tool_calls[tc_idx];
|
|
||||||
std::string tool_name = "unknown";
|
|
||||||
std::string tool_args = "{}";
|
|
||||||
if (tc.contains("function")) {
|
|
||||||
const auto& func = tc["function"];
|
|
||||||
if (func.contains("name")) {
|
|
||||||
tool_name = func["name"].get<std::string>();
|
|
||||||
}
|
|
||||||
if (func.contains("arguments")) {
|
|
||||||
tool_args = func["arguments"].is_string() ?
|
|
||||||
func["arguments"].get<std::string>() :
|
|
||||||
func["arguments"].dump();
|
|
||||||
}
|
|
||||||
} else if (tc.contains("name")) {
|
|
||||||
tool_name = tc["name"].get<std::string>();
|
|
||||||
if (tc.contains("arguments")) {
|
|
||||||
tool_args = tc["arguments"].is_string() ?
|
|
||||||
tc["arguments"].get<std::string>() :
|
|
||||||
tc["arguments"].dump();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
SRV_INF("[TOOL CALLS DEBUG] PredictStream: Message %d, tool_call %zu: name=%s, arguments=%s\n",
|
|
||||||
i, tc_idx, tool_name.c_str(), tool_args.c_str());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error& e) {
|
|
||||||
SRV_WRN("Failed to parse tool_calls JSON: %s\n", e.what());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Debug: Log final content state before adding to array
|
|
||||||
if (msg_json.contains("content")) {
|
|
||||||
if (msg_json["content"].is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: content is NULL - THIS WILL CAUSE ERROR!\n", i);
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: content type=%s, has_value=%d\n",
|
|
||||||
i, msg_json["content"].is_string() ? "string" :
|
|
||||||
msg_json["content"].is_array() ? "array" :
|
|
||||||
msg_json["content"].is_object() ? "object" : "other",
|
|
||||||
msg_json["content"].is_null() ? 0 : 1);
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: NO CONTENT FIELD - THIS WILL CAUSE ERROR!\n", i);
|
|
||||||
}
|
|
||||||
|
|
||||||
messages_json.push_back(msg_json);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Final safety check: Ensure no message has null content (Jinja templates require strings)
|
// Final safety check: Ensure no message has null content (Jinja templates require strings)
|
||||||
@@ -1976,36 +1851,7 @@ public:
|
|||||||
if (body_json.contains("messages") && body_json["messages"].is_array()) {
|
if (body_json.contains("messages") && body_json["messages"].is_array()) {
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
|
SRV_INF("[CONTENT DEBUG] PredictStream: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
|
||||||
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
|
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
|
||||||
auto& msg = body_json["messages"][idx];
|
llama_grpc::normalize_template_message(body_json["messages"][idx]);
|
||||||
std::string role_str = msg.contains("role") ? msg["role"].get<std::string>() : "unknown";
|
|
||||||
if (msg.contains("content")) {
|
|
||||||
if (msg["content"].is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) has NULL content - FIXING!\n", idx, role_str.c_str());
|
|
||||||
msg["content"] = ""; // Fix null content
|
|
||||||
} else if (role_str == "tool" && msg["content"].is_array()) {
|
|
||||||
// Tool messages must have string content, not array
|
|
||||||
// oaicompat_chat_params_parse expects tool messages to have string content
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=tool) has array content, converting to string\n", idx);
|
|
||||||
msg["content"] = msg["content"].dump();
|
|
||||||
} else if (!msg["content"].is_string() && !msg["content"].is_array()) {
|
|
||||||
// If content is object or other non-string type, convert to string for templates
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) content is not string/array, converting\n", idx, role_str.c_str());
|
|
||||||
if (msg["content"].is_object()) {
|
|
||||||
msg["content"] = msg["content"].dump();
|
|
||||||
} else {
|
|
||||||
msg["content"] = "";
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s): content type=%s\n",
|
|
||||||
idx, role_str.c_str(),
|
|
||||||
msg["content"].is_string() ? "string" :
|
|
||||||
msg["content"].is_array() ? "array" :
|
|
||||||
msg["content"].is_object() ? "object" : "other");
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) MISSING content field - ADDING!\n", idx, role_str.c_str());
|
|
||||||
msg["content"] = ""; // Add missing content
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2337,264 +2183,20 @@ public:
|
|||||||
SRV_INF("[CONTENT DEBUG] Predict: Processing %d messages\n", request->messages_size());
|
SRV_INF("[CONTENT DEBUG] Predict: Processing %d messages\n", request->messages_size());
|
||||||
for (int i = 0; i < request->messages_size(); i++) {
|
for (int i = 0; i < request->messages_size(); i++) {
|
||||||
const auto& msg = request->messages(i);
|
const auto& msg = request->messages(i);
|
||||||
json msg_json;
|
llama_grpc::ReconstructedMessageInput rin;
|
||||||
msg_json["role"] = msg.role();
|
rin.role = msg.role();
|
||||||
|
rin.content = msg.content();
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d: role=%s, content_empty=%d, content_length=%zu\n",
|
rin.name = msg.name();
|
||||||
i, msg.role().c_str(), msg.content().empty() ? 1 : 0, msg.content().size());
|
rin.tool_call_id = msg.tool_call_id();
|
||||||
if (!msg.content().empty()) {
|
rin.reasoning_content = msg.reasoning_content();
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d content (first 200 chars): %s\n",
|
rin.tool_calls = msg.tool_calls();
|
||||||
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
|
rin.is_last_user_msg = (i == last_user_msg_idx);
|
||||||
|
if (rin.is_last_user_msg) {
|
||||||
|
for (int j = 0; j < request->images_size(); j++) rin.images.push_back(request->images(j));
|
||||||
|
for (int j = 0; j < request->audios_size(); j++) rin.audios.push_back(request->audios(j));
|
||||||
|
for (int j = 0; j < request->videos_size(); j++) rin.videos.push_back(request->videos(j));
|
||||||
}
|
}
|
||||||
|
messages_json.push_back(llama_grpc::build_reconstructed_message(rin));
|
||||||
bool is_last_user_msg = (i == last_user_msg_idx);
|
|
||||||
bool has_images_or_audio = (request->images_size() > 0 || request->audios_size() > 0 || request->videos_size() > 0);
|
|
||||||
|
|
||||||
// Handle content - can be string, null, or array
|
|
||||||
// For multimodal content, we'll embed images/audio from separate fields
|
|
||||||
if (!msg.content().empty()) {
|
|
||||||
// Try to parse content as JSON to see if it's already an array
|
|
||||||
json content_val;
|
|
||||||
try {
|
|
||||||
content_val = json::parse(msg.content());
|
|
||||||
// Handle null values - convert to empty string to avoid template errors
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d parsed JSON is null, converting to empty string\n", i);
|
|
||||||
content_val = "";
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error&) {
|
|
||||||
// Not JSON, treat as plain string
|
|
||||||
content_val = msg.content();
|
|
||||||
}
|
|
||||||
|
|
||||||
// If content is an object (e.g., from tool call failures), convert to string
|
|
||||||
if (content_val.is_object()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d content is object, converting to string\n", i);
|
|
||||||
content_val = content_val.dump();
|
|
||||||
}
|
|
||||||
|
|
||||||
// If content is a string and this is the last user message with images/audio, combine them
|
|
||||||
if (content_val.is_string() && is_last_user_msg && has_images_or_audio) {
|
|
||||||
json content_array = json::array();
|
|
||||||
// Add text first
|
|
||||||
content_array.push_back({{"type", "text"}, {"text", content_val.get<std::string>()}});
|
|
||||||
// Add images
|
|
||||||
if (request->images_size() > 0) {
|
|
||||||
for (int j = 0; j < request->images_size(); j++) {
|
|
||||||
json image_chunk;
|
|
||||||
image_chunk["type"] = "image_url";
|
|
||||||
json image_url;
|
|
||||||
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
|
|
||||||
image_chunk["image_url"] = image_url;
|
|
||||||
content_array.push_back(image_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
// Add audios
|
|
||||||
if (request->audios_size() > 0) {
|
|
||||||
for (int j = 0; j < request->audios_size(); j++) {
|
|
||||||
json audio_chunk;
|
|
||||||
audio_chunk["type"] = "input_audio";
|
|
||||||
json input_audio;
|
|
||||||
input_audio["data"] = request->audios(j);
|
|
||||||
input_audio["format"] = "wav"; // default, could be made configurable
|
|
||||||
audio_chunk["input_audio"] = input_audio;
|
|
||||||
content_array.push_back(audio_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->videos_size() > 0) {
|
|
||||||
for (int j = 0; j < request->videos_size(); j++) {
|
|
||||||
json video_chunk;
|
|
||||||
video_chunk["type"] = "input_video";
|
|
||||||
json input_video;
|
|
||||||
input_video["data"] = request->videos(j);
|
|
||||||
video_chunk["input_video"] = input_video;
|
|
||||||
content_array.push_back(video_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
msg_json["content"] = content_array;
|
|
||||||
} else {
|
|
||||||
// Use content as-is (already array or not last user message)
|
|
||||||
// Ensure null values are converted to empty string
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d content_val was null, setting to empty string\n", i);
|
|
||||||
msg_json["content"] = "";
|
|
||||||
} else {
|
|
||||||
msg_json["content"] = content_val;
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d content set, type=%s\n",
|
|
||||||
i, content_val.is_string() ? "string" :
|
|
||||||
content_val.is_array() ? "array" :
|
|
||||||
content_val.is_object() ? "object" : "other");
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else if (is_last_user_msg && has_images_or_audio) {
|
|
||||||
// If no content but this is the last user message with images/audio, create content array
|
|
||||||
json content_array = json::array();
|
|
||||||
if (request->images_size() > 0) {
|
|
||||||
for (int j = 0; j < request->images_size(); j++) {
|
|
||||||
json image_chunk;
|
|
||||||
image_chunk["type"] = "image_url";
|
|
||||||
json image_url;
|
|
||||||
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
|
|
||||||
image_chunk["image_url"] = image_url;
|
|
||||||
content_array.push_back(image_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->audios_size() > 0) {
|
|
||||||
for (int j = 0; j < request->audios_size(); j++) {
|
|
||||||
json audio_chunk;
|
|
||||||
audio_chunk["type"] = "input_audio";
|
|
||||||
json input_audio;
|
|
||||||
input_audio["data"] = request->audios(j);
|
|
||||||
input_audio["format"] = "wav"; // default, could be made configurable
|
|
||||||
audio_chunk["input_audio"] = input_audio;
|
|
||||||
content_array.push_back(audio_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
if (request->videos_size() > 0) {
|
|
||||||
for (int j = 0; j < request->videos_size(); j++) {
|
|
||||||
json video_chunk;
|
|
||||||
video_chunk["type"] = "input_video";
|
|
||||||
json input_video;
|
|
||||||
input_video["data"] = request->videos(j);
|
|
||||||
video_chunk["input_video"] = input_video;
|
|
||||||
content_array.push_back(video_chunk);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
msg_json["content"] = content_array;
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d created content array with media\n", i);
|
|
||||||
} else if (!msg.tool_calls().empty()) {
|
|
||||||
// Tool call messages may have null content, but templates expect string
|
|
||||||
// IMPORTANT: Set to space " " instead of empty string "", because llama.cpp's
|
|
||||||
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
|
|
||||||
// which causes template errors when accessing message.content[:tool_start_length]
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d has tool_calls, setting content to space (not empty string)\n", i);
|
|
||||||
msg_json["content"] = " ";
|
|
||||||
} else if (msg.role() == "tool") {
|
|
||||||
// Tool role messages must have content field set, even if empty
|
|
||||||
// Jinja templates expect content to be a string, not null or object
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d is tool role, content_empty=%d\n", i, msg.content().empty() ? 1 : 0);
|
|
||||||
if (msg.content().empty()) {
|
|
||||||
msg_json["content"] = "";
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): empty content, set to empty string\n", i);
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): content exists: %s\n",
|
|
||||||
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
|
|
||||||
// Content exists, parse and ensure it's a string
|
|
||||||
json content_val;
|
|
||||||
try {
|
|
||||||
content_val = json::parse(msg.content());
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): parsed JSON, type=%s\n",
|
|
||||||
i, content_val.is_null() ? "null" :
|
|
||||||
content_val.is_object() ? "object" :
|
|
||||||
content_val.is_string() ? "string" :
|
|
||||||
content_val.is_array() ? "array" : "other");
|
|
||||||
// Handle null values - Jinja templates expect content to be a string, not null
|
|
||||||
if (content_val.is_null()) {
|
|
||||||
msg_json["content"] = "";
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): null content, converted to empty string\n", i);
|
|
||||||
} else if (content_val.is_object()) {
|
|
||||||
// If content is an object (e.g., from tool call failures/errors), convert to string
|
|
||||||
msg_json["content"] = content_val.dump();
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): object content, converted to string: %s\n",
|
|
||||||
i, content_val.dump().substr(0, std::min<size_t>(200, content_val.dump().size())).c_str());
|
|
||||||
} else if (content_val.is_string()) {
|
|
||||||
msg_json["content"] = content_val.get<std::string>();
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): string content, using as-is\n", i);
|
|
||||||
} else {
|
|
||||||
// For arrays or other types, convert to string
|
|
||||||
msg_json["content"] = content_val.dump();
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): %s content, converted to string\n",
|
|
||||||
i, content_val.is_array() ? "array" : "other type");
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error&) {
|
|
||||||
// Not JSON, treat as plain string
|
|
||||||
msg_json["content"] = msg.content();
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): not JSON, using as string\n", i);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Ensure all messages have content set (fallback for any unhandled cases)
|
|
||||||
// Jinja templates expect content to be present, default to empty string if not set
|
|
||||||
if (!msg_json.contains("content")) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d (role=%s): no content field, adding empty string\n",
|
|
||||||
i, msg.role().c_str());
|
|
||||||
msg_json["content"] = "";
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Add optional fields for OpenAI-compatible message format
|
|
||||||
if (!msg.name().empty()) {
|
|
||||||
msg_json["name"] = msg.name();
|
|
||||||
}
|
|
||||||
if (!msg.tool_call_id().empty()) {
|
|
||||||
msg_json["tool_call_id"] = msg.tool_call_id();
|
|
||||||
}
|
|
||||||
if (!msg.reasoning_content().empty()) {
|
|
||||||
msg_json["reasoning_content"] = msg.reasoning_content();
|
|
||||||
}
|
|
||||||
if (!msg.tool_calls().empty()) {
|
|
||||||
// Parse tool_calls JSON string and add to message
|
|
||||||
try {
|
|
||||||
json tool_calls = json::parse(msg.tool_calls());
|
|
||||||
msg_json["tool_calls"] = tool_calls;
|
|
||||||
SRV_INF("[TOOL CALLS DEBUG] Predict: Message %d has tool_calls: %s\n", i, tool_calls.dump().c_str());
|
|
||||||
// IMPORTANT: If message has tool_calls but content is empty or not set,
|
|
||||||
// set content to space " " instead of empty string "", because llama.cpp's
|
|
||||||
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
|
|
||||||
// which causes template errors when accessing message.content[:tool_start_length]
|
|
||||||
if (!msg_json.contains("content") || (msg_json.contains("content") && msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d has tool_calls but empty content, setting to space\n", i);
|
|
||||||
msg_json["content"] = " ";
|
|
||||||
}
|
|
||||||
// Log each tool call with name and arguments
|
|
||||||
if (tool_calls.is_array()) {
|
|
||||||
for (size_t tc_idx = 0; tc_idx < tool_calls.size(); tc_idx++) {
|
|
||||||
const auto& tc = tool_calls[tc_idx];
|
|
||||||
std::string tool_name = "unknown";
|
|
||||||
std::string tool_args = "{}";
|
|
||||||
if (tc.contains("function")) {
|
|
||||||
const auto& func = tc["function"];
|
|
||||||
if (func.contains("name")) {
|
|
||||||
tool_name = func["name"].get<std::string>();
|
|
||||||
}
|
|
||||||
if (func.contains("arguments")) {
|
|
||||||
tool_args = func["arguments"].is_string() ?
|
|
||||||
func["arguments"].get<std::string>() :
|
|
||||||
func["arguments"].dump();
|
|
||||||
}
|
|
||||||
} else if (tc.contains("name")) {
|
|
||||||
tool_name = tc["name"].get<std::string>();
|
|
||||||
if (tc.contains("arguments")) {
|
|
||||||
tool_args = tc["arguments"].is_string() ?
|
|
||||||
tc["arguments"].get<std::string>() :
|
|
||||||
tc["arguments"].dump();
|
|
||||||
}
|
|
||||||
}
|
|
||||||
SRV_INF("[TOOL CALLS DEBUG] Predict: Message %d, tool_call %zu: name=%s, arguments=%s\n",
|
|
||||||
i, tc_idx, tool_name.c_str(), tool_args.c_str());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} catch (const json::parse_error& e) {
|
|
||||||
SRV_WRN("Failed to parse tool_calls JSON: %s\n", e.what());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Debug: Log final content state before adding to array
|
|
||||||
if (msg_json.contains("content")) {
|
|
||||||
if (msg_json["content"].is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: content is NULL - THIS WILL CAUSE ERROR!\n", i);
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: content type=%s, has_value=%d\n",
|
|
||||||
i, msg_json["content"].is_string() ? "string" :
|
|
||||||
msg_json["content"].is_array() ? "array" :
|
|
||||||
msg_json["content"].is_object() ? "object" : "other",
|
|
||||||
msg_json["content"].is_null() ? 0 : 1);
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: NO CONTENT FIELD - THIS WILL CAUSE ERROR!\n", i);
|
|
||||||
}
|
|
||||||
|
|
||||||
messages_json.push_back(msg_json);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// Final safety check: Ensure no message has null content (Jinja templates require strings)
|
// Final safety check: Ensure no message has null content (Jinja templates require strings)
|
||||||
@@ -2815,36 +2417,7 @@ public:
|
|||||||
if (body_json.contains("messages") && body_json["messages"].is_array()) {
|
if (body_json.contains("messages") && body_json["messages"].is_array()) {
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
|
SRV_INF("[CONTENT DEBUG] Predict: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
|
||||||
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
|
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
|
||||||
auto& msg = body_json["messages"][idx];
|
llama_grpc::normalize_template_message(body_json["messages"][idx]);
|
||||||
std::string role_str = msg.contains("role") ? msg["role"].get<std::string>() : "unknown";
|
|
||||||
if (msg.contains("content")) {
|
|
||||||
if (msg["content"].is_null()) {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) has NULL content - FIXING!\n", idx, role_str.c_str());
|
|
||||||
msg["content"] = ""; // Fix null content
|
|
||||||
} else if (role_str == "tool" && msg["content"].is_array()) {
|
|
||||||
// Tool messages must have string content, not array
|
|
||||||
// oaicompat_chat_params_parse expects tool messages to have string content
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=tool) has array content, converting to string\n", idx);
|
|
||||||
msg["content"] = msg["content"].dump();
|
|
||||||
} else if (!msg["content"].is_string() && !msg["content"].is_array()) {
|
|
||||||
// If content is object or other non-string type, convert to string for templates
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) content is not string/array, converting\n", idx, role_str.c_str());
|
|
||||||
if (msg["content"].is_object()) {
|
|
||||||
msg["content"] = msg["content"].dump();
|
|
||||||
} else {
|
|
||||||
msg["content"] = "";
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s): content type=%s\n",
|
|
||||||
idx, role_str.c_str(),
|
|
||||||
msg["content"].is_string() ? "string" :
|
|
||||||
msg["content"].is_array() ? "array" :
|
|
||||||
msg["content"].is_object() ? "object" : "other");
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) MISSING content field - ADDING!\n", idx, role_str.c_str());
|
|
||||||
msg["content"] = ""; // Add missing content
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
192
backend/cpp/llama-cpp/message_content.h
Normal file
192
backend/cpp/llama-cpp/message_content.h
Normal file
@@ -0,0 +1,192 @@
|
|||||||
|
#pragma once
|
||||||
|
|
||||||
|
#include <string>
|
||||||
|
#include <vector>
|
||||||
|
|
||||||
|
#include <nlohmann/json.hpp>
|
||||||
|
|
||||||
|
namespace llama_grpc {
|
||||||
|
|
||||||
|
// Normalizes a proto message's content string into the JSON value used when
|
||||||
|
// reconstructing OpenAI-format messages for the tokenizer (jinja) template.
|
||||||
|
//
|
||||||
|
// Shared by the streaming (PredictStream) and non-streaming (Predict) message
|
||||||
|
// reconstruction paths so the two cannot drift.
|
||||||
|
//
|
||||||
|
// LocalAI's Go layer (schema.Messages.ToProto) always sends content as a plain
|
||||||
|
// text string; multimodal media travels in separate proto fields, never inside
|
||||||
|
// content. So user/system/developer content is *only ever* opaque text and must
|
||||||
|
// NOT be JSON-sniffed: a prompt that merely looks like JSON (e.g. an ingredient
|
||||||
|
// list ["1/4 cup sugar", ...]) would otherwise be reinterpreted as structured
|
||||||
|
// content parts and rejected by oaicompat_chat_params_parse with
|
||||||
|
// "unsupported content[].type" (https://github.com/mudler/LocalAI/issues/10524).
|
||||||
|
// (developer is OpenAI's modern system alias - same "human-authored text" nature.)
|
||||||
|
//
|
||||||
|
// For assistant/tool messages we still collapse a literal JSON null/object
|
||||||
|
// (tool-call bookkeeping) to a string, but we never turn a plain string into an
|
||||||
|
// array/scalar. The array defense is therefore role-independent (arrays/scalars
|
||||||
|
// fall through for every role); the role gate only governs the null/object case.
|
||||||
|
inline nlohmann::ordered_json normalize_message_content(const std::string& role,
|
||||||
|
const std::string& content) {
|
||||||
|
nlohmann::ordered_json content_val = content;
|
||||||
|
if (role != "user" && role != "system" && role != "developer") {
|
||||||
|
try {
|
||||||
|
nlohmann::ordered_json parsed = nlohmann::ordered_json::parse(content);
|
||||||
|
if (parsed.is_null()) {
|
||||||
|
content_val = "";
|
||||||
|
} else if (parsed.is_object()) {
|
||||||
|
content_val = parsed.dump();
|
||||||
|
}
|
||||||
|
// arrays / scalars: keep the original plain-text string as-is
|
||||||
|
} catch (const nlohmann::ordered_json::parse_error&) {
|
||||||
|
// Not JSON, already the plain string
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return content_val;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Final safety pass applied to each reconstructed OpenAI message right before it
|
||||||
|
// is handed to oaicompat_chat_params_parse (jinja templating). Jinja templates
|
||||||
|
// assume content is a string: a literal null breaks slicing such as
|
||||||
|
// message.content[:N] (#7324), and a tool message with array content is rejected
|
||||||
|
// (#7528). A multimodal user message legitimately carries a typed-part array
|
||||||
|
// ({type:text}, {type:image_url}, ...), which must be left intact. Shared by the
|
||||||
|
// streaming and non-streaming paths so this invariant cannot drift between them.
|
||||||
|
inline void normalize_template_message(nlohmann::ordered_json& msg) {
|
||||||
|
if (!msg.contains("content")) {
|
||||||
|
msg["content"] = ""; // templates expect the field to exist
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
nlohmann::ordered_json& content = msg["content"];
|
||||||
|
const std::string role = (msg.contains("role") && msg["role"].is_string())
|
||||||
|
? msg["role"].get<std::string>()
|
||||||
|
: std::string();
|
||||||
|
if (content.is_null()) {
|
||||||
|
content = ""; // #7324: null would crash content[:N] slicing
|
||||||
|
} else if (role == "tool" && content.is_array()) {
|
||||||
|
content = content.dump(); // #7528: tool messages must have string content
|
||||||
|
} else if (!content.is_string() && !content.is_array()) {
|
||||||
|
if (content.is_object()) {
|
||||||
|
content = content.dump(); // tool-call bookkeeping object -> string
|
||||||
|
} else {
|
||||||
|
content = ""; // other scalar (number/bool) -> empty
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// string, or a non-tool (multimodal) typed-part array: leave untouched
|
||||||
|
}
|
||||||
|
|
||||||
|
// One proto message's data, flattened to plain types so the reconstruction logic
|
||||||
|
// can be shared and unit-tested without protobuf. The streaming and non-streaming
|
||||||
|
// predict paths both populate this from proto::Message + the request's media.
|
||||||
|
struct ReconstructedMessageInput {
|
||||||
|
std::string role;
|
||||||
|
std::string content; // proto.Message.content (always a plain string)
|
||||||
|
std::string name;
|
||||||
|
std::string tool_call_id;
|
||||||
|
std::string reasoning_content;
|
||||||
|
std::string tool_calls; // tool_calls as a JSON string, or empty
|
||||||
|
bool is_last_user_msg = false; // attach request media to this message
|
||||||
|
std::vector<std::string> images; // base64 (jpeg)
|
||||||
|
std::vector<std::string> audios; // base64 (wav)
|
||||||
|
std::vector<std::string> videos; // base64
|
||||||
|
};
|
||||||
|
|
||||||
|
// Appends the request's media as OpenAI typed content parts. Imperative (not
|
||||||
|
// brace-init) to avoid nlohmann's object-vs-array initializer-list ambiguity.
|
||||||
|
inline void append_media_parts(nlohmann::ordered_json& content_array,
|
||||||
|
const std::vector<std::string>& images,
|
||||||
|
const std::vector<std::string>& audios,
|
||||||
|
const std::vector<std::string>& videos) {
|
||||||
|
for (const auto& img : images) {
|
||||||
|
nlohmann::ordered_json image_chunk;
|
||||||
|
image_chunk["type"] = "image_url";
|
||||||
|
nlohmann::ordered_json image_url;
|
||||||
|
image_url["url"] = "data:image/jpeg;base64," + img;
|
||||||
|
image_chunk["image_url"] = image_url;
|
||||||
|
content_array.push_back(image_chunk);
|
||||||
|
}
|
||||||
|
for (const auto& aud : audios) {
|
||||||
|
nlohmann::ordered_json audio_chunk;
|
||||||
|
audio_chunk["type"] = "input_audio";
|
||||||
|
nlohmann::ordered_json input_audio;
|
||||||
|
input_audio["data"] = aud;
|
||||||
|
input_audio["format"] = "wav"; // default; could be made configurable
|
||||||
|
audio_chunk["input_audio"] = input_audio;
|
||||||
|
content_array.push_back(audio_chunk);
|
||||||
|
}
|
||||||
|
for (const auto& vid : videos) {
|
||||||
|
nlohmann::ordered_json video_chunk;
|
||||||
|
video_chunk["type"] = "input_video";
|
||||||
|
nlohmann::ordered_json input_video;
|
||||||
|
input_video["data"] = vid;
|
||||||
|
video_chunk["input_video"] = input_video;
|
||||||
|
content_array.push_back(video_chunk);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Reconstructs a single OpenAI-format message (the object fed to
|
||||||
|
// oaicompat_chat_params_parse) from a proto message. Shared by PredictStream and
|
||||||
|
// Predict so the content/multimodal/tool_calls handling cannot drift between the
|
||||||
|
// two stream modes (it previously lived as two ~150-line copies with a redundant
|
||||||
|
// Predict-only tool_calls->" " branch). Guarantees content is always a string or
|
||||||
|
// a typed-part array, never null/missing.
|
||||||
|
inline nlohmann::ordered_json build_reconstructed_message(const ReconstructedMessageInput& in) {
|
||||||
|
nlohmann::ordered_json msg_json;
|
||||||
|
msg_json["role"] = in.role;
|
||||||
|
const bool has_media = !in.images.empty() || !in.audios.empty() || !in.videos.empty();
|
||||||
|
|
||||||
|
if (!in.content.empty()) {
|
||||||
|
nlohmann::ordered_json content_val = normalize_message_content(in.role, in.content);
|
||||||
|
if (content_val.is_string() && in.is_last_user_msg && has_media) {
|
||||||
|
// Last user message + media: build a typed-part array (text first).
|
||||||
|
nlohmann::ordered_json content_array = nlohmann::ordered_json::array();
|
||||||
|
nlohmann::ordered_json text_part;
|
||||||
|
text_part["type"] = "text";
|
||||||
|
text_part["text"] = content_val.get<std::string>();
|
||||||
|
content_array.push_back(text_part);
|
||||||
|
append_media_parts(content_array, in.images, in.audios, in.videos);
|
||||||
|
msg_json["content"] = content_array;
|
||||||
|
} else if (content_val.is_null()) {
|
||||||
|
msg_json["content"] = "";
|
||||||
|
} else {
|
||||||
|
msg_json["content"] = content_val;
|
||||||
|
}
|
||||||
|
} else if (in.is_last_user_msg && has_media) {
|
||||||
|
// No text but media on the last user message: media-only typed array.
|
||||||
|
nlohmann::ordered_json content_array = nlohmann::ordered_json::array();
|
||||||
|
append_media_parts(content_array, in.images, in.audios, in.videos);
|
||||||
|
msg_json["content"] = content_array;
|
||||||
|
} else {
|
||||||
|
// Empty content (any role, incl. tool/assistant): templates need a string.
|
||||||
|
msg_json["content"] = "";
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!in.name.empty()) {
|
||||||
|
msg_json["name"] = in.name;
|
||||||
|
}
|
||||||
|
if (!in.tool_call_id.empty()) {
|
||||||
|
msg_json["tool_call_id"] = in.tool_call_id;
|
||||||
|
}
|
||||||
|
if (!in.reasoning_content.empty()) {
|
||||||
|
msg_json["reasoning_content"] = in.reasoning_content;
|
||||||
|
}
|
||||||
|
if (!in.tool_calls.empty()) {
|
||||||
|
try {
|
||||||
|
nlohmann::ordered_json tool_calls = nlohmann::ordered_json::parse(in.tool_calls);
|
||||||
|
msg_json["tool_calls"] = tool_calls;
|
||||||
|
// tool_calls + empty/blank content: use " " not "", because llama.cpp's
|
||||||
|
// common_chat_msgs_to_json_oaicompat turns "" into null, which breaks
|
||||||
|
// templates that slice message.content[:tool_start_length] (#7324).
|
||||||
|
if (!msg_json.contains("content") ||
|
||||||
|
(msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
|
||||||
|
msg_json["content"] = " ";
|
||||||
|
}
|
||||||
|
} catch (const nlohmann::ordered_json::parse_error&) {
|
||||||
|
// Malformed tool_calls JSON: leave content as-is (prior behavior).
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return msg_json;
|
||||||
|
}
|
||||||
|
|
||||||
|
} // namespace llama_grpc
|
||||||
234
backend/cpp/llama-cpp/message_content_test.cpp
Normal file
234
backend/cpp/llama-cpp/message_content_test.cpp
Normal file
@@ -0,0 +1,234 @@
|
|||||||
|
// Unit tests for the shared message-reconstruction helpers (message_content.h).
|
||||||
|
//
|
||||||
|
// Build & run standalone (nlohmann/json single header on the include path):
|
||||||
|
// g++ -std=c++17 -I<dir-with-nlohmann> message_content_test.cpp -o t && ./t
|
||||||
|
// or via CMake: -DLLAMA_GRPC_BUILD_TESTS=ON then ctest.
|
||||||
|
//
|
||||||
|
// Regression coverage for:
|
||||||
|
// #10524 - a user/system prompt that is itself a JSON-array string must stay
|
||||||
|
// plain text, never be reinterpreted as OpenAI structured parts.
|
||||||
|
// #7324 - assistant/tool null content -> "" (templates slice content[:N]);
|
||||||
|
// assistant+tool_calls+empty content -> " " (not "", which becomes null).
|
||||||
|
// #7528 - tool message array content must reach the template as a string.
|
||||||
|
// multimodal - last user message text + media -> typed-part array, media kept.
|
||||||
|
|
||||||
|
#include <cassert>
|
||||||
|
#include <iostream>
|
||||||
|
#include <string>
|
||||||
|
|
||||||
|
#include "message_content.h"
|
||||||
|
|
||||||
|
using nlohmann::ordered_json;
|
||||||
|
using llama_grpc::normalize_message_content;
|
||||||
|
using llama_grpc::normalize_template_message;
|
||||||
|
using llama_grpc::build_reconstructed_message;
|
||||||
|
using llama_grpc::ReconstructedMessageInput;
|
||||||
|
|
||||||
|
static int failures = 0;
|
||||||
|
|
||||||
|
static void check(bool ok, const std::string& name, const std::string& detail = "") {
|
||||||
|
if (!ok) {
|
||||||
|
std::cerr << "FAIL " << name << (detail.empty() ? "" : ": " + detail) << "\n";
|
||||||
|
failures++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---- normalize_message_content -------------------------------------------
|
||||||
|
|
||||||
|
static void expect_norm_string(const char* name, const std::string& role,
|
||||||
|
const std::string& content, const std::string& want) {
|
||||||
|
auto got = normalize_message_content(role, content);
|
||||||
|
if (!got.is_string()) {
|
||||||
|
check(false, name, "expected a JSON string, got " +
|
||||||
|
std::string(got.is_array() ? "array" : got.is_object() ? "object" : "other") +
|
||||||
|
" (" + got.dump() + ")");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
check(got.get<std::string>() == want, name, "expected \"" + want + "\", got \"" + got.get<std::string>() + "\"");
|
||||||
|
}
|
||||||
|
|
||||||
|
static void test_normalize() {
|
||||||
|
const std::string ingredients = R"(["1/4 cup brown sugar, packed","1 pound ground beef"])";
|
||||||
|
|
||||||
|
// #10524 - JSON-array text must stay a string. Role-INDEPENDENT array defense.
|
||||||
|
for (const char* role : {"user", "system", "developer", "function", "assistant", "tool"}) {
|
||||||
|
expect_norm_string((std::string("json_array_stays_text:") + role).c_str(), role, ingredients, ingredients);
|
||||||
|
}
|
||||||
|
|
||||||
|
// #10524 - user/system/developer JSON-object text stays verbatim (NOT re-dumped).
|
||||||
|
expect_norm_string("user_json_object_verbatim", "user", R"({"a":1})", R"({"a":1})");
|
||||||
|
expect_norm_string("system_json_object_verbatim", "system", R"({"a":1})", R"({"a":1})");
|
||||||
|
expect_norm_string("developer_json_object_verbatim", "developer", R"({"a":1})", R"({"a":1})");
|
||||||
|
|
||||||
|
// Plain text unchanged for all roles.
|
||||||
|
expect_norm_string("user_plain_text", "user", "hello world", "hello world");
|
||||||
|
expect_norm_string("assistant_non_json_text_kept", "assistant", "hi [unclosed", "hi [unclosed");
|
||||||
|
|
||||||
|
// #7324 boundary - user/system/developer literal "null" preserved (never parsed).
|
||||||
|
expect_norm_string("user_literal_null_stays", "user", "null", "null");
|
||||||
|
expect_norm_string("system_literal_null_stays", "system", "null", "null");
|
||||||
|
expect_norm_string("developer_literal_null_stays", "developer", "null", "null");
|
||||||
|
|
||||||
|
// #7324 - assistant/tool literal null collapses to empty string.
|
||||||
|
expect_norm_string("assistant_null_to_empty", "assistant", "null", "");
|
||||||
|
expect_norm_string("tool_null_to_empty", "tool", "null", "");
|
||||||
|
|
||||||
|
// #7324/#7528 - assistant/tool object bookkeeping stringified (stays a string).
|
||||||
|
check(normalize_message_content("assistant", R"({"tool":"x"})").is_string(), "assistant_object_stringified");
|
||||||
|
check(normalize_message_content("tool", R"({"error":"boom"})").is_string(), "tool_object_stringified");
|
||||||
|
|
||||||
|
// #10524-family - a bare scalar that parses as a JSON number stays the string.
|
||||||
|
expect_norm_string("assistant_scalar_number_stays_string", "assistant", "42", "42");
|
||||||
|
|
||||||
|
// baseline - empty content stays empty.
|
||||||
|
expect_norm_string("user_empty_stays_empty", "user", "", "");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---- normalize_template_message (BEFORE TEMPLATE sanitizer) ---------------
|
||||||
|
|
||||||
|
static void test_template_sanitizer() {
|
||||||
|
// #7528 - a tool message with an ACTUAL array becomes a string.
|
||||||
|
{
|
||||||
|
ordered_json msg = {{"role", "tool"}, {"content", ordered_json::array({{{"type", "text"}, {"text", "r"}}})}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg["content"].is_string(), "before_template_tool_array_to_string", "got " + msg["content"].dump());
|
||||||
|
}
|
||||||
|
// #7324 - null content -> "" for any role.
|
||||||
|
{
|
||||||
|
ordered_json msg = {{"role", "assistant"}, {"content", nullptr}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg["content"].is_string() && msg["content"] == "", "before_template_null_to_empty");
|
||||||
|
}
|
||||||
|
// object content -> dumped string (would otherwise throw at the template).
|
||||||
|
{
|
||||||
|
ordered_json msg = {{"role", "assistant"}, {"content", {{"x", 1}}}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg["content"].is_string(), "before_template_object_to_string", "got " + msg["content"].dump());
|
||||||
|
}
|
||||||
|
// missing content field -> "".
|
||||||
|
{
|
||||||
|
ordered_json msg = {{"role", "user"}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg.contains("content") && msg["content"] == "", "before_template_missing_to_empty");
|
||||||
|
}
|
||||||
|
// multimodal: a well-typed user array must be left UNTOUCHED (role!=tool).
|
||||||
|
{
|
||||||
|
ordered_json parts = ordered_json::array();
|
||||||
|
parts.push_back({{"type", "text"}, {"text", "x"}});
|
||||||
|
ordered_json img; img["type"] = "image_url"; img["image_url"] = {{"url", "data:..."}};
|
||||||
|
parts.push_back(img);
|
||||||
|
ordered_json msg = {{"role", "user"}, {"content", parts}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg["content"].is_array() && msg["content"].size() == 2, "before_template_user_typed_array_preserved",
|
||||||
|
"got " + msg["content"].dump());
|
||||||
|
}
|
||||||
|
// a plain string is left untouched.
|
||||||
|
{
|
||||||
|
ordered_json msg = {{"role", "user"}, {"content", "hello"}};
|
||||||
|
normalize_template_message(msg);
|
||||||
|
check(msg["content"] == "hello", "before_template_string_untouched");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---- build_reconstructed_message ----------------------------------------
|
||||||
|
|
||||||
|
static void test_reconstruction() {
|
||||||
|
const std::string ingredients = R"(["1/4 cup brown sugar","1 pound ground beef"])";
|
||||||
|
|
||||||
|
// #10524 end-state - user JSON-array text, no media -> string content.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "user"; in.content = ingredients;
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_string() && m["content"] == ingredients, "recon_user_json_array_string",
|
||||||
|
"got " + m["content"].dump());
|
||||||
|
}
|
||||||
|
// multimodal - user text + one image on last user msg -> typed array, image kept.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "user"; in.content = ingredients; in.is_last_user_msg = true;
|
||||||
|
in.images.push_back("BASE64IMG");
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_array() && m["content"].size() == 2, "recon_multimodal_text_plus_image",
|
||||||
|
"got " + m["content"].dump());
|
||||||
|
check(m["content"][0]["type"] == "text" && m["content"][0]["text"] == ingredients, "recon_multimodal_text_first");
|
||||||
|
check(m["content"][1]["type"] == "image_url", "recon_multimodal_image_kept");
|
||||||
|
}
|
||||||
|
// multimodal media-only - empty text + image on last user msg.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "user"; in.content = ""; in.is_last_user_msg = true;
|
||||||
|
in.images.push_back("BASE64IMG");
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_array() && m["content"].size() == 1 && m["content"][0]["type"] == "image_url",
|
||||||
|
"recon_media_only", "got " + m["content"].dump());
|
||||||
|
}
|
||||||
|
// #7528 - tool array-string content stays a string.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "tool"; in.content = R"(["a","b"])"; in.tool_call_id = "call_1";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_string() && m["content"] == R"(["a","b"])", "recon_tool_array_string",
|
||||||
|
"got " + m["content"].dump());
|
||||||
|
check(m["tool_call_id"] == "call_1", "recon_tool_call_id_set");
|
||||||
|
}
|
||||||
|
// tool empty content -> "".
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "tool"; in.content = "";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_string() && m["content"] == "", "recon_tool_empty_to_string");
|
||||||
|
}
|
||||||
|
// #7324 - assistant + tool_calls + empty content -> " " (single space, not "").
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "assistant"; in.content = "";
|
||||||
|
in.tool_calls = R"([{"id":"c1","type":"function","function":{"name":"f","arguments":"{}"}}])";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"].is_string() && m["content"] == " ", "recon_toolcalls_empty_content_space",
|
||||||
|
"got " + m["content"].dump());
|
||||||
|
check(m["tool_calls"].is_array() && m["tool_calls"].size() == 1, "recon_toolcalls_parsed");
|
||||||
|
}
|
||||||
|
// assistant + tool_calls + real content keeps the content.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "assistant"; in.content = "I'll call f";
|
||||||
|
in.tool_calls = R"([{"id":"c1","type":"function","function":{"name":"f","arguments":"{}"}}])";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"] == "I'll call f", "recon_toolcalls_with_content_kept");
|
||||||
|
}
|
||||||
|
// assistant null content -> "".
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "assistant"; in.content = "null";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"] == "", "recon_assistant_null_to_empty");
|
||||||
|
}
|
||||||
|
// malformed tool_calls JSON must not throw; content preserved.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "assistant"; in.content = "hi"; in.tool_calls = "{not json";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["content"] == "hi" && !m.contains("tool_calls"), "recon_malformed_toolcalls_safe");
|
||||||
|
}
|
||||||
|
// optional fields: name + reasoning carried through.
|
||||||
|
{
|
||||||
|
ReconstructedMessageInput in;
|
||||||
|
in.role = "tool"; in.content = "result"; in.name = "get_weather"; in.reasoning_content = "thinking";
|
||||||
|
auto m = build_reconstructed_message(in);
|
||||||
|
check(m["name"] == "get_weather" && m["reasoning_content"] == "thinking", "recon_optional_fields");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int main() {
|
||||||
|
test_normalize();
|
||||||
|
test_template_sanitizer();
|
||||||
|
test_reconstruction();
|
||||||
|
|
||||||
|
if (failures == 0) {
|
||||||
|
std::cout << "OK: all message_content tests passed\n";
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
std::cerr << failures << " test(s) failed\n";
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
@@ -14,6 +14,22 @@ mkdir -p $CURDIR/package/lib
|
|||||||
cp -avrf $CURDIR/llama-cpp-* $CURDIR/package/
|
cp -avrf $CURDIR/llama-cpp-* $CURDIR/package/
|
||||||
cp -rfv $CURDIR/run.sh $CURDIR/package/
|
cp -rfv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
# Bundle the ggml shared backends produced by the CPU_ALL_VARIANTS build (libggml-base.so,
|
||||||
|
# libggml.so, libllama.so and the per-microarch libggml-cpu-*.so), all into package/lib.
|
||||||
|
#
|
||||||
|
# Two distinct resolution mechanisms both land here:
|
||||||
|
# - NEEDED deps (libggml-base/libggml/libllama): resolved by the dynamic linker via the
|
||||||
|
# LD_LIBRARY_PATH=$CURDIR/lib that run.sh exports.
|
||||||
|
# - The per-microarch libggml-cpu-*.so are NOT linked; ggml *discovers* them at runtime by
|
||||||
|
# scanning the executable's own directory (readlink /proc/self/exe). run.sh launches via
|
||||||
|
# the bundled $CURDIR/lib/ld.so, so /proc/self/exe -> .../lib/ld.so and ggml scans lib/.
|
||||||
|
# That is why the variants must sit in lib/ (next to ld.so), not just on the link path.
|
||||||
|
# No-op on builds (arm64/darwin) that don't produce the all-variants set.
|
||||||
|
if [ -d "$CURDIR/ggml-shared-libs" ]; then
|
||||||
|
echo "Bundling ggml shared backends (CPU_ALL_VARIANTS)..."
|
||||||
|
cp -avf $CURDIR/ggml-shared-libs/*.so* $CURDIR/package/lib/
|
||||||
|
fi
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
||||||
# x86_64 architecture
|
# x86_64 architecture
|
||||||
|
|||||||
@@ -18,6 +18,10 @@ done
|
|||||||
|
|
||||||
cp -r CMakeLists.txt llama.cpp/tools/grpc-server/
|
cp -r CMakeLists.txt llama.cpp/tools/grpc-server/
|
||||||
cp -r grpc-server.cpp llama.cpp/tools/grpc-server/
|
cp -r grpc-server.cpp llama.cpp/tools/grpc-server/
|
||||||
|
# Shared message-reconstruction helpers (included by grpc-server.cpp) and their
|
||||||
|
# unit test (compiled only when -DLLAMA_GRPC_BUILD_TESTS=ON).
|
||||||
|
cp -r message_content.h llama.cpp/tools/grpc-server/
|
||||||
|
cp -r message_content_test.cpp llama.cpp/tools/grpc-server/
|
||||||
cp -rfv llama.cpp/vendor/nlohmann/json.hpp llama.cpp/tools/grpc-server/
|
cp -rfv llama.cpp/vendor/nlohmann/json.hpp llama.cpp/tools/grpc-server/
|
||||||
cp -rfv llama.cpp/vendor/cpp-httplib/httplib.h llama.cpp/tools/grpc-server/
|
cp -rfv llama.cpp/vendor/cpp-httplib/httplib.h llama.cpp/tools/grpc-server/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,55 +12,41 @@ grep -e "flags" /proc/cpuinfo | head -1
|
|||||||
|
|
||||||
BINARY=llama-cpp-fallback
|
BINARY=llama-cpp-fallback
|
||||||
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
# CPU images (x86, arm64, darwin) ship a single llama-cpp-cpu-all built with ggml
|
||||||
echo "CPU: AVX found OK"
|
# CPU_ALL_VARIANTS: ggml's backend registry dlopens the best libggml-cpu-*.so for this
|
||||||
if [ -e $CURDIR/llama-cpp-avx ]; then
|
# host, so no shell-side AVX probing. GPU images (cublas/sycl/vulkan/hipblas) ship only
|
||||||
BINARY=llama-cpp-avx
|
# llama-cpp-fallback (the accelerator does the compute), so fall back to it when absent.
|
||||||
fi
|
if [ -e "$CURDIR"/llama-cpp-cpu-all ]; then
|
||||||
fi
|
BINARY=llama-cpp-cpu-all
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
|
||||||
echo "CPU: AVX2 found OK"
|
|
||||||
if [ -e $CURDIR/llama-cpp-avx2 ]; then
|
|
||||||
BINARY=llama-cpp-avx2
|
|
||||||
fi
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Check avx 512
|
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
|
||||||
echo "CPU: AVX512F found OK"
|
|
||||||
if [ -e $CURDIR/llama-cpp-avx512 ]; then
|
|
||||||
BINARY=llama-cpp-avx512
|
|
||||||
fi
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
|
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
|
||||||
if [ -e $CURDIR/llama-cpp-grpc ]; then
|
if [ -e "$CURDIR"/llama-cpp-grpc ]; then
|
||||||
BINARY=llama-cpp-grpc
|
BINARY=llama-cpp-grpc
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Extend ld library path with the dir where this script is located/lib
|
# Extend ld library path with the dir where this script is located/lib
|
||||||
if [ "$(uname)" == "Darwin" ]; then
|
if [ "$(uname)" == "Darwin" ]; then
|
||||||
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
|
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
|
||||||
else
|
else
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
|
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
|
||||||
if [ -d "$CURDIR/lib/rocblas/library" ]; then
|
if [ -d "$CURDIR/lib/rocblas/library" ]; then
|
||||||
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
|
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/$BINARY "$@"
|
||||||
|
|
||||||
# We should never reach this point, however just in case we do, run fallback
|
# We should never reach this point, however just in case we do, run fallback
|
||||||
exec $CURDIR/llama-cpp-fallback "$@"
|
exec "$CURDIR"/llama-cpp-fallback "$@"
|
||||||
@@ -51,6 +51,14 @@ add_library(hw_grpc_proto STATIC
|
|||||||
${HW_GRPC_SRCS} ${HW_GRPC_HDRS}
|
${HW_GRPC_SRCS} ${HW_GRPC_HDRS}
|
||||||
${HW_PROTO_SRCS} ${HW_PROTO_HDRS})
|
${HW_PROTO_SRCS} ${HW_PROTO_HDRS})
|
||||||
target_include_directories(hw_grpc_proto PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
|
target_include_directories(hw_grpc_proto PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
|
||||||
|
# The generated proto/grpc sources include protobuf and grpc++ headers, so this
|
||||||
|
# library must see their include dirs. Linking the imported targets propagates
|
||||||
|
# them. On Linux the apt headers live in /usr/include (default search path) so
|
||||||
|
# this was a no-op; on macOS the Homebrew headers are under /opt/homebrew and
|
||||||
|
# would otherwise be missed (runtime_version.h not found).
|
||||||
|
target_link_libraries(hw_grpc_proto PUBLIC
|
||||||
|
protobuf::libprotobuf
|
||||||
|
gRPC::grpc++)
|
||||||
|
|
||||||
# Build only the pf static lib (+ ggml) from the engine tree — no CLI/bench/tests.
|
# Build only the pf static lib (+ ggml) from the engine tree — no CLI/bench/tests.
|
||||||
# PF_VULKAN is honored when passed on the cmake command line (it lands in the
|
# PF_VULKAN is honored when passed on the cmake command line (it lands in the
|
||||||
|
|||||||
@@ -2,7 +2,13 @@
|
|||||||
# Entry point for the privacy-filter backend image / BACKEND_BINARY mode.
|
# Entry point for the privacy-filter backend image / BACKEND_BINARY mode.
|
||||||
set -e
|
set -e
|
||||||
CURDIR=$(dirname "$(realpath "$0")")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
|
# macOS has no bundled ld.so; the darwin package ships only dylibs under lib/,
|
||||||
|
# resolved via DYLD_LIBRARY_PATH (the ld.so branch below is skipped there).
|
||||||
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR/lib:$DYLD_LIBRARY_PATH"
|
||||||
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
|
||||||
|
fi
|
||||||
if [ -f "$CURDIR/lib/ld.so" ]; then
|
if [ -f "$CURDIR/lib/ld.so" ]; then
|
||||||
exec "$CURDIR/lib/ld.so" "$CURDIR/grpc-server" "$@"
|
exec "$CURDIR/lib/ld.so" "$CURDIR/grpc-server" "$@"
|
||||||
fi
|
fi
|
||||||
|
|||||||
71
backend/cpp/run-unit-tests.sh
Executable file
71
backend/cpp/run-unit-tests.sh
Executable file
@@ -0,0 +1,71 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
#
|
||||||
|
# Discovers and runs every standalone C++ unit test under backend/cpp/.
|
||||||
|
#
|
||||||
|
# A "standalone" unit test is a *_test.cpp that depends only on the C++ standard
|
||||||
|
# library and nlohmann/json (single header) - i.e. it exercises pure helpers and
|
||||||
|
# does not need the full llama.cpp + gRPC backend build. Tests that DO need the
|
||||||
|
# backend build use the CMake/ctest path (e.g. -DLLAMA_GRPC_BUILD_TESTS=ON)
|
||||||
|
# instead and are skipped here.
|
||||||
|
#
|
||||||
|
# This keeps CI generic: adding a new pure-C++ unit test file named *_test.cpp in
|
||||||
|
# an active backend source dir is picked up automatically, with no CI edits.
|
||||||
|
#
|
||||||
|
# Env:
|
||||||
|
# NLOHMANN_INCLUDE include dir that contains nlohmann/json.hpp. If unset, the
|
||||||
|
# nlohmann/json single header is fetched to a temp dir.
|
||||||
|
# CXX compiler (default: g++).
|
||||||
|
# JSON_VERSION nlohmann/json tag to fetch when NLOHMANN_INCLUDE is unset
|
||||||
|
# (default: v3.11.3).
|
||||||
|
set -uo pipefail
|
||||||
|
|
||||||
|
ROOT="$(cd "$(dirname "$0")" && pwd)"
|
||||||
|
CXX="${CXX:-g++}"
|
||||||
|
JSON_VERSION="${JSON_VERSION:-v3.11.3}"
|
||||||
|
|
||||||
|
JSON_INC="${NLOHMANN_INCLUDE:-}"
|
||||||
|
if [ -z "$JSON_INC" ]; then
|
||||||
|
JSON_INC="$(mktemp -d)"
|
||||||
|
mkdir -p "$JSON_INC/nlohmann"
|
||||||
|
echo "Fetching nlohmann/json ${JSON_VERSION} single header..."
|
||||||
|
if ! curl -L -sf \
|
||||||
|
"https://raw.githubusercontent.com/nlohmann/json/${JSON_VERSION}/single_include/nlohmann/json.hpp" \
|
||||||
|
-o "$JSON_INC/nlohmann/json.hpp"; then
|
||||||
|
echo "ERROR: failed to fetch nlohmann/json header" >&2
|
||||||
|
exit 1
|
||||||
|
fi
|
||||||
|
fi
|
||||||
|
|
||||||
|
# Active source dirs only - exclude per-variant build copies, dev snapshots and
|
||||||
|
# the vendored upstream llama.cpp tree.
|
||||||
|
mapfile -t tests < <(find "$ROOT" -name '*_test.cpp' \
|
||||||
|
-not -path '*/llama.cpp/*' \
|
||||||
|
-not -path '*-build/*' \
|
||||||
|
-not -path '*-dev/*' \
|
||||||
|
-not -path '*fallback*' | sort)
|
||||||
|
|
||||||
|
if [ "${#tests[@]}" -eq 0 ]; then
|
||||||
|
echo "No standalone C++ unit tests found under $ROOT"
|
||||||
|
exit 0
|
||||||
|
fi
|
||||||
|
|
||||||
|
fail=0
|
||||||
|
for test_src in "${tests[@]}"; do
|
||||||
|
name="$(basename "$test_src" .cpp)"
|
||||||
|
bin="$(mktemp -d)/$name"
|
||||||
|
echo "==> $test_src"
|
||||||
|
if ! "$CXX" -std=c++17 -Wall -Wextra \
|
||||||
|
-I"$JSON_INC" -I"$(dirname "$test_src")" \
|
||||||
|
"$test_src" -o "$bin"; then
|
||||||
|
echo "COMPILE FAILED: $test_src" >&2
|
||||||
|
fail=1
|
||||||
|
continue
|
||||||
|
fi
|
||||||
|
if ! "$bin"; then
|
||||||
|
echo "TEST FAILED: $test_src" >&2
|
||||||
|
fail=1
|
||||||
|
fi
|
||||||
|
done
|
||||||
|
|
||||||
|
echo "Ran ${#tests[@]} standalone C++ unit test file(s)"
|
||||||
|
exit "$fail"
|
||||||
@@ -65,6 +65,29 @@ turboquant-avx:
|
|||||||
turboquant-fallback:
|
turboquant-fallback:
|
||||||
$(call turboquant-build,fallback,-DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off,--target grpc-server)
|
$(call turboquant-build,fallback,-DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off,--target grpc-server)
|
||||||
|
|
||||||
|
# Single-build CPU backend via ggml CPU_ALL_VARIANTS (mirrors llama-cpp-cpu-all).
|
||||||
|
# turboquant reuses backend/cpp/llama-cpp's CMakeLists.txt (hw_grpc_proto STATIC) and
|
||||||
|
# Makefile (SHARED_LIBS make-var + EXTRA_CMAKE_ARGS), so this passes the same overrides
|
||||||
|
# through to the copied build: SHARED_LIBS=ON, the DL flags, and --target ggml (which
|
||||||
|
# pulls in the per-microarch libggml-cpu-*.so via ggml's add_dependencies). The .so set
|
||||||
|
# is collected for package.sh to bundle into package/lib.
|
||||||
|
turboquant-cpu-all:
|
||||||
|
rm -rf $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build
|
||||||
|
cp -rf $(LLAMA_CPP_DIR) $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build
|
||||||
|
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build purge
|
||||||
|
bash $(CURRENT_MAKEFILE_DIR)/patch-grpc-server.sh $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build/grpc-server.cpp
|
||||||
|
$(info $(GREEN)I turboquant build info:cpu-all-variants$(RESET))
|
||||||
|
LLAMA_REPO=$(LLAMA_REPO) LLAMA_VERSION=$(TURBOQUANT_VERSION) \
|
||||||
|
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build llama.cpp
|
||||||
|
bash $(CURRENT_MAKEFILE_DIR)/apply-patches.sh $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build/llama.cpp $(PATCHES_DIR)
|
||||||
|
SHARED_LIBS=ON EXTRA_CMAKE_ARGS="-DGGML_BACKEND_DL=ON -DGGML_CPU_ALL_VARIANTS=ON" TARGET="--target grpc-server --target ggml" \
|
||||||
|
LLAMA_REPO=$(LLAMA_REPO) LLAMA_VERSION=$(TURBOQUANT_VERSION) \
|
||||||
|
$(MAKE) -C $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build grpc-server
|
||||||
|
cp -rfv $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build/grpc-server turboquant-cpu-all
|
||||||
|
rm -rf ggml-shared-libs && mkdir -p ggml-shared-libs
|
||||||
|
find $(CURRENT_MAKEFILE_DIR)/../turboquant-cpu-all-build/llama.cpp/build \( -name '*.so*' -o -name '*.dylib' \) -exec cp -av {} ggml-shared-libs/ \;
|
||||||
|
@echo "Collected ggml shared backends:" && ls -la ggml-shared-libs/
|
||||||
|
|
||||||
turboquant-grpc:
|
turboquant-grpc:
|
||||||
$(call turboquant-build,grpc,-DGGML_RPC=ON -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off,--target grpc-server --target rpc-server)
|
$(call turboquant-build,grpc,-DGGML_RPC=ON -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off,--target grpc-server --target rpc-server)
|
||||||
|
|
||||||
|
|||||||
@@ -14,6 +14,15 @@ mkdir -p $CURDIR/package/lib
|
|||||||
cp -avrf $CURDIR/turboquant-* $CURDIR/package/
|
cp -avrf $CURDIR/turboquant-* $CURDIR/package/
|
||||||
cp -rfv $CURDIR/run.sh $CURDIR/package/
|
cp -rfv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
# Bundle the ggml shared backends from the CPU_ALL_VARIANTS build into package/lib. ggml
|
||||||
|
# discovers the per-microarch libggml-cpu-*.so by scanning the executable directory, which
|
||||||
|
# (via the bundled lib/ld.so that run.sh launches through) resolves to lib/. See the
|
||||||
|
# matching comment in backend/cpp/llama-cpp/package.sh. No-op on the fallback/ROCm builds.
|
||||||
|
if [ -d "$CURDIR/ggml-shared-libs" ]; then
|
||||||
|
echo "Bundling ggml shared backends (CPU_ALL_VARIANTS)..."
|
||||||
|
cp -avf $CURDIR/ggml-shared-libs/*.so* $CURDIR/package/lib/
|
||||||
|
fi
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
||||||
# x86_64 architecture
|
# x86_64 architecture
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,54 +12,39 @@ grep -e "flags" /proc/cpuinfo | head -1
|
|||||||
|
|
||||||
BINARY=turboquant-fallback
|
BINARY=turboquant-fallback
|
||||||
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
# x86/arm64 ship a single turboquant-cpu-all built with ggml CPU_ALL_VARIANTS: ggml's
|
||||||
echo "CPU: AVX found OK"
|
# backend registry dlopens the best libggml-cpu-*.so for this host, so no shell-side
|
||||||
if [ -e $CURDIR/turboquant-avx ]; then
|
# probing. ROCm ships only turboquant-fallback, so fall back to it when cpu-all is absent.
|
||||||
BINARY=turboquant-avx
|
if [ -e "$CURDIR"/turboquant-cpu-all ]; then
|
||||||
fi
|
BINARY=turboquant-cpu-all
|
||||||
fi
|
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
|
||||||
echo "CPU: AVX2 found OK"
|
|
||||||
if [ -e $CURDIR/turboquant-avx2 ]; then
|
|
||||||
BINARY=turboquant-avx2
|
|
||||||
fi
|
|
||||||
fi
|
|
||||||
|
|
||||||
# Check avx 512
|
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
|
||||||
echo "CPU: AVX512F found OK"
|
|
||||||
if [ -e $CURDIR/turboquant-avx512 ]; then
|
|
||||||
BINARY=turboquant-avx512
|
|
||||||
fi
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
|
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
|
||||||
if [ -e $CURDIR/turboquant-grpc ]; then
|
if [ -e "$CURDIR"/turboquant-grpc ]; then
|
||||||
BINARY=turboquant-grpc
|
BINARY=turboquant-grpc
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Extend ld library path with the dir where this script is located/lib
|
# Extend ld library path with the dir where this script is located/lib
|
||||||
if [ "$(uname)" == "Darwin" ]; then
|
if [ "$(uname)" == "Darwin" ]; then
|
||||||
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
else
|
else
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
|
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
|
||||||
if [ -d "$CURDIR/lib/rocblas/library" ]; then
|
if [ -d "$CURDIR/lib/rocblas/library" ]; then
|
||||||
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
|
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using binary: $BINARY"
|
echo "Using binary: $BINARY"
|
||||||
exec $CURDIR/$BINARY "$@"
|
exec "$CURDIR"/$BINARY "$@"
|
||||||
|
|
||||||
# We should never reach this point, however just in case we do, run fallback
|
# We should never reach this point, however just in case we do, run fallback
|
||||||
exec $CURDIR/turboquant-fallback "$@"
|
exec "$CURDIR"/turboquant-fallback "$@"
|
||||||
|
|||||||
@@ -117,7 +117,8 @@ libgoacestepcpp-custom: CMakeLists.txt cpp/goacestepcpp.cpp cpp/goacestepcpp.h
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) --target goacestepcpp && \
|
cmake --build . --config Release -j$(JOBS) --target goacestepcpp && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgoacestepcpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgoacestepcpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgoacestepcpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
test: acestep-cpp
|
test: acestep-cpp
|
||||||
@echo "Running acestep-cpp tests..."
|
@echo "Running acestep-cpp tests..."
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -22,7 +23,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("ACESTEP_LIBRARY")
|
libName := os.Getenv("ACESTEP_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgoacestepcpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgoacestepcpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgoacestepcpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -13,6 +13,7 @@ mkdir -p $CURDIR/package/lib
|
|||||||
|
|
||||||
cp -avf $CURDIR/acestep-cpp $CURDIR/package/
|
cp -avf $CURDIR/acestep-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgoacestepcpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgoacestepcpp-*.so $CURDIR/package/
|
||||||
|
cp -fv $CURDIR/libgoacestepcpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,29 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single library variant (Metal or Accelerate). The goacestepcpp
|
||||||
|
# target is built as a CMake MODULE, which emits a .dylib for a SHARED
|
||||||
|
# build but a .so for a MODULE build on Apple, so prefer .dylib and fall
|
||||||
|
# back to .so.
|
||||||
|
LIBRARY="$CURDIR/libgoacestepcpp-fallback.dylib"
|
||||||
|
if [ ! -e "$LIBRARY" ]; then
|
||||||
|
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
|
||||||
|
fi
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgoacestepcpp-avx.so ]; then
|
if [ -e "$CURDIR"/libgoacestepcpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoacestepcpp-avx.so"
|
LIBRARY="$CURDIR/libgoacestepcpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgoacestepcpp-avx2.so ]; then
|
if [ -e "$CURDIR"/libgoacestepcpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoacestepcpp-avx2.so"
|
LIBRARY="$CURDIR/libgoacestepcpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +42,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgoacestepcpp-avx512.so ]; then
|
if [ -e "$CURDIR"/libgoacestepcpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoacestepcpp-avx512.so"
|
LIBRARY="$CURDIR/libgoacestepcpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export ACESTEP_LIBRARY=$LIBRARY
|
export ACESTEP_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/acestep-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/acestep-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/acestep-cpp "$@"
|
exec "$CURDIR"/acestep-cpp "$@"
|
||||||
|
|||||||
@@ -57,6 +57,7 @@ libced.so: sources/ced.cpp
|
|||||||
cmake -B sources/ced.cpp/build-shared -S sources/ced.cpp $(CMAKE_ARGS)
|
cmake -B sources/ced.cpp/build-shared -S sources/ced.cpp $(CMAKE_ARGS)
|
||||||
cmake --build sources/ced.cpp/build-shared --config Release -j$(JOBS)
|
cmake --build sources/ced.cpp/build-shared --config Release -j$(JOBS)
|
||||||
cp -fv sources/ced.cpp/build-shared/libced.so* ./ 2>/dev/null || true
|
cp -fv sources/ced.cpp/build-shared/libced.so* ./ 2>/dev/null || true
|
||||||
|
cp -fv sources/ced.cpp/build-shared/libced.dylib ./ 2>/dev/null || true
|
||||||
cp -fv sources/ced.cpp/include/ced_capi.h ./
|
cp -fv sources/ced.cpp/include/ced_capi.h ./
|
||||||
|
|
||||||
ced-grpc: libced.so main.go goced.go
|
ced-grpc: libced.so main.go goced.go
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ import (
|
|||||||
"flag"
|
"flag"
|
||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -27,7 +28,11 @@ type libFunc struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("CED_LIBRARY")
|
libName := os.Getenv("CED_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "libced.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "libced.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "libced.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
if err != nil {
|
if err != nil {
|
||||||
|
|||||||
@@ -15,10 +15,12 @@ mkdir -p "$CURDIR/package/lib"
|
|||||||
cp -avf "$CURDIR/ced-grpc" "$CURDIR/package/"
|
cp -avf "$CURDIR/ced-grpc" "$CURDIR/package/"
|
||||||
cp -avf "$CURDIR/run.sh" "$CURDIR/package/"
|
cp -avf "$CURDIR/run.sh" "$CURDIR/package/"
|
||||||
|
|
||||||
cp -avf "$CURDIR"/libced.so* "$CURDIR/package/lib/" 2>/dev/null || {
|
cp -avf "$CURDIR"/libced.so* "$CURDIR/package/lib/" 2>/dev/null || true
|
||||||
echo "ERROR: libced.so not found in $CURDIR, run 'make' first" >&2
|
cp -avf "$CURDIR"/libced.dylib "$CURDIR/package/lib/" 2>/dev/null || true
|
||||||
|
if ! ls "$CURDIR"/package/lib/libced.* >/dev/null 2>&1; then
|
||||||
|
echo "ERROR: libced shared library not found in $CURDIR, run 'make' first" >&2
|
||||||
exit 1
|
exit 1
|
||||||
}
|
fi
|
||||||
|
|
||||||
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
||||||
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
||||||
|
|||||||
@@ -3,7 +3,12 @@ set -e
|
|||||||
|
|
||||||
CURDIR=$(dirname "$(realpath "$0")")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
|
||||||
|
export CED_LIBRARY="$CURDIR/lib/libced.dylib"
|
||||||
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
|
||||||
|
fi
|
||||||
|
|
||||||
# If a self-contained ld.so was packaged, route through it so the packaged
|
# If a self-contained ld.so was packaged, route through it so the packaged
|
||||||
# libc / libstdc++ are used instead of the host's (matches the sibling backends).
|
# libc / libstdc++ are used instead of the host's (matches the sibling backends).
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
exec $CURDIR/cloud-proxy "$@"
|
exec "$CURDIR"/cloud-proxy "$@"
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
|||||||
|
|
||||||
# CrispASR version (release tag)
|
# CrispASR version (release tag)
|
||||||
CRISPASR_REPO?=https://github.com/CrispStrobe/CrispASR
|
CRISPASR_REPO?=https://github.com/CrispStrobe/CrispASR
|
||||||
CRISPASR_VERSION?=63b57289255267edf66e43e33bc3911e04a2e92d
|
CRISPASR_VERSION?=8f1218141b792b8868861c1af17ba1e361b05dc0
|
||||||
SO_TARGET?=libgocrispasr.so
|
SO_TARGET?=libgocrispasr.so
|
||||||
|
|
||||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||||
@@ -75,7 +75,8 @@ UNAME_S := $(shell uname -s)
|
|||||||
ifeq ($(UNAME_S),Linux)
|
ifeq ($(UNAME_S),Linux)
|
||||||
VARIANT_TARGETS = libgocrispasr-avx.so libgocrispasr-avx2.so libgocrispasr-avx512.so libgocrispasr-fallback.so
|
VARIANT_TARGETS = libgocrispasr-avx.so libgocrispasr-avx2.so libgocrispasr-avx512.so libgocrispasr-fallback.so
|
||||||
else
|
else
|
||||||
VARIANT_TARGETS = libgocrispasr-fallback.so
|
# On non-Linux (e.g., Darwin), build only fallback variant (as a dylib)
|
||||||
|
VARIANT_TARGETS = libgocrispasr-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
crispasr: main.go gocrispasr.go $(VARIANT_TARGETS)
|
crispasr: main.go gocrispasr.go $(VARIANT_TARGETS)
|
||||||
@@ -87,7 +88,7 @@ package: crispasr
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libgocrispasr*.so package sources/CrispASR crispasr
|
rm -rf libgocrispasr*.so libgocrispasr*.dylib package sources/CrispASR crispasr
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -118,13 +119,21 @@ libgocrispasr-fallback.so: sources/CrispASR
|
|||||||
SO_TARGET=libgocrispasr-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgocrispasr-custom
|
SO_TARGET=libgocrispasr-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgocrispasr-custom
|
||||||
rm -rfv build*
|
rm -rfv build*
|
||||||
|
|
||||||
|
# Build fallback variant as a dylib (Darwin)
|
||||||
|
libgocrispasr-fallback.dylib: sources/CrispASR
|
||||||
|
$(MAKE) purge
|
||||||
|
$(info ${GREEN}I crispasr build info:fallback (dylib)${RESET})
|
||||||
|
SO_TARGET=libgocrispasr-fallback.dylib CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgocrispasr-custom
|
||||||
|
rm -rfv build*
|
||||||
|
|
||||||
libgocrispasr-custom: CMakeLists.txt cpp/crispasr_shim.cpp cpp/crispasr_shim.h
|
libgocrispasr-custom: CMakeLists.txt cpp/crispasr_shim.cpp cpp/crispasr_shim.h
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
cd build-$(SO_TARGET) && \
|
cd build-$(SO_TARGET) && \
|
||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgocrispasr.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgocrispasr.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgocrispasr.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
test: crispasr
|
test: crispasr
|
||||||
CGO_ENABLED=0 $(GOCMD) test -v ./...
|
CGO_ENABLED=0 $(GOCMD) test -v ./...
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("CRISPASR_LIBRARY")
|
libName := os.Getenv("CRISPASR_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgocrispasr-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgocrispasr-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgocrispasr-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,7 +12,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/crispasr $CURDIR/package/
|
cp -avf $CURDIR/crispasr $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgocrispasr-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgocrispasr-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgocrispasr-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgocrispasr-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgocrispasr-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgocrispasr-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgocrispasr-avx.so ]; then
|
if [ -e "$CURDIR"/libgocrispasr-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgocrispasr-avx.so"
|
LIBRARY="$CURDIR/libgocrispasr-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgocrispasr-avx2.so ]; then
|
if [ -e "$CURDIR"/libgocrispasr-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgocrispasr-avx2.so"
|
LIBRARY="$CURDIR/libgocrispasr-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,26 +36,27 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgocrispasr-avx512.so ]; then
|
if [ -e "$CURDIR"/libgocrispasr-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgocrispasr-avx512.so"
|
LIBRARY="$CURDIR/libgocrispasr-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export CRISPASR_LIBRARY=$LIBRARY
|
export CRISPASR_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# Point piper's espeak-ng phonemizer at the bundled voice data. The variable
|
# Point piper's espeak-ng phonemizer at the bundled voice data. The variable
|
||||||
# names the directory CONTAINING espeak-ng-data (package.sh drops it next to
|
# names the directory CONTAINING espeak-ng-data (package.sh drops it next to
|
||||||
# this script). Harmless when espeak-ng wasn't bundled.
|
# this script). Harmless when espeak-ng wasn't bundled.
|
||||||
export CRISPASR_ESPEAK_DATA_PATH=$CURDIR
|
export CRISPASR_ESPEAK_DATA_PATH="$CURDIR"
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/crispasr "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/crispasr "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/crispasr "$@"
|
exec "$CURDIR"/crispasr "$@"
|
||||||
|
|||||||
@@ -40,6 +40,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
|||||||
else ifeq ($(BUILD_TYPE),vulkan)
|
else ifeq ($(BUILD_TYPE),vulkan)
|
||||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DDA_GGML_VULKAN=ON
|
CMAKE_ARGS+=-DGGML_VULKAN=ON -DDA_GGML_VULKAN=ON
|
||||||
else ifeq ($(OS),Darwin)
|
else ifeq ($(OS),Darwin)
|
||||||
|
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||||
|
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||||
ifneq ($(BUILD_TYPE),metal)
|
ifneq ($(BUILD_TYPE),metal)
|
||||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||||
else
|
else
|
||||||
@@ -77,7 +79,7 @@ ifeq ($(UNAME_S),Linux)
|
|||||||
VARIANT_TARGETS = libdepthanythingcpp-avx.so libdepthanythingcpp-avx2.so libdepthanythingcpp-avx512.so libdepthanythingcpp-fallback.so
|
VARIANT_TARGETS = libdepthanythingcpp-avx.so libdepthanythingcpp-avx2.so libdepthanythingcpp-avx512.so libdepthanythingcpp-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant
|
||||||
VARIANT_TARGETS = libdepthanythingcpp-fallback.so
|
VARIANT_TARGETS = libdepthanythingcpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
depth-anything-cpp: main.go godepthanythingcpp.go $(VARIANT_TARGETS)
|
depth-anything-cpp: main.go godepthanythingcpp.go $(VARIANT_TARGETS)
|
||||||
@@ -89,7 +91,7 @@ package: depth-anything-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libdepthanythingcpp*.so depth-anything-cpp package sources
|
rm -rf libdepthanythingcpp*.so libdepthanythingcpp*.dylib depth-anything-cpp package sources
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -116,11 +118,19 @@ libdepthanythingcpp-avx512.so: sources/depth-anything.cpp
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
# Build fallback variant (all platforms)
|
# Build fallback variant (all platforms)
|
||||||
|
ifeq ($(UNAME_S),Darwin)
|
||||||
|
libdepthanythingcpp-fallback.dylib: sources/depth-anything.cpp
|
||||||
|
rm -rfv build-$@
|
||||||
|
$(info ${GREEN}I depth-anything-cpp build info:fallback${RESET})
|
||||||
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libdepthanythingcpp-custom
|
||||||
|
rm -rfv build-$@
|
||||||
|
else
|
||||||
libdepthanythingcpp-fallback.so: sources/depth-anything.cpp
|
libdepthanythingcpp-fallback.so: sources/depth-anything.cpp
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
$(info ${GREEN}I depth-anything-cpp build info:fallback${RESET})
|
$(info ${GREEN}I depth-anything-cpp build info:fallback${RESET})
|
||||||
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libdepthanythingcpp-custom
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libdepthanythingcpp-custom
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
|
endif
|
||||||
|
|
||||||
libdepthanythingcpp-custom: CMakeLists.txt
|
libdepthanythingcpp-custom: CMakeLists.txt
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
@@ -128,7 +138,8 @@ libdepthanythingcpp-custom: CMakeLists.txt
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libdepthanything.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libdepthanything.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libdepthanything.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
all: depth-anything-cpp package
|
all: depth-anything-cpp package
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -27,7 +28,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("DEPTHANYTHING_LIBRARY")
|
libName := os.Getenv("DEPTHANYTHING_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libdepthanythingcpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libdepthanythingcpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libdepthanythingcpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -10,7 +10,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
# Create lib directory
|
# Create lib directory
|
||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/libdepthanythingcpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/libdepthanythingcpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libdepthanythingcpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -avf $CURDIR/depth-anything-cpp $CURDIR/package/
|
cp -avf $CURDIR/depth-anything-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libdepthanythingcpp-avx.so ]; then
|
if [ -e "$CURDIR"/libdepthanythingcpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libdepthanythingcpp-avx.so"
|
LIBRARY="$CURDIR/libdepthanythingcpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libdepthanythingcpp-avx2.so ]; then
|
if [ -e "$CURDIR"/libdepthanythingcpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libdepthanythingcpp-avx2.so"
|
LIBRARY="$CURDIR/libdepthanythingcpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libdepthanythingcpp-avx512.so ]; then
|
if [ -e "$CURDIR"/libdepthanythingcpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libdepthanythingcpp-avx512.so"
|
LIBRARY="$CURDIR/libdepthanythingcpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export DEPTHANYTHING_LIBRARY=$LIBRARY
|
export DEPTHANYTHING_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/depth-anything-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/depth-anything-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/depth-anything-cpp "$@"
|
exec "$CURDIR"/depth-anything-cpp "$@"
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
exec $CURDIR/local-store "$@"
|
exec "$CURDIR"/local-store "$@"
|
||||||
@@ -32,6 +32,8 @@ endif
|
|||||||
ifeq ($(BUILD_TYPE),vulkan)
|
ifeq ($(BUILD_TYPE),vulkan)
|
||||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLOCALVQE_VULKAN=ON
|
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLOCALVQE_VULKAN=ON
|
||||||
else ifeq ($(OS),Darwin)
|
else ifeq ($(OS),Darwin)
|
||||||
|
# Apple Silicon: CPU-only (no Metal upstream); built + published as an arm64
|
||||||
|
# image by CI (includeDarwin in .github/backend-matrix.yml) for macOS install.
|
||||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||||
endif
|
endif
|
||||||
|
|
||||||
@@ -67,8 +69,9 @@ $(LIB_SENTINEL): sources/LocalVQE
|
|||||||
# that the loader picks at runtime. We must build every target — the
|
# that the loader picks at runtime. We must build every target — the
|
||||||
# default `--target localvqe_shared` drops these. CMAKE_LIBRARY_OUTPUT_DIRECTORY
|
# default `--target localvqe_shared` drops these. CMAKE_LIBRARY_OUTPUT_DIRECTORY
|
||||||
# routes all of them into build/bin; copy them out next to the binary.
|
# routes all of them into build/bin; copy them out next to the binary.
|
||||||
cp -P build/bin/liblocalvqe.so* . 2>/dev/null || cp -P build/liblocalvqe.so* .
|
cp -P build/bin/liblocalvqe.so* . 2>/dev/null || cp -P build/bin/liblocalvqe.dylib . 2>/dev/null || cp -P build/liblocalvqe.so* . 2>/dev/null || cp -P build/liblocalvqe.dylib .
|
||||||
cp -P build/bin/libggml*.so* . 2>/dev/null || true
|
cp -P build/bin/libggml*.so* . 2>/dev/null || true
|
||||||
|
cp -P build/bin/libggml*.dylib . 2>/dev/null || true
|
||||||
touch $(LIB_SENTINEL)
|
touch $(LIB_SENTINEL)
|
||||||
|
|
||||||
liblocalvqe.so: $(LIB_SENTINEL)
|
liblocalvqe.so: $(LIB_SENTINEL)
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("LOCALVQE_LIBRARY")
|
libName := os.Getenv("LOCALVQE_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./liblocalvqe.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./liblocalvqe.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./liblocalvqe.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -15,7 +15,9 @@ cp -avf $CURDIR/localvqe $CURDIR/package/
|
|||||||
# liblocalvqe.so* (with SOVERSION symlinks) and the libggml-*.so runtime
|
# liblocalvqe.so* (with SOVERSION symlinks) and the libggml-*.so runtime
|
||||||
# variants — LocalVQE picks the matching CPU variant at load time.
|
# variants — LocalVQE picks the matching CPU variant at load time.
|
||||||
cp -P $CURDIR/liblocalvqe.so* $CURDIR/package/ 2>/dev/null || true
|
cp -P $CURDIR/liblocalvqe.so* $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -P $CURDIR/liblocalvqe.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -P $CURDIR/libggml*.so* $CURDIR/package/ 2>/dev/null || true
|
cp -P $CURDIR/libggml*.so* $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -P $CURDIR/libggml*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -1,23 +1,34 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
# LocalVQE's runtime CPU-variant loader (ggml_backend_load_all) searches
|
# LocalVQE's runtime CPU-variant loader (ggml_backend_load_all) searches
|
||||||
# get_executable_path() and current_path() — the second one is what saves us
|
# get_executable_path() and current_path() — the second one is what saves us
|
||||||
# when /proc/self/exe resolves to lib/ld.so under the bundled-loader path.
|
# when /proc/self/exe resolves to lib/ld.so under the bundled-loader path.
|
||||||
# So we cd into $CURDIR (where all the libggml-cpu-*.so files live) before
|
# So we cd into "$CURDIR" (where all the libggml-cpu-*.so files live) before
|
||||||
# exec'ing the binary.
|
# exec'ing the binary.
|
||||||
cd "$CURDIR"
|
cd "$CURDIR"
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR:$CURDIR/lib:$LD_LIBRARY_PATH
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
export LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.so
|
# macOS: LocalVQE is built as a SHARED library, so dyld needs the .dylib +
|
||||||
|
# DYLD_LIBRARY_PATH. Prefer .dylib and fall back to .so just in case.
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.dylib
|
||||||
|
if [ ! -e "$LOCALVQE_LIBRARY" ]; then
|
||||||
|
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
|
||||||
|
fi
|
||||||
|
export LOCALVQE_LIBRARY
|
||||||
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
|
export LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
|
||||||
|
fi
|
||||||
|
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LOCALVQE_LIBRARY"
|
echo "Using library: $LOCALVQE_LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/localvqe "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/localvqe "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LOCALVQE_LIBRARY"
|
echo "Using library: $LOCALVQE_LIBRARY"
|
||||||
exec $CURDIR/localvqe "$@"
|
exec "$CURDIR"/localvqe "$@"
|
||||||
|
|||||||
@@ -33,6 +33,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
|||||||
else ifeq ($(BUILD_TYPE),vulkan)
|
else ifeq ($(BUILD_TYPE),vulkan)
|
||||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLA_GGML_VULKAN=ON
|
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLA_GGML_VULKAN=ON
|
||||||
else ifeq ($(OS),Darwin)
|
else ifeq ($(OS),Darwin)
|
||||||
|
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||||
|
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||||
ifneq ($(BUILD_TYPE),metal)
|
ifneq ($(BUILD_TYPE),metal)
|
||||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||||
else
|
else
|
||||||
@@ -70,7 +72,7 @@ ifeq ($(UNAME_S),Linux)
|
|||||||
VARIANT_TARGETS = liblocateanythingcpp-avx.so liblocateanythingcpp-avx2.so liblocateanythingcpp-avx512.so liblocateanythingcpp-fallback.so
|
VARIANT_TARGETS = liblocateanythingcpp-avx.so liblocateanythingcpp-avx2.so liblocateanythingcpp-avx512.so liblocateanythingcpp-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant
|
||||||
VARIANT_TARGETS = liblocateanythingcpp-fallback.so
|
VARIANT_TARGETS = liblocateanythingcpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
locate-anything-cpp: main.go golocateanythingcpp.go $(VARIANT_TARGETS)
|
locate-anything-cpp: main.go golocateanythingcpp.go $(VARIANT_TARGETS)
|
||||||
@@ -82,7 +84,7 @@ package: locate-anything-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf liblocateanythingcpp*.so locate-anything-cpp package sources
|
rm -rf liblocateanythingcpp*.so liblocateanythingcpp*.dylib locate-anything-cpp package sources
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -109,11 +111,19 @@ liblocateanythingcpp-avx512.so: sources/locate-anything.cpp
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
# Build fallback variant (all platforms)
|
# Build fallback variant (all platforms)
|
||||||
|
ifeq ($(UNAME_S),Darwin)
|
||||||
|
liblocateanythingcpp-fallback.dylib: sources/locate-anything.cpp
|
||||||
|
rm -rfv build-$@
|
||||||
|
$(info ${GREEN}I locate-anything-cpp build info:fallback${RESET})
|
||||||
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) liblocateanythingcpp-custom
|
||||||
|
rm -rfv build-$@
|
||||||
|
else
|
||||||
liblocateanythingcpp-fallback.so: sources/locate-anything.cpp
|
liblocateanythingcpp-fallback.so: sources/locate-anything.cpp
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
$(info ${GREEN}I locate-anything-cpp build info:fallback${RESET})
|
$(info ${GREEN}I locate-anything-cpp build info:fallback${RESET})
|
||||||
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) liblocateanythingcpp-custom
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) liblocateanythingcpp-custom
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
|
endif
|
||||||
|
|
||||||
liblocateanythingcpp-custom: CMakeLists.txt
|
liblocateanythingcpp-custom: CMakeLists.txt
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
@@ -121,7 +131,8 @@ liblocateanythingcpp-custom: CMakeLists.txt
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/liblocateanythingcpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/liblocateanythingcpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/liblocateanythingcpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
all: locate-anything-cpp package
|
all: locate-anything-cpp package
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -27,7 +28,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("LOCATEANYTHING_LIBRARY")
|
libName := os.Getenv("LOCATEANYTHING_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./liblocateanythingcpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./liblocateanythingcpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./liblocateanythingcpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -10,7 +10,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
# Create lib directory
|
# Create lib directory
|
||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/liblocateanythingcpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/liblocateanythingcpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/liblocateanythingcpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -avf $CURDIR/locate-anything-cpp $CURDIR/package/
|
cp -avf $CURDIR/locate-anything-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/liblocateanythingcpp-avx.so ]; then
|
if [ -e "$CURDIR"/liblocateanythingcpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/liblocateanythingcpp-avx.so"
|
LIBRARY="$CURDIR/liblocateanythingcpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/liblocateanythingcpp-avx2.so ]; then
|
if [ -e "$CURDIR"/liblocateanythingcpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/liblocateanythingcpp-avx2.so"
|
LIBRARY="$CURDIR/liblocateanythingcpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/liblocateanythingcpp-avx512.so ]; then
|
if [ -e "$CURDIR"/liblocateanythingcpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/liblocateanythingcpp-avx512.so"
|
LIBRARY="$CURDIR/liblocateanythingcpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export LOCATEANYTHING_LIBRARY=$LIBRARY
|
export LOCATEANYTHING_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/locate-anything-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/locate-anything-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/locate-anything-cpp "$@"
|
exec "$CURDIR"/locate-anything-cpp "$@"
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
|||||||
|
|
||||||
# omnivoice.cpp version
|
# omnivoice.cpp version
|
||||||
OMNIVOICE_REPO?=https://github.com/ServeurpersoCom/omnivoice.cpp
|
OMNIVOICE_REPO?=https://github.com/ServeurpersoCom/omnivoice.cpp
|
||||||
OMNIVOICE_VERSION?=96d30169afd5e6bb3fd6a0e9be0eb505bfe81fcd
|
OMNIVOICE_VERSION?=0f37401bebe9b20c0160a888e592108fc1d17607
|
||||||
SO_TARGET?=libgomnivoicecpp.so
|
SO_TARGET?=libgomnivoicecpp.so
|
||||||
|
|
||||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||||
@@ -65,7 +65,8 @@ UNAME_S := $(shell uname -s)
|
|||||||
ifeq ($(UNAME_S),Linux)
|
ifeq ($(UNAME_S),Linux)
|
||||||
VARIANT_TARGETS = libgomnivoicecpp-avx.so libgomnivoicecpp-avx2.so libgomnivoicecpp-avx512.so libgomnivoicecpp-fallback.so
|
VARIANT_TARGETS = libgomnivoicecpp-avx.so libgomnivoicecpp-avx2.so libgomnivoicecpp-avx512.so libgomnivoicecpp-fallback.so
|
||||||
else
|
else
|
||||||
VARIANT_TARGETS = libgomnivoicecpp-fallback.so
|
# On non-Linux (e.g., Darwin), build only fallback variant (as a dylib)
|
||||||
|
VARIANT_TARGETS = libgomnivoicecpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
omnivoice-cpp: main.go gomnivoicecpp.go $(VARIANT_TARGETS)
|
omnivoice-cpp: main.go gomnivoicecpp.go $(VARIANT_TARGETS)
|
||||||
@@ -77,7 +78,7 @@ package: omnivoice-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libgomnivoicecpp*.so package sources/omnivoice.cpp omnivoice-cpp
|
rm -rf libgomnivoicecpp*.so libgomnivoicecpp*.dylib package sources/omnivoice.cpp omnivoice-cpp
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -106,13 +107,20 @@ libgomnivoicecpp-fallback.so: sources/omnivoice.cpp
|
|||||||
SO_TARGET=libgomnivoicecpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgomnivoicecpp-custom
|
SO_TARGET=libgomnivoicecpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgomnivoicecpp-custom
|
||||||
rm -rf build-libgomnivoicecpp-fallback.so
|
rm -rf build-libgomnivoicecpp-fallback.so
|
||||||
|
|
||||||
|
# Build fallback variant as a dylib (Darwin)
|
||||||
|
libgomnivoicecpp-fallback.dylib: sources/omnivoice.cpp
|
||||||
|
$(info ${GREEN}I omnivoice-cpp build info:fallback (dylib)${RESET})
|
||||||
|
SO_TARGET=libgomnivoicecpp-fallback.dylib CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgomnivoicecpp-custom
|
||||||
|
rm -rf build-libgomnivoicecpp-fallback.dylib
|
||||||
|
|
||||||
libgomnivoicecpp-custom: CMakeLists.txt cpp/gomnivoicecpp.cpp cpp/gomnivoicecpp.h
|
libgomnivoicecpp-custom: CMakeLists.txt cpp/gomnivoicecpp.cpp cpp/gomnivoicecpp.h
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
cd build-$(SO_TARGET) && \
|
cd build-$(SO_TARGET) && \
|
||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) --target gomnivoicecpp && \
|
cmake --build . --config Release -j$(JOBS) --target gomnivoicecpp && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgomnivoicecpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgomnivoicecpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgomnivoicecpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
test: omnivoice-cpp
|
test: omnivoice-cpp
|
||||||
@echo "Running omnivoice-cpp tests..."
|
@echo "Running omnivoice-cpp tests..."
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("OMNIVOICE_LIBRARY")
|
libName := os.Getenv("OMNIVOICE_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgomnivoicecpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgomnivoicecpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgomnivoicecpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,7 +12,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/omnivoice-cpp $CURDIR/package/
|
cp -avf $CURDIR/omnivoice-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgomnivoicecpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgomnivoicecpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgomnivoicecpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgomnivoicecpp-avx.so ]; then
|
if [ -e "$CURDIR"/libgomnivoicecpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgomnivoicecpp-avx.so"
|
LIBRARY="$CURDIR/libgomnivoicecpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgomnivoicecpp-avx2.so ]; then
|
if [ -e "$CURDIR"/libgomnivoicecpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgomnivoicecpp-avx2.so"
|
LIBRARY="$CURDIR/libgomnivoicecpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgomnivoicecpp-avx512.so ]; then
|
if [ -e "$CURDIR"/libgomnivoicecpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgomnivoicecpp-avx512.so"
|
LIBRARY="$CURDIR/libgomnivoicecpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export OMNIVOICE_LIBRARY=$LIBRARY
|
export OMNIVOICE_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/omnivoice-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/omnivoice-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/omnivoice-cpp "$@"
|
exec "$CURDIR"/omnivoice-cpp "$@"
|
||||||
|
|||||||
@@ -1,13 +1,30 @@
|
|||||||
GOCMD?=go
|
GOCMD?=go
|
||||||
GO_TAGS?=
|
GO_TAGS?=
|
||||||
|
|
||||||
|
# The opus shim is a small C wrapper around libopus' variadic
|
||||||
|
# opus_encoder_ctl (see csrc/opus_shim.c). It is built as a shared library
|
||||||
|
# and dlopen'd at runtime by the Go backend (codec.go). The extension is
|
||||||
|
# OS-specific: Linux uses .so, macOS uses .dylib. OS is exported by the root
|
||||||
|
# Makefile (`export OS := $(shell uname -s)`).
|
||||||
|
SHIM_EXT=so
|
||||||
|
|
||||||
OPUS_CFLAGS := $(shell pkg-config --cflags opus)
|
OPUS_CFLAGS := $(shell pkg-config --cflags opus)
|
||||||
OPUS_LIBS := $(shell pkg-config --libs opus)
|
OPUS_LIBS := $(shell pkg-config --libs opus)
|
||||||
|
SHIM_LDFLAGS := $(OPUS_LIBS)
|
||||||
|
|
||||||
libopusshim.so: csrc/opus_shim.c
|
ifeq ($(OS),Darwin)
|
||||||
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(OPUS_LIBS)
|
SHIM_EXT=dylib
|
||||||
|
# Resolve libopus symbols lazily from the already globally-loaded
|
||||||
|
# libopus (codec.go dlopens it RTLD_GLOBAL before the shim) rather than
|
||||||
|
# recording an absolute Homebrew path in the dylib. This keeps the
|
||||||
|
# packaged shim relocatable on machines that have no Homebrew.
|
||||||
|
SHIM_LDFLAGS := -undefined dynamic_lookup
|
||||||
|
endif
|
||||||
|
|
||||||
opus: libopusshim.so
|
libopusshim.$(SHIM_EXT): csrc/opus_shim.c
|
||||||
|
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(SHIM_LDFLAGS)
|
||||||
|
|
||||||
|
opus: libopusshim.$(SHIM_EXT)
|
||||||
$(GOCMD) build -tags "$(GO_TAGS)" -o opus ./
|
$(GOCMD) build -tags "$(GO_TAGS)" -o opus ./
|
||||||
|
|
||||||
package: opus
|
package: opus
|
||||||
@@ -16,4 +33,7 @@ package: opus
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean:
|
clean:
|
||||||
rm -f opus libopusshim.so
|
rm -f opus libopusshim.$(SHIM_EXT)
|
||||||
|
rm -rf package
|
||||||
|
|
||||||
|
.PHONY: build package clean
|
||||||
|
|||||||
@@ -8,13 +8,23 @@ mkdir -p $CURDIR/package/lib
|
|||||||
cp -avf $CURDIR/opus $CURDIR/package/
|
cp -avf $CURDIR/opus $CURDIR/package/
|
||||||
cp -avf $CURDIR/run.sh $CURDIR/package/
|
cp -avf $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Copy the opus shim library
|
# The shim extension is OS-specific (.so on Linux, .dylib on macOS).
|
||||||
cp -avf $CURDIR/libopusshim.so $CURDIR/package/lib/
|
SHIM_EXT=so
|
||||||
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
SHIM_EXT=dylib
|
||||||
|
fi
|
||||||
|
|
||||||
# Copy system libopus
|
# Copy the opus shim library
|
||||||
|
cp -avf $CURDIR/libopusshim.$SHIM_EXT $CURDIR/package/lib/
|
||||||
|
|
||||||
|
# Copy system libopus so the backend is self-contained: the runtime base
|
||||||
|
# image has neither libopus-dev (Linux) nor Homebrew (macOS), so codec.go's
|
||||||
|
# dlopen would otherwise fail. Both name patterns are attempted; only the
|
||||||
|
# host's matching one exists.
|
||||||
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists opus; then
|
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists opus; then
|
||||||
LIBOPUS_DIR=$(pkg-config --variable=libdir opus)
|
LIBOPUS_DIR=$(pkg-config --variable=libdir opus)
|
||||||
cp -avfL $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
|
cp -avf $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
|
||||||
|
cp -avf $LIBOPUS_DIR/libopus*.dylib $CURDIR/package/lib/ 2>/dev/null || true
|
||||||
fi
|
fi
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
@@ -38,6 +48,8 @@ elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
|
|||||||
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
|
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
|
||||||
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
|
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
|
||||||
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
|
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
|
||||||
|
elif [ "$(uname -s)" = "Darwin" ]; then
|
||||||
|
echo "Detected Darwin — system libraries linked dynamically, no bundled loader needed"
|
||||||
else
|
else
|
||||||
echo "Warning: Could not detect architecture for system library bundling"
|
echo "Warning: Could not detect architecture for system library bundling"
|
||||||
fi
|
fi
|
||||||
|
|||||||
@@ -1,15 +1,20 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
export OPUS_SHIM_LIBRARY=$CURDIR/lib/libopusshim.so
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.dylib
|
||||||
# If there is a lib/ld.so, use it
|
else
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
echo "Using lib/ld.so"
|
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.so
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/opus "$@"
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
exec $CURDIR/opus "$@"
|
# If there is a lib/ld.so, use it
|
||||||
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
|
echo "Using lib/ld.so"
|
||||||
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/opus "$@"
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec "$CURDIR"/opus "$@"
|
||||||
|
|||||||
@@ -1,6 +1,6 @@
|
|||||||
# parakeet-cpp backend Makefile.
|
# parakeet-cpp backend Makefile.
|
||||||
#
|
#
|
||||||
# Upstream pin lives below as PARAKEET_VERSION?=db755a78d39f789bb7d4e3935158a9e8105dbe36
|
# Upstream pin lives below as PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
|
||||||
# (.github/bump_deps.sh) can find and update it - matches the
|
# (.github/bump_deps.sh) can find and update it - matches the
|
||||||
# whisper.cpp / ds4 / vibevoice-cpp convention.
|
# whisper.cpp / ds4 / vibevoice-cpp convention.
|
||||||
#
|
#
|
||||||
@@ -15,7 +15,7 @@
|
|||||||
# That's what the L0 smoke test uses. The default target below does the
|
# That's what the L0 smoke test uses. The default target below does the
|
||||||
# proper clone-at-pin + cmake build so CI doesn't need a side-checkout.
|
# proper clone-at-pin + cmake build so CI doesn't need a side-checkout.
|
||||||
|
|
||||||
PARAKEET_VERSION?=db755a78d39f789bb7d4e3935158a9e8105dbe36
|
PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
|
||||||
PARAKEET_REPO?=https://github.com/mudler/parakeet.cpp
|
PARAKEET_REPO?=https://github.com/mudler/parakeet.cpp
|
||||||
|
|
||||||
GOCMD?=go
|
GOCMD?=go
|
||||||
@@ -74,6 +74,7 @@ libparakeet.so: sources/parakeet.cpp
|
|||||||
cmake -B sources/parakeet.cpp/build-shared -S sources/parakeet.cpp $(CMAKE_ARGS)
|
cmake -B sources/parakeet.cpp/build-shared -S sources/parakeet.cpp $(CMAKE_ARGS)
|
||||||
cmake --build sources/parakeet.cpp/build-shared --config Release -j$(JOBS)
|
cmake --build sources/parakeet.cpp/build-shared --config Release -j$(JOBS)
|
||||||
cp -fv sources/parakeet.cpp/build-shared/libparakeet.so* ./ 2>/dev/null || true
|
cp -fv sources/parakeet.cpp/build-shared/libparakeet.so* ./ 2>/dev/null || true
|
||||||
|
cp -fv sources/parakeet.cpp/build-shared/libparakeet.dylib ./ 2>/dev/null || true
|
||||||
cp -fv sources/parakeet.cpp/include/parakeet_capi.h ./
|
cp -fv sources/parakeet.cpp/include/parakeet_capi.h ./
|
||||||
|
|
||||||
parakeet-cpp-grpc: libparakeet.so main.go goparakeetcpp.go
|
parakeet-cpp-grpc: libparakeet.so main.go goparakeetcpp.go
|
||||||
|
|||||||
@@ -2,15 +2,17 @@ package main
|
|||||||
|
|
||||||
// Started internally by LocalAI - one gRPC server per loaded model.
|
// Started internally by LocalAI - one gRPC server per loaded model.
|
||||||
//
|
//
|
||||||
// Loads libparakeet.so via purego and registers the flat C-API entry
|
// Loads the parakeet shared library via purego and registers the flat
|
||||||
// points declared in parakeet_capi.h. The library name can be overridden
|
// C-API entry points declared in parakeet_capi.h. The library name can be
|
||||||
// with PARAKEET_LIBRARY (mirrors the WHISPER_LIBRARY / VIBEVOICECPP_LIBRARY
|
// overridden with PARAKEET_LIBRARY (mirrors the WHISPER_LIBRARY /
|
||||||
// convention in the sibling backends); the default looks for the .so next
|
// VIBEVOICECPP_LIBRARY convention in the sibling backends); the default
|
||||||
// to this binary.
|
// looks next to this binary for libparakeet.so on Linux and
|
||||||
|
// libparakeet.dylib on macOS.
|
||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -28,7 +30,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("PARAKEET_LIBRARY")
|
libName := os.Getenv("PARAKEET_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "libparakeet.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "libparakeet.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "libparakeet.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -16,12 +16,15 @@ mkdir -p "$CURDIR/package/lib"
|
|||||||
cp -avf "$CURDIR/parakeet-cpp-grpc" "$CURDIR/package/"
|
cp -avf "$CURDIR/parakeet-cpp-grpc" "$CURDIR/package/"
|
||||||
cp -avf "$CURDIR/run.sh" "$CURDIR/package/"
|
cp -avf "$CURDIR/run.sh" "$CURDIR/package/"
|
||||||
|
|
||||||
# libparakeet.so + any soname symlinks (libparakeet.so.X[.Y]). purego.Dlopen
|
# libparakeet shared lib + any soname symlinks. On Linux this is
|
||||||
# resolves it via LD_LIBRARY_PATH, which run.sh points at lib/.
|
# libparakeet.so[.X.Y]; on macOS it is libparakeet.dylib. purego.Dlopen
|
||||||
cp -avf "$CURDIR"/libparakeet.so* "$CURDIR/package/lib/" 2>/dev/null || {
|
# resolves it via the *_LIBRARY_PATH that run.sh points at lib/.
|
||||||
echo "ERROR: libparakeet.so not found in $CURDIR, run 'make' first" >&2
|
cp -avf "$CURDIR"/libparakeet.so* "$CURDIR/package/lib/" 2>/dev/null || true
|
||||||
|
cp -avf "$CURDIR"/libparakeet.dylib "$CURDIR/package/lib/" 2>/dev/null || true
|
||||||
|
if ! ls "$CURDIR"/package/lib/libparakeet.* >/dev/null 2>&1; then
|
||||||
|
echo "ERROR: libparakeet shared library not found in $CURDIR, run 'make' first" >&2
|
||||||
exit 1
|
exit 1
|
||||||
}
|
fi
|
||||||
|
|
||||||
# Detect architecture and copy the core runtime libs libparakeet.so links
|
# Detect architecture and copy the core runtime libs libparakeet.so links
|
||||||
# against, plus the matching dynamic loader as lib/ld.so.
|
# against, plus the matching dynamic loader as lib/ld.so.
|
||||||
@@ -48,7 +51,7 @@ elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
|
|||||||
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 "$CURDIR/package/lib/librt.so.1"
|
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 "$CURDIR/package/lib/librt.so.1"
|
||||||
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 "$CURDIR/package/lib/libpthread.so.0"
|
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 "$CURDIR/package/lib/libpthread.so.0"
|
||||||
elif [ "$(uname -s)" = "Darwin" ]; then
|
elif [ "$(uname -s)" = "Darwin" ]; then
|
||||||
echo "Detected Darwin"
|
echo "Detected Darwin — system frameworks linked dynamically, no bundled libs needed"
|
||||||
else
|
else
|
||||||
echo "Error: Could not detect architecture"
|
echo "Error: Could not detect architecture"
|
||||||
exit 1
|
exit 1
|
||||||
|
|||||||
@@ -3,11 +3,17 @@ set -e
|
|||||||
|
|
||||||
CURDIR=$(dirname "$(realpath "$0")")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
|
||||||
|
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.dylib"
|
||||||
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
|
||||||
|
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.so"
|
||||||
|
fi
|
||||||
|
|
||||||
# If a self-contained ld.so was packaged, route through it so the
|
# If a self-contained ld.so was packaged, route through it so the
|
||||||
# packaged libc / libstdc++ are used instead of the host's (matches the
|
# packaged libc / libstdc++ are used instead of the host's (matches the
|
||||||
# whisper backend's runtime layout).
|
# whisper backend's runtime layout). Linux only.
|
||||||
if [ -f "$CURDIR/lib/ld.so" ]; then
|
if [ -f "$CURDIR/lib/ld.so" ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
exec "$CURDIR/lib/ld.so" "$CURDIR/parakeet-cpp-grpc" "$@"
|
exec "$CURDIR/lib/ld.so" "$CURDIR/parakeet-cpp-grpc" "$@"
|
||||||
|
|||||||
@@ -16,7 +16,15 @@ cp -rfv $CURDIR/run.sh $CURDIR/package/
|
|||||||
cp -rfLv $CURDIR/sources/go-piper/piper-phonemize/pi/lib/* $CURDIR/package/lib/
|
cp -rfLv $CURDIR/sources/go-piper/piper-phonemize/pi/lib/* $CURDIR/package/lib/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS has no glibc loader to bundle. The piper binary links its bundled
|
||||||
|
# libs (libucd, libespeak-ng, libpiper_phonemize, libonnxruntime) via
|
||||||
|
# @rpath but ships with no LC_RPATH, so dyld aborts at launch with
|
||||||
|
# "Library not loaded: @rpath/libucd.dylib ... no LC_RPATH's found".
|
||||||
|
# Add an @loader_path/lib rpath so @rpath resolves to package/lib/.
|
||||||
|
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
|
||||||
|
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/piper"
|
||||||
|
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
||||||
# x86_64 architecture
|
# x86_64 architecture
|
||||||
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
||||||
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so
|
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so
|
||||||
|
|||||||
@@ -1,15 +1,20 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export ESPEAK_NG_DATA=$CURDIR/espeak-ng-data
|
export ESPEAK_NG_DATA="$CURDIR"/espeak-ng-data
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
echo "Using lib/ld.so"
|
else
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/piper "$@"
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
exec $CURDIR/piper "$@"
|
# If there is a lib/ld.so, use it
|
||||||
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
|
echo "Using lib/ld.so"
|
||||||
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/piper "$@"
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec "$CURDIR"/piper "$@"
|
||||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
|||||||
|
|
||||||
# qwentts.cpp version
|
# qwentts.cpp version
|
||||||
QWEN3TTS_REPO?=https://github.com/ServeurpersoCom/qwentts.cpp
|
QWEN3TTS_REPO?=https://github.com/ServeurpersoCom/qwentts.cpp
|
||||||
QWEN3TTS_CPP_VERSION?=4536dcdce27c3764a93a06d6bf64026b124962f5
|
QWEN3TTS_CPP_VERSION?=9dbe7ea26a01b30fccb117ae5e86807c1dc23d42
|
||||||
SO_TARGET?=libgoqwen3ttscpp.so
|
SO_TARGET?=libgoqwen3ttscpp.so
|
||||||
|
|
||||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||||
@@ -65,8 +65,8 @@ UNAME_S := $(shell uname -s)
|
|||||||
ifeq ($(UNAME_S),Linux)
|
ifeq ($(UNAME_S),Linux)
|
||||||
VARIANT_TARGETS = libgoqwen3ttscpp-avx.so libgoqwen3ttscpp-avx2.so libgoqwen3ttscpp-avx512.so libgoqwen3ttscpp-fallback.so
|
VARIANT_TARGETS = libgoqwen3ttscpp-avx.so libgoqwen3ttscpp-avx2.so libgoqwen3ttscpp-avx512.so libgoqwen3ttscpp-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant (as a dylib)
|
||||||
VARIANT_TARGETS = libgoqwen3ttscpp-fallback.so
|
VARIANT_TARGETS = libgoqwen3ttscpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
qwen3-tts-cpp: main.go goqwen3ttscpp.go $(VARIANT_TARGETS)
|
qwen3-tts-cpp: main.go goqwen3ttscpp.go $(VARIANT_TARGETS)
|
||||||
@@ -78,7 +78,7 @@ package: qwen3-tts-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libgoqwen3ttscpp*.so package sources/qwentts.cpp qwen3-tts-cpp
|
rm -rf libgoqwen3ttscpp*.so libgoqwen3ttscpp*.dylib package sources/qwentts.cpp qwen3-tts-cpp
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -110,13 +110,20 @@ libgoqwen3ttscpp-fallback.so: sources/qwentts.cpp
|
|||||||
SO_TARGET=libgoqwen3ttscpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgoqwen3ttscpp-custom
|
SO_TARGET=libgoqwen3ttscpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgoqwen3ttscpp-custom
|
||||||
rm -rf build-libgoqwen3ttscpp-fallback.so
|
rm -rf build-libgoqwen3ttscpp-fallback.so
|
||||||
|
|
||||||
|
# Build fallback variant as a dylib (Darwin)
|
||||||
|
libgoqwen3ttscpp-fallback.dylib: sources/qwentts.cpp
|
||||||
|
$(info ${GREEN}I qwen3-tts-cpp build info:fallback (dylib)${RESET})
|
||||||
|
SO_TARGET=libgoqwen3ttscpp-fallback.dylib CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgoqwen3ttscpp-custom
|
||||||
|
rm -rf build-libgoqwen3ttscpp-fallback.dylib
|
||||||
|
|
||||||
libgoqwen3ttscpp-custom: CMakeLists.txt cpp/goqwen3ttscpp.cpp cpp/goqwen3ttscpp.h
|
libgoqwen3ttscpp-custom: CMakeLists.txt cpp/goqwen3ttscpp.cpp cpp/goqwen3ttscpp.h
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
cd build-$(SO_TARGET) && \
|
cd build-$(SO_TARGET) && \
|
||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) --target goqwen3ttscpp && \
|
cmake --build . --config Release -j$(JOBS) --target goqwen3ttscpp && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgoqwen3ttscpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgoqwen3ttscpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgoqwen3ttscpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
test: qwen3-tts-cpp
|
test: qwen3-tts-cpp
|
||||||
@echo "Running qwen3-tts-cpp tests..."
|
@echo "Running qwen3-tts-cpp tests..."
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("QWEN3TTS_LIBRARY")
|
libName := os.Getenv("QWEN3TTS_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgoqwen3ttscpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgoqwen3ttscpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgoqwen3ttscpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,7 +12,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/qwen3-tts-cpp $CURDIR/package/
|
cp -avf $CURDIR/qwen3-tts-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgoqwen3ttscpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgoqwen3ttscpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgoqwen3ttscpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgoqwen3ttscpp-avx.so ]; then
|
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx.so"
|
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgoqwen3ttscpp-avx2.so ]; then
|
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx2.so"
|
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgoqwen3ttscpp-avx512.so ]; then
|
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx512.so"
|
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export QWEN3TTS_LIBRARY=$LIBRARY
|
export QWEN3TTS_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/qwen3-tts-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/qwen3-tts-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/qwen3-tts-cpp "$@"
|
exec "$CURDIR"/qwen3-tts-cpp "$@"
|
||||||
|
|||||||
@@ -34,6 +34,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
|||||||
else ifeq ($(BUILD_TYPE),vulkan)
|
else ifeq ($(BUILD_TYPE),vulkan)
|
||||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DRFDETR_GGML_VULKAN=ON
|
CMAKE_ARGS+=-DGGML_VULKAN=ON -DRFDETR_GGML_VULKAN=ON
|
||||||
else ifeq ($(OS),Darwin)
|
else ifeq ($(OS),Darwin)
|
||||||
|
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||||
|
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||||
ifneq ($(BUILD_TYPE),metal)
|
ifneq ($(BUILD_TYPE),metal)
|
||||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||||
else
|
else
|
||||||
@@ -71,7 +73,7 @@ ifeq ($(UNAME_S),Linux)
|
|||||||
VARIANT_TARGETS = librfdetrcpp-avx.so librfdetrcpp-avx2.so librfdetrcpp-avx512.so librfdetrcpp-fallback.so
|
VARIANT_TARGETS = librfdetrcpp-avx.so librfdetrcpp-avx2.so librfdetrcpp-avx512.so librfdetrcpp-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant
|
||||||
VARIANT_TARGETS = librfdetrcpp-fallback.so
|
VARIANT_TARGETS = librfdetrcpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
rfdetr-cpp: main.go gorfdetrcpp.go $(VARIANT_TARGETS)
|
rfdetr-cpp: main.go gorfdetrcpp.go $(VARIANT_TARGETS)
|
||||||
@@ -83,7 +85,7 @@ package: rfdetr-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf librfdetrcpp*.so rfdetr-cpp package sources
|
rm -rf librfdetrcpp*.so librfdetrcpp*.dylib rfdetr-cpp package sources
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -110,11 +112,19 @@ librfdetrcpp-avx512.so: sources/rt-detr.cpp
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
# Build fallback variant (all platforms)
|
# Build fallback variant (all platforms)
|
||||||
|
ifeq ($(UNAME_S),Darwin)
|
||||||
|
librfdetrcpp-fallback.dylib: sources/rt-detr.cpp
|
||||||
|
rm -rfv build-$@
|
||||||
|
$(info ${GREEN}I rfdetr-cpp build info:fallback${RESET})
|
||||||
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) librfdetrcpp-custom
|
||||||
|
rm -rfv build-$@
|
||||||
|
else
|
||||||
librfdetrcpp-fallback.so: sources/rt-detr.cpp
|
librfdetrcpp-fallback.so: sources/rt-detr.cpp
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
$(info ${GREEN}I rfdetr-cpp build info:fallback${RESET})
|
$(info ${GREEN}I rfdetr-cpp build info:fallback${RESET})
|
||||||
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) librfdetrcpp-custom
|
SO_TARGET=$@ CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) librfdetrcpp-custom
|
||||||
rm -rfv build-$@
|
rm -rfv build-$@
|
||||||
|
endif
|
||||||
|
|
||||||
librfdetrcpp-custom: CMakeLists.txt
|
librfdetrcpp-custom: CMakeLists.txt
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
@@ -122,7 +132,8 @@ librfdetrcpp-custom: CMakeLists.txt
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/librfdetrcpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/librfdetrcpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/librfdetrcpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
all: rfdetr-cpp package
|
all: rfdetr-cpp package
|
||||||
|
|
||||||
|
|||||||
@@ -9,6 +9,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -27,7 +28,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("RFDETR_LIBRARY")
|
libName := os.Getenv("RFDETR_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./librfdetrcpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./librfdetrcpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./librfdetrcpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
rfdetrLib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
rfdetrLib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -10,7 +10,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
# Create lib directory
|
# Create lib directory
|
||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/librfdetrcpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/librfdetrcpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/librfdetrcpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -avf $CURDIR/rfdetr-cpp $CURDIR/package/
|
cp -avf $CURDIR/rfdetr-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/librfdetrcpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/librfdetrcpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/librfdetrcpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/librfdetrcpp-avx.so ]; then
|
if [ -e "$CURDIR"/librfdetrcpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/librfdetrcpp-avx.so"
|
LIBRARY="$CURDIR/librfdetrcpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/librfdetrcpp-avx2.so ]; then
|
if [ -e "$CURDIR"/librfdetrcpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/librfdetrcpp-avx2.so"
|
LIBRARY="$CURDIR/librfdetrcpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/librfdetrcpp-avx512.so ]; then
|
if [ -e "$CURDIR"/librfdetrcpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/librfdetrcpp-avx512.so"
|
LIBRARY="$CURDIR/librfdetrcpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export RFDETR_LIBRARY=$LIBRARY
|
export RFDETR_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/rfdetr-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/rfdetr-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/rfdetr-cpp "$@"
|
exec "$CURDIR"/rfdetr-cpp "$@"
|
||||||
|
|||||||
@@ -31,6 +31,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
|||||||
else ifeq ($(BUILD_TYPE),vulkan)
|
else ifeq ($(BUILD_TYPE),vulkan)
|
||||||
CMAKE_ARGS+=-DGGML_VULKAN=ON
|
CMAKE_ARGS+=-DGGML_VULKAN=ON
|
||||||
else ifeq ($(OS),Darwin)
|
else ifeq ($(OS),Darwin)
|
||||||
|
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||||
|
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||||
ifneq ($(BUILD_TYPE),metal)
|
ifneq ($(BUILD_TYPE),metal)
|
||||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||||
else
|
else
|
||||||
@@ -66,7 +68,7 @@ ifeq ($(UNAME_S),Linux)
|
|||||||
VARIANT_TARGETS = libgosam3-avx.so libgosam3-avx2.so libgosam3-avx512.so libgosam3-fallback.so
|
VARIANT_TARGETS = libgosam3-avx.so libgosam3-avx2.so libgosam3-avx512.so libgosam3-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant
|
||||||
VARIANT_TARGETS = libgosam3-fallback.so
|
VARIANT_TARGETS = libgosam3-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
sam3-cpp: main.go gosam3.go $(VARIANT_TARGETS)
|
sam3-cpp: main.go gosam3.go $(VARIANT_TARGETS)
|
||||||
@@ -78,7 +80,7 @@ package: sam3-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libgosam3*.so sam3-cpp package sources
|
rm -rf libgosam3*.so libgosam3*.dylib sam3-cpp package sources
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -105,11 +107,19 @@ libgosam3-avx512.so: sources/sam3.cpp
|
|||||||
endif
|
endif
|
||||||
|
|
||||||
# Build fallback variant (all platforms)
|
# Build fallback variant (all platforms)
|
||||||
|
ifeq ($(UNAME_S),Darwin)
|
||||||
|
libgosam3-fallback.dylib: sources/sam3.cpp
|
||||||
|
$(MAKE) purge
|
||||||
|
$(info ${GREEN}I sam3-cpp build info:fallback${RESET})
|
||||||
|
SO_TARGET=libgosam3-fallback.dylib CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgosam3-custom
|
||||||
|
rm -rfv build*
|
||||||
|
else
|
||||||
libgosam3-fallback.so: sources/sam3.cpp
|
libgosam3-fallback.so: sources/sam3.cpp
|
||||||
$(MAKE) purge
|
$(MAKE) purge
|
||||||
$(info ${GREEN}I sam3-cpp build info:fallback${RESET})
|
$(info ${GREEN}I sam3-cpp build info:fallback${RESET})
|
||||||
SO_TARGET=libgosam3-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgosam3-custom
|
SO_TARGET=libgosam3-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgosam3-custom
|
||||||
rm -rfv build*
|
rm -rfv build*
|
||||||
|
endif
|
||||||
|
|
||||||
libgosam3-custom: CMakeLists.txt cpp/gosam3.cpp cpp/gosam3.h
|
libgosam3-custom: CMakeLists.txt cpp/gosam3.cpp cpp/gosam3.h
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
@@ -117,6 +127,7 @@ libgosam3-custom: CMakeLists.txt cpp/gosam3.cpp cpp/gosam3.h
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgosam3.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgosam3.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgosam3.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
all: sam3-cpp package
|
all: sam3-cpp package
|
||||||
|
|||||||
@@ -3,6 +3,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("SAM3_LIBRARY")
|
libName := os.Getenv("SAM3_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgosam3-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgosam3-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgosam3-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
gosamLib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
gosamLib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -10,7 +10,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
# Create lib directory
|
# Create lib directory
|
||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/libgosam3-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgosam3-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgosam3-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -avf $CURDIR/sam3-cpp $CURDIR/package/
|
cp -avf $CURDIR/sam3-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgosam3-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgosam3-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgosam3-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgosam3-avx.so ]; then
|
if [ -e "$CURDIR"/libgosam3-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosam3-avx.so"
|
LIBRARY="$CURDIR/libgosam3-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgosam3-avx2.so ]; then
|
if [ -e "$CURDIR"/libgosam3-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosam3-avx2.so"
|
LIBRARY="$CURDIR/libgosam3-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgosam3-avx512.so ]; then
|
if [ -e "$CURDIR"/libgosam3-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosam3-avx512.so"
|
LIBRARY="$CURDIR/libgosam3-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export SAM3_LIBRARY=$LIBRARY
|
export SAM3_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/sam3-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/sam3-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/sam3-cpp "$@"
|
exec "$CURDIR"/sam3-cpp "$@"
|
||||||
|
|||||||
@@ -7,6 +7,7 @@ import (
|
|||||||
"fmt"
|
"fmt"
|
||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
|
"runtime"
|
||||||
"strconv"
|
"strconv"
|
||||||
"strings"
|
"strings"
|
||||||
"sync"
|
"sync"
|
||||||
@@ -238,11 +239,19 @@ func loadSherpaLibs() error {
|
|||||||
func loadSherpaLibsOnce() error {
|
func loadSherpaLibsOnce() error {
|
||||||
shimLib := os.Getenv("SHERPA_SHIM_LIBRARY")
|
shimLib := os.Getenv("SHERPA_SHIM_LIBRARY")
|
||||||
if shimLib == "" {
|
if shimLib == "" {
|
||||||
shimLib = "libsherpa-shim.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
shimLib = "libsherpa-shim.dylib"
|
||||||
|
} else {
|
||||||
|
shimLib = "libsherpa-shim.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
capiLib := os.Getenv("SHERPA_ONNX_LIBRARY")
|
capiLib := os.Getenv("SHERPA_ONNX_LIBRARY")
|
||||||
if capiLib == "" {
|
if capiLib == "" {
|
||||||
capiLib = "libsherpa-onnx-c-api.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
capiLib = "libsherpa-onnx-c-api.dylib"
|
||||||
|
} else {
|
||||||
|
capiLib = "libsherpa-onnx-c-api.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
shim, err := purego.Dlopen(shimLib, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
shim, err := purego.Dlopen(shimLib, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -1,13 +1,19 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
export SHERPA_SHIM_LIBRARY="$CURDIR"/lib/libsherpa-shim.dylib
|
||||||
echo "Using lib/ld.so"
|
export SHERPA_ONNX_LIBRARY="$CURDIR"/lib/libsherpa-onnx-c-api.dylib
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/sherpa-onnx "$@"
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
exec $CURDIR/sherpa-onnx "$@"
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
|
echo "Using lib/ld.so"
|
||||||
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/sherpa-onnx "$@"
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec "$CURDIR"/sherpa-onnx "$@"
|
||||||
|
|||||||
@@ -15,7 +15,14 @@ cp -avf $CURDIR/run.sh $CURDIR/package/
|
|||||||
cp -rfLv $CURDIR/backend-assets/lib/* $CURDIR/package/lib/
|
cp -rfLv $CURDIR/backend-assets/lib/* $CURDIR/package/lib/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS has no glibc loader to bundle. silero-vad links its bundled
|
||||||
|
# libonnxruntime via @rpath but ships with no LC_RPATH, so dyld can't find
|
||||||
|
# it at runtime. Add an @loader_path/lib rpath so @rpath resolves to
|
||||||
|
# package/lib/ (matching the piper darwin fix, #10525).
|
||||||
|
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
|
||||||
|
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/silero-vad"
|
||||||
|
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
|
||||||
# x86_64 architecture
|
# x86_64 architecture
|
||||||
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
echo "Detected x86_64 architecture, copying x86_64 libraries..."
|
||||||
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so
|
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so
|
||||||
|
|||||||
@@ -1,14 +1,18 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
# If there is a lib/ld.so, use it
|
else
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
echo "Using lib/ld.so"
|
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/silero-vad "$@"
|
|
||||||
fi
|
fi
|
||||||
|
|
||||||
exec $CURDIR/silero-vad "$@"
|
# If there is a lib/ld.so, use it
|
||||||
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
|
echo "Using lib/ld.so"
|
||||||
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/silero-vad "$@"
|
||||||
|
fi
|
||||||
|
|
||||||
|
exec "$CURDIR"/silero-vad "$@"
|
||||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
|||||||
|
|
||||||
# stablediffusion.cpp (ggml)
|
# stablediffusion.cpp (ggml)
|
||||||
STABLEDIFFUSION_GGML_REPO?=https://github.com/leejet/stable-diffusion.cpp
|
STABLEDIFFUSION_GGML_REPO?=https://github.com/leejet/stable-diffusion.cpp
|
||||||
STABLEDIFFUSION_GGML_VERSION?=f440ad9c29dd8bc34e5d1f4b863832b96d6ea05f
|
STABLEDIFFUSION_GGML_VERSION?=8caa3f908ae6d4a4bef531e73b9a969f266a3d1f
|
||||||
|
|
||||||
CMAKE_ARGS+=-DGGML_MAX_NAME=128
|
CMAKE_ARGS+=-DGGML_MAX_NAME=128
|
||||||
|
|
||||||
@@ -131,6 +131,7 @@ libgosd-custom: CMakeLists.txt cpp/gosd.cpp cpp/gosd.h
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgosd.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgosd.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgosd.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
all: stablediffusion-ggml package
|
all: stablediffusion-ggml package
|
||||||
@@ -3,6 +3,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("SD_LIBRARY")
|
libName := os.Getenv("SD_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgosd-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgosd-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgosd-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/libgosd-*.so $CURDIR/package/
|
cp -avf $CURDIR/libgosd-*.so $CURDIR/package/
|
||||||
|
cp -fv $CURDIR/libgosd-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -avf $CURDIR/stablediffusion-ggml $CURDIR/package/
|
cp -avf $CURDIR/stablediffusion-ggml $CURDIR/package/
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,28 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgosd-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single library variant (Metal or Accelerate). The gosd target is
|
||||||
|
# built as a CMake MODULE, which emits a .dylib for a SHARED build but a
|
||||||
|
# .so for a MODULE build on Apple, so prefer .dylib and fall back to .so.
|
||||||
|
LIBRARY="$CURDIR/libgosd-fallback.dylib"
|
||||||
|
if [ ! -e "$LIBRARY" ]; then
|
||||||
|
LIBRARY="$CURDIR/libgosd-fallback.so"
|
||||||
|
fi
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgosd-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgosd-avx.so ]; then
|
if [ -e "$CURDIR"/libgosd-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosd-avx.so"
|
LIBRARY="$CURDIR/libgosd-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgosd-avx2.so ]; then
|
if [ -e "$CURDIR"/libgosd-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosd-avx2.so"
|
LIBRARY="$CURDIR/libgosd-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +41,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgosd-avx512.so ]; then
|
if [ -e "$CURDIR"/libgosd-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgosd-avx512.so"
|
LIBRARY="$CURDIR/libgosd-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export SD_LIBRARY=$LIBRARY
|
export SD_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/stablediffusion-ggml "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/stablediffusion-ggml "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/stablediffusion-ggml "$@"
|
exec "$CURDIR"/stablediffusion-ggml "$@"
|
||||||
|
|||||||
@@ -16,6 +16,7 @@ import (
|
|||||||
"os"
|
"os"
|
||||||
"path/filepath"
|
"path/filepath"
|
||||||
"regexp"
|
"regexp"
|
||||||
|
"runtime"
|
||||||
"strings"
|
"strings"
|
||||||
"time"
|
"time"
|
||||||
"unicode"
|
"unicode"
|
||||||
@@ -943,7 +944,13 @@ func InitializeONNXRuntime() error {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
if libPath == "" {
|
if libPath == "" {
|
||||||
libPath = "/usr/local/lib/libonnxruntime.so"
|
// LocalAI: default to the platform-native shared library
|
||||||
|
// extension when nothing else is found (dyld vs ld.so).
|
||||||
|
if runtime.GOOS == "darwin" {
|
||||||
|
libPath = "/usr/local/lib/libonnxruntime.dylib"
|
||||||
|
} else {
|
||||||
|
libPath = "/usr/local/lib/libonnxruntime.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
ort.SetSharedLibraryPath(libPath)
|
ort.SetSharedLibraryPath(libPath)
|
||||||
|
|||||||
@@ -32,6 +32,10 @@ elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
|
|||||||
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
|
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
|
||||||
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
|
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
|
||||||
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
|
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
|
||||||
|
elif [ $(uname -s) = "Darwin" ]; then
|
||||||
|
# macOS: dyld resolves the bundled .dylib via DYLD_LIBRARY_PATH (set in
|
||||||
|
# run.sh); there is no ld.so loader nor glibc to bundle.
|
||||||
|
echo "Detected Darwin"
|
||||||
else
|
else
|
||||||
echo "Error: Could not detect architecture"
|
echo "Error: Could not detect architecture"
|
||||||
exit 1
|
exit 1
|
||||||
|
|||||||
@@ -1,14 +1,21 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
export ONNXRUNTIME_LIB_PATH=$CURDIR/lib/libonnxruntime.so
|
# macOS uses dyld: there is no ld.so loader, and the search path env
|
||||||
|
# var is DYLD_LIBRARY_PATH. ONNX Runtime ships as a .dylib here.
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.dylib
|
||||||
|
else
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
|
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.so
|
||||||
|
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/supertonic "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/supertonic "$@"
|
||||||
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
exec $CURDIR/supertonic "$@"
|
exec "$CURDIR"/supertonic "$@"
|
||||||
|
|||||||
@@ -70,8 +70,8 @@ UNAME_S := $(shell uname -s)
|
|||||||
ifeq ($(UNAME_S),Linux)
|
ifeq ($(UNAME_S),Linux)
|
||||||
VARIANT_TARGETS = libgovibevoicecpp-avx.so libgovibevoicecpp-avx2.so libgovibevoicecpp-avx512.so libgovibevoicecpp-fallback.so
|
VARIANT_TARGETS = libgovibevoicecpp-avx.so libgovibevoicecpp-avx2.so libgovibevoicecpp-avx512.so libgovibevoicecpp-fallback.so
|
||||||
else
|
else
|
||||||
# On non-Linux (e.g., Darwin), build only fallback variant
|
# On non-Linux (e.g., Darwin), build only fallback variant (as a dylib)
|
||||||
VARIANT_TARGETS = libgovibevoicecpp-fallback.so
|
VARIANT_TARGETS = libgovibevoicecpp-fallback.dylib
|
||||||
endif
|
endif
|
||||||
|
|
||||||
vibevoice-cpp: main.go govibevoicecpp.go $(VARIANT_TARGETS)
|
vibevoice-cpp: main.go govibevoicecpp.go $(VARIANT_TARGETS)
|
||||||
@@ -83,7 +83,7 @@ package: vibevoice-cpp
|
|||||||
build: package
|
build: package
|
||||||
|
|
||||||
clean: purge
|
clean: purge
|
||||||
rm -rf libgovibevoicecpp*.so package sources/vibevoice.cpp vibevoice-cpp
|
rm -rf libgovibevoicecpp*.so libgovibevoicecpp*.dylib package sources/vibevoice.cpp vibevoice-cpp
|
||||||
|
|
||||||
purge:
|
purge:
|
||||||
rm -rf build*
|
rm -rf build*
|
||||||
@@ -119,13 +119,21 @@ libgovibevoicecpp-fallback.so: sources/vibevoice.cpp
|
|||||||
SO_TARGET=libgovibevoicecpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgovibevoicecpp-custom
|
SO_TARGET=libgovibevoicecpp-fallback.so CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgovibevoicecpp-custom
|
||||||
rm -rfv build*
|
rm -rfv build*
|
||||||
|
|
||||||
|
# Build fallback variant as a dylib (Darwin)
|
||||||
|
libgovibevoicecpp-fallback.dylib: sources/vibevoice.cpp
|
||||||
|
$(MAKE) purge
|
||||||
|
$(info ${GREEN}I vibevoice-cpp build info:fallback (dylib)${RESET})
|
||||||
|
SO_TARGET=libgovibevoicecpp-fallback.dylib CMAKE_ARGS="$(CMAKE_ARGS) -DGGML_AVX=off -DGGML_AVX2=off -DGGML_AVX512=off -DGGML_FMA=off -DGGML_F16C=off -DGGML_BMI2=off" $(MAKE) libgovibevoicecpp-custom
|
||||||
|
rm -rfv build*
|
||||||
|
|
||||||
libgovibevoicecpp-custom: CMakeLists.txt cpp/govibevoicecpp.cpp cpp/govibevoicecpp.h
|
libgovibevoicecpp-custom: CMakeLists.txt cpp/govibevoicecpp.cpp cpp/govibevoicecpp.h
|
||||||
mkdir -p build-$(SO_TARGET) && \
|
mkdir -p build-$(SO_TARGET) && \
|
||||||
cd build-$(SO_TARGET) && \
|
cd build-$(SO_TARGET) && \
|
||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) --target govibevoicecpp && \
|
cmake --build . --config Release -j$(JOBS) --target govibevoicecpp && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgovibevoicecpp.so ./$(SO_TARGET)
|
(mv build-$(SO_TARGET)/libgovibevoicecpp.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgovibevoicecpp.dylib ./$(SO_TARGET) 2>/dev/null)
|
||||||
|
|
||||||
test: vibevoice-cpp
|
test: vibevoice-cpp
|
||||||
@echo "Running vibevoice-cpp tests..."
|
@echo "Running vibevoice-cpp tests..."
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -21,7 +22,11 @@ type LibFuncs struct {
|
|||||||
func main() {
|
func main() {
|
||||||
libName := os.Getenv("VIBEVOICECPP_LIBRARY")
|
libName := os.Getenv("VIBEVOICECPP_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgovibevoicecpp-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgovibevoicecpp-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgovibevoicecpp-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
lib, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,7 +12,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/vibevoice-cpp $CURDIR/package/
|
cp -avf $CURDIR/vibevoice-cpp $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgovibevoicecpp-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgovibevoicecpp-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgovibevoicecpp-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -1,7 +1,7 @@
|
|||||||
#!/bin/bash
|
#!/bin/bash
|
||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -11,39 +11,44 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgovibevoicecpp-avx.so ]; then
|
if [ -e "$CURDIR"/libgovibevoicecpp-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgovibevoicecpp-avx.so"
|
LIBRARY="$CURDIR/libgovibevoicecpp-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgovibevoicecpp-avx2.so ]; then
|
if [ -e "$CURDIR"/libgovibevoicecpp-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgovibevoicecpp-avx2.so"
|
LIBRARY="$CURDIR/libgovibevoicecpp-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgovibevoicecpp-avx512.so ]; then
|
if [ -e "$CURDIR"/libgovibevoicecpp-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgovibevoicecpp-avx512.so"
|
LIBRARY="$CURDIR/libgovibevoicecpp-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export VIBEVOICECPP_LIBRARY=$LIBRARY
|
export VIBEVOICECPP_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/vibevoice-cpp "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/vibevoice-cpp "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/vibevoice-cpp "$@"
|
exec "$CURDIR"/vibevoice-cpp "$@"
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -15,35 +15,35 @@ fi
|
|||||||
if [ "$(uname)" = "Darwin" ]; then
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
# macOS: single dylib variant (Metal or Accelerate)
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
LIBRARY="$CURDIR/libgovoxtral-fallback.dylib"
|
LIBRARY="$CURDIR/libgovoxtral-fallback.dylib"
|
||||||
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
else
|
else
|
||||||
LIBRARY="$CURDIR/libgovoxtral-fallback.so"
|
LIBRARY="$CURDIR/libgovoxtral-fallback.so"
|
||||||
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgovoxtral-avx.so ]; then
|
if [ -e "$CURDIR"/libgovoxtral-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgovoxtral-avx.so"
|
LIBRARY="$CURDIR/libgovoxtral-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgovoxtral-avx2.so ]; then
|
if [ -e "$CURDIR"/libgovoxtral-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgovoxtral-avx2.so"
|
LIBRARY="$CURDIR/libgovoxtral-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export VOXTRAL_LIBRARY=$LIBRARY
|
export VOXTRAL_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it (Linux only)
|
# If there is a lib/ld.so, use it (Linux only)
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/voxtral "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/voxtral "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/voxtral "$@"
|
exec "$CURDIR"/voxtral "$@"
|
||||||
|
|||||||
@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
|
|||||||
|
|
||||||
# whisper.cpp version
|
# whisper.cpp version
|
||||||
WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
|
WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
|
||||||
WHISPER_CPP_VERSION?=bae6bc02b1940bbfb87b6a0299c565e563b916d1
|
WHISPER_CPP_VERSION?=43d78af5be58f41d6ffbc227d608f104577741ea
|
||||||
SO_TARGET?=libgowhisper.so
|
SO_TARGET?=libgowhisper.so
|
||||||
|
|
||||||
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
|
||||||
@@ -117,6 +117,7 @@ libgowhisper-custom: CMakeLists.txt cpp/gowhisper.cpp cpp/gowhisper.h
|
|||||||
cmake .. $(CMAKE_ARGS) && \
|
cmake .. $(CMAKE_ARGS) && \
|
||||||
cmake --build . --config Release -j$(JOBS) && \
|
cmake --build . --config Release -j$(JOBS) && \
|
||||||
cd .. && \
|
cd .. && \
|
||||||
mv build-$(SO_TARGET)/libgowhisper.so ./$(SO_TARGET)
|
mv build-$(SO_TARGET)/libgowhisper.so ./$(SO_TARGET) 2>/dev/null || \
|
||||||
|
mv build-$(SO_TARGET)/libgowhisper.dylib ./$(SO_TARGET:.so=.dylib)
|
||||||
|
|
||||||
all: whisper package
|
all: whisper package
|
||||||
|
|||||||
@@ -4,6 +4,7 @@ package main
|
|||||||
import (
|
import (
|
||||||
"flag"
|
"flag"
|
||||||
"os"
|
"os"
|
||||||
|
"runtime"
|
||||||
|
|
||||||
"github.com/ebitengine/purego"
|
"github.com/ebitengine/purego"
|
||||||
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
grpc "github.com/mudler/LocalAI/pkg/grpc"
|
||||||
@@ -22,7 +23,11 @@ func main() {
|
|||||||
// Get library name from environment variable, default to fallback
|
// Get library name from environment variable, default to fallback
|
||||||
libName := os.Getenv("WHISPER_LIBRARY")
|
libName := os.Getenv("WHISPER_LIBRARY")
|
||||||
if libName == "" {
|
if libName == "" {
|
||||||
libName = "./libgowhisper-fallback.so"
|
if runtime.GOOS == "darwin" {
|
||||||
|
libName = "./libgowhisper-fallback.dylib"
|
||||||
|
} else {
|
||||||
|
libName = "./libgowhisper-fallback.so"
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
gosd, err := purego.Dlopen(libName, purego.RTLD_NOW|purego.RTLD_GLOBAL)
|
||||||
|
|||||||
@@ -12,7 +12,8 @@ REPO_ROOT="${CURDIR}/../../.."
|
|||||||
mkdir -p $CURDIR/package/lib
|
mkdir -p $CURDIR/package/lib
|
||||||
|
|
||||||
cp -avf $CURDIR/whisper $CURDIR/package/
|
cp -avf $CURDIR/whisper $CURDIR/package/
|
||||||
cp -fv $CURDIR/libgowhisper-*.so $CURDIR/package/
|
cp -fv $CURDIR/libgowhisper-*.so $CURDIR/package/ 2>/dev/null || true
|
||||||
|
cp -fv $CURDIR/libgowhisper-*.dylib $CURDIR/package/ 2>/dev/null || true
|
||||||
cp -fv $CURDIR/run.sh $CURDIR/package/
|
cp -fv $CURDIR/run.sh $CURDIR/package/
|
||||||
|
|
||||||
# Detect architecture and copy appropriate libraries
|
# Detect architecture and copy appropriate libraries
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
set -ex
|
set -ex
|
||||||
|
|
||||||
# Get the absolute current dir where the script is located
|
# Get the absolute current dir where the script is located
|
||||||
CURDIR=$(dirname "$(realpath $0)")
|
CURDIR=$(dirname "$(realpath "$0")")
|
||||||
|
|
||||||
cd /
|
cd /
|
||||||
|
|
||||||
@@ -12,19 +12,23 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
grep -e "flags" /proc/cpuinfo | head -1
|
grep -e "flags" /proc/cpuinfo | head -1
|
||||||
fi
|
fi
|
||||||
|
|
||||||
LIBRARY="$CURDIR/libgowhisper-fallback.so"
|
if [ "$(uname)" = "Darwin" ]; then
|
||||||
|
# macOS: single dylib variant (Metal or Accelerate)
|
||||||
|
LIBRARY="$CURDIR/libgowhisper-fallback.dylib"
|
||||||
|
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
|
||||||
|
else
|
||||||
|
LIBRARY="$CURDIR/libgowhisper-fallback.so"
|
||||||
|
|
||||||
if [ "$(uname)" != "Darwin" ]; then
|
|
||||||
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX found OK"
|
echo "CPU: AVX found OK"
|
||||||
if [ -e $CURDIR/libgowhisper-avx.so ]; then
|
if [ -e "$CURDIR"/libgowhisper-avx.so ]; then
|
||||||
LIBRARY="$CURDIR/libgowhisper-avx.so"
|
LIBRARY="$CURDIR/libgowhisper-avx.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX2 found OK"
|
echo "CPU: AVX2 found OK"
|
||||||
if [ -e $CURDIR/libgowhisper-avx2.so ]; then
|
if [ -e "$CURDIR"/libgowhisper-avx2.so ]; then
|
||||||
LIBRARY="$CURDIR/libgowhisper-avx2.so"
|
LIBRARY="$CURDIR/libgowhisper-avx2.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
@@ -32,21 +36,22 @@ if [ "$(uname)" != "Darwin" ]; then
|
|||||||
# Check avx 512
|
# Check avx 512
|
||||||
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
|
||||||
echo "CPU: AVX512F found OK"
|
echo "CPU: AVX512F found OK"
|
||||||
if [ -e $CURDIR/libgowhisper-avx512.so ]; then
|
if [ -e "$CURDIR"/libgowhisper-avx512.so ]; then
|
||||||
LIBRARY="$CURDIR/libgowhisper-avx512.so"
|
LIBRARY="$CURDIR/libgowhisper-avx512.so"
|
||||||
fi
|
fi
|
||||||
fi
|
fi
|
||||||
|
|
||||||
|
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
|
||||||
fi
|
fi
|
||||||
|
|
||||||
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
|
|
||||||
export WHISPER_LIBRARY=$LIBRARY
|
export WHISPER_LIBRARY=$LIBRARY
|
||||||
|
|
||||||
# If there is a lib/ld.so, use it
|
# If there is a lib/ld.so, use it
|
||||||
if [ -f $CURDIR/lib/ld.so ]; then
|
if [ -f "$CURDIR"/lib/ld.so ]; then
|
||||||
echo "Using lib/ld.so"
|
echo "Using lib/ld.so"
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/lib/ld.so $CURDIR/whisper "$@"
|
exec "$CURDIR"/lib/ld.so "$CURDIR"/whisper "$@"
|
||||||
fi
|
fi
|
||||||
|
|
||||||
echo "Using library: $LIBRARY"
|
echo "Using library: $LIBRARY"
|
||||||
exec $CURDIR/whisper "$@"
|
exec "$CURDIR"/whisper "$@"
|
||||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user