Compare commits

..

1 Commits

Author SHA1 Message Date
dependabot[bot]
40ff385c05 chore(deps): bump torch in /backend/python/vllm
Bumps torch from 2.9.1+cpu to 2.12.1+xpu.

---
updated-dependencies:
- dependency-name: torch
  dependency-version: 2.12.1+xpu
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-06-25 14:01:50 +00:00
103 changed files with 840 additions and 2401 deletions

View File

@@ -102,24 +102,6 @@ Multi-arch backends are NOT a single matrix entry with `platforms: 'linux/amd64,
Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~2535 min cold). Local `make backends/<name>` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
### Cover every OS the project supports (Linux **and** Darwin)
`.github/backend-matrix.yml` has two matrices, and they are the source of truth for which OS a backend ships on:
- `include:` — the **Linux** matrix (x86_64 + arm64; CPU and CUDA / ROCm / SYCL / Vulkan).
- `includeDarwin:` — the **macOS / Apple Silicon** matrix (arm64; Metal where the engine supports it, otherwise a native arm64 CPU build).
**A new backend must target every OS it can build for — do not ship Linux-only by default.** A backend that appears only under `include:` is silently unavailable on macOS even when its code would run there. Most C/C++/GGML engines build on Darwin out of the box (ggml defaults `GGML_METAL=ON` on Apple, so a plain build is Metal-enabled), and many Python backends do too (CPU / MPS wheels). If a backend genuinely cannot support an OS (e.g. CUDA-only, no CPU variant), state that in the PR description instead of omitting it silently.
Wiring a backend into `includeDarwin:` is more than the matrix entry:
1. **`includeDarwin:` entry** — `tag-suffix: "-metal-darwin-arm64-<backend>"`, `build-type: "metal"`, `lang: "go"` for go+ggml backends; omit `build-type` for the bespoke C++ ones (llama-cpp / ds4 / privacy-filter). Match an existing entry of the same shape.
2. **`backend/index.yaml`** — add `metal:` to the backend's `capabilities` map (main and `-development`) and concrete `metal-<backend>` / `metal-<backend>-development` image entries pointing at the `-metal-darwin-arm64-<backend>` images.
3. **C/C++ backends only** — add an `inferBackendPathDarwin` case in `scripts/changed-backends.js` returning `backend/cpp/<backend>/` (the generic fallthrough assumes `backend/<lang>/`, which is wrong for a C++ source tree driven with `lang: go`), and give `run.sh` a Darwin branch that exports `DYLD_LIBRARY_PATH` instead of `LD_LIBRARY_PATH`. If the build is bespoke (single `grpc-server` + dylib bundling), model it on `scripts/build/ds4-darwin.sh` and add a `backends/<backend>-darwin` make target plus a gated step in `.github/workflows/backend_build_darwin.yml`.
4. **C++ proto gotcha** — if the backend compiles the generated gRPC/protobuf in a separate CMake target (e.g. `hw_grpc_proto`), that target must link `protobuf::libprotobuf` + `gRPC::grpc++` so the Homebrew include dirs propagate; otherwise macOS fails with `google/protobuf/runtime_version.h not found` (Linux hides this because apt headers sit in `/usr/include`).
The CI path filter only builds a backend on a PR when a file under its directory changes, so a darwin-only YAML edit builds nothing — touch a file under `backend/<lang>/<backend>/` (a one-line comment is enough) in the same PR.
## 3. Add Backend Metadata to `backend/index.yaml`
**Step 3a: Add Meta Definition**
@@ -243,7 +225,6 @@ After adding a new backend, verify:
- [ ] Backend directory structure is complete with all necessary files
- [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant)
- [ ] **OS coverage considered**: added to `includeDarwin:` (macOS/Apple Silicon) if the backend can build there — with the `backend/index.yaml` `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD branch and `inferBackendPathDarwin` case for C++ backends — or the PR explains why an OS is unsupported. Do not ship Linux-only by default.
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
- [ ] Tag suffixes match between workflow file and index.yaml

View File

@@ -2,28 +2,6 @@
# Matrix data for backend container image builds.
# Consumed by scripts/changed-backends.js for both backend.yml and backend_pr.yml.
# This file is NOT a workflow — it has no top-level 'on:' or 'jobs:'.
#
# OS / platform coverage — READ THIS WHEN ADDING A BACKEND
# --------------------------------------------------------
# This file is the source of truth for which OS each backend is built and
# published for. A backend ships ONLY for the matrices it appears in:
# - Linux -> the `include:` matrix below (x86_64 + arm64; CPU and
# CUDA / ROCm / SYCL / Vulkan variants).
# - macOS -> the `includeDarwin:` matrix (Apple Silicon / arm64; Metal where
# the engine supports it, otherwise a native arm64 CPU build).
#
# New backends must target EVERY OS they can build for, not just Linux. A backend
# listed only under `include:` is silently unavailable on macOS even when its code
# would run there. Most C/C++/GGML engines build on Darwin (ggml defaults
# GGML_METAL=ON on Apple, so a plain build is Metal-enabled), and many Python
# backends do too (CPU / MPS). If a backend genuinely cannot support an OS, say so
# in its PR description rather than silently omitting it.
#
# Adding a backend to `includeDarwin:` is more than one line — see the darwin
# checklist in .agents/adding-backends.md (includeDarwin entry, the index.yaml
# `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD
# branch for C/C++ backends, and the inferBackendPathDarwin case in
# scripts/changed-backends.js so the path filter actually builds it).
# Linux matrix (consumed by backend-jobs).
include:
@@ -4944,37 +4922,6 @@ includeDarwin:
tag-suffix: "-metal-darwin-arm64-vibevoice-cpp"
build-type: "metal"
lang: "go"
# Vision/utility C++/ggml backends (go+cgo). Their Makefiles already carry a
# Darwin/Metal path (GGML_METAL=ON when build-type=metal); this just builds and
# publishes the metal image so Apple Silicon can install them.
- backend: "depth-anything-cpp"
tag-suffix: "-metal-darwin-arm64-depth-anything-cpp"
build-type: "metal"
lang: "go"
- backend: "locate-anything-cpp"
tag-suffix: "-metal-darwin-arm64-locate-anything-cpp"
build-type: "metal"
lang: "go"
- backend: "rfdetr-cpp"
tag-suffix: "-metal-darwin-arm64-rfdetr-cpp"
build-type: "metal"
lang: "go"
- backend: "sam3-cpp"
tag-suffix: "-metal-darwin-arm64-sam3-cpp"
build-type: "metal"
lang: "go"
# privacy-filter (PII/NER) is a C++/ggml backend built by a bespoke darwin
# script (make backends/privacy-filter-darwin); ggml defaults Metal ON on Apple
# so the build is Metal-enabled. lang=go drives runner/toolchain selection only.
- backend: "privacy-filter"
tag-suffix: "-metal-darwin-arm64-privacy-filter"
lang: "go"
# LocalVQE has no Metal path; on Apple Silicon it builds CPU-only (GGML_METAL
# OFF) but is still a native arm64 image. Uses the darwin/metal build profile.
- backend: "localvqe"
tag-suffix: "-metal-darwin-arm64-localvqe"
build-type: "metal"
lang: "go"
- backend: "voxtral"
tag-suffix: "-metal-darwin-arm64-voxtral"
build-type: "metal"

View File

@@ -99,7 +99,6 @@ jobs:
/opt/homebrew/Cellar/xxhash
/opt/homebrew/Cellar/zstd
/opt/homebrew/Cellar/nlohmann-json
/opt/homebrew/Cellar/opus
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
- name: Dependencies
@@ -114,12 +113,7 @@ jobs:
# nlohmann-json is header-only and required by the ds4 backend
# (dsml_renderer.cpp includes <nlohmann/json.hpp>); on Linux it comes
# from the apt-installed nlohmann-json3-dev in the build image.
# opus + pkg-config are required by the opus go backend: its
# Makefile/package.sh call `pkg-config --cflags/--libs opus` to build
# libopusshim.dylib and to locate libopus.dylib for bundling. brew's
# pkg-config defaults its search path to the Homebrew prefix so the
# opus.pc is found.
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json
# Force-reinstall ccache so brew re-validates its full runtime-dep
# closure on every run. This is the durable fix: when the upstream
# ccache formula gains a new transitive dep (as it has multiple times
@@ -138,7 +132,7 @@ jobs:
# and decides "already installed" without re-linking, so on a cache-
# hit run the formulas aren't on PATH. Force-link them; --overwrite
# tolerates pre-existing symlinks from earlier installs.
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config 2>/dev/null || true
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json 2>/dev/null || true
- name: Save Homebrew cache
if: github.event_name != 'pull_request' && steps.brew-cache.outputs.cache-hit != 'true'
@@ -159,7 +153,6 @@ jobs:
/opt/homebrew/Cellar/xxhash
/opt/homebrew/Cellar/zstd
/opt/homebrew/Cellar/nlohmann-json
/opt/homebrew/Cellar/opus
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
# ---- ccache for llama.cpp CMake builds ----
@@ -235,17 +228,8 @@ jobs:
run: |
make backends/ds4-darwin
# privacy-filter is a C++/ggml backend like ds4 - a single grpc-server with
# otool dylib bundling - so it gets its own bespoke darwin script rather than
# the generic build-darwin-go-backend path.
- name: Build privacy-filter backend (Darwin Metal)
if: inputs.backend == 'privacy-filter'
run: |
make protogen-go
make backends/privacy-filter-darwin
- name: Build ${{ inputs.backend }}-darwin
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4' && inputs.backend != 'privacy-filter'
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4'
run: |
make protogen-go
BACKEND=${{ inputs.backend }} BUILD_TYPE=${{ inputs.build-type }} USE_PIP=${{ inputs.use-pip }} make build-darwin-${{ inputs.lang }}-backend

View File

@@ -24,11 +24,6 @@ jobs:
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
MACOS_SIGN_P12: ${{ secrets.MACOS_CERTIFICATE }}
MACOS_SIGN_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
launcher-build-darwin:
runs-on: macos-latest
steps:
@@ -40,19 +35,9 @@ jobs:
uses: actions/setup-go@v5
with:
go-version: 1.23
- name: Import signing certificate
env:
MACOS_CERTIFICATE: ${{ secrets.MACOS_CERTIFICATE }}
MACOS_CERTIFICATE_PWD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
MACOS_CI_KEYCHAIN_PWD: ${{ secrets.MACOS_CI_KEYCHAIN_PWD }}
run: bash contrib/macos/sign-and-notarize.sh import-cert
- name: Build, sign and notarize the DMG
env:
MACOS_SIGN_IDENTITY: ${{ secrets.MACOS_SIGN_IDENTITY }}
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
run: make release-launcher-darwin
- name: Build launcher for macOS ARM64
run: |
make build-launcher-darwin
- name: Upload DMG to Release
uses: softprops/action-gh-release@v3
with:

View File

@@ -121,19 +121,3 @@ jobs:
detached: true
connect-timeout-seconds: 180
limit-access-to-actor: true
# Fast standalone unit tests for the backends' pure C++ helpers - currently the
# llama-cpp message reconstruction (backend/cpp/llama-cpp/message_content.h),
# which guards the OpenAI chat content normalization (mudler/LocalAI#10524,
# #7324, #7528). The runner discovers every *_test.cpp under backend/cpp/, so
# new pure-C++ unit tests are picked up with no CI changes. These need only the
# C++ stdlib + nlohmann/json, so they run on every PR without the full
# llama.cpp + gRPC backend build. (The same suite is also wired as an opt-in
# CMake/ctest target, -DLLAMA_GRPC_BUILD_TESTS=ON, for in-backend-build runs.)
tests-backend-cpp:
runs-on: ubuntu-latest
steps:
- name: Clone
uses: actions/checkout@v7
- name: Run backend C++ unit tests
run: make test-backend-cpp

3
.gitignore vendored
View File

@@ -94,6 +94,3 @@ core/http/react-ui/test-results/
# SDD / brainstorm scratch (agent-driven development)
.superpowers/
# Local Apple signing material (never commit)
.certs/

View File

@@ -9,8 +9,7 @@ source:
enabled: true
name_template: '{{ .ProjectName }}-{{ .Tag }}-source'
builds:
- id: local-ai
main: ./cmd/local-ai
- main: ./cmd/local-ai
env:
- CGO_ENABLED=0
ldflags:
@@ -36,19 +35,3 @@ snapshot:
version_template: "{{ .Tag }}-next"
changelog:
use: github-native
# Sign + notarize the macOS server binary via the quill backend (runs on Linux,
# no macOS runner needed). Disabled automatically when MACOS_SIGN_P12 is unset
# (forks / PRs), so those builds stay unsigned and green.
notarize:
macos:
- enabled: '{{ isEnvSet "MACOS_SIGN_P12" }}'
ids:
- local-ai
sign:
certificate: "{{.Env.MACOS_SIGN_P12}}"
password: "{{.Env.MACOS_SIGN_PASSWORD}}"
notarize:
issuer_id: "{{.Env.MACOS_NOTARY_ISSUER_ID}}"
key_id: "{{.Env.MACOS_NOTARY_KEY_ID}}"
key: "{{.Env.MACOS_NOTARY_KEY}}"
wait: true

View File

@@ -43,5 +43,4 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
- **New API endpoints**: LocalAI advertises its capability surface in several independent places — swagger `@Tags`, `/api/instructions` registry, auth `RouteFeatureRegistry`, React UI `capabilities.js`, docs. Read [.agents/api-endpoints-and-auth.md](.agents/api-endpoints-and-auth.md) and follow its checklist — missing any surface means clients, admins, and the UI won't know the endpoint exists.
- **Admin endpoints → MCP tool**: every admin endpoint that an admin would manage conversationally (install/list/edit/toggle/upgrade) MUST also be exposed as an MCP tool in `pkg/mcp/localaitools/`. The LocalAI Assistant chat modality and the standalone `local-ai mcp-server` consume that package; drift between REST and MCP is a real risk. Read [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) — the `TestToolHTTPRouteMappingComplete` test fails until you wire the new tool and update the route map.
- **Build**: Inspect `Makefile` and `.github/workflows/` — ask the user before running long builds
- **Backend OS coverage**: a new backend must target every OS it can build for, not just Linux. `.github/backend-matrix.yml` has two matrices — `include:` (Linux) and `includeDarwin:` (macOS / Apple Silicon). Most C/C++/GGML and many Python backends build on Darwin too — wire the `includeDarwin` entry + `backend/index.yaml` `metal:` entries, or say in the PR why an OS is unsupported. See the darwin checklist in [.agents/adding-backends.md](.agents/adding-backends.md).
- **UI**: The active UI is the React app in `core/http/react-ui/`. The older Alpine.js/HTML UI in `core/http/static/` is pending deprecation — all new UI work goes in the React UI

View File

@@ -1,5 +1,5 @@
# Disable parallel execution for backend builds
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter backends/privacy-filter-darwin
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter
GOCMD=go
GOTEST=$(GOCMD) test
@@ -103,7 +103,7 @@ COVERAGE_E2E_LABELS?=!real-models
COVERAGE_EXCLUDE_RE?=grpc/proto/.*[.]pb[.]go
.PHONY: all test test-coverage test-coverage-baseline test-coverage-check test-backend-cpp test-ui test-ui-coverage-baseline test-ui-coverage-check install-hooks build vendor lint lint-all
.PHONY: all test test-coverage test-coverage-baseline test-coverage-check test-ui test-ui-coverage-baseline test-ui-coverage-check install-hooks build vendor lint lint-all
all: help
@@ -201,13 +201,6 @@ test: prepare-test
OPUS_SHIM_LIBRARY=$(abspath ./pkg/opus/shim/libopusshim.so) \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts $(TEST_FLAKES) --fail-fast -v -r $(TEST_PATHS)
## Compiles and runs the standalone C++ unit tests for the backends (pure
## helpers that depend only on the stdlib + nlohmann/json, no full backend
## build). Discovers every *_test.cpp under backend/cpp/ - see
## backend/cpp/run-unit-tests.sh. Set NLOHMANN_INCLUDE to skip the header fetch.
test-backend-cpp:
bash backend/cpp/run-unit-tests.sh
## Runs the core suite ($(TEST_PATHS)) with statement-coverage instrumentation
## and writes a merged profile to $(COVERAGE_PROFILE). Deliberately omits
## --fail-fast so a single failure doesn't truncate the coverage number, and
@@ -1136,10 +1129,6 @@ backends/ds4-darwin: build
bash ./scripts/build/ds4-darwin.sh
./local-ai backends install "ocifile://$(abspath ./backend-images/ds4.tar)"
backends/privacy-filter-darwin: build
bash ./scripts/build/privacy-filter-darwin.sh
./local-ai backends install "ocifile://$(abspath ./backend-images/privacy-filter.tar)"
build-darwin-python-backend: build
bash ./scripts/build/python-darwin.sh
@@ -1460,32 +1449,13 @@ docs: docs/static/gallery.html
########################################################
## fyne cross-platform build
# Build LocalAI.app from the launcher via fyne (metadata read from cmd/launcher/FyneApp.toml).
# Signing happens via contrib/macos/sign-and-notarize.sh, which is a no-op when the signing
# secrets are unset, so unsigned local/fork builds keep working.
build-launcher-darwin:
rm -rf dist/LocalAI.app cmd/launcher/LocalAI.app
mkdir -p dist
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os darwin -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)
mv cmd/launcher/LocalAI.app dist/LocalAI.app
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.app
# Wrap the (signed) app into a drag-to-Applications DMG via hdiutil, then sign the DMG.
dmg-launcher-darwin: build-launcher-darwin
rm -rf dist/dmg dist/LocalAI.dmg
mkdir -p dist/dmg
cp -R dist/LocalAI.app dist/dmg/LocalAI.app
ln -s /Applications dist/dmg/Applications
hdiutil create -volname "LocalAI" -srcfolder dist/dmg -ov -format UDZO dist/LocalAI.dmg
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.dmg
# Submit the DMG to Apple notarization and staple the ticket (no-op without notary secrets).
notarize-launcher-darwin: dmg-launcher-darwin
bash contrib/macos/sign-and-notarize.sh notarize dist/LocalAI.dmg
# Single entrypoint for CI: build -> sign app -> dmg -> sign dmg -> notarize -> staple.
release-launcher-darwin: notarize-launcher-darwin
@echo "dist/LocalAI.dmg is ready"
build-launcher-darwin: build-launcher
go run github.com/tiagomelo/macos-dmg-creator/cmd/createdmg@latest \
--appName "LocalAI" \
--appBinaryPath "$(LAUNCHER_BINARY_NAME)" \
--bundleIdentifier "com.localai.launcher" \
--iconPath "core/http/static/logo.png" \
--outputDir "dist/"
build-launcher-linux:
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv LocalAI.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv launcher.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz

View File

@@ -1,5 +1,5 @@
IK_LLAMA_VERSION?=b84902d2ad27c34f989f23947200c4b91b1568fd
IK_LLAMA_VERSION?=d5507e33ae7ee2b7b41475f08044d3bde3b839ee
LLAMA_REPO?=https://github.com/ikawrakow/ik_llama.cpp
CMAKE_ARGS?=

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -13,28 +13,28 @@ grep -e "flags" /proc/cpuinfo | head -1
# ik_llama.cpp requires AVX2 — default to avx2 binary
BINARY=ik-llama-cpp-avx2
if [ -e "$CURDIR"/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
if [ -e $CURDIR/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 NOT found, using fallback"
BINARY=ik-llama-cpp-fallback
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec "$CURDIR"/$BINARY "$@"
exec $CURDIR/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec "$CURDIR"/ik-llama-cpp-fallback "$@"
exec $CURDIR/ik-llama-cpp-fallback "$@"

View File

@@ -87,18 +87,3 @@ target_compile_features(${TARGET} PRIVATE cxx_std_11)
if(TARGET BUILD_INFO)
add_dependencies(${TARGET} BUILD_INFO)
endif()
# Unit test for the message-content normalization helper (message_content.h).
# Off by default so the normal backend build is untouched; enable with
# -DLLAMA_GRPC_BUILD_TESTS=ON and run via ctest. It reuses llama.cpp's vendored
# <nlohmann/json.hpp> (propagated by the common helpers library) so it has no
# extra dependency beyond what the backend already builds against.
option(LLAMA_GRPC_BUILD_TESTS "Build grpc-server unit tests" OFF)
if(LLAMA_GRPC_BUILD_TESTS)
enable_testing()
add_executable(message_content_test message_content_test.cpp message_content.h)
target_include_directories(message_content_test PRIVATE ${CMAKE_CURRENT_SOURCE_DIR})
target_link_libraries(message_content_test PRIVATE ${_LLAMA_COMMON_TARGET})
target_compile_features(message_content_test PRIVATE cxx_std_17)
add_test(NAME message_content_test COMMAND message_content_test)
endif()

View File

@@ -1,5 +1,5 @@
LLAMA_VERSION?=9d5d882d8cd0f0a9283d87ed5e6fe3ee0d925fb1
LLAMA_VERSION?=8be759e6f70d629638a7eb70db3824cbdcea370b
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
CMAKE_ARGS?=

View File

@@ -39,7 +39,6 @@
#include "common.h"
#include "arg.h"
#include "chat-auto-parser.h"
#include "message_content.h"
#include <getopt.h>
#include <grpcpp/ext/proto_server_reflection_plugin.h>
#include <grpcpp/grpcpp.h>
@@ -1617,20 +1616,242 @@ public:
for (int i = 0; i < request->messages_size(); i++) {
const auto& msg = request->messages(i);
llama_grpc::ReconstructedMessageInput rin;
rin.role = msg.role();
rin.content = msg.content();
rin.name = msg.name();
rin.tool_call_id = msg.tool_call_id();
rin.reasoning_content = msg.reasoning_content();
rin.tool_calls = msg.tool_calls();
rin.is_last_user_msg = (i == last_user_msg_idx);
if (rin.is_last_user_msg) {
for (int j = 0; j < request->images_size(); j++) rin.images.push_back(request->images(j));
for (int j = 0; j < request->audios_size(); j++) rin.audios.push_back(request->audios(j));
for (int j = 0; j < request->videos_size(); j++) rin.videos.push_back(request->videos(j));
json msg_json;
msg_json["role"] = msg.role();
bool is_last_user_msg = (i == last_user_msg_idx);
bool has_images_or_audio = (request->images_size() > 0 || request->audios_size() > 0 || request->videos_size() > 0);
// Handle content - can be string, null, or array
// For multimodal content, we'll embed images/audio from separate fields
if (!msg.content().empty()) {
// Try to parse content as JSON to see if it's already an array
json content_val;
try {
content_val = json::parse(msg.content());
// Handle null values - convert to empty string to avoid template errors
if (content_val.is_null()) {
content_val = "";
}
} catch (const json::parse_error&) {
// Not JSON, treat as plain string
content_val = msg.content();
}
// If content is an object (e.g., from tool call failures), convert to string
if (content_val.is_object()) {
content_val = content_val.dump();
}
// If content is a string and this is the last user message with images/audio, combine them
if (content_val.is_string() && is_last_user_msg && has_images_or_audio) {
json content_array = json::array();
// Add text first
content_array.push_back({{"type", "text"}, {"text", content_val.get<std::string>()}});
// Add images
if (request->images_size() > 0) {
for (int j = 0; j < request->images_size(); j++) {
json image_chunk;
image_chunk["type"] = "image_url";
json image_url;
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
image_chunk["image_url"] = image_url;
content_array.push_back(image_chunk);
}
}
// Add audios
if (request->audios_size() > 0) {
for (int j = 0; j < request->audios_size(); j++) {
json audio_chunk;
audio_chunk["type"] = "input_audio";
json input_audio;
input_audio["data"] = request->audios(j);
input_audio["format"] = "wav"; // default, could be made configurable
audio_chunk["input_audio"] = input_audio;
content_array.push_back(audio_chunk);
}
}
if (request->videos_size() > 0) {
for (int j = 0; j < request->videos_size(); j++) {
json video_chunk;
video_chunk["type"] = "input_video";
json input_video;
input_video["data"] = request->videos(j);
video_chunk["input_video"] = input_video;
content_array.push_back(video_chunk);
}
}
msg_json["content"] = content_array;
} else {
// Use content as-is (already array or not last user message)
// Ensure null values are converted to empty string
if (content_val.is_null()) {
msg_json["content"] = "";
} else {
msg_json["content"] = content_val;
}
}
} else if (is_last_user_msg && has_images_or_audio) {
// If no content but this is the last user message with images/audio, create content array
json content_array = json::array();
if (request->images_size() > 0) {
for (int j = 0; j < request->images_size(); j++) {
json image_chunk;
image_chunk["type"] = "image_url";
json image_url;
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
image_chunk["image_url"] = image_url;
content_array.push_back(image_chunk);
}
}
if (request->audios_size() > 0) {
for (int j = 0; j < request->audios_size(); j++) {
json audio_chunk;
audio_chunk["type"] = "input_audio";
json input_audio;
input_audio["data"] = request->audios(j);
input_audio["format"] = "wav"; // default, could be made configurable
audio_chunk["input_audio"] = input_audio;
content_array.push_back(audio_chunk);
}
}
if (request->videos_size() > 0) {
for (int j = 0; j < request->videos_size(); j++) {
json video_chunk;
video_chunk["type"] = "input_video";
json input_video;
input_video["data"] = request->videos(j);
video_chunk["input_video"] = input_video;
content_array.push_back(video_chunk);
}
}
msg_json["content"] = content_array;
} else if (msg.role() == "tool") {
// Tool role messages must have content field set, even if empty
// Jinja templates expect content to be a string, not null or object
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d is tool role, content_empty=%d\n", i, msg.content().empty() ? 1 : 0);
if (msg.content().empty()) {
msg_json["content"] = "";
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): empty content, set to empty string\n", i);
} else {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): content exists: %s\n",
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
// Content exists, parse and ensure it's a string
json content_val;
try {
content_val = json::parse(msg.content());
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): parsed JSON, type=%s\n",
i, content_val.is_null() ? "null" :
content_val.is_object() ? "object" :
content_val.is_string() ? "string" :
content_val.is_array() ? "array" : "other");
// Handle null values - Jinja templates expect content to be a string, not null
if (content_val.is_null()) {
msg_json["content"] = "";
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): null content, converted to empty string\n", i);
} else if (content_val.is_object()) {
// If content is an object (e.g., from tool call failures/errors), convert to string
msg_json["content"] = content_val.dump();
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): object content, converted to string: %s\n",
i, content_val.dump().substr(0, std::min<size_t>(200, content_val.dump().size())).c_str());
} else if (content_val.is_string()) {
msg_json["content"] = content_val.get<std::string>();
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): string content, using as-is\n", i);
} else {
// For arrays or other types, convert to string
msg_json["content"] = content_val.dump();
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): %s content, converted to string\n",
i, content_val.is_array() ? "array" : "other type");
}
} catch (const json::parse_error&) {
// Not JSON, treat as plain string
msg_json["content"] = msg.content();
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (tool): not JSON, using as string\n", i);
}
}
} else {
// Ensure all messages have content set (fallback for any unhandled cases)
// Jinja templates expect content to be present, default to empty string if not set
if (!msg_json.contains("content")) {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d (role=%s): no content field, adding empty string\n",
i, msg.role().c_str());
msg_json["content"] = "";
}
}
messages_json.push_back(llama_grpc::build_reconstructed_message(rin));
// Add optional fields for OpenAI-compatible message format
if (!msg.name().empty()) {
msg_json["name"] = msg.name();
}
if (!msg.tool_call_id().empty()) {
msg_json["tool_call_id"] = msg.tool_call_id();
}
if (!msg.reasoning_content().empty()) {
msg_json["reasoning_content"] = msg.reasoning_content();
}
if (!msg.tool_calls().empty()) {
// Parse tool_calls JSON string and add to message
try {
json tool_calls = json::parse(msg.tool_calls());
msg_json["tool_calls"] = tool_calls;
SRV_INF("[TOOL CALLS DEBUG] PredictStream: Message %d has tool_calls: %s\n", i, tool_calls.dump().c_str());
// IMPORTANT: If message has tool_calls but content is empty or not set,
// set content to space " " instead of empty string "", because llama.cpp's
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
// which causes template errors when accessing message.content[:tool_start_length]
if (!msg_json.contains("content") || (msg_json.contains("content") && msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d has tool_calls but empty content, setting to space\n", i);
msg_json["content"] = " ";
}
// Log each tool call with name and arguments
if (tool_calls.is_array()) {
for (size_t tc_idx = 0; tc_idx < tool_calls.size(); tc_idx++) {
const auto& tc = tool_calls[tc_idx];
std::string tool_name = "unknown";
std::string tool_args = "{}";
if (tc.contains("function")) {
const auto& func = tc["function"];
if (func.contains("name")) {
tool_name = func["name"].get<std::string>();
}
if (func.contains("arguments")) {
tool_args = func["arguments"].is_string() ?
func["arguments"].get<std::string>() :
func["arguments"].dump();
}
} else if (tc.contains("name")) {
tool_name = tc["name"].get<std::string>();
if (tc.contains("arguments")) {
tool_args = tc["arguments"].is_string() ?
tc["arguments"].get<std::string>() :
tc["arguments"].dump();
}
}
SRV_INF("[TOOL CALLS DEBUG] PredictStream: Message %d, tool_call %zu: name=%s, arguments=%s\n",
i, tc_idx, tool_name.c_str(), tool_args.c_str());
}
}
} catch (const json::parse_error& e) {
SRV_WRN("Failed to parse tool_calls JSON: %s\n", e.what());
}
}
// Debug: Log final content state before adding to array
if (msg_json.contains("content")) {
if (msg_json["content"].is_null()) {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: content is NULL - THIS WILL CAUSE ERROR!\n", i);
} else {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: content type=%s, has_value=%d\n",
i, msg_json["content"].is_string() ? "string" :
msg_json["content"].is_array() ? "array" :
msg_json["content"].is_object() ? "object" : "other",
msg_json["content"].is_null() ? 0 : 1);
}
} else {
SRV_INF("[CONTENT DEBUG] PredictStream: Message %d FINAL STATE: NO CONTENT FIELD - THIS WILL CAUSE ERROR!\n", i);
}
messages_json.push_back(msg_json);
}
// Final safety check: Ensure no message has null content (Jinja templates require strings)
@@ -1851,7 +2072,36 @@ public:
if (body_json.contains("messages") && body_json["messages"].is_array()) {
SRV_INF("[CONTENT DEBUG] PredictStream: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
llama_grpc::normalize_template_message(body_json["messages"][idx]);
auto& msg = body_json["messages"][idx];
std::string role_str = msg.contains("role") ? msg["role"].get<std::string>() : "unknown";
if (msg.contains("content")) {
if (msg["content"].is_null()) {
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) has NULL content - FIXING!\n", idx, role_str.c_str());
msg["content"] = ""; // Fix null content
} else if (role_str == "tool" && msg["content"].is_array()) {
// Tool messages must have string content, not array
// oaicompat_chat_params_parse expects tool messages to have string content
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=tool) has array content, converting to string\n", idx);
msg["content"] = msg["content"].dump();
} else if (!msg["content"].is_string() && !msg["content"].is_array()) {
// If content is object or other non-string type, convert to string for templates
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) content is not string/array, converting\n", idx, role_str.c_str());
if (msg["content"].is_object()) {
msg["content"] = msg["content"].dump();
} else {
msg["content"] = "";
}
} else {
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s): content type=%s\n",
idx, role_str.c_str(),
msg["content"].is_string() ? "string" :
msg["content"].is_array() ? "array" :
msg["content"].is_object() ? "object" : "other");
}
} else {
SRV_INF("[CONTENT DEBUG] PredictStream: BEFORE TEMPLATE - Message %zu (role=%s) MISSING content field - ADDING!\n", idx, role_str.c_str());
msg["content"] = ""; // Add missing content
}
}
}
@@ -2183,20 +2433,264 @@ public:
SRV_INF("[CONTENT DEBUG] Predict: Processing %d messages\n", request->messages_size());
for (int i = 0; i < request->messages_size(); i++) {
const auto& msg = request->messages(i);
llama_grpc::ReconstructedMessageInput rin;
rin.role = msg.role();
rin.content = msg.content();
rin.name = msg.name();
rin.tool_call_id = msg.tool_call_id();
rin.reasoning_content = msg.reasoning_content();
rin.tool_calls = msg.tool_calls();
rin.is_last_user_msg = (i == last_user_msg_idx);
if (rin.is_last_user_msg) {
for (int j = 0; j < request->images_size(); j++) rin.images.push_back(request->images(j));
for (int j = 0; j < request->audios_size(); j++) rin.audios.push_back(request->audios(j));
for (int j = 0; j < request->videos_size(); j++) rin.videos.push_back(request->videos(j));
json msg_json;
msg_json["role"] = msg.role();
SRV_INF("[CONTENT DEBUG] Predict: Message %d: role=%s, content_empty=%d, content_length=%zu\n",
i, msg.role().c_str(), msg.content().empty() ? 1 : 0, msg.content().size());
if (!msg.content().empty()) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d content (first 200 chars): %s\n",
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
}
messages_json.push_back(llama_grpc::build_reconstructed_message(rin));
bool is_last_user_msg = (i == last_user_msg_idx);
bool has_images_or_audio = (request->images_size() > 0 || request->audios_size() > 0 || request->videos_size() > 0);
// Handle content - can be string, null, or array
// For multimodal content, we'll embed images/audio from separate fields
if (!msg.content().empty()) {
// Try to parse content as JSON to see if it's already an array
json content_val;
try {
content_val = json::parse(msg.content());
// Handle null values - convert to empty string to avoid template errors
if (content_val.is_null()) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d parsed JSON is null, converting to empty string\n", i);
content_val = "";
}
} catch (const json::parse_error&) {
// Not JSON, treat as plain string
content_val = msg.content();
}
// If content is an object (e.g., from tool call failures), convert to string
if (content_val.is_object()) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d content is object, converting to string\n", i);
content_val = content_val.dump();
}
// If content is a string and this is the last user message with images/audio, combine them
if (content_val.is_string() && is_last_user_msg && has_images_or_audio) {
json content_array = json::array();
// Add text first
content_array.push_back({{"type", "text"}, {"text", content_val.get<std::string>()}});
// Add images
if (request->images_size() > 0) {
for (int j = 0; j < request->images_size(); j++) {
json image_chunk;
image_chunk["type"] = "image_url";
json image_url;
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
image_chunk["image_url"] = image_url;
content_array.push_back(image_chunk);
}
}
// Add audios
if (request->audios_size() > 0) {
for (int j = 0; j < request->audios_size(); j++) {
json audio_chunk;
audio_chunk["type"] = "input_audio";
json input_audio;
input_audio["data"] = request->audios(j);
input_audio["format"] = "wav"; // default, could be made configurable
audio_chunk["input_audio"] = input_audio;
content_array.push_back(audio_chunk);
}
}
if (request->videos_size() > 0) {
for (int j = 0; j < request->videos_size(); j++) {
json video_chunk;
video_chunk["type"] = "input_video";
json input_video;
input_video["data"] = request->videos(j);
video_chunk["input_video"] = input_video;
content_array.push_back(video_chunk);
}
}
msg_json["content"] = content_array;
} else {
// Use content as-is (already array or not last user message)
// Ensure null values are converted to empty string
if (content_val.is_null()) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d content_val was null, setting to empty string\n", i);
msg_json["content"] = "";
} else {
msg_json["content"] = content_val;
SRV_INF("[CONTENT DEBUG] Predict: Message %d content set, type=%s\n",
i, content_val.is_string() ? "string" :
content_val.is_array() ? "array" :
content_val.is_object() ? "object" : "other");
}
}
} else if (is_last_user_msg && has_images_or_audio) {
// If no content but this is the last user message with images/audio, create content array
json content_array = json::array();
if (request->images_size() > 0) {
for (int j = 0; j < request->images_size(); j++) {
json image_chunk;
image_chunk["type"] = "image_url";
json image_url;
image_url["url"] = "data:image/jpeg;base64," + request->images(j);
image_chunk["image_url"] = image_url;
content_array.push_back(image_chunk);
}
}
if (request->audios_size() > 0) {
for (int j = 0; j < request->audios_size(); j++) {
json audio_chunk;
audio_chunk["type"] = "input_audio";
json input_audio;
input_audio["data"] = request->audios(j);
input_audio["format"] = "wav"; // default, could be made configurable
audio_chunk["input_audio"] = input_audio;
content_array.push_back(audio_chunk);
}
}
if (request->videos_size() > 0) {
for (int j = 0; j < request->videos_size(); j++) {
json video_chunk;
video_chunk["type"] = "input_video";
json input_video;
input_video["data"] = request->videos(j);
video_chunk["input_video"] = input_video;
content_array.push_back(video_chunk);
}
}
msg_json["content"] = content_array;
SRV_INF("[CONTENT DEBUG] Predict: Message %d created content array with media\n", i);
} else if (!msg.tool_calls().empty()) {
// Tool call messages may have null content, but templates expect string
// IMPORTANT: Set to space " " instead of empty string "", because llama.cpp's
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
// which causes template errors when accessing message.content[:tool_start_length]
SRV_INF("[CONTENT DEBUG] Predict: Message %d has tool_calls, setting content to space (not empty string)\n", i);
msg_json["content"] = " ";
} else if (msg.role() == "tool") {
// Tool role messages must have content field set, even if empty
// Jinja templates expect content to be a string, not null or object
SRV_INF("[CONTENT DEBUG] Predict: Message %d is tool role, content_empty=%d\n", i, msg.content().empty() ? 1 : 0);
if (msg.content().empty()) {
msg_json["content"] = "";
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): empty content, set to empty string\n", i);
} else {
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): content exists: %s\n",
i, msg.content().substr(0, std::min<size_t>(200, msg.content().size())).c_str());
// Content exists, parse and ensure it's a string
json content_val;
try {
content_val = json::parse(msg.content());
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): parsed JSON, type=%s\n",
i, content_val.is_null() ? "null" :
content_val.is_object() ? "object" :
content_val.is_string() ? "string" :
content_val.is_array() ? "array" : "other");
// Handle null values - Jinja templates expect content to be a string, not null
if (content_val.is_null()) {
msg_json["content"] = "";
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): null content, converted to empty string\n", i);
} else if (content_val.is_object()) {
// If content is an object (e.g., from tool call failures/errors), convert to string
msg_json["content"] = content_val.dump();
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): object content, converted to string: %s\n",
i, content_val.dump().substr(0, std::min<size_t>(200, content_val.dump().size())).c_str());
} else if (content_val.is_string()) {
msg_json["content"] = content_val.get<std::string>();
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): string content, using as-is\n", i);
} else {
// For arrays or other types, convert to string
msg_json["content"] = content_val.dump();
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): %s content, converted to string\n",
i, content_val.is_array() ? "array" : "other type");
}
} catch (const json::parse_error&) {
// Not JSON, treat as plain string
msg_json["content"] = msg.content();
SRV_INF("[CONTENT DEBUG] Predict: Message %d (tool): not JSON, using as string\n", i);
}
}
} else {
// Ensure all messages have content set (fallback for any unhandled cases)
// Jinja templates expect content to be present, default to empty string if not set
if (!msg_json.contains("content")) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d (role=%s): no content field, adding empty string\n",
i, msg.role().c_str());
msg_json["content"] = "";
}
}
// Add optional fields for OpenAI-compatible message format
if (!msg.name().empty()) {
msg_json["name"] = msg.name();
}
if (!msg.tool_call_id().empty()) {
msg_json["tool_call_id"] = msg.tool_call_id();
}
if (!msg.reasoning_content().empty()) {
msg_json["reasoning_content"] = msg.reasoning_content();
}
if (!msg.tool_calls().empty()) {
// Parse tool_calls JSON string and add to message
try {
json tool_calls = json::parse(msg.tool_calls());
msg_json["tool_calls"] = tool_calls;
SRV_INF("[TOOL CALLS DEBUG] Predict: Message %d has tool_calls: %s\n", i, tool_calls.dump().c_str());
// IMPORTANT: If message has tool_calls but content is empty or not set,
// set content to space " " instead of empty string "", because llama.cpp's
// common_chat_msgs_to_json_oaicompat converts empty strings to null (line 312),
// which causes template errors when accessing message.content[:tool_start_length]
if (!msg_json.contains("content") || (msg_json.contains("content") && msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d has tool_calls but empty content, setting to space\n", i);
msg_json["content"] = " ";
}
// Log each tool call with name and arguments
if (tool_calls.is_array()) {
for (size_t tc_idx = 0; tc_idx < tool_calls.size(); tc_idx++) {
const auto& tc = tool_calls[tc_idx];
std::string tool_name = "unknown";
std::string tool_args = "{}";
if (tc.contains("function")) {
const auto& func = tc["function"];
if (func.contains("name")) {
tool_name = func["name"].get<std::string>();
}
if (func.contains("arguments")) {
tool_args = func["arguments"].is_string() ?
func["arguments"].get<std::string>() :
func["arguments"].dump();
}
} else if (tc.contains("name")) {
tool_name = tc["name"].get<std::string>();
if (tc.contains("arguments")) {
tool_args = tc["arguments"].is_string() ?
tc["arguments"].get<std::string>() :
tc["arguments"].dump();
}
}
SRV_INF("[TOOL CALLS DEBUG] Predict: Message %d, tool_call %zu: name=%s, arguments=%s\n",
i, tc_idx, tool_name.c_str(), tool_args.c_str());
}
}
} catch (const json::parse_error& e) {
SRV_WRN("Failed to parse tool_calls JSON: %s\n", e.what());
}
}
// Debug: Log final content state before adding to array
if (msg_json.contains("content")) {
if (msg_json["content"].is_null()) {
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: content is NULL - THIS WILL CAUSE ERROR!\n", i);
} else {
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: content type=%s, has_value=%d\n",
i, msg_json["content"].is_string() ? "string" :
msg_json["content"].is_array() ? "array" :
msg_json["content"].is_object() ? "object" : "other",
msg_json["content"].is_null() ? 0 : 1);
}
} else {
SRV_INF("[CONTENT DEBUG] Predict: Message %d FINAL STATE: NO CONTENT FIELD - THIS WILL CAUSE ERROR!\n", i);
}
messages_json.push_back(msg_json);
}
// Final safety check: Ensure no message has null content (Jinja templates require strings)
@@ -2417,7 +2911,36 @@ public:
if (body_json.contains("messages") && body_json["messages"].is_array()) {
SRV_INF("[CONTENT DEBUG] Predict: Before oaicompat_chat_params_parse - checking %zu messages\n", body_json["messages"].size());
for (size_t idx = 0; idx < body_json["messages"].size(); idx++) {
llama_grpc::normalize_template_message(body_json["messages"][idx]);
auto& msg = body_json["messages"][idx];
std::string role_str = msg.contains("role") ? msg["role"].get<std::string>() : "unknown";
if (msg.contains("content")) {
if (msg["content"].is_null()) {
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) has NULL content - FIXING!\n", idx, role_str.c_str());
msg["content"] = ""; // Fix null content
} else if (role_str == "tool" && msg["content"].is_array()) {
// Tool messages must have string content, not array
// oaicompat_chat_params_parse expects tool messages to have string content
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=tool) has array content, converting to string\n", idx);
msg["content"] = msg["content"].dump();
} else if (!msg["content"].is_string() && !msg["content"].is_array()) {
// If content is object or other non-string type, convert to string for templates
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) content is not string/array, converting\n", idx, role_str.c_str());
if (msg["content"].is_object()) {
msg["content"] = msg["content"].dump();
} else {
msg["content"] = "";
}
} else {
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s): content type=%s\n",
idx, role_str.c_str(),
msg["content"].is_string() ? "string" :
msg["content"].is_array() ? "array" :
msg["content"].is_object() ? "object" : "other");
}
} else {
SRV_INF("[CONTENT DEBUG] Predict: BEFORE TEMPLATE - Message %zu (role=%s) MISSING content field - ADDING!\n", idx, role_str.c_str());
msg["content"] = ""; // Add missing content
}
}
}

View File

@@ -1,192 +0,0 @@
#pragma once
#include <string>
#include <vector>
#include <nlohmann/json.hpp>
namespace llama_grpc {
// Normalizes a proto message's content string into the JSON value used when
// reconstructing OpenAI-format messages for the tokenizer (jinja) template.
//
// Shared by the streaming (PredictStream) and non-streaming (Predict) message
// reconstruction paths so the two cannot drift.
//
// LocalAI's Go layer (schema.Messages.ToProto) always sends content as a plain
// text string; multimodal media travels in separate proto fields, never inside
// content. So user/system/developer content is *only ever* opaque text and must
// NOT be JSON-sniffed: a prompt that merely looks like JSON (e.g. an ingredient
// list ["1/4 cup sugar", ...]) would otherwise be reinterpreted as structured
// content parts and rejected by oaicompat_chat_params_parse with
// "unsupported content[].type" (https://github.com/mudler/LocalAI/issues/10524).
// (developer is OpenAI's modern system alias - same "human-authored text" nature.)
//
// For assistant/tool messages we still collapse a literal JSON null/object
// (tool-call bookkeeping) to a string, but we never turn a plain string into an
// array/scalar. The array defense is therefore role-independent (arrays/scalars
// fall through for every role); the role gate only governs the null/object case.
inline nlohmann::ordered_json normalize_message_content(const std::string& role,
const std::string& content) {
nlohmann::ordered_json content_val = content;
if (role != "user" && role != "system" && role != "developer") {
try {
nlohmann::ordered_json parsed = nlohmann::ordered_json::parse(content);
if (parsed.is_null()) {
content_val = "";
} else if (parsed.is_object()) {
content_val = parsed.dump();
}
// arrays / scalars: keep the original plain-text string as-is
} catch (const nlohmann::ordered_json::parse_error&) {
// Not JSON, already the plain string
}
}
return content_val;
}
// Final safety pass applied to each reconstructed OpenAI message right before it
// is handed to oaicompat_chat_params_parse (jinja templating). Jinja templates
// assume content is a string: a literal null breaks slicing such as
// message.content[:N] (#7324), and a tool message with array content is rejected
// (#7528). A multimodal user message legitimately carries a typed-part array
// ({type:text}, {type:image_url}, ...), which must be left intact. Shared by the
// streaming and non-streaming paths so this invariant cannot drift between them.
inline void normalize_template_message(nlohmann::ordered_json& msg) {
if (!msg.contains("content")) {
msg["content"] = ""; // templates expect the field to exist
return;
}
nlohmann::ordered_json& content = msg["content"];
const std::string role = (msg.contains("role") && msg["role"].is_string())
? msg["role"].get<std::string>()
: std::string();
if (content.is_null()) {
content = ""; // #7324: null would crash content[:N] slicing
} else if (role == "tool" && content.is_array()) {
content = content.dump(); // #7528: tool messages must have string content
} else if (!content.is_string() && !content.is_array()) {
if (content.is_object()) {
content = content.dump(); // tool-call bookkeeping object -> string
} else {
content = ""; // other scalar (number/bool) -> empty
}
}
// string, or a non-tool (multimodal) typed-part array: leave untouched
}
// One proto message's data, flattened to plain types so the reconstruction logic
// can be shared and unit-tested without protobuf. The streaming and non-streaming
// predict paths both populate this from proto::Message + the request's media.
struct ReconstructedMessageInput {
std::string role;
std::string content; // proto.Message.content (always a plain string)
std::string name;
std::string tool_call_id;
std::string reasoning_content;
std::string tool_calls; // tool_calls as a JSON string, or empty
bool is_last_user_msg = false; // attach request media to this message
std::vector<std::string> images; // base64 (jpeg)
std::vector<std::string> audios; // base64 (wav)
std::vector<std::string> videos; // base64
};
// Appends the request's media as OpenAI typed content parts. Imperative (not
// brace-init) to avoid nlohmann's object-vs-array initializer-list ambiguity.
inline void append_media_parts(nlohmann::ordered_json& content_array,
const std::vector<std::string>& images,
const std::vector<std::string>& audios,
const std::vector<std::string>& videos) {
for (const auto& img : images) {
nlohmann::ordered_json image_chunk;
image_chunk["type"] = "image_url";
nlohmann::ordered_json image_url;
image_url["url"] = "data:image/jpeg;base64," + img;
image_chunk["image_url"] = image_url;
content_array.push_back(image_chunk);
}
for (const auto& aud : audios) {
nlohmann::ordered_json audio_chunk;
audio_chunk["type"] = "input_audio";
nlohmann::ordered_json input_audio;
input_audio["data"] = aud;
input_audio["format"] = "wav"; // default; could be made configurable
audio_chunk["input_audio"] = input_audio;
content_array.push_back(audio_chunk);
}
for (const auto& vid : videos) {
nlohmann::ordered_json video_chunk;
video_chunk["type"] = "input_video";
nlohmann::ordered_json input_video;
input_video["data"] = vid;
video_chunk["input_video"] = input_video;
content_array.push_back(video_chunk);
}
}
// Reconstructs a single OpenAI-format message (the object fed to
// oaicompat_chat_params_parse) from a proto message. Shared by PredictStream and
// Predict so the content/multimodal/tool_calls handling cannot drift between the
// two stream modes (it previously lived as two ~150-line copies with a redundant
// Predict-only tool_calls->" " branch). Guarantees content is always a string or
// a typed-part array, never null/missing.
inline nlohmann::ordered_json build_reconstructed_message(const ReconstructedMessageInput& in) {
nlohmann::ordered_json msg_json;
msg_json["role"] = in.role;
const bool has_media = !in.images.empty() || !in.audios.empty() || !in.videos.empty();
if (!in.content.empty()) {
nlohmann::ordered_json content_val = normalize_message_content(in.role, in.content);
if (content_val.is_string() && in.is_last_user_msg && has_media) {
// Last user message + media: build a typed-part array (text first).
nlohmann::ordered_json content_array = nlohmann::ordered_json::array();
nlohmann::ordered_json text_part;
text_part["type"] = "text";
text_part["text"] = content_val.get<std::string>();
content_array.push_back(text_part);
append_media_parts(content_array, in.images, in.audios, in.videos);
msg_json["content"] = content_array;
} else if (content_val.is_null()) {
msg_json["content"] = "";
} else {
msg_json["content"] = content_val;
}
} else if (in.is_last_user_msg && has_media) {
// No text but media on the last user message: media-only typed array.
nlohmann::ordered_json content_array = nlohmann::ordered_json::array();
append_media_parts(content_array, in.images, in.audios, in.videos);
msg_json["content"] = content_array;
} else {
// Empty content (any role, incl. tool/assistant): templates need a string.
msg_json["content"] = "";
}
if (!in.name.empty()) {
msg_json["name"] = in.name;
}
if (!in.tool_call_id.empty()) {
msg_json["tool_call_id"] = in.tool_call_id;
}
if (!in.reasoning_content.empty()) {
msg_json["reasoning_content"] = in.reasoning_content;
}
if (!in.tool_calls.empty()) {
try {
nlohmann::ordered_json tool_calls = nlohmann::ordered_json::parse(in.tool_calls);
msg_json["tool_calls"] = tool_calls;
// tool_calls + empty/blank content: use " " not "", because llama.cpp's
// common_chat_msgs_to_json_oaicompat turns "" into null, which breaks
// templates that slice message.content[:tool_start_length] (#7324).
if (!msg_json.contains("content") ||
(msg_json["content"].is_string() && msg_json["content"].get<std::string>().empty())) {
msg_json["content"] = " ";
}
} catch (const nlohmann::ordered_json::parse_error&) {
// Malformed tool_calls JSON: leave content as-is (prior behavior).
}
}
return msg_json;
}
} // namespace llama_grpc

View File

@@ -1,234 +0,0 @@
// Unit tests for the shared message-reconstruction helpers (message_content.h).
//
// Build & run standalone (nlohmann/json single header on the include path):
// g++ -std=c++17 -I<dir-with-nlohmann> message_content_test.cpp -o t && ./t
// or via CMake: -DLLAMA_GRPC_BUILD_TESTS=ON then ctest.
//
// Regression coverage for:
// #10524 - a user/system prompt that is itself a JSON-array string must stay
// plain text, never be reinterpreted as OpenAI structured parts.
// #7324 - assistant/tool null content -> "" (templates slice content[:N]);
// assistant+tool_calls+empty content -> " " (not "", which becomes null).
// #7528 - tool message array content must reach the template as a string.
// multimodal - last user message text + media -> typed-part array, media kept.
#include <cassert>
#include <iostream>
#include <string>
#include "message_content.h"
using nlohmann::ordered_json;
using llama_grpc::normalize_message_content;
using llama_grpc::normalize_template_message;
using llama_grpc::build_reconstructed_message;
using llama_grpc::ReconstructedMessageInput;
static int failures = 0;
static void check(bool ok, const std::string& name, const std::string& detail = "") {
if (!ok) {
std::cerr << "FAIL " << name << (detail.empty() ? "" : ": " + detail) << "\n";
failures++;
}
}
// ---- normalize_message_content -------------------------------------------
static void expect_norm_string(const char* name, const std::string& role,
const std::string& content, const std::string& want) {
auto got = normalize_message_content(role, content);
if (!got.is_string()) {
check(false, name, "expected a JSON string, got " +
std::string(got.is_array() ? "array" : got.is_object() ? "object" : "other") +
" (" + got.dump() + ")");
return;
}
check(got.get<std::string>() == want, name, "expected \"" + want + "\", got \"" + got.get<std::string>() + "\"");
}
static void test_normalize() {
const std::string ingredients = R"(["1/4 cup brown sugar, packed","1 pound ground beef"])";
// #10524 - JSON-array text must stay a string. Role-INDEPENDENT array defense.
for (const char* role : {"user", "system", "developer", "function", "assistant", "tool"}) {
expect_norm_string((std::string("json_array_stays_text:") + role).c_str(), role, ingredients, ingredients);
}
// #10524 - user/system/developer JSON-object text stays verbatim (NOT re-dumped).
expect_norm_string("user_json_object_verbatim", "user", R"({"a":1})", R"({"a":1})");
expect_norm_string("system_json_object_verbatim", "system", R"({"a":1})", R"({"a":1})");
expect_norm_string("developer_json_object_verbatim", "developer", R"({"a":1})", R"({"a":1})");
// Plain text unchanged for all roles.
expect_norm_string("user_plain_text", "user", "hello world", "hello world");
expect_norm_string("assistant_non_json_text_kept", "assistant", "hi [unclosed", "hi [unclosed");
// #7324 boundary - user/system/developer literal "null" preserved (never parsed).
expect_norm_string("user_literal_null_stays", "user", "null", "null");
expect_norm_string("system_literal_null_stays", "system", "null", "null");
expect_norm_string("developer_literal_null_stays", "developer", "null", "null");
// #7324 - assistant/tool literal null collapses to empty string.
expect_norm_string("assistant_null_to_empty", "assistant", "null", "");
expect_norm_string("tool_null_to_empty", "tool", "null", "");
// #7324/#7528 - assistant/tool object bookkeeping stringified (stays a string).
check(normalize_message_content("assistant", R"({"tool":"x"})").is_string(), "assistant_object_stringified");
check(normalize_message_content("tool", R"({"error":"boom"})").is_string(), "tool_object_stringified");
// #10524-family - a bare scalar that parses as a JSON number stays the string.
expect_norm_string("assistant_scalar_number_stays_string", "assistant", "42", "42");
// baseline - empty content stays empty.
expect_norm_string("user_empty_stays_empty", "user", "", "");
}
// ---- normalize_template_message (BEFORE TEMPLATE sanitizer) ---------------
static void test_template_sanitizer() {
// #7528 - a tool message with an ACTUAL array becomes a string.
{
ordered_json msg = {{"role", "tool"}, {"content", ordered_json::array({{{"type", "text"}, {"text", "r"}}})}};
normalize_template_message(msg);
check(msg["content"].is_string(), "before_template_tool_array_to_string", "got " + msg["content"].dump());
}
// #7324 - null content -> "" for any role.
{
ordered_json msg = {{"role", "assistant"}, {"content", nullptr}};
normalize_template_message(msg);
check(msg["content"].is_string() && msg["content"] == "", "before_template_null_to_empty");
}
// object content -> dumped string (would otherwise throw at the template).
{
ordered_json msg = {{"role", "assistant"}, {"content", {{"x", 1}}}};
normalize_template_message(msg);
check(msg["content"].is_string(), "before_template_object_to_string", "got " + msg["content"].dump());
}
// missing content field -> "".
{
ordered_json msg = {{"role", "user"}};
normalize_template_message(msg);
check(msg.contains("content") && msg["content"] == "", "before_template_missing_to_empty");
}
// multimodal: a well-typed user array must be left UNTOUCHED (role!=tool).
{
ordered_json parts = ordered_json::array();
parts.push_back({{"type", "text"}, {"text", "x"}});
ordered_json img; img["type"] = "image_url"; img["image_url"] = {{"url", "data:..."}};
parts.push_back(img);
ordered_json msg = {{"role", "user"}, {"content", parts}};
normalize_template_message(msg);
check(msg["content"].is_array() && msg["content"].size() == 2, "before_template_user_typed_array_preserved",
"got " + msg["content"].dump());
}
// a plain string is left untouched.
{
ordered_json msg = {{"role", "user"}, {"content", "hello"}};
normalize_template_message(msg);
check(msg["content"] == "hello", "before_template_string_untouched");
}
}
// ---- build_reconstructed_message ----------------------------------------
static void test_reconstruction() {
const std::string ingredients = R"(["1/4 cup brown sugar","1 pound ground beef"])";
// #10524 end-state - user JSON-array text, no media -> string content.
{
ReconstructedMessageInput in;
in.role = "user"; in.content = ingredients;
auto m = build_reconstructed_message(in);
check(m["content"].is_string() && m["content"] == ingredients, "recon_user_json_array_string",
"got " + m["content"].dump());
}
// multimodal - user text + one image on last user msg -> typed array, image kept.
{
ReconstructedMessageInput in;
in.role = "user"; in.content = ingredients; in.is_last_user_msg = true;
in.images.push_back("BASE64IMG");
auto m = build_reconstructed_message(in);
check(m["content"].is_array() && m["content"].size() == 2, "recon_multimodal_text_plus_image",
"got " + m["content"].dump());
check(m["content"][0]["type"] == "text" && m["content"][0]["text"] == ingredients, "recon_multimodal_text_first");
check(m["content"][1]["type"] == "image_url", "recon_multimodal_image_kept");
}
// multimodal media-only - empty text + image on last user msg.
{
ReconstructedMessageInput in;
in.role = "user"; in.content = ""; in.is_last_user_msg = true;
in.images.push_back("BASE64IMG");
auto m = build_reconstructed_message(in);
check(m["content"].is_array() && m["content"].size() == 1 && m["content"][0]["type"] == "image_url",
"recon_media_only", "got " + m["content"].dump());
}
// #7528 - tool array-string content stays a string.
{
ReconstructedMessageInput in;
in.role = "tool"; in.content = R"(["a","b"])"; in.tool_call_id = "call_1";
auto m = build_reconstructed_message(in);
check(m["content"].is_string() && m["content"] == R"(["a","b"])", "recon_tool_array_string",
"got " + m["content"].dump());
check(m["tool_call_id"] == "call_1", "recon_tool_call_id_set");
}
// tool empty content -> "".
{
ReconstructedMessageInput in;
in.role = "tool"; in.content = "";
auto m = build_reconstructed_message(in);
check(m["content"].is_string() && m["content"] == "", "recon_tool_empty_to_string");
}
// #7324 - assistant + tool_calls + empty content -> " " (single space, not "").
{
ReconstructedMessageInput in;
in.role = "assistant"; in.content = "";
in.tool_calls = R"([{"id":"c1","type":"function","function":{"name":"f","arguments":"{}"}}])";
auto m = build_reconstructed_message(in);
check(m["content"].is_string() && m["content"] == " ", "recon_toolcalls_empty_content_space",
"got " + m["content"].dump());
check(m["tool_calls"].is_array() && m["tool_calls"].size() == 1, "recon_toolcalls_parsed");
}
// assistant + tool_calls + real content keeps the content.
{
ReconstructedMessageInput in;
in.role = "assistant"; in.content = "I'll call f";
in.tool_calls = R"([{"id":"c1","type":"function","function":{"name":"f","arguments":"{}"}}])";
auto m = build_reconstructed_message(in);
check(m["content"] == "I'll call f", "recon_toolcalls_with_content_kept");
}
// assistant null content -> "".
{
ReconstructedMessageInput in;
in.role = "assistant"; in.content = "null";
auto m = build_reconstructed_message(in);
check(m["content"] == "", "recon_assistant_null_to_empty");
}
// malformed tool_calls JSON must not throw; content preserved.
{
ReconstructedMessageInput in;
in.role = "assistant"; in.content = "hi"; in.tool_calls = "{not json";
auto m = build_reconstructed_message(in);
check(m["content"] == "hi" && !m.contains("tool_calls"), "recon_malformed_toolcalls_safe");
}
// optional fields: name + reasoning carried through.
{
ReconstructedMessageInput in;
in.role = "tool"; in.content = "result"; in.name = "get_weather"; in.reasoning_content = "thinking";
auto m = build_reconstructed_message(in);
check(m["name"] == "get_weather" && m["reasoning_content"] == "thinking", "recon_optional_fields");
}
}
int main() {
test_normalize();
test_template_sanitizer();
test_reconstruction();
if (failures == 0) {
std::cout << "OK: all message_content tests passed\n";
return 0;
}
std::cerr << failures << " test(s) failed\n";
return 1;
}

View File

@@ -18,10 +18,6 @@ done
cp -r CMakeLists.txt llama.cpp/tools/grpc-server/
cp -r grpc-server.cpp llama.cpp/tools/grpc-server/
# Shared message-reconstruction helpers (included by grpc-server.cpp) and their
# unit test (compiled only when -DLLAMA_GRPC_BUILD_TESTS=ON).
cp -r message_content.h llama.cpp/tools/grpc-server/
cp -r message_content_test.cpp llama.cpp/tools/grpc-server/
cp -rfv llama.cpp/vendor/nlohmann/json.hpp llama.cpp/tools/grpc-server/
cp -rfv llama.cpp/vendor/cpp-httplib/httplib.h llama.cpp/tools/grpc-server/

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -16,37 +16,37 @@ BINARY=llama-cpp-fallback
# CPU_ALL_VARIANTS: ggml's backend registry dlopens the best libggml-cpu-*.so for this
# host, so no shell-side AVX probing. GPU images (cublas/sycl/vulkan/hipblas) ship only
# llama-cpp-fallback (the accelerator does the compute), so fall back to it when absent.
if [ -e "$CURDIR"/llama-cpp-cpu-all ]; then
if [ -e $CURDIR/llama-cpp-cpu-all ]; then
BINARY=llama-cpp-cpu-all
fi
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
if [ -e "$CURDIR"/llama-cpp-grpc ]; then
if [ -e $CURDIR/llama-cpp-grpc ]; then
BINARY=llama-cpp-grpc
fi
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
if [ -d "$CURDIR/lib/rocblas/library" ]; then
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
fi
fi
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec "$CURDIR"/$BINARY "$@"
exec $CURDIR/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec "$CURDIR"/llama-cpp-fallback "$@"
exec $CURDIR/llama-cpp-fallback "$@"

View File

@@ -51,14 +51,6 @@ add_library(hw_grpc_proto STATIC
${HW_GRPC_SRCS} ${HW_GRPC_HDRS}
${HW_PROTO_SRCS} ${HW_PROTO_HDRS})
target_include_directories(hw_grpc_proto PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
# The generated proto/grpc sources include protobuf and grpc++ headers, so this
# library must see their include dirs. Linking the imported targets propagates
# them. On Linux the apt headers live in /usr/include (default search path) so
# this was a no-op; on macOS the Homebrew headers are under /opt/homebrew and
# would otherwise be missed (runtime_version.h not found).
target_link_libraries(hw_grpc_proto PUBLIC
protobuf::libprotobuf
gRPC::grpc++)
# Build only the pf static lib (+ ggml) from the engine tree — no CLI/bench/tests.
# PF_VULKAN is honored when passed on the cmake command line (it lands in the

View File

@@ -2,13 +2,7 @@
# Entry point for the privacy-filter backend image / BACKEND_BINARY mode.
set -e
CURDIR=$(dirname "$(realpath "$0")")
# macOS has no bundled ld.so; the darwin package ships only dylibs under lib/,
# resolved via DYLD_LIBRARY_PATH (the ld.so branch below is skipped there).
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:$DYLD_LIBRARY_PATH"
else
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
fi
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
if [ -f "$CURDIR/lib/ld.so" ]; then
exec "$CURDIR/lib/ld.so" "$CURDIR/grpc-server" "$@"
fi

View File

@@ -1,71 +0,0 @@
#!/bin/bash
#
# Discovers and runs every standalone C++ unit test under backend/cpp/.
#
# A "standalone" unit test is a *_test.cpp that depends only on the C++ standard
# library and nlohmann/json (single header) - i.e. it exercises pure helpers and
# does not need the full llama.cpp + gRPC backend build. Tests that DO need the
# backend build use the CMake/ctest path (e.g. -DLLAMA_GRPC_BUILD_TESTS=ON)
# instead and are skipped here.
#
# This keeps CI generic: adding a new pure-C++ unit test file named *_test.cpp in
# an active backend source dir is picked up automatically, with no CI edits.
#
# Env:
# NLOHMANN_INCLUDE include dir that contains nlohmann/json.hpp. If unset, the
# nlohmann/json single header is fetched to a temp dir.
# CXX compiler (default: g++).
# JSON_VERSION nlohmann/json tag to fetch when NLOHMANN_INCLUDE is unset
# (default: v3.11.3).
set -uo pipefail
ROOT="$(cd "$(dirname "$0")" && pwd)"
CXX="${CXX:-g++}"
JSON_VERSION="${JSON_VERSION:-v3.11.3}"
JSON_INC="${NLOHMANN_INCLUDE:-}"
if [ -z "$JSON_INC" ]; then
JSON_INC="$(mktemp -d)"
mkdir -p "$JSON_INC/nlohmann"
echo "Fetching nlohmann/json ${JSON_VERSION} single header..."
if ! curl -L -sf \
"https://raw.githubusercontent.com/nlohmann/json/${JSON_VERSION}/single_include/nlohmann/json.hpp" \
-o "$JSON_INC/nlohmann/json.hpp"; then
echo "ERROR: failed to fetch nlohmann/json header" >&2
exit 1
fi
fi
# Active source dirs only - exclude per-variant build copies, dev snapshots and
# the vendored upstream llama.cpp tree.
mapfile -t tests < <(find "$ROOT" -name '*_test.cpp' \
-not -path '*/llama.cpp/*' \
-not -path '*-build/*' \
-not -path '*-dev/*' \
-not -path '*fallback*' | sort)
if [ "${#tests[@]}" -eq 0 ]; then
echo "No standalone C++ unit tests found under $ROOT"
exit 0
fi
fail=0
for test_src in "${tests[@]}"; do
name="$(basename "$test_src" .cpp)"
bin="$(mktemp -d)/$name"
echo "==> $test_src"
if ! "$CXX" -std=c++17 -Wall -Wextra \
-I"$JSON_INC" -I"$(dirname "$test_src")" \
"$test_src" -o "$bin"; then
echo "COMPILE FAILED: $test_src" >&2
fail=1
continue
fi
if ! "$bin"; then
echo "TEST FAILED: $test_src" >&2
fail=1
fi
done
echo "Ran ${#tests[@]} standalone C++ unit test file(s)"
exit "$fail"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,36 +15,36 @@ BINARY=turboquant-fallback
# x86/arm64 ship a single turboquant-cpu-all built with ggml CPU_ALL_VARIANTS: ggml's
# backend registry dlopens the best libggml-cpu-*.so for this host, so no shell-side
# probing. ROCm ships only turboquant-fallback, so fall back to it when cpu-all is absent.
if [ -e "$CURDIR"/turboquant-cpu-all ]; then
if [ -e $CURDIR/turboquant-cpu-all ]; then
BINARY=turboquant-cpu-all
fi
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
if [ -e "$CURDIR"/turboquant-grpc ]; then
if [ -e $CURDIR/turboquant-grpc ]; then
BINARY=turboquant-grpc
fi
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
if [ -d "$CURDIR/lib/rocblas/library" ]; then
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
fi
fi
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec "$CURDIR"/$BINARY "$@"
exec $CURDIR/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec "$CURDIR"/turboquant-fallback "$@"
exec $CURDIR/turboquant-fallback "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -21,20 +21,20 @@ if [ "$(uname)" = "Darwin" ]; then
if [ ! -e "$LIBRARY" ]; then
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
fi
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgoacestepcpp-avx.so ]; then
if [ -e $CURDIR/libgoacestepcpp-avx.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgoacestepcpp-avx2.so ]; then
if [ -e $CURDIR/libgoacestepcpp-avx2.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx2.so"
fi
fi
@@ -42,22 +42,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgoacestepcpp-avx512.so ]; then
if [ -e $CURDIR/libgoacestepcpp-avx512.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export ACESTEP_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/acestep-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/acestep-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/acestep-cpp "$@"
exec $CURDIR/acestep-cpp "$@"

View File

@@ -4,10 +4,10 @@ set -e
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
export DYLD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${DYLD_LIBRARY_PATH:-}"
export CED_LIBRARY="$CURDIR/lib/libced.dylib"
else
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
fi
# If a self-contained ld.so was packaged, route through it so the packaged

View File

@@ -1,6 +1,6 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
exec "$CURDIR"/cloud-proxy "$@"
exec $CURDIR/cloud-proxy "$@"

View File

@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
# CrispASR version (release tag)
CRISPASR_REPO?=https://github.com/CrispStrobe/CrispASR
CRISPASR_VERSION?=8f1218141b792b8868861c1af17ba1e361b05dc0
CRISPASR_VERSION?=96b2a6ee31d30389fed8a7ef1a54239b75231ddc
SO_TARGET?=libgocrispasr.so
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgocrispasr-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgocrispasr-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgocrispasr-avx.so ]; then
if [ -e $CURDIR/libgocrispasr-avx.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgocrispasr-avx2.so ]; then
if [ -e $CURDIR/libgocrispasr-avx2.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx2.so"
fi
fi
@@ -36,12 +36,12 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgocrispasr-avx512.so ]; then
if [ -e $CURDIR/libgocrispasr-avx512.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export CRISPASR_LIBRARY=$LIBRARY
@@ -49,14 +49,14 @@ export CRISPASR_LIBRARY=$LIBRARY
# Point piper's espeak-ng phonemizer at the bundled voice data. The variable
# names the directory CONTAINING espeak-ng-data (package.sh drops it next to
# this script). Harmless when espeak-ng wasn't bundled.
export CRISPASR_ESPEAK_DATA_PATH="$CURDIR"
export CRISPASR_ESPEAK_DATA_PATH=$CURDIR
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/crispasr "$@"
exec $CURDIR/lib/ld.so $CURDIR/crispasr "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/crispasr "$@"
exec $CURDIR/crispasr "$@"

View File

@@ -40,8 +40,6 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DDA_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libdepthanythingcpp-avx.so ]; then
if [ -e $CURDIR/libdepthanythingcpp-avx.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libdepthanythingcpp-avx2.so ]; then
if [ -e $CURDIR/libdepthanythingcpp-avx2.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libdepthanythingcpp-avx512.so ]; then
if [ -e $CURDIR/libdepthanythingcpp-avx512.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export DEPTHANYTHING_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/depth-anything-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/depth-anything-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/depth-anything-cpp "$@"
exec $CURDIR/depth-anything-cpp "$@"

View File

@@ -1,6 +1,6 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
exec "$CURDIR"/local-store "$@"
exec $CURDIR/local-store "$@"

View File

@@ -32,8 +32,6 @@ endif
ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLOCALVQE_VULKAN=ON
else ifeq ($(OS),Darwin)
# Apple Silicon: CPU-only (no Metal upstream); built + published as an arm64
# image by CI (includeDarwin in .github/backend-matrix.yml) for macOS install.
CMAKE_ARGS+=-DGGML_METAL=OFF
endif

View File

@@ -1,34 +1,34 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
# LocalVQE's runtime CPU-variant loader (ggml_backend_load_all) searches
# get_executable_path() and current_path() — the second one is what saves us
# when /proc/self/exe resolves to lib/ld.so under the bundled-loader path.
# So we cd into "$CURDIR" (where all the libggml-cpu-*.so files live) before
# So we cd into $CURDIR (where all the libggml-cpu-*.so files live) before
# exec'ing the binary.
cd "$CURDIR"
if [ "$(uname)" = "Darwin" ]; then
# macOS: LocalVQE is built as a SHARED library, so dyld needs the .dylib +
# DYLD_LIBRARY_PATH. Prefer .dylib and fall back to .so just in case.
export DYLD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$DYLD_LIBRARY_PATH
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.dylib
export DYLD_LIBRARY_PATH=$CURDIR:$CURDIR/lib:$DYLD_LIBRARY_PATH
LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.dylib
if [ ! -e "$LOCALVQE_LIBRARY" ]; then
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.so
fi
export LOCALVQE_LIBRARY
else
export LD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$LD_LIBRARY_PATH
export LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
export LD_LIBRARY_PATH=$CURDIR:$CURDIR/lib:$LD_LIBRARY_PATH
export LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.so
fi
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LOCALVQE_LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/localvqe "$@"
exec $CURDIR/lib/ld.so $CURDIR/localvqe "$@"
fi
echo "Using library: $LOCALVQE_LIBRARY"
exec "$CURDIR"/localvqe "$@"
exec $CURDIR/localvqe "$@"

View File

@@ -33,8 +33,6 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLA_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/liblocateanythingcpp-avx.so ]; then
if [ -e $CURDIR/liblocateanythingcpp-avx.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/liblocateanythingcpp-avx2.so ]; then
if [ -e $CURDIR/liblocateanythingcpp-avx2.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/liblocateanythingcpp-avx512.so ]; then
if [ -e $CURDIR/liblocateanythingcpp-avx512.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export LOCATEANYTHING_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/locate-anything-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/locate-anything-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/locate-anything-cpp "$@"
exec $CURDIR/locate-anything-cpp "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgomnivoicecpp-avx.so ]; then
if [ -e $CURDIR/libgomnivoicecpp-avx.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgomnivoicecpp-avx2.so ]; then
if [ -e $CURDIR/libgomnivoicecpp-avx2.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgomnivoicecpp-avx512.so ]; then
if [ -e $CURDIR/libgomnivoicecpp-avx512.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export OMNIVOICE_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/omnivoice-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/omnivoice-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/omnivoice-cpp "$@"
exec $CURDIR/omnivoice-cpp "$@"

View File

@@ -1,30 +1,13 @@
GOCMD?=go
GO_TAGS?=
# The opus shim is a small C wrapper around libopus' variadic
# opus_encoder_ctl (see csrc/opus_shim.c). It is built as a shared library
# and dlopen'd at runtime by the Go backend (codec.go). The extension is
# OS-specific: Linux uses .so, macOS uses .dylib. OS is exported by the root
# Makefile (`export OS := $(shell uname -s)`).
SHIM_EXT=so
OPUS_CFLAGS := $(shell pkg-config --cflags opus)
OPUS_LIBS := $(shell pkg-config --libs opus)
SHIM_LDFLAGS := $(OPUS_LIBS)
ifeq ($(OS),Darwin)
SHIM_EXT=dylib
# Resolve libopus symbols lazily from the already globally-loaded
# libopus (codec.go dlopens it RTLD_GLOBAL before the shim) rather than
# recording an absolute Homebrew path in the dylib. This keeps the
# packaged shim relocatable on machines that have no Homebrew.
SHIM_LDFLAGS := -undefined dynamic_lookup
endif
libopusshim.so: csrc/opus_shim.c
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(OPUS_LIBS)
libopusshim.$(SHIM_EXT): csrc/opus_shim.c
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(SHIM_LDFLAGS)
opus: libopusshim.$(SHIM_EXT)
opus: libopusshim.so
$(GOCMD) build -tags "$(GO_TAGS)" -o opus ./
package: opus
@@ -33,7 +16,4 @@ package: opus
build: package
clean:
rm -f opus libopusshim.$(SHIM_EXT)
rm -rf package
.PHONY: build package clean
rm -f opus libopusshim.so

View File

@@ -8,23 +8,13 @@ mkdir -p $CURDIR/package/lib
cp -avf $CURDIR/opus $CURDIR/package/
cp -avf $CURDIR/run.sh $CURDIR/package/
# The shim extension is OS-specific (.so on Linux, .dylib on macOS).
SHIM_EXT=so
if [ "$(uname)" = "Darwin" ]; then
SHIM_EXT=dylib
fi
# Copy the opus shim library
cp -avf $CURDIR/libopusshim.$SHIM_EXT $CURDIR/package/lib/
cp -avf $CURDIR/libopusshim.so $CURDIR/package/lib/
# Copy system libopus so the backend is self-contained: the runtime base
# image has neither libopus-dev (Linux) nor Homebrew (macOS), so codec.go's
# dlopen would otherwise fail. Both name patterns are attempted; only the
# host's matching one exists.
# Copy system libopus
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists opus; then
LIBOPUS_DIR=$(pkg-config --variable=libdir opus)
cp -avf $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
cp -avf $LIBOPUS_DIR/libopus*.dylib $CURDIR/package/lib/ 2>/dev/null || true
cp -avfL $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
fi
# Detect architecture and copy appropriate libraries
@@ -48,8 +38,6 @@ elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
elif [ "$(uname -s)" = "Darwin" ]; then
echo "Detected Darwin — system libraries linked dynamically, no bundled loader needed"
else
echo "Warning: Could not detect architecture for system library bundling"
fi

View File

@@ -1,20 +1,15 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.dylib
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.so
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY=$CURDIR/lib/libopusshim.so
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/opus "$@"
exec $CURDIR/lib/ld.so $CURDIR/opus "$@"
fi
exec "$CURDIR"/opus "$@"
exec $CURDIR/opus "$@"

View File

@@ -1,6 +1,6 @@
# parakeet-cpp backend Makefile.
#
# Upstream pin lives below as PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
# Upstream pin lives below as PARAKEET_VERSION?=89f5e2977b4d8bccd45e7bcc6f2ef7c4ed49e89a
# (.github/bump_deps.sh) can find and update it - matches the
# whisper.cpp / ds4 / vibevoice-cpp convention.
#
@@ -15,7 +15,7 @@
# That's what the L0 smoke test uses. The default target below does the
# proper clone-at-pin + cmake build so CI doesn't need a side-checkout.
PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
PARAKEET_VERSION?=89f5e2977b4d8bccd45e7bcc6f2ef7c4ed49e89a
PARAKEET_REPO?=https://github.com/mudler/parakeet.cpp
GOCMD?=go

View File

@@ -4,10 +4,10 @@ set -e
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
export DYLD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${DYLD_LIBRARY_PATH:-}"
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.dylib"
else
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.so"
fi

View File

@@ -16,15 +16,7 @@ cp -rfv $CURDIR/run.sh $CURDIR/package/
cp -rfLv $CURDIR/sources/go-piper/piper-phonemize/pi/lib/* $CURDIR/package/lib/
# Detect architecture and copy appropriate libraries
if [ "$(uname)" = "Darwin" ]; then
# macOS has no glibc loader to bundle. The piper binary links its bundled
# libs (libucd, libespeak-ng, libpiper_phonemize, libonnxruntime) via
# @rpath but ships with no LC_RPATH, so dyld aborts at launch with
# "Library not loaded: @rpath/libucd.dylib ... no LC_RPATH's found".
# Add an @loader_path/lib rpath so @rpath resolves to package/lib/.
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/piper"
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
# x86_64 architecture
echo "Detected x86_64 architecture, copying x86_64 libraries..."
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so

View File

@@ -1,20 +1,15 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
export ESPEAK_NG_DATA="$CURDIR"/espeak-ng-data
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export ESPEAK_NG_DATA=$CURDIR/espeak-ng-data
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/piper "$@"
exec $CURDIR/lib/ld.so $CURDIR/piper "$@"
fi
exec "$CURDIR"/piper "$@"
exec $CURDIR/piper "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx.so ]; then
if [ -e $CURDIR/libgoqwen3ttscpp-avx.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx2.so ]; then
if [ -e $CURDIR/libgoqwen3ttscpp-avx2.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx512.so ]; then
if [ -e $CURDIR/libgoqwen3ttscpp-avx512.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export QWEN3TTS_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/qwen3-tts-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/qwen3-tts-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/qwen3-tts-cpp "$@"
exec $CURDIR/qwen3-tts-cpp "$@"

View File

@@ -34,8 +34,6 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DRFDETR_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/librfdetrcpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/librfdetrcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/librfdetrcpp-avx.so ]; then
if [ -e $CURDIR/librfdetrcpp-avx.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/librfdetrcpp-avx2.so ]; then
if [ -e $CURDIR/librfdetrcpp-avx2.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/librfdetrcpp-avx512.so ]; then
if [ -e $CURDIR/librfdetrcpp-avx512.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export RFDETR_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/rfdetr-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/rfdetr-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/rfdetr-cpp "$@"
exec $CURDIR/rfdetr-cpp "$@"

View File

@@ -31,8 +31,6 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgosam3-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgosam3-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgosam3-avx.so ]; then
if [ -e $CURDIR/libgosam3-avx.so ]; then
LIBRARY="$CURDIR/libgosam3-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgosam3-avx2.so ]; then
if [ -e $CURDIR/libgosam3-avx2.so ]; then
LIBRARY="$CURDIR/libgosam3-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgosam3-avx512.so ]; then
if [ -e $CURDIR/libgosam3-avx512.so ]; then
LIBRARY="$CURDIR/libgosam3-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export SAM3_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/sam3-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/sam3-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/sam3-cpp "$@"
exec $CURDIR/sam3-cpp "$@"

View File

@@ -1,19 +1,19 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export SHERPA_SHIM_LIBRARY="$CURDIR"/lib/libsherpa-shim.dylib
export SHERPA_ONNX_LIBRARY="$CURDIR"/lib/libsherpa-onnx-c-api.dylib
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export SHERPA_SHIM_LIBRARY=$CURDIR/lib/libsherpa-shim.dylib
export SHERPA_ONNX_LIBRARY=$CURDIR/lib/libsherpa-onnx-c-api.dylib
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/sherpa-onnx "$@"
exec $CURDIR/lib/ld.so $CURDIR/sherpa-onnx "$@"
fi
exec "$CURDIR"/sherpa-onnx "$@"
exec $CURDIR/sherpa-onnx "$@"

View File

@@ -15,14 +15,7 @@ cp -avf $CURDIR/run.sh $CURDIR/package/
cp -rfLv $CURDIR/backend-assets/lib/* $CURDIR/package/lib/
# Detect architecture and copy appropriate libraries
if [ "$(uname)" = "Darwin" ]; then
# macOS has no glibc loader to bundle. silero-vad links its bundled
# libonnxruntime via @rpath but ships with no LC_RPATH, so dyld can't find
# it at runtime. Add an @loader_path/lib rpath so @rpath resolves to
# package/lib/ (matching the piper darwin fix, #10525).
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/silero-vad"
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
# x86_64 architecture
echo "Detected x86_64 architecture, copying x86_64 libraries..."
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so

View File

@@ -1,18 +1,14 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/silero-vad "$@"
exec $CURDIR/lib/ld.so $CURDIR/silero-vad "$@"
fi
exec "$CURDIR"/silero-vad "$@"
exec $CURDIR/silero-vad "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -20,20 +20,20 @@ if [ "$(uname)" = "Darwin" ]; then
if [ ! -e "$LIBRARY" ]; then
LIBRARY="$CURDIR/libgosd-fallback.so"
fi
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgosd-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgosd-avx.so ]; then
if [ -e $CURDIR/libgosd-avx.so ]; then
LIBRARY="$CURDIR/libgosd-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgosd-avx2.so ]; then
if [ -e $CURDIR/libgosd-avx2.so ]; then
LIBRARY="$CURDIR/libgosd-avx2.so"
fi
fi
@@ -41,22 +41,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgosd-avx512.so ]; then
if [ -e $CURDIR/libgosd-avx512.so ]; then
LIBRARY="$CURDIR/libgosd-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export SD_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/stablediffusion-ggml "$@"
exec $CURDIR/lib/ld.so $CURDIR/stablediffusion-ggml "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/stablediffusion-ggml "$@"
exec $CURDIR/stablediffusion-ggml "$@"

View File

@@ -1,21 +1,21 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
if [ "$(uname)" = "Darwin" ]; then
# macOS uses dyld: there is no ld.so loader, and the search path env
# var is DYLD_LIBRARY_PATH. ONNX Runtime ships as a .dylib here.
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.dylib
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH=$CURDIR/lib/libonnxruntime.dylib
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.so
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH=$CURDIR/lib/libonnxruntime.so
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/supertonic "$@"
exec $CURDIR/lib/ld.so $CURDIR/supertonic "$@"
fi
fi
exec "$CURDIR"/supertonic "$@"
exec $CURDIR/supertonic "$@"

View File

@@ -1,7 +1,7 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -14,41 +14,41 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgovibevoicecpp-avx.so ]; then
if [ -e $CURDIR/libgovibevoicecpp-avx.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgovibevoicecpp-avx2.so ]; then
if [ -e $CURDIR/libgovibevoicecpp-avx2.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx2.so"
fi
fi
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgovibevoicecpp-avx512.so ]; then
if [ -e $CURDIR/libgovibevoicecpp-avx512.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export VIBEVOICECPP_LIBRARY=$LIBRARY
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/vibevoice-cpp "$@"
exec $CURDIR/lib/ld.so $CURDIR/vibevoice-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/vibevoice-cpp "$@"
exec $CURDIR/vibevoice-cpp "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -15,35 +15,35 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgovoxtral-fallback.dylib"
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgovoxtral-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgovoxtral-avx.so ]; then
if [ -e $CURDIR/libgovoxtral-avx.so ]; then
LIBRARY="$CURDIR/libgovoxtral-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgovoxtral-avx2.so ]; then
if [ -e $CURDIR/libgovoxtral-avx2.so ]; then
LIBRARY="$CURDIR/libgovoxtral-avx2.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export VOXTRAL_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it (Linux only)
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/voxtral "$@"
exec $CURDIR/lib/ld.so $CURDIR/voxtral "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/voxtral "$@"
exec $CURDIR/voxtral "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
cd /
@@ -13,28 +13,22 @@ if [ "$(uname)" != "Darwin" ]; then
fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single fallback variant (Metal/Accelerate). The cmake build emits a
# Mach-O named .so, but tolerate .dylib too — pick whichever exists so the Go
# loader doesn't panic on a hardcoded name that isn't on disk.
if [ -e "$CURDIR/libgowhisper-fallback.dylib" ]; then
LIBRARY="$CURDIR/libgowhisper-fallback.dylib"
else
LIBRARY="$CURDIR/libgowhisper-fallback.so"
fi
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgowhisper-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgowhisper-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e "$CURDIR"/libgowhisper-avx.so ]; then
if [ -e $CURDIR/libgowhisper-avx.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e "$CURDIR"/libgowhisper-avx2.so ]; then
if [ -e $CURDIR/libgowhisper-avx2.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx2.so"
fi
fi
@@ -42,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e "$CURDIR"/libgowhisper-avx512.so ]; then
if [ -e $CURDIR/libgowhisper-avx512.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx512.so"
fi
fi
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
fi
export WHISPER_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec "$CURDIR"/lib/ld.so "$CURDIR"/whisper "$@"
exec $CURDIR/lib/ld.so $CURDIR/whisper "$@"
fi
echo "Using library: $LIBRARY"
exec "$CURDIR"/whisper "$@"
exec $CURDIR/whisper "$@"

View File

@@ -340,7 +340,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp"
intel: "intel-sycl-f32-sam3-cpp"
vulkan: "vulkan-sam3-cpp"
metal: "metal-sam3-cpp"
- &rfdetrcpp
name: "rfdetr-cpp"
alias: "rfdetr-cpp"
@@ -369,7 +368,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp"
intel: "intel-sycl-f32-rfdetr-cpp"
vulkan: "vulkan-rfdetr-cpp"
metal: "metal-rfdetr-cpp"
- &locateanything
name: "locate-anything"
alias: "locate-anything"
@@ -399,7 +397,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp"
intel: "intel-sycl-f32-locate-anything-cpp"
vulkan: "vulkan-locate-anything-cpp"
metal: "metal-locate-anything-cpp"
- !!merge <<: *locateanything
name: "locate-anything-development"
capabilities:
@@ -412,7 +409,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp-development"
intel: "intel-sycl-f32-locate-anything-cpp-development"
vulkan: "vulkan-locate-anything-cpp-development"
metal: "metal-locate-anything-cpp-development"
- !!merge <<: *locateanything
name: "cpu-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-locate-anything-cpp"
@@ -423,16 +419,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-locate-anything-cpp"
mirrors:
- localai/localai-backends:master-cpu-locate-anything-cpp
- !!merge <<: *locateanything
name: "metal-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-locate-anything-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-locate-anything-cpp
- !!merge <<: *locateanything
name: "metal-locate-anything-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-locate-anything-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-locate-anything-cpp
- !!merge <<: *locateanything
name: "cuda12-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-locate-anything-cpp"
@@ -531,7 +517,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp"
intel: "intel-sycl-f32-depth-anything-cpp"
vulkan: "vulkan-depth-anything-cpp"
metal: "metal-depth-anything-cpp"
- !!merge <<: *depthanything
name: "depth-anything-development"
capabilities:
@@ -544,7 +529,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp-development"
intel: "intel-sycl-f32-depth-anything-cpp-development"
vulkan: "vulkan-depth-anything-cpp-development"
metal: "metal-depth-anything-cpp-development"
- !!merge <<: *depthanything
name: "cpu-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-depth-anything-cpp"
@@ -555,16 +539,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-depth-anything-cpp"
mirrors:
- localai/localai-backends:master-cpu-depth-anything-cpp
- !!merge <<: *depthanything
name: "metal-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-depth-anything-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-depth-anything-cpp
- !!merge <<: *depthanything
name: "metal-depth-anything-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-depth-anything-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-depth-anything-cpp
- !!merge <<: *depthanything
name: "cuda12-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-depth-anything-cpp"
@@ -1057,8 +1031,6 @@
nvidia-l4t: "vulkan-localvqe"
nvidia-l4t-cuda-12: "vulkan-localvqe"
nvidia-l4t-cuda-13: "vulkan-localvqe"
# Apple Silicon: CPU build (LocalVQE has no Metal path); still arm64-native.
metal: "metal-localvqe"
- &privacyfilter
name: "privacy-filter"
alias: "privacy-filter"
@@ -1095,7 +1067,6 @@
amd: "vulkan-privacy-filter"
intel: "vulkan-privacy-filter"
vulkan: "vulkan-privacy-filter"
metal: "metal-privacy-filter"
- &faster-whisper
icon: https://avatars.githubusercontent.com/u/1520500?s=200&v=4
description: |
@@ -2938,16 +2909,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-vulkan-privacy-filter"
mirrors:
- localai/localai-backends:master-gpu-vulkan-privacy-filter
- !!merge <<: *privacyfilter
name: "metal-privacy-filter"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-privacy-filter"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-privacy-filter
- !!merge <<: *privacyfilter
name: "metal-privacy-filter-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-privacy-filter"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-privacy-filter
- !!merge <<: *privacyfilter
name: "cuda13-privacy-filter"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-13-privacy-filter"
@@ -3259,7 +3220,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp-development"
intel: "intel-sycl-f32-sam3-cpp-development"
vulkan: "vulkan-sam3-cpp-development"
metal: "metal-sam3-cpp-development"
- !!merge <<: *sam3cpp
name: "cpu-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-sam3-cpp"
@@ -3270,16 +3230,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-sam3-cpp"
mirrors:
- localai/localai-backends:master-cpu-sam3-cpp
- !!merge <<: *sam3cpp
name: "metal-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-sam3-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-sam3-cpp
- !!merge <<: *sam3cpp
name: "metal-sam3-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-sam3-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-sam3-cpp
- !!merge <<: *sam3cpp
name: "cuda12-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-sam3-cpp"
@@ -3353,7 +3303,6 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp-development"
intel: "intel-sycl-f32-rfdetr-cpp-development"
vulkan: "vulkan-rfdetr-cpp-development"
metal: "metal-rfdetr-cpp-development"
- !!merge <<: *rfdetrcpp
name: "cpu-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-rfdetr-cpp"
@@ -3364,16 +3313,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-rfdetr-cpp"
mirrors:
- localai/localai-backends:master-cpu-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "metal-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-rfdetr-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "metal-rfdetr-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-rfdetr-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "cuda12-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-rfdetr-cpp"
@@ -4162,16 +4101,6 @@
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-vulkan-localvqe"
mirrors:
- localai/localai-backends:master-gpu-vulkan-localvqe
- !!merge <<: *localvqecpp
name: "metal-localvqe"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-localvqe"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-localvqe
- !!merge <<: *localvqecpp
name: "metal-localvqe-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-localvqe"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-localvqe
## kokoro
- !!merge <<: *kokoro
name: "kokoro-development"

View File

@@ -7,7 +7,3 @@ setuptools
six
scipy
numpy
# fish-speech is installed editable with --no-build-isolation, so the build
# backends of its transitive deps must already be in the venv. One of them
# builds a Rust extension and needs setuptools-rust present at metadata time.
setuptools-rust

View File

@@ -11,31 +11,14 @@ fi
EXTRA_PIP_INSTALL_FLAGS+=" --upgrade "
installRequirements
# Fetch convert_hf_to_gguf.py from llama.cpp.
# Upstream split the model-specific logic out of the single file into a
# sibling `conversion/` package (convert_hf_to_gguf.py now does
# `from conversion import ...`), so a single-file download no longer runs —
# it fails with `ModuleNotFoundError: No module named 'conversion'`. We clone
# the repo and copy both the script and the package; Python puts the script's
# own directory on sys.path[0], so the package resolves when placed beside it.
# Fetch convert_hf_to_gguf.py from llama.cpp
LLAMA_CPP_CONVERT_VERSION="${LLAMA_CPP_CONVERT_VERSION:-master}"
LLAMA_CPP_SRC="${EDIR}/llama.cpp"
CONVERT_SCRIPT="${EDIR}/convert_hf_to_gguf.py"
cloneLlamaCpp() {
if [ ! -d "${LLAMA_CPP_SRC}/.git" ]; then
git clone --depth 1 --branch "${LLAMA_CPP_CONVERT_VERSION}" \
https://github.com/ggml-org/llama.cpp.git "${LLAMA_CPP_SRC}" 2>/dev/null || \
git clone --depth 1 https://github.com/ggml-org/llama.cpp.git "${LLAMA_CPP_SRC}"
fi
}
if [ ! -f "${CONVERT_SCRIPT}" ] || [ ! -d "${EDIR}/conversion" ]; then
echo "Fetching convert_hf_to_gguf.py + conversion/ from llama.cpp (${LLAMA_CPP_CONVERT_VERSION})..."
cloneLlamaCpp
cp "${LLAMA_CPP_SRC}/convert_hf_to_gguf.py" "${CONVERT_SCRIPT}"
rm -rf "${EDIR}/conversion"
cp -r "${LLAMA_CPP_SRC}/conversion" "${EDIR}/conversion"
if [ ! -f "${CONVERT_SCRIPT}" ]; then
echo "Downloading convert_hf_to_gguf.py from llama.cpp (${LLAMA_CPP_CONVERT_VERSION})..."
curl -L --fail --retry 3 \
"https://raw.githubusercontent.com/ggml-org/llama.cpp/${LLAMA_CPP_CONVERT_VERSION}/convert_hf_to_gguf.py" \
-o "${CONVERT_SCRIPT}" || echo "Warning: Failed to download convert_hf_to_gguf.py."
fi
# Install gguf package from the same llama.cpp commit to keep them in sync
@@ -58,7 +41,12 @@ QUANTIZE_BIN="${EDIR}/llama-quantize"
if [ ! -x "${QUANTIZE_BIN}" ] && ! command -v llama-quantize &>/dev/null; then
if command -v cmake &>/dev/null; then
echo "Building llama-quantize from llama.cpp (${LLAMA_CPP_CONVERT_VERSION})..."
cloneLlamaCpp # reuses the clone fetched for convert_hf_to_gguf.py
LLAMA_CPP_SRC="${EDIR}/llama.cpp"
if [ ! -d "${LLAMA_CPP_SRC}" ]; then
git clone --depth 1 --branch "${LLAMA_CPP_CONVERT_VERSION}" \
https://github.com/ggml-org/llama.cpp.git "${LLAMA_CPP_SRC}" 2>/dev/null || \
git clone --depth 1 https://github.com/ggml-org/llama.cpp.git "${LLAMA_CPP_SRC}"
fi
cmake -B "${LLAMA_CPP_SRC}/build" -S "${LLAMA_CPP_SRC}" -DGGML_NATIVE=OFF -DBUILD_SHARED_LIBS=OFF
cmake --build "${LLAMA_CPP_SRC}/build" --target llama-quantize -j"$(nproc 2>/dev/null || echo 2)"
cp "${LLAMA_CPP_SRC}/build/bin/llama-quantize" "${QUANTIZE_BIN}"

View File

@@ -85,15 +85,9 @@ if [ "x${BUILD_TYPE}" == "x" ] || [ "x${FROM_SOURCE:-}" == "xtrue" ]; then
# The resulting binary still requires an AVX-512 capable CPU at runtime,
# same constraint sglang upstream documents in docker/xeon.Dockerfile.
# Pin the source build to the same release the GPU path floors on
# (0.5.11, see requirements-cublas12-after.txt). An unpinned master clone
# pulls in newer CPU kernels (e.g. mamba/fla.cpp) that fail to compile
# (constexpr non-constant + kineto_LIBRARY-NOTFOUND). Bump deliberately.
SGLANG_VERSION="${SGLANG_VERSION:-v0.5.11}"
_sgl_src=$(mktemp -d)
trap 'rm -rf "${_sgl_src}"' EXIT
git clone --depth 1 --branch "${SGLANG_VERSION}" \
https://github.com/sgl-project/sglang "${_sgl_src}/sglang"
git clone --depth 1 https://github.com/sgl-project/sglang "${_sgl_src}/sglang"
# Patch -march=native → -march=sapphirerapids in the CPU kernel CMakeLists
sed -i 's/-march=native/-march=sapphirerapids/g' \

View File

@@ -1,6 +1,6 @@
--extra-index-url https://download.pytorch.org/whl/cpu
accelerate
torch==2.9.1+cpu
torch==2.12.1+xpu
torchvision
torchaudio
transformers

View File

@@ -1,23 +1,23 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath "$0")")
CURDIR=$(dirname "$(realpath $0)")
export LD_LIBRARY_PATH="$CURDIR"/lib:${LD_LIBRARY_PATH:-}
export LD_LIBRARY_PATH=$CURDIR/lib:${LD_LIBRARY_PATH:-}
# SSL certificates for model auto-download
if [ -d "$CURDIR/etc/ssl/certs" ]; then
export SSL_CERT_DIR="$CURDIR"/etc/ssl/certs
export SSL_CERT_DIR=$CURDIR/etc/ssl/certs
fi
# espeak-ng data directory
if [ -d "$CURDIR/espeak-ng-data" ]; then
export ESPEAK_NG_DATA="$CURDIR"/espeak-ng-data
export ESPEAK_NG_DATA=$CURDIR/espeak-ng-data
fi
# Use bundled ld.so if present (portability)
if [ -f "$CURDIR"/lib/ld.so ]; then
exec "$CURDIR"/lib/ld.so "$CURDIR"/kokoros-grpc "$@"
if [ -f $CURDIR/lib/ld.so ]; then
exec $CURDIR/lib/ld.so $CURDIR/kokoros-grpc "$@"
fi
exec "$CURDIR"/kokoros-grpc "$@"
exec $CURDIR/kokoros-grpc "$@"

View File

@@ -570,43 +570,6 @@ impl Backend for KokorosService {
) -> Result<Response<backend::Result>, Status> {
Err(Status::unimplemented("Not supported"))
}
async fn sound_detection(
&self,
_: Request<backend::SoundDetectionRequest>,
) -> Result<Response<backend::SoundDetectionResponse>, Status> {
Err(Status::unimplemented("Not supported"))
}
async fn depth(
&self,
_: Request<backend::DepthRequest>,
) -> Result<Response<backend::DepthResponse>, Status> {
Err(Status::unimplemented("Not supported"))
}
async fn token_classify(
&self,
_: Request<backend::TokenClassifyRequest>,
) -> Result<Response<backend::TokenClassifyResponse>, Status> {
Err(Status::unimplemented("Not supported"))
}
async fn score(
&self,
_: Request<backend::ScoreRequest>,
) -> Result<Response<backend::ScoreResponse>, Status> {
Err(Status::unimplemented("Not supported"))
}
type ForwardStream = ReceiverStream<Result<backend::ForwardReply, Status>>;
async fn forward(
&self,
_: Request<tonic::Streaming<backend::ForwardRequest>>,
) -> Result<Response<Self::ForwardStream>, Status> {
Err(Status::unimplemented("Not supported"))
}
}
#[cfg(test)]

View File

@@ -1,8 +0,0 @@
Website = "https://localai.io"
[Details]
Icon = "../../core/http/static/logo.png"
Name = "LocalAI"
ID = "com.localai.launcher"
Version = "0.0.0"
Build = 1

View File

@@ -1,14 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.cs.allow-jit</key>
<true/>
<key>com.apple.security.cs.allow-unsigned-executable-memory</key>
<true/>
</dict>
</plist>

View File

@@ -1,84 +0,0 @@
#!/usr/bin/env bash
# Code-sign and notarize macOS artifacts for LocalAI.
# Every sub-command is a no-op (exit 0) when its required secret is unset,
# so unsigned builds (forks, local dev, PRs) keep working.
set -euo pipefail
ENTITLEMENTS="contrib/macos/Launcher.entitlements"
KEYCHAIN="localai-ci.keychain-db"
cmd_import_cert() {
if [ -z "${MACOS_CERTIFICATE:-}" ]; then
echo "[sign] MACOS_CERTIFICATE unset: skipping cert import (unsigned build)"
return 0
fi
local certfile keychain_pwd default_keychain
certfile="$(mktemp).p12"
keychain_pwd="${MACOS_CI_KEYCHAIN_PWD:?MACOS_CI_KEYCHAIN_PWD required when signing}"
echo "$MACOS_CERTIFICATE" | base64 --decode > "$certfile"
security create-keychain -p "$keychain_pwd" "$KEYCHAIN"
security set-keychain-settings -lut 21600 "$KEYCHAIN"
security unlock-keychain -p "$keychain_pwd" "$KEYCHAIN"
security import "$certfile" -k "$KEYCHAIN" -P "${MACOS_CERTIFICATE_PWD:?}" \
-T /usr/bin/codesign -T /usr/bin/security
security set-key-partition-list -S apple-tool:,apple:,codesign: \
-s -k "$keychain_pwd" "$KEYCHAIN" >/dev/null
default_keychain="$(security default-keychain | tr -d ' "')"
security list-keychains -d user -s "$KEYCHAIN" "$default_keychain"
rm -f "$certfile"
echo "[sign] certificate imported into $KEYCHAIN"
}
cmd_sign() {
local target="$1"
if [ -z "${MACOS_SIGN_IDENTITY:-}" ]; then
echo "[sign] MACOS_SIGN_IDENTITY unset: skipping codesign of $target"
return 0
fi
case "$target" in
*.app)
# Hardened runtime + entitlements are required for notarizing the app bundle.
codesign --deep --force --options runtime --timestamp \
--entitlements "$ENTITLEMENTS" \
--sign "$MACOS_SIGN_IDENTITY" "$target"
;;
*)
# A disk image carries no entitlements/runtime; just sign the container.
codesign --force --timestamp --sign "$MACOS_SIGN_IDENTITY" "$target"
;;
esac
codesign --verify --strict --verbose=2 "$target"
echo "[sign] signed $target"
}
cmd_notarize() {
local dmg="$1"
if [ -z "${MACOS_NOTARY_KEY:-}" ]; then
echo "[notarize] MACOS_NOTARY_KEY unset: skipping notarization of $dmg"
return 0
fi
local keyfile
keyfile="$(mktemp).p8"
echo "$MACOS_NOTARY_KEY" | base64 --decode > "$keyfile"
xcrun notarytool submit "$dmg" \
--key "$keyfile" \
--key-id "${MACOS_NOTARY_KEY_ID:?}" \
--issuer "${MACOS_NOTARY_ISSUER_ID:?}" \
--wait
rm -f "$keyfile"
xcrun stapler staple "$dmg"
xcrun stapler validate "$dmg"
echo "[notarize] notarized and stapled $dmg"
}
main() {
local sub="${1:-}"; shift || true
case "$sub" in
import-cert) cmd_import_cert ;;
sign) cmd_sign "$@" ;;
notarize) cmd_notarize "$@" ;;
*) echo "usage: $0 {import-cert|sign <path>|notarize <dmg>}" >&2; exit 2 ;;
esac
}
main "$@"

View File

@@ -16,7 +16,6 @@ import (
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/jobs"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/core/services/monitoring"
"github.com/mudler/LocalAI/core/services/nodes"
"github.com/mudler/LocalAI/core/services/routing/admission"
@@ -331,14 +330,9 @@ func New(opts ...config.AppOption) (*Application, error) {
gs := application.galleryService
sys := options.SystemState
cfgLoaderOpts := options.ToConfigLoaderOptions()
gs.OnModelsChanged = func(evt messaging.CacheInvalidateEvent) {
// ApplyRemoteChange honors the op: a "delete" prunes the element
// (a reload-from-path is additive and cannot drop it), anything
// else reloads from disk; a named element's running instance is
// shut down so the new config takes effect. The originating
// replica reloads inline and never depends on this path.
if err := modeladmin.ApplyRemoteChange(application.ModelConfigLoader(), application.modelLoader, sys.Model.ModelsPath, evt, cfgLoaderOpts...); err != nil {
xlog.Warn("Failed to apply peer model config change", "error", err)
gs.OnModelsChanged = func(_ messaging.CacheInvalidateEvent) {
if err := application.ModelConfigLoader().LoadModelConfigsFromPath(sys.Model.ModelsPath, cfgLoaderOpts...); err != nil {
xlog.Warn("Failed to reload model configs after peer invalidation", "error", err)
}
}
if err := application.galleryService.SubscribeBroadcasts(); err != nil {

View File

@@ -203,7 +203,6 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
system.WithBackendImagesReleaseTag(r.BackendImagesReleaseTag),
system.WithBackendImagesBranchTag(r.BackendImagesBranchTag),
system.WithBackendDevSuffix(r.BackendDevSuffix),
system.WithPreferDevelopmentBackends(r.PreferDevelopmentBackends),
)
if err != nil {
return err

View File

@@ -59,22 +59,6 @@ func getFallbackTagValues(systemState *system.SystemState) (latestTag, masterTag
return latestTag, masterTag, devSuffix
}
// developmentURI returns the development image URI for a released backend URI by
// swapping the released tag for the branch tag (e.g.
// latest-metal-darwin-arm64-llama-cpp -> master-metal-darwin-arm64-llama-cpp).
// The branch image tracks development. ok is false when uri has no released tag
// to swap or already uses the branch tag.
func developmentURI(uri, latestTag, masterTag string) (string, bool) {
if strings.Contains(uri, masterTag+"-") {
return "", false
}
branchURI := strings.Replace(uri, latestTag+"-", masterTag+"-", 1)
if branchURI == uri {
return "", false
}
return branchURI, true
}
// backendCandidate represents an installed concrete backend option for a given alias
type backendCandidate struct {
name string
@@ -311,28 +295,15 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
return fmt.Errorf("backend %q: %w", config.Name, optsErr)
}
// PreferDevelopmentBackends installs the development image as the primary URI,
// keeping the released image reachable as the first fallback — instead of only
// reaching development when the released image is missing.
primaryURI := string(config.URI)
mirrors := config.Mirrors
if systemState.PreferDevelopmentBackends {
if devURI, ok := developmentURI(string(config.URI), latestTag, masterTag); ok {
xlog.Info("PreferDevelopmentBackends: installing development image first", "development", devURI, "released", config.URI)
primaryURI = devURI
mirrors = append([]string{string(config.URI)}, config.Mirrors...)
}
}
uri := downloader.URI(primaryURI)
uri := downloader.URI(config.URI)
// Check if it is a directory
if uri.LooksLikeDir() {
// It is a directory, we just copy it over in the backend folder
if err := cp.Copy(string(uri), backendPath); err != nil {
if err := cp.Copy(config.URI, backendPath); err != nil {
return fmt.Errorf("failed copying: %w", err)
}
} else {
xlog.Debug("Downloading backend", "uri", primaryURI, "backendPath", backendPath)
xlog.Debug("Downloading backend", "uri", config.URI, "backendPath", backendPath)
if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err != nil {
xlog.Debug("Backend download failed, trying fallback", "backendPath", backendPath, "error", err)
@@ -345,9 +316,8 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
}
success := false
// Try to download from mirrors (when development is preferred, the
// released image is prepended here as the first fallback).
for _, mirror := range mirrors {
// Try to download from mirrors
for _, mirror := range config.Mirrors {
// Check for cancellation before trying next mirror
select {
case <-ctx.Done():

View File

@@ -1,26 +0,0 @@
package gallery
import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
var _ = Describe("developmentURI", func() {
const latest, master = "latest", "master"
It("rewrites a released image to its branch (development) image", func() {
got, ok := developmentURI("quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-llama-cpp", latest, master)
Expect(ok).To(BeTrue())
Expect(got).To(Equal("quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-llama-cpp"))
})
It("leaves an image already on the branch tag untouched", func() {
_, ok := developmentURI("quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-llama-cpp", latest, master)
Expect(ok).To(BeFalse())
})
It("returns ok=false when there is no released tag to swap", func() {
_, ok := developmentURI("oci://localhost/custom-backend:edge", latest, master)
Expect(ok).To(BeFalse())
})
})

View File

@@ -3,51 +3,10 @@
package auth
import (
"net/url"
"strings"
"gorm.io/driver/sqlite"
"gorm.io/gorm"
)
func openSQLiteDialector(path string) (gorm.Dialector, error) {
return sqlite.Open(buildSQLiteDSN(path)), nil
}
// buildSQLiteDSN augments a SQLite file path with connection pragmas that make
// the auth DB resilient on slow or contended storage.
//
// - _busy_timeout=5000 makes SQLite retry for up to 5s on SQLITE_BUSY instead
// of failing immediately. Network-backed storage (SMB/CIFS/NFS, e.g. Azure
// Files) is prone to transient lock contention during migration (see #10506).
// - _txlock=immediate takes the write lock at BEGIN, avoiding deadlocks when a
// read transaction later upgrades to a write during AutoMigrate.
//
// We deliberately do NOT set WAL journal mode: WAL relies on a shared-memory
// mmap that does not work over SMB/NFS, which is exactly the failing case here.
//
// Caller-supplied values for either pragma are preserved.
func buildSQLiteDSN(path string) string {
base := path
rawQuery := ""
if i := strings.IndexByte(path, '?'); i >= 0 {
base = path[:i]
rawQuery = path[i+1:]
}
values, err := url.ParseQuery(rawQuery)
if err != nil {
// An unparseable query string means a hand-crafted DSN we should not
// risk corrupting; leave it untouched.
return path
}
if values.Get("_busy_timeout") == "" {
values.Set("_busy_timeout", "5000")
}
if values.Get("_txlock") == "" {
values.Set("_txlock", "immediate")
}
return base + "?" + values.Encode()
return sqlite.Open(path), nil
}

View File

@@ -1,57 +0,0 @@
//go:build auth
package auth
import (
"net/url"
"strings"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
// parseDSN splits a "base?query" DSN into its base and decoded query values so
// assertions don't depend on url.Values.Encode()'s key ordering.
func parseDSN(dsn string) (string, url.Values) {
base := dsn
rawQuery := ""
if i := strings.IndexByte(dsn, '?'); i >= 0 {
base = dsn[:i]
rawQuery = dsn[i+1:]
}
values, err := url.ParseQuery(rawQuery)
Expect(err).ToNot(HaveOccurred())
return base, values
}
var _ = Describe("buildSQLiteDSN", func() {
It("adds busy_timeout and txlock to a plain file path", func() {
base, values := parseDSN(buildSQLiteDSN("/data/database.db"))
Expect(base).To(Equal("/data/database.db"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("adds pragmas to an in-memory database", func() {
base, values := parseDSN(buildSQLiteDSN(":memory:"))
Expect(base).To(Equal(":memory:"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("preserves an existing query string", func() {
base, values := parseDSN(buildSQLiteDSN("/data/database.db?cache=shared"))
Expect(base).To(Equal("/data/database.db"))
Expect(values.Get("cache")).To(Equal("shared"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("does not override a caller-supplied busy_timeout or txlock", func() {
_, values := parseDSN(buildSQLiteDSN("/data/database.db?_busy_timeout=1000&_txlock=deferred"))
Expect(values["_busy_timeout"]).To(HaveLen(1), "_busy_timeout should not be duplicated")
Expect(values.Get("_busy_timeout")).To(Equal("1000"))
Expect(values["_txlock"]).To(HaveLen(1), "_txlock should not be duplicated")
Expect(values.Get("_txlock")).To(Equal("deferred"))
})
})

View File

@@ -155,7 +155,7 @@ func AutocompleteEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, a
// @Param name path string true "Model name"
// @Success 200 {object} map[string]any "success message"
// @Router /api/models/config-json/{name} [patch]
func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -173,14 +173,6 @@ func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, gs
if _, err := svc.PatchConfig(c.Request().Context(), modelName, patchMap); err != nil {
return c.JSON(httpStatusForModelAdminError(err), map[string]any{"error": err.Error()})
}
// Patch rewrites the config on disk and reloads only the local loader;
// tell peers to refresh so the change is consistent across replicas.
// No-op in standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelName, "install")
}
return c.JSON(http.StatusOK, map[string]any{
"success": true,
"message": fmt.Sprintf("Model '%s' updated successfully", modelName),

View File

@@ -45,7 +45,7 @@ var _ = Describe("Config Metadata Endpoints", func() {
app = echo.New()
app.GET("/api/models/config-metadata", ConfigMetadataEndpoint())
app.GET("/api/models/config-metadata/autocomplete/:provider", AutocompleteEndpoint(configLoader, modelLoader, appConfig))
app.PATCH("/api/models/config-json/:name", PatchConfigEndpoint(configLoader, modelLoader, nil, appConfig))
app.PATCH("/api/models/config-json/:name", PatchConfigEndpoint(configLoader, modelLoader, appConfig))
})
AfterEach(func() {

View File

@@ -10,7 +10,6 @@ import (
"github.com/labstack/echo/v4"
"github.com/mudler/LocalAI/core/config"
httpUtils "github.com/mudler/LocalAI/core/http/middleware"
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/internal"
"github.com/mudler/LocalAI/pkg/model"
@@ -56,7 +55,7 @@ func GetEditModelPage(cl *config.ModelConfigLoader, appConfig *config.Applicatio
}
// EditModelEndpoint handles updating existing model configurations
func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -71,17 +70,6 @@ func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, gs *
if err != nil {
return c.JSON(httpStatusForModelAdminError(err), ModelResponse{Success: false, Error: err.Error()})
}
// Tell peer replicas to refresh their in-memory config: this endpoint
// only reloaded the local loader. A rename is a delete of the old name
// plus an install of the new one. No-op in standalone mode.
if gs != nil {
if result.Renamed {
gs.BroadcastModelsChanged(result.OldName, "delete")
}
gs.BroadcastModelsChanged(result.NewName, "install")
}
msg := fmt.Sprintf("Model '%s' updated successfully. Model has been reloaded with new configuration.", result.NewName)
if result.Renamed {
msg = fmt.Sprintf("Model '%s' renamed to '%s' and updated successfully.", result.OldName, result.NewName)

View File

@@ -56,7 +56,7 @@ var _ = Describe("Edit Model test", func() {
app := echo.New()
// Set up a simple renderer for the test
app.Renderer = &testRenderer{}
app.POST("/import-model", ImportModelEndpoint(modelConfigLoader, nil, applicationConfig))
app.POST("/import-model", ImportModelEndpoint(modelConfigLoader, applicationConfig))
app.GET("/edit-model/:name", GetEditModelPage(modelConfigLoader, applicationConfig))
requestBody := bytes.NewBufferString(`{"name": "foo", "backend": "foo", "model": "foo"}`)
@@ -106,7 +106,7 @@ var _ = Describe("Edit Model test", func() {
Expect(exists).To(BeTrue())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
newYAML := "name: newname\nbackend: llama\nmodel: foo\n"
req := httptest.NewRequest("POST", "/models/edit/oldname", bytes.NewBufferString(newYAML))
@@ -163,7 +163,7 @@ var _ = Describe("Edit Model test", func() {
Expect(modelConfigLoader.LoadModelConfigsFromPath(tempDir)).To(Succeed())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
req := httptest.NewRequest(
"POST",
@@ -204,7 +204,7 @@ var _ = Describe("Edit Model test", func() {
Expect(modelConfigLoader.LoadModelConfigsFromPath(tempDir)).To(Succeed())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
req := httptest.NewRequest(
"POST",

View File

@@ -125,7 +125,7 @@ func ImportModelURIEndpoint(cl *config.ModelConfigLoader, appConfig *config.Appl
}
// ImportModelEndpoint handles creating new model configurations
func ImportModelEndpoint(cl *config.ModelConfigLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func ImportModelEndpoint(cl *config.ModelConfigLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
return func(c echo.Context) error {
// Get the raw body
body, err := io.ReadAll(c.Request().Body)
@@ -245,13 +245,6 @@ func ImportModelEndpoint(cl *config.ModelConfigLoader, gs *galleryop.GalleryServ
}
return c.JSON(http.StatusInternalServerError, response)
}
// Tell peer replicas to load the newly-created config from the shared
// models dir: this endpoint only reloaded the local loader. No-op in
// standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelConfig.Name, "install")
}
// Return success response
response := ModelResponse{
Success: true,

View File

@@ -60,10 +60,7 @@ func GetNodeEndpoint(registry *nodes.NodeRegistry) echo.HandlerFunc {
return func(c echo.Context) error {
ctx := c.Request().Context()
id := c.Param("id")
// GetWithExtras (not Get) so the response carries the node's labels,
// loaded-model count, and in-flight total — the bare BackendNode keeps
// labels in a separate table, leaving the detail view's label list empty.
node, err := registry.GetWithExtras(ctx, id)
node, err := registry.Get(ctx, id)
if err != nil {
return c.JSON(http.StatusNotFound, nodeError(http.StatusNotFound, "node not found"))
}

View File

@@ -7,7 +7,6 @@ import (
"github.com/labstack/echo/v4"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/pkg/model"
)
@@ -25,7 +24,7 @@ import (
// @Failure 404 {object} ModelResponse
// @Failure 500 {object} ModelResponse
// @Router /api/models/{name}/{action} [put]
func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -37,14 +36,6 @@ func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoade
if err != nil {
return c.JSON(httpStatusForModelAdminError(err), ModelResponse{Success: false, Error: err.Error()})
}
// Enabling/disabling rewrites the config on disk and reloads only the
// local loader; tell peers to refresh so the model's availability is
// consistent across replicas. No-op in standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelName, "install")
}
msg := fmt.Sprintf("Model '%s' has been %sd successfully.", modelName, action)
if action == modeladmin.ActionDisable {
msg += " The model will not be loaded on demand until re-enabled."

View File

@@ -72,19 +72,19 @@ func RegisterLocalAIRoutes(router *echo.Echo,
router.POST("/backends/upgrades/check", backendGalleryEndpointService.CheckUpgradesEndpoint(), adminMiddleware)
router.POST("/backends/upgrade/:name", backendGalleryEndpointService.UpgradeBackendEndpoint(), adminMiddleware)
// Custom model import endpoint
router.POST("/models/import", localai.ImportModelEndpoint(cl, galleryService, appConfig), adminMiddleware)
router.POST("/models/import", localai.ImportModelEndpoint(cl, appConfig), adminMiddleware)
// URI model import endpoint
router.POST("/models/import-uri", localai.ImportModelURIEndpoint(cl, appConfig, galleryService, opcache), adminMiddleware)
// Custom model edit endpoint
router.POST("/models/edit/:name", localai.EditModelEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
router.POST("/models/edit/:name", localai.EditModelEndpoint(cl, ml, appConfig), adminMiddleware)
// List model aliases endpoint
router.GET("/api/aliases", localai.ListAliasesEndpoint(cl), adminMiddleware)
// Toggle model enable/disable endpoint
router.PUT("/models/toggle-state/:name/:action", localai.ToggleStateModelEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
router.PUT("/models/toggle-state/:name/:action", localai.ToggleStateModelEndpoint(cl, ml, appConfig), adminMiddleware)
// Toggle model pinned status endpoint
router.PUT("/models/toggle-pinned/:name/:action", localai.TogglePinnedModelEndpoint(cl, appConfig, func() {

View File

@@ -922,7 +922,7 @@ func RegisterUIAPIRoutes(app *echo.Echo, cl *config.ModelConfigLoader, ml *model
app.GET("/api/models/config-metadata/autocomplete/:provider", localai.AutocompleteEndpoint(cl, ml, appConfig), adminMiddleware)
// PATCH config endpoint - partial update using nested JSON merge
app.PATCH("/api/models/config-json/:name", localai.PatchConfigEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
app.PATCH("/api/models/config-json/:name", localai.PatchConfigEndpoint(cl, ml, appConfig), adminMiddleware)
// VRAM estimation endpoint
app.POST("/api/models/vram-estimate", localai.VRAMEstimateEndpoint(cl, appConfig), adminMiddleware)

View File

@@ -68,32 +68,6 @@ var _ = Describe("LLM tests", func() {
Expect(protoMessages[0].Content).To(Equal("Hello World"))
})
// Regression for mudler/LocalAI#10524: a text part whose inner text is
// itself a JSON-array string (mealie sends an ingredient list) must
// flatten to that exact string verbatim. ToProto must NOT escape or
// restructure it - the C++ backend then treats it as opaque text. This
// pins the precise Go-side input that produced the "unsupported
// content[].type" gRPC error before the backend stopped re-parsing it.
It("flattens a JSON-array-looking text part to the verbatim string (#10524)", func() {
ingredients := `["1/4 cup brown sugar, packed","1 pound ground beef"]`
messages := Messages{
{
Role: "user",
Content: []any{
map[string]any{
"type": "text",
"text": ingredients,
},
},
},
}
protoMessages := messages.ToProto()
Expect(protoMessages).To(HaveLen(1))
Expect(protoMessages[0].Content).To(Equal(ingredients))
})
It("should convert message with tool_calls", func() {
messages := Messages{
{

View File

@@ -4,59 +4,14 @@ import (
"context"
"fmt"
"hash/fnv"
"strings"
"sync"
"gorm.io/gorm"
)
// localLocks holds one buffered channel (capacity 1) per lock key, used as an
// in-process mutex for non-PostgreSQL dialects (SQLite). A SQLite auth DB is
// effectively single-process, so serializing guarded sections within this
// process is sufficient - we cannot and need not coordinate across processes
// the way a PostgreSQL advisory lock does.
var (
localLocksMu sync.Mutex
localLocks = map[int64]chan struct{}{}
)
// localLockChan returns the per-key buffered channel, creating it on first use.
func localLockChan(key int64) chan struct{} {
localLocksMu.Lock()
defer localLocksMu.Unlock()
ch, ok := localLocks[key]
if !ok {
ch = make(chan struct{}, 1)
localLocks[key] = ch
}
return ch
}
// isPostgres reports whether the gorm dialect is PostgreSQL. Anything else
// (SQLite and any non-postgres dialect) uses the in-process fallback, because
// the pg_* advisory lock functions only exist on PostgreSQL.
func isPostgres(db *gorm.DB) bool {
return strings.Contains(db.Dialector.Name(), "postgres")
}
// TryWithLockCtx attempts to acquire a lock and run fn without blocking.
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if
// the lock was already held, or (false, error) on failure.
//
// On PostgreSQL it uses pg_try_advisory_lock (cross-process). On other dialects
// (SQLite) it uses a non-blocking in-process lock keyed by key.
// TryWithLockCtx attempts to acquire a PostgreSQL advisory lock using the provided context.
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if the lock
// was already held, or (false, error) on failure.
func TryWithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) (bool, error) {
if !isPostgres(db) {
ch := localLockChan(key)
select {
case ch <- struct{}{}:
defer func() { <-ch }()
return true, fn()
default:
return false, nil
}
}
sqlDB, err := db.DB()
if err != nil {
return false, fmt.Errorf("get sql.DB: %w", err)
@@ -95,31 +50,9 @@ func KeyFromString(s string) int64 {
return int64(h.Sum64()>>1) | 0x100000000
}
// WithLockCtx acquires a lock for key, runs fn, then releases it, respecting
// context cancellation. If ctx is cancelled while waiting for the lock, the
// function returns ctx.Err().
//
// On PostgreSQL it uses pg_advisory_lock (cross-process). On other dialects
// (SQLite) it falls back to a blocking in-process lock keyed by key, which is
// sufficient because a SQLite auth DB is effectively single-process.
// WithLockCtx is like WithLock but respects context cancellation.
// If ctx is cancelled while waiting for the lock, the function returns ctx.Err().
func WithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) error {
if !isPostgres(db) {
// Honor an already-cancelled context before attempting acquisition:
// select picks a ready case at random, so without this an already-free
// lock could be taken despite a cancelled ctx.
if err := ctx.Err(); err != nil {
return err
}
ch := localLockChan(key)
select {
case ch <- struct{}{}:
defer func() { <-ch }()
return fn()
case <-ctx.Done():
return ctx.Err()
}
}
sqlDB, err := db.DB()
if err != nil {
return fmt.Errorf("advisorylock: getting sql.DB: %w", err)

View File

@@ -1,129 +0,0 @@
package advisorylock
import (
"context"
"sync"
"sync/atomic"
"time"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"gorm.io/driver/sqlite"
"gorm.io/gorm"
)
// These specs run against an in-memory SQLite DB and therefore do NOT require
// Docker, unlike the PostgreSQL testcontainer specs.
var _ = Describe("AdvisoryLock (SQLite fallback)", Label("sqlite"), func() {
var db *gorm.DB
BeforeEach(func() {
var err error
db, err = gorm.Open(sqlite.Open("file::memory:?cache=shared"), &gorm.Config{})
Expect(err).ToNot(HaveOccurred())
Expect(db.Dialector.Name()).To(ContainSubstring("sqlite"))
})
It("WithLockCtx executes fn and returns no error on SQLite", func() {
const lockKey int64 = 12001
executed := false
err := WithLockCtx(context.Background(), db, lockKey, func() error {
executed = true
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(executed).To(BeTrue(), "function should have run under the in-process lock")
})
It("WithLockCtx serializes concurrent goroutines on the same key", func() {
const lockKey int64 = 12002
var (
mu sync.Mutex
maxRunning int32
running int32
concurrency int32
)
var wg sync.WaitGroup
for range 2 {
wg.Go(func() {
defer GinkgoRecover()
err := WithLockCtx(context.Background(), db, lockKey, func() error {
cur := atomic.AddInt32(&running, 1)
mu.Lock()
if cur > maxRunning {
maxRunning = cur
}
if cur > 1 {
atomic.AddInt32(&concurrency, 1)
}
mu.Unlock()
time.Sleep(50 * time.Millisecond)
atomic.AddInt32(&running, -1)
return nil
})
Expect(err).ToNot(HaveOccurred())
})
}
wg.Wait()
Expect(maxRunning).To(BeNumerically("<=", 1), "expected max 1 goroutine inside lock at a time")
Expect(concurrency).To(BeZero(), "detected concurrent execution inside advisory lock")
})
It("WithLockCtx returns an error and does not run fn with an already-cancelled context", func() {
const lockKey int64 = 12003
ctx, cancel := context.WithCancel(context.Background())
cancel()
err := WithLockCtx(ctx, db, lockKey, func() error {
Fail("function should not run with a cancelled context")
return nil
})
Expect(err).To(HaveOccurred())
})
It("TryWithLockCtx returns (true, nil) when free and (false, nil) when held", func() {
const lockKey int64 = 12004
acquired, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(acquired).To(BeTrue(), "expected TryWithLockCtx to acquire the free lock")
// Hold the lock in one goroutine while a concurrent TryWithLockCtx
// attempts to acquire the same key.
held := make(chan struct{})
release := make(chan struct{})
var wg sync.WaitGroup
wg.Go(func() {
defer GinkgoRecover()
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
close(held)
<-release
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(ok).To(BeTrue())
})
<-held
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
Fail("function should not run while lock is held")
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(ok).To(BeFalse(), "expected TryWithLockCtx to fail to acquire a held lock")
close(release)
wg.Wait()
})
})

View File

@@ -404,36 +404,6 @@ var _ = Describe("GalleryService cache invalidation broadcasts", func() {
Element: "x", Op: "install",
})).To(Succeed())
})
It("BroadcastModelsChanged delivers the element and op to a peer's OnModelsChanged", func() {
var (
mu sync.Mutex
seen []messaging.CacheInvalidateEvent
)
svcB.OnModelsChanged = func(evt messaging.CacheInvalidateEvent) {
mu.Lock()
seen = append(seen, evt)
mu.Unlock()
}
Expect(svcA.SubscribeBroadcasts()).To(Succeed())
Expect(svcB.SubscribeBroadcasts()).To(Succeed())
// An admin edit on replica A must reach replica B over the same subject
// the gallery path uses, so B refreshes its in-memory config loader.
svcA.BroadcastModelsChanged("my-alias", "install")
mu.Lock()
defer mu.Unlock()
Expect(seen).To(ContainElement(messaging.CacheInvalidateEvent{
Element: "my-alias", Op: "install",
}))
})
It("BroadcastModelsChanged is a no-op when NATS is not wired (standalone)", func() {
standalone := galleryop.NewGalleryService(&config.ApplicationConfig{}, nil)
// No SetNATSClient: must not panic and must simply do nothing.
Expect(func() { standalone.BroadcastModelsChanged("x", "delete") }).ToNot(Panic())
})
})
var _ = Describe("GalleryService PostgreSQL hydration", func() {

View File

@@ -201,24 +201,6 @@ func (g *GalleryService) publishCacheInvalidate(subject string, evt messaging.Ca
}
}
// BroadcastModelsChanged notifies peer replicas that a model config was
// created, edited, or removed out-of-band of the gallery install/delete
// channel (e.g. the admin /models/edit, /models/import and
// /models/toggle-state endpoints, which write the YAML and reload only the
// local in-memory loader). Peers receive it via OnModelsChanged and refresh
// their own ModelConfigLoader so a request load-balanced to any replica sees
// the same config. No-op in standalone mode (no NATS client).
//
// op is "install" for a create/edit (the element must be (re)loaded from
// disk) or "delete" for a removal (the element must be pruned from memory,
// which a reload-from-path cannot do because the loader is additive).
func (g *GalleryService) BroadcastModelsChanged(element, op string) {
g.publishCacheInvalidate(messaging.SubjectCacheInvalidateModels, messaging.CacheInvalidateEvent{
Element: element,
Op: op,
})
}
// mergeStatus is the broadcast-side merge: it updates the in-memory map from
// a peer's GalleryProgressEvent without re-publishing to NATS or re-writing
// to PostgreSQL. UpdateStatus is the local-write entry point and does both;

View File

@@ -1,24 +0,0 @@
//go:build auth
package jobs_test
import (
"github.com/mudler/LocalAI/core/http/auth"
"github.com/mudler/LocalAI/core/services/jobs"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
// Reproduces the #10506 caller chain: auth.InitDB(sqlite) -> jobs.NewJobStore,
// which previously failed with "no such function: pg_advisory_lock".
var _ = Describe("NewJobStore on a SQLite auth DB (#10506)", func() {
It("migrates without pg_advisory_lock errors", func() {
db, err := auth.InitDB(":memory:")
Expect(err).ToNot(HaveOccurred())
store, err := jobs.NewJobStore(db)
Expect(err).ToNot(HaveOccurred())
Expect(store).ToNot(BeNil())
})
})

View File

@@ -1,53 +0,0 @@
package modeladmin
import (
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/xlog"
)
// opDelete is the CacheInvalidateEvent.Op value the gallery delete path and the
// admin delete endpoint use; a delete must prune (a reload-from-path cannot).
const opDelete = "delete"
// ApplyRemoteChange refreshes this replica's in-memory model state from a peer
// replica's model-config change broadcast (messaging.CacheInvalidateEvent on
// SubjectCacheInvalidateModels). It is the subscriber-side counterpart to
// GalleryService.BroadcastModelsChanged.
//
// The op matters because LoadModelConfigsFromPath is additive: it loads every
// YAML on disk into the loader but never removes an entry whose file is gone.
// So a delete cannot be propagated by a plain reload - the deleted element must
// be explicitly pruned. Specifically:
//
// - op == "delete" with a named element: prune that element from the loader.
// - otherwise: reload all configs from disk (picks up creates and edits).
//
// In both cases, when an element is named, any running instance on this replica
// is shut down (best-effort) so the next request rebuilds it from the new
// config instead of serving the stale one - mirroring what the originating
// replica does on a local edit/delete.
//
// ml may be nil (no running instances to shut down). modelsPath and opts are
// forwarded to LoadModelConfigsFromPath.
func ApplyRemoteChange(cl *config.ModelConfigLoader, ml *model.ModelLoader, modelsPath string, evt messaging.CacheInvalidateEvent, opts ...config.ConfigLoaderOption) error {
if evt.Op == opDelete && evt.Element != "" {
cl.RemoveModelConfig(evt.Element)
} else if err := cl.LoadModelConfigsFromPath(modelsPath, opts...); err != nil {
return err
}
// Drop any running instance of the affected model so the next request
// rebuilds it from the refreshed config instead of serving the stale one.
// Best-effort: the model may not be loaded on this replica, which surfaces
// as a benign error here.
if ml != nil && evt.Element != "" {
if err := ml.ShutdownModel(evt.Element); err != nil {
xlog.Debug("ApplyRemoteChange: could not shut down model instance (likely not loaded)",
"model", evt.Element, "error", err)
}
}
return nil
}

View File

@@ -1,80 +0,0 @@
package modeladmin
import (
"os"
"path/filepath"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"gopkg.in/yaml.v3"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/messaging"
)
var _ = Describe("ApplyRemoteChange", func() {
var (
dir string
loader *config.ModelConfigLoader
)
BeforeEach(func() {
dir = GinkgoT().TempDir()
loader = config.NewModelConfigLoader(dir)
})
writeYAML := func(name string, body map[string]any) {
body["name"] = name
data, err := yaml.Marshal(body)
Expect(err).ToNot(HaveOccurred())
Expect(os.WriteFile(filepath.Join(dir, name+".yaml"), data, 0644)).To(Succeed())
}
It("loads a peer-created config from disk on an install event", func() {
// Peer wrote the YAML to the shared models dir; this replica has not
// loaded it yet (empty in-memory loader).
writeYAML("peer-alias", map[string]any{"alias": "qwen"})
_, ok := loader.GetModelConfig("peer-alias")
Expect(ok).To(BeFalse(), "precondition: not yet in memory")
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{
Element: "peer-alias", Op: "install",
})
Expect(err).ToNot(HaveOccurred())
_, ok = loader.GetModelConfig("peer-alias")
Expect(ok).To(BeTrue(), "install event must reload the new config from disk")
})
It("prunes a peer-deleted config that a reload-from-path cannot drop", func() {
// Model is present in memory (loaded earlier) but its file is now gone
// from the shared dir. LoadModelConfigsFromPath is additive, so only an
// explicit prune can remove it - this is the cross-replica delete bug.
writeYAML("doomed", map[string]any{"alias": "qwen"})
Expect(loader.LoadModelConfigsFromPath(dir)).To(Succeed())
_, ok := loader.GetModelConfig("doomed")
Expect(ok).To(BeTrue(), "precondition: in memory")
Expect(os.Remove(filepath.Join(dir, "doomed.yaml"))).To(Succeed())
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{
Element: "doomed", Op: "delete",
})
Expect(err).ToNot(HaveOccurred())
_, ok = loader.GetModelConfig("doomed")
Expect(ok).To(BeFalse(), "delete event must prune the element from memory")
})
It("does a full reload when no element is named", func() {
writeYAML("m1", map[string]any{"alias": "qwen"})
writeYAML("m2", map[string]any{"alias": "qwen"})
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{})
Expect(err).ToNot(HaveOccurred())
_, ok1 := loader.GetModelConfig("m1")
_, ok2 := loader.GetModelConfig("m2")
Expect(ok1).To(BeTrue())
Expect(ok2).To(BeTrue())
})
})

View File

@@ -673,49 +673,6 @@ func (r *NodeRegistry) Get(ctx context.Context, nodeID string) (*BackendNode, er
return &node, nil
}
// GetWithExtras returns a single node enriched with the same computed fields as
// ListWithExtras (labels, loaded-model count, in-flight total). The plain Get
// returns a bare BackendNode whose Labels live in a separate table, so the node
// detail view needs this to show a node's existing labels and live counts.
func (r *NodeRegistry) GetWithExtras(ctx context.Context, nodeID string) (*NodeWithExtras, error) {
node, err := r.Get(ctx, nodeID)
if err != nil {
return nil, err
}
labels := make(map[string]string)
nodeLabels, err := r.GetNodeLabels(ctx, nodeID)
if err != nil {
xlog.Warn("GetWithExtras: failed to get labels", "node", nodeID, "error", err)
} else {
for _, l := range nodeLabels {
labels[l.Key] = l.Value
}
}
var modelCount int64
if err := r.db.WithContext(ctx).Model(&NodeModel{}).
Where("node_id = ? AND state = ?", nodeID, "loaded").
Count(&modelCount).Error; err != nil {
xlog.Warn("GetWithExtras: failed to get model count", "node", nodeID, "error", err)
}
var inFlight struct{ Total int }
if err := r.db.WithContext(ctx).Model(&NodeModel{}).
Select("COALESCE(SUM(in_flight), 0) as total").
Where("node_id = ? AND state IN ?", nodeID, []string{"loaded", "unloading"}).
Scan(&inFlight).Error; err != nil {
xlog.Warn("GetWithExtras: failed to get in-flight count", "node", nodeID, "error", err)
}
return &NodeWithExtras{
BackendNode: *node,
ModelCount: int(modelCount),
InFlightCount: inFlight.Total,
Labels: labels,
}, nil
}
// GetByName returns a single node by name.
func (r *NodeRegistry) GetByName(ctx context.Context, name string) (*BackendNode, error) {
var node BackendNode

View File

@@ -646,38 +646,6 @@ var _ = Describe("NodeRegistry", func() {
})
})
Describe("GetWithExtras", func() {
It("returns the node enriched with its labels map", func() {
node := makeNode("extras-node", "10.0.0.80:50051", 8_000_000_000)
Expect(registry.Register(context.Background(), node, true)).To(Succeed())
Expect(registry.SetNodeLabel(context.Background(), node.ID, "env", "prod")).To(Succeed())
Expect(registry.SetNodeLabel(context.Background(), node.ID, "region", "us-east")).To(Succeed())
got, err := registry.GetWithExtras(context.Background(), node.ID)
Expect(err).ToNot(HaveOccurred())
Expect(got).ToNot(BeNil())
Expect(got.ID).To(Equal(node.ID))
Expect(got.Name).To(Equal("extras-node"))
Expect(got.Labels).To(Equal(map[string]string{"env": "prod", "region": "us-east"}))
})
It("returns an empty (non-nil) labels map when the node has none", func() {
node := makeNode("extras-no-labels", "10.0.0.81:50051", 8_000_000_000)
Expect(registry.Register(context.Background(), node, true)).To(Succeed())
got, err := registry.GetWithExtras(context.Background(), node.ID)
Expect(err).ToNot(HaveOccurred())
Expect(got).ToNot(BeNil())
Expect(got.Labels).ToNot(BeNil())
Expect(got.Labels).To(BeEmpty())
})
It("returns an error for an unknown node", func() {
_, err := registry.GetWithExtras(context.Background(), "does-not-exist")
Expect(err).To(HaveOccurred())
})
})
Describe("FindNodesBySelector", func() {
It("returns nodes matching all labels in selector", func() {
n1 := makeNode("sel-match", "10.0.0.80:50051", 8_000_000_000)

View File

@@ -86,18 +86,6 @@ LOCALAI_AGENT_POOL_DATABASE_URL=postgresql://localrecall:localrecall@postgres:54
The PostgreSQL image `quay.io/mudler/localrecall:v0.5.2-postgresql` is pre-configured with pgvector and ready to use.
#### Connection safety timeouts (PostgreSQL only)
The embedded vector store sets per-connection timeouts so a single stuck or corrupt index can never hold a lock indefinitely and stall every other collection operation. Safe defaults are applied automatically — you only need to set these to override them:
| Variable | Default | Description |
|----------|---------|-------------|
| `POSTGRES_LOCK_TIMEOUT` | `30s` | Bounds how long a statement waits to acquire a lock, so queued statements fail fast instead of piling up. Set `0`/`off` to disable. |
| `POSTGRES_IDLE_IN_TRANSACTION_TIMEOUT` | `300s` | Reaps abandoned transactions that would otherwise pin locks. Set `0`/`off` to disable. |
| `POSTGRES_STATEMENT_TIMEOUT` | _(unset)_ | Bounds total statement runtime, auto-aborting a wedged query. Off by default since a large vector index build can exceed any fixed limit; index builds are exempted, so it is safe to enable. |
These are read directly from the LocalAI process environment by the embedded store (the same as `DATABASE_URL` and `HYBRID_SEARCH_*`).
### Docker Compose Example
Basic setup with in-memory vector store:

View File

@@ -85,8 +85,6 @@ localai run
| `LOCALAI_REGISTRATION_MODE` | `approval` | Registration mode: `open`, `approval`, or `invite` |
| `LOCALAI_DISABLE_LOCAL_AUTH` | `false` | Disable local email/password registration and login (for OAuth/OIDC-only deployments) |
> **Note: network-backed storage.** File-based SQLite relies on POSIX file locking, which is unreliable over network filesystems (SMB/CIFS/NFS, e.g. Azure Files / Azure Container Apps shared volumes). On such storage the auth DB can fail to migrate with `database is locked`. Use PostgreSQL (`LOCALAI_AUTH_DATABASE_URL=postgres://...`) when the data directory lives on shared or network storage, or place `database.db` on a local volume.
### Disabling Local Authentication
If you want to enforce OAuth/OIDC-only login and prevent users from registering or logging in with email/password, set `LOCALAI_DISABLE_LOCAL_AUTH=true` (or pass `--disable-local-auth`):

View File

@@ -22,16 +22,13 @@ Download the latest DMG from GitHub releases:
3. Drag the LocalAI application to your Applications folder
4. Launch LocalAI from your Applications folder
## Verification
## Known Issues
The `LocalAI.dmg` (and the app inside it) and the `local-ai` server binary are
signed with an Apple Developer ID and notarized by Apple, so they launch with no
quarantine prompt or workaround. To inspect the signature yourself:
```bash
spctl --assess --type open --context context:primary-signature -v /Applications/LocalAI.app
codesign --verify --deep --strict --verbose=2 /Applications/LocalAI.app
```
> **Note**: The DMGs are not signed by Apple and may show as quarantined.
>
> **Workaround**: See [this issue](https://github.com/mudler/LocalAI/issues/6268) for details on how to bypass the quarantine.
>
> **Fix tracking**: The signing issue is being tracked in [this issue](https://github.com/mudler/LocalAI/issues/6244).
## Next Steps

View File

@@ -1,3 +1,3 @@
{
"version": "v4.5.2"
"version": "v4.5.0"
}

View File

@@ -1,160 +1,4 @@
---
- name: "qwen-agentworld-35b-a3b"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:
- https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF
description: |
# Qwen-AgentWorld-35B-A3B
📑 Technical Report |
📖 Blog |
🤗 Hugging Face |
🤖 ModelScope |
💻 GitHub |
🖥️ Demo
> [!Note]
> This repository contains the model weights and configuration files for **Qwen-AgentWorld-35B-A3B**, a native language world model trained for agentic environment simulation.
>
> These artifacts are compatible with Hugging Face Transformers, vLLM, SGLang, etc.
**Qwen-AgentWorld** is the first language world model to cover seven agent interaction domains within a single model. It simulates agentic environments via long chain-of-thought reasoning, predicting the next environment state given an agent's action and interaction history. Trained through a three-stage pipeline — CPT injects environment knowledge, SFT activates next-state-prediction reasoning, RL sharpens simulation fidelity — Qwen-AgentWorld is a **native world model**: environment modeling is the training objective from the CPT stage onward, not a post-hoc add-on.
## Highlights
...
license: "apache-2.0"
tags:
- llm
- gguf
- qwen
icon: https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen-AgentWorld/logo.png
overrides:
backend: llama-cpp
function:
automatic_tool_parsing_fallback: true
grammar:
disable: true
known_usecases:
- chat
options:
- use_jinja:true
parameters:
model: llama-cpp/models/Qwen-AgentWorld-35B-A3B-GGUF/Qwen-AgentWorld-35B-A3B-UD-Q4_K_M.gguf
template:
use_tokenizer_template: true
files:
- filename: llama-cpp/models/Qwen-AgentWorld-35B-A3B-GGUF/Qwen-AgentWorld-35B-A3B-UD-Q4_K_M.gguf
sha256: e7a8eafdd8013443b6bcc4b6fb47b2d2025f772d359650b9ceb7d75971e22cad
uri: https://huggingface.co/unsloth/Qwen-AgentWorld-35B-A3B-GGUF/resolve/main/Qwen-AgentWorld-35B-A3B-UD-Q4_K_M.gguf
- name: "ornith-1.0-9b"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:
- https://huggingface.co/deepreinforce-ai/Ornith-1.0-9B-GGUF
description: |
[](https://deep-reinforce.com/ornith.html)
# Ornith-1.0-9B-GGUF
Aloha! 🌺 Today, we are releasing Ornith-1.0, a self-improving family of open-source models for agentic coding.
Highlights:
- **State-of-the-Art Coding Agents**: Available in 9B-Dense, 31B-Dense, 35B-MoE, and 397B-MoE (post-trained on top of Gemma 4 and Qwen 3.5), achieving state-of-the-art performance among open-source models of comparable size on coding benchmarks such as Terminal-Bench 2.1, SWE-Bench, NL2Repo and OpenClaw.
- **Self-Improving Training Framework**:  Ornith-1.0 employs RL to learn to generate not only solution rollouts, but also the scallfold that drive those rollouts. By jointly optimizing the scaffold and the resulting solution, the model discovers better search trajectories and generates higher-quality solutions.
- **Licence**: MIT licensed, globally accessible, and free from regional limitations.
## Ornith 1.0 9B
This model card documents **Ornith-1.0-9B**, the most lightweight member of the Ornith family, designed for efficient single-GPU deployment.
### Benchmarks
Ornith-1.0-9B
Qwen3.5-9B
Qwen3.5-35B
Gemma4-12B
Gemma4-31B
Agentic Coding
...
license: "mit"
tags:
- llm
- gguf
overrides:
backend: llama-cpp
function:
automatic_tool_parsing_fallback: true
grammar:
disable: true
known_usecases:
- chat
options:
- use_jinja:true
parameters:
model: llama-cpp/models/Ornith-1.0-9B-GGUF/ornith-1.0-9b-Q4_K_M.gguf
template:
use_tokenizer_template: true
files:
- filename: llama-cpp/models/Ornith-1.0-9B-GGUF/ornith-1.0-9b-Q4_K_M.gguf
sha256: 5720d1f671b4996481274fffe01868c3c36e87c135cc8538471cc7bd6087b106
uri: https://huggingface.co/deepreinforce-ai/Ornith-1.0-9B-GGUF/resolve/main/ornith-1.0-9b-Q4_K_M.gguf
- name: "ornith-1.0-35b"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:
- https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B-GGUF
description: |
[](https://deep-reinforce.com/ornith.html)
# Ornith-1.0-35B-GGUF
Aloha! 🌺 Today, we are releasing Ornith-1.0, a self-improving family of open-source models for agentic coding.
Highlights:
- **State-of-the-Art Coding Agents**: Available in 9B-Dense, 31B-Dense, 35B-MoE, and 397B-MoE (post-trained on top of Gemma 4 and Qwen 3.5), achieving state-of-the-art performance among open-source models of comparable size on coding benchmarks such as Terminal-Bench 2.1, SWE-Bench, NL2Repo and OpenClaw.
- **Self-Improving Training Framework**: Ornith-1.0 employs RL to learn to generate not only solution rollouts, but also the scallfold that drive those rollouts. By jointly optimizing the scaffold and the resulting solution, the model discovers better search trajectories and generates higher-quality solutions.
- **Licence**: MIT licensed, globally accessible, and free from regional limitations.
## Ornith 1.0 35B
This model card documents **Ornith-1.0-35B**, the lightweight member of the Ornith family, designed for efficient single-GPU deployment.
### Benchmarks
Ornith-1.0-35B
Qwen3.5-35B
Qwen3.6-35B
Gemma4-31B
Qwen3.5-397B
Agentic Coding
...
license: "mit"
tags:
- llm
- gguf
overrides:
backend: llama-cpp
function:
automatic_tool_parsing_fallback: true
grammar:
disable: true
known_usecases:
- chat
options:
- use_jinja:true
parameters:
model: llama-cpp/models/Ornith-1.0-35B-GGUF/ornith-1.0-35b-Q4_K_M.gguf
template:
use_tokenizer_template: true
files:
- filename: llama-cpp/models/Ornith-1.0-35B-GGUF/ornith-1.0-35b-Q4_K_M.gguf
sha256: ff25291b2599fb927a835e624d2b3540106af61761c3fa57ac4264046dbec002
uri: https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B-GGUF/resolve/main/ornith-1.0-35b-Q4_K_M.gguf
- name: "gemmable-4-12b-mtp"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
urls:

2
go.mod
View File

@@ -221,7 +221,7 @@ require (
github.com/labstack/gommon v0.4.2 // indirect
github.com/mschoch/smat v0.2.0 // indirect
github.com/mudler/LocalAGI v0.0.0-20260606071251-14aed1ae4336
github.com/mudler/localrecall v0.6.3 // indirect
github.com/mudler/localrecall v0.6.3-0.20260618142827-d0073dd5dc32 // indirect
github.com/mudler/skillserver v0.0.7-0.20260520220837-a7317cbf9145
github.com/olekukonko/tablewriter v0.0.5 // indirect
github.com/oxffaa/gopher-parse-sitemap v0.0.0-20191021113419-005d2eb1def4 // indirect

6
go.sum
View File

@@ -976,8 +976,10 @@ github.com/mudler/go-piper v0.0.0-20241023091659-2494246fd9fc h1:RxwneJl1VgvikiX
github.com/mudler/go-piper v0.0.0-20241023091659-2494246fd9fc/go.mod h1:O7SwdSWMilAWhBZMK9N9Y/oBDyMMzshE3ju8Xkexwig=
github.com/mudler/go-processmanager v0.1.1 h1:c/1NRZOZpW8HuFv9RhBG57nQu1oDMRomEHedwBFMlrw=
github.com/mudler/go-processmanager v0.1.1/go.mod h1:h6kmHUZeafr+k5hRYpGLMzJFH4hItHffgpRo2QIkP+o=
github.com/mudler/localrecall v0.6.3 h1:uXOrP9JmetzxgVKzSrawviyBHZfAcvPBBIrvVUdZjDA=
github.com/mudler/localrecall v0.6.3/go.mod h1:28k5n19raUrkuwXkacdNsBlj8yuSnGhpT16tu+2+4dU=
github.com/mudler/localrecall v0.6.3-0.20260606070048-9a3b3321a9cd h1:trn9D5UHAE6zdRyD2uX04W1tLSslAwozVwcyNTd72Ak=
github.com/mudler/localrecall v0.6.3-0.20260606070048-9a3b3321a9cd/go.mod h1:28k5n19raUrkuwXkacdNsBlj8yuSnGhpT16tu+2+4dU=
github.com/mudler/localrecall v0.6.3-0.20260618142827-d0073dd5dc32 h1:RP4BVGTHHpJIrGAwqRD3Wq1wmURmc1SxhwacnIWgI+g=
github.com/mudler/localrecall v0.6.3-0.20260618142827-d0073dd5dc32/go.mod h1:28k5n19raUrkuwXkacdNsBlj8yuSnGhpT16tu+2+4dU=
github.com/mudler/memory v0.0.0-20260406210934-424c1ecf2cf8 h1:Ry8RiWy8fZ6Ff4E7dPmjRsBrnHOnPeOOj2LhCgyjQu0=
github.com/mudler/memory v0.0.0-20260406210934-424c1ecf2cf8/go.mod h1:EA8Ashhd56o32qN7ouPKFSRUs/Z+LrRCF4v6R2Oarm8=
github.com/mudler/skillserver v0.0.7-0.20260520220837-a7317cbf9145 h1:z59tA3IDYPt71nzH1jpxeaA1LuDw8aZfpTQFNU43Zb8=

View File

@@ -26,10 +26,6 @@ type SystemState struct {
BackendImagesReleaseTag string
BackendImagesBranchTag string
BackendDevSuffix string
// PreferDevelopmentBackends installs the development image as the primary
// backend URI (the released image becomes a fallback) rather than only using
// development as a download fallback when the released image is missing.
PreferDevelopmentBackends bool
}
type SystemStateOptions func(*SystemState)
@@ -70,12 +66,6 @@ func WithBackendDevSuffix(suffix string) SystemStateOptions {
}
}
func WithPreferDevelopmentBackends(prefer bool) SystemStateOptions {
return func(s *SystemState) {
s.PreferDevelopmentBackends = prefer
}
}
func GetSystemState(opts ...SystemStateOptions) (*SystemState, error) {
state := &SystemState{}
for _, opt := range opts {

View File

@@ -17,31 +17,13 @@ rm -rf "${BACKEND_DIR}"/build-*
# run.sh's final `exec $CURDIR/<binary>` is the contract for what gets launched;
# the binary is not always named after the backend (e.g. parakeet-cpp launches
# parakeet-cpp-grpc), so derive it from run.sh and fall back to ${BACKEND}.
#
# Only scan the `exec` line(s): many run.sh select a runtime CPU variant via
# unquoted `LIBRARY=$CURDIR/libgo<x>-avx512.so` lines, and a whole-file grep
# would pick the last of those (avx512, which Darwin never builds) instead of
# the binary — failing the check below for whisper/sam3-cpp/vibevoice-cpp/...
# Also tolerate the exec being quoted (`exec "$CURDIR"/<binary>`).
RUN_BINARY=""
if [ -f "${BACKEND_DIR}/run.sh" ]; then
RUN_BINARY=$(grep -E '^[[:space:]]*exec[[:space:]]' "${BACKEND_DIR}/run.sh" | grep -oE '"?\$CURDIR"?/[A-Za-z0-9._-]+' | grep -v 'ld\.so' | tail -1 | sed -E 's|"?\$CURDIR"?/||')
RUN_BINARY=$(grep -oE '\$CURDIR/[A-Za-z0-9._-]+' "${BACKEND_DIR}/run.sh" | grep -v 'ld\.so' | tail -1 | sed 's|\$CURDIR/||')
fi
RUN_BINARY="${RUN_BINARY:-${BACKEND}}"
# Ship the self-contained package/ dir (run.sh + binary + lib/), matching the
# Linux Dockerfile.golang (`COPY .../package/. ./`). Packaging the whole backend
# dir instead left the runtime libraries under package/lib while run.sh looks in
# $CURDIR/lib, so backends such as sherpa-onnx could not dlopen their libs at
# runtime (they started fine only when run from inside package/). Backends that
# don't assemble a package/ fall back to the backend dir.
OCI_ROOT="${BACKEND_DIR}"
if [ -d "${BACKEND_DIR}/package" ]; then
OCI_ROOT="${BACKEND_DIR}/package"
fi
if [ ! -x "${OCI_ROOT}/${RUN_BINARY}" ]; then
echo "ERROR: ${OCI_ROOT}/${RUN_BINARY} not found after build; refusing to package a broken backend image (see issue #10267)." >&2
if [ ! -x "${BACKEND_DIR}/${RUN_BINARY}" ]; then
echo "ERROR: ${BACKEND_DIR}/${RUN_BINARY} not found after build; refusing to package a broken backend image (see issue #10267)." >&2
exit 1
fi
@@ -49,7 +31,7 @@ PLATFORMARCH="${PLATFORMARCH:-darwin/arm64}"
IMAGE_NAME="${IMAGE_NAME:-localai/${BACKEND}-darwin}"
./local-ai util create-oci-image \
"${OCI_ROOT}/." \
backend/go/${BACKEND}/. \
--output ./backend-images/${BACKEND}.tar \
--image-name $IMAGE_NAME \
--platform $PLATFORMARCH

View File

@@ -141,38 +141,6 @@ copy_elf_deps() {
done < <(ldd "$elf" 2>/dev/null | awk '/=>/ && $3 ~ /^\// {print $3}')
}
# Sweep the transitive shared-library dependencies of everything already
# bundled in a lib dir. The per-vendor packagers below copy an explicit
# allowlist of top-level runtime libs, but those libs pull in transitive deps
# that aren't in the list (e.g. ROCm's librocprofiler-register.so.0, libnuma,
# libdrm_amdgpu). Because backends run through the bundled lib/ld.so with
# LD_LIBRARY_PATH=lib (see run.sh), an unbundled transitive dep is a hard load
# failure (issue #10537: "librocprofiler-register.so.0: cannot open shared
# object file"). ldd resolves the full recursive closure, so a single pass over
# the already-bundled libs is enough; core libc-family deps are skipped via
# copy_elf_deps/is_core_lib so we never shadow the loader's own libc/libstdc++.
sweep_transitive_deps() {
local dir="${1:-$TARGET_LIB_DIR}"
command -v ldd >/dev/null 2>&1 || return 0
# Snapshot the current set first: copy_elf_deps adds files as it runs, and
# ldd already returns the full recursive closure, so we only need to sweep
# the libs that were present before the sweep started.
# `local x=$(...)` keeps set -e from tripping on shopt -p's nonzero exit.
local old_nullglob=$(shopt -p nullglob)
shopt -s nullglob
local libs=("$dir"/*.so*)
eval "$old_nullglob"
local lib
for lib in "${libs[@]}"; do
[ -e "$lib" ] || continue
# Skip symlinks: their real target is in the snapshot and gets swept.
[ -L "$lib" ] && continue
copy_elf_deps "$lib"
done
}
# Package NVIDIA CUDA libraries
package_cuda_libs() {
echo "Packaging CUDA libraries for BUILD_TYPE=${BUILD_TYPE}..."
@@ -217,10 +185,6 @@ package_cuda_libs() {
# cp -arfL /usr/local/cuda/targets "$TARGET_LIB_DIR/../cuda/" 2>/dev/null || true
# fi
# Pull in transitive deps the allowlist misses so the backend is
# self-contained (same class of failure as #10537).
sweep_transitive_deps "$TARGET_LIB_DIR"
echo "CUDA libraries packaged successfully"
}
@@ -297,10 +261,6 @@ package_rocm_libs() {
fi
done
# Pull in transitive deps the allowlist misses (librocprofiler-register.so.0,
# libnuma, libdrm_amdgpu, ...) so the backend is self-contained. See #10537.
sweep_transitive_deps "$TARGET_LIB_DIR"
echo "ROCm libraries packaged successfully"
}
@@ -343,10 +303,6 @@ package_intel_libs() {
fi
done
# Pull in transitive deps the allowlist misses so the backend is
# self-contained (same class of failure as #10537).
sweep_transitive_deps "$TARGET_LIB_DIR"
echo "Intel oneAPI libraries packaged successfully"
}
@@ -476,7 +432,6 @@ export -f copy_lib
export -f copy_libs_glob
export -f is_core_lib
export -f copy_elf_deps
export -f sweep_transitive_deps
export -f package_cuda_libs
export -f package_rocm_libs
export -f package_intel_libs

Some files were not shown because too many files have changed in this diff Show More