mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-25 17:12:10 -04:00
Compare commits
4 Commits
dependabot
...
docs/backe
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
00288b21cc | ||
|
|
286c508ce0 | ||
|
|
d1a9d59917 | ||
|
|
f72046b5b5 |
@@ -102,6 +102,24 @@ Multi-arch backends are NOT a single matrix entry with `platforms: 'linux/amd64,
|
||||
|
||||
Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~25–35 min cold). Local `make backends/<name>` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
|
||||
|
||||
### Cover every OS the project supports (Linux **and** Darwin)
|
||||
|
||||
`.github/backend-matrix.yml` has two matrices, and they are the source of truth for which OS a backend ships on:
|
||||
|
||||
- `include:` — the **Linux** matrix (x86_64 + arm64; CPU and CUDA / ROCm / SYCL / Vulkan).
|
||||
- `includeDarwin:` — the **macOS / Apple Silicon** matrix (arm64; Metal where the engine supports it, otherwise a native arm64 CPU build).
|
||||
|
||||
**A new backend must target every OS it can build for — do not ship Linux-only by default.** A backend that appears only under `include:` is silently unavailable on macOS even when its code would run there. Most C/C++/GGML engines build on Darwin out of the box (ggml defaults `GGML_METAL=ON` on Apple, so a plain build is Metal-enabled), and many Python backends do too (CPU / MPS wheels). If a backend genuinely cannot support an OS (e.g. CUDA-only, no CPU variant), state that in the PR description instead of omitting it silently.
|
||||
|
||||
Wiring a backend into `includeDarwin:` is more than the matrix entry:
|
||||
|
||||
1. **`includeDarwin:` entry** — `tag-suffix: "-metal-darwin-arm64-<backend>"`, `build-type: "metal"`, `lang: "go"` for go+ggml backends; omit `build-type` for the bespoke C++ ones (llama-cpp / ds4 / privacy-filter). Match an existing entry of the same shape.
|
||||
2. **`backend/index.yaml`** — add `metal:` to the backend's `capabilities` map (main and `-development`) and concrete `metal-<backend>` / `metal-<backend>-development` image entries pointing at the `-metal-darwin-arm64-<backend>` images.
|
||||
3. **C/C++ backends only** — add an `inferBackendPathDarwin` case in `scripts/changed-backends.js` returning `backend/cpp/<backend>/` (the generic fallthrough assumes `backend/<lang>/`, which is wrong for a C++ source tree driven with `lang: go`), and give `run.sh` a Darwin branch that exports `DYLD_LIBRARY_PATH` instead of `LD_LIBRARY_PATH`. If the build is bespoke (single `grpc-server` + dylib bundling), model it on `scripts/build/ds4-darwin.sh` and add a `backends/<backend>-darwin` make target plus a gated step in `.github/workflows/backend_build_darwin.yml`.
|
||||
4. **C++ proto gotcha** — if the backend compiles the generated gRPC/protobuf in a separate CMake target (e.g. `hw_grpc_proto`), that target must link `protobuf::libprotobuf` + `gRPC::grpc++` so the Homebrew include dirs propagate; otherwise macOS fails with `google/protobuf/runtime_version.h not found` (Linux hides this because apt headers sit in `/usr/include`).
|
||||
|
||||
The CI path filter only builds a backend on a PR when a file under its directory changes, so a darwin-only YAML edit builds nothing — touch a file under `backend/<lang>/<backend>/` (a one-line comment is enough) in the same PR.
|
||||
|
||||
## 3. Add Backend Metadata to `backend/index.yaml`
|
||||
|
||||
**Step 3a: Add Meta Definition**
|
||||
@@ -225,6 +243,7 @@ After adding a new backend, verify:
|
||||
|
||||
- [ ] Backend directory structure is complete with all necessary files
|
||||
- [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant)
|
||||
- [ ] **OS coverage considered**: added to `includeDarwin:` (macOS/Apple Silicon) if the backend can build there — with the `backend/index.yaml` `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD branch and `inferBackendPathDarwin` case for C++ backends — or the PR explains why an OS is unsupported. Do not ship Linux-only by default.
|
||||
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
|
||||
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
|
||||
- [ ] Tag suffixes match between workflow file and index.yaml
|
||||
|
||||
47
.github/backend-matrix.yml
vendored
47
.github/backend-matrix.yml
vendored
@@ -2,6 +2,28 @@
|
||||
# Matrix data for backend container image builds.
|
||||
# Consumed by scripts/changed-backends.js for both backend.yml and backend_pr.yml.
|
||||
# This file is NOT a workflow — it has no top-level 'on:' or 'jobs:'.
|
||||
#
|
||||
# OS / platform coverage — READ THIS WHEN ADDING A BACKEND
|
||||
# --------------------------------------------------------
|
||||
# This file is the source of truth for which OS each backend is built and
|
||||
# published for. A backend ships ONLY for the matrices it appears in:
|
||||
# - Linux -> the `include:` matrix below (x86_64 + arm64; CPU and
|
||||
# CUDA / ROCm / SYCL / Vulkan variants).
|
||||
# - macOS -> the `includeDarwin:` matrix (Apple Silicon / arm64; Metal where
|
||||
# the engine supports it, otherwise a native arm64 CPU build).
|
||||
#
|
||||
# New backends must target EVERY OS they can build for, not just Linux. A backend
|
||||
# listed only under `include:` is silently unavailable on macOS even when its code
|
||||
# would run there. Most C/C++/GGML engines build on Darwin (ggml defaults
|
||||
# GGML_METAL=ON on Apple, so a plain build is Metal-enabled), and many Python
|
||||
# backends do too (CPU / MPS). If a backend genuinely cannot support an OS, say so
|
||||
# in its PR description rather than silently omitting it.
|
||||
#
|
||||
# Adding a backend to `includeDarwin:` is more than one line — see the darwin
|
||||
# checklist in .agents/adding-backends.md (includeDarwin entry, the index.yaml
|
||||
# `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD
|
||||
# branch for C/C++ backends, and the inferBackendPathDarwin case in
|
||||
# scripts/changed-backends.js so the path filter actually builds it).
|
||||
|
||||
# Linux matrix (consumed by backend-jobs).
|
||||
include:
|
||||
@@ -4922,6 +4944,31 @@ includeDarwin:
|
||||
tag-suffix: "-metal-darwin-arm64-vibevoice-cpp"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
# Vision/utility C++/ggml backends (go+cgo). Their Makefiles already carry a
|
||||
# Darwin/Metal path (GGML_METAL=ON when build-type=metal); this just builds and
|
||||
# publishes the metal image so Apple Silicon can install them.
|
||||
- backend: "depth-anything-cpp"
|
||||
tag-suffix: "-metal-darwin-arm64-depth-anything-cpp"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
- backend: "locate-anything-cpp"
|
||||
tag-suffix: "-metal-darwin-arm64-locate-anything-cpp"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
- backend: "rfdetr-cpp"
|
||||
tag-suffix: "-metal-darwin-arm64-rfdetr-cpp"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
- backend: "sam3-cpp"
|
||||
tag-suffix: "-metal-darwin-arm64-sam3-cpp"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
# LocalVQE has no Metal path; on Apple Silicon it builds CPU-only (GGML_METAL
|
||||
# OFF) but is still a native arm64 image. Uses the darwin/metal build profile.
|
||||
- backend: "localvqe"
|
||||
tag-suffix: "-metal-darwin-arm64-localvqe"
|
||||
build-type: "metal"
|
||||
lang: "go"
|
||||
- backend: "voxtral"
|
||||
tag-suffix: "-metal-darwin-arm64-voxtral"
|
||||
build-type: "metal"
|
||||
|
||||
@@ -43,4 +43,5 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
|
||||
- **New API endpoints**: LocalAI advertises its capability surface in several independent places — swagger `@Tags`, `/api/instructions` registry, auth `RouteFeatureRegistry`, React UI `capabilities.js`, docs. Read [.agents/api-endpoints-and-auth.md](.agents/api-endpoints-and-auth.md) and follow its checklist — missing any surface means clients, admins, and the UI won't know the endpoint exists.
|
||||
- **Admin endpoints → MCP tool**: every admin endpoint that an admin would manage conversationally (install/list/edit/toggle/upgrade) MUST also be exposed as an MCP tool in `pkg/mcp/localaitools/`. The LocalAI Assistant chat modality and the standalone `local-ai mcp-server` consume that package; drift between REST and MCP is a real risk. Read [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) — the `TestToolHTTPRouteMappingComplete` test fails until you wire the new tool and update the route map.
|
||||
- **Build**: Inspect `Makefile` and `.github/workflows/` — ask the user before running long builds
|
||||
- **Backend OS coverage**: a new backend must target every OS it can build for, not just Linux. `.github/backend-matrix.yml` has two matrices — `include:` (Linux) and `includeDarwin:` (macOS / Apple Silicon). Most C/C++/GGML and many Python backends build on Darwin too — wire the `includeDarwin` entry + `backend/index.yaml` `metal:` entries, or say in the PR why an OS is unsupported. See the darwin checklist in [.agents/adding-backends.md](.agents/adding-backends.md).
|
||||
- **UI**: The active UI is the React app in `core/http/react-ui/`. The older Alpine.js/HTML UI in `core/http/static/` is pending deprecation — all new UI work goes in the React UI
|
||||
|
||||
@@ -40,6 +40,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
||||
else ifeq ($(BUILD_TYPE),vulkan)
|
||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DDA_GGML_VULKAN=ON
|
||||
else ifeq ($(OS),Darwin)
|
||||
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||
ifneq ($(BUILD_TYPE),metal)
|
||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||
else
|
||||
|
||||
@@ -32,6 +32,8 @@ endif
|
||||
ifeq ($(BUILD_TYPE),vulkan)
|
||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLOCALVQE_VULKAN=ON
|
||||
else ifeq ($(OS),Darwin)
|
||||
# Apple Silicon: CPU-only (no Metal upstream); built + published as an arm64
|
||||
# image by CI (includeDarwin in .github/backend-matrix.yml) for macOS install.
|
||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||
endif
|
||||
|
||||
|
||||
@@ -33,6 +33,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
||||
else ifeq ($(BUILD_TYPE),vulkan)
|
||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLA_GGML_VULKAN=ON
|
||||
else ifeq ($(OS),Darwin)
|
||||
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||
ifneq ($(BUILD_TYPE),metal)
|
||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||
else
|
||||
|
||||
@@ -34,6 +34,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
||||
else ifeq ($(BUILD_TYPE),vulkan)
|
||||
CMAKE_ARGS+=-DGGML_VULKAN=ON -DRFDETR_GGML_VULKAN=ON
|
||||
else ifeq ($(OS),Darwin)
|
||||
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||
ifneq ($(BUILD_TYPE),metal)
|
||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||
else
|
||||
|
||||
@@ -31,6 +31,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
|
||||
else ifeq ($(BUILD_TYPE),vulkan)
|
||||
CMAKE_ARGS+=-DGGML_VULKAN=ON
|
||||
else ifeq ($(OS),Darwin)
|
||||
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
|
||||
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
|
||||
ifneq ($(BUILD_TYPE),metal)
|
||||
CMAKE_ARGS+=-DGGML_METAL=OFF
|
||||
else
|
||||
|
||||
@@ -340,6 +340,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp"
|
||||
intel: "intel-sycl-f32-sam3-cpp"
|
||||
vulkan: "vulkan-sam3-cpp"
|
||||
metal: "metal-sam3-cpp"
|
||||
- &rfdetrcpp
|
||||
name: "rfdetr-cpp"
|
||||
alias: "rfdetr-cpp"
|
||||
@@ -368,6 +369,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp"
|
||||
intel: "intel-sycl-f32-rfdetr-cpp"
|
||||
vulkan: "vulkan-rfdetr-cpp"
|
||||
metal: "metal-rfdetr-cpp"
|
||||
- &locateanything
|
||||
name: "locate-anything"
|
||||
alias: "locate-anything"
|
||||
@@ -397,6 +399,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp"
|
||||
intel: "intel-sycl-f32-locate-anything-cpp"
|
||||
vulkan: "vulkan-locate-anything-cpp"
|
||||
metal: "metal-locate-anything-cpp"
|
||||
- !!merge <<: *locateanything
|
||||
name: "locate-anything-development"
|
||||
capabilities:
|
||||
@@ -409,6 +412,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp-development"
|
||||
intel: "intel-sycl-f32-locate-anything-cpp-development"
|
||||
vulkan: "vulkan-locate-anything-cpp-development"
|
||||
metal: "metal-locate-anything-cpp-development"
|
||||
- !!merge <<: *locateanything
|
||||
name: "cpu-locate-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-locate-anything-cpp"
|
||||
@@ -419,6 +423,16 @@
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-locate-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-cpu-locate-anything-cpp
|
||||
- !!merge <<: *locateanything
|
||||
name: "metal-locate-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-locate-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:latest-metal-darwin-arm64-locate-anything-cpp
|
||||
- !!merge <<: *locateanything
|
||||
name: "metal-locate-anything-cpp-development"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-locate-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-metal-darwin-arm64-locate-anything-cpp
|
||||
- !!merge <<: *locateanything
|
||||
name: "cuda12-locate-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-locate-anything-cpp"
|
||||
@@ -517,6 +531,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp"
|
||||
intel: "intel-sycl-f32-depth-anything-cpp"
|
||||
vulkan: "vulkan-depth-anything-cpp"
|
||||
metal: "metal-depth-anything-cpp"
|
||||
- !!merge <<: *depthanything
|
||||
name: "depth-anything-development"
|
||||
capabilities:
|
||||
@@ -529,6 +544,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp-development"
|
||||
intel: "intel-sycl-f32-depth-anything-cpp-development"
|
||||
vulkan: "vulkan-depth-anything-cpp-development"
|
||||
metal: "metal-depth-anything-cpp-development"
|
||||
- !!merge <<: *depthanything
|
||||
name: "cpu-depth-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-depth-anything-cpp"
|
||||
@@ -539,6 +555,16 @@
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-depth-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-cpu-depth-anything-cpp
|
||||
- !!merge <<: *depthanything
|
||||
name: "metal-depth-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-depth-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:latest-metal-darwin-arm64-depth-anything-cpp
|
||||
- !!merge <<: *depthanything
|
||||
name: "metal-depth-anything-cpp-development"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-depth-anything-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-metal-darwin-arm64-depth-anything-cpp
|
||||
- !!merge <<: *depthanything
|
||||
name: "cuda12-depth-anything-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-depth-anything-cpp"
|
||||
@@ -1031,6 +1057,8 @@
|
||||
nvidia-l4t: "vulkan-localvqe"
|
||||
nvidia-l4t-cuda-12: "vulkan-localvqe"
|
||||
nvidia-l4t-cuda-13: "vulkan-localvqe"
|
||||
# Apple Silicon: CPU build (LocalVQE has no Metal path); still arm64-native.
|
||||
metal: "metal-localvqe"
|
||||
- &privacyfilter
|
||||
name: "privacy-filter"
|
||||
alias: "privacy-filter"
|
||||
@@ -3220,6 +3248,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp-development"
|
||||
intel: "intel-sycl-f32-sam3-cpp-development"
|
||||
vulkan: "vulkan-sam3-cpp-development"
|
||||
metal: "metal-sam3-cpp-development"
|
||||
- !!merge <<: *sam3cpp
|
||||
name: "cpu-sam3-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-sam3-cpp"
|
||||
@@ -3230,6 +3259,16 @@
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-sam3-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-cpu-sam3-cpp
|
||||
- !!merge <<: *sam3cpp
|
||||
name: "metal-sam3-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-sam3-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:latest-metal-darwin-arm64-sam3-cpp
|
||||
- !!merge <<: *sam3cpp
|
||||
name: "metal-sam3-cpp-development"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-sam3-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-metal-darwin-arm64-sam3-cpp
|
||||
- !!merge <<: *sam3cpp
|
||||
name: "cuda12-sam3-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-sam3-cpp"
|
||||
@@ -3303,6 +3342,7 @@
|
||||
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp-development"
|
||||
intel: "intel-sycl-f32-rfdetr-cpp-development"
|
||||
vulkan: "vulkan-rfdetr-cpp-development"
|
||||
metal: "metal-rfdetr-cpp-development"
|
||||
- !!merge <<: *rfdetrcpp
|
||||
name: "cpu-rfdetr-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-rfdetr-cpp"
|
||||
@@ -3313,6 +3353,16 @@
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-rfdetr-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-cpu-rfdetr-cpp
|
||||
- !!merge <<: *rfdetrcpp
|
||||
name: "metal-rfdetr-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-rfdetr-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:latest-metal-darwin-arm64-rfdetr-cpp
|
||||
- !!merge <<: *rfdetrcpp
|
||||
name: "metal-rfdetr-cpp-development"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-rfdetr-cpp"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-metal-darwin-arm64-rfdetr-cpp
|
||||
- !!merge <<: *rfdetrcpp
|
||||
name: "cuda12-rfdetr-cpp"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-rfdetr-cpp"
|
||||
@@ -4101,6 +4151,16 @@
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-vulkan-localvqe"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-gpu-vulkan-localvqe
|
||||
- !!merge <<: *localvqecpp
|
||||
name: "metal-localvqe"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-localvqe"
|
||||
mirrors:
|
||||
- localai/localai-backends:latest-metal-darwin-arm64-localvqe
|
||||
- !!merge <<: *localvqecpp
|
||||
name: "metal-localvqe-development"
|
||||
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-localvqe"
|
||||
mirrors:
|
||||
- localai/localai-backends:master-metal-darwin-arm64-localvqe
|
||||
## kokoro
|
||||
- !!merge <<: *kokoro
|
||||
name: "kokoro-development"
|
||||
|
||||
@@ -3,10 +3,51 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"net/url"
|
||||
"strings"
|
||||
|
||||
"gorm.io/driver/sqlite"
|
||||
"gorm.io/gorm"
|
||||
)
|
||||
|
||||
func openSQLiteDialector(path string) (gorm.Dialector, error) {
|
||||
return sqlite.Open(path), nil
|
||||
return sqlite.Open(buildSQLiteDSN(path)), nil
|
||||
}
|
||||
|
||||
// buildSQLiteDSN augments a SQLite file path with connection pragmas that make
|
||||
// the auth DB resilient on slow or contended storage.
|
||||
//
|
||||
// - _busy_timeout=5000 makes SQLite retry for up to 5s on SQLITE_BUSY instead
|
||||
// of failing immediately. Network-backed storage (SMB/CIFS/NFS, e.g. Azure
|
||||
// Files) is prone to transient lock contention during migration (see #10506).
|
||||
// - _txlock=immediate takes the write lock at BEGIN, avoiding deadlocks when a
|
||||
// read transaction later upgrades to a write during AutoMigrate.
|
||||
//
|
||||
// We deliberately do NOT set WAL journal mode: WAL relies on a shared-memory
|
||||
// mmap that does not work over SMB/NFS, which is exactly the failing case here.
|
||||
//
|
||||
// Caller-supplied values for either pragma are preserved.
|
||||
func buildSQLiteDSN(path string) string {
|
||||
base := path
|
||||
rawQuery := ""
|
||||
if i := strings.IndexByte(path, '?'); i >= 0 {
|
||||
base = path[:i]
|
||||
rawQuery = path[i+1:]
|
||||
}
|
||||
|
||||
values, err := url.ParseQuery(rawQuery)
|
||||
if err != nil {
|
||||
// An unparseable query string means a hand-crafted DSN we should not
|
||||
// risk corrupting; leave it untouched.
|
||||
return path
|
||||
}
|
||||
|
||||
if values.Get("_busy_timeout") == "" {
|
||||
values.Set("_busy_timeout", "5000")
|
||||
}
|
||||
if values.Get("_txlock") == "" {
|
||||
values.Set("_txlock", "immediate")
|
||||
}
|
||||
|
||||
return base + "?" + values.Encode()
|
||||
}
|
||||
|
||||
57
core/http/auth/db_sqlite_test.go
Normal file
57
core/http/auth/db_sqlite_test.go
Normal file
@@ -0,0 +1,57 @@
|
||||
//go:build auth
|
||||
|
||||
package auth
|
||||
|
||||
import (
|
||||
"net/url"
|
||||
"strings"
|
||||
|
||||
. "github.com/onsi/ginkgo/v2"
|
||||
. "github.com/onsi/gomega"
|
||||
)
|
||||
|
||||
// parseDSN splits a "base?query" DSN into its base and decoded query values so
|
||||
// assertions don't depend on url.Values.Encode()'s key ordering.
|
||||
func parseDSN(dsn string) (string, url.Values) {
|
||||
base := dsn
|
||||
rawQuery := ""
|
||||
if i := strings.IndexByte(dsn, '?'); i >= 0 {
|
||||
base = dsn[:i]
|
||||
rawQuery = dsn[i+1:]
|
||||
}
|
||||
values, err := url.ParseQuery(rawQuery)
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
return base, values
|
||||
}
|
||||
|
||||
var _ = Describe("buildSQLiteDSN", func() {
|
||||
It("adds busy_timeout and txlock to a plain file path", func() {
|
||||
base, values := parseDSN(buildSQLiteDSN("/data/database.db"))
|
||||
Expect(base).To(Equal("/data/database.db"))
|
||||
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
|
||||
Expect(values.Get("_txlock")).To(Equal("immediate"))
|
||||
})
|
||||
|
||||
It("adds pragmas to an in-memory database", func() {
|
||||
base, values := parseDSN(buildSQLiteDSN(":memory:"))
|
||||
Expect(base).To(Equal(":memory:"))
|
||||
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
|
||||
Expect(values.Get("_txlock")).To(Equal("immediate"))
|
||||
})
|
||||
|
||||
It("preserves an existing query string", func() {
|
||||
base, values := parseDSN(buildSQLiteDSN("/data/database.db?cache=shared"))
|
||||
Expect(base).To(Equal("/data/database.db"))
|
||||
Expect(values.Get("cache")).To(Equal("shared"))
|
||||
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
|
||||
Expect(values.Get("_txlock")).To(Equal("immediate"))
|
||||
})
|
||||
|
||||
It("does not override a caller-supplied busy_timeout or txlock", func() {
|
||||
_, values := parseDSN(buildSQLiteDSN("/data/database.db?_busy_timeout=1000&_txlock=deferred"))
|
||||
Expect(values["_busy_timeout"]).To(HaveLen(1), "_busy_timeout should not be duplicated")
|
||||
Expect(values.Get("_busy_timeout")).To(Equal("1000"))
|
||||
Expect(values["_txlock"]).To(HaveLen(1), "_txlock should not be duplicated")
|
||||
Expect(values.Get("_txlock")).To(Equal("deferred"))
|
||||
})
|
||||
})
|
||||
@@ -4,14 +4,59 @@ import (
|
||||
"context"
|
||||
"fmt"
|
||||
"hash/fnv"
|
||||
"strings"
|
||||
"sync"
|
||||
|
||||
"gorm.io/gorm"
|
||||
)
|
||||
|
||||
// TryWithLockCtx attempts to acquire a PostgreSQL advisory lock using the provided context.
|
||||
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if the lock
|
||||
// was already held, or (false, error) on failure.
|
||||
// localLocks holds one buffered channel (capacity 1) per lock key, used as an
|
||||
// in-process mutex for non-PostgreSQL dialects (SQLite). A SQLite auth DB is
|
||||
// effectively single-process, so serializing guarded sections within this
|
||||
// process is sufficient - we cannot and need not coordinate across processes
|
||||
// the way a PostgreSQL advisory lock does.
|
||||
var (
|
||||
localLocksMu sync.Mutex
|
||||
localLocks = map[int64]chan struct{}{}
|
||||
)
|
||||
|
||||
// localLockChan returns the per-key buffered channel, creating it on first use.
|
||||
func localLockChan(key int64) chan struct{} {
|
||||
localLocksMu.Lock()
|
||||
defer localLocksMu.Unlock()
|
||||
ch, ok := localLocks[key]
|
||||
if !ok {
|
||||
ch = make(chan struct{}, 1)
|
||||
localLocks[key] = ch
|
||||
}
|
||||
return ch
|
||||
}
|
||||
|
||||
// isPostgres reports whether the gorm dialect is PostgreSQL. Anything else
|
||||
// (SQLite and any non-postgres dialect) uses the in-process fallback, because
|
||||
// the pg_* advisory lock functions only exist on PostgreSQL.
|
||||
func isPostgres(db *gorm.DB) bool {
|
||||
return strings.Contains(db.Dialector.Name(), "postgres")
|
||||
}
|
||||
|
||||
// TryWithLockCtx attempts to acquire a lock and run fn without blocking.
|
||||
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if
|
||||
// the lock was already held, or (false, error) on failure.
|
||||
//
|
||||
// On PostgreSQL it uses pg_try_advisory_lock (cross-process). On other dialects
|
||||
// (SQLite) it uses a non-blocking in-process lock keyed by key.
|
||||
func TryWithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) (bool, error) {
|
||||
if !isPostgres(db) {
|
||||
ch := localLockChan(key)
|
||||
select {
|
||||
case ch <- struct{}{}:
|
||||
defer func() { <-ch }()
|
||||
return true, fn()
|
||||
default:
|
||||
return false, nil
|
||||
}
|
||||
}
|
||||
|
||||
sqlDB, err := db.DB()
|
||||
if err != nil {
|
||||
return false, fmt.Errorf("get sql.DB: %w", err)
|
||||
@@ -50,9 +95,31 @@ func KeyFromString(s string) int64 {
|
||||
return int64(h.Sum64()>>1) | 0x100000000
|
||||
}
|
||||
|
||||
// WithLockCtx is like WithLock but respects context cancellation.
|
||||
// If ctx is cancelled while waiting for the lock, the function returns ctx.Err().
|
||||
// WithLockCtx acquires a lock for key, runs fn, then releases it, respecting
|
||||
// context cancellation. If ctx is cancelled while waiting for the lock, the
|
||||
// function returns ctx.Err().
|
||||
//
|
||||
// On PostgreSQL it uses pg_advisory_lock (cross-process). On other dialects
|
||||
// (SQLite) it falls back to a blocking in-process lock keyed by key, which is
|
||||
// sufficient because a SQLite auth DB is effectively single-process.
|
||||
func WithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) error {
|
||||
if !isPostgres(db) {
|
||||
// Honor an already-cancelled context before attempting acquisition:
|
||||
// select picks a ready case at random, so without this an already-free
|
||||
// lock could be taken despite a cancelled ctx.
|
||||
if err := ctx.Err(); err != nil {
|
||||
return err
|
||||
}
|
||||
ch := localLockChan(key)
|
||||
select {
|
||||
case ch <- struct{}{}:
|
||||
defer func() { <-ch }()
|
||||
return fn()
|
||||
case <-ctx.Done():
|
||||
return ctx.Err()
|
||||
}
|
||||
}
|
||||
|
||||
sqlDB, err := db.DB()
|
||||
if err != nil {
|
||||
return fmt.Errorf("advisorylock: getting sql.DB: %w", err)
|
||||
|
||||
129
core/services/advisorylock/advisorylock_sqlite_test.go
Normal file
129
core/services/advisorylock/advisorylock_sqlite_test.go
Normal file
@@ -0,0 +1,129 @@
|
||||
package advisorylock
|
||||
|
||||
import (
|
||||
"context"
|
||||
"sync"
|
||||
"sync/atomic"
|
||||
"time"
|
||||
|
||||
. "github.com/onsi/ginkgo/v2"
|
||||
. "github.com/onsi/gomega"
|
||||
|
||||
"gorm.io/driver/sqlite"
|
||||
"gorm.io/gorm"
|
||||
)
|
||||
|
||||
// These specs run against an in-memory SQLite DB and therefore do NOT require
|
||||
// Docker, unlike the PostgreSQL testcontainer specs.
|
||||
var _ = Describe("AdvisoryLock (SQLite fallback)", Label("sqlite"), func() {
|
||||
var db *gorm.DB
|
||||
|
||||
BeforeEach(func() {
|
||||
var err error
|
||||
db, err = gorm.Open(sqlite.Open("file::memory:?cache=shared"), &gorm.Config{})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(db.Dialector.Name()).To(ContainSubstring("sqlite"))
|
||||
})
|
||||
|
||||
It("WithLockCtx executes fn and returns no error on SQLite", func() {
|
||||
const lockKey int64 = 12001
|
||||
executed := false
|
||||
|
||||
err := WithLockCtx(context.Background(), db, lockKey, func() error {
|
||||
executed = true
|
||||
return nil
|
||||
})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(executed).To(BeTrue(), "function should have run under the in-process lock")
|
||||
})
|
||||
|
||||
It("WithLockCtx serializes concurrent goroutines on the same key", func() {
|
||||
const lockKey int64 = 12002
|
||||
|
||||
var (
|
||||
mu sync.Mutex
|
||||
maxRunning int32
|
||||
running int32
|
||||
concurrency int32
|
||||
)
|
||||
|
||||
var wg sync.WaitGroup
|
||||
|
||||
for range 2 {
|
||||
wg.Go(func() {
|
||||
defer GinkgoRecover()
|
||||
err := WithLockCtx(context.Background(), db, lockKey, func() error {
|
||||
cur := atomic.AddInt32(&running, 1)
|
||||
mu.Lock()
|
||||
if cur > maxRunning {
|
||||
maxRunning = cur
|
||||
}
|
||||
if cur > 1 {
|
||||
atomic.AddInt32(&concurrency, 1)
|
||||
}
|
||||
mu.Unlock()
|
||||
|
||||
time.Sleep(50 * time.Millisecond)
|
||||
|
||||
atomic.AddInt32(&running, -1)
|
||||
return nil
|
||||
})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
})
|
||||
}
|
||||
|
||||
wg.Wait()
|
||||
|
||||
Expect(maxRunning).To(BeNumerically("<=", 1), "expected max 1 goroutine inside lock at a time")
|
||||
Expect(concurrency).To(BeZero(), "detected concurrent execution inside advisory lock")
|
||||
})
|
||||
|
||||
It("WithLockCtx returns an error and does not run fn with an already-cancelled context", func() {
|
||||
const lockKey int64 = 12003
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
cancel()
|
||||
|
||||
err := WithLockCtx(ctx, db, lockKey, func() error {
|
||||
Fail("function should not run with a cancelled context")
|
||||
return nil
|
||||
})
|
||||
Expect(err).To(HaveOccurred())
|
||||
})
|
||||
|
||||
It("TryWithLockCtx returns (true, nil) when free and (false, nil) when held", func() {
|
||||
const lockKey int64 = 12004
|
||||
|
||||
acquired, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
|
||||
return nil
|
||||
})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(acquired).To(BeTrue(), "expected TryWithLockCtx to acquire the free lock")
|
||||
|
||||
// Hold the lock in one goroutine while a concurrent TryWithLockCtx
|
||||
// attempts to acquire the same key.
|
||||
held := make(chan struct{})
|
||||
release := make(chan struct{})
|
||||
var wg sync.WaitGroup
|
||||
wg.Go(func() {
|
||||
defer GinkgoRecover()
|
||||
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
|
||||
close(held)
|
||||
<-release
|
||||
return nil
|
||||
})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(ok).To(BeTrue())
|
||||
})
|
||||
|
||||
<-held
|
||||
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
|
||||
Fail("function should not run while lock is held")
|
||||
return nil
|
||||
})
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(ok).To(BeFalse(), "expected TryWithLockCtx to fail to acquire a held lock")
|
||||
|
||||
close(release)
|
||||
wg.Wait()
|
||||
})
|
||||
})
|
||||
24
core/services/jobs/sqlite_e2e_test.go
Normal file
24
core/services/jobs/sqlite_e2e_test.go
Normal file
@@ -0,0 +1,24 @@
|
||||
//go:build auth
|
||||
|
||||
package jobs_test
|
||||
|
||||
import (
|
||||
"github.com/mudler/LocalAI/core/http/auth"
|
||||
"github.com/mudler/LocalAI/core/services/jobs"
|
||||
|
||||
. "github.com/onsi/ginkgo/v2"
|
||||
. "github.com/onsi/gomega"
|
||||
)
|
||||
|
||||
// Reproduces the #10506 caller chain: auth.InitDB(sqlite) -> jobs.NewJobStore,
|
||||
// which previously failed with "no such function: pg_advisory_lock".
|
||||
var _ = Describe("NewJobStore on a SQLite auth DB (#10506)", func() {
|
||||
It("migrates without pg_advisory_lock errors", func() {
|
||||
db, err := auth.InitDB(":memory:")
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
|
||||
store, err := jobs.NewJobStore(db)
|
||||
Expect(err).ToNot(HaveOccurred())
|
||||
Expect(store).ToNot(BeNil())
|
||||
})
|
||||
})
|
||||
@@ -85,6 +85,8 @@ localai run
|
||||
| `LOCALAI_REGISTRATION_MODE` | `approval` | Registration mode: `open`, `approval`, or `invite` |
|
||||
| `LOCALAI_DISABLE_LOCAL_AUTH` | `false` | Disable local email/password registration and login (for OAuth/OIDC-only deployments) |
|
||||
|
||||
> **Note: network-backed storage.** File-based SQLite relies on POSIX file locking, which is unreliable over network filesystems (SMB/CIFS/NFS, e.g. Azure Files / Azure Container Apps shared volumes). On such storage the auth DB can fail to migrate with `database is locked`. Use PostgreSQL (`LOCALAI_AUTH_DATABASE_URL=postgres://...`) when the data directory lives on shared or network storage, or place `database.db` on a local volume.
|
||||
|
||||
### Disabling Local Authentication
|
||||
|
||||
If you want to enforce OAuth/OIDC-only login and prevent users from registering or logging in with email/password, set `LOCALAI_DISABLE_LOCAL_AUTH=true` (or pass `--disable-local-auth`):
|
||||
|
||||
Reference in New Issue
Block a user