Compare commits

...

29 Commits

Author SHA1 Message Date
Ettore Di Giacinto
50eb252003 fix(syncstate): annotate gosec G118 false positive on lifeCtx
gosec flagged the WithCancel in Start as "cancellation function not called"
because the returned cancel is stored on the struct rather than called/deferred
in scope. It is invoked in Close (covered by tests), and lifeCtx must outlive
Start to drive the reconnect/reconcile goroutines. Suppress the verified false
positive with a justified #nosec G118.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-27 07:05:25 +00:00
Ettore Di Giacinto
a0bdfc23b6 refactor(quantization): back jobs with SyncedMap + durable QuantStore
QuantizationService kept jobs in a process-local map persisted only to a local
state.json, so in distributed mode jobs were neither visible across replicas nor
durable cluster-wide. Back jobs with a syncstate.SyncedMap keyed by job ID
(value *schema.QuantizationJob, the exact REST shape).

- New distributed.QuantStore (GORM, table quantization_jobs) mirroring
  FineTuneStore: Create/Get/ListAll/Upsert(idempotent)/Delete, registered for
  AutoMigrate via distributed.InitStores (Stores.Quant).
- New adapter (quantization/syncstore.go) over QuantStore implementing
  syncstate.Store, with record<->schema conversion.
- Reads go through List/Get, writes through Set/Delete (write-through +
  broadcast); state.json is kept as the standalone Loader for single-node restart
  recovery (stale-job fixups preserved).
- app.go passes the distributed NATS client + QuantStore when distributed, nil
  otherwise; Start/Close lifecycle mirrors finetune.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-27 00:30:05 +00:00
Ettore Di Giacinto
c894336898 refactor(agentpool): back agent tasks with SyncedMap for cross-replica consistency
AgentJobService.ListTasks read the process-local tasks map only, while ListJobs
already read through the DB persister + dispatcher NATS - so in distributed mode
a task created on one replica was invisible to the others. Back tasks with a
syncstate.SyncedMap keyed by task ID (value schema.Task, the exact REST shape);
jobs are left untouched.

- Store adapter (task_syncstore.go) over the existing JobPersister
  (LoadTasks/SaveTask/DeleteTask); reads svc.persister/userID live so a persister
  swap needs no rebuild. No new persister methods required.
- Task reads -> SyncedMap.List/Get; create/update -> Set (write-through +
  broadcast); delete -> Delete. The file persister now owns its own task set so
  the write-through path does not re-enter the SyncedMap lock (deadlock guard).
- The distributed NATS client is not available at construction (start() precedes
  initDistributed), so it is injected via SetTaskSyncNATS, which rebuilds the
  still-empty map before Start/hydrate. Wired at the main, restart, and per-user
  (UserServicesManager) distributed sites.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-27 00:23:02 +00:00
Ettore Di Giacinto
b3d1c3b4a7 refactor(finetune): back jobs with SyncedMap for cross-replica consistency
FineTuneService kept jobs in a process-local map and, although it wrote them to
Postgres, ListJobs/GetJob never read the store back and the wired natsClient was
never used - so in distributed mode a job created on one replica was invisible to
the others. Replace the map and the dead client with a syncstate.SyncedMap keyed
by job ID, value *schema.FineTuneJob (the exact REST shape, so responses are
unchanged).

- Add a Store adapter (core/services/finetune/syncstore.go) over FineTuneStore,
  plus FineTuneStore.ListAll (global hydrate; per-user List kept) and an
  idempotent Upsert (create-or-update; Create alone fails on dup key).
- Writes go through SyncedMap.Set/Delete (write-through + broadcast); reads use
  List/Get. The on-disk state.json path becomes the standalone Loader, keeping
  single-node restart recovery (stale->stopped / exporting->failed fixups).
- Fold SetNATSClient/SetFineTuneStore into NewFineTuneService; app.go passes the
  distributed NATS client + store when distributed, nil otherwise.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-27 00:04:07 +00:00
Ettore Di Giacinto
e4e3fde68b feat(distributed): add SyncedMap cross-replica in-memory state component
Introduce core/services/syncstate.SyncedMap[K,V]: a thread-safe in-memory map
that keeps itself consistent across frontend replicas via NATS, with an optional
pluggable durable Store and hydrate-from-source convergence.

Several features keep process-local state surfaced to the API (finetune/quant
jobs, agent tasks, model configs) and each hand-wired the same in-memory + NATS
broadcast + read-through-store legs - or forgot to, reintroducing cross-replica
staleness. SyncedMap makes that consistency a configuration choice:

- local writes mutate the map, write through the Store, then broadcast a delta;
- the apply path is memory-only and never re-publishes or re-writes the Store
  (structural echo-loop guard, mirroring galleryop.mergeStatus);
- on Start and on NATS reconnect the map re-hydrates from the source (Store, else
  Loader); an optional periodic Reconcile repairs silent drift;
- standalone mode (nil NATS client) is a strict in-memory no-op.

Reconnect re-hydrate is wired via a new *messaging.Client.OnReconnect callback,
consumed through an optional type-assertion so MessagingClient stays minimal.
Adds messaging.SubjectSyncStateDelta and a reusable testutil.FakeBus (synchronous
in-process MessagingClient with wildcard matching) for adopter tests.

Component only; service migrations follow in subsequent commits.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
2026-06-26 23:49:41 +00:00
LocalAI [bot]
64150ca7ab fix(distributed): broadcast admin model-config changes across replicas (#10540)
In distributed mode the admin model endpoints (/models/edit, /models/import,
/models/toggle-state and the PATCH config-json endpoint) wrote the YAML to the
shared models dir but reloaded only the local replica's in-memory
ModelConfigLoader. With multiple frontend replicas behind one service, a save
landed on whichever replica handled the request; peers kept serving their stale
in-memory view, so a load-balanced request was a coin-flip between old and new
config (a created alias visible on one replica and missing on the other, an
edited alias target diverging, etc.).

The NATS cache-invalidation channel (SubjectCacheInvalidateModels +
OnModelsChanged) already existed for the gallery install/delete path; these
admin endpoints simply never published on it. Wire them up via a new
GalleryService.BroadcastModelsChanged helper (no-op in standalone mode).

Also fix delete propagation: LoadModelConfigsFromPath is additive and never
drops an entry whose file is gone, so the subscriber hook (which only reloaded
from disk) could not propagate a removal. ApplyRemoteChange now honors the
event op - pruning the element on "delete" and reloading otherwise - and shuts
down any running instance of the affected model so the new config takes effect.
This closes the same latent gap on the gallery delete path.


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-27 01:36:57 +02:00
LocalAI [bot]
f98b0f1c1e fix(gpu-libs): bundle transitive deps of GPU runtime libs (#10537) (#10539)
fix(gpu-libs): bundle transitive deps of GPU runtime libs

The per-vendor packagers in package-gpu-libs.sh copy an explicit allowlist
of top-level GPU runtime libraries (libamdhip64, libhipblas, librocblas, the
CUDA/Intel equivalents, ...) but never resolved their transitive
dependencies. Backends run through the bundled lib/ld.so with
LD_LIBRARY_PATH=lib, so any transitive dep not in the allowlist is a fatal
"cannot open shared object file" at load time.

On recent ROCm (base image rocm 7.2.1) the runtime libs link against
librocprofiler-register.so.0, which is not in the allowlist, so the rocm
llama-cpp backend (and every other GPU backend sharing this script) failed
to load with:

  librocprofiler-register.so.0: cannot open shared object file

The Vulkan path already solved this class of problem with copy_elf_deps
(ldd-based transitive resolution), but that sweep was only wired into the
Vulkan ICD path. This adds a generic sweep_transitive_deps that runs the
same ldd resolution over everything the allowlist already bundled, and wires
it into the ROCm, CUDA and Intel packagers. ldd returns the full recursive
closure, so one pass suffices; core libc-family deps are skipped via
is_core_lib so we never shadow the loader's own libc/libstdc++.

Adds a self-contained regression test (gcc + ldd) that fabricates a primary
lib linking a transitive lib and asserts the sweep bundles the dependency.

Fixes #10537

Assisted-by: Claude:opus-4.8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-27 01:36:33 +02:00
LocalAI [bot]
2c96c2d08e chore: ⬆️ Update mudler/parakeet.cpp to f469a57270a1cc4554acb15febf60e56619673b9 (#10530)
⬆️ Update mudler/parakeet.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-27 00:50:51 +02:00
LocalAI [bot]
f01a969f7b docs: ⬆️ update docs version mudler/LocalAI (#10531)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-27 00:29:29 +02:00
LocalAI [bot]
56600eec3e fix(nodes): show a node's existing labels on the detail view (#10529)
fix(nodes): return labels in single-node GET so the detail view shows them

The node detail view (/app/nodes/:id) reads `node.labels` to render a
node's existing labels, but the single-node GET endpoint returned a bare
BackendNode whose Labels live in a separate table - so the list was always
empty and operators could only add labels, never see what was already set
(#10527). The same response also lacked in_flight_count and model_count.

Add NodeRegistry.GetWithExtras, mirroring the existing List vs ListWithExtras
split: bare Get stays cheap for the routing hot paths and existence checks,
while the detail endpoint uses the enriched variant to attach the labels map
and live counts. No frontend change is needed - the UI already renders
existing labels once the data is present.

Closes #10527


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 23:06:42 +02:00
LocalAI [bot]
c4fa256cdf chore(model gallery): 🤖 add 1 new models via gallery agent (#10526)
chore(model gallery): 🤖 add new models via gallery agent

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-26 22:31:22 +02:00
LocalAI [bot]
17c1fc74b2 fix(backends): darwin packaging for silero-vad (last Linux-only Go backend) (#10528)
fix(backends): darwin packaging for silero-vad

silero-vad was the last Go backend with Linux-only darwin packaging:
- package.sh fell through to "Could not detect architecture" -> exit 1 on
  macOS (no Darwin branch), so its darwin image never packaged.
- run.sh exported LD_LIBRARY_PATH, which macOS dyld ignores, so the bundled
  libonnxruntime.dylib couldn't be found at runtime.

Add a Darwin branch to package.sh (skip the glibc/ld.so bundling; add an
@loader_path/lib rpath so @rpath resolves to package/lib/) and a
DYLD_LIBRARY_PATH branch to run.sh — mirroring the piper darwin fix (#10525).

Assisted-by: Claude:claude-opus-4-8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 22:31:06 +02:00
LocalAI [bot]
068d397acf fix(backends): set rpath on the piper darwin binary so it can load its bundled libs (#10525)
The metal-darwin-arm64-piper backend crashed at launch on macOS:

    DYLD "Library missing"
      Library not loaded: @rpath/libucd.dylib
      Referenced from: .../piper
      Reason: no LC_RPATH's found

The piper binary links libucd, libespeak-ng, libpiper_phonemize and
libonnxruntime via @rpath, but ships with no LC_RPATH, so dyld cannot
expand @rpath and aborts before piper runs. The libraries themselves are
already bundled in package/lib/ by package.sh.

Additionally, package.sh's architecture detection only handled the Linux
glibc loaders (/lib64/ld-linux-x86-64.so.2, /lib/ld-linux-aarch64.so.1)
and otherwise hit `echo "Error: Could not detect architecture"; exit 1`,
so on macOS packaging failed outright.

Add a Darwin branch (before the Linux checks) that skips the glibc/ld.so
bundling macOS has no use for and instead runs
`install_name_tool -add_rpath @loader_path/lib` on the piper binary, so
@rpath resolves to the bundled package/lib/ directory.

Also mirror sherpa-onnx/opus in run.sh: export DYLD_LIBRARY_PATH on
Darwin (LD_LIBRARY_PATH is Linux-only) as a defensive fallback.

Validated by hand on Apple Silicon: with the rpath added, piper
synthesized a real WAV. The darwin build is validated in CI.

Assisted-by: Claude:claude-opus-4-8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 15:10:15 +02:00
LocalAI [bot]
5b3572f8b8 feat(macos): sign and notarize the DMG, app, and server binary (#10510)
Produce a Gatekeeper-clean macOS distribution with no user workaround:

- Launcher DMG + the LocalAI.app inside it are built via fyne, codesigned
  with the Developer ID under the hardened runtime, then the DMG is signed,
  notarized (notarytool) and stapled. Replaces macos-dmg-creator (which had
  no signing hook) with fyne package + hdiutil so we control the .app before
  packaging.
- The bare local-ai darwin server binary is signed + notarized via
  GoReleaser's native notarize block (quill backend, runs on Linux).
- All signing is gated on secrets being present, so forks/PRs/local builds
  stay unsigned and green (contrib/macos/sign-and-notarize.sh no-ops).
- Add hardened-runtime entitlements and FyneApp.toml for deterministic
  packaging; update macOS install docs to drop the quarantine workaround.

Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 12:45:51 +02:00
LocalAI [bot]
6afe127cd4 fix(backends): make the opus backend build and package on macOS/Darwin (#10523)
The opus Go backend (WebRTC audio codec) never built on macOS, so the
published master-metal-darwin-arm64-opus image shipped source only — no
opus binary and no libopusshim — because every step assumed Linux.

- Makefile: hardcoded libopusshim.so with no OS handling. Mirror
  sherpa-onnx: SHIM_EXT=so / dylib on Darwin and build
  libopusshim.$(SHIM_EXT). On Darwin link the shim with
  -undefined dynamic_lookup so it resolves opus_encoder_ctl from the
  already globally-loaded libopus (codec.go dlopens it RTLD_GLOBAL
  first) instead of baking an absolute Homebrew path into the dylib,
  keeping the packaged shim relocatable.
- run.sh: hardcoded LD_LIBRARY_PATH + libopusshim.so even on macOS. Add
  a Darwin branch exporting DYLD_LIBRARY_PATH and the .dylib shim, like
  sherpa-onnx/run.sh.
- package.sh: bundle libopusshim.$(SHIM_EXT) and libopus*.dylib (not
  just .so) into package/lib so the OCI image (which ships package/.)
  is self-contained on a runtime with no Homebrew; add a Darwin arch
  branch so it doesn't warn/skip.
- backend_build_darwin.yml: install + link opus and pkg-config via brew
  so the Makefile's `pkg-config opus` resolves on the macOS runner, and
  cache opus' Cellar dir.

Go code is unchanged; darwin build is validated in CI.

Assisted-by: Claude:claude-opus-4-8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 11:19:50 +02:00
LocalAI [bot]
f58dcefed4 fix(backends): ship the package/ dir for darwin go backend images (#10522)
fix(backends): ship the package/ dir for darwin go backends

golang-darwin.sh packaged the whole backend source/build dir as the OCI
image (backend/go/$BACKEND/.), so the runtime dylibs ended up under
package/lib and backend-assets/lib while run.sh looks in $CURDIR/lib. As a
result a backend like sherpa-onnx could not dlopen its libsherpa-shim.dylib
at runtime and exited immediately (the model then 500s with "grpc service
not ready"); it started fine only when run from inside package/.

Ship package/. instead — the self-contained run.sh + binary + lib/ bundle —
matching the Linux Dockerfile.golang (`COPY .../package/. ./`). Backends
that don't assemble a package/ fall back to the backend dir, and the
binary-existence guard now checks the directory actually shipped.

Assisted-by: Claude:claude-opus-4-8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 08:48:27 +02:00
LocalAI [bot]
11b062f8f4 chore(model gallery): 🤖 add 1 new models via gallery agent (#10521)
chore(model gallery): 🤖 add new models via gallery agent

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-26 07:43:29 +02:00
LocalAI [bot]
114eeaae81 feat(backends): make PreferDevelopmentBackends install the development image as primary (#10520)
When LOCALAI_PREFER_DEV_BACKENDS is set, install the -development image as the
primary backend URI (keeping the released image reachable as the first
fallback), instead of only reaching development as a download fallback when the
released image is missing. This lets an operator force backends built from the
development branch — e.g. to pick up a fix already on master before a release.

Threads PreferDevelopmentBackends through SystemState so InstallBackend can see
it, and reuses the same development-URI convention as the existing failure-path
fallback (released tag -> branch tag + dev suffix). The unexported developmentURI
helper is covered by a Ginkgo spec.

Assisted-by: Claude:claude-opus-4-8

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 07:42:45 +02:00
LocalAI [bot]
d388f874de feat(backends): darwin/Metal build for the privacy-filter backend (#10513)
* feat(backends): darwin/Metal build for the privacy-filter backend (timeboxed try)

The privacy-filter.cpp engine is already Metal-capable on Apple Silicon: it pulls
ggml and never forces GGML_METAL=OFF, and ggml defaults Metal ON on Apple, so a
plain Darwin build is Metal-enabled. grpc++/protobuf resolve from Homebrew via
find_package(... CONFIG). It just had no darwin build path - the existing
package.sh and run.sh are Linux-only and there was no make target / workflow step.

Adds the bespoke darwin path, modeled on the ds4 one:
- scripts/build/privacy-filter-darwin.sh: native make grpc-server, otool -L dylib
  bundling, create-oci-image (no Linux package.sh).
- Makefile: backends/privacy-filter-darwin target (+ .NOTPARALLEL).
- .github/workflows/backend_build_darwin.yml: gated build step for privacy-filter.
- scripts/changed-backends.js: inferBackendPathDarwin special-case -> backend/cpp.
- .github/backend-matrix.yml: includeDarwin entry (lang go, like ds4/llama-cpp).
- backend/index.yaml: metal: capability + metal-privacy-filter(-development) entries.
- backend/cpp/privacy-filter/run.sh: DYLD_LIBRARY_PATH branch on Darwin.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:opus-4.8 [Claude Code]

* fix(privacy-filter): macOS proto include + bundle ggml dylibs

Validated natively on an M4 (the build/package/load chain now works with Metal):

- CMakeLists.txt: hw_grpc_proto compiles the generated proto/grpc sources but
  only linked the binary dir, so on macOS it could not find protobuf's headers
  (runtime_version.h) - Homebrew puts them under /opt/homebrew, not /usr/include.
  Link protobuf::libprotobuf + gRPC::grpc++ so their include dirs propagate. No-op
  on Linux (apt headers are already on the default search path).
- privacy-filter-darwin.sh: bundle the ggml shared libs the binary @rpath-links
  (libggml{,-base,-cpu,-blas,-metal}); the otool -L walk only catches on-disk
  absolute deps and missed them. Resolved at runtime by run.sh's DYLD_LIBRARY_PATH.

M4 check: arm64 grpc-server links @rpath/libggml-metal.0.dylib; with the 15 ggml
dylibs + grpc/protobuf bundled, it loads clean (no dyld errors) and prints usage.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:opus-4.8 [Claude Code]

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 01:18:41 +02:00
LocalAI [bot]
86677495a2 chore: ⬆️ Update ggml-org/llama.cpp to 9d5d882d8cd0f0a9283d87ed5e6fe3ee0d925fb1 (#10514)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-26 01:15:40 +02:00
LocalAI [bot]
253aedff06 chore: ⬆️ Update CrispStrobe/CrispASR to 8f1218141b792b8868861c1af17ba1e361b05dc0 (#10502)
⬆️ Update CrispStrobe/CrispASR

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-26 01:08:09 +02:00
LocalAI [bot]
74f07ecc35 fix(backends): quote $CURDIR in run.sh (fixes backends in paths with spaces) (#10519)
fix(backends): quote $CURDIR in run.sh so backends work in paths with spaces

The backend launcher scripts derive their own directory with
CURDIR=$(dirname "$(realpath $0)") and then referenced it unquoted as
$CURDIR (e.g. [ -f $CURDIR/lib/ld.so ], export LD_LIBRARY_PATH=$CURDIR/lib:...,
exec $CURDIR/<binary> "$@"). When a backend is installed under a path that
contains a space - notably macOS's ~/Library/Application Support/... - bash
word-splits the unquoted $CURDIR, so the test builtin fails with
"binary operator expected" and exec tries to run ".../Library/Application",
yielding "No such file or directory". The backend never starts, surfacing as
a gRPC "service not ready" error and an HTTP 500. Quote $CURDIR (and the
realpath "$0") in every affected run.sh; no logic changes.

Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 01:02:48 +02:00
LocalAI [bot]
ae0da454a7 chore: pin localrecall to tagged v0.6.3 (#10518)
#10517 pinned the pseudo-version of the postgres connection-timeout fix;
mudler/LocalRecall@v0.6.3 now tags that exact commit. Use the clean release
tag instead of the pseudo-version. No code change.


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 01:02:15 +02:00
LocalAI [bot]
179210b970 chore: bump localrecall for postgres per-connection timeouts (#10517)
* chore: bump localrecall for postgres per-connection timeouts

Pulls mudler/LocalRecall#49: sets lock_timeout / idle_in_transaction
(default on) + opt-in statement_timeout on every pooled connection, so a
corrupt/wedged index (e.g. a BM25 insert spinning on a buffer-content lock)
can no longer hold its relation lock forever and head-of-line block the
whole vector store.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

* docs(agents): document PostgreSQL connection safety timeouts

Note the POSTGRES_LOCK_TIMEOUT / POSTGRES_IDLE_IN_TRANSACTION_TIMEOUT /
POSTGRES_STATEMENT_TIMEOUT env vars read by the embedded vector store, and
that safe defaults are on automatically.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-26 00:53:03 +02:00
LocalAI [bot]
6c03e46390 chore: ⬆️ Update ikawrakow/ik_llama.cpp to b84902d2ad27c34f989f23947200c4b91b1568fd (#10515)
⬆️ Update ikawrakow/ik_llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-06-25 23:42:21 +02:00
LocalAI [bot]
f2ed63e39a docs(backends): make OS coverage explicit + require darwin support for new backends (#10516)
docs(backends): make OS coverage explicit + require darwin for new backends

The backend matrix is the source of truth for which OS a backend ships on, but
that was never written down, so backends were landing Linux-only by default even
when the engine builds fine on macOS.

- .github/backend-matrix.yml: header block documenting the two matrices
  (include = Linux, includeDarwin = macOS/Apple Silicon) and the policy that new
  backends target every OS they can build for.
- .agents/adding-backends.md: a 'Cover every OS' subsection in step 2 (full darwin
  wiring: includeDarwin entry, index.yaml metal: + metal-<backend> entries,
  run.sh DYLD branch + inferBackendPathDarwin case for C++ backends, the
  hw_grpc_proto protobuf/grpc link gotcha, and the path-filter touch) plus a
  verification-checklist item.
- AGENTS.md (CLAUDE.md): Quick Reference pointer so it surfaces every session.


Assisted-by: Claude:opus-4.8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-25 23:26:39 +02:00
LocalAI [bot]
286c508ce0 feat(backends): darwin build for the localvqe backend (acoustic echo cancellation) (#10512)
feat(backends): darwin build for the localvqe backend

LocalVQE (acoustic echo cancellation / noise suppression / dereverberation)
already builds on Darwin - its Makefile takes the OS=Darwin branch with
GGML_METAL=OFF (upstream is CPU + Vulkan only), producing a native arm64 CPU
image. It was just never wired into CI.

- .github/backend-matrix.yml: add localvqe to includeDarwin (build-type metal,
  lang go) - the darwin/arm64 build profile; the backend itself stays CPU.
- backend/index.yaml: metal: capability + concrete metal-localvqe(-development)
  entries pointing at the -metal-darwin-arm64-localvqe images.
- backend/go/localvqe/Makefile: note on the existing Darwin branch (also the
  per-backend change the CI path filter needs to build it here).


Assisted-by: Claude:opus-4.8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-25 22:54:36 +02:00
LocalAI [bot]
d1a9d59917 feat(backends): darwin/Metal builds for vision C++/ggml backends (depth-anything, locate-anything, rfdetr-cpp, sam3-cpp) (#10511)
feat(backends): darwin/Metal builds for the vision C++/ggml backends

depth-anything-cpp, locate-anything-cpp, rfdetr-cpp and sam3-cpp already carry
a Darwin/Metal path in their Makefiles (GGML_METAL=ON when build-type=metal),
but were never wired into CI, so no Metal image was published and Apple Silicon
could not install them.

- .github/backend-matrix.yml: add the four to includeDarwin (build-type metal,
  lang go), matching the other go+ggml *-cpp Metal entries.
- backend/index.yaml: add metal: to each backend's capabilities map (main and
  -development) plus concrete metal-<backend>(-development) entries pointing at
  the latest/master -metal-darwin-arm64-<backend> images.
- backend/go/*/Makefile: a one-line note on the existing Darwin branch (also
  the per-backend change the CI path filter needs to actually build them here).


Assisted-by: Claude:opus-4.8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-25 22:07:56 +02:00
LocalAI [bot]
f72046b5b5 fix(auth): make advisory locks dialect-aware and harden SQLite DSN (#10509)
* fix(auth): make advisory locks dialect-aware and harden SQLite DSN

Fixes #10506.

Two failures hit deployments that use the default SQLite auth database:

1. advisorylock executed PostgreSQL-only SQL (pg_advisory_lock /
   pg_try_advisory_lock) unconditionally. On a SQLite auth DB the job
   store, agent store and node registry migrations failed with
   "no such function: pg_advisory_lock". WithLockCtx/TryWithLockCtx now
   branch on the gorm dialect: PostgreSQL keeps the cross-process advisory
   lock, every other dialect uses a context-aware, per-key in-process lock
   (a SQLite auth DB is effectively single-process, so serializing within
   the process is sufficient).

2. The SQLite auth DSN set no busy timeout, so transient SQLITE_BUSY over
   network-backed storage (SMB/CIFS/NFS, e.g. Azure Files) failed the auth
   migration immediately with "database is locked". The DSN now sets
   _busy_timeout=5000 and _txlock=immediate (caller-supplied values are
   preserved). WAL is intentionally not enabled since its shared-memory
   mmap does not work over network filesystems. Docs note that PostgreSQL
   should be used when the data directory lives on shared storage.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

* test(jobs): regression test for #10506 SQLite job store migration

Exercises the exact caller chain that failed in the issue:
auth.InitDB(sqlite) -> jobs.NewJobStore -> advisorylock.WithLockCtx ->
AutoMigrate. Before the dialect-aware advisory lock fix this failed with
"no such function: pg_advisory_lock"; the test now asserts it migrates
cleanly on a SQLite auth DB.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-25 17:18:55 +02:00
119 changed files with 3968 additions and 467 deletions

View File

@@ -102,6 +102,24 @@ Multi-arch backends are NOT a single matrix entry with `platforms: 'linux/amd64,
Entries whose `dockerfile` is `./backend/Dockerfile.{llama-cpp,ik-llama-cpp,turboquant}` must also set a `builder-base-image` field pointing at a prebuilt base from `quay.io/go-skynet/ci-cache:base-grpc-*` (CI builds these via `.github/workflows/base-images.yml`). The mapping is by `(build-type, platforms)` — see existing entries for the pattern. CI uses these prebuilt bases to skip the gRPC compile (~2535 min cold). Local `make backends/<name>` ignores `builder-base-image` and uses the from-source path inside the Dockerfile, so you don't need quay access for local builds.
### Cover every OS the project supports (Linux **and** Darwin)
`.github/backend-matrix.yml` has two matrices, and they are the source of truth for which OS a backend ships on:
- `include:` — the **Linux** matrix (x86_64 + arm64; CPU and CUDA / ROCm / SYCL / Vulkan).
- `includeDarwin:` — the **macOS / Apple Silicon** matrix (arm64; Metal where the engine supports it, otherwise a native arm64 CPU build).
**A new backend must target every OS it can build for — do not ship Linux-only by default.** A backend that appears only under `include:` is silently unavailable on macOS even when its code would run there. Most C/C++/GGML engines build on Darwin out of the box (ggml defaults `GGML_METAL=ON` on Apple, so a plain build is Metal-enabled), and many Python backends do too (CPU / MPS wheels). If a backend genuinely cannot support an OS (e.g. CUDA-only, no CPU variant), state that in the PR description instead of omitting it silently.
Wiring a backend into `includeDarwin:` is more than the matrix entry:
1. **`includeDarwin:` entry** — `tag-suffix: "-metal-darwin-arm64-<backend>"`, `build-type: "metal"`, `lang: "go"` for go+ggml backends; omit `build-type` for the bespoke C++ ones (llama-cpp / ds4 / privacy-filter). Match an existing entry of the same shape.
2. **`backend/index.yaml`** — add `metal:` to the backend's `capabilities` map (main and `-development`) and concrete `metal-<backend>` / `metal-<backend>-development` image entries pointing at the `-metal-darwin-arm64-<backend>` images.
3. **C/C++ backends only** — add an `inferBackendPathDarwin` case in `scripts/changed-backends.js` returning `backend/cpp/<backend>/` (the generic fallthrough assumes `backend/<lang>/`, which is wrong for a C++ source tree driven with `lang: go`), and give `run.sh` a Darwin branch that exports `DYLD_LIBRARY_PATH` instead of `LD_LIBRARY_PATH`. If the build is bespoke (single `grpc-server` + dylib bundling), model it on `scripts/build/ds4-darwin.sh` and add a `backends/<backend>-darwin` make target plus a gated step in `.github/workflows/backend_build_darwin.yml`.
4. **C++ proto gotcha** — if the backend compiles the generated gRPC/protobuf in a separate CMake target (e.g. `hw_grpc_proto`), that target must link `protobuf::libprotobuf` + `gRPC::grpc++` so the Homebrew include dirs propagate; otherwise macOS fails with `google/protobuf/runtime_version.h not found` (Linux hides this because apt headers sit in `/usr/include`).
The CI path filter only builds a backend on a PR when a file under its directory changes, so a darwin-only YAML edit builds nothing — touch a file under `backend/<lang>/<backend>/` (a one-line comment is enough) in the same PR.
## 3. Add Backend Metadata to `backend/index.yaml`
**Step 3a: Add Meta Definition**
@@ -225,6 +243,7 @@ After adding a new backend, verify:
- [ ] Backend directory structure is complete with all necessary files
- [ ] Build configurations added to `.github/backend-matrix.yml` for all desired platforms (per-arch entries with `platform-tag` for multi-arch; `builder-base-image` for llama-cpp / ik-llama-cpp / turboquant)
- [ ] **OS coverage considered**: added to `includeDarwin:` (macOS/Apple Silicon) if the backend can build there — with the `backend/index.yaml` `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD branch and `inferBackendPathDarwin` case for C++ backends — or the PR explains why an OS is unsupported. Do not ship Linux-only by default.
- [ ] Meta definition added to `backend/index.yaml` in the `## metas` section
- [ ] Image entries added to `backend/index.yaml` for all build variants (latest + development)
- [ ] Tag suffixes match between workflow file and index.yaml

View File

@@ -2,6 +2,28 @@
# Matrix data for backend container image builds.
# Consumed by scripts/changed-backends.js for both backend.yml and backend_pr.yml.
# This file is NOT a workflow — it has no top-level 'on:' or 'jobs:'.
#
# OS / platform coverage — READ THIS WHEN ADDING A BACKEND
# --------------------------------------------------------
# This file is the source of truth for which OS each backend is built and
# published for. A backend ships ONLY for the matrices it appears in:
# - Linux -> the `include:` matrix below (x86_64 + arm64; CPU and
# CUDA / ROCm / SYCL / Vulkan variants).
# - macOS -> the `includeDarwin:` matrix (Apple Silicon / arm64; Metal where
# the engine supports it, otherwise a native arm64 CPU build).
#
# New backends must target EVERY OS they can build for, not just Linux. A backend
# listed only under `include:` is silently unavailable on macOS even when its code
# would run there. Most C/C++/GGML engines build on Darwin (ggml defaults
# GGML_METAL=ON on Apple, so a plain build is Metal-enabled), and many Python
# backends do too (CPU / MPS). If a backend genuinely cannot support an OS, say so
# in its PR description rather than silently omitting it.
#
# Adding a backend to `includeDarwin:` is more than one line — see the darwin
# checklist in .agents/adding-backends.md (includeDarwin entry, the index.yaml
# `metal:` capability + `metal-<backend>` image entries, a `run.sh` Darwin/DYLD
# branch for C/C++ backends, and the inferBackendPathDarwin case in
# scripts/changed-backends.js so the path filter actually builds it).
# Linux matrix (consumed by backend-jobs).
include:
@@ -4922,6 +4944,37 @@ includeDarwin:
tag-suffix: "-metal-darwin-arm64-vibevoice-cpp"
build-type: "metal"
lang: "go"
# Vision/utility C++/ggml backends (go+cgo). Their Makefiles already carry a
# Darwin/Metal path (GGML_METAL=ON when build-type=metal); this just builds and
# publishes the metal image so Apple Silicon can install them.
- backend: "depth-anything-cpp"
tag-suffix: "-metal-darwin-arm64-depth-anything-cpp"
build-type: "metal"
lang: "go"
- backend: "locate-anything-cpp"
tag-suffix: "-metal-darwin-arm64-locate-anything-cpp"
build-type: "metal"
lang: "go"
- backend: "rfdetr-cpp"
tag-suffix: "-metal-darwin-arm64-rfdetr-cpp"
build-type: "metal"
lang: "go"
- backend: "sam3-cpp"
tag-suffix: "-metal-darwin-arm64-sam3-cpp"
build-type: "metal"
lang: "go"
# privacy-filter (PII/NER) is a C++/ggml backend built by a bespoke darwin
# script (make backends/privacy-filter-darwin); ggml defaults Metal ON on Apple
# so the build is Metal-enabled. lang=go drives runner/toolchain selection only.
- backend: "privacy-filter"
tag-suffix: "-metal-darwin-arm64-privacy-filter"
lang: "go"
# LocalVQE has no Metal path; on Apple Silicon it builds CPU-only (GGML_METAL
# OFF) but is still a native arm64 image. Uses the darwin/metal build profile.
- backend: "localvqe"
tag-suffix: "-metal-darwin-arm64-localvqe"
build-type: "metal"
lang: "go"
- backend: "voxtral"
tag-suffix: "-metal-darwin-arm64-voxtral"
build-type: "metal"

View File

@@ -99,6 +99,7 @@ jobs:
/opt/homebrew/Cellar/xxhash
/opt/homebrew/Cellar/zstd
/opt/homebrew/Cellar/nlohmann-json
/opt/homebrew/Cellar/opus
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
- name: Dependencies
@@ -113,7 +114,12 @@ jobs:
# nlohmann-json is header-only and required by the ds4 backend
# (dsml_renderer.cpp includes <nlohmann/json.hpp>); on Linux it comes
# from the apt-installed nlohmann-json3-dev in the build image.
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json
# opus + pkg-config are required by the opus go backend: its
# Makefile/package.sh call `pkg-config --cflags/--libs opus` to build
# libopusshim.dylib and to locate libopus.dylib for bundling. brew's
# pkg-config defaults its search path to the Homebrew prefix so the
# opus.pc is found.
brew install protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config
# Force-reinstall ccache so brew re-validates its full runtime-dep
# closure on every run. This is the durable fix: when the upstream
# ccache formula gains a new transitive dep (as it has multiple times
@@ -132,7 +138,7 @@ jobs:
# and decides "already installed" without re-linking, so on a cache-
# hit run the formulas aren't on PATH. Force-link them; --overwrite
# tolerates pre-existing symlinks from earlier installs.
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json 2>/dev/null || true
brew link --overwrite protobuf grpc make protoc-gen-go protoc-gen-go-grpc libomp llvm ccache blake3 fmt hiredis xxhash zstd nlohmann-json opus pkg-config 2>/dev/null || true
- name: Save Homebrew cache
if: github.event_name != 'pull_request' && steps.brew-cache.outputs.cache-hit != 'true'
@@ -153,6 +159,7 @@ jobs:
/opt/homebrew/Cellar/xxhash
/opt/homebrew/Cellar/zstd
/opt/homebrew/Cellar/nlohmann-json
/opt/homebrew/Cellar/opus
key: brew-${{ runner.os }}-${{ runner.arch }}-v1-${{ hashFiles('.github/workflows/backend_build_darwin.yml') }}
# ---- ccache for llama.cpp CMake builds ----
@@ -228,8 +235,17 @@ jobs:
run: |
make backends/ds4-darwin
# privacy-filter is a C++/ggml backend like ds4 - a single grpc-server with
# otool dylib bundling - so it gets its own bespoke darwin script rather than
# the generic build-darwin-go-backend path.
- name: Build privacy-filter backend (Darwin Metal)
if: inputs.backend == 'privacy-filter'
run: |
make protogen-go
make backends/privacy-filter-darwin
- name: Build ${{ inputs.backend }}-darwin
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4'
if: inputs.backend != 'llama-cpp' && inputs.backend != 'ds4' && inputs.backend != 'privacy-filter'
run: |
make protogen-go
BACKEND=${{ inputs.backend }} BUILD_TYPE=${{ inputs.build-type }} USE_PIP=${{ inputs.use-pip }} make build-darwin-${{ inputs.lang }}-backend

View File

@@ -24,6 +24,11 @@ jobs:
args: release --clean
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
MACOS_SIGN_P12: ${{ secrets.MACOS_CERTIFICATE }}
MACOS_SIGN_PASSWORD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
launcher-build-darwin:
runs-on: macos-latest
steps:
@@ -35,9 +40,19 @@ jobs:
uses: actions/setup-go@v5
with:
go-version: 1.23
- name: Build launcher for macOS ARM64
run: |
make build-launcher-darwin
- name: Import signing certificate
env:
MACOS_CERTIFICATE: ${{ secrets.MACOS_CERTIFICATE }}
MACOS_CERTIFICATE_PWD: ${{ secrets.MACOS_CERTIFICATE_PWD }}
MACOS_CI_KEYCHAIN_PWD: ${{ secrets.MACOS_CI_KEYCHAIN_PWD }}
run: bash contrib/macos/sign-and-notarize.sh import-cert
- name: Build, sign and notarize the DMG
env:
MACOS_SIGN_IDENTITY: ${{ secrets.MACOS_SIGN_IDENTITY }}
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
run: make release-launcher-darwin
- name: Upload DMG to Release
uses: softprops/action-gh-release@v3
with:

3
.gitignore vendored
View File

@@ -94,3 +94,6 @@ core/http/react-ui/test-results/
# SDD / brainstorm scratch (agent-driven development)
.superpowers/
# Local Apple signing material (never commit)
.certs/

View File

@@ -9,7 +9,8 @@ source:
enabled: true
name_template: '{{ .ProjectName }}-{{ .Tag }}-source'
builds:
- main: ./cmd/local-ai
- id: local-ai
main: ./cmd/local-ai
env:
- CGO_ENABLED=0
ldflags:
@@ -35,3 +36,19 @@ snapshot:
version_template: "{{ .Tag }}-next"
changelog:
use: github-native
# Sign + notarize the macOS server binary via the quill backend (runs on Linux,
# no macOS runner needed). Disabled automatically when MACOS_SIGN_P12 is unset
# (forks / PRs), so those builds stay unsigned and green.
notarize:
macos:
- enabled: '{{ isEnvSet "MACOS_SIGN_P12" }}'
ids:
- local-ai
sign:
certificate: "{{.Env.MACOS_SIGN_P12}}"
password: "{{.Env.MACOS_SIGN_PASSWORD}}"
notarize:
issuer_id: "{{.Env.MACOS_NOTARY_ISSUER_ID}}"
key_id: "{{.Env.MACOS_NOTARY_KEY_ID}}"
key: "{{.Env.MACOS_NOTARY_KEY}}"
wait: true

View File

@@ -43,4 +43,5 @@ LocalAI follows the Linux kernel project's [guidelines for AI coding assistants]
- **New API endpoints**: LocalAI advertises its capability surface in several independent places — swagger `@Tags`, `/api/instructions` registry, auth `RouteFeatureRegistry`, React UI `capabilities.js`, docs. Read [.agents/api-endpoints-and-auth.md](.agents/api-endpoints-and-auth.md) and follow its checklist — missing any surface means clients, admins, and the UI won't know the endpoint exists.
- **Admin endpoints → MCP tool**: every admin endpoint that an admin would manage conversationally (install/list/edit/toggle/upgrade) MUST also be exposed as an MCP tool in `pkg/mcp/localaitools/`. The LocalAI Assistant chat modality and the standalone `local-ai mcp-server` consume that package; drift between REST and MCP is a real risk. Read [.agents/localai-assistant-mcp.md](.agents/localai-assistant-mcp.md) — the `TestToolHTTPRouteMappingComplete` test fails until you wire the new tool and update the route map.
- **Build**: Inspect `Makefile` and `.github/workflows/` — ask the user before running long builds
- **Backend OS coverage**: a new backend must target every OS it can build for, not just Linux. `.github/backend-matrix.yml` has two matrices — `include:` (Linux) and `includeDarwin:` (macOS / Apple Silicon). Most C/C++/GGML and many Python backends build on Darwin too — wire the `includeDarwin` entry + `backend/index.yaml` `metal:` entries, or say in the PR why an OS is unsupported. See the darwin checklist in [.agents/adding-backends.md](.agents/adding-backends.md).
- **UI**: The active UI is the React app in `core/http/react-ui/`. The older Alpine.js/HTML UI in `core/http/static/` is pending deprecation — all new UI work goes in the React UI

View File

@@ -1,5 +1,5 @@
# Disable parallel execution for backend builds
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter
.NOTPARALLEL: backends/diffusers backends/llama-cpp backends/turboquant backends/outetts backends/piper backends/stablediffusion-ggml backends/whisper backends/crispasr backends/parakeet-cpp backends/faster-whisper backends/silero-vad backends/local-store backends/huggingface backends/rfdetr backends/rfdetr-cpp backends/insightface backends/speaker-recognition backends/kitten-tts backends/kokoro backends/chatterbox backends/llama-cpp-darwin backends/neutts build-darwin-python-backend build-darwin-go-backend backends/mlx backends/diffuser-darwin backends/mlx-vlm backends/mlx-audio backends/mlx-distributed backends/stablediffusion-ggml-darwin backends/vllm backends/vllm-omni backends/sglang backends/moonshine backends/pocket-tts backends/qwen-tts backends/faster-qwen3-tts backends/qwen-asr backends/nemo backends/voxcpm backends/whisperx backends/ace-step backends/acestep-cpp backends/fish-speech backends/voxtral backends/opus backends/trl backends/llama-cpp-quantization backends/kokoros backends/sam3-cpp backends/qwen3-tts-cpp backends/omnivoice-cpp backends/vibevoice-cpp backends/localvqe backends/tinygrad backends/sherpa-onnx backends/ds4 backends/ds4-darwin backends/liquid-audio backends/supertonic backends/depth-anything-cpp backends/privacy-filter backends/privacy-filter-darwin
GOCMD=go
GOTEST=$(GOCMD) test
@@ -1129,6 +1129,10 @@ backends/ds4-darwin: build
bash ./scripts/build/ds4-darwin.sh
./local-ai backends install "ocifile://$(abspath ./backend-images/ds4.tar)"
backends/privacy-filter-darwin: build
bash ./scripts/build/privacy-filter-darwin.sh
./local-ai backends install "ocifile://$(abspath ./backend-images/privacy-filter.tar)"
build-darwin-python-backend: build
bash ./scripts/build/python-darwin.sh
@@ -1449,13 +1453,32 @@ docs: docs/static/gallery.html
########################################################
## fyne cross-platform build
build-launcher-darwin: build-launcher
go run github.com/tiagomelo/macos-dmg-creator/cmd/createdmg@latest \
--appName "LocalAI" \
--appBinaryPath "$(LAUNCHER_BINARY_NAME)" \
--bundleIdentifier "com.localai.launcher" \
--iconPath "core/http/static/logo.png" \
--outputDir "dist/"
# Build LocalAI.app from the launcher via fyne (metadata read from cmd/launcher/FyneApp.toml).
# Signing happens via contrib/macos/sign-and-notarize.sh, which is a no-op when the signing
# secrets are unset, so unsigned local/fork builds keep working.
build-launcher-darwin:
rm -rf dist/LocalAI.app cmd/launcher/LocalAI.app
mkdir -p dist
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os darwin -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)
mv cmd/launcher/LocalAI.app dist/LocalAI.app
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.app
# Wrap the (signed) app into a drag-to-Applications DMG via hdiutil, then sign the DMG.
dmg-launcher-darwin: build-launcher-darwin
rm -rf dist/dmg dist/LocalAI.dmg
mkdir -p dist/dmg
cp -R dist/LocalAI.app dist/dmg/LocalAI.app
ln -s /Applications dist/dmg/Applications
hdiutil create -volname "LocalAI" -srcfolder dist/dmg -ov -format UDZO dist/LocalAI.dmg
bash contrib/macos/sign-and-notarize.sh sign dist/LocalAI.dmg
# Submit the DMG to Apple notarization and staple the ticket (no-op without notary secrets).
notarize-launcher-darwin: dmg-launcher-darwin
bash contrib/macos/sign-and-notarize.sh notarize dist/LocalAI.dmg
# Single entrypoint for CI: build -> sign app -> dmg -> sign dmg -> notarize -> staple.
release-launcher-darwin: notarize-launcher-darwin
@echo "dist/LocalAI.dmg is ready"
build-launcher-linux:
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv launcher.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz
cd cmd/launcher && go run fyne.io/tools/cmd/fyne@latest package -os linux -icon ../../core/http/static/logo.png --executable $(LAUNCHER_BINARY_NAME)-linux && mv LocalAI.tar.xz ../../$(LAUNCHER_BINARY_NAME)-linux.tar.xz

View File

@@ -1,5 +1,5 @@
IK_LLAMA_VERSION?=d5507e33ae7ee2b7b41475f08044d3bde3b839ee
IK_LLAMA_VERSION?=b84902d2ad27c34f989f23947200c4b91b1568fd
LLAMA_REPO?=https://github.com/ikawrakow/ik_llama.cpp
CMAKE_ARGS?=

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -13,28 +13,28 @@ grep -e "flags" /proc/cpuinfo | head -1
# ik_llama.cpp requires AVX2 — default to avx2 binary
BINARY=ik-llama-cpp-avx2
if [ -e $CURDIR/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
if [ -e "$CURDIR"/ik-llama-cpp-fallback ] && ! grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 NOT found, using fallback"
BINARY=ik-llama-cpp-fallback
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
else
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec $CURDIR/$BINARY "$@"
exec "$CURDIR"/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec $CURDIR/ik-llama-cpp-fallback "$@"
exec "$CURDIR"/ik-llama-cpp-fallback "$@"

View File

@@ -1,5 +1,5 @@
LLAMA_VERSION?=8be759e6f70d629638a7eb70db3824cbdcea370b
LLAMA_VERSION?=9d5d882d8cd0f0a9283d87ed5e6fe3ee0d925fb1
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
CMAKE_ARGS?=

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -16,37 +16,37 @@ BINARY=llama-cpp-fallback
# CPU_ALL_VARIANTS: ggml's backend registry dlopens the best libggml-cpu-*.so for this
# host, so no shell-side AVX probing. GPU images (cublas/sycl/vulkan/hipblas) ship only
# llama-cpp-fallback (the accelerator does the compute), so fall back to it when absent.
if [ -e $CURDIR/llama-cpp-cpu-all ]; then
if [ -e "$CURDIR"/llama-cpp-cpu-all ]; then
BINARY=llama-cpp-cpu-all
fi
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
if [ -e $CURDIR/llama-cpp-grpc ]; then
if [ -e "$CURDIR"/llama-cpp-grpc ]; then
BINARY=llama-cpp-grpc
fi
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH=$CURDIR/lib:$DYLD_FALLBACK_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
#export DYLD_FALLBACK_LIBRARY_PATH="$CURDIR"/lib:$DYLD_FALLBACK_LIBRARY_PATH
else
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
if [ -d "$CURDIR/lib/rocblas/library" ]; then
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
fi
fi
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec $CURDIR/$BINARY "$@"
exec "$CURDIR"/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec $CURDIR/llama-cpp-fallback "$@"
exec "$CURDIR"/llama-cpp-fallback "$@"

View File

@@ -51,6 +51,14 @@ add_library(hw_grpc_proto STATIC
${HW_GRPC_SRCS} ${HW_GRPC_HDRS}
${HW_PROTO_SRCS} ${HW_PROTO_HDRS})
target_include_directories(hw_grpc_proto PUBLIC ${CMAKE_CURRENT_BINARY_DIR})
# The generated proto/grpc sources include protobuf and grpc++ headers, so this
# library must see their include dirs. Linking the imported targets propagates
# them. On Linux the apt headers live in /usr/include (default search path) so
# this was a no-op; on macOS the Homebrew headers are under /opt/homebrew and
# would otherwise be missed (runtime_version.h not found).
target_link_libraries(hw_grpc_proto PUBLIC
protobuf::libprotobuf
gRPC::grpc++)
# Build only the pf static lib (+ ggml) from the engine tree — no CLI/bench/tests.
# PF_VULKAN is honored when passed on the cmake command line (it lands in the

View File

@@ -2,7 +2,13 @@
# Entry point for the privacy-filter backend image / BACKEND_BINARY mode.
set -e
CURDIR=$(dirname "$(realpath "$0")")
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
# macOS has no bundled ld.so; the darwin package ships only dylibs under lib/,
# resolved via DYLD_LIBRARY_PATH (the ld.so branch below is skipped there).
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:$DYLD_LIBRARY_PATH"
else
export LD_LIBRARY_PATH="$CURDIR/lib:$LD_LIBRARY_PATH"
fi
if [ -f "$CURDIR/lib/ld.so" ]; then
exec "$CURDIR/lib/ld.so" "$CURDIR/grpc-server" "$@"
fi

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,36 +15,36 @@ BINARY=turboquant-fallback
# x86/arm64 ship a single turboquant-cpu-all built with ggml CPU_ALL_VARIANTS: ggml's
# backend registry dlopens the best libggml-cpu-*.so for this host, so no shell-side
# probing. ROCm ships only turboquant-fallback, so fall back to it when cpu-all is absent.
if [ -e $CURDIR/turboquant-cpu-all ]; then
if [ -e "$CURDIR"/turboquant-cpu-all ]; then
BINARY=turboquant-cpu-all
fi
if [ -n "$LLAMACPP_GRPC_SERVERS" ]; then
if [ -e $CURDIR/turboquant-grpc ]; then
if [ -e "$CURDIR"/turboquant-grpc ]; then
BINARY=turboquant-grpc
fi
fi
# Extend ld library path with the dir where this script is located/lib
if [ "$(uname)" == "Darwin" ]; then
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
# Tell rocBLAS where to find TensileLibrary data (GPU kernel tuning files)
if [ -d "$CURDIR/lib/rocblas/library" ]; then
export ROCBLAS_TENSILE_LIBPATH=$CURDIR/lib/rocblas/library
export ROCBLAS_TENSILE_LIBPATH="$CURDIR"/lib/rocblas/library
fi
fi
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using binary: $BINARY"
exec $CURDIR/lib/ld.so $CURDIR/$BINARY "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/$BINARY "$@"
fi
echo "Using binary: $BINARY"
exec $CURDIR/$BINARY "$@"
exec "$CURDIR"/$BINARY "$@"
# We should never reach this point, however just in case we do, run fallback
exec $CURDIR/turboquant-fallback "$@"
exec "$CURDIR"/turboquant-fallback "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -21,20 +21,20 @@ if [ "$(uname)" = "Darwin" ]; then
if [ ! -e "$LIBRARY" ]; then
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
fi
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgoacestepcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgoacestepcpp-avx.so ]; then
if [ -e "$CURDIR"/libgoacestepcpp-avx.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgoacestepcpp-avx2.so ]; then
if [ -e "$CURDIR"/libgoacestepcpp-avx2.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx2.so"
fi
fi
@@ -42,22 +42,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgoacestepcpp-avx512.so ]; then
if [ -e "$CURDIR"/libgoacestepcpp-avx512.so ]; then
LIBRARY="$CURDIR/libgoacestepcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export ACESTEP_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/acestep-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/acestep-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/acestep-cpp "$@"
exec "$CURDIR"/acestep-cpp "$@"

View File

@@ -4,10 +4,10 @@ set -e
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${DYLD_LIBRARY_PATH:-}"
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
export CED_LIBRARY="$CURDIR/lib/libced.dylib"
else
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
fi
# If a self-contained ld.so was packaged, route through it so the packaged

View File

@@ -1,6 +1,6 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
exec $CURDIR/cloud-proxy "$@"
exec "$CURDIR"/cloud-proxy "$@"

View File

@@ -8,7 +8,7 @@ JOBS?=$(shell nproc --ignore=1)
# CrispASR version (release tag)
CRISPASR_REPO?=https://github.com/CrispStrobe/CrispASR
CRISPASR_VERSION?=96b2a6ee31d30389fed8a7ef1a54239b75231ddc
CRISPASR_VERSION?=8f1218141b792b8868861c1af17ba1e361b05dc0
SO_TARGET?=libgocrispasr.so
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgocrispasr-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgocrispasr-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgocrispasr-avx.so ]; then
if [ -e "$CURDIR"/libgocrispasr-avx.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgocrispasr-avx2.so ]; then
if [ -e "$CURDIR"/libgocrispasr-avx2.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx2.so"
fi
fi
@@ -36,12 +36,12 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgocrispasr-avx512.so ]; then
if [ -e "$CURDIR"/libgocrispasr-avx512.so ]; then
LIBRARY="$CURDIR/libgocrispasr-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export CRISPASR_LIBRARY=$LIBRARY
@@ -49,14 +49,14 @@ export CRISPASR_LIBRARY=$LIBRARY
# Point piper's espeak-ng phonemizer at the bundled voice data. The variable
# names the directory CONTAINING espeak-ng-data (package.sh drops it next to
# this script). Harmless when espeak-ng wasn't bundled.
export CRISPASR_ESPEAK_DATA_PATH=$CURDIR
export CRISPASR_ESPEAK_DATA_PATH="$CURDIR"
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/crispasr "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/crispasr "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/crispasr "$@"
exec "$CURDIR"/crispasr "$@"

View File

@@ -40,6 +40,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DDA_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libdepthanythingcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libdepthanythingcpp-avx.so ]; then
if [ -e "$CURDIR"/libdepthanythingcpp-avx.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libdepthanythingcpp-avx2.so ]; then
if [ -e "$CURDIR"/libdepthanythingcpp-avx2.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libdepthanythingcpp-avx512.so ]; then
if [ -e "$CURDIR"/libdepthanythingcpp-avx512.so ]; then
LIBRARY="$CURDIR/libdepthanythingcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export DEPTHANYTHING_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/depth-anything-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/depth-anything-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/depth-anything-cpp "$@"
exec "$CURDIR"/depth-anything-cpp "$@"

View File

@@ -1,6 +1,6 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
exec $CURDIR/local-store "$@"
exec "$CURDIR"/local-store "$@"

View File

@@ -32,6 +32,8 @@ endif
ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLOCALVQE_VULKAN=ON
else ifeq ($(OS),Darwin)
# Apple Silicon: CPU-only (no Metal upstream); built + published as an arm64
# image by CI (includeDarwin in .github/backend-matrix.yml) for macOS install.
CMAKE_ARGS+=-DGGML_METAL=OFF
endif

View File

@@ -1,34 +1,34 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
# LocalVQE's runtime CPU-variant loader (ggml_backend_load_all) searches
# get_executable_path() and current_path() — the second one is what saves us
# when /proc/self/exe resolves to lib/ld.so under the bundled-loader path.
# So we cd into $CURDIR (where all the libggml-cpu-*.so files live) before
# So we cd into "$CURDIR" (where all the libggml-cpu-*.so files live) before
# exec'ing the binary.
cd "$CURDIR"
if [ "$(uname)" = "Darwin" ]; then
# macOS: LocalVQE is built as a SHARED library, so dyld needs the .dylib +
# DYLD_LIBRARY_PATH. Prefer .dylib and fall back to .so just in case.
export DYLD_LIBRARY_PATH=$CURDIR:$CURDIR/lib:$DYLD_LIBRARY_PATH
LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.dylib
export DYLD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$DYLD_LIBRARY_PATH
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.dylib
if [ ! -e "$LOCALVQE_LIBRARY" ]; then
LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.so
LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
fi
export LOCALVQE_LIBRARY
else
export LD_LIBRARY_PATH=$CURDIR:$CURDIR/lib:$LD_LIBRARY_PATH
export LOCALVQE_LIBRARY=$CURDIR/liblocalvqe.so
export LD_LIBRARY_PATH="$CURDIR":"$CURDIR"/lib:$LD_LIBRARY_PATH
export LOCALVQE_LIBRARY="$CURDIR"/liblocalvqe.so
fi
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LOCALVQE_LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/localvqe "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/localvqe "$@"
fi
echo "Using library: $LOCALVQE_LIBRARY"
exec $CURDIR/localvqe "$@"
exec "$CURDIR"/localvqe "$@"

View File

@@ -33,6 +33,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DLA_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/liblocateanythingcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/liblocateanythingcpp-avx.so ]; then
if [ -e "$CURDIR"/liblocateanythingcpp-avx.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/liblocateanythingcpp-avx2.so ]; then
if [ -e "$CURDIR"/liblocateanythingcpp-avx2.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/liblocateanythingcpp-avx512.so ]; then
if [ -e "$CURDIR"/liblocateanythingcpp-avx512.so ]; then
LIBRARY="$CURDIR/liblocateanythingcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export LOCATEANYTHING_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/locate-anything-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/locate-anything-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/locate-anything-cpp "$@"
exec "$CURDIR"/locate-anything-cpp "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgomnivoicecpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgomnivoicecpp-avx.so ]; then
if [ -e "$CURDIR"/libgomnivoicecpp-avx.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgomnivoicecpp-avx2.so ]; then
if [ -e "$CURDIR"/libgomnivoicecpp-avx2.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgomnivoicecpp-avx512.so ]; then
if [ -e "$CURDIR"/libgomnivoicecpp-avx512.so ]; then
LIBRARY="$CURDIR/libgomnivoicecpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export OMNIVOICE_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/omnivoice-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/omnivoice-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/omnivoice-cpp "$@"
exec "$CURDIR"/omnivoice-cpp "$@"

View File

@@ -1,13 +1,30 @@
GOCMD?=go
GO_TAGS?=
# The opus shim is a small C wrapper around libopus' variadic
# opus_encoder_ctl (see csrc/opus_shim.c). It is built as a shared library
# and dlopen'd at runtime by the Go backend (codec.go). The extension is
# OS-specific: Linux uses .so, macOS uses .dylib. OS is exported by the root
# Makefile (`export OS := $(shell uname -s)`).
SHIM_EXT=so
OPUS_CFLAGS := $(shell pkg-config --cflags opus)
OPUS_LIBS := $(shell pkg-config --libs opus)
SHIM_LDFLAGS := $(OPUS_LIBS)
libopusshim.so: csrc/opus_shim.c
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(OPUS_LIBS)
ifeq ($(OS),Darwin)
SHIM_EXT=dylib
# Resolve libopus symbols lazily from the already globally-loaded
# libopus (codec.go dlopens it RTLD_GLOBAL before the shim) rather than
# recording an absolute Homebrew path in the dylib. This keeps the
# packaged shim relocatable on machines that have no Homebrew.
SHIM_LDFLAGS := -undefined dynamic_lookup
endif
opus: libopusshim.so
libopusshim.$(SHIM_EXT): csrc/opus_shim.c
$(CC) -shared -fPIC -o $@ $< $(OPUS_CFLAGS) $(SHIM_LDFLAGS)
opus: libopusshim.$(SHIM_EXT)
$(GOCMD) build -tags "$(GO_TAGS)" -o opus ./
package: opus
@@ -16,4 +33,7 @@ package: opus
build: package
clean:
rm -f opus libopusshim.so
rm -f opus libopusshim.$(SHIM_EXT)
rm -rf package
.PHONY: build package clean

View File

@@ -8,13 +8,23 @@ mkdir -p $CURDIR/package/lib
cp -avf $CURDIR/opus $CURDIR/package/
cp -avf $CURDIR/run.sh $CURDIR/package/
# Copy the opus shim library
cp -avf $CURDIR/libopusshim.so $CURDIR/package/lib/
# The shim extension is OS-specific (.so on Linux, .dylib on macOS).
SHIM_EXT=so
if [ "$(uname)" = "Darwin" ]; then
SHIM_EXT=dylib
fi
# Copy system libopus
# Copy the opus shim library
cp -avf $CURDIR/libopusshim.$SHIM_EXT $CURDIR/package/lib/
# Copy system libopus so the backend is self-contained: the runtime base
# image has neither libopus-dev (Linux) nor Homebrew (macOS), so codec.go's
# dlopen would otherwise fail. Both name patterns are attempted; only the
# host's matching one exists.
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists opus; then
LIBOPUS_DIR=$(pkg-config --variable=libdir opus)
cp -avfL $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
cp -avf $LIBOPUS_DIR/libopus.so* $CURDIR/package/lib/ 2>/dev/null || true
cp -avf $LIBOPUS_DIR/libopus*.dylib $CURDIR/package/lib/ 2>/dev/null || true
fi
# Detect architecture and copy appropriate libraries
@@ -38,6 +48,8 @@ elif [ -f "/lib/ld-linux-aarch64.so.1" ]; then
cp -arfLv /lib/aarch64-linux-gnu/libdl.so.2 $CURDIR/package/lib/libdl.so.2
cp -arfLv /lib/aarch64-linux-gnu/librt.so.1 $CURDIR/package/lib/librt.so.1
cp -arfLv /lib/aarch64-linux-gnu/libpthread.so.0 $CURDIR/package/lib/libpthread.so.0
elif [ "$(uname -s)" = "Darwin" ]; then
echo "Detected Darwin — system libraries linked dynamically, no bundled loader needed"
else
echo "Warning: Could not detect architecture for system library bundling"
fi

View File

@@ -1,15 +1,20 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY=$CURDIR/lib/libopusshim.so
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec $CURDIR/lib/ld.so $CURDIR/opus "$@"
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.dylib
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export OPUS_SHIM_LIBRARY="$CURDIR"/lib/libopusshim.so
fi
exec $CURDIR/opus "$@"
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/opus "$@"
fi
exec "$CURDIR"/opus "$@"

View File

@@ -1,6 +1,6 @@
# parakeet-cpp backend Makefile.
#
# Upstream pin lives below as PARAKEET_VERSION?=89f5e2977b4d8bccd45e7bcc6f2ef7c4ed49e89a
# Upstream pin lives below as PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
# (.github/bump_deps.sh) can find and update it - matches the
# whisper.cpp / ds4 / vibevoice-cpp convention.
#
@@ -15,7 +15,7 @@
# That's what the L0 smoke test uses. The default target below does the
# proper clone-at-pin + cmake build so CI doesn't need a side-checkout.
PARAKEET_VERSION?=89f5e2977b4d8bccd45e7bcc6f2ef7c4ed49e89a
PARAKEET_VERSION?=f469a57270a1cc4554acb15febf60e56619673b9
PARAKEET_REPO?=https://github.com/mudler/parakeet.cpp
GOCMD?=go

View File

@@ -4,10 +4,10 @@ set -e
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${DYLD_LIBRARY_PATH:-}"
export DYLD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${DYLD_LIBRARY_PATH:-}"
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.dylib"
else
export LD_LIBRARY_PATH="$CURDIR/lib:$CURDIR:${LD_LIBRARY_PATH:-}"
export LD_LIBRARY_PATH="$CURDIR/lib:"$CURDIR":${LD_LIBRARY_PATH:-}"
export PARAKEET_LIBRARY="$CURDIR/lib/libparakeet.so"
fi

View File

@@ -16,7 +16,15 @@ cp -rfv $CURDIR/run.sh $CURDIR/package/
cp -rfLv $CURDIR/sources/go-piper/piper-phonemize/pi/lib/* $CURDIR/package/lib/
# Detect architecture and copy appropriate libraries
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
if [ "$(uname)" = "Darwin" ]; then
# macOS has no glibc loader to bundle. The piper binary links its bundled
# libs (libucd, libespeak-ng, libpiper_phonemize, libonnxruntime) via
# @rpath but ships with no LC_RPATH, so dyld aborts at launch with
# "Library not loaded: @rpath/libucd.dylib ... no LC_RPATH's found".
# Add an @loader_path/lib rpath so @rpath resolves to package/lib/.
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/piper"
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
# x86_64 architecture
echo "Detected x86_64 architecture, copying x86_64 libraries..."
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so

View File

@@ -1,15 +1,20 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
export ESPEAK_NG_DATA=$CURDIR/espeak-ng-data
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export ESPEAK_NG_DATA="$CURDIR"/espeak-ng-data
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec $CURDIR/lib/ld.so $CURDIR/piper "$@"
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
exec $CURDIR/piper "$@"
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/piper "$@"
fi
exec "$CURDIR"/piper "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgoqwen3ttscpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgoqwen3ttscpp-avx.so ]; then
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgoqwen3ttscpp-avx2.so ]; then
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx2.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgoqwen3ttscpp-avx512.so ]; then
if [ -e "$CURDIR"/libgoqwen3ttscpp-avx512.so ]; then
LIBRARY="$CURDIR/libgoqwen3ttscpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export QWEN3TTS_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/qwen3-tts-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/qwen3-tts-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/qwen3-tts-cpp "$@"
exec "$CURDIR"/qwen3-tts-cpp "$@"

View File

@@ -34,6 +34,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON -DRFDETR_GGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/librfdetrcpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/librfdetrcpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/librfdetrcpp-avx.so ]; then
if [ -e "$CURDIR"/librfdetrcpp-avx.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/librfdetrcpp-avx2.so ]; then
if [ -e "$CURDIR"/librfdetrcpp-avx2.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/librfdetrcpp-avx512.so ]; then
if [ -e "$CURDIR"/librfdetrcpp-avx512.so ]; then
LIBRARY="$CURDIR/librfdetrcpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export RFDETR_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/rfdetr-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/rfdetr-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/rfdetr-cpp "$@"
exec "$CURDIR"/rfdetr-cpp "$@"

View File

@@ -31,6 +31,8 @@ else ifeq ($(BUILD_TYPE),hipblas)
else ifeq ($(BUILD_TYPE),vulkan)
CMAKE_ARGS+=-DGGML_VULKAN=ON
else ifeq ($(OS),Darwin)
# macOS/Metal: built + published as an OCI image by CI (includeDarwin in
# .github/backend-matrix.yml) so Apple Silicon users can install this backend.
ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DGGML_METAL=OFF
else

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgosam3-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgosam3-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgosam3-avx.so ]; then
if [ -e "$CURDIR"/libgosam3-avx.so ]; then
LIBRARY="$CURDIR/libgosam3-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgosam3-avx2.so ]; then
if [ -e "$CURDIR"/libgosam3-avx2.so ]; then
LIBRARY="$CURDIR/libgosam3-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgosam3-avx512.so ]; then
if [ -e "$CURDIR"/libgosam3-avx512.so ]; then
LIBRARY="$CURDIR/libgosam3-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export SAM3_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/sam3-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/sam3-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/sam3-cpp "$@"
exec "$CURDIR"/sam3-cpp "$@"

View File

@@ -1,19 +1,19 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export SHERPA_SHIM_LIBRARY=$CURDIR/lib/libsherpa-shim.dylib
export SHERPA_ONNX_LIBRARY=$CURDIR/lib/libsherpa-onnx-c-api.dylib
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export SHERPA_SHIM_LIBRARY="$CURDIR"/lib/libsherpa-shim.dylib
export SHERPA_ONNX_LIBRARY="$CURDIR"/lib/libsherpa-onnx-c-api.dylib
else
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
exec $CURDIR/lib/ld.so $CURDIR/sherpa-onnx "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/sherpa-onnx "$@"
fi
exec $CURDIR/sherpa-onnx "$@"
exec "$CURDIR"/sherpa-onnx "$@"

View File

@@ -15,7 +15,14 @@ cp -avf $CURDIR/run.sh $CURDIR/package/
cp -rfLv $CURDIR/backend-assets/lib/* $CURDIR/package/lib/
# Detect architecture and copy appropriate libraries
if [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
if [ "$(uname)" = "Darwin" ]; then
# macOS has no glibc loader to bundle. silero-vad links its bundled
# libonnxruntime via @rpath but ships with no LC_RPATH, so dyld can't find
# it at runtime. Add an @loader_path/lib rpath so @rpath resolves to
# package/lib/ (matching the piper darwin fix, #10525).
echo "Detected macOS; adding @loader_path/lib rpath so bundled libs resolve via @rpath..."
install_name_tool -add_rpath @loader_path/lib "$CURDIR/package/silero-vad"
elif [ -f "/lib64/ld-linux-x86-64.so.2" ]; then
# x86_64 architecture
echo "Detected x86_64 architecture, copying x86_64 libraries..."
cp -arfLv /lib64/ld-linux-x86-64.so.2 $CURDIR/package/lib/ld.so

View File

@@ -1,14 +1,18 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
echo "Using lib/ld.so"
exec $CURDIR/lib/ld.so $CURDIR/silero-vad "$@"
if [ "$(uname)" = "Darwin" ]; then
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
exec $CURDIR/silero-vad "$@"
# If there is a lib/ld.so, use it
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
exec "$CURDIR"/lib/ld.so "$CURDIR"/silero-vad "$@"
fi
exec "$CURDIR"/silero-vad "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -20,20 +20,20 @@ if [ "$(uname)" = "Darwin" ]; then
if [ ! -e "$LIBRARY" ]; then
LIBRARY="$CURDIR/libgosd-fallback.so"
fi
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgosd-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgosd-avx.so ]; then
if [ -e "$CURDIR"/libgosd-avx.so ]; then
LIBRARY="$CURDIR/libgosd-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgosd-avx2.so ]; then
if [ -e "$CURDIR"/libgosd-avx2.so ]; then
LIBRARY="$CURDIR/libgosd-avx2.so"
fi
fi
@@ -41,22 +41,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgosd-avx512.so ]; then
if [ -e "$CURDIR"/libgosd-avx512.so ]; then
LIBRARY="$CURDIR/libgosd-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export SD_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/stablediffusion-ggml "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/stablediffusion-ggml "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/stablediffusion-ggml "$@"
exec "$CURDIR"/stablediffusion-ggml "$@"

View File

@@ -1,21 +1,21 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
if [ "$(uname)" = "Darwin" ]; then
# macOS uses dyld: there is no ld.so loader, and the search path env
# var is DYLD_LIBRARY_PATH. ONNX Runtime ships as a .dylib here.
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH=$CURDIR/lib/libonnxruntime.dylib
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.dylib
else
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH=$CURDIR/lib/libonnxruntime.so
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
export ONNXRUNTIME_LIB_PATH="$CURDIR"/lib/libonnxruntime.so
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
exec $CURDIR/lib/ld.so $CURDIR/supertonic "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/supertonic "$@"
fi
fi
exec $CURDIR/supertonic "$@"
exec "$CURDIR"/supertonic "$@"

View File

@@ -1,7 +1,7 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -14,41 +14,41 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgovibevoicecpp-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgovibevoicecpp-avx.so ]; then
if [ -e "$CURDIR"/libgovibevoicecpp-avx.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgovibevoicecpp-avx2.so ]; then
if [ -e "$CURDIR"/libgovibevoicecpp-avx2.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx2.so"
fi
fi
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgovibevoicecpp-avx512.so ]; then
if [ -e "$CURDIR"/libgovibevoicecpp-avx512.so ]; then
LIBRARY="$CURDIR/libgovibevoicecpp-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export VIBEVOICECPP_LIBRARY=$LIBRARY
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/vibevoice-cpp "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/vibevoice-cpp "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/vibevoice-cpp "$@"
exec "$CURDIR"/vibevoice-cpp "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,35 +15,35 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgovoxtral-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgovoxtral-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgovoxtral-avx.so ]; then
if [ -e "$CURDIR"/libgovoxtral-avx.so ]; then
LIBRARY="$CURDIR/libgovoxtral-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgovoxtral-avx2.so ]; then
if [ -e "$CURDIR"/libgovoxtral-avx2.so ]; then
LIBRARY="$CURDIR/libgovoxtral-avx2.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export VOXTRAL_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it (Linux only)
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/voxtral "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/voxtral "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/voxtral "$@"
exec "$CURDIR"/voxtral "$@"

View File

@@ -2,7 +2,7 @@
set -ex
# Get the absolute current dir where the script is located
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
cd /
@@ -15,20 +15,20 @@ fi
if [ "$(uname)" = "Darwin" ]; then
# macOS: single dylib variant (Metal or Accelerate)
LIBRARY="$CURDIR/libgowhisper-fallback.dylib"
export DYLD_LIBRARY_PATH=$CURDIR/lib:$DYLD_LIBRARY_PATH
export DYLD_LIBRARY_PATH="$CURDIR"/lib:$DYLD_LIBRARY_PATH
else
LIBRARY="$CURDIR/libgowhisper-fallback.so"
if grep -q -e "\savx\s" /proc/cpuinfo ; then
echo "CPU: AVX found OK"
if [ -e $CURDIR/libgowhisper-avx.so ]; then
if [ -e "$CURDIR"/libgowhisper-avx.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx.so"
fi
fi
if grep -q -e "\savx2\s" /proc/cpuinfo ; then
echo "CPU: AVX2 found OK"
if [ -e $CURDIR/libgowhisper-avx2.so ]; then
if [ -e "$CURDIR"/libgowhisper-avx2.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx2.so"
fi
fi
@@ -36,22 +36,22 @@ else
# Check avx 512
if grep -q -e "\savx512f\s" /proc/cpuinfo ; then
echo "CPU: AVX512F found OK"
if [ -e $CURDIR/libgowhisper-avx512.so ]; then
if [ -e "$CURDIR"/libgowhisper-avx512.so ]; then
LIBRARY="$CURDIR/libgowhisper-avx512.so"
fi
fi
export LD_LIBRARY_PATH=$CURDIR/lib:$LD_LIBRARY_PATH
export LD_LIBRARY_PATH="$CURDIR"/lib:$LD_LIBRARY_PATH
fi
export WHISPER_LIBRARY=$LIBRARY
# If there is a lib/ld.so, use it
if [ -f $CURDIR/lib/ld.so ]; then
if [ -f "$CURDIR"/lib/ld.so ]; then
echo "Using lib/ld.so"
echo "Using library: $LIBRARY"
exec $CURDIR/lib/ld.so $CURDIR/whisper "$@"
exec "$CURDIR"/lib/ld.so "$CURDIR"/whisper "$@"
fi
echo "Using library: $LIBRARY"
exec $CURDIR/whisper "$@"
exec "$CURDIR"/whisper "$@"

View File

@@ -340,6 +340,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp"
intel: "intel-sycl-f32-sam3-cpp"
vulkan: "vulkan-sam3-cpp"
metal: "metal-sam3-cpp"
- &rfdetrcpp
name: "rfdetr-cpp"
alias: "rfdetr-cpp"
@@ -368,6 +369,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp"
intel: "intel-sycl-f32-rfdetr-cpp"
vulkan: "vulkan-rfdetr-cpp"
metal: "metal-rfdetr-cpp"
- &locateanything
name: "locate-anything"
alias: "locate-anything"
@@ -397,6 +399,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp"
intel: "intel-sycl-f32-locate-anything-cpp"
vulkan: "vulkan-locate-anything-cpp"
metal: "metal-locate-anything-cpp"
- !!merge <<: *locateanything
name: "locate-anything-development"
capabilities:
@@ -409,6 +412,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-locate-anything-cpp-development"
intel: "intel-sycl-f32-locate-anything-cpp-development"
vulkan: "vulkan-locate-anything-cpp-development"
metal: "metal-locate-anything-cpp-development"
- !!merge <<: *locateanything
name: "cpu-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-locate-anything-cpp"
@@ -419,6 +423,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-locate-anything-cpp"
mirrors:
- localai/localai-backends:master-cpu-locate-anything-cpp
- !!merge <<: *locateanything
name: "metal-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-locate-anything-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-locate-anything-cpp
- !!merge <<: *locateanything
name: "metal-locate-anything-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-locate-anything-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-locate-anything-cpp
- !!merge <<: *locateanything
name: "cuda12-locate-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-locate-anything-cpp"
@@ -517,6 +531,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp"
intel: "intel-sycl-f32-depth-anything-cpp"
vulkan: "vulkan-depth-anything-cpp"
metal: "metal-depth-anything-cpp"
- !!merge <<: *depthanything
name: "depth-anything-development"
capabilities:
@@ -529,6 +544,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-depth-anything-cpp-development"
intel: "intel-sycl-f32-depth-anything-cpp-development"
vulkan: "vulkan-depth-anything-cpp-development"
metal: "metal-depth-anything-cpp-development"
- !!merge <<: *depthanything
name: "cpu-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-depth-anything-cpp"
@@ -539,6 +555,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-depth-anything-cpp"
mirrors:
- localai/localai-backends:master-cpu-depth-anything-cpp
- !!merge <<: *depthanything
name: "metal-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-depth-anything-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-depth-anything-cpp
- !!merge <<: *depthanything
name: "metal-depth-anything-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-depth-anything-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-depth-anything-cpp
- !!merge <<: *depthanything
name: "cuda12-depth-anything-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-depth-anything-cpp"
@@ -1031,6 +1057,8 @@
nvidia-l4t: "vulkan-localvqe"
nvidia-l4t-cuda-12: "vulkan-localvqe"
nvidia-l4t-cuda-13: "vulkan-localvqe"
# Apple Silicon: CPU build (LocalVQE has no Metal path); still arm64-native.
metal: "metal-localvqe"
- &privacyfilter
name: "privacy-filter"
alias: "privacy-filter"
@@ -1067,6 +1095,7 @@
amd: "vulkan-privacy-filter"
intel: "vulkan-privacy-filter"
vulkan: "vulkan-privacy-filter"
metal: "metal-privacy-filter"
- &faster-whisper
icon: https://avatars.githubusercontent.com/u/1520500?s=200&v=4
description: |
@@ -2909,6 +2938,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-vulkan-privacy-filter"
mirrors:
- localai/localai-backends:master-gpu-vulkan-privacy-filter
- !!merge <<: *privacyfilter
name: "metal-privacy-filter"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-privacy-filter"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-privacy-filter
- !!merge <<: *privacyfilter
name: "metal-privacy-filter-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-privacy-filter"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-privacy-filter
- !!merge <<: *privacyfilter
name: "cuda13-privacy-filter"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-13-privacy-filter"
@@ -3220,6 +3259,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-sam3-cpp-development"
intel: "intel-sycl-f32-sam3-cpp-development"
vulkan: "vulkan-sam3-cpp-development"
metal: "metal-sam3-cpp-development"
- !!merge <<: *sam3cpp
name: "cpu-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-sam3-cpp"
@@ -3230,6 +3270,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-sam3-cpp"
mirrors:
- localai/localai-backends:master-cpu-sam3-cpp
- !!merge <<: *sam3cpp
name: "metal-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-sam3-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-sam3-cpp
- !!merge <<: *sam3cpp
name: "metal-sam3-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-sam3-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-sam3-cpp
- !!merge <<: *sam3cpp
name: "cuda12-sam3-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-sam3-cpp"
@@ -3303,6 +3353,7 @@
nvidia-l4t-cuda-13: "cuda13-nvidia-l4t-arm64-rfdetr-cpp-development"
intel: "intel-sycl-f32-rfdetr-cpp-development"
vulkan: "vulkan-rfdetr-cpp-development"
metal: "metal-rfdetr-cpp-development"
- !!merge <<: *rfdetrcpp
name: "cpu-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-cpu-rfdetr-cpp"
@@ -3313,6 +3364,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-cpu-rfdetr-cpp"
mirrors:
- localai/localai-backends:master-cpu-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "metal-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-rfdetr-cpp"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "metal-rfdetr-cpp-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-rfdetr-cpp"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-rfdetr-cpp
- !!merge <<: *rfdetrcpp
name: "cuda12-rfdetr-cpp"
uri: "quay.io/go-skynet/local-ai-backends:latest-gpu-nvidia-cuda-12-rfdetr-cpp"
@@ -4101,6 +4162,16 @@
uri: "quay.io/go-skynet/local-ai-backends:master-gpu-vulkan-localvqe"
mirrors:
- localai/localai-backends:master-gpu-vulkan-localvqe
- !!merge <<: *localvqecpp
name: "metal-localvqe"
uri: "quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-localvqe"
mirrors:
- localai/localai-backends:latest-metal-darwin-arm64-localvqe
- !!merge <<: *localvqecpp
name: "metal-localvqe-development"
uri: "quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-localvqe"
mirrors:
- localai/localai-backends:master-metal-darwin-arm64-localvqe
## kokoro
- !!merge <<: *kokoro
name: "kokoro-development"

View File

@@ -1,23 +1,23 @@
#!/bin/bash
set -ex
CURDIR=$(dirname "$(realpath $0)")
CURDIR=$(dirname "$(realpath "$0")")
export LD_LIBRARY_PATH=$CURDIR/lib:${LD_LIBRARY_PATH:-}
export LD_LIBRARY_PATH="$CURDIR"/lib:${LD_LIBRARY_PATH:-}
# SSL certificates for model auto-download
if [ -d "$CURDIR/etc/ssl/certs" ]; then
export SSL_CERT_DIR=$CURDIR/etc/ssl/certs
export SSL_CERT_DIR="$CURDIR"/etc/ssl/certs
fi
# espeak-ng data directory
if [ -d "$CURDIR/espeak-ng-data" ]; then
export ESPEAK_NG_DATA=$CURDIR/espeak-ng-data
export ESPEAK_NG_DATA="$CURDIR"/espeak-ng-data
fi
# Use bundled ld.so if present (portability)
if [ -f $CURDIR/lib/ld.so ]; then
exec $CURDIR/lib/ld.so $CURDIR/kokoros-grpc "$@"
if [ -f "$CURDIR"/lib/ld.so ]; then
exec "$CURDIR"/lib/ld.so "$CURDIR"/kokoros-grpc "$@"
fi
exec $CURDIR/kokoros-grpc "$@"
exec "$CURDIR"/kokoros-grpc "$@"

View File

@@ -0,0 +1,8 @@
Website = "https://localai.io"
[Details]
Icon = "../../core/http/static/logo.png"
Name = "LocalAI"
ID = "com.localai.launcher"
Version = "0.0.0"
Build = 1

View File

@@ -0,0 +1,14 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
<key>com.apple.security.cs.allow-jit</key>
<true/>
<key>com.apple.security.cs.allow-unsigned-executable-memory</key>
<true/>
</dict>
</plist>

View File

@@ -0,0 +1,84 @@
#!/usr/bin/env bash
# Code-sign and notarize macOS artifacts for LocalAI.
# Every sub-command is a no-op (exit 0) when its required secret is unset,
# so unsigned builds (forks, local dev, PRs) keep working.
set -euo pipefail
ENTITLEMENTS="contrib/macos/Launcher.entitlements"
KEYCHAIN="localai-ci.keychain-db"
cmd_import_cert() {
if [ -z "${MACOS_CERTIFICATE:-}" ]; then
echo "[sign] MACOS_CERTIFICATE unset: skipping cert import (unsigned build)"
return 0
fi
local certfile keychain_pwd default_keychain
certfile="$(mktemp).p12"
keychain_pwd="${MACOS_CI_KEYCHAIN_PWD:?MACOS_CI_KEYCHAIN_PWD required when signing}"
echo "$MACOS_CERTIFICATE" | base64 --decode > "$certfile"
security create-keychain -p "$keychain_pwd" "$KEYCHAIN"
security set-keychain-settings -lut 21600 "$KEYCHAIN"
security unlock-keychain -p "$keychain_pwd" "$KEYCHAIN"
security import "$certfile" -k "$KEYCHAIN" -P "${MACOS_CERTIFICATE_PWD:?}" \
-T /usr/bin/codesign -T /usr/bin/security
security set-key-partition-list -S apple-tool:,apple:,codesign: \
-s -k "$keychain_pwd" "$KEYCHAIN" >/dev/null
default_keychain="$(security default-keychain | tr -d ' "')"
security list-keychains -d user -s "$KEYCHAIN" "$default_keychain"
rm -f "$certfile"
echo "[sign] certificate imported into $KEYCHAIN"
}
cmd_sign() {
local target="$1"
if [ -z "${MACOS_SIGN_IDENTITY:-}" ]; then
echo "[sign] MACOS_SIGN_IDENTITY unset: skipping codesign of $target"
return 0
fi
case "$target" in
*.app)
# Hardened runtime + entitlements are required for notarizing the app bundle.
codesign --deep --force --options runtime --timestamp \
--entitlements "$ENTITLEMENTS" \
--sign "$MACOS_SIGN_IDENTITY" "$target"
;;
*)
# A disk image carries no entitlements/runtime; just sign the container.
codesign --force --timestamp --sign "$MACOS_SIGN_IDENTITY" "$target"
;;
esac
codesign --verify --strict --verbose=2 "$target"
echo "[sign] signed $target"
}
cmd_notarize() {
local dmg="$1"
if [ -z "${MACOS_NOTARY_KEY:-}" ]; then
echo "[notarize] MACOS_NOTARY_KEY unset: skipping notarization of $dmg"
return 0
fi
local keyfile
keyfile="$(mktemp).p8"
echo "$MACOS_NOTARY_KEY" | base64 --decode > "$keyfile"
xcrun notarytool submit "$dmg" \
--key "$keyfile" \
--key-id "${MACOS_NOTARY_KEY_ID:?}" \
--issuer "${MACOS_NOTARY_ISSUER_ID:?}" \
--wait
rm -f "$keyfile"
xcrun stapler staple "$dmg"
xcrun stapler validate "$dmg"
echo "[notarize] notarized and stapled $dmg"
}
main() {
local sub="${1:-}"; shift || true
case "$sub" in
import-cert) cmd_import_cert ;;
sign) cmd_sign "$@" ;;
notarize) cmd_notarize "$@" ;;
*) echo "usage: $0 {import-cert|sign <path>|notarize <dmg>}" >&2; exit 2 ;;
esac
}
main "$@"

View File

@@ -37,6 +37,8 @@ func (a *Application) RestartAgentJobService() error {
if d.JobStore != nil {
agentJobService.SetDistributedJobStore(d.JobStore)
}
// Keep agent tasks consistent across replicas (same client the dispatcher uses).
agentJobService.SetTaskSyncNATS(d.Nats)
}
// Start the service

View File

@@ -604,6 +604,10 @@ func (a *Application) StartAgentPool() {
usm.SetJobDBStore(s)
}
}
// Keep per-user agent tasks consistent across replicas (nil in standalone).
if d := a.Distributed(); d != nil {
usm.SetJobSyncNATS(d.Nats)
}
aps.SetUserServicesManager(usm)
a.agentPoolService.Store(aps)

View File

@@ -16,6 +16,7 @@ import (
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/jobs"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/core/services/monitoring"
"github.com/mudler/LocalAI/core/services/nodes"
"github.com/mudler/LocalAI/core/services/routing/admission"
@@ -279,6 +280,9 @@ func New(opts ...config.AppOption) (*Application, error) {
if application.agentJobService != nil {
application.agentJobService.SetDistributedBackends(distSvc.Dispatcher)
application.agentJobService.SetDistributedJobStore(distSvc.JobStore)
// Keep agent tasks consistent across replicas (jobs already sync via the
// dispatcher + DB read-through). Same NATS client the dispatcher uses.
application.agentJobService.SetTaskSyncNATS(distSvc.Nats)
}
// Wire skill store into AgentPoolService (wired at pool start time via closure)
// The actual wiring happens in StartAgentPool since the pool doesn't exist yet.
@@ -330,9 +334,14 @@ func New(opts ...config.AppOption) (*Application, error) {
gs := application.galleryService
sys := options.SystemState
cfgLoaderOpts := options.ToConfigLoaderOptions()
gs.OnModelsChanged = func(_ messaging.CacheInvalidateEvent) {
if err := application.ModelConfigLoader().LoadModelConfigsFromPath(sys.Model.ModelsPath, cfgLoaderOpts...); err != nil {
xlog.Warn("Failed to reload model configs after peer invalidation", "error", err)
gs.OnModelsChanged = func(evt messaging.CacheInvalidateEvent) {
// ApplyRemoteChange honors the op: a "delete" prunes the element
// (a reload-from-path is additive and cannot drop it), anything
// else reloads from disk; a named element's running instance is
// shut down so the new config takes effect. The originating
// replica reloads inline and never depends on this path.
if err := modeladmin.ApplyRemoteChange(application.ModelConfigLoader(), application.modelLoader, sys.Model.ModelsPath, evt, cfgLoaderOpts...); err != nil {
xlog.Warn("Failed to apply peer model config change", "error", err)
}
}
if err := application.galleryService.SubscribeBroadcasts(); err != nil {

View File

@@ -203,6 +203,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
system.WithBackendImagesReleaseTag(r.BackendImagesReleaseTag),
system.WithBackendImagesBranchTag(r.BackendImagesBranchTag),
system.WithBackendDevSuffix(r.BackendDevSuffix),
system.WithPreferDevelopmentBackends(r.PreferDevelopmentBackends),
)
if err != nil {
return err

View File

@@ -59,6 +59,22 @@ func getFallbackTagValues(systemState *system.SystemState) (latestTag, masterTag
return latestTag, masterTag, devSuffix
}
// developmentURI returns the development image URI for a released backend URI by
// swapping the released tag for the branch tag (e.g.
// latest-metal-darwin-arm64-llama-cpp -> master-metal-darwin-arm64-llama-cpp).
// The branch image tracks development. ok is false when uri has no released tag
// to swap or already uses the branch tag.
func developmentURI(uri, latestTag, masterTag string) (string, bool) {
if strings.Contains(uri, masterTag+"-") {
return "", false
}
branchURI := strings.Replace(uri, latestTag+"-", masterTag+"-", 1)
if branchURI == uri {
return "", false
}
return branchURI, true
}
// backendCandidate represents an installed concrete backend option for a given alias
type backendCandidate struct {
name string
@@ -295,15 +311,28 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
return fmt.Errorf("backend %q: %w", config.Name, optsErr)
}
uri := downloader.URI(config.URI)
// PreferDevelopmentBackends installs the development image as the primary URI,
// keeping the released image reachable as the first fallback — instead of only
// reaching development when the released image is missing.
primaryURI := string(config.URI)
mirrors := config.Mirrors
if systemState.PreferDevelopmentBackends {
if devURI, ok := developmentURI(string(config.URI), latestTag, masterTag); ok {
xlog.Info("PreferDevelopmentBackends: installing development image first", "development", devURI, "released", config.URI)
primaryURI = devURI
mirrors = append([]string{string(config.URI)}, config.Mirrors...)
}
}
uri := downloader.URI(primaryURI)
// Check if it is a directory
if uri.LooksLikeDir() {
// It is a directory, we just copy it over in the backend folder
if err := cp.Copy(config.URI, backendPath); err != nil {
if err := cp.Copy(string(uri), backendPath); err != nil {
return fmt.Errorf("failed copying: %w", err)
}
} else {
xlog.Debug("Downloading backend", "uri", config.URI, "backendPath", backendPath)
xlog.Debug("Downloading backend", "uri", primaryURI, "backendPath", backendPath)
if err := uri.DownloadFileWithContext(ctx, backendPath, config.SHA256, 1, 1, downloadStatus, downloadOpts...); err != nil {
xlog.Debug("Backend download failed, trying fallback", "backendPath", backendPath, "error", err)
@@ -316,8 +345,9 @@ func InstallBackend(ctx context.Context, systemState *system.SystemState, modelL
}
success := false
// Try to download from mirrors
for _, mirror := range config.Mirrors {
// Try to download from mirrors (when development is preferred, the
// released image is prepended here as the first fallback).
for _, mirror := range mirrors {
// Check for cancellation before trying next mirror
select {
case <-ctx.Done():

View File

@@ -0,0 +1,26 @@
package gallery
import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
var _ = Describe("developmentURI", func() {
const latest, master = "latest", "master"
It("rewrites a released image to its branch (development) image", func() {
got, ok := developmentURI("quay.io/go-skynet/local-ai-backends:latest-metal-darwin-arm64-llama-cpp", latest, master)
Expect(ok).To(BeTrue())
Expect(got).To(Equal("quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-llama-cpp"))
})
It("leaves an image already on the branch tag untouched", func() {
_, ok := developmentURI("quay.io/go-skynet/local-ai-backends:master-metal-darwin-arm64-llama-cpp", latest, master)
Expect(ok).To(BeFalse())
})
It("returns ok=false when there is no released tag to swap", func() {
_, ok := developmentURI("oci://localhost/custom-backend:edge", latest, master)
Expect(ok).To(BeFalse())
})
})

View File

@@ -23,8 +23,10 @@ import (
"github.com/mudler/LocalAI/core/application"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/finetune"
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/nodes"
"github.com/mudler/LocalAI/core/services/quantization"
@@ -400,25 +402,45 @@ func API(application *application.Application) (*echo.Echo, error) {
routes.RegisterAgentPoolRoutes(e, application, agentsMw, skillsMw, collectionsMw)
// Fine-tuning routes
fineTuningMw := auth.RequireFeature(application.AuthDB(), auth.FeatureFineTuning)
// In distributed mode pass the shared NATS client + PostgreSQL store so
// fine-tune jobs stay consistent across replicas (the SyncedMap broadcasts
// mutations and hydrates from the DB); standalone passes nil for both.
var ftNats messaging.MessagingClient
var ftStore *distributed.FineTuneStore
if d := application.Distributed(); d != nil {
ftNats = d.Nats
if d.DistStores != nil && d.DistStores.FineTune != nil {
ftStore = d.DistStores.FineTune
}
}
ftService := finetune.NewFineTuneService(
application.ApplicationConfig(),
application.ModelLoader(),
application.ModelConfigLoader(),
ftNats,
ftStore,
)
if d := application.Distributed(); d != nil {
ftService.SetNATSClient(d.Nats)
if d.DistStores != nil && d.DistStores.FineTune != nil {
ftService.SetFineTuneStore(d.DistStores.FineTune)
}
}
routes.RegisterFineTuningRoutes(e, ftService, application.ApplicationConfig(), fineTuningMw)
// Quantization routes
quantizationMw := auth.RequireFeature(application.AuthDB(), auth.FeatureQuantization)
// In distributed mode pass the shared NATS client + PostgreSQL store so
// quantization jobs stay consistent across replicas (the SyncedMap broadcasts
// mutations and hydrates from the DB); standalone passes nil for both.
var quantNats messaging.MessagingClient
var quantStore *distributed.QuantStore
if d := application.Distributed(); d != nil {
quantNats = d.Nats
if d.DistStores != nil && d.DistStores.Quant != nil {
quantStore = d.DistStores.Quant
}
}
qService := quantization.NewQuantizationService(
application.ApplicationConfig(),
application.ModelLoader(),
application.ModelConfigLoader(),
quantNats,
quantStore,
)
routes.RegisterQuantizationRoutes(e, qService, application.ApplicationConfig(), quantizationMw)

View File

@@ -3,10 +3,51 @@
package auth
import (
"net/url"
"strings"
"gorm.io/driver/sqlite"
"gorm.io/gorm"
)
func openSQLiteDialector(path string) (gorm.Dialector, error) {
return sqlite.Open(path), nil
return sqlite.Open(buildSQLiteDSN(path)), nil
}
// buildSQLiteDSN augments a SQLite file path with connection pragmas that make
// the auth DB resilient on slow or contended storage.
//
// - _busy_timeout=5000 makes SQLite retry for up to 5s on SQLITE_BUSY instead
// of failing immediately. Network-backed storage (SMB/CIFS/NFS, e.g. Azure
// Files) is prone to transient lock contention during migration (see #10506).
// - _txlock=immediate takes the write lock at BEGIN, avoiding deadlocks when a
// read transaction later upgrades to a write during AutoMigrate.
//
// We deliberately do NOT set WAL journal mode: WAL relies on a shared-memory
// mmap that does not work over SMB/NFS, which is exactly the failing case here.
//
// Caller-supplied values for either pragma are preserved.
func buildSQLiteDSN(path string) string {
base := path
rawQuery := ""
if i := strings.IndexByte(path, '?'); i >= 0 {
base = path[:i]
rawQuery = path[i+1:]
}
values, err := url.ParseQuery(rawQuery)
if err != nil {
// An unparseable query string means a hand-crafted DSN we should not
// risk corrupting; leave it untouched.
return path
}
if values.Get("_busy_timeout") == "" {
values.Set("_busy_timeout", "5000")
}
if values.Get("_txlock") == "" {
values.Set("_txlock", "immediate")
}
return base + "?" + values.Encode()
}

View File

@@ -0,0 +1,57 @@
//go:build auth
package auth
import (
"net/url"
"strings"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
// parseDSN splits a "base?query" DSN into its base and decoded query values so
// assertions don't depend on url.Values.Encode()'s key ordering.
func parseDSN(dsn string) (string, url.Values) {
base := dsn
rawQuery := ""
if i := strings.IndexByte(dsn, '?'); i >= 0 {
base = dsn[:i]
rawQuery = dsn[i+1:]
}
values, err := url.ParseQuery(rawQuery)
Expect(err).ToNot(HaveOccurred())
return base, values
}
var _ = Describe("buildSQLiteDSN", func() {
It("adds busy_timeout and txlock to a plain file path", func() {
base, values := parseDSN(buildSQLiteDSN("/data/database.db"))
Expect(base).To(Equal("/data/database.db"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("adds pragmas to an in-memory database", func() {
base, values := parseDSN(buildSQLiteDSN(":memory:"))
Expect(base).To(Equal(":memory:"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("preserves an existing query string", func() {
base, values := parseDSN(buildSQLiteDSN("/data/database.db?cache=shared"))
Expect(base).To(Equal("/data/database.db"))
Expect(values.Get("cache")).To(Equal("shared"))
Expect(values.Get("_busy_timeout")).To(Equal("5000"))
Expect(values.Get("_txlock")).To(Equal("immediate"))
})
It("does not override a caller-supplied busy_timeout or txlock", func() {
_, values := parseDSN(buildSQLiteDSN("/data/database.db?_busy_timeout=1000&_txlock=deferred"))
Expect(values["_busy_timeout"]).To(HaveLen(1), "_busy_timeout should not be duplicated")
Expect(values.Get("_busy_timeout")).To(Equal("1000"))
Expect(values["_txlock"]).To(HaveLen(1), "_txlock should not be duplicated")
Expect(values.Get("_txlock")).To(Equal("deferred"))
})
})

View File

@@ -155,7 +155,7 @@ func AutocompleteEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, a
// @Param name path string true "Model name"
// @Success 200 {object} map[string]any "success message"
// @Router /api/models/config-json/{name} [patch]
func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -173,6 +173,14 @@ func PatchConfigEndpoint(cl *config.ModelConfigLoader, _ *model.ModelLoader, app
if _, err := svc.PatchConfig(c.Request().Context(), modelName, patchMap); err != nil {
return c.JSON(httpStatusForModelAdminError(err), map[string]any{"error": err.Error()})
}
// Patch rewrites the config on disk and reloads only the local loader;
// tell peers to refresh so the change is consistent across replicas.
// No-op in standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelName, "install")
}
return c.JSON(http.StatusOK, map[string]any{
"success": true,
"message": fmt.Sprintf("Model '%s' updated successfully", modelName),

View File

@@ -45,7 +45,7 @@ var _ = Describe("Config Metadata Endpoints", func() {
app = echo.New()
app.GET("/api/models/config-metadata", ConfigMetadataEndpoint())
app.GET("/api/models/config-metadata/autocomplete/:provider", AutocompleteEndpoint(configLoader, modelLoader, appConfig))
app.PATCH("/api/models/config-json/:name", PatchConfigEndpoint(configLoader, modelLoader, appConfig))
app.PATCH("/api/models/config-json/:name", PatchConfigEndpoint(configLoader, modelLoader, nil, appConfig))
})
AfterEach(func() {

View File

@@ -10,6 +10,7 @@ import (
"github.com/labstack/echo/v4"
"github.com/mudler/LocalAI/core/config"
httpUtils "github.com/mudler/LocalAI/core/http/middleware"
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/internal"
"github.com/mudler/LocalAI/pkg/model"
@@ -55,7 +56,7 @@ func GetEditModelPage(cl *config.ModelConfigLoader, appConfig *config.Applicatio
}
// EditModelEndpoint handles updating existing model configurations
func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -70,6 +71,17 @@ func EditModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appC
if err != nil {
return c.JSON(httpStatusForModelAdminError(err), ModelResponse{Success: false, Error: err.Error()})
}
// Tell peer replicas to refresh their in-memory config: this endpoint
// only reloaded the local loader. A rename is a delete of the old name
// plus an install of the new one. No-op in standalone mode.
if gs != nil {
if result.Renamed {
gs.BroadcastModelsChanged(result.OldName, "delete")
}
gs.BroadcastModelsChanged(result.NewName, "install")
}
msg := fmt.Sprintf("Model '%s' updated successfully. Model has been reloaded with new configuration.", result.NewName)
if result.Renamed {
msg = fmt.Sprintf("Model '%s' renamed to '%s' and updated successfully.", result.OldName, result.NewName)

View File

@@ -56,7 +56,7 @@ var _ = Describe("Edit Model test", func() {
app := echo.New()
// Set up a simple renderer for the test
app.Renderer = &testRenderer{}
app.POST("/import-model", ImportModelEndpoint(modelConfigLoader, applicationConfig))
app.POST("/import-model", ImportModelEndpoint(modelConfigLoader, nil, applicationConfig))
app.GET("/edit-model/:name", GetEditModelPage(modelConfigLoader, applicationConfig))
requestBody := bytes.NewBufferString(`{"name": "foo", "backend": "foo", "model": "foo"}`)
@@ -106,7 +106,7 @@ var _ = Describe("Edit Model test", func() {
Expect(exists).To(BeTrue())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
newYAML := "name: newname\nbackend: llama\nmodel: foo\n"
req := httptest.NewRequest("POST", "/models/edit/oldname", bytes.NewBufferString(newYAML))
@@ -163,7 +163,7 @@ var _ = Describe("Edit Model test", func() {
Expect(modelConfigLoader.LoadModelConfigsFromPath(tempDir)).To(Succeed())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
req := httptest.NewRequest(
"POST",
@@ -204,7 +204,7 @@ var _ = Describe("Edit Model test", func() {
Expect(modelConfigLoader.LoadModelConfigsFromPath(tempDir)).To(Succeed())
app := echo.New()
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, applicationConfig))
app.POST("/models/edit/:name", EditModelEndpoint(modelConfigLoader, modelLoader, nil, applicationConfig))
req := httptest.NewRequest(
"POST",

View File

@@ -125,7 +125,7 @@ func ImportModelURIEndpoint(cl *config.ModelConfigLoader, appConfig *config.Appl
}
// ImportModelEndpoint handles creating new model configurations
func ImportModelEndpoint(cl *config.ModelConfigLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func ImportModelEndpoint(cl *config.ModelConfigLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
return func(c echo.Context) error {
// Get the raw body
body, err := io.ReadAll(c.Request().Body)
@@ -245,6 +245,13 @@ func ImportModelEndpoint(cl *config.ModelConfigLoader, appConfig *config.Applica
}
return c.JSON(http.StatusInternalServerError, response)
}
// Tell peer replicas to load the newly-created config from the shared
// models dir: this endpoint only reloaded the local loader. No-op in
// standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelConfig.Name, "install")
}
// Return success response
response := ModelResponse{
Success: true,

View File

@@ -60,7 +60,10 @@ func GetNodeEndpoint(registry *nodes.NodeRegistry) echo.HandlerFunc {
return func(c echo.Context) error {
ctx := c.Request().Context()
id := c.Param("id")
node, err := registry.Get(ctx, id)
// GetWithExtras (not Get) so the response carries the node's labels,
// loaded-model count, and in-flight total — the bare BackendNode keeps
// labels in a separate table, leaving the detail view's label list empty.
node, err := registry.GetWithExtras(ctx, id)
if err != nil {
return c.JSON(http.StatusNotFound, nodeError(http.StatusNotFound, "node not found"))
}

View File

@@ -7,6 +7,7 @@ import (
"github.com/labstack/echo/v4"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/galleryop"
"github.com/mudler/LocalAI/core/services/modeladmin"
"github.com/mudler/LocalAI/pkg/model"
)
@@ -24,7 +25,7 @@ import (
// @Failure 404 {object} ModelResponse
// @Failure 500 {object} ModelResponse
// @Router /api/models/{name}/{action} [put]
func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, appConfig *config.ApplicationConfig) echo.HandlerFunc {
func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoader, gs *galleryop.GalleryService, appConfig *config.ApplicationConfig) echo.HandlerFunc {
svc := modeladmin.NewConfigService(cl, appConfig)
return func(c echo.Context) error {
modelName := c.Param("name")
@@ -36,6 +37,14 @@ func ToggleStateModelEndpoint(cl *config.ModelConfigLoader, ml *model.ModelLoade
if err != nil {
return c.JSON(httpStatusForModelAdminError(err), ModelResponse{Success: false, Error: err.Error()})
}
// Enabling/disabling rewrites the config on disk and reloads only the
// local loader; tell peers to refresh so the model's availability is
// consistent across replicas. No-op in standalone mode.
if gs != nil {
gs.BroadcastModelsChanged(modelName, "install")
}
msg := fmt.Sprintf("Model '%s' has been %sd successfully.", modelName, action)
if action == modeladmin.ActionDisable {
msg += " The model will not be loaded on demand until re-enabled."

View File

@@ -72,19 +72,19 @@ func RegisterLocalAIRoutes(router *echo.Echo,
router.POST("/backends/upgrades/check", backendGalleryEndpointService.CheckUpgradesEndpoint(), adminMiddleware)
router.POST("/backends/upgrade/:name", backendGalleryEndpointService.UpgradeBackendEndpoint(), adminMiddleware)
// Custom model import endpoint
router.POST("/models/import", localai.ImportModelEndpoint(cl, appConfig), adminMiddleware)
router.POST("/models/import", localai.ImportModelEndpoint(cl, galleryService, appConfig), adminMiddleware)
// URI model import endpoint
router.POST("/models/import-uri", localai.ImportModelURIEndpoint(cl, appConfig, galleryService, opcache), adminMiddleware)
// Custom model edit endpoint
router.POST("/models/edit/:name", localai.EditModelEndpoint(cl, ml, appConfig), adminMiddleware)
router.POST("/models/edit/:name", localai.EditModelEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
// List model aliases endpoint
router.GET("/api/aliases", localai.ListAliasesEndpoint(cl), adminMiddleware)
// Toggle model enable/disable endpoint
router.PUT("/models/toggle-state/:name/:action", localai.ToggleStateModelEndpoint(cl, ml, appConfig), adminMiddleware)
router.PUT("/models/toggle-state/:name/:action", localai.ToggleStateModelEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
// Toggle model pinned status endpoint
router.PUT("/models/toggle-pinned/:name/:action", localai.TogglePinnedModelEndpoint(cl, appConfig, func() {

View File

@@ -922,7 +922,7 @@ func RegisterUIAPIRoutes(app *echo.Echo, cl *config.ModelConfigLoader, ml *model
app.GET("/api/models/config-metadata/autocomplete/:provider", localai.AutocompleteEndpoint(cl, ml, appConfig), adminMiddleware)
// PATCH config endpoint - partial update using nested JSON merge
app.PATCH("/api/models/config-json/:name", localai.PatchConfigEndpoint(cl, ml, appConfig), adminMiddleware)
app.PATCH("/api/models/config-json/:name", localai.PatchConfigEndpoint(cl, ml, galleryService, appConfig), adminMiddleware)
// VRAM estimation endpoint
app.POST("/api/models/vram-estimate", localai.VRAMEstimateEndpoint(cl, appConfig), adminMiddleware)

View File

@@ -4,14 +4,59 @@ import (
"context"
"fmt"
"hash/fnv"
"strings"
"sync"
"gorm.io/gorm"
)
// TryWithLockCtx attempts to acquire a PostgreSQL advisory lock using the provided context.
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if the lock
// was already held, or (false, error) on failure.
// localLocks holds one buffered channel (capacity 1) per lock key, used as an
// in-process mutex for non-PostgreSQL dialects (SQLite). A SQLite auth DB is
// effectively single-process, so serializing guarded sections within this
// process is sufficient - we cannot and need not coordinate across processes
// the way a PostgreSQL advisory lock does.
var (
localLocksMu sync.Mutex
localLocks = map[int64]chan struct{}{}
)
// localLockChan returns the per-key buffered channel, creating it on first use.
func localLockChan(key int64) chan struct{} {
localLocksMu.Lock()
defer localLocksMu.Unlock()
ch, ok := localLocks[key]
if !ok {
ch = make(chan struct{}, 1)
localLocks[key] = ch
}
return ch
}
// isPostgres reports whether the gorm dialect is PostgreSQL. Anything else
// (SQLite and any non-postgres dialect) uses the in-process fallback, because
// the pg_* advisory lock functions only exist on PostgreSQL.
func isPostgres(db *gorm.DB) bool {
return strings.Contains(db.Dialector.Name(), "postgres")
}
// TryWithLockCtx attempts to acquire a lock and run fn without blocking.
// Returns (true, nil) if the lock was acquired and fn executed, (false, nil) if
// the lock was already held, or (false, error) on failure.
//
// On PostgreSQL it uses pg_try_advisory_lock (cross-process). On other dialects
// (SQLite) it uses a non-blocking in-process lock keyed by key.
func TryWithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) (bool, error) {
if !isPostgres(db) {
ch := localLockChan(key)
select {
case ch <- struct{}{}:
defer func() { <-ch }()
return true, fn()
default:
return false, nil
}
}
sqlDB, err := db.DB()
if err != nil {
return false, fmt.Errorf("get sql.DB: %w", err)
@@ -50,9 +95,31 @@ func KeyFromString(s string) int64 {
return int64(h.Sum64()>>1) | 0x100000000
}
// WithLockCtx is like WithLock but respects context cancellation.
// If ctx is cancelled while waiting for the lock, the function returns ctx.Err().
// WithLockCtx acquires a lock for key, runs fn, then releases it, respecting
// context cancellation. If ctx is cancelled while waiting for the lock, the
// function returns ctx.Err().
//
// On PostgreSQL it uses pg_advisory_lock (cross-process). On other dialects
// (SQLite) it falls back to a blocking in-process lock keyed by key, which is
// sufficient because a SQLite auth DB is effectively single-process.
func WithLockCtx(ctx context.Context, db *gorm.DB, key int64, fn func() error) error {
if !isPostgres(db) {
// Honor an already-cancelled context before attempting acquisition:
// select picks a ready case at random, so without this an already-free
// lock could be taken despite a cancelled ctx.
if err := ctx.Err(); err != nil {
return err
}
ch := localLockChan(key)
select {
case ch <- struct{}{}:
defer func() { <-ch }()
return fn()
case <-ctx.Done():
return ctx.Err()
}
}
sqlDB, err := db.DB()
if err != nil {
return fmt.Errorf("advisorylock: getting sql.DB: %w", err)

View File

@@ -0,0 +1,129 @@
package advisorylock
import (
"context"
"sync"
"sync/atomic"
"time"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"gorm.io/driver/sqlite"
"gorm.io/gorm"
)
// These specs run against an in-memory SQLite DB and therefore do NOT require
// Docker, unlike the PostgreSQL testcontainer specs.
var _ = Describe("AdvisoryLock (SQLite fallback)", Label("sqlite"), func() {
var db *gorm.DB
BeforeEach(func() {
var err error
db, err = gorm.Open(sqlite.Open("file::memory:?cache=shared"), &gorm.Config{})
Expect(err).ToNot(HaveOccurred())
Expect(db.Dialector.Name()).To(ContainSubstring("sqlite"))
})
It("WithLockCtx executes fn and returns no error on SQLite", func() {
const lockKey int64 = 12001
executed := false
err := WithLockCtx(context.Background(), db, lockKey, func() error {
executed = true
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(executed).To(BeTrue(), "function should have run under the in-process lock")
})
It("WithLockCtx serializes concurrent goroutines on the same key", func() {
const lockKey int64 = 12002
var (
mu sync.Mutex
maxRunning int32
running int32
concurrency int32
)
var wg sync.WaitGroup
for range 2 {
wg.Go(func() {
defer GinkgoRecover()
err := WithLockCtx(context.Background(), db, lockKey, func() error {
cur := atomic.AddInt32(&running, 1)
mu.Lock()
if cur > maxRunning {
maxRunning = cur
}
if cur > 1 {
atomic.AddInt32(&concurrency, 1)
}
mu.Unlock()
time.Sleep(50 * time.Millisecond)
atomic.AddInt32(&running, -1)
return nil
})
Expect(err).ToNot(HaveOccurred())
})
}
wg.Wait()
Expect(maxRunning).To(BeNumerically("<=", 1), "expected max 1 goroutine inside lock at a time")
Expect(concurrency).To(BeZero(), "detected concurrent execution inside advisory lock")
})
It("WithLockCtx returns an error and does not run fn with an already-cancelled context", func() {
const lockKey int64 = 12003
ctx, cancel := context.WithCancel(context.Background())
cancel()
err := WithLockCtx(ctx, db, lockKey, func() error {
Fail("function should not run with a cancelled context")
return nil
})
Expect(err).To(HaveOccurred())
})
It("TryWithLockCtx returns (true, nil) when free and (false, nil) when held", func() {
const lockKey int64 = 12004
acquired, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(acquired).To(BeTrue(), "expected TryWithLockCtx to acquire the free lock")
// Hold the lock in one goroutine while a concurrent TryWithLockCtx
// attempts to acquire the same key.
held := make(chan struct{})
release := make(chan struct{})
var wg sync.WaitGroup
wg.Go(func() {
defer GinkgoRecover()
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
close(held)
<-release
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(ok).To(BeTrue())
})
<-held
ok, err := TryWithLockCtx(context.Background(), db, lockKey, func() error {
Fail("function should not run while lock is held")
return nil
})
Expect(err).ToNot(HaveOccurred())
Expect(ok).To(BeFalse(), "expected TryWithLockCtx to fail to acquire a held lock")
close(release)
wg.Wait()
})
})

View File

@@ -30,6 +30,8 @@ import (
mcpTools "github.com/mudler/LocalAI/core/http/endpoints/mcp"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/jobs"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/syncstate"
"github.com/mudler/LocalAI/core/templates"
"github.com/mudler/LocalAI/pkg/httpclient"
"github.com/mudler/LocalAI/pkg/model"
@@ -43,8 +45,18 @@ type AgentJobService struct {
configLoader *config.ModelConfigLoader
evaluator *templates.Evaluator
// tasks is the cross-replica task store: an in-memory map kept consistent
// across replicas via NATS, with read-through to the configured persister
// (file in standalone, PostgreSQL in distributed). Unlike jobs - which already
// converge via the dispatcher + DB read-through - tasks previously read
// in-memory only, so ListTasks went stale on non-originating replicas.
tasks *syncstate.SyncedMap[string, schema.Task]
// taskNats is the distributed NATS client backing the tasks SyncedMap. It is
// not available at construction time, so it is injected via SetTaskSyncNATS
// during distributed wiring; nil keeps tasks in-memory-only (standalone).
taskNats messaging.MessagingClient
// Storage (in-memory primary, persister for secondary persistence)
tasks *xsync.SyncedMap[string, schema.Task]
jobs *xsync.SyncedMap[string, schema.Job]
persister JobPersister
userID string // Scoping: empty for global (main service), set for per-user instances
@@ -96,6 +108,31 @@ func (s *AgentJobService) SetDistributedJobStore(store *jobs.JobStore) {
s.persister = &dbJobPersister{store: store}
}
// SetTaskSyncNATS wires the distributed NATS client used to keep agent *tasks*
// consistent across replicas (jobs already converge via the dispatcher + DB
// read-through, so they are left untouched). The client is not available when the
// service is constructed, so it is injected here during distributed wiring and the
// tasks SyncedMap is rebuilt to pick it up. It is always called before Start /
// hydrate, while the map is still empty, so rebuilding loses no state. Passing nil
// (standalone) keeps the map in-memory-only with no broadcast.
func (s *AgentJobService) SetTaskSyncNATS(nats messaging.MessagingClient) {
s.taskNats = nats
s.buildTasksMap()
}
// buildTasksMap (re)constructs the cross-replica tasks SyncedMap from the current
// taskNats. The Store adapter reads s.persister/s.userID live, so a persister swap
// (SetDistributedJobStore) needs no rebuild; only the NATS client, fixed at
// New-time, forces one - hence SetTaskSyncNATS calls this.
func (s *AgentJobService) buildTasksMap() {
s.tasks = syncstate.New(syncstate.Config[string, schema.Task]{
Name: "agent.tasks",
Key: func(t schema.Task) string { return t.ID },
Nats: s.taskNats,
Store: &taskStoreAdapter{svc: s},
})
}
// Dispatcher returns the distributed dispatcher (nil if not in distributed mode).
func (s *AgentJobService) Dispatcher() DistributedDispatcher {
return s.dispatcher
@@ -106,13 +143,6 @@ func (s *AgentJobService) DBStore() *jobs.JobStore {
return s.rawDBStore
}
// saveTasks persists tasks via the configured persister (file or DB).
func (s *AgentJobService) saveTasks(task schema.Task) {
if err := s.persister.SaveTask(s.userID, task); err != nil {
xlog.Warn("Failed to persist task", "error", err, "task_id", task.ID)
}
}
// saveJobs persists jobs via the configured persister (file or DB).
func (s *AgentJobService) saveJobs(job schema.Job) {
if err := s.persister.SaveJob(s.userID, job); err != nil {
@@ -129,18 +159,8 @@ func (s *AgentJobService) LoadFromDB() {
// loadFromPersister loads tasks and jobs from the configured persister into memory.
func (s *AgentJobService) loadFromPersister() {
if tasks, err := s.persister.LoadTasks(s.userID); err != nil {
if err := s.hydrateTasks(s.appConfig.Context); err != nil {
xlog.Warn("Failed to load tasks from persister", "error", err)
} else {
for _, task := range tasks {
s.tasks.Set(task.ID, task)
if task.Enabled && task.Cron != "" {
if err := s.ScheduleCronTask(task); err != nil {
xlog.Warn("Failed to schedule cron task on load", "error", err, "task_id", task.ID)
}
}
}
xlog.Info("Loaded tasks from persister", "count", len(tasks))
}
if loadedJobs, err := s.persister.LoadJobs(s.userID); err != nil {
@@ -153,6 +173,27 @@ func (s *AgentJobService) loadFromPersister() {
}
}
// hydrateTasks loads tasks into the cross-replica SyncedMap and (re)schedules
// cron entries for enabled tasks. Hydration goes through the SyncedMap's Store
// read-through (Start), not Set, so it neither re-persists nor re-broadcasts the
// loaded tasks. Each service instance hydrates exactly once: the main service via
// Start -> loadFromPersister, per-user services via LoadFromDB or LoadTasksFromFile.
func (s *AgentJobService) hydrateTasks(ctx context.Context) error {
if err := s.tasks.Start(ctx); err != nil {
return err
}
tasks := s.tasks.List()
for _, task := range tasks {
if task.Enabled && task.Cron != "" {
if err := s.ScheduleCronTask(task); err != nil {
xlog.Warn("Failed to schedule cron task on load", "error", err, "task_id", task.ID)
}
}
}
xlog.Info("Loaded tasks from persister", "count", len(tasks))
return nil
}
// JobExecution represents a job to be executed
type JobExecution struct {
Job schema.Job
@@ -200,21 +241,19 @@ func NewAgentJobServiceWithPaths(
) *AgentJobService {
retentionDays := cmp.Or(appConfig.AgentJobRetentionDays, 30)
tasks := xsync.NewSyncedMap[string, schema.Task]()
jobsMap := xsync.NewSyncedMap[string, schema.Job]()
return &AgentJobService{
s := &AgentJobService{
appConfig: appConfig,
modelLoader: modelLoader,
configLoader: configLoader,
evaluator: evaluator,
tasks: tasks,
jobs: jobsMap,
persister: &fileJobPersister{
tasks: tasks,
jobs: jobsMap,
tasksFile: tasksFile,
jobsFile: jobsFile,
taskSet: make(map[string]schema.Task),
},
jobQueue: make(chan JobExecution, 100), // Buffer for 100 jobs
cancellations: xsync.NewSyncedMap[string, context.CancelFunc](),
@@ -222,25 +261,17 @@ func NewAgentJobServiceWithPaths(
cronEntries: xsync.NewSyncedMap[string, cron.EntryID](),
retentionDays: retentionDays,
}
// Build the cross-replica tasks map standalone (nil NATS); SetTaskSyncNATS
// rebuilds it with the distributed client once that is available, before Start.
s.buildTasksMap()
return s
}
// LoadTasksFromFile loads tasks from the persister into the in-memory map
// and schedules cron entries. Named "FromFile" for backward compat; in DB
// mode it loads from the database.
func (s *AgentJobService) LoadTasksFromFile() error {
tasks, err := s.persister.LoadTasks(s.userID)
if err != nil {
return err
}
for _, task := range tasks {
s.tasks.Set(task.ID, task)
if task.Enabled && task.Cron != "" {
if err := s.ScheduleCronTask(task); err != nil {
xlog.Warn("Failed to schedule cron task on load", "error", err, "task_id", task.ID)
}
}
}
return nil
return s.hydrateTasks(s.appConfig.Context)
}
// SaveTasksToFile flushes the current tasks map via the persister. File
@@ -293,8 +324,12 @@ func (s *AgentJobService) CreateTask(task schema.Task) (string, error) {
task.Enabled = true // Default to enabled
}
// Store task
s.tasks.Set(id, task)
// Store task: Set updates the in-memory map, write-throughs to the persister
// (file or DB), and broadcasts the create to peer replicas. Background ctx
// because CreateTask carries no request ctx (mirrors the finetune service).
if err := s.tasks.Set(context.Background(), task); err != nil {
return "", fmt.Errorf("failed to persist task: %w", err)
}
// Schedule cron if enabled and has cron expression
if task.Enabled && task.Cron != "" {
@@ -303,16 +338,15 @@ func (s *AgentJobService) CreateTask(task schema.Task) (string, error) {
}
}
s.saveTasks(task)
return id, nil
}
// UpdateTask updates an existing task
func (s *AgentJobService) UpdateTask(id string, task schema.Task) error {
if !s.tasks.Exists(id) {
existing, ok := s.tasks.Get(id)
if !ok {
return fmt.Errorf("%w: %s", ErrTaskNotFound, id)
}
existing := s.tasks.Get(id)
// Preserve ID and CreatedAt
task.ID = id
@@ -324,8 +358,10 @@ func (s *AgentJobService) UpdateTask(id string, task schema.Task) error {
s.UnscheduleCronTask(id)
}
// Store updated task
s.tasks.Set(id, task)
// Store updated task: write-through + broadcast (see CreateTask).
if err := s.tasks.Set(context.Background(), task); err != nil {
return fmt.Errorf("failed to persist task: %w", err)
}
// Schedule new cron if enabled and has cron expression
if task.Enabled && task.Cron != "" {
@@ -334,24 +370,22 @@ func (s *AgentJobService) UpdateTask(id string, task schema.Task) error {
}
}
s.saveTasks(task)
return nil
}
// DeleteTask deletes a task
func (s *AgentJobService) DeleteTask(id string) error {
if !s.tasks.Exists(id) {
if _, ok := s.tasks.Get(id); !ok {
return fmt.Errorf("%w: %s", ErrTaskNotFound, id)
}
// Unschedule cron
s.UnscheduleCronTask(id)
// Remove from memory
s.tasks.Delete(id)
if err := s.persister.DeleteTask(id); err != nil {
xlog.Warn("Failed to delete task from persister", "error", err, "task_id", id)
// Delete removes from the in-memory map, deletes from the persister, and
// broadcasts the removal to peer replicas.
if err := s.tasks.Delete(context.Background(), id); err != nil {
xlog.Warn("Failed to delete task from store", "error", err, "task_id", id)
}
return nil
@@ -359,8 +393,8 @@ func (s *AgentJobService) DeleteTask(id string) error {
// GetTask retrieves a task by ID
func (s *AgentJobService) GetTask(id string) (*schema.Task, error) {
task := s.tasks.Get(id)
if task.ID == "" {
task, ok := s.tasks.Get(id)
if !ok {
return nil, fmt.Errorf("%w: %s", ErrTaskNotFound, id)
}
return &task, nil
@@ -368,7 +402,7 @@ func (s *AgentJobService) GetTask(id string) (*schema.Task, error) {
// ListTasks returns all tasks, sorted by creation date (newest first)
func (s *AgentJobService) ListTasks() []schema.Task {
tasks := s.tasks.Values()
tasks := s.tasks.List()
// Sort by CreatedAt descending (newest first), then by Name for stability
slices.SortFunc(tasks, func(a, b schema.Task) int {
if a.CreatedAt.Equal(b.CreatedAt) {
@@ -397,8 +431,8 @@ func (s *AgentJobService) buildPrompt(templateStr string, params map[string]stri
// ExecuteJob creates and queues a job for execution
// multimedia can be nil for backward compatibility
func (s *AgentJobService) ExecuteJob(taskID string, params map[string]string, triggeredBy string, multimedia *schema.MultimediaAttachment) (string, error) {
task := s.tasks.Get(taskID)
if task.ID == "" {
task, ok := s.tasks.Get(taskID)
if !ok {
return "", fmt.Errorf("%w: %s", ErrTaskNotFound, taskID)
}
@@ -1451,6 +1485,12 @@ func (s *AgentJobService) Stop() error {
if s.cronScheduler != nil {
s.cronScheduler.Stop()
}
// Release the tasks SyncedMap subscription / background workers.
if s.tasks != nil {
if err := s.tasks.Close(); err != nil {
xlog.Warn("Error closing tasks sync map", "error", err)
}
}
xlog.Info("AgentJobService stopped")
return nil
}

View File

@@ -14,24 +14,38 @@ import (
)
// fileJobPersister persists tasks and jobs to JSON files.
// It holds references to the service's syncmaps and serializes the entire
// map contents on each save (bulk write). Reads at runtime return nil
// (the in-memory map is the authoritative source); LoadTasks/LoadJobs
// are used only at startup to bootstrap the syncmaps.
//
// Jobs serialize the service's in-memory jobs syncmap on each save (bulk write).
// Tasks are kept in this persister's own taskSet map instead: the tasks SyncedMap
// calls SaveTask/DeleteTask while holding its internal lock (write-through), so
// reading back the SyncedMap here would re-enter that lock and deadlock. The
// self-contained taskSet, seeded by LoadTasks, lets a per-task write rewrite the
// whole bulk file without touching the SyncedMap.
//
// Runtime reads (GetJob/ListJobs) return nil (the in-memory state is the
// authoritative source); LoadTasks/LoadJobs bootstrap state at startup.
type fileJobPersister struct {
tasks *xsync.SyncedMap[string, schema.Task]
jobs *xsync.SyncedMap[string, schema.Job]
tasksFile string
jobsFile string
mu sync.Mutex
// taskSet is the persister's own view of all tasks, seeded by LoadTasks and
// updated by SaveTask/DeleteTask. The bulk JSON file is rewritten from it.
taskSet map[string]schema.Task
}
func (p *fileJobPersister) SaveTask(_ string, _ schema.Task) error {
return p.saveTasksToFile()
func (p *fileJobPersister) SaveTask(_ string, task schema.Task) error {
p.mu.Lock()
defer p.mu.Unlock()
p.taskSet[task.ID] = task
return p.writeTasksLocked()
}
func (p *fileJobPersister) DeleteTask(_ string) error {
return p.saveTasksToFile()
func (p *fileJobPersister) DeleteTask(taskID string) error {
p.mu.Lock()
defer p.mu.Unlock()
delete(p.taskSet, taskID)
return p.writeTasksLocked()
}
func (p *fileJobPersister) SaveJob(_ string, _ schema.Job) error {
@@ -43,7 +57,9 @@ func (p *fileJobPersister) DeleteJob(_ string) error {
}
func (p *fileJobPersister) FlushTasks() error {
return p.saveTasksToFile()
p.mu.Lock()
defer p.mu.Unlock()
return p.writeTasksLocked()
}
func (p *fileJobPersister) FlushJobs() error {
@@ -83,6 +99,12 @@ func (p *fileJobPersister) LoadTasks(_ string) ([]schema.Task, error) {
return nil, fmt.Errorf("failed to parse tasks file: %w", err)
}
// Seed the in-memory set so subsequent per-task SaveTask/DeleteTask merge into
// (rather than overwrite) the persisted tasks when the bulk file is rewritten.
for _, t := range tf.Tasks {
p.taskSet[t.ID] = t
}
xlog.Info("Loaded tasks from file", "count", len(tf.Tasks))
return tf.Tasks, nil
}
@@ -118,19 +140,20 @@ func (p *fileJobPersister) CleanupOldJobs(_ time.Duration) (int64, error) {
return 0, nil // cleanup handled via in-memory filtering
}
// saveTasksToFile serializes the entire tasks map to the JSON file.
func (p *fileJobPersister) saveTasksToFile() error {
// writeTasksLocked serializes the persister's task set to the JSON file. Callers
// must hold p.mu.
func (p *fileJobPersister) writeTasksLocked() error {
if p.tasksFile == "" {
return nil
}
p.mu.Lock()
defer p.mu.Unlock()
tf := schema.TasksFile{
Tasks: p.tasks.Values(),
tasks := make([]schema.Task, 0, len(p.taskSet))
for _, t := range p.taskSet {
tasks = append(tasks, t)
}
tf := schema.TasksFile{Tasks: tasks}
data, err := json.MarshalIndent(tf, "", " ")
if err != nil {
return fmt.Errorf("failed to marshal tasks: %w", err)

View File

@@ -20,28 +20,26 @@ var _ = Describe("JobPersister", func() {
Context("fileJobPersister", func() {
var (
p *fileJobPersister
tasks *xsync.SyncedMap[string, schema.Task]
jobsMap *xsync.SyncedMap[string, schema.Job]
tmpDir string
)
BeforeEach(func() {
tmpDir = GinkgoT().TempDir()
tasks = xsync.NewSyncedMap[string, schema.Task]()
jobsMap = xsync.NewSyncedMap[string, schema.Job]()
p = &fileJobPersister{
tasks: tasks,
jobs: jobsMap,
tasksFile: filepath.Join(tmpDir, "tasks.json"),
jobsFile: filepath.Join(tmpDir, "jobs.json"),
// taskSet is the persister's own task view (decoupled from the tasks
// SyncedMap to avoid re-entering its lock during write-through).
taskSet: make(map[string]schema.Task),
}
})
It("SaveTask writes all tasks to file", func() {
tasks.Set("t1", schema.Task{ID: "t1", Name: "Task One", Model: "m", Prompt: "p"})
tasks.Set("t2", schema.Task{ID: "t2", Name: "Task Two", Model: "m", Prompt: "p"})
Expect(p.SaveTask("", schema.Task{})).To(Succeed())
Expect(p.SaveTask("", schema.Task{ID: "t1", Name: "Task One", Model: "m", Prompt: "p"})).To(Succeed())
Expect(p.SaveTask("", schema.Task{ID: "t2", Name: "Task Two", Model: "m", Prompt: "p"})).To(Succeed())
// Verify file contents
data, err := os.ReadFile(p.tasksFile)
@@ -52,11 +50,9 @@ var _ = Describe("JobPersister", func() {
})
It("DeleteTask writes updated tasks to file", func() {
tasks.Set("t1", schema.Task{ID: "t1", Name: "Keep"})
tasks.Set("t2", schema.Task{ID: "t2", Name: "Delete"})
Expect(p.SaveTask("", schema.Task{ID: "t1", Name: "Keep"})).To(Succeed())
Expect(p.SaveTask("", schema.Task{ID: "t2", Name: "Delete"})).To(Succeed())
// Simulate deletion from memory (caller does this before calling persister)
tasks.Delete("t2")
Expect(p.DeleteTask("t2")).To(Succeed())
data, err := os.ReadFile(p.tasksFile)

View File

@@ -0,0 +1,152 @@
package agentpool
// White-box tests (package agentpool) so a spec can build two AgentJobService
// instances sharing one in-memory bus and assert that agent *tasks* converge
// across replicas - the bug this migration fixes (ListTasks used to read
// in-memory only, so a task created on replica A was invisible on replica B).
// Jobs are deliberately untouched here: they already converge via the dispatcher
// + DB read-through.
import (
"context"
"time"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/syncstate"
"github.com/mudler/LocalAI/core/services/testutil"
"github.com/mudler/LocalAI/pkg/system"
)
// newTaskSyncService builds an AgentJobService wired to the given bus and a
// throwaway data dir (so the file persister has somewhere to write). Model/config
// loaders are nil because the task sync paths under test never touch them.
func newTaskSyncService(bus messaging.MessagingClient) *AgentJobService {
tmpDir := GinkgoT().TempDir()
sysState := &system.SystemState{}
sysState.Model.ModelsPath = tmpDir
appConfig := config.NewApplicationConfig(
config.WithDynamicConfigDir(tmpDir),
config.WithContext(context.Background()),
)
appConfig.SystemState = sysState
svc := NewAgentJobServiceWithPaths(appConfig, nil, nil, nil,
// Distinct per-replica files so the file persister write-through never
// crosses replicas: convergence here must be proven via the bus alone.
tmpDir+"/tasks.json", tmpDir+"/jobs.json")
svc.SetTaskSyncNATS(bus)
return svc
}
var _ = Describe("AgentJobService task cross-replica sync", func() {
Describe("two replicas sharing one bus", func() {
var (
bus *testutil.FakeBus
a, b *AgentJobService
)
BeforeEach(func() {
// One shared bus, two replicas: exactly the distributed topology where a
// round-robin request may land on a replica that did not originate the
// change.
bus = testutil.NewFakeBus()
a = newTaskSyncService(bus)
b = newTaskSyncService(bus)
// Start hydrates (empty here) and subscribes both replicas to deltas.
Expect(a.Start(context.Background())).To(Succeed())
Expect(b.Start(context.Background())).To(Succeed())
})
AfterEach(func() {
Expect(a.Stop()).To(Succeed())
Expect(b.Stop()).To(Succeed())
})
It("makes a task created on A visible via B's GetTask and ListTasks", func() {
id, err := a.CreateTask(schema.Task{Name: "Shared", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
got, err := b.GetTask(id)
Expect(err).NotTo(HaveOccurred(), "B must see a task A just created")
Expect(got.Name).To(Equal("Shared"))
listed := b.ListTasks()
Expect(listed).To(HaveLen(1))
Expect(listed[0].ID).To(Equal(id))
})
It("propagates a task update from A to B", func() {
id, err := a.CreateTask(schema.Task{Name: "Before", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
Expect(a.UpdateTask(id, schema.Task{Name: "After", Model: "m", Prompt: "p"})).To(Succeed())
got, err := b.GetTask(id)
Expect(err).NotTo(HaveOccurred())
Expect(got.Name).To(Equal("After"), "an update on A must be visible on B")
})
It("removes a task from B when it is deleted on A", func() {
id, err := a.CreateTask(schema.Task{Name: "Doomed", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
_, err = b.GetTask(id)
Expect(err).NotTo(HaveOccurred(), "precondition: B must have the task before the delete")
Expect(a.DeleteTask(id)).To(Succeed())
_, err = b.GetTask(id)
Expect(err).To(HaveOccurred(), "a delete on A must remove the task from B")
Expect(b.ListTasks()).To(BeEmpty())
})
It("does not re-broadcast a delta it received (echo-loop guard)", func() {
subject := messaging.SubjectSyncStateDelta("agent.tasks")
_, err := a.CreateTask(schema.Task{Name: "Once", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
// Exactly one publish: A's create. B applies it without re-publishing,
// otherwise this would be 2+ and a real bus would storm.
Expect(bus.PublishCount(subject)).To(Equal(1))
})
})
Describe("ListTasks ordering and scoping", func() {
var svc *AgentJobService
BeforeEach(func() {
svc = newTaskSyncService(testutil.NewFakeBus())
Expect(svc.Start(context.Background())).To(Succeed())
})
AfterEach(func() { Expect(svc.Stop()).To(Succeed()) })
It("sorts newest-first, breaking ties by name", func() {
// CreateTask stamps CreatedAt with time.Now(); space them out so ordering
// is deterministic rather than relying on the sub-millisecond gap.
oldID, err := svc.CreateTask(schema.Task{Name: "Old", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
time.Sleep(5 * time.Millisecond)
newID, err := svc.CreateTask(schema.Task{Name: "New", Model: "m", Prompt: "p"})
Expect(err).NotTo(HaveOccurred())
listed := svc.ListTasks()
Expect(listed).To(HaveLen(2))
Expect(listed[0].ID).To(Equal(newID), "newest first")
Expect(listed[1].ID).To(Equal(oldID))
})
})
Describe("compile-time adapter contract", func() {
It("satisfies syncstate.Store for tasks", func() {
// Mirrors the var assertion in task_syncstore.go; keeps the type
// referenced from a spec so drift surfaces here too.
var _ syncstate.Store[string, schema.Task] = (*taskStoreAdapter)(nil)
Expect(&taskStoreAdapter{}).ToNot(BeNil())
})
})
})

View File

@@ -0,0 +1,47 @@
package agentpool
import (
"context"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/syncstate"
)
// taskStoreAdapter bridges the existing JobPersister (file- or DB-backed) to the
// generic syncstate.Store the tasks SyncedMap consumes. Only tasks are migrated:
// jobs already converge across replicas via the dispatcher (NATS) plus the DB
// read-through in ListJobs/GetJob, whereas ListTasks read in-memory only and so
// went stale on replicas that did not originate the change.
//
// The adapter reads svc.persister and svc.userID live (rather than capturing
// them) because both are configured by setters - SetDistributedJobStore swaps the
// file persister for the DB one, SetUserID scopes per-user queries - AFTER the
// service, and thus this adapter, is constructed. Reading them at call time means
// the SyncedMap never has to be rebuilt when the persister is swapped.
//
// The SyncedMap value type is schema.Task: the exact shape ListTasks returns, so
// reads need no conversion and REST responses are provably unchanged.
type taskStoreAdapter struct {
svc *AgentJobService
}
// compile-time assertion that the adapter satisfies the component's Store.
var _ syncstate.Store[string, schema.Task] = (*taskStoreAdapter)(nil)
// List hydrates the map from durable storage on Start/reconnect: the file's task
// list (standalone) or every task row (DB / distributed).
func (a *taskStoreAdapter) List(_ context.Context) ([]schema.Task, error) {
return a.svc.persister.LoadTasks(a.svc.userID)
}
// Upsert write-through persists a single task created/updated locally; the
// SyncedMap then broadcasts the delta to peers.
func (a *taskStoreAdapter) Upsert(_ context.Context, task schema.Task) error {
return a.svc.persister.SaveTask(a.svc.userID, task)
}
// Delete write-through removes a task locally; the SyncedMap then broadcasts the
// removal to peers.
func (a *taskStoreAdapter) Delete(_ context.Context, id string) error {
return a.svc.persister.DeleteTask(id)
}

View File

@@ -7,6 +7,7 @@ import (
"github.com/mudler/LocalAGI/webui/collections"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/jobs"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/templates"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/xlog"
@@ -28,6 +29,9 @@ type UserServicesManager struct {
// Shared distributed backends (set once, inherited by per-user job services)
jobDispatcher DistributedDispatcher
jobDBStore *jobs.JobStore
// jobNats keeps per-user agent tasks consistent across replicas (nil in
// standalone). Inherited by each per-user AgentJobService.
jobNats messaging.MessagingClient
}
// NewUserServicesManager creates a new UserServicesManager.
@@ -162,6 +166,10 @@ func (m *UserServicesManager) GetJobs(userID string) (*AgentJobService, error) {
if m.jobDispatcher != nil {
svc.SetDistributedBackends(m.jobDispatcher)
}
// Inherit the NATS client so per-user tasks broadcast across replicas. Must be
// set before the hydrate below (LoadFromDB / LoadTasksFromFile) so the tasks
// SyncedMap is rebuilt with the client while it is still empty.
svc.SetTaskSyncNATS(m.jobNats)
if m.jobDBStore != nil {
svc.SetDistributedJobStore(m.jobDBStore)
// Load tasks/jobs from DB immediately (per-user services skip Start())
@@ -189,6 +197,12 @@ func (m *UserServicesManager) SetJobDBStore(s *jobs.JobStore) {
m.jobDBStore = s
}
// SetJobSyncNATS sets the NATS client used to keep per-user agent tasks consistent
// across replicas.
func (m *UserServicesManager) SetJobSyncNATS(nats messaging.MessagingClient) {
m.jobNats = nats
}
// ListAllUserIDs returns all user IDs that have scoped data directories.
func (m *UserServicesManager) ListAllUserIDs() ([]string, error) {
return m.storage.ListUserDirs()

View File

@@ -8,6 +8,7 @@ import (
"github.com/google/uuid"
"github.com/mudler/LocalAI/core/services/advisorylock"
"gorm.io/gorm"
"gorm.io/gorm/clause"
)
// FineTuneJobRecord tracks fine-tune jobs in PostgreSQL.
@@ -80,6 +81,34 @@ func (s *FineTuneStore) List(userID string) ([]FineTuneJobRecord, error) {
return jobs, q.Find(&jobs).Error
}
// ListAll returns every fine-tune job across all users. The SyncedMap that backs
// FineTuneService is a single global map (the REST API filters by user at read
// time), so hydrate needs the full set rather than the per-user List above.
func (s *FineTuneStore) ListAll() ([]FineTuneJobRecord, error) {
var jobs []FineTuneJobRecord
return jobs, s.db.Order("created_at DESC").Find(&jobs).Error
}
// Upsert idempotently inserts or fully replaces a job row by primary key. The
// SyncedMap write-through path issues a single Set per mutation regardless of
// whether the job already exists, so it needs one create-or-update primitive
// (Create alone fails on a duplicate key, UpdateStatus alone misses new rows and
// only touches a few columns).
func (s *FineTuneStore) Upsert(job *FineTuneJobRecord) error {
if job.ID == "" {
job.ID = uuid.New().String()
}
now := time.Now()
if job.CreatedAt.IsZero() {
job.CreatedAt = now
}
job.UpdatedAt = now
return s.db.Clauses(clause.OnConflict{
Columns: []clause.Column{{Name: "id"}},
UpdateAll: true,
}).Create(job).Error
}
// UpdateStatus updates the status and message of a fine-tune job.
func (s *FineTuneStore) UpdateStatus(id, status, message string) error {
return s.db.Model(&FineTuneJobRecord{}).Where("id = ?", id).Updates(map[string]any{

View File

@@ -0,0 +1,13 @@
package distributed_test
import (
"testing"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
func TestDistributed(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Distributed Suite")
}

View File

@@ -0,0 +1,61 @@
package distributed_test
import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/testutil"
)
var _ = Describe("FineTuneStore", func() {
var store *distributed.FineTuneStore
BeforeEach(func() {
db := testutil.SetupTestDB()
var err error
store, err = distributed.NewFineTuneStore(db)
Expect(err).ToNot(HaveOccurred())
})
Describe("ListAll", func() {
It("returns jobs across all users (unlike per-user List)", func() {
Expect(store.Create(&distributed.FineTuneJobRecord{ID: "j1", UserID: "u1", Status: "queued"})).To(Succeed())
Expect(store.Create(&distributed.FineTuneJobRecord{ID: "j2", UserID: "u2", Status: "queued"})).To(Succeed())
all, err := store.ListAll()
Expect(err).ToNot(HaveOccurred())
Expect(all).To(HaveLen(2))
perUser, err := store.List("u1")
Expect(err).ToNot(HaveOccurred())
Expect(perUser).To(HaveLen(1), "List stays per-user")
})
})
Describe("Upsert", func() {
It("inserts a new row", func() {
Expect(store.Upsert(&distributed.FineTuneJobRecord{ID: "up-1", UserID: "u1", Status: "queued"})).To(Succeed())
got, err := store.Get("up-1")
Expect(err).ToNot(HaveOccurred())
Expect(got.Status).To(Equal("queued"))
})
It("idempotently updates an existing row on a repeated key", func() {
Expect(store.Upsert(&distributed.FineTuneJobRecord{ID: "up-2", UserID: "u1", Status: "queued"})).To(Succeed())
// Second Upsert with the same primary key must update, not error on a
// duplicate-key violation (this is the SyncedMap write-through contract).
Expect(store.Upsert(&distributed.FineTuneJobRecord{ID: "up-2", UserID: "u1", Status: "completed", Message: "done"})).To(Succeed())
got, err := store.Get("up-2")
Expect(err).ToNot(HaveOccurred())
Expect(got.Status).To(Equal("completed"))
Expect(got.Message).To(Equal("done"))
all, err := store.ListAll()
Expect(err).ToNot(HaveOccurred())
Expect(all).To(HaveLen(1), "upsert must not create a duplicate")
})
})
})

View File

@@ -11,6 +11,7 @@ import (
type Stores struct {
Gallery *GalleryStore
FineTune *FineTuneStore
Quant *QuantStore
Skills *SkillStore
}
@@ -26,15 +27,21 @@ func InitStores(db *gorm.DB) (*Stores, error) {
return nil, fmt.Errorf("fine-tune store: %w", err)
}
quant, err := NewQuantStore(db)
if err != nil {
return nil, fmt.Errorf("quantization store: %w", err)
}
skills, err := NewSkillStore(db)
if err != nil {
return nil, fmt.Errorf("skills store: %w", err)
}
xlog.Info("Distributed stores initialized (Gallery, FineTune, Skills)")
xlog.Info("Distributed stores initialized (Gallery, FineTune, Quant, Skills)")
return &Stores{
Gallery: gallery,
FineTune: ft,
Quant: quant,
Skills: skills,
}, nil
}

View File

@@ -0,0 +1,105 @@
package distributed
import (
"context"
"fmt"
"time"
"github.com/google/uuid"
"github.com/mudler/LocalAI/core/services/advisorylock"
"gorm.io/gorm"
"gorm.io/gorm/clause"
)
// QuantJobRecord tracks quantization jobs in PostgreSQL. The columns mirror the
// API shape (schema.QuantizationJob); the structured Config and ExtraOptions are
// serialized into JSON text columns so a record fully reconstructs the job.
type QuantJobRecord struct {
ID string `gorm:"primaryKey;size:36" json:"id"`
UserID string `gorm:"index;size:36" json:"user_id,omitempty"`
Model string `gorm:"size:255" json:"model"`
Backend string `gorm:"size:64" json:"backend"`
ModelID string `gorm:"size:255" json:"model_id,omitempty"`
QuantizationType string `gorm:"size:32" json:"quantization_type"`
Status string `gorm:"index;size:32;default:queued" json:"status"` // queued, downloading, converting, quantizing, completed, failed, stopped
Message string `gorm:"type:text" json:"message,omitempty"`
OutputDir string `gorm:"size:512" json:"output_dir,omitempty"`
OutputFile string `gorm:"size:512" json:"output_file,omitempty"`
ConfigJSON string `gorm:"column:config;type:text" json:"-"`
ExtraOptsJSON string `gorm:"column:extra_options;type:text" json:"-"`
ImportStatus string `gorm:"size:32" json:"import_status,omitempty"`
ImportMessage string `gorm:"type:text" json:"import_message,omitempty"`
ImportModelName string `gorm:"size:255" json:"import_model_name,omitempty"`
CreatedAt time.Time `json:"created_at"`
UpdatedAt time.Time `json:"updated_at"`
}
func (QuantJobRecord) TableName() string { return "quantization_jobs" }
// QuantStore manages quantization job state in PostgreSQL.
type QuantStore struct {
db *gorm.DB
}
// NewQuantStore creates a new QuantStore and auto-migrates.
// Uses a PostgreSQL advisory lock to prevent concurrent migration races
// when multiple instances (frontend + workers) start at the same time.
func NewQuantStore(db *gorm.DB) (*QuantStore, error) {
if err := advisorylock.WithLockCtx(context.Background(), db, advisorylock.KeySchemaMigrate, func() error {
return db.AutoMigrate(&QuantJobRecord{})
}); err != nil {
return nil, fmt.Errorf("migrating quantization_jobs: %w", err)
}
return &QuantStore{db: db}, nil
}
// Create stores a new quantization job.
func (s *QuantStore) Create(job *QuantJobRecord) error {
if job.ID == "" {
job.ID = uuid.New().String()
}
job.CreatedAt = time.Now()
job.UpdatedAt = job.CreatedAt
return s.db.Create(job).Error
}
// Get retrieves a quantization job by ID.
func (s *QuantStore) Get(id string) (*QuantJobRecord, error) {
var job QuantJobRecord
if err := s.db.First(&job, "id = ?", id).Error; err != nil {
return nil, err
}
return &job, nil
}
// ListAll returns every quantization job across all users. The SyncedMap that
// backs QuantizationService is a single global map (the REST API filters by user
// at read time), so hydrate needs the full set.
func (s *QuantStore) ListAll() ([]QuantJobRecord, error) {
var jobs []QuantJobRecord
return jobs, s.db.Order("created_at DESC").Find(&jobs).Error
}
// Upsert idempotently inserts or fully replaces a job row by primary key. The
// SyncedMap write-through path issues a single Set per mutation regardless of
// whether the job already exists, so it needs one create-or-update primitive
// (Create alone fails on a duplicate key).
func (s *QuantStore) Upsert(job *QuantJobRecord) error {
if job.ID == "" {
job.ID = uuid.New().String()
}
now := time.Now()
if job.CreatedAt.IsZero() {
job.CreatedAt = now
}
job.UpdatedAt = now
return s.db.Clauses(clause.OnConflict{
Columns: []clause.Column{{Name: "id"}},
UpdateAll: true,
}).Create(job).Error
}
// Delete removes a quantization job.
func (s *QuantStore) Delete(id string) error {
return s.db.Where("id = ?", id).Delete(&QuantJobRecord{}).Error
}

View File

@@ -0,0 +1,57 @@
package distributed_test
import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/testutil"
)
var _ = Describe("QuantStore", func() {
var store *distributed.QuantStore
BeforeEach(func() {
db := testutil.SetupTestDB()
var err error
store, err = distributed.NewQuantStore(db)
Expect(err).ToNot(HaveOccurred())
})
Describe("ListAll", func() {
It("returns jobs across all users", func() {
Expect(store.Create(&distributed.QuantJobRecord{ID: "j1", UserID: "u1", Status: "queued"})).To(Succeed())
Expect(store.Create(&distributed.QuantJobRecord{ID: "j2", UserID: "u2", Status: "queued"})).To(Succeed())
all, err := store.ListAll()
Expect(err).ToNot(HaveOccurred())
Expect(all).To(HaveLen(2))
})
})
Describe("Upsert", func() {
It("inserts a new row", func() {
Expect(store.Upsert(&distributed.QuantJobRecord{ID: "up-1", UserID: "u1", Status: "queued"})).To(Succeed())
got, err := store.Get("up-1")
Expect(err).ToNot(HaveOccurred())
Expect(got.Status).To(Equal("queued"))
})
It("idempotently updates an existing row on a repeated key", func() {
Expect(store.Upsert(&distributed.QuantJobRecord{ID: "up-2", UserID: "u1", Status: "queued"})).To(Succeed())
// Second Upsert with the same primary key must update, not error on a
// duplicate-key violation (this is the SyncedMap write-through contract).
Expect(store.Upsert(&distributed.QuantJobRecord{ID: "up-2", UserID: "u1", Status: "completed", Message: "done"})).To(Succeed())
got, err := store.Get("up-2")
Expect(err).ToNot(HaveOccurred())
Expect(got.Status).To(Equal("completed"))
Expect(got.Message).To(Equal("done"))
all, err := store.ListAll()
Expect(err).ToNot(HaveOccurred())
Expect(all).To(HaveLen(1), "upsert must not create a duplicate")
})
})
})

View File

@@ -0,0 +1,13 @@
package finetune
import (
"testing"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
func TestFinetune(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Finetune Suite")
}

View File

@@ -19,6 +19,7 @@ import (
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/syncstate"
pb "github.com/mudler/LocalAI/pkg/grpc/proto"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/utils"
@@ -32,44 +33,63 @@ type FineTuneService struct {
modelLoader *model.ModelLoader
configLoader *config.ModelConfigLoader
mu sync.Mutex
jobs map[string]*schema.FineTuneJob
// mu serializes the read-modify-write of job values. The SyncedMap guards its
// own map structure, but a job is a pointer mutated in place (e.g. the export
// goroutine), so the service still needs a lock to keep those field updates
// and the subsequent Set atomic with respect to readers.
mu sync.Mutex
// Distributed mode (nil when not in distributed mode)
natsClient messaging.Publisher
fineTuneStore *distributed.FineTuneStore
// jobs is the cross-replica job store: an in-memory map kept consistent across
// replicas via NATS, optionally read-through to PostgreSQL in distributed mode.
jobs *syncstate.SyncedMap[string, *schema.FineTuneJob]
}
// SetNATSClient sets the NATS client for distributed progress publishing.
func (s *FineTuneService) SetNATSClient(nc messaging.Publisher) {
s.mu.Lock()
defer s.mu.Unlock()
s.natsClient = nc
}
// SetFineTuneStore sets the PostgreSQL fine-tune store for distributed persistence.
func (s *FineTuneService) SetFineTuneStore(store *distributed.FineTuneStore) {
s.mu.Lock()
defer s.mu.Unlock()
s.fineTuneStore = store
}
// NewFineTuneService creates a new FineTuneService.
// NewFineTuneService creates a new FineTuneService. In distributed mode pass the
// shared NATS client and PostgreSQL store so jobs stay consistent across
// replicas; pass nil for both in standalone mode, where the disk Loader hydrates
// the map and there is nothing to broadcast.
func NewFineTuneService(
appConfig *config.ApplicationConfig,
modelLoader *model.ModelLoader,
configLoader *config.ModelConfigLoader,
nats messaging.MessagingClient,
store *distributed.FineTuneStore,
) *FineTuneService {
s := &FineTuneService{
appConfig: appConfig,
modelLoader: modelLoader,
configLoader: configLoader,
jobs: make(map[string]*schema.FineTuneJob),
}
s.loadAllJobs()
// Only attach a Store interface when a concrete store exists, otherwise the
// SyncedMap would see a non-nil interface wrapping a nil pointer and try to
// hydrate/write through a nil DB.
var syncStore syncstate.Store[string, *schema.FineTuneJob]
if store != nil {
syncStore = &fineTuneStoreAdapter{store: store}
}
s.jobs = syncstate.New(syncstate.Config[string, *schema.FineTuneJob]{
Name: "finetune.jobs",
Key: func(j *schema.FineTuneJob) string { return j.ID },
Nats: nats,
Store: syncStore,
Loader: s.loadJobsFromDisk, // ignored when Store is set (distributed mode)
})
// Hydrate + subscribe. A hydrate failure must not take the server down: log
// and continue degraded (standalone), mirroring the OpCache wiring.
if err := s.jobs.Start(appConfig.Context); err != nil {
xlog.Warn("FineTune SyncedMap start failed; running degraded", "error", err)
}
return s
}
// Close releases the SyncedMap subscription and background workers.
func (s *FineTuneService) Close() error {
return s.jobs.Close()
}
// fineTuneBaseDir returns the base directory for fine-tune job data.
func (s *FineTuneService) fineTuneBaseDir() string {
return filepath.Join(s.appConfig.DataPath, "fine-tune")
@@ -100,15 +120,18 @@ func (s *FineTuneService) saveJobState(job *schema.FineTuneJob) {
}
}
// loadAllJobs scans the fine-tune directory for persisted jobs and loads them.
func (s *FineTuneService) loadAllJobs() {
// loadJobsFromDisk scans the fine-tune directory for persisted jobs and returns
// them. It is the SyncedMap Loader used in standalone mode (no DB); the returned
// slice hydrates the map on Start.
func (s *FineTuneService) loadJobsFromDisk(_ context.Context) ([]*schema.FineTuneJob, error) {
baseDir := s.fineTuneBaseDir()
entries, err := os.ReadDir(baseDir)
if err != nil {
// Directory doesn't exist yet — that's fine
return
// Directory doesn't exist yet — that's fine, start empty.
return nil, nil
}
var jobs []*schema.FineTuneJob
for _, entry := range entries {
if !entry.IsDir() {
continue
@@ -137,12 +160,13 @@ func (s *FineTuneService) loadAllJobs() {
job.ExportMessage = "Server restarted while export was running"
}
s.jobs[job.ID] = &job
jobs = append(jobs, &job)
}
if len(s.jobs) > 0 {
xlog.Info("Loaded persisted fine-tune jobs", "count", len(s.jobs))
if len(jobs) > 0 {
xlog.Info("Loaded persisted fine-tune jobs", "count", len(jobs))
}
return jobs, nil
}
// StartJob starts a new fine-tuning job.
@@ -236,27 +260,13 @@ func (s *FineTuneService) StartJob(ctx context.Context, userID string, req schem
CreatedAt: time.Now().UTC().Format(time.RFC3339),
Config: &req,
}
s.jobs[jobID] = job
s.saveJobState(job)
// Persist to PostgreSQL in distributed mode
if s.fineTuneStore != nil {
configJSON, _ := json.Marshal(req)
extraJSON, _ := json.Marshal(req.ExtraOptions)
s.fineTuneStore.Create(&distributed.FineTuneJobRecord{
ID: jobID,
UserID: userID,
Model: req.Model,
Backend: backendName,
ModelID: modelID,
TrainingType: req.TrainingType,
TrainingMethod: req.TrainingMethod,
Status: "queued",
OutputDir: outputDir,
ConfigJSON: string(configJSON),
ExtraOptsJSON: string(extraJSON),
})
// Set write-through persists to PostgreSQL (distributed) and broadcasts to
// peer replicas; the disk state.json is written separately for restart
// recovery / standalone hydrate.
if err := s.jobs.Set(ctx, job); err != nil {
return nil, fmt.Errorf("failed to persist job: %w", err)
}
s.saveJobState(job)
return &schema.FineTuneJobResponse{
ID: jobID,
@@ -270,7 +280,7 @@ func (s *FineTuneService) GetJob(userID, jobID string) (*schema.FineTuneJob, err
s.mu.Lock()
defer s.mu.Unlock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
return nil, fmt.Errorf("job not found: %s", jobID)
}
@@ -286,7 +296,7 @@ func (s *FineTuneService) ListJobs(userID string) []*schema.FineTuneJob {
defer s.mu.Unlock()
var result []*schema.FineTuneJob
for _, job := range s.jobs {
for _, job := range s.jobs.List() {
if userID == "" || job.UserID == userID {
result = append(result, job)
}
@@ -302,7 +312,7 @@ func (s *FineTuneService) ListJobs(userID string) []*schema.FineTuneJob {
// StopJob stops a running fine-tuning job.
func (s *FineTuneService) StopJob(ctx context.Context, userID, jobID string, saveCheckpoint bool) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -323,10 +333,10 @@ func (s *FineTuneService) StopJob(ctx context.Context, userID, jobID string, sav
s.mu.Lock()
job.Status = "stopped"
job.Message = "Training stopped by user"
s.saveJobState(job)
if s.fineTuneStore != nil {
s.fineTuneStore.UpdateStatus(jobID, "stopped", "Training stopped by user")
if err := s.jobs.Set(ctx, job); err != nil {
xlog.Warn("Failed to persist stopped job", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
return nil
@@ -335,7 +345,7 @@ func (s *FineTuneService) StopJob(ctx context.Context, userID, jobID string, sav
// DeleteJob removes a fine-tuning job and its associated data from disk.
func (s *FineTuneService) DeleteJob(userID, jobID string) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -360,9 +370,10 @@ func (s *FineTuneService) DeleteJob(userID, jobID string) error {
}
exportModelName := job.ExportModelName
delete(s.jobs, jobID)
if s.fineTuneStore != nil {
s.fineTuneStore.Delete(jobID)
// Delete write-through removes the DB row (distributed) and broadcasts the
// removal to peer replicas. DeleteJob has no ctx, so use Background.
if err := s.jobs.Delete(context.Background(), jobID); err != nil {
xlog.Warn("Failed to delete job from store", "job_id", jobID, "error", err)
}
s.mu.Unlock()
@@ -398,7 +409,7 @@ func (s *FineTuneService) DeleteJob(userID, jobID string) error {
// StreamProgress opens a gRPC progress stream and calls the callback for each update.
func (s *FineTuneService) StreamProgress(ctx context.Context, userID, jobID string, callback func(event *schema.FineTuneProgressEvent)) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -427,7 +438,7 @@ func (s *FineTuneService) StreamProgress(ctx context.Context, userID, jobID stri
}, func(update *pb.FineTuneProgressUpdate) {
// Update job status and persist
s.mu.Lock()
if j, ok := s.jobs[jobID]; ok {
if j, ok := s.jobs.Get(jobID); ok {
// Don't let progress updates overwrite terminal states
isTerminal := j.Status == "stopped" || j.Status == "completed" || j.Status == "failed"
if !isTerminal {
@@ -436,10 +447,10 @@ func (s *FineTuneService) StreamProgress(ctx context.Context, userID, jobID stri
if update.Message != "" {
j.Message = update.Message
}
s.saveJobState(j)
if s.fineTuneStore != nil {
s.fineTuneStore.UpdateStatus(jobID, j.Status, j.Message)
if err := s.jobs.Set(ctx, j); err != nil {
xlog.Warn("Failed to persist progress update", "job_id", jobID, "error", err)
}
s.saveJobState(j)
}
s.mu.Unlock()
@@ -474,7 +485,7 @@ func (s *FineTuneService) StreamProgress(ctx context.Context, userID, jobID stri
// ListCheckpoints lists checkpoints for a job.
func (s *FineTuneService) ListCheckpoints(ctx context.Context, userID, jobID string) ([]*pb.CheckpointInfo, error) {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return nil, fmt.Errorf("job not found: %s", jobID)
@@ -520,7 +531,7 @@ func sanitizeModelName(s string) string {
// ExportModel starts an async model export from a checkpoint and returns the intended model name immediately.
func (s *FineTuneService) ExportModel(ctx context.Context, userID, jobID string, req schema.ExportRequest) (string, error) {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return "", fmt.Errorf("job not found: %s", jobID)
@@ -572,6 +583,9 @@ func (s *FineTuneService) ExportModel(ctx context.Context, userID, jobID string,
job.ExportStatus = "exporting"
job.ExportMessage = ""
job.ExportModelName = ""
if err := s.jobs.Set(ctx, job); err != nil {
xlog.Warn("Failed to persist export start", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
@@ -662,24 +676,30 @@ func (s *FineTuneService) ExportModel(ctx context.Context, userID, jobID string,
xlog.Info("Model exported and registered", "job_id", jobID, "model_name", modelName, "format", req.ExportFormat)
// Runs after the HTTP request returns, so use Background rather than the
// (now likely cancelled) request ctx for the write-through.
s.mu.Lock()
job.ExportStatus = "completed"
job.ExportModelName = modelName
job.ExportMessage = ""
s.saveJobState(job)
if s.fineTuneStore != nil {
s.fineTuneStore.UpdateExportStatus(jobID, "completed", "", modelName)
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist export completion", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}()
return modelName, nil
}
// setExportMessage updates the export message and persists the job state.
// setExportMessage updates the export message and persists the job state. Called
// from the background export goroutine, so it uses Background for write-through.
func (s *FineTuneService) setExportMessage(job *schema.FineTuneJob, msg string) {
s.mu.Lock()
job.ExportMessage = msg
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist export message", "job_id", job.ID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}
@@ -687,7 +707,7 @@ func (s *FineTuneService) setExportMessage(job *schema.FineTuneJob, msg string)
// GetExportedModelPath returns the path to the exported model directory and its name.
func (s *FineTuneService) GetExportedModelPath(userID, jobID string) (string, string, error) {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return "", "", fmt.Errorf("job not found: %s", jobID)
@@ -723,10 +743,10 @@ func (s *FineTuneService) setExportFailed(job *schema.FineTuneJob, message strin
s.mu.Lock()
job.ExportStatus = "failed"
job.ExportMessage = message
s.saveJobState(job)
if s.fineTuneStore != nil {
s.fineTuneStore.UpdateExportStatus(job.ID, "failed", message, "")
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist export failure", "job_id", job.ID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}

View File

@@ -0,0 +1,185 @@
package finetune
// White-box tests (package finetune) so a spec can drive the service's internal
// SyncedMap the same way StartJob does (via jobs.Set) without standing up a
// training backend, then assert the cross-replica reads (GetJob/ListJobs) and
// the adapter conversions that keep REST responses byte-for-byte unchanged.
import (
"context"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/testutil"
)
// newTestService builds a standalone FineTuneService wired to the given bus. The
// model/config loaders are nil because the read/sync paths under test never touch
// them; the data dir is a throwaway temp dir so the disk Loader finds nothing.
func newTestService(bus *testutil.FakeBus) *FineTuneService {
appConfig := &config.ApplicationConfig{
Context: context.Background(),
DataPath: GinkgoT().TempDir(),
}
return NewFineTuneService(appConfig, nil, nil, bus, nil)
}
var _ = Describe("FineTuneService", func() {
ctx := context.Background()
Describe("cross-replica job visibility", func() {
var (
bus *testutil.FakeBus
a, b *FineTuneService
)
BeforeEach(func() {
// One shared bus, two replicas: exactly the distributed topology where
// a round-robin request may land on a replica that did not originate
// the change.
bus = testutil.NewFakeBus()
a = newTestService(bus)
b = newTestService(bus)
})
AfterEach(func() {
Expect(a.Close()).To(Succeed())
Expect(b.Close()).To(Succeed())
})
It("makes a job created on A visible via B's GetJob and ListJobs", func() {
job := &schema.FineTuneJob{ID: "job-1", UserID: "user-1", Status: "queued", CreatedAt: "2026-06-27T10:00:00Z"}
// StartJob persists via jobs.Set; drive that directly to avoid a backend.
Expect(a.jobs.Set(ctx, job)).To(Succeed())
got, err := b.GetJob("user-1", "job-1")
Expect(err).ToNot(HaveOccurred(), "B must see a job A just created")
Expect(got.Status).To(Equal("queued"))
listed := b.ListJobs("user-1")
Expect(listed).To(HaveLen(1))
Expect(listed[0].ID).To(Equal("job-1"))
})
It("removes a job from B when it is deleted on A", func() {
job := &schema.FineTuneJob{ID: "job-2", UserID: "user-1", Status: "completed", CreatedAt: "2026-06-27T10:00:00Z"}
Expect(a.jobs.Set(ctx, job)).To(Succeed())
_, err := b.GetJob("user-1", "job-2")
Expect(err).ToNot(HaveOccurred(), "precondition: B must have the job before the delete")
Expect(a.jobs.Delete(ctx, "job-2")).To(Succeed())
_, err = b.GetJob("user-1", "job-2")
Expect(err).To(HaveOccurred(), "a delete on A must remove the job from B")
})
It("propagates a status update from A to B", func() {
job := &schema.FineTuneJob{ID: "job-3", UserID: "user-1", Status: "training", CreatedAt: "2026-06-27T10:00:00Z"}
Expect(a.jobs.Set(ctx, job)).To(Succeed())
updated := &schema.FineTuneJob{ID: "job-3", UserID: "user-1", Status: "completed", CreatedAt: "2026-06-27T10:00:00Z"}
Expect(a.jobs.Set(ctx, updated)).To(Succeed())
got, err := b.GetJob("user-1", "job-3")
Expect(err).ToNot(HaveOccurred())
Expect(got.Status).To(Equal("completed"))
})
})
Describe("ListJobs", func() {
var svc *FineTuneService
BeforeEach(func() {
svc = newTestService(testutil.NewFakeBus())
})
AfterEach(func() { Expect(svc.Close()).To(Succeed()) })
It("filters by user and sorts newest-first", func() {
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "old", UserID: "u1", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "new", UserID: "u1", CreatedAt: "2026-06-27T10:00:00Z"})).To(Succeed())
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "other", UserID: "u2", CreatedAt: "2026-06-26T10:00:00Z"})).To(Succeed())
jobs := svc.ListJobs("u1")
Expect(jobs).To(HaveLen(2), "only u1's jobs")
Expect(jobs[0].ID).To(Equal("new"), "newest first")
Expect(jobs[1].ID).To(Equal("old"))
})
It("returns every user's jobs when the userID filter is empty", func() {
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "a", UserID: "u1", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "b", UserID: "u2", CreatedAt: "2026-06-26T10:00:00Z"})).To(Succeed())
Expect(svc.ListJobs("")).To(HaveLen(2))
})
It("rejects GetJob for a job owned by another user", func() {
Expect(svc.jobs.Set(ctx, &schema.FineTuneJob{ID: "x", UserID: "owner", CreatedAt: "2026-06-25T10:00:00Z"})).To(Succeed())
_, err := svc.GetJob("intruder", "x")
Expect(err).To(HaveOccurred(), "a different user must not read someone else's job")
})
})
Describe("store adapter conversion", func() {
// The SyncedMap value type is *schema.FineTuneJob (the exact REST shape).
// These specs prove the DB adapter round-trips it losslessly, so hydrate
// and write-through in distributed mode keep responses unchanged.
It("round-trips a job through jobToRecord/recordToJob preserving the API shape", func() {
original := &schema.FineTuneJob{
ID: "rt-1",
UserID: "user-1",
Model: "base-model",
Backend: "trl",
ModelID: "trl-finetune-rt-1",
TrainingType: "lora",
TrainingMethod: "sft",
Status: "completed",
Message: "done",
OutputDir: "/data/fine-tune/rt-1",
ExtraOptions: map[string]string{"hf_token": "secret"},
CreatedAt: "2026-06-27T10:00:00Z",
ExportStatus: "completed",
ExportMessage: "",
ExportModelName: "base-model-ft-rt-1",
Config: &schema.FineTuneJobRequest{Model: "base-model", Backend: "trl", DatasetSource: "data.jsonl"},
}
rec := jobToRecord(original)
Expect(rec.ID).To(Equal("rt-1"))
Expect(rec.ConfigJSON).ToNot(BeEmpty(), "structured config must serialize into the JSON column")
Expect(rec.ExtraOptsJSON).ToNot(BeEmpty())
back := recordToJob(rec)
Expect(back.ID).To(Equal(original.ID))
Expect(back.UserID).To(Equal(original.UserID))
Expect(back.Model).To(Equal(original.Model))
Expect(back.Backend).To(Equal(original.Backend))
Expect(back.ModelID).To(Equal(original.ModelID))
Expect(back.TrainingType).To(Equal(original.TrainingType))
Expect(back.TrainingMethod).To(Equal(original.TrainingMethod))
Expect(back.Status).To(Equal(original.Status))
Expect(back.Message).To(Equal(original.Message))
Expect(back.OutputDir).To(Equal(original.OutputDir))
Expect(back.ExportStatus).To(Equal(original.ExportStatus))
Expect(back.ExportModelName).To(Equal(original.ExportModelName))
Expect(back.CreatedAt).To(Equal(original.CreatedAt))
Expect(back.ExtraOptions).To(Equal(original.ExtraOptions))
Expect(back.Config).ToNot(BeNil())
Expect(back.Config.DatasetSource).To(Equal("data.jsonl"))
})
})
Describe("compile-time adapter contract", func() {
It("satisfies syncstate.Store for *distributed.FineTuneStore", func() {
// Guards against drift between the adapter and the component interface;
// the var assertion in syncstore.go covers it at build time, this keeps
// the type referenced from a spec too.
var _ *distributed.FineTuneStore
Expect(&fineTuneStoreAdapter{}).ToNot(BeNil())
})
})
})

View File

@@ -0,0 +1,114 @@
package finetune
import (
"context"
"encoding/json"
"time"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/syncstate"
)
// fineTuneStoreAdapter bridges the distributed PostgreSQL FineTuneStore to the
// generic syncstate.Store the SyncedMap consumes. It is only wired in distributed
// mode; standalone leaves Store nil and hydrates from disk via a Loader instead.
//
// The SyncedMap value type is *schema.FineTuneJob (the exact shape the REST API
// returns) so reads need no conversion and the response JSON is provably
// unchanged. The adapter is the single place that translates between that API
// shape and the DB FineTuneJobRecord.
type fineTuneStoreAdapter struct {
store *distributed.FineTuneStore
}
// compile-time assertion that the adapter satisfies the component's Store.
var _ syncstate.Store[string, *schema.FineTuneJob] = (*fineTuneStoreAdapter)(nil)
func (a *fineTuneStoreAdapter) List(_ context.Context) ([]*schema.FineTuneJob, error) {
records, err := a.store.ListAll()
if err != nil {
return nil, err
}
jobs := make([]*schema.FineTuneJob, 0, len(records))
for i := range records {
jobs = append(jobs, recordToJob(&records[i]))
}
return jobs, nil
}
func (a *fineTuneStoreAdapter) Upsert(_ context.Context, job *schema.FineTuneJob) error {
return a.store.Upsert(jobToRecord(job))
}
func (a *fineTuneStoreAdapter) Delete(_ context.Context, id string) error {
return a.store.Delete(id)
}
// recordToJob maps a persisted DB record back to the API shape, reconstructing
// the structured Config / ExtraOptions from their JSON columns.
func recordToJob(r *distributed.FineTuneJobRecord) *schema.FineTuneJob {
job := &schema.FineTuneJob{
ID: r.ID,
UserID: r.UserID,
Model: r.Model,
Backend: r.Backend,
ModelID: r.ModelID,
TrainingType: r.TrainingType,
TrainingMethod: r.TrainingMethod,
Status: r.Status,
Message: r.Message,
OutputDir: r.OutputDir,
ExportStatus: r.ExportStatus,
ExportMessage: r.ExportMessage,
ExportModelName: r.ExportModelName,
CreatedAt: r.CreatedAt.UTC().Format(time.RFC3339),
}
if r.ExtraOptsJSON != "" {
// Best-effort: a malformed column must not drop the whole job from the API.
_ = json.Unmarshal([]byte(r.ExtraOptsJSON), &job.ExtraOptions)
}
if r.ConfigJSON != "" {
var cfg schema.FineTuneJobRequest
if err := json.Unmarshal([]byte(r.ConfigJSON), &cfg); err == nil {
job.Config = &cfg
}
}
return job
}
// jobToRecord maps the API shape to a DB record for write-through, serializing
// the structured Config / ExtraOptions into their JSON columns. CreatedAt is
// parsed back from the RFC3339 string the service stamps; an unparseable value
// is left zero so FineTuneStore.Upsert stamps "now".
func jobToRecord(job *schema.FineTuneJob) *distributed.FineTuneJobRecord {
rec := &distributed.FineTuneJobRecord{
ID: job.ID,
UserID: job.UserID,
Model: job.Model,
Backend: job.Backend,
ModelID: job.ModelID,
TrainingType: job.TrainingType,
TrainingMethod: job.TrainingMethod,
Status: job.Status,
Message: job.Message,
OutputDir: job.OutputDir,
ExportStatus: job.ExportStatus,
ExportMessage: job.ExportMessage,
ExportModelName: job.ExportModelName,
}
if job.Config != nil {
if data, err := json.Marshal(job.Config); err == nil {
rec.ConfigJSON = string(data)
}
}
if job.ExtraOptions != nil {
if data, err := json.Marshal(job.ExtraOptions); err == nil {
rec.ExtraOptsJSON = string(data)
}
}
if t, err := time.Parse(time.RFC3339, job.CreatedAt); err == nil {
rec.CreatedAt = t
}
return rec
}

View File

@@ -404,6 +404,36 @@ var _ = Describe("GalleryService cache invalidation broadcasts", func() {
Element: "x", Op: "install",
})).To(Succeed())
})
It("BroadcastModelsChanged delivers the element and op to a peer's OnModelsChanged", func() {
var (
mu sync.Mutex
seen []messaging.CacheInvalidateEvent
)
svcB.OnModelsChanged = func(evt messaging.CacheInvalidateEvent) {
mu.Lock()
seen = append(seen, evt)
mu.Unlock()
}
Expect(svcA.SubscribeBroadcasts()).To(Succeed())
Expect(svcB.SubscribeBroadcasts()).To(Succeed())
// An admin edit on replica A must reach replica B over the same subject
// the gallery path uses, so B refreshes its in-memory config loader.
svcA.BroadcastModelsChanged("my-alias", "install")
mu.Lock()
defer mu.Unlock()
Expect(seen).To(ContainElement(messaging.CacheInvalidateEvent{
Element: "my-alias", Op: "install",
}))
})
It("BroadcastModelsChanged is a no-op when NATS is not wired (standalone)", func() {
standalone := galleryop.NewGalleryService(&config.ApplicationConfig{}, nil)
// No SetNATSClient: must not panic and must simply do nothing.
Expect(func() { standalone.BroadcastModelsChanged("x", "delete") }).ToNot(Panic())
})
})
var _ = Describe("GalleryService PostgreSQL hydration", func() {

View File

@@ -201,6 +201,24 @@ func (g *GalleryService) publishCacheInvalidate(subject string, evt messaging.Ca
}
}
// BroadcastModelsChanged notifies peer replicas that a model config was
// created, edited, or removed out-of-band of the gallery install/delete
// channel (e.g. the admin /models/edit, /models/import and
// /models/toggle-state endpoints, which write the YAML and reload only the
// local in-memory loader). Peers receive it via OnModelsChanged and refresh
// their own ModelConfigLoader so a request load-balanced to any replica sees
// the same config. No-op in standalone mode (no NATS client).
//
// op is "install" for a create/edit (the element must be (re)loaded from
// disk) or "delete" for a removal (the element must be pruned from memory,
// which a reload-from-path cannot do because the loader is additive).
func (g *GalleryService) BroadcastModelsChanged(element, op string) {
g.publishCacheInvalidate(messaging.SubjectCacheInvalidateModels, messaging.CacheInvalidateEvent{
Element: element,
Op: op,
})
}
// mergeStatus is the broadcast-side merge: it updates the in-memory map from
// a peer's GalleryProgressEvent without re-publishing to NATS or re-writing
// to PostgreSQL. UpdateStatus is the local-write entry point and does both;

View File

@@ -0,0 +1,24 @@
//go:build auth
package jobs_test
import (
"github.com/mudler/LocalAI/core/http/auth"
"github.com/mudler/LocalAI/core/services/jobs"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
// Reproduces the #10506 caller chain: auth.InitDB(sqlite) -> jobs.NewJobStore,
// which previously failed with "no such function: pg_advisory_lock".
var _ = Describe("NewJobStore on a SQLite auth DB (#10506)", func() {
It("migrates without pg_advisory_lock errors", func() {
db, err := auth.InitDB(":memory:")
Expect(err).ToNot(HaveOccurred())
store, err := jobs.NewJobStore(db)
Expect(err).ToNot(HaveOccurred())
Expect(store).ToNot(BeNil())
})
})

View File

@@ -22,6 +22,14 @@ const subscribeConfirmTimeout = 5 * time.Second
type Client struct {
conn *nats.Conn
mu sync.RWMutex
// reconnectCbs are invoked after the underlying connection is
// re-established. nats.go transparently resubscribes existing
// subscriptions on reconnect, but it cannot know that a consumer kept
// derived in-memory state (e.g. syncstate.SyncedMap) that may have drifted
// while the link was down — these callbacks let such consumers re-hydrate.
cbMu sync.Mutex
reconnectCbs []func()
}
// New creates a new NATS client with auto-reconnect.
@@ -31,6 +39,10 @@ func New(url string, opts ...Option) (*Client, error) {
o(&cfg)
}
// Allocate the client up front so the reconnect handler closure can reach
// it; conn is populated after nats.Connect succeeds below.
c := &Client{}
natsOpts := []nats.Option{
nats.RetryOnFailedConnect(true),
nats.MaxReconnects(-1),
@@ -41,6 +53,7 @@ func New(url string, opts ...Option) (*Client, error) {
}),
nats.ReconnectHandler(func(_ *nats.Conn) {
xlog.Info("NATS reconnected")
c.runReconnectCallbacks()
}),
nats.ClosedHandler(func(_ *nats.Conn) {
xlog.Info("NATS connection closed")
@@ -103,7 +116,33 @@ func New(url string, opts ...Option) (*Client, error) {
return nil, fmt.Errorf("connecting to NATS at %s: %w", sanitize.URL(url), err)
}
return &Client{conn: nc}, nil
c.conn = nc
return c, nil
}
// OnReconnect registers a callback invoked after the NATS connection is
// re-established. It is consumed via an optional interface type-assertion
// (interface{ OnReconnect(func()) }) rather than being added to MessagingClient,
// so the messaging abstraction stays minimal and standalone/test clients are not
// forced to implement reconnect semantics. A nil callback is ignored.
func (c *Client) OnReconnect(cb func()) {
if cb == nil {
return
}
c.cbMu.Lock()
c.reconnectCbs = append(c.reconnectCbs, cb)
c.cbMu.Unlock()
}
// runReconnectCallbacks invokes registered reconnect callbacks. It copies the
// slice under the lock so a callback that (re)registers cannot deadlock.
func (c *Client) runReconnectCallbacks() {
c.cbMu.Lock()
cbs := append([]func(){}, c.reconnectCbs...)
c.cbMu.Unlock()
for _, cb := range cbs {
cb()
}
}
// Publish marshals data as JSON and publishes it to the given subject.

View File

@@ -380,6 +380,20 @@ func SubjectCacheInvalidateCollection(name string) string {
return "cache.invalidate.collections." + sanitizeSubjectToken(name)
}
// SyncedMap State Sync (Pub/Sub — broadcast to all frontends)
//
// The reusable syncstate.SyncedMap component publishes a {op,key,value} delta on
// this subject whenever a replica mutates a piece of cross-replica in-memory
// state. Peers subscribe and apply the delta to their own map, so a round-robin
// API request that lands on a replica which did not originate the change still
// sees it. Convergence on (re)connect is done by re-hydrating from the durable
// source, so no request/reply snapshot subject is needed here.
func SubjectSyncStateDelta(name string) string {
return subjectSyncStatePrefix + sanitizeSubjectToken(name) + ".delta"
}
const subjectSyncStatePrefix = "state."
// Prefix-Cache Routing Sync (Pub/Sub - broadcast to all frontends)
//
// Frontends share prefix-cache observations so a request routed to any replica

View File

@@ -0,0 +1,53 @@
package modeladmin
import (
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/xlog"
)
// opDelete is the CacheInvalidateEvent.Op value the gallery delete path and the
// admin delete endpoint use; a delete must prune (a reload-from-path cannot).
const opDelete = "delete"
// ApplyRemoteChange refreshes this replica's in-memory model state from a peer
// replica's model-config change broadcast (messaging.CacheInvalidateEvent on
// SubjectCacheInvalidateModels). It is the subscriber-side counterpart to
// GalleryService.BroadcastModelsChanged.
//
// The op matters because LoadModelConfigsFromPath is additive: it loads every
// YAML on disk into the loader but never removes an entry whose file is gone.
// So a delete cannot be propagated by a plain reload - the deleted element must
// be explicitly pruned. Specifically:
//
// - op == "delete" with a named element: prune that element from the loader.
// - otherwise: reload all configs from disk (picks up creates and edits).
//
// In both cases, when an element is named, any running instance on this replica
// is shut down (best-effort) so the next request rebuilds it from the new
// config instead of serving the stale one - mirroring what the originating
// replica does on a local edit/delete.
//
// ml may be nil (no running instances to shut down). modelsPath and opts are
// forwarded to LoadModelConfigsFromPath.
func ApplyRemoteChange(cl *config.ModelConfigLoader, ml *model.ModelLoader, modelsPath string, evt messaging.CacheInvalidateEvent, opts ...config.ConfigLoaderOption) error {
if evt.Op == opDelete && evt.Element != "" {
cl.RemoveModelConfig(evt.Element)
} else if err := cl.LoadModelConfigsFromPath(modelsPath, opts...); err != nil {
return err
}
// Drop any running instance of the affected model so the next request
// rebuilds it from the refreshed config instead of serving the stale one.
// Best-effort: the model may not be loaded on this replica, which surfaces
// as a benign error here.
if ml != nil && evt.Element != "" {
if err := ml.ShutdownModel(evt.Element); err != nil {
xlog.Debug("ApplyRemoteChange: could not shut down model instance (likely not loaded)",
"model", evt.Element, "error", err)
}
}
return nil
}

View File

@@ -0,0 +1,80 @@
package modeladmin
import (
"os"
"path/filepath"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
"gopkg.in/yaml.v3"
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/services/messaging"
)
var _ = Describe("ApplyRemoteChange", func() {
var (
dir string
loader *config.ModelConfigLoader
)
BeforeEach(func() {
dir = GinkgoT().TempDir()
loader = config.NewModelConfigLoader(dir)
})
writeYAML := func(name string, body map[string]any) {
body["name"] = name
data, err := yaml.Marshal(body)
Expect(err).ToNot(HaveOccurred())
Expect(os.WriteFile(filepath.Join(dir, name+".yaml"), data, 0644)).To(Succeed())
}
It("loads a peer-created config from disk on an install event", func() {
// Peer wrote the YAML to the shared models dir; this replica has not
// loaded it yet (empty in-memory loader).
writeYAML("peer-alias", map[string]any{"alias": "qwen"})
_, ok := loader.GetModelConfig("peer-alias")
Expect(ok).To(BeFalse(), "precondition: not yet in memory")
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{
Element: "peer-alias", Op: "install",
})
Expect(err).ToNot(HaveOccurred())
_, ok = loader.GetModelConfig("peer-alias")
Expect(ok).To(BeTrue(), "install event must reload the new config from disk")
})
It("prunes a peer-deleted config that a reload-from-path cannot drop", func() {
// Model is present in memory (loaded earlier) but its file is now gone
// from the shared dir. LoadModelConfigsFromPath is additive, so only an
// explicit prune can remove it - this is the cross-replica delete bug.
writeYAML("doomed", map[string]any{"alias": "qwen"})
Expect(loader.LoadModelConfigsFromPath(dir)).To(Succeed())
_, ok := loader.GetModelConfig("doomed")
Expect(ok).To(BeTrue(), "precondition: in memory")
Expect(os.Remove(filepath.Join(dir, "doomed.yaml"))).To(Succeed())
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{
Element: "doomed", Op: "delete",
})
Expect(err).ToNot(HaveOccurred())
_, ok = loader.GetModelConfig("doomed")
Expect(ok).To(BeFalse(), "delete event must prune the element from memory")
})
It("does a full reload when no element is named", func() {
writeYAML("m1", map[string]any{"alias": "qwen"})
writeYAML("m2", map[string]any{"alias": "qwen"})
err := ApplyRemoteChange(loader, nil, dir, messaging.CacheInvalidateEvent{})
Expect(err).ToNot(HaveOccurred())
_, ok1 := loader.GetModelConfig("m1")
_, ok2 := loader.GetModelConfig("m2")
Expect(ok1).To(BeTrue())
Expect(ok2).To(BeTrue())
})
})

View File

@@ -673,6 +673,49 @@ func (r *NodeRegistry) Get(ctx context.Context, nodeID string) (*BackendNode, er
return &node, nil
}
// GetWithExtras returns a single node enriched with the same computed fields as
// ListWithExtras (labels, loaded-model count, in-flight total). The plain Get
// returns a bare BackendNode whose Labels live in a separate table, so the node
// detail view needs this to show a node's existing labels and live counts.
func (r *NodeRegistry) GetWithExtras(ctx context.Context, nodeID string) (*NodeWithExtras, error) {
node, err := r.Get(ctx, nodeID)
if err != nil {
return nil, err
}
labels := make(map[string]string)
nodeLabels, err := r.GetNodeLabels(ctx, nodeID)
if err != nil {
xlog.Warn("GetWithExtras: failed to get labels", "node", nodeID, "error", err)
} else {
for _, l := range nodeLabels {
labels[l.Key] = l.Value
}
}
var modelCount int64
if err := r.db.WithContext(ctx).Model(&NodeModel{}).
Where("node_id = ? AND state = ?", nodeID, "loaded").
Count(&modelCount).Error; err != nil {
xlog.Warn("GetWithExtras: failed to get model count", "node", nodeID, "error", err)
}
var inFlight struct{ Total int }
if err := r.db.WithContext(ctx).Model(&NodeModel{}).
Select("COALESCE(SUM(in_flight), 0) as total").
Where("node_id = ? AND state IN ?", nodeID, []string{"loaded", "unloading"}).
Scan(&inFlight).Error; err != nil {
xlog.Warn("GetWithExtras: failed to get in-flight count", "node", nodeID, "error", err)
}
return &NodeWithExtras{
BackendNode: *node,
ModelCount: int(modelCount),
InFlightCount: inFlight.Total,
Labels: labels,
}, nil
}
// GetByName returns a single node by name.
func (r *NodeRegistry) GetByName(ctx context.Context, name string) (*BackendNode, error) {
var node BackendNode

View File

@@ -646,6 +646,38 @@ var _ = Describe("NodeRegistry", func() {
})
})
Describe("GetWithExtras", func() {
It("returns the node enriched with its labels map", func() {
node := makeNode("extras-node", "10.0.0.80:50051", 8_000_000_000)
Expect(registry.Register(context.Background(), node, true)).To(Succeed())
Expect(registry.SetNodeLabel(context.Background(), node.ID, "env", "prod")).To(Succeed())
Expect(registry.SetNodeLabel(context.Background(), node.ID, "region", "us-east")).To(Succeed())
got, err := registry.GetWithExtras(context.Background(), node.ID)
Expect(err).ToNot(HaveOccurred())
Expect(got).ToNot(BeNil())
Expect(got.ID).To(Equal(node.ID))
Expect(got.Name).To(Equal("extras-node"))
Expect(got.Labels).To(Equal(map[string]string{"env": "prod", "region": "us-east"}))
})
It("returns an empty (non-nil) labels map when the node has none", func() {
node := makeNode("extras-no-labels", "10.0.0.81:50051", 8_000_000_000)
Expect(registry.Register(context.Background(), node, true)).To(Succeed())
got, err := registry.GetWithExtras(context.Background(), node.ID)
Expect(err).ToNot(HaveOccurred())
Expect(got).ToNot(BeNil())
Expect(got.Labels).ToNot(BeNil())
Expect(got.Labels).To(BeEmpty())
})
It("returns an error for an unknown node", func() {
_, err := registry.GetWithExtras(context.Background(), "does-not-exist")
Expect(err).To(HaveOccurred())
})
})
Describe("FindNodesBySelector", func() {
It("returns nodes matching all labels in selector", func() {
n1 := makeNode("sel-match", "10.0.0.80:50051", 8_000_000_000)

View File

@@ -0,0 +1,13 @@
package quantization
import (
"testing"
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
)
func TestQuantization(t *testing.T) {
RegisterFailHandler(Fail)
RunSpecs(t, "Quantization Suite")
}

View File

@@ -17,6 +17,9 @@ import (
"github.com/mudler/LocalAI/core/config"
"github.com/mudler/LocalAI/core/gallery/importers"
"github.com/mudler/LocalAI/core/schema"
"github.com/mudler/LocalAI/core/services/distributed"
"github.com/mudler/LocalAI/core/services/messaging"
"github.com/mudler/LocalAI/core/services/syncstate"
pb "github.com/mudler/LocalAI/pkg/grpc/proto"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/utils"
@@ -30,26 +33,63 @@ type QuantizationService struct {
modelLoader *model.ModelLoader
configLoader *config.ModelConfigLoader
mu sync.Mutex
jobs map[string]*schema.QuantizationJob
// mu serializes the read-modify-write of job values. The SyncedMap guards its
// own map structure, but a job is a pointer mutated in place (e.g. the import
// goroutine), so the service still needs a lock to keep those field updates and
// the subsequent Set atomic with respect to readers.
mu sync.Mutex
// jobs is the cross-replica job store: an in-memory map kept consistent across
// replicas via NATS, optionally read-through to PostgreSQL in distributed mode.
jobs *syncstate.SyncedMap[string, *schema.QuantizationJob]
}
// NewQuantizationService creates a new QuantizationService.
// NewQuantizationService creates a new QuantizationService. In distributed mode
// pass the shared NATS client and PostgreSQL store so jobs stay consistent across
// replicas; pass nil for both in standalone mode, where the disk Loader hydrates
// the map and there is nothing to broadcast.
func NewQuantizationService(
appConfig *config.ApplicationConfig,
modelLoader *model.ModelLoader,
configLoader *config.ModelConfigLoader,
nats messaging.MessagingClient,
store *distributed.QuantStore,
) *QuantizationService {
s := &QuantizationService{
appConfig: appConfig,
modelLoader: modelLoader,
configLoader: configLoader,
jobs: make(map[string]*schema.QuantizationJob),
}
s.loadAllJobs()
// Only attach a Store interface when a concrete store exists, otherwise the
// SyncedMap would see a non-nil interface wrapping a nil pointer and try to
// hydrate/write through a nil DB.
var syncStore syncstate.Store[string, *schema.QuantizationJob]
if store != nil {
syncStore = &quantStoreAdapter{store: store}
}
s.jobs = syncstate.New(syncstate.Config[string, *schema.QuantizationJob]{
Name: "quant.jobs",
Key: func(j *schema.QuantizationJob) string { return j.ID },
Nats: nats,
Store: syncStore,
Loader: s.loadJobsFromDisk, // ignored when Store is set (distributed mode)
})
// Hydrate + subscribe. A hydrate failure must not take the server down: log and
// continue degraded (standalone), mirroring the FineTune/OpCache wiring.
if err := s.jobs.Start(appConfig.Context); err != nil {
xlog.Warn("Quantization SyncedMap start failed; running degraded", "error", err)
}
return s
}
// Close releases the SyncedMap subscription and background workers.
func (s *QuantizationService) Close() error {
return s.jobs.Close()
}
// quantizationBaseDir returns the base directory for quantization job data.
func (s *QuantizationService) quantizationBaseDir() string {
return filepath.Join(s.appConfig.DataPath, "quantization")
@@ -80,15 +120,18 @@ func (s *QuantizationService) saveJobState(job *schema.QuantizationJob) {
}
}
// loadAllJobs scans the quantization directory for persisted jobs and loads them.
func (s *QuantizationService) loadAllJobs() {
// loadJobsFromDisk scans the quantization directory for persisted jobs and
// returns them. It is the SyncedMap Loader used in standalone mode (no DB); the
// returned slice hydrates the map on Start.
func (s *QuantizationService) loadJobsFromDisk(_ context.Context) ([]*schema.QuantizationJob, error) {
baseDir := s.quantizationBaseDir()
entries, err := os.ReadDir(baseDir)
if err != nil {
// Directory doesn't exist yet — that's fine
return
// Directory doesn't exist yet — that's fine, start empty.
return nil, nil
}
var jobs []*schema.QuantizationJob
for _, entry := range entries {
if !entry.IsDir() {
continue
@@ -117,12 +160,13 @@ func (s *QuantizationService) loadAllJobs() {
job.ImportMessage = "Server restarted while import was running"
}
s.jobs[job.ID] = &job
jobs = append(jobs, &job)
}
if len(s.jobs) > 0 {
xlog.Info("Loaded persisted quantization jobs", "count", len(s.jobs))
if len(jobs) > 0 {
xlog.Info("Loaded persisted quantization jobs", "count", len(jobs))
}
return jobs, nil
}
// StartJob starts a new quantization job.
@@ -188,7 +232,12 @@ func (s *QuantizationService) StartJob(ctx context.Context, userID string, req s
CreatedAt: time.Now().UTC().Format(time.RFC3339),
Config: &req,
}
s.jobs[jobID] = job
// Set write-through persists to PostgreSQL (distributed) and broadcasts to
// peer replicas; the disk state.json is written separately for restart
// recovery / standalone hydrate.
if err := s.jobs.Set(ctx, job); err != nil {
return nil, fmt.Errorf("failed to persist job: %w", err)
}
s.saveJobState(job)
return &schema.QuantizationJobResponse{
@@ -203,7 +252,7 @@ func (s *QuantizationService) GetJob(userID, jobID string) (*schema.Quantization
s.mu.Lock()
defer s.mu.Unlock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
return nil, fmt.Errorf("job not found: %s", jobID)
}
@@ -219,7 +268,7 @@ func (s *QuantizationService) ListJobs(userID string) []*schema.QuantizationJob
defer s.mu.Unlock()
var result []*schema.QuantizationJob
for _, job := range s.jobs {
for _, job := range s.jobs.List() {
if userID == "" || job.UserID == userID {
result = append(result, job)
}
@@ -235,7 +284,7 @@ func (s *QuantizationService) ListJobs(userID string) []*schema.QuantizationJob
// StopJob stops a running quantization job.
func (s *QuantizationService) StopJob(ctx context.Context, userID, jobID string) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -256,6 +305,9 @@ func (s *QuantizationService) StopJob(ctx context.Context, userID, jobID string)
s.mu.Lock()
job.Status = "stopped"
job.Message = "Quantization stopped by user"
if err := s.jobs.Set(ctx, job); err != nil {
xlog.Warn("Failed to persist stopped job", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
@@ -265,7 +317,7 @@ func (s *QuantizationService) StopJob(ctx context.Context, userID, jobID string)
// DeleteJob removes a quantization job and its associated data from disk.
func (s *QuantizationService) DeleteJob(userID, jobID string) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -289,7 +341,11 @@ func (s *QuantizationService) DeleteJob(userID, jobID string) error {
}
importModelName := job.ImportModelName
delete(s.jobs, jobID)
// Delete write-through removes the DB row (distributed) and broadcasts the
// removal to peer replicas. DeleteJob has no ctx, so use Background.
if err := s.jobs.Delete(context.Background(), jobID); err != nil {
xlog.Warn("Failed to delete job from store", "job_id", jobID, "error", err)
}
s.mu.Unlock()
// Remove job directory (state.json, output files)
@@ -324,7 +380,7 @@ func (s *QuantizationService) DeleteJob(userID, jobID string) error {
// StreamProgress opens a gRPC progress stream and calls the callback for each update.
func (s *QuantizationService) StreamProgress(ctx context.Context, userID, jobID string, callback func(event *schema.QuantizationProgressEvent)) error {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return fmt.Errorf("job not found: %s", jobID)
@@ -353,7 +409,7 @@ func (s *QuantizationService) StreamProgress(ctx context.Context, userID, jobID
}, func(update *pb.QuantizationProgressUpdate) {
// Update job status and persist
s.mu.Lock()
if j, ok := s.jobs[jobID]; ok {
if j, ok := s.jobs.Get(jobID); ok {
// Don't let progress updates overwrite terminal states
isTerminal := j.Status == "stopped" || j.Status == "completed" || j.Status == "failed"
if !isTerminal {
@@ -365,6 +421,9 @@ func (s *QuantizationService) StreamProgress(ctx context.Context, userID, jobID
if update.OutputFile != "" {
j.OutputFile = update.OutputFile
}
if err := s.jobs.Set(ctx, j); err != nil {
xlog.Warn("Failed to persist progress update", "job_id", jobID, "error", err)
}
s.saveJobState(j)
}
s.mu.Unlock()
@@ -399,7 +458,7 @@ func sanitizeQuantModelName(s string) string {
// ImportModel imports a quantized model into LocalAI asynchronously.
func (s *QuantizationService) ImportModel(ctx context.Context, userID, jobID string, req schema.QuantizationImportRequest) (string, error) {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return "", fmt.Errorf("job not found: %s", jobID)
@@ -459,6 +518,9 @@ func (s *QuantizationService) ImportModel(ctx context.Context, userID, jobID str
job.ImportStatus = "importing"
job.ImportMessage = ""
job.ImportModelName = ""
if err := s.jobs.Set(ctx, job); err != nil {
xlog.Warn("Failed to persist import start", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
@@ -514,10 +576,15 @@ func (s *QuantizationService) ImportModel(ctx context.Context, userID, jobID str
xlog.Info("Quantized model imported and registered", "job_id", jobID, "model_name", modelName)
// Runs after the HTTP request returns, so use Background rather than the
// (now likely cancelled) request ctx for the write-through.
s.mu.Lock()
job.ImportStatus = "completed"
job.ImportModelName = modelName
job.ImportMessage = ""
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist import completion", "job_id", jobID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}()
@@ -525,10 +592,14 @@ func (s *QuantizationService) ImportModel(ctx context.Context, userID, jobID str
return modelName, nil
}
// setImportMessage updates the import message and persists the job state.
// setImportMessage updates the import message and persists the job state. Called
// from the background import goroutine, so it uses Background for write-through.
func (s *QuantizationService) setImportMessage(job *schema.QuantizationJob, msg string) {
s.mu.Lock()
job.ImportMessage = msg
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist import message", "job_id", job.ID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}
@@ -539,6 +610,9 @@ func (s *QuantizationService) setImportFailed(job *schema.QuantizationJob, messa
s.mu.Lock()
job.ImportStatus = "failed"
job.ImportMessage = message
if err := s.jobs.Set(context.Background(), job); err != nil {
xlog.Warn("Failed to persist import failure", "job_id", job.ID, "error", err)
}
s.saveJobState(job)
s.mu.Unlock()
}
@@ -546,7 +620,7 @@ func (s *QuantizationService) setImportFailed(job *schema.QuantizationJob, messa
// GetOutputPath returns the path to the quantized model file and a download name.
func (s *QuantizationService) GetOutputPath(userID, jobID string) (string, string, error) {
s.mu.Lock()
job, ok := s.jobs[jobID]
job, ok := s.jobs.Get(jobID)
if !ok {
s.mu.Unlock()
return "", "", fmt.Errorf("job not found: %s", jobID)

Some files were not shown because too many files have changed in this diff Show More