LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-13 19:27:48 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	f4036fa83f	ci(python-backends): add weekly DEPS_REFRESH cache-buster The shared backend/Dockerfile.python ends in: RUN cd /${BACKEND} && PORTABLE_PYTHON=true make which `pip install`s each backend's requirements*.txt. A scan of all 34 Python backends shows every single one ships at least some unpinned deps (torch, transformers, vllm, diffusers, ...). With the registry cache now enabled, that `make` layer's BuildKit hash depends only on Dockerfile instructions + COPYed source — not on what pip resolves at runtime — so a warm cache would freeze upstream versions indefinitely. DEPS_REFRESH is an ARG declared right before that RUN. backend_build.yml computes `date -u +%Y-W%V` (ISO week, e.g. `2026-W17`) and passes it as a build-arg, so the install layer invalidates at most once per week and re-resolves PyPI / nightly indexes. Within a week, builds stay warm. Only Dockerfile.python is affected: Go (go.sum) and Rust (Cargo.lock) already lock their deps, and the C++ backends pull gRPC at a pinned tag and llama.cpp at a pinned commit. Add .agents/ci-caching.md documenting the cache layout (quay.io/go-skynet/ci-cache:cache<tag-suffix>), read/write semantics (master writes, PRs read-only), DEPS_REFRESH semantics, and how to manually evict tags. Index it from AGENTS.md (CLAUDE.md is a symlink). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7-1m	2026-04-27 14:21:11 +00:00
Ettore Di Giacinto	3810fe1a1e	fix(distributed): worker container healthcheck always unhealthy The Dockerfile's HEALTHCHECK probes http://localhost:8080/readyz, which is the OpenAI API server port. When the same image runs as a worker, it listens on the gRPC base port (50051) and an HTTP file transfer server on port-1 (50050) — nothing on 8080 — so docker always reports the container as unhealthy. Add unauthenticated /readyz and /healthz endpoints to the worker's HTTP file transfer server, and override HEALTHCHECK_ENDPOINT for worker-1 in the distributed compose file. Disable the healthcheck for agent-worker since it is NATS-only and exposes no HTTP server. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-27 13:51:57 +00:00
Ettore Di Giacinto	bdfa5e934a	ci: switch image/backend build cache to a dedicated registry image - Switch cache-from/cache-to in backend_build.yml and image_build.yml from the unused gha cache to type=registry pointing at quay.io/go-skynet/ci-cache:cache<tag-suffix>, mode=max with ignore-error=true. Master/tag builds populate their own per-matrix-entry cache; PR builds read-only. - Drop the broken generate_grpc_cache.yaml cron. It targeted a `grpc` Dockerfile stage that was removed by `b1fc5acd` in July 2025, has been failing every night since, and never populated the gha cache. The new registry-cache scheme is self-warming, so no separate populator is needed. - Remove the dead GRPC_VERSION / GRPC_BASE_IMAGE / GRPC_MAKEFLAGS build-args from image_build.yml and the orphan ARG GRPC_BASE_IMAGE in the root Dockerfile (the root Dockerfile no longer compiles gRPC; the source build now lives in backend/Dockerfile.{llama-cpp, ik-llama-cpp, turboquant} only and uses its own ARG defaults). - Drop the unused grpc-base-image input from image_build.yml plus the matrix passthroughs in image.yml / image-pr.yml. - Drop the unused GRPC_VERSION env in test.yml. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7-1m	2026-04-27 13:13:04 +00:00
Richard Palethorpe	deca6dbdad	feat: Log backend exit code (#9581 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-27 14:19:18 +02:00
Ettore Di Giacinto	60549a8a60	feat(react-ui): page-width archetype system + mobile/tablet nav polish Replace the universal max-width:1200px cap on .page with a four-tier archetype system (narrow 760, medium 1080, default 1600, wide unbounded) selected per page based on what its UX actually wants. Data/table pages fill ultrawide displays; forms cap at reading width; tabbed feature surfaces breathe. Mobile/tablet: - New 640/1024 breakpoint split. Tablets (640-1023) get a persistent 52px icon rail; below 640 keeps the slide-off drawer. - Drawer polish: body-scroll lock, Escape to close, focus moves into the drawer on open and back to the hamburger on close, aria-hidden + inert on main while open. - Mobile top bar carries hamburger + theme toggle + account avatar (44x44 touch targets) so theme/account aren't trapped in the drawer. - Page-level reflow on phones: page-header column-stacks, filter chips scroll horizontally, tables go edge-to-edge, OperationsBar overflows rather than wrapping. Honors prefers-reduced-motion. Manage > Models: drop the toggle column; Enable/Disable joins the per-row Actions menu alongside Stop/Pin/Edit/Logs/Delete for consistency with the other action verbs. Page-width tokens live in theme.css so future tuning is one line. Removes 7 inline maxWidth workarounds from page roots. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude Code:claude-opus-4-7 [Edit] [Bash]	2026-04-27 11:51:29 +00:00
Ettore Di Giacinto	54728e292f	feat(react-ui): split Manage backends toggle into Variants and Development Meta backends are now always shown — they're the entries operators configure against — and two independent toggles govern the noise around them. "Variants" hides platform-specific concrete builds that a meta backend aliases on the host (e.g. llama-cpp-cuda12-12.4). "Development" hides pre-release `-development` builds. Each toggle shows the count of items currently hidden in its category. The legacy `bm` URL flag is honored on read so existing deep-links resolve to the same view they used to. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-27 08:23:53 +00:00
Tai An	86fd62233f	fix(gallery): correct Qwen3.5 typo in qwen3.5-27b-claude-4.6 model override (closes #9362 ) (#9580 ) The overrides.parameters.model field referenced 'Qwen3.-27B-Claude-...' (missing the '5'), so model loads failed because the configured filename did not match the file actually downloaded by the entry's files: list ('Qwen3.5-27B-Claude-...'). Aligns the override filename with the files: entries and with the upstream HF repo (mradermacher/Qwen3.5-27B-...).	2026-04-27 09:24:00 +02:00
Alex Brick	41ed8ced70	[intel GPU support] Use latest oneapi-basekit image for Intel images to support b70 (in more places this time) (#9578 ) Update additional intel base images	2026-04-27 09:18:57 +02:00
LocalAI [bot]	05e94bd9e7	chore: ⬆️ Update ggml-org/llama.cpp to `f53577432541bb9edc1588c4ef45c66bf07e4468` (#9577 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-27 08:57:24 +02:00
Ettore Di Giacinto	8d124d080f	feat(gallery): add whisper-development umbrella stanza Mirrors the whisper capabilities map with -development variants so clients can pull the master-tagged whisper.cpp backend via a single platform-resolved name, matching the existing faster-whisper-development and whisperx-development entries. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-26 23:04:27 +00:00
Ettore Di Giacinto	2da1a4d230	feat(distributed): per-node backend installation from the gallery In distributed mode the Backends gallery used to fan every install out to every worker — fine for auto-resolving (meta) backends like llama-cpp where each node picks its own variant, but wrong for hardware-specific builds like cpu-llama-cpp that would silently land on every GPU node. Adds a node-targeted install path through the existing POST /api/nodes/:id/backends/install plumbing, with two entry points: - Backends gallery row gets a split-button in distributed mode. Auto- resolving keeps "Install on all nodes" as the primary; chevron menu opens the picker. Hardware-specific routes the primary directly to the picker — no fan-out path on the row. - Nodes-page drawer gets a "+ Add backend" button that navigates to /app/backends?target=<node-id>; the gallery scopes itself to that node (banner, single per-row install button, Reinstall/Remove for already- installed). One gallery, two scopes — no second UI to maintain. The picker (new NodeInstallPicker) shows a 3-state suitability column (Compatible / Override / Installed), an auto-expanding variant override disclosure that fires when selected nodes have no working GPU, parallel per-node installs with inline status and Retry-failed-nodes, and a mismatch confirm that names the consequence on the button itself. A 409 fan-out guard on /api/backends/apply protects CLI/Terraform/script users from the same footgun: hardware-specific installs in distributed mode now return code "concrete_backend_requires_target" with a human- readable error and a meta_alternative pointer. The gallery list payload now surfaces capabilities, metaBackendFor and per-row nodes (NodeBackendRef) so the picker and the new Nodes column have everything they need without re-walking the gallery client-side. GODEBUG=netdns=go is set on the compose services because the cgo DNS resolver follows the container's nsswitch.conf to host systemd-resolved (127.0.0.53), unreachable from inside the container; the pure-Go resolver reads /etc/resolv.conf directly and uses Docker's embedded DNS. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude Code:claude-opus-4-7[1m] [Edit] [Bash] [Read] [Write]	2026-04-26 22:05:18 +00:00
Ettore Di Giacinto	988430c850	test(react-ui): drive Manage page Backend logs link via the new kebab menu Manage page row actions moved into ActionMenu in `b336d9c6`, so the inline `<a title="Backend logs">` the e2e specs were asserting on no longer exists. Open the row's kebab and assert against the menuitem. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7	2026-04-26 20:51:01 +00:00
Ettore Di Giacinto	b336d9c626	feat(react-ui): polish Manage page with kebab menus and gallery rows Bring the System / Manage page up to the visual standard of the Install gallery so installed models and backends stop reading like a debug dump. - Unified ResourceRow anatomy (icon, name+description, badges, status, expandable detail) shared across both tabs. - Gallery enrichment cross-references installed names against the gallery list endpoints to surface icons, descriptions, license, tags, and links with a graceful "no description" fallback for custom imports. - Header summary with four StatCards (Models / Backends / Running / Updates) — clickable to switch tab + pre-set filter. - Backends meta + development entries hidden by default; "Show meta & development" paired toggle in the FilterBar with hidden-count hint. - Kebab (three-dot) ActionMenu replaces the inline button cluster on every row; restrained until hover, keyboard-navigable, danger items separated by a divider. - Backend "Version" cell now falls back to short digest, OCI tag, or ocifile basename when no semver is set, instead of showing "—" for every OCI install. Detail panel exposes full Source URI + Digest. - Drop redundant column headers ("Actions", "On") — kebabs and toggles carry their own affordance; screen readers still get a label. - Inline System / User / Meta / Dev badges next to the backend name so the dedicated Type column doesn't reserve space for "USER" repeated. - Tightened the spacing between the System Resources card and the StatCards so they no longer crowd the RAM bar. Extracted StatCard and GalleryLoader from Nodes.jsx and Models.jsx into shared components so the visual language is one source of truth. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude Code:claude-opus-4-7 [Read] [Edit] [Write] [Bash]	2026-04-26 20:33:49 +00:00
Ettore Di Giacinto	f384c64a91	fix(model-loader): also skip .ckpt, .zip, and .tag files when scanning models The local model directory scan treats every non-skipped file as a model config candidate. Sidecar artifacts that ship alongside checkpoints (checkpoint blobs, downloaded archives, ggml-style tag files) were slipping through and showing up as bogus models in the listing. Add their extensions to the suffix-skip list. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code]	2026-04-26 19:37:53 +00:00
Ettore Di Giacinto	e9d8e92988	fix(react-ui): don't yank chat scroll to bottom while user is reading The chat and agent-chat pages auto-scrolled to the bottom on every streamed token. If the user scrolled up to re-read part of a response, the next chunk pulled them back down — making long replies unreadable while streaming. Track a stickToBottomRef on each scroll event: if the user is within 80px of the bottom we keep auto-scrolling, otherwise we leave them where they are. On chat switch we snap back to the bottom and re-pin. Same fix applied to both Chat.jsx and AgentChat.jsx since they share the same streaming pattern. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code]	2026-04-26 19:35:39 +00:00
Ettore Di Giacinto	5b0196c7d0	fix(whisper): scrub invalid UTF-8 from segment text before protobuf marshal whisper.cpp can emit bytes that are not valid UTF-8 — typically a multibyte codepoint split across token boundaries. protobuf string fields reject those at marshal time, which would surface as a transcribe failure. Run strings.ToValidUTF8 on the segment text before it leaves the cgo boundary so the bad byte gets replaced with U+FFFD. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code]	2026-04-26 19:35:39 +00:00
Ettore Di Giacinto	c8d63a1003	fix(react-ui): stop Manage page from blanking on auto-refresh; show real model use cases - useModels.refetch now runs silently — distributed-mode 10s auto-refresh no longer flips loading=true and replaces the table with a spinner card. - Manage Use Cases column derives badges from each model's actual capabilities (Chat / Image / TTS / Embeddings / etc.) instead of hardcoding a "Chat" link for every row. - FilterBar right slot is right-aligned via margin-left:auto so the Update button lives at the end of the row, not next to the chips. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code]	2026-04-26 19:35:39 +00:00
LocalAI [bot]	d9cb0d6133	chore: ⬆️ Update ggml-org/llama.cpp to `dcad77cc3b0865153f486327064fb0320a57a476` (#9572 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-26 12:38:35 +02:00
LocalAI [bot]	f5c268deac	chore: ⬆️ Update TheTom/llama-cpp-turboquant to `11a241d0db78a68e0a5b99fe6f36de6683100f6a` (#9571 ) ⬆️ Update TheTom/llama-cpp-turboquant Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-26 12:38:25 +02:00
Tai An	8931a2ad31	fix(gallery): normalize inconsistent tag casing/plurals across gallery models (#9574 ) - embeddings → embedding (6 models): aligns with the WebUI filter button defined in core/http/views/models.html ({ term: 'embedding', ... }), so models like nomic-embed-text-v1.5 now appear under the Embedding filter - TTS → tts (5 models), ASR → asr (2 models): lowercase, per existing convention used by 161+ models - CPU/Cpu → cpu (17 models), GPU → gpu (17 models): lowercase, per existing convention used by 666+ models - dedupe duplicate tag entries on 3 models that already had repeated tags (gpt-oss-20b had gguf x2; arcee-ai/AFM-4.5B had gpu x2; one Qwen model had default x2) Closes #9247	2026-04-26 08:33:38 +02:00
Ettore Di Giacinto	e16e758dff	ci(backends): build cpu-whisperx and cpu-faster-whisper for linux/arm64 (#9573 ) Extend the existing CPU build matrix entries to produce a multi-arch manifest (linux/amd64,linux/arm64) at the same image tags. arm64 Linux hosts without an NVIDIA GPU report the "default" capability, which already maps to cpu-whisperx / cpu-faster-whisper in backend/index.yaml -- so the manifest list lets Docker pull the right variant without any gallery changes. Both stacks install cleanly under aarch64: torch (2.4.1/2.8.0), faster-whisper, ctranslate2, whisperx, opencv-python and the remaining deps all ship manylinux2014_aarch64 wheels, so no source builds run under QEMU emulation. Follows the same pattern already used by cpu-llama-cpp-quantization. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-26 08:30:03 +02:00
LocalAI [bot]	1c45227346	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `3a945af45d45936341a45bbf7deda56776a4af26` (#9570 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-26 08:26:37 +02:00
Ettore Di Giacinto	fbe4f0a99b	fix(docs): replace Docsy `alert` shortcode with Relearn `notice` The docs site uses the hugo-theme-relearn theme, which provides `notice` instead of Docsy's `alert`. The face-recognition, voice-recognition, and stores feature pages used `{{% alert %}}`, breaking `hugo build` with "template for shortcode \"alert\" not found". Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-25 21:04:31 +00:00
Ettore Di Giacinto	d733c9cd13	fix(mlx-vlm): pin upstream to v0.4.4 to unblock CUDA builds (#9568 ) Blaizzy/mlx-vlm git HEAD bumped its constraint to mlx>=0.31.2, but mlx-cuda-12 and mlx-cuda-13 are only published up to 0.31.1 on PyPI. Since mlx[cudaXX]==0.31.2 forces a sibling wheel that doesn't exist, pip backtracks through every older mlx[cudaXX], none of which satisfy mlx>=0.31.2, producing ResolutionImpossible. Pin all variants to the v0.4.4 tag (mlx>=0.30.0), which resolves cleanly against mlx[cuda13]==0.31.1. cpu/mps weren't broken yet but are pinned for consistency. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-25 22:06:01 +02:00
Ettore Di Giacinto	703b4fcae8	Change cron schedule to run every 12 hours Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-04-25 18:38:28 +02:00
Richard Palethorpe	73aacad2f9	fix(vllm): drop flash-attn wheel to avoid torch 2.10 ABI mismatch (#9557 ) The pinned flash-attn 2.8.3+cu12torch2.7 wheel breaks at import time once vllm 0.19.1 upgrades torch to its hard-pinned 2.10.0: ImportError: .../flash_attn_2_cuda...so: undefined symbol: _ZN3c104cuda29c10_cuda_check_implementationEiPKcS2_ib That C10 CUDA symbol is libtorch-version-specific. Dao-AILab has not yet published flash-attn wheels for torch 2.10 -- the latest release (2.8.3) tops out at torch 2.8 -- so any wheel pinned here is silently ABI-broken the moment vllm completes its install. vllm 0.19.1 lists flashinfer-python==0.6.6 as a hard dep, which already covers the attention path. The only other use of flash-attn in vllm is the rotary apply_rotary import in vllm/model_executor/layers/rotary_embedding/common.py, which is guarded by find_spec("flash_attn") and falls back cleanly when absent. Also unpin torch in requirements-cublas12.txt: the 2.7.0 pin only existed to give the flash-attn wheel a matching torch to link against. With flash-attn gone, vllm's own torch==2.10.0 dep is the binding constraint regardless of what we put here. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-25 15:38:13 +00:00
LocalAI [bot]	806ea24ff4	chore: ⬆️ Update TheTom/llama-cpp-turboquant to `67559e580b10e4e47e9a6fd6218873997976886d` (#9497 ) ⬆️ Update TheTom/llama-cpp-turboquant Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 14:03:46 +02:00
LocalAI [bot]	385de3705e	chore(model gallery): 🤖 add 1 new models via gallery agent (#9558 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 14:03:15 +02:00
Ettore Di Giacinto	21eace40ec	feat(llama-cpp): expose split_mode option for multi-GPU placement (#9560 ) Adds split_mode (alias sm) to the llama.cpp backend options allowlist, accepting none\|layer\|row\|tensor. The tensor value targets the experimental backend-agnostic tensor parallelism from ggml-org/llama.cpp#19378 and requires a llama.cpp build that includes that PR, FlashAttention enabled, KV-cache quantization disabled, and a manually set context size. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-25 14:02:57 +02:00
Ettore Di Giacinto	24505e57f5	feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang (#9553 ) * feat(backends): add CUDA 13 + L4T arm64 CUDA 13 variants for vllm/vllm-omni/sglang Adds new build profiles mirroring the diffusers/ace-step pattern so vLLM serving (and SGLang on arm64) can be deployed on CUDA 13 hosts and JetPack 7 boards: - vllm: cublas13 (PyPI cu130 channel) + l4t13 (jetson-ai-lab SBSA cu130 prebuilt vllm + flash-attn). - vllm-omni: cublas13 + l4t13. Floats vllm version on cu13 since vllm 0.19+ ships cu130 wheels by default and vllm-omni tracks vllm master; cu12 path keeps the 0.14.0 pin to avoid disturbing existing images. - sglang: l4t13 arm64 only — uses the prebuilt sglang wheel from the jetson-ai-lab SBSA cu130 index, so no source build is needed. Cublas13 sglang on x86_64 is intentionally deferred. CI matrix gains five new images (-gpu-nvidia-cuda-13-vllm{,-omni}, -nvidia-l4t-cuda-13-arm64-{vllm,vllm-omni,sglang}); backend/index.yaml gains the matching capability keys (nvidia-cuda-13, nvidia-l4t-cuda-13) and latest/development merge entries. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] * fix(backends): use unsafe-best-match index strategy on l4t13 builds The jetson-ai-lab SBSA cu130 index lists transitive deps (decord, etc.) at limited versions / older Python ABIs. uv defaults to the first index that contains a package and refuses to fall through to PyPI, so sglang l4t13 build fails resolving decord. Mirror the existing cpu sglang profile by setting --index-strategy=unsafe-best-match on l4t13 across the three backends, and apply it to the explicit vllm install line in vllm-omni's install.sh (which doesn't honor EXTRA_PIP_INSTALL_FLAGS). Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] * fix(sglang): drop [all] extras on l4t13, floor version at 0.5.0 The [all] extra brings in outlines→decord, and decord has no aarch64 cp312 wheel on PyPI nor the jetson-ai-lab index (only legacy cp35-cp37 tags). With unsafe-best-match enabled, uv backtracked through sglang versions trying to satisfy decord and silently landed on sglang==0.1.16, an ancient version with an entirely different dep tree (cloudpickle/outlines 0.0.44, etc.). Drop [all] so decord is no longer required, and floor sglang at 0.5.0 to prevent any future resolver misfire from degrading the version again. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-25 12:26:29 +02:00
LocalAI [bot]	d09706dc60	chore(model gallery): 🤖 add 1 new models via gallery agent (#9555 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 09:00:37 +02:00
LocalAI [bot]	08e393f7db	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `cb58a561f0c49f68b6d125cdfda037ed80433821` (#9549 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 08:59:48 +02:00
LocalAI [bot]	47cc3dc8d7	chore: ⬆️ Update ggml-org/llama.cpp to `361fe72acb7b9bd79059cc177cbeda99b35b5db9` (#9548 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-25 08:58:27 +02:00
Ettore Di Giacinto	83b384de97	feat: surface distributed backend management errors (#9552 ) * fix(distributed): surface per-node backend op errors to OpStatus DistributedBackendManager.{Install,Upgrade,Delete}Backend discarded the per-node BackendOpResult from enqueueAndDrainBackendOp with `_, err :=`. When workers replied Success=false (e.g. an OCI image with no arm64 variant on a Jetson host), the per-node Error string was recorded in result.Nodes[].Error but never reached the toplevel return value, so OpStatus.Error stayed empty and the UI reported the install as "completed" while the backend was nowhere on the cluster. Add BackendOpResult.Err() that aggregates per-node Status=="error" entries into a single error. Queued nodes (waiting for reconciler retry) are deliberately not treated as failures. Wire the three callers and DeleteBackendDetailed to call result.Err() so reply.Success=false finally reaches OpStatus.Error → /api/backends/job/:uid → the UI. The Delete closures had a related bug: they discarded the reply with `_` and only checked the NATS round-trip error, so reply.Success=false was a silent success even with the new aggregation. Check both. Standalone mode (LocalBackendManager) already surfaces gallery errors correctly through the same OpStatus.Error path; no change needed there. Tests: 9 new Ginkgo specs covering all-success / all-fail with distinct errors / mixed / all-queued / no-nodes for Install, Upgrade, Delete. Assisted-by: Claude:claude-opus-4-7 [Bash] [Edit] [Read] [Write] * feat(react-ui): per-node backend delete + clearer upgrade affordance The Nodes page exposed a per-node "reinstall" button (fa-sync-alt, tooltip "Reinstall backend") but no per-node delete, even though the Go side has had POST /api/nodes/:id/backends/delete → RemoteUnloaderAdapter.DeleteBackend → NATS-to-specific-node wired up for a while. Sync icons read as "refresh data" — the action is functionally an upgrade (re-pulls the gallery image), so the affordance was misleading. Per-node backend row now renders two icon buttons: - Upgrade: btn-secondary btn-sm + fa-arrow-up, tooltip "Upgrade backend on this node". Names both action and scope to differentiate from the cluster-wide upgrade on the Backends page. - Delete: btn-danger-ghost btn-sm + fa-trash, tooltip "Delete backend from this node". Matches the node-level destructive style at the row action column rather than the solid btn-danger of primary destructive pages, since this is a secondary action inside a busy row. Delete goes through the existing ConfirmDialog (danger=true) with copy that names the backend and the node explicitly — it's a non-recoverable op on a specific scope. Reuses nodesApi.deleteBackend(id, backend) which already existed in the API client. Tests: 4 new Playwright specs covering upgrade clarity (icon + tooltip), delete button presence, confirm dialog flow with POST body assertion, and cancel-doesn't-POST. Assisted-by: Claude:claude-opus-4-7 [Bash] [Edit] [Read] [Write]	2026-04-25 08:57:59 +02:00
Ettore Di Giacinto	487e3fd2a4	feat(react-ui): editorial refresh with Nord palette and polished primitives (#9550 ) * feat(react-ui): editorial refresh with Nord palette and polished primitives Replaces the cool gray-blue theme with a deep Nord-inspired palette: frost-cyan accent (#88c0d0) on deep blue-black surfaces (#13171f / #1a1f2a / #242a36), snow-storm text scale, aurora status colours. - Typography: Geist Variable + Geist Mono Variable (Google Fonts) with ss01/ss03/cv11 stylistic alternates; strengthened h1-h6 hierarchy; editorial negative tracking. - Primitives: buttons gain depth (inset highlight + hover lift + brightness filter); inputs become sunken wells with sage-swap-to-frost focus rings; cards hover-lift and gain an .card--accent left-rail variant; badges become mono caps rectangles with tabular-nums. - Chrome: sidebar active state is now an inset left rail + tint (no border-left); modals get popIn animation and proper shadow lift; toasts carry an inset accent bar + slide-in instead of tinted fills; operations bar breathes on active installs. - Empty states: editorial pattern (eyebrow rule, large mono title, 52ch lede) that inherits gracefully even without page JSX edits. - Chat: assistant bubbles drop the gray-nested-in-gray card for a transparent pull-quote with a left border; user bubbles soften from loud accent fill to a subtle frost tint. - Motion: custom spring easing cubic-bezier(0.22,1,0.36,1), 180ms standard; breathing/pulse/popIn keyframes; global prefers-reduced- motion honoring. - Radii tightened to 3/5/8/10px; warm-shadow tokens redone for cool depth; ::selection, :focus-visible, kbd globals added. - Migrated hardcoded 'JetBrains Mono' CSS literals to var(--font-mono) so the Geist Mono swap lands everywhere. Scope is intentionally tokens + primitives only. Page JSX and the ~1,800 inline style={{…}} instances are untouched and flagged as follow-ups. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] * feat(react-ui): complete-coverage pass — migrate inline styles to tokens Follows up the editorial/Nord token refresh with a mechanical sweep of page JSX and shared components so nothing bypasses the design system. - Font family: replaced 80+ 'JetBrains Mono' / 'Space Grotesk' inline literals (and the string-CSS variants in CollectionDetails and AgentStatus) with var(--font-mono) / var(--font-sans). SVG <text> nodes that used the attribute form were switched to style={{ }} so the CSS variable resolves. - Radii: every unquoted numeric borderRadius (2/3/4/10) is now a var(--radius-) token; 50% and 999px kept as computed shapes. - Spacing: clean-token gaps and margins (4/8/16px) moved to var(--spacing-xs/sm/md); padding: '4px 8px' and '8px 16px' lifted into token pairs. Micro-values (2/6/10/12px) left inline where no token maps cleanly. - Colors: Talk.jsx button/canvas-surface hardcodes moved to var(--color-); FineTune.jsx chart series colours now use the --color-data-* Nord palette (cyan/red/purple/orange instead of tailwind hex); AgentStatus tool-call icon and error tag hex swapped for var(--color-warning) / var(--color-text-inverse). - CodeMirror editor (utils/cmTheme.js): both themes rebased on Nord — polar-night surfaces and aurora syntax highlighting (dark), snow- storm surfaces with darkened aurora (light). Caret/selection/active line/search now frost-cyan tinted instead of legacy indigo/purple. Legitimately dynamic styles (computed widths, per-row colours, canvas 2D context fill/stroke for waveform and spectrogram drawing) remain inline — they can't be expressed as CSS tokens. 29 files, +237/-237 — identity preserved, semantics re-anchored to the token system. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write]	2026-04-24 23:35:59 +02:00
dependabot[bot]	9ab3496de2	chore(deps): bump rustls-webpki from 0.103.10 to 0.103.13 in /backend/rust/kokoros in the cargo group across 1 directory (#9546 ) chore(deps): bump rustls-webpki Bumps the cargo group with 1 update in the /backend/rust/kokoros directory: [rustls-webpki](https://github.com/rustls/webpki). Updates `rustls-webpki` from 0.103.10 to 0.103.13 - [Release notes](https://github.com/rustls/webpki/releases) - [Commits](https://github.com/rustls/webpki/compare/v/0.103.10...v/0.103.13) --- updated-dependencies: - dependency-name: rustls-webpki dependency-version: 0.103.13 dependency-type: indirect dependency-group: cargo ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-24 22:02:58 +02:00
dependabot[bot]	c4511be33a	chore(deps): bump postcss from 8.5.8 to 8.5.10 in /core/http/react-ui in the npm_and_yarn group across 1 directory (#9544 ) chore(deps): bump postcss Bumps the npm_and_yarn group with 1 update in the /core/http/react-ui directory: [postcss](https://github.com/postcss/postcss). Updates `postcss` from 8.5.8 to 8.5.10 - [Release notes](https://github.com/postcss/postcss/releases) - [Changelog](https://github.com/postcss/postcss/blob/main/CHANGELOG.md) - [Commits](https://github.com/postcss/postcss/compare/8.5.8...8.5.10) --- updated-dependencies: - dependency-name: postcss dependency-version: 8.5.10 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-24 22:02:41 +02:00
Ettore Di Giacinto	551ebdb57a	fix(distributed): correct VRAM/RAM reporting on NVIDIA unified-memory hosts (#9545 ) Workers on NVIDIA unified-memory hardware (DGX Spark / GB10, Jetson AGX Thor, Jetson Orin/Xavier/Nano) were reporting `available_vram=0` back to the frontend, so the Nodes UI showed the node as fully used even when most of the unified memory was actually free. Three causes addressed: * `isTegraDevice` only matched `/sys/devices/soc0/family == "Tegra"`. DGX Spark (SBSA) reports JEDEC codes there instead — `jep106:0426` for the NVIDIA manufacturer — so the Tegra/unified-memory fallback never ran. Renamed to `isNVIDIAIntegratedGPU` and extended to also match `jep106:0426[:]` via `/sys/devices/soc0/soc_id`. The unified-iGPU code defaulted the device name to `"NVIDIA Jetson"` when `/proc/device-tree/model` was missing. That's what happens for Thor inside a docker container, and always on DGX Spark. New `nvidiaIntegratedGPUName` resolves via dt-model → `/sys/devices/soc0/machine` → `soc_id` lookup (`jep106:0426:8901` → `"NVIDIA GB10"`) so the Nodes UI labels the box correctly. * Worker heartbeat sent `available_vram=0` (or total-as-available) when VRAM usage was momentarily unknown — e.g. when `nvidia-smi` intermittently failed with `waitid: no child processes` under containers without `--init`. Each such heartbeat overwrote the DB and made the UI flip to "fully used". `heartbeatBody` now omits `available_vram` in that case so the DB keeps its last good value. Also updates the commented GPU blocks in both compose files with `NVIDIA_DRIVER_CAPABILITIES=compute,utility`, `capabilities: [gpu, utility]`, and `init: true`, and documents the requirement in the distributed-mode and nvidia-l4t pages. Without `utility`, NVML/`nvidia-smi` are absent inside the container, which is what put the DGX Spark worker into the buggy fallback in the first place. Detection verified on live hardware (dgx.casa / GB10 and 192.168.68.23 / Thor) by running a cross-compiled probe of the new helpers on both host and inside the worker container. Assisted-by: Claude:opus-4.7 [Claude Code]	2026-04-24 22:02:23 +02:00
Andreas Egli	1d0de757c3	fix: add hipblaslt library (#9541 ) Signed-off-by: Andreas Egli <github@kharan.ch>	2026-04-24 18:50:03 +02:00
Alex Brick	e5337039b0	[intel GPU support] Use latest oneapi-basekit image for Intel images to support b70 (#9543 ) * Use latest oneapi-basekit image for Intel images The current `localai/localai:master-gpu-intel` images don't work with the intel arc pro b70. Updating the base_image to 2025.3.2 fixes it. Signed-off-by: Alex Brick <3220905+arbrick@users.noreply.github.com> * Update github workflow base image --------- Signed-off-by: Alex Brick <3220905+arbrick@users.noreply.github.com>	2026-04-24 18:29:10 +02:00
LocalAI [bot]	1c9592c77f	chore: ⬆️ Update leejet/stable-diffusion.cpp to `b8bdffc19962be7e5a84bfefeb2e31bd885b571a` (#9521 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-24 15:15:15 +02:00
Richard Palethorpe	3db60b57e6	fix(realtime): consume ChatDeltas when C++ autoparser clears Response (#9538 ) The llama.cpp C++-side chat autoparser clears Reply.Message and delivers parsed content/reasoning/tool-calls via Reply.chat_deltas. chat.go handles this (non-SSE path uses ToolCallsFromChatDeltas/ContentFromChatDeltas/ ReasoningFromChatDeltas), but realtime.go only read pred.Response, so any model routed through the autoparser (Qwen2.5/3 and friends) produced a silent reply: backend emitted N tokens, the session surface saw zero. Mirror the non-SSE chat path in realtime's triggerResponse: when deltas carry tool calls or content, use them directly; otherwise fall back to the existing raw-text parsing. Assisted-by: claude-opus-4-7-1M [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-24 14:41:38 +02:00
Richard Palethorpe	13734ae9fa	feat: Add Sherpa ONNX backend for ASR and TTS (#8523 ) feat(backend): Add Sherpa ONNX backend and Omnilingual ASR Adds a new Go backend wrapping sherpa-onnx via purego (no cgo). Same approach as opus/stablediffusion-ggml/whisper — a thin C shim (csrc/shim.c + shim.h → libsherpa-shim.so) wraps the bits purego can't reach directly: nested struct config writes, result-struct field reads, and the streaming TTS callback trampoline. The Go side uses opaque uintptr handles and purego.NewCallback for the TTS callback. Supports: - VAD via sherpa-onnx's Silero VAD - Offline ASR: Whisper, Paraformer, SenseVoice, Omnilingual CTC - Online/streaming ASR: zipformer transducer with endpoint detection (AudioTranscriptionStream emits delta events during decode) - Offline TTS: VITS (LJS, etc.) - Streaming TTS: sherpa-onnx's callback API → PCM chunks on a channel, prefixed by a streaming WAV header Gallery entries: omnilingual-0.3b-ctc-q8-sherpa (1600-language offline ASR), streaming-zipformer-en-sherpa (low-latency streaming ASR), silero-vad-sherpa, vits-ljs-sherpa. E2E coverage: tests/e2e-backends for offline + streaming ASR, tests/e2e for the full realtime pipeline (VAD + STT + TTS). Assisted-by: claude-opus-4-7-1M [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-24 14:40:06 +02:00
Ettore Di Giacinto	c0920f3273	fix(ik-llama-cpp): patch clip.cpp for new ggml_quantize_chunk signature (#9531 ) Bumps ik_llama.cpp pin to 16996aeab7. Upstream 286ce32...16996ae adds a trailing `const struct quantize_user_data *` parameter to `ggml_quantize_chunk` (PR ikawrakow/ik_llama.cpp#1677) but leaves `examples/llava/clip.cpp` unchanged because their build has moved to `examples/mtmd/`. LocalAI's prepare.sh still copies from `examples/llava/`, so the dead 7-arg call reaches the grpc-server compile and fails. Patch the call site to pass `nullptr` for the new param. Assisted-by: Claude:Opus-4.7 [Read] [Edit] [Bash]	2026-04-24 13:07:26 +02:00
LocalAI [bot]	7c1934b183	chore: ⬆️ Update ggml-org/llama.cpp to `187a45637054881ecacf17f8e2f6f8f2ba7df1c7` (#9520 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-24 09:17:06 +02:00
Tai An	5e062b4d1f	fix: use SetFunctionCallNameString when forcing a specific tool (3 sites) (#9526 ) * fix(anthropic): use SetFunctionCallNameString for specific tool forcing * fix(openai/realtime): use SetFunctionCallNameString for specific tool forcing * fix(openresponses): use SetFunctionCallNameString for specific tool forcing	2026-04-24 09:06:42 +02:00
Ettore Di Giacinto	4906cbad04	feat: add biometrics UI (#9524 ) * feat(react-ui): add Face & Voice Recognition pages Expose the face and voice biometrics endpoints (/v1/face/, /v1/voice/) through the React UI. Each page has four tabs driving the six endpoints per modality: Analyze (demographics with bounding boxes / waveform segments), Compare (verify with a match gauge and live threshold slider), Enrollment (register / identify / forget with a top-K matches view), Embedding (raw vector inspector with sparkline + copy). MediaInput supports file upload plus live capture: webcam snap-to-canvas for face, MediaRecorder -> AudioContext -> 16-bit PCM mono WAV transcode for voice (libsndfile on the backend only handles WAV/FLAC/OGG natively). Sidebar gets a new Biometrics section feature-gated on face_recognition / voice_recognition; routes are wrapped in <RequireFeature>. No new dependencies -- Font Awesome icons picked from the Free set. Assisted-by: Claude:Opus 4.7 * fix(localai): accept data URI prefixes with codec/charset params Browser MediaRecorder produces data URIs like data:audio/webm;codecs=opus;base64,... so the pre-';base64,' section can carry multiple parameter segments. The `^data:([^;]+);base64,` regex in pkg/utils/base64.go and core/http/endpoints/localai/audio.go only matched exactly one segment, so recordings straight from the React UI's live-capture tab failed the strip and then tripped the base64 decoder on the leading 'data:' literal, surfacing as "invalid audio base64: illegal base64 data at input byte 4" Widened both regexes to `^data:[^,]+?;base64,` so any number of ';param=value' segments between the mime type and ';base64,' are tolerated. Added a regression test covering the MediaRecorder shape. Assisted-by: Claude:Opus 4.7 * fix(insightface): scope pack ONNX loading to known manifests LocalAI's gallery extracts buffalo_* zips flat into the models directory, which inevitably mixes with ONNX files from other backends (opencv face engine, MiniFASNet antispoof, WeSpeaker voice embedding) and older buffalo pack installs. Feeding those foreign files into insightface's model_zoo.get_model() blows up inside the router -- it assumes a 4-D NCHW input and indexes `input_shape[2]` on tensors that aren't shaped like a face model, raising IndexError mid-load and leaving the backend unusable. The router's dispatch isn't amenable to per-file try/except alone (first-file-wins picks det_10g.onnx from buffalo_l even when the user asked for buffalo_sc -- alphabetical order happens to favour the wrong pack). Instead, ship an explicit manifest of the upstream v0.7 pack contents and scope the glob to that when the requested pack is known. The manifest is small and stable; future packs can be added alongside or fall through to the tolerance loop, which also swallows any remaining IndexError / ValueError from foreign files with a clear `[insightface] skipped` stderr line for diagnostics. Assisted-by: Claude:Opus 4.7 * fix(speaker-recognition): extract FBank features for rank-3 ONNX encoders Pre-exported speaker-encoder ONNX graphs come in two shapes: rank-2 [batch, samples] -- some 3D-Speaker exports, take raw waveform directly. rank-3 [batch, frames, n_mels] -- WeSpeaker and most Kaldi- lineage encoders, expect pre-computed Kaldi FBank. OnnxDirectEngine unconditionally fed `audio.reshape(1, -1)` -- correct for rank-2, IndexError-on-input_shape[3] on rank-3, which surfaced to the UI as "Invalid rank for input: feats Got: 2 Expected: 3" Detect the input rank at session init and run Kaldi FBank (80-dim, 25ms/10ms frames, dither=0.0, per-utterance CMN) before the forward pass when rank>=3. All knobs are configurable via backend options for encoders that deviate from defaults. torchaudio.compliance.kaldi is already in the backend's requirements (SpeechBrain pulls torchaudio in), so no new dependency. Assisted-by: Claude:Opus 4.7 * fix(biometrics): isolate face and voice vector stores Face (ArcFace, 512-D) and voice (ECAPA-TDNN 192-D / WeSpeaker 256-D) biometric embeddings were colliding inside a single in-memory local-store instance. Enrolling one after the other failed with "Try to add key with length N when existing length is M" because local-store correctly refuses to mix dimensions in one keyspace. The registries were constructed with `storeName=""`, which in StoreBackend() is just a WithModel() call. But ModelLoader's cache is keyed on `modelID`, not `model` -- so both registries collapsed to the same `modelID=""` slot and reused the same backend process despite looking isolated on paper. Three complementary fixes: 1. application.go -- give each registry a distinct default namespace ("localai-face-biometrics" / "localai-voice-biometrics"). The comment claimed isolation, now it's actually enforced. 2. stores.go -- pass the storeName as both WithModelID and WithModel so the ModelLoader cache key separates namespaces and the loader spawns distinct processes. 3. local-store/store.go -- drop the Load() `opts.Model != ""` guard. It was there to prevent generic model-loading loops from picking up local-store by accident, but that auto-load path is being retired; the guard now just blocks legitimate namespace isolation. opts.Model is treated as a tag; the per-tuple process isolation upstream handles discrimination. Assisted-by: Claude:Opus 4.7 * fix(gallery): stale-file cleanup and upgrade-tmp directory safety Two related robustness fixes for backend install/upgrade: pkg/downloader/uri.go OCI downloads passed through if filepath.Ext(filePath) != "" ... filePath = filepath.Dir(filePath) which was intended to redirect file-shaped download targets into their parent directory for OCI extraction. The heuristic misfires on directory-shaped paths with a dot-suffix -- gallery.UpgradeBackend uses tmpPath = "<backendsPath>/<name>.upgrade-tmp" and Go's filepath.Ext treats ".upgrade-tmp" as an extension. The rewrite landed the extraction at "<backendsPath>/", which then overwrote the real install (backends/<name>/) with a flat-layout file and left a stray run.sh at the top level. The tmp dir itself stayed empty, so the validation step that checked "<tmpPath>/run.sh" predictably failed with "upgrade validation failed: run.sh not found in new backend" Every manual upgrade silently corrupted the backends tree this way. Guard the rewrite behind "target isn't already an existing directory" -- InstallBackend / UpgradeBackend both pre-create the target as a directory, so they get the correct behaviour; existing file-path callers with a genuine dot-extension still get the parent redirect. core/gallery/backends.go InstallBackend's MkdirAll returned ENOTDIR when something at the target path was already a file (legacy dev builds dropped golang backend binaries directly at `<backendsPath>/<name>` instead of nesting them under their own subdir). That permanently blocked reinstall and upgrade for anyone carrying that state, since every retry hit the same error. Detect a pre-existing non-directory, warn, and remove it before the MkdirAll so the fresh install can write the correct nested layout with metadata.json + run.sh. Assisted-by: Claude:Opus 4.7 * fix(galleryop): refresh upgrade cache after backend ops UpgradeChecker caches the last upgrade-check result and only refreshes on the 6-hour tick or after an auto-upgrade cycle. Manual upgrades (POST /api/backends/upgrade/:name) go through the async galleryop worker, which completes the upgrade correctly but never tells UpgradeChecker to re-check -- so /api/backends/upgrades continued to list a just-upgraded backend as upgradeable, indistinguishable from a failed upgrade, for up to six hours. Add an optional `OnBackendOpCompleted func()` hook on GalleryService that fires after every successful install / upgrade / delete on the backend channel (async, so a slow callback doesn't stall the queue). startup.go wires it to UpgradeChecker.TriggerCheck after both services exist. Result: the upgrade banner clears within milliseconds of the worker finishing. Assisted-by: Claude:Opus 4.7 * build: prepend GOPATH/bin to PATH for protogen-go install-go-tools runs `go install` for protoc-gen-go and protoc-gen-go-grpc, which writes them into `go env GOPATH`/bin. That directory isn't on every dev's PATH, and protoc resolves its code-gen plugins via PATH, so the immediately-following protoc invocation fails with "protoc-gen-go: program not found" which in turn blocks `make build` and any `make backends/%` target that depends on build. Prepend `go env GOPATH`/bin to PATH for the protoc invocation so the freshly-installed plugins are found without requiring a shell-profile change. Assisted-by: Claude:Opus 4.7 * refactor(ui-api): non-blocking backend upgrade handler with opcache POST /api/backends/upgrade/:name used to send the ManagementOp directly onto the unbuffered BackendGalleryChannel, which blocked the HTTP request whenever the galleryop worker was busy with a prior operation. The op also didn't show up in /api/operations, so the Backends UI couldn't reflect upgrade progress on the affected row. Register the op in opcache immediately, wrap it in a cancellable context, store the cancellation function on the GalleryService, and push onto the channel from a goroutine so the handler returns right away. Response gains a `jobID` field and a `message` string so clients have a consistent handle regardless of whether the op is queued or running. Pairs with the OnBackendOpCompleted hook added in the galleryop commit — together the UI sees the upgrade start, watches progress via /api/operations, and drops the "upgradeable" flag the moment the worker finishes. Assisted-by: Claude:Opus 4.7	2026-04-24 08:50:34 +02:00
LocalAI [bot]	c755cd5ab5	feat(swagger): update swagger (#9518 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-23 23:26:50 +02:00
LocalAI [bot]	0fb04f7ac3	chore(model-gallery): ⬆️ update checksum (#9522 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-23 23:26:27 +02:00
Ettore Di Giacinto	d9d7b5c29b	docs(readme): add April 2026 highlights to Latest News Assisted-by: Claude-Code:claude-opus-4-7	2026-04-23 20:47:06 +00:00

1 2 3 4 5 ...

6174 Commits