LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-16 20:52:08 -04:00

Author	SHA1	Message	Date
dependabot[bot]	1caab1de10	chore(deps): bump actions/checkout from 4 to 6 (#9663 ) Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/checkout/compare/v4...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-05 15:37:05 +02:00
Ettore Di Giacinto	e86ade54a6	feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654 ) * feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp Closes #1648. OpenAI-style multipart endpoint that returns "who spoke when". Single endpoint instead of the issue's three-endpoint sketch (refactor /vad, /vad/embedding, /diarization) — the typical client wants one call, and embeddings can land later as a sibling without breaking this surface. Response shape borrows from Pyannote/Deepgram: segments carry a normalised SPEAKER_NN id (zero-padded, stable across the response) plus the raw backend label, optional per-segment text when the backend bundles ASR, and a speakers summary in verbose_json. response_format also accepts rttm so consumers can pipe straight into pyannote.metrics / dscore. Backends: * vibevoice-cpp — Diarize() reuses the existing vv_capi_asr pass. vibevoice's ASR prompt asks the model to emit [{Start,End,Speaker,Content}] natively, so diarization is a by-product of the same pass; include_text=true preserves the transcript per segment, otherwise we drop it. * sherpa-onnx — wraps the upstream SherpaOnnxOfflineSpeakerDiarization C API (pyannote segmentation + speaker-embedding extractor + fast clustering). libsherpa-shim grew config builders, a SetClustering wrapper for per-call num_clusters/threshold overrides, and a segment_at accessor (purego can't read field arrays out of SherpaOnnxOfflineSpeakerDiarizationSegment[] directly). Plumbing: new Diarize gRPC RPC + DiarizeRequest / DiarizeSegment / DiarizeResponse messages, threaded through interface.go, base, server, client, embed. Default Base impl returns unimplemented. Capability surfaces all updated: FLAG_DIARIZATION usecase, FeatureAudioDiarization permission (default-on), RouteFeatureRegistry entries for /v1/audio/diarization and /audio/diarization, audio instruction-def description widened, CAP_DIARIZATION JS symbol, swagger regenerated, /api/instructions discovery map updated. Tests: * core/backend: speaker-label normalisation (first-seen → SPEAKER_NN, per-speaker totals, nil-safety, fallback to backend NumSpeakers when no segments). * core/http/endpoints/openai: RTTM rendering (file-id basename, negative duration clamping, fallback id). * tests/e2e: mock-backend grew a deterministic Diarize that emits raw labels "5","2","5" so the e2e suite verifies SPEAKER_NN remapping, verbose_json speakers summary + transcript pass-through (gated by include_text), RTTM bytes content-type, and rejection of unknown response_format. mock-diarize model config registered with known_usecases=[FLAG_DIARIZATION] to bypass the backend-name guard. Docs: new features/audio-diarization.md (request/response, RTTM example, sherpa-onnx + vibevoice setup), cross-link from audio-to-text.md, entry in whats-new.md. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(diarization): correct sherpa-onnx symbol name + lint cleanup CI failures on #9654: * sherpa-onnx-grpc-{tts,transcription} and sherpa-onnx-realtime panicked at backend startup with `undefined symbol: SherpaOnnxDestroyOfflineSpeakerDiarizationResult`. Upstream's actual symbol is SherpaOnnxOfflineSpeakerDiarizationDestroyResult (Destroy in the middle, not the prefix); the rest of the diarization surface follows the same naming pattern. The mismatched name made purego.RegisterLibFunc fail at dlopen time and crashed the gRPC server before the BeforeAll could probe Health, taking down every sherpa-onnx test job — not just the diarization-related ones. * golangci-lint flagged 5 errcheck violations on new defer cleanups (os.RemoveAll / Close / conn.Close); wrap each in a `defer func() { _ = X() }()` closure (matches the pattern other LocalAI files use for new code, since pre-existing bare defers are grandfathered in via new-from-merge-base). * golangci-lint also flagged forbidigo violations: the new diarization_test.go files used testing.T-style `t.Errorf` / `t.Fatalf`, which are forbidden by the project's coding-style policy (.agents/coding-style.md). Convert both files to Ginkgo/Gomega Describe/It with Expect(...) — they get picked up by the existing TestBackend / TestOpenAI suites, no new suite plumbing needed. * modernize linter: tightened the diarization segment loop to `for i := range int(numSegments)` (Go 1.22+ idiom). Verified locally: golangci-lint with new-from-merge-base=origin/master reports 0 issues across all touched packages, and the four mocked diarization e2e specs in tests/e2e/mock_backend_test.go still pass. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(vibevoice-cpp): convert non-WAV input via ffmpeg + raise ASR token budget Confirmed end-to-end against a real LocalAI instance with vibevoice-asr-q4_k loaded and the multi-speaker MP3 sample at vibevoice.cpp/samples/2p_argument.mp3: both /v1/audio/transcriptions and /v1/audio/diarization now succeed and return correctly attributed speaker turns for the full clip. Two latent issues surfaced once the diarization endpoint actually exercised the backend with a non-trivial input: 1. vv_capi_asr only accepts WAV via load_wav_24k_mono. The previous code passed the uploaded path straight through, so anything that wasn't already a 24 kHz mono s16le WAV failed at the C side with rc=-8 and the very unhelpful "vv_capi_asr failed". prepareWavInput shells out to ffmpeg ("-ar 24000 -ac 1 -acodec pcm_s16le") in a per-call temp dir, matching the rate the model was trained on; both AudioTranscription and Diarize now route through it. This is the same shape sherpa-onnx uses (utils.AudioToWav), but vibevoice needs 24 kHz rather than 16 kHz so we don't reuse that helper. 2. The C ABI's max_new_tokens defaults to 256 when 0 is passed. That's fine for a five-second clip but not for anything past ~10 s — vibevoice stops mid-JSON, the parse fails, and the caller sees a hard error. Pass a much larger budget (16 384 ≈ ~9 minutes of speech at the model's ~30 tok/s rate); generation stops at EOS so this is a cap rather than a target. 3. As a defensive belt-and-braces, mirror AudioTranscription's existing "fall back to a single segment if the model emits non-JSON text" pattern in Diarize, so partial / unusual model output never produces a 500. This kept the endpoint usable while diagnosing (1) and (2), and is the right behaviour to keep. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(vibevoice-cpp): pass valid WAVs through directly so ffmpeg is not required at runtime Spotted by tests-e2e-backend (1.25.x): the previous fix forced every incoming audio file through `ffmpeg -ar 24000 ...`, which meant the backend container — which does not ship ffmpeg — failed even for the existing happy path where the caller already uploads a WAV. The container-side error was: rpc error: code = Unknown desc = vibevoice-cpp: ffmpeg convert to 24k mono wav: exec: "ffmpeg": executable file not found in $PATH Reading vibevoice.cpp's audio_io.cpp, `load_wav_24k_mono` uses drwav and already accepts any PCM/IEEE-float WAV at any sample rate, downmixes multi-channel input to mono, and resamples to 24 kHz internally. So the only inputs that genuinely need an external converter are non-WAV formats (MP3, OGG, FLAC, ...). Detect WAVs by RIFF/WAVE magic at bytes 0..3 / 8..11 and pass them straight through with a no-op cleanup; everything else still goes through ffmpeg with the same 24 kHz mono s16le target. The result: * Container builds without ffmpeg keep working for WAV uploads (the e2e-backends fixture is jfk.wav at 16 kHz mono s16le). * MP3 and other non-WAV inputs still get the new ffmpeg conversion path so the diarization endpoint stays useful. * If the caller uploads a non-WAV but ffmpeg isn't on PATH, the surfaced error is still descriptive enough to act on. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(ci): make gcc-14 install in Dockerfile.golang best-effort for jammy bases The LocalVQE PR (`bb033b16`) made `gcc-14 g++-14` an unconditional apt install in backend/Dockerfile.golang and pointed update-alternatives at them. That works on the default `BASE_IMAGE=ubuntu:24.04` (noble has gcc-14 in main), but every Go backend that builds on `nvcr.io/nvidia/l4t-jetpack:r36.4.0` — jammy under the hood — now fails at the apt step: E: Unable to locate package gcc-14 This blocked unrelated jobs: backend-jobs(*-nvidia-l4t-arm64-{stablediffusion-ggml, sam3-cpp, whisper, acestep-cpp, qwen3-tts-cpp, vibevoice-cpp}). LocalVQE itself is only matrix-built on ubuntu:24.04 (CPU + Vulkan), so it doesn't actually need gcc-14 anywhere else. Make the gcc-14 install conditional on the package being available in the configured apt repos. On noble: identical behaviour to today (gcc-14 installed, update-alternatives points at it). On jammy: skip the gcc-14 stanza entirely and let build-essential's default gcc take over, which is what the other Go backends compile with anyway. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-05 15:10:13 +02:00
LocalAI [bot]	1634eece6b	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `45dfd80371785731bc2ed05a76252497a4e7a282` (#9644 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 15:09:40 +02:00
LocalAI [bot]	b88ddce0f3	chore: ⬆️ Update ggml-org/llama.cpp to `eff06702b2a52e1020ea009ebd86cb9f5acabab5` (#9637 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 09:52:28 +02:00
Ettore Di Giacinto	bbcaebc1ef	feat(concurrency-groups): per-model exclusive groups for backend loading (#9662 ) * feat(concurrency-groups): per-model exclusive groups for backend loading Adds `concurrency_groups: [...]` to model YAML configs. Two models that share a group cannot be loaded concurrently on the same node — loading one evicts the others, reusing the existing pinned/busy/retry policy from LRU eviction. Layered design: - Watchdog (pkg/model): per-node correctness floor — on every Load(), evict any loaded model that shares a group with the requested one. Pinned skips surface NeedMore so the loader retries (and ultimately logs a clear warning), instead of silently allowing the rule to be violated. - Distributed scheduler (core/services/nodes): soft anti-affinity hint — scheduleNewModel prefers nodes that don't already host a same-group model, falling back to eviction only if every candidate has a conflict. Composes with NodeSelector at the same point in the candidate pipeline. Per-node, not cluster-wide: VRAM is a node-local resource, and two heavy models running on different nodes is fine. The ConfigLoader is wired into SmartRouter via a small ConcurrencyConflictResolver interface so the nodes package keeps a narrow surface on core/config. Refactors the inner LRU eviction body into a shared collectEvictionsLocked helper and the loader retry loop into retryEnforce(fn, maxRetries, interval), so both LRU and group enforcement share busy/pinned/retry semantics. Closes #9659. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(watchdog): sync pinned + concurrency_groups at startup The startup-time watchdog setup lives in initializeWatchdog (startup.go), not in startWatchdog (watchdog.go). The latter is only invoked from the runtime-settings RestartWatchdog path. As a result, neither SyncPinnedModelsToWatchdog nor SyncModelGroupsToWatchdog ran at boot, so `pinned: true` and `concurrency_groups: [...]` only became effective after a settings-driven watchdog restart. Fix by adding both sync calls to initializeWatchdog. Confirmed end-to-end: loading model A in group "heavy", then C with no group (coexists), then B in group "heavy" now correctly evicts A and leaves [B, C]. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(test): satisfy errcheck on new os.Remove in concurrency_groups spec CI lint runs new-from-merge-base, so the existing pre-existing `defer os.Remove(tmp.Name())` lines are baseline-grandfathered but the one introduced by the concurrency_groups YAML round-trip test is held to errcheck. Wrap the remove in a closure that discards the error. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-05 08:42:50 +02:00
dependabot[bot]	22ae415695	chore(deps): bump docs/themes/hugo-theme-relearn from `f69a085` to `8bb66fa` (#9665 ) chore(deps): bump docs/themes/hugo-theme-relearn Bumps [docs/themes/hugo-theme-relearn](https://github.com/McShelby/hugo-theme-relearn) from `f69a085` to `8bb66fa`. - [Release notes](https://github.com/McShelby/hugo-theme-relearn/releases) - [Commits](`f69a085322...8bb66fa674`) --- updated-dependencies: - dependency-name: docs/themes/hugo-theme-relearn dependency-version: 8bb66fa674351f3a0b0917a7552caac686eca920 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-05-05 08:42:27 +02:00
LocalAI [bot]	3a0164670e	chore: ⬆️ Update vllm-project/vllm cu130 wheel to `0.20.1` (#9649 ) ⬆️ Update vllm-project/vllm cu130 wheel Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 08:41:55 +02:00
LocalAI [bot]	a91b05907c	feat(swagger): update swagger (#9660 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 01:50:17 +02:00
LocalAI [bot]	4ef45bbccd	chore(model-gallery): ⬆️ update checksum (#9661 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 00:12:22 +02:00
Beshoy Girgis	b224a3d931	deps: update quic-go to v0.59.0 (fix session ticket panic) (#9655 ) Update quic-go from v0.54.1 to v0.59.0 to fix the crypto/tls session ticket panic described in quic-go/quic-go#5572. Co-dependency go-libp2p upgraded from v0.43.0 to v0.48.0 (required for quic-go v0.59.0 compatibility). Signed-off-by: Beshoy Girgis <shoy@1ds.us>	2026-05-04 22:07:42 +02:00
Richard Palethorpe	bb033b16a9	feat: add LocalVQE backend and audio transformations UI (#9640 ) feat(audio-transform): add LocalVQE backend, bidi gRPC RPC, Studio UI Introduce a generic "audio transform" capability for any audio-in / audio-out operation (echo cancellation, noise suppression, dereverberation, voice conversion, etc.) and ship LocalVQE as the first backend implementation. Backend protocol: - Two new gRPC RPCs in backend.proto: unary AudioTransform for batch and bidirectional AudioTransformStream for low-latency frame-by-frame use. This is the first bidi stream in the proto; per-frame unary at LocalVQE's 16 ms hop would be RTT-bound. Wire it through pkg/grpc/{client,server, embed,interface,base} with paired-channel ergonomics. LocalVQE backend (backend/go/localvqe/): - Go-Purego wrapper around upstream liblocalvqe.so. CMake builds the upstream shared lib + its libggml-cpu-.so runtime variants directly — no MODULE wrapper needed because LocalVQE handles CPU feature selection internally via GGML_BACKEND_DL. - Sets GGML_NTHREADS from opts.Threads (or runtime.NumCPU()-1) — without it LocalVQE runs single-threaded at ~1× realtime instead of the documented ~9.6×. - Reference-length policy: zero-pad short refs, truncate long ones (the trailing portion can't have leaked into a mic that wasn't recording). - Ginkgo test suite (9 always-on specs + 2 model-gated). HTTP layer: - POST /audio/transformations (alias /audio/transform): multipart batch endpoint, accepts audio + optional reference + params[]=v form fields. Persists inputs alongside the output in GeneratedContentDir/audio so the React UI history can replay past (audio, reference, output) triples. - GET /audio/transformations/stream: WebSocket bidi, 16 ms PCM frames (interleaved stereo mic+ref in, mono out). JSON session.update envelope for config; constants hoisted in core/schema/audio_transform.go. - ffmpeg-based input normalisation to 16 kHz mono s16 WAV via the existing utils.AudioToWav (with passthrough fast-path), so the user can upload any format / rate without seeing the model's strict 16 kHz constraint. - BackendTraceAudioTransform integration so /api/backend-traces and the Traces UI light up with audio_snippet base64 and timing. - Routes registered under routes/localai.go (LocalAI extension; OpenAI has no /audio/transformations endpoint), traced via TraceMiddleware. Auth + capability + importer: - FLAG_AUDIO_TRANSFORM (model_config.go), FeatureAudioTransform (default-on, in APIFeatures), three RouteFeatureRegistry rows. - localvqe added to knownPrefOnlyBackends with modality "audio-transform". - Gallery entry localvqe-v1-1.3m (sha256-pinned, hosted on huggingface.co/LocalAI-io/LocalVQE). React UI: - New /app/transform page surfaced via a dedicated "Enhance" sidebar section (sibling of Tools / Biometrics) — the page is enhancement, not generation, so it lives outside Studio. Two AudioInput components (Upload + Record tabs, drag-drop, mic capture). - Echo-test button: records mic while playing the loaded reference through the speakers — the mic naturally picks up speaker bleed, giving a real (mic, ref) pair for AEC testing without leaving the UI. - Reusable WaveformPlayer (canvas peaks + click-to-seek + audio controls) and useAudioPeaks hook (shared module-scoped AudioContext to avoid hitting browser context limits with three players on one page); migrated TTS, Sound, Traces audio blocks to use it. - Past runs saved in localStorage via useMediaHistory('audio-transform') — the history entry stores all three URLs so clicking re-renders the full triple, not just the output. Build + e2e: - 11 matrix entries removed from .github/workflows/backend.yml (CUDA, ROCm, SYCL, Metal, L4T): upstream supports only CPU + Vulkan, so we ship those two and let GPU-class hardware route through Vulkan in the gallery capabilities map. - tests-localvqe-grpc-transform job in test-extra.yml (gated on detect-changes.outputs.localvqe). - New audio_transform capability + 4 specs in tests/e2e-backends. - Playwright spec suite in core/http/react-ui/e2e/audio-transform.spec.js (8 specs covering tabs, file upload, multipart shape, history, errors). Docs: - New docs/content/features/audio-transform.md covering the (audio, reference) mental model, batch + WebSocket wire formats, LocalVQE param keys, and a YAML config example. Cross-links from text-to-audio and audio-to-text feature pages. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit Write Agent TaskCreate] Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-05-04 22:07:11 +02:00
LocalAI [bot]	de83b72bb7	fix(distributed): orchestrator resilience — auto-upgrade routing, worker bind-wait, RAG-init crash, log spam (#9657 ) * fix(nodes/health): skip stale-marking already-offline nodes The health monitor re-emitted "Node heartbeat stale" + "Marking stale node offline" + MarkOffline on every cycle for nodes that were already in the offline (or unhealthy) state. For an operator-stopped node this flooded the logs with the same WARN+INFO pair every check interval. Skip the staleness branch when the node is already StatusOffline / StatusUnhealthy — the state is already what we'd write, so neither the log lines nor the DB update carry information. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(worker): wait for backend gRPC bind before replying to backend.install The backend supervisor used to wait up to 4s (20 × 200ms) for the backend's gRPC server to answer a HealthCheck, then log a warning and reply Success with the bind address anyway. On slower nodes (a Jetson Orin doing first-boot CUDA init, large CGO library load) the gRPC listener wasn't up yet, so the frontend's first LoadModel dial returned "connect: connection refused" and the operator chased a phantom network issue instead of a startup-timing one. Two changes: - Bump the readiness window to 30s. CUDA init on Orin/Thor first boot measures in seconds, not milliseconds. - On deadline-exceeded, stop the half-started process, recycle the port, and return an error with the backend's stderr tail. The frontend now gets a real failure with diagnostic context instead of a misleading ECONNREFUSED on a downstream dial. Process death during the wait window keeps its existing fast-fail path. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(distributed): route auto-upgrade through BackendManager + bump LocalAGI/LocalRecall Two distributed-mode bugs that surfaced together in the orchestrator logs: 1. Auto-upgrade always failed with "backend not found". UpgradeChecker correctly routed CheckUpgrades through the active BackendManager (so the frontend aggregates worker state), but the auto-upgrade branch right below called gallery.UpgradeBackend directly with the frontend's SystemState. In distributed mode the frontend has no backends installed locally, so ListSystemBackends returned empty and Get(name) failed for every reported upgrade. Auto-upgrade now also goes through BackendManager.UpgradeBackend, which fans out to workers via NATS. 2. Embedding-load failure on a remote node crashed the orchestrator. When RAG init lazily called NewPersistentPostgresCollection and the remote embedding worker was unreachable, LocalRecall called os.Exit(1) inside the constructor, killing the orchestrator pod. LocalRecall now returns errors instead, LocalAGI surfaces them as a nil collection, and the existing RAGProviderFromState path returns (nil, nil, false) — the same code path the agent pool already takes when no RAG is configured. The orchestrator stays up; chat requests degrade to "no RAG available" until the embedding worker recovers. Bumps: github.com/mudler/LocalAGI → e83bf515d010 github.com/mudler/localrecall → 6138c1f535ab Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-04 19:09:16 +02:00
LocalAI [bot]	1aeb4d7e73	chore(model gallery): 🤖 add 1 new models via gallery agent (#9653 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-04 15:42:08 +02:00
Ettore Di Giacinto	a271c72931	fix(react-ui/e2e): scope backendTrigger to <main> so it skips LanguageSwitcher The LanguageSwitcher added in the i18n PR (#9642) lives in the sidebar and also uses aria-haspopup="listbox" — same attribute the import-form SearchableSelect uses. The Batch D / E tests' helper resolved the trigger with `page.locator('button[aria-haspopup="listbox"]').first()`, which now returns the language switcher (rendered first in DOM order, in the sidebar) instead of the backend dropdown. After clicking the wrong button, getByRole('option', { name: 'llama-cpp' }) naturally never resolves — language options aren't backend names — and the test times out at 30s. Scope the locator to the <main className="main-content"> wrapper so only buttons inside the route's main content area match. The page layout has the Sidebar outside <main>, so this cleanly excludes it. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-04 08:58:25 +00:00
Ettore Di Giacinto	ade5fd4b97	fix(react-ui): reflect disabled state on SearchableSelect button The Backend dropdown is disabled while /backends/known is in flight (disabled={isSubmitting \|\| backendsLoading} in ImportModel.jsx). Until now the disabled prop only guarded the internal onClick handler — there was no `disabled` HTML attribute on the <button>, so the element remained "actionable" from the outside. That regressed the import-form-ux Batch D / E Playwright tests after the i18next-suspense PR (#9642): suspending on the importModel namespace defers the useEffect that fetches /backends/known, so when the test calls backendTrigger.click() the button is rendered but backendsLoading is still true. The click hits the no-op branch, the dropdown stays closed, and `getByRole('option', { name: 'llama-cpp' })` times out at 30s. Surfacing the disabled state on the actual <button> makes Playwright auto-wait until the dropdown is ready, fixes a11y (screen readers now announce "disabled"), and removes the button from the tab order while loading. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-04 08:21:03 +00:00
LocalAI [bot]	170d55c67d	fix(distributed): honor NodeSelector in cached-replica lookup, stop empty-backend reconciler scaleups (#9652 ) * fix(distributed): honor NodeSelector in cached-replica lookup, stop empty-backend reconciler scaleups Two distinct bugs were causing tight retry loops in the distributed scheduler: 1. FindAndLockNodeWithModel ignored the model's NodeSelector. When a model was loaded on multiple nodes and only some matched the current selector, the function returned the lowest-in_flight node — even one the selector excluded. Route()'s post-check then fell through to scheduleNewModel, which targeted the matching node where the model was already at MaxReplicasPerModel capacity. Eviction couldn't help (the only loaded model on that node was the one being requested, and it was busy), so every request looped through "evicting LRU" → "all models busy". Fix: thread an optional candidateNodeIDs filter through FindAndLockNodeWithModel. Route() resolves the selector once via a new resolveSelectorCandidates helper and passes the matching IDs to both the cached-replica lookup and scheduleNewModel. The same helper replaces the inline selector block in scheduleNewModel. 2. ScheduleAndLoadModel (reconciler scale-up path) fell back to scheduleNewModel with backendType="" when no replica had ever been loaded for a model. The worker rejected the resulting backend.install ("backend name is empty") on every reconciler tick (~30s). Fix: remove the broken fallback. When GetModelLoadInfo has nothing stored, return a clear error instead of firing a doomed NATS install. The reconciler's existing scale-up failure log surfaces it once per tick; the model auto-replicates as soon as Route() serves it once and stores load info. Also downgrade the post-LoadModel-failure StopGRPC error to Debug — that cleanup attempt usually hits "model not found" because LoadModel failed before registering the process, and the outer "Failed to load model" error already carries the real reason. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash] * test(distributed): cover selector-aware FindAndLockNodeWithModel and reconciler scaleup guard Two regression tests for the bugs fixed in the previous commit: 1. FindAndLockNodeWithModel — registry-level integration tests verify the candidateNodeIDs filter: - Returns the included node even when an excluded node has lower in_flight (the original selector-mismatch loop scenario). - Returns not-found when the model is loaded only on excluded nodes, forcing Route() to fall through to a fresh schedule instead of reusing the excluded replica. 2. ScheduleAndLoadModel — mock-based test verifies the reconciler scale-up path returns an error and does NOT fire backend.install when no replica has been loaded yet. fakeUnloader gains an installCalls slice so this negative assertion is direct. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Read] [Edit] [Bash] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-04 09:42:14 +02:00
Ettore Di Giacinto	28b4857bd6	fix(ci): leave ports.ubuntu.com upstream on self-hosted runners mirrors.edge.kernel.org carries /ubuntu/ (amd64 archive) but does NOT carry /ubuntu-ports/. With the previous default both archive and ports pointed at kernel.org, so multi-arch builds (linux/amd64,linux/arm64) on bigger-runner / arc-runner-set 404'd on the arm64 leg: Err:5 http://mirrors.edge.kernel.org/ubuntu-ports noble Release 404 Not Found [IP: 213.196.21.55 80] The original outage was on archive.ubuntu.com, not ports.ubuntu.com, so default the self-hosted-ports-mirror to '' (= keep ports.ubuntu.com upstream). apt-mirror.sh and the runner-side rewrite both already no-op when the env var is empty. Self-hosted amd64 still uses kernel.org for the main archive, which worked fine in this run before the arm64 leg failed. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-04 07:28:43 +00:00
Ettore Di Giacinto	5503be1fb3	fix(ci): use http for the kernel.org mirror — bare ubuntu image has no CA bundle The Docker build runs on the minimal ubuntu:24.04 base image, which ships without ca-certificates. The very first apt-get update over HTTPS therefore fails the TLS handshake ("No system certificates available. Try installing ca-certificates."), and apt can't reach ca-certificates itself to fix the situation — chicken and egg. Apt validates package integrity via GPG-signed Release files, so plain HTTP is safe for the archive. archive.ubuntu.com / azure.archive are already accessed over HTTP for the same reason. Switch the kernel.org defaults from https://mirrors.edge.kernel.org to http://mirrors.edge.kernel.org so the in-Dockerfile rewrite works on self-hosted runners too. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-03 23:29:53 +00:00
Ettore Di Giacinto	50580a84ae	fix(ci): switch apt mirror per runner — azure on github-hosted, kernel.org on self-hosted Self-hosted runners (arc-runner-set, bigger-runner) cannot reach azure.archive.ubuntu.com — they live in different networks (e.g. our arc-runner-set Kubernetes cluster) where Azure's mirror IP is not routable. Symptom: "Connection failed [IP: 51.11.236.225 80]" with each Ign:/Err: cycle taking 60s, hanging the build for ~16 minutes before exit 100. Pick the mirror based on `runner.environment`: * github-hosted (ubuntu-latest, ubuntu-24.04-arm) → Azure (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com) — same VPC as the runner. * self-hosted (arc-runner-set, bigger-runner) → kernel.org (https://mirrors.edge.kernel.org for both archive and ports) — publicly reachable from any network. The choice now lives in one place: the .github/actions/configure-apt-mirror composite action exposes `effective-mirror` / `effective-ports-mirror` outputs so the reusable workflows can forward the same value as Docker build-args without duplicating the per-runner-environment branch. The now-redundant `apt-mirror` / `apt-ports-mirror` workflow inputs on image_build.yml and backend_build.yml are dropped — defaults live in the composite action and are visible there. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-03 22:59:26 +00:00
Ettore Di Giacinto	8edac61e57	feat(ci): allow routing apt traffic through an alternate Ubuntu mirror (#9650 ) * feat(ci): allow routing apt traffic through an alternate Ubuntu mirror Adds opt-in APT_MIRROR / APT_PORTS_MIRROR knobs to all Dockerfiles, the Makefile, and CI workflows so we can fail over to a non-canonical Ubuntu mirror when archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com are degraded (recently observed: multi-day DDoS against the default pool). Defaults are empty everywhere — behavior is unchanged unless a mirror is configured. To enable in CI, set the repo-level GitHub Actions variables APT_MIRROR (and APT_PORTS_MIRROR for arm64 builds). Locally: make docker APT_MIRROR=http://azure.archive.ubuntu.com A small POSIX-sh helper in .docker/apt-mirror.sh rewrites both DEB822 (/etc/apt/sources.list.d/ubuntu.sources, Ubuntu 24.04+) and the legacy /etc/apt/sources.list before the first apt-get update. Dockerfile stages load it via RUN --mount=type=bind, so there is no extra layer and no cache invalidation when the script is unchanged. Reusable workflows also rewrite the runner's own /etc/apt sources before any sudo apt-get call. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(apt-mirror): default to the Azure mirror, visible in the workflow source Bakes Azure (http://azure.archive.ubuntu.com / http://azure.ports.ubuntu.com) in as the default for both Docker builds and runner-side apt — rather than hiding the URL behind a GitHub Actions repo variable that's not visible from the source tree. A new composite action at .github/actions/configure-apt-mirror is the single source of truth for runner-side rewrites. Five standalone workflows (build-test, release, tests-e2e, tests-ui-e2e, update_swagger) just `uses: ./.github/actions/configure-apt-mirror`. Three workflows (image_build, backend_build, checksum_checker) keep an inline bash rewrite, because they install/upgrade git via apt before the checkout step (so the local composite action isn't loadable yet). The Azure URL is visible in those files too. The `apt-mirror` / `apt-ports-mirror` inputs of the reusable workflows keep their now-Azure defaults — they still feed the Docker build-args block in addition to the inline runner-side rewrite. Callers (image.yml, image-pr.yml, backend.yml, backend_pr.yml) drop the previous `vars.APT_MIRROR` plumbing and rely on those defaults. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(apt-mirror): drop Force Install GIT, consolidate on the composite action The PPA git upgrade ran add-apt-repository ppa:git-core/ppa, which talks to api.launchpad.net — also part of Canonical's infrastructure and currently returning HTTP 504. The Azure mirror only covers archive.ubuntu.com / security.ubuntu.com / ports.ubuntu.com, not PPAs. The system git that ubuntu-latest already ships is sufficient for actions/checkout and the build pipeline, so just drop the upgrade. With that gone, the apt-before-checkout constraint disappears too — all three holdouts (image_build, backend_build, checksum_checker) can now switch to ./.github/actions/configure-apt-mirror like the other five. Net: 0 inline apt-mirror blocks, all 8 workflows route through the composite action. Assisted-by: Claude:claude-opus-4-7[1m] [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-03 23:50:13 +02:00
Tai An	0b024f0886	chore(model gallery): add chroma1-hd diffusers model (#9646 ) Resolves https://github.com/mudler/LocalAI/issues/9604 Adds Chroma1-HD (lodestones/Chroma1-HD), an 8.9B-parameter text-to-image model derived from FLUX.1-schnell, served via the upstream-diffusers ChromaPipeline. Inference defaults follow the model card recommendations: 40 steps, CFG 3.0, bfloat16. Assisted-by: claude-code:opus-4.7	2026-05-03 09:06:31 +02:00
Ettore Di Giacinto	a6121e240e	docs: credit the LocalAI maintainers team Update README and docs to attribute maintenance to the LocalAI team (Ettore Di Giacinto and Richard Palethorpe) and drop the autonomous AI dev team section. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Edit] [Bash]	2026-05-02 23:37:04 +00:00
Ettore Di Giacinto	87cf736068	feat(react-ui): add multilingual (i18n) support (#9642 ) Adds end-to-end internationalization to the React UI with five seed languages (English, Italian, Spanish, German, Simplified Chinese) and a sidebar-footer language switcher next to the existing theme toggle. Library: react-i18next + i18next + i18next-http-backend + i18next-browser-languagedetector. The detector caches the user's choice in localStorage (key `localai-language`, mirroring the existing `localai-theme` convention) and updates the `<html lang>` attribute on change. fallbackLng is `en`, so any missing translation in another locale falls back transparently. Translation files live under `public/locales/<lng>/<ns>.json`. They ride along with the existing `//go:embed react-ui/dist/` directive, but the previous SPA route in core/http/app.go only exposed `/assets/` from the embedded React build. This commit generalizes the asset handler into a `serveReactSubdir(subdir)` helper and adds a matching `/locales/*` route so i18next-http-backend can fetch the JSONs at runtime. The http-backend `loadPath` is built via the existing `apiUrl()` helper so instances served under a sub-path (e.g. `<base href="/ui/">`) resolve correctly. Namespaces (13): common, nav, errors, auth, home, models, importModel, chat, agents, skills, collections, media, admin. Translated UI surfaces include the sidebar/header/footer chrome, login + account flows, the Home dashboard (incl. the manage-by-chat assistant CTA), the model gallery + import flow, the chat experience (Chat.jsx + ChatsMenu), agents/skills/collections list pages, the studio media tabs (Image, Video, TTS), and the admin page-headers (Settings incl. its section nav, Manage, Backends, Traces, Nodes, P2P, Users, Usage). Shared components (ConfirmDialog, Toast) take their default labels from the common namespace so callers don't need to pass strings explicitly. Tooling for incremental adoption is included: - `i18next-parser.config.js` + `npm run i18n:extract` to sweep `t()` keys into the JSON skeletons. - `scripts/translate-locales.mjs` (one-off helper) to bootstrap non-English locales from English source via OpenAI or Anthropic APIs, with --copy mode as a placeholder fallback. Idempotent; preserves existing translations unless --overwrite is passed. Larger config-driven pages (ModelEditor, Settings deep field forms, AgentChat/AgentCreate, SkillEdit, CollectionDetails, Talk, Sound, biometrics, FineTune/Quantize, Users modals, Nodes/P2P install pickers, BackendLogs, Traces deep filters, Explorer) intentionally keep their inner content untranslated for now — they fall back to English via fallbackLng so functionality is unaffected, and the extracted-strings pattern + the bootstrap script make follow-up extraction straightforward. The initial Suspense fallback at the root in main.jsx covers the first JSON fetch on cold load. A simple `.app-boot-spinner` styled in App.css provides a non-empty paint while the first namespace loads. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit Write Agent] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-02 22:42:08 +02:00
LocalAI [bot]	1ad5b5907d	feat(swagger): update swagger (#9643 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-02 22:33:47 +02:00
Russell Sim	18e039f305	fix(ci): fix AMDGPU_TARGETS empty-string bypass in hipblas builds (#9626 ) * fix(ci): fix AMDGPU_TARGETS empty-string bypass in hipblas builds `399c1dec` wired amdgpu-targets through the backend_build workflow_call interface, intending the input's default value to cover matrix entries that don't specify targets. However, GitHub Actions only applies a workflow_call input default when the caller omits the input entirely. When backend.yml passes `amdgpu-targets: ${{ matrix.amdgpu-targets }}` and the matrix entry has no amdgpu-targets key, the expression evaluates to an empty string, which is treated as an explicit value — bypassing the default. The result is Docker receiving AMDGPU_TARGETS="" which in turn causes Make's ?= default to be skipped (since the variable is already set in the environment, even to empty), and cmake gets -DAMDGPU_TARGETS= with no targets, so the HIP backend compiles for an indeterminate target rather than the intended GPU list. Fix this at two levels: 1. backend.yml: use a \|\| fallback in the expression so that an undefined matrix.amdgpu-targets never reaches the reusable workflow as an empty string. The target list is the canonical default and lives here. 2. backend_build.yml: remove the now-misleading default value from the input declaration. The default never fired due to the above bug, so keeping it implied a guarantee that didn't exist. 3. backend/cpp/llama-cpp/Makefile: add an explicit $(error ...) guard after the ?= assignment so that if AMDGPU_TARGETS is empty (whether from environment or any future CI wiring mistake) the build fails immediately with a clear message rather than silently producing a binary compiled for an unknown GPU target. Assisted-by: Claude Code:claude-sonnet-4-6 Signed-off-by: Russell Sim <rsl@simopolis.xyz> * fix(build): plumb AMDGPU_TARGETS through to Docker builds The docker-build-backend Makefile macro and Dockerfile.golang did not pass AMDGPU_TARGETS to the inner make invocation, so hipblas builds always used the backend Makefile's hardcoded default GPU targets regardless of what was specified via environment or CI inputs. Signed-off-by: Russell Sim <rsl@simopolis.xyz> --------- Signed-off-by: Russell Sim <rsl@simopolis.xyz>	2026-05-02 15:53:14 +02:00
Ettore Di Giacinto	b1a99436c7	feat(branding): admin-configurable instance name, tagline, and assets (#9635 ) Adds a whitelabeling feature so an operator can replace the LocalAI instance name, tagline, square logo, horizontal logo, and favicon from the admin Settings page. Defaults fall back to the bundled assets so existing installs are unaffected. The public GET /api/branding endpoint is reachable pre-auth so the login screen can render the configured branding before sign-in. Mutating routes (POST/DELETE /api/branding/asset/:kind) remain admin-only. Text fields (instance_name, instance_tagline) ride the existing /api/settings flow; binary assets get a dedicated multipart upload route that persists files under DynamicConfigsDir/branding/. To prevent the Settings page's stale local state from clobbering an upload on save, UpdateSettingsEndpoint preserves whatever the on-disk asset filename fields are regardless of the body — /api/branding/asset/* are the sole writers for those fields. The MCP catalog gains get_branding and set_branding tools (text fields only; file upload stays UI-only) plus a configure_branding skill prompt. While wiring this up, the same restart-loss class of bug surfaced for several existing fields whose RuntimeSettings entries were never read by the startup loader. Fix loadRuntimeSettingsFromFile() to load: - branding (instance_name, instance_tagline, *_file basenames) - auto_upgrade_backends, prefer_development_backends - localai_assistant_enabled - open_responses_store_ttl - the 7 existing AgentPool fields (enabled, default/embedding model, chunking sizes, enable_logs, collection_db_path) Also exposes 3 new AgentPool runtime settings (vector_engine, database_url, agent_hub_url) via /api/settings + the Settings UI, with the same load-on-startup wiring. The file watcher's manual-edit path is intentionally not changed — the in-process API endpoints already update appConfig directly, so the watcher is redundant for supported flows and a separate refactor for everything else. 15 TDD specs cover the loader behaviour (1 branding + 11 adjacent + 3 new agent-pool); 2 specs cover the persistence helpers and the clobber-prevention contract. Assisted-by: claude-code:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-02 15:51:36 +02:00
Ettore Di Giacinto	7325046650	fix(diffusers): drop compel from requirements to unblock pip resolver (#9632 ) compel 2.3.1 (latest, Nov 2025) declares transformers~=4.25 in its metadata, i.e. >=4.25,<5.0. After transformers 5.0 (2026-01-26) and huggingface-hub 1.0 (2025-10-27) shipped, the weekly DEPS_REFRESH cache rotation in CI started seeing the new majors and pip's resolver went into multi-hour backtracking storms walking every transformers 4.x candidate against every accelerate/hf-hub/tokenizers combination to find a set compel would accept. The 2026-04-29 backend-build for the diffusers backend (darwin-mps + l4t + cublas13-turboquant matrix cells) hit the GitHub Actions 6h job timeout still inside pip install — the build itself never started. compel is the only hard upper bound on transformers in this stack (diffusers, accelerate, peft, optimum-quanto are all flexible), and upstream support for transformers 5 is still in flight: damian0815/ compel#129 ("Modernize Compel for Transformers 5") and #128 ("Bump transformers version to >5.0") are both open as of today. backend.py only constructs Compel() when COMPEL=1 is set in the env (default off), so make compel a true optional extra: - Wrap the top-level `from compel import ...` in try/except ImportError, mirroring the existing sd_embed pattern. - Auto-disable COMPEL with a warning when the module isn't installed, instead of crashing on module load. - Drop compel from all eight requirements-*.txt variants so the resolver no longer has to satisfy its transformers cap. - Leave a TODO in backend.py and in each requirements file pointing at the upstream PR/issue, so the dependency can be reinstated once compel supports transformers >= 5. Users who rely on weighted-prompt embeddings can opt in with a manual `pip install compel` alongside COMPEL=1; the warning emitted on startup tells them how. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit WebFetch] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-01 14:45:14 +02:00
Ettore Di Giacinto	8452068f43	feat(importers): whisper.cpp HF repos pick a quant + nest under whisper/models (#9630 ) The WhisperImporter's Import() switch ordered LooksLikeURL ahead of the HuggingFace branch, so any https://huggingface.co/<owner>/<repo> URI (e.g. LocalAI-io/whisper-large-v3-it-yodas-only-ggml) hijacked the URL path. FilenameFromUrl returned the repo slug, the gallery entry pointed at the HTML repo page, the SHA256 was empty, and the HF file listing was effectively dead code for HTTPS imports. The HF branch only fired for huggingface://owner/repo and hf://owner/repo references. Gate the URL case on a "ggml-.bin" basename signal — mirroring how the llama-cpp importer gates on ".gguf" — so direct file URLs still take the URL path while HF repo URLs fall through to the HF branch. There the file listing is actually consulted: every ggml-.bin entry is collected and one is picked by the new preferences.quantizations preference (default q5_0; comma-separated for fallback ordering). Pin the chosen file under whisper/models/<name>/<file> so a single repo can ship q4_0/q5_0/q8_0 side-by-side without colliding on disk, matching the llama-cpp/models/<name>/ layout. The fallback when no preference matches is the last available ggml file, mirroring llama-cpp's pickPreferredGroup behaviour. Tests: replace the previous probe spec with positive assertions against LocalAI-io/whisper-large-v3-it-yodas-only-ggml (default → ggml-model-q5_0.bin, quantizations=q4_0 → ggml-model-q4_0.bin) plus two offline specs that build a fake hfapi.ModelDetails to cover the fallback rule and non-ggml filtering without touching the network. Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit WebFetch] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-01 12:03:07 +02:00
ER-EPR	0b0078047f	Add tags to qwen3-vl-reranker and Qwen3-VL-Embedding to the gallery (#9628 ) * Add tags to Qwen3-VL-Reranker models Added tags for reranker models in index.yaml. Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Add Qwen3-VL-Embedding models to gallery Added Qwen3-VL-Embedding-8B and Qwen3-VL-Embedding-2B models with detailed descriptions and file references. Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Update index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> --------- Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>	2026-05-01 10:56:58 +02:00
Tai An	80961d2da6	feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 ) Closes #9601 Makes the temporary scratch paths in vllm, vllm-omni, tinygrad, and pocket-tts backends configurable via the standard TMPDIR env var, instead of always writing to /tmp. This is a one-line change per call site that calls tempfile.gettempdir() for the directory and keeps the same filename suffix. Users who run on systems with a small root partition (or want to relocate scratch files to a larger volume) can now redirect these by setting TMPDIR (e.g. TMPDIR=/data/tmp), without affecting the existing LOCALAI_GENERATED_CONTENT_PATH or LOCALAI_UPLOAD_PATH options that already cover other temp paths. Files touched: - backend/python/vllm/backend.py (1 site: video base64 scratch) - backend/python/tinygrad/backend.py (1 site: image fallback dst) - backend/python/pocket-tts/backend.py (1 site: tts wav fallback dst) - backend/python/vllm-omni/backend.py (2 sites: video + audio scratch)	2026-05-01 10:56:24 +02:00
LocalAI [bot]	9c4c3f9d8f	chore: ⬆️ Update ggml-org/llama.cpp to `beb42fffa45eded44804a1fd4916146222371581` (#9624 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-01 02:02:56 +02:00
LocalAI [bot]	273416f54b	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `a8aecbf15933295af96504f9a693998322185b5c` (#9625 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-01 02:02:29 +02:00
Ettore Di Giacinto	c02a50f2ab	feat(llama-cpp): bump to d775992 and adapt to spec params refactor (#9618 ) Bumps backend/cpp/llama-cpp/Makefile LLAMA_VERSION from 665abc6 to d775992, picking up upstream PR ggml-org/llama.cpp#22397 which splits common_params_speculative into nested draft / ngram_simple / ngram_mod sub-structs. Renames every grpc-server.cpp reference to match: speculative.mparams_dft.path -> speculative.draft.mparams.path speculative.{n_max,n_min} -> speculative.draft.{n_max,n_min} speculative.{p_min,p_split} -> speculative.draft.{p_min,p_split} speculative.{n_gpu_layers,n_ctx} -> speculative.draft.{n_gpu_layers,n_ctx} speculative.ngram_size_n -> speculative.ngram_simple.size_n speculative.ngram_size_m -> speculative.ngram_simple.size_m speculative.ngram_min_hits -> speculative.ngram_simple.min_hits The "speculative.n_max" JSON key sent to the upstream server stays unchanged — server-task.cpp still reads it and routes the value into draft.n_max internally. The turboquant fork (TheTom/llama-cpp-turboquant @ 11a241d) branched before #22397 and still exposes the flat layout. Since turboquant reuses the shared backend/cpp/llama-cpp/grpc-server.cpp, extend patch-grpc-server.sh with an idempotent sed block that reverts the ten field references back to the legacy flat names on the build copy only — the original under backend/cpp/llama-cpp/ stays compiling against vanilla upstream. Drop the block once the fork rebases. ik-llama-cpp has its own grpc-server.cpp with no speculative refs (0/2661 lines), so it is unaffected. Validated locally with `make docker-build-llama-cpp` (avx, avx2, avx512, fallback, grpc + rpc-server all built; image exported). Assisted-by: Claude:claude-opus-4-7 [Bash Read Edit] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-30 08:44:43 +02:00
LocalAI [bot]	76971fb2aa	chore: ⬆️ Update leejet/stable-diffusion.cpp to `3d6064b37ef4607917f8acf2ca8c8906d5087413` (#9617 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-30 08:43:42 +02:00
LocalAI [bot]	ebd9fcbe20	chore(model gallery): 🤖 add 1 new models via gallery agent (#9615 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-29 22:33:41 +02:00
Ettore Di Giacinto	091eda8d70	feat: react chat redesign (#9616 ) * feat(react-ui): redesign chat — popover history, focus on send, density pass Replace the persistent 260px conversation sidebar with a Cmd/Ctrl+K popover (ChatsMenu) so the conversation owns the page. Once a chat has at least one message we auto-collapse the global app rail and fade non-essential header chrome; Esc gives the user back the full chrome for the rest of the session. Move Canvas mode and the MCP dropdown into the input wrapper as mode chips — they describe what's armed for the next message and now live where the user composes. The chat header drops to Chats · title · ModelSelector · overflow · settings, and an overflow menu carries admin-only Manage mode along with Info / Edit / Export / Clear. Density pass: tighter header (40px), smaller avatars with the assistant left-border accent doing the work, 88% bubble width, modern field-sizing on the textarea, 32px send/stop buttons. Empty state now surfaces a Recent strip (top 4 non-empty chats) and a Cmd+K hint, replacing the discoverability the persistent sidebar used to provide. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * feat(react-ui): chat input chips, slimmer menu, focus mode polish Move Canvas mode and the MCP dropdown into the input wrapper as compact mode chips — they describe what's armed for the next message and now sit where the user composes. The MCP popover flips upward when anchored to the input row so it stays on-screen. Eliminate the chat header overflow ("…") menu entirely; relocate each item to its semantic home so users don't have to remember a miscellany drawer: - Manage mode toggle → top of the Settings drawer, alongside the other sticky chat knobs. The shield next to the title still signals state at a glance. - Model info / Edit config → small admin-only "ⓘ" button next to the ModelSelector; the existing model-info panel now hosts the Edit config link. - Export as Markdown → per-row hover action in ChatsMenu, so it works for any chat (not just the active one). - Clear chat history → destructive button at the bottom of the Settings drawer. Make the Sidebar listen to its own `sidebar-collapse` event so the chat's focus mode actually shrinks the rail (it previously only flipped the layout class, leaving the sidebar element at full width and overlapping the chat). Drop the focus-mode toast — the visual shift is enough; the toast was noise. Define `--color-text-tertiary` in both themes; without it metadata text (recent strip timestamps and a few other sites) was inheriting the platform default, which read as black on the dark surface. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * fix(model/log-store): close merged channel exactly once; clean up Remove Two latent races in BackendLogStore.Subscribe could panic under load (distributed e2e test triggered "send on closed channel" at backend_log_store.go:288): 1. The aggregated path closed the merged channel `ch` from two places — the fan-in waiter goroutine (after all source channels drained) and unsubscribe(). When unsubscribe ran while a fan-in goroutine was mid-flight on `ch <- line`, the close beat the send and the runtime panicked. Now `ch` is closed by exactly one goroutine: the waiter that observes all fan-in goroutines finish. unsubscribe() only closes the per-buffer source channels — the for-range in each fan-in goroutine then exits naturally and the waiter takes care of the merged close. 2. Remove() closed every subscriber channel but didn't delete the entries from the subscribers map, so a concurrent unsubscribe() would call close() again on the already-closed channel ("close of closed channel"). Clear the map entry while closing. Add a regression test that hammers AppendLine concurrently with Subscribe + unsubscribe + Remove; the race detector catches both classes of regression. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 * test(model/log-store): port backend log store tests to ginkgo Bring backend_log_store_test.go in line with the rest of pkg/model (loader_test, watchdog_test, store_test): same external test package (`model_test`), same ginkgo + gomega imports, same Describe/It nesting around the public API. Behaviour is unchanged — the four existing scenarios plus the unsubscribe race regression all run as specs under the existing `TestModel` suite. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-29 22:33:26 +02:00
Ettore Di Giacinto	fe6eb57082	feat(vibevoice-cpp): add purego TTS+ASR backend (#9610 ) * feat(vibevoice-cpp): add purego TTS+ASR backend Wire up Microsoft VibeVoice via the vibevoice.cpp C ABI as a new purego-based Go backend that serves both Backend.TTS and Backend.AudioTranscription from a single gRPC binary. Mirrors the qwen3-tts-cpp / sherpa-onnx pattern so the variant matrix (cpu/cuda12/cuda13/metal/rocm/sycl-f16/f32/vulkan/l4t) and the e2e-backends gRPC harness reuse existing infrastructure. - backend/go/vibevoice-cpp/ - Makefile, CMakeLists, purego shim, gRPC Backend with model-dir auto-detection, closed-loop TTS->ASR smoke test - backend/index.yaml - &vibevoicecpp meta + 18 image entries - Makefile - .NOTPARALLEL, BACKEND_VIBEVOICE_CPP, docker-build wiring, test-extra-backend-vibevoice-cpp-{tts,transcription} e2e wrappers - .github/workflows/backend.yml - matrix entries for all variants - .github/workflows/test-extra.yml - per-backend smoke + 2 gRPC e2e jobs * feat(vibevoice-cpp): drop hardcoded glob detection, add gallery entries Refactor backend Load() to follow the standard Options[] convention used by sherpa-onnx and the rest of the multi-role backends: ModelFile is the primary gguf, supplementary paths come through opts.Options[] as key=value (or key:value for Make-target compat), resolved against opts.ModelPath. type=asr/tts decides the role of ModelFile when neither tts_model nor asr_model is set explicitly. Add gallery/index.yaml entries: - vibevoice-cpp - realtime 0.5B Q8_0 TTS + tokenizer + Carter voice - vibevoice-cpp-asr - long-form ASR Q8_0 + tokenizer Both pull from huggingface://mudler/vibevoice.cpp-models with sha256 verification. parameters.model + Options[] paths are siblings under {models_dir} per the qwen3-tts-cpp convention. Update Makefile e2e wrappers to pass BACKEND_TEST_OPTIONS comma+colon style, and tighten the per-backend Go closed-loop test to use the explicit Options API. * fix(vibevoice-cpp): force whole-archive link so vv_capi_* exports survive libvibevoice is a STATIC archive linked into the MODULE library. Without --whole-archive (or -force_load on Apple, /WHOLEARCHIVE on MSVC), the linker garbage-collects symbols not referenced from this translation unit - which means dlopen+RegisterLibFunc panics with 'undefined symbol: vv_capi_load' at backend startup, since purego looks them up by name and our cpp/govibevoicecpp.cpp doesn't call them directly. * test(vibevoice-cpp): rewrite suite with Ginkgo v2 Match the convention used by backend/go/sherpa-onnx/backend_test.go. The suite now covers backend semantics that don't need purego (Locking, empty-ModelFile rejection, TTS/ASR-without-loaded-model errors) on top of the gRPC lifecycle specs (Health, Load, closed-loop TTS->ASR). Model-dependent specs Skip() when VIBEVOICE_MODEL_DIR is unset, so `go test ./backend/go/vibevoice-cpp/` is green on a clean checkout and runs the heavyweight closed-loop spec when test.sh has staged the bundle. * fix(vibevoice-cpp): implement TTSStream + AudioTranscriptionStream The gRPC server's stream handlers (pkg/grpc/server.go) spawn a goroutine that ranges over a chan; the only thing closing that chan is the backend's own Stream method. With the default Base stub returning 'unimplemented' and never touching the chan, the server goroutine hangs forever and the client hits DeadlineExceeded - which is exactly what the e2e harness saw in the test-extra-backend-vibevoice-cpp-tts matrix run. TTSStream synthesizes via vv_capi_tts to a tempfile, then emits a streaming WAV header (chunk sizes 0xFFFFFFFF so HTTP clients can start playback before the full PCM lands) followed by the PCM body in 64 KB slices. The header + >=2 PCM frames satisfy the harness's 'expected >=2 chunks' assertion and give a real progressive stream. AudioTranscriptionStream runs the offline transcription, emits each segment as a delta, and closes with a final_result whose Text equals the concatenated deltas (the harness asserts those match). Two new Ginkgo specs guard the close-channel-on-error path so the deadline-exceeded regression can't come back silently. fix(vibevoice-cpp): silence errcheck on cleanup paths Lint flagged six unchecked Close()/Remove()/RemoveAll() calls along purely-cleanup deferred paths. Wrap each in '_ = ...' (or a closure for defers that take args) - matches what the rest of the LocalAI backend/go/* tree already does for these callsites. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(vibevoice-cpp): closed-loop slot fill + modelRoot-relative path resolution Two bugs the test-extra-backend-vibevoice-cpp-* CI matrix surfaced: 1. Closed-loop Load with ModelFile=tts.gguf + Options[asr_model=...] left v.ttsModel empty, because the default-fill block only ran when BOTH slots were empty. vv_capi_load then got tts="" + a voice and the C side rejected it with rc=-3 'TTS model required to load a voice'. Fix: ModelFile fills the primary role-slot (decided by 'type=' in Options, defaulting to tts) independently of the secondary, so ModelFile + asr_model resolves to both. 2. resolvePath stat'd CWD before falling back to relTo. With LocalAI launched from a directory that happens to contain a same-named file, supplementary Options[] paths could leak away from the models dir. Drop the CWD probe entirely - relative paths now always join onto opts.ModelPath (the gallery convention). New Ginkgo coverage: * 'ModelFile slot resolution' (4 specs) - asr_model+ModelFile, type=asr, explicit tts_model override, key:value variant. * 'resolvePath (relative-to-modelRoot)' (5 specs) - join, abs passthrough, empty input, empty relTo, and the CWD-trap regression test. * 'Load resolves relative Options paths against opts.ModelPath' - end- to-end gallery layout round-trip. Verified locally: 19/19 specs pass (with model bundle, including the closed-loop TTS->ASR; without bundle, 17 pass + 2 model-dependent skip). Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(vibevoice-cpp): use gallery convention in closed-loop spec The 'loads the realtime TTS model' / closed-loop specs were passing already-prefixed paths into Options[]: Options: ['tokenizer=' + filepath.Join(modelDir, 'tokenizer.gguf')] Combined with no ModelPath set on the request, the backend's modelRoot fell back to filepath.Dir(ModelFile) = modelDir, then resolvePath joined the prefixed Options path on top of it - producing 'vibevoice-models/vibevoice-models/tokenizer.gguf' when the CI's VIBEVOICE_MODEL_DIR is the relative './vibevoice-models'. The fix is to mirror the gallery contract LocalAI core actually sends in production: ModelPath is the models root (absolute), ModelFile is a name under it, every Options[] path is relative to ModelPath. Uses filepath.Base() to get bare filenames. Verified locally with both VIBEVOICE_MODEL_DIR=/tmp/vv-bundle (abs) and VIBEVOICE_MODEL_DIR=vibevoice-models (the relative shape that broke CI). Both: 19/19 specs pass, ~55-60s. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(vibevoice-cpp): switch ASR to Q4_K + bump transcription timeout The Q8_0 ASR gguf is ~14 GB - too big to fit alongside the runner image, the docker build cache, and the test artifacts on a free ubuntu-latest GHA runner; 'test-extra-backend-vibevoice-cpp-transcription' was getting SIGTERM'd at 90 min before the model could finish loading. Switch to Q4_K (~10 GB on disk, slightly faster CPU decode) for: * the e2e harness Make target * the gallery 'vibevoice-cpp-asr' entry (parameters + files block) * the per-backend test.sh auto-download list Bump tests-vibevoice-cpp-grpc-transcription's timeout-minutes from 90 to 150 - even with Q4_K, the 30 s JFK clip on a CPU runner needs runway above the previous 90 min cap. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(vibevoice-cpp): drop transcription gRPC e2e job - too heavy for free runners The vibevoice ASR is a 7B-parameter model. Even on Q4_K (~10 GB on disk) a single 30 s transcription saturates the per-test 30 min timeout in the e2e-backends harness on a 4-core ubuntu-latest, and the 10 GB download + Docker layer + working space leaves no headroom on the runner's free disk. Two attempts in CI got SIGTERM'd at the LoadModel boundary - the bottleneck isn't tunable from the workflow side without a paid-tier runner. The per-backend tests-vibevoice-cpp job already runs the same AudioTranscription path via a closed-loop TTS->ASR Ginkgo spec - same gRPC contract, same model, single process - so the standalone tests-vibevoice-cpp-grpc-transcription job was redundant on top of the disk/CPU pressure. The Makefile target test-extra-backend-vibevoice-cpp-transcription stays for local invocation on workstations that can afford it - useful when developing the streaming codepaths. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(vibevoice-cpp): restore transcription gRPC e2e on bigger-runner Switch tests-vibevoice-cpp-grpc-transcription from ubuntu-latest to the self-hosted 'bigger-runner' label that GPU image builds in backend.yml use, plus the documented Free-disk-space prep step (purge dotnet / ghc / android / CodeQL caches) the disabled vllm/sglang entries in this file describe. That gives the 7B-param Q4_K ASR model the disk + CPU runway it needs. Keep timeout-minutes: 150 - even on a beefier runner the 30 s JFK decode plus 10 GB download has to fit comfortably. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(vibevoice-cpp): apt-get install make on bigger-runner before transcription e2e bigger-runner is a self-hosted bare runner without the standard ubuntu image's preinstalled build tools, so the previous job died at the very first command with 'make: command not found' (exit 127). Add the Dependencies step that the disabled vllm/sglang entries in this file already document - apt-get installs make + build-essential + curl + unzip + ca-certificates + git + tar before the make target runs. Mirrors how every other 'runs-on: bigger-runner' entry in backend.yml prepares the runner. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-29 22:22:14 +02:00
LocalAI [bot]	13fe37df89	chore(model gallery): 🤖 add 1 new models via gallery agent (#9611 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-29 09:06:22 +02:00
Richard Palethorpe	4916f8c880	feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map (#9563 ) * feat(vllm): expose AsyncEngineArgs via generic engine_args YAML map LocalAI's vLLM backend wraps a small typed subset of vLLM's AsyncEngineArgs (quantization, tensor_parallel_size, dtype, etc.). Anything outside that subset -- pipeline/data/expert parallelism, speculative_config, kv_transfer_config, all2all_backend, prefix caching, chunked prefill, etc. -- requires a new protobuf field, a Go struct field, an options.go line, and a backend.py mapping per feature. That cadence is the bottleneck on shipping vLLM's production feature set. Add a generic `engine_args:` map on the model YAML that is JSON-serialised into a new ModelOptions.EngineArgs proto field and applied verbatim to AsyncEngineArgs at LoadModel time. Validation is done by the Python backend via dataclasses.fields(); unknown keys fail with the closest valid name as a hint. dataclasses.replace() is used so vLLM's __post_init__ re-runs and auto-converts dict values into nested config dataclasses (CompilationConfig, AttentionConfig, ...). speculative_config and kv_transfer_config flow through as dicts; vLLM converts them at engine init. Operators can now write: engine_args: data_parallel_size: 8 enable_expert_parallel: true all2all_backend: deepep_low_latency speculative_config: method: deepseek_mtp num_speculative_tokens: 3 kv_cache_dtype: fp8 without further proto/Go/Python plumbing per field. Production defaults seeded by hooks_vllm.go: enable_prefix_caching and enable_chunked_prefill default to true unless explicitly set. Existing typed YAML fields (gpu_memory_utilization, tensor_parallel_size, etc.) remain for back-compat; engine_args overrides them when both are set. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(vllm): pin cublas13 to vLLM 0.20.0 cu130 wheel vLLM's PyPI wheel is built against CUDA 12 (libcudart.so.12) and won't load on a cu130 host. Switch the cublas13 build to vLLM's per-tag cu130 simple-index (https://wheels.vllm.ai/0.20.0/cu130/) and pin vllm==0.20.0. The cu130-flavoured wheel ships libcudart.so.13 and includes the DFlash speculative-decoding method that landed in 0.20.0. cublas13 install gets --index-strategy=unsafe-best-match so uv consults both the cu130 index and PyPI when resolving — PyPI also publishes vllm==0.20.0, but with cu12 binaries that error at import time. Verified: Qwen3.5-4B + z-lab/Qwen3.5-4B-DFlash loads and serves chat completions on RTX 5070 Ti (sm_120, cu130). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * ci(vllm): bot job to bump cublas13 vLLM wheel pin vLLM's cu130 wheel index URL is itself version-locked (wheels.vllm.ai/<TAG>/cu130/, no /latest/ alias upstream), so a vLLM bump means rewriting two values atomically — the URL segment and the version constraint. bump_deps.sh handles git-sha-in-Makefile only; add a sibling bump_vllm_wheel.sh and a matching workflow job that mirrors the existing matrix's PR-creation pattern. The bumper queries /releases/latest (which excludes prereleases), strips the leading 'v', and seds both lines unconditionally. When the file is already on the latest tag the rewrite is a no-op and peter-evans/create-pull-request opens no PR. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> * docs(vllm): document engine_args and speculative decoding The new engine_args: map plumbs arbitrary AsyncEngineArgs through to vLLM, but the public docs only covered the basic typed fields. Add a short subsection in the vLLM section explaining the typed/generic split and showing a worked DFlash speculative-decoding config, with pointers to vLLM's SpeculativeConfig reference and z-lab's drafter collection. Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-04-29 00:49:28 +02:00
LocalAI [bot]	55afda22e3	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `453a027c17e4d63a7f16b871197a396240a65138` (#9608 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-29 00:18:19 +02:00
LocalAI [bot]	1fe3558ec6	feat(swagger): update swagger (#9607 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-29 00:18:02 +02:00
Ettore Di Giacinto	e370318bd7	fix(vllm): seed pybind11 for fastsafetensors build under --no-build-isolation fastsafetensors==0.3 (transitive dep of vllm) imports pybind11 in setup.py without declaring it in build-system.requires. With --no-build-isolation it has to already exist in the venv, otherwise the wheel build fails with ModuleNotFoundError on arm64 L4T CUDA 13 (and any other profile that picks up vllm 0.20.0). Assisted-by: Claude:claude-opus-4-7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-28 20:08:26 +00:00
Richard Palethorpe	4443250756	chore: add golangci-lint with new-from-merge-base baseline (#9603 ) * chore: add golangci-lint with new-from-merge-base baseline Configure golangci-lint v2 with the standard linter set (errcheck, govet, ineffassign, unused) plus forbidigo, which enforces the Ginkgo/Gomega-only test convention from .agents/coding-style.md by rejecting stdlib testing calls (t.Errorf, t.Fatalf, t.Run, ...). staticcheck is disabled — the codebase has many pre-existing QF-style suggestions not worth gating on. issues.new-from-merge-base = master makes the lint job a gate for new issues only; the ~1300 pre-existing baseline stays visible via 'make lint-all' for incremental cleanup. CI runs 'make lint'. Backends needing C/C++ headers we don't install in the lint runner are excluded via a deny list in the Makefile (backend/go/{piper,silero-vad, llm}, cmd/launcher). Discovery still flows through 'go list ./...', so new packages are scanned automatically. To make backend/go/{sam3-cpp,stablediffusion-ggml,whisper} typecheckable, move their .cpp/.h sources into cpp/ subdirs (matching qwen3-tts-cpp / acestep-cpp). Without this 'go list' rejects the package because Go does not allow .cpp alongside .go without cgo. Fix two real bugs found by lint in tests/integration/ (run only via 'make test-stores', not default CI): a stale zerolog reference left over from the slog migration (`c37785b7`) and an unused 'os' import. Assisted-by: Claude Code:Opus 4.7 (1M) [Bash] [Read] [Edit] [Write] Signed-off-by: Richard Palethorpe <io@richiejp.com> * ci(lint): generate proto sources and fetch full history The lint job was failing for two reasons: - pkg/grpc/proto/.go is generated, not checked in. Several packages import it, so without 'make protogen-go' typecheck fails project-wide with "no required module provides package github.com/mudler/LocalAI/ pkg/grpc/proto". - golangci-lint's new-from-merge-base needs to git-merge-base the PR against master, but actions/checkout's default shallow clone doesn't fetch master. fetch-depth: 0 brings full history; the config now references origin/master (the remote-tracking branch that survives the shallow checkout) instead of bare master (which doesn't exist locally after checkout). Assisted-by: Claude Code:Opus 4.7 (1M) [Bash] [Read] [Edit] [Write] Signed-off-by: Richard Palethorpe <io@richiejp.com> ci(lint): stub react-ui/dist for go:embed glob core/http/app.go has //go:embed react-ui/dist/*. The glob must match at least one non-hidden entry or typecheck fails the whole core/http package. We don't need the real React bundle to lint Go code, so just touch an empty index.html to satisfy the embed. Assisted-by: Claude Code:Opus 4.7 (1M) [Bash] [Read] [Edit] [Write] Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-28 22:07:44 +02:00
Ettore Di Giacinto	bcef72b9c1	feat: localai assistant chat modality (#9602 ) * fix(tests): inline model_test fixtures after tests/models_fixtures removal The previous reorg removed tests/models_fixtures/ but core/config/model_test.go still read CONFIG_FILE/MODELS_PATH env vars pointing into that directory, so `make test` failed with "open : no such file or directory" on the readConfigFile spec (the suite ran with --fail-fast and bailed before openresponses_test). Inline the YAMLs (config/embeddings/grpc/rwkv/whisper) directly into the test file, materialise them into a per-test tmpdir via BeforeEach, and drop the env-var lookups. The test no longer depends on Makefile plumbing. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash] * refactor(modeladmin): extract model-admin helpers into a service package Lift the bodies of EditModelEndpoint, PatchConfigEndpoint, ToggleStateModelEndpoint, TogglePinnedModelEndpoint and VRAMEstimateEndpoint into core/services/modeladmin so the same logic can be called by non-HTTP clients (notably the in-process MCP server that backs the LocalAI Assistant chat modality, landing in a follow-up commit). The HTTP handlers shrink to thin shells that parse echo inputs, call the matching helper, map typed errors (ErrNotFound, ErrConflict, ErrPathNotTrusted, ErrBadAction, ...) to the existing HTTP status codes, and render the existing response shapes. No REST-surface behaviour change; the existing localai endpoint tests cover the regression net. Adds focused unit tests for each helper against tmp-dir-backed ModelConfigLoader fixtures (deep-merge patch, rename + conflict, path separator guard, toggle/pin enable/disable, sync callback). Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(assistant): LocalAI Assistant chat modality with in-memory MCP server Adds a chat modality, admin-only, that wires the chat session to an in-memory MCP server exposing LocalAI's own admin/management surface as tools. An admin can install models, manage backends, edit configs and check status by chatting; the LLM calls tools like gallery_search, install_model, import_model_uri, list_installed_models, edit_model_config and surfaces the results. Same Go package powers two modes: pkg/mcp/localaitools/ NewServer(client, opts) builds an MCP server that registers the 19-tool admin catalog. The LocalAIClient interface has two impls: - inproc.Client — calls services directly (no HTTP loopback, no synthetic admin API key). Used in-process by the chat handler. - httpapi.Client — calls the LocalAI REST API. Used by the new `local-ai mcp-server --target=…` subcommand to control a remote LocalAI from a stdio MCP host. Tools and their embedded skill prompts are agnostic to which client backs them. Skill prompts are markdown files under prompts/, embedded via go:embed and assembled into the system prompt at server init. Wiring: - core/http/endpoints/mcp/localai_assistant.go — process-wide holder that spins up the in-memory MCP server once at Application start using paired net.Pipe transports, then reuses LocalToolExecutor (no fork) for every chat request that opts in. - core/http/endpoints/openai/chat.go — small branch ahead of the existing MCP block: when metadata.localai_assistant=true, defense-in-depth admin check + executor swap + system-prompt injection. All downstream tool dispatch is unchanged. - core/http/auth/{permissions,features}.go — adds FeatureLocalAIAssistant; gating happens at the chat handler entry plus admin-only `/api/settings`. - core/cli/{run.go,cli.go,mcp_server.go} — LOCALAI_DISABLE_ASSISTANT flag (runtime-toggleable via Settings, no restart), plus `local-ai mcp-server` stdio subcommand. - core/config/runtime_settings.go — `localai_assistant_enabled` runtime setting; the chat handler reads `DisableLocalAIAssistant` live at request entry. UI: - Home.jsx — prominent self-explanatory CTA card on first run ("Manage LocalAI by chatting"); collapses to a compact "Manage by chat" button in the quick-links row once used, persisted via localStorage. - Chat.jsx — admin-only "Manage" toggle in the chat header, "Manage mode" badge, dedicated empty-state copy, starter chips. - Settings.jsx — "LocalAI Assistant" section with the runtime enable toggle. - useChat.js — `localaiAssistant` flag on the chat schema; injects `metadata.localai_assistant=true` on requests when active. Distributed mode: the in-memory MCP server lives only on the head node; inproc.Client wraps already-distributed-aware services so installs propagate to workers via the existing GalleryService machinery. Documentation: `.agents/localai-assistant-mcp.md` is the contributor contract — when adding an admin REST endpoint, also add a LocalAIClient method, an inproc + httpapi impl, a tool registration, and a skill prompt update; the AGENTS.md index links to it. Out of scope (follow-ups): per-tool RBAC granularity for non-admin read-only access; streaming mcp_tool_progress for long installs; React Vitest rig for the UI changes. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(assistant): extract tool/capability/MiB/server-name constants The MCP tool surface, capability tag set, server-name default, and the chat-handler metadata key were repeated as bare string literals across seven files. Renaming any one required hand-editing every call site and risked code/test/prompt drift. This pulls them into typed constants: - pkg/mcp/localaitools/tools.go — Tool* constants for the 19 MCP tools, plus DefaultServerName. - pkg/mcp/localaitools/capability.go — typed Capability + constants for the capability tag set the LLM passes to list_installed_models. The type rides through LocalAIClient.ListInstalledModels and replaces the triplet of "embed"/"embedding"/"embeddings" with the single CapabilityEmbeddings. - pkg/mcp/localaitools/inproc/client.go — bytesPerMiB constant for the VRAMEstimate byte→MB conversion. - core/http/endpoints/mcp/tools.go — MetadataKeyLocalAIAssistant for the "localai_assistant" request-metadata key consumed by the chat handler. Tool registrations, the test catalog, the dispatch table, the validation fixtures, and the fake/stub clients all reference the constants. The embedded skill prompts under prompts/ keep their bare strings (go:embed markdown can't import Go constants); the existing TestPromptsContain SafetyAnchors guards the alignment. No behaviour change. All tests pass with -race. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(modeladmin): typed Action for ToggleState/TogglePinned The toggle/pin verbs were bare strings everywhere — handler signatures, service implementations, MCP tool args, the fake/stub clients, the inproc and httpapi LocalAIClient impls, plus 4 test files. A typo in any caller silently fell through to the runtime "must be 'enable' or 'disable'" check. Introduce core/services/modeladmin.Action (string alias) with ActionEnable, ActionDisable, ActionPin, ActionUnpin and a small Valid helper. The compiler now catches mismatches at every boundary; renames ripple through one source of truth. LocalAIClient.ToggleModelState/Pinned signatures change to take modeladmin.Action. The package is brand-new and unreleased so this is a free public-API tightening. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(assistant): respect ctx cancellation on gallery channel sends InstallModel, DeleteModel, ImportModelURI, InstallBackend and UpgradeBackend all pushed onto galleryop channels with bare sends. If the worker was paused or the buffer full, the chat-handler goroutine blocked forever — the LLM kept polling and the request leaked. Wrap the five sends in a sendModelOp/sendBackendOp helper that selects on ctx.Done() so a cancelled chat completion surfaces context.Canceled back to the LLM instead of hanging. Adds inproc/client_test.go with a pre-cancelled-ctx regression test on InstallModel; the helpers are shared so the same guarantee covers the other four call sites. Assisted-by: Claude:claude-opus-4-7 [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(assistant): graceful shutdown for in-memory holder and stdio CLI Two related leaks: - Application.start() built the LocalAIAssistantHolder but never wired Close() into the graceful-termination chain — the in-memory MCP transport pair stayed alive until process exit, and the goroutines behind net.Pipe() didn't drain. Hook into the existing signals.RegisterGracefulTerminationHandler chain (same pattern as core/http/endpoints/mcp/tools.go:770). - core/cli/mcp_server.go ran srv.Run with context.Background(); a Ctrl-C from the host (Claude Desktop, mcphost, npx inspector) or a SIGTERM from process supervision left the stdio loop reading from a closed pipe. Switch to signal.NotifyContext to surface the signal through ctx and let srv.Run drain. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(assistant): typed HTTPError + propagate prompt walk error The httpapi client detected "no such job" by substring-matching on the error string ("404", "could not find") — brittle to status-code formatting changes and to LocalAI fixing /models/jobs/:uuid to return a proper 404. Replace with a typed HTTPError whose Is() method honours errors.Is(err, ErrHTTPNotFound). The 500-with-"could not find" branch stays as a transitional fallback documented in Is(). Same change covers ListNodes' 404 fallback for the /api/nodes endpoint. Adds httptest tests for both 404 and the legacy 500 path, plus a direct errors.Is exposure test so external callers (the standalone stdio CLI host) can match without re-string-parsing. Also tightens prompts.SystemPrompt: panic when fs.WalkDir on the embedded FS fails. The only realistic cause is a build-time //go:embed misconfiguration; serving an empty system prompt to the LLM is much worse than crashing init. TestSystemPromptIncludesAllEmbeddedFiles catches regressions in CI. Assisted-by: Claude:claude-opus-4-7 [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> fix(modeladmin): atomic writes for model config files The five sites that wrote model YAML used os.WriteFile, which opens with O_TRUNC\|O_WRONLY\|O_CREATE. A crash mid-write left the destination truncated and the model unloadable until manual repair. Pre-existing behaviour inherited from the original endpoint handlers — fix once now that there's a single helper. Adds writeFileAtomic: writes to a sibling temp file, chmods, syncs via Close(), then os.Rename. Same-directory temp keeps the rename atomic on the same filesystem; cleanup runs on every error path so stray temps don't accumulate. No new dependency. Applied to: - ConfigService.PatchConfig - ConfigService.EditYAML (both rename and in-place branches) - mutateYAMLBoolFlag (drives ToggleState + TogglePinned) atomic_test.go covers the happy path plus a read-only-dir failure case that asserts the original file is preserved (skipped on Windows where the chmod trick is POSIX-specific). Assisted-by: Claude:claude-opus-4-7 [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(assistant): prune dead code, mark stub, document conventions Three small cleanups landing together: - Drop the unused errNotImplemented sentinel from inproc/client.go. All five methods that used to return it are wired to modeladmin helpers since the Phase B commit; the package var is dead. - Annotate httpapi.Client.GetModelConfig as a known stub. LocalAI's /models/edit/:name returns rendered HTML, not JSON, so the standalone CLI's get_model_config tool surfaces a clear error to the LLM. A future JSON-only /api/models/config-yaml/:name endpoint is tracked in the agent contract; FIXME points at it. - Extend `.agents/localai-assistant-mcp.md` with a "Code conventions" section that documents the audit-driven rules: tool/Capability/Action constants, errors.Is over substring matching, ctx-aware channel sends, atomic writes, and graceful shutdown. Refresh the file map so it lists tools.go and capability.go and drops the removed tools_bootstrap.go. The tools_models.go diff is a comment-only change explaining why the ModelName empty-string check stays at the tool layer (consistency across LocalAIClient implementations, since the SDK schema validator only enforces presence, not non-empty). Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test(assistant): convert test files to ginkgo + gomega The repo convention (per core/http/endpoints/localai/_test.go, core/gallery/, etc.) is Ginkgo v2 with Gomega assertions. The tests I introduced for the assistant feature used vanilla testing.T, which made them stand out and stripped the BDD structure the rest of the suite relies on. Convert every test file in the assistant scope to Ginkgo: pkg/mcp/localaitools/ dto_test.go — Describe("DTOs round-trip through JSON") prompts_test.go — Describe("SystemPrompt assembler") server_test.go — Describe("Server tool catalog"), Describe("Tool dispatch"), Describe("Tool error surfacing"), Describe("Argument validation"), Describe("Concurrent tool calls") parity_test.go — Describe("LocalAIClient parity"), hosts the suite's single RunSpecs (the file is package localaitools_test so it can import httpapi without an import cycle; Ginkgo aggregates Describes from both the internal and external test packages into one run). httpapi/client_test.go — Describe("httpapi.Client against the LocalAI admin REST surface"), Describe("ErrHTTPNotFound"), Describe("Bearer token") inproc/client_test.go — Describe("inproc.Client cancellation") core/services/modeladmin/ config_test.go — Describe("ConfigService") with sub-Describes for GetConfig, PatchConfig, EditYAML state_test.go — Describe("ConfigService.ToggleState") pinned_test.go — Describe("ConfigService.TogglePinned") atomic_test.go — Describe("writeFileAtomic") core/http/endpoints/mcp/ localai_assistant_test.go — Describe("LocalAIAssistantHolder") Each package gets a `_suite_test.go` with the standard `RegisterFailHandler(Fail) + RunSpecs(t, "...")` boilerplate. Helpers that previously took testing.T (newTestService, writeModelYAML, readMap, sortedStrings, sortGalleries, etc.) drop the T receiver and use Gomega Expectations directly. tmp dirs come from GinkgoT().TempDir(). No semantic change to test coverage — every original assertion has a direct Gomega counterpart. All suites pass with -race. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test+docs(assistant): drift detector for Tool ↔ REST route mapping Honest gap from the audit: the parity_test.go suite only checks four methods, and uses the same httpapi.Client for both sides — it asserts stability of the DTO shapes, not equivalence between in-process and HTTP. If a contributor adds an admin REST endpoint without an MCP tool, or a tool without a matching httpapi route, both surfaces silently diverge. Add a coverage test plus stronger docs: - pkg/mcp/localaitools/coverage_test.go introduces a hand-maintained toolToHTTPRoute map: every Tool* constant must list the REST endpoint the httpapi.Client hits (or "(none)" with a documented reason). Two Ginkgo specs assert the map and the published catalog stay in sync — one fails when a Tool is added without a route entry, the other fails when a route entry references a tool that no longer exists. Verified by removing the ToolDeleteModel entry locally; the test fired with a clear message pointing the contributor at the file. Deliberate non-test: we don't enumerate live admin REST routes from here. Walking the route registry requires booting Application; parsing core/http/routes/localai.go is brittle. The "new admin REST endpoint → MCP tool" direction stays a PR checklist item — see below. - AGENTS.md gets a new Quick Reference bullet that calls out the rule and points at the test by name. - .agents/api-endpoints-and-auth.md tightens the existing "Companion: MCP admin tool surface" subsection from "if useful, consider..." to "MUST be considered, with three concrete outcomes (tool added, deliberately skipped with documented reason, or forgot — which breaks the contract)". Adds a checklist item at the bottom of the file's authoritative checklist. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Write] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(assistant): drop duplicate DTOs, surface canonical types Audit feedback: localaitools/dto.go reinvented several types that already existed in the codebase. Replace the duplicates with the canonical types so the LLM-visible wire format stays aligned with the rest of LocalAI by construction (no parallel structs to keep in sync). Removed (and the canonical type now used by the LocalAIClient interface): localaitools.Gallery → config.Gallery localaitools.GalleryModelHit → gallery.Metadata localaitools.VRAMEstimate → vram.EstimateResult Tightened scope: localaitools.Backend → kept, but reduced to {Name, Installed}. ListKnownBackends now returns []schema.KnownBackend (the canonical type already used by REST /backends/known). Kept with documented rationale: localaitools.JobStatus — galleryop.OpStatus has Error error which marshals to "{}". JobStatus is the JSON-friendly mirror. localaitools.Node — nodes.BackendNode carries gorm internals + token hash; we expose only the LLM-relevant fields. ImportModelURIRequest/Response — schema.ImportModelRequest and GalleryResponse are wire-shaped, mine are LLM-shaped (BackendPreference flat, AmbiguousBackend exposed). Side wins: - Drop bytesPerMiB; vram.EstimateResult already carries human-readable display strings (size_display, vram_display) the LLM uses directly. - Drop the handler-private vramEstimateRequest in core/http/endpoints/localai/vram.go and bind directly into modeladmin.VRAMRequest (now JSON-tagged). Both clients pass through these types now where possible (e.g. ListGalleries in inproc.Client is a one-liner returning AppConfig.Galleries; httpapi.Client.GallerySearch decodes straight into []gallery.Metadata). All tests green with -race. Assisted-by: Claude:claude-opus-4-7 [Read] [Edit] [Bash] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactor(assistant): extract REST route paths into named constants httpapi.Client had 18 bare-string path sites scattered across methods. Pull them into pkg/mcp/localaitools/httpapi/routes.go: static paths as package-private constants, dynamic paths as small builders that handle url.PathEscape on segment values. No behaviour change. Drops the now-unused net/url import from client.go since path escaping moved into routes.go alongside the path it applies to. Local-only by design: the server-side registrations in core/http/routes/localai.go remain bare strings. Sharing constants across the pkg/ ↔ core/ boundary would invert the layering today; the existing Tool↔REST drift-detector in coverage_test.go is the safety net for that direction. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * docs(assistant): align with shipped UI and dropped bootstrap env vars The LocalAI Assistant doc still described the older iteration: - The in-chat toggle was renamed from "Admin" to "Manage" (the badge is now "Manage mode" and the home page exposes a "Manage by chat" CTA). - LOCALAI_ASSISTANT_BOOTSTRAP_MODEL / --localai-assistant-bootstrap-model and the bootstrap_default_model tool were removed — admins pick a model from the existing selector instead, no env-var configuration required. - The shipped tool catalog includes import_model_uri but didn't appear in the doc; bootstrap_default_model appeared but no longer exists. - The Settings → LocalAI Assistant runtime toggle wasn't mentioned as the preferred way to disable without restart. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-28 19:29:27 +02:00
Ettore Di Giacinto	142919fc79	fix(tests): inline model_test fixtures after tests/models_fixtures removal The previous reorg removed tests/models_fixtures/ but core/config/model_test.go still read CONFIG_FILE/MODELS_PATH env vars pointing into that directory, so `make test` failed with "open : no such file or directory" on the readConfigFile spec (the suite ran with --fail-fast and bailed before openresponses_test). Inline the YAMLs (config/embeddings/grpc/rwkv/whisper) directly into the test file, materialise them into a per-test tmpdir via BeforeEach, and drop the env-var lookups. The test no longer depends on Makefile plumbing. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash]	2026-04-28 12:58:49 +00:00
dependabot[bot]	439471baec	chore(deps): bump packaging from 24.1 to 26.2 in /backend/python/coqui (#9594 ) Bumps [packaging](https://github.com/pypa/packaging) from 24.1 to 26.2. - [Release notes](https://github.com/pypa/packaging/releases) - [Changelog](https://github.com/pypa/packaging/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pypa/packaging/compare/24.1...26.2) --- updated-dependencies: - dependency-name: packaging dependency-version: '26.2' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-28 08:44:53 +02:00
dependabot[bot]	eff4be6794	chore(deps): bump github.com/onsi/ginkgo/v2 from 2.28.1 to 2.28.2 (#9593 ) Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.28.1 to 2.28.2. - [Release notes](https://github.com/onsi/ginkgo/releases) - [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md) - [Commits](https://github.com/onsi/ginkgo/compare/v2.28.1...v2.28.2) --- updated-dependencies: - dependency-name: github.com/onsi/ginkgo/v2 dependency-version: 2.28.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-28 08:44:41 +02:00
dependabot[bot]	f1ec30d646	chore(deps): bump github.com/testcontainers/testcontainers-go/modules/postgres from 0.41.0 to 0.42.0 (#9591 ) chore(deps): bump github.com/testcontainers/testcontainers-go/modules/postgres Bumps [github.com/testcontainers/testcontainers-go/modules/postgres](https://github.com/testcontainers/testcontainers-go) from 0.41.0 to 0.42.0. - [Release notes](https://github.com/testcontainers/testcontainers-go/releases) - [Commits](https://github.com/testcontainers/testcontainers-go/compare/v0.41.0...v0.42.0) --- updated-dependencies: - dependency-name: github.com/testcontainers/testcontainers-go/modules/postgres dependency-version: 0.42.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-04-28 08:44:27 +02:00
LocalAI [bot]	f3500223d7	chore: ⬆️ Update leejet/stable-diffusion.cpp to `a81677f59c92d90343aebca51dfed7decf0a0cb0` (#9586 ) ⬆️ Update leejet/stable-diffusion.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-28 08:44:10 +02:00
LocalAI [bot]	b69bacfcdc	chore: ⬆️ Update ikawrakow/ik_llama.cpp to `d6f3e4e28fbf75e6181e6ea32e734de9ce9304fd` (#9585 ) ⬆️ Update ikawrakow/ik_llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-28 08:43:51 +02:00

1 2 3 4 5 ...

6233 Commits