LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-17 04:56:52 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	624fa946f8	feat(swagger): update swagger (#9723 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-08 23:44:55 +02:00
LocalAI [bot]	890c070e55	feat(swagger): update swagger (#9699 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-07 08:28:04 +02:00
Ettore Di Giacinto	e86ade54a6	feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654 ) * feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp Closes #1648. OpenAI-style multipart endpoint that returns "who spoke when". Single endpoint instead of the issue's three-endpoint sketch (refactor /vad, /vad/embedding, /diarization) — the typical client wants one call, and embeddings can land later as a sibling without breaking this surface. Response shape borrows from Pyannote/Deepgram: segments carry a normalised SPEAKER_NN id (zero-padded, stable across the response) plus the raw backend label, optional per-segment text when the backend bundles ASR, and a speakers summary in verbose_json. response_format also accepts rttm so consumers can pipe straight into pyannote.metrics / dscore. Backends: * vibevoice-cpp — Diarize() reuses the existing vv_capi_asr pass. vibevoice's ASR prompt asks the model to emit [{Start,End,Speaker,Content}] natively, so diarization is a by-product of the same pass; include_text=true preserves the transcript per segment, otherwise we drop it. * sherpa-onnx — wraps the upstream SherpaOnnxOfflineSpeakerDiarization C API (pyannote segmentation + speaker-embedding extractor + fast clustering). libsherpa-shim grew config builders, a SetClustering wrapper for per-call num_clusters/threshold overrides, and a segment_at accessor (purego can't read field arrays out of SherpaOnnxOfflineSpeakerDiarizationSegment[] directly). Plumbing: new Diarize gRPC RPC + DiarizeRequest / DiarizeSegment / DiarizeResponse messages, threaded through interface.go, base, server, client, embed. Default Base impl returns unimplemented. Capability surfaces all updated: FLAG_DIARIZATION usecase, FeatureAudioDiarization permission (default-on), RouteFeatureRegistry entries for /v1/audio/diarization and /audio/diarization, audio instruction-def description widened, CAP_DIARIZATION JS symbol, swagger regenerated, /api/instructions discovery map updated. Tests: * core/backend: speaker-label normalisation (first-seen → SPEAKER_NN, per-speaker totals, nil-safety, fallback to backend NumSpeakers when no segments). * core/http/endpoints/openai: RTTM rendering (file-id basename, negative duration clamping, fallback id). * tests/e2e: mock-backend grew a deterministic Diarize that emits raw labels "5","2","5" so the e2e suite verifies SPEAKER_NN remapping, verbose_json speakers summary + transcript pass-through (gated by include_text), RTTM bytes content-type, and rejection of unknown response_format. mock-diarize model config registered with known_usecases=[FLAG_DIARIZATION] to bypass the backend-name guard. Docs: new features/audio-diarization.md (request/response, RTTM example, sherpa-onnx + vibevoice setup), cross-link from audio-to-text.md, entry in whats-new.md. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(diarization): correct sherpa-onnx symbol name + lint cleanup CI failures on #9654: * sherpa-onnx-grpc-{tts,transcription} and sherpa-onnx-realtime panicked at backend startup with `undefined symbol: SherpaOnnxDestroyOfflineSpeakerDiarizationResult`. Upstream's actual symbol is SherpaOnnxOfflineSpeakerDiarizationDestroyResult (Destroy in the middle, not the prefix); the rest of the diarization surface follows the same naming pattern. The mismatched name made purego.RegisterLibFunc fail at dlopen time and crashed the gRPC server before the BeforeAll could probe Health, taking down every sherpa-onnx test job — not just the diarization-related ones. * golangci-lint flagged 5 errcheck violations on new defer cleanups (os.RemoveAll / Close / conn.Close); wrap each in a `defer func() { _ = X() }()` closure (matches the pattern other LocalAI files use for new code, since pre-existing bare defers are grandfathered in via new-from-merge-base). * golangci-lint also flagged forbidigo violations: the new diarization_test.go files used testing.T-style `t.Errorf` / `t.Fatalf`, which are forbidden by the project's coding-style policy (.agents/coding-style.md). Convert both files to Ginkgo/Gomega Describe/It with Expect(...) — they get picked up by the existing TestBackend / TestOpenAI suites, no new suite plumbing needed. * modernize linter: tightened the diarization segment loop to `for i := range int(numSegments)` (Go 1.22+ idiom). Verified locally: golangci-lint with new-from-merge-base=origin/master reports 0 issues across all touched packages, and the four mocked diarization e2e specs in tests/e2e/mock_backend_test.go still pass. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(vibevoice-cpp): convert non-WAV input via ffmpeg + raise ASR token budget Confirmed end-to-end against a real LocalAI instance with vibevoice-asr-q4_k loaded and the multi-speaker MP3 sample at vibevoice.cpp/samples/2p_argument.mp3: both /v1/audio/transcriptions and /v1/audio/diarization now succeed and return correctly attributed speaker turns for the full clip. Two latent issues surfaced once the diarization endpoint actually exercised the backend with a non-trivial input: 1. vv_capi_asr only accepts WAV via load_wav_24k_mono. The previous code passed the uploaded path straight through, so anything that wasn't already a 24 kHz mono s16le WAV failed at the C side with rc=-8 and the very unhelpful "vv_capi_asr failed". prepareWavInput shells out to ffmpeg ("-ar 24000 -ac 1 -acodec pcm_s16le") in a per-call temp dir, matching the rate the model was trained on; both AudioTranscription and Diarize now route through it. This is the same shape sherpa-onnx uses (utils.AudioToWav), but vibevoice needs 24 kHz rather than 16 kHz so we don't reuse that helper. 2. The C ABI's max_new_tokens defaults to 256 when 0 is passed. That's fine for a five-second clip but not for anything past ~10 s — vibevoice stops mid-JSON, the parse fails, and the caller sees a hard error. Pass a much larger budget (16 384 ≈ ~9 minutes of speech at the model's ~30 tok/s rate); generation stops at EOS so this is a cap rather than a target. 3. As a defensive belt-and-braces, mirror AudioTranscription's existing "fall back to a single segment if the model emits non-JSON text" pattern in Diarize, so partial / unusual model output never produces a 500. This kept the endpoint usable while diagnosing (1) and (2), and is the right behaviour to keep. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(vibevoice-cpp): pass valid WAVs through directly so ffmpeg is not required at runtime Spotted by tests-e2e-backend (1.25.x): the previous fix forced every incoming audio file through `ffmpeg -ar 24000 ...`, which meant the backend container — which does not ship ffmpeg — failed even for the existing happy path where the caller already uploads a WAV. The container-side error was: rpc error: code = Unknown desc = vibevoice-cpp: ffmpeg convert to 24k mono wav: exec: "ffmpeg": executable file not found in $PATH Reading vibevoice.cpp's audio_io.cpp, `load_wav_24k_mono` uses drwav and already accepts any PCM/IEEE-float WAV at any sample rate, downmixes multi-channel input to mono, and resamples to 24 kHz internally. So the only inputs that genuinely need an external converter are non-WAV formats (MP3, OGG, FLAC, ...). Detect WAVs by RIFF/WAVE magic at bytes 0..3 / 8..11 and pass them straight through with a no-op cleanup; everything else still goes through ffmpeg with the same 24 kHz mono s16le target. The result: * Container builds without ffmpeg keep working for WAV uploads (the e2e-backends fixture is jfk.wav at 16 kHz mono s16le). * MP3 and other non-WAV inputs still get the new ffmpeg conversion path so the diarization endpoint stays useful. * If the caller uploads a non-WAV but ffmpeg isn't on PATH, the surfaced error is still descriptive enough to act on. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] * fix(ci): make gcc-14 install in Dockerfile.golang best-effort for jammy bases The LocalVQE PR (`bb033b16`) made `gcc-14 g++-14` an unconditional apt install in backend/Dockerfile.golang and pointed update-alternatives at them. That works on the default `BASE_IMAGE=ubuntu:24.04` (noble has gcc-14 in main), but every Go backend that builds on `nvcr.io/nvidia/l4t-jetpack:r36.4.0` — jammy under the hood — now fails at the apt step: E: Unable to locate package gcc-14 This blocked unrelated jobs: backend-jobs(*-nvidia-l4t-arm64-{stablediffusion-ggml, sam3-cpp, whisper, acestep-cpp, qwen3-tts-cpp, vibevoice-cpp}). LocalVQE itself is only matrix-built on ubuntu:24.04 (CPU + Vulkan), so it doesn't actually need gcc-14 anywhere else. Make the gcc-14 install conditional on the package being available in the configured apt repos. On noble: identical behaviour to today (gcc-14 installed, update-alternatives points at it). On jammy: skip the gcc-14 stanza entirely and let build-essential's default gcc take over, which is what the other Go backends compile with anyway. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-05 15:10:13 +02:00
LocalAI [bot]	a91b05907c	feat(swagger): update swagger (#9660 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-05 01:50:17 +02:00
LocalAI [bot]	1ad5b5907d	feat(swagger): update swagger (#9643 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-05-02 22:33:47 +02:00
LocalAI [bot]	1fe3558ec6	feat(swagger): update swagger (#9607 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-29 00:18:02 +02:00
LocalAI [bot]	5efbe8405f	feat(swagger): update swagger (#9587 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-27 23:28:03 +02:00
LocalAI [bot]	c755cd5ab5	feat(swagger): update swagger (#9518 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-23 23:26:50 +02:00
Ettore Di Giacinto	181ebb6df4	feat: voice recognition (#9500 ) * feat(voice-recognition): add /v1/voice/{verify,analyze,embed} + speaker-recognition backend Audio analog to face recognition. Adds three gRPC RPCs (VoiceVerify / VoiceAnalyze / VoiceEmbed), their Go service and HTTP layers, a new FLAG_SPEAKER_RECOGNITION capability flag, and a Python backend scaffold under backend/python/speaker-recognition/ wrapping SpeechBrain ECAPA-TDNN with a parallel OnnxDirectEngine for WeSpeaker / 3D-Speaker ONNX exports. The kokoros Rust backend gets matching unimplemented trait stubs — tonic's async_trait has no defaults, so adding an RPC without Rust stubs breaks the build (same regression fixed by `eb01c772` for face). Swagger, /api/instructions, and the auth RouteFeatureRegistry / APIFeatures list are updated so the endpoints surface everywhere a client or admin UI looks. Assisted-by: Claude:claude-opus-4-7 * feat(voice-recognition): add 1:N identify + register/forget endpoints Mirrors the face-recognition register/identify/forget surface. New package core/services/voicerecognition/ carries a Registry interface and a local-store-backed implementation (same in-memory vector-store plumbing facerecognition uses, separate instance so the embedding spaces stay isolated). Handlers under /v1/voice/{register,identify,forget} reuse backend.VoiceEmbed to compute the probe vector, then delegate the nearest-neighbour search to the registry. Default cosine-distance threshold is tuned for ECAPA-TDNN on VoxCeleb (0.25, EER ~1.9%). As with the face registry, the current backing is in-memory only — a pgvector implementation is a future constructor-level swap. Assisted-by: Claude:claude-opus-4-7 * feat(voice-recognition): gallery, docs, CI and e2e coverage - backend/index.yaml: speaker-recognition backend entry + CPU and CUDA-12 image variants (plus matching development variants). - gallery/index.yaml: speechbrain-ecapa-tdnn (default) and wespeaker-resnet34 model entries. The WeSpeaker SHA-256 is a deliberate placeholder — the HF URI must be curl'd and its hash filled in before the entry installs. - docs/content/features/voice-recognition.md: API reference + quickstart, mirrors the face-recognition docs. - React UI: CAP_SPEAKER_RECOGNITION flag export (consumers follow face's precedent — no dedicated tab yet). - tests/e2e-backends: voice_embed / voice_verify / voice_analyze specs. Helper resolveFaceFixture is reused as-is — the only thing face/voice share is "download a file into workDir", so no need for a new helper. - Makefile: docker-build-speaker-recognition + test-extra-backend- speaker-recognition-{ecapa,all} targets. Audio fixtures default to VCTK p225/p226 samples from HuggingFace. - CI: test-extra.yml grows a tests-speaker-recognition-grpc job mirroring insightface. backend.yml matrix gains CPU + CUDA-12 image build entries — scripts/changed-backends.js auto-picks these up. Assisted-by: Claude:claude-opus-4-7 * feat(voice-recognition): wire a working /v1/voice/analyze head Adds AnalysisHead: a lazy-loading age / gender / emotion inference wrapper that plugs into both SpeechBrainEngine and OnnxDirectEngine. Defaults to two open-licence HuggingFace checkpoints: - audeering/wav2vec2-large-robust-24-ft-age-gender (Apache 2.0) — age regression + 3-way gender (female / male / child). - superb/wav2vec2-base-superb-er (Apache 2.0) — 4-way emotion. Both are optional and degrade gracefully when transformers or the model can't be loaded — the engine raises NotImplementedError so the gRPC layer returns 501 instead of a generic 500. Emotion classes pass through from the model (neutral/happy/angry/sad on the default checkpoint); the e2e test now accepts any non-empty dominant gender so custom age_gender_model overrides don't fail it. Adds transformers to the backend's CPU and CUDA-12 requirements. Assisted-by: Claude:claude-opus-4-7 * fix(voice-recognition): pin real WeSpeaker ResNet34 ONNX SHA-256 Replaces the placeholder hash in gallery/index.yaml with the actual SHA-256 (7bb2f06e…) of the upstream Wespeaker/wespeaker-voxceleb-resnet34-LM ONNX at ~25MB. `local-ai models install wespeaker-resnet34` now succeeds. Assisted-by: Claude:claude-opus-4-7 * fix(voice-recognition): soundfile loader + honest analyze default Two issues surfaced on first end-to-end smoke with the actual backend image: 1. torchaudio.load in torchaudio 2.8+ requires the torchcodec package for audio decoding. Switch SpeechBrainEngine._load_waveform to the already-present soundfile (listed in requirements.txt) plus a numpy linear resample to 16kHz. Drops a heavy ffmpeg-linked dep and the codepath we never exercise (torchaudio's ffmpeg backend). 2. The AnalysisHead was defaulting to audeering/wav2vec2-large-robust- 24-ft-age-gender, but AutoModelForAudioClassification silently mangles that checkpoint — it reports the age head weights as UNEXPECTED and re-initialises the classifier head with random values, so the "gender" output is noise and there is no age output at all. Make age/gender opt-in instead (empty default; users wire a cleanly-loadable Wav2Vec2ForSequenceClassification checkpoint via age_gender_model: option). Emotion keeps its working Superb default. Also broaden _infer_age_gender's tensor-shape handling and catch runtime exceptions so a dodgy age/gender head never takes down the whole analyze call. Docs and README updated to match the new policy. Verified with the branch-scoped gallery on localhost: - voice/embed → 192-d ECAPA-TDNN vector - voice/verify → same-clip dist≈6e-08 verified=true; cross-speaker dist 0.76–0.99 verified=false (as expected) - voice/register/identify/forget → round-trip works, 404 on unknown id - voice/analyze → emotion populated, age/gender omitted (opt-in) Assisted-by: Claude:claude-opus-4-7 * fix(voice-recognition): real CI audio fixtures + fixture-agnostic verify spec Two issues surfaced after CI actually ran the speaker-recognition e2e target (I'd curl-tested against a running server but hadn't run the make target locally): 1. The default BACKEND_TEST_VOICE_AUDIO_* URLs pointed at huggingface.co/datasets/CSTR-Edinburgh/vctk paths that return 404 (the dataset is gated). Swap them for the speechbrain test samples served from github.com/speechbrain/speechbrain/raw/develop/ — public, no auth, correct 16kHz mono format. 2. The VoiceVerify spec required d(file1,file2) < 0.4, assuming file1/file2 were same-speaker. The speechbrain samples are three different speakers (example1/2/5), and there is no easy un-gated source of true same-speaker audio pairs (VoxCeleb/VCTK/LibriSpeech are all license- or size-gated for CI use). Replace the ceiling check with a relative-ordering assertion: d(pair) > d(same-clip) for both file2 and file3 — that's enough to prove the embeddings encode speaker info, and it works with any three non-identical clips. Actual speaker ordering d(1,2) vs d(1,3) is logged but not asserted. Local run: 4/4 voice specs pass (Health, LoadModel, VoiceEmbed, VoiceVerify) on the built backend image. 12 non-voice specs skipped as expected. Assisted-by: Claude:claude-opus-4-7 * fix(ci): checkout with submodules in the reusable backend_build workflow The kokoros Rust backend build fails with failed to read .../sources/Kokoros/kokoros/Cargo.toml: No such file because the reusable backend_build.yml workflow's actions/checkout step was missing `submodules: true`. Dockerfile.rust does `COPY . /LocalAI`, and without the submodule files the subsequent `cargo build` can't find the vendored Kokoros crate. The bug pre-dates this PR — scripts/changed-backends.js only triggers the kokoros image job when something under backend/rust/kokoros or the shared proto changes, so master had been coasting past it. The voice-recognition proto addition re-broke it. Other checkouts in backend.yml (llama-cpp-darwin) and test-extra.yml (insightface, kokoros, speaker-recognition) already pass `submodules: true`; this brings the shared backend image builder in line. Assisted-by: Claude:claude-opus-4-7	2026-04-23 12:07:14 +02:00
LocalAI [bot]	2068b6f43c	feat(swagger): update swagger (#9498 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-22 22:51:39 +02:00
Adira	607efe5a4c	fix(backend-monitor): accept model as a query parameter (#9411 ) The /backend/monitor endpoint is routed as GET but its handler bound the model name from a request body, which is invalid per REST and breaks Swagger UI and OpenAPI codegen tools that refuse to send bodies with GET. Switch to reading ?model=<name> as a query parameter and update the Swagger annotation, regenerated spec files, and documentation. The handler still falls back to body binding when the query parameter is absent, so existing clients sending {"model": "..."} continue to work. Fixes #9207 Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>	2026-04-21 22:06:35 +02:00
LocalAI [bot]	cae79d9107	feat(swagger): update swagger (#9431 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-19 23:39:50 +02:00
LocalAI [bot]	07e244d869	feat(swagger): update swagger (#9356 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-15 01:25:24 +02:00
LocalAI [bot]	0f90d17aac	feat(swagger): update swagger (#9329 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-13 09:42:36 +02:00
LocalAI [bot]	b20a2f1cea	feat(swagger): update swagger (#9318 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-11 22:31:36 +02:00
LocalAI [bot]	c39213443b	feat(swagger): update swagger (#9310 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-11 08:38:55 +02:00
LocalAI [bot]	6bc76dda6d	feat(swagger): update swagger (#9300 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-10 01:05:53 +02:00
Richard Palethorpe	557d0f0f04	feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-04 15:14:35 +02:00
LocalAI [bot]	80699a3f70	feat(swagger): update swagger (#9187 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-30 23:48:06 +02:00
LocalAI [bot]	731176ce3a	feat(swagger): update swagger (#9136 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-26 07:58:11 +01:00
LocalAI [bot]	470d5e506f	feat(swagger): update swagger (#9103 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-22 21:28:20 +01:00
LocalAI [bot]	f7e3aab4fc	feat(swagger): update swagger (#9085 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-20 21:45:03 +01:00
LocalAI [bot]	6054d2a91b	feat(swagger): update swagger (#9075 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-19 21:49:40 +01:00
LocalAI [bot]	996dd7652f	feat(swagger): update swagger (#8961 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-11 22:01:55 +01:00
LocalAI [bot]	6a633f2533	feat(swagger): update swagger (#8923 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-09 22:14:17 +01:00
LocalAI [bot]	0063e5d68f	feat(swagger): update swagger (#8706 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-01 21:33:19 +01:00
LocalAI [bot]	bb226d1eaa	feat(swagger): update swagger (#8654 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-25 21:52:35 +01:00
LocalAI [bot]	c8d74d35df	feat(swagger): update swagger (#8418 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-05 21:33:44 +01:00
Ettore Di Giacinto	53276d28e7	feat(musicgen): add ace-step and UI interface (#8396 ) * feat(musicgen): add ace-step and UI interface Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Correctly handle model dir Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop auto-download Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to models, fixup UIs icons Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * l4t13 is incompatbile Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * avoid pinning version for cuda12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop l4t12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-05 12:04:53 +01:00
LocalAI [bot]	ee76a0cd1c	feat(swagger): update swagger (#8304 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-30 21:27:10 +01:00
LocalAI [bot]	6dd44742ea	feat(swagger): update swagger (#8150 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-21 21:44:44 +01:00
LocalAI [bot]	a021df5a88	feat(swagger): update swagger (#8098 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-18 22:10:06 +01:00
LocalAI [bot]	ab893fe302	feat(swagger): update swagger (#7964 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-10 21:46:23 +01:00
LocalAI [bot]	793e4907a2	feat(swagger): update swagger (#7847 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-01-03 22:09:39 +01:00
LocalAI [bot]	5ee6c1810b	feat(swagger): update swagger (#7820 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-01-01 21:16:38 +01:00
LocalAI [bot]	234bf7e2ad	feat(swagger): update swagger (#7794 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-12-30 21:05:01 +00:00
LocalAI [bot]	17d84c8556	feat(swagger): update swagger (#7400 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-30 21:39:29 +00:00
LocalAI [bot]	a9b8869964	feat(swagger): update swagger (#7394 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-30 09:05:46 +01:00
LocalAI [bot]	304ac94d01	feat(swagger): update swagger (#7356 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-25 22:19:53 +01:00
LocalAI [bot]	839aa7b42b	feat(swagger): update swagger (#7286 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-17 07:49:06 +01:00
LocalAI [bot]	3276d1cdaf	feat(swagger): update swagger (#7276 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-15 21:50:30 +01:00
LocalAI [bot]	b82645d28d	feat(swagger): update swagger (#7267 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-11-13 21:28:10 +00:00
LocalAI [bot]	0468456fad	feat(swagger): update swagger (#6834 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-27 21:24:28 +01:00
LocalAI [bot]	5de7a43319	feat(swagger): update swagger (#6384 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-10-05 22:14:10 +02:00
LocalAI [bot]	54ff70e451	feat(swagger): update swagger (#6162 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-08-29 08:11:34 +02:00
LocalAI [bot]	4594430a3e	feat(swagger): update swagger (#6111 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-08-19 22:56:00 +02:00
LocalAI [bot]	224063f0f7	feat(swagger): update swagger (#5983 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-08-07 00:32:11 +02:00
LocalAI [bot]	624f3b1fc8	feat(swagger): update swagger (#5950 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2025-07-31 21:04:23 +00:00
Ettore Di Giacinto	9aadfd485f	chore: update swagger (#5946 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-31 16:22:27 +02:00
Ettore Di Giacinto	2d64269763	feat: Add backend gallery (#5607 ) * feat: Add backend gallery This PR add support to manage backends as similar to models. There is now available a backend gallery which can be used to install and remove extra backends. The backend gallery can be configured similarly as a model gallery, and API calls allows to install and remove new backends in runtime, and as well during the startup phase of LocalAI. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add backends docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * wip: Backend Dockerfile for python backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: drop extras images, build python backends separately Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup on all backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * test CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Tweaks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop old backends leftovers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixup CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Move dockerfile upper Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix proto Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Feature dropped for consistency - we prefer model galleries Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add missing packages in the build image Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * exllama is ponly available on cublas Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * pin torch on chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups to index Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Debug CI * Install accellerators deps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add target arch * Add cuda minor version Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use self-hosted runners Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: use quay for test images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups for vllm and chatterbox Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small fixups on CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chatterbox is only available for nvidia Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Simplify CI builds Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Adapt test, use qwen3 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(model gallery): add jina-reranker-v1-tiny-en-gguf Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Use reranker from llama.cpp in AIO images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Limit concurrent jobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-06-15 14:56:52 +02:00

1 2

80 Commits