LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-18 05:33:09 -04:00

Author	SHA1	Message	Date
walcz-de	00fcf6936c	fix: implement encoding_format=base64 for embeddings endpoint (#9135 ) The OpenAI Node.js SDK v4+ sends encoding_format=base64 by default. LocalAI previously ignored this parameter and always returned a float JSON array, causing a silent data corruption bug in any Node.js client (AnythingLLM Desktop, LangChain.js, LlamaIndex.TS, …): // What the client does when it expects base64 but receives a float array: Buffer.from(floatArray, 'base64') Node.js treats a non-string first argument as a byte array — each float32 value is truncated to a single byte — and Float32Array then reads those bytes as floats, yielding dims/4 values. Vector databases (Qdrant, pgvector, …) then create collections with the wrong dimension, causing all similarity searches to fail silently. e.g. granite-embedding-107m (384 dims) → 96 stored in Qdrant jina-embeddings-v3 (1024 dims) → 256 stored in Qdrant Changes: - core/schema/prediction.go: add EncodingFormat string field to PredictionOptions so the request parameter is parsed and available throughout the request pipeline - core/schema/openai.go: add EmbeddingBase64 string field to Item; add MarshalJSON so the "embedding" JSON key emits either []float32 or a base64 string depending on which field is populated — all other Item consumers (image, video endpoints) are unaffected - core/http/endpoints/openai/embeddings.go: add floatsToBase64() which packs a float32 slice as little-endian bytes and base64-encodes it; add embeddingItem() helper; both InputToken and InputStrings loops now honour encoding_format=base64 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-25 17:38:07 +01:00
Ettore Di Giacinto	031a36c995	feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092 ) * feat: wire min_p Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: inferencing defaults Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(refactor): re-use iterative parser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: generate automatically inference defaults from unsloth Instead of trying to re-invent the wheel and maintain here the inference defaults, prefer to consume unsloth ones, and contribute there as necessary. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: apply defaults also to models installed via gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: be consistent and apply fallback to all endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 00:57:15 +01:00
Ettore Di Giacinto	f7e8d9e791	feat(quantization): add quantization backend (#9096 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 00:56:34 +01:00
Ettore Di Giacinto	d9c1db2b87	feat: add (experimental) fine-tuning support with TRL (#9088 ) * feat: add fine-tuning endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(experimental): add fine-tuning endpoint and TRL support This changeset defines new GRPC signatues for Fine tuning backends, and add TRL backend as initial fine-tuning engine. This implementation also supports exporting to GGUF and automatically importing it to LocalAI after fine-tuning. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * commit TRL backend, stop by killing process Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * move fine-tune to generic features Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add evals, reorder menu Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-21 02:08:02 +01:00
Richard Palethorpe	8cd3f9fc47	feat(ui, openai): Structured errors and link to traces in error toast (#9068 ) First when sending errors over SSE we now clearly identify them as such instead of just sending the error string as a chat completion message. We use this in the UI to identify errors and link to them to the traces. Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-20 15:06:07 +01:00
Ettore Di Giacinto	aea21951a2	feat: add users and authentication support (#9061 ) * feat(ui): add users and authentication support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: allow the admin user to impersonificate users Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: ui improvements, disable 'Users' button in navbar when no auth is configured Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: add OIDC support Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: gate models Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: cache requests to optimize speed Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * small UI enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ui): style improvements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: cover other paths by auth Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: separate local auth, refactor Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * security hardening, approval mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: fix tests and expectations Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: update localagi/localrecall Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-19 21:40:51 +01:00
Richard Palethorpe	e832efeb9e	fix(ui): Refresh model list on deletion (#9059 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-18 14:07:45 +01:00
Tv	8a0edd0809	Always populate ORItemParam.Summary (#9049 ) * fix(openresponses): do not omit required fields summary and id * fix(openresponses): ensure ORItemParam.Summary is never null Normalize Summary to an empty slice at serialization chokepoints (sendSSEEvent, bufferEvent, buildORResponse) so it always serializes as [] instead of null. Closes #9047	2026-03-18 08:45:46 +01:00
Richard Palethorpe	35d509d8e7	feat(ui): Per model backend logs and various fixes (#9028 ) * feat(gallery): Switch to expandable box instead of pop-over and display model files Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(ui, backends): Add individual backend logging Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(ui): Set the context settings from the model config Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-18 08:31:26 +01:00
Ettore Di Giacinto	eef808d921	fix: call .String() on AllowedTools Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-17 23:16:19 +00:00
Ettore Di Giacinto	ee96e5e08d	chore: refactor endpoints to use same inferencing path, add automatic retrial mechanism in case of errors (#9029 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-16 21:31:02 +01:00
Ettore Di Giacinto	d8161bfe57	fix(api): unescape model names (#9024 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-16 00:56:37 +01:00
Ettore Di Giacinto	5fd42399d4	feat: support streaming mode for tool calls in agent mode, fix interleaved thinking stream (#9023 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-16 00:50:19 +01:00
Ettore Di Giacinto	dde0353432	chore(api): add path to expose collection raw files Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-14 22:31:45 +00:00
Richard Palethorpe	f9a850c02a	feat(realtime): WebRTC support (#8790 ) * feat(realtime): WebRTC support Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(tracing): Show full LLM opts and deltas Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-13 21:37:15 +01:00
Attila Györffy	5a67b5d73c	Fix image upload processing and img2img pipeline in diffusers backend (#8879 ) * fix: add missing bufio.Flush in processImageFile The processImageFile function writes decoded image data (from base64 or URL download) through a bufio.NewWriter but never calls Flush() before closing the underlying file. Since bufio's default buffer is 4096 bytes, small images produce 0-byte files and large images are truncated — causing PIL to fail with "cannot identify image file". This breaks all image input paths: file, files, and ref_images parameters in /v1/images/generations, making img2img, inpainting, and reference image features non-functional. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * fix: merge options into kwargs in diffusers GenerateImage The GenerateImage method builds a local `options` dict containing the source image (PIL), negative_prompt, and num_inference_steps, but never merges it into `kwargs` before calling self.pipe(*kwargs). This causes img2img to fail with "Input is in incorrect format" because the pipeline never receives the image parameter. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> test: add unit test for processImageFile base64 decoding Verifies that a base64-encoded PNG survives the write path (encode → decode → bufio.Write → Flush → file on disk) with byte-for-byte fidelity. The test image is small enough to fit entirely in bufio's 4096-byte buffer, which is the exact scenario where the missing Flush() produced a 0-byte file. Also tests that invalid base64 input is handled gracefully. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * test: verify GenerateImage merges options into pipeline kwargs Mocks the diffusers pipeline and calls GenerateImage with a source image and negative prompt. Asserts that the pipeline receives the image, negative_prompt, and num_inference_steps via kwargs — the exact parameters that were silently dropped before the fix. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * fix: move kwargs.update(options) earlier in GenerateImage Move the options merge right after self.options merge (L742) so that image, negative_prompt, and num_inference_steps are available to all downstream code paths including img2vid and txt2vid. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * test: convert processImageFile tests to ginkgo Replace standard testing with ginkgo/gomega to be consistent with the rest of the test suites in the project. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> --------- Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-11 08:05:50 +01:00
Ettore Di Giacinto	8818452d85	feat(ui): MCP Apps, mcp streaming and client-side support (#8947 ) * Revert "fix: Add timeout-based wait for model deletion completion (#8756)" This reverts commit `9e1b0d0c82`. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: add mcp prompts and resources Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add client-side MCP Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): allow to authenticate MCP servers Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add MCP Apps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: update AGENTS Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: allow to collapse navbar, save state in storage Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(ui): add MCP button also to home page Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(chat): populate string content Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-11 07:30:49 +01:00
Ettore Di Giacinto	89076bab92	fix(ui): wrap struct to pass metadata fields Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-10 08:48:25 +00:00
Ettore Di Giacinto	85f3558d22	feat(ui): add canvas mode, support history in agent chat (#8927 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-09 23:42:47 +01:00
Ettore Di Giacinto	a026277ab9	feat(mlx-distributed): add new MLX-distributed backend (#8801 ) * feat(mlx-distributed): add new MLX-distributed backend Add new MLX distributed backend with support for both TCP and RDMA for model sharding. This implementation ties in the discovery implementation already in place, and re-uses the same P2P mechanism for the TCP MLX-distributed inferencing. The Auto-parallel implementation is inspired by Exo's ones (who have been added to acknowledgement for the great work!) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * expose a CLI to facilitate backend starting Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: make manual rank0 configurable via model configs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add missing features from mlx backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-09 17:29:32 +01:00
Ettore Di Giacinto	b2f81bfa2e	feat(functions): add peg-based parsing and allow backends to return tool calls directly (#8838 ) * feat(functions): add peg-based parsing Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: support returning toolcalls directly from backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: do run PEG only if backend didn't send deltas Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-08 22:21:57 +01:00
Ettore Di Giacinto	ac48867b7d	feat: add agentic management (#8820 ) * feat: add standalone and agentic functionalities Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * expose agents via responses api Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-07 00:03:08 +01:00
BitToby	96efa4fce0	feat: add WebSocket mode support for the response api (#8676 ) * feat: add WebSocket mode support for the response api Signed-off-by: bittoby <218712309+bittoby@users.noreply.github.com> * test: add e2e tests for WebSocket Responses API Signed-off-by: bittoby <218712309+bittoby@users.noreply.github.com> --------- Signed-off-by: bittoby <218712309+bittoby@users.noreply.github.com>	2026-03-06 10:36:59 +00:00
Ettore Di Giacinto	580517f9db	feat: pass-by metadata to predict options (#8795 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-05 22:50:10 +01:00
Ettore Di Giacinto	09ddaf94b2	feat(ui): move to React for frontend (#8772 ) * feat(ui): move to React Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add import model Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * syntax highlight Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Minor fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-05 21:47:12 +01:00
Ettore Di Giacinto	c7c4a20a9e	fix: retry when LLM returns empty messages (#8704 ) * debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * retry instead of re-computing a response Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-01 21:32:38 +01:00
Ettore Di Giacinto	983db7bedc	feat(ui): add model size estimation (#8684 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-28 23:03:47 +01:00
LocalAI [bot]	1027c487a6	fix: reload model after editing YAML config (issue #8647 ) (#8652 ) fix: reload model configuration after editing (issue #8647) - Add *model.ModelLoader parameter to EditModelEndpoint - Call ml.ShutdownModel() after saving config to unload the running model - Model will be reloaded on next inference request with new settings (e.g., context_size) - Update route registration to pass ml to EditModelEndpoint Signed-off-by: localai-bot <localai-bot@users.noreply.github.com> Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>	2026-02-25 22:18:42 +01:00
Copilot	3ac7301f31	Add `sample_rate` support to TTS API via post-processing resampling (#8650 ) * Initial plan * Add TTS sample_rate support via AudioResample post-processing Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-25 16:36:27 +01:00
Richard Palethorpe	b1b67b973e	fix(realtime): Add functions to conversation history (#8616 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-21 19:03:49 +01:00
Ettore Di Giacinto	51902df7ba	fix: merge openresponses messages (#8615 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-21 09:56:43 +01:00
Ettore Di Giacinto	352b8aaa1b	fix(ui): pass by needed values to unbreak model editor Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-20 09:06:17 +01:00
Ettore Di Giacinto	b471619ad9	chore(deps): bump cogito and add new options to the agent config (#8601 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-18 22:10:26 +01:00
Richard Palethorpe	4fe830ff58	fix(realtime): Limit buffer sizes to prevent DoS (#8596 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-18 14:36:43 +01:00
Richard Palethorpe	86b3bc9313	fix(realtime): Better support for thinking models and setting model parameters (#8595 ) * fix(realtime): Wrap functions in OpenAI chat completions format Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): Set max tokens from session object Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Find thinking start tag for thinking extraction Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Don't send buffer cleared message when we automatically drop it Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-18 14:36:16 +01:00
Ettore Di Giacinto	1c4e5aa5c0	chore: bump cogito (#8568 ) Adapt to new API and drop call to Ask() Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-14 22:52:22 +01:00
Richard Palethorpe	5bdbb10593	fix(realtime): Send proper image data to backend (#8547 ) * fix(realtime): Allow empty parameters Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Just pass base64 string to backend Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-13 18:01:07 +01:00
Richard Palethorpe	f6c80a6987	feat(realtime): Allow sending text, image and audio conversation items" (#8524 ) feat(realtime): Allow sending text and image conversation items Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-12 19:33:46 +00:00
Richard Palethorpe	1479bee894	fix(realtime): Sampling and websocket locking (#8521 ) * fix(realtime): Use locked websocket for concurrent access Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Use sample rate set in session Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(config): Allow pipelines to have no model parameters Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-12 13:57:34 +01:00
Richard Palethorpe	7270a98ce5	fix(realtime): Use user provided voice and allow pipeline models to have no backend (#8415 ) * fix(realtime): Use the voice provided by the user or none at all Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(ui,config): Allow pipeline models to have no backend and use same validation in frontend Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-11 14:18:05 +01:00
Kolega.dev	780877d1d0	security: validate URLs to prevent SSRF in content fetching endpoints (#8476 ) User-supplied URLs passed to GetContentURIAsBase64() and downloadFile() were fetched without validation, allowing SSRF attacks against internal services. Added URL validation that blocks private IPs, loopback, link-local, and cloud metadata endpoints before fetching. Co-authored-by: kolega.dev <faizan@kolega.ai>	2026-02-10 15:14:14 +01:00
Ettore Di Giacinto	697f6aa71c	feat(audio): set audio content type (#8416 ) * feat(audio): set audio content type Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: add tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-05 19:14:12 +01:00
Ettore Di Giacinto	53276d28e7	feat(musicgen): add ace-step and UI interface (#8396 ) * feat(musicgen): add ace-step and UI interface Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Correctly handle model dir Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop auto-download Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add to models, fixup UIs icons Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Update docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * l4t13 is incompatbile Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * avoid pinning version for cuda12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop l4t12 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-05 12:04:53 +01:00
Richard Palethorpe	5195062e12	fix(realtime): Include noAction function in prompt template and handle tool_choice (#8372 ) The realtime endpoint was not passing the noAction "answer" function to the model in the prompt template, causing the model to always call user-provided tools even when a direct response was appropriate. Root cause: - User tools were added to the funcs list - TemplateMessages() was called to generate the prompt - noAction function was only added AFTER templating - This meant the prompt didn't include the "answer" function, even though the grammar did Fix: - Move noAction function creation before TemplateMessages() call so it's included in both the prompt and grammar - Add proper tool_choice parameter handling to support "auto", "required", "none", and specific function selection - Match behavior of the standard chat endpoint 💘 Generated with Crush Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land> Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-03 14:30:37 +01:00
Alex O'Connell	b7585ca738	fix(api): Add missing field in initial OpenAI streaming response (#8341 ) Add missing field in initial OpenAI streaming response Signed-off-by: Alex O'Connell <35843486+acon96@users.noreply.github.com>	2026-02-02 08:30:04 +01:00
Andres	b6459ddd57	feat(api): Add transcribe response format request parameter & adjust STT backends (#8318 ) * WIP response format implementation for audio transcriptions (cherry picked from commit e271dd764bbc13846accf3beb8b6522153aa276f) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Rework transcript response_format and add more formats (cherry picked from commit 6a93a8f63e2ee5726bca2980b0c9cf4ef8b7aeb8) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Add test and replace go-openai package with official openai go client (cherry picked from commit f25d1a04e46526429c89db4c739e1e65942ca893) Signed-off-by: Andres Smith <andressmithdev@pm.me> * Fix faster-whisper backend and refactor transcription formatting to also work on CLI Signed-off-by: Andres Smith <andressmithdev@pm.me> (cherry picked from commit 69a93977d5e113eb7172bd85a0f918592d3d2168) Signed-off-by: Andres Smith <andressmithdev@pm.me> --------- Signed-off-by: Andres Smith <andressmithdev@pm.me> Co-authored-by: nanoandrew4 <nanoandrew4@gmail.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-01 17:33:17 +01:00
Ettore Di Giacinto	4077aaf978	chore: re-enable e2e tests, fixups anthropic API tools support (#8296 ) * chore(tests): add mock backend e2e tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixup anthropic tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * prepare e2e tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop repetitive tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop specific CI workflow Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixup anthropic issues, move all e2e tests to use mocked backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 12:41:50 +01:00
Ettore Di Giacinto	68dd9765a0	feat(tts): add support for streaming mode (#8291 ) * feat(tts): add support for streaming mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Send first audio, make sure it's 16 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-30 11:58:01 +01:00
Richard Palethorpe	dd8e74a486	feat(realtime): Add audio conversations (#6245 ) * feat(realtime): Add audio conversations Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(realtime): Vendor the updated API and modify for server side Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat(realtime): Update to the GA realtime API Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore: Document realtime API and add docs to AGENTS.md Signed-off-by: Richard Palethorpe <io@richiejp.com> * feat: Filter reasoning from spoken output Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Send delta and done events for tool calls and audio transcripts Ensure that content is sent in both deltas and done events for function call arguments and audio transcripts. This fixes compatibility with clients that rely on delta events for parsing. 💘 Generated with Crush Signed-off-by: Richard Palethorpe <io@richiejp.com> * fix(realtime): Improve tool call handling and error reporting - Refactor Model interface to accept []types.ToolUnion and *types.ToolChoiceUnion instead of JSON strings, eliminating unnecessary marshal/unmarshal cycles - Fix Parameters field handling: support both map[string]any and JSON string formats - Add PredictConfig() method to Model interface for accessing model configuration - Add comprehensive debug logging for tool call parsing and function config - Add missing return statement after prediction error (critical bug fix) - Add warning logs for NoAction function argument parsing failures - Improve error visibility throughout generateResponse function 💘 Generated with Crush Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land> Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-01-29 08:44:53 +01:00
Ettore Di Giacinto	0fa0ac4797	fix(videogen): drop incomplete endpoint, add GGUF support for LTX-2 (#8160 ) * Debug Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop openai video endpoint (is not complete) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add download button Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-22 14:09:20 +01:00

1 2 3 4 5

224 Commits