LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-03 04:46:54 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	8af963bdd9	fix(streaming): comply with OpenAI usage / stream_options spec (#9815 ) * fix(streaming): comply with OpenAI usage / stream_options spec (#8546) LocalAI emitted `"usage":{"prompt_tokens":0,...}` on every streamed chunk because `OpenAIResponse.Usage` was a value type without `omitempty`. The official OpenAI Node SDK and its consumers (continuedev/continue, Kilo Code, Roo Code, Zed, IntelliJ Continue) filter on a truthy `result.usage` to detect the trailing usage chunk; LocalAI's zero-but-non-null usage on every intermediate chunk made that filter swallow every content chunk and surface an empty chat response while the server log looked successful. Changes: - `core/schema/openai.go`: `Usage OpenAIUsage \`json:"usage,omitempty"\`` so intermediate chunks no longer carry a `usage` key. Add `OpenAIRequest.StreamOptions` with `include_usage` to mirror OpenAI's request field. - `core/http/endpoints/openai/chat.go` and `completion.go`: keep using the `Usage` struct field as an in-process channel for the running cumulative, but strip it before JSON marshalling. When the request set `stream_options.include_usage: true`, emit a dedicated trailing chunk with `"choices": []` and the populated usage (matching the OpenAI spec and llama.cpp's server behavior). - `chat_emit.go`: new `streamUsageTrailerJSON` helper; drop the `usage` parameter from `buildNoActionFinalChunks` since chunks no longer carry usage. - Update `image.go`, `inpainting.go`, `edit.go` to wrap their Usage values with `&` for the new pointer field. - UI: send `stream_options:{include_usage:true}` from the React (`useChat.js`) and legacy (`static/chat.js`) chat clients so the token-count badge keeps populating now that the server is spec-compliant. Tests: - New `chat_stream_usage_test.go` pins the spec invariants: intermediate chunks have no `usage` key, the trailer JSON has `"choices":[]` and a populated `usage`, and `OpenAIRequest` parses `stream_options.include_usage`. - Update `chat_emit_test.go` to reflect that finals no longer embed usage. Verified against the live LocalAI instance: before the fix Continue's filter logic swallowed 16/16 token chunks; with the new shape it yields 4/5 and routes usage through the dedicated trailer chunk. Fixes #8546 Assisted-by: Claude:opus-4.7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> fix(streaming): silence errcheck on usage trailer Fprintf The new spec-compliant `stream_options.include_usage` trailer writes were flagged by errcheck since they're new code (golangci-lint runs new-from-merge-base on master); the surrounding `fmt.Fprintf` data: writes are grandfathered. Drop the return values explicitly to match the linter's contract without adding a nolint shim. Assisted-by: Claude:opus-4.7 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-05-14 08:53:46 +02:00
Richard Palethorpe	557d0f0f04	feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-04 15:14:35 +02:00
Ettore Di Giacinto	59108fbe32	feat: add distributed mode (#9124 ) * feat: add distributed mode (experimental) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix data races, mutexes, transactions Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix events and tool stream in agent chat Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * use ginkgo Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(cron): compute correctly time boundaries avoiding re-triggering Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * enhancements, refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not flood of healthy checks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not list obvious backends as text backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop redundant healthcheck Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * enhancements, refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-30 00:47:27 +02:00
Attila Györffy	5a67b5d73c	Fix image upload processing and img2img pipeline in diffusers backend (#8879 ) * fix: add missing bufio.Flush in processImageFile The processImageFile function writes decoded image data (from base64 or URL download) through a bufio.NewWriter but never calls Flush() before closing the underlying file. Since bufio's default buffer is 4096 bytes, small images produce 0-byte files and large images are truncated — causing PIL to fail with "cannot identify image file". This breaks all image input paths: file, files, and ref_images parameters in /v1/images/generations, making img2img, inpainting, and reference image features non-functional. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * fix: merge options into kwargs in diffusers GenerateImage The GenerateImage method builds a local `options` dict containing the source image (PIL), negative_prompt, and num_inference_steps, but never merges it into `kwargs` before calling self.pipe(*kwargs). This causes img2img to fail with "Input is in incorrect format" because the pipeline never receives the image parameter. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> test: add unit test for processImageFile base64 decoding Verifies that a base64-encoded PNG survives the write path (encode → decode → bufio.Write → Flush → file on disk) with byte-for-byte fidelity. The test image is small enough to fit entirely in bufio's 4096-byte buffer, which is the exact scenario where the missing Flush() produced a 0-byte file. Also tests that invalid base64 input is handled gracefully. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * test: verify GenerateImage merges options into pipeline kwargs Mocks the diffusers pipeline and calls GenerateImage with a source image and negative prompt. Asserts that the pipeline receives the image, negative_prompt, and num_inference_steps via kwargs — the exact parameters that were silently dropped before the fix. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * fix: move kwargs.update(options) earlier in GenerateImage Move the options merge right after self.options merge (L742) so that image, negative_prompt, and num_inference_steps are available to all downstream code paths including img2vid and txt2vid. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> * test: convert processImageFile tests to ginkgo Replace standard testing with ginkgo/gomega to be consistent with the rest of the test suites in the project. Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> --------- Signed-off-by: Attila Györffy <attila+git@attilagyorffy.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-11 08:05:50 +01:00
Kolega.dev	780877d1d0	security: validate URLs to prevent SSRF in content fetching endpoints (#8476 ) User-supplied URLs passed to GetContentURIAsBase64() and downloadFile() were fetched without validation, allowing SSRF attacks against internal services. Added URL validation that blocks private IPs, loopback, link-local, and cloud metadata endpoints before fetching. Co-authored-by: kolega.dev <faizan@kolega.ai>	2026-02-10 15:14:14 +01:00
Ettore Di Giacinto	797f27f09f	feat(UI): image generation improvements (#7804 ) * chore: drop mode from image generation(unused) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(UI): improve image generation front-end Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(UI): only ref images. files is to be deprecated Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not override default steps Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-31 21:59:46 +01:00
lif	0d0ef0121c	fix: Usage for image generation is incorrect (and causes error in LiteLLM) (#7786 ) * fix: Add usage fields to image generation response for OpenAI API compatibility Fixes #7354 Added input_tokens, output_tokens, and input_tokens_details fields to the image generation API response to comply with OpenAI's image generation API specification. This resolves validation errors in LiteLLM and the OpenAI SDK. Changes: - Added InputTokensDetails struct with text_tokens and image_tokens fields - Extended OpenAIUsage struct with input_tokens, output_tokens, and input_tokens_details - Updated ImageEndpoint to populate usage object with required fields - Updated InpaintingEndpoint to populate usage object with required fields - All fields initialized to 0 as per current behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com> * fix: Correct usage field types for image generation API compatibility Changed InputTokens and OutputTokens from pointer types (*int) to regular int types to match OpenAI API specification. This fixes validation errors with LiteLLM and OpenAI SDK when parsing image generation responses. Fixes #7354 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: majiayu000 <1835304752@qq.com> --------- Signed-off-by: majiayu000 <1835304752@qq.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2025-12-30 09:53:05 +01:00
Ettore Di Giacinto	c37785b78c	chore(refactor): move logging to common package based on slog (#7668 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-12-21 19:33:13 +01:00
Ettore Di Giacinto	1cdcaf0152	feat: migrate to echo and enable cancellation of non-streaming requests (#7270 ) * WIP: migrate to echo Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-11-14 22:57:53 +01:00
Ettore Di Giacinto	089efe05fd	feat(backends): add system backend, refactor (#6059 ) - Add a system backend path - Refactor and consolidate system information in system state - Use system state in all the components to figure out the system paths to used whenever needed - Refactor BackendConfig -> ModelConfig. This was otherway misleading as now we do have a backend configuration which is not the model config. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-08-14 19:38:26 +02:00
Ettore Di Giacinto	3d22bfc27c	feat(stablediffusion-ggml): add support to ref images (flux Kontext) (#5935 ) * feat(stablediffusion-ggml): add support to ref images Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add it to the model gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-07-30 22:42:34 +02:00
Ettore Di Giacinto	2c9279a542	feat(video-gen): add endpoint for video generation (#5247 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-04-26 18:05:01 +02:00
Dave	3cddf24747	feat: Centralized Request Processing middleware (#3847 ) * squash past, centralize request middleware PR Signed-off-by: Dave Lee <dave@gray101.com> * migrate bruno request files to examples repo Signed-off-by: Dave Lee <dave@gray101.com> * fix Signed-off-by: Dave Lee <dave@gray101.com> * Update tests/e2e-aio/e2e_test.go Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Dave Lee <dave@gray101.com> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2025-02-10 12:06:16 +01:00
Ettore Di Giacinto	e15d29aba2	chore(stablediffusion-ncn): drop in favor of ggml implementation (#4652 ) * chore(stablediffusion-ncn): drop in favor of ggml implementation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ci): drop stablediffusion build Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): add Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): try to fixup current tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Try to fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Tests improvements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): use quality to specify step Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): switch to sd-1.5 also increase prep time for downloading models Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-01-22 19:34:16 +01:00
Gianluca Boiano	032a33de49	chore: remove deprecated tinydream backend (#4631 ) Signed-off-by: Gianluca Boiano <morf3089@gmail.com>	2025-01-18 18:35:30 +01:00
Ettore Di Giacinto	b425a870b0	fix(diffusers): correctly parse height and width request without parametrization (#4082 ) * fix(diffusers): allow to specify width and height without enable-parameters Let's simplify usage by not gating width and height by parameters Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: use sane defaults Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-11-06 08:53:02 +01:00
Ettore Di Giacinto	59ef426fbf	feat(model-list): be consistent, skip known files from listing (#2760 ) fix(model-list): be consistent, skip known files from listing This changeset does two things: - Removes the dependency of listing models from the OpenAI schema. - Tries to reduce confusion between ListModels() in model loader and in the service - now there is only one ListModels which is in services and does not depend anymore on the OpenAI schema - The OpenAI-schema functions were moved nearby the OpenAI specific endpoints that needs the schema - Drops the ListModel Service structure as there was no real need for it. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-07-10 15:28:39 +02:00
Sertaç Özercan	5866fc8ded	chore: fix go.mod module (#2635 ) Signed-off-by: Sertac Ozercan <sozercan@gmail.com>	2024-06-23 08:24:36 +00:00
Prajwal S Nayak	4d98dd9ce7	feat(image): support `response_type` in the OpenAI API request (#2347 ) * Change response_format type to string to match OpenAI Spec Signed-off-by: prajwal <prajwalnayak7@gmail.com> * updated response_type type to interface Signed-off-by: prajwal <prajwalnayak7@gmail.com> * feat: correctly parse generic struct Signed-off-by: mudler <mudler@localai.io> * add tests Signed-off-by: mudler <mudler@localai.io> --------- Signed-off-by: prajwal <prajwalnayak7@gmail.com> Signed-off-by: mudler <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: mudler <mudler@localai.io>	2024-05-29 14:40:54 +02:00
Ettore Di Giacinto	af9e5a2d05	Revert #1963 (#2056 ) * Revert "fix(fncall): fix regression introduced in #1963 (#2048)" This reverts commit `6b06d4e0af`. * Revert "fix: action-tmate back to upstream, dead code removal (#2038)" This reverts commit `fdec8a9d00`. * Revert "feat(grpc): return consumed token count and update response accordingly (#2035)" This reverts commit `e843d7df0e`. * Revert "refactor: backend/service split, channel-based llm flow (#1963)" This reverts commit `eed5706994`. * feat(grpc): return consumed token count and update response accordingly Fixes: #1920 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-04-17 23:33:49 +02:00
Dave	eed5706994	refactor: backend/service split, channel-based llm flow (#1963 ) Refactor: channel based llm flow and services split --------- Signed-off-by: Dave Lee <dave@gray101.com>	2024-04-13 09:45:34 +02:00
Ettore Di Giacinto	123a5a2e16	feat(swagger): Add swagger API doc (#1926 ) * makefile(build): add minimal and api build target * feat(swagger): Add swagger	2024-03-29 22:29:33 +01:00
Ettore Di Giacinto	f895d06605	fix(config): set better defaults for inferencing (#1822 ) * fix(defaults): set better defaults for inferencing This changeset aim to have better defaults and to properly detect when no inference settings are provided with the model. If not specified, we defaults to mirostat sampling, and offload all the GPU layers (if a GPU is detected). Related to https://github.com/mudler/LocalAI/issues/1373 and https://github.com/mudler/LocalAI/issues/1723 * Adapt tests * Also pre-initialize default seed	2024-03-13 10:05:30 +01:00
Dave	1c312685aa	refactor: move remaining api packages to core (#1731 ) * core 1 * api/openai/files fix * core 2 - core/config * move over core api.go and tests to the start of core/http * move over localai specific endpoints to core/http, begin the service/endpoint split there * refactor big chunk on the plane * refactor chunk 2 on plane, next step: port and modify changes to request.go * easy fixes for request.go, major changes not done yet * lintfix * json tag lintfix? * gitignore and .keep files * strange fix attempt: rename the config dir?	2024-03-01 16:19:53 +01:00
Ettore Di Giacinto	db926896bd	Revert "[Refactor]: Core/API Split" (#1550 ) Revert "[Refactor]: Core/API Split (#1506)" This reverts commit `ab7b4d5ee9`.	2024-01-05 18:04:46 +01:00
Dave	ab7b4d5ee9	[Refactor]: Core/API Split (#1506 ) Refactors api folder to core, creates firm split between backend code and api frontend.	2024-01-05 15:34:56 +01:00

26 Commits