Commit Graph

27 Commits

Author SHA1 Message Date
Andres
b6459ddd57 feat(api): Add transcribe response format request parameter & adjust STT backends (#8318)
* WIP response format implementation for audio transcriptions

(cherry picked from commit e271dd764bbc13846accf3beb8b6522153aa276f)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Rework transcript response_format and add more formats

(cherry picked from commit 6a93a8f63e2ee5726bca2980b0c9cf4ef8b7aeb8)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Add test and replace go-openai package with official openai go client

(cherry picked from commit f25d1a04e46526429c89db4c739e1e65942ca893)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Fix faster-whisper backend and refactor transcription formatting to also work on CLI

Signed-off-by: Andres Smith <andressmithdev@pm.me>
(cherry picked from commit 69a93977d5e113eb7172bd85a0f918592d3d2168)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
Co-authored-by: nanoandrew4 <nanoandrew4@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-01 17:33:17 +01:00
Ettore Di Giacinto
797f27f09f feat(UI): image generation improvements (#7804)
* chore: drop mode from image generation(unused)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(UI): improve image generation front-end

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(UI): only ref images. files is to be deprecated

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not override default steps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-31 21:59:46 +01:00
lif
0d0ef0121c fix: Usage for image generation is incorrect (and causes error in LiteLLM) (#7786)
* fix: Add usage fields to image generation response for OpenAI API compatibility

Fixes #7354

Added input_tokens, output_tokens, and input_tokens_details fields to the
image generation API response to comply with OpenAI's image generation API
specification. This resolves validation errors in LiteLLM and the OpenAI SDK.

Changes:
- Added InputTokensDetails struct with text_tokens and image_tokens fields
- Extended OpenAIUsage struct with input_tokens, output_tokens, and input_tokens_details
- Updated ImageEndpoint to populate usage object with required fields
- Updated InpaintingEndpoint to populate usage object with required fields
- All fields initialized to 0 as per current behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: majiayu000 <1835304752@qq.com>

* fix: Correct usage field types for image generation API compatibility

Changed InputTokens and OutputTokens from pointer types (*int) to
regular int types to match OpenAI API specification. This fixes
validation errors with LiteLLM and OpenAI SDK when parsing image
generation responses.

Fixes #7354

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: majiayu000 <1835304752@qq.com>

---------

Signed-off-by: majiayu000 <1835304752@qq.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-30 09:53:05 +01:00
Ettore Di Giacinto
d7f9f3ac93 feat: add support to logitbias and logprobs (#7283)
* feat: add support to logprobs in results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add support to logitbias

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-16 13:27:36 +01:00
Copilot
34bc1bda1e fix(api): SSE streaming format to comply with specification (#7182)
* Initial plan

* Fix SSE streaming format to comply with specification

- Replace json.Encoder with json.Marshal for explicit formatting
- Use explicit \n\n for all SSE messages (instead of relying on implicit newlines)
- Change %v to %s format specifier for proper string formatting
- Fix error message streaming to include proper SSE format
- Ensure consistency between chat.go and completion.go endpoints

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Add proper error handling for JSON marshal failures in streaming

- Handle json.Marshal errors explicitly in error response paths
- Add fallback simple error message if marshal fails
- Prevents sending 'data: <nil>' on marshal failures
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Fix SSE streaming format to comply with specification

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Fix finish_reason field to use pointer for proper null handling

- Change FinishReason from string to *string in Choice schema
- Streaming chunks now omit finish_reason (null) instead of empty string
- Final chunks properly set finish_reason to "stop", "tool_calls", etc.
- Remove empty content from initial streaming chunks (only send role)
- Final streaming chunk sends empty delta with finish_reason
- Addresses OpenAI API compliance issues causing client failures

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Improve code consistency for string pointer creation

- Use consistent pattern: declare variable then take address
- Remove inline anonymous function for better readability
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Move common finish reasons to constants

- Create constants.go with FinishReasonStop, FinishReasonToolCalls, FinishReasonFunctionCall
- Replace all string literals with constants in chat.go, completion.go, realtime.go
- Improves code maintainability and prevents typos

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Make it build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix finish_reason to always be present with null or string value

- Remove omitempty from FinishReason field in Choice struct
- Explicitly set FinishReason to nil for all streaming chunks
- Ensures finish_reason appears as null in JSON for streaming chunks
- Final chunks still properly set finish_reason to "stop", "tool_calls", etc.
- Complies with OpenAI API specification example

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-09 22:00:27 +01:00
Ettore Di Giacinto
02cc8cbcaa feat(llama.cpp): consolidate options and respect tokenizer template when enabled (#7120)
* feat(llama.cpp): expose env vars as options for consistency

This allows to configure everything in the YAML file of the model rather
than have global configurations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): respect usetokenizertemplate and use llama.cpp templating system to process messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Detect template exists if use tokenizer template is enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Better recognization of chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixes to support tool calls while using templates from tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop template guessing, fix passing tools to tokenizer

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Extract grammar and other options from chat template, add schema struct

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Automatically set use_jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Cleanups, identify by default gguf models for chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-07 21:23:50 +01:00
Ettore Di Giacinto
b9a25b16e6 feat: add reasoning effort and metadata to template (#5981)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-06 21:56:05 +02:00
Ettore Di Giacinto
3d22bfc27c feat(stablediffusion-ggml): add support to ref images (flux Kontext) (#5935)
* feat(stablediffusion-ggml): add support to ref images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add it to the model gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 22:42:34 +02:00
Ettore Di Giacinto
73ecb7f90b chore: drop assistants endpoint (#5926)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-27 21:06:09 +02:00
Max Goltzsche
eae4ca08da feat(openai): support input_audio chat api field (#5870)
Improving the chat completion endpoint OpenAI API compatibility by supporting messages of type `input_audio`, e.g.:
```
{
  ...
  "messages": [
    {
      "role": "user",
      "content": [{
        "type": "input_audio",
        "input_audio": {
          "data": "<base64-encoded audio data>",
          "format": "wav"
        }
      }]
    }
  ]
}
```

Closes #5869

Signed-off-by: Max Goltzsche <max.goltzsche@gmail.com>
2025-07-21 09:15:55 +02:00
Ettore Di Giacinto
61cc76c455 chore(autogptq): drop archived backend (#5214)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-19 15:52:29 +02:00
Ettore Di Giacinto
e15d29aba2 chore(stablediffusion-ncn): drop in favor of ggml implementation (#4652)
* chore(stablediffusion-ncn): drop in favor of ggml implementation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): drop stablediffusion build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): add

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): try to fixup current tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tests improvements

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): use quality to specify step

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): switch to sd-1.5

also increase prep time for downloading models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-22 19:34:16 +01:00
mintyleaf
96f8ec0402 feat: add machine tag and inference timings (#4577)
* Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

* remove redurant timing fields, fix not working timings output

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

* use middleware for Machine-Tag only if tag is specified

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

---------

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
2025-01-17 17:05:58 +01:00
Ettore Di Giacinto
191bc2e50a feat(api): allow to pass audios to backends (#3603)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 12:26:53 +02:00
Ettore Di Giacinto
fbb9facda4 feat(api): allow to pass videos to backends (#3601)
This prepares the API to receive videos as well for video understanding.

It works similarly to images, where the request should be in the form:

{
 "type": "video_url",
 "video_url": { "url": "url or base64 data" }
}

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 11:21:59 +02:00
Ettore Di Giacinto
e198347886 feat(openai): add json_schema format type and strict mode (#3193)
* feat(openai): add json_schema and strict mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* handle err vs _

security scanners prefer if we put these branches in, and I tend to agree.

Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-08-07 15:27:02 -04:00
Ettore Di Giacinto
bf9dd1de7f feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys (#2912)
* feat(functions): enhance parsing with broken JSON when we parse the raw results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* breaking: make function name by default

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(grammar): dynamically generate grammars with mutating keys

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: simplify condition

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-18 17:52:22 +02:00
Ettore Di Giacinto
b8b0c7ad0b docs(swagger): core more localai/openai endpoints (#2904)
* docs(swagger): core more localai/openai endpoints

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix swagger descriptions for backend_monitor.go

Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-07-18 00:38:41 -04:00
Ettore Di Giacinto
f120a0c9f9 docs(swagger): enhance coverage of APIs (#2753)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-09 23:09:49 +02:00
Sertaç Özercan
5866fc8ded chore: fix go.mod module (#2635)
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-06-23 08:24:36 +00:00
Prajwal S Nayak
4d98dd9ce7 feat(image): support response_type in the OpenAI API request (#2347)
* Change response_format type to string to match OpenAI Spec

Signed-off-by: prajwal <prajwalnayak7@gmail.com>

* updated response_type type to interface

Signed-off-by: prajwal <prajwalnayak7@gmail.com>

* feat: correctly parse generic struct

Signed-off-by: mudler <mudler@localai.io>

* add tests

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: prajwal <prajwalnayak7@gmail.com>
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@localai.io>
2024-05-29 14:40:54 +02:00
Ettore Di Giacinto
efa32a2677 feat(grammar): support models with specific construct (#2291)
When enabling grammar with functions, it might be useful to
allow more flexibility to support models that are fine-tuned against returning
function calls of the form of { "name": "function_name", "arguments" {...} }
rather then { "function": "function_name", "arguments": {..} }.

This might call out to a more generic approach later on, but for the moment being we can easily support both
as we have just to specific different types.

If needed we can expand on this later on

Signed-off-by: mudler <mudler@localai.io>
2024-05-12 01:13:22 +02:00
Ettore Di Giacinto
bbea62b907 feat(functions): support models with no grammar, add tests (#2068)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-04-18 22:43:12 +02:00
Chakib Benziane
801b481beb fixes #1051: handle openai presence and request penalty parameters (#1817)
* fix request debugging, disable marshalling of context fields

Signed-off-by: blob42 <contact@blob42.xyz>

* merge frequency_penalty request parm with config

Signed-off-by: blob42 <contact@blob42.xyz>

* openai: add presence_penalty parameter

Signed-off-by: blob42 <contact@blob42.xyz>

---------

Signed-off-by: blob42 <contact@blob42.xyz>
2024-03-17 09:43:20 +01:00
Dave
1c312685aa refactor: move remaining api packages to core (#1731)
* core 1

* api/openai/files fix

* core 2 - core/config

* move over core api.go and tests to the start of core/http

* move over localai specific endpoints to core/http, begin the service/endpoint split there

* refactor big chunk on the plane

* refactor chunk 2 on plane, next step: port and modify changes to request.go

* easy fixes for request.go, major changes not done yet

* lintfix

* json tag lintfix?

* gitignore and .keep files

* strange fix attempt: rename the config dir?
2024-03-01 16:19:53 +01:00
Ettore Di Giacinto
aa098e4d0b fix(sse): do not omit empty finish_reason (#1745)
Fixes https://github.com/mudler/LocalAI/issues/1744
2024-02-24 11:51:59 +01:00
Dave
255748bcba MQTT Startup Refactoring Part 1: core/ packages part 1 (#1728)
This PR specifically introduces a `core` folder and moves the following packages over, without any other changes:

- `api/backend`
- `api/config`
- `api/options`
- `api/schema`

Once this is merged and we confirm there's no regressions, I can migrate over the remaining changes piece by piece to split up application startup, backend services, http, and mqtt as was the goal of the earlier PRs!
2024-02-21 01:21:19 +00:00