Commit Graph

23 Commits

Author SHA1 Message Date
Richard Palethorpe
f9a850c02a feat(realtime): WebRTC support (#8790)
* feat(realtime): WebRTC support

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(tracing): Show full LLM opts and deltas

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-13 21:37:15 +01:00
Richard Palethorpe
b1b67b973e fix(realtime): Add functions to conversation history (#8616)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-21 19:03:49 +01:00
Richard Palethorpe
4fe830ff58 fix(realtime): Limit buffer sizes to prevent DoS (#8596)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-18 14:36:43 +01:00
Richard Palethorpe
86b3bc9313 fix(realtime): Better support for thinking models and setting model parameters (#8595)
* fix(realtime): Wrap functions in OpenAI chat completions format

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(realtime): Set max tokens from session object

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Find thinking start tag for thinking extraction

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Don't send buffer cleared message when we automatically drop it

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-18 14:36:16 +01:00
Richard Palethorpe
5bdbb10593 fix(realtime): Send proper image data to backend (#8547)
* fix(realtime): Allow empty parameters

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Just pass base64 string to backend

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-13 18:01:07 +01:00
Richard Palethorpe
f6c80a6987 feat(realtime): Allow sending text, image and audio conversation items" (#8524)
feat(realtime): Allow sending text and image conversation items

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-12 19:33:46 +00:00
Richard Palethorpe
1479bee894 fix(realtime): Sampling and websocket locking (#8521)
* fix(realtime): Use locked websocket for concurrent access

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Use sample rate set in session

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(config): Allow pipelines to have no model parameters

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-12 13:57:34 +01:00
Richard Palethorpe
7270a98ce5 fix(realtime): Use user provided voice and allow pipeline models to have no backend (#8415)
* fix(realtime): Use the voice provided by the user or none at all

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(ui,config): Allow pipeline models to have no backend and use same validation in frontend

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-11 14:18:05 +01:00
Richard Palethorpe
dd8e74a486 feat(realtime): Add audio conversations (#6245)
* feat(realtime): Add audio conversations

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(realtime): Vendor the updated API and modify for server side

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(realtime): Update to the GA realtime API

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore: Document realtime API and add docs to AGENTS.md

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat: Filter reasoning from spoken output

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Send delta and done events for tool calls and audio transcripts

Ensure that content is sent in both deltas and done events for function call arguments and audio transcripts. This fixes compatibility with clients that rely on delta events for parsing.

💘 Generated with Crush

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Improve tool call handling and error reporting

- Refactor Model interface to accept []types.ToolUnion and *types.ToolChoiceUnion
  instead of JSON strings, eliminating unnecessary marshal/unmarshal cycles
- Fix Parameters field handling: support both map[string]any and JSON string formats
- Add PredictConfig() method to Model interface for accessing model configuration
- Add comprehensive debug logging for tool call parsing and function config
- Add missing return statement after prediction error (critical bug fix)
- Add warning logs for NoAction function argument parsing failures
- Improve error visibility throughout generateResponse function

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land>
Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-01-29 08:44:53 +01:00
Ettore Di Giacinto
c37785b78c chore(refactor): move logging to common package based on slog (#7668)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-21 19:33:13 +01:00
Richard Palethorpe
716dba94b4 feat(whisper): Add prompt to condition transcription output (#7624)
* chore(makefile): Add buildargs for sd and cuda when building backend

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(whisper): Add prompt to condition transcription output

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-12-18 14:40:45 +01:00
Ettore Di Giacinto
d7f9f3ac93 feat: add support to logitbias and logprobs (#7283)
* feat: add support to logprobs in results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add support to logitbias

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-16 13:27:36 +01:00
Ettore Di Giacinto
1cdcaf0152 feat: migrate to echo and enable cancellation of non-streaming requests (#7270)
* WIP: migrate to echo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-14 22:57:53 +01:00
Copilot
34bc1bda1e fix(api): SSE streaming format to comply with specification (#7182)
* Initial plan

* Fix SSE streaming format to comply with specification

- Replace json.Encoder with json.Marshal for explicit formatting
- Use explicit \n\n for all SSE messages (instead of relying on implicit newlines)
- Change %v to %s format specifier for proper string formatting
- Fix error message streaming to include proper SSE format
- Ensure consistency between chat.go and completion.go endpoints

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Add proper error handling for JSON marshal failures in streaming

- Handle json.Marshal errors explicitly in error response paths
- Add fallback simple error message if marshal fails
- Prevents sending 'data: <nil>' on marshal failures
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Fix SSE streaming format to comply with specification

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Fix finish_reason field to use pointer for proper null handling

- Change FinishReason from string to *string in Choice schema
- Streaming chunks now omit finish_reason (null) instead of empty string
- Final chunks properly set finish_reason to "stop", "tool_calls", etc.
- Remove empty content from initial streaming chunks (only send role)
- Final streaming chunk sends empty delta with finish_reason
- Addresses OpenAI API compliance issues causing client failures

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Improve code consistency for string pointer creation

- Use consistent pattern: declare variable then take address
- Remove inline anonymous function for better readability
- Addresses code review feedback

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Move common finish reasons to constants

- Create constants.go with FinishReasonStop, FinishReasonToolCalls, FinishReasonFunctionCall
- Replace all string literals with constants in chat.go, completion.go, realtime.go
- Improves code maintainability and prevents typos

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

* Make it build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix finish_reason to always be present with null or string value

- Remove omitempty from FinishReason field in Choice struct
- Explicitly set FinishReason to nil for all streaming chunks
- Ensures finish_reason appears as null in JSON for streaming chunks
- Final chunks still properly set finish_reason to "stop", "tool_calls", etc.
- Complies with OpenAI API specification example

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-09 22:00:27 +01:00
Richard Palethorpe
0529c7d0a0 fix(realtime): Add transcription session created event, match OpenAI behavior (#6445)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-10-13 21:48:13 +02:00
Richard Palethorpe
e6ebfd3ba1 feat(whisper-cpp): Convert to Purego and add VAD (#6087)
* fix(ci): Avoid matching wrong backend with the same prefix

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(whisper): Use Purego and enable VAD

This replaces the Whisper CGO bindings with our own Purego based module
to make compilation easier.

In addition this allows VAD models to be loaded by Whisper. There is not
much benefit now except that the same backend can be used for VAD and
transcription. Depending on upstream we may also be able to use GPU for
VAD in the future, but presently it is disabled.

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-08-28 17:25:18 +02:00
Ettore Di Giacinto
9c7f92c81f feat(p2p): automatically sync installed models between instances (#6108)
* feat(p2p): sync models between federated nodes

This change makes sure that between federated nodes all the models are
synced with each other.

Note: this works exclusively with models belonging to a gallery. It does
not sync files between the nodes, but rather it synces the node setup.
E.g. All the nodes needs to have configured the same galleries and
install models without any local editing.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make nodes stable

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups on syncing

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ui: improve p2p view

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-19 19:37:46 +02:00
Ettore Di Giacinto
089efe05fd feat(backends): add system backend, refactor (#6059)
- Add a system backend path
- Refactor and consolidate system information in system state
- Use system state in all the components to figure out the system paths
  to used whenever needed
- Refactor BackendConfig -> ModelConfig. This was otherway misleading as
  now we do have a backend configuration which is not the model config.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-14 19:38:26 +02:00
Dave
b3c2a3c257 fix: untangle pkg and core (#5896)
* migrate core/system to pkg/system - it has no dependencies FROM core, and IS USED in pkg

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/templates up to core/templates -- nothing in pkg references it, but it does reference core.

Signed-off-by: Dave Lee <dave@gray101.com>

* remove extra check, len of nil is 0

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/startup to core/startup -- it does have important and unfixable dependencies on core

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2025-07-24 15:03:41 +02:00
Richard Palethorpe
754bedc3ea fix(realtime): Reset speech started flag on commit (#5879)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-22 16:41:12 +02:00
Richard Palethorpe
932f6b01a6 feat(realtime): Add speech started and stopped events (#5856)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-18 09:22:23 +02:00
Richard Palethorpe
d650647db9 fix(realtime): Use updated model on session update (#5604)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-06-09 00:11:05 +02:00
Richard Palethorpe
bf6426aef2 feat: Realtime API support reboot (#5392)
* feat(realtime): Initial Realtime API implementation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: go mod tidy

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat: Implement transcription only mode for realtime API

Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): Build backends on a separate layer to speed up core only changes

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-25 22:25:05 +02:00