Commit Graph

5976 Commits

Author SHA1 Message Date
Ettore Di Giacinto
85be4ff03c feat(api): add ollama compatibility (#9284)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-09 14:15:14 +02:00
Ettore Di Giacinto
b0d9ce4905 Remove header from OpenAI Realtime API documentation
Removed the header from the Realtime API documentation.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-04-09 09:00:28 +02:00
LocalAI [bot]
7081b54c09 chore: ⬆️ Update leejet/stable-diffusion.cpp to e8323cabb0e4511ba18a50b1cb34cf1f87fc71ef (#9281)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-09 08:12:23 +02:00
Ettore Di Giacinto
2b05420f95 chore(llama.cpp): bump to 'd12cc3d1ca6bba741cd77887ac9c9ee18c8415c7' (#9282)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-09 08:12:05 +02:00
Ettore Di Giacinto
b64347b6aa chore: add gemma4 to the gallery
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 23:44:16 +00:00
Ettore Di Giacinto
e00ce981f0 fix: try to add whisperx and faster-whisper for more variants (#9278)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 21:23:38 +02:00
Ettore Di Giacinto
285f7d4340 chore: add embeddingemma
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 17:40:55 +00:00
Richard Palethorpe
ea6e850809 feat: Add Kokoros backend (#9212)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-08 19:23:16 +02:00
Ettore Di Giacinto
b7247fc148 fix(whisperx): add alias
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 14:40:08 +00:00
Ettore Di Giacinto
39c6b3ed66 feat: track files being staged (#9275)
This changeset makes visible when files are being staged, so users are
aware that the model "isn't ready yet" for requests.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 14:33:58 +02:00
Ettore Di Giacinto
0e9d1a6588 chore(ci): drop unnecessary test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 12:19:54 +00:00
Ettore Di Giacinto
510d6759fe fix(nodes): better detection if nodes goes down or model is not available (#9274)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 12:11:02 +02:00
Ettore Di Giacinto
154fa000d3 fix(autoscaling): extract load model from Route() and use as well when doing autoscale (#9270)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-08 08:27:51 +02:00
LocalAI [bot]
0526e60f8d chore: ⬆️ Update ggml-org/llama.cpp to 66c4f9ded01b29d9120255be1ed8d5835bcbb51d (#9269)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-08 08:27:38 +02:00
LocalAI [bot]
db600fb5b2 docs: ⬆️ update docs version mudler/LocalAI (#9268)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-08 08:27:27 +02:00
Richard Palethorpe
9ac1bdc587 feat(ui): Interactive model config editor with autocomplete (#9149)
* feat(ui): Add dynamic model editor with autocomplete

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(docs): Add link to longformat installation video

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-07 14:42:23 +02:00
dependabot[bot]
fdc9f7bf35 chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus from 0.64.0 to 0.65.0 (#9254)
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus

Bumps [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go) from 0.64.0 to 0.65.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/exporters/prometheus/v0.64.0...exporters/prometheus/v0.65.0)

---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
  dependency-version: 0.65.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
v4.1.3
2026-04-07 00:39:52 +02:00
LocalAI [bot]
8e59346091 chore: ⬆️ Update leejet/stable-diffusion.cpp to 8afbeb6ba9702c15d41a38296f2ab1fe5c829fa0 (#9262)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:39:38 +02:00
LocalAI [bot]
e6e4e19633 chore: ⬆️ Update ace-step/acestep.cpp to e0c8d75a672fca5684c88c68dbf6d12f58754258 (#9261)
⬆️ Update ace-step/acestep.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:39:24 +02:00
Ettore Di Giacinto
505c417fa7 fix(gpu): better detection for MacOS and Thor (#9263)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-07 00:39:07 +02:00
LocalAI [bot]
17215f6fbc docs: ⬆️ update docs version mudler/LocalAI (#9260)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:38:50 +02:00
LocalAI [bot]
bccaba1f66 chore: ⬆️ Update ggml-org/llama.cpp to d0a6dfeb28a09831d904fc4d910ddb740da82834 (#9259)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-07 00:38:36 +02:00
Ettore Di Giacinto
0f9d516a6c fix(anthropic): do not emit empty tokens and fix SSE tool calls (#9258)
This fixes Claude Code compatibility

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-07 00:38:21 +02:00
dependabot[bot]
33b124c6f1 chore(deps): bump github.com/aws/aws-sdk-go-v2/config from 1.32.12 to 1.32.14 (#9256)
chore(deps): bump github.com/aws/aws-sdk-go-v2/config

Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.32.12 to 1.32.14.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases)
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.32.12...config/v1.32.14)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
  dependency-version: 1.32.14
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:46:52 +02:00
dependabot[bot]
6b8007e88e chore(deps): bump github.com/jaypipes/ghw from 0.23.0 to 0.24.0 (#9250)
Bumps [github.com/jaypipes/ghw](https://github.com/jaypipes/ghw) from 0.23.0 to 0.24.0.
- [Release notes](https://github.com/jaypipes/ghw/releases)
- [Commits](https://github.com/jaypipes/ghw/compare/v0.23.0...v0.24.0)

---
updated-dependencies:
- dependency-name: github.com/jaypipes/ghw
  dependency-version: 0.24.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:46:18 +02:00
dependabot[bot]
b3837c2078 chore(deps): bump google.golang.org/grpc from 1.79.3 to 1.80.0 (#9253)
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go) from 1.79.3 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc-go/releases)
- [Commits](https://github.com/grpc/grpc-go/compare/v1.79.3...v1.80.0)

---
updated-dependencies:
- dependency-name: google.golang.org/grpc
  dependency-version: 1.80.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:45:50 +02:00
Ettore Di Giacinto
92f99b1ec3 fix(token): login via legacy api keys (#9249)
We were not checking against the api keys when db == nil.

This commit also cleanups now unused middleware

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-06 21:45:09 +02:00
LocalAI [bot]
ad232fdb1a docs: ⬆️ update docs version mudler/LocalAI (#9241)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
v4.1.2
2026-04-06 10:53:07 +02:00
LocalAI [bot]
11637b5a1b chore: ⬆️ Update leejet/stable-diffusion.cpp to 7397ddaa86f4e8837d5261724678cde0f36d4d89 (#9242)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-06 10:52:51 +02:00
LocalAI [bot]
0dda4fe6f0 chore: ⬆️ Update ggml-org/llama.cpp to 761797ffdf2ce3f118e82c663b1ad7d935fbd656 (#9243)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-06 10:52:38 +02:00
Ettore Di Giacinto
773489eeb1 fix(chat): do not retry if we had chatdeltas or tooldeltas from backend (#9244)
* fix(chat): do not retry if we had chatdeltas or tooldeltas from backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: use oai compat for llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: apply to non-streaming path too

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* map also other fields

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-06 10:52:23 +02:00
Ettore Di Giacinto
06fbe48b3f feat(llama.cpp): wire speculative decoding settings (#9238)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-05 14:56:30 +02:00
Ettore Di Giacinto
232e324a68 fix(autoparser): correctly pass by logprobs (#9239)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-05 09:39:22 +02:00
ER-EPR
39c954764c Update index.yaml and add Qwen3.5 model files (#9237)
* Update index.yaml

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

* Add mmproj files for Qwen3.5 models

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

* Update file paths for Qwen models in index.yaml

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

* Update index.yaml

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

* Refactor Qwen3-Reranker-0.6B entry in index.yaml

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

* Update qwen3.yaml configuration parameters

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>

---------

Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>
2026-04-05 09:21:21 +02:00
Ettore Di Giacinto
9b7d5513fc chore(gallery): add mmproj file for gemma4
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
v4.1.1
2026-04-05 02:02:52 +02:00
LocalAI [bot]
84cd8c0e7f chore: ⬆️ Update ggml-org/llama.cpp to b8635075ffe27b135c49afb9a8b5c434bd42c502 (#9231)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-04 23:02:58 +02:00
LocalAI [bot]
d990f2790c chore(model-gallery): ⬆️ update checksum (#9233)
⬆️ Checksum updates in gallery/index.yaml

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-04 23:02:41 +02:00
Ettore Di Giacinto
53deeb1107 fix(reasoning): suppress partial tag tokens during autoparser warm-up
The C++ PEG parser needs a few tokens to identify the reasoning format
(e.g. "<|channel>thought\n" for Gemma 4). During this warm-up, the gRPC
layer was sending raw partial tag tokens to Go, which leaked into the
reasoning field.

- Clear reply.message in gRPC when autoparser is active but has no diffs
  yet, matching llama.cpp server behavior of only emitting classified output
- Prefer C++ autoparser chat deltas for reasoning/content in all streaming
  paths, falling back to Go-side extraction for backends without autoparser
  (e.g. vLLM)
- Override non-streaming no-tools result with chat delta content when available
- Guard PrependThinkingTokenIfNeeded against partial tag prefixes during
  streaming accumulation
- Reorder default thinking tokens so <|channel>thought is checked before
  <|think|> (Gemma 4 templates contain both)
2026-04-04 20:45:57 +00:00
Ettore Di Giacinto
c5a840f6af fix(reasoning): warm-up
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 20:25:24 +00:00
Ettore Di Giacinto
6d9d77d590 fix(reasoning): accumulate and strip reasoning tags from autoparser results (#9227)
fix(reasoning): acccumulate and strip reasoning tags from autoparser results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 18:15:32 +02:00
Ettore Di Giacinto
6f304d1201 chore(refactor): use interface (#9226)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 17:29:37 +02:00
Richard Palethorpe
557d0f0f04 feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-04-04 15:14:35 +02:00
Ettore Di Giacinto
b7e3589875 fix(anthropic): show null index when not present, default to 0 (#9225)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 15:13:17 +02:00
Ettore Di Giacinto
716ddd697b feat(autoparser): prefer chat deltas from backends when emitted (#9224)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 12:12:08 +02:00
Ettore Di Giacinto
223deb908d fix(nats): improve error handling (#9222)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 12:11:54 +02:00
Ettore Di Giacinto
9f8821bba8 feat(gemma4): add thinking support (#9221)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 12:11:38 +02:00
Ettore Di Giacinto
84e51b68ef fix(ui): pass by staticApiKeyRequired to show login when only api key is configured (#9220)
This fixes #9213

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-04-04 12:11:22 +02:00
LocalAI [bot]
7962dd16f7 chore: ⬆️ Update ggml-org/llama.cpp to d006858316d4650bb4da0c6923294ccd741caefd (#9215)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-04 09:44:39 +02:00
LocalAI [bot]
a1466b305a docs: ⬆️ update docs version mudler/LocalAI (#9214)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-04 09:44:25 +02:00
github-actions[bot]
57c0026715 chore: bump inference defaults from unsloth (#9219)
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-04-04 09:44:12 +02:00