dependabot[bot]
fdc9f7bf35
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus from 0.64.0 to 0.65.0 ( #9254 )
...
chore(deps): bump go.opentelemetry.io/otel/exporters/prometheus
Bumps [go.opentelemetry.io/otel/exporters/prometheus](https://github.com/open-telemetry/opentelemetry-go ) from 0.64.0 to 0.65.0.
- [Release notes](https://github.com/open-telemetry/opentelemetry-go/releases )
- [Changelog](https://github.com/open-telemetry/opentelemetry-go/blob/main/CHANGELOG.md )
- [Commits](https://github.com/open-telemetry/opentelemetry-go/compare/exporters/prometheus/v0.64.0...exporters/prometheus/v0.65.0 )
---
updated-dependencies:
- dependency-name: go.opentelemetry.io/otel/exporters/prometheus
dependency-version: 0.65.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
v4.1.3
2026-04-07 00:39:52 +02:00
LocalAI [bot]
8e59346091
chore: ⬆️ Update leejet/stable-diffusion.cpp to 8afbeb6ba9702c15d41a38296f2ab1fe5c829fa0 ( #9262 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-07 00:39:38 +02:00
LocalAI [bot]
e6e4e19633
chore: ⬆️ Update ace-step/acestep.cpp to e0c8d75a672fca5684c88c68dbf6d12f58754258 ( #9261 )
...
⬆️ Update ace-step/acestep.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-07 00:39:24 +02:00
Ettore Di Giacinto
505c417fa7
fix(gpu): better detection for MacOS and Thor ( #9263 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-07 00:39:07 +02:00
LocalAI [bot]
17215f6fbc
docs: ⬆️ update docs version mudler/LocalAI ( #9260 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-07 00:38:50 +02:00
LocalAI [bot]
bccaba1f66
chore: ⬆️ Update ggml-org/llama.cpp to d0a6dfeb28a09831d904fc4d910ddb740da82834 ( #9259 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-07 00:38:36 +02:00
Ettore Di Giacinto
0f9d516a6c
fix(anthropic): do not emit empty tokens and fix SSE tool calls ( #9258 )
...
This fixes Claude Code compatibility
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-07 00:38:21 +02:00
dependabot[bot]
33b124c6f1
chore(deps): bump github.com/aws/aws-sdk-go-v2/config from 1.32.12 to 1.32.14 ( #9256 )
...
chore(deps): bump github.com/aws/aws-sdk-go-v2/config
Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2 ) from 1.32.12 to 1.32.14.
- [Release notes](https://github.com/aws/aws-sdk-go-v2/releases )
- [Commits](https://github.com/aws/aws-sdk-go-v2/compare/config/v1.32.12...config/v1.32.14 )
---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go-v2/config
dependency-version: 1.32.14
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:46:52 +02:00
dependabot[bot]
6b8007e88e
chore(deps): bump github.com/jaypipes/ghw from 0.23.0 to 0.24.0 ( #9250 )
...
Bumps [github.com/jaypipes/ghw](https://github.com/jaypipes/ghw ) from 0.23.0 to 0.24.0.
- [Release notes](https://github.com/jaypipes/ghw/releases )
- [Commits](https://github.com/jaypipes/ghw/compare/v0.23.0...v0.24.0 )
---
updated-dependencies:
- dependency-name: github.com/jaypipes/ghw
dependency-version: 0.24.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:46:18 +02:00
dependabot[bot]
b3837c2078
chore(deps): bump google.golang.org/grpc from 1.79.3 to 1.80.0 ( #9253 )
...
Bumps [google.golang.org/grpc](https://github.com/grpc/grpc-go ) from 1.79.3 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc-go/releases )
- [Commits](https://github.com/grpc/grpc-go/compare/v1.79.3...v1.80.0 )
---
updated-dependencies:
- dependency-name: google.golang.org/grpc
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-04-06 21:45:50 +02:00
Ettore Di Giacinto
92f99b1ec3
fix(token): login via legacy api keys ( #9249 )
...
We were not checking against the api keys when db == nil.
This commit also cleanups now unused middleware
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-06 21:45:09 +02:00
LocalAI [bot]
ad232fdb1a
docs: ⬆️ update docs version mudler/LocalAI ( #9241 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v4.1.2
2026-04-06 10:53:07 +02:00
LocalAI [bot]
11637b5a1b
chore: ⬆️ Update leejet/stable-diffusion.cpp to 7397ddaa86f4e8837d5261724678cde0f36d4d89 ( #9242 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-06 10:52:51 +02:00
LocalAI [bot]
0dda4fe6f0
chore: ⬆️ Update ggml-org/llama.cpp to 761797ffdf2ce3f118e82c663b1ad7d935fbd656 ( #9243 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-06 10:52:38 +02:00
Ettore Di Giacinto
773489eeb1
fix(chat): do not retry if we had chatdeltas or tooldeltas from backend ( #9244 )
...
* fix(chat): do not retry if we had chatdeltas or tooldeltas from backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: use oai compat for llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: apply to non-streaming path too
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* map also other fields
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-06 10:52:23 +02:00
Ettore Di Giacinto
06fbe48b3f
feat(llama.cpp): wire speculative decoding settings ( #9238 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-05 14:56:30 +02:00
Ettore Di Giacinto
232e324a68
fix(autoparser): correctly pass by logprobs ( #9239 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-05 09:39:22 +02:00
ER-EPR
39c954764c
Update index.yaml and add Qwen3.5 model files ( #9237 )
...
* Update index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Add mmproj files for Qwen3.5 models
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update file paths for Qwen models in index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Refactor Qwen3-Reranker-0.6B entry in index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update qwen3.yaml configuration parameters
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
---------
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
2026-04-05 09:21:21 +02:00
Ettore Di Giacinto
9b7d5513fc
chore(gallery): add mmproj file for gemma4
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
v4.1.1
2026-04-05 02:02:52 +02:00
LocalAI [bot]
84cd8c0e7f
chore: ⬆️ Update ggml-org/llama.cpp to b8635075ffe27b135c49afb9a8b5c434bd42c502 ( #9231 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:58 +02:00
LocalAI [bot]
d990f2790c
chore(model-gallery): ⬆️ update checksum ( #9233 )
...
⬆️ Checksum updates in gallery/index.yaml
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:41 +02:00
Ettore Di Giacinto
53deeb1107
fix(reasoning): suppress partial tag tokens during autoparser warm-up
...
The C++ PEG parser needs a few tokens to identify the reasoning format
(e.g. "<|channel>thought\n" for Gemma 4). During this warm-up, the gRPC
layer was sending raw partial tag tokens to Go, which leaked into the
reasoning field.
- Clear reply.message in gRPC when autoparser is active but has no diffs
yet, matching llama.cpp server behavior of only emitting classified output
- Prefer C++ autoparser chat deltas for reasoning/content in all streaming
paths, falling back to Go-side extraction for backends without autoparser
(e.g. vLLM)
- Override non-streaming no-tools result with chat delta content when available
- Guard PrependThinkingTokenIfNeeded against partial tag prefixes during
streaming accumulation
- Reorder default thinking tokens so <|channel>thought is checked before
<|think|> (Gemma 4 templates contain both)
2026-04-04 20:45:57 +00:00
Ettore Di Giacinto
c5a840f6af
fix(reasoning): warm-up
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 20:25:24 +00:00
Ettore Di Giacinto
6d9d77d590
fix(reasoning): accumulate and strip reasoning tags from autoparser results ( #9227 )
...
fix(reasoning): acccumulate and strip reasoning tags from autoparser results
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 18:15:32 +02:00
Ettore Di Giacinto
6f304d1201
chore(refactor): use interface ( #9226 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 17:29:37 +02:00
Richard Palethorpe
557d0f0f04
feat(api): Allow coding agents to interactively discover how to control and configure LocalAI ( #9084 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2026-04-04 15:14:35 +02:00
Ettore Di Giacinto
b7e3589875
fix(anthropic): show null index when not present, default to 0 ( #9225 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 15:13:17 +02:00
Ettore Di Giacinto
716ddd697b
feat(autoparser): prefer chat deltas from backends when emitted ( #9224 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:12:08 +02:00
Ettore Di Giacinto
223deb908d
fix(nats): improve error handling ( #9222 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:54 +02:00
Ettore Di Giacinto
9f8821bba8
feat(gemma4): add thinking support ( #9221 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:38 +02:00
Ettore Di Giacinto
84e51b68ef
fix(ui): pass by staticApiKeyRequired to show login when only api key is configured ( #9220 )
...
This fixes #9213
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:22 +02:00
LocalAI [bot]
7962dd16f7
chore: ⬆️ Update ggml-org/llama.cpp to d006858316d4650bb4da0c6923294ccd741caefd ( #9215 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:39 +02:00
LocalAI [bot]
a1466b305a
docs: ⬆️ update docs version mudler/LocalAI ( #9214 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:25 +02:00
github-actions[bot]
57c0026715
chore: bump inference defaults from unsloth ( #9219 )
...
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:12 +02:00
Ettore Di Giacinto
1ed6b9e5ed
fix(llama.cpp): correctly parse grpc header for bearer token auth
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 21:38:41 +00:00
LocalAI [bot]
e4ee74354f
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9210 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 16:23:17 +02:00
Ettore Di Giacinto
8577bdcebc
Update asset links in README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:24:08 +02:00
Ettore Di Giacinto
0d489c7a0d
Add guided tour and update screenshots section
...
Updated README to include a guided tour section with links to various assets and details about agents and usage metrics.
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:23:03 +02:00
Ettore Di Giacinto
11dc54bda9
fix(docs): commit distribution.md
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 10:14:13 +02:00
Ettore Di Giacinto
7e0b73deaa
fix(docs): fix broken references to distributed mode
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 09:46:06 +02:00
LocalAI [bot]
c0a023d13d
chore: ⬆️ Update ggml-org/llama.cpp to a1cfb645307edc61a89e41557f290f441043d3c2 ( #9203 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 08:30:15 +02:00
Loryan Strant
0d3ae1c295
docs: Update Home Assistant integrations list ( #9206 )
...
Update Home Assistant integrations list
Signed-off-by: Loryan Strant <51473494+loryanstrant@users.noreply.github.com >
2026-04-03 08:30:00 +02:00
LocalAI [bot]
e9f10f2f50
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9202 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v4.1.0
2026-04-02 21:22:19 +02:00
Ettore Di Giacinto
b95b0b72ff
chore(ci): fix gallery agent
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-02 18:02:18 +00:00
LocalAI [bot]
26f1b94f4d
chore: ⬆️ Update ggml-org/llama.cpp to 95a6ebabb277c4cc18247e7bc2a5502133caca63 ( #9199 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:16 +02:00
LocalAI [bot]
2d40725ca2
chore: ⬆️ Update leejet/stable-diffusion.cpp to 87ecb95cbc65dc8e58e3d88f4f4a59a0939796f5 ( #9200 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:04 +02:00
Ettore Di Giacinto
6c635e8353
feat: add resume endpoint to undrain nodes ( #9197 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-01 18:21:43 +02:00
LocalAI [bot]
cc5f33ce95
chore: ⬆️ Update ggml-org/llama.cpp to 0fcb3760b2b9a3a496ef14621a7e4dad7a8df90f ( #9196 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:40 +02:00
LocalAI [bot]
ba7cdd532a
chore: ⬆️ Update leejet/stable-diffusion.cpp to 09b12d5f6d51d862749e8e0ee8baac8f012089e2 ( #9195 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:25 +02:00
Ettore Di Giacinto
6b6c136210
fix(inflight): count inflight from load model, but release afterwards ( #9194 )
...
This should fix the count of 1 in flight always showing in the node list
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 23:24:45 +02:00