LocalAI [bot]
ad232fdb1a
docs: ⬆️ update docs version mudler/LocalAI ( #9241 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v4.1.2
2026-04-06 10:53:07 +02:00
LocalAI [bot]
11637b5a1b
chore: ⬆️ Update leejet/stable-diffusion.cpp to 7397ddaa86f4e8837d5261724678cde0f36d4d89 ( #9242 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-06 10:52:51 +02:00
LocalAI [bot]
0dda4fe6f0
chore: ⬆️ Update ggml-org/llama.cpp to 761797ffdf2ce3f118e82c663b1ad7d935fbd656 ( #9243 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-06 10:52:38 +02:00
Ettore Di Giacinto
773489eeb1
fix(chat): do not retry if we had chatdeltas or tooldeltas from backend ( #9244 )
...
* fix(chat): do not retry if we had chatdeltas or tooldeltas from backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: use oai compat for llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* fix: apply to non-streaming path too
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* map also other fields
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-06 10:52:23 +02:00
Ettore Di Giacinto
06fbe48b3f
feat(llama.cpp): wire speculative decoding settings ( #9238 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-05 14:56:30 +02:00
Ettore Di Giacinto
232e324a68
fix(autoparser): correctly pass by logprobs ( #9239 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-05 09:39:22 +02:00
ER-EPR
39c954764c
Update index.yaml and add Qwen3.5 model files ( #9237 )
...
* Update index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Add mmproj files for Qwen3.5 models
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update file paths for Qwen models in index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Refactor Qwen3-Reranker-0.6B entry in index.yaml
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
* Update qwen3.yaml configuration parameters
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
---------
Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com >
2026-04-05 09:21:21 +02:00
Ettore Di Giacinto
9b7d5513fc
chore(gallery): add mmproj file for gemma4
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
v4.1.1
2026-04-05 02:02:52 +02:00
LocalAI [bot]
84cd8c0e7f
chore: ⬆️ Update ggml-org/llama.cpp to b8635075ffe27b135c49afb9a8b5c434bd42c502 ( #9231 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:58 +02:00
LocalAI [bot]
d990f2790c
chore(model-gallery): ⬆️ update checksum ( #9233 )
...
⬆️ Checksum updates in gallery/index.yaml
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:41 +02:00
Ettore Di Giacinto
53deeb1107
fix(reasoning): suppress partial tag tokens during autoparser warm-up
...
The C++ PEG parser needs a few tokens to identify the reasoning format
(e.g. "<|channel>thought\n" for Gemma 4). During this warm-up, the gRPC
layer was sending raw partial tag tokens to Go, which leaked into the
reasoning field.
- Clear reply.message in gRPC when autoparser is active but has no diffs
yet, matching llama.cpp server behavior of only emitting classified output
- Prefer C++ autoparser chat deltas for reasoning/content in all streaming
paths, falling back to Go-side extraction for backends without autoparser
(e.g. vLLM)
- Override non-streaming no-tools result with chat delta content when available
- Guard PrependThinkingTokenIfNeeded against partial tag prefixes during
streaming accumulation
- Reorder default thinking tokens so <|channel>thought is checked before
<|think|> (Gemma 4 templates contain both)
2026-04-04 20:45:57 +00:00
Ettore Di Giacinto
c5a840f6af
fix(reasoning): warm-up
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 20:25:24 +00:00
Ettore Di Giacinto
6d9d77d590
fix(reasoning): accumulate and strip reasoning tags from autoparser results ( #9227 )
...
fix(reasoning): acccumulate and strip reasoning tags from autoparser results
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 18:15:32 +02:00
Ettore Di Giacinto
6f304d1201
chore(refactor): use interface ( #9226 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 17:29:37 +02:00
Richard Palethorpe
557d0f0f04
feat(api): Allow coding agents to interactively discover how to control and configure LocalAI ( #9084 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2026-04-04 15:14:35 +02:00
Ettore Di Giacinto
b7e3589875
fix(anthropic): show null index when not present, default to 0 ( #9225 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 15:13:17 +02:00
Ettore Di Giacinto
716ddd697b
feat(autoparser): prefer chat deltas from backends when emitted ( #9224 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:12:08 +02:00
Ettore Di Giacinto
223deb908d
fix(nats): improve error handling ( #9222 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:54 +02:00
Ettore Di Giacinto
9f8821bba8
feat(gemma4): add thinking support ( #9221 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:38 +02:00
Ettore Di Giacinto
84e51b68ef
fix(ui): pass by staticApiKeyRequired to show login when only api key is configured ( #9220 )
...
This fixes #9213
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:22 +02:00
LocalAI [bot]
7962dd16f7
chore: ⬆️ Update ggml-org/llama.cpp to d006858316d4650bb4da0c6923294ccd741caefd ( #9215 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:39 +02:00
LocalAI [bot]
a1466b305a
docs: ⬆️ update docs version mudler/LocalAI ( #9214 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:25 +02:00
github-actions[bot]
57c0026715
chore: bump inference defaults from unsloth ( #9219 )
...
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:12 +02:00
Ettore Di Giacinto
1ed6b9e5ed
fix(llama.cpp): correctly parse grpc header for bearer token auth
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 21:38:41 +00:00
LocalAI [bot]
e4ee74354f
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9210 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 16:23:17 +02:00
Ettore Di Giacinto
8577bdcebc
Update asset links in README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:24:08 +02:00
Ettore Di Giacinto
0d489c7a0d
Add guided tour and update screenshots section
...
Updated README to include a guided tour section with links to various assets and details about agents and usage metrics.
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:23:03 +02:00
Ettore Di Giacinto
11dc54bda9
fix(docs): commit distribution.md
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 10:14:13 +02:00
Ettore Di Giacinto
7e0b73deaa
fix(docs): fix broken references to distributed mode
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 09:46:06 +02:00
LocalAI [bot]
c0a023d13d
chore: ⬆️ Update ggml-org/llama.cpp to a1cfb645307edc61a89e41557f290f441043d3c2 ( #9203 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 08:30:15 +02:00
Loryan Strant
0d3ae1c295
docs: Update Home Assistant integrations list ( #9206 )
...
Update Home Assistant integrations list
Signed-off-by: Loryan Strant <51473494+loryanstrant@users.noreply.github.com >
2026-04-03 08:30:00 +02:00
LocalAI [bot]
e9f10f2f50
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9202 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v4.1.0
2026-04-02 21:22:19 +02:00
Ettore Di Giacinto
b95b0b72ff
chore(ci): fix gallery agent
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-02 18:02:18 +00:00
LocalAI [bot]
26f1b94f4d
chore: ⬆️ Update ggml-org/llama.cpp to 95a6ebabb277c4cc18247e7bc2a5502133caca63 ( #9199 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:16 +02:00
LocalAI [bot]
2d40725ca2
chore: ⬆️ Update leejet/stable-diffusion.cpp to 87ecb95cbc65dc8e58e3d88f4f4a59a0939796f5 ( #9200 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:04 +02:00
Ettore Di Giacinto
6c635e8353
feat: add resume endpoint to undrain nodes ( #9197 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-01 18:21:43 +02:00
LocalAI [bot]
cc5f33ce95
chore: ⬆️ Update ggml-org/llama.cpp to 0fcb3760b2b9a3a496ef14621a7e4dad7a8df90f ( #9196 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:40 +02:00
LocalAI [bot]
ba7cdd532a
chore: ⬆️ Update leejet/stable-diffusion.cpp to 09b12d5f6d51d862749e8e0ee8baac8f012089e2 ( #9195 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:25 +02:00
Ettore Di Giacinto
6b6c136210
fix(inflight): count inflight from load model, but release afterwards ( #9194 )
...
This should fix the count of 1 in flight always showing in the node list
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 23:24:45 +02:00
Ettore Di Giacinto
e587ecc485
chore(ui): allow to unload forcefully
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 17:20:53 +00:00
Ettore Di Giacinto
f259036a27
feat(gpu): add jetson/tegra detection
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 15:45:07 +00:00
Ettore Di Giacinto
221ff0f28f
feat(ui): show cluster status in home in distributed mode
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 15:37:58 +00:00
Ettore Di Giacinto
16d5cb00bd
chore: css cleanups
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 16:37:38 +02:00
Richard Palethorpe
952635fba6
feat(distributed): Avoid resending models to backend nodes ( #9193 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-03-31 16:28:13 +02:00
Ettore Di Giacinto
3cc05af2e5
chore(nodes): restore offline nodes too
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 14:22:18 +00:00
Copilot
87a63316c7
stablediffusion-ggml: replace hand-maintained enum string arrays with upstream API calls ( #9192 )
...
* Initial plan
* Remove hand-maintained enum string arrays in gosd.cpp, use upstream API functions
Agent-Logs-Url: https://github.com/mudler/LocalAI/sessions/561fb489-89ed-4588-8f1e-7b967d91ba37
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-03-31 14:53:38 +02:00
Richard Palethorpe
efdcbbe332
feat(api): Return 404 when model is not found except for model names in HF format ( #9133 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2026-03-31 10:48:21 +02:00
Ettore Di Giacinto
b4fff9293d
chore: small ui improvements in the node page
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 08:41:40 +00:00
dependabot[bot]
8180221b7e
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/common/template ( #9176 )
...
chore(deps): bump grpcio in /backend/python/common/template
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:11:04 +02:00
dependabot[bot]
52a9755e08
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/rerankers ( #9181 )
...
chore(deps): bump grpcio in /backend/python/rerankers
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:10:50 +02:00