Ettore Di Giacinto
9b7d5513fc
chore(gallery): add mmproj file for gemma4
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
v4.1.1
2026-04-05 02:02:52 +02:00
LocalAI [bot]
84cd8c0e7f
chore: ⬆️ Update ggml-org/llama.cpp to b8635075ffe27b135c49afb9a8b5c434bd42c502 ( #9231 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:58 +02:00
LocalAI [bot]
d990f2790c
chore(model-gallery): ⬆️ update checksum ( #9233 )
...
⬆️ Checksum updates in gallery/index.yaml
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 23:02:41 +02:00
Ettore Di Giacinto
53deeb1107
fix(reasoning): suppress partial tag tokens during autoparser warm-up
...
The C++ PEG parser needs a few tokens to identify the reasoning format
(e.g. "<|channel>thought\n" for Gemma 4). During this warm-up, the gRPC
layer was sending raw partial tag tokens to Go, which leaked into the
reasoning field.
- Clear reply.message in gRPC when autoparser is active but has no diffs
yet, matching llama.cpp server behavior of only emitting classified output
- Prefer C++ autoparser chat deltas for reasoning/content in all streaming
paths, falling back to Go-side extraction for backends without autoparser
(e.g. vLLM)
- Override non-streaming no-tools result with chat delta content when available
- Guard PrependThinkingTokenIfNeeded against partial tag prefixes during
streaming accumulation
- Reorder default thinking tokens so <|channel>thought is checked before
<|think|> (Gemma 4 templates contain both)
2026-04-04 20:45:57 +00:00
Ettore Di Giacinto
c5a840f6af
fix(reasoning): warm-up
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 20:25:24 +00:00
Ettore Di Giacinto
6d9d77d590
fix(reasoning): accumulate and strip reasoning tags from autoparser results ( #9227 )
...
fix(reasoning): acccumulate and strip reasoning tags from autoparser results
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 18:15:32 +02:00
Ettore Di Giacinto
6f304d1201
chore(refactor): use interface ( #9226 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 17:29:37 +02:00
Richard Palethorpe
557d0f0f04
feat(api): Allow coding agents to interactively discover how to control and configure LocalAI ( #9084 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2026-04-04 15:14:35 +02:00
Ettore Di Giacinto
b7e3589875
fix(anthropic): show null index when not present, default to 0 ( #9225 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 15:13:17 +02:00
Ettore Di Giacinto
716ddd697b
feat(autoparser): prefer chat deltas from backends when emitted ( #9224 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:12:08 +02:00
Ettore Di Giacinto
223deb908d
fix(nats): improve error handling ( #9222 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:54 +02:00
Ettore Di Giacinto
9f8821bba8
feat(gemma4): add thinking support ( #9221 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:38 +02:00
Ettore Di Giacinto
84e51b68ef
fix(ui): pass by staticApiKeyRequired to show login when only api key is configured ( #9220 )
...
This fixes #9213
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-04 12:11:22 +02:00
LocalAI [bot]
7962dd16f7
chore: ⬆️ Update ggml-org/llama.cpp to d006858316d4650bb4da0c6923294ccd741caefd ( #9215 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:39 +02:00
LocalAI [bot]
a1466b305a
docs: ⬆️ update docs version mudler/LocalAI ( #9214 )
...
⬆️ Update docs version mudler/LocalAI
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:25 +02:00
github-actions[bot]
57c0026715
chore: bump inference defaults from unsloth ( #9219 )
...
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-04 09:44:12 +02:00
Ettore Di Giacinto
1ed6b9e5ed
fix(llama.cpp): correctly parse grpc header for bearer token auth
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 21:38:41 +00:00
LocalAI [bot]
e4ee74354f
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9210 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 16:23:17 +02:00
Ettore Di Giacinto
8577bdcebc
Update asset links in README.md
...
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:24:08 +02:00
Ettore Di Giacinto
0d489c7a0d
Add guided tour and update screenshots section
...
Updated README to include a guided tour section with links to various assets and details about agents and usage metrics.
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-04-03 10:23:03 +02:00
Ettore Di Giacinto
11dc54bda9
fix(docs): commit distribution.md
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 10:14:13 +02:00
Ettore Di Giacinto
7e0b73deaa
fix(docs): fix broken references to distributed mode
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-03 09:46:06 +02:00
LocalAI [bot]
c0a023d13d
chore: ⬆️ Update ggml-org/llama.cpp to a1cfb645307edc61a89e41557f290f441043d3c2 ( #9203 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-03 08:30:15 +02:00
Loryan Strant
0d3ae1c295
docs: Update Home Assistant integrations list ( #9206 )
...
Update Home Assistant integrations list
Signed-off-by: Loryan Strant <51473494+loryanstrant@users.noreply.github.com >
2026-04-03 08:30:00 +02:00
LocalAI [bot]
e9f10f2f50
chore(model gallery): 🤖 add 1 new models via gallery agent ( #9202 )
...
chore(model gallery): 🤖 add new models via gallery agent
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
v4.1.0
2026-04-02 21:22:19 +02:00
Ettore Di Giacinto
b95b0b72ff
chore(ci): fix gallery agent
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-02 18:02:18 +00:00
LocalAI [bot]
26f1b94f4d
chore: ⬆️ Update ggml-org/llama.cpp to 95a6ebabb277c4cc18247e7bc2a5502133caca63 ( #9199 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:16 +02:00
LocalAI [bot]
2d40725ca2
chore: ⬆️ Update leejet/stable-diffusion.cpp to 87ecb95cbc65dc8e58e3d88f4f4a59a0939796f5 ( #9200 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-02 08:53:04 +02:00
Ettore Di Giacinto
6c635e8353
feat: add resume endpoint to undrain nodes ( #9197 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-04-01 18:21:43 +02:00
LocalAI [bot]
cc5f33ce95
chore: ⬆️ Update ggml-org/llama.cpp to 0fcb3760b2b9a3a496ef14621a7e4dad7a8df90f ( #9196 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:40 +02:00
LocalAI [bot]
ba7cdd532a
chore: ⬆️ Update leejet/stable-diffusion.cpp to 09b12d5f6d51d862749e8e0ee8baac8f012089e2 ( #9195 )
...
⬆️ Update leejet/stable-diffusion.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-04-01 00:48:25 +02:00
Ettore Di Giacinto
6b6c136210
fix(inflight): count inflight from load model, but release afterwards ( #9194 )
...
This should fix the count of 1 in flight always showing in the node list
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 23:24:45 +02:00
Ettore Di Giacinto
e587ecc485
chore(ui): allow to unload forcefully
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 17:20:53 +00:00
Ettore Di Giacinto
f259036a27
feat(gpu): add jetson/tegra detection
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 15:45:07 +00:00
Ettore Di Giacinto
221ff0f28f
feat(ui): show cluster status in home in distributed mode
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 15:37:58 +00:00
Ettore Di Giacinto
16d5cb00bd
chore: css cleanups
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 16:37:38 +02:00
Richard Palethorpe
952635fba6
feat(distributed): Avoid resending models to backend nodes ( #9193 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com >
2026-03-31 16:28:13 +02:00
Ettore Di Giacinto
3cc05af2e5
chore(nodes): restore offline nodes too
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 14:22:18 +00:00
Copilot
87a63316c7
stablediffusion-ggml: replace hand-maintained enum string arrays with upstream API calls ( #9192 )
...
* Initial plan
* Remove hand-maintained enum string arrays in gosd.cpp, use upstream API functions
Agent-Logs-Url: https://github.com/mudler/LocalAI/sessions/561fb489-89ed-4588-8f1e-7b967d91ba37
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
---------
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com >
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-03-31 14:53:38 +02:00
Richard Palethorpe
efdcbbe332
feat(api): Return 404 when model is not found except for model names in HF format ( #9133 )
...
Signed-off-by: Richard Palethorpe <io@richiejp.com >
2026-03-31 10:48:21 +02:00
Ettore Di Giacinto
b4fff9293d
chore: small ui improvements in the node page
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 08:41:40 +00:00
dependabot[bot]
8180221b7e
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/common/template ( #9176 )
...
chore(deps): bump grpcio in /backend/python/common/template
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:11:04 +02:00
dependabot[bot]
52a9755e08
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/rerankers ( #9181 )
...
chore(deps): bump grpcio in /backend/python/rerankers
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:10:50 +02:00
dependabot[bot]
a2a1d919f9
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/coqui ( #9182 )
...
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:10:35 +02:00
dependabot[bot]
a3d37931ec
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/vllm ( #9177 )
...
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:10:17 +02:00
dependabot[bot]
5b2e25ebb0
chore(deps): bump grpcio from 1.78.1 to 1.80.0 in /backend/python/transformers ( #9180 )
...
chore(deps): bump grpcio in /backend/python/transformers
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.78.1 to 1.80.0.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Commits](https://github.com/grpc/grpc/compare/v1.78.1...v1.80.0 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-version: 1.80.0
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-31 10:10:03 +02:00
LocalAI [bot]
b0b37a472f
chore: ⬆️ Update ggml-org/llama.cpp to 08f21453aec846867b39878500d725a05bd32683 ( #9190 )
...
⬆️ Update ggml-org/llama.cpp
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-03-31 09:27:08 +02:00
Ettore Di Giacinto
3db12eaa7a
fix(oauth/invite): do not register user (prending approval) without correct invite ( #9189 )
...
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 08:29:07 +02:00
Ettore Di Giacinto
8862e3ce60
feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler ( #9186 )
...
* always enable parallel requests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore: move tests to ginkgo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
* chore(smart router): order by available vram
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io >
2026-03-31 08:28:56 +02:00
LocalAI [bot]
80699a3f70
feat(swagger): update swagger ( #9187 )
...
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com >
2026-03-30 23:48:06 +02:00