625 Commits

Author SHA1 Message Date
Ettore Di Giacinto
221ff0f28f feat(ui): show cluster status in home in distributed mode
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 15:37:58 +00:00
Ettore Di Giacinto
16d5cb00bd chore: css cleanups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 16:37:38 +02:00
Richard Palethorpe
952635fba6 feat(distributed): Avoid resending models to backend nodes (#9193)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-31 16:28:13 +02:00
Ettore Di Giacinto
3cc05af2e5 chore(nodes): restore offline nodes too
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 14:22:18 +00:00
Richard Palethorpe
efdcbbe332 feat(api): Return 404 when model is not found except for model names in HF format (#9133)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-31 10:48:21 +02:00
Ettore Di Giacinto
b4fff9293d chore: small ui improvements in the node page
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 08:41:40 +00:00
Ettore Di Giacinto
3db12eaa7a fix(oauth/invite): do not register user (prending approval) without correct invite (#9189)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 08:29:07 +02:00
Ettore Di Giacinto
8862e3ce60 feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186)
* always enable parallel requests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: move tests to ginkgo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(smart router): order by available vram

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-31 08:28:56 +02:00
Ettore Di Giacinto
dd3376e0a9 chore(workers): improve logging, set header timeouts (#9171)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-30 17:26:55 +02:00
Richard Palethorpe
c2f7d1c18b feat(ui): Add media history to studio pages (e.g. past images) (#9151)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-30 00:49:55 +02:00
Ettore Di Giacinto
59108fbe32 feat: add distributed mode (#9124)
* feat: add distributed mode (experimental)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix data races, mutexes, transactions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix events and tool stream in agent chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* use ginkgo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(cron): compute correctly time boundaries avoiding re-triggering

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not flood of healthy checks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not list obvious backends as text backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop redundant healthcheck

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-30 00:47:27 +02:00
Richard Palethorpe
d3f629f183 feat: Merge repeated log lines in the terminal (#9141)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-26 22:16:13 +01:00
walcz-de
00fcf6936c fix: implement encoding_format=base64 for embeddings endpoint (#9135)
The OpenAI Node.js SDK v4+ sends encoding_format=base64 by default.
LocalAI previously ignored this parameter and always returned a float
JSON array, causing a silent data corruption bug in any Node.js client
(AnythingLLM Desktop, LangChain.js, LlamaIndex.TS, …):

  // What the client does when it expects base64 but receives a float array:
  Buffer.from(floatArray, 'base64')

Node.js treats a non-string first argument as a byte array — each
float32 value is truncated to a single byte — and Float32Array then
reads those bytes as floats, yielding dims/4 values.  Vector databases
(Qdrant, pgvector, …) then create collections with the wrong dimension,
causing all similarity searches to fail silently.

  e.g. granite-embedding-107m (384 dims) → 96 stored in Qdrant
       jina-embeddings-v3      (1024 dims) → 256 stored in Qdrant

Changes:
- core/schema/prediction.go: add EncodingFormat string field to
  PredictionOptions so the request parameter is parsed and available
  throughout the request pipeline
- core/schema/openai.go: add EmbeddingBase64 string field to Item;
  add MarshalJSON so the "embedding" JSON key emits either []float32
  or a base64 string depending on which field is populated — all other
  Item consumers (image, video endpoints) are unaffected
- core/http/endpoints/openai/embeddings.go: add floatsToBase64()
  which packs a float32 slice as little-endian bytes and base64-encodes
  it; add embeddingItem() helper; both InputToken and InputStrings loops
  now honour encoding_format=base64

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 17:38:07 +01:00
Ettore Di Giacinto
15935e9d5f fix(auth): do not allow to register in invite mode (#9101)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 20:44:03 +01:00
Ettore Di Giacinto
5d410e5a03 fix(download): do not remove dst dir until we try all fallbacks (#9100)
This actually caused fallbacks to be compeletely no-op as we were
removing the destination dir before calling containerd.Apply

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 10:29:57 +01:00
Ettore Di Giacinto
031a36c995 feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092)
* feat: wire min_p

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: inferencing defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(refactor): re-use iterative parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: generate automatically inference defaults from unsloth

Instead of trying to re-invent the wheel and maintain here the inference
defaults, prefer to consume unsloth ones, and contribute there as
necessary.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: apply defaults also to models installed via gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: be consistent and apply fallback to all endpoint

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 00:57:15 +01:00
Ettore Di Giacinto
f7e8d9e791 feat(quantization): add quantization backend (#9096)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 00:56:34 +01:00
Ettore Di Giacinto
4b183b7bb6 feat: add quota system (#9090)
* feat: add quota system

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-21 10:09:49 +01:00
Ettore Di Giacinto
f38e91d80b feat(ui): add predictor for usage, user-breakdown statistics (#9091)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-21 10:09:36 +01:00
Ettore Di Giacinto
d9c1db2b87 feat: add (experimental) fine-tuning support with TRL (#9088)
* feat: add fine-tuning endpoint

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(experimental): add fine-tuning endpoint and TRL support

This changeset defines new GRPC signatues for Fine tuning backends, and
add TRL backend as initial fine-tuning engine. This implementation also
supports exporting to GGUF and automatically importing it to LocalAI
after fine-tuning.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* commit TRL backend, stop by killing process

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* move fine-tune to generic features

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add evals, reorder menu

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-21 02:08:02 +01:00
Richard Palethorpe
cb63bdb9e4 feat(ui): Add model pipeline editor (#9070)
This creates a new model config page. Presently just allows configuring
pipelines, but can be extending the future to other types of models.
However pipelines are quite easy to create a form for and require
editing to create.

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-20 15:07:34 +01:00
Richard Palethorpe
8cd3f9fc47 feat(ui, openai): Structured errors and link to traces in error toast (#9068)
First when sending errors over SSE we now clearly identify them as such
instead of just sending the error string as a chat completion message.

We use this in the UI to identify errors and link to them to the traces.

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-20 15:06:07 +01:00
lif
e0ab1a8b43 fix: use exact tag matching for model gallery tag filtering (#9041)
The Search() method uses strings.Contains() on comma-joined tags,
causing substring false positives (e.g., "asr" matching "image-diffusers").

Add FilterByTag() method that checks each tag with strings.EqualFold()
for exact, case-insensitive matching. Add 'tag' query parameter to
/api/models and /api/backends endpoints. Update the React frontend to
send filter selections as 'tag' instead of 'term'.

Closes #8775

Signed-off-by: majiayu000 <1835304752@qq.com>
2026-03-20 08:37:45 +01:00
Ettore Di Giacinto
c3174f9543 chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' (#9079)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-20 08:12:21 +01:00
LocalAI [bot]
2b12875302 fix: Add tracing settings loading from runtime_settings.json (#9081)
Tracing settings (EnableTracing and TracingMaxItems) were not being
loaded from runtime_settings.json on startup, causing tracing settings
configured via WebUI to be lost after service restart.

This fix adds proper loading of tracing settings in
loadRuntimeSettingsFromFile function in core/application/startup.go.

Fixes #9072

Co-authored-by: localai-bot <localai-bot@localai.io>
2026-03-20 00:58:52 +01:00
Tv
a6d0e29eba fix(openresponses): do not omit required field ORItemParam.Arguments (#9074)
See #9047
2026-03-19 22:04:45 +01:00
Ettore Di Giacinto
aea21951a2 feat: add users and authentication support (#9061)
* feat(ui): add users and authentication support

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: allow the admin user to impersonificate users

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: ui improvements, disable 'Users' button in navbar when no auth is configured

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: add OIDC support

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: gate models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: cache requests to optimize speed

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* small UI enhancements

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ui): style improvements

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: cover other paths by auth

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: separate local auth, refactor

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* security hardening, approval mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: fix tests and expectations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: update localagi/localrecall

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-19 21:40:51 +01:00
Richard Palethorpe
cfb7641eea feat(ui, gallery): Show model backends and add searchable model/backend selector (#9060)
* feat(ui, gallery): Display and filter by the backend models use

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(ui): Add searchable model backend/model selector and prevent delete models being selected

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-18 21:14:41 +01:00
Richard Palethorpe
e832efeb9e fix(ui): Refresh model list on deletion (#9059)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-18 14:07:45 +01:00
dependabot[bot]
a42548e9d1 chore(deps): bump playwright from 1.52.0 to 1.58.2 in /core/http/react-ui in the npm_and_yarn group across 1 directory (#9055)
chore(deps): bump playwright

Bumps the npm_and_yarn group with 1 update in the /core/http/react-ui directory: [playwright](https://github.com/microsoft/playwright).


Updates `playwright` from 1.52.0 to 1.58.2
- [Release notes](https://github.com/microsoft/playwright/releases)
- [Commits](https://github.com/microsoft/playwright/compare/v1.52.0...v1.58.2)

---
updated-dependencies:
- dependency-name: playwright
  dependency-version: 1.58.2
  dependency-type: indirect
  dependency-group: npm_and_yarn
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-18 14:05:59 +01:00
LocalAI [bot]
8615ce28a8 feat: Add standalone agent run mode inspired by LocalAGI (#9056)
- Add 'agent' subcommand with 'run' and 'list' sub-commands
- Support running agents by name from pool.json registry
- Support running agents from JSON config files
- Implement foreground mode with --prompt flag for single-turn interactions
- Reuse AgentPoolService for consistent agent initialization
- Add comprehensive unit tests for config loading and overrides

Fixes #8960

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
2026-03-18 14:04:20 +01:00
Ettore Di Giacinto
8336efec41 fix(ui): correctly display backend if specified in the model config, re-order MCP buttons (#9053)
fix(ui): correctly display backend if specified in the model config

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-18 09:58:25 +01:00
Tv
8a0edd0809 Always populate ORItemParam.Summary (#9049)
* fix(openresponses): do not omit required fields summary and id

* fix(openresponses): ensure ORItemParam.Summary is never null

Normalize Summary to an empty slice at serialization chokepoints
(sendSSEEvent, bufferEvent, buildORResponse) so it always serializes
as [] instead of null.

Closes #9047
2026-03-18 08:45:46 +01:00
Richard Palethorpe
35d509d8e7 feat(ui): Per model backend logs and various fixes (#9028)
* feat(gallery): Switch to expandable box instead of pop-over and display model files

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(ui, backends): Add individual backend logging

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(ui): Set the context settings from the model config

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-18 08:31:26 +01:00
Ettore Di Giacinto
eef808d921 fix: call .String() on AllowedTools
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-17 23:16:19 +00:00
Ettore Di Giacinto
ee96e5e08d chore: refactor endpoints to use same inferencing path, add automatic retrial mechanism in case of errors (#9029)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-16 21:31:02 +01:00
Richard Palethorpe
3d9ccd1ddc fix(ui): Add tracing inline settings back and create UI tests (#9027)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-16 17:51:06 +01:00
Ettore Di Giacinto
d8161bfe57 fix(api): unescape model names (#9024)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-16 00:56:37 +01:00
Ettore Di Giacinto
5fd42399d4 feat: support streaming mode for tool calls in agent mode, fix interleaved thinking stream (#9023)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-16 00:50:19 +01:00
Ettore Di Giacinto
4ea461c330 fix(ui): correctly map watchdog fields (#9022)
Fixes: https://github.com/mudler/LocalAI/issues/9018

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-15 22:12:24 +01:00
LocalAI [bot]
c6a51289b0 fix: Automatically disable mmap for Intel SYCL backends (#9012) (#9015)
* fix: Automatically disable mmap for Intel SYCL backends

Fixes issue #9012 where Qwen3.5 models fail to load on Intel Arc GPU
with RPC EOF error.

The Intel SYCL backend has a known issue where mmap enabled causes
the backend to hang. This change automatically disables mmap when
detecting Intel or SYCL backends.

References:
- https://github.com/mudler/LocalAI/issues/9012
- Documentation mentions: SYCL hangs when mmap: true is set

* feat: Add logging for mmap auto-disable on Intel SYCL backends

As requested in PR review, add xlog.Info call to log when mmap
is automatically disabled for Intel SYCL backends. This helps
with debugging and confirms the auto-disable logic is working.

---------

Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
2026-03-15 21:06:35 +01:00
Ettore Di Giacinto
dde0353432 chore(api): add path to expose collection raw files
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-14 22:31:45 +00:00
Ettore Di Giacinto
8e8b7df715 fix(ui): do not let from button to trigger
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-14 17:35:04 +00:00
Richard Palethorpe
e4c1c39727 fix(conf): Don't overwrite env provided galleries with runtime conf (#8994)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-13 21:39:03 +01:00
Richard Palethorpe
ed2c6da4bf fix(ui): Move routes to /app to avoid conflict with API endpoints (#8978)
Also test for regressions in HTTP GET API key exempted endpoints because
this list can get out of sync with the UI routes.

Also fix support for proxying on a different prefix both server and
client side.

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-13 21:38:18 +01:00
Richard Palethorpe
f9a850c02a feat(realtime): WebRTC support (#8790)
* feat(realtime): WebRTC support

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(tracing): Show full LLM opts and deltas

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-13 21:37:15 +01:00
Ettore Di Giacinto
4e3bf2752d fix(tests): Remove 'huggingface' substring expectation
Remove check for 'huggingface' in response body.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-13 13:12:04 +01:00
Ettore Di Giacinto
6c11c54a3b fix: avoid race condition
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-12 18:46:22 +00:00
Ettore Di Giacinto
13bd0d9944 fix(collections): start agent pool after http server (#8981)
Otherwise if using collections with postgresql we create a deadlock, as
we need embeddings to be up

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-12 19:25:49 +01:00
Ettore Di Giacinto
a738f8b0e4 feat(backends): add ace-step.cpp (#8965)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-12 18:56:26 +01:00