LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-17 11:43:42 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	85be4ff03c	feat(api): add ollama compatibility (#9284 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-09 14:15:14 +02:00
Ettore Di Giacinto	e00ce981f0	fix: try to add whisperx and faster-whisper for more variants (#9278 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 21:23:38 +02:00
Ettore Di Giacinto	39c6b3ed66	feat: track files being staged (#9275 ) This changeset makes visible when files are being staged, so users are aware that the model "isn't ready yet" for requests. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 14:33:58 +02:00
Ettore Di Giacinto	0e9d1a6588	chore(ci): drop unnecessary test Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 12:19:54 +00:00
Ettore Di Giacinto	154fa000d3	fix(autoscaling): extract load model from Route() and use as well when doing autoscale (#9270 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 08:27:51 +02:00
Richard Palethorpe	9ac1bdc587	feat(ui): Interactive model config editor with autocomplete (#9149 ) * feat(ui): Add dynamic model editor with autocomplete Signed-off-by: Richard Palethorpe <io@richiejp.com> * chore(docs): Add link to longformat installation video Signed-off-by: Richard Palethorpe <io@richiejp.com> --------- Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-07 14:42:23 +02:00
Ettore Di Giacinto	505c417fa7	fix(gpu): better detection for MacOS and Thor (#9263 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-07 00:39:07 +02:00
Ettore Di Giacinto	0f9d516a6c	fix(anthropic): do not emit empty tokens and fix SSE tool calls (#9258 ) This fixes Claude Code compatibility Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-07 00:38:21 +02:00
Ettore Di Giacinto	92f99b1ec3	fix(token): login via legacy api keys (#9249 ) We were not checking against the api keys when db == nil. This commit also cleanups now unused middleware Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-06 21:45:09 +02:00
Ettore Di Giacinto	773489eeb1	fix(chat): do not retry if we had chatdeltas or tooldeltas from backend (#9244 ) * fix(chat): do not retry if we had chatdeltas or tooldeltas from backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: use oai compat for llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: apply to non-streaming path too Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * map also other fields Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-06 10:52:23 +02:00
Ettore Di Giacinto	232e324a68	fix(autoparser): correctly pass by logprobs (#9239 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-05 09:39:22 +02:00
Ettore Di Giacinto	53deeb1107	fix(reasoning): suppress partial tag tokens during autoparser warm-up The C++ PEG parser needs a few tokens to identify the reasoning format (e.g. "<\|channel>thought\n" for Gemma 4). During this warm-up, the gRPC layer was sending raw partial tag tokens to Go, which leaked into the reasoning field. - Clear reply.message in gRPC when autoparser is active but has no diffs yet, matching llama.cpp server behavior of only emitting classified output - Prefer C++ autoparser chat deltas for reasoning/content in all streaming paths, falling back to Go-side extraction for backends without autoparser (e.g. vLLM) - Override non-streaming no-tools result with chat delta content when available - Guard PrependThinkingTokenIfNeeded against partial tag prefixes during streaming accumulation - Reorder default thinking tokens so <\|channel>thought is checked before <\|think\|> (Gemma 4 templates contain both)	2026-04-04 20:45:57 +00:00
Ettore Di Giacinto	c5a840f6af	fix(reasoning): warm-up Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 20:25:24 +00:00
Ettore Di Giacinto	6d9d77d590	fix(reasoning): accumulate and strip reasoning tags from autoparser results (#9227 ) fix(reasoning): acccumulate and strip reasoning tags from autoparser results Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 18:15:32 +02:00
Ettore Di Giacinto	6f304d1201	chore(refactor): use interface (#9226 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 17:29:37 +02:00
Richard Palethorpe	557d0f0f04	feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-04 15:14:35 +02:00
Ettore Di Giacinto	b7e3589875	fix(anthropic): show null index when not present, default to 0 (#9225 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 15:13:17 +02:00
Ettore Di Giacinto	716ddd697b	feat(autoparser): prefer chat deltas from backends when emitted (#9224 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 12:12:08 +02:00
Ettore Di Giacinto	223deb908d	fix(nats): improve error handling (#9222 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 12:11:54 +02:00
Ettore Di Giacinto	9f8821bba8	feat(gemma4): add thinking support (#9221 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 12:11:38 +02:00
Ettore Di Giacinto	84e51b68ef	fix(ui): pass by staticApiKeyRequired to show login when only api key is configured (#9220 ) This fixes #9213 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-04 12:11:22 +02:00
github-actions[bot]	57c0026715	chore: bump inference defaults from unsloth (#9219 ) Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-04 09:44:12 +02:00
Ettore Di Giacinto	6c635e8353	feat: add resume endpoint to undrain nodes (#9197 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-01 18:21:43 +02:00
Ettore Di Giacinto	6b6c136210	fix(inflight): count inflight from load model, but release afterwards (#9194 ) This should fix the count of 1 in flight always showing in the node list Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 23:24:45 +02:00
Ettore Di Giacinto	e587ecc485	chore(ui): allow to unload forcefully Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 17:20:53 +00:00
Ettore Di Giacinto	221ff0f28f	feat(ui): show cluster status in home in distributed mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 15:37:58 +00:00
Ettore Di Giacinto	16d5cb00bd	chore: css cleanups Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 16:37:38 +02:00
Richard Palethorpe	952635fba6	feat(distributed): Avoid resending models to backend nodes (#9193 ) Signed-off-by: Richard Palethorpe <io@richiejp.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-31 16:28:13 +02:00
Ettore Di Giacinto	3cc05af2e5	chore(nodes): restore offline nodes too Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 14:22:18 +00:00
Richard Palethorpe	efdcbbe332	feat(api): Return 404 when model is not found except for model names in HF format (#9133 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-31 10:48:21 +02:00
Ettore Di Giacinto	b4fff9293d	chore: small ui improvements in the node page Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 08:41:40 +00:00
Ettore Di Giacinto	3db12eaa7a	fix(oauth/invite): do not register user (prending approval) without correct invite (#9189 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 08:29:07 +02:00
Ettore Di Giacinto	8862e3ce60	feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler (#9186 ) * always enable parallel requests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: add node reconciler, allow to schedule to group of nodes, min/max autoscaler Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: move tests to ginkgo Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(smart router): order by available vram Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-31 08:28:56 +02:00
Ettore Di Giacinto	dd3376e0a9	chore(workers): improve logging, set header timeouts (#9171 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-30 17:26:55 +02:00
Richard Palethorpe	c2f7d1c18b	feat(ui): Add media history to studio pages (e.g. past images) (#9151 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-30 00:49:55 +02:00
Ettore Di Giacinto	59108fbe32	feat: add distributed mode (#9124 ) * feat: add distributed mode (experimental) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix data races, mutexes, transactions Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix events and tool stream in agent chat Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * use ginkgo Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(cron): compute correctly time boundaries avoiding re-triggering Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * enhancements, refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not flood of healthy checks Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * do not list obvious backends as text backends Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tests fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * refactoring and consolidation Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Drop redundant healthcheck Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * enhancements, refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-30 00:47:27 +02:00
Richard Palethorpe	d3f629f183	feat: Merge repeated log lines in the terminal (#9141 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-26 22:16:13 +01:00
walcz-de	00fcf6936c	fix: implement encoding_format=base64 for embeddings endpoint (#9135 ) The OpenAI Node.js SDK v4+ sends encoding_format=base64 by default. LocalAI previously ignored this parameter and always returned a float JSON array, causing a silent data corruption bug in any Node.js client (AnythingLLM Desktop, LangChain.js, LlamaIndex.TS, …): // What the client does when it expects base64 but receives a float array: Buffer.from(floatArray, 'base64') Node.js treats a non-string first argument as a byte array — each float32 value is truncated to a single byte — and Float32Array then reads those bytes as floats, yielding dims/4 values. Vector databases (Qdrant, pgvector, …) then create collections with the wrong dimension, causing all similarity searches to fail silently. e.g. granite-embedding-107m (384 dims) → 96 stored in Qdrant jina-embeddings-v3 (1024 dims) → 256 stored in Qdrant Changes: - core/schema/prediction.go: add EncodingFormat string field to PredictionOptions so the request parameter is parsed and available throughout the request pipeline - core/schema/openai.go: add EmbeddingBase64 string field to Item; add MarshalJSON so the "embedding" JSON key emits either []float32 or a base64 string depending on which field is populated — all other Item consumers (image, video endpoints) are unaffected - core/http/endpoints/openai/embeddings.go: add floatsToBase64() which packs a float32 slice as little-endian bytes and base64-encodes it; add embeddingItem() helper; both InputToken and InputStrings loops now honour encoding_format=base64 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-25 17:38:07 +01:00
Ettore Di Giacinto	15935e9d5f	fix(auth): do not allow to register in invite mode (#9101 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 20:44:03 +01:00
Ettore Di Giacinto	5d410e5a03	fix(download): do not remove dst dir until we try all fallbacks (#9100 ) This actually caused fallbacks to be compeletely no-op as we were removing the destination dir before calling containerd.Apply Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 10:29:57 +01:00
Ettore Di Giacinto	031a36c995	feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092 ) * feat: wire min_p Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: inferencing defaults Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(refactor): re-use iterative parser Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: generate automatically inference defaults from unsloth Instead of trying to re-invent the wheel and maintain here the inference defaults, prefer to consume unsloth ones, and contribute there as necessary. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: apply defaults also to models installed via gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore: be consistent and apply fallback to all endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 00:57:15 +01:00
Ettore Di Giacinto	f7e8d9e791	feat(quantization): add quantization backend (#9096 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-22 00:56:34 +01:00
Ettore Di Giacinto	4b183b7bb6	feat: add quota system (#9090 ) * feat: add quota system Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-21 10:09:49 +01:00
Ettore Di Giacinto	f38e91d80b	feat(ui): add predictor for usage, user-breakdown statistics (#9091 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-21 10:09:36 +01:00
Ettore Di Giacinto	d9c1db2b87	feat: add (experimental) fine-tuning support with TRL (#9088 ) * feat: add fine-tuning endpoint Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(experimental): add fine-tuning endpoint and TRL support This changeset defines new GRPC signatues for Fine tuning backends, and add TRL backend as initial fine-tuning engine. This implementation also supports exporting to GGUF and automatically importing it to LocalAI after fine-tuning. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * commit TRL backend, stop by killing process Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * move fine-tune to generic features Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * add evals, reorder menu Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-21 02:08:02 +01:00
Richard Palethorpe	cb63bdb9e4	feat(ui): Add model pipeline editor (#9070 ) This creates a new model config page. Presently just allows configuring pipelines, but can be extending the future to other types of models. However pipelines are quite easy to create a form for and require editing to create. Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-20 15:07:34 +01:00
Richard Palethorpe	8cd3f9fc47	feat(ui, openai): Structured errors and link to traces in error toast (#9068 ) First when sending errors over SSE we now clearly identify them as such instead of just sending the error string as a chat completion message. We use this in the UI to identify errors and link to them to the traces. Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-20 15:06:07 +01:00
lif	e0ab1a8b43	fix: use exact tag matching for model gallery tag filtering (#9041 ) The Search() method uses strings.Contains() on comma-joined tags, causing substring false positives (e.g., "asr" matching "image-diffusers"). Add FilterByTag() method that checks each tag with strings.EqualFold() for exact, case-insensitive matching. Add 'tag' query parameter to /api/models and /api/backends endpoints. Update the React frontend to send filter selections as 'tag' instead of 'term'. Closes #8775 Signed-off-by: majiayu000 <1835304752@qq.com>	2026-03-20 08:37:45 +01:00
Ettore Di Giacinto	c3174f9543	chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' (#9079 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-20 08:12:21 +01:00
LocalAI [bot]	2b12875302	fix: Add tracing settings loading from runtime_settings.json (#9081 ) Tracing settings (EnableTracing and TracingMaxItems) were not being loaded from runtime_settings.json on startup, causing tracing settings configured via WebUI to be lost after service restart. This fix adds proper loading of tracing settings in loadRuntimeSettingsFromFile function in core/application/startup.go. Fixes #9072 Co-authored-by: localai-bot <localai-bot@localai.io>	2026-03-20 00:58:52 +01:00

1 2 3 4 5 ...

650 Commits