Commit Graph

76 Commits

Author SHA1 Message Date
Ettore Di Giacinto
089efe05fd feat(backends): add system backend, refactor (#6059)
- Add a system backend path
- Refactor and consolidate system information in system state
- Use system state in all the components to figure out the system paths
  to used whenever needed
- Refactor BackendConfig -> ModelConfig. This was otherway misleading as
  now we do have a backend configuration which is not the model config.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-14 19:38:26 +02:00
Ettore Di Giacinto
b9a25b16e6 feat: add reasoning effort and metadata to template (#5981)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-06 21:56:05 +02:00
Ettore Di Giacinto
3d22bfc27c feat(stablediffusion-ggml): add support to ref images (flux Kontext) (#5935)
* feat(stablediffusion-ggml): add support to ref images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add it to the model gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 22:42:34 +02:00
Ettore Di Giacinto
73ecb7f90b chore: drop assistants endpoint (#5926)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-27 21:06:09 +02:00
Dave
b3c2a3c257 fix: untangle pkg and core (#5896)
* migrate core/system to pkg/system - it has no dependencies FROM core, and IS USED in pkg

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/templates up to core/templates -- nothing in pkg references it, but it does reference core.

Signed-off-by: Dave Lee <dave@gray101.com>

* remove extra check, len of nil is 0

Signed-off-by: Dave Lee <dave@gray101.com>

* move pkg/startup to core/startup -- it does have important and unfixable dependencies on core

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2025-07-24 15:03:41 +02:00
Richard Palethorpe
754bedc3ea fix(realtime): Reset speech started flag on commit (#5879)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-22 16:41:12 +02:00
Richard Palethorpe
932f6b01a6 feat(realtime): Add speech started and stopped events (#5856)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-18 09:22:23 +02:00
Ettore Di Giacinto
33f9ee06c9 fix(gallery): automatically install model from name (#5757)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-29 17:42:58 +02:00
Richard Palethorpe
d650647db9 fix(realtime): Use updated model on session update (#5604)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-06-09 00:11:05 +02:00
Richard Palethorpe
bf6426aef2 feat: Realtime API support reboot (#5392)
* feat(realtime): Initial Realtime API implementation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: go mod tidy

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat: Implement transcription only mode for realtime API

Reduce the scope of the real time API for the initial realease and make
transcription only mode functional.

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): Build backends on a separate layer to speed up core only changes

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-25 22:25:05 +02:00
Ettore Di Giacinto
2c9279a542 feat(video-gen): add endpoint for video generation (#5247)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-26 18:05:01 +02:00
Ettore Di Giacinto
2c425e9c69 feat(loader): enhance single active backend by treating as singleton (#5107)
feat(loader): enhance single active backend by treating at singleton

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-01 20:58:11 +02:00
Ettore Di Giacinto
28b10e8804 chore(swagger): update (#4805)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-11 09:51:01 +01:00
Dave
3cddf24747 feat: Centralized Request Processing middleware (#3847)
* squash past, centralize request middleware PR

Signed-off-by: Dave Lee <dave@gray101.com>

* migrate bruno request files to examples repo

Signed-off-by: Dave Lee <dave@gray101.com>

* fix

Signed-off-by: Dave Lee <dave@gray101.com>

* Update tests/e2e-aio/e2e_test.go

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-02-10 12:06:16 +01:00
Ettore Di Giacinto
8d45670e41 fix(openai): consistently return stop reason (#4771)
We were not returning a stop reason when no tool was actually called
(even if specified).

Fixes: https://github.com/mudler/LocalAI/issues/4716

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-06 12:41:08 +01:00
Ettore Di Giacinto
e15d29aba2 chore(stablediffusion-ncn): drop in favor of ggml implementation (#4652)
* chore(stablediffusion-ncn): drop in favor of ggml implementation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): drop stablediffusion build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): add

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): try to fixup current tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tests improvements

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): use quality to specify step

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): switch to sd-1.5

also increase prep time for downloading models

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-22 19:34:16 +01:00
Gianluca Boiano
032a33de49 chore: remove deprecated tinydream backend (#4631)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2025-01-18 18:35:30 +01:00
mintyleaf
96306a39a0 chore(docs): extra-Usage and Machine-Tag docs (#4627)
Rename LocalAI-Extra-Usage -> Extra-Usage, add MACHINE_TAG as cli flag option, add docs about extra-usage and machine-tag

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
2025-01-18 08:58:38 +01:00
mintyleaf
96f8ec0402 feat: add machine tag and inference timings (#4577)
* Add machine tag option, add extraUsage option, grpc-server -> proto -> endpoint extraUsage data is broken for now

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

* remove redurant timing fields, fix not working timings output

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

* use middleware for Machine-Tag only if tag is specified

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>

---------

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
2025-01-17 17:05:58 +01:00
Ettore Di Giacinto
f943c4b803 Revert "feat: include tokens usage for streamed output" (#4336)
Revert "feat: include tokens usage for streamed output (#4282)"

This reverts commit 0d6c3a7d57.
2024-12-08 17:53:36 +01:00
Ettore Di Giacinto
cea5a0ea42 feat(template): read jinja templates from gguf files (#4332)
* Read jinja templates as fallback

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move templating out of model loader

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Test TemplateMessages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Set role and content from transformers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tests: be more flexible

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* More jinja

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small refactoring and adaptations

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-08 13:50:33 +01:00
mintyleaf
0d6c3a7d57 feat: include tokens usage for streamed output (#4282)
Use pb.Reply instead of []byte with Reply.GetMessage() in llama grpc to get the proper usage data in reply streaming mode at the last [DONE] frame

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-11-28 14:47:56 +01:00
Ettore Di Giacinto
4f1ab2366d chore(refactor): imply modelpath (#4208)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-20 18:06:35 +01:00
Ettore Di Giacinto
b425a870b0 fix(diffusers): correctly parse height and width request without parametrization (#4082)
* fix(diffusers): allow to specify width and height without enable-parameters

Let's simplify usage by not gating width and height by parameters

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: use sane defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-06 08:53:02 +01:00
Ettore Di Giacinto
ccc7cb0287 feat(templates): use a single template for multimodals messages (#3892)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-22 09:34:05 +02:00
Dave
a1634b219a fix: roll out bluemonday Sanitize more widely (#3794)
* initial pass: roll out bluemonday sanitization more widely

Signed-off-by: Dave Lee <dave@gray101.com>

* add one additional sanitize - the overall modelslist used by the docs site

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-12 09:45:47 +02:00
Ettore Di Giacinto
648ffdf449 feat(multimodal): allow to template placeholders (#3728)
feat(multimodal): allow to template image placeholders

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-10-04 18:32:29 +02:00
Dave
307a835199 groundwork: ListModels Filtering Upgrade (#2773)
* seperate the filtering from the middleware changes

---------

Signed-off-by: Dave Lee <dave@gray101.com>
2024-10-01 18:55:46 +00:00
siddimore
50a3b54e34 feat(api): add correlationID to Track Chat requests (#3668)
* Add CorrelationID to chat request

Signed-off-by: Siddharth More <siddimore@gmail.com>

* remove get_token_metrics

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Add CorrelationID to proto

Signed-off-by: Siddharth More <siddimore@gmail.com>

* fix correlation method name

Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

* Update core/http/endpoints/openai/chat.go

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Siddharth More <siddimore@gmail.com>

---------

Signed-off-by: Siddharth More <siddimore@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-09-28 17:23:56 +02:00
Ettore Di Giacinto
191bc2e50a feat(api): allow to pass audios to backends (#3603)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 12:26:53 +02:00
Ettore Di Giacinto
fbb9facda4 feat(api): allow to pass videos to backends (#3601)
This prepares the API to receive videos as well for video understanding.

It works similarly to images, where the request should be in the form:

{
 "type": "video_url",
 "video_url": { "url": "url or base64 data" }
}

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-19 11:21:59 +02:00
Ettore Di Giacinto
cf747bcdec feat: extract output with regexes from LLMs (#3491)
* feat: extract output with regexes from LLMs

This changset adds `extract_regex` to the LLM config. It is a list of
regexes that can match output and will be used to re extract text from
the LLM output. This is particularly useful for LLMs which outputs final
results into tags.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add tests, enhance output in case of configuration error

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-09-13 13:27:36 +02:00
Ettore Di Giacinto
fbaae8528d fix(chat): re-generated uuid, created, and text on each request (#3359)
This was noticed by models returning content besides function calls.
Sadly we can't test that easily in the CI so it got unnoticed.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-22 10:56:05 +02:00
Ettore Di Giacinto
e198347886 feat(openai): add json_schema format type and strict mode (#3193)
* feat(openai): add json_schema and strict mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* handle err vs _

security scanners prefer if we put these branches in, and I tend to agree.

Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-08-07 15:27:02 -04:00
Ettore Di Giacinto
2169c3497d feat(grammar): add llama3.1 schema (#3015)
* wip

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* get rid of panics

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* expose it properly from the config

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* forgot to commit

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Remove focus on test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-26 20:11:29 +02:00
Ettore Di Giacinto
5eda7f578d refactor: break down json grammar parser in different files (#3004)
* refactor: break down json grammar parser in different files

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: patch to `refactor_grammars` - propagate errors (#3006)

propagate errors around

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-07-25 08:41:00 +02:00
Ettore Di Giacinto
bf9dd1de7f feat(functions): parse broken JSON when we parse the raw results, use dynamic rules for grammar keys (#2912)
* feat(functions): enhance parsing with broken JSON when we parse the raw results

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* breaking: make function name by default

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(grammar): dynamically generate grammars with mutating keys

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: simplify condition

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-18 17:52:22 +02:00
Ettore Di Giacinto
b8b0c7ad0b docs(swagger): core more localai/openai endpoints (#2904)
* docs(swagger): core more localai/openai endpoints

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix swagger descriptions for backend_monitor.go

Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-07-18 00:38:41 -04:00
Ettore Di Giacinto
59ef426fbf feat(model-list): be consistent, skip known files from listing (#2760)
fix(model-list): be consistent, skip known files from listing

This changeset does two things:

- Removes the dependency of listing models from the OpenAI schema.
- Tries to reduce confusion between ListModels() in model loader and in
  the service - now there is only one ListModels which is in services
and does not depend anymore on the OpenAI schema
- The OpenAI-schema functions were moved nearby the OpenAI specific
  endpoints that needs the schema
- Drops the ListModel Service structure as there was no real need for
  it.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-10 15:28:39 +02:00
Ettore Di Giacinto
f120a0c9f9 docs(swagger): enhance coverage of APIs (#2753)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-09 23:09:49 +02:00
Ettore Di Giacinto
03b1cf51fd feat(whisper): add translate option (#2649)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-24 19:21:22 +02:00
Ettore Di Giacinto
a181dd0ebc refactor: gallery inconsistencies (#2647)
* refactor(gallery): move under core/

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(unarchive): do not allow symlinks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-24 17:32:12 +02:00
Dave
12513ebae0 rf: centralize base64 image handling (#2595)
contains simple fixes to warnings and errors, removes a broken / outdated test, runs go mod tidy, and as the actual change, centralizes base64 image handling

Signed-off-by: Dave Lee <dave@gray101.com>
2024-06-24 08:34:36 +02:00
Sertaç Özercan
5866fc8ded chore: fix go.mod module (#2635)
Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2024-06-23 08:24:36 +00:00
Ettore Di Giacinto
aae7ad9d73 feat(llama.cpp): guess model defaults from file (#2522)
* wip: guess informations from gguf file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* update go mod

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Identify llama3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Do not try to guess the name, as reading gguf files can be expensive

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Allow to disable guessing

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-08 22:13:02 +02:00
Ettore Di Giacinto
3b7a78adda fix(stream): do not break channel consumption (#2517)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-07 17:20:42 +02:00
Ettore Di Giacinto
3f7212c660 feat(functions): better free string matching, allow to expect strings after JSON (#2445)
Allow now any non-character, both as suffix and prefix when mixed grammars are enabled

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-31 09:36:27 +02:00
Ettore Di Giacinto
5b75bf16c7 models(gallery): add Codestral (#2442)
models(gallery): add Coderstral

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-30 18:50:26 +02:00
Prajwal S Nayak
4d98dd9ce7 feat(image): support response_type in the OpenAI API request (#2347)
* Change response_format type to string to match OpenAI Spec

Signed-off-by: prajwal <prajwalnayak7@gmail.com>

* updated response_type type to interface

Signed-off-by: prajwal <prajwalnayak7@gmail.com>

* feat: correctly parse generic struct

Signed-off-by: mudler <mudler@localai.io>

* add tests

Signed-off-by: mudler <mudler@localai.io>

---------

Signed-off-by: prajwal <prajwalnayak7@gmail.com>
Signed-off-by: mudler <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@localai.io>
2024-05-29 14:40:54 +02:00
Ettore Di Giacinto
669cd06dd9 feat(functions): allow parallel calls with mixed/no grammars (#2432)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-05-28 21:06:09 +02:00