Commit Graph

211 Commits

Author SHA1 Message Date
Ettore Di Giacinto
5f7ece3e94 fix(p2p): adapt to backend changes, general improvements (#5889)
The binary is now named "llama-cpp-rpc-server" for p2p workers.

We also decrease the default token rotation interval, in this way
peer discovery is much more responsive.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-23 12:40:32 +02:00
Ettore Di Giacinto
98e5291afc feat: refactor build process, drop embedded backends (#5875)
* feat: split remaining backends and drop embedded backends

- Drop silero-vad, huggingface, and stores backend from embedded
  binaries
- Refactor Makefile and Dockerfile to avoid building grpc backends
- Drop golang code that was used to embed backends
- Simplify building by using goreleaser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): be specific with llama-cpp backend templates

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(docs): update

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): minor fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: drop all ffmpeg references

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: run protogen-go

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Always enable p2p mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update gorelease file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(stores): do not always load

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix linting issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Mac OS fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-22 16:31:04 +02:00
Ettore Di Giacinto
7e1f2657d5 Update GPU-acceleration.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-07-06 19:03:34 +02:00
Ettore Di Giacinto
e1cc7ee107 fix(ci): enable tag-latest to auto (#5738)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 18:17:01 +02:00
Ettore Di Giacinto
6644af10c6 feat: ⚠️ reduce images size and stop bundling sources (#5721)
feat: reduce images size and stop bundling sources

Do not copy sources anymore, and reduce packages of the base images by
not using builder images.

If needed to rebuild, just build the container image from scratch by
following the docs. We will slowly try to migrate all backends to the
gallery to keep the core small.

This PR is a breaking change, it also sets the base folders to /models
and /backends instead of /build/models and /build/backends.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-26 18:41:38 +02:00
Ettore Di Giacinto
7c4a2e9b85 chore(ci): ⚠️ fix latest tag by using docker meta action (#5722)
chore(ci): fix latest tag by using docker meta action

Also uniform tagging names

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-26 18:40:25 +02:00
kilavvy
b68d6e8088 Docs: Fix typos (#5709)
* Update GPU-acceleration.md

Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com>

* Update image-generation.md

Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com>

---------

Signed-off-by: kilavvy <140459108+kilavvy@users.noreply.github.com>
2025-06-23 18:15:06 +02:00
Ettore Di Giacinto
3796558aeb Update quickstart.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-21 20:11:57 +02:00
Ettore Di Giacinto
79abe0ad77 Drop latest references to extras images 2025-06-20 15:51:16 +02:00
Ettore Di Giacinto
8131d11d1f Update quickstart.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-19 22:42:38 +02:00
Ettore Di Giacinto
1ccd64ff6a chore: drop extras references from docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-19 22:04:28 +02:00
Ettore Di Giacinto
49d026a229 Update backends.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-19 19:47:09 +02:00
leopardracer
f9b968e19d Fix Typos and Improve Clarity in GPU Acceleration Documentation (#5688)
Update GPU-acceleration.md

Signed-off-by: leopardracer <136604165+leopardracer@users.noreply.github.com>
2025-06-19 15:41:13 +02:00
Maxim Evtush
add8fc35a2 Fix Typos in Documentation and Python Comments (#5658)
* Update istftnet.py

Signed-off-by: Maxim Evtush <154841002+maximevtush@users.noreply.github.com>

* Update GPU-acceleration.md

Signed-off-by: Maxim Evtush <154841002+maximevtush@users.noreply.github.com>

---------

Signed-off-by: Maxim Evtush <154841002+maximevtush@users.noreply.github.com>
2025-06-18 22:11:13 +02:00
Ettore Di Giacinto
867db3f888 chore(docs): add backend url
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-17 22:35:21 +02:00
Ettore Di Giacinto
b79aa31398 chore: move backends docs
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-17 22:26:40 +02:00
FT
1f29b5f38e Fix Typos and Improve Documentation Clarity (#5648)
* Update p2p.go

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>

* Update GPU-acceleration.md

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>

---------

Signed-off-by: FT <140458077+zeevick10@users.noreply.github.com>
2025-06-15 16:04:44 +02:00
Ettore Di Giacinto
2d64269763 feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00
David Thole
38c5d16b57 feat(docs): updating the documentation on fine tuning and advanced guide. (#5420)
updating the documentation on fine tuning and advanced guide.  This mirrors how modern version of llama.cpp operate
2025-05-21 19:11:00 +02:00
omahs
0f365ac204 fix: typos (#5376)
Signed-off-by: omahs <73983677+omahs@users.noreply.github.com>
2025-05-16 12:45:48 +02:00
Ettore Di Giacinto
e52c66c76e chore(docs/install.sh): image changes (#5354)
chore(docs): image changes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-14 19:28:30 +02:00
Ettore Di Giacinto
0e8af53a5b chore: update quickstart
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-05-01 22:36:33 +02:00
Simon Redman
88857696d4 fix(CUDA): Add note for how to run CUDA with SELinux (#5259)
* Add note to help run nvidia containers with SELinux

* Use correct CUDA container references as noted in the dockerhub overview

* Clean trailing whitespaces
2025-04-28 09:00:52 +02:00
Mohit Gaur
b6e3dc5f02 docs: update docs for DisableWebUI flag (#5256)
Signed-off-by: Mohit Gaur <56885276+Mohit-Gaur@users.noreply.github.com>
2025-04-27 16:02:02 +02:00
Simon Redman
a65e012aa2 docs(Vulkan): Add GPU docker documentation for Vulkan (#5255)
Add GPU docker documentation for Vulkan
2025-04-27 09:20:26 +02:00
Ettore Di Giacinto
2c9279a542 feat(video-gen): add endpoint for video generation (#5247)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-26 18:05:01 +02:00
Ettore Di Giacinto
cc3df759f8 chore(docs): improve installer.sh docs (#5232)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-21 22:11:43 +02:00
Ettore Di Giacinto
61cc76c455 chore(autogptq): drop archived backend (#5214)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-19 15:52:29 +02:00
Ettore Di Giacinto
7547463f81 Update quickstart.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-04-16 08:48:55 +02:00
Ettore Di Giacinto
4f239bac89 feat: rebrand - LocalAGI and LocalRecall joins the LocalAI stack family (#5159)
* wip

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update lotusdocs and hugo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* rephrasing

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Latest fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adjust readme section

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-04-15 17:51:24 +02:00
Ettore Di Giacinto
ac4991b069 chore(docs): update sponsor logo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-20 15:31:41 +01:00
Ettore Di Giacinto
f3ae94ca70 chore: update Image generation docs and examples (#4841)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-17 16:51:06 +01:00
Ettore Di Giacinto
7f90ff7aec chore(llama-ggml): drop deprecated backend (#4775)
The GGML format is now dead, since in the next version of LocalAI we
already bring many breaking compatibility changes, taking the occasion
also to drop ggml support (pre-gguf).

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-06 18:36:23 +01:00
Ettore Di Giacinto
28a1310890 chore(docs): enhance visibility
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-05 19:50:32 +01:00
Ettore Di Giacinto
2a702e9ca4 chore(docs): small updates
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-05 19:49:11 +01:00
Ettore Di Giacinto
3ecaea1b6e chore(docs): update sponsors in the website
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-05 19:41:55 +01:00
Ettore Di Giacinto
af41436f1b fix(tests): pin to branch for config used in tests (#4721)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-31 09:57:58 +01:00
Ettore Di Giacinto
72e52c4f6a chore: drop embedded models (#4715)
Since the remote gallery was introduced this is now completely
superseded by it. In order to keep the code clean and remove redudant
parts let's simplify the usage.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-30 00:03:01 +01:00
Ettore Di Giacinto
7f62b418a4 chore(docs): add documentation for l4t images
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-01-29 15:16:07 +01:00
Maximilian Kenfenheuer
a37b2c765c docs: update advanced-usage.md to reflect changes in #4700 (#4709)
Signed-off-by: Maximilian Kenfenheuer <maximilian.kenfenheuer@ksol.it>
2025-01-28 22:58:35 +01:00
Gianluca Boiano
032a33de49 chore: remove deprecated tinydream backend (#4631)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2025-01-18 18:35:30 +01:00
Gianluca Boiano
4bd8434ae0 fix(docs): add missing -core suffix to sycl images (#4630)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2025-01-18 15:47:49 +01:00
mintyleaf
96306a39a0 chore(docs): extra-Usage and Machine-Tag docs (#4627)
Rename LocalAI-Extra-Usage -> Extra-Usage, add MACHINE_TAG as cli flag option, add docs about extra-usage and machine-tag

Signed-off-by: mintyleaf <mintyleafdev@gmail.com>
2025-01-18 08:58:38 +01:00
Ettore Di Giacinto
ab344e4f47 docs: update compatibility-table.md (#4557)
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-01-07 21:20:44 +01:00
Ettore Di Giacinto
cab9f88ca4 chore(docs): add nvidia l4t instructions (#4454)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-23 18:59:33 +01:00
jtwolfe
ae9855a39e chore(docs): patch p2p detail in env and docs (#4434)
* Update distributed_inferencing.md

Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com>

* Update .env

Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com>

* Update distributed_inferencing.md

whoops

Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com>

---------

Signed-off-by: jtwolfe <jamie.t.wolfe@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-12-19 15:19:31 +01:00
Ettore Di Giacinto
3127cd1352 chore(docs): update available backends (#4325)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-05 16:57:56 +01:00
PetrFlegr
b90d78d9f6 Updated links of yamls (#4324)
Updated links

Links to deplyment*.yaml was changed

Signed-off-by: PetrFlegr <ptrflegr@gmail.com>
2024-12-05 16:06:51 +01:00
Ettore Di Giacinto
44a5dac312 feat(backend): add stablediffusion-ggml (#4289)
* feat(backend): add stablediffusion-ggml

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): track stablediffusion-ggml

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use default scheduler and sampler if not specified

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move cfg scale out of diffusers block

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Make it working

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: set free_params_immediately to false to call the model in sequence

https://github.com/leejet/stable-diffusion.cpp/issues/366

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-12-03 22:41:22 +01:00
Ettore Di Giacinto
3c3050f68e feat(backends): Drop bert.cpp (#4272)
* feat(backends): Drop bert.cpp

use llama.cpp 3.2 as a drop-in replacement for bert.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): make test more robust

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-27 16:34:28 +01:00