Commit Graph

24 Commits

Author SHA1 Message Date
Ettore Di Giacinto
98e5291afc feat: refactor build process, drop embedded backends (#5875)
* feat: split remaining backends and drop embedded backends

- Drop silero-vad, huggingface, and stores backend from embedded
  binaries
- Refactor Makefile and Dockerfile to avoid building grpc backends
- Drop golang code that was used to embed backends
- Simplify building by using goreleaser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(gallery): be specific with llama-cpp backend templates

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(docs): update

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(ci): minor fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: drop all ffmpeg references

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: run protogen-go

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Always enable p2p mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update gorelease file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(stores): do not always load

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix linting issues

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Mac OS fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-22 16:31:04 +02:00
Ettore Di Giacinto
294f7022f3 feat: do not bundle llama-cpp anymore (#5790)
* Build llama.cpp separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Start to try to attach some tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add git and small fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: correctly autoload external backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run AIO tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Slightly update the Makefile helps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt auto-bumper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run linux test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add llama-cpp into build pipelines

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add default capability (for cpu)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop llama-cpp specific logic from the backend loader

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* drop grpc install in ci for tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Pass by backends path for tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Build protogen at start

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(tests): set backends path consistently

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Correctly configure the backends path

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to build for darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* WIP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Compile for metal on arm64/darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to run build off from cross-arch

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to the backend index nvidia-l4t and cpu's llama-cpp backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Build also darwin-x86 for llama-cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Disable arm64 builds temporary

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Test backend build on PR

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup build backend reusable workflow

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pass by skip drivers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use crane

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Skip drivers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* x86 darwin

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add packaging step for llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix leftover from bark-cpp extraction

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Try to fix hipblas build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-18 13:24:12 +02:00
Ettore Di Giacinto
ec206cc67c feat(cli): allow to install backends from OCI tar files (#5816)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-09 18:19:51 +02:00
Ettore Di Giacinto
8276952920 feat(system): detect and allow to override capabilities (#5785)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-03 19:30:52 +02:00
Ettore Di Giacinto
b7cd5bfaec feat(backends): add metas in the gallery (#5784)
* chore(backends): add metas in the gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: correctly handle aliases and metas with same names

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-03 18:01:55 +02:00
Ettore Di Giacinto
bfdc29d316 fix(gallery): correctly show status for downloading OCI images (#5774)
We can't use the mutate.Extract written bytes as current status as that
will be bigger than the compressed image size. Image manifest don't have
any guarantee of the type of artifact (can be compressed or not) when
showing the layer size.

Split the extraction process in two parts: Downloading and extracting as
a flattened system, in this way we can display the status of downloading
and extracting accordingly.

This change also fixes a small nuance in detecting installed backends,
now it's more consistent and looks if a metadata.json and/or a path with
a `run.sh` file is present.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-02 08:25:48 +02:00
Ettore Di Giacinto
d0fb23514f Revert "fix(gallery): correctly show status for downloading OCI images"
This reverts commit 780d034ac9.
2025-07-01 21:32:04 +02:00
Ettore Di Giacinto
780d034ac9 fix(gallery): correctly show status for downloading OCI images
We can't use the mutate.Extract written bytes as current status as that
will be bigger than the compressed image size. Image manifest don't have
any guarantee of the type of artifact (can be compressed or not) when
showing the layer size.

Split the extraction process in two parts: Downloading and extracting as
a flattened system, in this way we can display the status of downloading
and extracting accordingly.

This change also fixes a small nuance in detecting installed backends,
now it's more consistent and looks if a metadata.json and/or a path with
a `run.sh` file is present.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-01 19:56:28 +02:00
Ettore Di Giacinto
7a78e4f482 fix(backends gallery): meta packages do not have URIs (#5740)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 22:23:14 +02:00
Ettore Di Giacinto
bb54f2da2b feat(gallery): automatically install missing backends along models (#5736)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-27 18:25:44 +02:00
Ettore Di Giacinto
a6d9988e84 feat(backend gallery): add meta packages (#5696)
* feat(backend gallery): add meta packages

So we can have meta packages such as "vllm" that automatically installs
the corresponding package depending on the GPU that is being currently
detected in the system.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: use a metadata file

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-24 17:08:27 +02:00
Ettore Di Giacinto
efde0eaf83 feat(backend gallery): display download progress (#5687)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-06-18 23:49:44 +02:00
Ettore Di Giacinto
2d64269763 feat: Add backend gallery (#5607)
* feat: Add backend gallery

This PR add support to manage backends as similar to models. There is
now available a backend gallery which can be used to install and remove
extra backends.
The backend gallery can be configured similarly as a model gallery, and
API calls allows to install and remove new backends in runtime, and as
well during the startup phase of LocalAI.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add backends docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip: Backend Dockerfile for python backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: drop extras images, build python backends separately

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup on all backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* test CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Tweaks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop old backends leftovers

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixup CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Move dockerfile upper

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fix proto

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Feature dropped for consistency - we prefer model galleries

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing packages in the build image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* exllama is ponly available on cublas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* pin torch on chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups to index

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug CI

* Install accellerators deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add target arch

* Add cuda minor version

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use self-hosted runners

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: use quay for test images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups for vllm and chatterbox

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small fixups on CI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chatterbox is only available for nvidia

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Simplify CI builds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Adapt test, use qwen3

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(model gallery): add jina-reranker-v1-tiny-en-gguf

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(gguf-parser): recover from potential panics that can happen while reading ggufs with gguf-parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Use reranker from llama.cpp in AIO images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Limit concurrent jobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-06-15 14:56:52 +02:00
Ettore Di Giacinto
c87870b18e feat(ui): improve chat interface (#4910)
* feat(ui): show more informations in the chat view, minor adjustments to model gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(ui): UI improvements

Visual improvements and bugfixes including:
- disable pagination during search
- fix scrolling on new message

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-26 18:27:18 +01:00
Ettore Di Giacinto
e9971b168a feat(ui): paginate model gallery (#4886)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-22 21:38:00 +01:00
Ettore Di Giacinto
7daf5ac3e3 fix(gallery): do not return overrides and additional config (#4768)
When hitting /models/available we are intersted in the model
description, name and small metadatas. Configuration and overrides are
part of internals which are required only for installation.

This also solves a current bug when hitting /models/available fails if
one of the gallery items have overrides with parameters defined

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-02-05 18:37:09 +01:00
Ettore Di Giacinto
3c3050f68e feat(backends): Drop bert.cpp (#4272)
* feat(backends): Drop bert.cpp

use llama.cpp 3.2 as a drop-in replacement for bert.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): make test more robust

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-11-27 16:34:28 +01:00
Dave
90cacb9692 test: preliminary tests and merge fix for authv2 (#3584)
* add api key to existing app tests, add preliminary auth test

Signed-off-by: Dave Lee <dave@gray101.com>

* small fix, run test

Signed-off-by: Dave Lee <dave@gray101.com>

* status on non-opaque

Signed-off-by: Dave Lee <dave@gray101.com>

* tweak auth error

Signed-off-by: Dave Lee <dave@gray101.com>

* exp

Signed-off-by: Dave Lee <dave@gray101.com>

* quick fix on real laptop

Signed-off-by: Dave Lee <dave@gray101.com>

* add downloader version that allows providing an auth header

Signed-off-by: Dave Lee <dave@gray101.com>

* stash some devcontainer fixes during testing

Signed-off-by: Dave Lee <dave@gray101.com>

* s2

Signed-off-by: Dave Lee <dave@gray101.com>

* s

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* done with experiment

Signed-off-by: Dave Lee <dave@gray101.com>

* after merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* rename and fix

Signed-off-by: Dave Lee <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-09-24 09:32:48 +02:00
Ettore Di Giacinto
a36b721ca6 fix: be consistent in downloading files, check for scanner errors (#3108)
* fix(downloader): be consistent in downloading files

This PR puts some order in the downloader such as functions are re-used
across several places.

This fixes an issue with having uri's inside the model YAML file, it
would resolve to MD5 rather then using the filename

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(scanner): do raise error only if unsafeFiles are found

Fixes: https://github.com/mudler/LocalAI/issues/3114

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-08-02 20:06:25 +02:00
Ettore Di Giacinto
2a839e1432 fix(gallery): do not attempt to delete duplicate files (#3031)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-28 10:27:56 +02:00
Ettore Di Giacinto
607900a4bb docs: more swagger, update docs (#2907)
* docs(swagger): finish convering gallery section

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* docs: add section to explain how to install models with local-ai run

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Minor docs adjustments

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-07-18 11:25:21 +02:00
Rene Leonhardt
fc87507012 chore(deps): Update Dependencies (#2538)
* chore(deps): Update dependencies

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

* chore(deps): Upgrade github.com/imdario/mergo to dario.cat/mergo

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

* remove version identifiers for MeloTTS

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>

---------

Signed-off-by: Rene Leonhardt <65483435+reneleonhardt@users.noreply.github.com>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Dave <dave@gray101.com>
2024-07-12 19:54:08 +00:00
Dave
133987b1fb feat: HF /scan endpoint (#2566)
* start by checking /scan during the checksum update

Signed-off-by: Dave Lee <dave@gray101.com>

* add back in golang side features: downloader/uri gets struct and scan function, gallery uses it, and secscan/models calls it.

Signed-off-by: Dave Lee <dave@gray101.com>

* add a param to scan specific urls - useful for debugging

Signed-off-by: Dave Lee <dave@gray101.com>

* helpful printouts

Signed-off-by: Dave Lee <dave@gray101.com>

* fix offsets

Signed-off-by: Dave Lee <dave@gray101.com>

* fix error and naming

Signed-off-by: Dave Lee <dave@gray101.com>

* expose error

Signed-off-by: Dave Lee <dave@gray101.com>

* fix json tags

Signed-off-by: Dave Lee <dave@gray101.com>

* slight wording change

Signed-off-by: Dave Lee <dave@gray101.com>

* go mod tidy - getting warnings

Signed-off-by: Dave Lee <dave@gray101.com>

* split out python to make editing easier, add some simple code  to delete contaminated entries from gallery

Signed-off-by: Dave Lee <dave@gray101.com>

* o7 to my favorite part of our old name, go-skynet

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* merge fix

Signed-off-by: Dave Lee <dave@gray101.com>

* address review comments

Signed-off-by: Dave Lee <dave@gray101.com>

* forgot secscan could accept multiple URL at once

Signed-off-by: Dave Lee <dave@gray101.com>

* invert naming and actually use it

Signed-off-by: Dave Lee <dave@gray101.com>

* missed cli/models.go

Signed-off-by: Dave Lee <dave@gray101.com>

* Update .github/check_and_update.py

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Dave <dave@gray101.com>

---------

Signed-off-by: Dave Lee <dave@gray101.com>
Signed-off-by: Dave <dave@gray101.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-07-10 13:18:32 +02:00
Ettore Di Giacinto
a181dd0ebc refactor: gallery inconsistencies (#2647)
* refactor(gallery): move under core/

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(unarchive): do not allow symlinks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-24 17:32:12 +02:00