Compare commits

...

198 Commits

Author SHA1 Message Date
Ettore Di Giacinto
d6073ac18e Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-12-01 20:05:58 +01:00
Ettore Di Giacinto
1c450d46cf Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-12-01 20:01:07 +01:00
lunamidori5
6b312a8522 Site Clean up - How to Clean up (#1342)
* Create easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request-curl.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request-openai-v0.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request-openai-v1.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-request.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Delete docs/content/howtos/easy-request-openai-v1.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Delete docs/content/howtos/easy-request-openai-v0.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Delete docs/content/howtos/easy-request-curl.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update and rename easy-model-import-downloaded.md to easy-model.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-setup-docker-cpu.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-setup-docker-gpu.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-setup-docker-gpu.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-setup-docker-cpu.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Delete docs/content/howtos/autogen-setup.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Delete docs/content/howtos/easy-request-autogen.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update easy-model.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.en.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.en.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.en.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.en.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

* Update _index.md

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>

---------

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
2023-12-01 19:12:21 +01:00
Ettore Di Giacinto
2b2007ae9e docs: add fine-tuning example (#1374)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-12-01 19:11:45 +01:00
Dave
e94a34be8c fix: OSX Build Fix Part 1: Metal (#1365)
* Make Metal the default on OSX, simplify osx-specific code, and fix the file copy error.

* fix endif / comment
2023-11-30 19:50:50 +01:00
Ettore Di Giacinto
c3fb4b1d8e ci: rename workflow
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-30 19:25:33 +01:00
Ettore Di Giacinto
e3ca1a7dbe ci: split into reusable workflows (#1366)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-30 19:24:37 +01:00
B4ckslash
2d64d8b444 fix/docs: Python backend dependencies (#1360)
* Update docs for new requirements.txt path

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

* Fix typo (.PONY -> .PHONY) in python backend makefiles

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

---------

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
2023-11-30 17:46:55 +01:00
Ettore Di Giacinto
9b98be160a ci: limit concurrent jobs (#1364)
* ci: limit concurrent image push

* docs: mention core images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-30 17:45:20 +01:00
LocalAI [bot]
9f708ff318 ⬆️ Update ggerganov/llama.cpp (#1363)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-30 00:06:28 +01:00
Ettore Di Giacinto
4e0ad33d92 docs: Update getting started and GPU section (#1362) 2023-11-29 18:51:57 +01:00
LocalAI [bot]
519285bf38 ⬆️ Update ggerganov/llama.cpp (#1351)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-29 08:29:03 +01:00
Ettore Di Giacinto
fd1b7b3f22 docs: Add docker instructions, add community projects section in README (#1359)
docs: Add docker instructions
2023-11-28 23:14:16 +01:00
Gianluca Boiano
687730a7f5 fix: go-piper add libucd at linking time (#1357)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2023-11-28 19:55:09 +00:00
Ettore Di Giacinto
b7821361c3 feat(petals): add backend (#1350)
* feat(petals): add backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-28 09:01:46 +01:00
LocalAI [bot]
63e1f8fffd ⬆️ Update ggerganov/llama.cpp (#1345)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-27 09:02:19 +01:00
Ettore Di Giacinto
824612f1b4 feat: initial watchdog implementation (#1341)
* feat: initial watchdog implementation

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* fiuxups

* Add more output

* wip: idletime checker

* wire idle watchdog checks

* enlarge watchdog time window

* small fixes

* Use stopmodel

* Always delete process

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-26 18:36:23 +01:00
LocalAI [bot]
9482acfdfc ⬆️ Update ggerganov/llama.cpp (#1340)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-26 09:27:42 +01:00
Ettore Di Giacinto
c75bdd99e4 fix: rename transformers.py to avoid circular import (#1337)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-26 08:49:43 +01:00
Ettore Di Giacinto
6f34e8f044 fix: propagate CMAKE_ARGS when building grpc (#1334)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-25 13:53:51 +01:00
Ettore Di Giacinto
6d187af643 fix: handle grpc and llama-cpp with REBUILD=true (#1328)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-25 08:48:24 +01:00
LocalAI [bot]
97e9598c79 ⬆️ Update ggerganov/llama.cpp (#1330)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-24 23:45:05 +01:00
B4ckslash
5a6a6de3d7 docs: Update Features->Embeddings page to reflect backend restructuring (#1325)
* Update path to sentencetransformers backend for local execution

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

* Rename huggingface-embeddings -> sentencetransformers in embeddings.md for consistency with the backend structure

The Dockerfile still knows the "huggingface-embeddings"
backend (I assume for compatibility reasons) but uses the
sentencetransformers backend under the hood anyway.

I figured it would be good to update the docs to use the new naming to
make it less confusing moving forward. As the docker container knows
both the "huggingface-embeddings" and the "sentencetransformers"
backend, this should not break anything.

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

---------

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
2023-11-24 18:21:04 +01:00
LocalAI [bot]
b1a20effde ⬆️ Update ggerganov/llama.cpp (#1323)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-24 08:32:36 +01:00
Ettore Di Giacinto
ba5ab26f2e docs: Add llava, update hot topics (#1322)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-23 18:54:55 +01:00
Dave
69f53211a1 Feat: OSX Local Codesigning (#1319)
* stage makefile

* OSX local code signing and entitlements file to fix incoming connections prompt
2023-11-23 15:22:54 +01:00
B4ckslash
9dddd1134d fix: move python header comments below shebang in some backends (#1321)
* Fix python header comments for some extra gRPC backends

When a Python script is to be executed directly via exec(3), either the platform knows how to execute
the file itself (i.e. special configuration is necessary) or the first line
contains a shebang (#!) specifying the interpreter to run it (similar to
shell scripts).

The shebang MUST be on the first line for the script to work on all platforms,
so any header comments need to be in the lines following it. Otherwise
executing these scripts as extra backends will yield an "exec format
error" message.

Changes:
* Move introductory comments below the shebang line
* Change header comment in transformers.py to refer to the correct
  python module

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

* Make header comment in ttsbark.py more specific

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>

---------

Signed-off-by: Marcus Köhler <khler.marcus@gmail.com>
2023-11-23 15:22:37 +01:00
Ettore Di Giacinto
c5c77d2b0d docs: Initial import from localai-website (#1312)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-22 18:13:50 +01:00
LocalAI [bot]
763f94ca80 ⬆️ Update ggerganov/llama.cpp (#1313)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-22 08:37:11 +01:00
ok2sh
20d637e7b7 fix: ExLlama Backend Context Size & Rope Scaling (#1311)
* fix: context_size not propagated to exllama backend

* fix: exllama rope scaling
2023-11-21 19:26:39 +01:00
LocalAI [bot]
480b14c8dc ⬆️ Update ggerganov/llama.cpp (#1310)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-21 00:20:37 +01:00
Ettore Di Giacinto
999db4301a ci(core): add -core images without python deps (#1309)
* ci(core): add -core images without python deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci(core): use public runners

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-20 23:01:31 +01:00
Ettore Di Giacinto
92cbc4d516 feat(transformers): add embeddings with Automodel (#1308)
* Update huggingface.py

Switch SentenceTransformer for AutoModel in order to set trust_remote_code needed to use the encode method with embeddings models like jinai-v2

Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu>

* feat(transformers): split in separate backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Lucas Hänke de Cansino <lhc@next-boss.eu>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Lucas Hänke de Cansino <lhc@next-boss.eu>
2023-11-20 21:21:17 +01:00
LocalAI [bot]
ff9afdb0fe ⬆️ Update ggerganov/llama.cpp (#1306)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-20 08:16:00 +01:00
LocalAI [bot]
3e35b20a02 ⬆️ Update mudler/go-piper (#1305)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-19 09:01:40 +01:00
LocalAI [bot]
9ea371d6cd ⬆️ Update ggerganov/llama.cpp (#1304)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-19 08:49:05 +01:00
Ettore Di Giacinto
7a0f9767da docs: fix heading
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-18 15:04:00 +01:00
Ettore Di Giacinto
9d7363f2a7 docs: update configuration readme
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-18 15:03:15 +01:00
Ettore Di Giacinto
8ee5cf38fd Delete examples/configurations/llava/README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-18 15:01:39 +01:00
Ettore Di Giacinto
a6b788d220 docs: update LLaVa instructions
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-18 15:01:16 +01:00
lunamidori5
ccd87cd9f0 llava.yaml (yaml format standardization) (#1303)
Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
2023-11-18 14:48:54 +01:00
LocalAI [bot]
b5af87fc6c ⬆️ Update ggerganov/llama.cpp (#1300)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-18 08:19:10 +01:00
Ettore Di Giacinto
3c9544b023 refactor: rename llama-stable to llama-ggml (#1287)
* refactor: rename llama-stable to llama-ggml

* Makefile: get sources in sources/

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup path

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup sources

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups sd

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* update SD

* fixup

* fixup: create piper libdir also when not built

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix make target on linux test

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-18 08:18:43 +01:00
Mathias
2f65671070 fix(api/config): allow YAML config with .yml (#1299)
This commit allow to use both `.yml` and `.yaml` extensions for YAML configuration files as
it is usually expected.
2023-11-17 22:47:30 +01:00
LocalAI [bot]
8c5436cbed ⬆️ Update ggerganov/llama.cpp (#1297)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-17 08:45:22 +01:00
Ettore Di Giacinto
548959b50f feat: queue up requests if not running parallel requests (#1296)
Return a GRPC which handles a lock in case it is not meant to be
parallel.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-16 22:20:16 +01:00
LocalAI [bot]
2addb9f99a ⬆️ Update ggerganov/llama.cpp (#1291)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-16 08:20:26 +01:00
Ettore Di Giacinto
fdd95d1d86 feat: allow to run parallel requests (#1290)
* feat: allow to run parallel requests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixup

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-16 08:20:05 +01:00
Ettore Di Giacinto
66a558ff41 fix: respect OpenAI spec for response format (#1289)
fix: properly respect OpenAI spec for response format

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-15 19:36:23 +01:00
LocalAI [bot]
733b612eb2 ⬆️ Update ggerganov/llama.cpp (#1288)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-15 18:41:09 +01:00
LocalAI [bot]
991ecce004 ⬆️ Update ggerganov/llama.cpp (#1285)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-14 18:23:09 +01:00
Ettore Di Giacinto
ad0e30bca5 refactor: move backends into the backends directory (#1279)
* refactor: move backends into the backends directory

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactor: move main close to implementation for every backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-13 22:40:16 +01:00
LocalAI [bot]
55461188a4 ⬆️ Update ggerganov/llama.cpp (#1282)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-13 00:48:26 +00:00
LocalAI [bot]
5d2405fdef ⬆️ Update ggerganov/llama.cpp (#1280)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-11 23:26:54 +00:00
LocalAI [bot]
e9f1268225 ⬆️ Update ggerganov/llama.cpp (#1272)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-11 20:00:28 +00:00
Ettore Di Giacinto
803a0ac02a feat(llama.cpp): support lora with scale and yarn (#1277)
* feat(llama.cpp): support lora with scale

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(llama.cpp): support yarn

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-11 18:40:48 +01:00
Gianluca Boiano
bde87d00b9 deps(go-piper): update to 2023.11.6-3 (#1257)
Signed-off-by: Gianluca Boiano <morf3089@gmail.com>
2023-11-11 18:40:26 +01:00
Ettore Di Giacinto
0eae727366 🔥 add LaVA support and GPT vision API, Multiple requests for llama.cpp, return JSON types (#1254)
* wip

* wip

* Make it functional

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* wip

* Small fixups

* do not inject space on role encoding, encode img at beginning of messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add examples/config defaults

* Add include dir of current source dir

* cleanup

* fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

* Revert "fixups"

This reverts commit f1a4731cca.

* fixes

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-11 13:14:59 +01:00
LocalAI [bot]
3b4c5d54d8 ⬆️ Update ggerganov/llama.cpp (#1265)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-10 08:50:42 +01:00
LocalAI [bot]
4e16bc2f13 ⬆️ Update ggerganov/llama.cpp (#1256)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-08 08:21:12 +01:00
LocalAI [bot]
562ac62f59 ⬆️ Update ggerganov/llama.cpp (#1242)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-07 08:37:55 +01:00
Diego
e7fa2e06f8 Fixes the bug 1196 (#1232)
* Current state of the branch.

* Now gRPC is build only when the BUILD_GRPC_FOR_BACKEND_LLAMA variable is defined.

* Now the local compilation of gRPC is executed on BUILD_GRPC_FOR_BACKEND_LLAMA.

* Revised the Makefile.

* Removed replace directives in go.mod.

---------

Signed-off-by: Diego <38375572+diego-minguzzi@users.noreply.github.com>
Co-authored-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-11-06 19:07:46 +01:00
Ettore Di Giacinto
8123f009d0 dockerfile: fixup duplicate
This should have been "exllama"

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-05 14:09:31 +01:00
Ettore Di Giacinto
622aaa9f7d dockerfile: avoid pushing a big layer
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-05 10:31:33 +01:00
Ettore Di Giacinto
7b1ee203ce tests: re-add flake-attempts
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-11-05 09:01:03 +01:00
Ettore Di Giacinto
f347e51927 feat(conda): conda environments (#1144)
* feat(autogptq): add a separate conda environment for autogptq (#1137)

**Description**

This PR related to #1117

**Notes for Reviewers**

Here we lock down the version of the dependencies. Make sure it can be
used all the time without failed if the version of dependencies were
upgraded.

I change the order of importing packages according to the pylint, and no
change the logic of code. It should be ok.

I will do more investigate on writing some test cases for every backend.
I can run the service in my environment, but there is not exist a way to
test it. So, I am not confident on it.

Add a README.md in the `grpc` root. This is the common commands for
creating `conda` environment. And it can be used to the reference file
for creating extral gRPC backend document.

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* [Extra backend] Add seperate environment for ttsbark (#1141)

**Description**

This PR relates to #1117

**Notes for Reviewers**

Same to the latest PR:
* The code is also changed, but only the order of the import package
parts. And some code comments are also added.
* Add a configuration of the `conda` environment
* Add a simple test case for testing if the service can be startup in
current `conda` environment. It is succeed in VSCode, but the it is not
out of box on terminal. So, it is hard to say the test case really
useful.

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.

<!--
Thank you for contributing to LocalAI!

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda): add make target and entrypoints for the dockerfile

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda): Add seperate conda env for diffusers (#1145)

**Description**

This PR relates to  #1117

**Notes for Reviewers**

* Add `conda` env `diffusers.yml`
* Add Makefile to create it automatically
* Add `run.sh` to support running as a extra backend
  * Also adding it to the main Dockerfile
* Add make command in the root Makefile
* Testing the server, it can start up under the env

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda):Add seperate env for vllm (#1148)

**Description**

This PR is related to #1117

**Notes for Reviewers**

* The gRPC server can be started as normal
* The test case can be triggered in VSCode
* Same to other this kind of PRs, add `vllm.yml` Makefile and add
`run.sh` to the main Dockerfile, and command to the main Makefile

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.

<!--
Thank you for contributing to LocalAI!

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda):Add seperate env for huggingface (#1146)

**Description**

This PR is related to  #1117

**Notes for Reviewers**

* Add conda env `huggingface.yml`
* Change the import order, and also remove the no-used packages
* Add `run.sh` and `make command` to the main Dockerfile and Makefile
* Add test cases for it. It can be triggered and succeed under VSCode
Python extension but it is hang by using `python -m unites
test_huggingface.py` in the terminal

```
Running tests (unittest): /workspaces/LocalAI/extra/grpc/huggingface
Running tests: /workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_embedding
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_load_model
/workspaces/LocalAI/extra/grpc/huggingface/test_huggingface.py::TestBackendServicer::test_server_startup
./test_huggingface.py::TestBackendServicer::test_embedding Passed

./test_huggingface.py::TestBackendServicer::test_load_model Passed

./test_huggingface.py::TestBackendServicer::test_server_startup Passed

Total number of tests expected to run: 3
Total number of tests run: 3
Total number of tests passed: 3
Total number of tests failed: 0
Total number of tests failed with errors: 0
Total number of tests skipped: 0

Finished running tests!
```

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.

<!--
Thank you for contributing to LocalAI!

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda): Add the seperate conda env for VALL-E X (#1147)

**Description**

This PR is related  to #1117

**Notes for Reviewers**

* The gRPC server cannot start up

```
(ttsvalle) @Aisuko ➜ /workspaces/LocalAI (feat/vall-e-x) $ /opt/conda/envs/ttsvalle/bin/python /workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py
Traceback (most recent call last):
  File "/workspaces/LocalAI/extra/grpc/vall-e-x/ttsvalle.py", line 14, in <module>
    from utils.generation import SAMPLE_RATE, generate_audio, preload_models
ModuleNotFoundError: No module named 'utils'
```

The installation steps follow
https://github.com/Plachtaa/VALL-E-X#-installation below:

* Under the `ttsvalle` conda env

```
git clone https://github.com/Plachtaa/VALL-E-X.git
cd VALL-E-X
pip install -r requirements.txt
```

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.

<!--
Thank you for contributing to LocalAI!

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: set image type

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat(conda):Add seperate conda env for exllama (#1149)

Add seperate env for exllama

Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Setup conda

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Set image_type arg

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: prepare only conda env in tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Dockerfile: comment manual pip calls

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* conda: add conda to PATH

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixes

* add shebang

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* file perms

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug

* Install new conda in the worker

* Disable GPU tests for now until the worker is back

* Rename workflows

* debug

* Fixup conda install

* fixup(wrapper): pass args

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
2023-11-04 15:30:32 +01:00
LocalAI [bot]
9b17af18b3 ⬆️ Update ggerganov/llama.cpp (#1236)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-03 19:23:53 +01:00
Samuel Walker
23c7fbfe6b chianlit example (#1238) 2023-11-02 22:56:46 +01:00
Samuel Walker
035fea676a llama index example (#1237) 2023-11-02 13:35:06 -07:00
Vitor Oliveira
6e1a234d15 feat(certificates): add support for custom CA certificates (#880)
This change facilitates users working behind corporate firewalls or proxies. By allowing the integration of custom CA certificates, users can handle SSL connections that are intercepted by company infrastructure.
2023-11-01 20:10:14 +01:00
LocalAI [bot]
5b596ea605 ⬆️ Update ggerganov/llama.cpp (#1231)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-11-01 12:44:34 +00:00
Dave
6bd56460de Update .gitignore for backend/llama.cpp (#1235)
Signed-off-by: Dave <dave@gray101.com>
2023-11-01 09:52:02 +01:00
LocalAI [bot]
6ef7ea2635 ⬆️ Update ggerganov/llama.cpp (#1207)
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-10-30 08:00:36 +00:00
Ettore Di Giacinto
f8c00fbaf1 ci: enlarge download timeout window
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-29 22:09:35 +01:00
Ettore Di Giacinto
d9a42cc4c5 ci: run only cublas on selfhosted (#1224)
* ci: run only cublas on selfhosted

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* update git

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* change testing embeddings model link

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-29 22:04:43 +01:00
Ettore Di Giacinto
fc0bc32814 ci: use self-hosted to build container images (#1206)
ci: use self-hosted

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-10-26 21:13:40 +02:00
Ettore Di Giacinto
c62504ac92 cleanup: drop bloomz and ggllm as now supported by llama.cpp (#1217)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-26 07:43:31 +02:00
Ettore Di Giacinto
f227e918f9 feat(llama.cpp): Bump llama.cpp, adapt grpc server (#1211)
* feat(llama.cpp): Bump llama.cpp, adapt grpc server

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-25 20:56:25 +02:00
Ettore Di Giacinto
c132dbadce docs(examples): Add mistral example (#1214)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-25 20:56:12 +02:00
Dave
b839eb80a1 Fix backend/cpp/llama CMakeList.txt on OSX (#1212)
* Fix backend/cpp/llama CMakeList.txt on OSX - detect OSX and use homebrew libraries

* sneak a logging fix in too for gallery debugging

* additional logging
2023-10-25 20:53:26 +02:00
renovate[bot]
23b03a7f03 fix(deps): update module github.com/onsi/gomega to v1.28.1 (#1205)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-24 09:16:02 +02:00
LocalAI [bot]
9196583651 ⬆️ Update ggerganov/llama.cpp (#1204)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-10-23 19:06:39 +02:00
Ettore Di Giacinto
fd28252e55 fix(Dockerfile): try to save some space
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-22 17:13:39 +02:00
renovate[bot]
94f20e2eb7 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to c25dc51 (#1191)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-22 16:58:45 +02:00
Ettore Di Giacinto
5ced99a8e7 ci: more cleanup for workers
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-10-22 12:27:04 +02:00
LocalAI [bot]
c377e61ff0 ⬆️ Update go-skynet/go-llama.cpp (#1156)
Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-10-22 08:55:44 +02:00
Ettore Di Giacinto
a6fe0a020a feat(llama.cpp): update (#1200)
**Description**

This PR updates llama.cpp to
465219b914

Supersedes #1195
2023-10-21 18:44:37 +02:00
Ettore Di Giacinto
bf2ed3d752 fix(Dockerfile): piper phonemize is required during build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-21 16:40:41 +02:00
Ettore Di Giacinto
d17a92eef3 example(bruno): add image generation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-21 11:38:23 +02:00
Ettore Di Giacinto
1a7be035d3 fix(Makefile): build all backends if none is specified
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-21 11:34:59 +02:00
Ettore Di Giacinto
004baaa30f feat(llama.cpp): update
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-21 11:04:03 +02:00
renovate[bot]
ef19268418 chore(deps): update actions/checkout action to v4 (#1006)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-21 08:55:44 +02:00
renovate[bot]
e82470341f fix(deps): update module google.golang.org/grpc to v1.59.0 (#1189)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-20 17:04:14 +02:00
renovate[bot]
88fa42de75 fix(deps): update github.com/tmc/langchaingo digest to c636b3d (#1188)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-20 17:03:01 +02:00
Ettore Di Giacinto
432513c3ba ci: add GPU tests (#1095)
* ci: test GPU

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: show logs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Debug

* debug

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* split extra/core images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* split extra/core images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* consider runner host dir

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-19 13:50:40 +02:00
renovate[bot]
45370c212b fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 9a19c74 (#1179)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-17 18:37:27 +02:00
Jesús Espino
e91f660eb1 feat(metrics): Adding initial support for prometheus metrics (#1176)
* feat(metrics): Adding initial support for prometheus metrics

* Fixing CI

* run go mod tidy
2023-10-17 18:22:53 +02:00
renovate[bot]
3f3162e57c fix(deps): update module github.com/gofiber/fiber/v2 to v2.50.0 (#1177)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-16 21:47:44 +02:00
renovate[bot]
208d1fce58 fix(deps): update github.com/tmc/langchaingo digest to a02d4fd (#1175)
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-16 21:46:53 +02:00
Ettore Di Giacinto
128694213f feat: llama.cpp gRPC C++ backend (#1170)
* wip: llama.cpp c++ gRPC server

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* make it work, attach it to the build process

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* update deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: add protobuf dep

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* try fix protobuf on cmake

* cmake: workarounds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add packages

* cmake: use fixed version of grpc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* cmake(grpc): install locally

* install grpc

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* install required deps for grpc on debian bullseye

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* debug

* debug

* Fixups

* no need to install cmake manually

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: fixup macOS

* use brew whenever possible

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* macOS fixups

* debug

* fix container build

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* workaround

* try mac

https://stackoverflow.com/questions/23905661/on-mac-g-clang-fails-to-search-usr-local-include-and-usr-local-lib-by-def

* Disable temp. arm64 docker image builds

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-16 21:46:29 +02:00
Jesús Espino
8034ed3473 Adding transcript subcommand (#1171)
Adding the transcript subcommand to the localai binary

This PR is related to #816
2023-10-15 09:17:41 +02:00
renovate[bot]
d22069c59e fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 22de3c5 (#1172)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `10f9b49` -> `22de3c5` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-14 12:29:22 +02:00
renovate[bot]
5a04d32b39 fix(deps): update module github.com/sashabaranov/go-openai to v1.16.0 (#1159)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | minor | `v1.15.4` -> `v1.16.0` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.16.0`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.16.0)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.4...v1.16.0)

#### What's Changed

- Add DotProduct Method and README Example for Embedding Similarity
Search by [@&#8203;ealvar3z](https://togithub.com/ealvar3z) in
[https://github.com/sashabaranov/go-openai/pull/492](https://togithub.com/sashabaranov/go-openai/pull/492)
- fix: use any for n_epochs by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/499](https://togithub.com/sashabaranov/go-openai/pull/499)
- Feat Add headers to openai responses by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/506](https://togithub.com/sashabaranov/go-openai/pull/506)
- Support get http header and x-ratelimit-\* headers by
[@&#8203;liushuangls](https://togithub.com/liushuangls) in
[https://github.com/sashabaranov/go-openai/pull/507](https://togithub.com/sashabaranov/go-openai/pull/507)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.4...v1.16.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-14 12:28:58 +02:00
Jesús Espino
ab65f3a17d Addining the tts command line subcommand (#1169)
This PR adds the tts (Text to Speach) command to the localai binary.

This PR is related to the issue #816
2023-10-14 12:27:35 +02:00
renovate[bot]
4e23cbebcf fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 10f9b49 (#1158)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `56c0d28` -> `10f9b49` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-13 18:31:13 +02:00
Ettore Di Giacinto
63418c1afc ci: cleanup worker (#1166)
**Description**

Tries to make CI green again

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-10-12 18:09:56 +02:00
Jesús Espino
8ca671761a feat(cli): Adding models subcommand with list and install subcommands (#1165)
Adding subcommands to do certain actions directly from the command line.
I'm starting with the models subcommand allowing you to list models from
your galleries and install them.

This PR partially fixes #816

My intention is to keep adding other subcommands, but I think this is a
good start, and I think this already provides value.

Also, I added a new dependency to generate the progress bar in the
command line, it is not "needed" but I think is a nice to have to have a
cooler interface.

Here is a screenshot:

![imagen](https://github.com/go-skynet/LocalAI/assets/290303/8d8c1bf0-5340-46ce-9362-812694f914cd)
2023-10-12 10:45:34 +02:00
Jesús Espino
81a5ed9f31 fix(openai): Populate ID and Created fields in OpenAI compatible responses (#1164)
Adding the extra ID and Created fields to any request to the OpenAI
Compatible API to improve the compatibility.

This PR fixes #1103
2023-10-12 02:00:08 +00:00
renovate[bot]
528b9d9206 fix(deps): update github.com/go-skynet/go-llama.cpp digest to aeba71e (#1155)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `1676dcd` -> `aeba71e` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-11 18:19:13 +02:00
renovate[bot]
1a4c57fac2 fix(deps): update module google.golang.org/grpc to v1.58.3 (#1160)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [google.golang.org/grpc](https://togithub.com/grpc/grpc-go) | require
| patch | `v1.58.2` -> `v1.58.3` |

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.58.3`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.3)

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.58.2...v1.58.3)

### Security

- server: prohibit more than MaxConcurrentStreams handlers from running
at once (CVE-2023-44487)

In addition to this change, applications should ensure they do not leave
running tasks behind related to the RPC before returning from method
handlers, or should enforce appropriate limits on any such work.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-11 18:18:32 +02:00
Dave
44a7045732 Feats: bruno example, gallery improvements for new scraper (#1161)
This PR bundles together two unrelated features:

1. Model Gallery improvements - specifically, the ability to follow
".ref" gallery links (which I made up for this specific application) to
an actual gallery yaml file (in order to have stable URLs) and the
ability to load self-contained configurations, rather than always using
a base.yaml + overrides. This is groundwork for my python-based
huggingface scraper.

2. A while ago I introduced some Insomnia request templates for people
to use. Unfortunately, Insomnia has decided to tank their product... So
I've personally switched to using
[bruno](https://github.com/usebruno/bruno/). Corresponding equivalent
files that I use for my testing have been added. Just open the folder
from bruno and everything will work. No import process required.

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-10-11 18:18:12 +02:00
renovate[bot]
8ac7186185 fix(deps): update module github.com/onsi/ginkgo/v2 to v2.13.0 (#1152)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/onsi/ginkgo/v2](https://togithub.com/onsi/ginkgo) |
require | minor | `v2.12.1` -> `v2.13.0` |

---

### Release Notes

<details>
<summary>onsi/ginkgo (github.com/onsi/ginkgo/v2)</summary>

### [`v2.13.0`](https://togithub.com/onsi/ginkgo/releases/tag/v2.13.0)

[Compare
Source](https://togithub.com/onsi/ginkgo/compare/v2.12.1...v2.13.0)

#### 2.13.0

##### Features

Add PreviewSpect() to enable programmatic preview access to the suite
report (fixes
[#&#8203;1225](https://togithub.com/onsi/ginkgo/issues/1225))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy44LjEiLCJ1cGRhdGVkSW5WZXIiOiIzNy44LjEiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-10 09:27:41 +02:00
renovate[bot]
975387f7ae fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 56c0d28 (#1140)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `6711bdd` -> `56c0d28` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-07 11:20:40 +02:00
David
d793b5af5e fix: update docker-compose.yaml (#1131)
fix issue #803
2023-10-05 22:13:18 +02:00
renovate[bot]
5188776224 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 1676dcd (#1135)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `6018c9d` -> `1676dcd` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-05 21:14:47 +02:00
LocalAI [bot]
07249c0446 ⬆️ Update go-skynet/go-llama.cpp (#1136)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-10-05 17:35:21 +02:00
renovate[bot]
188301f403 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 6018c9d (#1129)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `79f9587` -> `6018c9d` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-04 18:03:15 +02:00
LocalAI [bot]
e660721a0c ⬆️ Update go-skynet/go-llama.cpp (#1130)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-10-04 16:54:20 +02:00
renovate[bot]
e029cc66bc fix(deps): update module github.com/rs/zerolog to v1.31.0 (#1102)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/rs/zerolog](https://togithub.com/rs/zerolog) | require |
minor | `v1.30.0` -> `v1.31.0` |

---

### Release Notes

<details>
<summary>rs/zerolog (github.com/rs/zerolog)</summary>

###
[`v1.31.0`](https://togithub.com/rs/zerolog/compare/v1.30.0...v1.31.0)

[Compare
Source](https://togithub.com/rs/zerolog/compare/v1.30.0...v1.31.0)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-02 18:14:33 +02:00
James Braza
e34b5f0119 Cleaning up examples/ models and starter .env files (#1124)
Closes https://github.com/go-skynet/LocalAI/issues/1066 and
https://github.com/go-skynet/LocalAI/issues/1065

Standardizes all `examples/`:
- Models in one place (other than `rwkv`, which was one-offy)
- Env files as `.env.example` with `cp`
    - Also standardizes comments and links docs
2023-10-02 18:14:10 +02:00
renovate[bot]
c223364816 fix(deps): update module github.com/sashabaranov/go-openai to v1.15.4 (#1122)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | patch | `v1.15.3` -> `v1.15.4` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.4`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.4)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.3...v1.15.4)

#### What's Changed

- added delete fine tune model endpoint by
[@&#8203;BrendanMartin](https://togithub.com/BrendanMartin) in
[https://github.com/sashabaranov/go-openai/pull/497](https://togithub.com/sashabaranov/go-openai/pull/497)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.3...v1.15.4

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-01 19:45:26 +02:00
renovate[bot]
74fd5844ca fix(deps): update module github.com/shirou/gopsutil/v3 to v3.23.9 (#1120)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/shirou/gopsutil/v3](https://togithub.com/shirou/gopsutil)
| require | patch | `v3.23.8` -> `v3.23.9` |

---

### Release Notes

<details>
<summary>shirou/gopsutil (github.com/shirou/gopsutil/v3)</summary>

###
[`v3.23.9`](https://togithub.com/shirou/gopsutil/compare/v3.23.8...v3.23.9)

[Compare
Source](https://togithub.com/shirou/gopsutil/compare/v3.23.8...v3.23.9)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-10-01 09:18:39 +00:00
renovate[bot]
4ebc86df84 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 79f9587 (#1085)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `d84f03c` -> `79f9587` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzcuMC4zIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-30 14:15:54 +02:00
renovate[bot]
8cd03eff58 fix(deps): update github.com/tmc/langchaingo digest to e16b777 (#1101)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `2c309cf` -> `e16b777` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzcuMC4zIiwidGFyZ2V0QnJhbmNoIjoibWFzdGVyIn0=-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-30 08:13:13 +02:00
LocalAI [bot]
46660a16a0 ⬆️ Update go-skynet/go-llama.cpp (#1106)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-29 23:55:12 +00:00
renovate[bot]
27b097309e fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to 6711bdd (#1079)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `e86c637` -> `6711bdd` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuMTA3LjIiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-29 19:18:04 +02:00
renovate[bot]
d0fa1f8e94 fix(deps): update module github.com/onsi/gomega to v1.28.0 (#1113)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/onsi/gomega](https://togithub.com/onsi/gomega) | require |
minor | `v1.27.10` -> `v1.28.0` |

---

### Release Notes

<details>
<summary>onsi/gomega (github.com/onsi/gomega)</summary>

### [`v1.28.0`](https://togithub.com/onsi/gomega/releases/tag/v1.28.0)

[Compare
Source](https://togithub.com/onsi/gomega/compare/v1.27.10...v1.28.0)

#### 1.28.0

##### Features

- Add VerifyHost handler to ghttp
([#&#8203;698](https://togithub.com/onsi/gomega/issues/698))
\[[`0b03b36`](https://togithub.com/onsi/gomega/commit/0b03b36)]

##### Fixes

- Read Body for Newer Responses in HaveHTTPBodyMatcher
([#&#8203;686](https://togithub.com/onsi/gomega/issues/686))
\[[`18d6673`](https://togithub.com/onsi/gomega/commit/18d6673)]

##### Maintenance

- Bump github.com/onsi/ginkgo/v2 from 2.11.0 to 2.12.0
([#&#8203;693](https://togithub.com/onsi/gomega/issues/693))
\[[`55a33f3`](https://togithub.com/onsi/gomega/commit/55a33f3)]
- Typo in matchers.go
([#&#8203;691](https://togithub.com/onsi/gomega/issues/691))
\[[`de68e8f`](https://togithub.com/onsi/gomega/commit/de68e8f)]
- Bump commonmarker from 0.23.9 to 0.23.10 in /docs
([#&#8203;690](https://togithub.com/onsi/gomega/issues/690))
\[[`ab17f5e`](https://togithub.com/onsi/gomega/commit/ab17f5e)]
- chore: update test matrix for Go 1.21
([#&#8203;689](https://togithub.com/onsi/gomega/issues/689))
\[[`5069017`](https://togithub.com/onsi/gomega/commit/5069017)]
- Bump golang.org/x/net from 0.12.0 to 0.14.0
([#&#8203;688](https://togithub.com/onsi/gomega/issues/688))
\[[`babe25f`](https://togithub.com/onsi/gomega/commit/babe25f)]

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYXN0ZXIifQ==-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-29 19:14:36 +02:00
65a
55e38fea0e feat(llama.cpp): enable ROCm/HIPBLAS support (#1100)
**Description**

This PR fixes lack of HIPBLAS support in LocalAI.

**Notes for Reviewers**
This PR builds on https://github.com/go-skynet/go-llama.cpp/pull/235 to
enable ROCm/HIPBLAS support for gguf models running under llama.cpp
backend (not the stable ggml one). It can be enabled by using
BUILD_TYPE=hipblas. This was tested on a gfx1100 card, but should work
for gfx900,gfx1030 and other cards. Card support can be set with
AMDGPU_TARGETS environment variable.

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 
<!--
Thank you for contributing to LocalAI! 

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`). 
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

---------

Signed-off-by: 65a <65a@63bit.net>
2023-09-28 21:42:20 +02:00
renovate[bot]
274ace2898 fix(deps): update github.com/tmc/langchaingo digest to 2c309cf (#1097)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `9c8845b` -> `2c309cf` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-24 14:55:17 +02:00
Aisuko
a8cc3709c6 Add the CONTRIBUTING.md (#1098)
**Description**

This PR is related to  #105 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 
<!--
Thank you for contributing to LocalAI! 

Contributing Conventions
-------------------------

The draft above helps to give a quick overview of your PR.

Remember to remove this comment and to at least:

1. Include descriptive PR titles with [<component-name>] prepended. We
use [conventional
commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`). 
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If
your PR gets announced, and you'd like a mention, we'll gladly shout you
out!

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.

If no one reviews your PR within a few days, please @-mention @mudler.
-->

---------

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Aisuko <urakiny@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-24 14:54:55 +02:00
Ettore Di Giacinto
a28ab18987 feat(vllm): Allow to set quantization (#1094)
This particularly useful to set AWQ

**Description**

Follow up of #1015 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-22 15:52:38 +02:00
lunamidori5
048b81373d Requested Changes from GPT4ALL to Luna-AI-Llama2 (#1092)
**Description**

This PR fixes #na

**Notes for Reviewers**
n/a

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com>
2023-09-22 11:22:17 +02:00
renovate[bot]
aea1d62ae6 fix(deps): update module google.golang.org/grpc to v1.58.2 (#1090)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [google.golang.org/grpc](https://togithub.com/grpc/grpc-go) | require
| patch | `v1.58.1` -> `v1.58.2` |

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.58.2`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.2):
Release 1.58.2

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.58.1...v1.58.2)

### Bug Fixes

-   balancer/weighted_round_robin: fix ticker leak on update

A new ticker is created every time there is an update of addresses or
configuration, but was not properly stopped. This change stops the
ticker when it is no longer needed.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi45Ny4xIiwidXBkYXRlZEluVmVyIjoiMzYuOTcuMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-22 08:44:45 +02:00
Ettore Di Giacinto
601e54000d fix(llama.cpp): update, run go mod tidy (#1088)
**Description**

This PR supersedes #1086

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-22 00:45:02 +02:00
ci-robbot [bot]
7bdf707dd3 ⬆️ Update go-skynet/go-llama.cpp (#1084)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-20 19:48:38 +02:00
Ettore Di Giacinto
4a7e7e9fdb fix(vall-e-x): copy vall-e-x next to the local-ai binary in the container image (#1082)
**Description**

This PR fixes vall-e-x in the container image

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 21:30:51 +02:00
Ettore Di Giacinto
bdf3f95346 feat(python-grpc): allow to set max workers with PYTHON_GRPC_MAX_WORKERS (#1081)
**Description**

this allows to customize the maximum number of grpc workers for python
backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 21:30:39 +02:00
Ettore Di Giacinto
453e9c5da9 fix(vllm): set default top_p with vllm (#1078)
**Description**

This PR fixes vllm when called with a request with an empty top_p

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-19 18:10:23 +02:00
Ettore Di Giacinto
3a69bd3ef5 Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-19 11:23:20 +02:00
renovate[bot]
a69c0f765e fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to e86c637 (#1059)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `cf4eb53` -> `e86c637` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 17:10:23 +02:00
renovate[bot]
97d1367764 fix(deps): update github.com/go-skynet/go-llama.cpp digest to b471eb7 (#1050)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `cc8a123` -> `b471eb7` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 17:09:51 +02:00
renovate[bot]
880e21288e fix(deps): update module github.com/valyala/fasthttp to v1.50.0 (#1060)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/valyala/fasthttp](https://togithub.com/valyala/fasthttp) |
require | minor | `v1.49.0` -> `v1.50.0` |

---

### Release Notes

<details>
<summary>valyala/fasthttp (github.com/valyala/fasthttp)</summary>

###
[`v1.50.0`](https://togithub.com/valyala/fasthttp/releases/tag/v1.50.0)

[Compare
Source](https://togithub.com/valyala/fasthttp/compare/v1.49.0...v1.50.0)

- [`8cc5539`](https://togithub.com/valyala/fasthttp/commit/8cc5539) Fix
various request timeout issues (Erik Dubbelboer)
- [`34e7da1`](https://togithub.com/valyala/fasthttp/commit/34e7da1)
Allow connection close for custom streams
([#&#8203;1603](https://togithub.com/valyala/fasthttp/issues/1603))
(Armin Becher)
- [`8236f8d`](https://togithub.com/valyala/fasthttp/commit/8236f8d)
fasthttpproxy: fix doc examples (Oleksandr Redko)
- [`4ec5c5a`](https://togithub.com/valyala/fasthttp/commit/4ec5c5a)
docs: fix typos in comments and tests (Oleksandr Redko)
- [`9aa666e`](https://togithub.com/valyala/fasthttp/commit/9aa666e)
Enable gocritic linter; fix lint issues
([#&#8203;1612](https://togithub.com/valyala/fasthttp/issues/1612))
(Oleksandr Redko)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-18 16:43:24 +02:00
James Braza
2ba9762255 Cleaned up chatbot-ui READMEs (#1075)
This PR cleans up the `chatbot-ui`/`-manual` examples:
- Fixes `Dockerfile` vs `docker-compose` confusion
- Makes it clear where to view the web UI in `## Run` sections

---------

Signed-off-by: James Braza <jamesbraza@gmail.com>
2023-09-18 16:43:06 +02:00
renovate[bot]
30f120ee6a fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.2 (#1049)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | patch | `v2.49.1` -> `v2.49.2` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.2`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.2)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.49.1...v2.49.2)

#### 🧹 Updates

- Middleware/logger: Enabling color changes padding for some fields
[#&#8203;2604](https://togithub.com/gofiber/fiber/issues/2604)
([#&#8203;2616](https://togithub.com/gofiber/fiber/issues/2616))
- Bump actions/checkout from 3 to 4
([#&#8203;2618](https://togithub.com/gofiber/fiber/issues/2618))
- Bump golang.org/x/sys from 0.11.0 to 0.12.0
([#&#8203;2617](https://togithub.com/gofiber/fiber/issues/2617))

#### 🐛 Fixes

-   Vulnerability in Ctx.IsFromLocal()

#### 📚 Documentation

- Replaced double quotes with backticks in all route parameter strings
([#&#8203;2591](https://togithub.com/gofiber/fiber/issues/2591))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.49.1...v2.49.2

Thank you [@&#8203;11-aryan](https://togithub.com/11-aryan) and
[@&#8203;AKARSHITJOSHI](https://togithub.com/AKARSHITJOSHI) for making
this update possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-17 08:39:06 +02:00
renovate[bot]
28a36e20aa fix(deps): update module google.golang.org/grpc to v1.58.1 (#1020)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [google.golang.org/grpc](https://togithub.com/grpc/grpc-go) | require
| minor | `v1.57.0` -> `v1.58.1` |

---

### Release Notes

<details>
<summary>grpc/grpc-go (google.golang.org/grpc)</summary>

### [`v1.58.1`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.1):
Release 1.58.1

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.58.0...v1.58.1)

### Bug Fixes

- grpc: fix a bug that was decrementing active RPC count too early for
streaming RPCs; leading to channel moving to IDLE even though it had
open streams
- grpc: fix a bug where transports were not being closed upon channel
entering IDLE

### [`v1.58.0`](https://togithub.com/grpc/grpc-go/releases/tag/v1.58.0):
Release 1.58.0

[Compare
Source](https://togithub.com/grpc/grpc-go/compare/v1.57.0...v1.58.0)

### API Changes

See [#&#8203;6472](https://togithub.com/grpc/grpc-go/issues/6472) for
details about these changes.

- balancer: add `StateListener` to `NewSubConnOptions` for `SubConn`
state updates and deprecate `Balancer.UpdateSubConnState`
([#&#8203;6481](https://togithub.com/grpc/grpc-go/issues/6481))
    -   `UpdateSubConnState` will be deleted in the future.
- balancer: add `SubConn.Shutdown` and deprecate
`Balancer.RemoveSubConn`
([#&#8203;6493](https://togithub.com/grpc/grpc-go/issues/6493))
    -   `RemoveSubConn` will be deleted in the future.
- resolver: remove deprecated `AddressType`
([#&#8203;6451](https://togithub.com/grpc/grpc-go/issues/6451))
- This was previously used as a signal to enable the "grpclb" load
balancing policy, and to pass LB addresses to the policy. Instead,
`balancer/grpclb/state.Set()` should be used to add these addresses to
the name resolver's output. The built-in "dns" name resolver already
does this.
- resolver: add new field `Endpoints` to `State` and deprecate
`Addresses`
([#&#8203;6471](https://togithub.com/grpc/grpc-go/issues/6471))
    -   `Addresses` will be deleted in the future.

### New Features

- balancer/leastrequest: Add experimental support for least request LB
policy and least request configured as a custom xDS policy
([#&#8203;6510](https://togithub.com/grpc/grpc-go/issues/6510),
[#&#8203;6517](https://togithub.com/grpc/grpc-go/issues/6517))
    -   Set `GRPC_EXPERIMENTAL_ENABLE_LEAST_REQUEST=true` to enable
- stats: Add an RPC event for blocking caused by the LB policy's picker
([#&#8203;6422](https://togithub.com/grpc/grpc-go/issues/6422))

### Bug Fixes

- clusterresolver: fix deadlock when dns resolver responds inline with
update or error at build time
([#&#8203;6563](https://togithub.com/grpc/grpc-go/issues/6563))
- grpc: fix a bug where the channel could erroneously report
`TRANSIENT_FAILURE` when actually moving to `IDLE`
([#&#8203;6497](https://togithub.com/grpc/grpc-go/issues/6497))
- balancergroup: do not cache closed sub-balancers by default; affects
`rls`, `weightedtarget` and `clustermanager` LB policies
([#&#8203;6523](https://togithub.com/grpc/grpc-go/issues/6523))
- client: fix a bug that prevented detection of RPC status in
trailers-only RPC responses when using `ClientStream.Header()`, and
prevented retry of the RPC
([#&#8203;6557](https://togithub.com/grpc/grpc-go/issues/6557))

### Performance Improvements

- client & server: Add experimental `[With]SharedWriteBuffer` to improve
performance by reducing allocations when sending RPC messages. (Disabled
by default.)
([#&#8203;6309](https://togithub.com/grpc/grpc-go/issues/6309))
- Special Thanks:
[@&#8203;s-matyukevich](https://togithub.com/s-matyukevich)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-17 08:38:52 +02:00
ci-robbot [bot]
a8fb4d23f8 ⬆️ Update go-skynet/go-llama.cpp (#1062)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-17 08:38:28 +02:00
Manohar Joshi
f37a4ec9c8 1038 - Streamlit bot with LocalAI (#1072)
**Description**

This PR fixes #1038

Added Streamlit example and also updated readme for examples.


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [X] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-17 08:33:23 +02:00
Ettore Di Giacinto
31ed13094b Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-09-16 23:00:42 +02:00
Ettore Di Giacinto
8ccf5b2044 feat(speculative-sampling): allow to specify a draft model in the model config (#1052)
**Description**

This PR fixes #1013.

It adds `draft_model` and `n_draft` to the model YAML config in order to
load models with speculative sampling. This should be compatible as well
with grammars.

example:

```yaml
backend: llama                                                                                                                                                                   
context_size: 1024                                                                                                                                                                        
name: my-model-name
parameters:
  model: foo-bar
n_draft: 16                                                                                                                                                                      
draft_model: model-name
```

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-14 17:44:16 +02:00
renovate[bot]
247d85b523 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to cf4eb53 (#1047)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `f0735ef` -> `cf4eb53` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-14 10:41:07 +02:00
renovate[bot]
54688db994 chore(deps): update docker/metadata-action action to v5 (#1045)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [docker/metadata-action](https://togithub.com/docker/metadata-action)
| action | major | `v4` -> `v5` |

---

### Release Notes

<details>
<summary>docker/metadata-action (docker/metadata-action)</summary>

### [`v5`](https://togithub.com/docker/metadata-action/compare/v4...v5)

[Compare
Source](https://togithub.com/docker/metadata-action/compare/v4...v5)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-14 10:40:51 +02:00
ci-robbot [bot]
8590f5a599 ⬆️ Update go-skynet/go-llama.cpp (#1048)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-14 10:40:36 +02:00
renovate[bot]
289d51c049 fix(deps): update github.com/go-skynet/go-llama.cpp digest to cc8a123 (#1041)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `4145bd5` -> `cc8a123` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 10:49:40 +00:00
renovate[bot]
813eaa867c chore(deps): update docker/login-action action to v3 (#1040)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [docker/login-action](https://togithub.com/docker/login-action) |
action | major | `v2` -> `v3` |

---

### Release Notes

<details>
<summary>docker/login-action (docker/login-action)</summary>

### [`v3`](https://togithub.com/docker/login-action/compare/v2...v3)

[Compare
Source](https://togithub.com/docker/login-action/compare/v2...v3)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:50 +02:00
renovate[bot]
abffb16292 chore(deps): update docker/build-push-action action to v5 (#1039)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[docker/build-push-action](https://togithub.com/docker/build-push-action)
| action | major | `v4` -> `v5` |

---

### Release Notes

<details>
<summary>docker/build-push-action (docker/build-push-action)</summary>

###
[`v5`](https://togithub.com/docker/build-push-action/compare/v4...v5)

[Compare
Source](https://togithub.com/docker/build-push-action/compare/v4...v5)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:28 +02:00
renovate[bot]
50e439f633 fix(deps): update module github.com/sashabaranov/go-openai to v1.15.3 (#1035)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | patch | `v1.15.2` -> `v1.15.3` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.3`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.3)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.2...v1.15.3)

#### What's Changed

- Chore Support base64 embedding format by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/485](https://togithub.com/sashabaranov/go-openai/pull/485)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.2...v1.15.3

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:17:11 +02:00
renovate[bot]
25eb1415df fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to f0735ef (#1034)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `b6e38d6` -> `f0735ef` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-13 09:16:52 +02:00
ci-robbot [bot]
0b28220f2b ⬆️ Update go-skynet/go-llama.cpp (#1043)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-13 09:16:33 +02:00
renovate[bot]
5661740990 fix(deps): update github.com/tmc/langchaingo digest to 9c8845b (#1029)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `c85d396` -> `9c8845b` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-11 09:43:11 +02:00
ci-robbot [bot]
255c31bddf ⬆️ Update go-skynet/go-llama.cpp (#1027)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-11 09:42:54 +02:00
Ettore Di Giacinto
7888fefeea docs: Update README 2023-09-10 09:21:47 +02:00
renovate[bot]
0937835802 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 4145bd5 (#1025)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `05dc4b6` -> `4145bd5` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-10 09:19:03 +02:00
renovate[bot]
ea806b37ac fix(deps): update module github.com/sashabaranov/go-openai to v1.15.2 (#1022)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | patch | `v1.15.1` -> `v1.15.2` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.2`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.2)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.15.1...v1.15.2)

#### What's Changed

- Update OpenAPI file return struct by
[@&#8203;NullpointerW](https://togithub.com/NullpointerW) in
[https://github.com/sashabaranov/go-openai/pull/486](https://togithub.com/sashabaranov/go-openai/pull/486)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15.1...v1.15.2

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi44My4wIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-10 09:18:28 +02:00
Ettore Di Giacinto
d6614f3149 feat(vllm): Initial vllm backend implementation (#1026)
Related to: https://github.com/go-skynet/LocalAI/issues/1015
2023-09-10 09:17:55 +02:00
Ettore Di Giacinto
9a50a39848 doc(README): update 2023-09-09 19:28:07 +02:00
Ettore Di Giacinto
2793e8f327 doc(citation): Add citation block
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-09 19:17:19 +02:00
Ettore Di Giacinto
c0bb5c4bf6 feat(vllm): Initial vllm backend implementation
Related to: https://github.com/go-skynet/LocalAI/issues/1015

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-09 17:03:23 +02:00
Ettore Di Giacinto
cc74fc93b4 feat(llama.cpp): update (#1024)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-08 18:38:22 +02:00
renovate[bot]
44b39195d6 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 05dc4b6 (#1004)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `d8c8547` -> `05dc4b6` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuODMuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-08 12:39:17 +02:00
Robert Deaton
2454110d81 Update README to reflect changes in Continue's config file (#1014)
**Description**

OpenAIServerInfo no longer exists, and api_base has been moved up.
Changes were made here
8967e2d53f (diff-98e147eaa7c9936befdddabb16c72447fbf8ad2df6b680c5176c24813169858e)

Signed-off-by: Robert Deaton <rdeaton@platipy.org>
2023-09-07 16:29:07 +02:00
Ettore Di Giacinto
ee59e7d45f fix(vall-e-x): make audiopath relative to models (#1012)
**Description**

This PR fixes #

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-05 19:33:36 +02:00
Ettore Di Giacinto
605c319157 feat(diffusers): don't set seed in params and respect device (#1010)
**Description**

Follow up of #998 - respect the device used to load the model and do not
specify a seed in the parameters, but rather just configure the
generator as described in
https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-04 19:38:38 +02:00
Ettore Di Giacinto
dc307a1cc0 feat: add vall-e-x (#1007)
**Description**

This PR fixes #985 

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-09-04 19:25:23 +02:00
quoing
e7981152b2 [query_data example] max_chunk_overlap in PromptHelper must be in 0..1 range (#1000)
**Description**

Simple fix, percentage value is expected to be float in range 0..1

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 19:12:53 +02:00
ci-robbot [bot]
b3eb5c860b ⬆️ Update go-skynet/go-llama.cpp (#1005)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-04 19:11:41 +02:00
Bo-Yi Wu
1c2f7409e3 chore(deps): remove unused package (#1003)
**Description**

Just remove Golang unused package and update the format in Makefile

Signed-off-by: appleboy <appleboy.tw@gmail.com>
2023-09-04 19:11:28 +02:00
renovate[bot]
57d41a3f94 fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.1 (#1001)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | patch | `v2.49.0` -> `v2.49.1` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.1`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.1)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.49.0...v2.49.1)

#### 🧹 Updates

- Bump github.com/valyala/fasthttp from 1.48.0 to 1.49.0
([#&#8203;2615](https://togithub.com/gofiber/fiber/issues/2615))

#### 🐛 Fixes

- Rollback changes to go.mod file
([#&#8203;2614](https://togithub.com/gofiber/fiber/issues/2614))

#### 📚 Documentation

- Add Polish translation - README_pl.md
([#&#8203;2613](https://togithub.com/gofiber/fiber/issues/2613))
- Update README_ko.md
([#&#8203;2605](https://togithub.com/gofiber/fiber/issues/2605))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.49.0...v2.49.1

Thank you [@&#8203;KompocikDot](https://togithub.com/KompocikDot),
[@&#8203;LimJiAn](https://togithub.com/LimJiAn) and
[@&#8203;gaby](https://togithub.com/gaby) for making this update
possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNzguOCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-04 19:11:09 +02:00
Max Cohen
f9d2bd24eb Allow to manually set the seed for the SD pipeline (#998)
**Description**

Enable setting the seed for the stable diffusion pipeline. This is done
through an additional `seed` parameter in the request, such as:

```bash
curl http://localhost:8080/v1/images/generations \
    -H "Content-Type: application/json" \
    -d '{"model": "stablediffusion", "prompt": "prompt", "n": 1, "step": 51, "size": "512x512", "seed": 3}'
```

**Notes for Reviewers**
When the `seed` parameter is not sent, `request.seed` defaults to `0`,
making it difficult to detect an actual seed of `0`. Is there a way to
change the default to `-1` for instance ?

**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [x] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-09-04 19:10:55 +02:00
ci-robbot [bot]
0e7e8eec53 ⬆️ Update go-skynet/go-llama.cpp (#1002)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-03 10:00:01 +02:00
renovate[bot]
9a30a246d8 fix(deps): update github.com/go-skynet/go-llama.cpp digest to d8c8547 (#997)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `c5622a8` -> `d8c8547` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 12:31:12 +00:00
ci-robbot [bot]
c332499252 ⬆️ Update go-skynet/go-llama.cpp (#996)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-09-02 09:54:50 +02:00
Dave
005f289632 feat: Model Gallery Endpoint Refactor / Mutable Galleries Endpoints (#991)
refactor for model gallery endpoints - bundle up resources into a
struct, make galleries mutable with some crud endpoints. This is
groundwork required for making efficient use of the new scraper - while
that PR isn't _quite_ ready yet, the goal is to have more, individually
smaller gallery files. Therefore, rather than requiring a full localai
service restart, these new endpoints have been added to make life
easier.

- Adds endpoints to add, list and remove model galleries at runtime
- Adds these endpoints to the Insomnia config
- Minor fix: loading file urls follows symbolic links now
2023-09-02 09:00:44 +02:00
renovate[bot]
3d7553317f fix(deps): update github.com/go-skynet/go-llama.cpp digest to c5622a8 (#992)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `bf3f946` -> `c5622a8` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 08:58:16 +02:00
renovate[bot]
8e4f6b2ee5 fix(deps): update github.com/nomic-ai/gpt4all/gpt4all-bindings/golang digest to b6e38d6 (#988)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/nomic-ai/gpt4all/gpt4all-bindings/golang](https://togithub.com/nomic-ai/gpt4all)
| require | digest | `27a8b02` -> `b6e38d6` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-02 08:56:43 +02:00
renovate[bot]
d5cad7d3ae fix(deps): update module github.com/shirou/gopsutil/v3 to v3.23.8 (#989)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/shirou/gopsutil/v3](https://togithub.com/shirou/gopsutil)
| require | patch | `v3.23.7` -> `v3.23.8` |

---

### Release Notes

<details>
<summary>shirou/gopsutil (github.com/shirou/gopsutil/v3)</summary>

###
[`v3.23.8`](https://togithub.com/shirou/gopsutil/releases/tag/v3.23.8)

[Compare
Source](https://togithub.com/shirou/gopsutil/compare/v3.23.7...v3.23.8)

<!-- Release notes generated using configuration in .github/release.yml
at v3.23.8 -->

#### What's Changed

[#&#8203;1514](https://togithub.com/shirou/gopsutil/issues/1514)
improves `Processes()` performance 6% or more. Thank you
[@&#8203;atoulme](https://togithub.com/atoulme) !

##### cpu

- Enable setting of vendor and related information for all Power
versions by [@&#8203;kishen-v](https://togithub.com/kishen-v) in
[https://github.com/shirou/gopsutil/pull/1495](https://togithub.com/shirou/gopsutil/pull/1495)
- chore: change CIRCLECI environment variable to CI. by
[@&#8203;shirou](https://togithub.com/shirou) in
[https://github.com/shirou/gopsutil/pull/1518](https://togithub.com/shirou/gopsutil/pull/1518)

##### disk

- fix: fixed windows disk package leaks by
[@&#8203;ozanh](https://togithub.com/ozanh) in
[https://github.com/shirou/gopsutil/pull/1501](https://togithub.com/shirou/gopsutil/pull/1501)
- fix IOCounters() SerialNumber enumeration by
[@&#8203;gdvalle](https://togithub.com/gdvalle) in
[https://github.com/shirou/gopsutil/pull/1508](https://togithub.com/shirou/gopsutil/pull/1508)

##### host

- \[host]\[linux]: remove double quote from lsb release info by
[@&#8203;shirou](https://togithub.com/shirou) in
[https://github.com/shirou/gopsutil/pull/1504](https://togithub.com/shirou/gopsutil/pull/1504)

##### mem

- mem: linux: fix vmstat field names by
[@&#8203;chouquette](https://togithub.com/chouquette) in
[https://github.com/shirou/gopsutil/pull/1498](https://togithub.com/shirou/gopsutil/pull/1498)

##### process

- Fix Processes() calls with many cores by
[@&#8203;atoulme](https://togithub.com/atoulme) in
[https://github.com/shirou/gopsutil/pull/1514](https://togithub.com/shirou/gopsutil/pull/1514)

#### New Contributors

- [@&#8203;kishen-v](https://togithub.com/kishen-v) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1495](https://togithub.com/shirou/gopsutil/pull/1495)
- [@&#8203;chouquette](https://togithub.com/chouquette) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1498](https://togithub.com/shirou/gopsutil/pull/1498)
- [@&#8203;ozanh](https://togithub.com/ozanh) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1501](https://togithub.com/shirou/gopsutil/pull/1501)
- [@&#8203;gdvalle](https://togithub.com/gdvalle) made their first
contribution in
[https://github.com/shirou/gopsutil/pull/1508](https://togithub.com/shirou/gopsutil/pull/1508)

**Full Changelog**:
https://github.com/shirou/gopsutil/compare/v3.23.7...v3.23.8

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-09-01 13:22:53 -04:00
Jirubizu
355e9d4fb5 [API] expose all the jobs via /models/jobs endpoint (#983)
**Description**

This PR fixes #


**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Co-authored-by: Jirubizu <jirubizu@jirubizu.cc>
2023-08-31 15:03:03 +00:00
renovate[bot]
629185e10a fix(deps): update module github.com/sashabaranov/go-openai to v1.15.1 (#984)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/sashabaranov/go-openai](https://togithub.com/sashabaranov/go-openai)
| require | minor | `v1.14.2` -> `v1.15.1` |

---

### Release Notes

<details>
<summary>sashabaranov/go-openai
(github.com/sashabaranov/go-openai)</summary>

###
[`v1.15.1`](https://togithub.com/sashabaranov/go-openai/releases/tag/v1.15.1)

[Compare
Source](https://togithub.com/sashabaranov/go-openai/compare/v1.14.2...v1.15.1)

#### What's Changed

- Chore Deprecate legacy fine tunes API by
[@&#8203;henomis](https://togithub.com/henomis) in
[https://github.com/sashabaranov/go-openai/pull/484](https://togithub.com/sashabaranov/go-openai/pull/484)

**Full Changelog**:
https://github.com/sashabaranov/go-openai/compare/v1.15...v1.15.1

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-31 14:35:13 +00:00
Samuel Maynard
deeef5fc24 fix(utf8): prevent multi-byte utf8 characters from being mangled (#981)
**Description**

This PR fixes #677 using [suggested
solution](https://github.com/go-skynet/LocalAI/issues/677#issuecomment-1695939097)
from @yantoz

before:
```
❯ curl -N http://localhost:57541/v1/completions -H "Content-Type: application/json" -d '{
     "model": "ggml-model-q4_0.bin",
     "prompt": "",
     "max_tokens": 32,
     "temperature": 0.7,
     "stream": true
   }'
data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"\ufffd"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" |"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":" I"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"text":"m"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
```

now:
```
❯ curl -N http://localhost:57541/v1/completions -H Content-Type: application/json -d {
   "model": "ggml-model-q4_0.bin",
   "prompt": "",
   "max_tokens": 32,
   "temperature": 0.7,
   "stream": true
 }
data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"😂"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":" "}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"|"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":" "}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"I"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"'"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

data: {"object":"text_completion","model":"ggml-model-q4_0.bin","choices":[{"index":0,"text":"m"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
```

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [X] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->

Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-08-30 23:56:59 +00:00
renovate[bot]
b905c07650 fix(deps): update github.com/go-skynet/go-llama.cpp digest to bf3f946 (#979)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `9072315` -> `bf3f946` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4xIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-30 23:02:19 +02:00
Ettore Di Giacinto
1ff30034e8 fix(deps): update go-llama.cpp (#980)
**Description**

This PR bumps llama.cpp (adding support to gguf v2) and changes the
default test model

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-30 23:01:55 +02:00
renovate[bot]
c64b59c80c fix(deps): update module github.com/valyala/fasthttp to v1.49.0 (#971)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/valyala/fasthttp](https://togithub.com/valyala/fasthttp) |
require | minor | `v1.48.0` -> `v1.49.0` |

---

### Release Notes

<details>
<summary>valyala/fasthttp (github.com/valyala/fasthttp)</summary>

###
[`v1.49.0`](https://togithub.com/valyala/fasthttp/releases/tag/v1.49.0)

[Compare
Source](https://togithub.com/valyala/fasthttp/compare/v1.48.0...v1.49.0)

- [`0e99e64`](https://togithub.com/valyala/fasthttp/commit/0e99e64)
Update golangci-lint and gosec
([#&#8203;1609](https://togithub.com/valyala/fasthttp/issues/1609))
(Erik Dubbelboer)
- [`6aea1e0`](https://togithub.com/valyala/fasthttp/commit/6aea1e0) fix
round2\_32, split round2 tests because they depend on sizeof int at
compile time
([#&#8203;1607](https://togithub.com/valyala/fasthttp/issues/1607))
(Duncan Overbruck)
- [`4b0e6c7`](https://togithub.com/valyala/fasthttp/commit/4b0e6c7)
Update ErrNoMultipartForm (Erik Dubbelboer)
- [`727021a`](https://togithub.com/valyala/fasthttp/commit/727021a)
Update security policy (Erik Dubbelboer)
- [`54fdc7a`](https://togithub.com/valyala/fasthttp/commit/54fdc7a)
Abstracts the RoundTripper interface and provides a default implement
([#&#8203;1602](https://togithub.com/valyala/fasthttp/issues/1602))
(Tim)
- [`e181af1`](https://togithub.com/valyala/fasthttp/commit/e181af1)
fasthttpproxy support ipv6
([#&#8203;1597](https://togithub.com/valyala/fasthttp/issues/1597))
(Pluto)
- [`6eb2249`](https://togithub.com/valyala/fasthttp/commit/6eb2249)
fix:fasthttp server with tlsConfig
([#&#8203;1595](https://togithub.com/valyala/fasthttp/issues/1595))
(Zhang Xiaopei)
- [`1c85d43`](https://togithub.com/valyala/fasthttp/commit/1c85d43) Fix
round2 (Erik Dubbelboer)
- [`064124e`](https://togithub.com/valyala/fasthttp/commit/064124e)
Avoid nolint:errcheck in header tests
([#&#8203;1589](https://togithub.com/valyala/fasthttp/issues/1589))
(Oleksandr Redko)
- [`0d0bbfe`](https://togithub.com/valyala/fasthttp/commit/0d0bbfe) Auto
add 'Vary' header after compression
([#&#8203;1585](https://togithub.com/valyala/fasthttp/issues/1585))
(AutumnSun)
- [`d229959`](https://togithub.com/valyala/fasthttp/commit/d229959)
Remove unnecessary indent blocks
([#&#8203;1586](https://togithub.com/valyala/fasthttp/issues/1586))
(Oleksandr Redko)
- [`6b68042`](https://togithub.com/valyala/fasthttp/commit/6b68042) Use
timeout in TCPDialer to resolveTCPAddrs
([#&#8203;1582](https://togithub.com/valyala/fasthttp/issues/1582))
(un000)

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42OC4wIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-29 22:16:10 +02:00
renovate[bot]
9a869bbaf6 fix(deps): update github.com/tmc/langchaingo digest to c85d396 (#962)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/tmc/langchaingo](https://togithub.com/tmc/langchaingo) |
require | digest | `1e2a401` -> `c85d396` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNjguMSIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-29 22:15:43 +02:00
renovate[bot]
fe1b54b713 fix(deps): update module github.com/gofiber/fiber/v2 to v2.49.0 (#966)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [github.com/gofiber/fiber/v2](https://togithub.com/gofiber/fiber) |
require | minor | `v2.48.0` -> `v2.49.0` |

---

### Release Notes

<details>
<summary>gofiber/fiber (github.com/gofiber/fiber/v2)</summary>

### [`v2.49.0`](https://togithub.com/gofiber/fiber/releases/tag/v2.49.0)

[Compare
Source](https://togithub.com/gofiber/fiber/compare/v2.48.0...v2.49.0)

####  Breaking Changes

- Add config to enable splitting by comma in parsers
([#&#8203;2560](https://togithub.com/gofiber/fiber/issues/2560))
    https://docs.gofiber.io/api/fiber#config

> EnableSplittingOnParsers splits the query/body/header parameters by
comma when it's true (default: false).
>
> For example, you can use it to parse multiple values from a query
parameter like this:
> /api?foo=bar,baz == foo\[]=bar\&foo\[]=baz

#### 🚀 New

- Add custom data property to favicon middleware config
([#&#8203;2579](https://togithub.com/gofiber/fiber/issues/2579))
    https://docs.gofiber.io/api/middleware/favicon#config

> This allows the user to use //go:embed flags to load favicon data
during build-time, and supply it to the middleware instead of reading
the file every time the application starts.

#### 🧹 Updates

- Middleware/logger: Latency match gin-gonic/gin formatter
([#&#8203;2569](https://togithub.com/gofiber/fiber/issues/2569))
- Middleware/filesystem: Refactor: use `errors.Is` instead of
`os.IsNotExist`
([#&#8203;2558](https://togithub.com/gofiber/fiber/issues/2558))
- Use Global vars instead of local vars for isLocalHost
([#&#8203;2595](https://togithub.com/gofiber/fiber/issues/2595))
- Remove redundant nil check
([#&#8203;2584](https://togithub.com/gofiber/fiber/issues/2584))
- Bump github.com/mattn/go-runewidth from 0.0.14 to 0.0.15
([#&#8203;2551](https://togithub.com/gofiber/fiber/issues/2551))
- Bump github.com/google/uuid from 1.3.0 to 1.3.1
([#&#8203;2592](https://togithub.com/gofiber/fiber/issues/2592))
- Bump golang.org/x/sys from 0.10.0 to 0.11.0
([#&#8203;2563](https://togithub.com/gofiber/fiber/issues/2563))
- Add go 1.21 to ci and readmes
([#&#8203;2588](https://togithub.com/gofiber/fiber/issues/2588))

#### 🐛 Fixes

- Middleware/logger: Default latency output format
([#&#8203;2580](https://togithub.com/gofiber/fiber/issues/2580))
- Decompress request body when multi Content-Encoding sent on request
headers ([#&#8203;2555](https://togithub.com/gofiber/fiber/issues/2555))

#### 📚 Documentation

- Fix wrong JSON docs
([#&#8203;2554](https://togithub.com/gofiber/fiber/issues/2554))
- Update io/ioutil package to io package
([#&#8203;2589](https://togithub.com/gofiber/fiber/issues/2589))
- Replace EG flag with the proper and smaller SVG
([#&#8203;2585](https://togithub.com/gofiber/fiber/issues/2585))
- Added Egyptian Arabic readme file
([#&#8203;2565](https://togithub.com/gofiber/fiber/issues/2565))
- Translate README to Portuguese
([#&#8203;2567](https://togithub.com/gofiber/fiber/issues/2567))
- Improve \*fiber.Client section
([#&#8203;2553](https://togithub.com/gofiber/fiber/issues/2553))
- Improved the config section of the middleware readme´s
([#&#8203;2552](https://togithub.com/gofiber/fiber/issues/2552))
- Added documentation about ctx Fresh
([#&#8203;2549](https://togithub.com/gofiber/fiber/issues/2549))
- Update intro.md
([#&#8203;2550](https://togithub.com/gofiber/fiber/issues/2550))
- Fixed link to slim template engine
([#&#8203;2547](https://togithub.com/gofiber/fiber/issues/2547))

**Full Changelog**:
https://github.com/gofiber/fiber/compare/v2.48.0...v2.49.0

Thank you [@&#8203;Jictyvoo](https://togithub.com/Jictyvoo),
[@&#8203;Juneezee](https://togithub.com/Juneezee),
[@&#8203;Kirari04](https://togithub.com/Kirari04),
[@&#8203;LimJiAn](https://togithub.com/LimJiAn),
[@&#8203;PassTheMayo](https://togithub.com/PassTheMayo),
[@&#8203;andersonmiranda-com](https://togithub.com/andersonmiranda-com),
[@&#8203;bigpreshy](https://togithub.com/bigpreshy),
[@&#8203;efectn](https://togithub.com/efectn),
[@&#8203;renanbastos93](https://togithub.com/renanbastos93),
[@&#8203;scandar](https://togithub.com/scandar),
[@&#8203;sixcolors](https://togithub.com/sixcolors) and
[@&#8203;stefanb](https://togithub.com/stefanb) for making this update
possible.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi42NC44IiwidXBkYXRlZEluVmVyIjoiMzYuNjQuOCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-28 08:24:13 +02:00
ci-robbot [bot]
cc84dfd50f ⬆️ Update go-skynet/go-llama.cpp (#968)
Bump of go-skynet/go-llama.cpp version

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: mudler <mudler@users.noreply.github.com>
2023-08-28 08:23:51 +02:00
Ettore Di Giacinto
158c7867e7 fix(diffusers): correctly check alpha (#967)
**Description**

Loras that have no alpha would crash otherwise

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2023-08-27 15:35:59 +02:00
renovate[bot]
997c39ccd5 fix(deps): update github.com/go-skynet/go-llama.cpp digest to 9072315 (#963)
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/go-skynet/go-llama.cpp](https://togithub.com/go-skynet/go-llama.cpp)
| require | digest | `bf63302` -> `9072315` |

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/go-skynet/LocalAI).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNi41Ni4wIiwidXBkYXRlZEluVmVyIjoiMzYuNTYuMCIsInRhcmdldEJyYW5jaCI6Im1hc3RlciJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
2023-08-27 10:11:45 +02:00
Ettore Di Giacinto
3bab307904 fix(llama): resolve lora adapters correctly from the model file (#964)
**Description**

we were otherwise expecting absolute paths. this make it relative to the
model file (as someone would expect)

**Notes for Reviewers**


**[Signed
commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
 

<!--
Thank you for contributing to LocalAI! 

Contributing Conventions:

1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR. 
3. Sign your commits

By following the community's contribution conventions upfront, the
review process will
be accelerated and your PR merged more quickly.
-->
2023-08-27 10:11:32 +02:00
Ettore Di Giacinto
02704e38d3 feat(diffusers): Add lora (#965)
**Description**

This PR fixes #914 

Now diffusers respects the `lora_adapter` configuration parameter.

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2023-08-27 10:11:16 +02:00
330 changed files with 42898 additions and 1512 deletions

43
.env
View File

@@ -23,6 +23,12 @@ MODELS_PATH=/models
## Enable debug mode
# DEBUG=true
## Disables COMPEL (Diffusers)
# COMPEL=0
## Enable/Disable single backend (useful if only one GPU is available)
# SINGLE_ACTIVE_BACKEND=true
## Specify a build type. Available: cublas, openblas, clblas.
## cuBLAS: This is a GPU-accelerated version of the complete standard BLAS (Basic Linear Algebra Subprograms) library. It's provided by Nvidia and is part of their CUDA toolkit.
## OpenBLAS: This is an open-source implementation of the BLAS library that aims to provide highly optimized code for various platforms. It includes support for multi-threading and can be compiled to use hardware-specific features for additional performance. OpenBLAS can run on many kinds of hardware, including CPUs from Intel, AMD, and ARM.
@@ -44,3 +50,40 @@ MODELS_PATH=/models
## Specify a default upload limit in MB (whisper)
# UPLOAD_LIMIT
## List of external GRPC backends (note on the container image this variable is already set to use extra backends available in extra/)
# EXTERNAL_GRPC_BACKENDS=my-backend:127.0.0.1:9000,my-backend2:/usr/bin/backend.py
### Advanced settings ###
### Those are not really used by LocalAI, but from components in the stack ###
##
### Preload libraries
# LD_PRELOAD=
### Huggingface cache for models
# HUGGINGFACE_HUB_CACHE=/usr/local/huggingface
### Python backends GRPC max workers
### Default number of workers for GRPC Python backends.
### This actually controls wether a backend can process multiple requests or not.
# PYTHON_GRPC_MAX_WORKERS=1
### Define the number of parallel LLAMA.cpp workers (Defaults to 1)
# LLAMACPP_PARALLEL=1
### Enable to run parallel requests
# PARALLEL_REQUESTS=true
### Watchdog settings
###
# Enables watchdog to kill backends that are inactive for too much time
# WATCHDOG_IDLE=true
#
# Enables watchdog to kill backends that are busy for too much time
# WATCHDOG_BUSY=true
#
# Time in duration format (e.g. 1h30m) after which a backend is considered idle
# WATCHDOG_IDLE_TIMEOUT=5m
#
# Time in duration format (e.g. 1h30m) after which a backend is considered busy
# WATCHDOG_BUSY_TIMEOUT=5m

View File

@@ -8,16 +8,24 @@ This PR fixes #
**[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**
- [ ] Yes, I signed my commits.
<!--
Thank you for contributing to LocalAI!
Contributing Conventions:
Contributing Conventions
-------------------------
1. Include descriptive PR titles with [<component-name>] prepended.
2. Build and test your changes before submitting a PR.
The draft above helps to give a quick overview of your PR.
Remember to remove this comment and to at least:
1. Include descriptive PR titles with [<component-name>] prepended. We use [conventional commits](https://www.conventionalcommits.org/en/v1.0.0/).
2. Build and test your changes before submitting a PR (`make build`).
3. Sign your commits
4. **Tag maintainer:** for a quicker response, tag the relevant maintainer (see below).
5. **X/Twitter handle:** we announce bigger features on X/Twitter. If your PR gets announced, and you'd like a mention, we'll gladly shout you out!
By following the community's contribution conventions upfront, the review process will
be accelerated and your PR merged more quickly.
If no one reviews your PR within a few days, please @-mention @mudler.
-->

View File

@@ -12,6 +12,9 @@ jobs:
- repository: "go-skynet/go-llama.cpp"
variable: "GOLLAMA_VERSION"
branch: "master"
- repository: "ggerganov/llama.cpp"
variable: "CPPLLAMA_VERSION"
branch: "master"
- repository: "go-skynet/go-ggml-transformers.cpp"
variable: "GOGGMLTRANSFORMERS_VERSION"
branch: "master"
@@ -41,7 +44,7 @@ jobs:
branch: "master"
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Bump dependencies 🔧
run: |
bash .github/bump_deps.sh ${{ matrix.repository }} ${{ matrix.branch }} ${{ matrix.variable }}

63
.github/workflows/disabled/test-gpu.yml vendored Normal file
View File

@@ -0,0 +1,63 @@
---
name: 'GPU tests'
on:
pull_request:
push:
branches:
- master
tags:
- '*'
concurrency:
group: ci-gpu-tests-${{ github.head_ref || github.ref }}-${{ github.repository }}
cancel-in-progress: true
jobs:
ubuntu-latest:
runs-on: gpu
strategy:
matrix:
go-version: ['1.21.x']
steps:
- name: Clone
uses: actions/checkout@v4
with:
submodules: true
- name: Setup Go ${{ matrix.go-version }}
uses: actions/setup-go@v4
with:
go-version: ${{ matrix.go-version }}
# You can test your matrix by printing the current Go version
- name: Display Go version
run: go version
- name: Dependencies
run: |
sudo apt-get update
sudo DEBIAN_FRONTEND=noninteractive apt-get install -y make wget
- name: Build
run: |
if [ ! -e /run/systemd/system ]; then
sudo mkdir /run/systemd/system
fi
sudo mkdir -p /host/tests/${{ github.head_ref || github.ref }}
sudo chmod -R 777 /host/tests/${{ github.head_ref || github.ref }}
make \
TEST_DIR="/host/tests/${{ github.head_ref || github.ref }}" \
BUILD_TYPE=cublas \
prepare-e2e run-e2e-image test-e2e
- name: Release space from worker ♻
if: always()
run: |
sudo rm -rf build || true
sudo rm -rf bin || true
sudo rm -rf dist || true
sudo docker logs $(sudo docker ps -q --filter ancestor=localai-tests) > logs.txt
sudo cat logs.txt || true
sudo rm -rf logs.txt
make clean || true
make \
TEST_DIR="/host/tests/${{ github.head_ref || github.ref }}" \
teardown-e2e || true
sudo rm -rf /host/tests/${{ github.head_ref || github.ref }} || true
docker system prune -f -a --volumes || true

View File

@@ -14,128 +14,144 @@ concurrency:
cancel-in-progress: true
jobs:
docker:
extras-image-build:
uses: ./.github/workflows/image_build.yml
with:
tag-latest: ${{ matrix.tag-latest }}
tag-suffix: ${{ matrix.tag-suffix }}
ffmpeg: ${{ matrix.ffmpeg }}
image-type: ${{ matrix.image-type }}
build-type: ${{ matrix.build-type }}
cuda-major-version: ${{ matrix.cuda-major-version }}
cuda-minor-version: ${{ matrix.cuda-minor-version }}
platforms: ${{ matrix.platforms }}
runs-on: ${{ matrix.runs-on }}
secrets:
dockerUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
dockerPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
strategy:
# Pushing with all jobs in parallel
# eats the bandwidth of all the nodes
max-parallel: ${{ github.event_name != 'pull_request' && 2 || 4 }}
matrix:
include:
- build-type: ''
platforms: 'linux/amd64,linux/arm64'
#platforms: 'linux/amd64,linux/arm64'
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: ''
ffmpeg: ''
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: ''
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-ffmpeg'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: 'cublas'
cuda-major-version: 11
cuda-minor-version: 7
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda11'
ffmpeg: ''
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: 'cublas'
cuda-major-version: 12
cuda-minor-version: 1
cuda-major-version: "12"
cuda-minor-version: "1"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda12'
ffmpeg: ''
- build-type: ''
platforms: 'linux/amd64,linux/arm64'
tag-latest: 'false'
tag-suffix: '-ffmpeg'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: 'cublas'
cuda-major-version: 11
cuda-minor-version: 7
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda11-ffmpeg'
ffmpeg: 'true'
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: 'cublas'
cuda-major-version: 12
cuda-minor-version: 1
cuda-major-version: "12"
cuda-minor-version: "1"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda12-ffmpeg'
ffmpeg: 'true'
runs-on: ubuntu-latest
steps:
- name: Release space from worker
run: |
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
df -h
echo
sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
sudo apt-get remove --auto-remove android-sdk-platform-tools || true
sudo apt-get purge --auto-remove android-sdk-platform-tools || true
sudo rm -rf /usr/local/lib/android
sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
sudo rm -rf /usr/share/dotnet
sudo apt-get remove -y '^mono-.*' || true
sudo apt-get remove -y '^ghc-.*' || true
sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
sudo apt-get remove -y 'php.*' || true
sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
sudo apt-get remove -y '^google-.*' || true
sudo apt-get remove -y azure-cli || true
sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
sudo apt-get remove -y '^gfortran-.*' || true
sudo apt-get autoremove -y
sudo apt-get clean
echo
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
sudo rm -rfv build || true
df -h
- name: Checkout
uses: actions/checkout@v3
- name: Docker meta
id: meta
uses: docker/metadata-action@v4
with:
images: quay.io/go-skynet/local-ai
tags: |
type=ref,event=branch
type=semver,pattern={{raw}}
type=sha
flavor: |
latest=${{ matrix.tag-latest }}
suffix=${{ matrix.tag-suffix }}
- name: Set up QEMU
uses: docker/setup-qemu-action@master
with:
platforms: all
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@master
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v2
with:
registry: quay.io
username: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
password: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
- name: Build and push
uses: docker/build-push-action@v4
with:
builder: ${{ steps.buildx.outputs.name }}
build-args: |
BUILD_TYPE=${{ matrix.build-type }}
CUDA_MAJOR_VERSION=${{ matrix.cuda-major-version }}
CUDA_MINOR_VERSION=${{ matrix.cuda-minor-version }}
FFMPEG=${{ matrix.ffmpeg }}
context: .
file: ./Dockerfile
platforms: ${{ matrix.platforms }}
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
image-type: 'extras'
runs-on: 'arc-runner-set'
- build-type: ''
#platforms: 'linux/amd64,linux/arm64'
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: ''
ffmpeg: ''
image-type: 'extras'
runs-on: 'arc-runner-set'
core-image-build:
uses: ./.github/workflows/image_build.yml
with:
tag-latest: ${{ matrix.tag-latest }}
tag-suffix: ${{ matrix.tag-suffix }}
ffmpeg: ${{ matrix.ffmpeg }}
image-type: ${{ matrix.image-type }}
build-type: ${{ matrix.build-type }}
cuda-major-version: ${{ matrix.cuda-major-version }}
cuda-minor-version: ${{ matrix.cuda-minor-version }}
platforms: ${{ matrix.platforms }}
runs-on: ${{ matrix.runs-on }}
secrets:
dockerUsername: ${{ secrets.LOCALAI_REGISTRY_USERNAME }}
dockerPassword: ${{ secrets.LOCALAI_REGISTRY_PASSWORD }}
strategy:
matrix:
include:
- build-type: ''
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-ffmpeg-core'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'ubuntu-latest'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda11-core'
ffmpeg: ''
image-type: 'core'
runs-on: 'ubuntu-latest'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "1"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda12-core'
ffmpeg: ''
image-type: 'core'
runs-on: 'ubuntu-latest'
- build-type: 'cublas'
cuda-major-version: "11"
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda11-ffmpeg-core'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'ubuntu-latest'
- build-type: 'cublas'
cuda-major-version: "12"
cuda-minor-version: "1"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-cublas-cuda12-ffmpeg-core'
ffmpeg: 'true'
image-type: 'core'
runs-on: 'ubuntu-latest'

147
.github/workflows/image_build.yml vendored Normal file
View File

@@ -0,0 +1,147 @@
---
name: 'build container images (reusable)'
on:
workflow_call:
inputs:
build-type:
description: 'Build type'
default: ''
type: string
cuda-major-version:
description: 'CUDA major version'
default: "11"
type: string
cuda-minor-version:
description: 'CUDA minor version'
default: "7"
type: string
platforms:
description: 'Platforms'
default: ''
type: string
tag-latest:
description: 'Tag latest'
default: ''
type: string
tag-suffix:
description: 'Tag suffix'
default: ''
type: string
ffmpeg:
description: 'FFMPEG'
default: ''
type: string
image-type:
description: 'Image type'
default: ''
type: string
runs-on:
description: 'Runs on'
required: true
default: ''
type: string
secrets:
dockerUsername:
required: true
dockerPassword:
required: true
jobs:
reusable_image-build:
runs-on: ${{ inputs.runs-on }}
steps:
- name: Force Install GIT latest
run: |
sudo apt-get update \
&& sudo apt-get install -y software-properties-common \
&& sudo apt-get update \
&& sudo add-apt-repository -y ppa:git-core/ppa \
&& sudo apt-get update \
&& sudo apt-get install -y git
- name: Checkout
uses: actions/checkout@v4
# - name: Release space from worker
# run: |
# echo "Listing top largest packages"
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
# head -n 30 <<< "${pkgs}"
# echo
# df -h
# echo
# sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
# sudo apt-get remove --auto-remove android-sdk-platform-tools || true
# sudo apt-get purge --auto-remove android-sdk-platform-tools || true
# sudo rm -rf /usr/local/lib/android
# sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
# sudo rm -rf /usr/share/dotnet
# sudo apt-get remove -y '^mono-.*' || true
# sudo apt-get remove -y '^ghc-.*' || true
# sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
# sudo apt-get remove -y 'php.*' || true
# sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
# sudo apt-get remove -y '^google-.*' || true
# sudo apt-get remove -y azure-cli || true
# sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
# sudo apt-get remove -y '^gfortran-.*' || true
# sudo apt-get remove -y microsoft-edge-stable || true
# sudo apt-get remove -y firefox || true
# sudo apt-get remove -y powershell || true
# sudo apt-get remove -y r-base-core || true
# sudo apt-get autoremove -y
# sudo apt-get clean
# echo
# echo "Listing top largest packages"
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
# head -n 30 <<< "${pkgs}"
# echo
# sudo rm -rfv build || true
# df -h
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: quay.io/go-skynet/local-ai
tags: |
type=ref,event=branch
type=semver,pattern={{raw}}
type=sha
flavor: |
latest=${{ inputs.tag-latest }}
suffix=${{ inputs.tag-suffix }}
- name: Set up QEMU
uses: docker/setup-qemu-action@master
with:
platforms: all
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@master
- name: Login to DockerHub
if: github.event_name != 'pull_request'
uses: docker/login-action@v3
with:
registry: quay.io
username: ${{ secrets.dockerUsername }}
password: ${{ secrets.dockerPassword }}
- name: Build and push
uses: docker/build-push-action@v5
with:
builder: ${{ steps.buildx.outputs.name }}
build-args: |
BUILD_TYPE=${{ inputs.build-type }}
CUDA_MAJOR_VERSION=${{ inputs.cuda-major-version }}
CUDA_MINOR_VERSION=${{ inputs.cuda-minor-version }}
FFMPEG=${{ inputs.ffmpeg }}
IMAGE_TYPE=${{ inputs.image-type }}
context: .
file: ./Dockerfile
platforms: ${{ inputs.platforms }}
push: ${{ github.event_name != 'pull_request' }}
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
- name: job summary
run: |
echo "Built image: ${{ steps.meta.outputs.labels }}" >> $GITHUB_STEP_SUMMARY

View File

@@ -19,7 +19,7 @@ jobs:
runs-on: ubuntu-latest
steps:
- name: Clone
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
submodules: true
- uses: actions/setup-go@v4
@@ -29,6 +29,12 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install build-essential ffmpeg
git clone --recurse-submodules -b v1.58.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc && \
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
../.. && sudo make -j12 install
- name: Build
id: build
env:
@@ -60,18 +66,26 @@ jobs:
runs-on: macOS-latest
steps:
- name: Clone
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
submodules: true
- uses: actions/setup-go@v4
with:
go-version: '>=1.21.0'
- name: Dependencies
run: |
git clone --recurse-submodules -b v1.58.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc && \
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
../.. && make -j12 install && rm -rf grpc
- name: Build
id: build
env:
CMAKE_ARGS: "${{ matrix.defines }}"
BUILD_ID: "${{ matrix.build }}"
run: |
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
make dist
- uses: actions/upload-artifact@v3
with:

View File

@@ -14,14 +14,46 @@ concurrency:
cancel-in-progress: true
jobs:
ubuntu-latest:
tests-linux:
runs-on: ubuntu-latest
strategy:
matrix:
go-version: ['1.21.x']
steps:
- name: Release space from worker
run: |
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
df -h
echo
sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
sudo apt-get remove --auto-remove android-sdk-platform-tools || true
sudo apt-get purge --auto-remove android-sdk-platform-tools || true
sudo rm -rf /usr/local/lib/android
sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
sudo rm -rf /usr/share/dotnet
sudo apt-get remove -y '^mono-.*' || true
sudo apt-get remove -y '^ghc-.*' || true
sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
sudo apt-get remove -y 'php.*' || true
sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
sudo apt-get remove -y '^google-.*' || true
sudo apt-get remove -y azure-cli || true
sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
sudo apt-get remove -y '^gfortran-.*' || true
sudo apt-get autoremove -y
sudo apt-get clean
echo
echo "Listing top largest packages"
pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
head -n 30 <<< "${pkgs}"
echo
sudo rm -rfv build || true
df -h
- name: Clone
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
submodules: true
- name: Setup Go ${{ matrix.go-version }}
@@ -35,38 +67,42 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install build-essential ffmpeg
curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
sudo install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
sudo /bin/bash -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list' && \
sudo /bin/bash -c 'echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list' && \
sudo apt-get update && \
sudo apt-get install -y conda
sudo apt-get install -y ca-certificates cmake curl patch
sudo apt-get install -y libopencv-dev && sudo ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
sudo pip install -r extra/requirements.txt
sudo rm -rfv /usr/bin/conda || true
PATH=$PATH:/opt/conda/bin make -C backend/python/sentencetransformers
sudo mkdir /build && sudo chmod -R 777 /build && cd /build && \
curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v1.11.0.tar.gz" | \
tar -xzvf - && \
mkdir -p "spdlog-1.11.0/build" && \
cd "spdlog-1.11.0/build" && \
cmake .. && \
make -j8 && \
sudo cmake --install . --prefix /usr && mkdir -p "lib/Linux-$(uname -m)" && \
cd /build && \
mkdir -p "lib/Linux-$(uname -m)/piper_phonemize" && \
curl -L "https://github.com/rhasspy/piper-phonemize/releases/download/v1.0.0/libpiper_phonemize-amd64.tar.gz" | \
tar -C "lib/Linux-$(uname -m)/piper_phonemize" -xzvf - && ls -liah /build/lib/Linux-$(uname -m)/piper_phonemize/ && \
sudo cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /usr/lib/ && \
sudo ln -s /usr/lib/libpiper_phonemize.so /usr/lib/libpiper_phonemize.so.1 && \
sudo cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/include/. /usr/include/
# Pre-build piper before we start tests in order to have shared libraries in place
make sources/go-piper && \
GO_TAGS="tts" make -C sources/go-piper piper.o && \
sudo cp -rfv sources/go-piper/piper/build/pi/lib/. /usr/lib/ && \
# Pre-build stable diffusion before we install a newer version of abseil (not compatible with stablediffusion-ncn)
GO_TAGS="stablediffusion tts" GRPC_BACKENDS=backend-assets/grpc/stablediffusion make build
git clone --recurse-submodules -b v1.58.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc && \
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
../.. && sudo make -j12 install
- name: Test
run: |
ESPEAK_DATA="/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data" GO_TAGS="tts stablediffusion" make test
GO_TAGS="stablediffusion tts" make test
macOS-latest:
tests-apple:
runs-on: macOS-latest
strategy:
matrix:
go-version: ['1.21.x']
steps:
- name: Clone
uses: actions/checkout@v3
uses: actions/checkout@v4
with:
submodules: true
- name: Setup Go ${{ matrix.go-version }}
@@ -76,6 +112,14 @@ jobs:
# You can test your matrix by printing the current Go version
- name: Display Go version
run: go version
- name: Dependencies
run: |
git clone --recurse-submodules -b v1.58.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc && \
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
../.. && make -j12 install && rm -rf grpc
- name: Test
run: |
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
CMAKE_ARGS="-DLLAMA_F16C=OFF -DLLAMA_AVX512=OFF -DLLAMA_AVX2=OFF -DLLAMA_FMA=OFF" make test

11
.gitignore vendored
View File

@@ -1,15 +1,10 @@
# go-llama build artifacts
go-llama
go-llama-stable
/gpt4all
go-stable-diffusion
go-piper
/go-bert
go-ggllm
/piper
/sources/
__pycache__/
*.a
get-sources
/backend/cpp/llama/grpc-server
/backend/cpp/llama/llama.cpp
go-ggml-transformers
go-gpt2

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "docs/themes/hugo-theme-relearn"]
path = docs/themes/hugo-theme-relearn
url = https://github.com/McShelby/hugo-theme-relearn.git

72
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,72 @@
# Contributing to localAI
Thank you for your interest in contributing to LocalAI! We appreciate your time and effort in helping to improve our project. Before you get started, please take a moment to review these guidelines.
## Table of Contents
- [Getting Started](#getting-started)
- [Prerequisites](#prerequisites)
- [Setting up the Development Environment](#setting-up-the-development-environment)
- [Contributing](#contributing)
- [Submitting an Issue](#submitting-an-issue)
- [Creating a Pull Request (PR)](#creating-a-pull-request-pr)
- [Coding Guidelines](#coding-guidelines)
- [Testing](#testing)
- [Documentation](#documentation)
- [Community and Communication](#community-and-communication)
## Getting Started
### Prerequisites
- Golang [1.21]
- Git
- macOS/Linux
### Setting up the Development Environment and running localAI in the local environment
1. Clone the repository: `git clone https://github.com/go-skynet/LocalAI.git`
2. Navigate to the project directory: `cd LocalAI`
3. Install the required dependencies: `make prepare`
4. Run LocalAI: `make run`
## Contributing
We welcome contributions from everyone! To get started, follow these steps:
### Submitting an Issue
If you find a bug, have a feature request, or encounter any issues, please check the [issue tracker](https://github.com/go-skynet/LocalAI/issues) to see if a similar issue has already been reported. If not, feel free to [create a new issue](https://github.com/go-skynet/LocalAI/issues/new) and provide as much detail as possible.
### Creating a Pull Request (PR)
1. Fork the repository.
2. Create a new branch with a descriptive name: `git checkout -b [branch name]`
3. Make your changes and commit them.
4. Push the changes to your fork: `git push origin [branch name]`
5. Create a new pull request from your branch to the main project's `main` or `master` branch.
6. Provide a clear description of your changes in the pull request.
7. Make any requested changes during the review process.
8. Once your PR is approved, it will be merged into the main project.
## Coding Guidelines
- No specific coding guidelines at the moment. Please make sure the code can be tested. The most popular lint tools like []`golangci-lint`](https://golangci-lint.run) can help you here.
## Testing
`make test` cannot handle all the model now. Please be sure to add a test case for the new features or the part was changed.
## Documentation
- We are welcome the contribution of the documents, please open new PR in the official document repo [localai-website](https://github.com/go-skynet/localai-website)
## Community and Communication
- You can reach out via the Github issue tracker.
- Open a new discussion at [Discussion](https://github.com/go-skynet/LocalAI/discussions)
- Join the Discord channel [Discord](https://discord.gg/uJAeKSAGDy)
---

View File

@@ -1,22 +1,27 @@
ARG GO_VERSION=1.21-bullseye
ARG IMAGE_TYPE=extras
# extras or core
FROM golang:$GO_VERSION as requirements
FROM golang:$GO_VERSION as requirements-core
ARG BUILD_TYPE
ARG CUDA_MAJOR_VERSION=11
ARG CUDA_MINOR_VERSION=7
ARG SPDLOG_VERSION="1.11.0"
ARG PIPER_PHONEMIZE_VERSION='1.0.0'
ARG TARGETARCH
ARG TARGETVARIANT
ENV BUILD_TYPE=${BUILD_TYPE}
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/extra/grpc/huggingface/huggingface.py,autogptq:/build/extra/grpc/autogptq/autogptq.py,bark:/build/extra/grpc/bark/ttsbark.py,diffusers:/build/extra/grpc/diffusers/backend_diffusers.py,exllama:/build/extra/grpc/exllama/exllama.py"
ENV EXTERNAL_GRPC_BACKENDS="huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh,petals:/build/backend/python/petals/run.sh,transformers:/build/backend/python/transformers/run.sh,sentencetransformers:/build/backend/python/sentencetransformers/run.sh,autogptq:/build/backend/python/autogptq/run.sh,bark:/build/backend/python/bark/run.sh,diffusers:/build/backend/python/diffusers/run.sh,exllama:/build/backend/python/exllama/run.sh,vall-e-x:/build/backend/python/vall-e-x/run.sh,vllm:/build/backend/python/vllm/run.sh"
ENV GALLERIES='[{"name":"model-gallery", "url":"github:go-skynet/model-gallery/index.yaml"}, {"url": "github:go-skynet/model-gallery/huggingface.yaml","name":"huggingface"}]'
ARG GO_TAGS="stablediffusion tts"
RUN apt-get update && \
apt-get install -y ca-certificates cmake curl patch pip
apt-get install -y ca-certificates curl patch pip cmake && apt-get clean
COPY --chmod=644 custom-ca-certs/* /usr/local/share/ca-certificates/
RUN update-ca-certificates
# Use the variables in subsequent instructions
RUN echo "Target Architecture: $TARGETARCH"
@@ -30,63 +35,52 @@ RUN if [ "${BUILD_TYPE}" = "cublas" ]; then \
dpkg -i cuda-keyring_1.0-1_all.deb && \
rm -f cuda-keyring_1.0-1_all.deb && \
apt-get update && \
apt-get install -y cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} \
apt-get install -y cuda-nvcc-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcublas-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusparse-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} libcusolver-dev-${CUDA_MAJOR_VERSION}-${CUDA_MINOR_VERSION} && apt-get clean \
; fi
ENV PATH /usr/local/cuda/bin:${PATH}
# Extras requirements
COPY extra/requirements.txt /build/extra/requirements.txt
ENV PATH="/root/.cargo/bin:${PATH}"
RUN pip install --upgrade pip
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
RUN if [ "${TARGETARCH}" = "amd64" ]; then \
pip install git+https://github.com/suno-ai/bark.git diffusers invisible_watermark transformers accelerate safetensors;\
fi
RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH}" = "amd64" ]; then \
pip install torch && pip install auto-gptq https://github.com/jllllll/exllama/releases/download/0.0.10/exllama-0.0.10+cu${CUDA_MAJOR_VERSION}${CUDA_MINOR_VERSION}-cp39-cp39-linux_x86_64.whl;\
fi
RUN pip install -r /build/extra/requirements.txt && rm -rf /build/extra/requirements.txt
# OpenBLAS requirements and stable diffusion
RUN apt-get install -y \
libopenblas-dev \
libopencv-dev \
&& apt-get clean
# Set up OpenCV
RUN ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
WORKDIR /build
# OpenBLAS requirements
RUN apt-get install -y libopenblas-dev
# Stable Diffusion requirements
RUN apt-get install -y libopencv-dev && \
ln -s /usr/include/opencv4/opencv2 /usr/include/opencv2
# piper requirements
# Use pre-compiled Piper phonemization library (includes onnxruntime)
#RUN if echo "${GO_TAGS}" | grep -q "tts"; then \
RUN test -n "$TARGETARCH" \
|| (echo 'warn: missing $TARGETARCH, either set this `ARG` manually, or run using `docker buildkit`')
RUN curl -L "https://github.com/gabime/spdlog/archive/refs/tags/v${SPDLOG_VERSION}.tar.gz" | \
tar -xzvf - && \
mkdir -p "spdlog-${SPDLOG_VERSION}/build" && \
cd "spdlog-${SPDLOG_VERSION}/build" && \
cmake .. && \
make -j8 && \
cmake --install . --prefix /usr && mkdir -p "lib/Linux-$(uname -m)" && \
cd /build && \
mkdir -p "lib/Linux-$(uname -m)/piper_phonemize" && \
curl -L "https://github.com/rhasspy/piper-phonemize/releases/download/v${PIPER_PHONEMIZE_VERSION}/libpiper_phonemize-${TARGETARCH:-$(go env GOARCH)}${TARGETVARIANT}.tar.gz" | \
tar -C "lib/Linux-$(uname -m)/piper_phonemize" -xzvf - && ls -liah /build/lib/Linux-$(uname -m)/piper_phonemize/ && \
cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/lib/. /usr/lib/ && \
ln -s /usr/lib/libpiper_phonemize.so /usr/lib/libpiper_phonemize.so.1 && \
cp -rfv /build/lib/Linux-$(uname -m)/piper_phonemize/include/. /usr/include/
# Extras requirements
FROM requirements-core as requirements-extras
RUN curl https://repo.anaconda.com/pkgs/misc/gpgkeys/anaconda.asc | gpg --dearmor > conda.gpg && \
install -o root -g root -m 644 conda.gpg /usr/share/keyrings/conda-archive-keyring.gpg && \
gpg --keyring /usr/share/keyrings/conda-archive-keyring.gpg --no-default-keyring --fingerprint 34161F5BF5EB1D4BFBBB8F0A8AEB4F8B29D82806 && \
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" > /etc/apt/sources.list.d/conda.list && \
echo "deb [arch=amd64 signed-by=/usr/share/keyrings/conda-archive-keyring.gpg] https://repo.anaconda.com/pkgs/misc/debrepo/conda stable main" | tee -a /etc/apt/sources.list.d/conda.list && \
apt-get update && \
apt-get install -y conda
ENV PATH="/root/.cargo/bin:${PATH}"
RUN pip install --upgrade pip
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
# \
# ; fi
###################################
###################################
FROM requirements as builder
FROM requirements-${IMAGE_TYPE} as builder
ARG GO_TAGS="stablediffusion tts"
ARG GRPC_BACKENDS
ARG BUILD_GRPC=true
ENV GRPC_BACKENDS=${GRPC_BACKENDS}
ENV GO_TAGS=${GO_TAGS}
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
ENV NVIDIA_REQUIRE_CUDA="cuda>=${CUDA_MAJOR_VERSION}.0"
@@ -94,28 +88,47 @@ ENV NVIDIA_VISIBLE_DEVICES=all
WORKDIR /build
COPY Makefile .
RUN make get-sources
COPY go.mod .
RUN make prepare
COPY . .
COPY .git .
RUN make prepare
RUN ESPEAK_DATA=/build/lib/Linux-$(uname -m)/piper_phonemize/lib/espeak-ng-data make build
# stablediffusion does not tolerate a newer version of abseil, build it first
RUN GRPC_BACKENDS=backend-assets/grpc/stablediffusion make build
RUN if [ "${BUILD_GRPC}" = "true" ]; then \
git clone --recurse-submodules -b v1.58.0 --depth 1 --shallow-submodules https://github.com/grpc/grpc && \
cd grpc && mkdir -p cmake/build && cd cmake/build && cmake -DgRPC_INSTALL=ON \
-DgRPC_BUILD_TESTS=OFF \
../.. && make -j12 install \
; fi
# Rebuild with defaults backends
RUN make build
RUN if [ ! -d "/build/sources/go-piper/piper/build/pi/lib/" ]; then \
mkdir -p /build/sources/go-piper/piper/build/pi/lib/ \
touch /build/sources/go-piper/piper/build/pi/lib/keep \
; fi
###################################
###################################
FROM requirements
FROM requirements-${IMAGE_TYPE}
ARG FFMPEG
ARG BUILD_TYPE
ARG TARGETARCH
ARG IMAGE_TYPE=extras
ENV BUILD_TYPE=${BUILD_TYPE}
ENV REBUILD=false
ENV HEALTHCHECK_ENDPOINT=http://localhost:8080/readyz
ARG CUDA_MAJOR_VERSION=11
ENV NVIDIA_DRIVER_CAPABILITIES=compute,utility
ENV NVIDIA_REQUIRE_CUDA="cuda>=${CUDA_MAJOR_VERSION}.0"
ENV NVIDIA_VISIBLE_DEVICES=all
# Add FFmpeg
RUN if [ "${FFMPEG}" = "true" ]; then \
apt-get install -y ffmpeg \
@@ -128,12 +141,62 @@ WORKDIR /build
# see https://github.com/go-skynet/LocalAI/pull/658#discussion_r1241971626 and
# https://github.com/go-skynet/LocalAI/pull/434
COPY . .
RUN make prepare-sources
COPY --from=builder /build/sources ./sources/
COPY --from=builder /build/grpc ./grpc/
RUN make prepare-sources && cd /build/grpc/cmake/build && make install && rm -rf grpc
# Copy the binary
COPY --from=builder /build/local-ai ./
# To resolve exllama import error
RUN if [ "${BUILD_TYPE}" = "cublas" ] && [ "${TARGETARCH:-$(go env GOARCH)}" = "amd64" ]; then \
cp -rfv /usr/local/lib/python3.9/dist-packages/exllama extra/grpc/exllama/;\
# Copy shared libraries for piper
COPY --from=builder /build/sources/go-piper/piper/build/pi/lib/* /usr/lib/
# do not let stablediffusion rebuild (requires an older version of absl)
COPY --from=builder /build/backend-assets/grpc/stablediffusion ./backend-assets/grpc/stablediffusion
## Duplicated from Makefile to avoid having a big layer that's hard to push
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/autogptq \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/bark \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/diffusers \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/vllm \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/sentencetransformers \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/transformers \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/vall-e-x \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/exllama \
; fi
RUN if [ "${IMAGE_TYPE}" = "extras" ]; then \
PATH=$PATH:/opt/conda/bin make -C backend/python/petals \
; fi
# Copy VALLE-X as it's not a real "lib"
# TODO: this is wrong - we should copy the lib into the conda env path
RUN if [ -d /usr/lib/vall-e-x ]; then \
cp -rfv /usr/lib/vall-e-x/* ./ ; \
fi
# we also copy exllama libs over to resolve exllama import error
# TODO: check if this is still needed
RUN if [ -d /usr/local/lib/python3.9/dist-packages/exllama ]; then \
cp -rfv /usr/local/lib/python3.9/dist-packages/exllama backend/python/exllama/;\
fi
# Define the health check command
HEALTHCHECK --interval=1m --timeout=10m --retries=10 \
CMD curl -f $HEALTHCHECK_ENDPOINT || exit 1

10
Entitlements.plist Normal file
View File

@@ -0,0 +1,10 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.network.client</key>
<true/>
<key>com.apple.security.network.server</key>
<true/>
</dict>
</plist>

461
Makefile
View File

@@ -4,10 +4,12 @@ GOVET=$(GOCMD) vet
BINARY_NAME=local-ai
# llama.cpp versions
GOLLAMA_VERSION?=bf63302a2be787674e6ca4227a8aaeb95a8eb6b1
GOLLAMA_VERSION?=aeba71ee842819da681ea537e78846dc75949ac0
GOLLAMA_STABLE_VERSION?=50cee7712066d9e38306eccadcfbb44ea87df4b7
CPPLLAMA_VERSION?=1f5cd83275fabb43f2ae92c30033b384a3eb37b4
# gpt4all version
GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
GPT4ALL_VERSION?=27a8b020c36b0df8f8b82a252d261cda47cf44b8
@@ -26,23 +28,23 @@ WHISPER_CPP_VERSION?=85ed71aaec8e0612a84c0b67804bde75aa75a273
BERT_VERSION?=6abe312cded14042f6b7c3cd8edf082713334a4d
# go-piper version
PIPER_VERSION?=56b8a81b4760a6fbee1a82e62f007ae7e8f010a7
# go-bloomz version
BLOOMZ_VERSION?=1834e77b83faafe912ad4092ccf7f77937349e2f
PIPER_VERSION?=5a4c9e28c84bac09ab6baa9f88457d852cb46bb2
# stablediffusion version
STABLEDIFFUSION_VERSION?=d89260f598afb809279bc72aa0107b4292587632
# Go-ggllm
GOGGLLM_VERSION?=862477d16eefb0805261c19c9b0d053e3b2b684b
STABLEDIFFUSION_VERSION?=902db5f066fd137697e3b69d0fa10d4782bd2c2f
export BUILD_TYPE?=
export STABLE_BUILD_TYPE?=$(BUILD_TYPE)
export CMAKE_ARGS?=
CGO_LDFLAGS?=
CUDA_LIBPATH?=/usr/local/cuda/lib64/
GO_TAGS?=
BUILD_ID?=git
TEST_DIR=/tmp/test
RANDOM := $(shell bash -c 'echo $$RANDOM')
VERSION?=$(shell git describe --always --tags || echo "dev" )
# go tool nm ./local-ai | grep Commit
LD_FLAGS?=
@@ -50,7 +52,6 @@ override LD_FLAGS += -X "github.com/go-skynet/LocalAI/internal.Version=$(VERSION
override LD_FLAGS += -X "github.com/go-skynet/LocalAI/internal.Commit=$(shell git rev-parse HEAD)"
OPTIONAL_TARGETS?=
ESPEAK_DATA?=
OS := $(shell uname -s)
ARCH := $(shell uname -m)
@@ -60,13 +61,26 @@ WHITE := $(shell tput -Txterm setaf 7)
CYAN := $(shell tput -Txterm setaf 6)
RESET := $(shell tput -Txterm sgr0)
# Default Docker bridge IP
E2E_BRIDGE_IP?=172.17.0.1
ifndef UNAME_S
UNAME_S := $(shell uname -s)
endif
# workaround for rwkv.cpp
ifeq ($(UNAME_S),Darwin)
CGO_LDFLAGS += -lcblas -framework Accelerate
ifeq ($(OS),Darwin)
CGO_LDFLAGS += -lcblas -framework Accelerate
ifeq ($(OSX_SIGNING_IDENTITY),)
OSX_SIGNING_IDENTITY := $(shell security find-identity -v -p codesigning | grep '"' | head -n 1 | sed -E 's/.*"(.*)"/\1/')
endif
# on OSX, if BUILD_TYPE is blank, we should default to use Metal
ifeq ($(BUILD_TYPE),)
BUILD_TYPE=metal
# disable metal if on Darwin and any other value is explicitly passed.
else ifneq ($(BUILD_TYPE),metal)
CMAKE_ARGS+=-DLLAMA_METAL=OFF
endif
endif
ifeq ($(BUILD_TYPE),openblas)
@@ -78,6 +92,18 @@ ifeq ($(BUILD_TYPE),cublas)
export LLAMA_CUBLAS=1
endif
ifeq ($(BUILD_TYPE),hipblas)
ROCM_HOME ?= /opt/rocm
export CXX=$(ROCM_HOME)/llvm/bin/clang++
export CC=$(ROCM_HOME)/llvm/bin/clang
# llama-ggml has no hipblas support, so override it here.
export STABLE_BUILD_TYPE=
GPU_TARGETS ?= gfx900,gfx90a,gfx1030,gfx1031,gfx1100
AMDGPU_TARGETS ?= "$(GPU_TARGETS)"
CMAKE_ARGS+=-DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS="$(AMDGPU_TARGETS)" -DGPU_TARGETS="$(GPU_TARGETS)"
CGO_LDFLAGS += -O3 --rtlib=compiler-rt -unwindlib=libgcc -lhipblas -lrocblas --hip-link
endif
ifeq ($(BUILD_TYPE),metal)
CGO_LDFLAGS+=-framework Foundation -framework Metal -framework MetalKit -framework MetalPerformanceShaders
export LLAMA_METAL=1
@@ -100,169 +126,149 @@ endif
ifeq ($(findstring tts,$(GO_TAGS)),tts)
# OPTIONAL_TARGETS+=go-piper/libpiper_binding.a
# OPTIONAL_TARGETS+=backend-assets/espeak-ng-data
PIPER_CGO_CXXFLAGS+=-I$(shell pwd)/sources/go-piper/piper/src/cpp -I$(shell pwd)/sources/go-piper/piper/build/fi/include -I$(shell pwd)/sources/go-piper/piper/build/pi/include -I$(shell pwd)/sources/go-piper/piper/build/si/include
PIPER_CGO_LDFLAGS+=-L$(shell pwd)/sources/go-piper/piper/build/fi/lib -L$(shell pwd)/sources/go-piper/piper/build/pi/lib -L$(shell pwd)/sources/go-piper/piper/build/si/lib -lfmt -lspdlog -lucd
OPTIONAL_GRPC+=backend-assets/grpc/piper
endif
GRPC_BACKENDS?=backend-assets/grpc/langchain-huggingface backend-assets/grpc/falcon-ggml backend-assets/grpc/bert-embeddings backend-assets/grpc/falcon backend-assets/grpc/bloomz backend-assets/grpc/llama backend-assets/grpc/llama-stable backend-assets/grpc/gpt4all backend-assets/grpc/dolly backend-assets/grpc/gpt2 backend-assets/grpc/gptj backend-assets/grpc/gptneox backend-assets/grpc/mpt backend-assets/grpc/replit backend-assets/grpc/starcoder backend-assets/grpc/rwkv backend-assets/grpc/whisper $(OPTIONAL_GRPC)
ALL_GRPC_BACKENDS=backend-assets/grpc/langchain-huggingface backend-assets/grpc/falcon-ggml backend-assets/grpc/bert-embeddings backend-assets/grpc/llama backend-assets/grpc/llama-cpp backend-assets/grpc/llama-ggml backend-assets/grpc/gpt4all backend-assets/grpc/dolly backend-assets/grpc/gpt2 backend-assets/grpc/gptj backend-assets/grpc/gptneox backend-assets/grpc/mpt backend-assets/grpc/replit backend-assets/grpc/starcoder backend-assets/grpc/rwkv backend-assets/grpc/whisper $(OPTIONAL_GRPC)
GRPC_BACKENDS?=$(ALL_GRPC_BACKENDS) $(OPTIONAL_GRPC)
# If empty, then we build all
ifeq ($(GRPC_BACKENDS),)
GRPC_BACKENDS=$(ALL_GRPC_BACKENDS)
endif
.PHONY: all test build vendor
all: help
## GPT4ALL
gpt4all:
git clone --recurse-submodules $(GPT4ALL_REPO) gpt4all
cd gpt4all && git checkout -b build $(GPT4ALL_VERSION) && git submodule update --init --recursive --depth 1
## go-ggllm
go-ggllm:
git clone --recurse-submodules https://github.com/mudler/go-ggllm.cpp go-ggllm
cd go-ggllm && git checkout -b build $(GOGGLLM_VERSION) && git submodule update --init --recursive --depth 1
go-ggllm/libggllm.a: go-ggllm
$(MAKE) -C go-ggllm BUILD_TYPE=$(BUILD_TYPE) libggllm.a
sources/gpt4all:
git clone --recurse-submodules $(GPT4ALL_REPO) sources/gpt4all
cd sources/gpt4all && git checkout -b build $(GPT4ALL_VERSION) && git submodule update --init --recursive --depth 1
## go-piper
go-piper:
git clone --recurse-submodules https://github.com/mudler/go-piper go-piper
cd go-piper && git checkout -b build $(PIPER_VERSION) && git submodule update --init --recursive --depth 1
sources/go-piper:
git clone --recurse-submodules https://github.com/mudler/go-piper sources/go-piper
cd sources/go-piper && git checkout -b build $(PIPER_VERSION) && git submodule update --init --recursive --depth 1
## BERT embeddings
go-bert:
git clone --recurse-submodules https://github.com/go-skynet/go-bert.cpp go-bert
cd go-bert && git checkout -b build $(BERT_VERSION) && git submodule update --init --recursive --depth 1
sources/go-bert:
git clone --recurse-submodules https://github.com/go-skynet/go-bert.cpp sources/go-bert
cd sources/go-bert && git checkout -b build $(BERT_VERSION) && git submodule update --init --recursive --depth 1
## stable diffusion
go-stable-diffusion:
git clone --recurse-submodules https://github.com/mudler/go-stable-diffusion go-stable-diffusion
cd go-stable-diffusion && git checkout -b build $(STABLEDIFFUSION_VERSION) && git submodule update --init --recursive --depth 1
sources/go-stable-diffusion:
git clone --recurse-submodules https://github.com/mudler/go-stable-diffusion sources/go-stable-diffusion
cd sources/go-stable-diffusion && git checkout -b build $(STABLEDIFFUSION_VERSION) && git submodule update --init --recursive --depth 1
go-stable-diffusion/libstablediffusion.a:
$(MAKE) -C go-stable-diffusion libstablediffusion.a
sources/go-stable-diffusion/libstablediffusion.a:
$(MAKE) -C sources/go-stable-diffusion libstablediffusion.a
## RWKV
go-rwkv:
git clone --recurse-submodules $(RWKV_REPO) go-rwkv
cd go-rwkv && git checkout -b build $(RWKV_VERSION) && git submodule update --init --recursive --depth 1
sources/go-rwkv:
git clone --recurse-submodules $(RWKV_REPO) sources/go-rwkv
cd sources/go-rwkv && git checkout -b build $(RWKV_VERSION) && git submodule update --init --recursive --depth 1
go-rwkv/librwkv.a: go-rwkv
cd go-rwkv && cd rwkv.cpp && cmake . -DRWKV_BUILD_SHARED_LIBRARY=OFF && cmake --build . && cp librwkv.a ..
sources/go-rwkv/librwkv.a: sources/go-rwkv
cd sources/go-rwkv && cd rwkv.cpp && cmake . -DRWKV_BUILD_SHARED_LIBRARY=OFF && cmake --build . && cp librwkv.a ..
## bloomz
bloomz:
git clone --recurse-submodules https://github.com/go-skynet/bloomz.cpp bloomz
cd bloomz && git checkout -b build $(BLOOMZ_VERSION) && git submodule update --init --recursive --depth 1
sources/go-bert/libgobert.a: sources/go-bert
$(MAKE) -C sources/go-bert libgobert.a
bloomz/libbloomz.a: bloomz
cd bloomz && make libbloomz.a
go-bert/libgobert.a: go-bert
$(MAKE) -C go-bert libgobert.a
backend-assets/gpt4all: gpt4all/gpt4all-bindings/golang/libgpt4all.a
backend-assets/gpt4all: sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a
mkdir -p backend-assets/gpt4all
@cp gpt4all/gpt4all-bindings/golang/buildllm/*.so backend-assets/gpt4all/ || true
@cp gpt4all/gpt4all-bindings/golang/buildllm/*.dylib backend-assets/gpt4all/ || true
@cp gpt4all/gpt4all-bindings/golang/buildllm/*.dll backend-assets/gpt4all/ || true
@cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.so backend-assets/gpt4all/ || true
@cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.dylib backend-assets/gpt4all/ || true
@cp sources/gpt4all/gpt4all-bindings/golang/buildllm/*.dll backend-assets/gpt4all/ || true
backend-assets/espeak-ng-data:
backend-assets/espeak-ng-data: sources/go-piper
mkdir -p backend-assets/espeak-ng-data
ifdef ESPEAK_DATA
@cp -rf $(ESPEAK_DATA)/. backend-assets/espeak-ng-data
else
@echo "ESPEAK_DATA not set, skipping tts. Note that this will break the tts functionality."
@touch backend-assets/espeak-ng-data/keep
endif
$(MAKE) -C sources/go-piper piper.o
@cp -rf sources/go-piper/piper/build/pi/share/espeak-ng-data/. backend-assets/espeak-ng-data
gpt4all/gpt4all-bindings/golang/libgpt4all.a: gpt4all
$(MAKE) -C gpt4all/gpt4all-bindings/golang/ libgpt4all.a
sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a: sources/gpt4all
$(MAKE) -C sources/gpt4all/gpt4all-bindings/golang/ libgpt4all.a
## CEREBRAS GPT
go-ggml-transformers:
git clone --recurse-submodules https://github.com/go-skynet/go-ggml-transformers.cpp go-ggml-transformers
cd go-ggml-transformers && git checkout -b build $(GOGPT2_VERSION) && git submodule update --init --recursive --depth 1
sources/go-ggml-transformers:
git clone --recurse-submodules https://github.com/go-skynet/go-ggml-transformers.cpp sources/go-ggml-transformers
cd sources/go-ggml-transformers && git checkout -b build $(GOGPT2_VERSION) && git submodule update --init --recursive --depth 1
go-ggml-transformers/libtransformers.a: go-ggml-transformers
$(MAKE) -C go-ggml-transformers BUILD_TYPE=$(BUILD_TYPE) libtransformers.a
sources/go-ggml-transformers/libtransformers.a: sources/go-ggml-transformers
$(MAKE) -C sources/go-ggml-transformers BUILD_TYPE=$(BUILD_TYPE) libtransformers.a
whisper.cpp:
git clone https://github.com/ggerganov/whisper.cpp.git
cd whisper.cpp && git checkout -b build $(WHISPER_CPP_VERSION) && git submodule update --init --recursive --depth 1
sources/whisper.cpp:
git clone https://github.com/ggerganov/whisper.cpp.git sources/whisper.cpp
cd sources/whisper.cpp && git checkout -b build $(WHISPER_CPP_VERSION) && git submodule update --init --recursive --depth 1
whisper.cpp/libwhisper.a: whisper.cpp
cd whisper.cpp && make libwhisper.a
sources/whisper.cpp/libwhisper.a: sources/whisper.cpp
cd sources/whisper.cpp && make libwhisper.a
go-llama:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp go-llama
cd go-llama && git checkout -b build $(GOLLAMA_VERSION) && git submodule update --init --recursive --depth 1
sources/go-llama:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp sources/go-llama
cd sources/go-llama && git checkout -b build $(GOLLAMA_VERSION) && git submodule update --init --recursive --depth 1
go-llama-stable:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp go-llama-stable
cd go-llama-stable && git checkout -b build $(GOLLAMA_STABLE_VERSION) && git submodule update --init --recursive --depth 1
sources/go-llama-ggml:
git clone --recurse-submodules https://github.com/go-skynet/go-llama.cpp sources/go-llama-ggml
cd sources/go-llama-ggml && git checkout -b build $(GOLLAMA_STABLE_VERSION) && git submodule update --init --recursive --depth 1
go-llama/libbinding.a: go-llama
$(MAKE) -C go-llama BUILD_TYPE=$(BUILD_TYPE) libbinding.a
sources/go-llama/libbinding.a: sources/go-llama
$(MAKE) -C sources/go-llama BUILD_TYPE=$(BUILD_TYPE) libbinding.a
go-llama-stable/libbinding.a: go-llama-stable
$(MAKE) -C go-llama-stable BUILD_TYPE=$(BUILD_TYPE) libbinding.a
sources/go-llama-ggml/libbinding.a: sources/go-llama-ggml
$(MAKE) -C sources/go-llama-ggml BUILD_TYPE=$(STABLE_BUILD_TYPE) libbinding.a
go-piper/libpiper_binding.a:
$(MAKE) -C go-piper libpiper_binding.a example/main
sources/go-piper/libpiper_binding.a: sources/go-piper
$(MAKE) -C sources/go-piper libpiper_binding.a example/main
get-sources: go-llama go-llama-stable go-ggllm go-ggml-transformers gpt4all go-piper go-rwkv whisper.cpp go-bert bloomz go-stable-diffusion
backend/cpp/llama/llama.cpp:
$(MAKE) -C backend/cpp/llama llama.cpp
get-sources: backend/cpp/llama/llama.cpp sources/go-llama sources/go-llama-ggml sources/go-ggml-transformers sources/gpt4all sources/go-piper sources/go-rwkv sources/whisper.cpp sources/go-bert sources/go-stable-diffusion
touch $@
replace:
$(GOCMD) mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=$(shell pwd)/gpt4all/gpt4all-bindings/golang
$(GOCMD) mod edit -replace github.com/go-skynet/go-ggml-transformers.cpp=$(shell pwd)/go-ggml-transformers
$(GOCMD) mod edit -replace github.com/donomii/go-rwkv.cpp=$(shell pwd)/go-rwkv
$(GOCMD) mod edit -replace github.com/ggerganov/whisper.cpp=$(shell pwd)/whisper.cpp
$(GOCMD) mod edit -replace github.com/go-skynet/go-bert.cpp=$(shell pwd)/go-bert
$(GOCMD) mod edit -replace github.com/go-skynet/bloomz.cpp=$(shell pwd)/bloomz
$(GOCMD) mod edit -replace github.com/mudler/go-stable-diffusion=$(shell pwd)/go-stable-diffusion
$(GOCMD) mod edit -replace github.com/mudler/go-piper=$(shell pwd)/go-piper
$(GOCMD) mod edit -replace github.com/mudler/go-ggllm.cpp=$(shell pwd)/go-ggllm
$(GOCMD) mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=$(shell pwd)/sources/gpt4all/gpt4all-bindings/golang
$(GOCMD) mod edit -replace github.com/go-skynet/go-ggml-transformers.cpp=$(shell pwd)/sources/go-ggml-transformers
$(GOCMD) mod edit -replace github.com/donomii/go-rwkv.cpp=$(shell pwd)/sources/go-rwkv
$(GOCMD) mod edit -replace github.com/ggerganov/whisper.cpp=$(shell pwd)/sources/whisper.cpp
$(GOCMD) mod edit -replace github.com/go-skynet/go-bert.cpp=$(shell pwd)/sources/go-bert
$(GOCMD) mod edit -replace github.com/mudler/go-stable-diffusion=$(shell pwd)/sources/go-stable-diffusion
$(GOCMD) mod edit -replace github.com/mudler/go-piper=$(shell pwd)/sources/go-piper
prepare-sources: get-sources replace
$(GOCMD) mod download
touch $@
## GENERIC
rebuild: ## Rebuilds the project
$(GOCMD) clean -cache
$(MAKE) -C go-llama clean
$(MAKE) -C go-llama-stable clean
$(MAKE) -C gpt4all/gpt4all-bindings/golang/ clean
$(MAKE) -C go-ggml-transformers clean
$(MAKE) -C go-rwkv clean
$(MAKE) -C whisper.cpp clean
$(MAKE) -C go-stable-diffusion clean
$(MAKE) -C go-bert clean
$(MAKE) -C bloomz clean
$(MAKE) -C go-piper clean
$(MAKE) -C go-ggllm clean
$(MAKE) -C sources/go-llama clean
$(MAKE) -C sources/go-llama-ggml clean
$(MAKE) -C sources/gpt4all/gpt4all-bindings/golang/ clean
$(MAKE) -C sources/go-ggml-transformers clean
$(MAKE) -C sources/go-rwkv clean
$(MAKE) -C sources/whisper.cpp clean
$(MAKE) -C sources/go-stable-diffusion clean
$(MAKE) -C sources/go-bert clean
$(MAKE) -C sources/go-piper clean
$(MAKE) build
prepare: prepare-sources $(OPTIONAL_TARGETS)
prepare: prepare-sources $(OPTIONAL_TARGETS)
touch $@
clean: ## Remove build related file
$(GOCMD) clean -cache
rm -f prepare
rm -rf ./go-llama
rm -rf ./gpt4all
rm -rf ./go-llama-stable
rm -rf ./go-gpt2
rm -rf ./go-stable-diffusion
rm -rf ./go-ggml-transformers
rm -rf ./backend-assets
rm -rf ./go-rwkv
rm -rf ./go-bert
rm -rf ./bloomz
rm -rf ./whisper.cpp
rm -rf ./go-piper
rm -rf ./go-ggllm
rm -rf ./sources
rm -rf $(BINARY_NAME)
rm -rf release/
rm -rf ./backend/cpp/grpc/grpc_repo
rm -rf ./backend/cpp/grpc/build
rm -rf ./backend/cpp/grpc/installed_packages
$(MAKE) -C backend/cpp/llama clean
## Build:
@@ -278,6 +284,9 @@ dist: build
mkdir -p release
cp $(BINARY_NAME) release/$(BINARY_NAME)-$(BUILD_ID)-$(OS)-$(ARCH)
osx-signed: build
codesign --deep --force --sign "$(OSX_SIGNING_IDENTITY)" --entitlements "./Entitlements.plist" "./$(BINARY_NAME)"
## Run
run: prepare ## run local-ai
CGO_LDFLAGS="$(CGO_LDFLAGS)" $(GOCMD) run ./
@@ -285,12 +294,12 @@ run: prepare ## run local-ai
test-models/testmodel:
mkdir test-models
mkdir test-dir
wget https://huggingface.co/nnakasato/ggml-model-test/resolve/main/ggml-model-q4.bin -O test-models/testmodel
wget https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O test-models/whisper-en
wget https://huggingface.co/skeskinen/ggml/resolve/main/all-MiniLM-L6-v2/ggml-model-q4_0.bin -O test-models/bert
wget https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav -O test-dir/audio.wav
wget https://huggingface.co/mudler/rwkv-4-raven-1.5B-ggml/resolve/main/RWKV-4-Raven-1B5-v11-Eng99%2525-Other1%2525-20230425-ctx4096_Q4_0.bin -O test-models/rwkv
wget https://raw.githubusercontent.com/saharNooby/rwkv.cpp/5eb8f09c146ea8124633ab041d9ea0b1f1db4459/rwkv/20B_tokenizer.json -O test-models/rwkv.tokenizer.json
wget -q https://huggingface.co/nnakasato/ggml-model-test/resolve/main/ggml-model-q4.bin -O test-models/testmodel
wget -q https://huggingface.co/ggerganov/whisper.cpp/resolve/main/ggml-base.en.bin -O test-models/whisper-en
wget -q https://huggingface.co/mudler/all-MiniLM-L6-v2/resolve/main/ggml-model-q4_0.bin -O test-models/bert
wget -q https://cdn.openai.com/whisper/draft-20220913a/micro-machines.wav -O test-dir/audio.wav
wget -q https://huggingface.co/mudler/rwkv-4-raven-1.5B-ggml/resolve/main/RWKV-4-Raven-1B5-v11-Eng99%2525-Other1%2525-20230425-ctx4096_Q4_0.bin -O test-models/rwkv
wget -q https://raw.githubusercontent.com/saharNooby/rwkv.cpp/5eb8f09c146ea8124633ab041d9ea0b1f1db4459/rwkv/20B_tokenizer.json -O test-models/rwkv.tokenizer.json
cp tests/models_fixtures/* test-models
prepare-test: grpcs
@@ -301,14 +310,34 @@ test: prepare test-models/testmodel grpcs
@echo 'Running tests'
export GO_TAGS="tts stablediffusion"
$(MAKE) prepare-test
HUGGINGFACE_GRPC=$(abspath ./)/extra/grpc/huggingface/huggingface.py TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --flake-attempts 5 -v -r ./api ./pkg
HUGGINGFACE_GRPC=$(abspath ./)/backend/python/sentencetransformers/run.sh TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="!gpt4all && !llama && !llama-gguf" --flake-attempts 5 --fail-fast -v -r ./api ./pkg
$(MAKE) test-gpt4all
$(MAKE) test-llama
$(MAKE) test-llama-gguf
$(MAKE) test-tts
$(MAKE) test-stablediffusion
prepare-e2e:
mkdir -p $(TEST_DIR)
cp -rfv $(abspath ./tests/e2e-fixtures)/gpu.yaml $(TEST_DIR)/gpu.yaml
test -e $(TEST_DIR)/ggllm-test-model.bin || wget -q https://huggingface.co/TheBloke/CodeLlama-7B-Instruct-GGUF/resolve/main/codellama-7b-instruct.Q2_K.gguf -O $(TEST_DIR)/ggllm-test-model.bin
docker build --build-arg BUILD_GRPC=true --build-arg GRPC_BACKENDS="$(GRPC_BACKENDS)" --build-arg IMAGE_TYPE=core --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg CUDA_MAJOR_VERSION=11 --build-arg CUDA_MINOR_VERSION=7 --build-arg FFMPEG=true -t localai-tests .
run-e2e-image:
ls -liah $(abspath ./tests/e2e-fixtures)
docker run -p 5390:8080 -e MODELS_PATH=/models -e THREADS=1 -e DEBUG=true -d --rm -v $(TEST_DIR):/models --gpus all --name e2e-tests-$(RANDOM) localai-tests
test-e2e:
@echo 'Running e2e tests'
BUILD_TYPE=$(BUILD_TYPE) \
LOCALAI_API=http://$(E2E_BRIDGE_IP):5390/v1 \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --flake-attempts 5 -v -r ./tests/e2e
teardown-e2e:
rm -rf $(TEST_DIR) || true
docker stop $$(docker ps -q --filter ancestor=localai-tests)
test-gpt4all: prepare-test
TEST_DIR=$(abspath ./)/test-dir/ FIXTURES=$(abspath ./)/tests/fixtures CONFIG_FILE=$(abspath ./)/test-models/config.yaml MODELS_PATH=$(abspath ./)/test-models \
$(GOCMD) run github.com/onsi/ginkgo/v2/ginkgo --label-filter="gpt4all" --flake-attempts 5 -v -r ./api ./pkg
@@ -349,99 +378,141 @@ protogen: protogen-go protogen-python
protogen-go:
protoc --go_out=. --go_opt=paths=source_relative --go-grpc_out=. --go-grpc_opt=paths=source_relative \
pkg/grpc/proto/backend.proto
backend/backend.proto
protogen-python:
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/huggingface/ --grpc_python_out=extra/grpc/huggingface/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/autogptq/ --grpc_python_out=extra/grpc/autogptq/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/exllama/ --grpc_python_out=extra/grpc/exllama/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/bark/ --grpc_python_out=extra/grpc/bark/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ipkg/grpc/proto/ --python_out=extra/grpc/diffusers/ --grpc_python_out=extra/grpc/diffusers/ pkg/grpc/proto/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/sentencetransformers/ --grpc_python_out=backend/python/sentencetransformers/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/transformers/ --grpc_python_out=backend/python/transformers/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/autogptq/ --grpc_python_out=backend/python/autogptq/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/exllama/ --grpc_python_out=backend/python/exllama/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/bark/ --grpc_python_out=backend/python/bark/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/diffusers/ --grpc_python_out=backend/python/diffusers/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/vall-e-x/ --grpc_python_out=backend/python/vall-e-x/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/vllm/ --grpc_python_out=backend/python/vllm/ backend/backend.proto
python3 -m grpc_tools.protoc -Ibackend/ --python_out=backend/python/petals/ --grpc_python_out=backend/python/petals/ backend/backend.proto
## GRPC
# Note: it is duplicated in the Dockerfile
prepare-extra-conda-environments:
$(MAKE) -C backend/python/autogptq
$(MAKE) -C backend/python/bark
$(MAKE) -C backend/python/diffusers
$(MAKE) -C backend/python/vllm
$(MAKE) -C backend/python/sentencetransformers
$(MAKE) -C backend/python/transformers
$(MAKE) -C backend/python/vall-e-x
$(MAKE) -C backend/python/exllama
$(MAKE) -C backend/python/petals
backend-assets/grpc:
mkdir -p backend-assets/grpc
backend-assets/grpc/falcon: backend-assets/grpc go-ggllm/libggllm.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggllm LIBRARY_PATH=$(shell pwd)/go-ggllm \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/falcon ./cmd/grpc/falcon/
backend-assets/grpc/llama: backend-assets/grpc go-llama/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-llama LIBRARY_PATH=$(shell pwd)/go-llama \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama ./cmd/grpc/llama/
# TODO: every binary should have its own folder instead, so can have different metal implementations
backend-assets/grpc/llama: backend-assets/grpc sources/go-llama/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/sources/go-llama
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-llama LIBRARY_PATH=$(shell pwd)/sources/go-llama \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama ./backend/go/llm/llama/
# TODO: every binary should have its own folder instead, so can have different implementations
ifeq ($(BUILD_TYPE),metal)
cp go-llama/build/bin/ggml-metal.metal backend-assets/grpc/
cp backend/cpp/llama/llama.cpp/ggml-metal.metal backend-assets/grpc/
endif
backend-assets/grpc/llama-stable: backend-assets/grpc go-llama-stable/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/go-llama-stable
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-llama-stable LIBRARY_PATH=$(shell pwd)/go-llama \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama-stable ./cmd/grpc/llama-stable/
## BACKEND CPP LLAMA START
# Sets the variables in case it has to build the gRPC locally.
INSTALLED_PACKAGES=$(CURDIR)/backend/cpp/grpc/installed_packages
INSTALLED_LIB_CMAKE=$(INSTALLED_PACKAGES)/lib/cmake
ADDED_CMAKE_ARGS=-Dabsl_DIR=${INSTALLED_LIB_CMAKE}/absl \
-DProtobuf_DIR=${INSTALLED_LIB_CMAKE}/protobuf \
-Dutf8_range_DIR=${INSTALLED_LIB_CMAKE}/utf8_range \
-DgRPC_DIR=${INSTALLED_LIB_CMAKE}/grpc \
-DCMAKE_CXX_STANDARD_INCLUDE_DIRECTORIES=${INSTALLED_PACKAGES}/include
backend-assets/grpc/gpt4all: backend-assets/grpc backend-assets/gpt4all gpt4all/gpt4all-bindings/golang/libgpt4all.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(shell pwd)/gpt4all/gpt4all-bindings/golang/ \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt4all ./cmd/grpc/gpt4all/
backend/cpp/llama/grpc-server:
ifdef BUILD_GRPC_FOR_BACKEND_LLAMA
backend/cpp/grpc/script/build_grpc.sh ${INSTALLED_PACKAGES}
export _PROTOBUF_PROTOC=${INSTALLED_PACKAGES}/bin/proto && \
export _GRPC_CPP_PLUGIN_EXECUTABLE=${INSTALLED_PACKAGES}/bin/grpc_cpp_plugin && \
export PATH=${PATH}:${INSTALLED_PACKAGES}/bin && \
CMAKE_ARGS="${CMAKE_ARGS} ${ADDED_CMAKE_ARGS}" LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/llama grpc-server
else
echo "BUILD_GRPC_FOR_BACKEND_LLAMA is not defined."
LLAMA_VERSION=$(CPPLLAMA_VERSION) $(MAKE) -C backend/cpp/llama grpc-server
endif
## BACKEND CPP LLAMA END
##
backend-assets/grpc/llama-cpp: backend-assets/grpc backend/cpp/llama/grpc-server
cp -rfv backend/cpp/llama/grpc-server backend-assets/grpc/llama-cpp
# TODO: every binary should have its own folder instead, so can have different metal implementations
ifeq ($(BUILD_TYPE),metal)
cp backend/cpp/llama/llama.cpp/build/bin/ggml-metal.metal backend-assets/grpc/
endif
backend-assets/grpc/dolly: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/dolly ./cmd/grpc/dolly/
backend-assets/grpc/llama-ggml: backend-assets/grpc sources/go-llama-ggml/libbinding.a
$(GOCMD) mod edit -replace github.com/go-skynet/go-llama.cpp=$(shell pwd)/sources/go-llama-ggml
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-llama-ggml LIBRARY_PATH=$(shell pwd)/sources/go-llama-ggml \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/llama-ggml ./backend/go/llm/llama-ggml/
backend-assets/grpc/gpt2: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt2 ./cmd/grpc/gpt2/
backend-assets/grpc/gpt4all: backend-assets/grpc backend-assets/gpt4all sources/gpt4all/gpt4all-bindings/golang/libgpt4all.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/gpt4all/gpt4all-bindings/golang/ LIBRARY_PATH=$(shell pwd)/sources/gpt4all/gpt4all-bindings/golang/ \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt4all ./backend/go/llm/gpt4all/
backend-assets/grpc/gptj: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gptj ./cmd/grpc/gptj/
backend-assets/grpc/dolly: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/dolly ./backend/go/llm/dolly/
backend-assets/grpc/gptneox: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gptneox ./cmd/grpc/gptneox/
backend-assets/grpc/gpt2: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gpt2 ./backend/go/llm/gpt2/
backend-assets/grpc/mpt: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/mpt ./cmd/grpc/mpt/
backend-assets/grpc/gptj: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gptj ./backend/go/llm/gptj/
backend-assets/grpc/replit: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/replit ./cmd/grpc/replit/
backend-assets/grpc/gptneox: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/gptneox ./backend/go/llm/gptneox/
backend-assets/grpc/falcon-ggml: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/falcon-ggml ./cmd/grpc/falcon-ggml/
backend-assets/grpc/mpt: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/mpt ./backend/go/llm/mpt/
backend-assets/grpc/starcoder: backend-assets/grpc go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/starcoder ./cmd/grpc/starcoder/
backend-assets/grpc/replit: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/replit ./backend/go/llm/replit/
backend-assets/grpc/rwkv: backend-assets/grpc go-rwkv/librwkv.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-rwkv LIBRARY_PATH=$(shell pwd)/go-rwkv \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/rwkv ./cmd/grpc/rwkv/
backend-assets/grpc/falcon-ggml: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/falcon-ggml ./backend/go/llm/falcon-ggml/
backend-assets/grpc/bloomz: backend-assets/grpc bloomz/libbloomz.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/bloomz LIBRARY_PATH=$(shell pwd)/bloomz \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/bloomz ./cmd/grpc/bloomz/
backend-assets/grpc/starcoder: backend-assets/grpc sources/go-ggml-transformers/libtransformers.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-ggml-transformers LIBRARY_PATH=$(shell pwd)/sources/go-ggml-transformers \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/starcoder ./backend/go/llm/starcoder/
backend-assets/grpc/bert-embeddings: backend-assets/grpc go-bert/libgobert.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-bert LIBRARY_PATH=$(shell pwd)/go-bert \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/bert-embeddings ./cmd/grpc/bert-embeddings/
backend-assets/grpc/rwkv: backend-assets/grpc sources/go-rwkv/librwkv.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-rwkv LIBRARY_PATH=$(shell pwd)/sources/go-rwkv \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/rwkv ./backend/go/llm/rwkv
backend-assets/grpc/bert-embeddings: backend-assets/grpc sources/go-bert/libgobert.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-bert LIBRARY_PATH=$(shell pwd)/sources/go-bert \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/bert-embeddings ./backend/go/llm/bert/
backend-assets/grpc/langchain-huggingface: backend-assets/grpc
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/langchain-huggingface ./cmd/grpc/langchain-huggingface/
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/langchain-huggingface ./backend/go/llm/langchain/
backend-assets/grpc/stablediffusion: backend-assets/grpc go-stable-diffusion/libstablediffusion.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/go-stable-diffusion/ LIBRARY_PATH=$(shell pwd)/go-stable-diffusion/ \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/stablediffusion ./cmd/grpc/stablediffusion/
backend-assets/grpc/stablediffusion: backend-assets/grpc
if [ ! -f backend-assets/grpc/stablediffusion ]; then \
$(MAKE) sources/go-stable-diffusion/libstablediffusion.a; \
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/go-stable-diffusion/ LIBRARY_PATH=$(shell pwd)/sources/go-stable-diffusion/ \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/stablediffusion ./backend/go/image/; \
fi
backend-assets/grpc/piper: backend-assets/grpc backend-assets/espeak-ng-data go-piper/libpiper_binding.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" LIBRARY_PATH=$(shell pwd)/go-piper \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/piper ./cmd/grpc/piper/
backend-assets/grpc/piper: backend-assets/grpc backend-assets/espeak-ng-data sources/go-piper/libpiper_binding.a
CGO_CXXFLAGS="$(PIPER_CGO_CXXFLAGS)" CGO_LDFLAGS="$(PIPER_CGO_LDFLAGS)" LIBRARY_PATH=$(shell pwd)/sources/go-piper \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/piper ./backend/go/tts/
backend-assets/grpc/whisper: backend-assets/grpc whisper.cpp/libwhisper.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/whisper.cpp LIBRARY_PATH=$(shell pwd)/whisper.cpp \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/whisper ./cmd/grpc/whisper/
backend-assets/grpc/whisper: backend-assets/grpc sources/whisper.cpp/libwhisper.a
CGO_LDFLAGS="$(CGO_LDFLAGS)" C_INCLUDE_PATH=$(shell pwd)/sources/whisper.cpp LIBRARY_PATH=$(shell pwd)/sources/whisper.cpp \
$(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o backend-assets/grpc/whisper ./backend/go/transcribe/
grpcs: prepare $(GRPC_BACKENDS)

View File

@@ -22,14 +22,13 @@
> :bulb: Get help - [❓FAQ](https://localai.io/faq/) [💭Discussions](https://github.com/go-skynet/LocalAI/discussions) [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) [:book: Documentation website](https://localai.io/)
>
> [💻 Quickstart](https://localai.io/basics/getting_started/) [📣 News](https://localai.io/basics/news/) [ 🛫 Examples ](https://github.com/go-skynet/LocalAI/tree/master/examples/) [ 🖼️ Models ](https://localai.io/models/)
> [💻 Quickstart](https://localai.io/basics/getting_started/) [📣 News](https://localai.io/basics/news/) [ 🛫 Examples ](https://github.com/go-skynet/LocalAI/tree/master/examples/) [ 🖼️ Models ](https://localai.io/models/) [ 🚀 Roadmap ](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap)
[![tests](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml)[![Build and Release](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml)[![build container images](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml)[![Bump dependencies](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml)[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/localai)](https://artifacthub.io/packages/search?repo=localai)
**LocalAI** is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.
**LocalAI** is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API thats compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU.
<p align="center"><b>Follow LocalAI </b></p>
<p align="center"><b>Follow LocalAI </b></p>
<p align="center">
<a href="https://twitter.com/LocalAI_API" target="blank">
@@ -39,7 +38,7 @@
<img src="https://dcbadge.vercel.app/api/server/uJAeKSAGDy?style=flat-square&theme=default-inverted" alt="Join LocalAI Discord Community"/>
</a>
<p align="center"><b>Connect with the Creator </b></p>
<p align="center"><b>Connect with the Creator </b></p>
<p align="center">
<a href="https://twitter.com/mudler_it" target="blank">
@@ -50,7 +49,7 @@
</a>
</p>
<p align="center"><b>Share LocalAI Repository</b></p>
<p align="center"><b>Share LocalAI Repository</b></p>
<p align="center">
@@ -64,6 +63,22 @@
</p>
## 💻 [Getting started](https://localai.io/basics/getting_started/index.html)
## 🔥🔥 Hot topics / Roadmap
[Roadmap](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap)
🆕 New! [LLM finetuning guide](https://localai.io/advanced/fine-tuning/)
Hot topics (looking for contributors):
- Backends v2: https://github.com/mudler/LocalAI/issues/1126
- Improving UX v2: https://github.com/mudler/LocalAI/issues/1373
If you want to help and contribute, issues up for grabs: https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3A%22up+for+grabs%22
<hr>
In a nutshell:
@@ -79,8 +94,6 @@ LocalAI was created by [Ettore Di Giacinto](https://github.com/mudler/) and is a
Note that this started just as a [fun weekend project](https://localai.io/#backstory) in order to try to create the necessary pieces for a full AI assistant like `ChatGPT`: the community is growing fast and we are working hard to make it better and more stable. If you want to help, please consider contributing (see below)!
## 🔥🔥 [Hot topics / Roadmap](https://localai.io/#-hot-topics--roadmap)
## 🚀 [Features](https://localai.io/features/)
- 📖 [Text generation with GPTs](https://localai.io/features/text-generation/) (`llama.cpp`, `gpt4all.cpp`, ... [:book: and more](https://localai.io/model-compatibility/index.html#model-compatibility-table))
@@ -91,8 +104,32 @@ Note that this started just as a [fun weekend project](https://localai.io/#backs
- 🧠 [Embeddings generation for vector databases](https://localai.io/features/embeddings/)
- ✍️ [Constrained grammars](https://localai.io/features/constrained_grammars/)
- 🖼️ [Download Models directly from Huggingface ](https://localai.io/models/)
- 🆕 [Vision API](https://localai.io/features/gpt-vision/)
## 💻 Usage
Check out the [Getting started](https://localai.io/basics/getting_started/index.html) section in our documentation.
### 🔗 Community and integrations
WebUIs:
- https://github.com/Jirubizu/localai-admin
- https://github.com/go-skynet/LocalAI-frontend
Model galleries
- https://github.com/go-skynet/model-gallery
Other:
- Helm chart https://github.com/go-skynet/helm-charts
### 🔗 Resources
- 🆕 New! [LLM finetuning guide](https://localai.io/advanced/fine-tuning/)
- [How to build locally](https://localai.io/basics/build/index.html)
- [How to install in Kubernetes](https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes)
- [Projects integrating LocalAI](https://localai.io/integrations/)
- [How tos section](https://localai.io/howtos/) (curated by our community)
## :book: 🎥 [Media, Blogs, Social](https://localai.io/basics/news/#media-blogs-social)
- [Create a slackbot for teams and OSS projects that answer to documentation](https://mudler.pm/posts/smart-slackbot-for-teams/)
@@ -100,19 +137,19 @@ Note that this started just as a [fun weekend project](https://localai.io/#backs
- [Question Answering on Documents locally with LangChain, LocalAI, Chroma, and GPT4All](https://mudler.pm/posts/localai-question-answering/)
- [Tutorial to use k8sgpt with LocalAI](https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65)
## 💻 Usage
## Citation
Check out the [Getting started](https://localai.io/basics/getting_started/index.html) section in our documentation.
If you utilize this repository, data in a downstream project, please consider citing it with:
### 💡 Example: Use GPT4ALL-J model
See the [documentation](https://localai.io/basics/getting_started/#example-use-gpt4all-j-model-with-docker-compose)
### 🔗 Resources
- [How to build locally](https://localai.io/basics/build/index.html)
- [How to install in Kubernetes](https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes)
- [Projects integrating LocalAI](https://localai.io/integrations/)
```
@misc{localai,
author = {Ettore Di Giacinto},
title = {LocalAI: The free, Open source OpenAI alternative},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/go-skynet/LocalAI}},
```
## ❤️ Sponsors
@@ -127,6 +164,11 @@ A huge thank you to our generous sponsors who support this project:
| [Spectro Cloud](https://www.spectrocloud.com/) |
| Spectro Cloud kindly supports LocalAI by providing GPU and computing resources to run tests on lamdalabs! |
And a huge shout-out to individuals sponsoring the project by donating hardware or backing the project.
- [Sponsor list](https://github.com/sponsors/mudler)
- JDAM00 (donating HW for the CI)
## 🌟 Star history
[![LocalAI Star history Chart](https://api.star-history.com/svg?repos=go-skynet/LocalAI&type=Date)](https://star-history.com/#go-skynet/LocalAI&Date)

View File

@@ -11,7 +11,9 @@ import (
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/internal"
"github.com/go-skynet/LocalAI/metrics"
"github.com/go-skynet/LocalAI/pkg/assets"
"github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/gofiber/fiber/v2/middleware/cors"
@@ -78,6 +80,22 @@ func Startup(opts ...options.AppOption) (*options.Option, *config.ConfigLoader,
options.Loader.StopAllGRPC()
}()
if options.WatchDog {
wd := model.NewWatchDog(
options.Loader,
options.WatchDogBusyTimeout,
options.WatchDogIdleTimeout,
options.WatchDogBusy,
options.WatchDogIdle)
options.Loader.SetWatchDog(wd)
go wd.Run()
go func() {
<-options.Context.Done()
log.Debug().Msgf("Context canceled, shutting down")
wd.Shutdown()
}()
}
return options, cl, nil
}
@@ -120,6 +138,9 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
// Default middleware config
app.Use(recover.New())
if options.Metrics != nil {
app.Use(metrics.APIMiddleware(options.Metrics))
}
// Auth middleware checking if API key is valid. If no API key is set, no auth is required.
auth := func(c *fiber.Ctx) error {
@@ -168,9 +189,14 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
}{Version: internal.PrintableVersion()})
})
app.Post("/models/apply", auth, localai.ApplyModelGalleryEndpoint(options.Loader.ModelPath, cl, galleryService.C, options.Galleries))
app.Get("/models/available", auth, localai.ListModelFromGalleryEndpoint(options.Galleries, options.Loader.ModelPath))
app.Get("/models/jobs/:uuid", auth, localai.GetOpStatusEndpoint(galleryService))
modelGalleryService := localai.CreateModelGalleryService(options.Galleries, options.Loader.ModelPath, galleryService)
app.Post("/models/apply", auth, modelGalleryService.ApplyModelGalleryEndpoint())
app.Get("/models/available", auth, modelGalleryService.ListModelFromGalleryEndpoint())
app.Get("/models/galleries", auth, modelGalleryService.ListModelGalleriesEndpoint())
app.Post("/models/galleries", auth, modelGalleryService.AddModelGalleryEndpoint())
app.Delete("/models/galleries", auth, modelGalleryService.RemoveModelGalleryEndpoint())
app.Get("/models/jobs/:uuid", auth, modelGalleryService.GetOpStatusEndpoint())
app.Get("/models/jobs", auth, modelGalleryService.GetAllStatusEndpoint())
// openAI compatible API endpoint
@@ -224,5 +250,7 @@ func App(opts ...options.AppOption) (*fiber.App, error) {
app.Get("/v1/models", auth, openai.ListModelsEndpoint(options.Loader, cl))
app.Get("/models", auth, openai.ListModelsEndpoint(options.Loader, cl))
app.Get("/metrics", metrics.MetricsHandler())
return app, nil
}

View File

@@ -15,6 +15,7 @@ import (
. "github.com/go-skynet/LocalAI/api"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/metrics"
"github.com/go-skynet/LocalAI/pkg/gallery"
"github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
@@ -162,8 +163,12 @@ var _ = Describe("API test", func() {
},
}
metricsService, err := metrics.SetupMetrics()
Expect(err).ToNot(HaveOccurred())
app, err = App(
append(commonOpts,
options.WithMetrics(metricsService),
options.WithContext(c),
options.WithGalleries(galleries),
options.WithModelLoader(modelLoader), options.WithBackendAssets(backendAssets), options.WithBackendAssetsOutput(tmpdir))...)
@@ -296,7 +301,7 @@ var _ = Describe("API test", func() {
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "github:go-skynet/model-gallery/openllama_3b.yaml",
Name: "openllama_3b",
Overrides: map[string]interface{}{"backend": "llama-stable", "mmap": true, "f16": true, "context_size": 128},
Overrides: map[string]interface{}{"backend": "llama-ggml", "mmap": true, "f16": true, "context_size": 128},
})
Expect(response["uuid"]).ToNot(BeEmpty(), fmt.Sprint(response))
@@ -363,9 +368,10 @@ var _ = Describe("API test", func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
modelName := "codellama"
response := postModelApplyRequest("http://127.0.0.1:9090/models/apply", modelApplyRequest{
URL: "github:go-skynet/model-gallery/openllama-3b-gguf.yaml",
Name: "openllama_3b_gguf",
URL: "github:go-skynet/model-gallery/codellama-7b-instruct.yaml",
Name: modelName,
Overrides: map[string]interface{}{"backend": "llama", "mmap": true, "f16": true, "context_size": 128},
})
@@ -378,17 +384,22 @@ var _ = Describe("API test", func() {
return response["processed"].(bool)
}, "360s", "10s").Should(Equal(true))
By("testing completion")
resp, err := client.CreateCompletion(context.TODO(), openai.CompletionRequest{Model: "openllama_3b_gguf", Prompt: "Count up to five: one, two, three, four, "})
By("testing chat")
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: modelName, Messages: []openai.ChatCompletionMessage{
{
Role: "user",
Content: "How much is 2+2?",
},
}})
Expect(err).ToNot(HaveOccurred())
Expect(len(resp.Choices)).To(Equal(1))
Expect(resp.Choices[0].Text).To(ContainSubstring("five"))
Expect(resp.Choices[0].Message.Content).To(Or(ContainSubstring("4"), ContainSubstring("four")))
By("testing functions")
resp2, err := client.CreateChatCompletion(
context.TODO(),
openai.ChatCompletionRequest{
Model: "openllama_3b_gguf",
Model: modelName,
Messages: []openai.ChatCompletionMessage{
{
Role: "user",
@@ -424,7 +435,7 @@ var _ = Describe("API test", func() {
var res map[string]string
err = json.Unmarshal([]byte(resp2.Choices[0].Message.FunctionCall.Arguments), &res)
Expect(err).ToNot(HaveOccurred())
Expect(res["location"]).To(Equal("San Francisco, California"), fmt.Sprint(res))
Expect(res["location"]).To(Equal("San Francisco"), fmt.Sprint(res))
Expect(res["unit"]).To(Equal("celcius"), fmt.Sprint(res))
Expect(string(resp2.Choices[0].FinishReason)).To(Equal("function_call"), fmt.Sprint(resp2.Choices[0].FinishReason))
})
@@ -446,7 +457,7 @@ var _ = Describe("API test", func() {
Eventually(func() bool {
response := getModelStatus("http://127.0.0.1:9090/models/jobs/" + uuid)
return response["processed"].(bool)
}, "360s", "10s").Should(Equal(true))
}, "960s", "10s").Should(Equal(true))
resp, err := client.CreateChatCompletion(context.TODO(), openai.ChatCompletionRequest{Model: "gpt4all-j", Messages: []openai.ChatCompletionMessage{openai.ChatCompletionMessage{Role: "user", Content: "How are you?"}}})
Expect(err).ToNot(HaveOccurred())
@@ -473,9 +484,13 @@ var _ = Describe("API test", func() {
},
}
metricsService, err := metrics.SetupMetrics()
Expect(err).ToNot(HaveOccurred())
app, err = App(
append(commonOpts,
options.WithContext(c),
options.WithMetrics(metricsService),
options.WithAudioDir(tmpdir),
options.WithImageDir(tmpdir),
options.WithGalleries(galleries),
@@ -577,12 +592,15 @@ var _ = Describe("API test", func() {
modelLoader = model.NewModelLoader(os.Getenv("MODELS_PATH"))
c, cancel = context.WithCancel(context.Background())
var err error
metricsService, err := metrics.SetupMetrics()
Expect(err).ToNot(HaveOccurred())
app, err = App(
append(commonOpts,
options.WithExternalBackend("huggingface", os.Getenv("HUGGINGFACE_GRPC")),
options.WithContext(c),
options.WithModelLoader(modelLoader),
options.WithMetrics(metricsService),
)...)
Expect(err).ToNot(HaveOccurred())
go app.Listen("127.0.0.1:9090")
@@ -669,7 +687,7 @@ var _ = Describe("API test", func() {
Input: []string{"sun", "cat"},
},
)
Expect(err).ToNot(HaveOccurred())
Expect(err).ToNot(HaveOccurred(), err)
Expect(len(resp.Data[0].Embedding)).To(BeNumerically("==", 384))
Expect(len(resp.Data[1].Embedding)).To(BeNumerically("==", 384))
@@ -686,7 +704,7 @@ var _ = Describe("API test", func() {
})
Context("External gRPC calls", func() {
It("calculate embeddings with huggingface", func() {
It("calculate embeddings with sentencetransformers", func() {
if runtime.GOOS != "linux" {
Skip("test supported only on linux")
}
@@ -786,10 +804,13 @@ var _ = Describe("API test", func() {
modelLoader = model.NewModelLoader(os.Getenv("MODELS_PATH"))
c, cancel = context.WithCancel(context.Background())
var err error
metricsService, err := metrics.SetupMetrics()
Expect(err).ToNot(HaveOccurred())
app, err = App(
append(commonOpts,
options.WithContext(c),
options.WithMetrics(metricsService),
options.WithModelLoader(modelLoader),
options.WithConfigFile(os.Getenv("CONFIG_FILE")))...,
)

View File

@@ -20,6 +20,9 @@ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negat
SchedulerType: c.Diffusers.SchedulerType,
PipelineType: c.Diffusers.PipelineType,
CFGScale: c.Diffusers.CFGScale,
LoraAdapter: c.LoraAdapter,
LoraScale: c.LoraScale,
LoraBase: c.LoraBase,
IMG2IMG: c.Diffusers.IMG2IMG,
CLIPModel: c.Diffusers.ClipModel,
CLIPSubfolder: c.Diffusers.ClipSubFolder,

View File

@@ -6,6 +6,7 @@ import (
"regexp"
"strings"
"sync"
"unicode/utf8"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
@@ -25,7 +26,7 @@ type TokenUsage struct {
Completion int
}
func ModelInference(ctx context.Context, s string, loader *model.ModelLoader, c config.Config, o *options.Option, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
func ModelInference(ctx context.Context, s string, images []string, loader *model.ModelLoader, c config.Config, o *options.Option, tokenCallback func(string, TokenUsage) bool) (func() (LLMResponse, error), error) {
modelFile := c.Model
grpcOpts := gRPCModelOpts(c)
@@ -71,6 +72,7 @@ func ModelInference(ctx context.Context, s string, loader *model.ModelLoader, c
fn := func() (LLMResponse, error) {
opts := gRPCPredictOpts(c, loader.ModelPath)
opts.Prompt = s
opts.Images = images
tokenUsage := TokenUsage{}
@@ -97,9 +99,23 @@ func ModelInference(ctx context.Context, s string, loader *model.ModelLoader, c
if tokenCallback != nil {
ss := ""
err := inferenceModel.PredictStream(ctx, opts, func(s []byte) {
tokenCallback(string(s), tokenUsage)
ss += string(s)
var partialRune []byte
err := inferenceModel.PredictStream(ctx, opts, func(chars []byte) {
partialRune = append(partialRune, chars...)
for len(partialRune) > 0 {
r, size := utf8.DecodeRune(partialRune)
if r == utf8.RuneError {
// incomplete rune, wait for more bytes
break
}
tokenCallback(string(r), tokenUsage)
ss += string(r)
partialRune = partialRune[size:]
}
})
return LLMResponse{
Response: ss,

View File

@@ -16,6 +16,10 @@ func modelOpts(c config.Config, o *options.Option, opts []model.Option) []model.
opts = append(opts, model.WithSingleActiveBackend())
}
if o.ParallelBackendRequests {
opts = append(opts, model.EnableParallelRequests)
}
if c.GRPC.Attempts != 0 {
opts = append(opts, model.WithGRPCAttempts(c.GRPC.Attempts))
}
@@ -38,26 +42,35 @@ func gRPCModelOpts(c config.Config) *pb.ModelOptions {
}
return &pb.ModelOptions{
ContextSize: int32(c.ContextSize),
Seed: int32(c.Seed),
NBatch: int32(b),
NoMulMatQ: c.NoMulMatQ,
LoraAdapter: c.LoraAdapter,
LoraBase: c.LoraBase,
NGQA: c.NGQA,
RMSNormEps: c.RMSNormEps,
F16Memory: c.F16,
MLock: c.MMlock,
RopeFreqBase: c.RopeFreqBase,
RopeFreqScale: c.RopeFreqScale,
NUMA: c.NUMA,
Embeddings: c.Embeddings,
LowVRAM: c.LowVRAM,
NGPULayers: int32(c.NGPULayers),
MMap: c.MMap,
MainGPU: c.MainGPU,
Threads: int32(c.Threads),
TensorSplit: c.TensorSplit,
ContextSize: int32(c.ContextSize),
Seed: int32(c.Seed),
NBatch: int32(b),
NoMulMatQ: c.NoMulMatQ,
DraftModel: c.DraftModel,
AudioPath: c.VallE.AudioPath,
Quantization: c.Quantization,
MMProj: c.MMProj,
YarnExtFactor: c.YarnExtFactor,
YarnAttnFactor: c.YarnAttnFactor,
YarnBetaFast: c.YarnBetaFast,
YarnBetaSlow: c.YarnBetaSlow,
LoraAdapter: c.LoraAdapter,
LoraBase: c.LoraBase,
LoraScale: c.LoraScale,
NGQA: c.NGQA,
RMSNormEps: c.RMSNormEps,
F16Memory: c.F16,
MLock: c.MMlock,
RopeFreqBase: c.RopeFreqBase,
RopeFreqScale: c.RopeFreqScale,
NUMA: c.NUMA,
Embeddings: c.Embeddings,
LowVRAM: c.LowVRAM,
NGPULayers: int32(c.NGPULayers),
MMap: c.MMap,
MainGPU: c.MainGPU,
Threads: int32(c.Threads),
TensorSplit: c.TensorSplit,
// AutoGPTQ
ModelBaseName: c.AutoGPTQ.ModelBaseName,
Device: c.AutoGPTQ.Device,
@@ -78,6 +91,7 @@ func gRPCPredictOpts(c config.Config, modelPath string) *pb.PredictOptions {
return &pb.PredictOptions{
Temperature: float32(c.Temperature),
TopP: float32(c.TopP),
NDraft: c.NDraft,
TopK: int32(c.TopK),
Tokens: int32(c.Maxtokens),
Threads: int32(c.Threads),

View File

@@ -43,6 +43,13 @@ type Config struct {
// GRPC Options
GRPC GRPC `yaml:"grpc"`
// Vall-e-x
VallE VallE `yaml:"vall-e"`
}
type VallE struct {
AudioPath string `yaml:"audio_path"`
}
type FeatureFlag map[string]*bool
@@ -93,7 +100,18 @@ type LLMConfig struct {
NUMA bool `yaml:"numa"`
LoraAdapter string `yaml:"lora_adapter"`
LoraBase string `yaml:"lora_base"`
LoraScale float32 `yaml:"lora_scale"`
NoMulMatQ bool `yaml:"no_mulmatq"`
DraftModel string `yaml:"draft_model"`
NDraft int32 `yaml:"n_draft"`
Quantization string `yaml:"quantization"`
MMProj string `yaml:"mmproj"`
RopeScaling string `yaml:"rope_scaling"`
YarnExtFactor float32 `yaml:"yarn_ext_factor"`
YarnAttnFactor float32 `yaml:"yarn_attn_factor"`
YarnBetaFast float32 `yaml:"yarn_beta_fast"`
YarnBetaSlow float32 `yaml:"yarn_beta_slow"`
}
type AutoGPTQ struct {
@@ -259,7 +277,7 @@ func (cm *ConfigLoader) LoadConfigs(path string) error {
}
for _, file := range files {
// Skip templates, YAML and .keep files
if !strings.Contains(file.Name(), ".yaml") {
if !strings.Contains(file.Name(), ".yaml") && !strings.Contains(file.Name(), ".yml") {
continue
}
c, err := ReadConfig(filepath.Join(path, file.Name()))

View File

@@ -123,13 +123,12 @@ func BackendMonitorEndpoint(bm BackendMonitor) func(c *fiber.Ctx) error {
return err
}
client := bm.options.Loader.CheckIsLoaded(backendId)
if client == nil {
model := bm.options.Loader.CheckIsLoaded(backendId)
if model == "" {
return fmt.Errorf("backend %s is not currently loaded", backendId)
}
status, rpcErr := client.Status(context.TODO())
status, rpcErr := model.GRPC(false, nil).Status(context.TODO())
if rpcErr != nil {
log.Warn().Msgf("backend %s experienced an error retrieving status info: %s", backendId, rpcErr.Error())
val, slbErr := bm.SampleLocalBackendProcess(backendId)

View File

@@ -4,6 +4,7 @@ import (
"context"
"fmt"
"os"
"slices"
"strings"
"sync"
@@ -27,6 +28,7 @@ type galleryOp struct {
}
type galleryOpStatus struct {
FileName string `json:"file_name"`
Error error `json:"error"`
Processed bool `json:"processed"`
Message string `json:"message"`
@@ -50,7 +52,6 @@ func NewGalleryService(modelPath string) *galleryApplier {
}
}
// prepareModel applies a
func prepareModel(modelPath string, req gallery.GalleryModel, cm *config.ConfigLoader, downloadStatus func(string, string, string, float64)) error {
config, err := gallery.GetGalleryConfigFromURL(req.URL)
@@ -76,6 +77,13 @@ func (g *galleryApplier) getStatus(s string) *galleryOpStatus {
return g.statuses[s]
}
func (g *galleryApplier) getAllStatus() map[string]*galleryOpStatus {
g.Lock()
defer g.Unlock()
return g.statuses
}
func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
go func() {
for {
@@ -94,7 +102,7 @@ func (g *galleryApplier) Start(c context.Context, cm *config.ConfigLoader) {
// displayDownload displays the download progress
progressCallback := func(fileName string, current string, total string, percentage float64) {
g.updateStatus(op.id, &galleryOpStatus{Message: "processing", Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
g.updateStatus(op.id, &galleryOpStatus{Message: "processing", FileName: fileName, Progress: percentage, TotalFileSize: total, DownloadedFileSize: current})
utils.DisplayDownloadFunction(fileName, current, total, percentage)
}
@@ -176,18 +184,12 @@ func ApplyGalleryFromString(modelPath, s string, cm *config.ConfigLoader, galler
return processRequests(modelPath, s, cm, galleries, requests)
}
/// Endpoints
/// Endpoint Service
func GetOpStatusEndpoint(g *galleryApplier) func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
status := g.getStatus(c.Params("uuid"))
if status == nil {
return fmt.Errorf("could not find any status for ID")
}
return c.JSON(status)
}
type ModelGalleryService struct {
galleries []gallery.Gallery
modelPath string
galleryApplier *galleryApplier
}
type GalleryModel struct {
@@ -195,7 +197,31 @@ type GalleryModel struct {
gallery.GalleryModel
}
func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan galleryOp, galleries []gallery.Gallery) func(c *fiber.Ctx) error {
func CreateModelGalleryService(galleries []gallery.Gallery, modelPath string, galleryApplier *galleryApplier) ModelGalleryService {
return ModelGalleryService{
galleries: galleries,
modelPath: modelPath,
galleryApplier: galleryApplier,
}
}
func (mgs *ModelGalleryService) GetOpStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
status := mgs.galleryApplier.getStatus(c.Params("uuid"))
if status == nil {
return fmt.Errorf("could not find any status for ID")
}
return c.JSON(status)
}
}
func (mgs *ModelGalleryService) GetAllStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
return c.JSON(mgs.galleryApplier.getAllStatus())
}
}
func (mgs *ModelGalleryService) ApplyModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(GalleryModel)
// Get input data from the request body
@@ -207,11 +233,11 @@ func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan
if err != nil {
return err
}
g <- galleryOp{
mgs.galleryApplier.C <- galleryOp{
req: input.GalleryModel,
id: uuid.String(),
galleryName: input.ID,
galleries: galleries,
galleries: mgs.galleries,
}
return c.JSON(struct {
ID string `json:"uuid"`
@@ -220,11 +246,11 @@ func ApplyModelGalleryEndpoint(modelPath string, cm *config.ConfigLoader, g chan
}
}
func ListModelFromGalleryEndpoint(galleries []gallery.Gallery, basePath string) func(c *fiber.Ctx) error {
func (mgs *ModelGalleryService) ListModelFromGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
log.Debug().Msgf("Listing models from galleries: %+v", galleries)
log.Debug().Msgf("Listing models from galleries: %+v", mgs.galleries)
models, err := gallery.AvailableGalleryModels(galleries, basePath)
models, err := gallery.AvailableGalleryModels(mgs.galleries, mgs.modelPath)
if err != nil {
return err
}
@@ -239,3 +265,56 @@ func ListModelFromGalleryEndpoint(galleries []gallery.Gallery, basePath string)
return c.Send(dat)
}
}
// NOTE: This is different (and much simpler!) than above! This JUST lists the model galleries that have been loaded, not their contents!
func (mgs *ModelGalleryService) ListModelGalleriesEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
log.Debug().Msgf("Listing model galleries %+v", mgs.galleries)
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
return c.Send(dat)
}
}
func (mgs *ModelGalleryService) AddModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(gallery.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if slices.ContainsFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s already exists", input.Name)
}
dat, err := json.Marshal(mgs.galleries)
if err != nil {
return err
}
log.Debug().Msgf("Adding %+v to gallery list", *input)
mgs.galleries = append(mgs.galleries, *input)
return c.Send(dat)
}
}
func (mgs *ModelGalleryService) RemoveModelGalleryEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
input := new(gallery.Gallery)
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return err
}
if !slices.ContainsFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
}) {
return fmt.Errorf("%s is not currently registered", input.Name)
}
mgs.galleries = slices.DeleteFunc(mgs.galleries, func(gallery gallery.Gallery) bool {
return gallery.Name == input.Name
})
return c.Send(nil)
}
}

View File

@@ -6,6 +6,7 @@ import (
"encoding/json"
"fmt"
"strings"
"time"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
@@ -15,15 +16,20 @@ import (
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/go-skynet/LocalAI/pkg/utils"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/rs/zerolog/log"
"github.com/valyala/fasthttp"
)
func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
emptyMessage := ""
id := uuid.New().String()
created := int(time.Now().Unix())
process := func(s string, req *schema.OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan schema.OpenAIResponse) {
initialMessage := schema.OpenAIResponse{
ID: id,
Created: created,
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []schema.Choice{{Delta: &schema.Message{Role: "assistant", Content: &emptyMessage}}},
Object: "chat.completion.chunk",
@@ -32,6 +38,8 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
ComputeChoices(req, s, config, o, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
resp := schema.OpenAIResponse{
ID: id,
Created: created,
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []schema.Choice{{Delta: &schema.Message{Content: &s}, Index: 0}},
Object: "chat.completion.chunk",
@@ -73,6 +81,10 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
noActionDescription = config.FunctionsConfig.NoActionDescriptionName
}
if input.ResponseFormat.Type == "json_object" {
input.Grammar = grammar.JSONBNF
}
// process functions if we have any defined or if we have a function call string
if len(input.Functions) > 0 && config.ShouldUseFunctions() {
log.Debug().Msgf("Response needs to process functions")
@@ -132,14 +144,14 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
}
r := config.Roles[role]
contentExists := i.Content != nil && *i.Content != ""
contentExists := i.Content != nil && i.StringContent != ""
// First attempt to populate content via a chat message specific template
if config.TemplateConfig.ChatMessage != "" {
chatMessageData := model.ChatMessageTemplateData{
SystemPrompt: config.SystemPrompt,
Role: r,
RoleName: role,
Content: *i.Content,
Content: i.StringContent,
MessageIndex: messageIndex,
}
templatedChatMessage, err := o.Loader.EvaluateTemplateForChatMessage(config.TemplateConfig.ChatMessage, chatMessageData)
@@ -158,7 +170,7 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
if content == "" {
if r != "" {
if contentExists {
content = fmt.Sprint(r, " ", *i.Content)
content = fmt.Sprint(r, i.StringContent)
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
@@ -172,7 +184,7 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
} else {
if contentExists {
content = fmt.Sprint(*i.Content)
content = fmt.Sprint(i.StringContent)
}
if i.FunctionCall != nil {
j, err := json.Marshal(i.FunctionCall)
@@ -261,7 +273,9 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []schema.Choice{
{
FinishReason: "stop",
@@ -324,7 +338,11 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
// Otherwise ask the LLM to understand the JSON output and the context, and return a message
// Note: This costs (in term of CPU) another computation
config.Grammar = ""
predFunc, err := backend.ModelInference(input.Context, predInput, o.Loader, *config, o, nil)
images := []string{}
for _, m := range input.Messages {
images = append(images, m.StringImages...)
}
predFunc, err := backend.ModelInference(input.Context, predInput, images, o.Loader, *config, o, nil)
if err != nil {
log.Error().Msgf("inference error: %s", err.Error())
return
@@ -355,6 +373,8 @@ func ChatEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
}
resp := &schema.OpenAIResponse{
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "chat.completion",

View File

@@ -6,23 +6,31 @@ import (
"encoding/json"
"errors"
"fmt"
"time"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/options"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/go-skynet/LocalAI/pkg/grammar"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/rs/zerolog/log"
"github.com/valyala/fasthttp"
)
// https://platform.openai.com/docs/api-reference/completions
func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx) error {
id := uuid.New().String()
created := int(time.Now().Unix())
process := func(s string, req *schema.OpenAIRequest, config *config.Config, loader *model.ModelLoader, responses chan schema.OpenAIResponse) {
ComputeChoices(req, s, config, o, loader, func(s string, c *[]schema.Choice) {}, func(s string, usage backend.TokenUsage) bool {
resp := schema.OpenAIResponse{
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
ID: id,
Created: created,
Model: req.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []schema.Choice{
{
Index: 0,
@@ -57,6 +65,10 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
return fmt.Errorf("failed reading parameters from request:%w", err)
}
if input.ResponseFormat.Type == "json_object" {
input.Grammar = grammar.JSONBNF
}
log.Debug().Msgf("Parameter Config: %+v", config)
if input.Stream {
@@ -108,7 +120,9 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
}
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: []schema.Choice{
{
Index: 0,
@@ -156,6 +170,8 @@ func CompletionEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
}
resp := &schema.OpenAIResponse{
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "text_completion",

View File

@@ -3,6 +3,7 @@ package openai
import (
"encoding/json"
"fmt"
"time"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
@@ -10,6 +11,7 @@ import (
"github.com/go-skynet/LocalAI/api/schema"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/gofiber/fiber/v2"
"github.com/google/uuid"
"github.com/rs/zerolog/log"
)
@@ -62,7 +64,11 @@ func EditEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx)
result = append(result, r...)
}
id := uuid.New().String()
created := int(time.Now().Unix())
resp := &schema.OpenAIResponse{
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Choices: result,
Object: "edit",

View File

@@ -3,10 +3,12 @@ package openai
import (
"encoding/json"
"fmt"
"time"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/google/uuid"
"github.com/go-skynet/LocalAI/api/options"
"github.com/gofiber/fiber/v2"
@@ -57,10 +59,14 @@ func EmbeddingsEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fibe
items = append(items, schema.Item{Embedding: embeddings, Index: i, Object: "embedding"})
}
id := uuid.New().String()
created := int(time.Now().Unix())
resp := &schema.OpenAIResponse{
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Data: items,
Object: "list",
ID: id,
Created: created,
Model: input.Model, // we have to return what the user sent here, due to OpenAI spec.
Data: items,
Object: "list",
}
jsonResult, _ := json.Marshal(resp)

View File

@@ -5,11 +5,14 @@ import (
"encoding/base64"
"encoding/json"
"fmt"
"github.com/go-skynet/LocalAI/api/schema"
"os"
"path/filepath"
"strconv"
"strings"
"time"
"github.com/go-skynet/LocalAI/api/schema"
"github.com/google/uuid"
"github.com/go-skynet/LocalAI/api/backend"
config "github.com/go-skynet/LocalAI/api/config"
@@ -97,7 +100,7 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
}
b64JSON := false
if input.ResponseFormat == "b64_json" {
if input.ResponseFormat.Type == "b64_json" {
b64JSON = true
}
// src and clip_skip
@@ -174,8 +177,12 @@ func ImageEndpoint(cm *config.ConfigLoader, o *options.Option) func(c *fiber.Ctx
}
}
id := uuid.New().String()
created := int(time.Now().Unix())
resp := &schema.OpenAIResponse{
Data: result,
ID: id,
Created: created,
Data: result,
}
jsonResult, _ := json.Marshal(resp)

View File

@@ -23,8 +23,13 @@ func ComputeChoices(
n = 1
}
images := []string{}
for _, m := range req.Messages {
images = append(images, m.StringImages...)
}
// get the model function to call for the result
predFunc, err := backend.ModelInference(req.Context, predInput, loader, *config, o, tokenCallback)
predFunc, err := backend.ModelInference(req.Context, predInput, images, loader, *config, o, tokenCallback)
if err != nil {
return result, backend.TokenUsage{}, err
}

View File

@@ -2,8 +2,11 @@ package openai
import (
"context"
"encoding/base64"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"os"
"path/filepath"
"strings"
@@ -24,7 +27,7 @@ func readInput(c *fiber.Ctx, o *options.Option, randomModel bool) (string, *sche
input.Cancel = cancel
// Get input data from the request body
if err := c.BodyParser(input); err != nil {
return "", nil, err
return "", nil, fmt.Errorf("failed parsing request body: %w", err)
}
modelFile := input.Model
@@ -61,6 +64,37 @@ func readInput(c *fiber.Ctx, o *options.Option, randomModel bool) (string, *sche
return modelFile, input, nil
}
// this function check if the string is an URL, if it's an URL downloads the image in memory
// encodes it in base64 and returns the base64 string
func getBase64Image(s string) (string, error) {
if strings.HasPrefix(s, "http") {
// download the image
resp, err := http.Get(s)
if err != nil {
return "", err
}
defer resp.Body.Close()
// read the image data into memory
data, err := ioutil.ReadAll(resp.Body)
if err != nil {
return "", err
}
// encode the image data in base64
encoded := base64.StdEncoding.EncodeToString(data)
// return the base64 string
return encoded, nil
}
// if the string instead is prefixed with "data:image/jpeg;base64,", drop it
if strings.HasPrefix(s, "data:image/jpeg;base64,") {
return strings.ReplaceAll(s, "data:image/jpeg;base64,", ""), nil
}
return "", fmt.Errorf("not valid string")
}
func updateConfig(config *config.Config, input *schema.OpenAIRequest) {
if input.Echo {
config.Echo = input.Echo
@@ -129,6 +163,35 @@ func updateConfig(config *config.Config, input *schema.OpenAIRequest) {
}
}
// Decode each request's message content
index := 0
for i, m := range input.Messages {
switch content := m.Content.(type) {
case string:
input.Messages[i].StringContent = content
case []interface{}:
dat, _ := json.Marshal(content)
c := []schema.Content{}
json.Unmarshal(dat, &c)
for _, pp := range c {
if pp.Type == "text" {
input.Messages[i].StringContent = pp.Text
} else if pp.Type == "image_url" {
// Detect if pp.ImageURL is an URL, if it is download the image and encode it in base64:
base64, err := getBase64Image(pp.ImageURL.URL)
if err == nil {
input.Messages[i].StringImages = append(input.Messages[i].StringImages, base64) // TODO: make sure that we only return base64 stuff
// set a placeholder for each image
input.Messages[i].StringContent = fmt.Sprintf("[img-%d]", index) + input.Messages[i].StringContent
index++
} else {
fmt.Print("Failed encoding image", err)
}
}
}
}
}
if input.RepeatPenalty != 0 {
config.RepeatPenalty = input.RepeatPenalty
}

View File

@@ -4,7 +4,9 @@ import (
"context"
"embed"
"encoding/json"
"time"
"github.com/go-skynet/LocalAI/metrics"
"github.com/go-skynet/LocalAI/pkg/gallery"
model "github.com/go-skynet/LocalAI/pkg/model"
"github.com/rs/zerolog/log"
@@ -24,6 +26,7 @@ type Option struct {
PreloadModelsFromPath string
CORSAllowOrigins string
ApiKeys []string
Metrics *metrics.Metrics
Galleries []gallery.Gallery
@@ -34,7 +37,13 @@ type Option struct {
AutoloadGalleries bool
SingleBackend bool
SingleBackend bool
ParallelBackendRequests bool
WatchDogIdle bool
WatchDogBusy bool
WatchDog bool
WatchDogBusyTimeout, WatchDogIdleTimeout time.Duration
}
type AppOption func(*Option)
@@ -60,10 +69,40 @@ func WithCors(b bool) AppOption {
}
}
var EnableWatchDog = func(o *Option) {
o.WatchDog = true
}
var EnableWatchDogIdleCheck = func(o *Option) {
o.WatchDog = true
o.WatchDogIdle = true
}
var EnableWatchDogBusyCheck = func(o *Option) {
o.WatchDog = true
o.WatchDogBusy = true
}
func SetWatchDogBusyTimeout(t time.Duration) AppOption {
return func(o *Option) {
o.WatchDogBusyTimeout = t
}
}
func SetWatchDogIdleTimeout(t time.Duration) AppOption {
return func(o *Option) {
o.WatchDogIdleTimeout = t
}
}
var EnableSingleBackend = func(o *Option) {
o.SingleBackend = true
}
var EnableParallelBackendRequests = func(o *Option) {
o.ParallelBackendRequests = true
}
var EnableGalleriesAutoload = func(o *Option) {
o.AutoloadGalleries = true
}
@@ -99,6 +138,7 @@ func WithStringGalleries(galls string) AppOption {
return func(o *Option) {
if galls == "" {
log.Debug().Msgf("no galleries to load")
o.Galleries = []gallery.Gallery{}
return
}
var galleries []gallery.Gallery
@@ -197,3 +237,9 @@ func WithApiKeys(apiKeys []string) AppOption {
o.ApiKeys = apiKeys
}
}
func WithMetrics(meter *metrics.Metrics) AppOption {
return func(o *Option) {
o.Metrics = meter
}
}

View File

@@ -55,11 +55,25 @@ type Choice struct {
Text string `json:"text,omitempty"`
}
type Content struct {
Type string `json:"type" yaml:"type"`
Text string `json:"text" yaml:"text"`
ImageURL ContentURL `json:"image_url" yaml:"image_url"`
}
type ContentURL struct {
URL string `json:"url" yaml:"url"`
}
type Message struct {
// The message role
Role string `json:"role,omitempty" yaml:"role"`
// The message content
Content *string `json:"content" yaml:"content"`
Content interface{} `json:"content" yaml:"content"`
StringContent string `json:"string_content,omitempty" yaml:"string_content,omitempty"`
StringImages []string `json:"string_images,omitempty" yaml:"string_images,omitempty"`
// A result of a function call
FunctionCall interface{} `json:"function_call,omitempty" yaml:"function_call,omitempty"`
}
@@ -69,6 +83,12 @@ type OpenAIModel struct {
Object string `json:"object"`
}
type ChatCompletionResponseFormatType string
type ChatCompletionResponseFormat struct {
Type ChatCompletionResponseFormatType `json:"type,omitempty"`
}
type OpenAIRequest struct {
config.PredictionOptions
@@ -78,7 +98,7 @@ type OpenAIRequest struct {
// whisper
File string `json:"file" validate:"required"`
//whisper/image
ResponseFormat string `json:"response_format"`
ResponseFormat ChatCompletionResponseFormat `json:"response_format"`
// image
Size string `json:"size"`
// Prompt is read only by completion/image API calls

View File

@@ -63,6 +63,8 @@ message PredictOptions {
float RopeFreqScale = 38;
float NegativePromptScale = 39;
string NegativePrompt = 40;
int32 NDraft = 41;
repeated string Images = 42;
}
// The response message containing the result
@@ -115,7 +117,23 @@ message ModelOptions {
// LLM (llama.cpp)
string LoraBase = 35;
string LoraAdapter = 36;
float LoraScale = 42;
bool NoMulMatQ = 37;
string DraftModel = 39;
string AudioPath = 38;
// vllm
string Quantization = 40;
string MMProj = 41;
string RopeScaling = 43;
float YarnExtFactor = 44;
float YarnAttnFactor = 45;
float YarnBetaFast = 46;
float YarnBetaSlow = 47;
}
message Result {

3
backend/cpp/grpc/.gitignore vendored Normal file
View File

@@ -0,0 +1,3 @@
installed_packages/
grpc_build/
grpc_repo/

View File

@@ -0,0 +1,81 @@
#!/bin/bash
# Builds locally from sources the packages needed by the llama cpp backend.
# Makes sure a few base packages exist.
# sudo apt-get --no-upgrade -y install g++ gcc binutils cmake git build-essential autoconf libtool pkg-config
SCRIPT_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )"
echo "Script directory: $SCRIPT_DIR"
CPP_INSTALLED_PACKAGES_DIR=$1
if [ -z ${CPP_INSTALLED_PACKAGES_DIR} ]; then
echo "CPP_INSTALLED_PACKAGES_DIR env variable not set. Don't know where to install: failed.";
echo
exit -1
fi
if [ -d "${CPP_INSTALLED_PACKAGES_DIR}" ]; then
echo "gRPC installation directory already exists. Nothing to do."
exit 0
fi
# The depth when cloning a git repo. 1 speeds up the clone when the repo history is not needed.
GIT_CLONE_DEPTH=1
NUM_BUILD_THREADS=$(nproc --ignore=1)
# Google gRPC --------------------------------------------------------------------------------------
TAG_LIB_GRPC="v1.59.0"
GIT_REPO_LIB_GRPC="https://github.com/grpc/grpc.git"
GRPC_REPO_DIR="${SCRIPT_DIR}/../grpc_repo"
GRPC_BUILD_DIR="${SCRIPT_DIR}/../grpc_build"
SRC_DIR_LIB_GRPC="${GRPC_REPO_DIR}/grpc"
echo "SRC_DIR_LIB_GRPC: ${SRC_DIR_LIB_GRPC}"
echo "GRPC_REPO_DIR: ${GRPC_REPO_DIR}"
echo "GRPC_BUILD_DIR: ${GRPC_BUILD_DIR}"
mkdir -pv ${GRPC_REPO_DIR}
rm -rf ${GRPC_BUILD_DIR}
mkdir -pv ${GRPC_BUILD_DIR}
mkdir -pv ${CPP_INSTALLED_PACKAGES_DIR}
if [ -d "${SRC_DIR_LIB_GRPC}" ]; then
echo "gRPC source already exists locally. Not cloned again."
else
( cd ${GRPC_REPO_DIR} && \
git clone --depth ${GIT_CLONE_DEPTH} -b ${TAG_LIB_GRPC} ${GIT_REPO_LIB_GRPC} && \
cd ${SRC_DIR_LIB_GRPC} && \
git submodule update --init --recursive --depth ${GIT_CLONE_DEPTH}
)
fi
( cd ${GRPC_BUILD_DIR} && \
cmake -G "Unix Makefiles" \
-DCMAKE_BUILD_TYPE=Release \
-DgRPC_INSTALL=ON \
-DEXECUTABLE_OUTPUT_PATH=${CPP_INSTALLED_PACKAGES_DIR}/grpc/bin \
-DLIBRARY_OUTPUT_PATH=${CPP_INSTALLED_PACKAGES_DIR}/grpc/lib \
-DgRPC_BUILD_TESTS=OFF \
-DgRPC_BUILD_CSHARP_EXT=OFF \
-DgRPC_BUILD_GRPC_CPP_PLUGIN=ON \
-DgRPC_BUILD_GRPC_CSHARP_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_NODE_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_OBJECTIVE_C_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_PHP_PLUGIN=OFF \
-DgRPC_BUILD_GRPC_PYTHON_PLUGIN=ON \
-DgRPC_BUILD_GRPC_RUBY_PLUGIN=OFF \
-Dprotobuf_WITH_ZLIB=ON \
-DRE2_BUILD_TESTING=OFF \
-DCMAKE_INSTALL_PREFIX=${CPP_INSTALLED_PACKAGES_DIR}/ \
${SRC_DIR_LIB_GRPC} && \
cmake --build . -- -j ${NUM_BUILD_THREADS} && \
cmake --build . --target install -- -j ${NUM_BUILD_THREADS}
)
rm -rf ${GRPC_BUILD_DIR}
rm -rf ${GRPC_REPO_DIR}

View File

@@ -0,0 +1,74 @@
## XXX: In some versions of CMake clip wasn't being built before llama.
## This is an hack for now, but it should be fixed in the future.
set(TARGET myclip)
add_library(${TARGET} clip.cpp clip.h)
install(TARGETS ${TARGET} LIBRARY)
target_link_libraries(${TARGET} PRIVATE common ggml ${CMAKE_THREAD_LIBS_INIT})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
if (NOT MSVC)
target_compile_options(${TARGET} PRIVATE -Wno-cast-qual) # stb_image.h
endif()
set(TARGET grpc-server)
# END CLIP hack
set(CMAKE_CXX_STANDARD 17)
cmake_minimum_required(VERSION 3.15)
set(TARGET grpc-server)
set(_PROTOBUF_LIBPROTOBUF libprotobuf)
set(_REFLECTION grpc++_reflection)
if (${CMAKE_SYSTEM_NAME} MATCHES "Darwin")
link_directories("/opt/homebrew/lib")
include_directories("/opt/homebrew/include")
endif()
find_package(absl CONFIG REQUIRED)
find_package(Protobuf CONFIG REQUIRED)
find_package(gRPC CONFIG REQUIRED)
find_program(_PROTOBUF_PROTOC protoc)
set(_GRPC_GRPCPP grpc++)
find_program(_GRPC_CPP_PLUGIN_EXECUTABLE grpc_cpp_plugin)
include_directories(${CMAKE_CURRENT_BINARY_DIR})
include_directories(${Protobuf_INCLUDE_DIRS})
message(STATUS "Using protobuf version ${Protobuf_VERSION} | Protobuf_INCLUDE_DIRS: ${Protobuf_INCLUDE_DIRS} | CMAKE_CURRENT_BINARY_DIR: ${CMAKE_CURRENT_BINARY_DIR}")
# Proto file
get_filename_component(hw_proto "../../../../../../backend/backend.proto" ABSOLUTE)
get_filename_component(hw_proto_path "${hw_proto}" PATH)
# Generated sources
set(hw_proto_srcs "${CMAKE_CURRENT_BINARY_DIR}/backend.pb.cc")
set(hw_proto_hdrs "${CMAKE_CURRENT_BINARY_DIR}/backend.pb.h")
set(hw_grpc_srcs "${CMAKE_CURRENT_BINARY_DIR}/backend.grpc.pb.cc")
set(hw_grpc_hdrs "${CMAKE_CURRENT_BINARY_DIR}/backend.grpc.pb.h")
add_custom_command(
OUTPUT "${hw_proto_srcs}" "${hw_proto_hdrs}" "${hw_grpc_srcs}" "${hw_grpc_hdrs}"
COMMAND ${_PROTOBUF_PROTOC}
ARGS --grpc_out "${CMAKE_CURRENT_BINARY_DIR}"
--cpp_out "${CMAKE_CURRENT_BINARY_DIR}"
-I "${hw_proto_path}"
--plugin=protoc-gen-grpc="${_GRPC_CPP_PLUGIN_EXECUTABLE}"
"${hw_proto}"
DEPENDS "${hw_proto}")
# hw_grpc_proto
add_library(hw_grpc_proto
${hw_grpc_srcs}
${hw_grpc_hdrs}
${hw_proto_srcs}
${hw_proto_hdrs} )
add_executable(${TARGET} grpc-server.cpp json.hpp )
target_link_libraries(${TARGET} PRIVATE common llama myclip ${CMAKE_THREAD_LIBS_INIT} absl::flags hw_grpc_proto
absl::flags_parse
gRPC::${_REFLECTION}
gRPC::${_GRPC_GRPCPP}
protobuf::${_PROTOBUF_LIBPROTOBUF})
target_compile_features(${TARGET} PRIVATE cxx_std_11)
if(TARGET BUILD_INFO)
add_dependencies(${TARGET} BUILD_INFO)
endif()

View File

@@ -0,0 +1,50 @@
LLAMA_VERSION?=d9b33fe95bd257b36c84ee5769cc048230067d6f
CMAKE_ARGS?=
BUILD_TYPE?=
# If build type is cublas, then we set -DLLAMA_CUBLAS=ON to CMAKE_ARGS automatically
ifeq ($(BUILD_TYPE),cublas)
CMAKE_ARGS+=-DLLAMA_CUBLAS=ON
# If build type is openblas then we set -DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS
# to CMAKE_ARGS automatically
else ifeq ($(BUILD_TYPE),openblas)
CMAKE_ARGS+=-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS
# If build type is clblast (openCL) we set -DLLAMA_CLBLAST=ON -DCLBlast_DIR=/some/path
else ifeq ($(BUILD_TYPE),clblast)
CMAKE_ARGS+=-DLLAMA_CLBLAST=ON -DCLBlast_DIR=/some/path
# If it's hipblas we do have also to set CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++
else ifeq ($(BUILD_TYPE),hipblas)
CMAKE_ARGS+=-DLLAMA_HIPBLAS=ON
endif
llama.cpp:
git clone --recurse-submodules https://github.com/ggerganov/llama.cpp llama.cpp
cd llama.cpp && git checkout -b build $(LLAMA_VERSION) && git submodule update --init --recursive --depth 1
llama.cpp/examples/grpc-server:
mkdir -p llama.cpp/examples/grpc-server
cp -r $(abspath ./)/CMakeLists.txt llama.cpp/examples/grpc-server/
cp -r $(abspath ./)/grpc-server.cpp llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/json.hpp llama.cpp/examples/grpc-server/
echo "add_subdirectory(grpc-server)" >> llama.cpp/examples/CMakeLists.txt
## XXX: In some versions of CMake clip wasn't being built before llama.
## This is an hack for now, but it should be fixed in the future.
cp -rfv llama.cpp/examples/llava/clip.h llama.cpp/examples/grpc-server/clip.h
cp -rfv llama.cpp/examples/llava/clip.cpp llama.cpp/examples/grpc-server/clip.cpp
rebuild:
cp -rfv $(abspath ./)/CMakeLists.txt llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/grpc-server.cpp llama.cpp/examples/grpc-server/
cp -rfv $(abspath ./)/json.hpp llama.cpp/examples/grpc-server/
rm -rf grpc-server
$(MAKE) grpc-server
clean:
rm -rf llama.cpp
rm -rf grpc-server
grpc-server: llama.cpp llama.cpp/examples/grpc-server
cd llama.cpp && mkdir -p build && cd build && cmake .. $(CMAKE_ARGS) && cmake --build . --config Release
cp llama.cpp/build/bin/grpc-server .

View File

File diff suppressed because it is too large Load Diff

24596
backend/cpp/llama/json.hpp Normal file
View File

File diff suppressed because it is too large Load Diff

View File

@@ -5,7 +5,6 @@ package main
import (
"flag"
bert "github.com/go-skynet/LocalAI/pkg/backend/llm/bert"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
@@ -16,7 +15,7 @@ var (
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &bert.Embeddings{}); err != nil {
if err := grpc.StartServer(*addr, &StableDiffusion{}); err != nil {
panic(err)
}
}

View File

@@ -1,4 +1,4 @@
package image
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -1,4 +1,4 @@
package bert
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -5,8 +5,6 @@ package main
import (
"flag"
rwkv "github.com/go-skynet/LocalAI/pkg/backend/llm/rwkv"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
@@ -17,7 +15,7 @@ var (
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &rwkv.LLM{}); err != nil {
if err := grpc.StartServer(*addr, &Embeddings{}); err != nil {
panic(err)
}
}

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -1,4 +1,4 @@
package gpt4all
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -5,8 +5,6 @@ package main
import (
"flag"
bloomz "github.com/go-skynet/LocalAI/pkg/backend/llm/bloomz"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
@@ -17,7 +15,7 @@ var (
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &bloomz.LLM{}); err != nil {
if err := grpc.StartServer(*addr, &LLM{}); err != nil {
panic(err)
}
}

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -1,4 +1,4 @@
package langchain
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -0,0 +1,21 @@
package main
// Note: this is started internally by LocalAI and a server is allocated for each model
import (
"flag"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
var (
addr = flag.String("addr", "localhost:50051", "the address to connect to")
)
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &LLM{}); err != nil {
panic(err)
}
}

View File

@@ -1,4 +1,4 @@
package llama
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -3,8 +3,6 @@ package main
import (
"flag"
llama "github.com/go-skynet/LocalAI/pkg/backend/llm/llama-stable"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
@@ -15,7 +13,7 @@ var (
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &llama.LLM{}); err != nil {
if err := grpc.StartServer(*addr, &LLM{}); err != nil {
panic(err)
}
}

View File

@@ -1,9 +1,10 @@
package llama
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)
import (
"fmt"
"path/filepath"
"github.com/go-skynet/LocalAI/pkg/grpc/base"
pb "github.com/go-skynet/LocalAI/pkg/grpc/proto"
@@ -13,7 +14,8 @@ import (
type LLM struct {
base.SingleThread
llama *llama.LLama
llama *llama.LLama
draftModel *llama.LLama
}
func (llm *LLM) Load(opts *pb.ModelOptions) error {
@@ -36,12 +38,15 @@ func (llm *LLM) Load(opts *pb.ModelOptions) error {
llamaOpts = append(llamaOpts, llama.SetMulMatQ(false))
}
// Get base path of opts.ModelFile and use the same for lora (assume the same path)
basePath := filepath.Dir(opts.ModelFile)
if opts.LoraAdapter != "" {
llamaOpts = append(llamaOpts, llama.SetLoraAdapter(opts.LoraAdapter))
llamaOpts = append(llamaOpts, llama.SetLoraAdapter(filepath.Join(basePath, opts.LoraAdapter)))
}
if opts.LoraBase != "" {
llamaOpts = append(llamaOpts, llama.SetLoraBase(opts.LoraBase))
llamaOpts = append(llamaOpts, llama.SetLoraBase(filepath.Join(basePath, opts.LoraBase)))
}
if opts.ContextSize != 0 {
@@ -74,7 +79,27 @@ func (llm *LLM) Load(opts *pb.ModelOptions) error {
llamaOpts = append(llamaOpts, llama.EnabelLowVRAM)
}
if opts.DraftModel != "" {
// https://github.com/ggerganov/llama.cpp/blob/71ca2fad7d6c0ef95ef9944fb3a1a843e481f314/examples/speculative/speculative.cpp#L40
llamaOpts = append(llamaOpts, llama.SetPerplexity(true))
}
model, err := llama.New(opts.ModelFile, llamaOpts...)
if opts.DraftModel != "" {
// opts.DraftModel is relative to opts.ModelFile, so we need to get the basepath of opts.ModelFile
if !filepath.IsAbs(opts.DraftModel) {
dir := filepath.Dir(opts.ModelFile)
opts.DraftModel = filepath.Join(dir, opts.DraftModel)
}
draftModel, err := llama.New(opts.DraftModel, llamaOpts...)
if err != nil {
return err
}
llm.draftModel = draftModel
}
llm.llama = model
return err
@@ -158,6 +183,9 @@ func buildPredictOptions(opts *pb.PredictOptions) []llama.PredictOption {
predictOptions = append(predictOptions, llama.SetSeed(int(opts.Seed)))
}
if opts.NDraft != 0 {
predictOptions = append(predictOptions, llama.SetNDraft(int(opts.NDraft)))
}
//predictOptions = append(predictOptions, llama.SetLogitBias(c.Seed))
predictOptions = append(predictOptions, llama.SetFrequencyPenalty(opts.FrequencyPenalty))
@@ -171,6 +199,9 @@ func buildPredictOptions(opts *pb.PredictOptions) []llama.PredictOption {
}
func (llm *LLM) Predict(opts *pb.PredictOptions) (string, error) {
if llm.draftModel != nil {
return llm.llama.SpeculativeSampling(llm.draftModel, opts.Prompt, buildPredictOptions(opts)...)
}
return llm.llama.Predict(opts.Prompt, buildPredictOptions(opts)...)
}
@@ -183,7 +214,13 @@ func (llm *LLM) PredictStream(opts *pb.PredictOptions, results chan string) erro
}))
go func() {
_, err := llm.llama.Predict(opts.Prompt, predictOptions...)
var err error
if llm.draftModel != nil {
_, err = llm.llama.SpeculativeSampling(llm.draftModel, opts.Prompt, buildPredictOptions(opts)...)
} else {
_, err = llm.llama.Predict(opts.Prompt, predictOptions...)
}
if err != nil {
fmt.Println("err: ", err)
}

View File

@@ -1,12 +1,12 @@
package main
// GRPC Falcon server
// Note: this is started internally by LocalAI and a server is allocated for each model
import (
"flag"
tts "github.com/go-skynet/LocalAI/pkg/backend/tts"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
@@ -17,7 +17,7 @@ var (
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &tts.Piper{}); err != nil {
if err := grpc.StartServer(*addr, &LLM{}); err != nil {
panic(err)
}
}

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -0,0 +1,21 @@
package main
// Note: this is started internally by LocalAI and a server is allocated for each model
import (
"flag"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
var (
addr = flag.String("addr", "localhost:50051", "the address to connect to")
)
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &LLM{}); err != nil {
panic(err)
}
}

View File

@@ -1,4 +1,4 @@
package rwkv
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

View File

@@ -5,7 +5,7 @@ package main
import (
"flag"
transformers "github.com/go-skynet/LocalAI/pkg/backend/llm/transformers"
transformers "github.com/go-skynet/LocalAI/backend/go/llm/transformers"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)

View File

@@ -0,0 +1,21 @@
package main
// Note: this is started internally by LocalAI and a server is allocated for each model
import (
"flag"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
var (
addr = flag.String("addr", "localhost:50051", "the address to connect to")
)
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &Whisper{}); err != nil {
panic(err)
}
}

View File

@@ -1,4 +1,4 @@
package transcribe
package main
import (
"fmt"

View File

@@ -1,4 +1,4 @@
package transcribe
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

21
backend/go/tts/main.go Normal file
View File

@@ -0,0 +1,21 @@
package main
// Note: this is started internally by LocalAI and a server is allocated for each model
import (
"flag"
grpc "github.com/go-skynet/LocalAI/pkg/grpc"
)
var (
addr = flag.String("addr", "localhost:50051", "the address to connect to")
)
func main() {
flag.Parse()
if err := grpc.StartServer(*addr, &Piper{}); err != nil {
panic(err)
}
}

View File

@@ -1,4 +1,4 @@
package tts
package main
// This is a wrapper to statisfy the GRPC service interface
// It is meant to be used by the main executable that is the server for the specific backend type (falcon, gpt3, etc)

38
backend/python/README.md Normal file
View File

@@ -0,0 +1,38 @@
# Common commands about conda environment
## Create a new empty conda environment
```
conda create --name <env-name> python=<your version> -y
conda create --name autogptq python=3.11 -y
```
## To activate the environment
As of conda 4.4
```
conda activate autogptq
```
The conda version older than 4.4
```
source activate autogptq
```
## Install the packages to your environment
Sometimes you need to install the packages from the conda-forge channel
By using `conda`
```
conda install <your-package-name>
conda install -c conda-forge <your package-name>
```
Or by using `pip`
```
pip install <your-package-name>
```

View File

@@ -0,0 +1,5 @@
.PHONY: autogptq
autogptq:
@echo "Creating virtual environment..."
@conda env create --name autogptq --file autogptq.yml
@echo "Virtual environment created."

View File

@@ -0,0 +1,5 @@
# Creating a separate environment for the autogptq project
```
make autogptq
```

View File

@@ -1,20 +1,23 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from pathlib import Path
import time
import grpc
import backend_pb2
import backend_pb2_grpc
from auto_gptq import AutoGPTQForCausalLM
from transformers import AutoTokenizer
from transformers import TextGenerationPipeline
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
def Health(self, request, context):
@@ -77,7 +80,7 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=1))
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()

View File

@@ -0,0 +1,86 @@
name: autogptq
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h7b6447c_0
- ca-certificates=2023.08.22=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_0
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.11=h7f8727e_2
- pip=23.2.1=py311h06a4308_0
- python=3.11.5=h955ad1f_0
- readline=8.2=h5eee18b_0
- setuptools=68.0.0=py311h06a4308_0
- sqlite=3.41.2=h5eee18b_0
- tk=8.6.12=h1ccaba5_0
- wheel=0.41.2=py311h06a4308_0
- xz=5.4.2=h5eee18b_0
- zlib=1.2.13=h5eee18b_0
- pip:
- accelerate==0.23.0
- aiohttp==3.8.5
- aiosignal==1.3.1
- async-timeout==4.0.3
- attrs==23.1.0
- auto-gptq==0.4.2
- certifi==2023.7.22
- charset-normalizer==3.3.0
- datasets==2.14.5
- dill==0.3.7
- filelock==3.12.4
- frozenlist==1.4.0
- fsspec==2023.6.0
- grpcio==1.59.0
- huggingface-hub==0.16.4
- idna==3.4
- jinja2==3.1.2
- markupsafe==2.1.3
- mpmath==1.3.0
- multidict==6.0.4
- multiprocess==0.70.15
- networkx==3.1
- numpy==1.26.0
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cuda-cupti-cu12==12.1.105
- nvidia-cuda-nvrtc-cu12==12.1.105
- nvidia-cuda-runtime-cu12==12.1.105
- nvidia-cudnn-cu12==8.9.2.26
- nvidia-cufft-cu12==11.0.2.54
- nvidia-curand-cu12==10.3.2.106
- nvidia-cusolver-cu12==11.4.5.107
- nvidia-cusparse-cu12==12.1.0.106
- nvidia-nccl-cu12==2.18.1
- nvidia-nvjitlink-cu12==12.2.140
- nvidia-nvtx-cu12==12.1.105
- packaging==23.2
- pandas==2.1.1
- peft==0.5.0
- protobuf==4.24.4
- psutil==5.9.5
- pyarrow==13.0.0
- python-dateutil==2.8.2
- pytz==2023.3.post1
- pyyaml==6.0.1
- regex==2023.10.3
- requests==2.31.0
- rouge==1.0.1
- safetensors==0.3.3
- six==1.16.0
- sympy==1.12
- tokenizers==0.14.0
- torch==2.1.0
- tqdm==4.66.1
- transformers==4.34.0
- triton==2.1.0
- typing-extensions==4.8.0
- tzdata==2023.3
- urllib3==2.0.6
- xxhash==3.4.1
- yarl==1.9.2

View File

File diff suppressed because one or more lines are too long

14
backend/python/autogptq/run.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
##
## A bash script wrapper that runs the autogptq server with conda
export PATH=$PATH:/opt/conda/bin
# Activate conda environment
source activate autogptq
# get the directory where the bash script is located
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
python $DIR/autogptq.py $@

View File

@@ -0,0 +1,5 @@
.PHONY: ttsbark
ttsbark:
@echo "Creating virtual environment..."
@conda env create --name ttsbark --file ttsbark.yml
@echo "Virtual environment created."

View File

@@ -0,0 +1,16 @@
# Creating a separate environment for ttsbark project
```
make ttsbark
```
# Testing the gRPC server
```
<The path of your python interpreter> -m unittest test_ttsbark.py
```
For example
```
/opt/conda/envs/bark/bin/python -m unittest extra/grpc/bark/test_ttsbark.py
``````

View File

File diff suppressed because one or more lines are too long

14
backend/python/bark/run.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
##
## A bash script wrapper that runs the ttsbark server with conda
export PATH=$PATH:/opt/conda/bin
# Activate conda environment
source activate ttsbark
# get the directory where the bash script is located
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
python $DIR/ttsbark.py $@

View File

@@ -0,0 +1,32 @@
import unittest
import subprocess
import time
import backend_pb2
import backend_pb2_grpc
import grpc
class TestBackendServicer(unittest.TestCase):
"""
TestBackendServicer is the class that tests the gRPC service
"""
def setUp(self):
self.service = subprocess.Popen(["python3", "ttsbark.py", "--addr", "localhost:50051"])
def tearDown(self) -> None:
self.service.terminate()
self.service.wait()
def test_server_startup(self):
time.sleep(2)
try:
self.setUp()
with grpc.insecure_channel("localhost:50051") as channel:
stub = backend_pb2_grpc.BackendStub(channel)
response = stub.Health(backend_pb2.HealthMessage())
self.assertEqual(response.message, b'OK')
except Exception as err:
print(err)
self.fail("Server failed to start")
finally:
self.tearDown()

View File

@@ -1,22 +1,32 @@
#!/usr/bin/env python3
import grpc
"""
This is an extra gRPC server of LocalAI for Bark TTS
"""
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
import signal
import sys
import os
from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
from pathlib import Path
from bark import SAMPLE_RATE, generate_audio, preload_models
from scipy.io.wavfile import write as write_wav
import backend_pb2
import backend_pb2_grpc
from bark import SAMPLE_RATE, generate_audio, preload_models
import grpc
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# Implement the BackendServicer class with the service methods
class BackendServicer(backend_pb2_grpc.BackendServicer):
"""
BackendServicer is the class that implements the gRPC service
"""
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", 'utf-8'))
def LoadModel(self, request, context):
@@ -51,7 +61,7 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
return backend_pb2.Result(success=True)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=1))
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
@@ -80,4 +90,4 @@ if __name__ == "__main__":
)
args = parser.parse_args()
serve(args.addr)
serve(args.addr)

View File

@@ -0,0 +1,96 @@
name: bark
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h7b6447c_0
- ca-certificates=2023.08.22=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_0
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.11=h7f8727e_2
- pip=23.2.1=py311h06a4308_0
- python=3.11.5=h955ad1f_0
- readline=8.2=h5eee18b_0
- setuptools=68.0.0=py311h06a4308_0
- sqlite=3.41.2=h5eee18b_0
- tk=8.6.12=h1ccaba5_0
- wheel=0.41.2=py311h06a4308_0
- xz=5.4.2=h5eee18b_0
- zlib=1.2.13=h5eee18b_0
- pip:
- accelerate==0.23.0
- aiohttp==3.8.5
- aiosignal==1.3.1
- async-timeout==4.0.3
- attrs==23.1.0
- bark==0.1.5
- boto3==1.28.61
- botocore==1.31.61
- certifi==2023.7.22
- charset-normalizer==3.3.0
- datasets==2.14.5
- dill==0.3.7
- einops==0.7.0
- encodec==0.1.1
- filelock==3.12.4
- frozenlist==1.4.0
- fsspec==2023.6.0
- funcy==2.0
- grpcio==1.59.0
- huggingface-hub==0.16.4
- idna==3.4
- jinja2==3.1.2
- jmespath==1.0.1
- markupsafe==2.1.3
- mpmath==1.3.0
- multidict==6.0.4
- multiprocess==0.70.15
- networkx==3.1
- numpy==1.26.0
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cuda-cupti-cu12==12.1.105
- nvidia-cuda-nvrtc-cu12==12.1.105
- nvidia-cuda-runtime-cu12==12.1.105
- nvidia-cudnn-cu12==8.9.2.26
- nvidia-cufft-cu12==11.0.2.54
- nvidia-curand-cu12==10.3.2.106
- nvidia-cusolver-cu12==11.4.5.107
- nvidia-cusparse-cu12==12.1.0.106
- nvidia-nccl-cu12==2.18.1
- nvidia-nvjitlink-cu12==12.2.140
- nvidia-nvtx-cu12==12.1.105
- packaging==23.2
- pandas==2.1.1
- peft==0.5.0
- protobuf==4.24.4
- psutil==5.9.5
- pyarrow==13.0.0
- python-dateutil==2.8.2
- pytz==2023.3.post1
- pyyaml==6.0.1
- regex==2023.10.3
- requests==2.31.0
- rouge==1.0.1
- s3transfer==0.7.0
- safetensors==0.3.3
- scipy==1.11.3
- six==1.16.0
- sympy==1.12
- tokenizers==0.14.0
- torch==2.1.0
- torchaudio==2.1.0
- tqdm==4.66.1
- transformers==4.34.0
- triton==2.1.0
- typing-extensions==4.8.0
- tzdata==2023.3
- urllib3==1.26.17
- xxhash==3.4.1
- yarl==1.9.2
prefix: /opt/conda/envs/bark

View File

@@ -0,0 +1,11 @@
.PHONY: diffusers
diffusers:
@echo "Creating virtual environment..."
@conda env create --name diffusers --file diffusers.yml
@echo "Virtual environment created."
.PHONY: run
run:
@echo "Running diffusers..."
bash run.sh
@echo "Diffusers run."

View File

@@ -0,0 +1,5 @@
# Creating a separate environment for the diffusers project
```
make diffusers
```

View File

@@ -1,30 +1,39 @@
#!/usr/bin/env python3
import grpc
from concurrent import futures
import time
import backend_pb2
import backend_pb2_grpc
import argparse
from collections import defaultdict
from enum import Enum
import signal
import sys
import time
import os
# import diffusers
import torch
from torch import autocast
from diffusers import StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, EulerAncestralDiscreteScheduler
from diffusers.pipelines.stable_diffusion import safety_checker
from compel import Compel
from PIL import Image
from io import BytesIO
import torch
import backend_pb2
import backend_pb2_grpc
import grpc
from diffusers import StableDiffusionXLPipeline, StableDiffusionDepth2ImgPipeline, DPMSolverMultistepScheduler, StableDiffusionPipeline, DiffusionPipeline, EulerAncestralDiscreteScheduler
from diffusers import StableDiffusionImg2ImgPipeline
from diffusers.pipelines.stable_diffusion import safety_checker
from compel import Compel
from transformers import CLIPTextModel
from enum import Enum
from safetensors.torch import load_file
_ONE_DAY_IN_SECONDS = 60 * 60 * 24
COMPEL=os.environ.get("COMPEL", "1") == "1"
CLIPSKIP=os.environ.get("CLIPSKIP", "1") == "1"
# If MAX_WORKERS are specified in the environment use it, otherwise default to 1
MAX_WORKERS = int(os.environ.get('PYTHON_GRPC_MAX_WORKERS', '1'))
# https://github.com/CompVis/stable-diffusion/issues/239#issuecomment-1627615287
def sc(self, clip_input, images) : return images, [False for i in images]
# edit the StableDiffusionSafetyChecker class so that, when called, it just returns the images and an array of True values
@@ -213,11 +222,83 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
self.compel = Compel(tokenizer=self.pipe.tokenizer, text_encoder=self.pipe.text_encoder)
if request.CUDA:
self.pipe.to('cuda')
# Assume directory from request.ModelFile.
# Only if request.LoraAdapter it's not an absolute path
if request.LoraAdapter and request.ModelFile != "" and not os.path.isabs(request.LoraAdapter) and request.LoraAdapter:
# get base path of modelFile
modelFileBase = os.path.dirname(request.ModelFile)
# modify LoraAdapter to be relative to modelFileBase
request.LoraAdapter = os.path.join(modelFileBase, request.LoraAdapter)
device = "cpu" if not request.CUDA else "cuda"
self.device = device
if request.LoraAdapter:
# Check if its a local file and not a directory ( we load lora differently for a safetensor file )
if os.path.exists(request.LoraAdapter) and not os.path.isdir(request.LoraAdapter):
self.load_lora_weights(request.LoraAdapter, 1, device, torchType)
else:
self.pipe.unet.load_attn_procs(request.LoraAdapter)
except Exception as err:
return backend_pb2.Result(success=False, message=f"Unexpected {err=}, {type(err)=}")
# Implement your logic here for the LoadModel service
# Replace this with your desired response
return backend_pb2.Result(message="Model loaded successfully", success=True)
# https://github.com/huggingface/diffusers/issues/3064
def load_lora_weights(self, checkpoint_path, multiplier, device, dtype):
LORA_PREFIX_UNET = "lora_unet"
LORA_PREFIX_TEXT_ENCODER = "lora_te"
# load LoRA weight from .safetensors
state_dict = load_file(checkpoint_path, device=device)
updates = defaultdict(dict)
for key, value in state_dict.items():
# it is suggested to print out the key, it usually will be something like below
# "lora_te_text_model_encoder_layers_0_self_attn_k_proj.lora_down.weight"
layer, elem = key.split('.', 1)
updates[layer][elem] = value
# directly update weight in diffusers model
for layer, elems in updates.items():
if "text" in layer:
layer_infos = layer.split(LORA_PREFIX_TEXT_ENCODER + "_")[-1].split("_")
curr_layer = self.pipe.text_encoder
else:
layer_infos = layer.split(LORA_PREFIX_UNET + "_")[-1].split("_")
curr_layer = self.pipe.unet
# find the target layer
temp_name = layer_infos.pop(0)
while len(layer_infos) > -1:
try:
curr_layer = curr_layer.__getattr__(temp_name)
if len(layer_infos) > 0:
temp_name = layer_infos.pop(0)
elif len(layer_infos) == 0:
break
except Exception:
if len(temp_name) > 0:
temp_name += "_" + layer_infos.pop(0)
else:
temp_name = layer_infos.pop(0)
# get elements for this layer
weight_up = elems['lora_up.weight'].to(dtype)
weight_down = elems['lora_down.weight'].to(dtype)
alpha = elems['alpha'] if 'alpha' in elems else None
if alpha:
alpha = alpha.item() / weight_up.shape[1]
else:
alpha = 1.0
# update weight
if len(weight_up.shape) == 4:
curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up.squeeze(3).squeeze(2), weight_down.squeeze(3).squeeze(2)).unsqueeze(2).unsqueeze(3)
else:
curr_layer.weight.data += multiplier * alpha * torch.mm(weight_up, weight_down)
def GenerateImage(self, request, context):
prompt = request.positive_prompt
@@ -246,6 +327,12 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
# create a dictionary of parameters by using the keys from EnableParameters and the values from defaults
kwargs = {key: options[key] for key in keys}
# Set seed
if request.seed > 0:
kwargs["generator"] = torch.Generator(device=self.device).manual_seed(
request.seed
)
image = {}
if COMPEL:
conditioning = self.compel.build_conditioning_tensor(prompt)
@@ -267,7 +354,7 @@ class BackendServicer(backend_pb2_grpc.BackendServicer):
return backend_pb2.Result(message="Model loaded successfully", success=True)
def serve(address):
server = grpc.server(futures.ThreadPoolExecutor(max_workers=1))
server = grpc.server(futures.ThreadPoolExecutor(max_workers=MAX_WORKERS))
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()

View File

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,74 @@
name: diffusers
channels:
- defaults
dependencies:
- _libgcc_mutex=0.1=main
- _openmp_mutex=5.1=1_gnu
- bzip2=1.0.8=h7b6447c_0
- ca-certificates=2023.08.22=h06a4308_0
- ld_impl_linux-64=2.38=h1181459_1
- libffi=3.4.4=h6a678d5_0
- libgcc-ng=11.2.0=h1234567_1
- libgomp=11.2.0=h1234567_1
- libstdcxx-ng=11.2.0=h1234567_1
- libuuid=1.41.5=h5eee18b_0
- ncurses=6.4=h6a678d5_0
- openssl=3.0.11=h7f8727e_2
- pip=23.2.1=py311h06a4308_0
- python=3.11.5=h955ad1f_0
- readline=8.2=h5eee18b_0
- setuptools=68.0.0=py311h06a4308_0
- sqlite=3.41.2=h5eee18b_0
- tk=8.6.12=h1ccaba5_0
- tzdata=2023c=h04d1e81_0
- wheel=0.41.2=py311h06a4308_0
- xz=5.4.2=h5eee18b_0
- zlib=1.2.13=h5eee18b_0
- pip:
- accelerate==0.23.0
- certifi==2023.7.22
- charset-normalizer==3.3.0
- compel==2.0.2
- diffusers==0.21.4
- filelock==3.12.4
- fsspec==2023.9.2
- grpcio==1.59.0
- huggingface-hub==0.17.3
- idna==3.4
- importlib-metadata==6.8.0
- jinja2==3.1.2
- markupsafe==2.1.3
- mpmath==1.3.0
- networkx==3.1
- numpy==1.26.0
- nvidia-cublas-cu12==12.1.3.1
- nvidia-cuda-cupti-cu12==12.1.105
- nvidia-cuda-nvrtc-cu12==12.1.105
- nvidia-cuda-runtime-cu12==12.1.105
- nvidia-cudnn-cu12==8.9.2.26
- nvidia-cufft-cu12==11.0.2.54
- nvidia-curand-cu12==10.3.2.106
- nvidia-cusolver-cu12==11.4.5.107
- nvidia-cusparse-cu12==12.1.0.106
- nvidia-nccl-cu12==2.18.1
- nvidia-nvjitlink-cu12==12.2.140
- nvidia-nvtx-cu12==12.1.105
- packaging==23.2
- pillow==10.0.1
- protobuf==4.24.4
- psutil==5.9.5
- pyparsing==3.1.1
- pyyaml==6.0.1
- regex==2023.10.3
- requests==2.31.0
- safetensors==0.4.0
- sympy==1.12
- tokenizers==0.14.1
- torch==2.1.0
- tqdm==4.66.1
- transformers==4.34.0
- triton==2.1.0
- typing-extensions==4.8.0
- urllib3==2.0.6
- zipp==3.17.0
prefix: /opt/conda/envs/diffusers

14
backend/python/diffusers/run.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
##
## A bash script wrapper that runs the diffusers server with conda
export PATH=$PATH:/opt/conda/bin
# Activate conda environment
source activate diffusers
# get the directory where the bash script is located
DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"
python $DIR/backend_diffusers.py $@

View File

@@ -0,0 +1,11 @@
.PHONY: exllama
exllama:
@echo "Creating virtual environment..."
@conda env create --name exllama --file exllama.yml
@echo "Virtual environment created."
.PHONY: run
run:
@echo "Running exllama..."
bash run.sh
@echo "exllama run."

View File

@@ -0,0 +1,5 @@
# Creating a separate environment for the exllama project
```
make exllama
```

Some files were not shown because too many files have changed in this diff Show More