Compare commits

..

20 Commits

Author SHA1 Message Date
LocalAI [bot]
2f297979a7 ⬆️ Update ggerganov/llama.cpp (#2587)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-17 15:28:19 +00:00
Ettore Di Giacinto
2437a2769d models(gallery): add gemma-1.1-7b-it (#2588)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-17 14:13:27 +02:00
Ettore Di Giacinto
b58b7cad94 models(gallery): add samantha-qwen2 (#2586)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-17 10:08:29 +02:00
LocalAI [bot]
68148f2a1a ⬆️ Update ggerganov/llama.cpp (#2584)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-17 00:18:44 +00:00
Ettore Di Giacinto
4897eb0ba2 ci: pack less libs inside the binary (#2579)
The binary grew up to 1.8GB quickly - rocm at least raises +800MB by
itself - so we might just want to manage the GPU libs separately.

Adds a comment to list all the libraries found so far that we are
depending on, but will likely follow up in a way to bundle these
separately.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 22:10:28 +02:00
Ettore Di Giacinto
1b43966c48 Update README.md
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-16 20:27:37 +02:00
Ettore Di Giacinto
c5f2f11503 models(gallery): add hathor_stable-v0.2-l3-8b (#2582)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 20:24:36 +02:00
Ettore Di Giacinto
895443d1b5 models(gallery): add tess-v2.5-phi-3-medium-128k-14b (#2581)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 20:22:08 +02:00
Ettore Di Giacinto
6a0802e8e6 models(gallery): add dolphin-qwen (#2580)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 20:11:21 +02:00
Ettore Di Giacinto
94cfaad7f4 feat(libpath): refactor and expose functions for external library paths (#2578)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-16 13:58:28 +02:00
Ettore Di Giacinto
ac4a94dd44 feat(build): bundle libs for arm64 and x86 linux binaries (#2572)
This PR bundles further libs into the arm64 and x86_64 binaries

This can be improved by a lot - it's far from perfect, however in this PR I wanted to collect the required libs, and give a simple baseline to improve later upon. It is quite challenging to do this exercise with CI only - but it's the fastest way I see now. 

I hope that after the list is initially built we can further improve this down the line and remove some of the technical debt left here to speedup things and do not get stuck in the middle of CI cycles.

In this PR:

- The x86_64 binary now bundles hipblas, nvidia and intel libraries too to avoid any dependency to be installed in the host
- Similarly, for the arm64 we now bundle all the required assets

## What's left

We should be also able to cross-compile Nvidia for arm64 - however I didn't succeed so far so I've left that open. Similarly I might have missed some libraries, but we will see with bug reports and testing around with the new binaries. I've tested on my arm64 board and I could finally start things up.

An open point still is shipping libraries for e.g. tts and stablediffusion. this is not done yet, however with the same methodology we should be able to extend support also for these two backends in the binary.
2024-06-16 09:10:44 +02:00
LocalAI [bot]
58bf8614d9 ⬆️ Update ggerganov/llama.cpp (#2575)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-15 23:45:10 +00:00
Ettore Di Giacinto
3764e50b35 models(gallery): add firefly-gemma-7b (#2576)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-15 23:07:20 +02:00
Nate Harris
3f464d2d9e Fix standard image latest Docker tags (#2574)
- Fix standard image latest Docker tags

Signed-off-by: Nate Harris <nwithan8@users.noreply.github.com>
2024-06-15 22:08:30 +02:00
LocalAI [bot]
5116d561e1 ⬆️ Update ggerganov/llama.cpp (#2570)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-14 23:39:20 +00:00
Ettore Di Giacinto
96a7a3b59f fix(Makefile): enable STATIC on dist (#2569)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-14 12:28:46 +02:00
Ettore Di Giacinto
112d0ffa45 feat(darwin): embed grpc libs (#2567)
* debug

* feat(makefile): allow to bundle libs into binary

* ci: bundle protobuf into single-binary

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* ci: tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(assets): correctly reference extract folder

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* bundle also abseil

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* bundle more libs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2024-06-14 08:51:25 +02:00
LocalAI [bot]
25f45827ab ⬆️ Update ggerganov/whisper.cpp (#2565)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-14 00:26:51 +00:00
LocalAI [bot]
f322f7c62d ⬆️ Update ggerganov/llama.cpp (#2564)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2024-06-13 23:47:50 +00:00
Ettore Di Giacinto
06351cbbb4 feat(binary): support extracted bundled libs on darwin (#2563)
When offering fallback libs, use the proper env var for darwin

Note: this does not include the libraries itself, but only sets the
proper env var for the libs to be picked up on darwin.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2024-06-13 22:59:42 +02:00
10 changed files with 219 additions and 36 deletions

View File

@@ -100,7 +100,13 @@ jobs:
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.0
export PATH=$PATH:$GOPATH/bin
export PATH=/usr/local/cuda/bin:$PATH
GO_TAGS=p2p GOOS=linux GOARCH=arm64 CMAKE_ARGS="-DProtobuf_INCLUDE_DIRS=$CROSS_STAGING_PREFIX/include -DProtobuf_DIR=$CROSS_STAGING_PREFIX/lib/cmake/protobuf -DgRPC_DIR=$CROSS_STAGING_PREFIX/lib/cmake/grpc -DCMAKE_TOOLCHAIN_FILE=$CMAKE_CROSS_TOOLCHAIN -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++" make dist-cross-linux-arm64
sudo rm -rf /usr/aarch64-linux-gnu/lib/libstdc++.so.6
sudo cp -rf /usr/aarch64-linux-gnu/lib/libstdc++.so* /usr/aarch64-linux-gnu/lib/libstdc++.so.6
GO_TAGS=p2p \
BACKEND_LIBS="./grpc/cmake/cross_build/third_party/re2/libre2.a ./grpc/cmake/cross_build/libgrpc.a ./grpc/cmake/cross_build/libgrpc++.a ./grpc/cmake/cross_build/third_party/protobuf/libprotobuf.a /usr/aarch64-linux-gnu/lib/libc.so.6 /usr/aarch64-linux-gnu/lib/libstdc++.so.6 /usr/aarch64-linux-gnu/lib/libgomp.so.1 /usr/aarch64-linux-gnu/lib/libm.so.6 /usr/aarch64-linux-gnu/lib/libgcc_s.so.1 /usr/aarch64-linux-gnu/lib/libdl.so.2 /usr/aarch64-linux-gnu/lib/libpthread.so.0" \
GOOS=linux \
GOARCH=arm64 \
CMAKE_ARGS="-DProtobuf_INCLUDE_DIRS=$CROSS_STAGING_PREFIX/include -DProtobuf_DIR=$CROSS_STAGING_PREFIX/lib/cmake/protobuf -DgRPC_DIR=$CROSS_STAGING_PREFIX/lib/cmake/grpc -DCMAKE_TOOLCHAIN_FILE=$CMAKE_CROSS_TOOLCHAIN -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++" make dist-cross-linux-arm64
- uses: actions/upload-artifact@v4
with:
name: LocalAI-linux-arm64
@@ -111,7 +117,13 @@ jobs:
with:
files: |
release/*
- name: Setup tmate session if tests fail
if: ${{ failure() }}
uses: mxschmitt/action-tmate@v3.18
with:
detached: true
connect-timeout-seconds: 180
limit-access-to-actor: true
build-linux:
runs-on: arc-runner-set
steps:
@@ -190,6 +202,7 @@ jobs:
- name: Install gRPC
run: |
cd grpc && cd cmake/build && sudo make --jobs 5 --output-sync=target install
# BACKEND_LIBS needed for gpu-workload: /opt/intel/oneapi/*/lib/libiomp5.so /opt/intel/oneapi/*/lib/libmkl_core.so /opt/intel/oneapi/*/lib/libmkl_core.so.2 /opt/intel/oneapi/*/lib/libmkl_intel_ilp64.so /opt/intel/oneapi/*/lib/libmkl_intel_ilp64.so.2 /opt/intel/oneapi/*/lib/libmkl_sycl_blas.so /opt/intel/oneapi/*/lib/libmkl_sycl_blas.so.4 /opt/intel/oneapi/*/lib/libmkl_tbb_thread.so /opt/intel/oneapi/*/lib/libmkl_tbb_thread.so.2 /opt/intel/oneapi/*/lib/libsycl.so /opt/intel/oneapi/*/lib/libsycl.so.7 /opt/intel/oneapi/*/lib/libsycl.so.7.1.0 /opt/rocm-*/lib/libamdhip64.so /opt/rocm-*/lib/libamdhip64.so.5 /opt/rocm-*/lib/libamdhip64.so.6 /opt/rocm-*/lib/libamdhip64.so.6.1.60100 /opt/rocm-*/lib/libhipblas.so /opt/rocm-*/lib/libhipblas.so.2 /opt/rocm-*/lib/libhipblas.so.2.1.60100 /opt/rocm-*/lib/librocblas.so /opt/rocm-*/lib/librocblas.so.4 /opt/rocm-*/lib/librocblas.so.4.1.60100 /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/x86_64-linux-gnu/libm.so.6 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 /usr/lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/local/cuda-*/targets/x86_64-linux/lib/libcublas.so /usr/local/cuda-*/targets/x86_64-linux/lib/libcublasLt.so /usr/local/cuda-*/targets/x86_64-linux/lib/libcudart.so /usr/local/cuda-*/targets/x86_64-linux/lib/stubs/libcuda.so
- name: Build
id: build
run: |
@@ -199,7 +212,9 @@ jobs:
export PATH=/usr/local/cuda/bin:$PATH
export PATH=/opt/rocm/bin:$PATH
source /opt/intel/oneapi/setvars.sh
GO_TAGS=p2p make -j4 dist
GO_TAGS=p2p \
BACKEND_LIBS="/usr/lib/x86_64-linux-gnu/libstdc++.so.6 /usr/lib/x86_64-linux-gnu/libm.so.6 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 /usr/lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libgomp.so.1" \
make -j4 dist
- uses: actions/upload-artifact@v4
with:
name: LocalAI-linux
@@ -210,7 +225,13 @@ jobs:
with:
files: |
release/*
- name: Setup tmate session if tests fail
if: ${{ failure() }}
uses: mxschmitt/action-tmate@v3.18
with:
detached: true
connect-timeout-seconds: 180
limit-access-to-actor: true
build-stablediffusion:
runs-on: ubuntu-latest
steps:
@@ -257,10 +278,6 @@ jobs:
with:
go-version: '1.21.x'
cache: false
- name: Setup tmate session if tests fail
uses: mxschmitt/action-tmate@v3.18
with:
limit-access-to-actor: true
- name: Dependencies
run: |
brew install protobuf grpc
@@ -272,7 +289,8 @@ jobs:
export C_INCLUDE_PATH=/usr/local/include
export CPLUS_INCLUDE_PATH=/usr/local/include
export PATH=$PATH:$GOPATH/bin
GO_TAGS=p2p make dist
BACKEND_LIBS="$(ls /opt/homebrew/opt/grpc/lib/*.dylib /opt/homebrew/opt/re2/lib/*.dylib /opt/homebrew/opt/openssl@3/lib/*.dylib /opt/homebrew/opt/protobuf/lib/*.dylib /opt/homebrew/opt/abseil/lib/*.dylib | xargs)" GO_TAGS=p2p make dist
- uses: actions/upload-artifact@v4
with:
name: LocalAI-MacOS-arm64
@@ -283,3 +301,10 @@ jobs:
with:
files: |
release/*
- name: Setup tmate session if tests fail
if: ${{ failure() }}
uses: mxschmitt/action-tmate@v3.18
with:
detached: true
connect-timeout-seconds: 180
limit-access-to-actor: true

View File

@@ -5,7 +5,7 @@ BINARY_NAME=local-ai
# llama.cpp versions
GOLLAMA_STABLE_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
CPPLLAMA_VERSION?=963552903f51043ee947a8deeaaa7ec00bc3f1a4
CPPLLAMA_VERSION?=21be9cab94e0b5b53cb6edeeebf8c8c799baad03
# gpt4all version
GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
@@ -16,7 +16,7 @@ RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp
RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6
# whisper.cpp version
WHISPER_CPP_VERSION?=420b6abc54008ab634f5887dc45bd77122c2f320
WHISPER_CPP_VERSION?=b29b3b29240aac8b71ce8e5a4360c1f1562ad66f
# bert.cpp version
BERT_VERSION?=710044b124545415f555e4260d16b146c725a6e4
@@ -313,6 +313,10 @@ build: prepare backend-assets grpcs ## Build the project
$(info ${GREEN}I BUILD_TYPE: ${YELLOW}$(BUILD_TYPE)${RESET})
$(info ${GREEN}I GO_TAGS: ${YELLOW}$(GO_TAGS)${RESET})
$(info ${GREEN}I LD_FLAGS: ${YELLOW}$(LD_FLAGS)${RESET})
ifneq ($(BACKEND_LIBS),)
$(MAKE) backend-assets/lib
cp -r $(BACKEND_LIBS) backend-assets/lib/
endif
CGO_LDFLAGS="$(CGO_LDFLAGS)" $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o $(BINARY_NAME) ./
build-minimal:
@@ -321,8 +325,11 @@ build-minimal:
build-api:
BUILD_GRPC_FOR_BACKEND_LLAMA=true BUILD_API_ONLY=true GO_TAGS=none $(MAKE) build
backend-assets/lib:
mkdir -p backend-assets/lib
dist:
STATIC=true $(MAKE) backend-assets/grpc/llama-cpp-avx2
$(MAKE) backend-assets/grpc/llama-cpp-avx2
ifeq ($(OS),Darwin)
$(info ${GREEN}I Skip CUDA/hipblas build on MacOS${RESET})
else
@@ -331,7 +338,7 @@ else
$(MAKE) backend-assets/grpc/llama-cpp-sycl_f16
$(MAKE) backend-assets/grpc/llama-cpp-sycl_f32
endif
$(MAKE) build
STATIC=true $(MAKE) build
mkdir -p release
# if BUILD_ID is empty, then we don't append it to the binary name
ifeq ($(BUILD_ID),)
@@ -344,7 +351,7 @@ endif
dist-cross-linux-arm64:
CMAKE_ARGS="$(CMAKE_ARGS) -DLLAMA_NATIVE=off" GRPC_BACKENDS="backend-assets/grpc/llama-cpp-fallback backend-assets/grpc/llama-cpp-grpc backend-assets/util/llama-cpp-rpc-server" \
$(MAKE) build
STATIC=true $(MAKE) build
mkdir -p release
# if BUILD_ID is empty, then we don't append it to the binary name
ifeq ($(BUILD_ID),)

View File

@@ -65,6 +65,7 @@ docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu
[Roadmap](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap)
- 🆕 You can browse now the model gallery without LocalAI! Check out https://models.localai.io
- 🔥🔥 Decentralized llama.cpp: https://github.com/mudler/LocalAI/pull/2343 (peer2peer llama.cpp!) 👉 Docs https://localai.io/features/distribute/
- 🔥🔥 Openvoice: https://github.com/mudler/LocalAI/pull/2334
- 🆕 Function calls without grammars and mixed mode: https://github.com/mudler/LocalAI/pull/2328

View File

@@ -43,6 +43,7 @@ type RunCMD struct {
Address string `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"`
CORS bool `env:"LOCALAI_CORS,CORS" help:"" group:"api"`
CORSAllowOrigins string `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"`
LibraryPath string `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"`
CSRF bool `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"`
UploadLimit int `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"`
APIKeys []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"`
@@ -80,6 +81,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
config.WithCors(r.CORS),
config.WithCorsAllowOrigins(r.CORSAllowOrigins),
config.WithCsrf(r.CSRF),
config.WithLibPath(r.LibraryPath),
config.WithThreads(r.Threads),
config.WithBackendAssets(ctx.BackendAssets),
config.WithBackendAssetsOutput(r.BackendAssetsPath),

View File

@@ -15,6 +15,7 @@ type ApplicationConfig struct {
Context context.Context
ConfigFile string
ModelPath string
LibPath string
UploadLimitMB, Threads, ContextSize int
DisableWebUI bool
F16 bool
@@ -101,6 +102,12 @@ func WithModelLibraryURL(url string) AppOption {
}
}
func WithLibPath(path string) AppOption {
return func(o *ApplicationConfig) {
o.LibPath = path
}
}
var EnableWatchDog = func(o *ApplicationConfig) {
o.WatchDog = true
}

View File

@@ -9,6 +9,7 @@ import (
"github.com/go-skynet/LocalAI/core/services"
"github.com/go-skynet/LocalAI/internal"
"github.com/go-skynet/LocalAI/pkg/assets"
"github.com/go-skynet/LocalAI/pkg/library"
"github.com/go-skynet/LocalAI/pkg/model"
pkgStartup "github.com/go-skynet/LocalAI/pkg/startup"
"github.com/go-skynet/LocalAI/pkg/xsysinfo"
@@ -109,6 +110,11 @@ func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.Mode
}
}
if options.LibPath != "" {
// If there is a lib directory, set LD_LIBRARY_PATH to include it
library.LoadExternal(options.LibPath)
}
// turn off any process that was started by GRPC if the context is canceled
go func() {
<-options.Context.Done()

View File

@@ -85,7 +85,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-----------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master` | `localai/localai:master` |
| Latest tag | `quay.io/go-skynet/local-ai:latest` | `localai/localai:latest` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-cpu` | `localai/localai:latest-cpu` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}` | `localai/localai:{{< version >}}` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-ffmpeg` | `localai/localai:{{< version >}}-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-ffmpeg-core` | `localai/localai:{{< version >}}-ffmpeg-core` |
@@ -97,7 +97,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-cublas-cuda11` | `localai/localai:master-cublas-cuda11` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-cublas-cuda11` | `localai/localai:latest-cublas-cuda11` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-11` | `localai/localai:latest-gpu-nvidia-cuda-11` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11` | `localai/localai:{{< version >}}-cublas-cuda11` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11-ffmpeg` | `localai/localai:{{< version >}}-cublas-cuda11-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11-ffmpeg-core` | `localai/localai:{{< version >}}-cublas-cuda11-ffmpeg-core` |
@@ -109,7 +109,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-cublas-cuda12` | `localai/localai:master-cublas-cuda12` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-cublas-cuda12` | `localai/localai:latest-cublas-cuda12` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-12` | `localai/localai:latest-gpu-nvidia-cuda-12` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12` | `localai/localai:{{< version >}}-cublas-cuda12` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12-ffmpeg` | `localai/localai:{{< version >}}-cublas-cuda12-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12-ffmpeg-core` | `localai/localai:{{< version >}}-cublas-cuda12-ffmpeg-core` |
@@ -121,7 +121,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-sycl-f16` | `localai/localai:master-sycl-f16` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-sycl-f16` | `localai/localai:latest-sycl-f16` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f16` | `localai/localai:latest-gpu-intel-f16` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16` | `localai/localai:{{< version >}}-sycl-f16` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16-ffmpeg` | `localai/localai:{{< version >}}-sycl-f16-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16-ffmpeg-core` | `localai/localai:{{< version >}}-sycl-f16-ffmpeg-core` |
@@ -133,7 +133,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-sycl-f32` | `localai/localai:master-sycl-f32` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-sycl-f32` | `localai/localai:latest-sycl-f32` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f32` | `localai/localai:latest-gpu-intel-f32` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32` | `localai/localai:{{< version >}}-sycl-f32` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32-ffmpeg` | `localai/localai:{{< version >}}-sycl-f32-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32-ffmpeg-core` | `localai/localai:{{< version >}}-sycl-f32-ffmpeg-core` |
@@ -145,7 +145,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-hipblas` | `localai/localai:master-hipblas` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-hipblas` | `localai/localai:latest-hipblas` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-hipblas` | `localai/localai:latest-gpu-hipblas` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-hipblas` | `localai/localai:{{< version >}}-hipblas` |
| Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-hipblas-ffmpeg` | `localai/localai:{{< version >}}-hipblas-ffmpeg` |
| Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-hipblas-ffmpeg-core` | `localai/localai:{{< version >}}-hipblas-ffmpeg-core` |

View File

@@ -22,6 +22,53 @@
- filename: Qwen2-7B-Instruct-Q4_K_M.gguf
sha256: 8d0d33f0d9110a04aad1711b1ca02dafc0fa658cd83028bdfa5eff89c294fe76
uri: huggingface://bartowski/Qwen2-7B-Instruct-GGUF/Qwen2-7B-Instruct-Q4_K_M.gguf
- !!merge <<: *qwen2
name: "dolphin-2.9.2-qwen2-72b"
icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
urls:
- https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf
description: |
Dolphin 2.9.2 Qwen2 72B 🐬
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
overrides:
parameters:
model: dolphin-2.9.2-qwen2-Q4_K_M.gguf
files:
- filename: dolphin-2.9.2-qwen2-Q4_K_M.gguf
sha256: 44a0e82cbc2a201b2f4b9e16099a0a4d97b6f0099d45bcc5b354601f38dbb709
uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf/qwen2-Q4_K_M.gguf
- !!merge <<: *qwen2
name: "dolphin-2.9.2-qwen2-7b"
description: |
Dolphin 2.9.2 Qwen2 7B 🐬
Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
urls:
- https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b
- https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf
icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
overrides:
parameters:
model: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
files:
- filename: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
sha256: a15b5db4df6be4f4bfb3632b2009147332ef4c57875527f246b4718cb0d3af1f
uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf/dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
- !!merge <<: *qwen2
name: "samantha-qwen-2-7B"
description: |
Samantha based on qwen2
urls:
- https://huggingface.co/bartowski/Samantha-Qwen-2-7B-GGUF
- https://huggingface.co/macadeliccc/Samantha-Qwen2-7B
overrides:
parameters:
model: Samantha-Qwen-2-7B-Q4_K_M.gguf
files:
- filename: Samantha-Qwen-2-7B-Q4_K_M.gguf
sha256: 5d1cf1c35a7a46c536a96ba0417d08b9f9e09c24a4e25976f72ad55d4904f6fe
uri: huggingface://bartowski/Samantha-Qwen-2-7B-GGUF/Samantha-Qwen-2-7B-Q4_K_M.gguf
## START Mistral
- &mistral03
url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master"
@@ -176,6 +223,37 @@
- filename: gemma-2b.Q4_K_M.gguf
sha256: 37d50c21ef7847926204ad9b3007127d9a2722188cfd240ce7f9f7f041aa71a5
uri: huggingface://mlabonne/gemma-2b-GGUF/gemma-2b.Q4_K_M.gguf
- !!merge <<: *gemma
name: "firefly-gemma-7b-iq-imatrix"
icon: "https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/SrOekTxdpnxHyWWmMiAvc.jpeg"
urls:
- https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix
- https://huggingface.co/YeungNLP/firefly-gemma-7b
description: |
firefly-gemma-7b is trained based on gemma-7b to act as a helpful and harmless AI assistant. We use Firefly to train the model on a single V100 GPU with QLoRA.
overrides:
parameters:
model: firefly-gemma-7b-Q4_K_S-imatrix.gguf
files:
- filename: firefly-gemma-7b-Q4_K_S-imatrix.gguf
sha256: 622e0b8e4f12203cc40c7f87915abf99498c2e0582203415ca236ea37643e428
uri: huggingface://Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/firefly-gemma-7b-Q4_K_S-imatrix.gguf
- !!merge <<: *gemma
name: "gemma-1.1-7b-it"
urls:
- https://huggingface.co/bartowski/gemma-1.1-7b-it-GGUF
- https://huggingface.co/google/gemma-1.1-7b-it
description: |
This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release.
Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with "Sure,".
overrides:
parameters:
model: gemma-1.1-7b-it-Q4_K_M.gguf
files:
- filename: gemma-1.1-7b-it-Q4_K_M.gguf
sha256: 47821da72ee9e80b6fd43c6190ad751b485fb61fa5664590f7a73246bcd8332e
uri: huggingface://bartowski/gemma-1.1-7b-it-GGUF/gemma-1.1-7b-it-Q4_K_M.gguf
- &llama3
url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png
@@ -1055,6 +1133,21 @@
- filename: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
sha256: 3b5fbd6c59d7d38205ab81970c0227c74693eb480acf20d8c2f211f62e3ca5f6
uri: huggingface://bartowski/Tess-2.0-Llama-3-8B-GGUF/Tess-2.0-Llama-3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
name: "tess-v2.5-phi-3-medium-128k-14b"
urls:
- https://huggingface.co/bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF
icon: https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B/resolve/main/Tess-2.png
description: |
Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series.
overrides:
parameters:
model: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
files:
- filename: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
sha256: 9efb6ebc00de74012d0fb36134cce07d624a870fc12f38b16b57ce447b86e27e
uri: huggingface://bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF/Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
- !!merge <<: *llama3
name: "llama3-iterative-dpo-final"
urls:
@@ -1702,6 +1795,20 @@
- filename: Llama-3-Update-3.0-mmproj-model-f16.gguf
sha256: 3d2f36dff61d6157cadf102df86a808eb9f8a230be1bc0bc99039d81a895468a
uri: huggingface://Nitral-AI/Llama-3-Update-3.0-mmproj-model-f16/Llama-3-Update-3.0-mmproj-model-f16.gguf
- !!merge <<: *llama3
name: "hathor_stable-v0.2-l3-8b"
urls:
- https://huggingface.co/bartowski/Hathor_Stable-v0.2-L3-8B-GGUF
description: |
Hathor-v0.2 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction.
icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/FLvA7-CWp3UhBuR2eGSh7.webp
overrides:
parameters:
model: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
files:
- filename: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
sha256: 291cd30421f519ec00e04ae946a4f639d8d1b7c294cb2b2897b35da6d498fdc4
uri: huggingface://bartowski/Hathor_Stable-v0.2-L3-8B-GGUF/Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
- !!merge <<: *llama3
name: "bunny-llama-3-8b-v"
urls:

View File

@@ -6,6 +6,8 @@ import (
"io/fs"
"os"
"path/filepath"
"github.com/go-skynet/LocalAI/pkg/library"
)
func ResolvePath(dir string, paths ...string) string {
@@ -54,22 +56,7 @@ func ExtractFiles(content embed.FS, extractDir string) error {
// If there is a lib directory, set LD_LIBRARY_PATH to include it
// we might use this mechanism to carry over e.g. Nvidia CUDA libraries
// from the embedded FS to the target directory
library.LoadExtractedLibs(extractDir)
// Skip this if LOCALAI_SKIP_LD_LIBRARY_PATH is set
if os.Getenv("LOCALAI_SKIP_LD_LIBRARY_PATH") != "" {
return err
}
for _, libDir := range []string{filepath.Join(extractDir, "backend_assets", "lib"), filepath.Join(extractDir, "lib")} {
if _, err := os.Stat(libDir); err == nil {
ldLibraryPath := os.Getenv("LD_LIBRARY_PATH")
if ldLibraryPath == "" {
ldLibraryPath = libDir
} else {
ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, libDir)
}
os.Setenv("LD_LIBRARY_PATH", ldLibraryPath)
}
}
return err
}

41
pkg/library/dynaload.go Normal file
View File

@@ -0,0 +1,41 @@
package library
import (
"fmt"
"os"
"path/filepath"
"runtime"
)
func LoadExtractedLibs(dir string) {
// Skip this if LOCALAI_SKIP_LIBRARY_PATH is set
if os.Getenv("LOCALAI_SKIP_LIBRARY_PATH") != "" {
return
}
for _, libDir := range []string{filepath.Join(dir, "backend-assets", "lib"), filepath.Join(dir, "lib")} {
LoadExternal(libDir)
}
}
func LoadExternal(dir string) {
// Skip this if LOCALAI_SKIP_LIBRARY_PATH is set
if os.Getenv("LOCALAI_SKIP_LIBRARY_PATH") != "" {
return
}
lpathVar := "LD_LIBRARY_PATH"
if runtime.GOOS == "darwin" {
lpathVar = "DYLD_FALLBACK_LIBRARY_PATH" // should it be DYLD_LIBRARY_PATH ?
}
if _, err := os.Stat(dir); err == nil {
ldLibraryPath := os.Getenv(lpathVar)
if ldLibraryPath == "" {
ldLibraryPath = dir
} else {
ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, dir)
}
os.Setenv(lpathVar, ldLibraryPath)
}
}