⬆️ Update ggerganov/llama.cpp (#2587 )

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
models(gallery): add gemma-1.1-7b-it (#2588 )
2026-02-03 11:13:31 -05:00 · 2024-06-17 15:28:19 +00:00 · 2024-06-17 14:13:27 +02:00 · 2024-06-17 10:08:29 +02:00 · 2024-06-17 00:18:44 +00:00 · 2024-06-16 22:10:28 +02:00
10 changed files with 219 additions and 36 deletions
--- a/.github/workflows/release.yaml
+++ b/.github/workflows/release.yaml
@@ -100,7 +100,13 @@ jobs:
          go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.0
          export PATH=$PATH:$GOPATH/bin
          export PATH=/usr/local/cuda/bin:$PATH
-          GO_TAGS=p2p GOOS=linux GOARCH=arm64 CMAKE_ARGS="-DProtobuf_INCLUDE_DIRS=$CROSS_STAGING_PREFIX/include -DProtobuf_DIR=$CROSS_STAGING_PREFIX/lib/cmake/protobuf -DgRPC_DIR=$CROSS_STAGING_PREFIX/lib/cmake/grpc -DCMAKE_TOOLCHAIN_FILE=$CMAKE_CROSS_TOOLCHAIN -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++" make dist-cross-linux-arm64
+          sudo rm -rf /usr/aarch64-linux-gnu/lib/libstdc++.so.6
+          sudo cp -rf /usr/aarch64-linux-gnu/lib/libstdc++.so* /usr/aarch64-linux-gnu/lib/libstdc++.so.6
+          GO_TAGS=p2p \
+          BACKEND_LIBS="./grpc/cmake/cross_build/third_party/re2/libre2.a ./grpc/cmake/cross_build/libgrpc.a ./grpc/cmake/cross_build/libgrpc++.a ./grpc/cmake/cross_build/third_party/protobuf/libprotobuf.a /usr/aarch64-linux-gnu/lib/libc.so.6 /usr/aarch64-linux-gnu/lib/libstdc++.so.6 /usr/aarch64-linux-gnu/lib/libgomp.so.1 /usr/aarch64-linux-gnu/lib/libm.so.6 /usr/aarch64-linux-gnu/lib/libgcc_s.so.1 /usr/aarch64-linux-gnu/lib/libdl.so.2 /usr/aarch64-linux-gnu/lib/libpthread.so.0" \
+          GOOS=linux \
+          GOARCH=arm64 \
+          CMAKE_ARGS="-DProtobuf_INCLUDE_DIRS=$CROSS_STAGING_PREFIX/include -DProtobuf_DIR=$CROSS_STAGING_PREFIX/lib/cmake/protobuf -DgRPC_DIR=$CROSS_STAGING_PREFIX/lib/cmake/grpc -DCMAKE_TOOLCHAIN_FILE=$CMAKE_CROSS_TOOLCHAIN -DCMAKE_C_COMPILER=aarch64-linux-gnu-gcc -DCMAKE_CXX_COMPILER=aarch64-linux-gnu-g++" make dist-cross-linux-arm64
      - uses: actions/upload-artifact@v4
        with:
          name: LocalAI-linux-arm64
@@ -111,7 +117,13 @@ jobs:
        with:
          files: |
            release/*
-
+      - name: Setup tmate session if tests fail
+        if: ${{ failure() }}
+        uses: mxschmitt/action-tmate@v3.18
+        with:
+          detached: true
+          connect-timeout-seconds: 180
+          limit-access-to-actor: true
  build-linux:
    runs-on: arc-runner-set
    steps:
@@ -190,6 +202,7 @@ jobs:
      - name: Install gRPC
        run: |
          cd grpc && cd cmake/build && sudo make --jobs 5 --output-sync=target install
+      # BACKEND_LIBS needed for gpu-workload: /opt/intel/oneapi/*/lib/libiomp5.so /opt/intel/oneapi/*/lib/libmkl_core.so /opt/intel/oneapi/*/lib/libmkl_core.so.2 /opt/intel/oneapi/*/lib/libmkl_intel_ilp64.so /opt/intel/oneapi/*/lib/libmkl_intel_ilp64.so.2 /opt/intel/oneapi/*/lib/libmkl_sycl_blas.so /opt/intel/oneapi/*/lib/libmkl_sycl_blas.so.4 /opt/intel/oneapi/*/lib/libmkl_tbb_thread.so /opt/intel/oneapi/*/lib/libmkl_tbb_thread.so.2 /opt/intel/oneapi/*/lib/libsycl.so /opt/intel/oneapi/*/lib/libsycl.so.7 /opt/intel/oneapi/*/lib/libsycl.so.7.1.0 /opt/rocm-*/lib/libamdhip64.so /opt/rocm-*/lib/libamdhip64.so.5 /opt/rocm-*/lib/libamdhip64.so.6 /opt/rocm-*/lib/libamdhip64.so.6.1.60100 /opt/rocm-*/lib/libhipblas.so /opt/rocm-*/lib/libhipblas.so.2 /opt/rocm-*/lib/libhipblas.so.2.1.60100 /opt/rocm-*/lib/librocblas.so /opt/rocm-*/lib/librocblas.so.4 /opt/rocm-*/lib/librocblas.so.4.1.60100 /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1 /usr/lib/x86_64-linux-gnu/libOpenCL.so.1.0.0 /usr/lib/x86_64-linux-gnu/libm.so.6 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 /usr/lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/librt.so.1 /usr/local/cuda-*/targets/x86_64-linux/lib/libcublas.so /usr/local/cuda-*/targets/x86_64-linux/lib/libcublasLt.so /usr/local/cuda-*/targets/x86_64-linux/lib/libcudart.so /usr/local/cuda-*/targets/x86_64-linux/lib/stubs/libcuda.so
      - name: Build
        id: build
        run: |
@@ -199,7 +212,9 @@ jobs:
          export PATH=/usr/local/cuda/bin:$PATH
          export PATH=/opt/rocm/bin:$PATH
          source /opt/intel/oneapi/setvars.sh
-          GO_TAGS=p2p make -j4 dist
+          GO_TAGS=p2p \
+          BACKEND_LIBS="/usr/lib/x86_64-linux-gnu/libstdc++.so.6 /usr/lib/x86_64-linux-gnu/libm.so.6 /usr/lib/x86_64-linux-gnu/libgcc_s.so.1 /usr/lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libgomp.so.1" \
+          make -j4 dist
      - uses: actions/upload-artifact@v4
        with:
          name: LocalAI-linux
@@ -210,7 +225,13 @@ jobs:
        with:
          files: |
            release/*
-
+      - name: Setup tmate session if tests fail
+        if: ${{ failure() }}
+        uses: mxschmitt/action-tmate@v3.18
+        with:
+          detached: true
+          connect-timeout-seconds: 180
+          limit-access-to-actor: true
  build-stablediffusion:
    runs-on: ubuntu-latest
    steps:
@@ -257,10 +278,6 @@ jobs:
        with:
          go-version: '1.21.x'
          cache: false
-      - name: Setup tmate session if tests fail
-        uses: mxschmitt/action-tmate@v3.18
-        with:
-          limit-access-to-actor: true
      - name: Dependencies
        run: |
          brew install protobuf grpc
@@ -272,7 +289,8 @@ jobs:
          export C_INCLUDE_PATH=/usr/local/include
          export CPLUS_INCLUDE_PATH=/usr/local/include
          export PATH=$PATH:$GOPATH/bin
-          GO_TAGS=p2p make dist
+          
+          BACKEND_LIBS="$(ls /opt/homebrew/opt/grpc/lib/*.dylib /opt/homebrew/opt/re2/lib/*.dylib /opt/homebrew/opt/openssl@3/lib/*.dylib /opt/homebrew/opt/protobuf/lib/*.dylib /opt/homebrew/opt/abseil/lib/*.dylib | xargs)" GO_TAGS=p2p make dist
      - uses: actions/upload-artifact@v4
        with:
          name: LocalAI-MacOS-arm64
@@ -283,3 +301,10 @@ jobs:
        with:
          files: |
            release/*
+      - name: Setup tmate session if tests fail
+        if: ${{ failure() }}
+        uses: mxschmitt/action-tmate@v3.18
+        with:
+          detached: true
+          connect-timeout-seconds: 180
+          limit-access-to-actor: true
--- a/17
+++ b/17
@@ -5,7 +5,7 @@ BINARY_NAME=local-ai

 # llama.cpp versions
 GOLLAMA_STABLE_VERSION?=2b57a8ae43e4699d3dc5d1496a1ccd42922993be
-CPPLLAMA_VERSION?=963552903f51043ee947a8deeaaa7ec00bc3f1a4
+CPPLLAMA_VERSION?=21be9cab94e0b5b53cb6edeeebf8c8c799baad03

 # gpt4all version
 GPT4ALL_REPO?=https://github.com/nomic-ai/gpt4all
@@ -16,7 +16,7 @@ RWKV_REPO?=https://github.com/donomii/go-rwkv.cpp
 RWKV_VERSION?=661e7ae26d442f5cfebd2a0881b44e8c55949ec6

 # whisper.cpp version
-WHISPER_CPP_VERSION?=420b6abc54008ab634f5887dc45bd77122c2f320
+WHISPER_CPP_VERSION?=b29b3b29240aac8b71ce8e5a4360c1f1562ad66f

 # bert.cpp version
 BERT_VERSION?=710044b124545415f555e4260d16b146c725a6e4
@@ -313,6 +313,10 @@ build: prepare backend-assets grpcs ## Build the project
 	$(info ${GREEN}I BUILD_TYPE: ${YELLOW}$(BUILD_TYPE)${RESET})
 	$(info ${GREEN}I GO_TAGS: ${YELLOW}$(GO_TAGS)${RESET})
 	$(info ${GREEN}I LD_FLAGS: ${YELLOW}$(LD_FLAGS)${RESET})
+ifneq ($(BACKEND_LIBS),)
+	$(MAKE) backend-assets/lib
+	cp -r $(BACKEND_LIBS) backend-assets/lib/
+endif
 	CGO_LDFLAGS="$(CGO_LDFLAGS)" $(GOCMD) build -ldflags "$(LD_FLAGS)" -tags "$(GO_TAGS)" -o $(BINARY_NAME) ./

 build-minimal:
@@ -321,8 +325,11 @@ build-minimal:
 build-api:
 	BUILD_GRPC_FOR_BACKEND_LLAMA=true BUILD_API_ONLY=true GO_TAGS=none $(MAKE) build

+backend-assets/lib:
+	mkdir -p backend-assets/lib
+
 dist:
-	STATIC=true $(MAKE) backend-assets/grpc/llama-cpp-avx2
+	$(MAKE) backend-assets/grpc/llama-cpp-avx2
 ifeq ($(OS),Darwin)
 	$(info ${GREEN}I Skip CUDA/hipblas build on MacOS${RESET})
 else
@@ -331,7 +338,7 @@ else
 	$(MAKE) backend-assets/grpc/llama-cpp-sycl_f16
 	$(MAKE) backend-assets/grpc/llama-cpp-sycl_f32
 endif
-	$(MAKE) build
+	STATIC=true $(MAKE) build
 	mkdir -p release
 # if BUILD_ID is empty, then we don't append it to the binary name
 ifeq ($(BUILD_ID),)
@@ -344,7 +351,7 @@ endif

 dist-cross-linux-arm64: 
 	CMAKE_ARGS="$(CMAKE_ARGS) -DLLAMA_NATIVE=off" GRPC_BACKENDS="backend-assets/grpc/llama-cpp-fallback backend-assets/grpc/llama-cpp-grpc backend-assets/util/llama-cpp-rpc-server" \
-	$(MAKE) build
+	STATIC=true $(MAKE) build
 	mkdir -p release
 # if BUILD_ID is empty, then we don't append it to the binary name
 ifeq ($(BUILD_ID),)
--- a/README.md
+++ b/README.md
@@ -65,6 +65,7 @@ docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-cpu

 [Roadmap](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap)

+- 🆕 You can browse now the model gallery without LocalAI! Check out https://models.localai.io
 - 🔥🔥 Decentralized llama.cpp:  https://github.com/mudler/LocalAI/pull/2343 (peer2peer llama.cpp!) 👉 Docs  https://localai.io/features/distribute/
 - 🔥🔥 Openvoice: https://github.com/mudler/LocalAI/pull/2334
 - 🆕 Function calls without grammars and mixed mode: https://github.com/mudler/LocalAI/pull/2328
--- a/core/cli/run.go
+++ b/core/cli/run.go
@@ -43,6 +43,7 @@ type RunCMD struct {
 	Address              string   `env:"LOCALAI_ADDRESS,ADDRESS" default:":8080" help:"Bind address for the API server" group:"api"`
 	CORS                 bool     `env:"LOCALAI_CORS,CORS" help:"" group:"api"`
 	CORSAllowOrigins     string   `env:"LOCALAI_CORS_ALLOW_ORIGINS,CORS_ALLOW_ORIGINS" group:"api"`
+	LibraryPath          string   `env:"LOCALAI_LIBRARY_PATH,LIBRARY_PATH" help:"Path to the library directory (for e.g. external libraries used by backends)" default:"/usr/share/local-ai/libs" group:"backends"`
 	CSRF                 bool     `env:"LOCALAI_CSRF" help:"Enables fiber CSRF middleware" group:"api"`
 	UploadLimit          int      `env:"LOCALAI_UPLOAD_LIMIT,UPLOAD_LIMIT" default:"15" help:"Default upload-limit in MB" group:"api"`
 	APIKeys              []string `env:"LOCALAI_API_KEY,API_KEY" help:"List of API Keys to enable API authentication. When this is set, all the requests must be authenticated with one of these API keys" group:"api"`
@@ -80,6 +81,7 @@ func (r *RunCMD) Run(ctx *cliContext.Context) error {
 		config.WithCors(r.CORS),
 		config.WithCorsAllowOrigins(r.CORSAllowOrigins),
 		config.WithCsrf(r.CSRF),
+		config.WithLibPath(r.LibraryPath),
 		config.WithThreads(r.Threads),
 		config.WithBackendAssets(ctx.BackendAssets),
 		config.WithBackendAssetsOutput(r.BackendAssetsPath),
--- a/core/config/application_config.go
+++ b/core/config/application_config.go
@@ -15,6 +15,7 @@ type ApplicationConfig struct {
 	Context                             context.Context
 	ConfigFile                          string
 	ModelPath                           string
+	LibPath                             string
 	UploadLimitMB, Threads, ContextSize int
 	DisableWebUI                        bool
 	F16                                 bool
@@ -101,6 +102,12 @@ func WithModelLibraryURL(url string) AppOption {
 	}
 }

+func WithLibPath(path string) AppOption {
+	return func(o *ApplicationConfig) {
+		o.LibPath = path
+	}
+}
+
 var EnableWatchDog = func(o *ApplicationConfig) {
 	o.WatchDog = true
 }
--- a/core/startup/startup.go
+++ b/core/startup/startup.go
@@ -9,6 +9,7 @@ import (
 	"github.com/go-skynet/LocalAI/core/services"
 	"github.com/go-skynet/LocalAI/internal"
 	"github.com/go-skynet/LocalAI/pkg/assets"
+	"github.com/go-skynet/LocalAI/pkg/library"
 	"github.com/go-skynet/LocalAI/pkg/model"
 	pkgStartup "github.com/go-skynet/LocalAI/pkg/startup"
 	"github.com/go-skynet/LocalAI/pkg/xsysinfo"
@@ -109,6 +110,11 @@ func Startup(opts ...config.AppOption) (*config.BackendConfigLoader, *model.Mode
 		}
 	}

+	if options.LibPath != "" {
+		// If there is a lib directory, set LD_LIBRARY_PATH to include it
+		library.LoadExternal(options.LibPath)
+	}
+
 	// turn off any process that was started by GRPC if the context is canceled
 	go func() {
 		<-options.Context.Done()
--- a/docs/content/docs/getting-started/container-images.md
+++ b/docs/content/docs/getting-started/container-images.md
@@ -85,7 +85,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                   |
 | --- | --- |-----------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master` | `localai/localai:master`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest` | `localai/localai:latest`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-cpu`                  | `localai/localai:latest-cpu`                  |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}` | `localai/localai:{{< version >}}`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-ffmpeg` | `localai/localai:{{< version >}}-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-ffmpeg-core` | `localai/localai:{{< version >}}-ffmpeg-core` |
@@ -97,7 +97,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                                  |
 | --- | --- |-------------------------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-cublas-cuda11` | `localai/localai:master-cublas-cuda11`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest-cublas-cuda11` | `localai/localai:latest-cublas-cuda11`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-11`                 | `localai/localai:latest-gpu-nvidia-cuda-11`                      |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11` | `localai/localai:{{< version >}}-cublas-cuda11`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11-ffmpeg` | `localai/localai:{{< version >}}-cublas-cuda11-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda11-ffmpeg-core` | `localai/localai:{{< version >}}-cublas-cuda11-ffmpeg-core` |
@@ -109,7 +109,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                                  |
 | --- | --- |-------------------------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-cublas-cuda12` | `localai/localai:master-cublas-cuda12`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest-cublas-cuda12` | `localai/localai:latest-cublas-cuda12`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-12` | `localai/localai:latest-gpu-nvidia-cuda-12`                 |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12` | `localai/localai:{{< version >}}-cublas-cuda12`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12-ffmpeg` | `localai/localai:{{< version >}}-cublas-cuda12-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-cublas-cuda12-ffmpeg-core` | `localai/localai:{{< version >}}-cublas-cuda12-ffmpeg-core` |
@@ -121,7 +121,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                                  |
 | --- | --- |-------------------------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-sycl-f16` | `localai/localai:master-sycl-f16`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest-sycl-f16` | `localai/localai:latest-sycl-f16`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f16` | `localai/localai:latest-gpu-intel-f16`                      |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16` | `localai/localai:{{< version >}}-sycl-f16`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16-ffmpeg` | `localai/localai:{{< version >}}-sycl-f16-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f16-ffmpeg-core` | `localai/localai:{{< version >}}-sycl-f16-ffmpeg-core` |
@@ -133,7 +133,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                                  |
 | --- | --- |-------------------------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-sycl-f32` | `localai/localai:master-sycl-f32`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest-sycl-f32` | `localai/localai:latest-sycl-f32`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f32` | `localai/localai:latest-gpu-intel-f32`                      |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32` | `localai/localai:{{< version >}}-sycl-f32`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32-ffmpeg` | `localai/localai:{{< version >}}-sycl-f32-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-sycl-f32-ffmpeg-core` | `localai/localai:{{< version >}}-sycl-f32-ffmpeg-core` |
@@ -145,7 +145,7 @@ Images with `core` in the tag are smaller and do not contain any python dependen
 | Description | Quay | Docker Hub                                                  |
 | --- | --- |-------------------------------------------------------------|
 | Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-hipblas` | `localai/localai:master-hipblas`                      |
-| Latest tag | `quay.io/go-skynet/local-ai:latest-hipblas` | `localai/localai:latest-hipblas`                      |
+| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-hipblas`                  | `localai/localai:latest-gpu-hipblas`                  |
 | Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-hipblas` | `localai/localai:{{< version >}}-hipblas`             |
 | Versioned image including FFMpeg| `quay.io/go-skynet/local-ai:{{< version >}}-hipblas-ffmpeg` | `localai/localai:{{< version >}}-hipblas-ffmpeg`      |
 | Versioned image including FFMpeg, no python | `quay.io/go-skynet/local-ai:{{< version >}}-hipblas-ffmpeg-core` | `localai/localai:{{< version >}}-hipblas-ffmpeg-core` |
--- a/gallery/index.yaml
+++ b/gallery/index.yaml
@@ -22,6 +22,53 @@
    - filename: Qwen2-7B-Instruct-Q4_K_M.gguf
      sha256: 8d0d33f0d9110a04aad1711b1ca02dafc0fa658cd83028bdfa5eff89c294fe76
      uri: huggingface://bartowski/Qwen2-7B-Instruct-GGUF/Qwen2-7B-Instruct-Q4_K_M.gguf
+- !!merge <<: *qwen2
+  name: "dolphin-2.9.2-qwen2-72b"
+  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
+  urls:
+    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf
+  description: |
+    Dolphin 2.9.2 Qwen2 72B 🐬
+
+    Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
+  overrides:
+    parameters:
+      model: dolphin-2.9.2-qwen2-Q4_K_M.gguf
+  files:
+    - filename: dolphin-2.9.2-qwen2-Q4_K_M.gguf
+      sha256: 44a0e82cbc2a201b2f4b9e16099a0a4d97b6f0099d45bcc5b354601f38dbb709
+      uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-72b-gguf/qwen2-Q4_K_M.gguf
+- !!merge <<: *qwen2
+  name: "dolphin-2.9.2-qwen2-7b"
+  description: |
+    Dolphin 2.9.2 Qwen2 7B 🐬
+
+    Curated and trained by Eric Hartford, Lucas Atkins, and Fernando Fernandes, and Cognitive Computations
+  urls:
+    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b
+    - https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf
+  icon: https://cdn-uploads.huggingface.co/production/uploads/63111b2d88942700629f5771/ldkN1J0WIDQwU4vutGYiD.png
+  overrides:
+    parameters:
+      model: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
+  files:
+    - filename: dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
+      sha256: a15b5db4df6be4f4bfb3632b2009147332ef4c57875527f246b4718cb0d3af1f
+      uri: huggingface://cognitivecomputations/dolphin-2.9.2-qwen2-7b-gguf/dolphin-2.9.2-qwen2-7b-Q4_K_M.gguf
+- !!merge <<: *qwen2
+  name: "samantha-qwen-2-7B"
+  description: |
+    Samantha based on qwen2
+  urls:
+    - https://huggingface.co/bartowski/Samantha-Qwen-2-7B-GGUF
+    - https://huggingface.co/macadeliccc/Samantha-Qwen2-7B
+  overrides:
+    parameters:
+      model: Samantha-Qwen-2-7B-Q4_K_M.gguf
+  files:
+    - filename: Samantha-Qwen-2-7B-Q4_K_M.gguf
+      sha256: 5d1cf1c35a7a46c536a96ba0417d08b9f9e09c24a4e25976f72ad55d4904f6fe
+      uri: huggingface://bartowski/Samantha-Qwen-2-7B-GGUF/Samantha-Qwen-2-7B-Q4_K_M.gguf
 ## START Mistral
 - &mistral03
  url: "github:mudler/LocalAI/gallery/mistral-0.3.yaml@master"
@@ -176,6 +223,37 @@
    - filename: gemma-2b.Q4_K_M.gguf
      sha256: 37d50c21ef7847926204ad9b3007127d9a2722188cfd240ce7f9f7f041aa71a5
      uri: huggingface://mlabonne/gemma-2b-GGUF/gemma-2b.Q4_K_M.gguf
+- !!merge <<: *gemma
+  name: "firefly-gemma-7b-iq-imatrix"
+  icon: "https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/SrOekTxdpnxHyWWmMiAvc.jpeg"
+  urls:
+    - https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix
+    - https://huggingface.co/YeungNLP/firefly-gemma-7b
+  description: |
+    firefly-gemma-7b is trained based on gemma-7b to act as a helpful and harmless AI assistant. We use Firefly to train the model on a single V100 GPU with QLoRA.
+  overrides:
+    parameters:
+      model: firefly-gemma-7b-Q4_K_S-imatrix.gguf
+  files:
+    - filename: firefly-gemma-7b-Q4_K_S-imatrix.gguf
+      sha256: 622e0b8e4f12203cc40c7f87915abf99498c2e0582203415ca236ea37643e428
+      uri: huggingface://Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/firefly-gemma-7b-Q4_K_S-imatrix.gguf
+- !!merge <<: *gemma
+  name: "gemma-1.1-7b-it"
+  urls:
+    - https://huggingface.co/bartowski/gemma-1.1-7b-it-GGUF
+    - https://huggingface.co/google/gemma-1.1-7b-it
+  description: |
+      This is Gemma 1.1 7B (IT), an update over the original instruction-tuned Gemma release.
+
+      Gemma 1.1 was trained using a novel RLHF method, leading to substantial gains on quality, coding capabilities, factuality, instruction following and multi-turn conversation quality. We also fixed a bug in multi-turn conversations, and made sure that model responses don't always start with "Sure,".
+  overrides:
+    parameters:
+      model: gemma-1.1-7b-it-Q4_K_M.gguf
+  files:
+    - filename: gemma-1.1-7b-it-Q4_K_M.gguf
+      sha256: 47821da72ee9e80b6fd43c6190ad751b485fb61fa5664590f7a73246bcd8332e
+      uri: huggingface://bartowski/gemma-1.1-7b-it-GGUF/gemma-1.1-7b-it-Q4_K_M.gguf
 - &llama3
  url: "github:mudler/LocalAI/gallery/llama3-instruct.yaml@master"
  icon: https://cdn-uploads.huggingface.co/production/uploads/642cc1c253e76b4c2286c58e/aJJxKus1wP5N-euvHEUq7.png
@@ -1055,6 +1133,21 @@
    - filename: Tess-2.0-Llama-3-8B-Q4_K_M.gguf
      sha256: 3b5fbd6c59d7d38205ab81970c0227c74693eb480acf20d8c2f211f62e3ca5f6
      uri: huggingface://bartowski/Tess-2.0-Llama-3-8B-GGUF/Tess-2.0-Llama-3-8B-Q4_K_M.gguf
+- !!merge <<: *llama3
+  url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
+  name: "tess-v2.5-phi-3-medium-128k-14b"
+  urls:
+    - https://huggingface.co/bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF
+  icon: https://huggingface.co/migtissera/Tess-2.0-Mixtral-8x22B/resolve/main/Tess-2.png
+  description: |
+    Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series.
+  overrides:
+    parameters:
+      model: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
+  files:
+    - filename: Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
+      sha256: 9efb6ebc00de74012d0fb36134cce07d624a870fc12f38b16b57ce447b86e27e
+      uri: huggingface://bartowski/Tess-v2.5-Phi-3-medium-128k-14B-GGUF/Tess-v2.5-Phi-3-medium-128k-14B-Q4_K_M.gguf
 - !!merge <<: *llama3
  name: "llama3-iterative-dpo-final"
  urls:
@@ -1702,6 +1795,20 @@
    - filename: Llama-3-Update-3.0-mmproj-model-f16.gguf
      sha256: 3d2f36dff61d6157cadf102df86a808eb9f8a230be1bc0bc99039d81a895468a
      uri: huggingface://Nitral-AI/Llama-3-Update-3.0-mmproj-model-f16/Llama-3-Update-3.0-mmproj-model-f16.gguf
+- !!merge <<: *llama3
+  name: "hathor_stable-v0.2-l3-8b"
+  urls:
+    - https://huggingface.co/bartowski/Hathor_Stable-v0.2-L3-8B-GGUF
+  description: |
+    Hathor-v0.2 is a model based on the LLaMA 3 architecture: Designed to seamlessly integrate the qualities of creativity, intelligence, and robust performance. Making it an ideal tool for a wide range of applications; such as creative writing, educational support and human/computer interaction.
+  icon: https://cdn-uploads.huggingface.co/production/uploads/65d4cf2693a0a3744a27536c/FLvA7-CWp3UhBuR2eGSh7.webp
+  overrides:
+    parameters:
+      model: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
+  files:
+    - filename: Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
+      sha256: 291cd30421f519ec00e04ae946a4f639d8d1b7c294cb2b2897b35da6d498fdc4
+      uri: huggingface://bartowski/Hathor_Stable-v0.2-L3-8B-GGUF/Hathor_Stable-v0.2-L3-8B-Q4_K_M.gguf
 - !!merge <<: *llama3
  name: "bunny-llama-3-8b-v"
  urls:
--- a/pkg/assets/extract.go
+++ b/pkg/assets/extract.go
@@ -6,6 +6,8 @@ import (
 	"io/fs"
 	"os"
 	"path/filepath"
+
+	"github.com/go-skynet/LocalAI/pkg/library"
 )

 func ResolvePath(dir string, paths ...string) string {
@@ -54,22 +56,7 @@ func ExtractFiles(content embed.FS, extractDir string) error {
 	// If there is a lib directory, set LD_LIBRARY_PATH to include it
 	// we might use this mechanism to carry over e.g. Nvidia CUDA libraries
 	// from the embedded FS to the target directory
+	library.LoadExtractedLibs(extractDir)

-	// Skip this if LOCALAI_SKIP_LD_LIBRARY_PATH is set
-	if os.Getenv("LOCALAI_SKIP_LD_LIBRARY_PATH") != "" {
-		return err
-	}
-
-	for _, libDir := range []string{filepath.Join(extractDir, "backend_assets", "lib"), filepath.Join(extractDir, "lib")} {
-		if _, err := os.Stat(libDir); err == nil {
-			ldLibraryPath := os.Getenv("LD_LIBRARY_PATH")
-			if ldLibraryPath == "" {
-				ldLibraryPath = libDir
-			} else {
-				ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, libDir)
-			}
-			os.Setenv("LD_LIBRARY_PATH", ldLibraryPath)
-		}
-	}
 	return err
 }
--- a/pkg/library/dynaload.go
+++ b/pkg/library/dynaload.go
@@ -0,0 +1,41 @@
+package library
+
+import (
+	"fmt"
+	"os"
+	"path/filepath"
+	"runtime"
+)
+
+func LoadExtractedLibs(dir string) {
+	// Skip this if LOCALAI_SKIP_LIBRARY_PATH is set
+	if os.Getenv("LOCALAI_SKIP_LIBRARY_PATH") != "" {
+		return
+	}
+
+	for _, libDir := range []string{filepath.Join(dir, "backend-assets", "lib"), filepath.Join(dir, "lib")} {
+		LoadExternal(libDir)
+	}
+}
+
+func LoadExternal(dir string) {
+	// Skip this if LOCALAI_SKIP_LIBRARY_PATH is set
+	if os.Getenv("LOCALAI_SKIP_LIBRARY_PATH") != "" {
+		return
+	}
+
+	lpathVar := "LD_LIBRARY_PATH"
+	if runtime.GOOS == "darwin" {
+		lpathVar = "DYLD_FALLBACK_LIBRARY_PATH" // should it be DYLD_LIBRARY_PATH ?
+	}
+
+	if _, err := os.Stat(dir); err == nil {
+		ldLibraryPath := os.Getenv(lpathVar)
+		if ldLibraryPath == "" {
+			ldLibraryPath = dir
+		} else {
+			ldLibraryPath = fmt.Sprintf("%s:%s", ldLibraryPath, dir)
+		}
+		os.Setenv(lpathVar, ldLibraryPath)
+	}
+}
Author	SHA1	Message	Date
LocalAI [bot]	2f297979a7	⬆️ Update ggerganov/llama.cpp (#2587 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-17 15:28:19 +00:00
Ettore Di Giacinto	2437a2769d	models(gallery): add gemma-1.1-7b-it (#2588 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-17 14:13:27 +02:00
Ettore Di Giacinto	b58b7cad94	models(gallery): add samantha-qwen2 (#2586 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-17 10:08:29 +02:00
LocalAI [bot]	68148f2a1a	⬆️ Update ggerganov/llama.cpp (#2584 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-17 00:18:44 +00:00
Ettore Di Giacinto	4897eb0ba2	ci: pack less libs inside the binary (#2579 ) The binary grew up to 1.8GB quickly - rocm at least raises +800MB by itself - so we might just want to manage the GPU libs separately. Adds a comment to list all the libraries found so far that we are depending on, but will likely follow up in a way to bundle these separately. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 22:10:28 +02:00
Ettore Di Giacinto	1b43966c48	Update README.md Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-06-16 20:27:37 +02:00
Ettore Di Giacinto	c5f2f11503	models(gallery): add hathor_stable-v0.2-l3-8b (#2582 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 20:24:36 +02:00
Ettore Di Giacinto	895443d1b5	models(gallery): add tess-v2.5-phi-3-medium-128k-14b (#2581 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 20:22:08 +02:00
Ettore Di Giacinto	6a0802e8e6	models(gallery): add dolphin-qwen (#2580 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 20:11:21 +02:00
Ettore Di Giacinto	94cfaad7f4	feat(libpath): refactor and expose functions for external library paths (#2578 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-16 13:58:28 +02:00
Ettore Di Giacinto	ac4a94dd44	feat(build): bundle libs for arm64 and x86 linux binaries (#2572 ) This PR bundles further libs into the arm64 and x86_64 binaries This can be improved by a lot - it's far from perfect, however in this PR I wanted to collect the required libs, and give a simple baseline to improve later upon. It is quite challenging to do this exercise with CI only - but it's the fastest way I see now. I hope that after the list is initially built we can further improve this down the line and remove some of the technical debt left here to speedup things and do not get stuck in the middle of CI cycles. In this PR: - The x86_64 binary now bundles hipblas, nvidia and intel libraries too to avoid any dependency to be installed in the host - Similarly, for the arm64 we now bundle all the required assets ## What's left We should be also able to cross-compile Nvidia for arm64 - however I didn't succeed so far so I've left that open. Similarly I might have missed some libraries, but we will see with bug reports and testing around with the new binaries. I've tested on my arm64 board and I could finally start things up. An open point still is shipping libraries for e.g. tts and stablediffusion. this is not done yet, however with the same methodology we should be able to extend support also for these two backends in the binary.	2024-06-16 09:10:44 +02:00
LocalAI [bot]	58bf8614d9	⬆️ Update ggerganov/llama.cpp (#2575 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-15 23:45:10 +00:00
Ettore Di Giacinto	3764e50b35	models(gallery): add firefly-gemma-7b (#2576 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-15 23:07:20 +02:00
Nate Harris	3f464d2d9e	Fix standard image latest Docker tags (#2574 ) - Fix standard image latest Docker tags Signed-off-by: Nate Harris <nwithan8@users.noreply.github.com>	2024-06-15 22:08:30 +02:00
LocalAI [bot]	5116d561e1	⬆️ Update ggerganov/llama.cpp (#2570 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-14 23:39:20 +00:00
Ettore Di Giacinto	96a7a3b59f	fix(Makefile): enable STATIC on dist (#2569 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-14 12:28:46 +02:00
Ettore Di Giacinto	112d0ffa45	feat(darwin): embed grpc libs (#2567 ) * debug * feat(makefile): allow to bundle libs into binary * ci: bundle protobuf into single-binary Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci: tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(assets): correctly reference extract folder Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * bundle also abseil Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * bundle more libs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2024-06-14 08:51:25 +02:00
LocalAI [bot]	25f45827ab	⬆️ Update ggerganov/whisper.cpp (#2565 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-14 00:26:51 +00:00
LocalAI [bot]	f322f7c62d	⬆️ Update ggerganov/llama.cpp (#2564 ) Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2024-06-13 23:47:50 +00:00
Ettore Di Giacinto	06351cbbb4	feat(binary): support extracted bundled libs on darwin (#2563 ) When offering fallback libs, use the proper env var for darwin Note: this does not include the libraries itself, but only sets the proper env var for the libs to be picked up on darwin. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2024-06-13 22:59:42 +02:00