Compare commits

..

3 Commits

Author SHA1 Message Date
Ettore Di Giacinto
a8057b952c fix(cuda): be consistent with image tag naming (#5916)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-26 08:30:59 +02:00
Ettore Di Giacinto
fd5c1d916f chore(docs): add documentation on backend detection override (#5915)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-26 08:18:31 +02:00
LocalAI [bot]
5ce982b9c9 chore: ⬆️ Update ggml-org/llama.cpp to c7f3169cd523140a288095f2d79befb20a0b73f4 (#5913)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-25 23:08:20 +02:00
10 changed files with 28 additions and 13 deletions

View File

@@ -39,7 +39,7 @@ jobs:
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'false'
tag-suffix: '-gpu-nvidia-cuda12'
tag-suffix: '-gpu-nvidia-cuda-12'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:22.04"
makeflags: "--jobs=3 --output-sync=target"

View File

@@ -83,7 +83,7 @@ jobs:
cuda-minor-version: "7"
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-gpu-nvidia-cuda11'
tag-suffix: '-gpu-nvidia-cuda-11'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:22.04"
makeflags: "--jobs=4 --output-sync=target"
@@ -94,7 +94,7 @@ jobs:
cuda-minor-version: "0"
platforms: 'linux/amd64'
tag-latest: 'auto'
tag-suffix: '-gpu-nvidia-cuda12'
tag-suffix: '-gpu-nvidia-cuda-12'
runs-on: 'ubuntu-latest'
base-image: "ubuntu:22.04"
skip-drivers: 'false'

View File

@@ -322,7 +322,7 @@ docker-cuda11:
--build-arg GO_TAGS="$(GO_TAGS)" \
--build-arg MAKEFLAGS="$(DOCKER_MAKEFLAGS)" \
--build-arg BUILD_TYPE=$(BUILD_TYPE) \
-t $(DOCKER_IMAGE)-cuda11 .
-t $(DOCKER_IMAGE)-cuda-11 .
docker-aio:
@echo "Building AIO image with base $(BASE_IMAGE) as $(DOCKER_AIO_IMAGE)"

View File

@@ -189,6 +189,8 @@ local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
local-ai run oci://localai/phi-2:latest
```
> ⚡ **Automatic Backend Detection**: When you install models from the gallery or YAML files, LocalAI automatically detects your system's GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see [GPU Acceleration](https://localai.io/features/gpu-acceleration/#automatic-backend-detection).
For more information, see [💻 Getting started](https://localai.io/basics/getting_started/index.html)
## 📰 Latest project news

View File

@@ -1,5 +1,5 @@
LLAMA_VERSION?=3f4fc97f1d745f1d5d3c853949503136d419e6de
LLAMA_VERSION?=c7f3169cd523140a288095f2d79befb20a0b73f4
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
CMAKE_ARGS?=

View File

@@ -111,7 +111,7 @@ function ensureVenv() {
# - requirements-${BUILD_TYPE}.txt
# - requirements-${BUILD_PROFILE}.txt
#
# BUILD_PROFILE is a pore specific version of BUILD_TYPE, ex: cuda11 or cuda12
# BUILD_PROFILE is a pore specific version of BUILD_TYPE, ex: cuda-11 or cuda-12
# it can also include some options that we do not have BUILD_TYPES for, ex: intel
#
# NOTE: for BUILD_PROFILE==intel, this function does NOT automatically use the Intel python package index.

View File

@@ -15,6 +15,16 @@ This section contains instruction on how to use LocalAI with GPU acceleration.
For acceleration for AMD or Metal HW is still in development, for additional details see the [build]({{%relref "docs/getting-started/build#Acceleration" %}})
{{% /alert %}}
## Automatic Backend Detection
When you install a model from the gallery (or a YAML file), LocalAI intelligently detects the required backend and your system's capabilities, then downloads the correct version for you. Whether you're running on a standard CPU, an NVIDIA GPU, an AMD GPU, or an Intel GPU, LocalAI handles it automatically.
For advanced use cases or to override auto-detection, you can use the `LOCALAI_FORCE_META_BACKEND_CAPABILITY` environment variable. Here are the available options:
- `default`: Forces CPU-only backend. This is the fallback if no specific hardware is detected.
- `nvidia`: Forces backends compiled with CUDA support for NVIDIA GPUs.
- `amd`: Forces backends compiled with ROCm support for AMD GPUs.
- `intel`: Forces backends compiled with SYCL/oneAPI support for Intel GPUs.
## Model configuration
@@ -71,8 +81,8 @@ To use CUDA, use the images with the `cublas` tag, for example.
The image list is on [quay](https://quay.io/repository/go-skynet/local-ai?tab=tags):
- CUDA `11` tags: `master-gpu-nvidia-cuda11`, `v1.40.0-gpu-nvidia-cuda11`, ...
- CUDA `12` tags: `master-gpu-nvidia-cuda12`, `v1.40.0-gpu-nvidia-cuda12`, ...
- CUDA `11` tags: `master-gpu-nvidia-cuda-11`, `v1.40.0-gpu-nvidia-cuda-11`, ...
- CUDA `12` tags: `master-gpu-nvidia-cuda-12`, `v1.40.0-gpu-nvidia-cuda-12`, ...
In addition to the commands to run LocalAI normally, you need to specify `--gpus all` to docker, for example:

View File

@@ -163,9 +163,9 @@ Standard container images do not have pre-installed models.
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-nvidia-cuda11` | `localai/localai:master-gpu-nvidia-cuda11` |
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-nvidia-cuda-11` | `localai/localai:master-gpu-nvidia-cuda-11` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-11` | `localai/localai:latest-gpu-nvidia-cuda-11` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-nvidia-cuda11` | `localai/localai:{{< version >}}-gpu-nvidia-cuda11` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-nvidia-cuda-11` | `localai/localai:{{< version >}}-gpu-nvidia-cuda-11` |
{{% /tab %}}
@@ -173,9 +173,9 @@ Standard container images do not have pre-installed models.
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-nvidia-cuda12` | `localai/localai:master-gpu-nvidia-cuda12` |
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-nvidia-cuda-12` | `localai/localai:master-gpu-nvidia-cuda12` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-nvidia-cuda-12` | `localai/localai:latest-gpu-nvidia-cuda-12` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-nvidia-cuda12` | `localai/localai:{{< version >}}-gpu-nvidia-cuda12` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-nvidia-cuda-12` | `localai/localai:{{< version >}}-gpu-nvidia-cuda-12` |
{{% /tab %}}

View File

@@ -106,6 +106,9 @@ local-ai run https://gist.githubusercontent.com/.../phi-2.yaml
local-ai run oci://localai/phi-2:latest
```
{{% alert icon="⚡" %}}
**Automatic Backend Detection**: When you install models from the gallery or YAML files, LocalAI automatically detects your system's GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see [GPU Acceleration]({{% relref "docs/features/gpu-acceleration#automatic-backend-detection" %}}).
{{% /alert %}}
For a full list of options, refer to the [Installer Options]({{% relref "docs/advanced/installer" %}}) documentation.

View File

@@ -672,7 +672,7 @@ install_docker() {
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
elif [ "$HAS_CUDA" ]; then
# Default to CUDA 12
IMAGE_TAG=${LOCALAI_VERSION}-gpu-nvidia-cuda12
IMAGE_TAG=${LOCALAI_VERSION}-gpu-nvidia-cuda-12
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-nvidia-cuda-12