Compare commits

...

29 Commits

Author SHA1 Message Date
Richard Palethorpe
d6274eaf4a chore(build): Rename sycl to intel (#5964)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-08-04 11:01:28 +02:00
LocalAI [bot]
4d90971424 chore: ⬆️ Update ggml-org/llama.cpp to d31192b4ee1441bbbecd3cbf9e02633368bdc4f5 (#5965)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-08-03 21:03:20 +00:00
Ettore Di Giacinto
90f5639639 feat(backends): allow backends to not have a metadata file (#5963)
In this case we generate one on the fly and we infer the metadata we
can.

Obviously this have the side effect of not being able to register
potential aliases.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-03 16:47:02 +02:00
Ettore Di Giacinto
a35a701052 feat(backends): install from local path (#5962)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-03 14:24:50 +02:00
Ettore Di Giacinto
3d8ec72dbf chore(stable-diffusion): bump, set GGML_MAX_NAME (#5961)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-08-03 10:47:02 +02:00
LocalAI [bot]
2a9d675d62 chore: ⬆️ Update ggml-org/llama.cpp to 5c0eb5ef544aeefd81c303e03208f768e158d93c (#5959)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-08-02 23:35:24 +02:00
LocalAI [bot]
c782e8abf1 chore: ⬆️ Update ggml-org/whisper.cpp to 0becabc8d68d9ffa6ddfba5240e38cd7a2642046 (#5958)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-08-02 21:04:13 +00:00
LocalAI [bot]
a1e1942d83 docs: ⬆️ update docs version mudler/LocalAI (#5956)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-08-01 22:14:23 +02:00
Dedy F. Setyawan
787302b204 fix(docs): Improve responsiveness of tables (#5954)
Signed-off-by: Dedy F. Setyawan <dedyfajars@gmail.com>
2025-08-01 22:13:53 +02:00
LocalAI [bot]
0b085089b9 chore: ⬆️ Update ggml-org/llama.cpp to daf2dd788066b8b239cb7f68210e090c2124c199 (#5951)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-08-01 08:25:36 +02:00
LocalAI [bot]
624f3b1fc8 feat(swagger): update swagger (#5950)
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-31 21:04:23 +00:00
Richard Palethorpe
c07bc55fee fix(intel): Set GPU vendor on Intel images and cleanup (#5945)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2025-07-31 19:44:46 +02:00
Ettore Di Giacinto
173e0774c0 chore(model gallery): add flux.1-krea-dev-ggml (#5949)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 18:32:06 +02:00
Ettore Di Giacinto
8ece26ab7c chore(model gallery): add flux.1-dev-ggml-abliterated-v2-q8_0 (#5948)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 17:23:48 +02:00
Ettore Di Giacinto
d704cc7970 chore(model gallery): add flux.1-dev-ggml-q8_0 (#5947)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 17:13:19 +02:00
Ettore Di Giacinto
ab17baaae1 chore(capability): improve messages (#5944)
* chore(capability): improve messages

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: isolate to constants, do not detect from the first gpu

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 16:25:19 +02:00
Ettore Di Giacinto
ca358fcdca feat(stablediffusion-ggml): allow to load loras (#5943)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 16:25:05 +02:00
Ettore Di Giacinto
9aadfd485f chore: update swagger (#5946)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-31 16:22:27 +02:00
LocalAI [bot]
da3b0850de chore: ⬆️ Update ggml-org/whisper.cpp to f7502dca872866a310fe69d30b163fa87d256319 (#5941)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-31 09:26:30 +02:00
LocalAI [bot]
8b1e8b4cda chore: ⬆️ Update ggml-org/llama.cpp to e9192bec564780bd4313ad6524d20a0ab92797db (#5940)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-31 09:26:02 +02:00
Ettore Di Giacinto
3d22bfc27c feat(stablediffusion-ggml): add support to ref images (flux Kontext) (#5935)
* feat(stablediffusion-ggml): add support to ref images

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add it to the model gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 22:42:34 +02:00
Ettore Di Giacinto
4438b4361e chore(model gallery): add qwen_qwen3-30b-a3b-thinking-2507 (#5939)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 21:18:56 +02:00
Ettore Di Giacinto
04bad9a2da chore(model gallery): add arcee-ai_afm-4.5b (#5938)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 15:37:07 +02:00
Ettore Di Giacinto
8235e53602 chore(model gallery): add qwen_qwen3-30b-a3b-instruct-2507 (#5936)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-07-30 15:29:34 +02:00
LocalAI [bot]
eb5c3670f1 chore: ⬆️ Update ggml-org/llama.cpp to aa79524c51fb014f8df17069d31d7c44b9ea6cb8 (#5934)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-29 21:05:00 +00:00
LocalAI [bot]
89e61fca90 chore: ⬆️ Update ggml-org/whisper.cpp to d0a9d8c7f8f7b91c51d77bbaa394b915f79cde6b (#5932)
⬆️ Update ggml-org/whisper.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-29 08:02:01 +02:00
LocalAI [bot]
9d6efe8842 chore: ⬆️ Update leejet/stable-diffusion.cpp to f6b9aa1a4373e322ff12c15b8a0749e6dd6f0253 (#5930)
⬆️ Update leejet/stable-diffusion.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-29 08:01:30 +02:00
LocalAI [bot]
60726d16f2 chore: ⬆️ Update ggml-org/llama.cpp to 8ad7b3e65b5834e5574c2f5640056c9047b5d93b (#5931)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-29 08:01:03 +02:00
LocalAI [bot]
9d7ec09ec0 docs: ⬆️ update docs version mudler/LocalAI (#5929)
⬆️ Update docs version mudler/LocalAI

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2025-07-28 21:03:44 +00:00
36 changed files with 1522 additions and 1263 deletions

View File

@@ -51,12 +51,12 @@ jobs:
grpc-base-image: "ubuntu:22.04"
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'sycl_f16'
- build-type: 'sycl'
platforms: 'linux/amd64'
tag-latest: 'false'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: 'sycl-f16'
tag-suffix: 'sycl'
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
- build-type: 'vulkan'

View File

@@ -109,24 +109,15 @@ jobs:
skip-drivers: 'false'
makeflags: "--jobs=4 --output-sync=target"
aio: "-aio-gpu-vulkan"
- build-type: 'sycl_f16'
- build-type: 'intel'
platforms: 'linux/amd64'
tag-latest: 'auto'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-gpu-intel-f16'
tag-suffix: '-gpu-intel'
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
aio: "-aio-gpu-intel-f16"
- build-type: 'sycl_f32'
platforms: 'linux/amd64'
tag-latest: 'auto'
base-image: "quay.io/go-skynet/intel-oneapi-base:latest"
grpc-base-image: "ubuntu:22.04"
tag-suffix: '-gpu-intel-f32'
runs-on: 'ubuntu-latest'
makeflags: "--jobs=3 --output-sync=target"
aio: "-aio-gpu-intel-f32"
aio: "-aio-gpu-intel"
gh-runner:
uses: ./.github/workflows/image_build.yml

View File

@@ -100,6 +100,8 @@ RUN if [ "${BUILD_TYPE}" = "hipblas" ] && [ "${SKIP_DRIVERS}" = "false" ]; then
ldconfig \
; fi
RUN expr "${BUILD_TYPE}" = intel && echo "intel" > /run/localai/capability || echo "not intel"
# Cuda
ENV PATH=/usr/local/cuda/bin:${PATH}

View File

@@ -5,8 +5,6 @@ BINARY_NAME=local-ai
GORELEASER?=
ONEAPI_VERSION?=2025.2
export BUILD_TYPE?=
GO_TAGS?=
@@ -340,19 +338,11 @@ docker-aio-all:
docker-image-intel:
docker build \
--build-arg BASE_IMAGE=intel/oneapi-basekit:${ONEAPI_VERSION}.0-0-devel-ubuntu24.04 \
--build-arg BASE_IMAGE=quay.io/go-skynet/intel-oneapi-base:latest \
--build-arg IMAGE_TYPE=$(IMAGE_TYPE) \
--build-arg GO_TAGS="$(GO_TAGS)" \
--build-arg MAKEFLAGS="$(DOCKER_MAKEFLAGS)" \
--build-arg BUILD_TYPE=sycl_f32 -t $(DOCKER_IMAGE) .
docker-image-intel-xpu:
docker build \
--build-arg BASE_IMAGE=intel/oneapi-basekit:${ONEAPI_VERSION}.0-0-devel-ubuntu22.04 \
--build-arg IMAGE_TYPE=$(IMAGE_TYPE) \
--build-arg GO_TAGS="$(GO_TAGS)" \
--build-arg MAKEFLAGS="$(DOCKER_MAKEFLAGS)" \
--build-arg BUILD_TYPE=sycl_f32 -t $(DOCKER_IMAGE) .
--build-arg BUILD_TYPE=intel -t $(DOCKER_IMAGE) .
########################################################
## Backends

View File

@@ -140,11 +140,7 @@ docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri
### Intel GPU Images (oneAPI):
```bash
# Intel GPU with FP16 support
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel-f16
# Intel GPU with FP32 support
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel-f32
docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel
```
### Vulkan GPU Images:
@@ -166,7 +162,7 @@ docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-ai
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11
# Intel GPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel-f16
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel
# AMD GPU version
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas

View File

@@ -96,17 +96,6 @@ RUN if [ "${BUILD_TYPE}" = "hipblas" ] && [ "${SKIP_DRIVERS}" = "false" ]; then
ldconfig \
; fi
# Intel oneAPI requirements
RUN <<EOT bash
if [[ "${BUILD_TYPE}" == sycl* ]] && [ "${SKIP_DRIVERS}" = "false" ]; then
apt-get update && \
apt-get install -y --no-install-recommends \
intel-oneapi-runtime-libs && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
fi
EOT
# Install Go
RUN curl -L -s https://go.dev/dl/go${GO_VERSION}.linux-${TARGETARCH}.tar.gz | tar -C /usr/local -xz
ENV PATH=$PATH:/root/go/bin:/usr/local/go/bin:/usr/local/bin

View File

@@ -305,6 +305,9 @@ message GenerateImageRequest {
// Diffusers
string EnableParameters = 10;
int32 CLIPSkip = 11;
// Reference images for models that support them (e.g., Flux Kontext)
repeated string ref_images = 12;
}
message GenerateVideoRequest {

View File

@@ -1,5 +1,5 @@
LLAMA_VERSION?=bf78f5439ee8e82e367674043303ebf8e92b4805
LLAMA_VERSION?=d31192b4ee1441bbbecd3cbf9e02633368bdc4f5
LLAMA_REPO?=https://github.com/ggerganov/llama.cpp
CMAKE_ARGS?=
@@ -26,7 +26,7 @@ else ifeq ($(BUILD_TYPE),openblas)
# If build type is clblas (openCL) we set -DGGML_CLBLAST=ON -DCLBlast_DIR=/some/path
else ifeq ($(BUILD_TYPE),clblas)
CMAKE_ARGS+=-DGGML_CLBLAST=ON -DCLBlast_DIR=/some/path
# If it's hipblas we do have also to set CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++
# If it's hipblas we do have also to set CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++
else ifeq ($(BUILD_TYPE),hipblas)
ROCM_HOME ?= /opt/rocm
ROCM_PATH ?= /opt/rocm

View File

@@ -19,10 +19,10 @@ LD_FLAGS?=
# stablediffusion.cpp (ggml)
STABLEDIFFUSION_GGML_REPO?=https://github.com/leejet/stable-diffusion.cpp
STABLEDIFFUSION_GGML_VERSION?=eed97a5e1d054f9c1e7ac01982ae480411d4157e
STABLEDIFFUSION_GGML_VERSION?=5900ef6605c6fbf7934239f795c13c97bc993853
# Disable Shared libs as we are linking on static gRPC and we can't mix shared and static
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF
CMAKE_ARGS+=-DBUILD_SHARED_LIBS=OFF -DGGML_MAX_NAME=128 -DSD_USE_SYSTEM_GGML=OFF
ifeq ($(NATIVE),false)
CMAKE_ARGS+=-DGGML_NATIVE=OFF

View File

@@ -1,3 +1,5 @@
#define GGML_MAX_NAME 128
#include <stdio.h>
#include <string.h>
#include <time.h>
@@ -5,6 +7,7 @@
#include <random>
#include <string>
#include <vector>
#include <filesystem>
#include "gosd.h"
// #include "preprocessing.hpp"
@@ -85,7 +88,7 @@ void sd_log_cb(enum sd_log_level_t level, const char* log, void* data) {
fflush(stderr);
}
int load_model(char *model, char* options[], int threads, int diff) {
int load_model(char *model, char *model_path, char* options[], int threads, int diff) {
fprintf (stderr, "Loading model!\n");
sd_set_log_callback(sd_log_cb, NULL);
@@ -103,6 +106,8 @@ int load_model(char *model, char* options[], int threads, int diff) {
char *vae_path = "";
char *scheduler = "";
char *sampler = "";
char *lora_dir = model_path;
bool lora_dir_allocated = false;
fprintf(stderr, "parsing options\n");
@@ -132,6 +137,20 @@ int load_model(char *model, char* options[], int threads, int diff) {
if (!strcmp(optname, "sampler")) {
sampler = optval;
}
if (!strcmp(optname, "lora_dir")) {
// Path join with model dir
if (model_path && strlen(model_path) > 0) {
std::filesystem::path model_path_str(model_path);
std::filesystem::path lora_path(optval);
std::filesystem::path full_lora_path = model_path_str / lora_path;
lora_dir = strdup(full_lora_path.string().c_str());
lora_dir_allocated = true;
fprintf(stderr, "Lora dir resolved to: %s\n", lora_dir);
} else {
lora_dir = optval;
fprintf(stderr, "No model path provided, using lora dir as-is: %s\n", lora_dir);
}
}
}
fprintf(stderr, "parsed options\n");
@@ -176,7 +195,7 @@ int load_model(char *model, char* options[], int threads, int diff) {
ctx_params.vae_path = vae_path;
ctx_params.taesd_path = "";
ctx_params.control_net_path = "";
ctx_params.lora_model_dir = "";
ctx_params.lora_model_dir = lora_dir;
ctx_params.embedding_dir = "";
ctx_params.stacked_id_embed_dir = "";
ctx_params.vae_decode_only = false;
@@ -189,16 +208,25 @@ int load_model(char *model, char* options[], int threads, int diff) {
if (sd_ctx == NULL) {
fprintf (stderr, "failed loading model (generic error)\n");
// Clean up allocated memory
if (lora_dir_allocated && lora_dir) {
free(lora_dir);
}
return 1;
}
fprintf (stderr, "Created context: OK\n");
sd_c = sd_ctx;
// Clean up allocated memory
if (lora_dir_allocated && lora_dir) {
free(lora_dir);
}
return 0;
}
int gen_image(char *text, char *negativeText, int width, int height, int steps, int seed , char *dst, float cfg_scale) {
int gen_image(char *text, char *negativeText, int width, int height, int steps, int seed , char *dst, float cfg_scale, char *src_image, float strength, char *mask_image, char **ref_images, int ref_images_count) {
sd_image_t* results;
@@ -221,15 +249,187 @@ int gen_image(char *text, char *negativeText, int width, int height, int steps,
p.seed = seed;
p.input_id_images_path = "";
// Handle input image for img2img
bool has_input_image = (src_image != NULL && strlen(src_image) > 0);
bool has_mask_image = (mask_image != NULL && strlen(mask_image) > 0);
uint8_t* input_image_buffer = NULL;
uint8_t* mask_image_buffer = NULL;
std::vector<uint8_t> default_mask_image_vec;
if (has_input_image) {
fprintf(stderr, "Loading input image: %s\n", src_image);
int c = 0;
int img_width = 0;
int img_height = 0;
input_image_buffer = stbi_load(src_image, &img_width, &img_height, &c, 3);
if (input_image_buffer == NULL) {
fprintf(stderr, "Failed to load input image from '%s'\n", src_image);
return 1;
}
if (c < 3) {
fprintf(stderr, "Input image must have at least 3 channels, got %d\n", c);
free(input_image_buffer);
return 1;
}
// Resize input image if dimensions don't match
if (img_width != width || img_height != height) {
fprintf(stderr, "Resizing input image from %dx%d to %dx%d\n", img_width, img_height, width, height);
uint8_t* resized_image_buffer = (uint8_t*)malloc(height * width * 3);
if (resized_image_buffer == NULL) {
fprintf(stderr, "Failed to allocate memory for resized image\n");
free(input_image_buffer);
return 1;
}
stbir_resize(input_image_buffer, img_width, img_height, 0,
resized_image_buffer, width, height, 0, STBIR_TYPE_UINT8,
3, STBIR_ALPHA_CHANNEL_NONE, 0,
STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP,
STBIR_FILTER_BOX, STBIR_FILTER_BOX,
STBIR_COLORSPACE_SRGB, nullptr);
free(input_image_buffer);
input_image_buffer = resized_image_buffer;
}
p.init_image = {(uint32_t)width, (uint32_t)height, 3, input_image_buffer};
p.strength = strength;
fprintf(stderr, "Using img2img with strength: %.2f\n", strength);
} else {
// No input image, use empty image for text-to-image
p.init_image = {(uint32_t)width, (uint32_t)height, 3, NULL};
p.strength = 0.0f;
}
// Handle mask image for inpainting
if (has_mask_image) {
fprintf(stderr, "Loading mask image: %s\n", mask_image);
int c = 0;
int mask_width = 0;
int mask_height = 0;
mask_image_buffer = stbi_load(mask_image, &mask_width, &mask_height, &c, 1);
if (mask_image_buffer == NULL) {
fprintf(stderr, "Failed to load mask image from '%s'\n", mask_image);
if (input_image_buffer) free(input_image_buffer);
return 1;
}
// Resize mask if dimensions don't match
if (mask_width != width || mask_height != height) {
fprintf(stderr, "Resizing mask image from %dx%d to %dx%d\n", mask_width, mask_height, width, height);
uint8_t* resized_mask_buffer = (uint8_t*)malloc(height * width);
if (resized_mask_buffer == NULL) {
fprintf(stderr, "Failed to allocate memory for resized mask\n");
free(mask_image_buffer);
if (input_image_buffer) free(input_image_buffer);
return 1;
}
stbir_resize(mask_image_buffer, mask_width, mask_height, 0,
resized_mask_buffer, width, height, 0, STBIR_TYPE_UINT8,
1, STBIR_ALPHA_CHANNEL_NONE, 0,
STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP,
STBIR_FILTER_BOX, STBIR_FILTER_BOX,
STBIR_COLORSPACE_SRGB, nullptr);
free(mask_image_buffer);
mask_image_buffer = resized_mask_buffer;
}
p.mask_image = {(uint32_t)width, (uint32_t)height, 1, mask_image_buffer};
fprintf(stderr, "Using inpainting with mask\n");
} else {
// No mask image, create default full mask
default_mask_image_vec.resize(width * height, 255);
p.mask_image = {(uint32_t)width, (uint32_t)height, 1, default_mask_image_vec.data()};
}
// Handle reference images
std::vector<sd_image_t> ref_images_vec;
std::vector<uint8_t*> ref_image_buffers;
if (ref_images_count > 0 && ref_images != NULL) {
fprintf(stderr, "Loading %d reference images\n", ref_images_count);
for (int i = 0; i < ref_images_count; i++) {
if (ref_images[i] == NULL || strlen(ref_images[i]) == 0) {
continue;
}
fprintf(stderr, "Loading reference image %d: %s\n", i + 1, ref_images[i]);
int c = 0;
int ref_width = 0;
int ref_height = 0;
uint8_t* ref_image_buffer = stbi_load(ref_images[i], &ref_width, &ref_height, &c, 3);
if (ref_image_buffer == NULL) {
fprintf(stderr, "Failed to load reference image from '%s'\n", ref_images[i]);
continue;
}
if (c < 3) {
fprintf(stderr, "Reference image must have at least 3 channels, got %d\n", c);
free(ref_image_buffer);
continue;
}
// Resize reference image if dimensions don't match
if (ref_width != width || ref_height != height) {
fprintf(stderr, "Resizing reference image from %dx%d to %dx%d\n", ref_width, ref_height, width, height);
uint8_t* resized_ref_buffer = (uint8_t*)malloc(height * width * 3);
if (resized_ref_buffer == NULL) {
fprintf(stderr, "Failed to allocate memory for resized reference image\n");
free(ref_image_buffer);
continue;
}
stbir_resize(ref_image_buffer, ref_width, ref_height, 0,
resized_ref_buffer, width, height, 0, STBIR_TYPE_UINT8,
3, STBIR_ALPHA_CHANNEL_NONE, 0,
STBIR_EDGE_CLAMP, STBIR_EDGE_CLAMP,
STBIR_FILTER_BOX, STBIR_FILTER_BOX,
STBIR_COLORSPACE_SRGB, nullptr);
free(ref_image_buffer);
ref_image_buffer = resized_ref_buffer;
}
ref_image_buffers.push_back(ref_image_buffer);
ref_images_vec.push_back({(uint32_t)width, (uint32_t)height, 3, ref_image_buffer});
}
if (!ref_images_vec.empty()) {
p.ref_images = ref_images_vec.data();
p.ref_images_count = ref_images_vec.size();
fprintf(stderr, "Using %zu reference images\n", ref_images_vec.size());
}
}
results = generate_image(sd_c, &p);
if (results == NULL) {
fprintf (stderr, "NO results\n");
if (input_image_buffer) free(input_image_buffer);
if (mask_image_buffer) free(mask_image_buffer);
for (auto buffer : ref_image_buffers) {
if (buffer) free(buffer);
}
return 1;
}
if (results[0].data == NULL) {
fprintf (stderr, "Results with no data\n");
if (input_image_buffer) free(input_image_buffer);
if (mask_image_buffer) free(mask_image_buffer);
for (auto buffer : ref_image_buffers) {
if (buffer) free(buffer);
}
return 1;
}
@@ -245,11 +445,15 @@ int gen_image(char *text, char *negativeText, int width, int height, int steps,
results[0].data, 0, NULL);
fprintf (stderr, "Saved resulting image to '%s'\n", dst);
// TODO: free results. Why does it crash?
// Clean up
free(results[0].data);
results[0].data = NULL;
free(results);
if (input_image_buffer) free(input_image_buffer);
if (mask_image_buffer) free(mask_image_buffer);
for (auto buffer : ref_image_buffers) {
if (buffer) free(buffer);
}
fprintf (stderr, "gen_image is done", dst);
return 0;

View File

@@ -29,16 +29,21 @@ func (sd *SDGGML) Load(opts *pb.ModelOptions) error {
sd.threads = int(opts.Threads)
modelPath := opts.ModelPath
modelFile := C.CString(opts.ModelFile)
defer C.free(unsafe.Pointer(modelFile))
modelPathC := C.CString(modelPath)
defer C.free(unsafe.Pointer(modelPathC))
var options **C.char
// prepare the options array to pass to C
size := C.size_t(unsafe.Sizeof((*C.char)(nil)))
length := C.size_t(len(opts.Options))
options = (**C.char)(C.malloc((length + 1) * size))
view := (*[1 << 30]*C.char)(unsafe.Pointer(options))[0:len(opts.Options) + 1:len(opts.Options) + 1]
view := (*[1 << 30]*C.char)(unsafe.Pointer(options))[0 : len(opts.Options)+1 : len(opts.Options)+1]
var diffusionModel int
@@ -70,7 +75,7 @@ func (sd *SDGGML) Load(opts *pb.ModelOptions) error {
sd.cfgScale = opts.CFGScale
ret := C.load_model(modelFile, options, C.int(opts.Threads), C.int(diffusionModel))
ret := C.load_model(modelFile, modelPathC, options, C.int(opts.Threads), C.int(diffusionModel))
if ret != 0 {
return fmt.Errorf("could not load model")
}
@@ -88,7 +93,56 @@ func (sd *SDGGML) GenerateImage(opts *pb.GenerateImageRequest) error {
negative := C.CString(opts.NegativePrompt)
defer C.free(unsafe.Pointer(negative))
ret := C.gen_image(t, negative, C.int(opts.Width), C.int(opts.Height), C.int(opts.Step), C.int(opts.Seed), dst, C.float(sd.cfgScale))
// Handle source image path
var srcImage *C.char
if opts.Src != "" {
srcImage = C.CString(opts.Src)
defer C.free(unsafe.Pointer(srcImage))
}
// Handle mask image path
var maskImage *C.char
if opts.EnableParameters != "" {
// Parse EnableParameters for mask path if provided
// This is a simple approach - in a real implementation you might want to parse JSON
if strings.Contains(opts.EnableParameters, "mask:") {
parts := strings.Split(opts.EnableParameters, "mask:")
if len(parts) > 1 {
maskPath := strings.TrimSpace(parts[1])
if maskPath != "" {
maskImage = C.CString(maskPath)
defer C.free(unsafe.Pointer(maskImage))
}
}
}
}
// Handle reference images
var refImages **C.char
var refImagesCount C.int
if len(opts.RefImages) > 0 {
refImagesCount = C.int(len(opts.RefImages))
// Allocate array of C strings
size := C.size_t(unsafe.Sizeof((*C.char)(nil)))
refImages = (**C.char)(C.malloc((C.size_t(len(opts.RefImages)) + 1) * size))
view := (*[1 << 30]*C.char)(unsafe.Pointer(refImages))[0 : len(opts.RefImages)+1 : len(opts.RefImages)+1]
for i, refImagePath := range opts.RefImages {
view[i] = C.CString(refImagePath)
defer C.free(unsafe.Pointer(view[i]))
}
view[len(opts.RefImages)] = nil
}
// Default strength for img2img (0.75 is a good default)
strength := C.float(0.75)
if opts.Src != "" {
// If we have a source image, use img2img mode
// You could also parse strength from EnableParameters if needed
strength = C.float(0.75)
}
ret := C.gen_image(t, negative, C.int(opts.Width), C.int(opts.Height), C.int(opts.Step), C.int(opts.Seed), dst, C.float(sd.cfgScale), srcImage, strength, maskImage, refImages, refImagesCount)
if ret != 0 {
return fmt.Errorf("inference failed")
}

View File

@@ -1,8 +1,8 @@
#ifdef __cplusplus
extern "C" {
#endif
int load_model(char *model, char* options[], int threads, int diffusionModel);
int gen_image(char *text, char *negativeText, int width, int height, int steps, int seed, char *dst, float cfg_scale);
int load_model(char *model, char *model_path, char* options[], int threads, int diffusionModel);
int gen_image(char *text, char *negativeText, int width, int height, int steps, int seed, char *dst, float cfg_scale, char *src_image, float strength, char *mask_image, char **ref_images, int ref_images_count);
#ifdef __cplusplus
}
#endif

View File

@@ -6,7 +6,7 @@ CMAKE_ARGS?=
# whisper.cpp version
WHISPER_REPO?=https://github.com/ggml-org/whisper.cpp
WHISPER_CPP_VERSION?=e7bf0294ec9099b5fc21f5ba969805dfb2108cea
WHISPER_CPP_VERSION?=0becabc8d68d9ffa6ddfba5240e38cd7a2642046
export WHISPER_CMAKE_ARGS?=-DBUILD_SHARED_LIBS=OFF
export WHISPER_DIR=$(abspath ./sources/whisper.cpp)

View File

@@ -7,7 +7,7 @@ import (
model "github.com/mudler/LocalAI/pkg/model"
)
func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negative_prompt, src, dst string, loader *model.ModelLoader, backendConfig config.BackendConfig, appConfig *config.ApplicationConfig) (func() error, error) {
func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negative_prompt, src, dst string, loader *model.ModelLoader, backendConfig config.BackendConfig, appConfig *config.ApplicationConfig, refImages []string) (func() error, error) {
opts := ModelOptions(backendConfig, appConfig)
inferenceModel, err := loader.Load(
@@ -33,6 +33,7 @@ func ImageGeneration(height, width, mode, step, seed int, positive_prompt, negat
Dst: dst,
Src: src,
EnableParameters: backendConfig.Diffusers.EnableParameters,
RefImages: refImages,
})
return err
}

View File

@@ -11,6 +11,7 @@ import (
"github.com/mudler/LocalAI/pkg/downloader"
"github.com/mudler/LocalAI/pkg/model"
"github.com/mudler/LocalAI/pkg/system"
cp "github.com/otiai10/copy"
"github.com/rs/zerolog/log"
)
@@ -145,18 +146,27 @@ func InstallBackend(basePath string, config *GalleryBackend, downloadStatus func
}
uri := downloader.URI(config.URI)
if err := uri.DownloadFile(backendPath, "", 1, 1, downloadStatus); err != nil {
success := false
// Try to download from mirrors
for _, mirror := range config.Mirrors {
if err := downloader.URI(mirror).DownloadFile(backendPath, "", 1, 1, downloadStatus); err == nil {
success = true
break
}
// Check if it is a directory
if uri.LooksLikeDir() {
// It is a directory, we just copy it over in the backend folder
if err := cp.Copy(config.URI, backendPath); err != nil {
return fmt.Errorf("failed copying: %w", err)
}
} else {
uri := downloader.URI(config.URI)
if err := uri.DownloadFile(backendPath, "", 1, 1, downloadStatus); err != nil {
success := false
// Try to download from mirrors
for _, mirror := range config.Mirrors {
if err := downloader.URI(mirror).DownloadFile(backendPath, "", 1, 1, downloadStatus); err == nil {
success = true
break
}
}
if !success {
return fmt.Errorf("failed to download backend %q: %v", config.URI, err)
if !success {
return fmt.Errorf("failed to download backend %q: %v", config.URI, err)
}
}
}
@@ -240,16 +250,22 @@ func ListSystemBackends(basePath string) (map[string]string, error) {
for _, backend := range backends {
if backend.IsDir() {
runFile := filepath.Join(basePath, backend.Name(), runFile)
// Skip if metadata file don't exist
var metadata *BackendMetadata
// If metadata file does not exist, we just use the directory name
// and we do not fill the other metadata (such as potential backend Aliases)
metadataFilePath := filepath.Join(basePath, backend.Name(), metadataFile)
if _, err := os.Stat(metadataFilePath); os.IsNotExist(err) {
continue
}
// Check for alias in metadata
metadata, err := readBackendMetadata(filepath.Join(basePath, backend.Name()))
if err != nil {
return nil, err
metadata = &BackendMetadata{
Name: backend.Name(),
}
} else {
// Check for alias in metadata
metadata, err = readBackendMetadata(filepath.Join(basePath, backend.Name()))
if err != nil {
return nil, err
}
}
if metadata == nil {

View File

@@ -34,7 +34,7 @@ func CreateBackendEndpointService(galleries []config.Gallery, backendPath string
// GetOpStatusEndpoint returns the job status
// @Summary Returns the job status
// @Success 200 {object} services.BackendOpStatus "Response"
// @Success 200 {object} services.GalleryOpStatus "Response"
// @Router /backends/jobs/{uuid} [get]
func (mgs *BackendEndpointService) GetOpStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
@@ -48,7 +48,7 @@ func (mgs *BackendEndpointService) GetOpStatusEndpoint() func(c *fiber.Ctx) erro
// GetAllStatusEndpoint returns all the jobs status progress
// @Summary Returns all the jobs status progress
// @Success 200 {object} map[string]services.BackendOpStatus "Response"
// @Success 200 {object} map[string]services.GalleryOpStatus "Response"
// @Router /backends/jobs [get]
func (mgs *BackendEndpointService) GetAllStatusEndpoint() func(c *fiber.Ctx) error {
return func(c *fiber.Ctx) error {
@@ -58,7 +58,7 @@ func (mgs *BackendEndpointService) GetAllStatusEndpoint() func(c *fiber.Ctx) err
// ApplyBackendEndpoint installs a new backend to a LocalAI instance
// @Summary Install backends to LocalAI.
// @Param request body BackendModel true "query params"
// @Param request body GalleryBackend true "query params"
// @Success 200 {object} schema.BackendResponse "Response"
// @Router /backends/apply [post]
func (mgs *BackendEndpointService) ApplyBackendEndpoint() func(c *fiber.Ctx) error {

View File

@@ -79,49 +79,37 @@ func ImageEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, appCon
return fiber.ErrBadRequest
}
// Process input images (for img2img/inpainting)
src := ""
if input.File != "" {
src = processImageFile(input.File, appConfig.GeneratedContentDir)
if src != "" {
defer os.RemoveAll(src)
}
}
fileData := []byte{}
var err error
// check if input.File is an URL, if so download it and save it
// to a temporary file
if strings.HasPrefix(input.File, "http://") || strings.HasPrefix(input.File, "https://") {
out, err := downloadFile(input.File)
if err != nil {
return fmt.Errorf("failed downloading file:%w", err)
}
defer os.RemoveAll(out)
fileData, err = os.ReadFile(out)
if err != nil {
return fmt.Errorf("failed reading file:%w", err)
}
} else {
// base 64 decode the file and write it somewhere
// that we will cleanup
fileData, err = base64.StdEncoding.DecodeString(input.File)
if err != nil {
return err
// Process multiple input images
var inputImages []string
if len(input.Files) > 0 {
for _, file := range input.Files {
processedFile := processImageFile(file, appConfig.GeneratedContentDir)
if processedFile != "" {
inputImages = append(inputImages, processedFile)
defer os.RemoveAll(processedFile)
}
}
}
// Create a temporary file
outputFile, err := os.CreateTemp(appConfig.GeneratedContentDir, "b64")
if err != nil {
return err
// Process reference images
var refImages []string
if len(input.RefImages) > 0 {
for _, file := range input.RefImages {
processedFile := processImageFile(file, appConfig.GeneratedContentDir)
if processedFile != "" {
refImages = append(refImages, processedFile)
defer os.RemoveAll(processedFile)
}
}
// write the base64 result
writer := bufio.NewWriter(outputFile)
_, err = writer.Write(fileData)
if err != nil {
outputFile.Close()
return err
}
outputFile.Close()
src = outputFile.Name()
defer os.RemoveAll(src)
}
log.Debug().Msgf("Parameter Config: %+v", config)
@@ -202,7 +190,13 @@ func ImageEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, appCon
baseURL := c.BaseURL()
fn, err := backend.ImageGeneration(height, width, mode, step, *config.Seed, positive_prompt, negative_prompt, src, output, ml, *config, appConfig)
// Use the first input image as src if available, otherwise use the original src
inputSrc := src
if len(inputImages) > 0 {
inputSrc = inputImages[0]
}
fn, err := backend.ImageGeneration(height, width, mode, step, *config.Seed, positive_prompt, negative_prompt, inputSrc, output, ml, *config, appConfig, refImages)
if err != nil {
return err
}
@@ -243,3 +237,51 @@ func ImageEndpoint(cl *config.BackendConfigLoader, ml *model.ModelLoader, appCon
return c.JSON(resp)
}
}
// processImageFile handles a single image file (URL or base64) and returns the path to the temporary file
func processImageFile(file string, generatedContentDir string) string {
fileData := []byte{}
var err error
// check if file is an URL, if so download it and save it to a temporary file
if strings.HasPrefix(file, "http://") || strings.HasPrefix(file, "https://") {
out, err := downloadFile(file)
if err != nil {
log.Error().Err(err).Msgf("Failed downloading file: %s", file)
return ""
}
defer os.RemoveAll(out)
fileData, err = os.ReadFile(out)
if err != nil {
log.Error().Err(err).Msgf("Failed reading downloaded file: %s", out)
return ""
}
} else {
// base 64 decode the file and write it somewhere that we will cleanup
fileData, err = base64.StdEncoding.DecodeString(file)
if err != nil {
log.Error().Err(err).Msgf("Failed decoding base64 file")
return ""
}
}
// Create a temporary file
outputFile, err := os.CreateTemp(generatedContentDir, "b64")
if err != nil {
log.Error().Err(err).Msg("Failed creating temporary file")
return ""
}
// write the base64 result
writer := bufio.NewWriter(outputFile)
_, err = writer.Write(fileData)
if err != nil {
outputFile.Close()
log.Error().Err(err).Msg("Failed writing to temporary file")
return ""
}
outputFile.Close()
return outputFile.Name()
}

View File

@@ -141,6 +141,10 @@ type OpenAIRequest struct {
// whisper
File string `json:"file" validate:"required"`
// Multiple input images for img2img or inpainting
Files []string `json:"files,omitempty"`
// Reference images for models that support them (e.g., Flux Kontext)
RefImages []string `json:"ref_images,omitempty"`
//whisper/image
ResponseFormat interface{} `json:"response_format,omitempty"`
// image

View File

@@ -22,6 +22,17 @@ func InstallExternalBackends(galleries []config.Gallery, backendPath string, dow
for _, backend := range backends {
uri := downloader.URI(backend)
switch {
case uri.LooksLikeDir():
name := filepath.Base(backend)
log.Info().Str("backend", backend).Str("name", name).Msg("Installing backend from path")
if err := gallery.InstallBackend(backendPath, &gallery.GalleryBackend{
Metadata: gallery.Metadata{
Name: name,
},
URI: backend,
}, downloadStatus); err != nil {
errs = errors.Join(err, fmt.Errorf("error installing backend %s", backend))
}
case uri.LooksLikeOCI():
name, err := uri.FilenameFromUrl()
if err != nil {

View File

@@ -445,15 +445,17 @@ make -C backend/python/vllm
When LocalAI runs in a container,
there are additional environment variables available that modify the behavior of LocalAI on startup:
{{< table "table-responsive" >}}
| Environment variable | Default | Description |
|----------------------------|---------|------------------------------------------------------------------------------------------------------------|
| `REBUILD` | `false` | Rebuild LocalAI on startup |
| `BUILD_TYPE` | | Build type. Available: `cublas`, `openblas`, `clblas` |
| `BUILD_TYPE` | | Build type. Available: `cublas`, `openblas`, `clblas`, `intel` (intel core), `sycl_f16`, `sycl_f32` (intel backends) |
| `GO_TAGS` | | Go tags. Available: `stablediffusion` |
| `HUGGINGFACEHUB_API_TOKEN` | | Special token for interacting with HuggingFace Inference API, required only when using the `langchain-huggingface` backend |
| `EXTRA_BACKENDS` | | A space separated list of backends to prepare. For example `EXTRA_BACKENDS="backend/python/diffusers backend/python/transformers"` prepares the python environment on start |
| `DISABLE_AUTODETECT` | `false` | Disable autodetect of CPU flagset on start |
| `LLAMACPP_GRPC_SERVERS` | | A list of llama.cpp workers to distribute the workload. For example `LLAMACPP_GRPC_SERVERS="address1:port,address2:port"` |
{{< /table >}}
Here is how to configure these variables:
@@ -471,12 +473,15 @@ You can control LocalAI with command line arguments, to specify a binding addres
In the help text below, BASEPATH is the location that local-ai is being executed from
#### Global Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| -h, --help | | Show context-sensitive help. |
| --log-level | info | Set the level of logs to output [error,warn,info,debug] | $LOCALAI_LOG_LEVEL |
{{< /table >}}
#### Storage Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| --models-path | BASEPATH/models | Path containing models used for inferencing | $LOCALAI_MODELS_PATH |
@@ -487,8 +492,10 @@ In the help text below, BASEPATH is the location that local-ai is being executed
| --localai-config-dir | BASEPATH/configuration | Directory for dynamic loading of certain configuration files (currently api_keys.json and external_backends.json) | $LOCALAI_CONFIG_DIR |
| --localai-config-dir-poll-interval | | Typically the config path picks up changes automatically, but if your system has broken fsnotify events, set this to a time duration to poll the LocalAI Config Dir (example: 1m) | $LOCALAI_CONFIG_DIR_POLL_INTERVAL |
| --models-config-file | STRING | YAML file containing a list of model backend configs | $LOCALAI_MODELS_CONFIG_FILE |
{{< /table >}}
#### Models Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| --galleries | STRING | JSON list of galleries | $LOCALAI_GALLERIES |
@@ -497,15 +504,19 @@ In the help text below, BASEPATH is the location that local-ai is being executed
| --preload-models | STRING | A List of models to apply in JSON at start |$LOCALAI_PRELOAD_MODELS |
| --models | MODELS,... | A List of model configuration URLs to load | $LOCALAI_MODELS |
| --preload-models-config | STRING | A List of models to apply at startup. Path to a YAML config file | $LOCALAI_PRELOAD_MODELS_CONFIG |
{{< /table >}}
#### Performance Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| --f16 | | Enable GPU acceleration | $LOCALAI_F16 |
| -t, --threads | 4 | Number of threads used for parallel computation. Usage of the number of physical cores in the system is suggested | $LOCALAI_THREADS |
| --context-size | 512 | Default context size for models | $LOCALAI_CONTEXT_SIZE |
{{< /table >}}
#### API Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| --address | ":8080" | Bind address for the API server | $LOCALAI_ADDRESS |
@@ -516,8 +527,10 @@ In the help text below, BASEPATH is the location that local-ai is being executed
| --disable-welcome | | Disable welcome pages | $LOCALAI_DISABLE_WELCOME |
| --disable-webui | false | Disables the web user interface. When set to true, the server will only expose API endpoints without serving the web interface | $LOCALAI_DISABLE_WEBUI |
| --machine-tag | | If not empty - put that string to Machine-Tag header in each response. Useful to track response from different machines using multiple P2P federated nodes | $LOCALAI_MACHINE_TAG |
{{< /table >}}
#### Backend Flags
{{< table "table-responsive" >}}
| Parameter | Default | Description | Environment Variable |
|-----------|---------|-------------|----------------------|
| --parallel-requests | | Enable backends to handle multiple requests in parallel if they support it (e.g.: llama.cpp or vllm) | $LOCALAI_PARALLEL_REQUESTS |
@@ -528,6 +541,7 @@ In the help text below, BASEPATH is the location that local-ai is being executed
| --watchdog-idle-timeout | 15m | Threshold beyond which an idle backend should be stopped | $LOCALAI_WATCHDOG_IDLE_TIMEOUT, $WATCHDOG_IDLE_TIMEOUT |
| --enable-watchdog-busy | | Enable watchdog for stopping backends that are busy longer than the watchdog-busy-timeout | $LOCALAI_WATCHDOG_BUSY |
| --watchdog-busy-timeout | 5m | Threshold beyond which a busy backend should be stopped | $LOCALAI_WATCHDOG_BUSY_TIMEOUT |
{{< /table >}}
### .env files

View File

@@ -267,7 +267,7 @@ If building from source, you need to install [Intel oneAPI Base Toolkit](https:/
### Container images
To use SYCL, use the images with the `gpu-intel-f16` or `gpu-intel-f32` tag, for example `{{< version >}}-gpu-intel-f32-core`, `{{< version >}}-gpu-intel-f16`, ...
To use SYCL, use the images with `gpu-intel` in the tag, for example `{{< version >}}-gpu-intel`, ...
The image list is on [quay](https://quay.io/repository/go-skynet/local-ai?tab=tags).
@@ -276,7 +276,7 @@ The image list is on [quay](https://quay.io/repository/go-skynet/local-ai?tab=ta
To run LocalAI with Docker and sycl starting `phi-2`, you can use the following command as an example:
```bash
docker run -e DEBUG=true --privileged -ti -v $PWD/models:/models -p 8080:8080 -v /dev/dri:/dev/dri --rm quay.io/go-skynet/local-ai:master-gpu-intel-f32 phi-2
docker run -e DEBUG=true --privileged -ti -v $PWD/models:/models -p 8080:8080 -v /dev/dri:/dev/dri --rm quay.io/go-skynet/local-ai:master-gpu-intel phi-2
```
### Notes
@@ -284,7 +284,7 @@ docker run -e DEBUG=true --privileged -ti -v $PWD/models:/models -p 8080:8080 -
In addition to the commands to run LocalAI normally, you need to specify `--device /dev/dri` to docker, for example:
```bash
docker run --rm -ti --device /dev/dri -p 8080:8080 -e DEBUG=true -e MODELS_PATH=/models -e THREADS=1 -v $PWD/models:/models quay.io/go-skynet/local-ai:{{< version >}}-gpu-intel-f16
docker run --rm -ti --device /dev/dri -p 8080:8080 -e DEBUG=true -e MODELS_PATH=/models -e THREADS=1 -v $PWD/models:/models quay.io/go-skynet/local-ai:{{< version >}}-gpu-intel
```
Note also that sycl does have a known issue to hang with `mmap: true`. You have to disable it in the model configuration if explicitly enabled.

View File

@@ -129,6 +129,7 @@ The server logs should indicate that new workers are being discovered.
There are options that can be tweaked or parameters that can be set using environment variables
{{< table "table-responsive" >}}
| Environment Variable | Description |
|----------------------|-------------|
| **LOCALAI_P2P** | Set to "true" to enable p2p |
@@ -142,6 +143,7 @@ There are options that can be tweaked or parameters that can be set using enviro
| **LOCALAI_P2P_TOKEN** | Set the token for the p2p network |
| **LOCALAI_P2P_LOGLEVEL** | Set the loglevel for the LocalAI p2p stack (default: info) |
| **LOCALAI_P2P_LIB_LOGLEVEL** | Set the loglevel for the underlying libp2p stack (default: fatal) |
{{< /table >}}
## Architecture

View File

@@ -197,4 +197,4 @@ docker build --build-arg BUILD_TYPE=$(BUILD_TYPE) --build-arg BASE_IMAGE=$(BASE_
Note:
- BUILD_TYPE can be either: `cublas`, `hipblas`, `sycl_f16`, `sycl_f32`, `metal`.
- BASE_IMAGE is tested on `ubuntu:22.04` (and defaults to it)
- BASE_IMAGE is tested on `ubuntu:22.04` (and defaults to it) and `quay.io/go-skynet/intel-oneapi-base:latest` for intel/sycl

View File

@@ -41,6 +41,7 @@ All-In-One images are images that come pre-configured with a set of models and b
In the AIO images there are models configured with the names of OpenAI models, however, they are really backed by Open Source models. You can find the table below
{{< table "table-responsive" >}}
| Category | Model name | Real model (CPU) | Real model (GPU) |
| ---- | ---- | ---- | ---- |
| Text Generation | `gpt-4` | `phi-2` | `hermes-2-pro-mistral` |
@@ -49,6 +50,7 @@ In the AIO images there are models configured with the names of OpenAI models, h
| Speech to Text | `whisper-1` | `whisper` with `whisper-base` model | <= same |
| Text to Speech | `tts-1` | `en-us-amy-low.onnx` from `rhasspy/piper` | <= same |
| Embeddings | `text-embedding-ada-002` | `all-MiniLM-L6-v2` in Q4 | `all-MiniLM-L6-v2` |
{{< /table >}}
### Usage
@@ -131,8 +133,7 @@ docker run -p 8080:8080 --name local-ai -ti -v localai-models:/models localai/lo
| Latest images for Nvidia GPU (CUDA11) | `quay.io/go-skynet/local-ai:latest-aio-gpu-nvidia-cuda-11` | `localai/localai:latest-aio-gpu-nvidia-cuda-11` |
| Latest images for Nvidia GPU (CUDA12) | `quay.io/go-skynet/local-ai:latest-aio-gpu-nvidia-cuda-12` | `localai/localai:latest-aio-gpu-nvidia-cuda-12` |
| Latest images for AMD GPU | `quay.io/go-skynet/local-ai:latest-aio-gpu-hipblas` | `localai/localai:latest-aio-gpu-hipblas` |
| Latest images for Intel GPU (sycl f16) | `quay.io/go-skynet/local-ai:latest-aio-gpu-intel-f16` | `localai/localai:latest-aio-gpu-intel-f16` |
| Latest images for Intel GPU (sycl f32) | `quay.io/go-skynet/local-ai:latest-aio-gpu-intel-f32` | `localai/localai:latest-aio-gpu-intel-f32` |
| Latest images for Intel GPU | `quay.io/go-skynet/local-ai:latest-aio-gpu-intel` | `localai/localai:latest-aio-gpu-intel` |
### Available environment variables
@@ -179,23 +180,13 @@ Standard container images do not have pre-installed models.
{{% /tab %}}
{{% tab tabName="Intel GPU (sycl f16)" %}}
{{% tab tabName="Intel GPU" %}}
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-intel-f16` | `localai/localai:master-gpu-intel-f16` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f16` | `localai/localai:latest-gpu-intel-f16` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-intel-f16` | `localai/localai:{{< version >}}-gpu-intel-f16` |
{{% /tab %}}
{{% tab tabName="Intel GPU (sycl f32)" %}}
| Description | Quay | Docker Hub |
| --- | --- |-------------------------------------------------------------|
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-intel-f32` | `localai/localai:master-gpu-intel-f32` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel-f32` | `localai/localai:latest-gpu-intel-f32` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-intel-f32` | `localai/localai:{{< version >}}-gpu-intel-f32` |
| Latest images from the branch (development) | `quay.io/go-skynet/local-ai:master-gpu-intel` | `localai/localai:master-gpu-intel` |
| Latest tag | `quay.io/go-skynet/local-ai:latest-gpu-intel` | `localai/localai:latest-gpu-intel` |
| Versioned image | `quay.io/go-skynet/local-ai:{{< version >}}-gpu-intel` | `localai/localai:{{< version >}}-gpu-intel` |
{{% /tab %}}

View File

@@ -59,11 +59,7 @@ docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri
#### Intel GPU Images (oneAPI):
```bash
# Intel GPU with FP16 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f16
# Intel GPU with FP32 support
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel-f32
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-intel
```
#### Vulkan GPU Images:
@@ -85,7 +81,7 @@ docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-ai
docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-aio-gpu-nvidia-cuda-11
# Intel GPU version
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel-f16
docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-aio-gpu-intel
# AMD GPU version
docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-aio-gpu-hipblas

View File

@@ -14,6 +14,7 @@ LocalAI will attempt to automatically load models which are not explicitly confi
{{% /alert %}}
{{< table "table-responsive" >}}
| Backend and Bindings | Compatible models | Completion/Chat endpoint | Capability | Embeddings support | Token stream support | Acceleration |
|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
| [llama.cpp]({{%relref "docs/features/text-generation#llama.cpp" %}}) | LLama, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | yes | GPT and Functions | yes | yes | CUDA, openCL, cuBLAS, Metal |
@@ -34,6 +35,7 @@ LocalAI will attempt to automatically load models which are not explicitly confi
| [bark-cpp](https://github.com/PABannier/bark.cpp) | bark | no | Audio-Only | no | no | yes |
| [stablediffusion-cpp](https://github.com/leejet/stable-diffusion.cpp) | stablediffusion-1, stablediffusion-2, stablediffusion-3, flux, PhotoMaker | no | Image | no | no | N/A |
| [silero-vad](https://github.com/snakers4/silero-vad) with [Golang bindings](https://github.com/streamer45/silero-vad-go) | Silero VAD | no | Voice Activity Detection | no | no | CPU |
{{< /table >}}
Note: any backend name listed above can be used in the `backend` field of the model configuration file (See [the advanced section]({{%relref "docs/advanced" %}})).

View File

@@ -1,3 +1,3 @@
{
"version": "v3.2.3"
"version": "v3.3.1"
}

View File

@@ -715,11 +715,10 @@ install_docker() {
$envs \
-d -p $PORT:8080 --name local-ai localai/localai:$IMAGE_TAG $STARTCOMMAND
elif [ "$HAS_INTEL" ]; then
# Default to FP32 for better compatibility
IMAGE_TAG=${LOCALAI_VERSION}-gpu-intel-f32
IMAGE_TAG=${LOCALAI_VERSION}-gpu-intel
# AIO
if [ "$USE_AIO" = true ]; then
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-intel-f32
IMAGE_TAG=${LOCALAI_VERSION}-aio-gpu-intel
fi
info "Starting LocalAI Docker container..."

View File

@@ -1,4 +1,32 @@
---
- &afm
name: "arcee-ai_afm-4.5b"
url: "github:mudler/LocalAI/gallery/chatml.yaml@master"
icon: https://cdn-uploads.huggingface.co/production/uploads/6435718aaaef013d1aec3b8b/Lj9YVLIKKdImV_jID0A1g.png
license: aml
urls:
- https://huggingface.co/arcee-ai/AFM-4.5B
- https://huggingface.co/bartowski/arcee-ai_AFM-4.5B-GGUF
tags:
- gguf
- gpu
- gpu
- text-generation
description: |
AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. We use a modified version of TorchTitan for pretraining, Axolotl for supervised fine-tuning, and a modified version of Verifiers for reinforcement learning.
The development of AFM-4.5B prioritized data quality as a fundamental requirement for achieving robust model performance. We collaborated with DatologyAI, a company specializing in large-scale data curation. DatologyAI's curation pipeline integrates a suite of proprietary algorithms—model-based quality filtering, embedding-based curation, target distribution-matching, source mixing, and synthetic data. Their expertise enabled the creation of a curated dataset tailored to support strong real-world performance.
The model architecture follows a standard transformer decoder-only design based on Vaswani et al., incorporating several key modifications for enhanced performance and efficiency. Notable architectural features include grouped query attention for improved inference efficiency and ReLU^2 activation functions instead of SwiGLU to enable sparsification while maintaining or exceeding performance benchmarks.
The model available in this repo is the instruct model following supervised fine-tuning and reinforcement learning.
overrides:
parameters:
model: arcee-ai_AFM-4.5B-Q4_K_M.gguf
files:
- filename: arcee-ai_AFM-4.5B-Q4_K_M.gguf
sha256: f05516b323f581bebae1af2cbf900d83a2569b0a60c54366daf4a9c15ae30d4f
uri: huggingface://bartowski/arcee-ai_AFM-4.5B-GGUF/arcee-ai_AFM-4.5B-Q4_K_M.gguf
- &rfdetr
name: "rfdetr-base"
url: "github:mudler/LocalAI/gallery/virtual.yaml@master"
@@ -1878,6 +1906,43 @@
- filename: Menlo_Lucy-128k-Q4_K_M.gguf
sha256: fb3e591cccc5d2821f3c615fd6dc2ca86d409f56fbc124275510a9612a90e61f
uri: huggingface://bartowski/Menlo_Lucy-128k-GGUF/Menlo_Lucy-128k-Q4_K_M.gguf
- !!merge <<: *qwen3
name: "qwen_qwen3-30b-a3b-instruct-2507"
urls:
- https://huggingface.co/Qwen/Qwen3-30B-A3B-Instruct-2507
- https://huggingface.co/bartowski/Qwen_Qwen3-30B-A3B-Instruct-2507-GGUF
description: |
We introduce the updated version of the Qwen3-30B-A3B non-thinking mode, named Qwen3-30B-A3B-Instruct-2507, featuring the following key enhancements:
Significant improvements in general capabilities, including instruction following, logical reasoning, text comprehension, mathematics, science, coding and tool usage.
Substantial gains in long-tail knowledge coverage across multiple languages.
Markedly better alignment with user preferences in subjective and open-ended tasks, enabling more helpful responses and higher-quality text generation.
Enhanced capabilities in 256K long-context understanding.
overrides:
parameters:
model: Qwen_Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf
files:
- filename: Qwen_Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf
sha256: 382b4f5a164d200f93790ee0e339fae12852896d23485cfb203ce868fea33a95
uri: huggingface://bartowski/Qwen_Qwen3-30B-A3B-Instruct-2507-GGUF/Qwen_Qwen3-30B-A3B-Instruct-2507-Q4_K_M.gguf
- !!merge <<: *qwen3
name: "qwen_qwen3-30b-a3b-thinking-2507"
urls:
- https://huggingface.co/Qwen/Qwen3-30B-A3B-Thinking-2507
- https://huggingface.co/bartowski/Qwen_Qwen3-30B-A3B-Thinking-2507-GGUF
description: |
Over the past three months, we have continued to scale the thinking capability of Qwen3-30B-A3B, improving both the quality and depth of reasoning. We are pleased to introduce Qwen3-30B-A3B-Thinking-2507, featuring the following key enhancements:
Significantly improved performance on reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks that typically require human expertise.
Markedly better general capabilities, such as instruction following, tool usage, text generation, and alignment with human preferences.
Enhanced 256K long-context understanding capabilities.
NOTE: This version has an increased thinking length. We strongly recommend its use in highly complex reasoning tasks.
overrides:
parameters:
model: Qwen_Qwen3-30B-A3B-Thinking-2507-Q4_K_M.gguf
files:
- filename: Qwen_Qwen3-30B-A3B-Thinking-2507-Q4_K_M.gguf
sha256: 1359aa08e2f2dfe7ce4b5ff88c4c996e6494c9d916b1ebacd214bb74bbd5a9db
uri: huggingface://bartowski/Qwen_Qwen3-30B-A3B-Thinking-2507-GGUF/Qwen_Qwen3-30B-A3B-Thinking-2507-Q4_K_M.gguf
- &gemma3
url: "github:mudler/LocalAI/gallery/gemma.yaml@master"
name: "gemma-3-27b-it"
@@ -19079,6 +19144,148 @@
overrides:
parameters:
model: SicariusSicariiStuff/flux.1dev-abliteratedv2
- name: flux.1-kontext-dev
license: flux-1-dev-non-commercial-license
url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
icon: https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/media/main/teaser.png
description: |
FLUX.1 Kontext [dev] is a 12 billion parameter rectified flow transformer capable of editing images based on text instructions. For more information, please read our blog post and our technical report. You can find information about the [pro] version in here.
Key Features
Change existing images based on an edit instruction.
Have character, style and object reference without any finetuning.
Robust consistency allows users to refine an image through multiple successive edits with minimal visual drift.
Trained using guidance distillation, making FLUX.1 Kontext [dev] more efficient.
Open weights to drive new scientific research, and empower artists to develop innovative workflows.
Generated outputs can be used for personal, scientific, and commercial purposes, as described in the FLUX.1 [dev] Non-Commercial License.
urls:
- https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev
- https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF
tags:
- image-to-image
- flux
- gpu
- cpu
overrides:
parameters:
model: flux1-kontext-dev-Q8_0.gguf
files:
- filename: "flux1-kontext-dev-Q8_0.gguf"
sha256: "ff2ff71c3755c8ab394398a412252c23382a83138b65190b16e736d457b80f73"
uri: "huggingface://QuantStack/FLUX.1-Kontext-dev-GGUF/flux1-kontext-dev-Q8_0.gguf"
- filename: ae.safetensors
sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
- filename: clip_l.safetensors
sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
- filename: t5xxl_fp16.safetensors
sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- !!merge <<: *flux
name: flux.1-dev-ggml-q8_0
license: flux-1-dev-non-commercial-license
url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
urls:
- https://huggingface.co/black-forest-labs/FLUX.1-dev
- https://huggingface.co/city96/FLUX.1-dev-gguf
overrides:
parameters:
model: flux1-dev-Q8_0.gguf
files:
- filename: "flux1-dev-Q8_0.gguf"
sha256: "129032f32224bf7138f16e18673d8008ba5f84c1ec74063bf4511a8bb4cf553d"
uri: "huggingface://city96/FLUX.1-dev-gguf/flux1-dev-Q8_0.gguf"
- filename: ae.safetensors
sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
- filename: clip_l.safetensors
sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
- filename: t5xxl_fp16.safetensors
sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- !!merge <<: *flux
name: flux.1-dev-ggml-abliterated-v2-q8_0
url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
description: |
FLUX.1 [dev] is an abliterated version of FLUX.1 [dev]
urls:
- https://huggingface.co/black-forest-labs/FLUX.1-dev
- https://huggingface.co/t8star/flux.1-dev-abliterated-V2-GGUF
overrides:
parameters:
model: T8-flux.1-dev-abliterated-V2-GGUF-Q8_0.gguf
files:
- filename: "T8-flux.1-dev-abliterated-V2-GGUF-Q8_0.gguf"
sha256: "aba8163ff644018da195212a1c33aeddbf802a0c2bba96abc584a2d0b6b42272"
uri: "huggingface://t8star/flux.1-dev-abliterated-V2-GGUF/T8-flux.1-dev-abliterated-V2-GGUF-Q8_0.gguf"
- filename: ae.safetensors
sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
- filename: clip_l.safetensors
sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
- filename: t5xxl_fp16.safetensors
sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- !!merge <<: *flux
name: flux.1-krea-dev-ggml
url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
description: |
FLUX.1 Krea [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post and Krea's blog post.
Cutting-edge output quality, with a focus on aesthetic photography.
Competitive prompt following, matching the performance of closed source alternatives.
Trained using guidance distillation, making FLUX.1 Krea [dev] more efficient.
Open weights to drive new scientific research, and empower artists to develop innovative workflows.
Generated outputs can be used for personal, scientific, and commercial purposes, as described in the flux-1-dev-non-commercial-license.
urls:
- https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev
- https://huggingface.co/QuantStack/FLUX.1-Krea-dev-GGUF
overrides:
parameters:
model: flux1-krea-dev-Q4_K_M.gguf
files:
- filename: "flux1-krea-dev-Q4_K_M.gguf"
sha256: "cf199b88509be2b3476a3372ff03eaaa662cb2b5d3710abf939ebb4838dbdcaf"
uri: "huggingface://QuantStack/FLUX.1-Krea-dev-GGUF/flux1-krea-dev-Q4_K_M.gguf"
- filename: ae.safetensors
sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
- filename: clip_l.safetensors
sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
- filename: t5xxl_fp16.safetensors
sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- !!merge <<: *flux
name: flux.1-krea-dev-ggml-q8_0
url: "github:mudler/LocalAI/gallery/flux-ggml.yaml@master"
description: |
FLUX.1 Krea [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. For more information, please read our blog post and Krea's blog post.
Cutting-edge output quality, with a focus on aesthetic photography.
Competitive prompt following, matching the performance of closed source alternatives.
Trained using guidance distillation, making FLUX.1 Krea [dev] more efficient.
Open weights to drive new scientific research, and empower artists to develop innovative workflows.
Generated outputs can be used for personal, scientific, and commercial purposes, as described in the flux-1-dev-non-commercial-license.
urls:
- https://huggingface.co/black-forest-labs/FLUX.1-Krea-dev
- https://huggingface.co/markury/FLUX.1-Krea-dev-gguf
overrides:
parameters:
model: flux1-krea-dev-Q8_0.gguf
files:
- filename: "flux1-krea-dev-Q8_0.gguf"
sha256: "0d085b1e3ae0b90e5dbf74da049a80a565617de622a147d28ee37a07761fbd90"
uri: "huggingface://markury/FLUX.1-Krea-dev-gguf/flux1-krea-dev-Q8_0.gguf"
- filename: ae.safetensors
sha256: afc8e28272cd15db3919bacdb6918ce9c1ed22e96cb12c4d5ed0fba823529e38
uri: https://huggingface.co/ChuckMcSneed/FLUX.1-dev/resolve/main/ae.safetensors
- filename: clip_l.safetensors
sha256: 660c6f5b1abae9dc498ac2d21e1347d2abdb0cf6c0c0c8576cd796491d9a6cdd
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/clip_l.safetensors
- filename: t5xxl_fp16.safetensors
sha256: 6e480b09fae049a72d2a8c5fbccb8d3e92febeb233bbe9dfe7256958a9167635
uri: https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors
- &whisper
url: "github:mudler/LocalAI/gallery/whisper-base.yaml@master" ## Whisper
name: "whisper-1"

13
go.mod
View File

@@ -37,6 +37,7 @@ require (
github.com/nikolalohinski/gonja/v2 v2.3.2
github.com/onsi/ginkgo/v2 v2.22.2
github.com/onsi/gomega v1.36.2
github.com/otiai10/copy v1.14.1
github.com/otiai10/openaigo v1.7.0
github.com/phayes/freeport v0.0.0-20220201140144-74d24b5ae9f5
github.com/prometheus/client_golang v1.21.0
@@ -65,15 +66,12 @@ require (
require (
github.com/containerd/platforms v0.2.1 // indirect
github.com/cpuguy83/dockercfg v0.3.2 // indirect
github.com/cpuguy83/go-md2man/v2 v2.0.5 // indirect
github.com/distribution/reference v0.6.0 // indirect
github.com/dustin/go-humanize v1.0.1 // indirect
github.com/fasthttp/websocket v1.5.8 // indirect
github.com/felixge/httpsnoop v1.0.4 // indirect
github.com/go-task/slim-sprig/v3 v3.0.0 // indirect
github.com/json-iterator/go v1.1.12 // indirect
github.com/labstack/echo/v4 v4.13.3 // indirect
github.com/labstack/gommon v0.4.2 // indirect
github.com/libp2p/go-yamux/v5 v5.0.0 // indirect
github.com/magiconair/properties v1.8.7 // indirect
github.com/moby/docker-image-spec v1.3.1 // indirect
@@ -83,14 +81,13 @@ require (
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/morikuni/aec v1.0.0 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/otiai10/mint v1.6.3 // indirect
github.com/pion/datachannel v1.5.10 // indirect
github.com/pion/dtls/v2 v2.2.12 // indirect
github.com/pion/dtls/v3 v3.0.4 // indirect
github.com/pion/ice/v2 v2.3.37 // indirect
github.com/pion/ice/v4 v4.0.6 // indirect
github.com/pion/interceptor v0.1.37 // indirect
github.com/pion/logging v0.2.3 // indirect
github.com/pion/mdns v0.0.12 // indirect
github.com/pion/mdns/v2 v2.0.7 // indirect
github.com/pion/randutil v0.1.0 // indirect
github.com/pion/rtcp v1.2.15 // indirect
@@ -102,17 +99,12 @@ require (
github.com/pion/stun/v3 v3.0.0 // indirect
github.com/pion/transport/v2 v2.2.10 // indirect
github.com/pion/transport/v3 v3.0.7 // indirect
github.com/pion/turn/v2 v2.1.6 // indirect
github.com/pion/turn/v4 v4.0.0 // indirect
github.com/pion/webrtc/v4 v4.0.9 // indirect
github.com/rs/dnscache v0.0.0-20230804202142-fc85eb664529 // indirect
github.com/russross/blackfriday/v2 v2.1.0 // indirect
github.com/savsgio/gotils v0.0.0-20240303185622-093b76447511 // indirect
github.com/shirou/gopsutil/v4 v4.24.7 // indirect
github.com/urfave/cli/v2 v2.27.5 // indirect
github.com/valyala/fasttemplate v1.2.2 // indirect
github.com/wlynxg/anet v0.0.5 // indirect
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 // indirect
go.opentelemetry.io/auto/sdk v1.1.0 // indirect
go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp v0.56.0 // indirect
go.uber.org/mock v0.5.0 // indirect
@@ -213,7 +205,6 @@ require (
github.com/libp2p/go-nat v0.2.0 // indirect
github.com/libp2p/go-netroute v0.2.2 // indirect
github.com/libp2p/go-reuseport v0.4.0 // indirect
github.com/libp2p/go-yamux/v4 v4.0.2 // indirect
github.com/libp2p/zeroconf/v2 v2.2.0 // indirect
github.com/lucasb-eyer/go-colorful v1.2.0 // indirect
github.com/lufia/plan9stats v0.0.0-20240819163618-b1d8f4d146e7 // indirect

50
go.sum
View File

@@ -94,8 +94,6 @@ github.com/cpuguy83/dockercfg v0.3.2/go.mod h1:sugsbF4//dDlL/i+S+rtpIWp+5h0BHJHf
github.com/cpuguy83/go-md2man/v2 v2.0.0-20190314233015-f79a8a8ca69d/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.0/go.mod h1:maD7wRr/U5Z6m/iR4s+kqSMx2CaBsrgA7czyZG/E6dU=
github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/cpuguy83/go-md2man/v2 v2.0.5 h1:ZtcqGrnekaHpVLArFSe4HK5DoKx1T0rq2DwVB0alcyc=
github.com/cpuguy83/go-md2man/v2 v2.0.5/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
github.com/creachadair/mds v0.21.3 h1:RRgEAPIb52cU0q7UxGyN+13QlCVTZIL4slRr0cYYQfA=
github.com/creachadair/mds v0.21.3/go.mod h1:1ltMWZd9yXhaHEoZwBialMaviWVUpRPvMwVP7saFAzM=
github.com/creachadair/otp v0.5.0 h1:q3Th7CXm2zlmCdBjw5tEPFOj4oWJMnVL5HXlq0sNKS0=
@@ -109,11 +107,8 @@ github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davidlazar/go-crypto v0.0.0-20200604182044-b73af7476f6c h1:pFUpOrbxDR6AkioZ1ySsx5yxlDQZ8stG2b88gTPxgJU=
github.com/davidlazar/go-crypto v0.0.0-20200604182044-b73af7476f6c/go.mod h1:6UhI8N9EjYm1c2odKpFpAYeR8dsBeM7PtzQhRgxRr9U=
github.com/decred/dcrd/crypto/blake256 v1.0.1 h1:7PltbUIQB7u/FfZ39+DGa/ShuMyJ5ilcvdfma9wOH6Y=
github.com/decred/dcrd/crypto/blake256 v1.0.1/go.mod h1:2OfgNZ5wDpcsFmHmCK5gZTPcCXqlm2ArzUIkw9czNJo=
github.com/decred/dcrd/crypto/blake256 v1.1.0 h1:zPMNGQCm0g4QTY27fOCorQW7EryeQ/U0x++OzVrdms8=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.3.0 h1:rpfIENRNNilwHwZeG5+P150SMrnNEcHYvcCuK6dPZSg=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.3.0/go.mod h1:v57UDF4pDQJcEfFUCRop3lJL149eHGSe9Jvczhzjo/0=
github.com/decred/dcrd/crypto/blake256 v1.1.0/go.mod h1:2OfgNZ5wDpcsFmHmCK5gZTPcCXqlm2ArzUIkw9czNJo=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0 h1:NMZiJj8QnKe1LgsbDayM4UoHwbvwDRwnI3hwNaAHRnc=
github.com/decred/dcrd/dcrec/secp256k1/v4 v4.4.0/go.mod h1:ZXNYxsqcloTdSy/rNShjYzMhyjf0LaoftYK0p+A3h40=
github.com/distribution/reference v0.6.0 h1:0IXCQ5g4/QMHHkarYzh5l+u8T3t73zM5QvfrDyIgxBk=
@@ -264,7 +259,6 @@ github.com/google/pprof v0.0.0-20250208200701-d0013a598941 h1:43XjGa6toxLpeksjcx
github.com/google/pprof v0.0.0-20250208200701-d0013a598941/go.mod h1:vavhavw2zAxS5dIdcRluK6cSGGPlZynqzFM8NdvU144=
github.com/google/renameio v0.1.0/go.mod h1:KWCgfxg9yswjAJkECMjeO8J8rahYeXnNhOm40UhjYkI=
github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.3.1/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
github.com/googleapis/gax-go v2.0.0+incompatible/go.mod h1:SFVmujtThgffbyetf+mdk2eWhX2bMyUtNHzFKcPA9HY=
@@ -308,8 +302,6 @@ github.com/ipfs/go-block-format v0.2.0 h1:ZqrkxBA2ICbDRbK8KJs/u0O3dlp6gmAuuXUJNi
github.com/ipfs/go-block-format v0.2.0/go.mod h1:+jpL11nFx5A/SPpsoBn6Bzkra/zaArfSmsknbPMYgzM=
github.com/ipfs/go-cid v0.5.0 h1:goEKKhaGm0ul11IHA7I6p1GmKz8kEYniqFopaB5Otwg=
github.com/ipfs/go-cid v0.5.0/go.mod h1:0L7vmeNXpQpUS9vt+yEARkJ8rOg43DF3iPgn4GIN0mk=
github.com/ipfs/go-datastore v0.6.0 h1:JKyz+Gvz1QEZw0LsX1IBn+JFCJQH4SJVFtM4uWU0Myk=
github.com/ipfs/go-datastore v0.6.0/go.mod h1:rt5M3nNbSO/8q1t4LNkLyUwRs8HupMeN/8O4Vn9YAT8=
github.com/ipfs/go-datastore v0.7.0 h1:a6JMuRFKYhw6XXmIVoTthF8ZFm4QQXvLDXFhXRVv8Go=
github.com/ipfs/go-datastore v0.7.0/go.mod h1:ucOWMfbOPI6ZEyaIB1q/+78RPLBPERfuUVYX1EPnNpQ=
github.com/ipfs/go-detect-race v0.0.1 h1:qX/xay2W3E4Q1U7d9lNs1sU9nvguX0a7319XbyQ6cOk=
@@ -374,24 +366,16 @@ github.com/kr/text v0.2.0 h1:5Nx0Ya0ZqY2ygV366QzturHI13Jq95ApcVaJBhpS+AY=
github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
github.com/kylelemons/godebug v1.1.0 h1:RPNrshWIDI6G2gRW9EHilWtl7Z6Sb1BR0xunSBf0SNc=
github.com/kylelemons/godebug v1.1.0/go.mod h1:9/0rRGxNHcop5bhtWyNeEfOS8JIWk580+fNqagV/RAw=
github.com/labstack/echo/v4 v4.13.3 h1:pwhpCPrTl5qry5HRdM5FwdXnhXSLSY+WE+YQSeCaafY=
github.com/labstack/echo/v4 v4.13.3/go.mod h1:o90YNEeQWjDozo584l7AwhJMHN0bOC4tAfg+Xox9q5g=
github.com/labstack/gommon v0.4.2 h1:F8qTUNXgG1+6WQmqoUWnz8WiEU60mXVVw0P4ht1WRA0=
github.com/labstack/gommon v0.4.2/go.mod h1:QlUFxVM+SNXhDL/Z7YhocGIBYOiwB0mXm1+1bAPHPyU=
github.com/libp2p/go-buffer-pool v0.1.0 h1:oK4mSFcQz7cTQIfqbe4MIj9gLW+mnanjyFtc6cdF0Y8=
github.com/libp2p/go-buffer-pool v0.1.0/go.mod h1:N+vh8gMqimBzdKkSMVuydVDq+UV5QTWy5HSiZacSbPg=
github.com/libp2p/go-cidranger v1.1.0 h1:ewPN8EZ0dd1LSnrtuwd4709PXVcITVeuwbag38yPW7c=
github.com/libp2p/go-cidranger v1.1.0/go.mod h1:KWZTfSr+r9qEo9OkI9/SIEeAtw+NNoU0dXIXt15Okic=
github.com/libp2p/go-flow-metrics v0.2.0 h1:EIZzjmeOE6c8Dav0sNv35vhZxATIXWZg6j/C08XmmDw=
github.com/libp2p/go-flow-metrics v0.2.0/go.mod h1:st3qqfu8+pMfh+9Mzqb2GTiwrAGjIPszEjZmtksN8Jc=
github.com/libp2p/go-libp2p v0.39.1 h1:1Ur6rPCf3GR+g8jkrnaQaM0ha2IGespsnNlCqJLLALE=
github.com/libp2p/go-libp2p v0.39.1/go.mod h1:3zicI8Lp7Isun+Afo/JOACUbbJqqR2owK6RQWFsVAbI=
github.com/libp2p/go-libp2p v0.40.0 h1:1LOMO3gigxeXFs50HGEc1U79OINewUQB7o4gTKGPC3U=
github.com/libp2p/go-libp2p v0.40.0/go.mod h1:hOzj2EAIYsXpVpBnyA1pRHzpUJGF9nbWiDLjgasnbF0=
github.com/libp2p/go-libp2p-asn-util v0.4.1 h1:xqL7++IKD9TBFMgnLPZR6/6iYhawHKHl950SO9L6n94=
github.com/libp2p/go-libp2p-asn-util v0.4.1/go.mod h1:d/NI6XZ9qxw67b4e+NgpQexCIiFYJjErASrYW4PFDN8=
github.com/libp2p/go-libp2p-kad-dht v0.29.0 h1:045eW21lGlMSD9aKSZZGH4fnBMIInPwQLxIQ35P962I=
github.com/libp2p/go-libp2p-kad-dht v0.29.0/go.mod h1:mIci3rHSwDsxQWcCjfmxD8vMTgh5xLuvwb1D5WP8ZNk=
github.com/libp2p/go-libp2p-kad-dht v0.29.1 h1:RyD1RnnkXOh1gwBCrMQ6ZVfTJECY5yDOY6qxt9VNqE4=
github.com/libp2p/go-libp2p-kad-dht v0.29.1/go.mod h1:tZEFTKWCsY0xngypKyAIwNDNZOBiikSUIgd/BjTF5Ms=
github.com/libp2p/go-libp2p-kbucket v0.6.5 h1:Fsl1YvZcMwqrR4DYrTO02yo9PGYs2HBQIT3lGXFMTxg=
@@ -412,8 +396,6 @@ github.com/libp2p/go-netroute v0.2.2 h1:Dejd8cQ47Qx2kRABg6lPwknU7+nBnFRpko45/fFP
github.com/libp2p/go-netroute v0.2.2/go.mod h1:Rntq6jUAH0l9Gg17w5bFGhcC9a+vk4KNXs6s7IljKYE=
github.com/libp2p/go-reuseport v0.4.0 h1:nR5KU7hD0WxXCJbmw7r2rhRYruNRl2koHw8fQscQm2s=
github.com/libp2p/go-reuseport v0.4.0/go.mod h1:ZtI03j/wO5hZVDFo2jKywN6bYKWLOy8Se6DrI2E1cLU=
github.com/libp2p/go-yamux/v4 v4.0.2 h1:nrLh89LN/LEiqcFiqdKDRHjGstN300C1269K/EX0CPU=
github.com/libp2p/go-yamux/v4 v4.0.2/go.mod h1:C808cCRgOs1iBwY4S71T5oxgMxgLmqUw56qh4AeBW2o=
github.com/libp2p/go-yamux/v5 v5.0.0 h1:2djUh96d3Jiac/JpGkKs4TO49YhsfLopAoryfPmf+Po=
github.com/libp2p/go-yamux/v5 v5.0.0/go.mod h1:en+3cdX51U0ZslwRdRLrvQsdayFt3TSUKvBGErzpWbU=
github.com/libp2p/zeroconf/v2 v2.2.0 h1:Cup06Jv6u81HLhIj1KasuNM/RHHrJ8T7wOTS4+Tv53Q=
@@ -492,8 +474,6 @@ github.com/morikuni/aec v1.0.0/go.mod h1:BbKIizmSmc5MMPqRYbxO4ZU0S0+P200+tUnFx7P
github.com/mr-tron/base58 v1.1.2/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc=
github.com/mr-tron/base58 v1.2.0 h1:T/HDJBh4ZCPbU39/+c3rRvE0uKBQlU27+QI8LJ4t64o=
github.com/mr-tron/base58 v1.2.0/go.mod h1:BinMc/sQntlIE1frQmRFPUoPA1Zkr8VRgBdjWI2mNwc=
github.com/mudler/edgevpn v0.30.1 h1:4yyhNFJX62NpRp50sxiyZE5E/sdAqEZX+aE5Mv7QS60=
github.com/mudler/edgevpn v0.30.1/go.mod h1:IAJkkJ0oH3rwsSGOGTFT4UBYFqYuD/QyaKzTLB3P/eU=
github.com/mudler/edgevpn v0.30.2 h1:3cD0UM8BHM8tQ1v3WIZOyzmktgZbKPAQQDH3KoH15rs=
github.com/mudler/edgevpn v0.30.2/go.mod h1:bGUdGQzwLOuMs3SII1N6SazoI1qQ1ekxdxNatOCS5ZM=
github.com/mudler/go-piper v0.0.0-20241023091659-2494246fd9fc h1:RxwneJl1VgvikiX28EkpdAyL4yQVnJMrbquKospjHyA=
@@ -556,8 +536,10 @@ github.com/opencontainers/runtime-spec v1.2.0/go.mod h1:jwyrGlmzljRJv/Fgzds9SsS/
github.com/opentracing/opentracing-go v1.2.0 h1:uEJPy/1a5RIPAJ0Ov+OIO8OxWu77jEv+1B0VhjKrZUs=
github.com/opentracing/opentracing-go v1.2.0/go.mod h1:GxEUsuufX4nBwe+T+Wl9TAgYrxe9dPLANfrWvHYVTgc=
github.com/openzipkin/zipkin-go v0.1.1/go.mod h1:NtoC/o8u3JlF1lSlyPNswIbeQH9bJTmOf0Erfk+hxe8=
github.com/otiai10/mint v1.6.1 h1:kgbTJmOpp/0ce7hk3H8jiSuR0MXmpwWRfqUdKww17qg=
github.com/otiai10/mint v1.6.1/go.mod h1:MJm72SBthJjz8qhefc4z1PYEieWmy8Bku7CjcAqyUSM=
github.com/otiai10/copy v1.14.1 h1:5/7E6qsUMBaH5AnQ0sSLzzTg1oTECmcCmT6lvF45Na8=
github.com/otiai10/copy v1.14.1/go.mod h1:oQwrEDDOci3IM8dJF0d8+jnbfPDllW6vUjNc3DoZm9I=
github.com/otiai10/mint v1.6.3 h1:87qsV/aw1F5as1eH1zS/yqHY85ANKVMgkDrf9rcxbQs=
github.com/otiai10/mint v1.6.3/go.mod h1:MJm72SBthJjz8qhefc4z1PYEieWmy8Bku7CjcAqyUSM=
github.com/otiai10/openaigo v1.7.0 h1:AOQcOjRRM57ABvz+aI2oJA/Qsz1AydKbdZAlGiKyCqg=
github.com/otiai10/openaigo v1.7.0/go.mod h1:kIaXc3V+Xy5JLplcBxehVyGYDtufHp3PFPy04jOwOAI=
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 h1:onHthvaw9LFnH4t2DcNVpwGmV9E1BkGknEliJkfwQj0=
@@ -577,8 +559,6 @@ github.com/pion/dtls/v2 v2.2.12 h1:KP7H5/c1EiVAAKUmXyCzPiQe5+bCJrpOeKg/L05dunk=
github.com/pion/dtls/v2 v2.2.12/go.mod h1:d9SYc9fch0CqK90mRk1dC7AkzzpwJj6u2GU3u+9pqFE=
github.com/pion/dtls/v3 v3.0.4 h1:44CZekewMzfrn9pmGrj5BNnTMDCFwr+6sLH+cCuLM7U=
github.com/pion/dtls/v3 v3.0.4/go.mod h1:R373CsjxWqNPf6MEkfdy3aSe9niZvL/JaKlGeFphtMg=
github.com/pion/ice/v2 v2.3.37 h1:ObIdaNDu1rCo7hObhs34YSBcO7fjslJMZV0ux+uZWh0=
github.com/pion/ice/v2 v2.3.37/go.mod h1:mBF7lnigdqgtB+YHkaY/Y6s6tsyRyo4u4rPGRuOjUBQ=
github.com/pion/ice/v4 v4.0.6 h1:jmM9HwI9lfetQV/39uD0nY4y++XZNPhvzIPCb8EwxUM=
github.com/pion/ice/v4 v4.0.6/go.mod h1:y3M18aPhIxLlcO/4dn9X8LzLLSma84cx6emMSu14FGw=
github.com/pion/interceptor v0.1.37 h1:aRA8Zpab/wE7/c0O3fh1PqY0AJI3fCSEM5lRWJVorwI=
@@ -586,8 +566,6 @@ github.com/pion/interceptor v0.1.37/go.mod h1:JzxbJ4umVTlZAf+/utHzNesY8tmRkM2lVm
github.com/pion/logging v0.2.2/go.mod h1:k0/tDVsRCX2Mb2ZEmTqNa7CWsQPc+YYCB7Q+5pahoms=
github.com/pion/logging v0.2.3 h1:gHuf0zpoh1GW67Nr6Gj4cv5Z9ZscU7g/EaoC/Ke/igI=
github.com/pion/logging v0.2.3/go.mod h1:z8YfknkquMe1csOrxK5kc+5/ZPAzMxbKLX5aXpbpC90=
github.com/pion/mdns v0.0.12 h1:CiMYlY+O0azojWDmxdNr7ADGrnZ+V6Ilfner+6mSVK8=
github.com/pion/mdns v0.0.12/go.mod h1:VExJjv8to/6Wqm1FXK+Ii/Z9tsVk/F5sD/N70cnYFbk=
github.com/pion/mdns/v2 v2.0.7 h1:c9kM8ewCgjslaAmicYMFQIde2H9/lrZpjBkN8VwoVtM=
github.com/pion/mdns/v2 v2.0.7/go.mod h1:vAdSYNAT0Jy3Ru0zl2YiW3Rm/fJCwIeM0nToenfOJKA=
github.com/pion/randutil v0.1.0 h1:CFG1UdESneORglEsnimhUjf33Rwjubwj6xfiOXBa3mA=
@@ -610,12 +588,8 @@ github.com/pion/transport/v2 v2.2.1/go.mod h1:cXXWavvCnFF6McHTft3DWS9iic2Mftcz1A
github.com/pion/transport/v2 v2.2.4/go.mod h1:q2U/tf9FEfnSBGSW6w5Qp5PFWRLRj3NjLhCCgpRK4p0=
github.com/pion/transport/v2 v2.2.10 h1:ucLBLE8nuxiHfvkFKnkDQRYWYfp8ejf4YBOPfaQpw6Q=
github.com/pion/transport/v2 v2.2.10/go.mod h1:sq1kSLWs+cHW9E+2fJP95QudkzbK7wscs8yYgQToO5E=
github.com/pion/transport/v3 v3.0.1/go.mod h1:UY7kiITrlMv7/IKgd5eTUcaahZx5oUN3l9SzK5f5xE0=
github.com/pion/transport/v3 v3.0.7 h1:iRbMH05BzSNwhILHoBoAPxoB9xQgOaJk+591KC9P1o0=
github.com/pion/transport/v3 v3.0.7/go.mod h1:YleKiTZ4vqNxVwh77Z0zytYi7rXHl7j6uPLGhhz9rwo=
github.com/pion/turn/v2 v2.1.3/go.mod h1:huEpByKKHix2/b9kmTAM3YoX6MKP+/D//0ClgUYR2fY=
github.com/pion/turn/v2 v2.1.6 h1:Xr2niVsiPTB0FPtt+yAWKFUkU1eotQbGgpTIld4x1Gc=
github.com/pion/turn/v2 v2.1.6/go.mod h1:huEpByKKHix2/b9kmTAM3YoX6MKP+/D//0ClgUYR2fY=
github.com/pion/turn/v4 v4.0.0 h1:qxplo3Rxa9Yg1xXDxxH8xaqcyGUtbHYw4QSCvmFWvhM=
github.com/pion/turn/v4 v4.0.0/go.mod h1:MuPDkm15nYSklKpN8vWJ9W2M0PlyQZqYt1McGuxG7mA=
github.com/pion/webrtc/v4 v4.0.9 h1:PyOYMRKJgfy0dzPcYtFD/4oW9zaw3Ze3oZzzbj2LV9E=
@@ -632,8 +606,6 @@ github.com/polydawn/refmt v0.89.0/go.mod h1:/zvteZs/GwLtCgZ4BL6CBsk9IKIlexP43ObX
github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55 h1:o4JXh1EVt9k/+g42oCprj/FisM4qX9L3sZB3upGN2ZU=
github.com/power-devops/perfstat v0.0.0-20240221224432-82ca36839d55/go.mod h1:OmDBASR4679mdNQnz2pUhc2G8CO2JrUAVFDRBDP/hJE=
github.com/prometheus/client_golang v0.8.0/go.mod h1:7SWBe2y4D6OKWSNQJUaRYU/AaXPKyh/dDVn+NZz0KFw=
github.com/prometheus/client_golang v1.20.5 h1:cxppBPuYhUnsO6yo/aoRol4L7q7UFfdm+bR9r+8l63Y=
github.com/prometheus/client_golang v1.20.5/go.mod h1:PIEt8X02hGcP8JWbeHyeZ53Y/jReSnHgO035n//V5WE=
github.com/prometheus/client_golang v1.21.0 h1:DIsaGmiaBkSangBgMtWdNfxbMNdku5IK6iNhrEqWvdA=
github.com/prometheus/client_golang v1.21.0/go.mod h1:U9NM32ykUErtVBxdvD3zfi+EuFkkaBvMb09mIfe0Zgg=
github.com/prometheus/client_model v0.0.0-20180712105110-5c3871d89910/go.mod h1:MbSGuTsp3dbXC40dX6PRTWyKYBIrTGTE9sqQNg2J8bo=
@@ -670,7 +642,6 @@ github.com/russross/blackfriday v1.5.2/go.mod h1:JO/DiYxRf+HjHt06OyowR9PTA263kcR
github.com/russross/blackfriday v1.6.0 h1:KqfZb0pUVN2lYqZUYRddxF4OR8ZMURnJIG5Y3VRLtww=
github.com/russross/blackfriday v1.6.0/go.mod h1:ti0ldHuxg49ri4ksnFxlkCfN+hvslNlmVHqNRXXJNAY=
github.com/russross/blackfriday/v2 v2.0.1/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/russross/blackfriday/v2 v2.1.0 h1:JIOH55/0cWyOuilr9/qlrm0BSXldqnqwMsf35Ld67mk=
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
github.com/sashabaranov/go-openai v1.26.2 h1:cVlQa3gn3eYqNXRW03pPlpy6zLG52EU4g0FrWXc0EFI=
github.com/sashabaranov/go-openai v1.26.2/go.mod h1:lj5b/K+zjTSFxVLijLSTDZuP7adOgerWeFyZLUhAKRg=
@@ -769,16 +740,11 @@ github.com/ulikunitz/xz v0.5.9 h1:RsKRIA2MO8x56wkkcd3LbtcE/uMszhb6DpRf+3uwa3I=
github.com/ulikunitz/xz v0.5.9/go.mod h1:nbz6k7qbPmH4IRqmfOplQw/tblSgqTqBwxkY0oWt/14=
github.com/urfave/cli v1.22.2/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/urfave/cli v1.22.10/go.mod h1:Gos4lmkARVdJ6EkW0WaNv/tZAAMe9V7XWyB60NtXRu0=
github.com/urfave/cli v1.22.12 h1:igJgVw1JdKH+trcLWLeLwZjU9fEfPesQ+9/e4MQ44S8=
github.com/urfave/cli v1.22.12/go.mod h1:sSBEIC79qR6OvcmsD4U3KABeOTxDqQtdDnaFuUN30b8=
github.com/urfave/cli/v2 v2.27.5 h1:WoHEJLdsXr6dDWoJgMq/CboDmyY/8HMMH1fTECbih+w=
github.com/urfave/cli/v2 v2.27.5/go.mod h1:3Sevf16NykTbInEnD0yKkjDAeZDS0A6bzhBH5hrMvTQ=
github.com/valyala/bytebufferpool v1.0.0 h1:GqA5TC/0021Y/b9FG4Oi9Mr3q7XYx6KllzawFIhcdPw=
github.com/valyala/bytebufferpool v1.0.0/go.mod h1:6bBcMArwyJ5K/AmCkWv1jt77kVWyCJ6HpOuEn7z0Csc=
github.com/valyala/fasthttp v1.55.0 h1:Zkefzgt6a7+bVKHnu/YaYSOPfNYNisSVBo/unVCf8k8=
github.com/valyala/fasthttp v1.55.0/go.mod h1:NkY9JtkrpPKmgwV3HTaS2HWaJss9RSIsRVfcxxoHiOM=
github.com/valyala/fasttemplate v1.2.2 h1:lxLXG0uE3Qnshl9QyaK6XJxMXlQZELvChBOCmQD0Loo=
github.com/valyala/fasttemplate v1.2.2/go.mod h1:KHLXt3tVN2HBp8eijSv/kGJopbvo7S+qRAEEKiv+SiQ=
github.com/valyala/tcplisten v1.0.0 h1:rBHj/Xf+E1tRGZyWIWwJDiRY0zc1Js+CV5DqwacVSA8=
github.com/valyala/tcplisten v1.0.0/go.mod h1:T0xQ8SeCZGxckz9qRXTfG43PvQ/mcWh7FwZEA7Ioqkc=
github.com/vbatts/tar-split v0.11.3 h1:hLFqsOLQ1SsppQNTMpkpPXClLDfC2A3Zgy9OUU+RVck=
@@ -799,8 +765,6 @@ github.com/wlynxg/anet v0.0.5 h1:J3VJGi1gvo0JwZ/P1/Yc/8p63SoW98B5dHkYDmpgvvU=
github.com/wlynxg/anet v0.0.5/go.mod h1:eay5PRQr7fIVAMbTbchTnO9gG65Hg/uYGdc7mguHxoA=
github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8 h1:nIPpBwaJSVYIxUFsDv3M8ofmx9yWTog9BfvIu0q41lo=
github.com/xi2/xz v0.0.0-20171230120015-48954b6210f8/go.mod h1:HUYIGzjTL3rfEspMxjDjgmT5uz5wzYJKVo23qUhYTos=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1 h1:gEOO8jv9F4OT7lGCjxCBTO/36wtF6j2nSip77qHd4x4=
github.com/xrash/smetrics v0.0.0-20240521201337-686a1a2994c1/go.mod h1:Ohn+xnUBiLI6FVj/9LpzZWtj1/D6lUovWYBkxHVV3aM=
github.com/yuin/goldmark v1.1.27/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.2.1/go.mod h1:3hX8gzYuyVAZsxl0MRgGTJEmQBFcNTphYh9decYSb74=
github.com/yuin/goldmark v1.3.5/go.mod h1:mwnBkeHKe2W/ZEtQ+71ViKU8L12m81fl3OWwC1Zlc8k=
@@ -875,8 +839,6 @@ golang.org/x/crypto v0.18.0/go.mod h1:R0j02AL6hcrfOiy9T4ZYp/rcWeMxM3L6QYxlOuEG1m
golang.org/x/crypto v0.33.0 h1:IOBPskki6Lysi0lo9qQvbxiQ+FvsCC/YWOecCHAixus=
golang.org/x/crypto v0.33.0/go.mod h1:bVdXmD7IV/4GdElGPozy6U7lWdRXA4qyRVGJV57uQ5M=
golang.org/x/exp v0.0.0-20190121172915-509febef88a4/go.mod h1:CJ0aWSM057203Lf6IL+f9T1iT9GByDxfZKAQTCR3kQA=
golang.org/x/exp v0.0.0-20250215185904-eff6e970281f h1:oFMYAjX0867ZD2jcNiLBrI9BdpmEkvPyi5YrBGXbamg=
golang.org/x/exp v0.0.0-20250215185904-eff6e970281f/go.mod h1:BHOTPb3L19zxehTsLoJXVaTktb06DFgmdW6Wb9s8jqk=
golang.org/x/exp v0.0.0-20250218142911-aa4b98e5adaa h1:t2QcU6V556bFjYgu4L6C+6VrCPyJZ+eyRsABUPs1mz4=
golang.org/x/exp v0.0.0-20250218142911-aa4b98e5adaa/go.mod h1:BHOTPb3L19zxehTsLoJXVaTktb06DFgmdW6Wb9s8jqk=
golang.org/x/lint v0.0.0-20180702182130-06c8688daad7/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
@@ -973,7 +935,6 @@ golang.org/x/sys v0.5.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.7.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.9.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.10.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.11.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
golang.org/x/sys v0.12.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
@@ -1084,7 +1045,6 @@ google.golang.org/protobuf v1.36.5 h1:tPhr+woSbjfYvY6/GPufUoYizxw1cF/yFoxJ2fmpwl
google.golang.org/protobuf v1.36.5/go.mod h1:9fA7Ob0pmnwhb644+1+CVWFRbNajQ6iRojtC/QF5bRE=
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c/go.mod h1:JHkPIbrfpd72SG/EVd6muEfDQjcINNoR0C8j2r3qZ4Q=
gopkg.in/errgo.v2 v2.1.0/go.mod h1:hNsd1EY+bozCKY1Ytp96fpM3vjJbqLJn88ws8XvfDNI=

View File

@@ -144,6 +144,11 @@ func (u URI) LooksLikeHTTPURL() bool {
strings.HasPrefix(string(u), HTTPSPrefix)
}
func (u URI) LooksLikeDir() bool {
f, err := os.Stat(string(u))
return err == nil && f.IsDir()
}
func (s URI) LooksLikeOCI() bool {
return strings.HasPrefix(string(s), "quay.io") ||
strings.HasPrefix(string(s), OCIPrefix) ||

View File

@@ -18,6 +18,13 @@ const (
nvidiaL4T = "nvidia-l4t"
darwinX86 = "darwin-x86"
metal = "metal"
nvidia = "nvidia"
amd = "amd"
intel = "intel"
capabilityEnv = "LOCALAI_FORCE_META_BACKEND_CAPABILITY"
capabilityRunFileEnv = "LOCALAI_FORCE_META_BACKEND_CAPABILITY_RUN_FILE"
defaultRunFile = "/run/localai/capability"
)
func (s *SystemState) Capability(capMap map[string]string) string {
@@ -35,15 +42,16 @@ func (s *SystemState) Capability(capMap map[string]string) string {
}
func (s *SystemState) getSystemCapabilities() string {
if os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY") != "" {
log.Debug().Str("LOCALAI_FORCE_META_BACKEND_CAPABILITY", os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY")).Msg("Using forced capability")
return os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY")
capability := os.Getenv(capabilityEnv)
if capability != "" {
log.Info().Str("capability", capability).Msgf("Using forced capability from environment variable (%s)", capabilityEnv)
return capability
}
capabilityRunFile := "/run/localai/capability"
if os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY_RUN_FILE") != "" {
log.Debug().Str("LOCALAI_FORCE_META_BACKEND_CAPABILITY_RUN_FILE", os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY_RUN_FILE")).Msg("Using forced capability run file")
capabilityRunFile = os.Getenv("LOCALAI_FORCE_META_BACKEND_CAPABILITY_RUN_FILE")
capabilityRunFile := defaultRunFile
capabilityRunFileEnv := os.Getenv(capabilityRunFileEnv)
if capabilityRunFileEnv != "" {
capabilityRunFile = capabilityRunFileEnv
}
// Check if /run/localai/capability exists and use it
@@ -52,37 +60,37 @@ func (s *SystemState) getSystemCapabilities() string {
if _, err := os.Stat(capabilityRunFile); err == nil {
capability, err := os.ReadFile(capabilityRunFile)
if err == nil {
log.Debug().Str("capability", string(capability)).Msg("Using capability from run file")
log.Info().Str("capabilityRunFile", capabilityRunFile).Str("capability", string(capability)).Msgf("Using forced capability run file (%s)", capabilityRunFileEnv)
return strings.Trim(strings.TrimSpace(string(capability)), "\n")
}
}
// If we are on mac and arm64, we will return metal
if runtime.GOOS == "darwin" && runtime.GOARCH == "arm64" {
log.Debug().Msg("Using metal capability")
log.Info().Msgf("Using metal capability (arm64 on mac), set %s to override", capabilityEnv)
return metal
}
// If we are on mac and x86, we will return darwin-x86
if runtime.GOOS == "darwin" && runtime.GOARCH == "amd64" {
log.Debug().Msg("Using darwin-x86 capability")
log.Info().Msgf("Using darwin-x86 capability (amd64 on mac), set %s to override", capabilityEnv)
return darwinX86
}
// If arm64 on linux and a nvidia gpu is detected, we will return nvidia-l4t
if runtime.GOOS == "linux" && runtime.GOARCH == "arm64" {
if s.GPUVendor == "nvidia" {
log.Debug().Msg("Using nvidia-l4t capability")
log.Info().Msgf("Using nvidia-l4t capability (arm64 on linux), set %s to override", capabilityEnv)
return nvidiaL4T
}
}
if s.GPUVendor == "" {
log.Debug().Msg("Using default capability")
log.Info().Msgf("Default capability (no GPU detected), set %s to override", capabilityEnv)
return defaultCapability
}
log.Debug().Str("GPUVendor", s.GPUVendor).Msg("Using GPU vendor capability")
log.Info().Str("Capability", s.GPUVendor).Msgf("Capability automatically detected, set %s to override", capabilityEnv)
return s.GPUVendor
}
@@ -106,18 +114,16 @@ func detectGPUVendor() (string, error) {
if gpu.DeviceInfo.Vendor != nil {
gpuVendorName := strings.ToUpper(gpu.DeviceInfo.Vendor.Name)
if strings.Contains(gpuVendorName, "NVIDIA") {
return "nvidia", nil
return nvidia, nil
}
if strings.Contains(gpuVendorName, "AMD") {
return "amd", nil
return amd, nil
}
if strings.Contains(gpuVendorName, "INTEL") {
return "intel", nil
return intel, nil
}
return "nvidia", nil
}
}
}
return "", nil

View File

@@ -92,6 +92,129 @@ const docTemplate = `{
"responses": {}
}
},
"/backends": {
"get": {
"summary": "List all Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/apply": {
"post": {
"summary": "Install backends to LocalAI.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/localai.GalleryBackend"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/available": {
"get": {
"summary": "List all available Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/delete/{name}": {
"post": {
"summary": "delete backends from LocalAI.",
"parameters": [
{
"type": "string",
"description": "Backend name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/galleries": {
"get": {
"summary": "List all Galleries",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/backends/jobs": {
"get": {
"summary": "Returns all the jobs status progress",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/services.GalleryOpStatus"
}
}
}
}
}
},
"/backends/jobs/{uuid}": {
"get": {
"summary": "Returns the job status",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/services.GalleryOpStatus"
}
}
}
}
},
"/metrics": {
"get": {
"summary": "Prometheus metrics endpoint",
@@ -185,56 +308,6 @@ const docTemplate = `{
}
}
}
},
"post": {
"summary": "Adds a gallery in LocalAI",
"parameters": [
{
"description": "Gallery details",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/config.Gallery"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
},
"delete": {
"summary": "removes a gallery from LocalAI",
"parameters": [
{
"description": "Gallery details",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/config.Gallery"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/models/jobs": {
@@ -328,94 +401,6 @@ const docTemplate = `{
}
}
},
"/v1/assistants": {
"get": {
"summary": "List available assistents",
"parameters": [
{
"type": "integer",
"description": "Limit the number of assistants returned",
"name": "limit",
"in": "query"
},
{
"type": "string",
"description": "Order of assistants returned",
"name": "order",
"in": "query"
},
{
"type": "string",
"description": "Return assistants created after the given ID",
"name": "after",
"in": "query"
},
{
"type": "string",
"description": "Return assistants created before the given ID",
"name": "before",
"in": "query"
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
}
},
"post": {
"summary": "Create an assistant with a model and instructions.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/openai.AssistantRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
}
},
"/v1/assistants/{assistant_id}": {
"get": {
"summary": "Get assistent data",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
},
"delete": {
"summary": "Delete assistents",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.DeleteAssistantResponse"
}
}
}
}
},
"/v1/audio/speech": {
"post": {
"consumes": [
@@ -529,6 +514,30 @@ const docTemplate = `{
}
}
},
"/v1/detection": {
"post": {
"summary": "Detects objects in the input image.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.DetectionRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.DetectionResponse"
}
}
}
}
},
"/v1/edits": {
"post": {
"summary": "OpenAI edit endpoint",
@@ -577,56 +586,6 @@ const docTemplate = `{
}
}
},
"/v1/files": {
"get": {
"summary": "List files.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ListFiles"
}
}
}
}
},
"/v1/files/{file_id}": {
"get": {
"summary": "Returns information about a specific file.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.File"
}
}
}
},
"delete": {
"summary": "Delete a file.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.DeleteStatus"
}
}
}
}
},
"/v1/files/{file_id}/content": {
"get": {
"summary": "Returns information about a specific file.",
"responses": {
"200": {
"description": "file",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/images/generations": {
"post": {
"summary": "Creates an image given a prompt.",
@@ -926,6 +885,75 @@ const docTemplate = `{
}
}
},
"gallery.GalleryBackend": {
"type": "object",
"properties": {
"alias": {
"type": "string"
},
"capabilities": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"description": {
"type": "string"
},
"files": {
"description": "AdditionalFiles are used to add additional files to the model",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.File"
}
},
"gallery": {
"description": "Gallery is a reference to the gallery which contains the model",
"allOf": [
{
"$ref": "#/definitions/config.Gallery"
}
]
},
"icon": {
"type": "string"
},
"installed": {
"description": "Installed is used to indicate if the model is installed or not",
"type": "boolean"
},
"license": {
"type": "string"
},
"mirrors": {
"type": "array",
"items": {
"type": "string"
}
},
"name": {
"type": "string"
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
},
"uri": {
"type": "string"
},
"url": {
"type": "string"
},
"urls": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"gallery.GalleryModel": {
"type": "object",
"properties": {
@@ -987,34 +1015,11 @@ const docTemplate = `{
}
}
},
"services.GalleryOpStatus": {
"localai.GalleryBackend": {
"type": "object",
"properties": {
"deletion": {
"description": "Deletion is true if the operation is a deletion",
"type": "boolean"
},
"downloaded_size": {
"id": {
"type": "string"
},
"error": {},
"file_name": {
"type": "string"
},
"file_size": {
"type": "string"
},
"gallery_model_name": {
"type": "string"
},
"message": {
"type": "string"
},
"processed": {
"type": "boolean"
},
"progress": {
"type": "number"
}
}
},
@@ -1026,9 +1031,6 @@ const docTemplate = `{
"type": "object",
"additionalProperties": true
},
"config_url": {
"type": "string"
},
"description": {
"type": "string"
},
@@ -1085,130 +1087,6 @@ const docTemplate = `{
}
}
},
"openai.Assistant": {
"type": "object",
"properties": {
"created": {
"description": "The time at which the assistant was created.",
"type": "integer"
},
"description": {
"description": "The description of the assistant.",
"type": "string"
},
"file_ids": {
"description": "A list of file IDs attached to this assistant.",
"type": "array",
"items": {
"type": "string"
}
},
"id": {
"description": "The unique identifier of the assistant.",
"type": "string"
},
"instructions": {
"description": "The system instructions that the assistant uses.",
"type": "string"
},
"metadata": {
"description": "Set of key-value pairs attached to the assistant.",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"description": "The model ID used by the assistant.",
"type": "string"
},
"name": {
"description": "The name of the assistant.",
"type": "string"
},
"object": {
"description": "Object type, which is \"assistant\".",
"type": "string"
},
"tools": {
"description": "A list of tools enabled on the assistant.",
"type": "array",
"items": {
"$ref": "#/definitions/openai.Tool"
}
}
}
},
"openai.AssistantRequest": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"file_ids": {
"type": "array",
"items": {
"type": "string"
}
},
"instructions": {
"type": "string"
},
"metadata": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"name": {
"type": "string"
},
"tools": {
"type": "array",
"items": {
"$ref": "#/definitions/openai.Tool"
}
}
}
},
"openai.DeleteStatus": {
"type": "object",
"properties": {
"deleted": {
"type": "boolean"
},
"id": {
"type": "string"
},
"object": {
"type": "string"
}
}
},
"openai.Tool": {
"type": "object",
"properties": {
"type": {
"$ref": "#/definitions/openai.ToolType"
}
}
},
"openai.ToolType": {
"type": "string",
"enum": [
"code_interpreter",
"retrieval",
"function"
],
"x-enum-varnames": [
"CodeInterpreter",
"Retrieval",
"Function"
]
},
"p2p.NodeData": {
"type": "object",
"properties": {
@@ -1235,7 +1113,8 @@ const docTemplate = `{
"breakdown": {
"type": "object",
"additionalProperties": {
"type": "integer"
"type": "integer",
"format": "int64"
}
},
"total": {
@@ -1256,6 +1135,7 @@ const docTemplate = `{
},
"proto.StatusResponse_State": {
"type": "integer",
"format": "int32",
"enum": [
0,
1,
@@ -1299,6 +1179,17 @@ const docTemplate = `{
}
}
},
"schema.BackendResponse": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"status_url": {
"type": "string"
}
}
},
"schema.Choice": {
"type": "object",
"properties": {
@@ -1319,17 +1210,45 @@ const docTemplate = `{
}
}
},
"schema.DeleteAssistantResponse": {
"schema.Detection": {
"type": "object",
"properties": {
"deleted": {
"type": "boolean"
},
"id": {
"class_name": {
"type": "string"
},
"object": {
"height": {
"type": "number"
},
"width": {
"type": "number"
},
"x": {
"type": "number"
},
"y": {
"type": "number"
}
}
},
"schema.DetectionRequest": {
"type": "object",
"properties": {
"image": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.DetectionResponse": {
"type": "object",
"properties": {
"detections": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Detection"
}
}
}
},
@@ -1353,35 +1272,6 @@ const docTemplate = `{
}
}
},
"schema.File": {
"type": "object",
"properties": {
"bytes": {
"description": "Size of the file in bytes",
"type": "integer"
},
"created_at": {
"description": "The time at which the file was created",
"type": "string"
},
"filename": {
"description": "The name of the file",
"type": "string"
},
"id": {
"description": "Unique identifier for the file",
"type": "string"
},
"object": {
"description": "Type of the object (e.g., \"file\")",
"type": "string"
},
"purpose": {
"description": "The purpose of the file (e.g., \"fine-tune\", \"classifications\", etc.)",
"type": "string"
}
}
},
"schema.FunctionCall": {
"type": "object",
"properties": {
@@ -1501,20 +1391,6 @@ const docTemplate = `{
}
}
},
"schema.ListFiles": {
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.File"
}
},
"object": {
"type": "string"
}
}
},
"schema.Message": {
"type": "object",
"properties": {
@@ -1610,6 +1486,13 @@ const docTemplate = `{
"description": "whisper",
"type": "string"
},
"files": {
"description": "Multiple input images for img2img or inpainting",
"type": "array",
"items": {
"type": "string"
}
},
"frequency_penalty": {
"type": "number"
},
@@ -1684,6 +1567,13 @@ const docTemplate = `{
"quality": {
"type": "string"
},
"ref_images": {
"description": "Reference images for models that support them (e.g., Flux Kontext)",
"type": "array",
"items": {
"type": "string"
}
},
"repeat_last_n": {
"type": "integer"
},
@@ -1923,6 +1813,37 @@ const docTemplate = `{
"type": "string"
}
}
},
"services.GalleryOpStatus": {
"type": "object",
"properties": {
"deletion": {
"description": "Deletion is true if the operation is a deletion",
"type": "boolean"
},
"downloaded_size": {
"type": "string"
},
"error": {},
"file_name": {
"type": "string"
},
"file_size": {
"type": "string"
},
"gallery_element_name": {
"type": "string"
},
"message": {
"type": "string"
},
"processed": {
"type": "boolean"
},
"progress": {
"type": "number"
}
}
}
},
"securityDefinitions": {

View File

@@ -85,6 +85,129 @@
"responses": {}
}
},
"/backends": {
"get": {
"summary": "List all Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/apply": {
"post": {
"summary": "Install backends to LocalAI.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/localai.GalleryBackend"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/available": {
"get": {
"summary": "List all available Backends",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/gallery.GalleryBackend"
}
}
}
}
}
},
"/backends/delete/{name}": {
"post": {
"summary": "delete backends from LocalAI.",
"parameters": [
{
"type": "string",
"description": "Backend name",
"name": "name",
"in": "path",
"required": true
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.BackendResponse"
}
}
}
}
},
"/backends/galleries": {
"get": {
"summary": "List all Galleries",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/backends/jobs": {
"get": {
"summary": "Returns all the jobs status progress",
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "object",
"additionalProperties": {
"$ref": "#/definitions/services.GalleryOpStatus"
}
}
}
}
}
},
"/backends/jobs/{uuid}": {
"get": {
"summary": "Returns the job status",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/services.GalleryOpStatus"
}
}
}
}
},
"/metrics": {
"get": {
"summary": "Prometheus metrics endpoint",
@@ -178,56 +301,6 @@
}
}
}
},
"post": {
"summary": "Adds a gallery in LocalAI",
"parameters": [
{
"description": "Gallery details",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/config.Gallery"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
},
"delete": {
"summary": "removes a gallery from LocalAI",
"parameters": [
{
"description": "Gallery details",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/config.Gallery"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/config.Gallery"
}
}
}
}
}
},
"/models/jobs": {
@@ -321,94 +394,6 @@
}
}
},
"/v1/assistants": {
"get": {
"summary": "List available assistents",
"parameters": [
{
"type": "integer",
"description": "Limit the number of assistants returned",
"name": "limit",
"in": "query"
},
{
"type": "string",
"description": "Order of assistants returned",
"name": "order",
"in": "query"
},
{
"type": "string",
"description": "Return assistants created after the given ID",
"name": "after",
"in": "query"
},
{
"type": "string",
"description": "Return assistants created before the given ID",
"name": "before",
"in": "query"
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"type": "array",
"items": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
}
},
"post": {
"summary": "Create an assistant with a model and instructions.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/openai.AssistantRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
}
},
"/v1/assistants/{assistant_id}": {
"get": {
"summary": "Get assistent data",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.Assistant"
}
}
}
},
"delete": {
"summary": "Delete assistents",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.DeleteAssistantResponse"
}
}
}
}
},
"/v1/audio/speech": {
"post": {
"consumes": [
@@ -522,6 +507,30 @@
}
}
},
"/v1/detection": {
"post": {
"summary": "Detects objects in the input image.",
"parameters": [
{
"description": "query params",
"name": "request",
"in": "body",
"required": true,
"schema": {
"$ref": "#/definitions/schema.DetectionRequest"
}
}
],
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.DetectionResponse"
}
}
}
}
},
"/v1/edits": {
"post": {
"summary": "OpenAI edit endpoint",
@@ -570,56 +579,6 @@
}
}
},
"/v1/files": {
"get": {
"summary": "List files.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.ListFiles"
}
}
}
}
},
"/v1/files/{file_id}": {
"get": {
"summary": "Returns information about a specific file.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/schema.File"
}
}
}
},
"delete": {
"summary": "Delete a file.",
"responses": {
"200": {
"description": "Response",
"schema": {
"$ref": "#/definitions/openai.DeleteStatus"
}
}
}
}
},
"/v1/files/{file_id}/content": {
"get": {
"summary": "Returns information about a specific file.",
"responses": {
"200": {
"description": "file",
"schema": {
"type": "string"
}
}
}
}
},
"/v1/images/generations": {
"post": {
"summary": "Creates an image given a prompt.",
@@ -919,6 +878,75 @@
}
}
},
"gallery.GalleryBackend": {
"type": "object",
"properties": {
"alias": {
"type": "string"
},
"capabilities": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"description": {
"type": "string"
},
"files": {
"description": "AdditionalFiles are used to add additional files to the model",
"type": "array",
"items": {
"$ref": "#/definitions/gallery.File"
}
},
"gallery": {
"description": "Gallery is a reference to the gallery which contains the model",
"allOf": [
{
"$ref": "#/definitions/config.Gallery"
}
]
},
"icon": {
"type": "string"
},
"installed": {
"description": "Installed is used to indicate if the model is installed or not",
"type": "boolean"
},
"license": {
"type": "string"
},
"mirrors": {
"type": "array",
"items": {
"type": "string"
}
},
"name": {
"type": "string"
},
"tags": {
"type": "array",
"items": {
"type": "string"
}
},
"uri": {
"type": "string"
},
"url": {
"type": "string"
},
"urls": {
"type": "array",
"items": {
"type": "string"
}
}
}
},
"gallery.GalleryModel": {
"type": "object",
"properties": {
@@ -980,34 +1008,11 @@
}
}
},
"services.GalleryOpStatus": {
"localai.GalleryBackend": {
"type": "object",
"properties": {
"deletion": {
"description": "Deletion is true if the operation is a deletion",
"type": "boolean"
},
"downloaded_size": {
"id": {
"type": "string"
},
"error": {},
"file_name": {
"type": "string"
},
"file_size": {
"type": "string"
},
"gallery_model_name": {
"type": "string"
},
"message": {
"type": "string"
},
"processed": {
"type": "boolean"
},
"progress": {
"type": "number"
}
}
},
@@ -1019,9 +1024,6 @@
"type": "object",
"additionalProperties": true
},
"config_url": {
"type": "string"
},
"description": {
"type": "string"
},
@@ -1078,130 +1080,6 @@
}
}
},
"openai.Assistant": {
"type": "object",
"properties": {
"created": {
"description": "The time at which the assistant was created.",
"type": "integer"
},
"description": {
"description": "The description of the assistant.",
"type": "string"
},
"file_ids": {
"description": "A list of file IDs attached to this assistant.",
"type": "array",
"items": {
"type": "string"
}
},
"id": {
"description": "The unique identifier of the assistant.",
"type": "string"
},
"instructions": {
"description": "The system instructions that the assistant uses.",
"type": "string"
},
"metadata": {
"description": "Set of key-value pairs attached to the assistant.",
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"description": "The model ID used by the assistant.",
"type": "string"
},
"name": {
"description": "The name of the assistant.",
"type": "string"
},
"object": {
"description": "Object type, which is \"assistant\".",
"type": "string"
},
"tools": {
"description": "A list of tools enabled on the assistant.",
"type": "array",
"items": {
"$ref": "#/definitions/openai.Tool"
}
}
}
},
"openai.AssistantRequest": {
"type": "object",
"properties": {
"description": {
"type": "string"
},
"file_ids": {
"type": "array",
"items": {
"type": "string"
}
},
"instructions": {
"type": "string"
},
"metadata": {
"type": "object",
"additionalProperties": {
"type": "string"
}
},
"model": {
"type": "string"
},
"name": {
"type": "string"
},
"tools": {
"type": "array",
"items": {
"$ref": "#/definitions/openai.Tool"
}
}
}
},
"openai.DeleteStatus": {
"type": "object",
"properties": {
"deleted": {
"type": "boolean"
},
"id": {
"type": "string"
},
"object": {
"type": "string"
}
}
},
"openai.Tool": {
"type": "object",
"properties": {
"type": {
"$ref": "#/definitions/openai.ToolType"
}
}
},
"openai.ToolType": {
"type": "string",
"enum": [
"code_interpreter",
"retrieval",
"function"
],
"x-enum-varnames": [
"CodeInterpreter",
"Retrieval",
"Function"
]
},
"p2p.NodeData": {
"type": "object",
"properties": {
@@ -1228,7 +1106,8 @@
"breakdown": {
"type": "object",
"additionalProperties": {
"type": "integer"
"type": "integer",
"format": "int64"
}
},
"total": {
@@ -1249,6 +1128,7 @@
},
"proto.StatusResponse_State": {
"type": "integer",
"format": "int32",
"enum": [
0,
1,
@@ -1292,6 +1172,17 @@
}
}
},
"schema.BackendResponse": {
"type": "object",
"properties": {
"id": {
"type": "string"
},
"status_url": {
"type": "string"
}
}
},
"schema.Choice": {
"type": "object",
"properties": {
@@ -1312,17 +1203,45 @@
}
}
},
"schema.DeleteAssistantResponse": {
"schema.Detection": {
"type": "object",
"properties": {
"deleted": {
"type": "boolean"
},
"id": {
"class_name": {
"type": "string"
},
"object": {
"height": {
"type": "number"
},
"width": {
"type": "number"
},
"x": {
"type": "number"
},
"y": {
"type": "number"
}
}
},
"schema.DetectionRequest": {
"type": "object",
"properties": {
"image": {
"type": "string"
},
"model": {
"type": "string"
}
}
},
"schema.DetectionResponse": {
"type": "object",
"properties": {
"detections": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.Detection"
}
}
}
},
@@ -1346,35 +1265,6 @@
}
}
},
"schema.File": {
"type": "object",
"properties": {
"bytes": {
"description": "Size of the file in bytes",
"type": "integer"
},
"created_at": {
"description": "The time at which the file was created",
"type": "string"
},
"filename": {
"description": "The name of the file",
"type": "string"
},
"id": {
"description": "Unique identifier for the file",
"type": "string"
},
"object": {
"description": "Type of the object (e.g., \"file\")",
"type": "string"
},
"purpose": {
"description": "The purpose of the file (e.g., \"fine-tune\", \"classifications\", etc.)",
"type": "string"
}
}
},
"schema.FunctionCall": {
"type": "object",
"properties": {
@@ -1494,20 +1384,6 @@
}
}
},
"schema.ListFiles": {
"type": "object",
"properties": {
"data": {
"type": "array",
"items": {
"$ref": "#/definitions/schema.File"
}
},
"object": {
"type": "string"
}
}
},
"schema.Message": {
"type": "object",
"properties": {
@@ -1603,6 +1479,13 @@
"description": "whisper",
"type": "string"
},
"files": {
"description": "Multiple input images for img2img or inpainting",
"type": "array",
"items": {
"type": "string"
}
},
"frequency_penalty": {
"type": "number"
},
@@ -1677,6 +1560,13 @@
"quality": {
"type": "string"
},
"ref_images": {
"description": "Reference images for models that support them (e.g., Flux Kontext)",
"type": "array",
"items": {
"type": "string"
}
},
"repeat_last_n": {
"type": "integer"
},
@@ -1916,6 +1806,37 @@
"type": "string"
}
}
},
"services.GalleryOpStatus": {
"type": "object",
"properties": {
"deletion": {
"description": "Deletion is true if the operation is a deletion",
"type": "boolean"
},
"downloaded_size": {
"type": "string"
},
"error": {},
"file_name": {
"type": "string"
},
"file_size": {
"type": "string"
},
"gallery_element_name": {
"type": "string"
},
"message": {
"type": "string"
},
"processed": {
"type": "boolean"
},
"progress": {
"type": "number"
}
}
}
},
"securityDefinitions": {

View File

@@ -57,6 +57,51 @@ definitions:
uri:
type: string
type: object
gallery.GalleryBackend:
properties:
alias:
type: string
capabilities:
additionalProperties:
type: string
type: object
description:
type: string
files:
description: AdditionalFiles are used to add additional files to the model
items:
$ref: '#/definitions/gallery.File'
type: array
gallery:
allOf:
- $ref: '#/definitions/config.Gallery'
description: Gallery is a reference to the gallery which contains the model
icon:
type: string
installed:
description: Installed is used to indicate if the model is installed or not
type: boolean
license:
type: string
mirrors:
items:
type: string
type: array
name:
type: string
tags:
items:
type: string
type: array
uri:
type: string
url:
type: string
urls:
items:
type: string
type: array
type: object
gallery.GalleryModel:
properties:
config_file:
@@ -100,26 +145,10 @@ definitions:
type: string
type: array
type: object
services.GalleryOpStatus:
localai.GalleryBackend:
properties:
deletion:
description: Deletion is true if the operation is a deletion
type: boolean
downloaded_size:
id:
type: string
error: {}
file_name:
type: string
file_size:
type: string
gallery_model_name:
type: string
message:
type: string
processed:
type: boolean
progress:
type: number
type: object
localai.GalleryModel:
properties:
@@ -128,8 +157,6 @@ definitions:
description: config_file is read in the situation where URL is blank - and
therefore this is a base config.
type: object
config_url:
type: string
description:
type: string
files:
@@ -168,92 +195,6 @@ definitions:
type: string
type: array
type: object
openai.Assistant:
properties:
created:
description: The time at which the assistant was created.
type: integer
description:
description: The description of the assistant.
type: string
file_ids:
description: A list of file IDs attached to this assistant.
items:
type: string
type: array
id:
description: The unique identifier of the assistant.
type: string
instructions:
description: The system instructions that the assistant uses.
type: string
metadata:
additionalProperties:
type: string
description: Set of key-value pairs attached to the assistant.
type: object
model:
description: The model ID used by the assistant.
type: string
name:
description: The name of the assistant.
type: string
object:
description: Object type, which is "assistant".
type: string
tools:
description: A list of tools enabled on the assistant.
items:
$ref: '#/definitions/openai.Tool'
type: array
type: object
openai.AssistantRequest:
properties:
description:
type: string
file_ids:
items:
type: string
type: array
instructions:
type: string
metadata:
additionalProperties:
type: string
type: object
model:
type: string
name:
type: string
tools:
items:
$ref: '#/definitions/openai.Tool'
type: array
type: object
openai.DeleteStatus:
properties:
deleted:
type: boolean
id:
type: string
object:
type: string
type: object
openai.Tool:
properties:
type:
$ref: '#/definitions/openai.ToolType'
type: object
openai.ToolType:
enum:
- code_interpreter
- retrieval
- function
type: string
x-enum-varnames:
- CodeInterpreter
- Retrieval
- Function
p2p.NodeData:
properties:
id:
@@ -271,6 +212,7 @@ definitions:
properties:
breakdown:
additionalProperties:
format: int64
type: integer
type: object
total:
@@ -289,6 +231,7 @@ definitions:
- 1
- 2
- -1
format: int32
type: integer
x-enum-varnames:
- StatusResponse_UNINITIALIZED
@@ -314,6 +257,13 @@ definitions:
model:
type: string
type: object
schema.BackendResponse:
properties:
id:
type: string
status_url:
type: string
type: object
schema.Choice:
properties:
delta:
@@ -327,14 +277,32 @@ definitions:
text:
type: string
type: object
schema.DeleteAssistantResponse:
schema.Detection:
properties:
deleted:
type: boolean
id:
class_name:
type: string
object:
height:
type: number
width:
type: number
x:
type: number
"y":
type: number
type: object
schema.DetectionRequest:
properties:
image:
type: string
model:
type: string
type: object
schema.DetectionResponse:
properties:
detections:
items:
$ref: '#/definitions/schema.Detection'
type: array
type: object
schema.ElevenLabsSoundGenerationRequest:
properties:
@@ -349,28 +317,6 @@ definitions:
text:
type: string
type: object
schema.File:
properties:
bytes:
description: Size of the file in bytes
type: integer
created_at:
description: The time at which the file was created
type: string
filename:
description: The name of the file
type: string
id:
description: Unique identifier for the file
type: string
object:
description: Type of the object (e.g., "file")
type: string
purpose:
description: The purpose of the file (e.g., "fine-tune", "classifications",
etc.)
type: string
type: object
schema.FunctionCall:
properties:
arguments:
@@ -448,15 +394,6 @@ definitions:
total_tokens:
type: integer
type: object
schema.ListFiles:
properties:
data:
items:
$ref: '#/definitions/schema.File'
type: array
object:
type: string
type: object
schema.Message:
properties:
content:
@@ -519,6 +456,11 @@ definitions:
file:
description: whisper
type: string
files:
description: Multiple input images for img2img or inpainting
items:
type: string
type: array
frequency_penalty:
type: number
function_call:
@@ -572,6 +514,11 @@ definitions:
description: Prompt is read only by completion/image API calls
quality:
type: string
ref_images:
description: Reference images for models that support them (e.g., Flux Kontext)
items:
type: string
type: array
repeat_last_n:
type: integer
repeat_penalty:
@@ -737,6 +684,27 @@ definitions:
model:
type: string
type: object
services.GalleryOpStatus:
properties:
deletion:
description: Deletion is true if the operation is a deletion
type: boolean
downloaded_size:
type: string
error: {}
file_name:
type: string
file_size:
type: string
gallery_element_name:
type: string
message:
type: string
processed:
type: boolean
progress:
type: number
type: object
info:
contact:
name: LocalAI
@@ -792,6 +760,83 @@ paths:
$ref: '#/definitions/schema.BackendMonitorRequest'
responses: {}
summary: Backend monitor endpoint
/backends:
get:
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/gallery.GalleryBackend'
type: array
summary: List all Backends
/backends/apply:
post:
parameters:
- description: query params
in: body
name: request
required: true
schema:
$ref: '#/definitions/localai.GalleryBackend'
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.BackendResponse'
summary: Install backends to LocalAI.
/backends/available:
get:
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/gallery.GalleryBackend'
type: array
summary: List all available Backends
/backends/delete/{name}:
post:
parameters:
- description: Backend name
in: path
name: name
required: true
type: string
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.BackendResponse'
summary: delete backends from LocalAI.
/backends/galleries:
get:
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/config.Gallery'
type: array
summary: List all Galleries
/backends/jobs:
get:
responses:
"200":
description: Response
schema:
additionalProperties:
$ref: '#/definitions/services.GalleryOpStatus'
type: object
summary: Returns all the jobs status progress
/backends/jobs/{uuid}:
get:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/services.GalleryOpStatus'
summary: Returns the job status
/metrics:
get:
parameters:
@@ -843,22 +888,6 @@ paths:
$ref: '#/definitions/schema.GalleryResponse'
summary: delete models to LocalAI.
/models/galleries:
delete:
parameters:
- description: Gallery details
in: body
name: request
required: true
schema:
$ref: '#/definitions/config.Gallery'
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/config.Gallery'
type: array
summary: removes a gallery from LocalAI
get:
responses:
"200":
@@ -868,22 +897,6 @@ paths:
$ref: '#/definitions/config.Gallery'
type: array
summary: List all Galleries
post:
parameters:
- description: Gallery details
in: body
name: request
required: true
schema:
$ref: '#/definitions/config.Gallery'
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/config.Gallery'
type: array
summary: Adds a gallery in LocalAI
/models/jobs:
get:
responses:
@@ -941,62 +954,6 @@ paths:
schema:
type: string
summary: Generates audio from the input text.
/v1/assistants:
get:
parameters:
- description: Limit the number of assistants returned
in: query
name: limit
type: integer
- description: Order of assistants returned
in: query
name: order
type: string
- description: Return assistants created after the given ID
in: query
name: after
type: string
- description: Return assistants created before the given ID
in: query
name: before
type: string
responses:
"200":
description: Response
schema:
items:
$ref: '#/definitions/openai.Assistant'
type: array
summary: List available assistents
post:
parameters:
- description: query params
in: body
name: request
required: true
schema:
$ref: '#/definitions/openai.AssistantRequest'
responses:
"200":
description: Response
schema:
$ref: '#/definitions/openai.Assistant'
summary: Create an assistant with a model and instructions.
/v1/assistants/{assistant_id}:
delete:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.DeleteAssistantResponse'
summary: Delete assistents
get:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/openai.Assistant'
summary: Get assistent data
/v1/audio/speech:
post:
consumes:
@@ -1069,6 +1026,21 @@ paths:
schema:
$ref: '#/definitions/schema.OpenAIResponse'
summary: Generate completions for a given prompt and model.
/v1/detection:
post:
parameters:
- description: query params
in: body
name: request
required: true
schema:
$ref: '#/definitions/schema.DetectionRequest'
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.DetectionResponse'
summary: Detects objects in the input image.
/v1/edits:
post:
parameters:
@@ -1100,37 +1072,6 @@ paths:
$ref: '#/definitions/schema.OpenAIResponse'
summary: Get a vector representation of a given input that can be easily consumed
by machine learning models and algorithms.
/v1/files:
get:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.ListFiles'
summary: List files.
/v1/files/{file_id}:
delete:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/openai.DeleteStatus'
summary: Delete a file.
get:
responses:
"200":
description: Response
schema:
$ref: '#/definitions/schema.File'
summary: Returns information about a specific file.
/v1/files/{file_id}/content:
get:
responses:
"200":
description: file
schema:
type: string
summary: Returns information about a specific file.
/v1/images/generations:
post:
parameters: