Commit Graph

300 Commits

Author SHA1 Message Date
LocalAI [bot]
d200401e86 feat: Add --data-path CLI flag for persistent data separation (#8888)
feat: add --data-path CLI flag for persistent data separation

- Add LOCALAI_DATA_PATH environment variable and --data-path CLI flag
- Default data path: /data (separate from configuration directory)
- Automatic migration on startup: moves agent_tasks.json, agent_jobs.json, collections/, and assets/ from old config dir to new data path
- Backward compatible: preserves old behavior if LOCALAI_DATA_PATH is not set
- Agent state and job directories now use DataPath with proper fallback chain
- Update documentation with new flag and docker-compose example

This separates mutable persistent data (collectiondb, agents, assets, skills) from configuration files, enabling better volume mounting and data persistence in containerized deployments.

Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
2026-03-09 14:11:15 +01:00
LocalAI [bot]
9da24cdf85 docs: Update model compatibility documentation with missing backends (#8889)
* docs: Update model compatibility documentation with missing backends

Added the following backends to README.md and compatibility-table.md:
- vllm-omni: Multimodal vLLM with vision and audio support
- nemo: NVIDIA NeMo framework for speech models
- outetts: OuteTTS with voice cloning capabilities
- faster-qwen3-tts: Faster Qwen3 TTS implementation
- qwen-asr: Qwen automatic speech recognition
- voxcpm: VoxCPM speech understanding model
- whisperx: Enhanced Whisper with word-level transcription

These backends exist in the codebase (backend/index.yaml) but were missing
from the documentation. This update ensures accurate reflection of currently
supported backends in LocalAI.

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@example.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-09 09:22:34 +01:00
LocalAI [bot]
9297074caa docs: expand GPU acceleration guide with L4T, multi-GPU, monitoring, and troubleshooting (#8858)
- Expand multi-GPU section to cover llama.cpp (CUDA_VISIBLE_DEVICES,
  HIP_VISIBLE_DEVICES) in addition to diffusers
- Add NVIDIA L4T/Jetson section with quick start commands and cross-reference
  to the dedicated ARM64 page
- Add GPU monitoring section with vendor-specific tools (nvidia-smi, rocm-smi,
  intel_gpu_top)
- Add troubleshooting section covering common issues: GPU not detected, CPU
  fallback, OOM errors, unsupported ROCm targets, SYCL mmap hang
- Replace "under construction" warning with useful cross-references to related
  docs (container images, VRAM management)

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 21:59:57 +01:00
LocalAI [bot]
2133031b47 feat: Create comprehensive troubleshooting guide (M1 task) (#8856)
* feat: create comprehensive troubleshooting guide (M1 task)

- Consolidates troubleshooting information from scattered documentation
- Covers installation, model loading, GPU/memory, API, performance, Docker, and network issues
- Includes diagnostic commands and step-by-step solutions
- Organized by category for easy navigation

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-08 21:58:32 +01:00
LocalAI [bot]
9090bca920 feat: Add documentation for undocumented API endpoints (#8852)
* feat: add documentation for undocumented API endpoints

Creates comprehensive documentation for 8 previously undocumented endpoints:
- Voice Activity Detection (/v1/vad)
- Video Generation (/video)
- Sound Generation (/v1/sound-generation)
- Backend Monitor (/backend/monitor, /backend/shutdown)
- Token Metrics (/tokenMetrics)
- P2P endpoints (/api/p2p/* - 5 sub-endpoints)
- System Info (/system, /version)

Each documentation file includes HTTP method, request/response schemas,
curl examples, sample JSON responses, and error codes.

* docs: remove token-metrics endpoint documentation per review feedback

The token-metrics endpoint is not wired into the HTTP router and
should not be documented per reviewer request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move system-info documentation to reference section

Per review feedback, system-info endpoint docs are better suited
for the reference section rather than features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 17:59:33 +01:00
Ettore Di Giacinto
d21369ad7b Update shell completion documentation URL
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-08 09:33:36 +01:00
LocalAI [bot]
efd402207c feat: Add shell completion support for bash, zsh, and fish (#8851)
feat: add shell completion support for bash, zsh, and fish

- Add core/cli/completion.go with dynamic completion script generation
- Add core/cli/completion_test.go with unit tests
- Modify cmd/local-ai/main.go to support completion command
- Modify core/cli/cli.go to add Completion subcommand
- Add docs/content/features/shell-completion.md with installation instructions

The completion scripts are generated dynamically from the Kong CLI model,
so they automatically include all commands, subcommands, and flags.

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
2026-03-08 09:32:39 +01:00
LocalAI [bot]
36c184175a docs: Add comprehensive API error reference documentation (#8848)
docs: add comprehensive API error reference documentation

Document all error response formats (OpenAI, Anthropic, Open Responses),
HTTP status codes, per-endpoint error scenarios, and client error handling
examples based on actual error handling code in the codebase.

Signed-off-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 08:54:51 +01:00
Ettore Di Giacinto
ac48867b7d feat: add agentic management (#8820)
* feat: add standalone and agentic functionalities

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* expose agents via responses api

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-07 00:03:08 +01:00
LocalAI [bot]
ab315f2725 feat: Add LOCALAI_DISABLE_MCP environment variable to disable MCP support (#8816)
* feat: Add LOCALAI_DISABLE_MCP environment variable to disable MCP support

- Added DisableMCP field to RunCMD struct in core/cli/run.go
- Added LOCALAI_DISABLE_MCP environment variable support
- Added DisableMCP field to ApplicationConfig struct
- Added DisableMCP AppOption function
- Updated MCP endpoint routing to check appConfig.DisableMCP
- When LOCALAI_DISABLE_MCP is set to true/1/yes, MCP endpoints are not registered

When set, all MCP functionality is disabled and appropriate error messages
are returned to users.

Use Cases:
- Security-conscious deployments where MCP is not needed
- Reducing attack surface
- Compliance requirements that prohibit certain protocol support

Environment variable: LOCALAI_DISABLE_MCP=true

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

* docs: Add documentation for LOCALAI_DISABLE_MCP environment variable

- Add section explaining how to disable MCP support using environment variable
- Document use cases for disabling MCP
- Provide examples for CLI and Docker usage

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

---------

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
2026-03-06 20:44:03 +01:00
Ettore Di Giacinto
580517f9db feat: pass-by metadata to predict options (#8795)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-05 22:50:10 +01:00
Andres
454d8adc76 feat(qwen-tts): Support using multiple voices (#8757)
* Add support for multiple voice clones in Qwen TTS

Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Add voice prompt caching and generation logs to see generation time

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-04 09:47:21 +01:00
Ettore Di Giacinto
983db7bedc feat(ui): add model size estimation (#8684)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-28 23:03:47 +01:00
LocalAI [bot]
b260378694 docs: add TLS reverse proxy configuration guide (#8673)
* docs: add TLS reverse proxy configuration guide

Add documentation explaining how to use LocalAI behind a TLS
termination reverse proxy (HAProxy, Apache, Nginx).

The documentation covers:
- How LocalAI detects HTTPS via X-Forwarded-Proto header
- Required headers that must be forwarded
- Configuration examples for HAProxy, Apache, and Nginx
- Sub-path serving configuration
- Testing and troubleshooting guide

Fixes: Issue #7176 - Web UI broken behind TLS reverse proxy

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

* docs: remove non-existent --base-url option from sub-path section

---------

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
2026-02-28 23:02:17 +01:00
LocalAI [bot]
5e13193d84 docs: add CDI driver config for NVIDIA GPU in containers (fix #8108) (#8677)
This addresses issue #8108 where the legacy nvidia driver configuration
causes container startup failures with newer NVIDIA Container Toolkit versions.

Changes:
- Update docker-compose example to show both CDI (recommended) and legacy
  nvidia driver options
- Add troubleshooting section for 'Auto-detected mode as legacy' error
- Document the fix for nvidia-container-cli 'invalid expression' errors

The root cause is a Docker/NVIDIA Container Toolkit configuration issue,
not a LocalAI code bug. The error occurs during the container runtime's
prestart hook before LocalAI starts.

Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
2026-02-28 08:42:53 +01:00
LocalAI [bot]
d845c39963 docs: add Podman installation documentation (#8646)
* docs: add Podman installation documentation

- Add new podman.md with comprehensive installation and usage guide
- Cover installation on multiple platforms (Ubuntu, Fedora, Arch, macOS, Windows)
- Document GPU support (NVIDIA CUDA, AMD ROCm, Intel, Vulkan)
- Include rootless container configuration
- Document Docker Compose with podman-compose
- Add troubleshooting section for common issues
- Link to Podman documentation in installation index
- Update image references to use Docker Hub and link to docker docs
- Change YAML heredoc to EOF in compose.yaml example
- Add curly brackets to notice shortcode and fix link

Closes #8645

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

* docs: merge Docker and Podman docs into unified Containers guide

Following the review comment, we have merged the Docker and Podman
documentation into a single 'Containers' page that covers both container
engines. The Docker and Podman pages now redirect to this unified guide.

Changes:
- Added new docs/content/installation/containers.md with combined Docker/Podman guide
- Updated docs/content/installation/docker.md to redirect to containers
- Updated docs/content/installation/podman.md to redirect to containers
- Updated docs/content/installation/_index.en.md to link to containers

Signed-off-by: LocalAI [bot] <localai-bot@users.noreply.github.com>
Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

* docs: remove podman.md as docs are merged into containers.md

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>

---------

Signed-off-by: localai-bot <localai-bot@users.noreply.github.com>
Signed-off-by: LocalAI [bot] <localai-bot@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
2026-02-25 08:13:55 +01:00
LocalAI [bot]
db6ba4ef07 chore: remove install.sh script and documentation references (#8643)
* chore: remove install.sh script and documentation references

- Delete docs/static/install.sh (broken installer causing issues)
- Remove One-Line Installer section from linux.md documentation
- Remove install.sh references from installation/_index.en.md
- Remove install.sh warning and commands from README.md

Closes #8032

* fix: add missing closing braces to notice shortcode
2026-02-24 08:36:25 +01:00
Lukas Schaefer
ed0bfb8732 fix: rename json_verbose to verbose_json (#8627)
Signed-off-by: Lukas Schaefer <lukas@lschaefer.xyz>
2026-02-23 17:57:06 +00:00
Andres
e45d63c86e fix(cli): Fix watchdog running constantly and spamming logs (#8624)
* Fix watchdog running constantly and spamming logs

Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Update docs

Signed-off-by: Andres Smith <andressmithdev@pm.me>

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
2026-02-23 11:57:28 +01:00
LocalAI [bot]
559ab99890 docs: update diffusers multi-GPU documentation to mention tensor_parallel_size configuration (#8621)
* docs: update diffusers multi-GPU documentation to mention tensor_parallel_size configuration

* chore: revert backend/python/diffusers/README.md to original content

---------

Co-authored-by: Your Name <you@example.com>
2026-02-22 18:17:23 +01:00
Ettore Di Giacinto
b471619ad9 chore(deps): bump cogito and add new options to the agent config (#8601)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-18 22:10:26 +01:00
Ettore Di Giacinto
bf5a1dd840 feat(voxtral): add voxtral backend (#8451)
* feat(voxtral): add voxtral backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-09 09:12:05 +01:00
Varun Chawla
aa9ca401fa docs: update model gallery documentation to reference main repository (#8452)
Fixes #8212 - Updated the note about reporting broken models to
reference the main LocalAI repository instead of the outdated
separate gallery repository reference.
2026-02-08 22:14:23 +01:00
Richard Palethorpe
c1d0b10b14 chore(docs): Document using a local model gallery (#8426)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-06 10:28:41 +01:00
Ettore Di Giacinto
53276d28e7 feat(musicgen): add ace-step and UI interface (#8396)
* feat(musicgen): add ace-step and UI interface

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Correctly handle model dir

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop auto-download

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to models, fixup UIs icons

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* l4t13 is incompatbile

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* avoid pinning version for cuda12

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop l4t12

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-05 12:04:53 +01:00
Jonas Bernard
5ac50c9348 fix(docs): Promote DEBUG=false in production docker compose (#8390)
fix(docs): Use DEBUG=false in production docker compose

Signed-off-by: Jonas Bernard <public.jbernard@web.de>
2026-02-04 09:35:32 +01:00
Andres
b6459ddd57 feat(api): Add transcribe response format request parameter & adjust STT backends (#8318)
* WIP response format implementation for audio transcriptions

(cherry picked from commit e271dd764bbc13846accf3beb8b6522153aa276f)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Rework transcript response_format and add more formats

(cherry picked from commit 6a93a8f63e2ee5726bca2980b0c9cf4ef8b7aeb8)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Add test and replace go-openai package with official openai go client

(cherry picked from commit f25d1a04e46526429c89db4c739e1e65942ca893)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

* Fix faster-whisper backend and refactor transcription formatting to also work on CLI

Signed-off-by: Andres Smith <andressmithdev@pm.me>
(cherry picked from commit 69a93977d5e113eb7172bd85a0f918592d3d2168)
Signed-off-by: Andres Smith <andressmithdev@pm.me>

---------

Signed-off-by: Andres Smith <andressmithdev@pm.me>
Co-authored-by: nanoandrew4 <nanoandrew4@gmail.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-01 17:33:17 +01:00
Ettore Di Giacinto
68dd9765a0 feat(tts): add support for streaming mode (#8291)
* feat(tts): add support for streaming mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Send first audio, make sure it's 16

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-30 11:58:01 +01:00
Richard Palethorpe
dd8e74a486 feat(realtime): Add audio conversations (#6245)
* feat(realtime): Add audio conversations

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(realtime): Vendor the updated API and modify for server side

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat(realtime): Update to the GA realtime API

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore: Document realtime API and add docs to AGENTS.md

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* feat: Filter reasoning from spoken output

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Send delta and done events for tool calls and audio transcripts

Ensure that content is sent in both deltas and done events for function call arguments and audio transcripts. This fixes compatibility with clients that rely on delta events for parsing.

💘 Generated with Crush

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* fix(realtime): Improve tool call handling and error reporting

- Refactor Model interface to accept []types.ToolUnion and *types.ToolChoiceUnion
  instead of JSON strings, eliminating unnecessary marshal/unmarshal cycles
- Fix Parameters field handling: support both map[string]any and JSON string formats
- Add PredictConfig() method to Model interface for accessing model configuration
- Add comprehensive debug logging for tool call parsing and function config
- Add missing return statement after prediction error (critical bug fix)
- Add warning logs for NoAction function argument parsing failures
- Improve error visibility throughout generateResponse function

💘 Generated with Crush

Assisted-by: Claude Sonnet 4.5 via Crush <crush@charm.land>
Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-01-29 08:44:53 +01:00
Ettore Di Giacinto
6804ce1c39 chore(docs): change MEMORY_FILE_PATH to MEMORY_INDEX_PATH
Updated MEMORY_FILE_PATH to MEMORY_INDEX_PATH in memory configuration.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-01-25 22:14:11 +01:00
Ettore Di Giacinto
26a374b717 chore: drop bark which is unmaintained (#8207)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-25 09:26:40 +01:00
Ettore Di Giacinto
05904c77f5 chore(exllama): drop backend now almost deprecated (#8186)
exllama2 development has stalled and only old architectures are
supported. exllamav3 is still in development, meanwhile cleaning up
exllama2 from the gallery.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-24 08:57:37 +01:00
Ettore Di Giacinto
923ebbb344 feat(qwen-tts): add Qwen-tts backend (#8163)
* feat(qwen-tts): add Qwen-tts backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update intel deps

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop flash-attn for cuda13

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-23 15:18:41 +01:00
Ettore Di Giacinto
c491c6ca90 feat(openresponses): Support reasoning blocks (#8133)
* feat(openresponses): support reasoning blocks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* allow to disable reasoning, refactor common logic

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add option to only strip reasoning

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add configurations for custom reasoning tokens

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-21 00:11:45 +01:00
Ettore Di Giacinto
4bf2f8bbd8 chore(docs): update docs with Anthropic API and openresponses
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-20 09:25:24 +01:00
Ettore Di Giacinto
44d78b4d15 chore(doc): put alert on install.sh until is fixed (#8042)
See: https://github.com/mudler/LocalAI/issues/8032

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-14 22:08:48 +01:00
Ettore Di Giacinto
64d0a96ba3 feat(ui): add video gen UI (#8020)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-14 11:43:32 +01:00
Ettore Di Giacinto
a6ff354c86 feat(tts): add pocket-tts backend (#8018)
* feat(pocket-tts): add new backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add to the gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-13 23:35:19 +01:00
Richard Palethorpe
98f28bf583 chore(docs): Add Crush and VoxInput to the integrations (#7924)
* chore(docs): Add Crush and VoxInput to the integrations

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Richard Palethorpe <io@richiejp.com>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-01-08 21:39:25 +01:00
Richard Palethorpe
e6ba26c3e7 chore: Update to Ubuntu24.04 (cont #7423) (#7769)
* ci(workflows): bump GitHub Actions images to Ubuntu 24.04

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): remove CUDA 11.x support from GitHub Actions (incompatible with ubuntu:24.04)

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): bump GitHub Actions CUDA support to 12.9

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): bump base image to ubuntu:24.04 and adjust Vulkan SDK/packages

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix(backend): correct context paths for Python backends in workflows, Makefile and Dockerfile

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): disable parallel backend builds to avoid race conditions

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): export CUDA_MAJOR_VERSION and CUDA_MINOR_VERSION for override

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(backend): update backend Dockerfiles to Ubuntu 24.04

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(backend): add ROCm env vars and default AMDGPU_TARGETS for hipBLAS builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(chatterbox): bump ROCm PyTorch to 2.9.1+rocm6.4 and update index URL; align hipblas requirements

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore: add local-ai-launcher to .gitignore

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): fix backends GitHub Actions workflows after rebase

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): use build-time UBUNTU_VERSION variable

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(docker): remove libquadmath0 from requirements-stage base image

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(make): add backends/vllm to .NOTPARALLEL to prevent parallel builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix(docker): correct CUDA installation steps in backend Dockerfiles

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* chore(backend): update ROCm to 6.4 and align Python hipblas requirements

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for CUDA on arm64 builds

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(docker): update base image and backend Dockerfiles for Ubuntu 24.04 compatibility on arm64

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* build(backend): increase timeout for uv installs behind slow networks on backend/Dockerfile.python

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): switch GitHub Actions runners to Ubuntu-24.04 for vibevoice backend

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* ci(workflows): fix failing GitHub Actions runners

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>

* fix: Allow FROM_SOURCE to be unset, use upstream Intel images etc.

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): rm all traces of CUDA 11

Signed-off-by: Richard Palethorpe <io@richiejp.com>

* chore(build): Add Ubuntu codename as an argument

Signed-off-by: Richard Palethorpe <io@richiejp.com>

---------

Signed-off-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Co-authored-by: Alessandro Sturniolo <alessandro.sturniolo@gmail.com>
2026-01-06 15:26:42 +01:00
Ettore Di Giacinto
d38811560c chore(docs): add opencode, GHA, and realtime voice assistant examples
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-03 22:03:43 +01:00
Ettore Di Giacinto
c844b7ac58 feat: disable force eviction (#7725)
* feat: allow to set forcing backends eviction while requests are in flight

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: try to make the request sit and retry if eviction couldn't be done

Otherwise calls that in order to pass would need to shutdown other
backends would just fail.

In this way instead we make the request sit and retry eviction until it
succeeds. The thresholds can be configured by the user.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* expose settings to CLI

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-25 14:26:18 +01:00
Ettore Di Giacinto
bf2f95c684 chore(docs): update docs with cuda 13 instructions and the new vibevoice backend
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-25 10:00:07 +01:00
Mikhail Khludnev
53b0530275 docs: Add langchain-localai integration package to documentation (#7677)
Add `langchain-localai` integration package to documentation

Signed-off-by: Mikhail Khludnev <mkhludnev@users.noreply.github.com>
2025-12-21 21:02:14 +01:00
Ettore Di Giacinto
2387b266d8 chore(llama.cpp): Add Missing llama.cpp Options to gRPC Server (#7584)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-15 21:55:20 +01:00
Ettore Di Giacinto
fc5b9ebfcc feat(loader): enhance single active backend to support LRU eviction (#7535)
* feat(loader): refactor single active backend support to LRU

This changeset introduces LRU management of loaded backends. Users can
set now a maximum number of models to be loaded concurrently, and, when
setting LocalAI in single active backend mode we set LRU to 1 for
backward compatibility.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: add tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Update docs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-12-12 12:28:38 +01:00
Ettore Di Giacinto
00a05208bc chore(docs): center video
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-08 16:59:11 +01:00
Ettore Di Giacinto
a27d0d151f Embed YouTube video in documentation
Added an embedded YouTube video to the documentation.

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2025-12-08 16:53:20 +01:00
Igor B. Poretsky
ab022172a9 chore: switch from /usr/share to /var/lib for data storage (#7361)
* More appropriate place for data storing

The /usr/share subtree in Linux is used for data that generally are not
supposed to change. Conventional places for changeable data are usually
located under /var, so /var/lib seems to be a reasonable default here.

* Data paths consistency fix

* Directory name consistency fix
2025-11-27 09:18:28 +01:00
Ettore Di Giacinto
dd2828241c chore(docs): add documentation about import (#7315)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-11-20 23:07:36 +01:00