diff --git a/README.md b/README.md index dcc2da167..a93f160f5 100644 --- a/README.md +++ b/README.md @@ -5,35 +5,17 @@

- -LocalAI forks - LocalAI stars - -LocalAI pull-requests - -

- -

LocalAI License

-

- -LocalAI Docker hub - - -LocalAI Quay.io - -

-

Follow LocalAI_API @@ -47,22 +29,20 @@ mudler%2FLocalAI | Trendshift

-> :bulb: Get help - [❓FAQ](https://localai.io/faq/) [πŸ’­Discussions](https://github.com/go-skynet/LocalAI/discussions) [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) [:book: Documentation website](https://localai.io/) -> -> [πŸ’» Quickstart](https://localai.io/basics/getting_started/) [πŸ–ΌοΈ Models](https://models.localai.io/) [πŸš€ Roadmap](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap) [πŸ›« Examples](https://github.com/mudler/LocalAI-examples) Try on -[![Telegram](https://img.shields.io/badge/Telegram-2CA5E0?style=for-the-badge&logo=telegram&logoColor=white)](https://t.me/localaiofficial_bot) +**LocalAI** is the open-source AI engine. Run any model β€” LLMs, vision, voice, image, video β€” on any hardware. No GPU required. -[![tests](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/test.yml)[![Build and Release](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/release.yaml)[![build container images](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/image.yml)[![Bump dependencies](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml/badge.svg)](https://github.com/go-skynet/LocalAI/actions/workflows/bump_deps.yaml)[![Artifact Hub](https://img.shields.io/endpoint?url=https://artifacthub.io/badge/repository/localai)](https://artifacthub.io/packages/search?repo=localai) +- **Drop-in API compatibility** β€” OpenAI, Anthropic, ElevenLabs APIs +- **35+ backends** β€” llama.cpp, vLLM, transformers, whisper, diffusers, MLX... +- **Any hardware** β€” NVIDIA, AMD, Intel, Apple Silicon, Vulkan, or CPU-only +- **Multi-user ready** β€” API key auth, user quotas, role-based access +- **Built-in AI agents** β€” autonomous agents with tool use, RAG, MCP, and skills +- **Privacy-first** β€” your data never leaves your infrastructure -

- -LocalAI Examples Repository - -

+Created and maintained by [Ettore Di Giacinto](https://github.com/mudler). -**LocalAI** is the free, Open Source OpenAI alternative. LocalAI act as a drop-in replacement REST API that's compatible with OpenAI (Elevenlabs, Anthropic... ) API specifications for local AI inferencing. It allows you to run LLMs, generate images, audio (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families. Does not require GPU. It is created and maintained by [Ettore Di Giacinto](https://github.com/mudler). +> [:book: Documentation](https://localai.io/) | [:speech_balloon: Discord](https://discord.gg/uJAeKSAGDy) | [πŸ’» Quickstart](https://localai.io/basics/getting_started/) | [πŸ–ΌοΈ Models](https://models.localai.io/) | [❓FAQ](https://localai.io/faq/) -## Screenshots / Video +## Screenshots ### Chat, Model gallery @@ -72,282 +52,137 @@ https://github.com/user-attachments/assets/08cbb692-57da-48f7-963d-2e7b43883c18 https://github.com/user-attachments/assets/6270b331-e21d-4087-a540-6290006b381a -### Youtube video +## Quickstart -

-
-
-
-

- -## πŸ’» Quickstart - -### macOS Download: +### macOS Download LocalAI for macOS -> Note: the DMGs are not signed by Apple as quarantined. See https://github.com/mudler/LocalAI/issues/6268 for a workaround, fix is tracked here: https://github.com/mudler/LocalAI/issues/6244 -> Install the DMG and paste this code into terminal: `sudo xattr -d com.apple.quarantine /Applications/LocalAI.app` +> **Note:** The DMG is not signed by Apple. After installing, run: `sudo xattr -d com.apple.quarantine /Applications/LocalAI.app`. See [#6268](https://github.com/mudler/LocalAI/issues/6268) for details. ### Containers (Docker, podman, ...) -> **πŸ’‘ Docker Run vs Docker Start** -> -> - `docker run` creates and starts a new container. If a container with the same name already exists, this command will fail. -> - `docker start` starts an existing container that was previously created with `docker run`. -> -> If you've already run LocalAI before and want to start it again, use: `docker start -i local-ai` +> Already ran LocalAI before? Use `docker start -i local-ai` to restart an existing container. -#### CPU only image: +#### CPU only: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest ``` -#### NVIDIA GPU Images: +#### NVIDIA GPU: ```bash -# CUDA 13.0 +# CUDA 13 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-13 -# CUDA 12.0 +# CUDA 12 docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-gpu-nvidia-cuda-12 -# NVIDIA Jetson (L4T) ARM64 -# CUDA 12 (for Nvidia AGX Orin and similar platforms) +# NVIDIA Jetson ARM64 (CUDA 12, for AGX Orin and similar) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64 -# CUDA 13 (for Nvidia DGX Spark) +# NVIDIA Jetson ARM64 (CUDA 13, for DGX Spark) docker run -ti --name local-ai -p 8080:8080 --gpus all localai/localai:latest-nvidia-l4t-arm64-cuda-13 ``` -#### AMD GPU Images (ROCm): +#### AMD GPU (ROCm): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/kfd --device=/dev/dri --group-add=video localai/localai:latest-gpu-hipblas ``` -#### Intel GPU Images (oneAPI): +#### Intel GPU (oneAPI): ```bash docker run -ti --name local-ai -p 8080:8080 --device=/dev/dri/card1 --device=/dev/dri/renderD128 localai/localai:latest-gpu-intel ``` -#### Vulkan GPU Images: +#### Vulkan GPU: ```bash docker run -ti --name local-ai -p 8080:8080 localai/localai:latest-gpu-vulkan ``` -To load models: +### Loading models ```bash -# From the model gallery (see available models with `local-ai models list`, in the WebUI from the model tab, or visiting https://models.localai.io) +# From the model gallery (see available models with `local-ai models list` or at https://models.localai.io) local-ai run llama-3.2-1b-instruct:q4_k_m -# Start LocalAI with the phi-2 model directly from huggingface +# From Huggingface local-ai run huggingface://TheBloke/phi-2-GGUF/phi-2.Q8_0.gguf -# Install and run a model from the Ollama OCI registry +# From the Ollama OCI registry local-ai run ollama://gemma:2b -# Run a model from a configuration file +# From a YAML config local-ai run https://gist.githubusercontent.com/.../phi-2.yaml -# Install and run a model from a standard OCI registry (e.g., Docker Hub) +# From a standard OCI registry (e.g., Docker Hub) local-ai run oci://localai/phi-2:latest ``` -> ⚑ **Automatic Backend Detection**: When you install models from the gallery or YAML files, LocalAI automatically detects your system's GPU capabilities (NVIDIA, AMD, Intel) and downloads the appropriate backend. For advanced configuration options, see [GPU Acceleration](https://localai.io/features/gpu-acceleration/#automatic-backend-detection). +> **Automatic Backend Detection**: LocalAI automatically detects your GPU capabilities and downloads the appropriate backend. For advanced options, see [GPU Acceleration](https://localai.io/features/gpu-acceleration/). -For more information, see [πŸ’» Getting started](https://localai.io/basics/getting_started/index.html), if you are interested in our roadmap items and future enhancements, you can see the [Issues labeled as Roadmap here](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap) +For more details, see the [Getting Started guide](https://localai.io/basics/getting_started/). -## πŸ“° Latest project news -- March 2026: [Agent management](https://github.com/mudler/LocalAI/pull/8820), [New React UI](https://github.com/mudler/LocalAI/pull/8772), [WebRTC](https://github.com/mudler/LocalAI/pull/8790),[MLX-distributed via P2P and RDMA](https://github.com/mudler/LocalAI/pull/8801), [MCP Apps, MCP Client-side](https://github.com/mudler/LocalAI/pull/8947) -- February 2026: [Realtime API for audio-to-audio with tool calling](https://github.com/mudler/LocalAI/pull/6245), [ACE-Step 1.5 support](https://github.com/mudler/LocalAI/pull/8396) -- January 2026: **LocalAI 3.10.0** - Major release with Anthropic API support, Open Responses API for stateful agents, video & image generation suite (LTX-2), unified GPU backends, tool streaming & XML parsing, system-aware backend gallery, crash fixes for AVX-only CPUs and AMD VRAM reporting, request tracing, and new backends: **Moonshine** (ultra-fast transcription), **Pocket-TTS** (lightweight TTS). Vulkan arm64 builds now available. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v3.10.0). -- December 2025: [Dynamic Memory Resource reclaimer](https://github.com/mudler/LocalAI/pull/7583), [Automatic fitting of models to multiple GPUS(llama.cpp)](https://github.com/mudler/LocalAI/pull/7584), [Added Vibevoice backend](https://github.com/mudler/LocalAI/pull/7494) -- November 2025: Major improvements to the UX. Among these: [Import models via URL](https://github.com/mudler/LocalAI/pull/7245) and [Multiple chats and history](https://github.com/mudler/LocalAI/pull/7325) -- October 2025: πŸ”Œ [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) support added for agentic capabilities with external tools -- September 2025: New Launcher application for MacOS and Linux, extended support to many backends for Mac and Nvidia L4T devices. Models: Added MLX-Audio, WAN 2.2. WebUI improvements and Python-based backends now ships portable python environments. -- August 2025: MLX, MLX-VLM, Diffusers and llama.cpp are now supported on Mac M1/M2/M3+ chips ( with `development` suffix in the gallery ): https://github.com/mudler/LocalAI/pull/6049 https://github.com/mudler/LocalAI/pull/6119 https://github.com/mudler/LocalAI/pull/6121 https://github.com/mudler/LocalAI/pull/6060 -- July/August 2025: πŸ” [Object Detection](https://localai.io/features/object-detection/) added to the API featuring [rf-detr](https://github.com/roboflow/rf-detr) -- July 2025: All backends migrated outside of the main binary. LocalAI is now more lightweight, small, and automatically downloads the required backend to run the model. [Read the release notes](https://github.com/mudler/LocalAI/releases/tag/v3.2.0) -- June 2025: [Backend management](https://github.com/mudler/LocalAI/pull/5607) has been added. Attention: extras images are going to be deprecated from the next release! Read [the backend management PR](https://github.com/mudler/LocalAI/pull/5607). -- May 2025: [Audio input](https://github.com/mudler/LocalAI/pull/5466) and [Reranking](https://github.com/mudler/LocalAI/pull/5396) in llama.cpp backend, [Realtime API](https://github.com/mudler/LocalAI/pull/5392), Support to Gemma, SmollVLM, and more multimodal models (available in the gallery). -- May 2025: Important: image name changes [See release](https://github.com/mudler/LocalAI/releases/tag/v2.29.0) -- Apr 2025: Rebrand, WebUI enhancements -- Apr 2025: [LocalAGI](https://github.com/mudler/LocalAGI) and [LocalRecall](https://github.com/mudler/LocalRecall) join the LocalAI family stack. -- Apr 2025: WebUI overhaul -- Feb 2025: Backend cleanup, Breaking changes, new backends (kokoro, OutelTTS, faster-whisper), Nvidia L4T images -- Jan 2025: LocalAI model release: https://huggingface.co/mudler/LocalAI-functioncall-phi-4-v0.3, SANA support in diffusers: https://github.com/mudler/LocalAI/pull/4603 -- Dec 2024: stablediffusion.cpp backend (ggml) added ( https://github.com/mudler/LocalAI/pull/4289 ) -- Nov 2024: Bark.cpp backend added ( https://github.com/mudler/LocalAI/pull/4287 ) -- Nov 2024: Voice activity detection models (**VAD**) added to the API: https://github.com/mudler/LocalAI/pull/4204 -- Oct 2024: examples moved to [LocalAI-examples](https://github.com/mudler/LocalAI-examples) -- Aug 2024: πŸ†• FLUX-1, [P2P Explorer](https://explorer.localai.io) -- July 2024: πŸ”₯πŸ”₯ πŸ†• P2P Dashboard, LocalAI Federated mode and AI Swarms: https://github.com/mudler/LocalAI/pull/2723. P2P Global community pools: https://github.com/mudler/LocalAI/issues/3113 -- May 2024: πŸ”₯πŸ”₯ Decentralized P2P llama.cpp: https://github.com/mudler/LocalAI/pull/2343 (peer2peer llama.cpp!) πŸ‘‰ Docs https://localai.io/features/distribute/ -- May 2024: πŸ”₯πŸ”₯ Distributed inferencing: https://github.com/mudler/LocalAI/pull/2324 -- April 2024: Reranker API: https://github.com/mudler/LocalAI/pull/2121 +## Latest News -Roadmap items: [List of issues](https://github.com/mudler/LocalAI/issues?q=is%3Aissue+is%3Aopen+label%3Aroadmap) +- **March 2026**: [Agent management](https://github.com/mudler/LocalAI/pull/8820), [New React UI](https://github.com/mudler/LocalAI/pull/8772), [WebRTC](https://github.com/mudler/LocalAI/pull/8790), [MLX-distributed via P2P and RDMA](https://github.com/mudler/LocalAI/pull/8801), [MCP Apps, MCP Client-side](https://github.com/mudler/LocalAI/pull/8947) +- **February 2026**: [Realtime API for audio-to-audio with tool calling](https://github.com/mudler/LocalAI/pull/6245), [ACE-Step 1.5 support](https://github.com/mudler/LocalAI/pull/8396) +- **January 2026**: **LocalAI 3.10.0** β€” Anthropic API support, Open Responses API, video & image generation (LTX-2), unified GPU backends, tool streaming, Moonshine, Pocket-TTS. [Release notes](https://github.com/mudler/LocalAI/releases/tag/v3.10.0) +- **December 2025**: [Dynamic Memory Resource reclaimer](https://github.com/mudler/LocalAI/pull/7583), [Automatic multi-GPU model fitting (llama.cpp)](https://github.com/mudler/LocalAI/pull/7584), [Vibevoice backend](https://github.com/mudler/LocalAI/pull/7494) +- **November 2025**: [Import models via URL](https://github.com/mudler/LocalAI/pull/7245), [Multiple chats and history](https://github.com/mudler/LocalAI/pull/7325) +- **October 2025**: [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) support for agentic capabilities +- **September 2025**: New Launcher for macOS and Linux, extended backend support for Mac and Nvidia L4T, MLX-Audio, WAN 2.2 +- **August 2025**: MLX, MLX-VLM, Diffusers, llama.cpp now supported on Apple Silicon +- **July 2025**: All backends migrated outside the main binary β€” [lightweight, modular architecture](https://github.com/mudler/LocalAI/releases/tag/v3.2.0) -## πŸš€ [Features](https://localai.io/features/) +For older news and full release notes, see [GitHub Releases](https://github.com/mudler/LocalAI/releases) and the [News page](https://localai.io/basics/news/). -- 🧩 [Backend Gallery](https://localai.io/backends/): Install/remove backends on the fly, powered by OCI images β€” fully customizable and API-driven. -- πŸ“– [Text generation with GPTs](https://localai.io/features/text-generation/) (`llama.cpp`, `transformers`, `vllm` ... [:book: and more](https://localai.io/model-compatibility/index.html#model-compatibility-table)) -- πŸ—£ [Text to Audio](https://localai.io/features/text-to-audio/) -- πŸ”ˆ [Audio to Text](https://localai.io/features/audio-to-text/) -- 🎨 [Image generation](https://localai.io/features/image-generation) -- πŸ”₯ [OpenAI-alike tools API](https://localai.io/features/openai-functions/) -- ⚑ [Realtime API](https://localai.io/features/openai-realtime/) (Speech-to-speech) -- 🧠 [Embeddings generation for vector databases](https://localai.io/features/embeddings/) -- ✍️ [Constrained grammars](https://localai.io/features/constrained_grammars/) -- πŸ–ΌοΈ [Download Models directly from Huggingface ](https://localai.io/models/) -- πŸ₯½ [Vision API](https://localai.io/features/gpt-vision/) -- πŸ” [Object Detection](https://localai.io/features/object-detection/) -- πŸ“ˆ [Reranker API](https://localai.io/features/reranker/) -- πŸ†•πŸ–§ [P2P Inferencing](https://localai.io/features/distribute/) -- πŸ†•πŸ”Œ [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) - Agentic capabilities with external tools and [LocalAGI's Agentic capabilities](https://github.com/mudler/LocalAGI) -- πŸ†•πŸ€– [Built-in Agents](https://localai.io/features/agents/) - Autonomous AI agents with tool use, knowledge base (RAG), skills, SSE streaming, import/export, and [Agent Hub](https://agenthub.localai.io) β€” powered by [LocalAGI](https://github.com/mudler/LocalAGI) -- πŸ”Š Voice activity detection (Silero-VAD support) -- 🌍 Integrated WebUI! +## Features -## 🧩 Supported Backends & Acceleration +- [Text generation](https://localai.io/features/text-generation/) (`llama.cpp`, `transformers`, `vllm` ... [and more](https://localai.io/model-compatibility/)) +- [Text to Audio](https://localai.io/features/text-to-audio/) +- [Audio to Text](https://localai.io/features/audio-to-text/) +- [Image generation](https://localai.io/features/image-generation) +- [OpenAI-compatible tools API](https://localai.io/features/openai-functions/) +- [Realtime API](https://localai.io/features/openai-realtime/) (Speech-to-speech) +- [Embeddings generation](https://localai.io/features/embeddings/) +- [Constrained grammars](https://localai.io/features/constrained_grammars/) +- [Download models from Huggingface](https://localai.io/models/) +- [Vision API](https://localai.io/features/gpt-vision/) +- [Object Detection](https://localai.io/features/object-detection/) +- [Reranker API](https://localai.io/features/reranker/) +- [P2P Inferencing](https://localai.io/features/distribute/) +- [Model Context Protocol (MCP)](https://localai.io/docs/features/mcp/) +- [Built-in Agents](https://localai.io/features/agents/) β€” Autonomous AI agents with tool use, RAG, skills, SSE streaming, and [Agent Hub](https://agenthub.localai.io) +- [Backend Gallery](https://localai.io/backends/) β€” Install/remove backends on the fly via OCI images +- Voice Activity Detection (Silero-VAD) +- Integrated WebUI -LocalAI supports a comprehensive range of AI backends with multiple acceleration options: +## Supported Backends & Acceleration -### Text Generation & Language Models -| Backend | Description | Acceleration Support | -|---------|-------------|---------------------| -| **llama.cpp** | LLM inference in C/C++ | CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, CPU | -| **vLLM** | Fast LLM inference with PagedAttention | CUDA 12/13, ROCm, Intel | -| **transformers** | HuggingFace transformers framework | CUDA 12/13, ROCm, Intel, CPU | -| **MLX** | Apple Silicon LLM inference | Metal (M1/M2/M3+) | -| **MLX-VLM** | Apple Silicon Vision-Language Models | Metal (M1/M2/M3+) | -| **vLLM Omni** | Multimodal vLLM with vision and audio | CUDA 12/13, ROCm, Intel | +LocalAI supports **35+ backends** including llama.cpp, vLLM, transformers, whisper.cpp, diffusers, MLX, MLX-VLM, and many more. Hardware acceleration is available for **NVIDIA** (CUDA 12/13), **AMD** (ROCm), **Intel** (oneAPI/SYCL), **Apple Silicon** (Metal), **Vulkan**, and **NVIDIA Jetson** (L4T). All backends can be installed on-the-fly from the [Backend Gallery](https://localai.io/backends/). -### Audio & Speech Processing -| Backend | Description | Acceleration Support | -|---------|-------------|---------------------| -| **whisper.cpp** | OpenAI Whisper in C/C++ | CUDA 12/13, ROCm, Intel SYCL, Vulkan, CPU | -| **faster-whisper** | Fast Whisper with CTranslate2 | CUDA 12/13, ROCm, Intel, CPU | -| **moonshine** | Ultra-fast transcription engine for low-end devices | CUDA 12/13, Metal, CPU | -| **coqui** | Advanced TTS with 1100+ languages | CUDA 12/13, ROCm, Intel, CPU | -| **kokoro** | Lightweight TTS model | CUDA 12/13, ROCm, Intel, CPU | -| **chatterbox** | Production-grade TTS | CUDA 12/13, CPU | -| **piper** | Fast neural TTS system | CPU | -| **kitten-tts** | Kitten TTS models | CPU | -| **silero-vad** | Voice Activity Detection | CPU | -| **neutts** | Text-to-speech with voice cloning | CUDA 12/13, ROCm, CPU | -| **vibevoice** | Real-time TTS with voice cloning | CUDA 12/13, ROCm, Intel, CPU | -| **pocket-tts** | Lightweight CPU-based TTS | CUDA 12/13, ROCm, Intel, CPU | -| **qwen-tts** | High-quality TTS with custom voice, voice design, and voice cloning | CUDA 12/13, ROCm, Intel, CPU | -| **nemo** | NVIDIA NeMo framework for speech models | CUDA 12/13, ROCm, Intel, CPU | -| **outetts** | OuteTTS with voice cloning | CUDA 12/13, CPU | -| **faster-qwen3-tts** | Faster Qwen3 TTS | CUDA 12/13, ROCm, Intel, CPU | -| **qwen-asr** | Qwen ASR speech recognition | CUDA 12/13, ROCm, Intel, CPU | -| **voxcpm** | VoxCPM speech understanding | CUDA 12/13, Metal, CPU | -| **whisperx** | Enhanced Whisper transcription | CUDA 12/13, ROCm, Intel, CPU | -| **ace-step** | Music generation from text descriptions, lyrics, or audio samples | CUDA 12/13, ROCm, Intel, Metal, CPU | +See the full [Backend & Model Compatibility Table](https://localai.io/model-compatibility/) and [GPU Acceleration guide](https://localai.io/features/gpu-acceleration/). -### Image & Video Generation -| Backend | Description | Acceleration Support | -|---------|-------------|---------------------| -| **stablediffusion.cpp** | Stable Diffusion in C/C++ | CUDA 12/13, Intel SYCL, Vulkan, CPU | -| **diffusers** | HuggingFace diffusion models | CUDA 12/13, ROCm, Intel, Metal, CPU | +## Resources -### Specialized AI Tasks -| Backend | Description | Acceleration Support | -|---------|-------------|---------------------| -| **rfdetr** | Real-time object detection | CUDA 12/13, Intel, CPU | -| **rerankers** | Document reranking API | CUDA 12/13, ROCm, Intel, CPU | -| **local-store** | Vector database | CPU | -| **huggingface** | HuggingFace API integration | API-based | +- [Documentation](https://localai.io/) +- [LLM fine-tuning guide](https://localai.io/docs/advanced/fine-tuning/) +- [Build from source](https://localai.io/basics/build/) +- [Kubernetes installation](https://localai.io/basics/getting_started/#run-localai-in-kubernetes) +- [Integrations & community projects](https://localai.io/docs/integrations/) +- [Media & blog posts](https://localai.io/basics/news/#media-blogs-social) +- [Examples](https://github.com/mudler/LocalAI-examples) -### Hardware Acceleration Matrix +## Autonomous Development Team -| Acceleration Type | Supported Backends | Hardware Support | -|-------------------|-------------------|------------------| -| **NVIDIA CUDA 12** | All CUDA-compatible backends | Nvidia hardware | -| **NVIDIA CUDA 13** | All CUDA-compatible backends | Nvidia hardware | -| **AMD ROCm** | llama.cpp, whisper, vllm, transformers, diffusers, rerankers, coqui, kokoro, neutts, vibevoice, pocket-tts, qwen-tts, ace-step | AMD Graphics | -| **Intel oneAPI** | llama.cpp, whisper, stablediffusion, vllm, transformers, diffusers, rfdetr, rerankers, coqui, kokoro, vibevoice, pocket-tts, qwen-tts, ace-step | Intel Arc, Intel iGPUs | -| **Apple Metal** | llama.cpp, whisper, diffusers, MLX, MLX-VLM, moonshine, ace-step | Apple M1/M2/M3+ | -| **Vulkan** | llama.cpp, whisper, stablediffusion | Cross-platform GPUs | -| **NVIDIA Jetson (CUDA 12)** | llama.cpp, whisper, stablediffusion, diffusers, rfdetr, ace-step | ARM64 embedded AI (AGX Orin, etc.) | -| **NVIDIA Jetson (CUDA 13)** | llama.cpp, whisper, stablediffusion, diffusers, rfdetr | ARM64 embedded AI (DGX Spark) | -| **CPU Optimized** | All backends | AVX/AVX2/AVX512, quantization support | +LocalAI is helped being maintained by a team of autonomous AI agents led by an AI Scrum Master. -### πŸ”— Community and integrations - -Build and deploy custom containers: -- https://github.com/sozercan/aikit - -WebUIs: -- https://github.com/Jirubizu/localai-admin -- https://github.com/go-skynet/LocalAI-frontend -- QA-Pilot(An interactive chat project that leverages LocalAI LLMs for rapid understanding and navigation of GitHub code repository) https://github.com/reid41/QA-Pilot - -Agentic Libraries: -- https://github.com/mudler/cogito - -MCPs: -- https://github.com/mudler/MCPs - -OS Assistant: - -- https://github.com/mudler/Keygeist - Keygeist is an AI-powered keyboard operator that listens for key combinations and responds with AI-generated text typed directly into your Linux box. - -Model galleries -- https://github.com/go-skynet/model-gallery - -Voice: -- https://github.com/richiejp/VoxInput - -Other: -- Helm chart https://github.com/go-skynet/helm-charts -- VSCode extension https://github.com/badgooooor/localai-vscode-plugin -- Langchain: https://python.langchain.com/docs/integrations/providers/localai/ -- Terminal utility https://github.com/djcopley/ShellOracle -- Local Smart assistant https://github.com/mudler/LocalAGI -- Home Assistant https://github.com/drndos/hass-openai-custom-conversation / https://github.com/valentinfrlch/ha-llmvision / https://github.com/loryanstrant/HA-LocalAI-Monitor -- Discord bot https://github.com/mudler/LocalAGI/tree/main/examples/discord -- Slack bot https://github.com/mudler/LocalAGI/tree/main/examples/slack -- Shell-Pilot(Interact with LLM using LocalAI models via pure shell scripts on your Linux or MacOS system) https://github.com/reid41/shell-pilot -- Telegram bot https://github.com/mudler/LocalAI/tree/master/examples/telegram-bot -- Another Telegram Bot https://github.com/JackBekket/Hellper -- Auto-documentation https://github.com/JackBekket/Reflexia -- Github bot which answer on issues, with code and documentation as context https://github.com/JackBekket/GitHelper -- Github Actions: https://github.com/marketplace/actions/start-localai -- Examples: https://github.com/mudler/LocalAI/tree/master/examples/ - - -### πŸ”— Resources - -- [LLM finetuning guide](https://localai.io/docs/advanced/fine-tuning/) -- [How to build locally](https://localai.io/basics/build/index.html) -- [How to install in Kubernetes](https://localai.io/basics/getting_started/index.html#run-localai-in-kubernetes) -- [Projects integrating LocalAI](https://localai.io/docs/integrations/) - -## :book: πŸŽ₯ [Media, Blogs, Social](https://localai.io/basics/news/#media-blogs-social) - -- πŸ†• [LocalAI Autonomous Dev Team Blog Post](https://mudler.pm/posts/2026/02/28/a-call-to-open-source-maintainers-stop-babysitting-ai-how-i-built-a-100-local-autonomous-dev-team-to-maintain-localai-and-why-you-should-too/) -- [Run Visual studio code with LocalAI (SUSE)](https://www.suse.com/c/running-ai-locally/) -- πŸ†• [Run LocalAI on Jetson Nano Devkit](https://mudler.pm/posts/local-ai-jetson-nano-devkit/) -- [Run LocalAI on AWS EKS with Pulumi](https://www.pulumi.com/blog/low-code-llm-apps-with-local-ai-flowise-and-pulumi/) -- [Run LocalAI on AWS](https://staleks.hashnode.dev/installing-localai-on-aws-ec2-instance) -- [Create a slackbot for teams and OSS projects that answer to documentation](https://mudler.pm/posts/smart-slackbot-for-teams/) -- [LocalAI meets k8sgpt](https://www.youtube.com/watch?v=PKrDNuJ_dfE) -- [Question Answering on Documents locally with LangChain, LocalAI, Chroma, and GPT4All](https://mudler.pm/posts/localai-question-answering/) -- [Tutorial to use k8sgpt with LocalAI](https://medium.com/@tyler_97636/k8sgpt-localai-unlock-kubernetes-superpowers-for-free-584790de9b65) - -## πŸ€– Autonomous Development Team - -LocalAI is now helped being maintained (for small tasks!) by a full team of autonomous AI agents led by an AI Scrum Master! This experiment demonstrates how open source projects can leverage AI agents for sustainable, long-term maintenance. - -- **πŸ“Š Live Reports**: [Automatically generated reports](http://reports.localai.io) -- **πŸ“‹ Project Board**: [Agent task tracking](https://github.com/users/mudler/projects/6) -- **πŸ“ Blog Post**: [Learn about the autonomous dev team experiment](https://mudler.pm/posts/2026/02/28/a-call-to-open-source-maintainers-stop-babysitting-ai-how-i-built-a-100-local-autonomous-dev-team-to-maintain-localai-and-why-you-should-too/) +- **Live Reports**: [reports.localai.io](http://reports.localai.io) +- **Project Board**: [Agent task tracking](https://github.com/users/mudler/projects/6) +- **Blog Post**: [Learn about the experiment](https://mudler.pm/posts/2026/02/28/a-call-to-open-source-maintainers-stop-babysitting-ai-how-i-built-a-100-local-autonomous-dev-team-to-maintain-localai-and-why-you-should-too/) ## Citation @@ -363,7 +198,7 @@ If you utilize this repository, data in a downstream project, please consider ci howpublished = {\url{https://github.com/go-skynet/LocalAI}}, ``` -## ❀️ Sponsors +## Sponsors > Do you find LocalAI useful? @@ -382,19 +217,19 @@ A huge thank you to our generous sponsors who support this project covering CI e ### Individual sponsors -A special thanks to individual sponsors that contributed to the project, a full list is in [Github](https://github.com/sponsors/mudler) and [buymeacoffee](https://buymeacoffee.com/mudler), a special shout out goes to [drikster80](https://github.com/drikster80) for being generous. Thank you everyone! +A special thanks to individual sponsors, a full list is on [GitHub](https://github.com/sponsors/mudler) and [buymeacoffee](https://buymeacoffee.com/mudler). Special shout out to [drikster80](https://github.com/drikster80) for being generous. Thank you everyone! -## 🌟 Star history +## Star history [![LocalAI Star history Chart](https://api.star-history.com/svg?repos=go-skynet/LocalAI&type=Date)](https://star-history.com/#go-skynet/LocalAI&Date) -## πŸ“– License +## License LocalAI is a community-driven project created by [Ettore Di Giacinto](https://github.com/mudler/). MIT - Author Ettore Di Giacinto -## πŸ™‡ Acknowledgements +## Acknowledgements LocalAI couldn't have been built without the help of great software already available from the community. Thank you! @@ -407,9 +242,9 @@ LocalAI couldn't have been built without the help of great software already avai - https://github.com/rhasspy/piper - [exo](https://github.com/exo-explore/exo) for the MLX distributed auto-parallel sharding implementation -## πŸ€— Contributors +## Contributors -This is a community project, a special thanks to our contributors! πŸ€— +This is a community project, a special thanks to our contributors! diff --git a/docs/content/integrations.md b/docs/content/integrations.md index 4851267de..0f147002a 100644 --- a/docs/content/integrations.md +++ b/docs/content/integrations.md @@ -6,35 +6,94 @@ icon = "sync" +++ -## Community integrations +## Community Integrations -List of projects that are using directly LocalAI behind the scenes can be found [here](https://github.com/mudler/LocalAI#-community-and-integrations). +The lists below cover software and community projects that integrate with LocalAI. -The list below is a list of software that integrates with LocalAI. +Feel free to open up a Pull request (by clicking at the "Edit page" below) to get your project added! + +### Build & Deploy + +- [aikit](https://github.com/sozercan/aikit) β€” Build and deploy custom LocalAI containers +- [Helm chart](https://github.com/go-skynet/helm-charts) β€” Deploy LocalAI on Kubernetes +- [GitHub Actions](https://github.com/marketplace/actions/start-localai) β€” Use LocalAI in CI/CD workflows + +### Web UIs + +- [localai-admin](https://github.com/Jirubizu/localai-admin) +- [LocalAI-frontend](https://github.com/go-skynet/LocalAI-frontend) +- [QA-Pilot](https://github.com/reid41/QA-Pilot) β€” Interactive chat for navigating GitHub code repositories +- [Big AGI](https://github.com/enricoros/big-agi) β€” Powerful web interface running entirely in the browser + +### Agentic Libraries & Assistants + +- [cogito](https://github.com/mudler/cogito) β€” Agentic library for Go +- [LocalAGI](https://github.com/mudler/LocalAGI) β€” Local smart assistant with autonomous agents + +### MCP Servers + +- [MCPs](https://github.com/mudler/MCPs) β€” Model Context Protocol servers + +### OS Assistants + +- [Keygeist](https://github.com/mudler/Keygeist) β€” AI-powered keyboard operator for Linux + +### Voice + +- [VoxInput](https://github.com/richiejp/VoxInput) β€” Use voice to control your desktop + +### IDE & Editor Plugins + +- [VSCode extension](https://github.com/badgooooor/localai-vscode-plugin) +- [GPTLocalhost (Word Add-in)](https://gptlocalhost.com/demo#LocalAI) β€” Run LocalAI in Microsoft Word locally + +### Framework Integrations + +- [Langchain (Python)](https://python.langchain.com/docs/integrations/providers/localai/) β€” [pypi](https://pypi.org/project/langchain-localai/) +- [langchain4j](https://github.com/langchain4j/langchain4j) β€” Java LangChain +- [lingoose](https://github.com/henomis/lingoose) β€” Go framework for LLM apps +- [LLPhant](https://github.com/theodo-group/LLPhant) β€” PHP library for LLMs and vector databases +- [FlowiseAI](https://github.com/FlowiseAI/Flowise) β€” Low-code LLM app builder +- [LLMStack](https://github.com/trypromptly/LLMStack) +- [Midori AI Subsystem Manager](https://io.midori-ai.xyz/subsystem/manager/) + +### Terminal Tools + +- [ShellOracle](https://github.com/djcopley/ShellOracle) β€” Terminal utility +- [Shell-Pilot](https://github.com/reid41/shell-pilot) β€” Interact with LLMs via pure shell scripts +- [Mods](https://github.com/charmbracelet/mods) β€” AI on the command line + +### Chat Bots + +- [Discord bot](https://github.com/mudler/LocalAGI/tree/main/examples/discord) +- [Slack bot](https://github.com/mudler/LocalAGI/tree/main/examples/slack) +- [Telegram bot](https://github.com/mudler/LocalAI/tree/master/examples/telegram-bot) +- [Hellper (Telegram)](https://github.com/JackBekket/Hellper) + +### Home Automation + +- [hass-openai-custom-conversation](https://github.com/drndos/hass-openai-custom-conversation) β€” Home Assistant integration +- [ha-llmvision](https://github.com/valentinfrlch/ha-llmvision) β€” Home Assistant LLM Vision +- [HA-LocalAI-Monitor](https://github.com/loryanstrant/HA-LocalAI-Monitor) β€” Home Assistant monitoring +- Nextcloud [integration plugin](https://apps.nextcloud.com/apps/integration_openai) and [AI assistant](https://apps.nextcloud.com/apps/assistant) + +### Automation & DevOps + +- [Reflexia](https://github.com/JackBekket/Reflexia) β€” Auto-documentation +- [GitHelper](https://github.com/JackBekket/GitHelper) β€” GitHub bot for issues with code and documentation context +- [kairos](https://github.com/kairos-io/kairos) β€” Immutable Linux OS + +### Other Integrations - [AnythingLLM](https://github.com/Mintplex-Labs/anything-llm) -- [Logseq GPT3 OpenAI plugin](https://github.com/briansunter/logseq-plugin-gpt3-openai) allows to set a base URL, and works with LocalAI. -- https://plugins.jetbrains.com/plugin/21056-codegpt allows for custom OpenAI compatible endpoints since 2.4.0 -- [Wave Terminal](https://docs.waveterm.dev/features/supportedLLMs/localai) has native support for LocalAI! -- https://github.com/longy2k/obsidian-bmo-chatbot -- https://github.com/FlowiseAI/Flowise -- https://github.com/k8sgpt-ai/k8sgpt -- https://github.com/kairos-io/kairos -- https://github.com/langchain4j/langchain4j -- https://github.com/henomis/lingoose -- https://github.com/trypromptly/LLMStack -- https://github.com/mattermost/openops -- https://github.com/charmbracelet/mods -- https://github.com/cedriking/spark -- [Big AGI](https://github.com/enricoros/big-agi) is a powerful web interface entirely running in the browser, supporting LocalAI -- [Midori AI Subsystem Manager](https://io.midori-ai.xyz/subsystem/manager/) is a powerful docker subsystem for running all types of AI programs -- [LLPhant](https://github.com/theodo-group/LLPhant) is a PHP library for interacting with LLMs and Vector Databases -- [GPTLocalhost (Word Add-in)](https://gptlocalhost.com/demo#LocalAI) - run LocalAI in Microsoft Word locally -- use LocalAI from Nextcloud with the [integration plugin](https://apps.nextcloud.com/apps/integration_openai) and [AI assistant](https://apps.nextcloud.com/apps/assistant) -- [Langchain](https://docs.langchain.com/oss/python/integrations/providers/localai) integration package [pypi](https://pypi.org/project/langchain-localai/) -- [VoxInput](https://github.com/richiejp/VoxInput) - Use voice to control your desktop - -Feel free to open up a Pull request (by clicking at the "Edit page" below) to get a page for your project made or if you see a error on one of the pages! +- [Logseq GPT3 OpenAI plugin](https://github.com/briansunter/logseq-plugin-gpt3-openai) +- [CodeGPT (JetBrains)](https://plugins.jetbrains.com/plugin/21056-codegpt) β€” Custom OpenAI-compatible endpoints +- [Wave Terminal](https://docs.waveterm.dev/features/supportedLLMs/localai) β€” Native LocalAI support +- [Obsidian BMO Chatbot](https://github.com/longy2k/obsidian-bmo-chatbot) +- [spark](https://github.com/cedriking/spark) +- [openops (Mattermost)](https://github.com/mattermost/openops) +- [Model Gallery](https://github.com/go-skynet/model-gallery) +- [Examples](https://github.com/mudler/LocalAI/tree/master/examples/) ## Configuration Guides diff --git a/docs/content/reference/compatibility-table.md b/docs/content/reference/compatibility-table.md index fc3033aa9..80cf4e781 100644 --- a/docs/content/reference/compatibility-table.md +++ b/docs/content/reference/compatibility-table.md @@ -16,55 +16,72 @@ LocalAI will attempt to automatically load models which are not explicitly confi ## Text Generation & Language Models -| Backend and Bindings | Compatible models | Completion/Chat endpoint | Capability | Embeddings support | Token stream support | Acceleration | -|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------| -| [llama.cpp]({{%relref "features/text-generation#llama.cpp" %}}) | LLama, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | yes | GPT and Functions | yes | yes | CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, CPU | -| [vLLM](https://github.com/vllm-project/vllm) | Various GPTs and quantization formats | yes | GPT | no | no | CUDA 12/13, ROCm, Intel | -| [transformers](https://github.com/huggingface/transformers) | Various GPTs and quantization formats | yes | GPT, embeddings, Audio generation | yes | yes* | CUDA 12/13, ROCm, Intel, CPU | -| [MLX](https://github.com/ml-explore/mlx-lm) | Various LLMs | yes | GPT | no | no | Metal (Apple Silicon) | -| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) | Vision-Language Models | yes | Multimodal GPT | no | no | Metal (Apple Silicon) | -| [vllm-omni](https://github.com/vllm-project/vllm) | vLLM Omni multimodal | yes | Multimodal GPT | no | no | CUDA 12/13, ROCm, Intel | -| [langchain-huggingface](https://github.com/tmc/langchaingo) | Any text generators available on HuggingFace through API | yes | GPT | no | no | N/A | +| Backend | Description | Capability | Embeddings | Streaming | Acceleration | +|---------|-------------|------------|------------|-----------|-------------| +| [llama.cpp](https://github.com/ggerganov/llama.cpp) | LLM inference in C/C++. Supports LLaMA, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | GPT, Functions | yes | yes | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T | +| [vLLM](https://github.com/vllm-project/vllm) | Fast LLM serving with PagedAttention | GPT | no | no | CUDA 12, ROCm, Intel | +| [vLLM Omni](https://github.com/vllm-project/vllm) | Unified multimodal generation (text, image, video, audio) | Multimodal GPT | no | no | CUDA 12, ROCm | +| [transformers](https://github.com/huggingface/transformers) | HuggingFace Transformers framework | GPT, Embeddings, Multimodal | yes | yes* | CPU, CUDA 12/13, ROCm, Intel, Metal | +| [MLX](https://github.com/ml-explore/mlx-lm) | Apple Silicon LLM inference | GPT | no | no | Metal | +| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) | Vision-Language Models on Apple Silicon | Multimodal GPT | no | no | Metal | +| [MLX Distributed](https://github.com/ml-explore/mlx-lm) | Distributed LLM inference across multiple Apple Silicon Macs | GPT | no | no | Metal | -## Audio & Speech Processing +## Speech-to-Text -| Backend and Bindings | Compatible models | Completion/Chat endpoint | Capability | Embeddings support | Token stream support | Acceleration | -|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------| -| [whisper.cpp](https://github.com/ggml-org/whisper.cpp) | whisper | no | Audio transcription | no | no | CUDA 12/13, ROCm, Intel SYCL, Vulkan, CPU | -| [faster-whisper](https://github.com/SYSTRAN/faster-whisper) | whisper | no | Audio transcription | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [piper](https://github.com/rhasspy/piper) ([binding](https://github.com/mudler/go-piper)) | Any piper onnx model | no | Text to voice | no | no | CPU | -| [coqui](https://github.com/idiap/coqui-ai-TTS) | Coqui TTS | no | Audio generation and Voice cloning | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [kokoro](https://github.com/hexgrad/kokoro) | Kokoro TTS | no | Text-to-speech | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [chatterbox](https://github.com/resemble-ai/chatterbox) | Chatterbox TTS | no | Text-to-speech | no | no | CUDA 12/13, CPU | -| [kitten-tts](https://github.com/KittenML/KittenTTS) | Kitten TTS | no | Text-to-speech | no | no | CPU | -| [silero-vad](https://github.com/snakers4/silero-vad) with [Golang bindings](https://github.com/streamer45/silero-vad-go) | Silero VAD | no | Voice Activity Detection | no | no | CPU | -| [neutts](https://github.com/neuphonic/neuttsair) | NeuTTSAir | no | Text-to-speech with voice cloning | no | no | CUDA 12/13, ROCm, CPU | -| [vibevoice](https://github.com/microsoft/VibeVoice) | VibeVoice-Realtime | no | Real-time text-to-speech with voice cloning | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [pocket-tts](https://github.com/kyutai-labs/pocket-tts) | Pocket TTS | no | Lightweight CPU-based text-to-speech with voice cloning | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [mlx-audio](https://github.com/Blaizzy/mlx-audio) | MLX | no | Text-tospeech | no | no | Metal (Apple Silicon) | -| [nemo](https://github.com/NVIDIA/NeMo) | NeMo speech models | no | Speech models | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [outetts](https://github.com/edwengc/outetts) | OuteTTS | no | Text-to-speech with voice cloning | no | no | CUDA 12/13, CPU | -| [faster-qwen3-tts](https://github.com/andimarafioti/faster-qwen3-tts) | Faster Qwen3 TTS | no | Fast text-to-speech | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [qwen-asr](https://github.com/QwenLM/Qwen-ASR) | Qwen ASR | no | Automatic speech recognition | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [voxcpm](https://github.com/voxcpm/voxcpm) | VoxCPM | no | Speech understanding | no | no | CUDA 12/13, Metal, CPU | -| [whisperx](https://github.com/m-bain/whisperX) | WhisperX | no | Enhanced transcription | no | no | CUDA 12/13, ROCm, Intel, CPU | +| Backend | Description | Acceleration | +|---------|-------------|-------------| +| [whisper.cpp](https://github.com/ggml-org/whisper.cpp) | OpenAI Whisper in C/C++ | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T | +| [faster-whisper](https://github.com/SYSTRAN/faster-whisper) | Fast Whisper with CTranslate2 | CUDA 12/13, ROCm, Intel, Metal | +| [WhisperX](https://github.com/m-bain/whisperX) | Word-level timestamps and speaker diarization | CPU, CUDA 12/13, ROCm, Metal | +| [moonshine](https://github.com/moonshine-ai/moonshine) | Ultra-fast transcription for low-end devices | CPU, CUDA 12/13, Metal | +| [voxtral](https://github.com/mudler/voxtral.c) | Voxtral Realtime 4B speech-to-text in C | CPU, Metal | +| [Qwen3-ASR](https://github.com/QwenLM/Qwen3-ASR) | Qwen3 automatic speech recognition | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [NeMo](https://github.com/NVIDIA/NeMo) | NVIDIA NeMo ASR toolkit | CPU, CUDA 12/13, ROCm, Intel, Metal | + +## Text-to-Speech + +| Backend | Description | Acceleration | +|---------|-------------|-------------| +| [piper](https://github.com/rhasspy/piper) | Fast neural TTS | CPU | +| [Coqui TTS](https://github.com/idiap/coqui-ai-TTS) | TTS with 1100+ languages and voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal | +| [Kokoro](https://huggingface.co/hexgrad/Kokoro-82M) | Lightweight TTS (82M params) | CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [Chatterbox](https://github.com/resemble-ai/chatterbox) | Production-grade TTS with emotion control | CPU, CUDA 12/13, Metal, Jetson L4T | +| [VibeVoice](https://github.com/microsoft/VibeVoice) | Real-time TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) | TTS with custom voice, voice design, and voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [fish-speech](https://github.com/fishaudio/fish-speech) | High-quality TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [Pocket TTS](https://github.com/kyutai-labs/pocket-tts) | Lightweight CPU-efficient TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | +| [OuteTTS](https://github.com/OuteAI/outetts) | TTS with custom speaker voices | CPU, CUDA 12 | +| [faster-qwen3-tts](https://github.com/andimarafioti/faster-qwen3-tts) | Real-time Qwen3-TTS with CUDA graph capture | CUDA 12/13, Jetson L4T | +| [NeuTTS Air](https://github.com/neuphonic/neutts-air) | Instant voice cloning TTS | CPU, CUDA 12, ROCm | +| [VoxCPM](https://github.com/ModelBest/VoxCPM) | Expressive end-to-end TTS | CPU, CUDA 12/13, ROCm, Intel, Metal | +| [Kitten TTS](https://github.com/KittenML/KittenTTS) | Kitten TTS model | CPU, Metal | +| [MLX-Audio](https://github.com/Blaizzy/mlx-audio) | Audio models on Apple Silicon | Metal, CPU, CUDA 12/13, Jetson L4T | + +## Music Generation + +| Backend | Description | Acceleration | +|---------|-------------|-------------| +| [ACE-Step](https://github.com/ace-step/ACE-Step-1.5) | Music generation from text descriptions, lyrics, or audio | CPU, CUDA 12/13, ROCm, Intel, Metal | +| [acestep.cpp](https://github.com/ace-step/acestep.cpp) | ACE-Step 1.5 C++ backend using GGML | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T | ## Image & Video Generation -| Backend and Bindings | Compatible models | Completion/Chat endpoint | Capability | Embeddings support | Token stream support | Acceleration | -|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------| -| [stablediffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) | stablediffusion-1, stablediffusion-2, stablediffusion-3, flux, PhotoMaker | no | Image | no | no | CUDA 12/13, Intel SYCL, Vulkan, CPU | -| [diffusers](https://github.com/huggingface/diffusers) | SD, various diffusion models,... | no | Image/Video generation | no | no | CUDA 12/13, ROCm, Intel, Metal, CPU | -| [transformers-musicgen](https://github.com/huggingface/transformers) | MusicGen | no | Audio generation | no | no | CUDA, CPU | +| Backend | Description | Acceleration | +|---------|-------------|-------------| +| [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) | Stable Diffusion, Flux, PhotoMaker in C/C++ | CPU, CUDA 12/13, Intel SYCL, Vulkan, Metal, Jetson L4T | +| [diffusers](https://github.com/huggingface/diffusers) | HuggingFace diffusion models (image and video generation) | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T | -## Specialized AI Tasks +## Specialized Tasks -| Backend and Bindings | Compatible models | Completion/Chat endpoint | Capability | Embeddings support | Token stream support | Acceleration | -|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------| -| [rfdetr](https://github.com/roboflow/rf-detr) | RF-DETR | no | Object Detection | no | no | CUDA 12/13, Intel, CPU | -| [rerankers](https://github.com/AnswerDotAI/rerankers) | Reranking API | no | Reranking | no | no | CUDA 12/13, ROCm, Intel, CPU | -| [local-store](https://github.com/mudler/LocalAI) | Vector database | no | Vector storage | yes | no | CPU | -| [huggingface](https://huggingface.co/docs/hub/en/api) | HuggingFace API models | yes | Various AI tasks | yes | yes | API-based | +| Backend | Description | Acceleration | +|---------|-------------|-------------| +| [RF-DETR](https://github.com/roboflow/rf-detr) | Real-time transformer-based object detection | CPU, CUDA 12/13, Intel, Metal, Jetson L4T | +| [rerankers](https://github.com/AnswerDotAI/rerankers) | Document reranking for RAG | CUDA 12/13, ROCm, Intel, Metal | +| [local-store](https://github.com/mudler/LocalAI) | Local vector database for embeddings | CPU, Metal | +| [Silero VAD](https://github.com/snakers4/silero-vad) | Voice Activity Detection | CPU | +| [TRL](https://github.com/huggingface/trl) | Fine-tuning (SFT, DPO, GRPO, RLOO, KTO, ORPO) | CPU, CUDA 12/13 | +| [llama.cpp quantization](https://github.com/ggml-org/llama.cpp) | HuggingFace β†’ GGUF model conversion and quantization | CPU, Metal | +| [Opus](https://opus-codec.org/) | Audio codec for WebRTC / Realtime API | CPU, Metal | ## Acceleration Support Summary diff --git a/docs/content/whats-new.md b/docs/content/whats-new.md index f3b57c178..05dc2e961 100644 --- a/docs/content/whats-new.md +++ b/docs/content/whats-new.md @@ -6,10 +6,20 @@ url = '/basics/news/' icon = "newspaper" +++ -Release notes have been now moved completely over Github releases. +Release notes have been now moved completely over Github releases. You can see the release notes [here](https://github.com/mudler/LocalAI/releases). +## 2024 Highlights + +- **April 2024**: [Reranker API](https://github.com/mudler/LocalAI/pull/2121) +- **May 2024**: [Distributed inferencing](https://github.com/mudler/LocalAI/pull/2324), [Decentralized P2P llama.cpp](https://github.com/mudler/LocalAI/pull/2343) β€” [Docs](https://localai.io/features/distribute/) +- **July/August 2024**: [P2P Dashboard, Federated mode and AI Swarms](https://github.com/mudler/LocalAI/pull/2723), [P2P Global community pools](https://github.com/mudler/LocalAI/issues/3113), FLUX-1 support, [P2P Explorer](https://explorer.localai.io) +- **October 2024**: Examples moved to [LocalAI-examples](https://github.com/mudler/LocalAI-examples) +- **November 2024**: [Voice Activity Detection (VAD)](https://github.com/mudler/LocalAI/pull/4204), [Bark.cpp backend](https://github.com/mudler/LocalAI/pull/4287) +- **December 2024**: [stablediffusion.cpp backend (ggml)](https://github.com/mudler/LocalAI/pull/4289) + +--- ## 04-12-2023: __v2.0.0__