chore(docs): simplify

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-07-08 07:18:31 -04:00 · 2026-03-22 20:24:44 +00:00
parent 15935e9d5f
commit cecd8d6aa5
4 changed files with 238 additions and 317 deletions
--- a/docs/content/integrations.md
+++ b/docs/content/integrations.md
@@ -6,35 +6,94 @@ icon = "sync"

 +++

-## Community integrations
+## Community Integrations

-List of projects that are using directly LocalAI behind the scenes can be found [here](https://github.com/mudler/LocalAI#-community-and-integrations).
+The lists below cover software and community projects that integrate with LocalAI.

-The list below is a list of software that integrates with LocalAI.
+Feel free to open up a Pull request (by clicking at the "Edit page" below) to get your project added!
+
+### Build & Deploy
+
+- [aikit](https://github.com/sozercan/aikit) — Build and deploy custom LocalAI containers
+- [Helm chart](https://github.com/go-skynet/helm-charts) — Deploy LocalAI on Kubernetes
+- [GitHub Actions](https://github.com/marketplace/actions/start-localai) — Use LocalAI in CI/CD workflows
+
+### Web UIs
+
+- [localai-admin](https://github.com/Jirubizu/localai-admin)
+- [LocalAI-frontend](https://github.com/go-skynet/LocalAI-frontend)
+- [QA-Pilot](https://github.com/reid41/QA-Pilot) — Interactive chat for navigating GitHub code repositories
+- [Big AGI](https://github.com/enricoros/big-agi) — Powerful web interface running entirely in the browser
+
+### Agentic Libraries & Assistants
+
+- [cogito](https://github.com/mudler/cogito) — Agentic library for Go
+- [LocalAGI](https://github.com/mudler/LocalAGI) — Local smart assistant with autonomous agents
+
+### MCP Servers
+
+- [MCPs](https://github.com/mudler/MCPs) — Model Context Protocol servers
+
+### OS Assistants
+
+- [Keygeist](https://github.com/mudler/Keygeist) — AI-powered keyboard operator for Linux
+
+### Voice
+
+- [VoxInput](https://github.com/richiejp/VoxInput) — Use voice to control your desktop
+
+### IDE & Editor Plugins
+
+- [VSCode extension](https://github.com/badgooooor/localai-vscode-plugin)
+- [GPTLocalhost (Word Add-in)](https://gptlocalhost.com/demo#LocalAI) — Run LocalAI in Microsoft Word locally
+
+### Framework Integrations
+
+- [Langchain (Python)](https://python.langchain.com/docs/integrations/providers/localai/) — [pypi](https://pypi.org/project/langchain-localai/)
+- [langchain4j](https://github.com/langchain4j/langchain4j) — Java LangChain
+- [lingoose](https://github.com/henomis/lingoose) — Go framework for LLM apps
+- [LLPhant](https://github.com/theodo-group/LLPhant) — PHP library for LLMs and vector databases
+- [FlowiseAI](https://github.com/FlowiseAI/Flowise) — Low-code LLM app builder
+- [LLMStack](https://github.com/trypromptly/LLMStack)
+- [Midori AI Subsystem Manager](https://io.midori-ai.xyz/subsystem/manager/)
+
+### Terminal Tools
+
+- [ShellOracle](https://github.com/djcopley/ShellOracle) — Terminal utility
+- [Shell-Pilot](https://github.com/reid41/shell-pilot) — Interact with LLMs via pure shell scripts
+- [Mods](https://github.com/charmbracelet/mods) — AI on the command line
+
+### Chat Bots
+
+- [Discord bot](https://github.com/mudler/LocalAGI/tree/main/examples/discord)
+- [Slack bot](https://github.com/mudler/LocalAGI/tree/main/examples/slack)
+- [Telegram bot](https://github.com/mudler/LocalAI/tree/master/examples/telegram-bot)
+- [Hellper (Telegram)](https://github.com/JackBekket/Hellper)
+
+### Home Automation
+
+- [hass-openai-custom-conversation](https://github.com/drndos/hass-openai-custom-conversation) — Home Assistant integration
+- [ha-llmvision](https://github.com/valentinfrlch/ha-llmvision) — Home Assistant LLM Vision
+- [HA-LocalAI-Monitor](https://github.com/loryanstrant/HA-LocalAI-Monitor) — Home Assistant monitoring
+- Nextcloud [integration plugin](https://apps.nextcloud.com/apps/integration_openai) and [AI assistant](https://apps.nextcloud.com/apps/assistant)
+
+### Automation & DevOps
+
+- [Reflexia](https://github.com/JackBekket/Reflexia) — Auto-documentation
+- [GitHelper](https://github.com/JackBekket/GitHelper) — GitHub bot for issues with code and documentation context
+- [kairos](https://github.com/kairos-io/kairos) — Immutable Linux OS
+
+### Other Integrations

 - [AnythingLLM](https://github.com/Mintplex-Labs/anything-llm)
- [Logseq GPT3 OpenAI plugin](https://github.com/briansunter/logseq-plugin-gpt3-openai) allows to set a base URL, and works with LocalAI.
- https://plugins.jetbrains.com/plugin/21056-codegpt allows for custom OpenAI compatible endpoints since 2.4.0
- [Wave Terminal](https://docs.waveterm.dev/features/supportedLLMs/localai) has native support for LocalAI!
- https://github.com/longy2k/obsidian-bmo-chatbot
- https://github.com/FlowiseAI/Flowise
- https://github.com/k8sgpt-ai/k8sgpt
- https://github.com/kairos-io/kairos
- https://github.com/langchain4j/langchain4j
- https://github.com/henomis/lingoose
- https://github.com/trypromptly/LLMStack
- https://github.com/mattermost/openops
- https://github.com/charmbracelet/mods
- https://github.com/cedriking/spark
- [Big AGI](https://github.com/enricoros/big-agi) is a powerful web interface entirely running in the browser, supporting LocalAI
- [Midori AI Subsystem Manager](https://io.midori-ai.xyz/subsystem/manager/) is a powerful docker subsystem for running all types of AI programs
- [LLPhant](https://github.com/theodo-group/LLPhant) is a PHP library for interacting with LLMs and Vector Databases
- [GPTLocalhost (Word Add-in)](https://gptlocalhost.com/demo#LocalAI) - run LocalAI in Microsoft Word locally
- use LocalAI from Nextcloud with the [integration plugin](https://apps.nextcloud.com/apps/integration_openai) and [AI assistant](https://apps.nextcloud.com/apps/assistant)
- [Langchain](https://docs.langchain.com/oss/python/integrations/providers/localai) integration package [pypi](https://pypi.org/project/langchain-localai/)
- [VoxInput](https://github.com/richiejp/VoxInput) - Use voice to control your desktop
-
-Feel free to open up a Pull request (by clicking at the "Edit page" below) to get a page for your project made or if you see a error on one of the pages!
+- [Logseq GPT3 OpenAI plugin](https://github.com/briansunter/logseq-plugin-gpt3-openai)
+- [CodeGPT (JetBrains)](https://plugins.jetbrains.com/plugin/21056-codegpt) — Custom OpenAI-compatible endpoints
+- [Wave Terminal](https://docs.waveterm.dev/features/supportedLLMs/localai) — Native LocalAI support
+- [Obsidian BMO Chatbot](https://github.com/longy2k/obsidian-bmo-chatbot)
+- [spark](https://github.com/cedriking/spark)
+- [openops (Mattermost)](https://github.com/mattermost/openops)
+- [Model Gallery](https://github.com/go-skynet/model-gallery)
+- [Examples](https://github.com/mudler/LocalAI/tree/master/examples/)

 ## Configuration Guides

--- a/docs/content/reference/compatibility-table.md
+++ b/docs/content/reference/compatibility-table.md
@@ -16,55 +16,72 @@ LocalAI will attempt to automatically load models which are not explicitly confi

 ## Text Generation & Language Models

-| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
-|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
-| [llama.cpp]({{%relref "features/text-generation#llama.cpp" %}})        | LLama, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | yes                      | GPT and Functions                        | yes | yes                  | CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, CPU |
-| [vLLM](https://github.com/vllm-project/vllm)        | Various GPTs and quantization formats | yes                      | GPT             | no | no                  | CUDA 12/13, ROCm, Intel |
-| [transformers](https://github.com/huggingface/transformers) | Various GPTs and quantization formats  | yes                      | GPT, embeddings, Audio generation            | yes | yes*                  | CUDA 12/13, ROCm, Intel, CPU |
-| [MLX](https://github.com/ml-explore/mlx-lm)        | Various LLMs               | yes                       | GPT                        | no                                | no                   | Metal (Apple Silicon) |
-| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm)        | Vision-Language Models               | yes                       | Multimodal GPT                        | no                                | no                   | Metal (Apple Silicon) |
-| [vllm-omni](https://github.com/vllm-project/vllm) | vLLM Omni multimodal | yes | Multimodal GPT | no | no | CUDA 12/13, ROCm, Intel |
-| [langchain-huggingface](https://github.com/tmc/langchaingo)                                                                    | Any text generators available on HuggingFace through API | yes                      | GPT                        | no                                | no                   | N/A |
+| Backend | Description | Capability | Embeddings | Streaming | Acceleration |
+|---------|-------------|------------|------------|-----------|-------------|
+| [llama.cpp](https://github.com/ggerganov/llama.cpp) | LLM inference in C/C++. Supports LLaMA, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | GPT, Functions | yes | yes | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T |
+| [vLLM](https://github.com/vllm-project/vllm) | Fast LLM serving with PagedAttention | GPT | no | no | CUDA 12, ROCm, Intel |
+| [vLLM Omni](https://github.com/vllm-project/vllm) | Unified multimodal generation (text, image, video, audio) | Multimodal GPT | no | no | CUDA 12, ROCm |
+| [transformers](https://github.com/huggingface/transformers) | HuggingFace Transformers framework | GPT, Embeddings, Multimodal | yes | yes* | CPU, CUDA 12/13, ROCm, Intel, Metal |
+| [MLX](https://github.com/ml-explore/mlx-lm) | Apple Silicon LLM inference | GPT | no | no | Metal |
+| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) | Vision-Language Models on Apple Silicon | Multimodal GPT | no | no | Metal |
+| [MLX Distributed](https://github.com/ml-explore/mlx-lm) | Distributed LLM inference across multiple Apple Silicon Macs | GPT | no | no | Metal |

-## Audio & Speech Processing
+## Speech-to-Text

-| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
-|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
-| [whisper.cpp](https://github.com/ggml-org/whisper.cpp)         | whisper               | no                       | Audio transcription                 | no                                | no                   | CUDA 12/13, ROCm, Intel SYCL, Vulkan, CPU |
-| [faster-whisper](https://github.com/SYSTRAN/faster-whisper)         | whisper               | no                       | Audio transcription                 | no                                | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [piper](https://github.com/rhasspy/piper) ([binding](https://github.com/mudler/go-piper))                                                                     | Any piper onnx model | no                      | Text to voice                        | no                                | no                   | CPU |
-| [coqui](https://github.com/idiap/coqui-ai-TTS) | Coqui TTS    | no                       | Audio generation and Voice cloning    | no                               | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [kokoro](https://github.com/hexgrad/kokoro) | Kokoro TTS    | no                       | Text-to-speech    | no                               | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [chatterbox](https://github.com/resemble-ai/chatterbox) | Chatterbox TTS    | no                       | Text-to-speech    | no                               | no                   | CUDA 12/13, CPU |
-| [kitten-tts](https://github.com/KittenML/KittenTTS) | Kitten TTS    | no                       | Text-to-speech    | no                               | no                   | CPU |
-| [silero-vad](https://github.com/snakers4/silero-vad) with [Golang bindings](https://github.com/streamer45/silero-vad-go) | Silero VAD    | no                       | Voice Activity Detection    | no                               | no                   | CPU |
-| [neutts](https://github.com/neuphonic/neuttsair) | NeuTTSAir    | no                       | Text-to-speech with voice cloning    | no                               | no                   | CUDA 12/13, ROCm, CPU |
-| [vibevoice](https://github.com/microsoft/VibeVoice) | VibeVoice-Realtime    | no                       | Real-time text-to-speech with voice cloning    | no                               | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [pocket-tts](https://github.com/kyutai-labs/pocket-tts) | Pocket TTS    | no                       | Lightweight CPU-based text-to-speech with voice cloning    | no                               | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [mlx-audio](https://github.com/Blaizzy/mlx-audio) | MLX | no                       | Text-tospeech    | no                               | no                   | Metal (Apple Silicon) |
-| [nemo](https://github.com/NVIDIA/NeMo) | NeMo speech models | no | Speech models | no | no | CUDA 12/13, ROCm, Intel, CPU |
-| [outetts](https://github.com/edwengc/outetts) | OuteTTS | no | Text-to-speech with voice cloning | no | no | CUDA 12/13, CPU |
-| [faster-qwen3-tts](https://github.com/andimarafioti/faster-qwen3-tts) | Faster Qwen3 TTS | no | Fast text-to-speech | no | no | CUDA 12/13, ROCm, Intel, CPU |
-| [qwen-asr](https://github.com/QwenLM/Qwen-ASR) | Qwen ASR | no | Automatic speech recognition | no | no | CUDA 12/13, ROCm, Intel, CPU |
-| [voxcpm](https://github.com/voxcpm/voxcpm) | VoxCPM | no | Speech understanding | no | no | CUDA 12/13, Metal, CPU |
-| [whisperx](https://github.com/m-bain/whisperX) | WhisperX | no | Enhanced transcription | no | no | CUDA 12/13, ROCm, Intel, CPU |
+| Backend | Description | Acceleration |
+|---------|-------------|-------------|
+| [whisper.cpp](https://github.com/ggml-org/whisper.cpp) | OpenAI Whisper in C/C++ | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T |
+| [faster-whisper](https://github.com/SYSTRAN/faster-whisper) | Fast Whisper with CTranslate2 | CUDA 12/13, ROCm, Intel, Metal |
+| [WhisperX](https://github.com/m-bain/whisperX) | Word-level timestamps and speaker diarization | CPU, CUDA 12/13, ROCm, Metal |
+| [moonshine](https://github.com/moonshine-ai/moonshine) | Ultra-fast transcription for low-end devices | CPU, CUDA 12/13, Metal |
+| [voxtral](https://github.com/mudler/voxtral.c) | Voxtral Realtime 4B speech-to-text in C | CPU, Metal |
+| [Qwen3-ASR](https://github.com/QwenLM/Qwen3-ASR) | Qwen3 automatic speech recognition | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [NeMo](https://github.com/NVIDIA/NeMo) | NVIDIA NeMo ASR toolkit | CPU, CUDA 12/13, ROCm, Intel, Metal |
+
+## Text-to-Speech
+
+| Backend | Description | Acceleration |
+|---------|-------------|-------------|
+| [piper](https://github.com/rhasspy/piper) | Fast neural TTS | CPU |
+| [Coqui TTS](https://github.com/idiap/coqui-ai-TTS) | TTS with 1100+ languages and voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal |
+| [Kokoro](https://huggingface.co/hexgrad/Kokoro-82M) | Lightweight TTS (82M params) | CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [Chatterbox](https://github.com/resemble-ai/chatterbox) | Production-grade TTS with emotion control | CPU, CUDA 12/13, Metal, Jetson L4T |
+| [VibeVoice](https://github.com/microsoft/VibeVoice) | Real-time TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [Qwen3-TTS](https://github.com/QwenLM/Qwen3-TTS) | TTS with custom voice, voice design, and voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [fish-speech](https://github.com/fishaudio/fish-speech) | High-quality TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [Pocket TTS](https://github.com/kyutai-labs/pocket-tts) | Lightweight CPU-efficient TTS with voice cloning | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |
+| [OuteTTS](https://github.com/OuteAI/outetts) | TTS with custom speaker voices | CPU, CUDA 12 |
+| [faster-qwen3-tts](https://github.com/andimarafioti/faster-qwen3-tts) | Real-time Qwen3-TTS with CUDA graph capture | CUDA 12/13, Jetson L4T |
+| [NeuTTS Air](https://github.com/neuphonic/neutts-air) | Instant voice cloning TTS | CPU, CUDA 12, ROCm |
+| [VoxCPM](https://github.com/ModelBest/VoxCPM) | Expressive end-to-end TTS | CPU, CUDA 12/13, ROCm, Intel, Metal |
+| [Kitten TTS](https://github.com/KittenML/KittenTTS) | Kitten TTS model | CPU, Metal |
+| [MLX-Audio](https://github.com/Blaizzy/mlx-audio) | Audio models on Apple Silicon | Metal, CPU, CUDA 12/13, Jetson L4T |
+
+## Music Generation
+
+| Backend | Description | Acceleration |
+|---------|-------------|-------------|
+| [ACE-Step](https://github.com/ace-step/ACE-Step-1.5) | Music generation from text descriptions, lyrics, or audio | CPU, CUDA 12/13, ROCm, Intel, Metal |
+| [acestep.cpp](https://github.com/ace-step/acestep.cpp) | ACE-Step 1.5 C++ backend using GGML | CPU, CUDA 12/13, ROCm, Intel SYCL, Vulkan, Metal, Jetson L4T |

 ## Image & Video Generation

-| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
-|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
-| [stablediffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)         | stablediffusion-1, stablediffusion-2, stablediffusion-3, flux, PhotoMaker               | no                       | Image                 | no                                | no                   | CUDA 12/13, Intel SYCL, Vulkan, CPU |
-| [diffusers](https://github.com/huggingface/diffusers)  | SD, various diffusion models,...                   | no                       | Image/Video generation    | no                               | no                   | CUDA 12/13, ROCm, Intel, Metal, CPU |
-| [transformers-musicgen](https://github.com/huggingface/transformers)  | MusicGen                    | no                       | Audio generation                | no                               | no                   | CUDA, CPU |
+| Backend | Description | Acceleration |
+|---------|-------------|-------------|
+| [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) | Stable Diffusion, Flux, PhotoMaker in C/C++ | CPU, CUDA 12/13, Intel SYCL, Vulkan, Metal, Jetson L4T |
+| [diffusers](https://github.com/huggingface/diffusers) | HuggingFace diffusion models (image and video generation) | CPU, CUDA 12/13, ROCm, Intel, Metal, Jetson L4T |

-## Specialized AI Tasks
+## Specialized Tasks

-| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
-|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
-| [rfdetr](https://github.com/roboflow/rf-detr) | RF-DETR    | no                       | Object Detection    | no                               | no                   | CUDA 12/13, Intel, CPU |
-| [rerankers](https://github.com/AnswerDotAI/rerankers) | Reranking API    | no                       | Reranking   | no                               | no                   | CUDA 12/13, ROCm, Intel, CPU |
-| [local-store](https://github.com/mudler/LocalAI) | Vector database    | no                       | Vector storage   | yes                               | no                   | CPU |
-| [huggingface](https://huggingface.co/docs/hub/en/api) | HuggingFace API models    | yes                       | Various AI tasks   | yes                               | yes                   | API-based |
+| Backend | Description | Acceleration |
+|---------|-------------|-------------|
+| [RF-DETR](https://github.com/roboflow/rf-detr) | Real-time transformer-based object detection | CPU, CUDA 12/13, Intel, Metal, Jetson L4T |
+| [rerankers](https://github.com/AnswerDotAI/rerankers) | Document reranking for RAG | CUDA 12/13, ROCm, Intel, Metal |
+| [local-store](https://github.com/mudler/LocalAI) | Local vector database for embeddings | CPU, Metal |
+| [Silero VAD](https://github.com/snakers4/silero-vad) | Voice Activity Detection | CPU |
+| [TRL](https://github.com/huggingface/trl) | Fine-tuning (SFT, DPO, GRPO, RLOO, KTO, ORPO) | CPU, CUDA 12/13 |
+| [llama.cpp quantization](https://github.com/ggml-org/llama.cpp) | HuggingFace → GGUF model conversion and quantization | CPU, Metal |
+| [Opus](https://opus-codec.org/) | Audio codec for WebRTC / Realtime API | CPU, Metal |

 ## Acceleration Support Summary

--- a/docs/content/whats-new.md
+++ b/docs/content/whats-new.md
@@ -6,10 +6,20 @@ url = '/basics/news/'
 icon = "newspaper"
 +++

-Release notes have been now moved completely over Github releases. 
+Release notes have been now moved completely over Github releases.

 You can see the release notes [here](https://github.com/mudler/LocalAI/releases).

+## 2024 Highlights
+
+- **April 2024**: [Reranker API](https://github.com/mudler/LocalAI/pull/2121)
+- **May 2024**: [Distributed inferencing](https://github.com/mudler/LocalAI/pull/2324), [Decentralized P2P llama.cpp](https://github.com/mudler/LocalAI/pull/2343) — [Docs](https://localai.io/features/distribute/)
+- **July/August 2024**: [P2P Dashboard, Federated mode and AI Swarms](https://github.com/mudler/LocalAI/pull/2723), [P2P Global community pools](https://github.com/mudler/LocalAI/issues/3113), FLUX-1 support, [P2P Explorer](https://explorer.localai.io)
+- **October 2024**: Examples moved to [LocalAI-examples](https://github.com/mudler/LocalAI-examples)
+- **November 2024**: [Voice Activity Detection (VAD)](https://github.com/mudler/LocalAI/pull/4204), [Bark.cpp backend](https://github.com/mudler/LocalAI/pull/4287)
+- **December 2024**: [stablediffusion.cpp backend (ggml)](https://github.com/mudler/LocalAI/pull/4289)
+
+---

 ## 04-12-2023: __v2.0.0__