LocalAI/docs/content/features at 8134d6db374a6f6fa6b6ae5784244159e5bec54d - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-12 02:38:19 -04:00

Files

History

Ettore Di Giacinto 8134d6db37 docs(dllm): record Q4_K_M validation and quantization guidance

Q4_K_M validated on GB10: quality holds (cosine 0.9862, coherent
generation, 19/48 stopper exit) but a forward step is ~5x slower than
BF16 (27.5s vs 5.6s: native BF16 tensor cores vs K-quant MoE dequant).
Guidance: prefer BF16 when it fits; Q4_K_M is the memory-bound option.

Assisted-by: Claude Code (Fable 5)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-11 19:22:02 +00:00

..

_index.en.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

agents.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

api-discovery.md

feat(api): Allow coding agents to interactively discover how to control and configure LocalAI (#9084 )

2026-04-04 15:14:35 +02:00

audio-diarization.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

audio-to-text.md

feat(parakeet-cpp): real segment timestamps (NeMo-faithful) (#10207 )

2026-06-07 22:08:24 +02:00

audio-transform.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

authentication.md

feat(usage): track and visualise usage per API key (#9920 )

2026-05-21 16:34:02 +02:00

backend-monitor.md

fix(backend-monitor): accept model as a query parameter (#9411 )

2026-04-21 22:06:35 +02:00

backends.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

cloud-proxy.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

constrained_grammars.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

distributed_inferencing.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

distributed-mode.md

fix(docs): use relearn notice shortcode instead of unsupported alert (#10206 )

2026-06-07 00:37:12 +02:00

distribution.md

fix(docs): commit distribution.md

2026-04-03 10:14:13 +02:00

embeddings.md

feat(face-recognition): add insightface/onnx backend for 1:1 verify, 1:N identify, embedding, detection, analysis (#9480 )

2026-04-22 21:55:41 +02:00

face-recognition.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

fine-tuning.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

gpt-vision.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

GPU-acceleration.md

feat(rocm): bump to 7.x (#9323 )

2026-04-12 08:51:30 +02:00

image-generation.md

docs: fix documentation typos (#10125 )

2026-06-01 14:31:08 +02:00

localai-assistant.md

feat: localai assistant chat modality (#9602 )

2026-04-28 19:29:27 +02:00

mcp.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

middleware.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

mitm-proxy.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

mlx-distributed.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

model-gallery.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

object-detection.md

feat(backend): rfdetr-cpp native object detection + segmentation backend (#10028 )

2026-05-27 18:43:57 +02:00

openai-functions.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

openai-realtime.md

feat(realtime): stream the LLM / TTS / transcription pipeline stages (#10176 )

2026-06-11 08:43:12 +01:00

p2p.md

feat: Add documentation for undocumented API endpoints (#8852 )

2026-03-08 17:59:33 +01:00

quantization.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

reranker.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00

runtime-settings.md

fix(docs): fix broken references to distributed mode

2026-04-03 09:46:06 +02:00

sound-generation.md

feat: Add documentation for undocumented API endpoints (#8852 )

2026-03-08 17:59:33 +01:00

stores.md

fix(docs): replace Docsy alert shortcode with Relearn notice

2026-04-25 21:04:31 +00:00

text-generation.md

docs(dllm): record Q4_K_M validation and quantization guidance

2026-06-11 19:22:02 +00:00

text-to-audio.md

feat(qwen3-tts-cpp): normalize request language for flexible matching (#10174 )

2026-06-04 17:26:31 +02:00

video-generation.md

feat: Add documentation for undocumented API endpoints (#8852 )

2026-03-08 17:59:33 +01:00

voice-activity-detection.md

feat: Add documentation for undocumented API endpoints (#8852 )

2026-03-08 17:59:33 +01:00

voice-recognition.md

docs: architecture & feature diagrams (blueprint style) (#10137 )

2026-06-02 18:43:22 +02:00