LocalAI/core at a7a7bd646b61fa2e376e1a62d779db1baf440407 - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-30 09:57:57 -04:00

Files

History

LocalAI [bot] a7a7bd646b fix(mlx): route vision-language models to the mlx-vlm backend (#10274 )

Vision-language checkpoints such as mlx-community/gemma-4-E4B-it-qat-4bit
declare the "image-text-to-text" pipeline tag on HuggingFace. The mlx
importer hardcoded backend "mlx" for every mlx-community model, so these
VLMs were served by the text-only mlx-lm backend whose tokenizer does not
carry the processor chat template. The template was never applied and the
model produced degenerate, looping output that echoed the prompt.

Detect the "image-text-to-text" pipeline tag in the importer and route those
models to mlx-vlm, which applies the processor-aware chat template. An
explicit backend preference still wins.

As a defensive backstop, the mlx backend now warns loudly when the loaded
model has no chat template, so a misrouted VLM surfaces the problem instead
of silently looping.

Fixes #10269


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-12 23:12:42 +02:00

..

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

feat(realtime): make WebRTC ICE candidates configurable (#10231 )

2026-06-09 22:28:03 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

dependencies_manager

feat(ui): move to React for frontend (#8772 )

2026-03-05 21:47:12 +01:00

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

fix(mlx): route vision-language models to the mlx-vlm backend (#10274 )

2026-06-12 23:12:42 +02:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

feat(gallery): verify backend OCI images with keyless cosign (#9823 )

2026-05-18 08:02:20 +02:00

fix(openresponses): populate Content and accept bare {role,content} items (#10039 ) (#10040 )

2026-05-28 07:21:48 +00:00

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00