LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-06 07:46:15 -04:00

Files

Ettore Di Giacinto cb3609530a fix(realtime): always strip reasoning from spoken output

disable_thinking maps to ReasoningConfig.DisableReasoning=true on the LLM
config, which the backend reads as enable_thinking=false. But the realtime
handler reads that SAME config to drive reasoning extraction, and there
DisableReasoning=true means "skip stripping". PredictConfig() returns this
LLM config, so both the streamed (speechStreamer) and buffered realtime
paths stopped stripping <think>…</think> exactly when disable_thinking was
on — leaking raw reasoning to the client whenever the model ignored the
enable_thinking hint (e.g. lfm2.5).

Add spokenReasoningConfig() which clears DisableReasoning for extraction
(keeping custom tokens/tag pairs) and route both realtime paths through it.
Spoken output now always strips reasoning, independent of the backend
suppression hint.

Assisted-by: Claude:claude-opus-4-8 go test, golangci-lint
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-05 14:03:36 +00:00

anthropic

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

elevenlabs

feat(tts): support per-request instructions and params (#10172 )

2026-06-04 11:45:02 +02:00

explorer

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

jina

feat(whisper): honor client cancellation via ggml abort_callback (#9710 )