mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-06 07:46:15 -04:00
Per review (richiejp): the sentence segmenter pipelined unary TTS by splitting on ASCII .!?/newline, which does nothing for languages without those boundaries (CJK/Thai) — there it already degraded to buffering the whole message anyway. Replace it with a uniform model: stream the LLM transcript live, buffer the full message, then synthesize it once. emitSpeech already streams the audio chunks when the backend implements TTSStream and falls back to a single unary delta otherwise, so this is real streaming TTS where supported and a clean whole-message synthesis elsewhere — no per-sentence emulation, no language assumptions. speechStreamer becomes transcriptStreamer (transcript deltas only); the whole-message synthesis moves into streamLLMResponse. Assisted-by: Claude:claude-opus-4-8 go test, golangci-lint Signed-off-by: Ettore Di Giacinto <mudler@localai.io>