LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-31 18:38:23 -04:00

Files

LocalAI [bot] 46ba70632b fix(crispasr): write piper TTS WAV at the model's native sample rate (#10277 )

CrispASR's piper backend returns PCM at the voice's native rate (from the GGUF
piper.sample_rate key: 16 kHz for x_low/low, 22.05 kHz for medium/high) and does
not resample, but the Go WAV encoder hardcoded 24000 Hz. Every piper voice was
therefore written with a wrong header and played back at the wrong pitch/speed.

Read piper.sample_rate from the model's GGUF metadata at Load via the vendored
gguf-parser-go and use it for the WAV header, falling back to the 24 kHz default
for the other CrispASR TTS engines (vibevoice/orpheus/chatterbox/qwen3-tts) that
emit 24 kHz and carry no such key.

Adds unit specs (minimal crafted GGUFs + WAV-header decode) and an env-gated
end-to-end spec (CRISPASR_PIPER_MODEL_PATH). Verified e2e: en_GB-cori-medium
synthesizes a 22050 Hz WAV through backend:piper.


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-12 23:10:17 +02:00

acestep-cpp

chore(acestep-cpp): bump pin to ed53caf and adapt wrapper to new API (#9908 )

2026-05-20 21:05:32 +00:00

cloud-proxy

fix(distributed): self-heal stale 'model not loaded' routing (#10181 )

2026-06-05 09:01:36 +02:00

crispasr

fix(crispasr): write piper TTS WAV at the model's native sample rate (#10277 )

2026-06-12 23:10:17 +02:00

llm/llama

feat: add distributed mode (#9124 )