LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-08-02 03:20:12 -04:00

Files

LocalAI [bot] de2ce74bea fix(stablediffusion-ggml): mux LTX-2 audio into output MP4 (#9990 )

feat(stablediffusion-ggml): mux LTX-2 audio into output MP4

sd.cpp's generate_video now returns a sd_audio_t* alongside the video
frames for models with an audio VAE (LTX-2.3). Our gosd wrapper was
already collecting that pointer but immediately freed it without ever
muxing it into the output, so LTX-2 generations landed as silent MP4s
even though the audio VAE decode succeeded.

Stage the planar float32 waveform to a temp WAV (IEEE float, header
hand-built; samples interleaved on the fly), then add it as a second
ffmpeg input with -c:a aac -map 0:v:0 -map 1:a:0 -shortest. The temp
WAV is cleaned up unconditionally after ffmpeg exits, including on
the write/waitpid error paths.

Non-LTX models (Wan i2v / FLF2V) keep their current behaviour: audio
arg is nullptr, the audio-related ffmpeg flags are not added, and no
temp file is created.

Assisted-by: Claude:claude-opus-4-7

Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-05-25 22:40:16 +02:00

gosd.cpp

fix(stablediffusion-ggml): mux LTX-2 audio into output MP4 (#9990 )

2026-05-25 22:40:16 +02:00

gosd.h

chore: add golangci-lint with new-from-merge-base baseline (#9603 )

2026-04-28 22:07:44 +02:00