mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-06 15:56:06 -04:00
chore(parakeet-cpp): bump pin to banded long-audio attention (843600590) Update PARAKEET_VERSION to mudler/parakeet.cpp@843600590f (merge of parakeet.cpp#9). Brings NeMo rel_pos_local_attn banded/Longformer attention with the chunk-matmul construction: long audio now uses O(T*window) attention instead of global O(T^2), fixing the encoder OOM on long clips (~16.6-min clip: 54GB->9.4GB peak, ~4x faster) at NeMo's full [128,128] window. Short clips are unchanged (global path). No C-ABI change. Assisted-by: Claude:claude-opus-4-8 Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>