LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-08-01 02:49:51 -04:00

Files

LocalAI [bot] 8c8204d3c4 feat(parakeet-cpp): enable GGML_CUDA_GRAPHS in the cublas build (#10273 )

ggml leaves GGML_CUDA_GRAPHS off by default. Passing -DGGML_CUDA_GRAPHS=ON
for cublas builds lets the CUDA backend capture and replay the compute
graph for a small free speedup (about 1% measured on a GB10, never
negative). It is not gated by parakeet.cpp's CMake options, so it passes
straight through to ggml.

Assisted-by: Claude Opus 4.8 <noreply@anthropic.com>

Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-12 18:47:36 +02:00

.gitignore

feat(parakeet-cpp): add NVIDIA NeMo Parakeet ASR backend (parakeet.cpp) (#10084 )

2026-05-30 14:46:10 +02:00

batcher_test.go

feat(parakeet-cpp): nemotron-3.5-asr multilingual streaming model + request language support (#10199 )

2026-06-06 13:53:10 +02:00

batcher.go

feat(parakeet-cpp): nemotron-3.5-asr multilingual streaming model + request language support (#10199 )

2026-06-06 13:53:10 +02:00

goparakeetcpp_test.go

feat(parakeet-cpp): real segment timestamps (NeMo-faithful) (#10207 )