mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-01 04:28:59 -04:00
* feat(parakeet-cpp): L0 backend scaffold, LoadModel + AudioTranscription (text) Add a Go gRPC backend that bridges LocalAI to parakeet.cpp via the flat C-API (parakeet_capi.h), loaded with purego (cgo-less, mirrors the whisper / vibevoice-cpp backends). L0 scope: - main.go: dlopen libparakeet.so (override via PARAKEET_LIBRARY), register the C-API entry points, start the gRPC server. - goparakeetcpp.go: Load (parakeet_capi_load), AudioTranscription (parakeet_capi_transcribe_path, decoder=0 = per-arch default head), Free, serialized through base.SingleThread since the C engine is a thread-unsafe singleton. char* returns are bound as uintptr so the malloc'd buffer is freed via parakeet_capi_free_string after copy. - AudioTranscriptionStream returns a clear "not implemented in L0" error (closes the channel so the server doesn't hang), wired in L2. - Makefile: clone-at-pin + cmake (PARAKEET_VERSION for bump_deps.sh), with a local-symlink dev shortcut; run.sh / package.sh mirror whisper. - Test auto-skips without PARAKEET_BACKEND_TEST_MODEL/_WAV fixtures. Builds clean (CGO_ENABLED=0), gofmt clean, test passes. The single unsafeptr vet note in goStringFromCPtr is documented and matches the whisper backend's tolerated pattern. Word/segment timestamps (L1) and cache-aware streaming (L2) follow. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(parakeet-cpp): L1 word/segment timestamps via transcribe_path_json AudioTranscription now calls parakeet_capi_transcribe_path_json and shapes the per-word / per-token timestamps into the TranscriptResult: - Bind parakeet_capi_transcribe_path_json (purego, char* as uintptr like the other returns) and register it in main.go + the test loader. - Parse the JSON document ({"text","words":[{w,start,end,conf}], "tokens":[{id,t,conf}]}) into typed structs. - Synthesise a single whole-clip segment (parakeet emits no native segment boundaries) spanning the first word start to the last word end; token ids populate Segment.Tokens. - Attach word-level timings only when timestamp_granularities=["word"], matching the OpenAI API (segment-level default). secondsToNanos mirrors the whisper backend's nanosecond convention. Verified end-to-end against tdt_ctc-110m (f16): both the default and word-granularity specs pass; builds clean, gofmt clean, vet shows only the one documented unsafeptr note shared with the whisper backend. Cache-aware streaming (L2) follows. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(parakeet-cpp): L2 cache-aware streaming with EOU segmentation Wire AudioTranscriptionStream to the streaming RNN-T C-API: - Bind parakeet_capi_stream_{begin,feed,finalize,free}; feed takes 16 kHz mono float PCM ([]float32 via purego) and writes *eou_out on <EOU>/<EOB>. - Decode opts.Dst to 16 kHz mono PCM (utils.AudioToWav + go-audio, same as the whisper backend), feed it in 1 s chunks, and emit each newly-finalized text run as a TranscriptStreamResponse delta. - <EOU>/<EOB> events close the current segment; a closing FinalResult carries the full transcript plus the per-utterance segments (with a whole-clip fallback segment when no EOU fired). - stream_begin returns 0 for non-streaming models, surfaced as a clear error instead of an empty stream. Honours context cancellation between chunks. Frees every malloc'd delta and the session. Verified end-to-end against realtime_eou_120m-v1 (f16): the streamed transcript matches the offline 110m reference word-for-word, deltas reconstruct the final text, and the spec passes alongside the offline specs. Builds clean, gofmt clean, vet shows only the shared documented unsafeptr note. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(parakeet-cpp): L3 register backend in build/CI/gallery (whisper parity) Wire the new Go gRPC parakeet-cpp backend (parakeet.cpp ggml port of NVIDIA NeMo Parakeet ASR) into LocalAI's build/CI/gallery surfaces, matching the existing ggml whisper Go backend 1:1. - .github/backend-matrix.yml: add 11 linux entries + 1 darwin entry mirroring every whisper build (cpu amd64/arm64, intel sycl f32/f16, vulkan amd64/arm64, nvidia cuda-12, nvidia cuda-13, nvidia-l4t-arm64, nvidia-l4t-cuda-13-arm64, rocm hipblas, metal-darwin-arm64), all on ./backend/Dockerfile.golang with backend: "parakeet-cpp" and -*-parakeet-cpp tag-suffixes. - scripts/changed-backends.js: explicit inferBackendPath branch resolving parakeet-cpp to backend/go/parakeet-cpp/ before the generic golang branch. - .github/workflows/bump_deps.yaml: track the PARAKEET_VERSION pin in backend/go/parakeet-cpp/Makefile (repo mudler/parakeet.cpp, branch master). - backend/index.yaml: add ¶keetcpp meta + latest/development image entries for every matrix tag-suffix. - Makefile: add backends/parakeet-cpp to .NOTPARALLEL, BACKEND_PARAKEET_CPP definition, docker-build target eval, and test-extra-backend-parakeet-cpp- transcription target (mirrors test-extra-backend-whisper-transcription). Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(parakeet-cpp): L4 gallery importer for parakeet GGUFs Add ParakeetCppImporter so parakeet.cpp GGUFs auto-detect on /import-model and route to the parakeet-cpp backend (it also surfaces in /backends/known, which drives the import dropdown). - Match is narrow: a .gguf whose name carries a parakeet architecture token (<arch>-<size>-<quant>.gguf, e.g. tdt_ctc-110m-f16.gguf, rnnt-0.6b-q4_k.gguf, realtime_eou_120m-v1-q8_0.gguf), a direct URL to one, or preferences.backend="parakeet-cpp". It deliberately does NOT claim arbitrary llama-style GGUFs, nor the upstream nvidia/parakeet-* NeMo repos (.nemo, not runnable here). - Registered in the ASR batch BEFORE LlamaCPPImporter so its GGUFs aren't swallowed by the generic .gguf importer. - Import nests files under parakeet-cpp/models/<name>/, defaults to the smallest quant (q4_k, near-lossless on parakeet) with a size-ladder fallback, and honours preferences.quantizations / name / description. Tested with synthetic HF details (no network): metadata, positive matches (HF repo, direct URL, preference), narrowness negatives (llama GGUF, NeMo repo), and import (default quant, override, direct URL), 9 specs pass, build/vet/gofmt clean. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs(parakeet-cpp): document the parakeet-cpp transcription backend Add parakeet-cpp to the audio-to-text backend list and a dedicated usage section: direct GGUF import (auto-detects to the backend), model YAML, word-level timestamps via timestamp_granularities[]=word, and cache-aware streaming with the realtime_eou model. Points at the mudler/parakeet-cpp-gguf collection repo. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * ci(parakeet-cpp): wire transcription gRPC e2e test into test-extra The L3 commit added the test-extra-backend-parakeet-cpp-transcription Makefile target but never invoked it in CI. Mirror the whisper job: - Add a parakeet-cpp output to detect-changes (emitted by changed-backends.js from the matrix entry). - Add tests-parakeet-cpp-grpc-transcription, gated on the parakeet-cpp path filter / run-all, building the backend image and running the transcription e2e against tdt_ctc-110m + the JFK clip. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * style(parakeet-cpp): drop em dashes from comments and docs Replace em dashes with plain punctuation in the backend comments, the importer, package.sh, and the audio-to-text docs section (and use "and" instead of the multiplication sign). No behaviour change. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat(gallery): add parakeet-cpp f16 models to the model gallery Add the 10 NVIDIA Parakeet models (f16, the recommended quality/speed default) as gallery entries that install on the parakeet-cpp backend from mudler/parakeet-cpp-gguf: tdt_ctc-110m/1.1b, tdt-0.6b-v2/v3, tdt-1.1b, ctc-0.6b/1.1b, rnnt-0.6b/1.1b, and the cache-aware streaming realtime_eou_120m-v1. Each pins the file sha256 and routes transcript usecases to the backend. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(parakeet-cpp): satisfy govet lint + bump PARAKEET_VERSION - goparakeetcpp.go: //nolint:govet on the C-owned-pointer unsafe.Pointer conversion (golangci-lint reports new-only issues, so unlike the whisper backend's identical line this one is flagged). - Makefile: bump PARAKEET_VERSION to the current parakeet.cpp master commit (the previous pin's commit no longer exists after upstream history was squashed), so the backend image clone/build resolves again. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(parakeet-cpp): pin PARAKEET_VERSION to a tag-stable commit The previous SHA pin was orphaned when parakeet.cpp's single-commit master was amended/force-pushed, so the backend image clone (git fetch <sha>) failed across every build variant. Repoint to 845c29e, which upstream now keeps permanently fetchable via the `localai-backend-pin` tag, so future upstream amends no longer break the backend build. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(parakeet-cpp): init the ggml submodule in the backend image clone The backend Dockerfile clones parakeet.cpp at PARAKEET_VERSION with a shallow fetch + checkout but never initialised submodules, so third_party/ggml was empty and the parakeet.cpp cmake build failed at `add_subdirectory(third_party/ggml)` (CMakeLists.txt:53) on every build variant. Add `git submodule update --init --recursive --depth 1 --single-branch` after checkout, mirroring the whisper backend. Verified locally: clone + submodule + cmake configure now succeeds. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(parakeet-cpp): statically link ggml into libparakeet.so The shared libparakeet.so linked ggml's shared libs (libggml*.so), but the package only ships libparakeet.so, so at runtime dlopen failed with "libggml.so.0: cannot open shared object file" (the e2e transcription test panicked on load). Build ggml static + PIC (BUILD_SHARED_LIBS=OFF, CMAKE_POSITION_INDEPENDENT_CODE=ON) so libparakeet.so embeds ggml and depends only on system libs already present in the runtime image. Verified locally: ldd shows no libggml dependency. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(parakeet-cpp): non-streaming fallback in AudioTranscriptionStream The e2e streaming test ran AudioTranscriptionStream against tdt_ctc-110m (not a cache-aware streaming model), so stream_begin returned 0 and the call errored. Per LocalAI's streaming contract (and the whisper backend), a non-streaming model should fall back to a single offline transcription emitted as one delta plus a closing FinalResult. Do that instead of erroring, so the streaming endpoint works for every parakeet model. Verified locally: the streaming spec passes against the non-streaming 110m model via fallback. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
1122 lines
44 KiB
YAML
1122 lines
44 KiB
YAML
---
|
|
name: 'Tests extras backends'
|
|
|
|
on:
|
|
pull_request:
|
|
push:
|
|
branches:
|
|
- master
|
|
tags:
|
|
- '*'
|
|
|
|
concurrency:
|
|
group: ci-tests-extra-${{ github.event.pull_request.number || github.sha }}-${{ github.repository }}
|
|
cancel-in-progress: ${{ github.event_name == 'pull_request' }}
|
|
|
|
jobs:
|
|
detect-changes:
|
|
runs-on: ubuntu-latest
|
|
outputs:
|
|
run-all: ${{ steps.detect.outputs.run-all }}
|
|
transformers: ${{ steps.detect.outputs.transformers }}
|
|
rerankers: ${{ steps.detect.outputs.rerankers }}
|
|
diffusers: ${{ steps.detect.outputs.diffusers }}
|
|
coqui: ${{ steps.detect.outputs.coqui }}
|
|
moonshine: ${{ steps.detect.outputs.moonshine }}
|
|
pocket-tts: ${{ steps.detect.outputs.pocket-tts }}
|
|
qwen-tts: ${{ steps.detect.outputs.qwen-tts }}
|
|
qwen-asr: ${{ steps.detect.outputs.qwen-asr }}
|
|
nemo: ${{ steps.detect.outputs.nemo }}
|
|
voxcpm: ${{ steps.detect.outputs.voxcpm }}
|
|
liquid-audio: ${{ steps.detect.outputs.liquid-audio }}
|
|
llama-cpp-quantization: ${{ steps.detect.outputs.llama-cpp-quantization }}
|
|
llama-cpp: ${{ steps.detect.outputs.llama-cpp }}
|
|
ik-llama-cpp: ${{ steps.detect.outputs.ik-llama-cpp }}
|
|
turboquant: ${{ steps.detect.outputs.turboquant }}
|
|
vllm: ${{ steps.detect.outputs.vllm }}
|
|
sglang: ${{ steps.detect.outputs.sglang }}
|
|
acestep-cpp: ${{ steps.detect.outputs.acestep-cpp }}
|
|
qwen3-tts-cpp: ${{ steps.detect.outputs.qwen3-tts-cpp }}
|
|
rfdetr-cpp: ${{ steps.detect.outputs.rfdetr-cpp }}
|
|
vibevoice-cpp: ${{ steps.detect.outputs.vibevoice-cpp }}
|
|
localvqe: ${{ steps.detect.outputs.localvqe }}
|
|
voxtral: ${{ steps.detect.outputs.voxtral }}
|
|
kokoros: ${{ steps.detect.outputs.kokoros }}
|
|
insightface: ${{ steps.detect.outputs.insightface }}
|
|
speaker-recognition: ${{ steps.detect.outputs.speaker-recognition }}
|
|
sherpa-onnx: ${{ steps.detect.outputs.sherpa-onnx }}
|
|
whisper: ${{ steps.detect.outputs.whisper }}
|
|
parakeet-cpp: ${{ steps.detect.outputs.parakeet-cpp }}
|
|
steps:
|
|
- name: Checkout repository
|
|
uses: actions/checkout@v6
|
|
- name: Setup Bun
|
|
uses: oven-sh/setup-bun@v2
|
|
- name: Install dependencies
|
|
run: bun add js-yaml @octokit/core
|
|
- name: Detect changed backends
|
|
id: detect
|
|
env:
|
|
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
|
GITHUB_EVENT_PATH: ${{ github.event_path }}
|
|
run: bun run scripts/changed-backends.js
|
|
|
|
# Requires CUDA
|
|
# tests-chatterbox-tts:
|
|
# runs-on: ubuntu-latest
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install build-essential ffmpeg
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# sudo apt-get install -y libopencv-dev
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
|
|
# - name: Test chatterbox-tts
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/chatterbox
|
|
# make --jobs=5 --output-sync=target -C backend/python/chatterbox test
|
|
tests-transformers:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.transformers == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install build-essential ffmpeg
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
sudo apt-get install -y libopencv-dev
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
|
|
- name: Test transformers
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/transformers
|
|
make --jobs=5 --output-sync=target -C backend/python/transformers test
|
|
tests-rerankers:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.rerankers == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install build-essential ffmpeg
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
sudo apt-get install -y libopencv-dev
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
|
|
- name: Test rerankers
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/rerankers
|
|
make --jobs=5 --output-sync=target -C backend/python/rerankers test
|
|
|
|
tests-diffusers:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.diffusers == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
sudo apt-get install -y libopencv-dev
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test diffusers
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/diffusers
|
|
make --jobs=5 --output-sync=target -C backend/python/diffusers test
|
|
|
|
#tests-vllm:
|
|
# runs-on: ubuntu-latest
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install -y build-essential ffmpeg
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# sudo apt-get install -y libopencv-dev
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
# - name: Test vllm backend
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/vllm
|
|
# make --jobs=5 --output-sync=target -C backend/python/vllm test
|
|
# tests-transformers-musicgen:
|
|
# runs-on: ubuntu-latest
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install build-essential ffmpeg
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# sudo apt-get install -y libopencv-dev
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
|
|
# - name: Test transformers-musicgen
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/transformers-musicgen
|
|
# make --jobs=5 --output-sync=target -C backend/python/transformers-musicgen test
|
|
|
|
# tests-bark:
|
|
# runs-on: ubuntu-latest
|
|
# steps:
|
|
# - name: Release space from worker
|
|
# run: |
|
|
# echo "Listing top largest packages"
|
|
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
|
# head -n 30 <<< "${pkgs}"
|
|
# echo
|
|
# df -h
|
|
# echo
|
|
# sudo apt-get remove -y '^llvm-.*|^libllvm.*' || true
|
|
# sudo apt-get remove --auto-remove android-sdk-platform-tools || true
|
|
# sudo apt-get purge --auto-remove android-sdk-platform-tools || true
|
|
# sudo rm -rf /usr/local/lib/android
|
|
# sudo apt-get remove -y '^dotnet-.*|^aspnetcore-.*' || true
|
|
# sudo rm -rf /usr/share/dotnet
|
|
# sudo apt-get remove -y '^mono-.*' || true
|
|
# sudo apt-get remove -y '^ghc-.*' || true
|
|
# sudo apt-get remove -y '.*jdk.*|.*jre.*' || true
|
|
# sudo apt-get remove -y 'php.*' || true
|
|
# sudo apt-get remove -y hhvm powershell firefox monodoc-manual msbuild || true
|
|
# sudo apt-get remove -y '^google-.*' || true
|
|
# sudo apt-get remove -y azure-cli || true
|
|
# sudo apt-get remove -y '^mongo.*-.*|^postgresql-.*|^mysql-.*|^mssql-.*' || true
|
|
# sudo apt-get remove -y '^gfortran-.*' || true
|
|
# sudo apt-get remove -y microsoft-edge-stable || true
|
|
# sudo apt-get remove -y firefox || true
|
|
# sudo apt-get remove -y powershell || true
|
|
# sudo apt-get remove -y r-base-core || true
|
|
# sudo apt-get autoremove -y
|
|
# sudo apt-get clean
|
|
# echo
|
|
# echo "Listing top largest packages"
|
|
# pkgs=$(dpkg-query -Wf '${Installed-Size}\t${Package}\t${Status}\n' | awk '$NF == "installed"{print $1 "\t" $2}' | sort -nr)
|
|
# head -n 30 <<< "${pkgs}"
|
|
# echo
|
|
# sudo rm -rfv build || true
|
|
# sudo rm -rf /usr/share/dotnet || true
|
|
# sudo rm -rf /opt/ghc || true
|
|
# sudo rm -rf "/usr/local/share/boost" || true
|
|
# sudo rm -rf "$AGENT_TOOLSDIRECTORY" || true
|
|
# df -h
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install build-essential ffmpeg
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# sudo apt-get install -y libopencv-dev
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
|
|
# - name: Test bark
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/bark
|
|
# make --jobs=5 --output-sync=target -C backend/python/bark test
|
|
|
|
|
|
# Below tests needs GPU. Commented out for now
|
|
# TODO: Re-enable as soon as we have GPU nodes
|
|
# tests-vllm:
|
|
# runs-on: ubuntu-latest
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install build-essential ffmpeg
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# sudo apt-get install -y libopencv-dev
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
# - name: Test vllm
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/vllm
|
|
# make --jobs=5 --output-sync=target -C backend/python/vllm test
|
|
|
|
tests-coqui:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.coqui == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch espeak espeak-ng python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test coqui
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/coqui
|
|
make --jobs=5 --output-sync=target -C backend/python/coqui test
|
|
tests-moonshine:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.moonshine == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test moonshine
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/moonshine
|
|
make --jobs=5 --output-sync=target -C backend/python/moonshine test
|
|
tests-pocket-tts:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.pocket-tts == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test pocket-tts
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/pocket-tts
|
|
make --jobs=5 --output-sync=target -C backend/python/pocket-tts test
|
|
tests-qwen-tts:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.qwen-tts == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test qwen-tts
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/qwen-tts
|
|
make --jobs=5 --output-sync=target -C backend/python/qwen-tts test
|
|
# TODO: s2-pro model is too large to load on CPU-only CI runners — re-enable
|
|
# when we have GPU runners or a smaller test model.
|
|
# tests-fish-speech:
|
|
# runs-on: ubuntu-latest
|
|
# timeout-minutes: 45
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install -y build-essential ffmpeg portaudio19-dev
|
|
# sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# # Install UV
|
|
# curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
# pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
# - name: Test fish-speech
|
|
# run: |
|
|
# make --jobs=5 --output-sync=target -C backend/python/fish-speech
|
|
# make --jobs=5 --output-sync=target -C backend/python/fish-speech test
|
|
tests-qwen-asr:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.qwen-asr == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg sox
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test qwen-asr
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/qwen-asr
|
|
make --jobs=5 --output-sync=target -C backend/python/qwen-asr test
|
|
tests-nemo:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.nemo == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg sox
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test nemo
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/nemo
|
|
make --jobs=5 --output-sync=target -C backend/python/nemo test
|
|
tests-voxcpm:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.voxcpm == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test voxcpm
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/voxcpm
|
|
make --jobs=5 --output-sync=target -C backend/python/voxcpm test
|
|
# liquid-audio: LFM2.5-Audio any-to-any backend. The CI smoke test
|
|
# exercises Health() and LoadModel(mode:finetune) — fine-tune mode
|
|
# short-circuits before pulling weights (backend.py:192), so no
|
|
# HuggingFace download or GPU is needed. The full-inference path is
|
|
# gated on LIQUID_AUDIO_MODEL_ID, which we don't set here.
|
|
tests-liquid-audio:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.liquid-audio == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential ffmpeg
|
|
sudo apt-get install -y ca-certificates cmake curl patch python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Test liquid-audio
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/liquid-audio
|
|
make --jobs=5 --output-sync=target -C backend/python/liquid-audio test
|
|
tests-llama-cpp-quantization:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.llama-cpp-quantization == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 30
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl git python3-pip
|
|
# Install UV
|
|
curl -LsSf https://astral.sh/uv/install.sh | sh
|
|
pip install --user --no-cache-dir grpcio-tools==1.64.1
|
|
- name: Build llama-quantize from llama.cpp
|
|
run: |
|
|
git clone --depth 1 https://github.com/ggml-org/llama.cpp.git /tmp/llama.cpp
|
|
cmake -B /tmp/llama.cpp/build -S /tmp/llama.cpp -DGGML_NATIVE=OFF
|
|
cmake --build /tmp/llama.cpp/build --target llama-quantize -j$(nproc)
|
|
sudo cp /tmp/llama.cpp/build/bin/llama-quantize /usr/local/bin/
|
|
- name: Install backend
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/llama-cpp-quantization
|
|
- name: Test llama-cpp-quantization
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/python/llama-cpp-quantization test
|
|
tests-llama-cpp-grpc:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.llama-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build llama-cpp backend image and run gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-llama-cpp
|
|
tests-llama-cpp-grpc-transcription:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.llama-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build llama-cpp backend image and run audio transcription gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-llama-cpp-transcription
|
|
# PR-acceptance smoke gate: always runs on every PR (no detect-changes gate, no
|
|
# paths filter). Pulls the pre-built master CPU llama-cpp image from quay
|
|
# instead of building from source, so the cost is a docker pull (~30s) plus the
|
|
# short Qwen3-0.6B model download. Exercises the full gRPC surface — health,
|
|
# load, predict, stream — plus the logprobs/logit_bias specs that moved out of
|
|
# core/http/app_test.go. Anything heavier or per-backend is gated to the
|
|
# detect-changes path-filter above.
|
|
tests-llama-cpp-smoke:
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 20
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Pull pre-built llama-cpp backend image
|
|
run: docker pull quay.io/go-skynet/local-ai-backends:master-cpu-llama-cpp
|
|
- name: Run e2e-backends smoke
|
|
env:
|
|
BACKEND_IMAGE: quay.io/go-skynet/local-ai-backends:master-cpu-llama-cpp
|
|
BACKEND_TEST_CAPS: health,load,predict,stream,logprobs,logit_bias
|
|
run: |
|
|
make test-extra-backend
|
|
# Realtime e2e with sherpa-onnx driving VAD + STT + TTS against a mocked LLM.
|
|
# Builds the sherpa-onnx Docker image, extracts the rootfs so the e2e suite
|
|
# can discover the backend binary + shared libs, downloads the three model
|
|
# bundles (silero-vad, omnilingual-asr, vits-ljs) and drives the realtime
|
|
# websocket spec end-to-end.
|
|
tests-sherpa-onnx-realtime:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.sherpa-onnx == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Setup Node.js
|
|
uses: actions/setup-node@v6
|
|
with:
|
|
node-version: '22'
|
|
- name: Build sherpa-onnx backend image and run realtime e2e tests
|
|
run: |
|
|
make test-extra-e2e-realtime-sherpa
|
|
# Streaming ASR via the sherpa-onnx online recognizer (zipformer
|
|
# transducer). Exercises both AudioTranscription (buffered) and
|
|
# AudioTranscriptionStream (real-time deltas) on the e2e-backends
|
|
# harness.
|
|
tests-sherpa-onnx-grpc-transcription:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.sherpa-onnx == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build sherpa-onnx backend image and run streaming ASR gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-sherpa-onnx-transcription
|
|
# End-to-end transcription via the e2e-backends gRPC harness against
|
|
# the whisper.cpp backend. Drives AudioTranscription (offline) and
|
|
# AudioTranscriptionStream (real, segment-callback-driven deltas) on
|
|
# ggml-base.en + the JFK 11s clip.
|
|
tests-whisper-grpc-transcription:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.whisper == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build whisper backend image and run transcription gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-whisper-transcription
|
|
# Parakeet ASR via the parakeet-cpp backend (C++/ggml port of NeMo
|
|
# Parakeet). Drives AudioTranscription (offline, with word timestamps) on
|
|
# tdt_ctc-110m + the JFK 11s clip.
|
|
tests-parakeet-cpp-grpc-transcription:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.parakeet-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build parakeet-cpp backend image and run transcription gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-parakeet-cpp-transcription
|
|
# VITS TTS via the sherpa-onnx backend. Drives both TTS (file write) and
|
|
# TTSStream (PCM chunks) on the e2e-backends harness.
|
|
tests-sherpa-onnx-grpc-tts:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.sherpa-onnx == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build sherpa-onnx backend image and run TTS gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-sherpa-onnx-tts
|
|
tests-ik-llama-cpp-grpc:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.ik-llama-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build ik-llama-cpp backend image and run gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-ik-llama-cpp
|
|
tests-turboquant-grpc:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.turboquant == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
# Exercises the turboquant (llama.cpp fork) backend with KV-cache
|
|
# quantization enabled. The convenience target sets
|
|
# BACKEND_TEST_CACHE_TYPE_K / _V=q8_0, which are plumbed into the
|
|
# ModelOptions.CacheTypeKey/Value gRPC fields. LoadModel-success +
|
|
# backend stdout/stderr (captured by the Ginkgo suite) prove the
|
|
# cache-type config path reaches the fork's KV-cache init.
|
|
- name: Build turboquant backend image and run gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-turboquant
|
|
# tests-vllm-grpc is currently disabled in CI.
|
|
#
|
|
# The prebuilt vllm CPU wheel is compiled with AVX-512 VNNI/BF16
|
|
# instructions, and neither ubuntu-latest nor the bigger-runner pool
|
|
# offers a stable CPU baseline that supports them — runners come
|
|
# back with different hardware between runs and SIGILL on import of
|
|
# vllm.model_executor.models.registry. Compiling vllm from source
|
|
# via FROM_SOURCE=true works on any CPU but takes 30-50 minutes per
|
|
# run, which is too slow for a smoke test.
|
|
#
|
|
# The test itself (tests/e2e-backends + make test-extra-backend-vllm)
|
|
# is fully working and validated locally on a host with the right
|
|
# SIMD baseline. Run it manually with:
|
|
#
|
|
# make test-extra-backend-vllm
|
|
#
|
|
# Re-enable this job once we have a self-hosted runner label with
|
|
# guaranteed AVX-512 VNNI/BF16 support, or once the vllm project
|
|
# publishes a CPU wheel with a wider baseline.
|
|
#
|
|
# tests-vllm-grpc:
|
|
# needs: detect-changes
|
|
# if: needs.detect-changes.outputs.vllm == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
# runs-on: bigger-runner
|
|
# timeout-minutes: 90
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install -y --no-install-recommends \
|
|
# make build-essential curl unzip ca-certificates git tar
|
|
# - name: Setup Go
|
|
# uses: actions/setup-go@v5
|
|
# with:
|
|
# go-version: '1.25.4'
|
|
# - name: Free disk space
|
|
# run: |
|
|
# sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /opt/hostedtoolcache/CodeQL || true
|
|
# df -h
|
|
# - name: Build vllm (cpu) backend image and run gRPC e2e tests
|
|
# run: |
|
|
# make test-extra-backend-vllm
|
|
# tests-sglang-grpc is currently disabled in CI for the same reason as
|
|
# tests-vllm-grpc: sglang's CPU kernel (sgl-kernel) uses __m512 AVX-512
|
|
# intrinsics unconditionally in shm.cpp, so the from-source build
|
|
# requires `-march=sapphirerapids` (already set in install.sh) and the
|
|
# resulting binary SIGILLs at import on CPUs without AVX-512 VNNI/BF16.
|
|
# The ubuntu-latest runner pool does not guarantee that ISA baseline.
|
|
#
|
|
# The test itself (tests/e2e-backends + make test-extra-backend-sglang)
|
|
# is fully working and validated locally on a host with the right
|
|
# SIMD baseline. Run it manually with:
|
|
#
|
|
# make test-extra-backend-sglang
|
|
#
|
|
# Re-enable this job once we have a self-hosted runner label with
|
|
# guaranteed AVX-512 VNNI/BF16 support.
|
|
#
|
|
# tests-sglang-grpc:
|
|
# needs: detect-changes
|
|
# if: needs.detect-changes.outputs.sglang == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
# runs-on: bigger-runner
|
|
# timeout-minutes: 90
|
|
# steps:
|
|
# - name: Clone
|
|
# uses: actions/checkout@v6
|
|
# with:
|
|
# submodules: true
|
|
# - name: Dependencies
|
|
# run: |
|
|
# sudo apt-get update
|
|
# sudo apt-get install -y --no-install-recommends \
|
|
# make build-essential curl unzip ca-certificates git tar
|
|
# - name: Setup Go
|
|
# uses: actions/setup-go@v5
|
|
# with:
|
|
# go-version: '1.25.4'
|
|
# - name: Free disk space
|
|
# run: |
|
|
# sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /opt/hostedtoolcache/CodeQL || true
|
|
# df -h
|
|
# - name: Build sglang (cpu) backend image and run gRPC e2e tests
|
|
# run: |
|
|
# make test-extra-backend-sglang
|
|
tests-acestep-cpp:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.acestep-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl libopenblas-dev ffmpeg
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
- name: Display Go version
|
|
run: go version
|
|
- name: Proto Dependencies
|
|
run: |
|
|
# Install protoc
|
|
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
|
|
unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
|
|
rm protoc.zip
|
|
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
|
|
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
|
|
PATH="$PATH:$HOME/go/bin" make protogen-go
|
|
- name: Build acestep-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/acestep-cpp
|
|
- name: Test acestep-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/acestep-cpp test
|
|
tests-qwen3-tts-cpp:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.qwen3-tts-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl libopenblas-dev ffmpeg
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
- name: Display Go version
|
|
run: go version
|
|
- name: Proto Dependencies
|
|
run: |
|
|
# Install protoc
|
|
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
|
|
unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
|
|
rm protoc.zip
|
|
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
|
|
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
|
|
PATH="$PATH:$HOME/go/bin" make protogen-go
|
|
- name: Build qwen3-tts-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/qwen3-tts-cpp
|
|
- name: Test qwen3-tts-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/qwen3-tts-cpp test
|
|
# Per-backend smoke for rfdetr-cpp: builds the .so + Go binary and runs
|
|
# `make -C backend/go/rfdetr-cpp test`. test.sh fetches the small (~20 MB)
|
|
# rfdetr-nano-q8_0 GGUF from the published mudler/rfdetr-cpp-nano HF repo
|
|
# via curl and synthesises a tiny PNG to exercise the wire protocol.
|
|
tests-rfdetr-cpp:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.rfdetr-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl libopenblas-dev
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
- name: Display Go version
|
|
run: go version
|
|
- name: Proto Dependencies
|
|
run: |
|
|
# Install protoc
|
|
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
|
|
unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
|
|
rm protoc.zip
|
|
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
|
|
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
|
|
PATH="$PATH:$HOME/go/bin" make protogen-go
|
|
- name: Build rfdetr-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/rfdetr-cpp
|
|
- name: Test rfdetr-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/rfdetr-cpp test
|
|
# Per-backend smoke for vibevoice-cpp: builds the .so + Go binary and
|
|
# runs `make -C backend/go/vibevoice-cpp test`. test.sh auto-downloads
|
|
# the published mudler/vibevoice.cpp-models bundle (TTS Q8_0 + ASR Q4_K
|
|
# + tokenizer + voice) and runs the closed-loop TTS → ASR Go test.
|
|
tests-vibevoice-cpp:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.vibevoice-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl libopenblas-dev ffmpeg
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
- name: Display Go version
|
|
run: go version
|
|
- name: Proto Dependencies
|
|
run: |
|
|
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
|
|
unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
|
|
rm protoc.zip
|
|
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
|
|
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
|
|
PATH="$PATH:$HOME/go/bin" make protogen-go
|
|
- name: Build vibevoice-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/vibevoice-cpp
|
|
- name: Test vibevoice-cpp
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/vibevoice-cpp test
|
|
# End-to-end TTS via the e2e-backends gRPC harness. Builds the
|
|
# vibevoice-cpp Docker image and drives Backend/TTS against it with a
|
|
# real LocalAI gRPC client.
|
|
tests-vibevoice-cpp-grpc-tts:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.vibevoice-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build vibevoice-cpp backend image and run TTS gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-vibevoice-cpp-tts
|
|
# End-to-end transcription via the e2e-backends gRPC harness. The
|
|
# vibevoice ASR is a 7B-param model (Q4_K weights ~10 GB on disk)
|
|
# and the JFK 30 s decode is too heavy for a free 4-core
|
|
# ubuntu-latest pool runner - two CI attempts got SIGTERM'd during
|
|
# LoadModel, before the test could even progress. Use the
|
|
# self-hosted 'bigger-runner' label (same one the GPU image builds
|
|
# in backend.yml use) and the documented dotnet/ghc/android cache
|
|
# purge to clear ~10-20 GB of headroom for the model + Docker
|
|
# image + working dir.
|
|
tests-vibevoice-cpp-grpc-transcription:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.vibevoice-cpp == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: bigger-runner
|
|
timeout-minutes: 150
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y --no-install-recommends \
|
|
make build-essential curl unzip ca-certificates git tar
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Free disk space
|
|
run: |
|
|
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /opt/hostedtoolcache/CodeQL || true
|
|
df -h
|
|
- name: Build vibevoice-cpp backend image and run ASR gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-vibevoice-cpp-transcription
|
|
# End-to-end audio transform via the e2e-backends gRPC harness. The
|
|
# LocalVQE GGUF is small (~5 MB) and the model is real-time on CPU, so
|
|
# the default ubuntu-latest pool is plenty.
|
|
tests-localvqe-grpc-transform:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.localvqe == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 60
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.25.4'
|
|
- name: Build localvqe backend image and run audio_transform gRPC e2e tests
|
|
run: |
|
|
make test-extra-backend-localvqe-transform
|
|
tests-voxtral:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.voxtral == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake curl libopenblas-dev ffmpeg
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
# You can test your matrix by printing the current Go version
|
|
- name: Display Go version
|
|
run: go version
|
|
- name: Proto Dependencies
|
|
run: |
|
|
# Install protoc
|
|
curl -L -s https://github.com/protocolbuffers/protobuf/releases/download/v26.1/protoc-26.1-linux-x86_64.zip -o protoc.zip && \
|
|
unzip -j -d /usr/local/bin protoc.zip bin/protoc && \
|
|
rm protoc.zip
|
|
go install google.golang.org/protobuf/cmd/protoc-gen-go@v1.34.2
|
|
go install google.golang.org/grpc/cmd/protoc-gen-go-grpc@1958fcbe2ca8bd93af633f11e97d44e567e945af
|
|
PATH="$PATH:$HOME/go/bin" make protogen-go
|
|
- name: Build voxtral
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/voxtral
|
|
- name: Test voxtral
|
|
run: |
|
|
make --jobs=5 --output-sync=target -C backend/go/voxtral test
|
|
tests-kokoros:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.kokoros == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y build-essential cmake pkg-config protobuf-compiler clang libclang-dev
|
|
sudo apt-get install -y espeak-ng libespeak-ng-dev libsonic-dev libpcaudio-dev libopus-dev libssl-dev
|
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
|
|
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
|
|
- name: Build kokoros
|
|
run: |
|
|
make -C backend/rust/kokoros kokoros-grpc
|
|
- name: Test kokoros
|
|
run: |
|
|
make -C backend/rust/kokoros test
|
|
tests-insightface-grpc:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.insightface == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y --no-install-recommends \
|
|
make build-essential curl unzip ca-certificates git tar
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.26.0'
|
|
- name: Free disk space
|
|
run: |
|
|
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /opt/hostedtoolcache/CodeQL || true
|
|
df -h
|
|
- name: Build insightface backend image and run both model configurations
|
|
run: |
|
|
make test-extra-backend-insightface-all
|
|
tests-speaker-recognition-grpc:
|
|
needs: detect-changes
|
|
if: needs.detect-changes.outputs.speaker-recognition == 'true' || needs.detect-changes.outputs.run-all == 'true'
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 90
|
|
steps:
|
|
- name: Clone
|
|
uses: actions/checkout@v6
|
|
with:
|
|
submodules: true
|
|
- name: Dependencies
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y --no-install-recommends \
|
|
make build-essential curl ca-certificates git tar
|
|
- name: Setup Go
|
|
uses: actions/setup-go@v5
|
|
with:
|
|
go-version: '1.26.0'
|
|
- name: Free disk space
|
|
run: |
|
|
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /opt/hostedtoolcache/CodeQL || true
|
|
df -h
|
|
- name: Build speaker-recognition backend image and run the ECAPA-TDNN configuration
|
|
run: |
|
|
make test-extra-backend-speaker-recognition-all
|