mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-03 13:36:30 -04:00
refactor(tests): split app_test.go, move real-backend coverage to e2e-backends
core/http/app_test.go had grown to 1495 lines exercising three concerns at
once: HTTP-layer integration, real-backend inference (llama-gguf, tts,
stablediffusion, transformers embeddings, whisper), and service logic that
already has unit-level coverage. Each PR paid for 6 backend builds plus
real-model downloads to satisfy a single suite.
Reorg per layer:
- app_test.go (1495 -> 1003 lines) drives the mock-backend binary only.
Kept: auth, routing, gallery API, file:// import, /system, agent-jobs
HTTP plumbing, config-file model loading. Deleted real-inference specs
(llama-gguf chat, ggml completions/streaming, logprobs, logit_bias,
transcription, embeddings, External-gRPC, Stores duplicate, Model gallery
Context). Lifted Agent Jobs out of the deleted Stores Context.
- tests/e2e-backends/backend_test.go gains logprobs, logit_bias, and
no-first-token-dup specs (the latter folded into PredictStream). Two
new caps gate them so non-LLM backends opt out.
- tests/e2e-aio/e2e_test.go gains a streaming smoke under Context("text")
to catch container-level streaming regressions.
- tests/models_fixtures/ removed; all fixtures referenced testmodel.ggml.
app_test.go now writes per-Context inline mock-model YAMLs.
CI:
- test.yml + tests-e2e.yml gain paths-ignore (docs/, examples/, *.md,
backend/) so docs and backend-only PRs skip them. test.yml drops the
6-backend Build step plus TRANSFORMER_BACKEND/GO_TAGS=tts; tests-apple
drops the llama-cpp-darwin build.
- New tests-aio.yml runs the AIO container nightly + on workflow_dispatch
+ master/tags. The tests-e2e-container job moved out of test.yml so PRs
no longer pay AIO cost.
- New tests-llama-cpp-smoke job in test-extra.yml runs on every PR with
no detect-changes gate; pulls quay.io/go-skynet/local-ai-backends:
master-cpu-llama-cpp (no build on PR) and exercises predict/stream/
logprobs/logit_bias against Qwen3-0.6B. This is the PR-acceptance
real-backend gate after AIO moved to nightly. The path-gated heavy
test-extra-backend-llama-cpp wrapper appends the same caps so it
exercises the moved specs when the backend actually changes.
Makefile:
- Deleted test-models/testmodel.ggml (the wget chain), test-llama-gguf,
test-tts, test-stablediffusion, test-realtime-models. test target
drops --label-filter, HUGGINGFACE_GRPC, TRANSFORMER_BACKEND, TEST_DIR,
FIXTURES, CONFIG_FILE, MODELS_PATH, BACKENDS_PATH; depends on
build-mock-backend. test-stores keeps a focused entry point and depends
on backends/local-store. clean-tests also clears the mock-backend
binary.
Net per typical Go-side PR: ~25min (6 backend builds + tests + AIO) +
~8min e2e drops to ~5min mock-backend test + ~8min e2e + ~5-10min
llama-cpp-smoke (image pulled). Docs and backend-only PRs skip the
always-on workflows entirely.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash]
This commit is contained in:
27
.github/workflows/test-extra.yml
vendored
27
.github/workflows/test-extra.yml
vendored
@@ -507,6 +507,33 @@ jobs:
|
||||
- name: Build llama-cpp backend image and run audio transcription gRPC e2e tests
|
||||
run: |
|
||||
make test-extra-backend-llama-cpp-transcription
|
||||
# PR-acceptance smoke gate: always runs on every PR (no detect-changes gate, no
|
||||
# paths filter). Pulls the pre-built master CPU llama-cpp image from quay
|
||||
# instead of building from source, so the cost is a docker pull (~30s) plus the
|
||||
# short Qwen3-0.6B model download. Exercises the full gRPC surface — health,
|
||||
# load, predict, stream — plus the logprobs/logit_bias specs that moved out of
|
||||
# core/http/app_test.go. Anything heavier or per-backend is gated to the
|
||||
# detect-changes path-filter above.
|
||||
tests-llama-cpp-smoke:
|
||||
runs-on: ubuntu-latest
|
||||
timeout-minutes: 20
|
||||
steps:
|
||||
- name: Clone
|
||||
uses: actions/checkout@v6
|
||||
with:
|
||||
submodules: true
|
||||
- name: Setup Go
|
||||
uses: actions/setup-go@v5
|
||||
with:
|
||||
go-version: '1.25.4'
|
||||
- name: Pull pre-built llama-cpp backend image
|
||||
run: docker pull quay.io/go-skynet/local-ai-backends:master-cpu-llama-cpp
|
||||
- name: Run e2e-backends smoke
|
||||
env:
|
||||
BACKEND_IMAGE: quay.io/go-skynet/local-ai-backends:master-cpu-llama-cpp
|
||||
BACKEND_TEST_CAPS: health,load,predict,stream,logprobs,logit_bias
|
||||
run: |
|
||||
make test-extra-backend
|
||||
# Realtime e2e with sherpa-onnx driving VAD + STT + TTS against a mocked LLM.
|
||||
# Builds the sherpa-onnx Docker image, extracts the rootfs so the e2e suite
|
||||
# can discover the backend binary + shared libs, downloads the three model
|
||||
|
||||
Reference in New Issue
Block a user