Files
LocalAI/Makefile
Ettore Di Giacinto a0317d9926 refactor(tests): split app_test.go, move real-backend coverage to e2e-backends
core/http/app_test.go had grown to 1495 lines exercising three concerns at
once: HTTP-layer integration, real-backend inference (llama-gguf, tts,
stablediffusion, transformers embeddings, whisper), and service logic that
already has unit-level coverage. Each PR paid for 6 backend builds plus
real-model downloads to satisfy a single suite.

Reorg per layer:

- app_test.go (1495 -> 1003 lines) drives the mock-backend binary only.
  Kept: auth, routing, gallery API, file:// import, /system, agent-jobs
  HTTP plumbing, config-file model loading. Deleted real-inference specs
  (llama-gguf chat, ggml completions/streaming, logprobs, logit_bias,
  transcription, embeddings, External-gRPC, Stores duplicate, Model gallery
  Context). Lifted Agent Jobs out of the deleted Stores Context.
- tests/e2e-backends/backend_test.go gains logprobs, logit_bias, and
  no-first-token-dup specs (the latter folded into PredictStream). Two
  new caps gate them so non-LLM backends opt out.
- tests/e2e-aio/e2e_test.go gains a streaming smoke under Context("text")
  to catch container-level streaming regressions.
- tests/models_fixtures/ removed; all fixtures referenced testmodel.ggml.
  app_test.go now writes per-Context inline mock-model YAMLs.

CI:

- test.yml + tests-e2e.yml gain paths-ignore (docs/, examples/, *.md,
  backend/) so docs and backend-only PRs skip them. test.yml drops the
  6-backend Build step plus TRANSFORMER_BACKEND/GO_TAGS=tts; tests-apple
  drops the llama-cpp-darwin build.
- New tests-aio.yml runs the AIO container nightly + on workflow_dispatch
  + master/tags. The tests-e2e-container job moved out of test.yml so PRs
  no longer pay AIO cost.
- New tests-llama-cpp-smoke job in test-extra.yml runs on every PR with
  no detect-changes gate; pulls quay.io/go-skynet/local-ai-backends:
  master-cpu-llama-cpp (no build on PR) and exercises predict/stream/
  logprobs/logit_bias against Qwen3-0.6B. This is the PR-acceptance
  real-backend gate after AIO moved to nightly. The path-gated heavy
  test-extra-backend-llama-cpp wrapper appends the same caps so it
  exercises the moved specs when the backend actually changes.

Makefile:

- Deleted test-models/testmodel.ggml (the wget chain), test-llama-gguf,
  test-tts, test-stablediffusion, test-realtime-models. test target
  drops --label-filter, HUGGINGFACE_GRPC, TRANSFORMER_BACKEND, TEST_DIR,
  FIXTURES, CONFIG_FILE, MODELS_PATH, BACKENDS_PATH; depends on
  build-mock-backend. test-stores keeps a focused entry point and depends
  on backends/local-store. clean-tests also clears the mock-backend
  binary.

Net per typical Go-side PR: ~25min (6 backend builds + tests + AIO) +
~8min e2e drops to ~5min mock-backend test + ~8min e2e + ~5-10min
llama-cpp-smoke (image pulled). Docs and backend-only PRs skip the
always-on workflows entirely.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash]
2026-04-27 23:09:20 +00:00

53 KiB