Commit Graph

4 Commits

Author SHA1 Message Date
Ettore Di Giacinto
a0317d9926 refactor(tests): split app_test.go, move real-backend coverage to e2e-backends
core/http/app_test.go had grown to 1495 lines exercising three concerns at
once: HTTP-layer integration, real-backend inference (llama-gguf, tts,
stablediffusion, transformers embeddings, whisper), and service logic that
already has unit-level coverage. Each PR paid for 6 backend builds plus
real-model downloads to satisfy a single suite.

Reorg per layer:

- app_test.go (1495 -> 1003 lines) drives the mock-backend binary only.
  Kept: auth, routing, gallery API, file:// import, /system, agent-jobs
  HTTP plumbing, config-file model loading. Deleted real-inference specs
  (llama-gguf chat, ggml completions/streaming, logprobs, logit_bias,
  transcription, embeddings, External-gRPC, Stores duplicate, Model gallery
  Context). Lifted Agent Jobs out of the deleted Stores Context.
- tests/e2e-backends/backend_test.go gains logprobs, logit_bias, and
  no-first-token-dup specs (the latter folded into PredictStream). Two
  new caps gate them so non-LLM backends opt out.
- tests/e2e-aio/e2e_test.go gains a streaming smoke under Context("text")
  to catch container-level streaming regressions.
- tests/models_fixtures/ removed; all fixtures referenced testmodel.ggml.
  app_test.go now writes per-Context inline mock-model YAMLs.

CI:

- test.yml + tests-e2e.yml gain paths-ignore (docs/, examples/, *.md,
  backend/) so docs and backend-only PRs skip them. test.yml drops the
  6-backend Build step plus TRANSFORMER_BACKEND/GO_TAGS=tts; tests-apple
  drops the llama-cpp-darwin build.
- New tests-aio.yml runs the AIO container nightly + on workflow_dispatch
  + master/tags. The tests-e2e-container job moved out of test.yml so PRs
  no longer pay AIO cost.
- New tests-llama-cpp-smoke job in test-extra.yml runs on every PR with
  no detect-changes gate; pulls quay.io/go-skynet/local-ai-backends:
  master-cpu-llama-cpp (no build on PR) and exercises predict/stream/
  logprobs/logit_bias against Qwen3-0.6B. This is the PR-acceptance
  real-backend gate after AIO moved to nightly. The path-gated heavy
  test-extra-backend-llama-cpp wrapper appends the same caps so it
  exercises the moved specs when the backend actually changes.

Makefile:

- Deleted test-models/testmodel.ggml (the wget chain), test-llama-gguf,
  test-tts, test-stablediffusion, test-realtime-models. test target
  drops --label-filter, HUGGINGFACE_GRPC, TRANSFORMER_BACKEND, TEST_DIR,
  FIXTURES, CONFIG_FILE, MODELS_PATH, BACKENDS_PATH; depends on
  build-mock-backend. test-stores keeps a focused entry point and depends
  on backends/local-store. clean-tests also clears the mock-backend
  binary.

Net per typical Go-side PR: ~25min (6 backend builds + tests + AIO) +
~8min e2e drops to ~5min mock-backend test + ~8min e2e + ~5-10min
llama-cpp-smoke (image pulled). Docs and backend-only PRs skip the
always-on workflows entirely.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash]
2026-04-27 23:09:20 +00:00
Ettore Di Giacinto
3387bfaee0 feat(api): add support for open responses specification (#8063)
* feat: openresponses

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add ttl settings, fix tests

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix: register cors middleware by default

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* satisfy schema

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Logitbias and logprobs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add grammar

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* SSE compliance

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tool JSON conversion

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* support background mode

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* swagger

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* drop code. This is handled in the handler

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Small refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* background mode for MCP

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-01-17 22:11:47 +01:00
Ettore Di Giacinto
6410c99bf2 fix(llama-cpp): correctly calculate embeddings (#6259)
* chore(tests): check embeddings differs in llama.cpp

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(llama.cpp): use the correct field for embedding

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(llama.cpp): use embedding type none

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(tests): add test-cases in aio-e2e suite

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2025-09-13 23:11:54 +02:00
Ettore Di Giacinto
f3f6535aad fix: rename fiber entrypoint from http/api to http/app (#2096)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Dave <dave@gray101.com>
2024-04-21 22:39:28 +02:00