LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-18 05:33:09 -04:00

Author	SHA1	Message	Date
Ettore Di Giacinto	a0317d9926	refactor(tests): split app_test.go, move real-backend coverage to e2e-backends core/http/app_test.go had grown to 1495 lines exercising three concerns at once: HTTP-layer integration, real-backend inference (llama-gguf, tts, stablediffusion, transformers embeddings, whisper), and service logic that already has unit-level coverage. Each PR paid for 6 backend builds plus real-model downloads to satisfy a single suite. Reorg per layer: - app_test.go (1495 -> 1003 lines) drives the mock-backend binary only. Kept: auth, routing, gallery API, file:// import, /system, agent-jobs HTTP plumbing, config-file model loading. Deleted real-inference specs (llama-gguf chat, ggml completions/streaming, logprobs, logit_bias, transcription, embeddings, External-gRPC, Stores duplicate, Model gallery Context). Lifted Agent Jobs out of the deleted Stores Context. - tests/e2e-backends/backend_test.go gains logprobs, logit_bias, and no-first-token-dup specs (the latter folded into PredictStream). Two new caps gate them so non-LLM backends opt out. - tests/e2e-aio/e2e_test.go gains a streaming smoke under Context("text") to catch container-level streaming regressions. - tests/models_fixtures/ removed; all fixtures referenced testmodel.ggml. app_test.go now writes per-Context inline mock-model YAMLs. CI: - test.yml + tests-e2e.yml gain paths-ignore (docs/, examples/, *.md, backend/) so docs and backend-only PRs skip them. test.yml drops the 6-backend Build step plus TRANSFORMER_BACKEND/GO_TAGS=tts; tests-apple drops the llama-cpp-darwin build. - New tests-aio.yml runs the AIO container nightly + on workflow_dispatch + master/tags. The tests-e2e-container job moved out of test.yml so PRs no longer pay AIO cost. - New tests-llama-cpp-smoke job in test-extra.yml runs on every PR with no detect-changes gate; pulls quay.io/go-skynet/local-ai-backends: master-cpu-llama-cpp (no build on PR) and exercises predict/stream/ logprobs/logit_bias against Qwen3-0.6B. This is the PR-acceptance real-backend gate after AIO moved to nightly. The path-gated heavy test-extra-backend-llama-cpp wrapper appends the same caps so it exercises the moved specs when the backend actually changes. Makefile: - Deleted test-models/testmodel.ggml (the wget chain), test-llama-gguf, test-tts, test-stablediffusion, test-realtime-models. test target drops --label-filter, HUGGINGFACE_GRPC, TRANSFORMER_BACKEND, TEST_DIR, FIXTURES, CONFIG_FILE, MODELS_PATH, BACKENDS_PATH; depends on build-mock-backend. test-stores keeps a focused entry point and depends on backends/local-store. clean-tests also clears the mock-backend binary. Net per typical Go-side PR: ~25min (6 backend builds + tests + AIO) + ~8min e2e drops to ~5min mock-backend test + ~8min e2e + ~5-10min llama-cpp-smoke (image pulled). Docs and backend-only PRs skip the always-on workflows entirely. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: claude-code:claude-opus-4-7 [Edit] [Write] [Bash]	2026-04-27 23:09:20 +00:00
Ettore Di Giacinto	3387bfaee0	feat(api): add support for open responses specification (#8063 ) * feat: openresponses Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add ttl settings, fix tests Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: register cors middleware by default Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * satisfy schema Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Logitbias and logprobs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add grammar Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * SSE compliance Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * tool JSON conversion Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * support background mode Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * swagger Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * drop code. This is handled in the handler Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small refactorings Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * background mode for MCP Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-01-17 22:11:47 +01:00
Ettore Di Giacinto	6410c99bf2	fix(llama-cpp): correctly calculate embeddings (#6259 ) * chore(tests): check embeddings differs in llama.cpp Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(llama.cpp): use the correct field for embedding Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix(llama.cpp): use embedding type none Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(tests): add test-cases in aio-e2e suite Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2025-09-13 23:11:54 +02:00
Ettore Di Giacinto	f3f6535aad	fix: rename fiber entrypoint from http/api to http/app (#2096 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Dave <dave@gray101.com>	2024-04-21 22:39:28 +02:00

4 Commits