LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-04-29 19:44:13 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	6820ec468f	chore(model gallery): 🤖 add 1 new models via gallery agent (#9491 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-22 21:56:11 +02:00
Ettore Di Giacinto	20baec77ab	feat(face-recognition): add insightface/onnx backend for 1:1 verify, 1:N identify, embedding, detection, analysis (#9480 ) * feat(face-recognition): add insightface backend for 1:1 verify, 1:N identify, embedding, detection, analysis Adds face recognition as a new first-class capability in LocalAI via the `insightface` Python backend, with a pluggable two-engine design so non-commercial (insightface model packs) and commercial-safe (OpenCV Zoo YuNet + SFace) models share the same gRPC/HTTP surface. New gRPC RPCs (backend/backend.proto): * FaceVerify(FaceVerifyRequest) returns FaceVerifyResponse * FaceAnalyze(FaceAnalyzeRequest) returns FaceAnalyzeResponse Existing Embedding and Detect RPCs are reused (face image in PredictOptions.Images / DetectOptions.src) for face embedding and face detection respectively. New HTTP endpoints under /v1/face/: * verify — 1:1 image pair same-person decision * analyze — per-face age + gender (emotion/race reserved) * register — 1:N enrollment; stores embedding in vector store * identify — 1:N recognition; detect → embed → StoresFind * forget — remove a registered face by opaque ID Service layer (core/services/facerecognition/) introduces a `Registry` interface with one in-memory `storeRegistry` impl backed by LocalAI's existing local-store gRPC vector backend. HTTP handlers depend on the interface, not on StoresSet/StoresFind directly, so a persistent PostgreSQL/pgvector implementation can be slotted in via a single constructor change in core/application (TODO marker in the package doc). New usecase flag FLAG_FACE_RECOGNITION; insightface is also wired into FLAG_DETECTION so /v1/detection works for face bounding boxes. Gallery (backend/index.yaml) ships three entries: * insightface-buffalo-l — SCRFD-10GF + ArcFace R50 + genderage (~326MB pre-baked; non-commercial research use only) * insightface-opencv — YuNet + SFace (~40MB pre-baked; Apache 2.0) * insightface-buffalo-s — SCRFD-500MF + MBF (runtime download; non-commercial) Python backend (backend/python/insightface/): * engines.py — FaceEngine protocol with InsightFaceEngine and OnnxDirectEngine; resolves model paths relative to the backend directory so the same gallery config works in docker-scratch and in the e2e-backends rootfs-extraction harness. * backend.py — gRPC servicer implementing Health, LoadModel, Status, Embedding, Detect, FaceVerify, FaceAnalyze. * install.sh — pre-bakes buffalo_l + OpenCV YuNet/SFace inside the backend directory so first-run is offline-clean (the final scratch image only preserves files under /<backend>/). * test.py — parametrized unit tests over both engines. Tests: * Registry unit tests (go test -race ./core/services/facerecognition/...) — in-memory fake grpc.Backend, table-driven, covers register/ identify/forget/error paths + concurrent access. * tests/e2e-backends/backend_test.go extended with face caps (face_detect, face_embed, face_verify, face_analyze); relative ordering + configurable verifyCeiling per engine. * Makefile targets: test-extra-backend-insightface-buffalo-l, -opencv, and the -all aggregate. * CI: .github/workflows/test-extra.yml gains tests-insightface-grpc, auto-triggered by changes under backend/python/insightface/. Docs: * docs/content/features/face-recognition.md — feature page with license table, quickstart (defaults to the commercial-safe model), models matrix, API reference, 1:N workflow, storage caveats. * Cross-refs in object-detection.md, stores.md, embeddings.md, and whats-new.md. * Contributor README at backend/python/insightface/README.md. Verified end-to-end: * buffalo_l: 6/6 specs (health, load, face_detect, face_embed, face_verify, face_analyze). * opencv: 5/5 specs (same minus face_analyze — SFace has no demographic head; correctly skipped via BACKEND_TEST_CAPS). Assisted-by: Claude:claude-opus-4-7 * fix(face-recognition): move engine selection to model gallery, collapse backend entries The previous commit put engine/model_pack options on backend gallery entries (`backend/index.yaml`). That was wrong — `GalleryBackend` (core/gallery/backend_types.go:32) has no `options` field, so the YAML decoder silently dropped those keys and all three "different insightface-" backend entries resolved to the same container image with no distinguishing configuration. Correct split: `backend/index.yaml` now has ONE `insightface` backend entry shipping the CPU + CUDA 12 container images. The Python backend bundles both the non-commercial insightface model packs (buffalo_l / buffalo_s) and the commercial-safe OpenCV Zoo weights (YuNet + SFace); the active engine is selected at LoadModel time via `options: ["engine:..."]`. * `gallery/index.yaml` gains three model entries — `insightface-buffalo-l`, `insightface-opencv`, `insightface-buffalo-s` — each setting the appropriate `overrides.backend` + `overrides.options` so installing one actually gives the user the intended engine. This matches how `rfdetr-base` lives in the model gallery against the `rfdetr` backend. The earlier e2e tests passed despite this bug because the Makefile targets pass `BACKEND_TEST_OPTIONS` directly to LoadModel via gRPC, bypassing any gallery resolution entirely. No code changes needed. Assisted-by: Claude:claude-opus-4-7 * feat(face-recognition): cover all supported models in the gallery + drop weight baking Follows up on the model-gallery split: adds entries for every model configuration either engine actually supports, and switches weight delivery from image-baked to LocalAI's standard gallery mechanism. Gallery now has seven `insightface-` model entries (gallery/index.yaml): insightface (family) — non-commercial research use • buffalo-l (326MB) — SCRFD-10GF + ResNet50 + genderage, default • buffalo-m (313MB) — SCRFD-2.5GF + ResNet50 + genderage • buffalo-s (159MB) — SCRFD-500MF + MBF + genderage • buffalo-sc (16MB) — SCRFD-500MF + MBF, recognition only (no landmarks, no demographics — analyze returns empty attributes) • antelopev2 (407MB) — SCRFD-10GF + ResNet100@Glint360K + genderage OpenCV Zoo family — Apache 2.0 commercial-safe • opencv — YuNet + SFace fp32 (~40MB) • opencv-int8 — YuNet + SFace int8 (~12MB, ~3x smaller, faster on CPU) Model weights are no longer baked into the backend image. The image now ships only the Python runtime + libraries (~275MB content size, ~1.18GB disk vs ~1.21GB when weights were baked). Weights flow through LocalAI's gallery mechanism: OpenCV variants list `files:` with ONNX URIs + SHA-256, so `local-ai models install insightface-opencv` pulls them into the models directory exactly like any other gallery-managed model. * insightface packs (upstream distributes .zip archives only, not individual ONNX files) auto-download on first LoadModel via FaceAnalysis' built-in machinery, rooted at the LocalAI models directory so they live alongside everything else — same pattern `rfdetr` uses with `inference.get_model()`. Backend changes (backend/python/insightface/): * backend.py — LoadModel propagates `ModelOptions.ModelPath` (the LocalAI models directory) to engines via a `_model_dir` hint. This replaces the earlier ModelFile-dirname approach; ModelPath is the canonical "models directory" variable set by the Go loader (pkg/model/initializers.go:144) and is always populated. * engines.py::_resolve_model_path — picks up `model_dir` and searches it (plus basename-in-model-dir) before falling back to the dev script-dir. This is how OnnxDirectEngine finds gallery-downloaded YuNet/SFace files by filename only. * engines.py::_flatten_insightface_pack — new helper that works around an upstream packaging inconsistency: buffalo_l/s/sc zips expand flat, but buffalo_m and antelopev2 zips wrap their ONNX files in a redundant `<name>/` directory. insightface's own loader looks one level too shallow and fails. We call `ensure_available()` explicitly, flatten if nested, then hand to FaceAnalysis. * engines.py::InsightFaceEngine.prepare — root-resolution order now includes the `_model_dir` hint so packs download into the LocalAI models directory by default. * install.sh — no longer pre-downloads any weights. Everything is gallery-managed now. * smoke.py (new) — parametrized smoke test that iterates over every gallery configuration, simulating the LocalAI install flow (creates a models dir, fetches OpenCV files with checksum verification, lets insightface auto-download its packs), then runs detect + embed + verify (+ analyze where supported) through the in-process BackendServicer. * test.py — OnnxDirectEngineTest no longer hardcodes `/models/opencv/` paths; downloads ONNX files to a temp dir at setUpClass time and passes ModelPath accordingly. Registry change (core/services/facerecognition/store_registry.go): * `dim=0` in NewStoreRegistry now means "accept whatever dimension arrives" — needed because the backend supports 512-d ArcFace/MBF and 128-d SFace via the same Registry. A non-zero dim still fails fast with ErrDimensionMismatch. * core/application plumbs `faceEmbeddingDim = 0`, explaining the rationale in the comment. Backend gallery description updated to reflect that the image carries no weights — it's just Python + engines. Smoke-tested all 7 configurations against the rebuilt image (with the flatten fix applied), exit 0: PASS: insightface-buffalo-l faces=6 dim=512 same-dist=0.000 PASS: insightface-buffalo-sc faces=6 dim=512 same-dist=0.000 PASS: insightface-buffalo-s faces=6 dim=512 same-dist=0.000 PASS: insightface-buffalo-m faces=6 dim=512 same-dist=0.000 PASS: insightface-antelopev2 faces=6 dim=512 same-dist=0.000 PASS: insightface-opencv faces=6 dim=128 same-dist=0.000 PASS: insightface-opencv-int8 faces=6 dim=128 same-dist=0.000 7/7 passed Assisted-by: Claude:claude-opus-4-7 * fix(face-recognition): pre-fetch OpenCV ONNX for e2e target; drop stale pre-baked claim CI regression from the previous commit: I moved OpenCV Zoo weight delivery to LocalAI's gallery `files:` mechanism, but the test-extra-backend-insightface-opencv target was still passing relative paths `detector_onnx:models/opencv/yunet.onnx` in BACKEND_TEST_OPTIONS. The e2e suite drives LoadModel directly over gRPC without going through the gallery, so those relative paths resolved to nothing and OpenCV's ONNXImporter failed: LoadModel failed: Failed to load face engine: OpenCV(4.13.0) ... Can't read ONNX file: models/opencv/yunet.onnx Fix: add an `insightface-opencv-models` prerequisite target that fetches the two ONNX files (YuNet + SFace) to a deterministic host cache at /tmp/localai-insightface-opencv-cache/, verifies SHA-256, and skips the download on re-runs. The opencv test target depends on it and passes absolute paths in BACKEND_TEST_OPTIONS, so the backend finds the files via its normal absolute-path resolution branch. Also refresh the buffalo_l comment: it no longer says "pre-baked" (nothing is — the pack auto-downloads from upstream's GitHub release on first LoadModel, same as in CI). Locally verified: `make test-extra-backend-insightface-opencv` passes 5/5 specs (health, load, face_detect, face_embed, face_verify). Assisted-by: Claude:claude-opus-4-7 * feat(face-recognition): add POST /v1/face/embed + correct /v1/embeddings docs The docs promised that /v1/embeddings returns face vectors when you send an image data-URI. That was never true: /v1/embeddings is OpenAI-compatible and text-only by contract — its handler goes through `core/backend/embeddings.go::ModelEmbedding`, which sets `predictOptions.Embeddings = s` (a string of TEXT to embed) and never populates `predictOptions.Images[]`. The Python backend's Embedding gRPC method does handle Images[] (that's how /v1/face/register reaches it internally via `backend.FaceEmbed`), but the HTTP embeddings endpoint wasn't wired to populate it. Rather than overload /v1/embeddings with image-vs-text detection — messy, and the endpoint is OpenAI-compatible by design — add a dedicated /v1/face/embed endpoint that wraps `backend.FaceEmbed` (already used internally by /v1/face/register and /v1/face/identify). Matches LocalAI's convention of a dedicated path per non-standard flow (/v1/rerank, /v1/detection, /v1/face/verify etc.). Response: { "embedding": [<dim> floats, L2-normed], "dim": int, // 512 for ArcFace R50 / MBF, 128 for SFace "model": "<name>" } Live-tested on the opencv engine: returns a 128-d L2-normalized vector (sum(x^2) = 1.0000). Sentinel in docs updated to note /v1/embeddings is text-only and point image users at /v1/face/embed instead. Assisted-by: Claude:claude-opus-4-7 * fix(http): map malformed image input + gRPC status codes to proper 4xx Image-input failures on LocalAI's single-image endpoints (/v1/detection, /v1/face/{verify,analyze,embed,register,identify}) have historically returned 500 — even when the client was the one who sent garbage. Classic example: you POST an "image" that isn't a URL, isn't a data-URI, and isn't a valid JPEG/PNG — the server shouldn't claim that's its fault. Two helpers land in core/http/endpoints/localai/images.go and every single-image handler is switched over: * decodeImageInput(s) Wraps utils.GetContentURIAsBase64 and turns any failure (invalid URL, not a data-URI, download error, etc.) into echo.NewHTTPError(400, "invalid image input: ..."). * mapBackendError(err) Inspects the gRPC status on a backend call error and maps: INVALID_ARGUMENT → 400 Bad Request NOT_FOUND → 404 Not Found FAILED_PRECONDITION → 412 Precondition Failed Unimplemented → 501 Not Implemented All other codes fall through unchanged (still 500). Before, my 1×1 PNG error-path test returned: HTTP 500 "rpc error: code = InvalidArgument desc = failed to decode one or both images" After: HTTP 400 "failed to decode one or both images" Scope-limited to the LocalAI single-image endpoints. The multi-modal paths (middleware/request.go, openresponses/responses.go, openai/realtime.go) intentionally log-and-skip individual media parts when decoding fails — different design intent (graceful degradation of a multi-part message), not a 400-worthy failure. Left untouched. Live-verified: every error case in /tmp/face_errors.py now returns 4xx with a meaningful message; the "image with no face (1x1 PNG)" case specifically went from 500 → 400. Assisted-by: Claude:claude-opus-4-7 * refactor(face-recognition): insightface packs go through gallery files:, drop FaceAnalysis Follows up on the discovery that LocalAI's gallery `files:` mechanism handles archives (zip, tar.gz, …) via mholt/archiver/v3 — the rhasspy piper voices use exactly this pattern. Insightface packs are zip archives, so we can now deliver them the same way every other gallery-managed model gets delivered: declaratively, checksum-verified, through LocalAI's standard download+extract pipeline. Two changes: 1. Gallery (gallery/index.yaml) — every insightface-* entry gains a `files:` list with the pack zip's URI + SHA-256. `local-ai models install insightface-buffalo-l` now fetches the zip, verifies the hash, and extracts it into the models directory. No more reliance on insightface's library-internal `ensure_available()` auto-download or its hardcoded `BASE_REPO_URL`. 2. InsightFaceEngine (backend/python/insightface/engines.py) — drops the FaceAnalysis wrapper and drives insightface's `model_zoo` directly. The ~50 lines FaceAnalysis provides — glob ONNX files, route each through `model_zoo.get_model()`, build a `{taskname: model}` dict, loop per-face at inference — are reimplemented in `InsightFaceEngine`. The actual inference classes (RetinaFace, ArcFaceONNX, Attribute, Landmark) are still insightface's — we only replicate the glue, so drift risk against upstream is minimal. Why drop FaceAnalysis: it hard-codes a `<root>/models/<name>/.onnx` layout that doesn't match what LocalAI's zip extraction produces. LocalAI unpacks archives flat into `<models_dir>`. Upstream packs are inconsistent — buffalo_l/s/sc ship ONNX at the zip root (lands at `<models_dir>/.onnx`), buffalo_m/antelopev2 wrap in a redundant `<name>/` dir (lands at `<models_dir>/<name>/.onnx`). The new `_locate_insightface_pack` helper searches both locations plus legacy paths and returns whichever has ONNX files. Replaces the earlier `_flatten_insightface_pack` helper (which tried to fight FaceAnalysis's layout expectations; now we just find the files wherever they are). Net effect for users: install once via LocalAI's managed flow, weights live alongside every other model, progress shows in the jobs endpoint, no first-load network call. Same API surface, cleaner plumbing. Assisted-by: Claude:claude-opus-4-7 fix(face-recognition): CI's insightface e2e path needs the pack pre-fetched The e2e suite drives LoadModel over gRPC without going through LocalAI's gallery flow, so the engine's `_model_dir` option (normally populated from ModelPath) is empty. Previously the insightface target relied on FaceAnalysis auto-download to paper over this, but we dropped FaceAnalysis in favor of direct model_zoo calls — so the buffalo_l target started failing at LoadModel with "no insightface pack found". Mirror the opencv target's pre-fetch pattern: download buffalo_sc.zip (same SHA as the gallery entry), extract it on the host, and pass `root:<dir>` so the engine locates the pack without needing ModelPath. Switched to buffalo_sc (smallest pack, ~16MB) to keep CI fast; it covers the same insightface engine code path as buffalo_l. Face analyze cap dropped since buffalo_sc has no age/gender head. Assisted-by: Claude:claude-opus-4-7[1m] * feat(face-recognition): surface face-recognition in advertised feature maps The six /v1/face/* endpoints were missing from every place LocalAI advertises its feature surface to clients: * api_instructions — the machine-readable capability index at GET /api/instructions. Added `face-recognition` as a dedicated instruction area with an intro that calls out the in-memory registry caveat and the /v1/face/embed vs /v1/embeddings split. * auth/permissions — added FeatureFaceRecognition constant, routed all six face endpoints through it so admins can gate them per-user like any other API feature. Default ON (matches the other API features). * React UI capabilities — CAP_FACE_RECOGNITION symbol mapped to FLAG_FACE_RECOGNITION. Declared only for now; the Face page is a follow-up (noted in the plan). Instruction count bumped 9 → 10; test updated. Assisted-by: Claude:claude-opus-4-7[1m] * docs(agents): capture advertising-surface steps in the endpoint guide Before this change, adding a new /v1/* endpoint reliably missed one or more of: the swagger @Tags annotation, the /api/instructions registry, the auth RouteFeatureRegistry, and the React UI CAP_* symbol. The endpoint would work but be invisible to API consumers, admins, and the UI — and nothing in the existing docs said to look in those places. Extend .agents/api-endpoints-and-auth.md with a new "Advertising surfaces" section covering all four surfaces (swagger tags, /api/ instructions, capabilities.js, docs/), and expand the closing checklist so it's impossible to ship a feature without visiting each one. Hoist a one-liner reminder into AGENTS.md's Quick Reference so agents skim it before diving in. Assisted-by: Claude:claude-opus-4-7[1m]	2026-04-22 21:55:41 +02:00
LocalAI [bot]	0f3bb2d647	chore(model gallery): 🤖 add 1 new models via gallery agent (#9481 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-22 08:22:05 +02:00
LocalAI [bot]	47efaf5b43	Fix: Add model parameter to neutts-air gallery definition (#8793 ) fix: Add model parameter to neutts-air gallery definition The neutts-air model entry was missing the 'model' parameter in its configuration, which caused LocalAI to fail with an 'Unrecognized model' error when trying to use it. This change adds the required model parameter pointing to the HuggingFace repository (neuphonic/neutts-air) so the backend can properly load the model. Fixes #8792 Signed-off-by: localai-bot <localai-bot@example.com> Co-authored-by: localai-bot <localai-bot@example.com>	2026-04-21 11:56:00 +02:00
LocalAI [bot]	047bc48fa9	chore(model gallery): 🤖 add 1 new models via gallery agent (#9464 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-21 11:07:07 +02:00
sec171	01bd8ae5d0	[gallery] Fix duplicate sha256 keys in Wan models (#9461 ) Fix duplicate sha256 keys in wan models gallery The wan models previously defined the `sha256` key twice in their files lists, which triggered strict mapping key checks in the YAML parser and resulted in unmarshal errors that crashed the `/api/models` loading. This removes the redundant trailing `sha256` keys from the Wan model definitions. Assisted-by: Antigravity:Gemini-3.1-Pro-High [multi_replace_file_content, run_command] Signed-off-by: Alex <codecrusher24@gmail.com>	2026-04-21 11:06:36 +02:00
LocalAI [bot]	d9808769be	chore(model-gallery): ⬆️ update checksum (#9451 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-21 00:07:58 +02:00
Ettore Di Giacinto	8ab56e2ad3	feat(gallery): add wan i2v 720p (#9457 ) feat(gallery): add Wan 2.1 I2V 14B 720P + pin all wan ggufs by sha256 Adds a new entry for the native-720p image-to-video sibling of the 480p I2V model (wan-2.1-i2v-14b-480p-ggml). The 720p I2V model is trained purely as image-to-video — no first-last-frame interpolation path — so motion is freer than repurposing the FLF2V 720P variant as an i2v. Shares the same VAE, umt5_xxl text encoder, and clip_vision_h auxiliary files as the existing 480p I2V and 720p FLF2V entries, so no new aux downloads are introduced. Also pins the main diffusion gguf by sha256 for the new entry and for the three existing wan entries that were previously missing a hash (wan-2.1-t2v-1.3b-ggml, wan-2.1-i2v-14b-480p-ggml, wan-2.1-flf2v-14b-720p-ggml). Hashes were fetched from HuggingFace's x-linked-etag header per .agents/adding-gallery-models.md. Assisted-by: Claude:claude-opus-4-7	2026-04-20 23:34:11 +02:00
Ettore Di Giacinto	f683231811	feat(gallery): add Wan 2.1 FLF2V 14B 720P (#9440 ) First-last-frame-to-video variant of the 14B Wan family. Accepts a start and end reference image and — unlike the pure i2v path — runs both through clip_vision, so the final frame lands on the end image both in pixel and semantic space. Right pick for seamless loops (start_image == end_image) and narrative A→B cuts. Shares the same VAE, umt5_xxl text encoder, and clip_vision_h as the I2V 14B entry. Options block mirrors i2v's full-list-in-override style so the template merge doesn't drop fields. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 10:34:36 +02:00
LocalAI [bot]	960757f0e8	chore(model gallery): 🤖 add 1 new models via gallery agent (#9436 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-20 08:48:47 +02:00
LocalAI [bot]	cb77a5a4b9	chore(model gallery): 🤖 add 1 new models via gallery agent (#9425 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-20 00:42:44 +02:00
Ettore Di Giacinto	9e44944cc1	fix(i2v): Add new options to the model configuration Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-04-20 00:27:05 +02:00
Ettore Di Giacinto	b27de08fff	chore(gallery): fixup wan Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-19 21:31:22 +00:00
Ettore Di Giacinto	054c4b4b45	feat(stable-diffusion.ggml): add support for video generation (#9420 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-19 09:26:33 +02:00
LocalAI [bot]	844b0b760b	chore(model gallery): 🤖 add 1 new models via gallery agent (#9400 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-17 17:56:41 +02:00
LocalAI [bot]	55c05211d3	chore(model gallery): 🤖 add 1 new models via gallery agent (#9399 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-17 16:10:02 +02:00
LocalAI [bot]	ec5935421c	chore(model-gallery): ⬆️ update checksum (#9384 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-16 22:41:52 +02:00
Matt Van Horn	c4f309388e	fix(gallery): correct gemma-4 model URIs returning 404 (#9379 ) The gemma-4-26b-a4b-it, gemma-4-e2b-it, and gemma-4-e4b-it gallery entries pointed at files that do not exist on HuggingFace, so LocalAI fails with 404 when users try to install them. Two issues per entry: - mmproj filename uses the 'f16' quantization suffix, but ggml-org publishes the mmproj projectors as 'bf16'. - The e2b and e4b URIs hardcode lowercase 'e2b'/'e4b' in the filename component. HuggingFace file paths are case-sensitive and the real files use uppercase 'E2B'/'E4B'. Updated filename, uri, sha256, and the top-level 'mmproj' and 'parameters.model' references so every entry points at a real file and the declared hashes match the content. Verified each URI resolves (HTTP 302) and each sha256 matches the 'x-linked-etag' header returned by HuggingFace. Signed-off-by: Matt Van Horn <mvanhorn@gmail.com>	2026-04-16 08:51:20 +02:00
LocalAI [bot]	08445b1b89	chore(model-gallery): ⬆️ update checksum (#9369 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-16 01:12:01 +02:00
LocalAI [bot]	8487058673	chore(model-gallery): ⬆️ update checksum (#9358 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-15 01:25:59 +02:00
Ettore Di Giacinto	b361d2ddd6	chore(gallery): add new llama.cpp supported models (qwen-asr, ocr) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-14 10:04:50 +00:00
LocalAI [bot]	c6d5dc3374	chore(model-gallery): ⬆️ update checksum (#9346 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-13 23:00:13 +02:00
Ettore Di Giacinto	be1b8d56c9	fix(gallery): override parameters for flux kontext Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-04-13 22:29:17 +02:00
Ettore Di Giacinto	7a0e6ae6d2	feat(qwen3tts.cpp): add new backend (#9316 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-11 23:14:26 +02:00
LocalAI [bot]	7edd3ea96f	chore(model-gallery): ⬆️ update checksum (#9321 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-11 22:53:48 +02:00
thelittlefireman	7c1865b307	Fix load of z-image-turbo (#9264 ) * Fix load of z-image and improve speed Signed-off-by: thelittlefireman <5165783+thelittlefireman@users.noreply.github.com> * Remove diffusion_flash_attn from z-image-ggml.yaml Removed 'diffusion_flash_attn' parameter from configuration. Signed-off-by: thelittlefireman <5165783+thelittlefireman@users.noreply.github.com> --------- Signed-off-by: thelittlefireman <5165783+thelittlefireman@users.noreply.github.com>	2026-04-11 08:42:13 +02:00
Ettore Di Giacinto	706cf5d43c	feat(sam.cpp): add sam.cpp detection backend (#9288 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-09 21:49:11 +02:00
Ettore Di Giacinto	b64347b6aa	chore: add gemma4 to the gallery Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 23:44:16 +00:00
Ettore Di Giacinto	285f7d4340	chore: add embeddingemma Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-08 17:40:55 +00:00
Richard Palethorpe	ea6e850809	feat: Add Kokoros backend (#9212 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-04-08 19:23:16 +02:00
ER-EPR	39c954764c	Update index.yaml and add Qwen3.5 model files (#9237 ) * Update index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Add mmproj files for Qwen3.5 models Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Update file paths for Qwen models in index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Update index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Refactor Qwen3-Reranker-0.6B entry in index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Update qwen3.yaml configuration parameters Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> --------- Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>	2026-04-05 09:21:21 +02:00
Ettore Di Giacinto	9b7d5513fc	chore(gallery): add mmproj file for gemma4 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-04-05 02:02:52 +02:00
LocalAI [bot]	d990f2790c	chore(model-gallery): ⬆️ update checksum (#9233 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-04 23:02:41 +02:00
LocalAI [bot]	e4ee74354f	chore(model gallery): 🤖 add 1 new models via gallery agent (#9210 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-03 16:23:17 +02:00
LocalAI [bot]	e9f10f2f50	chore(model gallery): 🤖 add 1 new models via gallery agent (#9202 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-04-02 21:22:19 +02:00
ER-EPR	afe79568d6	fix: huggingface repo change the file name so Update index.yaml is needed (#9163 ) * Update index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Add mmproj files for Qwen3.5 models Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> * Update file paths for Qwen models in index.yaml Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com> --------- Signed-off-by: ER-EPR <38782737+ER-EPR@users.noreply.github.com>	2026-03-30 00:48:17 +02:00
Richard Palethorpe	87b3e10024	fix(flux.2-klein-9b): Use Qwen3-8b to avoid GGML assertion failure on tensor mismatch (#8995 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-03-13 21:39:31 +01:00
Ettore Di Giacinto	ec91c477dc	Remove model descriptions from index.yaml Removed description fields from multiple model entries in index.yaml. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-12 21:43:21 +01:00
LocalAI [bot]	3b9abffdc8	chore(model-gallery): ⬆️ update checksum (#8985 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-12 21:34:27 +01:00
Ettore Di Giacinto	a738f8b0e4	feat(backends): add ace-step.cpp (#8965 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-12 18:56:26 +01:00
Ettore Di Giacinto	8f3efaed15	Update model entry in index.yaml Removed description and license fields from model entry. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-12 18:51:29 +01:00
LocalAI [bot]	c4cccb728e	chore(model gallery): 🤖 add 1 new models via gallery agent (#8980 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-12 18:51:00 +01:00
Ettore Di Giacinto	b209947e81	Remove Qwen3.5-35B model and update model path Removed deprecated Qwen3.5-35B-A3B model configuration and updated model path for Qwen3. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-12 18:19:55 +01:00
Ettore Di Giacinto	7dc691c171	feat: add fish-speech backend (#8962 ) * feat: add fish-speech backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * drop portaudio Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-12 07:48:23 +01:00
Ettore Di Giacinto	17f36e73b5	Remove model name entries from index.yaml Removed 'name' entries for various models in the index file. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-12 01:13:58 +01:00
Ettore Di Giacinto	031909d85a	Clean up gallery index by removing obsolete models Removed multiple models and their associated metadata from the gallery index. Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-12 00:55:19 +01:00
LocalAI [bot]	79f90de935	chore(model-gallery): ⬆️ update checksum (#8945 ) ⬆️ Checksum updates in gallery/index.yaml Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-10 21:43:52 +01:00
LocalAI [bot]	bda826d005	chore(model gallery): 🤖 add 1 new models via gallery agent (#8939 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-10 18:09:11 +01:00
Ettore Di Giacinto	05a3d00924	chore(size): display size of HF models and allow to specify it from the gallery (#8907 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-03-09 17:38:14 +01:00
LocalAI [bot]	734b6d391f	chore(model gallery): 🤖 add 1 new models via gallery agent (#8904 ) chore(model gallery): 🤖 add new models via gallery agent Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-09 17:01:56 +01:00

1 2 3 4 5 ...

1344 Commits