mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-07 08:16:53 -04:00
* docs: add 'how LocalAI works' architecture diagram Add a blueprint-style architecture diagram: clients -> small core (API, router, WebUI, agents) -> gRPC -> backend processes pulled on demand as OCI images. Place it on the overview page and replace the stale external architecture image on the reference page. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs: add blueprint diagrams across feature, distributed & getting-started docs Add 24 architecture/flow/comparison diagrams (PNG + HTML source) under docs/static/images/diagrams/, wired into their docs pages, from an impact-vs-effort audit of the docs. Broaden the API surface on the overview architecture diagram (OpenAI, Anthropic, ElevenLabs, Ollama, and LocalAI's own API) and move the gRPC boundary label clear of the arrows. Pages: distributed mode (architecture, scheduling, ds4 layer-split), distributed inferencing, MLX, realtime, quantization, MCP, agents, mitm & cloud proxy, middleware, reverse-proxy TLS, VRAM, voice & face recognition, reranker, function calling, fine-tuning (recipe + jobs), diarization, audio transform, quickstart, model resolution. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs: add composable-core diagram to README hero Commit the composable-core card (small core + on-demand backend tiles) alongside the other diagrams and reference it from the README hero via a repo-relative path, so it renders on GitHub. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * docs: fix composable-core connectors/badge and federated-vs-worker layout - composable-core: thicken the plug-in connectors so they read clearly, and widen the SEPARATE IMAGE badge so its text no longer overflows the box. - federated-vs-worker: shorten the WHOLE/SPLIT REQUEST pills to fit, and replace the tangled node-to-node activation arrows with a clean fan-out (request split across all sharded nodes), mirroring the federated panel. Assisted-by: Claude:claude-opus-4-8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
297 lines
10 KiB
Markdown
297 lines
10 KiB
Markdown
+++
|
|
disableToc = false
|
|
title = "Face Recognition"
|
|
weight = 14
|
|
url = "/features/face-recognition/"
|
|
+++
|
|
|
|

|
|
|
|
LocalAI supports face recognition through the `insightface` backend:
|
|
face verification (1:1), face identification (1:N) against a built-in
|
|
vector store, face embedding, face detection, demographic analysis
|
|
(age / gender), and antispoofing / liveness detection.
|
|
|
|
The backend ships **two interchangeable engines** under one image, each
|
|
paired with a distinct gallery entry so users can pick by license and
|
|
accuracy needs.
|
|
|
|
## Licensing — read this first
|
|
|
|
| Gallery entry | Detector + recognizer | Size | License |
|
|
|---|---|---|---|
|
|
| `insightface-buffalo-l` | SCRFD-10GF + ArcFace R50 + GenderAge | ~326 MB | **Non-commercial research only** (upstream insightface weights) |
|
|
| `insightface-buffalo-s` | SCRFD-500MF + MBF + GenderAge | ~159 MB | **Non-commercial research only** |
|
|
| `insightface-opencv` | YuNet + SFace | ~40 MB | **Apache 2.0 — commercial-safe** |
|
|
|
|
The `insightface` Python library itself is MIT, but the pretrained model
|
|
packs (buffalo_l, buffalo_s, antelopev2) are released by the upstream
|
|
maintainers for **non-commercial research use only**. Pick the
|
|
`insightface-opencv` entry for production / commercial deployments.
|
|
|
|
## Quickstart
|
|
|
|
Pull the commercial-safe backend (recommended for copy-paste):
|
|
|
|
```bash
|
|
local-ai models install insightface-opencv
|
|
```
|
|
|
|
Verify that two images depict the same person:
|
|
|
|
```bash
|
|
curl -sX POST http://localhost:8080/v1/face/verify \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "insightface-opencv",
|
|
"img1": "https://example.com/alice_1.jpg",
|
|
"img2": "https://example.com/alice_2.jpg"
|
|
}'
|
|
```
|
|
|
|
Response:
|
|
|
|
```json
|
|
{
|
|
"verified": true,
|
|
"distance": 0.27,
|
|
"threshold": 0.35,
|
|
"confidence": 23.1,
|
|
"model": "insightface-opencv",
|
|
"img1_area": { "x": 120.4, "y": 82.1, "w": 198.3, "h": 260.5 },
|
|
"img2_area": { "x": 110.8, "y": 95.0, "w": 205.6, "h": 268.2 },
|
|
"processing_time_ms": 412.0
|
|
}
|
|
```
|
|
|
|
## 1:N identification workflow (register → identify → forget)
|
|
|
|
This is the primary "face recognition" flow. Under the hood it uses
|
|
LocalAI's built-in in-memory vector store — no external database to
|
|
stand up.
|
|
|
|
1. Register known faces:
|
|
|
|
```bash
|
|
curl -sX POST http://localhost:8080/v1/face/register \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "insightface-buffalo-l",
|
|
"name": "Alice",
|
|
"img": "https://example.com/alice.jpg"
|
|
}'
|
|
# → {"id": "8b7...", "name": "Alice", "registered_at": "2026-04-21T..."}
|
|
```
|
|
|
|
2. Identify an unknown probe:
|
|
|
|
```bash
|
|
curl -sX POST http://localhost:8080/v1/face/identify \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "insightface-buffalo-l",
|
|
"img": "https://example.com/unknown.jpg",
|
|
"top_k": 5
|
|
}'
|
|
# → {"matches": [{"id":"8b7...","name":"Alice","distance":0.22,"match":true,...}]}
|
|
```
|
|
|
|
3. Remove a person by ID:
|
|
|
|
```bash
|
|
curl -sX POST http://localhost:8080/v1/face/forget \
|
|
-d '{"id": "8b7..."}'
|
|
# → 204 No Content
|
|
```
|
|
|
|
{{% notice warning %}}
|
|
**Storage caveat.** The default vector store is in-memory. All
|
|
registered faces are lost when LocalAI restarts. Persistent storage
|
|
(pgvector) is a tracked future enhancement — the face-recognition HTTP
|
|
API is designed to swap the backing store without changing the wire
|
|
format.
|
|
{{% /notice %}}
|
|
|
|
## API reference
|
|
|
|
### `POST /v1/face/verify` (1:1)
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `model` | string | gallery entry name (e.g. `insightface-buffalo-l`) |
|
|
| `img1`, `img2` | string | URL, base64, or data-URI |
|
|
| `threshold` | float, optional | cosine-distance cutoff; default depends on engine |
|
|
| `anti_spoofing` | bool, optional | also run MiniFASNet liveness on each image — see [Antispoofing](#antispoofing-liveness-detection) |
|
|
|
|
Returns `verified`, `distance`, `threshold`, `confidence`, `model`,
|
|
`img1_area`, `img2_area`, and `processing_time_ms`. When
|
|
`anti_spoofing` is set, the response also carries per-image liveness
|
|
fields: `img1_is_real`, `img1_antispoof_score`, `img2_is_real`,
|
|
`img2_antispoof_score`. A failed liveness check on either image forces
|
|
`verified=false` regardless of similarity.
|
|
|
|
### `POST /v1/face/analyze`
|
|
|
|
Returns demographic attributes for every detected face:
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `model` | string | gallery entry |
|
|
| `img` | string | URL / base64 / data-URI |
|
|
| `actions` | string[] | subset of `["age","gender","emotion","race"]`; empty = all supported |
|
|
|
|
Only `insightface-buffalo-l` / `insightface-buffalo-s` populate age and
|
|
gender (genderage head). `insightface-opencv` returns face regions with
|
|
empty attributes — SFace has no demographic classifier. Emotion and
|
|
race are always empty in the current release.
|
|
|
|
### `POST /v1/face/register` (1:N enrollment)
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `model` | string | face recognition model |
|
|
| `img` | string | face to enroll |
|
|
| `name` | string | human-readable label |
|
|
| `labels` | map[string]string, optional | arbitrary metadata |
|
|
| `store` | string, optional | vector store model; defaults to local-store |
|
|
|
|
Returns `{id, name, registered_at}`. The `id` is an opaque UUID used by
|
|
`/v1/face/identify` and `/v1/face/forget`.
|
|
|
|
### `POST /v1/face/identify` (1:N recognition)
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `model` | string | face recognition model |
|
|
| `img` | string | probe image |
|
|
| `top_k` | int, optional | max matches to return; default 5 |
|
|
| `threshold` | float, optional | cosine-distance cutoff; default 0.35 (ArcFace) |
|
|
| `store` | string, optional | vector store model; defaults to local-store |
|
|
|
|
Returns a list of matches sorted by ascending distance, each with `id`,
|
|
`name`, `labels`, `distance`, `confidence`, and `match`
|
|
(`distance ≤ threshold`).
|
|
|
|
### `POST /v1/face/forget`
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `id` | string | ID returned by `/v1/face/register` |
|
|
|
|
Returns `204 No Content` on success, `404 Not Found` if the ID is
|
|
unknown.
|
|
|
|
### `POST /v1/face/embed`
|
|
|
|
Returns the L2-normalized face embedding vector for the detected face.
|
|
|
|
| field | type | description |
|
|
|---|---|---|
|
|
| `model` | string | face model |
|
|
| `img` | string | URL / base64 / data-URI |
|
|
|
|
Returns `{embedding: float[], dim: int, model: string}`. Dimension is
|
|
512 for the insightface ArcFace/MBF recognizers and 128 for OpenCV's
|
|
SFace.
|
|
|
|
> **Note:** the OpenAI-compatible `/v1/embeddings` endpoint is
|
|
> intentionally text-only by contract (`input` is a string or list of
|
|
> strings of TEXT to embed) — passing an image data-URI there does
|
|
> nothing useful. Use `/v1/face/embed` for image inputs.
|
|
|
|
### Reused endpoint
|
|
|
|
- `POST /v1/detection` — returns face bounding boxes with
|
|
`class_name: "face"`; works for both engines.
|
|
|
|
## Antispoofing (liveness detection)
|
|
|
|
All gallery entries ship the [Silent-Face-Anti-Spoofing](https://github.com/minivision-ai/Silent-Face-Anti-Spoofing)
|
|
MiniFASNetV2 + MiniFASNetV1SE ensemble (Apache 2.0, ~4 MB total, CPU-only)
|
|
alongside the face recognition weights. Set `anti_spoofing: true` on
|
|
`/v1/face/verify` or `/v1/face/analyze` to run liveness on each detected
|
|
face. The two models look at different crop scales and their softmax
|
|
outputs are averaged before argmax — the upstream-recommended setup.
|
|
|
|
`/v1/face/verify` with liveness gating:
|
|
|
|
```bash
|
|
curl -sX POST http://localhost:8080/v1/face/verify \
|
|
-H "Content-Type: application/json" \
|
|
-d '{
|
|
"model": "insightface-opencv",
|
|
"img1": "https://example.com/alice_selfie.jpg",
|
|
"img2": "https://example.com/alice_id_scan.jpg",
|
|
"anti_spoofing": true
|
|
}'
|
|
```
|
|
|
|
Response (fields added when `anti_spoofing` is enabled):
|
|
|
|
```json
|
|
{
|
|
"verified": true,
|
|
"distance": 0.27,
|
|
"threshold": 0.5,
|
|
"confidence": 46.0,
|
|
"model": "insightface-opencv",
|
|
"img1_area": { "x": 120, "y": 82, "w": 198, "h": 260 },
|
|
"img2_area": { "x": 110, "y": 95, "w": 205, "h": 268 },
|
|
"img1_is_real": true,
|
|
"img1_antispoof_score": 0.82,
|
|
"img2_is_real": true,
|
|
"img2_antispoof_score": 0.74,
|
|
"processing_time_ms": 431.0
|
|
}
|
|
```
|
|
|
|
If either image fails liveness (`is_real=false`), `verified` is forced
|
|
to `false` — similarity alone is not enough.
|
|
|
|
`/v1/face/analyze` reports per-face `is_real` and `antispoof_score`
|
|
when the flag is set.
|
|
|
|
**Fail-loud semantics.** If `anti_spoofing: true` is sent against a
|
|
model installed without the MiniFASNet files (e.g. a custom entry that
|
|
only listed the face recognition weights), the request returns a gRPC
|
|
`FAILED_PRECONDITION` error — the endpoint will never silently return
|
|
`is_real=false`. Re-install the gallery entry or point the backend at a
|
|
model that bundles the MiniFASNet ONNX files.
|
|
|
|
{{% notice info %}}
|
|
The MiniFASNet score is best at catching **printed photos and screen
|
|
replays**. Deepfake videos and high-quality prosthetics are out of
|
|
scope — liveness here is a low-cost first line of defence, not a
|
|
guarantee. For higher assurance, combine with challenge-response (e.g.
|
|
ask the user to turn their head).
|
|
{{% /notice %}}
|
|
|
|
## Choosing an engine
|
|
|
|
| Need | Entry |
|
|
|---|---|
|
|
| Commercial product | `insightface-opencv` |
|
|
| Highest accuracy (research / demos) | `insightface-buffalo-l` |
|
|
| Edge / low-memory / research | `insightface-buffalo-s` |
|
|
|
|
The recommended default `threshold` for `/v1/face/verify` and
|
|
`/v1/face/identify` depends on the recognizer:
|
|
|
|
| Recognizer | Cosine-distance threshold |
|
|
|---|---|
|
|
| ArcFace R50 (`buffalo_l`) | ~0.35 |
|
|
| MBF (`buffalo_s`) | ~0.40 |
|
|
| SFace (`opencv`) | ~0.50 |
|
|
|
|
Pass `threshold` explicitly when switching engines — the per-engine
|
|
default only fires when the field is omitted.
|
|
|
|
## Related features
|
|
|
|
- [Object Detection](/features/object-detection/) — generic bounding-box
|
|
detection; `/v1/detection` works with the insightface backend too.
|
|
- [Embeddings](/features/embeddings/) — raw vector extraction; face
|
|
embeddings live in the same endpoint under the hood.
|
|
- [Stores](/features/stores/) — the generic vector store powering the
|
|
1:N recognition pipeline.
|