Files
LocalAI/docs/content/features/face-recognition.md
Ettore Di Giacinto f5eb13d3c2 feat(insightface): add antispoofing (liveness) detection (#9515)
* feat(insightface): add antispoofing (liveness) detection

Light up the anti_spoofing flag that was parked during the first pass.
Both FaceVerify and FaceAnalyze now run the Silent-Face MiniFASNetV2 +
MiniFASNetV1SE ensemble (~4 MB, Apache 2.0, CPU <10ms) when the flag is
set. Failed liveness on either image vetoes FaceVerify regardless of
embedding similarity. Every insightface* gallery entry now ships the
MiniFASNet ONNX weights so existing packs light up after reinstall.

Setting the flag against a model without the MiniFASNet files returns
FAILED_PRECONDITION (HTTP 412) with a clear install message — no
silent is_real=false.

FaceVerifyResponse gained per-image img{1,2}_is_real and
img{1,2}_antispoof_score (proto 9-12); FaceAnalysis's existing
is_real/antispoof_score fields are now populated. Schema fields are
pointers so they are fully absent from the JSON response when
anti_spoofing was not requested — avoids collapsing "not checked" with
"checked and fake" under Go's omitempty on bool.

Validated end-to-end over HTTP against a local install:
- verify + anti_spoofing, both real -> verified=true, score ~0.76
- verify + anti_spoofing, img2 spoof -> verified=false, img2_is_real=false
- analyze + anti_spoofing -> is_real and score per face
- flag against model without MiniFASNet -> HTTP 412 fail-loud

Assisted-by: Claude:claude-opus-4-7 go vet

* test(insightface): wire test target into test-extra

The root Makefile's `test-extra` already runs
`$(MAKE) -C backend/python/insightface test`, but the backend's
Makefile never defined the target — so the command silently errored
and the suite was never executed in CI. Adding the two-line target
(matching ace-step/Makefile) hooks `test.sh` → `runUnittests` →
`python -m unittest test.py`, which discovers both the pre-existing
engine classes (InsightFaceEngineTest, OnnxDirectEngineTest) and the
new AntispoofingTest. Each class skips gracefully when its weights
can't be downloaded from a network-restricted runner.

Assisted-by: Claude:claude-opus-4-7

* test(insightface): exercise antispoofing in e2e-backends (both paths)

Add a `face_antispoof` capability to the Ginkgo e2e suite and extend
the existing FaceVerify + FaceAnalyze specs with liveness assertions
covering BOTH paths:

  real fixture -> is_real=true, score>0, verified stays true
  spoof fixture -> is_real=false, verified vetoed to false

The spoof fixture is upstream's own `image_F2.jpg` (via the yakhyo
mirror) — verified locally against the MiniFASNetV2+V1SE ensemble to
classify as is_real=false with score ~0.013. That makes the assertion
deterministic across CI runs; synthetic/derived spoofs fool the model
unpredictably and would be flaky.

Makefile wires it up end-to-end:
- New INSIGHTFACE_ANTISPOOF_* cache dir + two ONNX downloads with
  pinned SHAs, matching the gallery entries.
- insightface-antispoof-models target shared by both backend configs.
- FACE_SPOOF_IMAGE_URL passed via BACKEND_TEST_FACE_SPOOF_IMAGE_URL.
- Both e2e targets (buffalo-sc + opencv) now:
  * depend on insightface-antispoof-models
  * pass antispoof_v2_onnx / antispoof_v1se_onnx in BACKEND_TEST_OPTIONS
  * include face_antispoof in BACKEND_TEST_CAPS

backend_test.go adds the new capability constant and a faceSpoofFile
fixture resolved the same way as faceFile1/2/3. Spoof assertions are
gated on both capFaceAntispoof AND faceSpoofFile being set, so a test
config that omits the spoof fixture degrades gracefully to "real path
only" instead of failing.

Assisted-by: Claude:claude-opus-4-7 go vet
2026-04-23 18:28:15 +02:00

9.8 KiB
Raw Blame History

+++ disableToc = false title = "Face Recognition" weight = 14 url = "/features/face-recognition/" +++

LocalAI supports face recognition through the insightface backend: face verification (1:1), face identification (1:N) against a built-in vector store, face embedding, face detection, demographic analysis (age / gender), and antispoofing / liveness detection.

The backend ships two interchangeable engines under one image, each paired with a distinct gallery entry so users can pick by license and accuracy needs.

Licensing — read this first

Gallery entry Detector + recognizer Size License
insightface-buffalo-l SCRFD-10GF + ArcFace R50 + GenderAge ~326 MB Non-commercial research only (upstream insightface weights)
insightface-buffalo-s SCRFD-500MF + MBF + GenderAge ~159 MB Non-commercial research only
insightface-opencv YuNet + SFace ~40 MB Apache 2.0 — commercial-safe

The insightface Python library itself is MIT, but the pretrained model packs (buffalo_l, buffalo_s, antelopev2) are released by the upstream maintainers for non-commercial research use only. Pick the insightface-opencv entry for production / commercial deployments.

Quickstart

Pull the commercial-safe backend (recommended for copy-paste):

local-ai models install insightface-opencv

Verify that two images depict the same person:

curl -sX POST http://localhost:8080/v1/face/verify \
  -H "Content-Type: application/json" \
  -d '{
    "model": "insightface-opencv",
    "img1": "https://example.com/alice_1.jpg",
    "img2": "https://example.com/alice_2.jpg"
  }'

Response:

{
  "verified": true,
  "distance": 0.27,
  "threshold": 0.35,
  "confidence": 23.1,
  "model": "insightface-opencv",
  "img1_area": { "x": 120.4, "y": 82.1, "w": 198.3, "h": 260.5 },
  "img2_area": { "x": 110.8, "y": 95.0, "w": 205.6, "h": 268.2 },
  "processing_time_ms": 412.0
}

1:N identification workflow (register → identify → forget)

This is the primary "face recognition" flow. Under the hood it uses LocalAI's built-in in-memory vector store — no external database to stand up.

  1. Register known faces:

    curl -sX POST http://localhost:8080/v1/face/register \
      -H "Content-Type: application/json" \
      -d '{
        "model": "insightface-buffalo-l",
        "name": "Alice",
        "img": "https://example.com/alice.jpg"
      }'
    # → {"id": "8b7...", "name": "Alice", "registered_at": "2026-04-21T..."}
    
  2. Identify an unknown probe:

    curl -sX POST http://localhost:8080/v1/face/identify \
      -H "Content-Type: application/json" \
      -d '{
        "model": "insightface-buffalo-l",
        "img": "https://example.com/unknown.jpg",
        "top_k": 5
      }'
    # → {"matches": [{"id":"8b7...","name":"Alice","distance":0.22,"match":true,...}]}
    
  3. Remove a person by ID:

    curl -sX POST http://localhost:8080/v1/face/forget \
      -d '{"id": "8b7..."}'
    # → 204 No Content
    

{{% alert icon="⚠️" color="warning" %}} Storage caveat. The default vector store is in-memory. All registered faces are lost when LocalAI restarts. Persistent storage (pgvector) is a tracked future enhancement — the face-recognition HTTP API is designed to swap the backing store without changing the wire format. {{% /alert %}}

API reference

POST /v1/face/verify (1:1)

field type description
model string gallery entry name (e.g. insightface-buffalo-l)
img1, img2 string URL, base64, or data-URI
threshold float, optional cosine-distance cutoff; default depends on engine
anti_spoofing bool, optional also run MiniFASNet liveness on each image — see Antispoofing

Returns verified, distance, threshold, confidence, model, img1_area, img2_area, and processing_time_ms. When anti_spoofing is set, the response also carries per-image liveness fields: img1_is_real, img1_antispoof_score, img2_is_real, img2_antispoof_score. A failed liveness check on either image forces verified=false regardless of similarity.

POST /v1/face/analyze

Returns demographic attributes for every detected face:

field type description
model string gallery entry
img string URL / base64 / data-URI
actions string[] subset of ["age","gender","emotion","race"]; empty = all supported

Only insightface-buffalo-l / insightface-buffalo-s populate age and gender (genderage head). insightface-opencv returns face regions with empty attributes — SFace has no demographic classifier. Emotion and race are always empty in the current release.

POST /v1/face/register (1:N enrollment)

field type description
model string face recognition model
img string face to enroll
name string human-readable label
labels map[string]string, optional arbitrary metadata
store string, optional vector store model; defaults to local-store

Returns {id, name, registered_at}. The id is an opaque UUID used by /v1/face/identify and /v1/face/forget.

POST /v1/face/identify (1:N recognition)

field type description
model string face recognition model
img string probe image
top_k int, optional max matches to return; default 5
threshold float, optional cosine-distance cutoff; default 0.35 (ArcFace)
store string, optional vector store model; defaults to local-store

Returns a list of matches sorted by ascending distance, each with id, name, labels, distance, confidence, and match (distance ≤ threshold).

POST /v1/face/forget

field type description
id string ID returned by /v1/face/register

Returns 204 No Content on success, 404 Not Found if the ID is unknown.

POST /v1/face/embed

Returns the L2-normalized face embedding vector for the detected face.

field type description
model string face model
img string URL / base64 / data-URI

Returns {embedding: float[], dim: int, model: string}. Dimension is 512 for the insightface ArcFace/MBF recognizers and 128 for OpenCV's SFace.

Note: the OpenAI-compatible /v1/embeddings endpoint is intentionally text-only by contract (input is a string or list of strings of TEXT to embed) — passing an image data-URI there does nothing useful. Use /v1/face/embed for image inputs.

Reused endpoint

  • POST /v1/detection — returns face bounding boxes with class_name: "face"; works for both engines.

Antispoofing (liveness detection)

All gallery entries ship the Silent-Face-Anti-Spoofing MiniFASNetV2 + MiniFASNetV1SE ensemble (Apache 2.0, ~4 MB total, CPU-only) alongside the face recognition weights. Set anti_spoofing: true on /v1/face/verify or /v1/face/analyze to run liveness on each detected face. The two models look at different crop scales and their softmax outputs are averaged before argmax — the upstream-recommended setup.

/v1/face/verify with liveness gating:

curl -sX POST http://localhost:8080/v1/face/verify \
  -H "Content-Type: application/json" \
  -d '{
    "model": "insightface-opencv",
    "img1": "https://example.com/alice_selfie.jpg",
    "img2": "https://example.com/alice_id_scan.jpg",
    "anti_spoofing": true
  }'

Response (fields added when anti_spoofing is enabled):

{
  "verified": true,
  "distance": 0.27,
  "threshold": 0.5,
  "confidence": 46.0,
  "model": "insightface-opencv",
  "img1_area": { "x": 120, "y": 82, "w": 198, "h": 260 },
  "img2_area": { "x": 110, "y": 95, "w": 205, "h": 268 },
  "img1_is_real": true,
  "img1_antispoof_score": 0.82,
  "img2_is_real": true,
  "img2_antispoof_score": 0.74,
  "processing_time_ms": 431.0
}

If either image fails liveness (is_real=false), verified is forced to false — similarity alone is not enough.

/v1/face/analyze reports per-face is_real and antispoof_score when the flag is set.

Fail-loud semantics. If anti_spoofing: true is sent against a model installed without the MiniFASNet files (e.g. a custom entry that only listed the face recognition weights), the request returns a gRPC FAILED_PRECONDITION error — the endpoint will never silently return is_real=false. Re-install the gallery entry or point the backend at a model that bundles the MiniFASNet ONNX files.

{{% alert icon="" color="info" %}} The MiniFASNet score is best at catching printed photos and screen replays. Deepfake videos and high-quality prosthetics are out of scope — liveness here is a low-cost first line of defence, not a guarantee. For higher assurance, combine with challenge-response (e.g. ask the user to turn their head). {{% /alert %}}

Choosing an engine

Need Entry
Commercial product insightface-opencv
Highest accuracy (research / demos) insightface-buffalo-l
Edge / low-memory / research insightface-buffalo-s

The recommended default threshold for /v1/face/verify and /v1/face/identify depends on the recognizer:

Recognizer Cosine-distance threshold
ArcFace R50 (buffalo_l) ~0.35
MBF (buffalo_s) ~0.40
SFace (opencv) ~0.50

Pass threshold explicitly when switching engines — the per-engine default only fires when the field is omitted.

  • Object Detection — generic bounding-box detection; /v1/detection works with the insightface backend too.
  • Embeddings — raw vector extraction; face embeddings live in the same endpoint under the hood.
  • Stores — the generic vector store powering the 1:N recognition pipeline.