Files
LocalAI/backend/python/insightface/backend.py
Ettore Di Giacinto 20baec77ab feat(face-recognition): add insightface/onnx backend for 1:1 verify, 1:N identify, embedding, detection, analysis (#9480)
* feat(face-recognition): add insightface backend for 1:1 verify, 1:N identify, embedding, detection, analysis

Adds face recognition as a new first-class capability in LocalAI via the
`insightface` Python backend, with a pluggable two-engine design so
non-commercial (insightface model packs) and commercial-safe
(OpenCV Zoo YuNet + SFace) models share the same gRPC/HTTP surface.

New gRPC RPCs (backend/backend.proto):
  * FaceVerify(FaceVerifyRequest) returns FaceVerifyResponse
  * FaceAnalyze(FaceAnalyzeRequest) returns FaceAnalyzeResponse

Existing Embedding and Detect RPCs are reused (face image in
PredictOptions.Images / DetectOptions.src) for face embedding and
face detection respectively.

New HTTP endpoints under /v1/face/:
  * verify     — 1:1 image pair same-person decision
  * analyze    — per-face age + gender (emotion/race reserved)
  * register   — 1:N enrollment; stores embedding in vector store
  * identify   — 1:N recognition; detect → embed → StoresFind
  * forget     — remove a registered face by opaque ID

Service layer (core/services/facerecognition/) introduces a
`Registry` interface with one in-memory `storeRegistry` impl backed
by LocalAI's existing local-store gRPC vector backend. HTTP handlers
depend on the interface, not on StoresSet/StoresFind directly, so a
persistent PostgreSQL/pgvector implementation can be slotted in via a
single constructor change in core/application (TODO marker in the
package doc).

New usecase flag FLAG_FACE_RECOGNITION; insightface is also wired
into FLAG_DETECTION so /v1/detection works for face bounding boxes.

Gallery (backend/index.yaml) ships three entries:
  * insightface-buffalo-l   — SCRFD-10GF + ArcFace R50 + genderage
                              (~326MB pre-baked; non-commercial research use only)
  * insightface-opencv      — YuNet + SFace (~40MB pre-baked; Apache 2.0)
  * insightface-buffalo-s   — SCRFD-500MF + MBF (runtime download; non-commercial)

Python backend (backend/python/insightface/):
  * engines.py — FaceEngine protocol with InsightFaceEngine and
    OnnxDirectEngine; resolves model paths relative to the backend
    directory so the same gallery config works in docker-scratch and
    in the e2e-backends rootfs-extraction harness.
  * backend.py — gRPC servicer implementing Health, LoadModel, Status,
    Embedding, Detect, FaceVerify, FaceAnalyze.
  * install.sh — pre-bakes buffalo_l + OpenCV YuNet/SFace inside the
    backend directory so first-run is offline-clean (the final scratch
    image only preserves files under /<backend>/).
  * test.py — parametrized unit tests over both engines.

Tests:
  * Registry unit tests (go test -race ./core/services/facerecognition/...)
    — in-memory fake grpc.Backend, table-driven, covers register/
    identify/forget/error paths + concurrent access.
  * tests/e2e-backends/backend_test.go extended with face caps
    (face_detect, face_embed, face_verify, face_analyze); relative
    ordering + configurable verifyCeiling per engine.
  * Makefile targets: test-extra-backend-insightface-buffalo-l,
    -opencv, and the -all aggregate.
  * CI: .github/workflows/test-extra.yml gains tests-insightface-grpc,
    auto-triggered by changes under backend/python/insightface/.

Docs:
  * docs/content/features/face-recognition.md — feature page with
    license table, quickstart (defaults to the commercial-safe model),
    models matrix, API reference, 1:N workflow, storage caveats.
  * Cross-refs in object-detection.md, stores.md, embeddings.md, and
    whats-new.md.
  * Contributor README at backend/python/insightface/README.md.

Verified end-to-end:
  * buffalo_l: 6/6 specs (health, load, face_detect, face_embed,
    face_verify, face_analyze).
  * opencv: 5/5 specs (same minus face_analyze — SFace has no
    demographic head; correctly skipped via BACKEND_TEST_CAPS).

Assisted-by: Claude:claude-opus-4-7

* fix(face-recognition): move engine selection to model gallery, collapse backend entries

The previous commit put engine/model_pack options on backend gallery
entries (`backend/index.yaml`). That was wrong — `GalleryBackend`
(core/gallery/backend_types.go:32) has no `options` field, so the
YAML decoder silently dropped those keys and all three "different
insightface-*" backend entries resolved to the same container image
with no distinguishing configuration.

Correct split:

  * `backend/index.yaml` now has ONE `insightface` backend entry
    shipping the CPU + CUDA 12 container images. The Python backend
    bundles both the non-commercial insightface model packs
    (buffalo_l / buffalo_s) and the commercial-safe OpenCV Zoo
    weights (YuNet + SFace); the active engine is selected at
    LoadModel time via `options: ["engine:..."]`.

  * `gallery/index.yaml` gains three model entries —
    `insightface-buffalo-l`, `insightface-opencv`,
    `insightface-buffalo-s` — each setting the appropriate
    `overrides.backend` + `overrides.options` so installing one
    actually gives the user the intended engine. This matches how
    `rfdetr-base` lives in the model gallery against the `rfdetr`
    backend.

The earlier e2e tests passed despite this bug because the Makefile
targets pass `BACKEND_TEST_OPTIONS` directly to LoadModel via gRPC,
bypassing any gallery resolution entirely. No code changes needed.

Assisted-by: Claude:claude-opus-4-7

* feat(face-recognition): cover all supported models in the gallery + drop weight baking

Follows up on the model-gallery split: adds entries for every model
configuration either engine actually supports, and switches weight
delivery from image-baked to LocalAI's standard gallery mechanism.

Gallery now has seven `insightface-*` model entries (gallery/index.yaml):

  insightface (family)  — non-commercial research use
    • buffalo-l   (326MB)  — SCRFD-10GF + ResNet50 + genderage, default
    • buffalo-m   (313MB)  — SCRFD-2.5GF + ResNet50 + genderage
    • buffalo-s   (159MB)  — SCRFD-500MF + MBF + genderage
    • buffalo-sc  (16MB)   — SCRFD-500MF + MBF, recognition only
                             (no landmarks, no demographics — analyze
                             returns empty attributes)
    • antelopev2  (407MB)  — SCRFD-10GF + ResNet100@Glint360K + genderage

  OpenCV Zoo family — Apache 2.0 commercial-safe
    • opencv       — YuNet + SFace fp32 (~40MB)
    • opencv-int8  — YuNet + SFace int8 (~12MB, ~3x smaller, faster on CPU)

Model weights are no longer baked into the backend image. The image
now ships only the Python runtime + libraries (~275MB content size,
~1.18GB disk vs ~1.21GB when weights were baked). Weights flow through
LocalAI's gallery mechanism:

  * OpenCV variants list `files:` with ONNX URIs + SHA-256, so
    `local-ai models install insightface-opencv` pulls them into the
    models directory exactly like any other gallery-managed model.

  * insightface packs (upstream distributes .zip archives only, not
    individual ONNX files) auto-download on first LoadModel via
    FaceAnalysis' built-in machinery, rooted at the LocalAI models
    directory so they live alongside everything else — same pattern
    `rfdetr` uses with `inference.get_model()`.

Backend changes (backend/python/insightface/):

  * backend.py — LoadModel propagates `ModelOptions.ModelPath` (the
    LocalAI models directory) to engines via a `_model_dir` hint.
    This replaces the earlier ModelFile-dirname approach; ModelPath
    is the canonical "models directory" variable set by the Go loader
    (pkg/model/initializers.go:144) and is always populated.

  * engines.py::_resolve_model_path — picks up `model_dir` and searches
    it (plus basename-in-model-dir) before falling back to the dev
    script-dir. This is how OnnxDirectEngine finds gallery-downloaded
    YuNet/SFace files by filename only.

  * engines.py::_flatten_insightface_pack — new helper that works
    around an upstream packaging inconsistency: buffalo_l/s/sc zips
    expand flat, but buffalo_m and antelopev2 zips wrap their ONNX
    files in a redundant `<name>/` directory. insightface's own
    loader looks one level too shallow and fails. We call
    `ensure_available()` explicitly, flatten if nested, then hand to
    FaceAnalysis.

  * engines.py::InsightFaceEngine.prepare — root-resolution order now
    includes the `_model_dir` hint so packs download into the LocalAI
    models directory by default.

  * install.sh — no longer pre-downloads any weights. Everything is
    gallery-managed now.

  * smoke.py (new) — parametrized smoke test that iterates over every
    gallery configuration, simulating the LocalAI install flow
    (creates a models dir, fetches OpenCV files with checksum
    verification, lets insightface auto-download its packs), then
    runs detect + embed + verify (+ analyze where supported) through
    the in-process BackendServicer.

  * test.py — OnnxDirectEngineTest no longer hardcodes `/models/opencv/`
    paths; downloads ONNX files to a temp dir at setUpClass time and
    passes ModelPath accordingly.

Registry change (core/services/facerecognition/store_registry.go):

  * `dim=0` in NewStoreRegistry now means "accept whatever dimension
    arrives" — needed because the backend supports 512-d ArcFace/MBF
    and 128-d SFace via the same Registry. A non-zero dim still fails
    fast with ErrDimensionMismatch.

  * core/application plumbs `faceEmbeddingDim = 0`, explaining the
    rationale in the comment.

Backend gallery description updated to reflect that the image carries
no weights — it's just Python + engines.

Smoke-tested all 7 configurations against the rebuilt image (with the
flatten fix applied), exit 0:

    PASS: insightface-buffalo-l    faces=6 dim=512 same-dist=0.000
    PASS: insightface-buffalo-sc   faces=6 dim=512 same-dist=0.000
    PASS: insightface-buffalo-s    faces=6 dim=512 same-dist=0.000
    PASS: insightface-buffalo-m    faces=6 dim=512 same-dist=0.000
    PASS: insightface-antelopev2   faces=6 dim=512 same-dist=0.000
    PASS: insightface-opencv       faces=6 dim=128 same-dist=0.000
    PASS: insightface-opencv-int8  faces=6 dim=128 same-dist=0.000
    7/7 passed

Assisted-by: Claude:claude-opus-4-7

* fix(face-recognition): pre-fetch OpenCV ONNX for e2e target; drop stale pre-baked claim

CI regression from the previous commit: I moved OpenCV Zoo weight
delivery to LocalAI's gallery `files:` mechanism, but the
test-extra-backend-insightface-opencv target was still passing
relative paths `detector_onnx:models/opencv/yunet.onnx` in
BACKEND_TEST_OPTIONS. The e2e suite drives LoadModel directly over
gRPC without going through the gallery, so those relative paths
resolved to nothing and OpenCV's ONNXImporter failed:

    LoadModel failed: Failed to load face engine:
    OpenCV(4.13.0) ... Can't read ONNX file: models/opencv/yunet.onnx

Fix: add an `insightface-opencv-models` prerequisite target that
fetches the two ONNX files (YuNet + SFace) to a deterministic host
cache at /tmp/localai-insightface-opencv-cache/, verifies SHA-256,
and skips the download on re-runs. The opencv test target depends on
it and passes absolute paths in BACKEND_TEST_OPTIONS, so the backend
finds the files via its normal absolute-path resolution branch.

Also refresh the buffalo_l comment: it no longer says "pre-baked"
(nothing is — the pack auto-downloads from upstream's GitHub release
on first LoadModel, same as in CI).

Locally verified: `make test-extra-backend-insightface-opencv` passes
5/5 specs (health, load, face_detect, face_embed, face_verify).

Assisted-by: Claude:claude-opus-4-7

* feat(face-recognition): add POST /v1/face/embed + correct /v1/embeddings docs

The docs promised that /v1/embeddings returns face vectors when you
send an image data-URI. That was never true: /v1/embeddings is
OpenAI-compatible and text-only by contract — its handler goes
through `core/backend/embeddings.go::ModelEmbedding`, which sets
`predictOptions.Embeddings = s` (a string of TEXT to embed) and never
populates `predictOptions.Images[]`. The Python backend's Embedding
gRPC method does handle Images[] (that's how /v1/face/register reaches
it internally via `backend.FaceEmbed`), but the HTTP embeddings
endpoint wasn't wired to populate it.

Rather than overload /v1/embeddings with image-vs-text detection —
messy, and the endpoint is OpenAI-compatible by design — add a
dedicated /v1/face/embed endpoint that wraps `backend.FaceEmbed`
(already used internally by /v1/face/register and /v1/face/identify).

Matches LocalAI's convention of a dedicated path per non-standard flow
(/v1/rerank, /v1/detection, /v1/face/verify etc.).

Response:

    {
      "embedding": [<dim> floats, L2-normed],
      "dim": int,           // 512 for ArcFace R50 / MBF, 128 for SFace
      "model": "<name>"
    }

Live-tested on the opencv engine: returns a 128-d L2-normalized vector
(sum(x^2) = 1.0000). Sentinel in docs updated to note /v1/embeddings
is text-only and point image users at /v1/face/embed instead.

Assisted-by: Claude:claude-opus-4-7

* fix(http): map malformed image input + gRPC status codes to proper 4xx

Image-input failures on LocalAI's single-image endpoints (/v1/detection,
/v1/face/{verify,analyze,embed,register,identify}) have historically
returned 500 — even when the client was the one who sent garbage.
Classic example: you POST an "image" that isn't a URL, isn't a
data-URI, and isn't a valid JPEG/PNG — the server shouldn't claim
that's its fault.

Two helpers land in core/http/endpoints/localai/images.go and every
single-image handler is switched over:

  * decodeImageInput(s)
      Wraps utils.GetContentURIAsBase64 and turns any failure
      (invalid URL, not a data-URI, download error, etc.) into
      echo.NewHTTPError(400, "invalid image input: ...").

  * mapBackendError(err)
      Inspects the gRPC status on a backend call error and maps:
        INVALID_ARGUMENT     → 400 Bad Request
        NOT_FOUND            → 404 Not Found
        FAILED_PRECONDITION  → 412 Precondition Failed
        Unimplemented        → 501 Not Implemented
      All other codes fall through unchanged (still 500).

Before, my 1×1 PNG error-path test returned:
    HTTP 500 "rpc error: code = InvalidArgument desc = failed to decode one or both images"
After:
    HTTP 400 "failed to decode one or both images"

Scope-limited to the LocalAI single-image endpoints. The multi-modal
paths (middleware/request.go, openresponses/responses.go,
openai/realtime.go) intentionally log-and-skip individual media parts
when decoding fails — different design intent (graceful degradation
of a multi-part message), not a 400-worthy failure. Left untouched.

Live-verified: every error case in /tmp/face_errors.py now returns
4xx with a meaningful message; the "image with no face (1x1 PNG)"
case specifically went from 500 → 400.

Assisted-by: Claude:claude-opus-4-7

* refactor(face-recognition): insightface packs go through gallery files:, drop FaceAnalysis

Follows up on the discovery that LocalAI's gallery `files:` mechanism
handles archives (zip, tar.gz, …) via mholt/archiver/v3 — the rhasspy
piper voices use exactly this pattern. Insightface packs are zip
archives, so we can now deliver them the same way every other
gallery-managed model gets delivered: declaratively, checksum-verified,
through LocalAI's standard download+extract pipeline.

Two changes:

1. Gallery (gallery/index.yaml) — every insightface-* entry gains a
   `files:` list with the pack zip's URI + SHA-256. `local-ai models
   install insightface-buffalo-l` now fetches the zip, verifies the
   hash, and extracts it into the models directory. No more reliance
   on insightface's library-internal `ensure_available()` auto-download
   or its hardcoded `BASE_REPO_URL`.

2. InsightFaceEngine (backend/python/insightface/engines.py) — drops
   the FaceAnalysis wrapper and drives insightface's `model_zoo`
   directly. The ~50 lines FaceAnalysis provides — glob ONNX files,
   route each through `model_zoo.get_model()`, build a
   `{taskname: model}` dict, loop per-face at inference — are
   reimplemented in `InsightFaceEngine`. The actual inference classes
   (RetinaFace, ArcFaceONNX, Attribute, Landmark) are still
   insightface's — we only replicate the glue, so drift risk against
   upstream is minimal.

   Why drop FaceAnalysis: it hard-codes a `<root>/models/<name>/*.onnx`
   layout that doesn't match what LocalAI's zip extraction produces.
   LocalAI unpacks archives flat into `<models_dir>`. Upstream packs
   are inconsistent — buffalo_l/s/sc ship ONNX at the zip root (lands
   at `<models_dir>/*.onnx`), buffalo_m/antelopev2 wrap in a redundant
   `<name>/` dir (lands at `<models_dir>/<name>/*.onnx`). The new
   `_locate_insightface_pack` helper searches both locations plus
   legacy paths and returns whichever has ONNX files. Replaces the
   earlier `_flatten_insightface_pack` helper (which tried to fight
   FaceAnalysis's layout expectations; now we just find the files
   wherever they are).

Net effect for users: install once via LocalAI's managed flow,
weights live alongside every other model, progress shows in the
jobs endpoint, no first-load network call. Same API surface,
cleaner plumbing.

Assisted-by: Claude:claude-opus-4-7

* fix(face-recognition): CI's insightface e2e path needs the pack pre-fetched

The e2e suite drives LoadModel over gRPC without going through LocalAI's
gallery flow, so the engine's `_model_dir` option (normally populated
from ModelPath) is empty. Previously the insightface target relied on
FaceAnalysis auto-download to paper over this, but we dropped
FaceAnalysis in favor of direct model_zoo calls — so the buffalo_l
target started failing at LoadModel with "no insightface pack found".

Mirror the opencv target's pre-fetch pattern: download buffalo_sc.zip
(same SHA as the gallery entry), extract it on the host, and pass
`root:<dir>` so the engine locates the pack without needing
ModelPath. Switched to buffalo_sc (smallest pack, ~16MB) to keep CI
fast; it covers the same insightface engine code path as buffalo_l.

Face analyze cap dropped since buffalo_sc has no age/gender head.

Assisted-by: Claude:claude-opus-4-7[1m]

* feat(face-recognition): surface face-recognition in advertised feature maps

The six /v1/face/* endpoints were missing from every place LocalAI
advertises its feature surface to clients:

  * api_instructions — the machine-readable capability index at
    GET /api/instructions. Added `face-recognition` as a dedicated
    instruction area with an intro that calls out the in-memory
    registry caveat and the /v1/face/embed vs /v1/embeddings split.
  * auth/permissions — added FeatureFaceRecognition constant, routed
    all six face endpoints through it so admins can gate them per-user
    like any other API feature. Default ON (matches the other API
    features).
  * React UI capabilities — CAP_FACE_RECOGNITION symbol mapped to
    FLAG_FACE_RECOGNITION. Declared only for now; the Face page is a
    follow-up (noted in the plan).

Instruction count bumped 9 → 10; test updated.

Assisted-by: Claude:claude-opus-4-7[1m]

* docs(agents): capture advertising-surface steps in the endpoint guide

Before this change, adding a new /v1/* endpoint reliably missed one or
more of: the swagger @Tags annotation, the /api/instructions registry,
the auth RouteFeatureRegistry, and the React UI CAP_* symbol. The
endpoint would work but be invisible to API consumers, admins, and the
UI — and nothing in the existing docs said to look in those places.

Extend .agents/api-endpoints-and-auth.md with a new "Advertising
surfaces" section covering all four surfaces (swagger tags, /api/
instructions, capabilities.js, docs/), and expand the closing checklist
so it's impossible to ship a feature without visiting each one. Hoist a
one-liner reminder into AGENTS.md's Quick Reference so agents skim it
before diving in.

Assisted-by: Claude:claude-opus-4-7[1m]
2026-04-22 21:55:41 +02:00

266 lines
9.8 KiB
Python

#!/usr/bin/env python3
"""gRPC server for the insightface face recognition backend.
Implements Health / LoadModel / Status plus the face-specific methods:
Embedding, Detect, FaceVerify, FaceAnalyze. The heavy lifting is
delegated to engines.py — this file is just the gRPC plumbing.
"""
import argparse
import base64
import os
import signal
import sys
import time
from concurrent import futures
from io import BytesIO
import backend_pb2
import backend_pb2_grpc
import cv2
import grpc
import numpy as np
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "common"))
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "common"))
from grpc_auth import get_auth_interceptors # noqa: E402
from engines import FaceEngine, build_engine # noqa: E402
_ONE_DAY = 60 * 60 * 24
MAX_WORKERS = int(os.environ.get("PYTHON_GRPC_MAX_WORKERS", "1"))
# Default cosine-distance threshold for "same person" on buffalo_l
# ArcFace R50. Clients can override per-request; clients using SFace
# should pass threshold≈0.4 since the distance distribution is wider.
DEFAULT_VERIFY_THRESHOLD = 0.35
def _decode_image(src: str) -> np.ndarray | None:
"""Decode a base64-encoded image into an OpenCV BGR numpy array."""
if not src:
return None
try:
data = base64.b64decode(src, validate=False)
except Exception:
return None
arr = np.frombuffer(data, dtype=np.uint8)
if arr.size == 0:
return None
img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
return img
def _parse_options(raw: list[str]) -> dict[str, str]:
out: dict[str, str] = {}
for entry in raw:
if ":" not in entry:
continue
k, v = entry.split(":", 1)
out[k.strip()] = v.strip()
return out
class BackendServicer(backend_pb2_grpc.BackendServicer):
def __init__(self) -> None:
self.engine: FaceEngine | None = None
self.engine_name: str = ""
self.model_name: str = ""
self.verify_threshold: float = DEFAULT_VERIFY_THRESHOLD
def Health(self, request, context):
return backend_pb2.Reply(message=bytes("OK", "utf-8"))
def LoadModel(self, request, context):
options = _parse_options(list(request.Options))
# Surface LocalAI's models directory (ModelPath) so engines can
# anchor relative paths — OnnxDirectEngine's detector_onnx /
# recognizer_onnx point at gallery-managed files that LocalAI
# dropped there, and InsightFaceEngine auto-downloads its packs
# into that same directory alongside every other managed model.
# Private key to avoid clashing with user-provided options.
if request.ModelPath:
options["_model_dir"] = request.ModelPath
engine_name = options.get("engine", "insightface")
try:
self.engine = build_engine(engine_name)
self.engine.prepare(options)
except Exception as err: # pragma: no cover - exercised via e2e
return backend_pb2.Result(success=False, message=f"Failed to load face engine: {err}")
self.engine_name = engine_name
self.model_name = request.Model or options.get("model_pack", "")
if "verify_threshold" in options:
try:
self.verify_threshold = float(options["verify_threshold"])
except ValueError:
pass
print(f"[insightface] engine={engine_name} model={self.model_name} loaded", file=sys.stderr)
return backend_pb2.Result(success=True, message="Model loaded successfully")
def Status(self, request, context):
state = (
backend_pb2.StatusResponse.READY
if self.engine is not None
else backend_pb2.StatusResponse.UNINITIALIZED
)
return backend_pb2.StatusResponse(state=state)
def Embedding(self, request, context):
if self.engine is None:
context.set_code(grpc.StatusCode.FAILED_PRECONDITION)
context.set_details("face model not loaded")
return backend_pb2.EmbeddingResult()
if not request.Images:
context.set_code(grpc.StatusCode.INVALID_ARGUMENT)
context.set_details("Embedding requires Images[0] to be a base64 image")
return backend_pb2.EmbeddingResult()
img = _decode_image(request.Images[0])
if img is None:
context.set_code(grpc.StatusCode.INVALID_ARGUMENT)
context.set_details("failed to decode image")
return backend_pb2.EmbeddingResult()
vec = self.engine.embed(img)
if vec is None:
context.set_code(grpc.StatusCode.NOT_FOUND)
context.set_details("no face detected")
return backend_pb2.EmbeddingResult()
return backend_pb2.EmbeddingResult(embeddings=[float(x) for x in vec])
def Detect(self, request, context):
if self.engine is None:
return backend_pb2.DetectResponse()
img = _decode_image(request.src)
if img is None:
return backend_pb2.DetectResponse()
detections = []
for d in self.engine.detect(img):
x1, y1, x2, y2 = d.bbox
detections.append(
backend_pb2.Detection(
x=float(x1),
y=float(y1),
width=float(x2 - x1),
height=float(y2 - y1),
confidence=float(d.score),
class_name="face",
)
)
return backend_pb2.DetectResponse(Detections=detections)
def FaceVerify(self, request, context):
if self.engine is None:
context.set_code(grpc.StatusCode.FAILED_PRECONDITION)
context.set_details("face model not loaded")
return backend_pb2.FaceVerifyResponse()
img1 = _decode_image(request.img1)
img2 = _decode_image(request.img2)
if img1 is None or img2 is None:
context.set_code(grpc.StatusCode.INVALID_ARGUMENT)
context.set_details("failed to decode one or both images")
return backend_pb2.FaceVerifyResponse()
threshold = request.threshold if request.threshold > 0 else self.verify_threshold
start = time.time()
e1 = self.engine.embed(img1)
e2 = self.engine.embed(img2)
if e1 is None or e2 is None:
context.set_code(grpc.StatusCode.NOT_FOUND)
context.set_details("no face detected in one or both images")
return backend_pb2.FaceVerifyResponse()
# Both engines return L2-normalized vectors, so the dot product
# is the cosine similarity directly.
sim = float(np.dot(e1, e2))
distance = 1.0 - sim
verified = distance < threshold
confidence = max(0.0, min(100.0, (1.0 - distance / threshold) * 100.0)) if threshold > 0 else 0.0
def _region(img) -> backend_pb2.FacialArea:
dets = self.engine.detect(img)
if not dets:
return backend_pb2.FacialArea()
best = max(dets, key=lambda d: d.score)
x1, y1, x2, y2 = best.bbox
return backend_pb2.FacialArea(x=x1, y=y1, w=x2 - x1, h=y2 - y1)
return backend_pb2.FaceVerifyResponse(
verified=verified,
distance=float(distance),
threshold=float(threshold),
confidence=float(confidence),
model=self.model_name or self.engine_name,
img1_area=_region(img1),
img2_area=_region(img2),
processing_time_ms=float((time.time() - start) * 1000.0),
)
def FaceAnalyze(self, request, context):
if self.engine is None:
context.set_code(grpc.StatusCode.FAILED_PRECONDITION)
context.set_details("face model not loaded")
return backend_pb2.FaceAnalyzeResponse()
img = _decode_image(request.img)
if img is None:
context.set_code(grpc.StatusCode.INVALID_ARGUMENT)
context.set_details("failed to decode image")
return backend_pb2.FaceAnalyzeResponse()
faces = []
for attrs in self.engine.analyze(img):
x, y, w, h = attrs.region
fa = backend_pb2.FaceAnalysis(
region=backend_pb2.FacialArea(x=float(x), y=float(y), w=float(w), h=float(h)),
face_confidence=float(attrs.face_confidence),
)
if attrs.age is not None:
fa.age = float(attrs.age)
if attrs.dominant_gender:
fa.dominant_gender = attrs.dominant_gender
for k, v in attrs.gender.items():
fa.gender[k] = float(v)
faces.append(fa)
return backend_pb2.FaceAnalyzeResponse(faces=faces)
def serve(address: str) -> None:
server = grpc.server(
futures.ThreadPoolExecutor(max_workers=MAX_WORKERS),
options=[
("grpc.max_message_length", 50 * 1024 * 1024),
("grpc.max_send_message_length", 50 * 1024 * 1024),
("grpc.max_receive_message_length", 50 * 1024 * 1024),
],
interceptors=get_auth_interceptors(),
)
backend_pb2_grpc.add_BackendServicer_to_server(BackendServicer(), server)
server.add_insecure_port(address)
server.start()
print("[insightface] Server started. Listening on: " + address, file=sys.stderr)
def _stop(sig, frame): # pragma: no cover
print("[insightface] shutting down")
server.stop(0)
sys.exit(0)
signal.signal(signal.SIGINT, _stop)
signal.signal(signal.SIGTERM, _stop)
try:
while True:
time.sleep(_ONE_DAY)
except KeyboardInterrupt:
server.stop(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Run the insightface gRPC server.")
parser.add_argument("--addr", default="localhost:50051", help="The address to bind the server to.")
args = parser.parse_args()
print(f"[insightface] startup: {args}", file=sys.stderr)
serve(args.addr)