feat(backend): rfdetr-cpp native object detection + segmentation backend (#10028)

Adds a Go native gRPC backend that dlopens librfdetrcpp.so (built from
mudler/rf-detr.cpp at the pinned RFDETR_VERSION) via purego and exposes
the rfdetr.cpp inference pipeline through LocalAI's existing Detect RPC.

Supports all 5 RF-DETR detection variants (Nano/Small/Base/Medium/Large)
and 6 segmentation variants (SegNano/SegSmall/SegMedium/SegLarge/
SegXLarge/Seg2XLarge) with F32/F16/Q8_0/Q4_K quantizations. Pre-built
GGUFs ship at mudler/rfdetr-cpp-* on HuggingFace.

Detection returns Bbox + class_name + confidence; segmentation also
returns PNG-encoded per-detection masks via the rfdetr_capi accessor
functions (rfdetr_capi_get_detection_{class_id,box,score,class_name,
mask_png}).

End-to-end verified through POST /v1/detection: HTTP -> gRPC -> purego
dlopen -> rfdetr.cpp -> ggml -> response (9 detections on the detection
model, 21 detections + valid PNG masks on the seg-nano model against
the kitchen fixture).

Wiring:
  - backend/go/rfdetr-cpp/{main.go,gorfdetrcpp.go,CMakeLists.txt,
    Makefile,run.sh,package.sh,test.sh,.gitignore}
  - Top-level Makefile: BACKEND_RFDETR_CPP, docker-build target,
    .NOTPARALLEL, prepare-test-extra, test-extra
  - backend/go/rfdetr-cpp/Makefile: `test` target invoked by test-extra
  - .github/backend-matrix.yml: CPU + CUDA-12/13 + L4T CUDA-12/13
    (arm64) + HIP + Vulkan (amd64 + arm64) + SYCL f32/f16
  - backend/index.yaml: rfdetr-cpp meta anchor + latest/development
    image entries for every matrix tag-suffix
  - .github/workflows/bump_deps.yaml: RFDETR_VERSION pin tracking
    (mudler/rf-detr.cpp branch main)
  - gallery/index.yaml: 11 rfdetr-cpp-* entries (nano + 4 detection
    variants + 6 seg variants), all backed by mudler/rfdetr-cpp-*
    on HuggingFace with sha256 pinning on the F16 default
  - core/gallery/importers/rfdetr.go: GGUF auto-routing for HF imports
    (mudler/rfdetr-cpp-* repos route to rfdetr-cpp, Transformer-format
    repos stay on the Python rfdetr backend; explicit preferences.backend
    overrides both heuristics)
  - core/gallery/importers/rfdetr_test.go: table-driven coverage of the
    auto-routing + a live mudler/rfdetr-cpp-nano cross-check

scripts/changed-backends.js needs no change: the existing
Dockerfile.golang -> backend/go/${item.backend}/ branch already routes
the 9 rfdetr-cpp matrix entries to the correct backend path.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
LocalAI [bot]
2026-05-27 18:43:57 +02:00
committed by GitHub
parent 893e69cbf8
commit 7a4ca8f60d
18 changed files with 1697 additions and 6 deletions

View File

@@ -6182,6 +6182,317 @@
- detection
parameters:
model: rfdetr-base
- name: rfdetr-cpp-nano
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-nano
description: |
RF-DETR Nano object detection model, served via the native rfdetr.cpp backend (ggml + purego, no Python).
Q8_0 quantization is the recommended default for CPU: same accuracy as F16/F32, ~20MB on disk, fastest CPU latency.
Pure C++/ggml runtime; no Python dependencies. Drop-in for the /v1/detection endpoint.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-nano-q8_0.gguf
files:
- filename: rfdetr-nano-q8_0.gguf
uri: huggingface://mudler/rfdetr-cpp-nano/rfdetr-nano-q8_0.gguf
- name: rfdetr-cpp-base
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-base
description: |
RF-DETR Base object detection model, served via the native rfdetr.cpp backend.
F16 quantization is recommended on CPU: identical accuracy to F32, half the size, fastest.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-base-f16.gguf
files:
- filename: rfdetr-base-f16.gguf
uri: huggingface://mudler/rfdetr-cpp-base/rfdetr-base-f16.gguf
- name: rfdetr-cpp-small
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-small
description: |
RF-DETR Small object detection model (DINOv2-small backbone, 512px input, 3 decoder layers), served
via the native rfdetr.cpp backend (ggml + purego, no Python). A step up from Nano in accuracy while
staying lightweight on CPU. F16 quantization is the recommended default: identical accuracy to F32
at roughly half the size. Drop-in for the /v1/detection endpoint.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-small-f16.gguf
files:
- filename: rfdetr-small-f16.gguf
sha256: 5365264a976bb99ab31f735f43326e50b0804a60cd1709abe8c1c95114c4d79d
uri: huggingface://mudler/rfdetr-cpp-small/rfdetr-small-f16.gguf
- name: rfdetr-cpp-medium
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-medium
description: |
RF-DETR Medium object detection model (DINOv2-small backbone, 576px input, 4 decoder layers), served
via the native rfdetr.cpp backend. Balanced detection quality vs. CPU latency — recommended when
Base is not accurate enough but Large is too slow. F16 quantization is the recommended default:
identical accuracy to F32, half the size. Drop-in for the /v1/detection endpoint.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-medium-f16.gguf
files:
- filename: rfdetr-medium-f16.gguf
sha256: 685b8f50890f099bbc603454309b2d5f1d471541420b95c20c6ed296aec1e7ae
uri: huggingface://mudler/rfdetr-cpp-medium/rfdetr-medium-f16.gguf
- name: rfdetr-cpp-large
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-large
description: |
RF-DETR Large object detection model (DINOv2-small backbone, 704px input, 4 decoder layers), served
via the native rfdetr.cpp backend. Highest-accuracy detection variant — best for offline workflows
and high-resolution inputs where CPU latency is secondary to recall. F16 quantization is the
recommended default: identical accuracy to F32, half the size. Drop-in for the /v1/detection endpoint.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-large-f16.gguf
files:
- filename: rfdetr-large-f16.gguf
sha256: 819f1abc72f746a686722eacc9c4db992b7ca853b26e390ab0a66ca6ea70060a
uri: huggingface://mudler/rfdetr-cpp-large/rfdetr-large-f16.gguf
- name: rfdetr-cpp-seg-nano
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-nano
description: |
RF-DETR Seg-Nano instance segmentation model (DINOv2-small backbone, 312px input, 4 decoder layers,
100 queries), served via the native rfdetr.cpp backend. Smallest segmentation variant — fastest CPU
latency, ideal for edge deployment. Returns both bounding boxes and per-instance masks via the
/v1/detection endpoint. F16 quantization is the recommended default: identical accuracy to F32,
half the size.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-nano-f16.gguf
files:
- filename: rfdetr-seg-nano-f16.gguf
sha256: 9f9a0ab547743992b6c664d41ee1a6afcd66b21b04609a68f76c0eec88648c2b
uri: huggingface://mudler/rfdetr-cpp-seg-nano/rfdetr-seg-nano-f16.gguf
- name: rfdetr-cpp-seg-small
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-small
description: |
RF-DETR Seg-Small instance segmentation model (DINOv2-small backbone, 384px input, 4 decoder layers,
100 queries), served via the native rfdetr.cpp backend. Step up from Seg-Nano in mask quality while
staying CPU-friendly. Returns both bounding boxes and per-instance masks via the /v1/detection
endpoint. F16 quantization is the recommended default: identical accuracy to F32, half the size.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-small-f16.gguf
files:
- filename: rfdetr-seg-small-f16.gguf
sha256: 1b569a182aea941ec645a1923c1e8ad9db05e006db36136da9f148d1ec066670
uri: huggingface://mudler/rfdetr-cpp-seg-small/rfdetr-seg-small-f16.gguf
- name: rfdetr-cpp-seg-medium
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-medium
description: |
RF-DETR Seg-Medium instance segmentation model (DINOv2-small backbone, 432px input, 5 decoder layers,
200 queries), served via the native rfdetr.cpp backend. Balanced segmentation quality vs. CPU latency
— recommended for everyday segmentation workloads. Returns both bounding boxes and per-instance masks
via the /v1/detection endpoint. F16 quantization is the recommended default.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-medium-f16.gguf
files:
- filename: rfdetr-seg-medium-f16.gguf
sha256: 885d85ed6935495fc50ff464e06b6ea3bd8e8386865852d68a8be0f649d65afe
uri: huggingface://mudler/rfdetr-cpp-seg-medium/rfdetr-seg-medium-f16.gguf
- name: rfdetr-cpp-seg-large
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-large
description: |
RF-DETR Seg-Large instance segmentation model (DINOv2-small backbone, 504px input, 5 decoder layers,
200 queries), served via the native rfdetr.cpp backend. Higher-resolution input than Seg-Medium for
sharper mask boundaries. Returns both bounding boxes and per-instance masks via the /v1/detection
endpoint. F16 quantization is the recommended default: identical accuracy to F32, half the size.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-large-f16.gguf
files:
- filename: rfdetr-seg-large-f16.gguf
sha256: 90423066d0791b4ae249f3986cce1f095a1e4090bf46800bf7f9e371ea80d559
uri: huggingface://mudler/rfdetr-cpp-seg-large/rfdetr-seg-large-f16.gguf
- name: rfdetr-cpp-seg-xlarge
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-xlarge
description: |
RF-DETR Seg-XLarge instance segmentation model (DINOv2-small backbone, 624px input, 6 decoder layers,
300 queries), served via the native rfdetr.cpp backend. High-capacity segmentation variant with more
queries and deeper decoder — best for dense scenes with many instances. Returns both bounding boxes
and per-instance masks via the /v1/detection endpoint. F16 quantization is the recommended default.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-xlarge-f16.gguf
files:
- filename: rfdetr-seg-xlarge-f16.gguf
sha256: 0b82de4a6e65a40bc930979a1a4281cb24de35203d30eeefd797c858101a7bec
uri: huggingface://mudler/rfdetr-cpp-seg-xlarge/rfdetr-seg-xlarge-f16.gguf
- name: rfdetr-cpp-seg-2xlarge
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls:
- https://github.com/mudler/rf-detr.cpp
- https://huggingface.co/mudler/rfdetr-cpp-seg-2xlarge
description: |
RF-DETR Seg-2XLarge instance segmentation model (DINOv2-small backbone, 768px input, 6 decoder layers,
300 queries), served via the native rfdetr.cpp backend. Highest-accuracy segmentation variant — best
for offline workflows and high-resolution inputs where CPU latency is secondary to mask quality.
Returns both bounding boxes and per-instance masks via the /v1/detection endpoint. F16 quantization
is the recommended default: identical accuracy to F32, half the size.
license: apache-2.0
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
tags:
- object-detection
- image-segmentation
- rfdetr
- native
- cpp
- cpu
overrides:
backend: rfdetr-cpp
known_usecases:
- detection
parameters:
model: rfdetr-seg-2xlarge-f16.gguf
files:
- filename: rfdetr-seg-2xlarge-f16.gguf
sha256: 7f957997db23e844194ea8266a95b4adc3deb6d0b71c0924922b20fbdeafa299
uri: huggingface://mudler/rfdetr-cpp-seg-2xlarge/rfdetr-seg-2xlarge-f16.gguf
- name: edgetam
url: github:mudler/LocalAI/gallery/virtual.yaml@master
urls: