mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-04 23:06:22 -04:00
feat(backend): rfdetr-cpp native object detection + segmentation backend (#10028)
Adds a Go native gRPC backend that dlopens librfdetrcpp.so (built from
mudler/rf-detr.cpp at the pinned RFDETR_VERSION) via purego and exposes
the rfdetr.cpp inference pipeline through LocalAI's existing Detect RPC.
Supports all 5 RF-DETR detection variants (Nano/Small/Base/Medium/Large)
and 6 segmentation variants (SegNano/SegSmall/SegMedium/SegLarge/
SegXLarge/Seg2XLarge) with F32/F16/Q8_0/Q4_K quantizations. Pre-built
GGUFs ship at mudler/rfdetr-cpp-* on HuggingFace.
Detection returns Bbox + class_name + confidence; segmentation also
returns PNG-encoded per-detection masks via the rfdetr_capi accessor
functions (rfdetr_capi_get_detection_{class_id,box,score,class_name,
mask_png}).
End-to-end verified through POST /v1/detection: HTTP -> gRPC -> purego
dlopen -> rfdetr.cpp -> ggml -> response (9 detections on the detection
model, 21 detections + valid PNG masks on the seg-nano model against
the kitchen fixture).
Wiring:
- backend/go/rfdetr-cpp/{main.go,gorfdetrcpp.go,CMakeLists.txt,
Makefile,run.sh,package.sh,test.sh,.gitignore}
- Top-level Makefile: BACKEND_RFDETR_CPP, docker-build target,
.NOTPARALLEL, prepare-test-extra, test-extra
- backend/go/rfdetr-cpp/Makefile: `test` target invoked by test-extra
- .github/backend-matrix.yml: CPU + CUDA-12/13 + L4T CUDA-12/13
(arm64) + HIP + Vulkan (amd64 + arm64) + SYCL f32/f16
- backend/index.yaml: rfdetr-cpp meta anchor + latest/development
image entries for every matrix tag-suffix
- .github/workflows/bump_deps.yaml: RFDETR_VERSION pin tracking
(mudler/rf-detr.cpp branch main)
- gallery/index.yaml: 11 rfdetr-cpp-* entries (nano + 4 detection
variants + 6 seg variants), all backed by mudler/rfdetr-cpp-*
on HuggingFace with sha256 pinning on the F16 default
- core/gallery/importers/rfdetr.go: GGUF auto-routing for HF imports
(mudler/rfdetr-cpp-* repos route to rfdetr-cpp, Transformer-format
repos stay on the Python rfdetr backend; explicit preferences.backend
overrides both heuristics)
- core/gallery/importers/rfdetr_test.go: table-driven coverage of the
auto-routing + a live mudler/rfdetr-cpp-nano cross-check
scripts/changed-backends.js needs no change: the existing
Dockerfile.golang -> backend/go/${item.backend}/ branch already routes
the 9 rfdetr-cpp matrix entries to the correct backend path.
Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
@@ -6182,6 +6182,317 @@
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-base
|
||||
- name: rfdetr-cpp-nano
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-nano
|
||||
description: |
|
||||
RF-DETR Nano object detection model, served via the native rfdetr.cpp backend (ggml + purego, no Python).
|
||||
Q8_0 quantization is the recommended default for CPU: same accuracy as F16/F32, ~20MB on disk, fastest CPU latency.
|
||||
Pure C++/ggml runtime; no Python dependencies. Drop-in for the /v1/detection endpoint.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-nano-q8_0.gguf
|
||||
files:
|
||||
- filename: rfdetr-nano-q8_0.gguf
|
||||
uri: huggingface://mudler/rfdetr-cpp-nano/rfdetr-nano-q8_0.gguf
|
||||
- name: rfdetr-cpp-base
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-base
|
||||
description: |
|
||||
RF-DETR Base object detection model, served via the native rfdetr.cpp backend.
|
||||
F16 quantization is recommended on CPU: identical accuracy to F32, half the size, fastest.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-base-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-base-f16.gguf
|
||||
uri: huggingface://mudler/rfdetr-cpp-base/rfdetr-base-f16.gguf
|
||||
- name: rfdetr-cpp-small
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-small
|
||||
description: |
|
||||
RF-DETR Small object detection model (DINOv2-small backbone, 512px input, 3 decoder layers), served
|
||||
via the native rfdetr.cpp backend (ggml + purego, no Python). A step up from Nano in accuracy while
|
||||
staying lightweight on CPU. F16 quantization is the recommended default: identical accuracy to F32
|
||||
at roughly half the size. Drop-in for the /v1/detection endpoint.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-small-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-small-f16.gguf
|
||||
sha256: 5365264a976bb99ab31f735f43326e50b0804a60cd1709abe8c1c95114c4d79d
|
||||
uri: huggingface://mudler/rfdetr-cpp-small/rfdetr-small-f16.gguf
|
||||
- name: rfdetr-cpp-medium
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-medium
|
||||
description: |
|
||||
RF-DETR Medium object detection model (DINOv2-small backbone, 576px input, 4 decoder layers), served
|
||||
via the native rfdetr.cpp backend. Balanced detection quality vs. CPU latency — recommended when
|
||||
Base is not accurate enough but Large is too slow. F16 quantization is the recommended default:
|
||||
identical accuracy to F32, half the size. Drop-in for the /v1/detection endpoint.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-medium-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-medium-f16.gguf
|
||||
sha256: 685b8f50890f099bbc603454309b2d5f1d471541420b95c20c6ed296aec1e7ae
|
||||
uri: huggingface://mudler/rfdetr-cpp-medium/rfdetr-medium-f16.gguf
|
||||
- name: rfdetr-cpp-large
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-large
|
||||
description: |
|
||||
RF-DETR Large object detection model (DINOv2-small backbone, 704px input, 4 decoder layers), served
|
||||
via the native rfdetr.cpp backend. Highest-accuracy detection variant — best for offline workflows
|
||||
and high-resolution inputs where CPU latency is secondary to recall. F16 quantization is the
|
||||
recommended default: identical accuracy to F32, half the size. Drop-in for the /v1/detection endpoint.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-large-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-large-f16.gguf
|
||||
sha256: 819f1abc72f746a686722eacc9c4db992b7ca853b26e390ab0a66ca6ea70060a
|
||||
uri: huggingface://mudler/rfdetr-cpp-large/rfdetr-large-f16.gguf
|
||||
- name: rfdetr-cpp-seg-nano
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-nano
|
||||
description: |
|
||||
RF-DETR Seg-Nano instance segmentation model (DINOv2-small backbone, 312px input, 4 decoder layers,
|
||||
100 queries), served via the native rfdetr.cpp backend. Smallest segmentation variant — fastest CPU
|
||||
latency, ideal for edge deployment. Returns both bounding boxes and per-instance masks via the
|
||||
/v1/detection endpoint. F16 quantization is the recommended default: identical accuracy to F32,
|
||||
half the size.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-nano-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-nano-f16.gguf
|
||||
sha256: 9f9a0ab547743992b6c664d41ee1a6afcd66b21b04609a68f76c0eec88648c2b
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-nano/rfdetr-seg-nano-f16.gguf
|
||||
- name: rfdetr-cpp-seg-small
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-small
|
||||
description: |
|
||||
RF-DETR Seg-Small instance segmentation model (DINOv2-small backbone, 384px input, 4 decoder layers,
|
||||
100 queries), served via the native rfdetr.cpp backend. Step up from Seg-Nano in mask quality while
|
||||
staying CPU-friendly. Returns both bounding boxes and per-instance masks via the /v1/detection
|
||||
endpoint. F16 quantization is the recommended default: identical accuracy to F32, half the size.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-small-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-small-f16.gguf
|
||||
sha256: 1b569a182aea941ec645a1923c1e8ad9db05e006db36136da9f148d1ec066670
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-small/rfdetr-seg-small-f16.gguf
|
||||
- name: rfdetr-cpp-seg-medium
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-medium
|
||||
description: |
|
||||
RF-DETR Seg-Medium instance segmentation model (DINOv2-small backbone, 432px input, 5 decoder layers,
|
||||
200 queries), served via the native rfdetr.cpp backend. Balanced segmentation quality vs. CPU latency
|
||||
— recommended for everyday segmentation workloads. Returns both bounding boxes and per-instance masks
|
||||
via the /v1/detection endpoint. F16 quantization is the recommended default.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-medium-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-medium-f16.gguf
|
||||
sha256: 885d85ed6935495fc50ff464e06b6ea3bd8e8386865852d68a8be0f649d65afe
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-medium/rfdetr-seg-medium-f16.gguf
|
||||
- name: rfdetr-cpp-seg-large
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-large
|
||||
description: |
|
||||
RF-DETR Seg-Large instance segmentation model (DINOv2-small backbone, 504px input, 5 decoder layers,
|
||||
200 queries), served via the native rfdetr.cpp backend. Higher-resolution input than Seg-Medium for
|
||||
sharper mask boundaries. Returns both bounding boxes and per-instance masks via the /v1/detection
|
||||
endpoint. F16 quantization is the recommended default: identical accuracy to F32, half the size.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-large-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-large-f16.gguf
|
||||
sha256: 90423066d0791b4ae249f3986cce1f095a1e4090bf46800bf7f9e371ea80d559
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-large/rfdetr-seg-large-f16.gguf
|
||||
- name: rfdetr-cpp-seg-xlarge
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-xlarge
|
||||
description: |
|
||||
RF-DETR Seg-XLarge instance segmentation model (DINOv2-small backbone, 624px input, 6 decoder layers,
|
||||
300 queries), served via the native rfdetr.cpp backend. High-capacity segmentation variant with more
|
||||
queries and deeper decoder — best for dense scenes with many instances. Returns both bounding boxes
|
||||
and per-instance masks via the /v1/detection endpoint. F16 quantization is the recommended default.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-xlarge-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-xlarge-f16.gguf
|
||||
sha256: 0b82de4a6e65a40bc930979a1a4281cb24de35203d30eeefd797c858101a7bec
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-xlarge/rfdetr-seg-xlarge-f16.gguf
|
||||
- name: rfdetr-cpp-seg-2xlarge
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
- https://github.com/mudler/rf-detr.cpp
|
||||
- https://huggingface.co/mudler/rfdetr-cpp-seg-2xlarge
|
||||
description: |
|
||||
RF-DETR Seg-2XLarge instance segmentation model (DINOv2-small backbone, 768px input, 6 decoder layers,
|
||||
300 queries), served via the native rfdetr.cpp backend. Highest-accuracy segmentation variant — best
|
||||
for offline workflows and high-resolution inputs where CPU latency is secondary to mask quality.
|
||||
Returns both bounding boxes and per-instance masks via the /v1/detection endpoint. F16 quantization
|
||||
is the recommended default: identical accuracy to F32, half the size.
|
||||
license: apache-2.0
|
||||
icon: https://avatars.githubusercontent.com/u/53104118?s=200&v=4
|
||||
tags:
|
||||
- object-detection
|
||||
- image-segmentation
|
||||
- rfdetr
|
||||
- native
|
||||
- cpp
|
||||
- cpu
|
||||
overrides:
|
||||
backend: rfdetr-cpp
|
||||
known_usecases:
|
||||
- detection
|
||||
parameters:
|
||||
model: rfdetr-seg-2xlarge-f16.gguf
|
||||
files:
|
||||
- filename: rfdetr-seg-2xlarge-f16.gguf
|
||||
sha256: 7f957997db23e844194ea8266a95b4adc3deb6d0b71c0924922b20fbdeafa299
|
||||
uri: huggingface://mudler/rfdetr-cpp-seg-2xlarge/rfdetr-seg-2xlarge-f16.gguf
|
||||
- name: edgetam
|
||||
url: github:mudler/LocalAI/gallery/virtual.yaml@master
|
||||
urls:
|
||||
|
||||
Reference in New Issue
Block a user