feat(dllm): backend packaging, gallery index, CI matrix

Registers the dllm backend across every surface: backend gallery index
(cpu amd64+arm64 with manifest merge, cuda13, l4t-cuda13 for GB10-class
hardware; no darwin per engine scope), top-level Makefile targets,
bump_deps pin tracking for DLLM_VERSION, and the curated known-backends
list for /backends/known (pref-only: auto-detecting on .gguf would
shadow llama-cpp). Note: image builds and the nightly bump leg stay red
until github.com/mudler/dllm.cpp is published (planned at merge time).

Assisted-by: Claude Code (Fable 5)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto
2026-06-11 17:05:18 +00:00
parent 99184809fa
commit 52b3b68cea
8 changed files with 136 additions and 3 deletions

View File

@@ -25,6 +25,10 @@ var knownPrefOnlyBackends = []schema.KnownBackend{
// Text LLM
// ds4: antirez/ds4 - single-model DeepSeek V4 Flash engine; auto-detected via DS4Importer
{Name: "ds4", Modality: "text", AutoDetect: false, Description: "antirez/ds4 DeepSeek V4 Flash engine (auto-detected; pref-only fallback)"},
// dllm consumes GGUF weights like llama-cpp does, but only for the
// DiffusionGemma architecture - auto-detecting on .gguf would shadow
// llama-cpp, so it stays preference-only.
{Name: "dllm", Modality: "text", AutoDetect: false, Description: "dllm.cpp DiffusionGemma block-diffusion engine (preference-only)"},
{Name: "sglang", Modality: "text", AutoDetect: false, Description: "SGLang runtime (preference-only)"},
{Name: "tinygrad", Modality: "text", AutoDetect: false, Description: "tinygrad runtime (preference-only)"},
{Name: "trl", Modality: "text", AutoDetect: false, Description: "Transformers Reinforcement Learning (preference-only)"},

View File

@@ -135,6 +135,7 @@ var _ = Describe("Backend Endpoints", func() {
Expect(entry.Modality).To(Equal(modality))
}
expectPrefOnly("dllm", "text")
expectPrefOnly("sglang", "text")
expectPrefOnly("tinygrad", "text")
expectPrefOnly("trl", "text")