chore(dllm): review fixes - file modes and build-matrix doc accuracy

Drop the stray executable bit from the Go sources and Makefile (the
sibling Go backends commit them 644; only run.sh/package.sh are
executable), and correct two documentation claims found in the final
branch review: cuda13-dllm is built for amd64 only (arm64 CUDA ships as
the l4t flavor), and package.sh is the parakeet-cpp-style stub layout
with no ldd walk.

Assisted-by: Claude Code (Fable 5)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto
2026-06-11 17:17:54 +00:00
parent aba9c4794a
commit eb61e1d770
11 changed files with 7 additions and 6 deletions

View File

@@ -112,13 +112,14 @@ carry that coverage.
## Build matrix
`cpu-dllm` (amd64 + arm64), `cuda13-dllm` (amd64 + arm64), and
`cuda13-nvidia-l4t-arm64-dllm` (Jetson / DGX Spark GB10), via
`cpu-dllm` (amd64 + arm64), `cuda13-dllm` (amd64), and
`cuda13-nvidia-l4t-arm64-dllm` (arm64 CUDA: Jetson / DGX Spark GB10), via
`.github/backend-matrix.yml`. No darwin/Metal. CUDA builds forward
`-DDLLM_CUDA=ON` (dllm.cpp gates ggml's CUDA behind its own flag - a bare
`-DGGML_CUDA=ON` is overridden by the cache FORCE). `libdllm.so` is
self-contained (ggml statically absorbed, PIC), so packaging only ships the
one .so plus the usual ldd walk.
self-contained (ggml statically absorbed, PIC), so `package.sh` only ships
the binary, `run.sh` and that one .so (the parakeet-cpp-style stub layout;
no ldd walk yet).
## Known limitations

0
backend/go/dllm/Makefile Executable file → Normal file
View File

0
backend/go/dllm/capi.go Executable file → Normal file
View File

0
backend/go/dllm/dllm.go Executable file → Normal file
View File

0
backend/go/dllm/dllm_test.go Executable file → Normal file
View File

0
backend/go/dllm/gemma4_parser.go Executable file → Normal file
View File

0
backend/go/dllm/gemma4_parser_test.go Executable file → Normal file
View File

0
backend/go/dllm/gemma4_renderer.go Executable file → Normal file
View File

0
backend/go/dllm/gemma4_renderer_test.go Executable file → Normal file
View File

0
backend/go/dllm/main.go Executable file → Normal file
View File

View File

@@ -676,8 +676,8 @@ This backend is **experimental**, and the engine does not yet have a prompt-KV p
| Flavor | Hardware |
|---|---|
| `cpu-dllm` | CPU (amd64 + arm64) - functional but very slow on the 26B model; mainly useful for wiring tests |
| `cuda13-dllm` | NVIDIA CUDA 13 (amd64 + arm64) |
| `cuda13-nvidia-l4t-arm64-dllm` | NVIDIA L4T (Jetson / DGX Spark GB10) |
| `cuda13-dllm` | NVIDIA CUDA 13 (amd64) |
| `cuda13-nvidia-l4t-arm64-dllm` | NVIDIA L4T arm64 (Jetson / DGX Spark GB10) |
macOS/Metal is not available yet.