mirror of
https://github.com/mudler/LocalAI.git
synced 2026-04-30 12:08:13 -04:00
Workers on NVIDIA unified-memory hardware (DGX Spark / GB10, Jetson AGX Thor, Jetson Orin/Xavier/Nano) were reporting `available_vram=0` back to the frontend, so the Nodes UI showed the node as fully used even when most of the unified memory was actually free. Three causes addressed: * `isTegraDevice` only matched `/sys/devices/soc0/family == "Tegra"`. DGX Spark (SBSA) reports JEDEC codes there instead — `jep106:0426` for the NVIDIA manufacturer — so the Tegra/unified-memory fallback never ran. Renamed to `isNVIDIAIntegratedGPU` and extended to also match `jep106:0426[:*]` via `/sys/devices/soc0/soc_id`. * The unified-iGPU code defaulted the device name to `"NVIDIA Jetson"` when `/proc/device-tree/model` was missing. That's what happens for Thor inside a docker container, and always on DGX Spark. New `nvidiaIntegratedGPUName` resolves via dt-model → `/sys/devices/soc0/machine` → `soc_id` lookup (`jep106:0426:8901` → `"NVIDIA GB10"`) so the Nodes UI labels the box correctly. * Worker heartbeat sent `available_vram=0` (or total-as-available) when VRAM usage was momentarily unknown — e.g. when `nvidia-smi` intermittently failed with `waitid: no child processes` under containers without `--init`. Each such heartbeat overwrote the DB and made the UI flip to "fully used". `heartbeatBody` now omits `available_vram` in that case so the DB keeps its last good value. Also updates the commented GPU blocks in both compose files with `NVIDIA_DRIVER_CAPABILITIES=compute,utility`, `capabilities: [gpu, utility]`, and `init: true`, and documents the requirement in the distributed-mode and nvidia-l4t pages. Without `utility`, NVML/`nvidia-smi` are absent inside the container, which is what put the DGX Spark worker into the buggy fallback in the first place. Detection verified on live hardware (dgx.casa / GB10 and 192.168.68.23 / Thor) by running a cross-compiled probe of the new helpers on both host and inside the worker container. Assisted-by: Claude:opus-4.7 [Claude Code]
6.4 KiB
6.4 KiB