mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-27 18:06:58 -04:00
The gated-DeltaNet + SSM_CONV ops have upstream Metal/Vulkan/SYCL kernels, so the Qwen3.6 hybrids run there (non-fused) - the earlier 'no Vulkan kernel' note was wrong. The patchset's fusions are gated off off-CUDA, so the backend ships CUDA-only; non-CUDA users use stock llama-cpp. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>