LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-29 11:07:18 -04:00

Files

Ettore Di Giacinto 9787bee48b fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles

Two more hipblas-only build failures in buun's fattn.cu, fixed under the
same patches/ infrastructure:

1. cudaMemcpyToSymbol / cudaMemcpyFromSymbol — buun's Q² calibration +
   TCQ codebook upload paths call the symbol variants of cudaMemcpy.
   ggml/src/ggml-cuda/vendors/hip.h aliases every other cudaMemcpy*
   name (cudaMemcpy, cudaMemcpyAsync, cudaMemcpy2DAsync, …) but the
   symbol pair was never added. 15+ "use of undeclared identifier"
   errors across fattn.cu lines 40, 54, 74-76, 94, 100-101, 371, 883,
   905, 954, 976, 1449, 1463. Add the two missing aliases alongside
   the existing memcpy block.

2. __shfl_xor_sync fwht128 calls — same 3-arg omission pattern as the
   earlier argmax top-K fix. Lines 512 (ggml_cuda_fwht128 intra-warp
   butterfly) and 536 (fwht128_store_half neighbor fetch) drop the
   width argument that hip.h:33 requires. Add WARP_SIZE.

Assisted-by: Claude:claude-opus-4-7
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-04-24 20:09:36 +00:00

0001-fattn-atomicAdd-double-shim.patch

fix(buun-llama-cpp): shim atomicAdd(double*,double) for pre-sm_60 CUDA

2026-04-24 13:57:30 +00:00

0002-argmax-shfl-xor-sync-add-width.patch

fix(buun-llama-cpp): pass WARP_SIZE to argmax __shfl_xor_sync calls

2026-04-24 16:29:29 +00:00

0003-hip-add-memcpy-symbol-aliases.patch

fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles

2026-04-24 20:09:36 +00:00

0004-fattn-fwht128-shfl-xor-sync-add-width.patch

fix(buun-llama-cpp): shim cudaMemcpy{To,From}Symbol + WARP_SIZE on fwht128 shuffles

2026-04-24 20:09:36 +00:00