mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-23 16:19:07 -04:00
Every edit mapped (gather-index graph input mirroring k_idxs; gather K/V/mask by one aligned index; n_kv compaction; gated so stock stays byte-identical) with the token-identical gate and the known risks (mask transpose layout, v_trans). Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>