From 62c407ed553ab4c213b7a885e2a5bde84d6856b5 Mon Sep 17 00:00:00 2001 From: Ettore Di Giacinto Date: Fri, 26 Jun 2026 21:41:45 +0000 Subject: [PATCH] docs(paged): lever1 gather-fusion bench landed - checkpoint + attribution (patch 0028) Anchors the rigorous same-session A/B validation of patch 0028 (residual conv-state tap k_get_rows fusion) on this worktree branch with sign-off attribution. The regenerated 0028 patch + bench-updated LEVER1_GATHER_RESULTS.md first landed via a concurrent origin/master merge (c1f1d1e8e) that swept the staged files; this records the provenance and the bench summary in the checkpoint. Gate (bit-exact, greedy --temp 0 --seed 1 -n 48): dense q36-27b-nvfp4 5951a5b4d624ce891e22ab5fca9bc439, MoE q36-35b-a3b-nvfp4 07db32c2bcb78d17a43ed18bc22705cd (both == baseline; base == lever1). decode_agg npl128: dense 369.95 -> 377.83 t/s (+2.13%, 96.6% of vLLM), MoE 763.47 -> 777.95 t/s (+1.90%, 86.3% of vLLM). nsys MoE decode: k_get_rows_float 17334 -> 15414 inst (-1920), 358.37 -> 133.52 ms, step -3.13 ms. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto --- .../cpp/llama-cpp/patches/paged/LEVER1_GATHER_PROGRESS.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/backend/cpp/llama-cpp/patches/paged/LEVER1_GATHER_PROGRESS.md b/backend/cpp/llama-cpp/patches/paged/LEVER1_GATHER_PROGRESS.md index e4d14b940..c95441932 100644 --- a/backend/cpp/llama-cpp/patches/paged/LEVER1_GATHER_PROGRESS.md +++ b/backend/cpp/llama-cpp/patches/paged/LEVER1_GATHER_PROGRESS.md @@ -24,3 +24,11 @@ update via ggml_ssm_conv_update_inplace_ids (src[4]=ids discriminator). Mirrors - Patch: patches/paged/0028-qwen35-recurrent-state-gather-fusion.patch (LocalAI worktree) - Docs: LEVER1_GATHER_RESULTS.md (full bench tables) - DGX bench outs: ab_{dense,moe}_{base,lever1}.out, nab_{base,lever1}.kern.csv, md5{d,m}_{base,lever1}.txt + +## gather-bench landed (worktree) + +Rigorous same-session A/B (DGX GB10) validated patch 0028 bit-exact and lifting both models; +results folded into LEVER1_GATHER_RESULTS.md and the regenerated 0028 patch. The bench files +first landed in this worktree via concurrent merge c1f1d1e8e (origin/master sweep); this commit +re-anchors them with sign-off attribution. DGX llama tree dedicated commit: fafe878 (code +byte-identical to 944636c; docs-only amend). Both trees committed, not pushed.