diff --git a/backend/cpp/llama-cpp-localai-paged/README.md b/backend/cpp/llama-cpp-localai-paged/README.md
index a536e041a..e06668d4e 100644
--- a/backend/cpp/llama-cpp-localai-paged/README.md
+++ b/backend/cpp/llama-cpp-localai-paged/README.md
@@ -164,7 +164,7 @@ swept over serving width `npl` in {8, 32, 64, 128}. Plots:
 [`qwen36_moe_decode_vs_npl.png`](docs/qwen36_moe_decode_vs_npl.png); raw data
 [`final_benchmark.csv`](docs/final_benchmark.csv).
 
-![NVFP4 decode throughput vs concurrency on GB10: llama.cpp standard vs vLLM vs LocalAI's llama.cpp patches](docs/qwen36_decode_overview.png)
+![NVFP4 decode throughput vs concurrency on GB10: llama.cpp standard vs vLLM vs LocalAI's llama.cpp patches, plus the opt-in bf16-tau ceiling](docs/qwen36_decode_overview.png)
 
 > **What was re-measured (2026-06-27).** The three llama columns - **stock**,
 > **patched**, and **patched+bf16-tau** - were all re-measured this session on one
diff --git a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_decode_overview.png b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_decode_overview.png
index 7a5f2e809..bec4bbd41 100644
Binary files a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_decode_overview.png and b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_decode_overview.png differ
diff --git a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_dense_decode_vs_npl.png b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_dense_decode_vs_npl.png
index 0f40032d6..1dd5cf000 100644
Binary files a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_dense_decode_vs_npl.png and b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_dense_decode_vs_npl.png differ
diff --git a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_moe_decode_vs_npl.png b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_moe_decode_vs_npl.png
index d06ca0759..680fd10db 100644
Binary files a/backend/cpp/llama-cpp-localai-paged/docs/qwen36_moe_decode_vs_npl.png and b/backend/cpp/llama-cpp-localai-paged/docs/qwen36_moe_decode_vs_npl.png differ