Merge origin/master + pin-sync paged backend to 0ed235ea

master auto-bumped the stock llama-cpp pin 9d5d882d -> 0ed235ea and updated the shared grpc-server.cpp. The paged backend's pin must track the stock pin (the grpc-server.cpp is shared), so bump its LLAMA_VERSION to match. All 28 paged patches apply clean on 0ed235ea (verified against a fresh upstream clone). The bf16-tau state-serialization fix (patch 0026) is included. Bit-exact gate + full grpc-server build verify on GPU/CI to follow. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-29 10:56:23 -04:00 · 2026-06-28 07:56:47 +00:00
parent 1f3e5ba301 de2ec2f136
commit ea72a56e2c
95 changed files with 6339 additions and 487 deletions
--- a/backend/cpp/llama-cpp-localai-paged/Makefile
+++ b/backend/cpp/llama-cpp-localai-paged/Makefile
@@ -49,7 +49,7 @@
 # helpers that the refactor pulled into the headers grpc-server.cpp includes.
 # Therefore a PIN_SYNC must pass the FULL grpc-server build/link on CI, not only
 # the bit-exact gate. See README section 7 + .agents/llama-cpp-localai-paged-backend.md.
-LLAMA_VERSION?=9d5d882d8cd0f0a9283d87ed5e6fe3ee0d925fb1
+LLAMA_VERSION?=0ed235ea2c17a19fc8238668653946721ed136fd

 CMAKE_ARGS?=
 BUILD_TYPE?=