chore(paged): decouple paged llama.cpp pin from the nightly auto-bumper

The llama-cpp-localai-paged backend reused backend/cpp/llama-cpp's LLAMA_VERSION, which .github/workflows/bump_deps.yaml auto-bumps nightly to the latest ggml-org/llama.cpp master tip. The stock backend is patch-free so that bump is safe, but the paged backend applies a vendored patch series (backend/cpp/llama-cpp/patches/paged/) hand-verified bit-exact against ONE specific tip. A naive bump moves the tip out from under the patches and breaks 'git apply' at build time - a dep-bump PR would go red (or, worse, the break surfaces later in a release build). Mirror the turboquant precedent: give the paged wrapper its OWN LLAMA_VERSION pin (the verified 9d5d882d) and force it into every copied build via LLAMA_VERSION=$(LLAMA_VERSION), so the nightly stock bump no longer drags the paged build to an unverified tip. Unlike turboquant (whose fork branch carries the patches and is safe to auto-bump), the paged series is vendored, so it gets NO bump_deps.yaml entry: it is advanced only by the manual PIN_SYNC process. Add cross-referencing comments in both Makefiles and bump_deps.yaml. Also add PIN_BUMP_APPLY_CHECK.md: an apply-feasibility report for the latest tip (c299a92c, 23 commits ahead). The full series applies CLEAN under 'git apply' with only benign line offsets and zero conflicts; the lone failure (0019) is a pre-existing stray dev-doc hunk, identical on the current pin, not a bump regression. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-27 18:06:58 -04:00 · 2026-06-27 08:02:37 +00:00
parent 400930db19
commit e160041f05
4 changed files with 150 additions and 14 deletions
--- a/.github/workflows/bump_deps.yaml
+++ b/.github/workflows/bump_deps.yaml
@@ -9,6 +9,15 @@ jobs:
    strategy:
      fail-fast: false
      matrix:
+        # NOTE: there is intentionally NO entry for the llama-cpp-localai-paged
+        # backend. It carries a vendored paged-attention patch series
+        # (backend/cpp/llama-cpp/patches/paged/) hand-verified bit-exact against
+        # ONE specific llama.cpp tip; a naive nightly bump would move the tip out
+        # from under the patches and break `git apply` at build time. Its pin is
+        # therefore decoupled (its own LLAMA_VERSION in
+        # backend/cpp/llama-cpp-localai-paged/Makefile) and advanced ONLY by the
+        # manual PIN_SYNC process. Do not add it here. (turboquant CAN be
+        # auto-bumped below because its fork branch carries the patches.)
        include:
          - repository: "ggml-org/llama.cpp"
            variable: "LLAMA_VERSION"