fix: properly terminate llama.cpp kv_overrides array with empty key + updated doc (#6672)

* fix: properly terminate kv_overrides array with empty key The llama model loading function expects KV overrides to be terminated with an empty key (key[0] == 0). Previously, the kv_overrides vector was not being properly terminated, causing an assertion failure. This commit ensures that after parsing all KV override strings, we add a final terminating entry with an empty key to satisfy the C-style array termination requirement. This fixes the assertion error and allows the model to load correctly with custom KV overrides. Fixes #6643 - Also included a reference to the usage of the `overrides` option in the advanced-usage section. Signed-off-by: blob42 <contact@blob42.xyz> * doc: document the `overrides` option --------- Signed-off-by: blob42 <contact@blob42.xyz>
2026-07-17 19:53:38 -04:00 · 2025-10-23 09:31:55 +02:00
parent 24ce79a67c
commit 32c0ab3a7f
2 changed files with 14 additions and 0 deletions
--- a/backend/cpp/llama-cpp/grpc-server.cpp
+++ b/backend/cpp/llama-cpp/grpc-server.cpp
@@ -291,6 +291,11 @@ static void params_parse(server_context& ctx_server, const backend::ModelOptions
        }
    }

+    if (!params.kv_overrides.empty()) {
+        params.kv_overrides.emplace_back();
+        params.kv_overrides.back().key[0] = 0;
+    }
+
    // TODO: Add yarn

    if (!request->tensorsplit().empty()) {