fix(llama-cpp): terminate tensor/kv override vectors after passthrough

The tensor_buft_overrides padding and the kv/draft override terminators ran before the generic option passthrough, so a passthrough flag (--cpu-moe, --override-tensor, --override-kv, ...) appended a real entry after the null sentinel - tripping the model loader's back().pattern == nullptr assertion (crash) or being silently dropped. Move all three termination/padding blocks to the end of params_parse, after both the named-option loop and common_params_parse have pushed their real entries. Also widen the exit()-flag skip list so --version, --license, --list-devices and --cache-list cannot terminate the backend. Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-06-25 09:09:07 -04:00 · 2026-06-24 17:28:22 +00:00
parent 977ccd88f0
commit 28beac9a18
2 changed files with 36 additions and 23 deletions
--- a/docs/content/advanced/model-configuration.md
+++ b/docs/content/advanced/model-configuration.md
@@ -524,8 +524,9 @@ Notes:
 - **Power-user territory:** an invalid flag or value is rejected by the upstream
  parser exactly as it would be by `llama-server`, which can fail model loading.
  Prefer the named options above when one exists.
- `--help`, `--usage`, and `--completion*` are ignored (they would terminate the
-  backend process).
+- Flags that would terminate the process (such as `--help`, `--usage`,
+  `--version`, `--license`, `--list-devices`, `--cache-list`, and
+  `--completion*`) are ignored.

 ### Prompt Caching