fix(llama-cpp): terminate tensor/kv override vectors after passthrough

The tensor_buft_overrides padding and the kv/draft override terminators
ran before the generic option passthrough, so a passthrough flag
(--cpu-moe, --override-tensor, --override-kv, ...) appended a real entry
after the null sentinel - tripping the model loader's
back().pattern == nullptr assertion (crash) or being silently dropped.
Move all three termination/padding blocks to the end of params_parse,
after both the named-option loop and common_params_parse have pushed
their real entries. Also widen the exit()-flag skip list so --version,
--license, --list-devices and --cache-list cannot terminate the backend.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
This commit is contained in:
Ettore Di Giacinto
2026-06-24 17:28:22 +00:00
parent 977ccd88f0
commit 28beac9a18
2 changed files with 36 additions and 23 deletions

View File

@@ -524,8 +524,9 @@ Notes:
- **Power-user territory:** an invalid flag or value is rejected by the upstream
parser exactly as it would be by `llama-server`, which can fail model loading.
Prefer the named options above when one exists.
- `--help`, `--usage`, and `--completion*` are ignored (they would terminate the
backend process).
- Flags that would terminate the process (such as `--help`, `--usage`,
`--version`, `--license`, `--list-devices`, `--cache-list`, and
`--completion*`) are ignored.
### Prompt Caching