* fix(grammars): honor properties_order entry at index 0
The JSON-schema-to-GBNF property sort used `aOrder != 0 && bOrder != 0` as
its "is this key ordered?" guard. That treats index 0 — the first key listed
in properties_order — as unset, so `properties_order: name,arguments` fell
back to alphabetical ordering and still emitted "arguments" before "name".
Use presence in the order map instead: listed keys sort by their index and
ahead of unlisted keys, which keep a stable alphabetical order. This makes
the documented `properties_order: name,arguments` actually produce
name-first tool-call JSON. Relates to #10052.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
* fix(functions): defer tool grammar to the backend when the tokenizer template owns templating (#10052)
When use_tokenizer_template delegates templating to the backend (llama.cpp),
the backend also owns tool-call grammar generation and parsing. LocalAI was
still generating its own GBNF grammar and sending it down. With a grammar
present, llama.cpp does not hand the tools to its template, so its native
peg/json tool parser never engages: it streams the grammar-constrained
tool-call JSON back as plain content instead of emitting tool_calls. In
streaming mode the JSON object leaked into the content field, and the
Go-side incremental detector never gated content because the
LocalAI-generated grammar emitted "arguments" before "name".
The GGUF auto-import path already couples use_tokenizer_template with
grammar.disable, but that block is skipped when a template is already
configured, so gallery and hand-written configs (e.g. qwen3) that set the
tokenizer template directly never got the paired grammar.disable.
- SetDefaults now enforces the coupling for every config: when
use_tokenizer_template is set, grammar generation is disabled and tools
flow to the backend's native (name-first) pipeline. This also fixes
already-installed models without editing each config.
- Set function.grammar.disable in the shared gallery/qwen3.yaml, which is
the base config referenced by every qwen3 gallery entry.
Verified end to end against qwen3-4b with stream:true + tools: content no
longer carries the tool-call JSON, reasoning is classified separately, and
tool calls stream as proper name-first tool_calls deltas.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
* feat: add distributed mode (experimental)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix data races, mutexes, transactions
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix events and tool stream in agent chat
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* use ginkgo
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(cron): compute correctly time boundaries avoiding re-triggering
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not flood of healthy checks
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* do not list obvious backends as text backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* tests fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* refactoring and consolidation
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop redundant healthcheck
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* enhancements, refactorings
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* wip
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* get rid of panics
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* expose it properly from the config
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Simplify
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* forgot to commit
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Remove focus on test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>