LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-30 19:37:00 -04:00

Files

pos-ei-don 6ab29ec8b9 fix(sglang): parse tool_call function arguments before applying the chat template (#10558 )

OpenAI wire format carries `function.arguments` as a JSON-encoded string,
but chat templates (e.g. Qwen3-Coder) iterate over it as a mapping. The
vllm backend already parses arguments before applying the chat template
(PR #10256); this mirrors that fix in the sglang backend.

Without this fix the second turn of any tool-using session (assistant
returns tool_calls, user posts `role:"tool"` result, model is invoked
with arguments still as a string) crashes inside transformers' Jinja
chat-template rendering with:

  TypeError: Can only get item pairs from a mapping.
  File ".../transformers/utils/chat_template_utils.py", in render_jinja_template
  File ".../jinja2/filters.py", in do_items
      raise TypeError("Can only get item pairs from a mapping.")

Reproduced on `lmsysorg/sglang:v0.5.14` via LocalAI v4.5.4 with
`saricles/Qwen3-Coder-Next-NVFP4-GB10` (W4A4 NVFP4 / compressed-tensors)
on NVIDIA DGX Spark (GB10, sm_121).

After the patch, a tool-call roundtrip (assistant tool_calls -> tool
result -> assistant final answer) returns http=200 with the expected
follow-up content; no behaviour change on requests that don't carry
tool_calls.

Signed-off-by: Poseidon <philipp.wacker@ibf-solutions.com>
Co-authored-by: Poseidon <philipp.wacker@ibf-solutions.com>

2026-06-30 09:00:51 +02:00

backend.py

fix(sglang): parse tool_call function arguments before applying the chat template (#10558 )

2026-06-30 09:00:51 +02:00

install.sh

fix(backends): repair release CI build/test breaks (kokoros, fish-speech, llama-cpp-quantization, sglang) (#10547 )

2026-06-27 09:42:22 +02:00

Makefile

feat(sglang): wire engine_args, add cuda13 build, ship MTP gallery demos (#9686 )

2026-05-07 17:27:29 +02:00

package.sh

feat(backends): add sglang (#9359 )