LocalAI/backend/python/common/python_utils.py at 4bf73a7e22a5ff5f72a2bbdb9fd7658fb60609e2

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-03 12:57:02 -04:00

Files

Ettore Di Giacinto 4bf73a7e22 fix(python-backends): parse tool-call arguments for chat templates and split implicit reasoning blocks

Two bugs broke OpenAI-style tool calling on the MLX backend (and any
Python backend sharing backend/python/common), reproduced end-to-end on
LocalAI v4.5.5 with the metal-mlx backend and
mlx-community/Qwen3.5-2B-MLX-8bit.

messages_to_dicts left each tool call's function.arguments as the raw
OpenAI-wire JSON string. HuggingFace chat templates (e.g. Qwen3.5)
iterate arguments as a mapping (.items()), so any request whose history
contained a prior assistant tool_calls message failed with HTTP 500
"Generation failed: Can only get item pairs from a mapping." — breaking
every agent loop on its second turn. Decode the string back into a dict
so the template sees a mapping.

split_reasoning returned ("", text) whenever the opening think tag was
absent. Models like Qwen3.5 open the assistant turn already inside
thinking, so the generated text carries only the closing </think>; the
whole chain-of-thought leaked into content. When the opener is missing
but the closer is present, treat everything before the closer as
reasoning.

Adds platform-independent unit tests under backend/python/common
(stdlib-only, no MLX/venv required, following parent_watch_test.py).

Assisted-by: Claude Code:claude-opus-4-8

2026-07-03 08:06:59 +00:00

2.9 KiB

Raw Blame History

View Raw

2.9 KiB Raw Blame History

2.9 KiB

Raw Blame History