The llama-cpp gRPC backend reconstructs OpenAI messages from proto for the
tokenizer-template path and blindly json::parse'd each message's content
string. LocalAI's Go layer always flattens content to a plain string, so a
user prompt that merely looks like JSON (e.g. mealie's ingredient array
["1/4 cup brown sugar", ...]) was reinterpreted as structured content parts and
rejected by oaicompat_chat_params_parse with "unsupported content[].type".
Normalize content per role instead: user/system/developer content is opaque
text and is never JSON-sniffed; assistant/tool content still collapses a literal
JSON null/object (tool-call bookkeeping) to a string, but a plain string is
never turned into an array/scalar. The array defense is role-independent, so the
role gate only governs the benign null/object case.
While here, extract the duplicated per-message reconstruction and the
pre-template content sanitization into shared, unit-tested helpers
(message_content.h) so the streaming (PredictStream) and non-streaming (Predict)
paths cannot drift. This removes ~490 lines of copy-pasted defensive code, the
dead tool-role parse branches, and the redundant Predict-only tool_calls branch,
while preserving the prior #7324 (null content -> "") and #7528 (tool array
content -> string) fixes.
Tests:
- backend/cpp/llama-cpp/message_content_test.cpp: standalone C++ unit tests for
all three helpers (#10524, #7324, #7528, multimodal), discovered and run by
`make test-backend-cpp` and a new generic tests-backend-cpp CI job. Also wired
as an opt-in CMake/ctest target (-DLLAMA_GRPC_BUILD_TESTS=ON).
- core/schema/message_test.go: Go regression pinning that ToProto flattens a
JSON-array-looking text part to the verbatim string.
- prepare.sh now copies message_content.h into the build tree.
Assisted-by: Claude:claude-opus-4-8 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
* Build llama.cpp separately
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Start to try to attach some tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add git and small fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix: correctly autoload external backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to run AIO tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Slightly update the Makefile helps
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Adapt auto-bumper
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to run linux test
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add llama-cpp into build pipelines
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add default capability (for cpu)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Drop llama-cpp specific logic from the backend loader
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* drop grpc install in ci for tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Pass by backends path for tests
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Build protogen at start
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fix(tests): set backends path consistently
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Correctly configure the backends path
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to build for darwin
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* WIP
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Compile for metal on arm64/darwin
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to run build off from cross-arch
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add to the backend index nvidia-l4t and cpu's llama-cpp backends
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Build also darwin-x86 for llama-cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Disable arm64 builds temporary
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Test backend build on PR
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixup build backend reusable workflow
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* pass by skip drivers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Use crane
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Skip drivers
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* x86 darwin
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Add packaging step for llama.cpp
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* fixups
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Fix leftover from bark-cpp extraction
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
* Try to fix hipblas build
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
---------
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>