LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-16 04:38:50 -04:00

Files

Ettore Di Giacinto 826d91ddf4 feat(llama-cpp): generic chat_template_kwargs merge (drop per-key blocks)

Replace the per-key enable_thinking/reasoning_effort handling in both the
streaming and non-streaming chat paths with a single block that parses the
chat_template_kwargs JSON blob resolved by the Go layer and merges every key
into body_json. New jinja template levers (e.g. preserve_thinking) now need
no C++ change. Issue #10329.

Assisted-by: Claude:claude-opus-4-8
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-16 08:03:50 +00:00

CMakeLists.txt

fix(turboquant): resolve common.h by detecting llama-common vs common target (#9413 )

2026-04-18 20:30:28 +02:00

grpc-server.cpp

feat(llama-cpp): generic chat_template_kwargs merge (drop per-key blocks)

2026-06-16 08:03:50 +00:00

Makefile

chore: ⬆️ Update ggml-org/llama.cpp to 7dad2f1a17d65b5e2034c277125bc9f97573a779 (#10337 )

2026-06-16 08:22:26 +02:00

package.sh

fix(llama.cpp): bundle libdl, librt, libpthread in llama-cpp backend (#9099 )