LocalAI/core at 5b24b4dacc2b7accdd7dc299bf3350ed298c35a8 - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-20 13:13:59 -04:00

Files

History

LocalAI [bot] 4a2cc64d07 feat(reasoning): honor per-request reasoning_effort on chat completions (#10082 )

The OpenAI `reasoning_effort` field only reached the prompt template; it
never toggled the backend's thinking. Map it onto
ReasoningConfig.DisableReasoning (which becomes the enable_thinking gRPC
metadata) in the request merge, so reasoning_effort="none" disables
reasoning per request: the use case from #10072 (run a single Qwen3-style
model and turn reasoning off for low-latency tasks while keeping it on
for others).

Effort levels (minimal/low/medium/high) enable thinking unless the model
config explicitly disabled it (reasoning.disable: true wins and is never
re-enabled by a request); "none" always disables.

Closes #10072


Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>

2026-05-29 22:09:07 +00:00

..

fix(application): stop backend processes synchronously on shutdown (#10058 )

2026-05-29 11:40:43 +02:00

feat(reasoning): honor per-request reasoning_effort on chat completions (#10082 )

2026-05-29 22:09:07 +00:00

fix(application): stop backend processes synchronously on shutdown (#10058 )

2026-05-29 11:40:43 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

fix: tool-call JSON leaks into content with stream+tools on tokenizer-template models (#10052 ) (#10057 )

2026-05-29 10:12:53 +02:00

dependencies_manager

feat(ui): move to React for frontend (#8772 )

2026-03-05 21:47:12 +01:00

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

feat(backend): rfdetr-cpp native object detection + segmentation backend (#10028 )

2026-05-27 18:43:57 +02:00

feat(reasoning): honor per-request reasoning_effort on chat completions (#10082 )

2026-05-29 22:09:07 +00:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

fix(openresponses): populate Content and accept bare {role,content} items (#10039 ) (#10040 )

2026-05-28 07:21:48 +00:00

fix(distributed): sync gallery OpCache + caches across frontend replicas (#9983 )

2026-05-25 17:28:14 +02:00

feat(gallery): verify backend OCI images with keyless cosign (#9823 )

2026-05-18 08:02:20 +02:00

fix(openresponses): populate Content and accept bare {role,content} items (#10039 ) (#10040 )

2026-05-28 07:21:48 +00:00

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00