LocalAI/core/http at 9748a1cbc63178233fca8d170f424e0f38cb5dbf - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-17 03:33:44 -04:00

Files

History

Ettore Di Giacinto 9748a1cbc6 fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299 )

When TASK_RESPONSE_TYPE_OAI_CHAT is used, the first streaming token
produces a JSON array with two elements: a role-init chunk and the
actual content chunk. The grpc-server loop called attach_chat_deltas
for both elements with the same raw_result pointer, stamping the first
token's ChatDelta.Content on both replies. The Go side accumulated both,
emitting the first content token twice to SSE clients.

Fix: in the array iteration loops in PredictStream, detect role-init
elements (delta has "role" key) and skip attach_chat_deltas for them.
Only content/reasoning elements get chat deltas attached.

Reasoning models are unaffected because their first token goes into
reasoning_content, not content.

2026-04-10 08:45:47 +02:00

..

fix(oauth/invite): do not register user (prending approval) without correct invite (#9189 )

2026-03-31 08:29:07 +02:00

fix(streaming): deduplicate tool call emissions during streaming (#9292 )

2026-04-10 00:44:25 +02:00

fix(token): login via legacy api keys (#9249 )

2026-04-06 21:45:09 +02:00

feat: track files being staged (#9275 )

2026-04-08 14:33:58 +02:00

feat(api): add ollama compatibility (#9284 )

2026-04-09 14:15:14 +02:00

feat(realtime): WebRTC support (#8790 )

2026-03-13 21:37:15 +01:00

feat(realtime): WebRTC support (#8790 )

2026-03-13 21:37:15 +01:00

app_test.go

fix(streaming): skip chat deltas for role-init elements to prevent first token duplication (#9299 )

2026-04-10 08:45:47 +02:00

app.go

feat(api): add ollama compatibility (#9284 )

2026-04-09 14:15:14 +02:00

explorer.go

chore(refactor): move logging to common package based on slog (#7668 )

2025-12-21 19:33:13 +01:00

http_suite_test.go

feat(api): add support for open responses specification (#8063 )

2026-01-17 22:11:47 +01:00

openresponses_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

render.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00