LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-12 18:58:49 -04:00

Files

Ettore Di Giacinto ad6d1dbc8b feat(grpc): request cancellation for Go backends via the Cancellable capability

The llama.cpp C++ backend aborts generation when its gRPC context is
cancelled (grpc-server.cpp polls context->IsCancelled() in the result
loops), but Go backends served by pkg/grpc never observed context
cancellation: a disconnected client left the generation running to
completion. Add an optional Cancellable capability; the server registers
context.AfterFunc on the request/stream context (after the Locking block
so queued requests cannot abort the current owner) covering both rich
and legacy paths. dllm implements it: measured cancel latency ~10ms vs
~10s of orphaned generation, and follow-up requests no longer queue
behind cancelled ones (~220ms vs ~9s in the e2e proof).

Assisted-by: Claude Code (Fable 5)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-11 17:50:04 +00:00

advanced

feat: forward reasoning_effort to the backend so jinja models honor it (#10184 )

2026-06-05 13:45:43 +00:00

features

feat(grpc): request cancellation for Go backends via the Cancellable capability

2026-06-11 17:50:04 +00:00

getting-started

feat(cli): add interactive chat mode (#10226 )

2026-06-09 14:58:44 +00:00

installation

feat(ui): Interactive model config editor with autocomplete (#9149 )