LocalAI/core/backend at d0e6bf3aa7ff61d00eaf8f9997b04c7ccc50eaba - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-03 04:46:54 -04:00

Files

History

Ettore Di Giacinto d0e6bf3aa7 fix(backend): don't let a client disconnect cancel the model load

Image generation (and the tts/transcript/embeddings/vad/rerank/llm helpers)
pass the request context to loader.Load so distributed routing decisions
reach the request's X-LocalAI-Node holder. That context also governs
cancellation of the load, so when a client disconnects mid-load the
LoadModel RPC is aborted, stopLoadProcess tears down the backend process,
and every retry restarts from scratch. Heavy diffusers/LLM models on a slow
host (e.g. a shared-memory iGPU) take long enough to load that the request
routinely ends first, so the model never finishes loading and the UI shows
"NetworkError when attempting to fetch resource".

Wrap the load context with context.WithoutCancel: the routing holder value
still propagates, but the request's cancellation no longer aborts the load,
so it runs to completion and caches for the next request. Inference keeps the
cancellable request context, so a disconnect still stops generation.

Adds a regression spec asserting a canceled request context does not cancel
the model load while the routing holder still reaches the router.

Fixes #10636

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

2026-07-02 20:52:51 +00:00

..

audio_transform.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

backend_suite_test.go

feat: extract output with regexes from LLMs (#3491 )

2024-09-13 13:27:36 +02:00

ctx_propagation_test.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

depth.go

feat(backend): add depth-anything (Depth Anything 3) C++/ggml backend + gallery (#10352 )

2026-06-16 16:28:28 +02:00

detection.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

diarization_test.go

feat(api): add /v1/audio/diarization endpoint with sherpa-onnx + vibevoice.cpp (#9654 )

2026-05-05 15:10:13 +02:00

diarization.go

feat(whisper): honor client cancellation via ggml abort_callback (#9710 )

2026-05-08 01:44:47 +02:00

embeddings.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

face_analyze.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

face_embed.go

feat(whisper): honor client cancellation via ggml abort_callback (#9710 )

2026-05-08 01:44:47 +02:00

face_verify.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

image.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

llm_probe_test.go

Respect explicit reasoning config during GGUF thinking probe (#9463 )

2026-04-21 21:53:10 +02:00

llm_test.go

feat(autoparser): prefer chat deltas from backends when emitted (#9224 )

2026-04-04 12:12:08 +02:00

llm.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

model_load_trace_test.go

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

options_internal_test.go

feat: generic chat_template_kwargs (model config + per-request metadata) (#10359 )

2026-06-16 12:16:34 +02:00

options.go

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

prefix_source_internal_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

prefix_source.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

rerank.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

score_test.go

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

score.go

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

sound_classification.go

feat(ced): sound-event classification backend (CED audio tagger) (#10425 )

2026-06-22 01:00:28 +02:00

soundgeneration.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

stores_test.go

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

stores.go

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

token_classify_test.go

feat(pii): NER tier engine — privacy-filter.cpp backend + NER-centric PII filter (#10360 )

2026-06-18 11:45:22 +01:00

token_classify.go

feat(pii): NER tier engine — privacy-filter.cpp backend + NER-centric PII filter (#10360 )

2026-06-18 11:45:22 +01:00

token_metrics.go

feat(whisper): honor client cancellation via ggml abort_callback (#9710 )

2026-05-08 01:44:47 +02:00

tokenize_test.go

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

tokenize.go

fix(router): production-ready request router + auto-size batch for embedding/rerank (#10104 )

2026-06-12 16:21:15 +02:00

transcript_live_internal_test.go

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

transcript_live.go

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

transcript.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

tts_test.go

feat(tts): support per-request instructions and params (#10172 )

2026-06-04 11:45:02 +02:00

tts.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

vad.go

fix(backend): don't let a client disconnect cancel the model load

2026-07-02 20:52:51 +00:00

video.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

voice_analyze.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

voice_embed.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00

voice_verify.go

fix(traces): cap backend trace Data to keep admin UI responsive (#9960 )

2026-05-23 14:50:40 +02:00