LocalAI/core/services/nodes at 4d14fe5bef3afdbd8a3bccb9fbc2c683cf9f5ffa - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-22 07:39:02 -04:00

Files

History

Ettore Di Giacinto 4d14fe5bef fix(distributed): detach cold-load staging from the request context

A model not yet loaded on a worker is staged lazily on the inference
request path. Staging a multi-GB model takes minutes - far longer than
any client keeps its HTTP request open - so a browser refresh, an
ingress/LB idle-timeout, or a round-robined retry landing on another
frontend replica cancels the request context and aborts the upload with
"context canceled" mid-transfer. Large models then never finish staging,
so they never load (observed in a 2-replica deployment: both frontends
repeatedly failed to stage a 15.7 GB GGUF, each attempt dying at a
different offset).

Bind the cold load (staging + LoadModel + the per-model advisory lock) to
context.WithoutCancel(ctx): it keeps the request's values (prefix chain)
but drops cancellation/deadline. Each long step keeps its own bound (the
file stager's resume budget, LoadModel's 5m timeout), and the advisory
lock still de-dupes concurrent loaders across replicas.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-8 [Claude Code]

2026-06-21 23:11:26 +00:00

..

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

distributed_store_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

distributed_store.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

file_stager_http.go

feat(distributed): resumable file uploads via HTTP Content-Range (#10109 )

2026-05-31 11:02:20 +00:00

file_stager_s3.go

feat: track files being staged (#9275 )

2026-04-08 14:33:58 +02:00

file_stager.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

file_staging_client.go

feat: wire transcription for llama.cpp, add streaming support (#9353 )

2026-04-14 16:13:40 +02:00

file_transfer_server_test.go

feat(distributed): enforce registration token for worker file transfer (#10183 )

2026-06-05 14:34:28 +02:00

file_transfer_server.go

feat(distributed): enforce registration token for worker file transfer (#10183 )

2026-06-05 14:34:28 +02:00

health_mock_test.go

feat(backend): add depth-anything (Depth Anything 3) C++/ggml backend + gallery (#10352 )

2026-06-16 16:28:28 +02:00

health_test.go

fix(distributed): cascade-clean stale node_models rows + filter routing by healthy status (#9754 )

2026-05-13 21:57:50 +02:00

health.go

fix(distributed): cascade-clean stale node_models rows + filter routing by healthy status (#9754 )

2026-05-13 21:57:50 +02:00

inflight_test.go

feat(backend): add depth-anything (Depth Anything 3) C++/ggml backend + gallery (#10352 )

2026-06-16 16:28:28 +02:00

inflight.go

feat(backend): add depth-anything (Depth Anything 3) C++/ggml backend + gallery (#10352 )

2026-06-16 16:28:28 +02:00

install_progress_publisher_test.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

install_progress_publisher.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

interfaces.go

fix(distributed): self-heal stale 'model not loaded' routing (#10181 )

2026-06-05 09:01:36 +02:00

managers_distributed_test.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

managers_distributed.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

model_router_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

model_router.go

feat(distributed): gated X-LocalAI-Node response header (middleware + wrapper) (#9976 )

2026-05-25 10:51:48 +02:00

nodes_suite_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

pending_op_cleanup_test.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

probe_cache_test.go

fix(distributed): route per request across loaded replicas + cache probeHealth (#9968 )

2026-05-24 08:15:27 +00:00

probe_cache.go

fix(distributed): route per request across loaded replicas + cache probeHealth (#9968 )

2026-05-24 08:15:27 +00:00

reconciler_test.go

feat(distributed): declarative per-model scheduling via env/args (#10308 )

2026-06-13 18:31:06 +02:00

reconciler.go

feat(distributed): declarative per-model scheduling via env/args (#10308 )

2026-06-13 18:31:06 +02:00

registry_test.go

feat(distributed): declarative per-model scheduling via env/args (#10308 )

2026-06-13 18:31:06 +02:00

registry.go

feat(config): hardware-tuned defaults — Blackwell batch + VRAM-scaled concurrency (#10411 )

2026-06-20 14:45:59 +02:00

replicapicker.go

refactor(routing): extract replica picker into pkg/clusterrouting (#10123 )

2026-06-01 09:38:55 +02:00

router_dirstage_test.go

fix(distributed): stage directory-based models to remote nodes (#10175 )

2026-06-04 18:05:38 +02:00

router_hardware_internal_test.go

feat(config): hardware-tuned defaults — Blackwell batch + VRAM-scaled concurrency (#10411 )

2026-06-20 14:45:59 +02:00

router_optionstage_test.go

fix(distributed): stage backend companion assets to remote nodes (#10330 )

2026-06-14 16:42:59 +02:00

router_staging_context_test.go

fix(distributed): detach cold-load staging from the request context

2026-06-21 23:11:26 +00:00

router_test.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

router.go

fix(distributed): detach cold-load staging from the request context

2026-06-21 23:11:26 +00:00

scheduling_seed_test.go

feat(distributed): declarative per-model scheduling via env/args (#10308 )

2026-06-13 18:31:06 +02:00

scheduling_seed.go

feat(distributed): declarative per-model scheduling via env/args (#10308 )

2026-06-13 18:31:06 +02:00

staging_keys_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

staging_keys.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

staging_progress.go

feat: track files being staged (#9275 )

2026-04-08 14:33:58 +02:00

unloader_test.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

unloader_upgrade_test.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00

unloader.go

fix: distributed backend reinstall/upgrade UI stuck on 'reinstalling' (#10214 )

2026-06-08 10:03:02 +02:00