LocalAI/core/services/nodes at 9f41e69bc3b1faa89f3202223fc0b599fa175fee - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-05 07:16:10 -04:00

Files

History

Ettore Di Giacinto 9f41e69bc3 fix(distributed): self-heal stale 'model not loaded' routing

In distributed mode the registry can list a model as loaded on a node
while the worker has evicted it (autonomous LRU eviction, an out-of-band
unload, etc.) yet the backend process survives. The router's cached-node
check only verifies the process is alive (probeHealth), so it routes there
and inference fails with "<backend>: model not loaded" — and stays broken
until the controller restarts and rebuilds its registry.

InFlightTrackingClient now reconciles this: when a tracked inference call
returns a model-not-loaded error, it drops the stale replica row
(RemoveNodeModel) so the next request reloads the model on a healthy node
instead of routing back to the evicted one. The original error is returned
unchanged; only the registry is corrected.

Assisted-by: Claude:claude-opus-4-8 go vet
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-06-04 23:00:50 +00:00

..

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

distributed_store_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

distributed_store.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

file_stager_http.go

feat(distributed): resumable file uploads via HTTP Content-Range (#10109 )

2026-05-31 11:02:20 +00:00

file_stager_s3.go

feat: track files being staged (#9275 )

2026-04-08 14:33:58 +02:00

file_stager.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

file_staging_client.go

feat: wire transcription for llama.cpp, add streaming support (#9353 )

2026-04-14 16:13:40 +02:00

file_transfer_server_test.go

feat(distributed): resumable file uploads via HTTP Content-Range (#10109 )

2026-05-31 11:02:20 +00:00

file_transfer_server.go

feat(distributed): resumable file uploads via HTTP Content-Range (#10109 )

2026-05-31 11:02:20 +00:00

health_mock_test.go

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

health_test.go

fix(distributed): cascade-clean stale node_models rows + filter routing by healthy status (#9754 )

2026-05-13 21:57:50 +02:00

health.go

fix(distributed): cascade-clean stale node_models rows + filter routing by healthy status (#9754 )

2026-05-13 21:57:50 +02:00

inflight_test.go

fix(distributed): self-heal stale 'model not loaded' routing

2026-06-04 23:00:50 +00:00

inflight.go

fix(distributed): self-heal stale 'model not loaded' routing

2026-06-04 23:00:50 +00:00

install_progress_publisher_test.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

install_progress_publisher.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

interfaces.go

fix(distributed): self-heal stale 'model not loaded' routing

2026-06-04 23:00:50 +00:00

managers_distributed_test.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

managers_distributed.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

model_router_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

model_router.go

feat(distributed): gated X-LocalAI-Node response header (middleware + wrapper) (#9976 )

2026-05-25 10:51:48 +02:00

nodes_suite_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

probe_cache_test.go

fix(distributed): route per request across loaded replicas + cache probeHealth (#9968 )

2026-05-24 08:15:27 +00:00

probe_cache.go

fix(distributed): route per request across loaded replicas + cache probeHealth (#9968 )

2026-05-24 08:15:27 +00:00

reconciler_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

reconciler.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

registry_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

registry.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

replicapicker.go

refactor(routing): extract replica picker into pkg/clusterrouting (#10123 )

2026-06-01 09:38:55 +02:00

router_dirstage_test.go

fix(distributed): stage directory-based models to remote nodes (#10175 )

2026-06-04 18:05:38 +02:00

router_test.go

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

router.go

fix(distributed): stage directory-based models to remote nodes (#10175 )

2026-06-04 18:05:38 +02:00

staging_keys_test.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

staging_keys.go

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

staging_progress.go

feat: track files being staged (#9275 )

2026-04-08 14:33:58 +02:00

unloader_test.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

unloader_upgrade_test.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00

unloader.go

fix(distributed): make admin backend installs resilient and observable (#9958 )

2026-05-23 12:35:44 +02:00