mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-30 19:47:47 -04:00
Introduce a per-request Echo middleware that wraps the response writer and lazily stamps X-LocalAI-Node on the first Write / WriteHeader / Flush. This replaces the chan-based per-request rendezvous and per-handler maybeSetNodeHeader calls with a single enforcement point. The wrapper reads the picked node ID by looking up the request's model in the ModelLoader at flush time (late binding), so the value reflects the post-ml.Load state of the loader rather than any pre-route guess. Off by default; gated by ApplicationConfig.ExposeNodeHeader. Ginkgo specs cover off/on, missing model, in-process model (no node ID), absent stash, buffered + streaming flush ordering, error path, and late binding under in-handler stamp. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-7[1m]