mirror of
https://github.com/mudler/LocalAI.git
synced 2026-05-31 20:21:26 -04:00
The pre-existing buffered-handler tests only exercised
maybeSetNodeHeader against a pre-populated ModelLoader store. They did
nothing to verify that the streaming path attaches the header AFTER
ml.Load has stamped a node ID on the model, which is exactly the
ordering bug the streaming rendezvous chan fix addresses.
Add a streaming integration spec that:
- Builds a ModelLoader with a Model entry but NO node ID stamped on
it (so any pre-Load read returns empty).
- Installs a fake backend.ModelInferenceFunc that stamps the node ID
onto the Model AT THE MOMENT IT IS CALLED, matching production
timing where ModelRouterAdapter.Route does the stamp inside
ml.Load.
- Drives processStream with a per-request nodeIDCh and a handler-side
loop that mirrors chat.go's flush ordering (read response, apply
nodeIDCh header, write, flush).
- Asserts the recorded X-LocalAI-Node header equals the node ID the
fake backend stamped during the worker's ml.Load. With the pre-fix
code the header would be empty because the request goroutine read
the loader's node ID before the worker had stamped it.
Cover three additional scenarios:
- ExposeNodeHeader=false suppresses the header even after stamping
(opt-in is sacred).
- Two sequential requests each get THEIR OWN routing decision in the
header, not the prior request's; this is the direct regression
check for the original bug under load.
- The SSE body is still written so we don't regress streaming output
while attaching the header.
All four specs use Ginkgo; no stdlib testing patterns.
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>