LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-05-31 20:21:26 -04:00

Files

Ettore Di Giacinto df8418cb2d test(distributed): cover streaming X-LocalAI-Node header end-to-end

The pre-existing buffered-handler tests only exercised
maybeSetNodeHeader against a pre-populated ModelLoader store. They did
nothing to verify that the streaming path attaches the header AFTER
ml.Load has stamped a node ID on the model, which is exactly the
ordering bug the streaming rendezvous chan fix addresses.

Add a streaming integration spec that:
  - Builds a ModelLoader with a Model entry but NO node ID stamped on
    it (so any pre-Load read returns empty).
  - Installs a fake backend.ModelInferenceFunc that stamps the node ID
    onto the Model AT THE MOMENT IT IS CALLED, matching production
    timing where ModelRouterAdapter.Route does the stamp inside
    ml.Load.
  - Drives processStream with a per-request nodeIDCh and a handler-side
    loop that mirrors chat.go's flush ordering (read response, apply
    nodeIDCh header, write, flush).
  - Asserts the recorded X-LocalAI-Node header equals the node ID the
    fake backend stamped during the worker's ml.Load. With the pre-fix
    code the header would be empty because the request goroutine read
    the loader's node ID before the worker had stamped it.

Cover three additional scenarios:
  - ExposeNodeHeader=false suppresses the header even after stamping
    (opt-in is sacred).
  - Two sequential requests each get THEIR OWN routing decision in the
    header, not the prior request's; this is the direct regression
    check for the original bug under load.
  - The SSE body is still written so we don't regress streaming output
    while attaching the header.

All four specs use Ginkgo; no stdlib testing patterns.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Assisted-by: Claude:claude-opus-4-7[1m]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

2026-05-24 20:48:06 +00:00