mirror of
https://github.com/mudler/LocalAI.git
synced 2026-06-24 16:49:06 -04:00
Phase 3 synthesis of the max_prefill_tokens (patch 0013) fair re-run: how much of the gap was prefill starvation, the genuine remaining gap to vLLM, and where par-or-beat stands per concurrency/model. Assisted-by: Claude:opus-4.8 [Claude Code] Signed-off-by: Ettore Di Giacinto <mudler@localai.io>