LocalAI/pkg at 0e381897b5ca227b379578331c83fd34c6d908a6 - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-06-30 11:26:32 -04:00

Files

History

pos-ei-don 2cee318fad fix(functions): avoid quadratic-time debug logging in CleanupLLMResult / ParseFunctionCall (#10592 )

fix(functions): avoid quadratic-time debug logging in CleanupLLMResult/ParseFunctionCall

The streaming chat path (core/http/endpoints/openai/chat_stream_workers.go)
calls CleanupLLMResult / ParseFunctionCall once per delta chunk with the
*full accumulated* LLM result so far. Both functions xlog.Debug the entire
argument on entry and exit, so a single N-chunk stream emits roughly
chunk_size * N^2 bytes of debug output.

Under LOG_LEVEL=debug this was observed in a recent SGLang-via-LocalAI
session on a DGX Spark host (about 50K tokens, long streaming generation)
to drive container logs to ~96 GiB, which interacted with the streaming
hot loop on the same filesystem and contributed to a host-wide hard hang
once disk pressure built up. Workaround was setting LOG_LEVEL=info, but
the quadratic shape remains a foot-gun for anyone intentionally enabling
debug.

Replace the four result-content debug arguments with len(...) plus a
fixed-size head (200 bytes via a new truncForLog helper), bounding per-
call output to a constant. The debug signal stays useful: the first 200
chars are enough to identify which generation is in flight, and the
length lets you observe growth without paying for the payload itself.

No API change. No behaviour change for LOG_LEVEL != debug.

Signed-off-by: Poseidon <philipp.wacker@ibf-solutions.com>
Co-authored-by: Poseidon <philipp.wacker@ibf-solutions.com>

2026-06-30 09:16:03 +02:00

..

feat(localvqe/audio): v1.3 release and add spectrograms to audio transform UI (#10113 )

2026-05-31 23:56:46 +02:00

refactor(routing): extract replica picker into pkg/clusterrouting (#10123 )

2026-06-01 09:38:55 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

fix(downloader): stall timeout, resume-safe cancel, and stale-partial reaping (#10406 )

2026-06-19 21:35:21 +02:00

fix(functions): avoid quadratic-time debug logging in CleanupLLMResult / ParseFunctionCall (#10592 )

2026-06-30 09:16:03 +02:00

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

huggingface-api

Harden gallery-agent Hugging Face fetches against transient rate limiting (#10187 )

2026-06-05 23:43:06 +02:00

mcp/localaitools

feat(models): model aliases - redirect a model name to another configured model (#10414 )

2026-06-20 22:38:42 +02:00

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

fix(distributed): missing agent NATS permissions (#10571 )

2026-06-28 12:58:13 +02:00

fix(oci): retry layer downloads on transient network errors (#10579 )

2026-06-28 21:21:08 +02:00

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

fix(reasoning): stop prefilled <think> from swallowing tag-less answers (#10225 )

2026-06-09 09:02:04 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat(realtime): Semantic VAD EOU token (#10444 )

2026-06-30 09:01:22 +02:00

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

feat(backends): make PreferDevelopmentBackends install the development image as primary (#10520 )

2026-06-26 07:42:45 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

fix(config): skip vocab arrays and mmap GGUF headers to speed up startup (#10213 )

2026-06-07 23:33:52 +02:00

feat(ui): allow to cancel ops (#7264 )

2025-11-13 18:41:47 +01:00

chore: fix go.mod module (#2635 )

2024-06-23 08:24:36 +00:00

fix(config): per-device VRAM headroom for Blackwell defaults (#10485 ) (#10494 )

2026-06-25 00:07:48 +02:00