LocalAI/pkg at 2c804bef5a6e359d6be8d005808e5645807c8ae3 - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-30 09:57:57 -04:00

Files

History

Adira 2c804bef5a fix(config): skip vocab arrays and mmap GGUF headers to speed up startup (#10213 )

When the models directory holds many GGUF files, startup parsed every
model's full GGUF — including the tokenizer vocab arrays
(tokenizer.ggml.tokens/scores/merges, often >100k entries) — once per
model while guessing defaults. On slow storage (e.g. a models directory
on a Docker volume) those hundreds of thousands of tiny reads dominate
boot time before the HTTP server comes up.

The default-guessing path and the VRAM metadata reader only consume
scalar metadata and array lengths, never the array contents. Parse with
SkipLargeMetadata (seek past large arrays) and UseMMap (fault in a few
header pages instead of issuing per-element read() syscalls). For a
256k-token vocab this cuts the parse from ~524k read() syscalls to 8.
The mapping is released when ParseGGUFFile returns.

Fixes #9790

Assisted-by: Claude:claude-opus-4-8 [Claude Code]

Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>

2026-06-07 23:33:52 +02:00

..

feat(localvqe/audio): v1.3 release and add spectrograms to audio transform UI (#10113 )

2026-05-31 23:56:46 +02:00

refactor(routing): extract replica picker into pkg/clusterrouting (#10123 )

2026-06-01 09:38:55 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

fix(functions): validate auto-detected XML tool-call names — robust glm-4.5/Hermes guard (#9722 , supersedes #9940 ) (#10059 )

2026-05-29 12:03:33 +02:00

fix(distributed): self-heal stale 'model not loaded' routing (#10181 )

2026-06-05 09:01:36 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

huggingface-api

Harden gallery-agent Hugging Face fetches against transient rate limiting (#10187 )

2026-06-05 23:43:06 +02:00

mcp/localaitools

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

fix(model): track intentional stops, stop misreading clean shutdowns as crashes (#10060 )

2026-05-29 18:54:27 +02:00

feat(distributed): Add NATS JWT authentication and TLS/mTLS options (#10159 )

2026-06-03 19:43:56 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

feat: prefix-cache-aware routing for distributed mode (#10071 )

2026-05-30 23:24:22 +02:00

fix(reasoning): suppress partial tag tokens during autoparser warm-up

2026-04-04 20:45:57 +00:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat: add distributed mode (#9124 )

2026-03-30 00:47:27 +02:00

feat(middleware): Model routing, PII filtering, Cloud model proxies (#9802 )

2026-05-25 09:28:27 +02:00

feat(importer): expand importer flow to almost all backends (#9466 )

2026-04-22 22:42:37 +02:00

security(http): refuse redirects on outbound clients via hardened pkg/httpclient (#10087 )

2026-05-30 12:04:10 +02:00

fix(config): skip vocab arrays and mmap GGUF headers to speed up startup (#10213 )

2026-06-07 23:33:52 +02:00

feat(ui): allow to cancel ops (#7264 )

2025-11-13 18:41:47 +01:00

chore: fix go.mod module (#2635 )

2024-06-23 08:24:36 +00:00

fix(intel): VRAM detection (#9944 )

2026-05-25 09:29:00 +02:00