LocalAI/pkg at 8cae99229c0672289edfb87ec731fffb6d815eff - LocalAI - Gitea: Git with a cup of tea

mirror/LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-02-02 10:42:56 -05:00

Files

History

Ettore Di Giacinto 800f749c7b fix: drop gguf VRAM estimation (now redundant) (#8325 )

fix: drop gguf VRAM estimation

Cleanup. This is now handled directly in llama.cpp, no need to estimate from Go.

VRAM estimation in general is tricky, but llama.cpp ( 41ea26144e/src/llama.cpp (L168) ) lately has added an automatic "fitting" of models to VRAM, so we can drop backend-specific GGUF VRAM estimation from our code instead of trying to guess as we already enable it

 397f7f0862/backend/cpp/llama-cpp/grpc-server.cpp (L393)

Fixes: https://github.com/mudler/LocalAI/issues/8302
See: https://github.com/mudler/LocalAI/issues/8302#issuecomment-3830773472

2026-02-01 17:33:28 +01:00

..

feat: Realtime API support reboot (#5392 )

2025-05-25 22:25:05 +02:00

chore: update jobresult_test.go (#4124 )

2024-11-12 08:52:18 +01:00

chore(refactor): move logging to common package based on slog (#7668 )

2025-12-21 19:33:13 +01:00

feat(api): Add transcribe response format request parameter & adjust STT backends (#8318 )

2026-02-01 17:33:17 +01:00

fix(reasoning): support models with reasoning without starting thinking tag (#8132 )

2026-01-20 21:07:59 +01:00

feat(tts): add support for streaming mode (#8291 )

2026-01-30 11:58:01 +01:00

huggingface-api

feat(hf-api): return files in nested directories (#7396 )

2025-11-30 09:06:54 +01:00

feat(llama.cpp): do not specify backends to autoload and add llama.cpp variants (#2232 )

2024-05-04 17:56:12 +02:00

feat(api): Add transcribe response format request parameter & adjust STT backends (#8318 )

2026-02-01 17:33:17 +01:00

feat(ui): allow to cancel ops (#7264 )

2025-11-13 18:41:47 +01:00

feat(openresponses): Support reasoning blocks (#8133 )

2026-01-21 00:11:45 +01:00

chore: update cogito and simplify MCP logics (#6413 )

2025-10-09 12:36:45 +02:00

feat: Realtime API support reboot (#5392 )

2025-05-25 22:25:05 +02:00

chore: fix go.mod module (#2635 )

2024-06-23 08:24:36 +00:00

feat: Filter backend gallery by system capabilities (#7950 )

2026-01-10 23:34:01 +01:00

chore(refactor): move logging to common package based on slog (#7668 )

2025-12-21 19:33:13 +01:00

feat(ui): allow to cancel ops (#7264 )

2025-11-13 18:41:47 +01:00

chore: fix go.mod module (#2635 )

2024-06-23 08:24:36 +00:00

fix: drop gguf VRAM estimation (now redundant) (#8325 )

2026-02-01 17:33:28 +01:00