mirror of
https://github.com/mudler/LocalAI.git
synced 2026-02-02 10:42:56 -05:00
fix: drop gguf VRAM estimation Cleanup. This is now handled directly in llama.cpp, no need to estimate from Go. VRAM estimation in general is tricky, but llama.cpp (41ea26144e/src/llama.cpp (L168)) lately has added an automatic "fitting" of models to VRAM, so we can drop backend-specific GGUF VRAM estimation from our code instead of trying to guess as we already enable it397f7f0862/backend/cpp/llama-cpp/grpc-server.cpp (L393)Fixes: https://github.com/mudler/LocalAI/issues/8302 See: https://github.com/mudler/LocalAI/issues/8302#issuecomment-3830773472