LocalAI/pkg/grpc/interface.go at c4cccb728eef645efe18c8d3d33db72264750d49

mirror of https://github.com/mudler/LocalAI.git synced 2026-04-01 13:42:20 -04:00

Files

LocalAI [bot] 6e5a58ca70 feat: Add Free RPC to backend.proto for VRAM cleanup (#8751 )

* fix: Add VRAM cleanup when stopping models

- Add Free() method to AIModel interface for proper GPU resource cleanup
- Implement Free() in llama backend to release llama.cpp model resources
- Add Free() stub implementations in base and SingleThread backends
- Modify deleteProcess() to call Free() before stopping the process
  to ensure VRAM is properly released when models are unloaded

Fixes issue where VRAM was not freed when stopping models, which
could lead to memory exhaustion when running multiple models
sequentially.

* feat: Add Free RPC to backend.proto for VRAM cleanup\n\n- Add rpc Free(HealthMessage) returns (Result) {} to backend.proto\n- This RPC is required to properly expose the Free() method\n  through the gRPC interface for VRAM resource cleanup\n\nRefs: PR #8739

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2026-03-03 12:39:06 +01:00

1.2 KiB

Raw Blame History

View Raw

1.2 KiB Raw Blame History

1.2 KiB

Raw Blame History