LocalAI

mirror of https://github.com/mudler/LocalAI.git synced 2026-07-01 03:46:41 -04:00

Author	SHA1	Message	Date
LocalAI [bot]	6e5a58ca70	feat: Add Free RPC to backend.proto for VRAM cleanup (#8751 ) * fix: Add VRAM cleanup when stopping models - Add Free() method to AIModel interface for proper GPU resource cleanup - Implement Free() in llama backend to release llama.cpp model resources - Add Free() stub implementations in base and SingleThread backends - Modify deleteProcess() to call Free() before stopping the process to ensure VRAM is properly released when models are unloaded Fixes issue where VRAM was not freed when stopping models, which could lead to memory exhaustion when running multiple models sequentially. * feat: Add Free RPC to backend.proto for VRAM cleanup\n\n- Add rpc Free(HealthMessage) returns (Result) {} to backend.proto\n- This RPC is required to properly expose the Free() method\n through the gRPC interface for VRAM resource cleanup\n\nRefs: PR #8739 * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> Co-authored-by: localai-bot <localai-bot@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-03 12:39:06 +01:00
Ettore Di Giacinto	1c8db3846d	chore(faster-qwen3-tts): Add anyio to requirements.txt Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-03-03 09:43:29 +01:00
LocalAI [bot]	d846ad3a84	chore: ⬆️ Update ggml-org/llama.cpp to `4d828bd1ab52773ba9570cc008cf209eb4a8b2f5` (#8727 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-02 23:22:28 +01:00
LocalAI [bot]	2dd4e7cdc3	fix(qwen-tts): ensure all requirements files end with newline (#8724 ) - Add trailing newline to all requirements*.txt files in qwen-tts backend - This ensures proper file formatting and prevents potential issues with package installation tools that expect newline-terminated files	2026-03-02 13:56:11 +01:00
LocalAI [bot]	b61536c0f4	chore: ⬆️ Update ggml-org/llama.cpp to `319146247e643695f94a558e8ae686277dd4f8da` (#8707 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-02 10:08:51 +01:00
LocalAI [bot]	8b430c577b	feat: Add debug logging for pocket-tts voice issue #8244 (#8715 ) Adding debug logging to help investigate the pocket-tts custom voice finding issue (Issue #8244). This is a first step to understand how voices are being loaded and where the failure occurs. Signed-off-by: localai-bot <localai-bot@users.noreply.github.com> Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>	2026-03-02 09:24:59 +01:00
LocalAI [bot]	ddb36468ed	chore: ⬆️ Update ggml-org/llama.cpp to `05728db18eea59de81ee3a7699739daaf015206b` (#8683 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-03-01 00:48:26 +01:00
Ettore Di Giacinto	1c5dc83232	chore(deps): bump llama.cpp to 'ecbcb7ea9d3303097519723b264a8b5f1e977028' (#8672 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-28 00:33:56 +01:00
LocalAI [bot]	73b997686a	chore: ⬆️ Update ggml-org/whisper.cpp to `9453b4b9be9b73adfc35051083f37cefa039acee` (#8671 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-27 21:28:48 +00:00
LocalAI [bot]	dfc6efb88d	feat(backends): add faster-qwen3-tts (#8664 ) * feat(backends): add faster-qwen3-tts Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: this backend is CUDA only Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fix: add requirements-install.txt with setuptools for build isolation The faster-qwen3-tts backend requires setuptools to build packages like sox that have setuptools as a build dependency. This ensures the build completes successfully in CI. Signed-off-by: LocalAI Bot <localai-bot@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: LocalAI Bot <localai-bot@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-27 08:16:51 +01:00
LocalAI [bot]	8ad40091a6	chore: ⬆️ Update ggml-org/llama.cpp to `723c71064da0908c19683f8c344715fbf6d986fd` (#8660 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-26 21:34:47 +00:00
LocalAI [bot]	fb86f6461d	chore: ⬆️ Update ggml-org/llama.cpp to `3769fe6eb70b0a0fbb30b80917f1caae68c902f7` (#8655 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-26 00:05:03 +01:00
Ettore Di Giacinto	b032cf489b	fix(chatterbox): add support for cuda13/aarch64 (#8653 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-25 21:51:44 +01:00
dependabot[bot]	c4783a0a05	chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/vllm (#8635 ) Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:17:32 +01:00
dependabot[bot]	c44f03b882	chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/rerankers (#8636 ) chore(deps): bump grpcio in /backend/python/rerankers Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:16:57 +01:00
dependabot[bot]	eeec92af78	chore(deps): bump sentence-transformers from 5.2.2 to 5.2.3 in /backend/python/transformers (#8638 ) chore(deps): bump sentence-transformers in /backend/python/transformers Bumps [sentence-transformers](https://github.com/huggingface/sentence-transformers) from 5.2.2 to 5.2.3. - [Release notes](https://github.com/huggingface/sentence-transformers/releases) - [Commits](https://github.com/huggingface/sentence-transformers/compare/v5.2.2...v5.2.3) --- updated-dependencies: - dependency-name: sentence-transformers dependency-version: 5.2.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:16:41 +01:00
dependabot[bot]	842033b8b5	chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/transformers (#8640 ) chore(deps): bump grpcio in /backend/python/transformers Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:14:55 +01:00
dependabot[bot]	a2941228a7	chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/common/template (#8641 ) chore(deps): bump grpcio in /backend/python/common/template Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:14:43 +01:00
dependabot[bot]	791e6b84ee	chore(deps): bump grpcio from 1.76.0 to 1.78.1 in /backend/python/coqui (#8642 ) Bumps [grpcio](https://github.com/grpc/grpc) from 1.76.0 to 1.78.1. - [Release notes](https://github.com/grpc/grpc/releases) - [Commits](https://github.com/grpc/grpc/compare/v1.76.0...v1.78.1) --- updated-dependencies: - dependency-name: grpcio dependency-version: 1.78.1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-02-25 08:14:30 +01:00
LocalAI [bot]	1331e23b67	chore: ⬆️ Update ggml-org/llama.cpp to `418dea39cea85d3496c8b04a118c3b17f3940ad8` (#8649 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-25 00:04:48 +00:00
LocalAI [bot]	9a5b5ee8a9	chore: ⬆️ Update ggml-org/llama.cpp to `b68a83e641b3ebe6465970b34e99f3f0e0a0b21a` (#8628 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-23 22:02:40 +00:00
LocalAI [bot]	f40c8dd0ce	chore: ⬆️ Update ggml-org/llama.cpp to `2b6dfe824de8600c061ef91ce5cc5c307f97112c` (#8622 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-23 09:30:58 +00:00
LocalAI [bot]	91f2dd5820	chore: ⬆️ Update ggml-org/llama.cpp to `f75c4e8bf52ea480ece07fd3d9a292f1d7f04bc5` (#8619 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-22 13:20:08 +01:00
LocalAI [bot]	fcecc12e57	chore: ⬆️ Update ggml-org/llama.cpp to `ba3b9c8844aca35ecb40d31886686326f22d2214` (#8613 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-21 09:57:04 +01:00
LocalAI [bot]	bb0924dff1	chore: ⬆️ Update ggml-org/llama.cpp to `b908baf1825b1a89afef87b09e22c32af2ca6548` (#8612 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-20 23:47:47 +01:00
LocalAI [bot]	b1c434f0fc	chore: ⬆️ Update ggml-org/llama.cpp to `11c325c6e0666a30590cde390d5746a405e536b9` (#8607 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-19 23:32:35 +01:00
LocalAI [bot]	bb42b342de	chore: ⬆️ Update ggml-org/whisper.cpp to `21411d81ea736ed5d9cdea4df360d3c4b60a4adb` (#8606 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-19 23:32:21 +01:00
LocalAI [bot]	e555057f8b	fix: multi-GPU support for Diffusers (Issue #8575 ) (#8605 ) * chore: init * feat: implement multi-GPU support for Diffusers backend (fixes #8575) --------- Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>	2026-02-19 21:35:58 +01:00
Ettore Di Giacinto	dadc7158fb	fix(diffusers): sd_embed is not always available (#8602 ) Seems sd_embed doesn't play well with MPS and L4T. Making it optional Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-19 10:45:17 +01:00
LocalAI [bot]	68c7077491	chore: ⬆️ Update ggml-org/llama.cpp to `b55dcdef5dcd74dc75c4921090e928d43453c157` (#8599 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-18 22:33:25 +01:00
LocalAI [bot]	ed832cf0e0	chore: ⬆️ Update ggml-org/llama.cpp to `2b089c77580d347767f440205103e4da8ec33d89` (#8592 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com> Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>	2026-02-17 22:35:07 +00:00
Richard Palethorpe	9e692967c3	fix(llama-cpp): Pass parameters when using embedded template (#8590 ) Signed-off-by: Richard Palethorpe <io@richiejp.com>	2026-02-17 18:50:05 +01:00
LocalAI [bot]	067a255435	chore: ⬆️ Update ggml-org/llama.cpp to `d612901116ab2066c7923372d4827032ff296bc4` (#8588 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-17 00:57:32 +01:00
LocalAI [bot]	109f29cc24	chore: ⬆️ Update ggml-org/llama.cpp to `27b93cbd157fc4ad94573a1fbc226d3e18ea1bb4` (#8577 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-15 23:42:36 +01:00
LocalAI [bot]	587e4a21b3	chore: ⬆️ Update antirez/voxtral.c to `134d366c24d20c64b614a3dcc8bda2a6922d077d` (#8578 ) ⬆️ Update antirez/voxtral.c Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-15 23:42:11 +01:00
LocalAI [bot]	3f1f58b2ab	chore: ⬆️ Update ggml-org/whisper.cpp to `364c77f4ca2737e3287652e0e8a8c6dce3231bba` (#8576 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-15 21:20:04 +00:00
LocalAI [bot]	d784851337	chore: ⬆️ Update ggml-org/llama.cpp to `01d8eaa28d57bfc6d06e30072085ed0ef12e06c5` (#8567 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-14 22:52:32 +01:00
LocalAI [bot]	94df096fb9	fix: pin neutts-air to known working commit (#8566 ) * chore: init * fix: pin neutts-air to known working commit --------- Co-authored-by: localai-bot <localai-bot@users.noreply.github.com>	2026-02-14 21:16:37 +01:00
Ettore Di Giacinto	820bd7dd01	fix(ci): try to fix deps for l4t13 on qwen-* Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-14 10:21:23 +01:00
Austen	42cb7bda19	fix(llama-cpp): populate tensor_buft_override buffer so llama-cpp properly performs fit calculations (#8560 ) fix auto-fit for llama-cpp	2026-02-14 10:07:37 +01:00
Ettore Di Giacinto	2fb9940b8a	fix(voxcpm): pin setuptools (#8556 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-13 23:44:35 +01:00
LocalAI [bot]	2ff0ad4190	chore: ⬆️ Update ggml-org/llama.cpp to `05a6f0e8946914918758db767f6eb04bc1e38507` (#8553 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-13 22:48:01 +01:00
Ettore Di Giacinto	2fd026e958	fix: update moonshine API, add setuptools to voxcpm requirements (#8541 ) Signed-off-by: Ettore Di Giacinto <mudler@localai.io>	2026-02-12 23:22:37 +01:00
LocalAI [bot]	08718b656e	chore: ⬆️ Update ggml-org/llama.cpp to `338085c69e486b7155e5b03d7b5087e02c0e2528` (#8538 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-12 23:21:53 +01:00
Austen	cff972094c	feat(diffusers): add experimental support for sd_embed-style prompt embedding (#8504 ) * add experimental support for sd_embed-style prompt embedding Signed-off-by: Austen Dicken <cvpcsm@gmail.com> * add doc equivalent to compel Signed-off-by: Austen Dicken <cvpcsm@gmail.com> * need to use flux1 embedding function for flux model Signed-off-by: Austen Dicken <cvpcsm@gmail.com> --------- Signed-off-by: Austen Dicken <cvpcsm@gmail.com>	2026-02-11 22:58:19 +01:00
LocalAI [bot]	79a25f7ae9	chore: ⬆️ Update ggml-org/llama.cpp to `4d3daf80f8834e0eb5148efc7610513f1e263653` (#8513 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-11 21:48:39 +00:00
LocalAI [bot]	0ee92317ec	chore: ⬆️ Update ggml-org/llama.cpp to `57487a64c88c152ac72f3aea09bd1cc491b2f61e` (#8499 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-10 21:32:46 +00:00
LocalAI [bot]	743d2d1947	chore: ⬆️ Update ggml-org/whisper.cpp to `764482c3175d9c3bc6089c1ec84df7d1b9537d83` (#8478 ) ⬆️ Update ggml-org/whisper.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-10 15:14:59 +01:00
LocalAI [bot]	df04843f34	chore: ⬆️ Update ggml-org/llama.cpp to `262364e31d1da43596fe84244fba44e94a0de64e` (#8479 ) ⬆️ Update ggml-org/llama.cpp Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-10 15:14:33 +01:00
LocalAI [bot]	0c040beb59	chore: ⬆️ Update antirez/voxtral.c to `c9e8773a2042d67c637fc492c8a655c485354080` (#8477 ) ⬆️ Update antirez/voxtral.c Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>	2026-02-09 22:20:03 +01:00

1 2 3 4 5 ...

1013 Commits