Commit Graph

374 Commits

Author SHA1 Message Date
LocalAI [bot]
b0b37a472f chore: ⬆️ Update ggml-org/llama.cpp to 08f21453aec846867b39878500d725a05bd32683 (#9190)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-31 09:27:08 +02:00
LocalAI [bot]
3d738164b7 chore: ⬆️ Update ggml-org/llama.cpp to 7c203670f8d746382247ed369fea7fbf10df8ae0 (#9160)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-30 08:27:26 +02:00
Ettore Di Giacinto
59108fbe32 feat: add distributed mode (#9124)
* feat: add distributed mode (experimental)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix data races, mutexes, transactions

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix events and tool stream in agent chat

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* use ginkgo

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(cron): compute correctly time boundaries avoiding re-triggering

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not flood of healthy checks

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* do not list obvious backends as text backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* tests fixups

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* refactoring and consolidation

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Drop redundant healthcheck

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* enhancements, refactorings

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-30 00:47:27 +02:00
LocalAI [bot]
4c870288d9 chore: ⬆️ Update ggml-org/llama.cpp to 59d840209a5195c2f6e2e81b5f8339a0637b59d9 (#9144)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-28 18:18:06 +01:00
LocalAI [bot]
b86fa63f70 chore: ⬆️ Update ggml-org/llama.cpp to a970515bdb0b1d09519106847660b0d0c84d2472 (#9137)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-26 07:56:41 +01:00
LocalAI [bot]
9bc68b2721 chore: ⬆️ Update ggml-org/llama.cpp to 9f102a1407ed5d73b8c954f32edab50f8dfa3f58 (#9127)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-25 07:52:14 +01:00
LocalAI [bot]
2ad8c149e0 chore: ⬆️ Update ggml-org/llama.cpp to 1772701f99dd3fc13f5783b282c2361eda8ca47c (#9123)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-24 00:35:40 +01:00
LocalAI [bot]
31fcb1425d chore: ⬆️ Update ggml-org/llama.cpp to 49bfddeca18e62fa3d39114a23e9fcbdf8a22388 (#9102)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-23 01:11:18 +01:00
Ettore Di Giacinto
f891d60d26 fix(llama.cpp): bundle libdl, librt, libpthread in llama-cpp backend (#9099)
chore(llama.cpp): bundle libdl, librt, libpthread in llama-cpp backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 00:58:14 +01:00
LocalAI [bot]
b74111feed chore: ⬆️ Update ggml-org/llama.cpp to 990e4d96980d0b016a2b07049cc9031642fb9903 (#9095)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-22 00:57:39 +01:00
Ettore Di Giacinto
031a36c995 feat: inferencing default, automatic tool parsing fallback and wire min_p (#9092)
* feat: wire min_p

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: inferencing defaults

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore(refactor): re-use iterative parser

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: generate automatically inference defaults from unsloth

Instead of trying to re-invent the wheel and maintain here the inference
defaults, prefer to consume unsloth ones, and contribute there as
necessary.

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: apply defaults also to models installed via gallery

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: be consistent and apply fallback to all endpoint

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-22 00:57:15 +01:00
LocalAI [bot]
aa3e82976e chore: ⬆️ Update ggml-org/llama.cpp to 4cb7e0bd61e7e1101e8ab10db5dee70c5717a386 (#9087)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-21 09:41:11 +01:00
Ettore Di Giacinto
c3174f9543 chore(deps): bump llama-cpp to 'a0bbcdd9b6b83eeeda6f1216088f42c33d464e38' (#9079)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-20 08:12:21 +01:00
LocalAI [bot]
9a9da062e1 chore: ⬆️ Update ggml-org/llama.cpp to 5744d7ec430e2f875a393770195fda530560773f (#9063)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-19 07:58:30 +01:00
LocalAI [bot]
a58475dbef chore: ⬆️ Update ggml-org/llama.cpp to ee4801e5a6ee7ee4063144ab44ab4e127f76fba8 (#9044)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-18 08:46:12 +01:00
LocalAI [bot]
118bcee196 chore: ⬆️ Update ggml-org/llama.cpp to 9b342d0a9f2f4892daec065491583ec2be129685 (#9039)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-17 10:22:42 +01:00
LocalAI [bot]
b2030255ca chore: ⬆️ Update ggml-org/llama.cpp to 88915cb55c14769738fcab7f1c6eaa6dcc9c2b0c (#9020)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-16 00:10:11 +01:00
LocalAI [bot]
87525109f1 chore: ⬆️ Update ggml-org/llama.cpp to 3a6f059909ed5dab8587df5df4120315053d57a4 (#9009)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-15 09:46:45 +01:00
LocalAI [bot]
977063c4ba chore: ⬆️ Update ggml-org/llama.cpp to e30f1fdf74ea9238ff562901aa974c75aab6619b (#8997)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-14 01:16:42 +01:00
LocalAI [bot]
46a8941a2c chore: ⬆️ Update ggml-org/llama.cpp to 57819b8d4b39d893408e51520dff3d47d1ebb757 (#8983)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-13 07:59:15 +01:00
Richard Palethorpe
b24ca51287 fix(llama-cpp): Set enable_thinking in the correct place (#8973)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-03-12 13:32:29 +01:00
LocalAI [bot]
270eb956c7 chore: ⬆️ Update ggml-org/llama.cpp to 10e5b148b061569aaee8ae0cf72a703129df0eab (#8946)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-11 08:04:09 +01:00
LocalAI [bot]
b48920ecf6 chore: ⬆️ Update ggml-org/llama.cpp to 23fbfcb1ad6c6f76b230e8895254de785000be46 (#8921)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-10 07:30:43 +01:00
LocalAI [bot]
f06c02d10e chore: ⬆️ Update ggml-org/llama.cpp to 35bee031e17ed2b2e8e7278b284a6c8cd120d9f8 (#8872)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-08 22:25:04 +01:00
Ettore Di Giacinto
b2f81bfa2e feat(functions): add peg-based parsing and allow backends to return tool calls directly (#8838)
* feat(functions): add peg-based parsing

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: support returning toolcalls directly from backends

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* chore: do run PEG only if backend didn't send deltas

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-08 22:21:57 +01:00
LocalAI [bot]
1296167f84 chore: ⬆️ Update ggml-org/llama.cpp to c5a778891ba0ddbd4cbb507c823f970595b1adc2 (#8837)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-07 23:28:06 +01:00
LocalAI [bot]
e1df6807dc chore: ⬆️ Update ggml-org/llama.cpp to 566059a26b0ce8faec4ea053605719d399c64cc5 (#8822)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-06 23:53:23 +01:00
Ettore Di Giacinto
580517f9db feat: pass-by metadata to predict options (#8795)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-03-05 22:50:10 +01:00
LocalAI [bot]
0cf7c18177 chore: ⬆️ Update ggml-org/llama.cpp to a0ed91a442ea6b013bd42ebc3887a81792eaefa1 (#8797)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-05 22:49:45 +01:00
LocalAI [bot]
f25e450414 chore: ⬆️ Update ggml-org/llama.cpp to 24d2ee052795063afffc9732465ca1b1c65f4a28 (#8777)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-04 23:25:48 +01:00
LocalAI [bot]
6002c940a9 chore: ⬆️ Update ggml-org/llama.cpp to ecd99d6a9acbc436bad085783bcd5d0b9ae9e9e9 (#8762)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-03-04 08:08:37 +01:00
LocalAI [bot]
d846ad3a84 chore: ⬆️ Update ggml-org/llama.cpp to 4d828bd1ab52773ba9570cc008cf209eb4a8b2f5 (#8727)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-02 23:22:28 +01:00
LocalAI [bot]
b61536c0f4 chore: ⬆️ Update ggml-org/llama.cpp to 319146247e643695f94a558e8ae686277dd4f8da (#8707)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-02 10:08:51 +01:00
LocalAI [bot]
ddb36468ed chore: ⬆️ Update ggml-org/llama.cpp to 05728db18eea59de81ee3a7699739daaf015206b (#8683)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-03-01 00:48:26 +01:00
Ettore Di Giacinto
1c5dc83232 chore(deps): bump llama.cpp to 'ecbcb7ea9d3303097519723b264a8b5f1e977028' (#8672)
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
2026-02-28 00:33:56 +01:00
LocalAI [bot]
8ad40091a6 chore: ⬆️ Update ggml-org/llama.cpp to 723c71064da0908c19683f8c344715fbf6d986fd (#8660)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-26 21:34:47 +00:00
LocalAI [bot]
fb86f6461d chore: ⬆️ Update ggml-org/llama.cpp to 3769fe6eb70b0a0fbb30b80917f1caae68c902f7 (#8655)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-26 00:05:03 +01:00
LocalAI [bot]
1331e23b67 chore: ⬆️ Update ggml-org/llama.cpp to 418dea39cea85d3496c8b04a118c3b17f3940ad8 (#8649)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-25 00:04:48 +00:00
LocalAI [bot]
9a5b5ee8a9 chore: ⬆️ Update ggml-org/llama.cpp to b68a83e641b3ebe6465970b34e99f3f0e0a0b21a (#8628)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-23 22:02:40 +00:00
LocalAI [bot]
f40c8dd0ce chore: ⬆️ Update ggml-org/llama.cpp to 2b6dfe824de8600c061ef91ce5cc5c307f97112c (#8622)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-23 09:30:58 +00:00
LocalAI [bot]
91f2dd5820 chore: ⬆️ Update ggml-org/llama.cpp to f75c4e8bf52ea480ece07fd3d9a292f1d7f04bc5 (#8619)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-22 13:20:08 +01:00
LocalAI [bot]
fcecc12e57 chore: ⬆️ Update ggml-org/llama.cpp to ba3b9c8844aca35ecb40d31886686326f22d2214 (#8613)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-21 09:57:04 +01:00
LocalAI [bot]
bb0924dff1 chore: ⬆️ Update ggml-org/llama.cpp to b908baf1825b1a89afef87b09e22c32af2ca6548 (#8612)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-20 23:47:47 +01:00
LocalAI [bot]
b1c434f0fc chore: ⬆️ Update ggml-org/llama.cpp to 11c325c6e0666a30590cde390d5746a405e536b9 (#8607)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-19 23:32:35 +01:00
LocalAI [bot]
68c7077491 chore: ⬆️ Update ggml-org/llama.cpp to b55dcdef5dcd74dc75c4921090e928d43453c157 (#8599)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-18 22:33:25 +01:00
LocalAI [bot]
ed832cf0e0 chore: ⬆️ Update ggml-org/llama.cpp to 2b089c77580d347767f440205103e4da8ec33d89 (#8592)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
Co-authored-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
2026-02-17 22:35:07 +00:00
Richard Palethorpe
9e692967c3 fix(llama-cpp): Pass parameters when using embedded template (#8590)
Signed-off-by: Richard Palethorpe <io@richiejp.com>
2026-02-17 18:50:05 +01:00
LocalAI [bot]
067a255435 chore: ⬆️ Update ggml-org/llama.cpp to d612901116ab2066c7923372d4827032ff296bc4 (#8588)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-17 00:57:32 +01:00
LocalAI [bot]
109f29cc24 chore: ⬆️ Update ggml-org/llama.cpp to 27b93cbd157fc4ad94573a1fbc226d3e18ea1bb4 (#8577)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-15 23:42:36 +01:00
LocalAI [bot]
d784851337 chore: ⬆️ Update ggml-org/llama.cpp to 01d8eaa28d57bfc6d06e30072085ed0ef12e06c5 (#8567)
⬆️ Update ggml-org/llama.cpp

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
2026-02-14 22:52:32 +01:00