LocalAI/backend/python/mlx-distributed/sharding.py at 585d6248f2d4308716e125ed952ae74218009e20

mirror of https://github.com/mudler/LocalAI.git synced 2026-04-01 05:36:49 -04:00

Files

Ettore Di Giacinto a026277ab9 feat(mlx-distributed): add new MLX-distributed backend (#8801 )

* feat(mlx-distributed): add new MLX-distributed backend

Add new MLX distributed backend with support for both TCP and RDMA for
model sharding.

This implementation ties in the discovery implementation already in
place, and re-uses the same P2P mechanism for the TCP MLX-distributed
inferencing.

The Auto-parallel implementation is inspired by Exo's
ones (who have been added to acknowledgement for the great work!)

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* expose a CLI to facilitate backend starting

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* feat: make manual rank0 configurable via model configs

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Add missing features from mlx backend

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* Apply suggestion from @mudler

Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>

2026-03-09 17:29:32 +01:00

4.5 KiB

Raw Blame History

View Raw

4.5 KiB Raw Blame History

4.5 KiB

Raw Blame History