mirror of
https://github.com/mudler/LocalAI.git
synced 2026-04-01 13:42:20 -04:00
* feat(mlx-distributed): add new MLX-distributed backend Add new MLX distributed backend with support for both TCP and RDMA for model sharding. This implementation ties in the discovery implementation already in place, and re-uses the same P2P mechanism for the TCP MLX-distributed inferencing. The Auto-parallel implementation is inspired by Exo's ones (who have been added to acknowledgement for the great work!) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * expose a CLI to facilitate backend starting Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * feat: make manual rank0 configurable via model configs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Add missing features from mlx backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com>
24 lines
402 B
Makefile
24 lines
402 B
Makefile
.PHONY: mlx-distributed
|
|
mlx-distributed:
|
|
bash install.sh
|
|
|
|
.PHONY: run
|
|
run:
|
|
@echo "Running mlx-distributed..."
|
|
bash run.sh
|
|
@echo "mlx-distributed run."
|
|
|
|
.PHONY: test
|
|
test:
|
|
@echo "Testing mlx-distributed..."
|
|
bash test.sh
|
|
@echo "mlx-distributed tested."
|
|
|
|
.PHONY: protogen-clean
|
|
protogen-clean:
|
|
$(RM) backend_pb2_grpc.py backend_pb2.py
|
|
|
|
.PHONY: clean
|
|
clean: protogen-clean
|
|
rm -rf venv __pycache__
|