P2P Distribution Not Enabled
Enable peer-to-peer distribution to scale your AI workloads across multiple devices. Share instances, shard models, and pool computational resources across your network.
Instance Federation
Load balance across multiple instances
Model Sharding
Split large models across workers
Resource Sharing
Pool resources from multiple devices
How to Enable P2P
Start LocalAI with P2P enabled
local-ai run --p2p
This will automatically generate a network token for you.
Or use an existing token
export TOKEN="your-token-here"
local-ai run --p2p
If you already have a token from another instance, you can reuse it.
Access the P2P dashboard
Once enabled, refresh this page to see your network token and start connecting nodes.
How P2P Distribution Works
LocalAI leverages cutting-edge peer-to-peer technologies to distribute AI workloads intelligently across your network
Instance Federation
Share complete LocalAI instances across your network for load balancing and redundancy. Perfect for scaling across multiple devices.
Model Sharding
Split large model weights across multiple workers. Currently supported with llama.cpp backends for efficient memory usage.
Resource Sharing
Pool computational resources from multiple devices, including your friends' machines, to handle larger workloads collaboratively.
Parallel processing
Add more nodes
Fault tolerant
Resource optimization
Network Token
{{.P2PToken}}
The network token can be used to either share the instance or join a federation or a worker network. Below you will find examples on how to start a new instance or a worker with this token.
Federation
Instance sharing
nodes
Workers
Model sharding
workers
Network
Connection token
Federation Network
Instance load balancing and sharing
Start LocalAI in federated mode to share your instance, or launch a federated server to distribute requests intelligently across multiple nodes in your network.
No nodes available
Start some workers to see them here
Node
Worker Network
Distributed model computation (llama.cpp)
Deploy llama.cpp workers to split model weights across multiple devices. This enables processing larger models by distributing computational load and memory requirements.
No workers available
Start some workers to see them here