Files
DeepZima 2da740c387 Feat/static peer discovery (#1690)
**Enabling peers to be discovered in environments where mDNS is
unavailable (SSH sessions, headless servers, Docker).**

## Motivation
Exo discovers peers exclusively via mDNS, which works great on a local
network but breaks once you move beyond a single L2 broadcast domain:

- SSH sessions on macOS — TCC blocks mDNS multicast from non-GUI
sessions (#1488)
- Headless servers/rack machines — #1682 ("DGX Spark does not find other
nodes")
- Docker Compose — mDNS is often unavailable across container networks;
e.g. #1462 (E2E test framework) needs an alternative

Related works: 
#1488 (working implementation made by @AlexCheema and closed because SSH
had a GUI workaround),
#1023 (Headscale WAN then closed due to merge conflicts), 
#1656 (discovery cleanup, open). 

This PR introduces an optional bootstrap mechanism for peer discovery
while leaving the existing mDNS behavior unchanged.

## Changes
Adds two new CLI flags:

- `--bootstrap-peers` (env: `EXO_BOOTSTRAP_PEERS`) — comma-separated
libp2p multiaddrs to dial on startup and retry periodically
- `--libp2p-port` — fixed TCP port for libp2p to listen on (default:
OS-assigned). Required when bootstrap peers, so other nodes know which
port to dial.

8 files: 
- `rust/networking/src/discovery.rs`: Store bootstrap addrs, dial in
existing retry loop
- `rust/networking/src/swarm.rs`: Thread `bootstrap_peers` parameter to
`Behaviour`
- `rust/networking/examples/chatroom.rs`: Updated call site for new
create_swarm signature
- `rust/networking/tests/bootstrap_peers.rs`: Integration tests
- `rust/exo_pyo3_bindings/src/networking.rs`: Accept optional
`bootstrap_peers` in PyO3 constructor
- `rust/exo_pyo3_bindings/exo_pyo3_bindings.pyi` : Update type stub 
- `src/exo/routing/router.py`: Pass peers to `NetworkingHandle` 
- `src/exo/main.py` : `--bootstrap-peers` CLI arg +
`EXO_BOOTSTRAP_PEERS` env var

## Why It Works

Bootstrap peers are dialed in the existing retry loop — the same path
taken by peers when mDNS-discovered. The swarm handles connection, Noise
handshake, and gossipsub mesh joining from there.

PeerId is intentionally not required in the multiaddr, the Noise
handshake discovers it.

Docker Compose example:

```yaml
services:
  exo-1:
    environment:
      EXO_BOOTSTRAP_PEERS: "/ip4/exo-2/tcp/30000"
  exo-2:
    environment:
      EXO_BOOTSTRAP_PEERS: "/ip4/exo-1/tcp/30000"
```

## Test Plan

### Manual Testing
<details>
<summary>Docker Compose config</summary>

```
services:
  exo-node1:
    build:
      context: .
      dockerfile: Dockerfile.bootstrap-test
    container_name: exo-bootstrap-node1
    hostname: exo-node1
    command: ["-q", "--libp2p-port", "30000", "--bootstrap-peers", "/ip4/172.30.20.3/tcp/30000"]
    environment:
      - EXO_LIBP2P_NAMESPACE=bootstrap-test
    ports:
      - "52415:52415"
    networks:
      bootstrap-net:
        ipv4_address: 172.30.20.2
    deploy:
      resources:
        limits:
          memory: 4g

  exo-node2:
    build:
      context: .
      dockerfile: Dockerfile.bootstrap-test
    container_name: exo-bootstrap-node2
    hostname: exo-node2
    command: ["-q", "--libp2p-port", "30000", "--bootstrap-peers", "/ip4/172.30.20.2/tcp/30000"]
    environment:
      - EXO_LIBP2P_NAMESPACE=bootstrap-test
    ports:
      - "52416:52415"
    networks:
      bootstrap-net:
        ipv4_address: 172.30.20.3
    deploy:
      resources:
        limits:
          memory: 4g

networks:
  bootstrap-net:
    driver: bridge
    ipam:
      config:
        - subnet: 172.30.20.0/24
```
</details> 

Two containers on a bridge network (`172.30.20.0/24`), fixed IPs,
`--libp2p-port 30000`, cross-referencing `--bootstrap-peers`.

Both nodes found each other and established a connection then ran the
election protocol.

### Automated Testing

4 Rust integration tests in `rust/networking/tests/bootstrap_peers.rs`
(`cargo test -p networking`):

| Test | What it verifies | Result |
|------|-----------------|--------|
| `two_nodes_connect_via_bootstrap_peers` | Node B discovers Node A via
bootstrap addr (real TCP connection) | PASS |
| `create_swarm_with_empty_bootstrap_peers` | Backward compatibility —
no bootstrap peers works | PASS |
| `create_swarm_ignores_invalid_bootstrap_addrs` | Invalid multiaddrs
silently filtered | PASS |
| `create_swarm_with_fixed_port` | `listen_port` parameter works | PASS
|

All 4 pass. The connection test takes ~6s

---------

Signed-off-by: DeepZima <deepzima@outlook.com>
Co-authored-by: Evan <evanev7@gmail.com>
2026-03-25 10:55:12 +00:00
..
2026-03-25 10:55:12 +00:00
2026-03-25 10:55:12 +00:00