BrowserMesh

Distributed LLM inference across browser peers, via WebRTC. Open this URL in two tabs (or send to a friend), connect, load the model in each, and run inference together.

Your peer ID: connecting to broker…

Mesh

No peers connected. Share your invite link to add some.

This browser

Detecting capabilities…

Click to download and load. Cached after first use.

Inference

Ensemble working Pipeline parallel protocol only MoE expert routing protocol only

Load the model first.

Mesh log

Deployment notes

Signaling. This demo uses the public PeerJS broker at 0.peerjs.com — rate-limited and shared. For production, run your own PeerServer (npm install peer or the Docker image) and pass {host, port, path, key} to new Peer(). Or write a PHP signaling server — it's just WebSocket + relay-room-by-id.

NAT traversal. Default ICE uses Google's public STUN. If two peers can't connect (corporate NAT, symmetric NAT) you need a TURN server. Coturn on a small VPS handles thousands of users.

CORS / COEP. transformers.js downloads from huggingface.co CDN (CORS-enabled). No special headers required unless you later add SharedArrayBuffer/threaded WASM, then add Cross-Origin-Opener-Policy: same-origin and Cross-Origin-Embedder-Policy: require-corp.

Pipeline / MoE. The protocol layer (capability ad, request routing, error handling, circuit breaker) is fully wired. The inference handlers return { unimplemented: true } — implementing them requires a custom transformers.js fork that exposes intermediate hidden states between transformer blocks. See the handlePipelineStage and handleMoeExpert stubs at the bottom of the script for the TODO contract.