Minimal WebSocket relay server for party games. Clients share rooms and exchange messages — the server just forwards them.
- A client creates a room (server assigns a 6-char code) with a max client limit
- Other clients join by room code
- Each member is assigned a numeric
index(slot id). Indices are stable for the room's lifetime and never reassigned. New joiners always get the next index, so in long-lived rooms with churn an index can exceedmaxClients - Clients pick their own
clientId; it stays server-side and acts as the bearer secret for their slot. Reconnecting with the sameclientIdreplaces the old connection in the same slot - Messages can be broadcast to all peers or sent to a specific peer index
peer_leftis broadcast immediately on disconnect, and the cap slot frees right away — a new joiner can take it. The originalclientIdcan still reclaim its old index if no one took the spot in the meantime- Rooms are cleaned up when empty
bun run server.ts
# or
PORT=8080 bun run server.tsA host creates a room; one or more guests join it.
const ws = new WebSocket("wss://your-relay.example.com");
// Persist clientId in localStorage if you want reconnects to survive a reload.
const clientId = crypto.randomUUID();
ws.onopen = () => {
ws.send(JSON.stringify({ type: "create", clientId, maxClients: 4 }));
};
ws.onerror = (event) => console.error("websocket error", event);
ws.onmessage = (event) => {
const msg = JSON.parse(event.data);
switch (msg.type) {
case "created": console.log("room code:", msg.room, "my index:", msg.index); break;
case "peer_joined": console.log("peer joined:", msg.index); break;
case "peer_left": console.log("peer left:", msg.index); break;
case "message": console.log("from", msg.from, msg.data); break;
case "error": console.error("server error:", msg.message); break;
}
};const ws = new WebSocket("wss://your-relay.example.com");
const clientId = crypto.randomUUID(); // presenting the same one re-identifies your slot
ws.onopen = () => {
ws.send(JSON.stringify({ type: "join", clientId, room: "Mu5h6Z" }));
};
ws.onerror = (event) => console.error("websocket error", event);
ws.onmessage = (event) => {
const msg = JSON.parse(event.data);
// Guests also receive `peer_joined`, `peer_left`, `message`, and `error` —
// see the host snippet above for the full event surface.
if (msg.type === "joined") {
// msg.index is your slot; msg.peers is everyone else's slots.
// Broadcast to all peers. Add `to: <peerIndex>` to target a single peer.
ws.send(JSON.stringify({ type: "send", data: { move: "left" } }));
}
};Joining with the same clientId replaces the old connection in the same slot; no special reconnect message needed. The server closes the previous WebSocket with code 4000 and reason "replaced". Treat that as terminal in your reconnect loop, otherwise the new connection will be torn down by the next replacement.
Since the clientId stays on the original client, another connection can't take over your slot — but they can fill the cap spot it freed when you disconnected. Reconnect is best-effort: if the room re-filled while you were gone you'll get Room is full. Persist clientId locally if you want reconnect to survive a refresh.
Hosts reconnect like guests: send join with the original clientId and room code, not another create.
ws.onclose = (event) => {
if (event.code === 4000) return; // replaced by a newer connection — don't reconnect
// ...your reconnect logic
};sequenceDiagram
participant H as Host
participant S as Server
participant G as Guest
Note over H,G: Create a room
H->>S: { type: "create", clientId: "host-secret", maxClients: 4 }
S->>H: { type: "created", room: "Mu5h6Z", instance: "00bb33ff", region: "fra", index: 0 }
Note over H,G: Join a room
G->>S: { type: "join", clientId: "guest-secret", room: "Mu5h6Z" }
S->>G: { type: "joined", room: "Mu5h6Z", index: 1, peers: [0] }
S->>H: { type: "peer_joined", index: 1 }
Note over H,G: Broadcast message (sender excluded)
H->>S: { type: "send", data: { move: "left" } }
S->>G: { type: "message", from: 0, data: { move: "left" } }
Note over H,G: Targeted message
G->>S: { type: "send", to: 0, data: "hello" }
S->>H: { type: "message", from: 1, data: "hello" }
Note over H,G: Disconnect
G--xS: connection closed
S->>H: { type: "peer_left", index: 1 }
All messages are JSON over WebSocket.
| type | fields | description |
|---|---|---|
create |
clientId, maxClients |
Create a new room. Server assigns the 6-char code. Keep clientId private; presenting it again reclaims the slot. |
join |
clientId, room |
Join an existing room. Reusing your prior clientId reclaims your slot. |
send |
data, to? |
Send to all peers or a specific peer (to is a numeric index). |
| type | fields | description |
|---|---|---|
created |
room, instance, region, index |
Room created. index is your slot id (always 0 for the creator). instance identifies the holding machine for cross-instance routing; region is a label. |
joined |
room, index, peers[] |
Joined room. index is your slot id; peers lists the other present slot ids. |
peer_joined |
index |
A new peer joined the room |
peer_left |
index |
A peer disconnected |
message |
from, data |
Relayed message from a peer (from is the sender's index) |
error |
message |
Error description |
docker build -t party-sockets .
docker run -p 3000:3000 -e PORT=3000 party-sockets| variable | default | description |
|---|---|---|
PORT |
3000 |
TCP port to listen on |
INSTANCE_ID |
empty | Machine identifier echoed in the created message and X-Instance-Id response header |
REGION |
empty | Region label echoed in created and /metrics |
DASHBOARD_URL |
none | Where GET / redirects. Unset → plaintext rooms / clients snapshot |
All HTTP endpoints include Access-Control-Allow-Origin: *.
Liveness probe.
- 200 —
{ status: "ok" }
Check whether a room exists on this server. The handling machine's ID is returned in the X-Instance-Id response header.
- 200 — room found:
{ clients: number, maxClients: number, origin: string } - 404 — room not found:
{ error: "Room not found" }
Prometheus exposition format. Exposes:
party_sockets_clients/party_sockets_rooms— live gaugesparty_sockets_clients_by_origin/party_sockets_rooms_by_origin— same, labeled by originparty_sockets_connections_total/party_sockets_rooms_created_total— since-boot counters per originparty_sockets_origins_tracked— origins currently tracked (size of internal map; capped at 500 with LRU eviction)process_resident_memory_bytes,process_heap_used_bytes,process_uptime_seconds— runtime health
All series are labeled with instance, region, version.
302 to DASHBOARD_URL if set; otherwise a plaintext rooms / clients snapshot for this machine.
Starter Grafana dashboard at ops/grafana-dashboard.json. Import into Grafana with a Prometheus datasource.
# Unit tests (in-process, no network)
bun test
# Live tests against a deployed instance
LIVE_URL=https://your-relay.example.com bun run test:liveOn Fly.io the platform-injected env vars unlock cross-instance and cross-region routing. None are required outside Fly.
| variable | role |
|---|---|
FLY_APP_NAME |
Enables DNS-based peer probe and default DASHBOARD_URL |
FLY_MACHINE_ID |
Fallback for INSTANCE_ID |
FLY_REGION |
Fallback for REGION; enables region-encoded room codes |
When deployed across multiple machines behind one anycast hostname, the upgrade URL can carry routing hints so connections land on the machine that holds the room:
// Pin to a known instance + room (from a previous `created` response)
new WebSocket("wss://your-relay.fly.dev/Mu5h6Z?instance=00bb33ff");
// Manual code entry: server reads /<code> from the path. Room codes encode
// their home region in the top 5 bits, so the receiving machine fly-replays
// directly to that region. Within the home region, peers probe each other
// over internal DNS to find the machine actually holding the room.
new WebSocket("wss://your-relay.fly.dev/Mu5h6Z");Single-instance deployments can omit both — they're no-ops when no peers exist. Redirects use fly-replay headers; swap the helpers in server.ts for other platforms.
Stale ?instance= values (machine replaced or destroyed) fall through to local handling rather than erroring — clients get a clean "Room not found" on join instead of a connection failure.
When FLY_REGION is set, the top 5 bits of the room code encode the region index from regions.ts, so any peer can route a /<code> or /room/<code> request directly to the home region. Locally, the full 35-bit space is random and region routing is skipped.
When FLY_APP_NAME is set, DASHBOARD_URL defaults to Fly's hosted Grafana for the app.
bun run test:live pulls machine IDs from flyctl automatically (requires fly CLI and auth). Pass LIVE_INSTANCES=id1,id2 to override. Multi-machine tests self-skip on single-machine deployments.