Apr 17, 2025

Digging deep into web sockets

Most of the communcation on web happens over HTTP protocol. With HTTP, client and server communicate through a request-response model. The client sends a request to the server and the server sends back a response. HTTP enables this communication by establishing a TCP connection between the client and the server.

Ref image below.

Notice the connectd to and the port number 80 part. This indicates that the connection is established over TCP.

HTTP is a stateless protocol. This means that each request is independent of the previous request.

Limitations of HTTP

  1. Every interaction between the client and the server needs a new request over a new TCP connection. This is inefficient.
  2. HTTP communication is one way only. The server cannot send data to the client without the client requesting it.

What are web sockets ?

Web sockets is a communicatin protocolo than enabled bi-directional communication between the client and the server. Web sockets achieve this by establishing a persistent long-lived TCP connection.

Once the connection is established, both the client and server can send data to each other without the need for a new request.

How does web sockets work ?

Web sockets in function through three steps.

  1. Websocket handshake
  • The client initiates the connection by sending an HTTP GET request with specific headers, including Upgrade: websocket and Connection: Upgrade.
  • The server, if it supports WebSockets, responds with an HTTP 101 Switching Protocols response, also containing the Upgrade and Connection headers, along with a Sec-WebSocket-Accept header for security.
  • This handshake successfully "upgrades" the HTTP connection to a WebSocket connection.

You can easily see this happening in network tab. Checkout the image below showing a popular chat application whatsapp establishing a web socket connection.

  1. Data transfer
  • After the handshake, communication happens over the established TCP connection using WebSocket frames.
  • These frames contain a payload (the actual data) and some metadata about the data type (text, binary, etc.).
  • Since it's a persistent connection, both the client and server can send these frames whenever they have data to transmit.

Here's how a web socket frame looks like.

0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7/14/63)| (16/64 bits) |
|N|V|V|V| |S| | (if Payload len is 126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+-------------------------------+
| Masking-key (32 bits) | Payload data |
| (if MASK is set to 1) | (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len | Extended payload length |
|I|S|S|S| (4) |A| (7/14/63)| (16/64 bits) |
|N|V|V|V| |S| | (if Payload len is 126/127) |
| |1|2|3| |K| | |
+-+-+-+-+-------+-+-------------+-------------------------------+
| Masking-key (32 bits) | Payload data |
| (if MASK is set to 1) | (variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+

A WebSocket frame carries a header with control info (like data type and length) and an optional mask, followed by the actual data. Think of it as a lightweight envelope for your real-time messages.

  1. Web socket close
  • Either the client or the server can initiate the closing of the WebSocket connection through a closing handshake.
  • This involves sending a special close frame and acknowledging the received close frame from the other party.

Working with web sockets in JavaScript

Here's a quick example of how to use web sockets in JavaScript.

<!DOCTYPE html>
<html>
<head>
<title>WebSocket Example</title>
<script>
function connectAndSend() {
const outputDiv = document.getElementById('output');
outputDiv.innerHTML = ''; // Clear previous output

// Use a WebSocket echo server URL. Use wss for secure connection.
const ws = new WebSocket('wss://echo.websocket.org');

outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connecting...</span>";

ws.onopen = function() {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connected to WebSocket</span>";
try {
ws.send('Hello, world!');
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Sent: Hello, world!</span>";
} catch (error) {
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>Error sending message: " + error.message + "</span>";
}

};

ws.onmessage = function(event) {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Received: " + event.data + "</span>";
try{
ws.close(); // Close the connection after receiving the message
} catch(error){
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>Error closing connection: " + error.message + "</span>";
}

};

ws.onclose = function() {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Disconnected from WebSocket</span>";
};

ws.onerror = function(error) {
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>WebSocket error:<br>" + error.message + "</span>"; // Display the error message
};

// Optional: Display the current state of the connection.
const stateInterval = setInterval(() => {
let stateText = "";
switch (ws.readyState) {
case WebSocket.CONNECTING:
stateText = "CONNECTING";
break;
case WebSocket.OPEN:
stateText = "OPEN";
break;
case WebSocket.CLOSING:
stateText = "CLOSING";
break;
case WebSocket.CLOSED:
stateText = "CLOSED";
break;
default:
stateText = "UNKNOWN";
}
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connection State: " + stateText + "</span>";
if (ws.readyState === WebSocket.CLOSED || ws.readyState === WebSocket.CLOSING) {
clearInterval(stateInterval);
}
}, 500); // Update every half second
}
</script>
</head>
<body>
<button onclick="connectAndSend()">Connect and Send</button>
<div id="output"></div>
</body>
</html>

<!DOCTYPE html>
<html>
<head>
<title>WebSocket Example</title>
<script>
function connectAndSend() {
const outputDiv = document.getElementById('output');
outputDiv.innerHTML = ''; // Clear previous output

// Use a WebSocket echo server URL. Use wss for secure connection.
const ws = new WebSocket('wss://echo.websocket.org');

outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connecting...</span>";

ws.onopen = function() {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connected to WebSocket</span>";
try {
ws.send('Hello, world!');
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Sent: Hello, world!</span>";
} catch (error) {
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>Error sending message: " + error.message + "</span>";
}

};

ws.onmessage = function(event) {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Received: " + event.data + "</span>";
try{
ws.close(); // Close the connection after receiving the message
} catch(error){
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>Error closing connection: " + error.message + "</span>";
}

};

ws.onclose = function() {
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Disconnected from WebSocket</span>";
};

ws.onerror = function(error) {
outputDiv.innerHTML += "<span style='color:red; margin-bottom: 10px; display: block;'>WebSocket error:<br>" + error.message + "</span>"; // Display the error message
};

// Optional: Display the current state of the connection.
const stateInterval = setInterval(() => {
let stateText = "";
switch (ws.readyState) {
case WebSocket.CONNECTING:
stateText = "CONNECTING";
break;
case WebSocket.OPEN:
stateText = "OPEN";
break;
case WebSocket.CLOSING:
stateText = "CLOSING";
break;
case WebSocket.CLOSED:
stateText = "CLOSED";
break;
default:
stateText = "UNKNOWN";
}
outputDiv.innerHTML += "<span style='color:green; margin-bottom: 10px; display: block;'>Connection State: " + stateText + "</span>";
if (ws.readyState === WebSocket.CLOSED || ws.readyState === WebSocket.CLOSING) {
clearInterval(stateInterval);
}
}, 500); // Update every half second
}
</script>
</head>
<body>
<button onclick="connectAndSend()">Connect and Send</button>
<div id="output"></div>
</body>
</html>

The HTML code creates a webpage with a button. When clicked, the JavaScript function connectAndSend():

  • Establishes a secure WebSocket connection using the WebSocket API to an echo server (wss://echo.websocket.org). Learn more about echo server here.
  • Displays connection status updates in green (or red for errors) within a designated HTML element.
  • Sends the message "Hello, world!" to the server.
  • Displays the server's response.
  • Closes the connection after receiving the response.

Limitations of WebSockets

DisadvantageRemedy
No auto-reconnectImplement manual reconnection with backoff (e.g., setTimeout retries)
No built-in heartbeatSend application-level ping/pong messages periodically
Hard to scaleUse load balancers with sticky sessions + Redis Pub/Sub for broadcasts
Proxy/firewall issuesFall back to HTTP long-polling if blocked; always use wss://
No multiplexingImplement app-level channel IDs for multiple streams
Stateful connectionsTrack clients in a shared DB; handle orphaned connections
Limited browser controlUse workarounds (e.g., app-level pings, binary encoding)
No binary compressionManually compress data (e.g., zlib for Node.js)
Security risks (DoS, CSWSH)Validate Origin, rate-limit, sanitize messages
Unpredictable performanceMonitor latency; fall back to HTTP in poor conditions
No cachingCache critical data separately via HTTP/CDN
Poor debugging toolsUse wscat, DevTools, or specialized WebSocket testers

Scaling web sockets for million connections

Here are some of the things which big companies do to scale web sockets for millions of users.

  1. Distribute Connections Using Sticky Load Balancing

To avoid overloading a single server, connections must be spread across multiple machines. Sticky sessions (session affinity) ensure that a user’s WebSocket connection consistently routes to the same backend server, preventing reconnection delays. Load balancers like NGINX, HAProxy, or AWS ALB handle this by hashing the client’s IP or session token. Without sticky sessions, reconnections could bounce users between servers, breaking real-time sync.

  1. Optimize Payloads with Binary Protocols (Protobuf/MessagePack)

JSON is human-readable but bloated. Binary protocols like Protocol Buffers (protobuf) or MessagePack reduce payload size by 40-60%, saving bandwidth and CPU. For example:

// Instead of JSON:
ws.send(JSON.stringify({ user: "Alice", msg: "Hi" })); // ~30 bytes

// Use binary encoding:
const binaryMsg = MessagePack.encode({ user: "Alice", msg: "Hi" }); // ~15 bytes
ws.send(binaryMsg);
// Instead of JSON:
ws.send(JSON.stringify({ user: "Alice", msg: "Hi" })); // ~30 bytes

// Use binary encoding:
const binaryMsg = MessagePack.encode({ user: "Alice", msg: "Hi" }); // ~15 bytes
ws.send(binaryMsg);
  1. Handle Failures with Heartbeats & Smart Reconnection

Servers crash, networks drop, and mobile users switch between Wi-Fi and cellular. To maintain reliability:

  • Ping/pong frames detect dead connections (every ~30 sec).
  • Clients implement exponential backoff reconnection (retry after 1s, 2s, 4s, etc.).
  • If WebSockets fail, fall back to HTTP long-polling (used by WhatsApp after ~2 minutes of inactivity).
  1. Deploy Globally with Edge POPs to Reduce Latency

A user in Mumbai shouldn’t connect to a server in New York. Edge computing (Cloudflare, AWS Global Accelerator) places WebSocket servers in regional data centers:

  • Users connect to the nearest edge location (e.g., ws-mumbai.example.com).
  • Reduces latency from 200ms → 20ms.
  • Enables compliance with data sovereignty laws (GDPR, etc.).
  1. Monitor & Auto-Scale Based on Real-Time Metrics

Traffic spikes (e.g., New Year’s Eve) can overwhelm servers. Solutions:

Auto-scaling (Kubernetes, AWS ECS) spins up servers when CPU/memory exceeds 70%.

Prometheus/Grafana tracks:

  • Active connections per server.
  • Ping/pong latency (detect network issues).
  • Message throughput (identify bottlenecks).