This article may contain affiliate links. If you buy some products using those links, I may receive monetary benefits. See affiliate disclosure here
We know that HTTP is the protocol for the world wide web. Released in 2008, WebSocket is another internet protocol that you can use for creating web applications.
If you are a beginner, terms like WebSocket, sockets, socket programming, etc can be confusing. So in this post, I attempt to throw some light upon these topics.
Watch a Video
To better understand sockets and the Websocket protocol, it is good to have a basic understanding about the different network protocols governing the internet.
The OSI Model & the TCP/IP Model
In case you are new to protocols and layers, just understand that protocols are a set of rules that define how the computers on a network communicate each other.
If two computers need to communicate each other, there has to be some common standards in place, right? Otherwise, how can one host understand the data sent by another? That’s what these protocols are for. These standards are formulated by the Internet Engineering Task Force (IETF) in the form of several RFC documents.
Not just for sending and receiving data, there are protocols for transmitting the data over a medium as well. For instance, whether it is satellite communication, fiber optic, cable, and so on. So, physical layer (layer 1) is the lowest layer. This include specifications for the voltage levels, pins, etc. While the Application Layer is the highest, which includes high-level protocols. Here are the 7 layers in the OSI model from bottom to top:
- Physical layer
- Data Link Layer
- Network Layer
- Transport Layer – TCP, UDP, etc.
- Session Layer
- Presentation Layer
- Application Layer – HTTP, FTP, SSH, Websocket, SMTP, etc.
However, the OSI model is more of a theoretical framework. The real and working model which is used on the internet is the TCP/IP protocol suite, a.k.a, the Internet Protocol Suite, which includes four layers:
- Link Layer
- Internet Layer
- Transport Layer
- Application Layer
The important point is, both HTTP and WebSocket belong to the Application Layer, and depend on TCP at the Transport Layer.
What are Sockets
Before moving on to WebSocket protocol, let’s understand sockets.
Here are the things I’ve learned about sockets:
- Socket is an endpoint for communication between two computers on the internet
- Socket on one computer connects to the socket on another computer on the network to send/receive data
- Also known as BSD (Berkeley Software Distributions) Sockets, it is a software structure that defines these endpoints, not a physical device.
- A Socket consists of an IP address and a port number – IP addresses are part of the Network layer while port numbers are part of the Transport Layer.
- So, BSD socket does not fully fit into a single OSI or TCP/IP suite layer. Don’t confuse it to be a protocol. It’s an API.
- This API enables us to use programming languages like C/C++/Python to open ports, listen for connections, send/receive data, etc.
- BSD Sockets are supported on most POSIX-compliant operating systems, including Linux, Mac, and Windows.
What is HTTP
HTTP is an application layer protocol that works on top of sockets. Yes, you read it correct. Not only the WebSocket protocol, but other TCP protocols like HTTP also use sockets.
The difference is that HTTP connections are not persistent. Once the request is handled, the server closes the connection. So the client needs to send new requests every time to receive new data from the server.
Web browsers support HTTP protocol out of the box, which allows you to access web pages from HTTP servers without even knowing what’s going on under the hood.
On the server side, HTTP is usually implemented using a web server software like Apache or Nginx, which enables the server to handle HTTP requests and respond to it.
The URI scheme for HTTP is http:// (non-secure) and https:// (for secure connections over TLS).
What is Websocket
The URI scheme for a websocket connection is ws:// (non-secure) or wss:// (secure).
Unlike HTTP, a websocket connection is a persistent two-way connection between the client socket and the server socket. The socket server does not drop the connection right away after handling the initial request/message.
More importantly, the server can push data back to the client as long as the connection is open.
This makes websocket highly useful for real-time web applications like chats, online games, real-time editing, live feeds, etc. Earlier, clients had to poll the server at intervals to receive new information. But now, the server has the capability to send messages as they become available.
How a Websocket connection works
For a websocket connection to work, two things are required:
- A client that supports the websocket protocol – most modern browsers now do that.
- A server with an open BSD socket listening for connections from clients.
Since websocket is not meant to get web pages, it won’t work if you enter a ws:// URL in the address bar. Instead, the WebSocket Javascript API is the standard way to open connections to a socket server.
let socket = new WebSocket("ws://codelab.local:8000/phpsock/");
var socketOpen = (e) => {
console.log("connected to the socket");
socket.send("This is me, the browser");
}
var socketMessage = (e) => {
console.log(`Message from socket: ${e.data}`);
}
var socketClose = (e) => {
if(e.wasClean) {
console.log("The connection closed cleanly");
}
else {
console.log("The connection closed for some reason");
}
}
var socketError = (e) => {
console.log("WebSocket Error");
console.log(e);
}
socket.addEventListener("open", socketOpen);
socket.addEventListener("message", socketMessage);
socket.addEventListener("close", socketClose);
socket.addEventListener("error", socketError);
Also, on the server-side, you have to implement some mechanism to handle socket connection. You can use programming languages like Python, PHP, NodeJS (Javascript) or something like that do the socket programming.
A server-side socket can be functional even if it does not adhere to websocket specifications. For instance, a simple socket program can respond to a telnet client successfully on the command line.
However, if the socket wants to handle Websocket connections from browsers, it needs to implement the specifications mentioned in RFC 6455. This includes the initial handshake, calculating the value of Sec-Websocket-Accept header, masking and unmasking data fragments, etc.
Here is how you can do the handshake using PHP:
function wshandshake($requestHeader,$sock, $host_name, $port) {
$headers = array();
$lines = preg_split("/\r\n/", $requestHeader);
foreach($lines as $line)
{
$line = chop($line);
if(preg_match('/\A(\S+): (.*)\z/', $line, $matches))
{
$headers[$matches[1]] = $matches[2];
}
}
$secKey = $headers['Sec-WebSocket-Key'];
$secAccept = base64_encode(pack('H*', sha1($secKey . '258EAFA5-E914-47DA-95CA-C5AB0DC85B11')));
$responseHeader = "HTTP/1.1 101 Switching Protocols\r\n" .
"Upgrade: websocket\r\n" .
"Connection: Upgrade\r\n" .
"Sec-WebSocket-Accept:$secAccept\r\n\r\n";
socket_write($sock,$responseHeader,strlen($responseHeader));
}
Conclusion
I hope you got a basic understanding on sockets, WebSocket protocol, and how it differs from HTTP.
References
Here are a few links I read before writing this blog post. I suggest you to go through them as well to improve your understanding.