In this article, I’ll guide you through an overview of what happens when you navigate to a website using your web browser.

Whether you’re new to web development or have some experience, this post will help you gain a better understanding of how the web and its core technologies works.

The internet is a huge network of interconnected computers. The World Wide Web (aka web) is built on top of that technology, as well as other services such as email, chat systems or file sharing, and so on.

Computers connected to the internet are either:

  • Clients, the web user’s devices and the software that those devices use to access the web.

  • Servers, computers that store web pages, sites, or apps and the files they need to be displayed in the user’s web browser or devices.

Finding a resource: URLs

Each resource stored in a server can be located by clients using its valid associated URL. The following is an example of a valid URL:

You may already know what a URL is, but let’s see in detail each one of its parts:

  • Scheme:The first part of an URL indicates the protocol that should be used to retrieve the resource. Websites use the HTTP and the HTTPS protocol, but we’ll see more details about this later. The : after the scheme is what separates it from the next part of the URL.
  • Authority: this part is composed by the domain name and the port number separated by a colon. The port is only mandatory when the standard ports of the HTTP protocol (80 for HTTP and 443 for HTTPS) are not being used by the web server. The // before the domain name indicates the beginning of the authority.
  • Path to resource: this is the abstract or physical path to the resource in the web server.
  • Parameters: a set of key/value pairs that add extra options to apply to the returning the requested resource. They are separated by a & and each web server has its own way to handle parameters. This section starts with ?.
  • Anchor: This section, if present, starts by a # and is handled by the browser to display a specific part of the returned document. For example, it can point to a specific section in a HTML document.

There are a few things that happen when you type a URL into your browser’s address bar that allow you to navigate to a site and interact with its content. Let’s see each one in detail.

Matching IPs and URLs: DNS Resolution

While, as humans, we prefer domain names composed of words, computers communicate with each other using IP addresses. IP addresses are composed by numbers and are harder to remember for our human minds. The Domain Name System (DNS) is what puts together domain names and IP addresses.

When you type a URL, the browser will first look into the local cache to see if the results for the DNS lookup are already stored. Then, it will equally check into the operating system cache.

If there are no results stored, then the browser will call the DNS Resolver to try to find the associated IP address for the domain name.

What is the DNS Resolver?

The resolver is typically defined by your internet provider’s DNS, even though that’s the default most people use, and it is possible to change it to Google’s or Cloudflare’s or anything else you want.

Again, the provider’s DNS might have the results for the domain name stored in its cache, if not, it will ask the Root DNS server.

What is the Root DNS Server?

The Root DNS server is a system that actually drives the entire internet. It is composed of thirteen servers distributed across the planet. It returns the query from the resolver with the relevant Top Level Domain server for the requested domain name.

At this moment, the DNS resolver will cache the IP of that Top Level Domain server so it doesn’t have to ask the Root DNS server again for it.

What is the Top Level Domain Server?

The Top Level Domain (TLD) server stores the IP addresses of the authoritative name servers for the domain that the user is looking for.

In the URL, www.exampleurl.com, the top-level domain is .com. There are different types, such as generic top-level domains like .com or .org, country code top-level domains, usually represented by the two letters ISO country code, and more.

The TLD returns the authoritative name servers for the requested domain. One more time, the DNS resolver will cache the results so it doesn’t have to come back later for them.

Authoritative Nameserver

This server contains the DNS records that maps domain names to IP addresses. There is more than one name server attached to any domain.

There is no caching involved at this point, because the authoritative nameserver is the highest authority and the latest piece in the DNS resolution chain.

If the IP address is available, the authoritative nameserver responds to the DNS resolver query with the domain name’s IP address. If not available it will respond with an error. Then, the DNS resolver stores the IP and sends it back to the client’s browser.

Once the DNS lookup is completed and the browser has the IP address, it can attempt to establish a connection with the server.

Establishing a Connection: TCP/IP Model

The connection between client and server is established using the Transmission Control Protocol (TCP) and the Internet Protocol (IP). These protocols are the main ones behind the World Wide Web and other internet technologies, such as email, and determine how data travels across the network.

The TCP/IP model is a framework used to organize the different protocols involved in the internet and other network communications. The primary responsibility of TCP/IP is to divide the data into packets and send them to their final destination, ensuring the packets can be put back together on the other end of the communication.

This process follows a four-layer model, where data travels in one direction and then in the reverse direction when it reaches the destination:

The transport layer that ensures applications can exchange data by establishing data channels. It is also the layer that establishes the concept of network ports, a system of numbered data channels allocated for the specific communication channels that applications need.

The TCP/IP model’s transport layer includes two protocols that are most commonly used on the internet: the TCP and the User Datagram Protocol (UDP).

TCP includes some capabilities that make it prevalent over most of the internet-based applications such as the web, so let’s focus on it.

How Does TCP Connection Work?

TCP allows data to be transferred reliably and in order to its destination. It is a connection-oriented protocol, that means that the sender and the receiver must agree on connection parameters before actually establishing the connection. This process is known as the handshake procedure.

TCP Three-way Handshake

The handshake is a way for the client and the server to establish a secure connection and ensure that both parties are synchronized and ready to start exchanging messages.

The three steps of the TCP handshake include:

  1. The client informs the server that it wants to establish a connection by sending a SYN (synchronize) packet. This packet specifies a sequence number that subsequent segments will start with.

  2. The server receives the SYN and responds with a SYN-ACK (synchronize-acknowledgment) segment. It includes the server’s sequence number and an acknowledgment of the client’s sequence number, incremented by one.

  3. The client responds with an ACK message, acknowledging the server’s sequence number. At this point, the connection has been established.

Starting the Exchange: Client-Server Communication

Once the TCP connection is established, the client and server can start exchanging messages using the HTTP protocol.

What is the HTTP Protocol?

Hypertext Transfer Protocol (HTTP) is the most widely used application layer protocol in the TCP/IP suite, but it’s considered insecure, leading to a shift towards HTTPS, which uses TLS on top of TCP for data encryption. You see find more details about this later.

The browser will start by sending an HTTP request message to the server, asking for a copy of the site in the form of an HTML file. HTTP protocol can transfer files like HTML, CSS, JS, SVG, and so on.

HTTP Request/Response

There are two types of HTTP messages:

  • Requests, sent by the client to the server to trigger an action.

  • Responses, sent from the server to the client as an answer to the previous request.

Messages are plain text documents, structured in a precise way determined by the communication protocol, in this case, HTTP.

The three parts included in a HTTP request are:

  1. Request line: Includes the request method, which is a verb defining the action to perform. In the case we are covering in this blogpost, the browser will make a GET request to fetch a page from the server. The request line will also include the resource location, in this case an URL, and the protocol version being used.

  2. Request header: A set of key value pairs. Two of them are mandatory. Host indicates the domain name to target, and Connection which is always set to close unless it must be kept open. The request header always ends with a blank line.

  3. Request body: Is an optional field that allows sending data over the server.

The server will reply to the request with an HTTP response. Responses include information about the request status and may include the requested resource or data.

HTTP responses are structured in the following parts:

  1. Status line: Includes the used protocol version, a status code and a status text, with a human readable description of the status code.

  2. Headers: A set of key-value pairs that can either be general headers, applying to the whole message; response headers, giving additional information about the server status; or representation headers, describing the format and encoding for the message data if present.

  3. Body: Contains the requested data or resource. If no data or resource is expected by the client, the response usually won’t include a body.

When the request for a web page is approved by the server, the response will include a 200 OK message. Other existing HTTP response codes are:

  • 404 Not Found

  • 403 Forbidden

  • 301 Moved Permanently

  • 500 Internal Server Error

  • 304 Not Modified

  • 401 Unauthorized

The response will also contain a list of HTTP headers and the response body, including the corresponding HTML code for the requested page.

HTTPS

Hypertext Transfer Protocol Secure (HTTPS) is not a different protocol, but an extension of the HTTP. It is usually referred to as HTTP over Transport Layer Security (TLS). Let’s see what it exactly means.

HTTP is the protocol used for most communications between browsers and servers, but it lacks security. Any data sent over HTTP can potentially be visible to anyone on the network. This is especially risky when sensitive data is involved in the connection, such as login credentials, financial information, health information, and so on.

The main motivation behind HTTPS is to ensure data privacy, integrity, and identification. This means ensuring that data is only accessible to whom it is supposed to be and cannot be intercepted or modified by anyone in between. Also, that both sender and receiver can be identified as who they claim to be by a legitimate authority.

In HTTPS the communications are encrypted using the TLS protocol, which relies on asymmetric public key infrastructure. It combines two keys: one public and one private. The server shares its public key so the client can use it to encrypt messages that can only be decrypted by the server’s private key.

To establish an encrypted communication, the client and the server have to initiate another handshake. During the handshake, they agree on the TLS version to use and on how they will encrypt data and authenticate each other during the connection, a set of rules known as the cipher suite.

This handshake or TLS negotiation starts once a TCP connection has been established, and includes the following steps:

  • Client hello: The browser sends a hello message that includes all supported TLS versions and cipher suites.

  • Server Hello: The server responds with the chosen cipher suite and TLS version, along with its SSL certificate containing the server’s public key.

  • Authentication and Pre-Master Key: The client verifies the server’s SSL certificate with the corresponding trusted authority, then creates a pre-master key using the server’s public key (previously shared in the certificate) and shares this pre-master key with the server.

  • Pre-master key decryption: The pre-mastered key can only be decrypted using the server’s private key. If the server is able to decrypt it, the client and server can then agree on a shared master secret to use for the session.

  • Client ChangeCipherSpec: The client creates a session key using the shared master secret and sends the server all previously exchanged records, this time encrypted with the session key.

  • Server ChangeCipherSpec: If the server generates the correct session key, it will be able to decrypt the message and verify the received record. The server then sends a record to confirm that the client also has the correct keys.

  • Secured connection established: The handshake is complete.

Once the handshake is completed, all the communication between the client and server is protected by symmetric encryption using the session key, and the browser can make the first HTTP GET request for the site.

Time to First Byte

When the first request is made from the client, the first packet that arrives as response marks the Time to First Byte (TTFB), which represents the time elapsed since the request was initiated and when the first chunk of data was received as a response. It will include the time taken for the DNS lookup, the TCP handshake to establish the connection, and the TLS handshake if the request is made over HTTPS.

Rendering the response

Once the browser’s request is approved, the server will send a 200 OK message along with the response headers and the contents requested. As it is a website, content is likely to be an HTML document.

Data travels between the client and server divided into a series of small data chunks, called data packets. This makes it easy to replace corrupted chunks of data if needed and also allows data to travel to and from different locations, enabling multiple users to access data faster and at the same time.

After receiving the response, browser immediate render the whole page based on the response.

So, you fire up your favorite browser and visit a webpage. From the moment you hit the Enter key to when you finally see the page displayed, the following things happen:

Reference