16/180: What happens when you type google.com in browser

Navneet Ojha
5 min readApr 3, 2021

--

Today is day 16 of my 180 days streak, still going on with positivity. Today we are going to understand the basics when we type some URL in browser then how it fetch the page info and display to us.

What happens when you type in ‘www.google.com’ in your browser?

The communication between your client and a web server can be divided into the following components. The first thing when we type the first character of the domain, browser starts its search from there only, and show us the results based upon browser history, cache cookies or some widely used terms all over the world. These are few of the criteria how browser try to autocomplete the URL or keyword. So basically on every character added to the address bar browser show us different results and before typing the whole URL , browser autocompletes the domain name. Now the browser have the URL. And browser will check if the domain is in its cache if not then it checks the hosts file which is present in /etc/hosts where we give descriptions about the hostname, if that particular url is pointed to something or not, if not it tries to resolve the hostname through DNS. The DNS resolver is usually your local router if you are at home. For this it makes a call to the local router of ISP’s DNS caching sever. If DNS server is on same or different subnet it, ARP process takes place for DNS Server, default gateway respectively.

In order to send an ARP (Address Resolution Protocol) broadcast the network stack library needs the target IP address to look up. It also needs to know the MAC address of the interface it will use to send out the ARP broadcast. The ARP cache is first checked for an ARP entry for our target IP. If it is in the cache, the library function returns the result: Target IP = MAC. If the entry is not in ARP Cache. The route table is looked up, to see if the Target IP address is on any of the subnets on the local route table. If it is, the library uses the interface associated with that subnet. If it is not, the library uses the interface that has the subnet of our default gateway. If the computer is directly connected to the router the router responds with an ARP Reply. Now that the network library has the IP address of either our DNS server or the default gateway it can resume its DNS process: If the local/ISP DNS server does not have it, then a recursive search is requested and that flows up the list of DNS servers until the SOA is reached, and if found an answer is returned.

Once the browser receives the IP address of the destination server, it takes that and the given port number from the URL (the HTTP protocol defaults to port 80, and HTTPS to port 443), and makes a call to the system library function named socket and requests a TCP socket stream. This request is first passed to the Transport Layer. The destination port is added to the header. This segment is sent to the Network Layer, which wraps an additional IP header. The packet next arrives at the Link Layer. This send and receive happens multiple times following the TCP connection flow:

Network communication is when a 3 way TCP handshake happens between the client and the server, that includes a syn, sync-ack, and then an ack. Once the client has the IP address from the above DNS it makes a connection to the server. Client chooses an initial sequence number (ISN) and sends the packet to the server with the SYN bit set to indicate it is setting the ISN. Server receives SYN and if it’s in an agreeable mood. Client acknowledges the connection by sending a packet. When the other side acknowledges receipt of that packet (or a string of packets), it sends an ACK packet with the ACK value equal to the last received sequence from the other. Transport Layer Security (TLS) handshake happens between client server, where server responds with TLS version, selected cipher, selected compression methods and the server’s public certificate signed by a CA (Certificate Authority). The certificate contains a public key that will be used by the client to encrypt the rest of the handshake until a symmetric key can be agreed upon. Sometimes, due to network congestion or flaky hardware connections, TLS packets will be dropped before they get to their final destination. The sender then has to decide how to react. The algorithm for this is called TCP congestion control.

Once the destination receives the packet, let’s say https://www.google.com is running behind a http load balancer, The load balancer will handle all incoming and outgoing connection between the client and the http server. DSR or direct server return means that the incoming connections may come through the load balancer, but the outgoing connections will be between the web server and the client. Apache (HTTP Server Request Handle) when it receives the incoming request on port 80 will then either use a forked process or a thread to pass the request to. Apache has two modes of running, one is worker.c and the other is pre-fork. In pre-fork Apache uses processes that have been forked. In worker.c it uses threads. Threads consumes less resources, but is more complex. Since pre-fork is by default, let’s say Apache forked off a process to handle our request. If the HTML referenced a resource on a different domain than www.google.com, the web browser goes back to the steps involved in resolving the other domain, and follows all steps up to this point for that domain. The Host header in the request will be set to the appropriate server name instead of google.com.

Once the server supplies the resources (HTML, CSS, JS, images, etc.) to the browser it undergoes the below process:

  • Parsing — HTML, CSS, JS
  • Rendering — Construct DOM Tree → Render Tree → Layout of Render Tree → Painting the render tree

This was all the very basic thing one should know and this question is asked in interviews. So you can prepare well. I have given the resources below from where I did studied all this.

References

what-happens-when/README.rst at master · alex/what-happens-when (github.com)

What happens when you type in ‘www.cnn.com’ in your browser? | Syed Ali

--

--

Navneet Ojha
Navneet Ojha

Written by Navneet Ojha

I am Indian by birth, Punjabi by destiny. Humanity is my religion. Love to eat, travel, read books and my million dreams keep me alive.

No responses yet