All you need to know about cookies - Part I (How cookies work)
Cookies are small pieces of data that a website/web-server sends to your browser, so it can be stored in your computer’s hard-disk. The next time you make another request on the same website.
Cookies are not evil. Let’s not be too harsh. You wouldn’t be hare-brained about cookies if you knew what they were built for.
💡What's a cookie?
🍪Cookies are small pieces of information/data that a website/web-server sends to your browser, so it can be stored in your computer’s hard-disk. The next time you make another request on the same website, your browser sends the cookie along with the request, so the website can identify you.
Stay with me, I’ll break this down and explain.
🎥FourZeroThree - YouTube
A quick shout out! Here's a “video version” of the article. If you are the visual type, I recommend watching the video. I bet you’ll enjoy it :)
💻Browser - Server interaction basics
Let’s buckle down to some “browser-server interaction” basics, so you could get a hang of what cookies are.
HTTP Request & Response
Hypertext Transfer Protocol abbreviated as “HTTP” is a set of rules that help a browser (you) and a server (websites you interact with) talk to each other. It is this protocol (or set of rules) that helps in the transfer of hypermedia (includes hypertext, sound, videos, graphics) on the web. Hypermedia basically, constitutes your web page.
When you type the URL of a website in your browser’s address bar and press enter, you are essentially “requesting” the server of the corresponding website to deliver the website’s home page to your browser. The same applies when you click on links on the website. Let’s say you click the “Login” option in the main menu of the website. You are “requesting” the server to serve you the website’s “Login” page.
This interaction that you have with the server via your browser is called a “Request”. This is what a request would look like.
GET /login HTTP/1.1
Host: www[dot]whateverwebsite[dot]com
User-Agent: Mozilla/5.0 (Windows NT 10.0; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4086.0 Safari/537.36
Accept: text/html,application/xhtml+xml
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: close
Note that the Request asks the server to “GET”
the “/login”
page of website (host) www[dot]whateverwebsite[dot]com
. This is a GET request.
Correspondingly, the act of the server delivering the “Login” page of the website to your browser is called a “Response”.
HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 2966
Connection: close
<html>
...HTML garbage...
<script>...some javascript rubbish...</script>
</html>
Note that the server Responds with an “OK”
, and delivers the “Login” web-page.
Don’t get intimidated by the data in the Request and Response messages. That’s not your business. For now, understand that HTTP is a Request and Response protocol.
HTTP is “Stateless”
It is this (Stateless) property of the HTTP protocol that makes it difficult for servers to remember you. You see, servers (websites) suffer from something akin to “short term memory loss”. Let me explain.
HTTP and the web were initially created to share documents. That’s all. They were not designed to operate the way it does today. If you wanted to read a document, you made a corresponding request to the website and the server responded with the web-page having the document (is rendered in your browser). There was no need for the server to know or “remember” who you are.
If HTTP were to be used like this today, applications would make our internet lives frustrating.
Request: You click “Sign in” on ecommerceweb[dot]com via your browser.
Response: ecommerceweb[dot]com server responds with the “Sign In” page.
Request: You type in your credentials and click “Sign In”.
Response: ecommerceweb[dot]com server responds with the “Home” page.
Request: You click on the “Your Orders” option to check your orders.
Reseponse: ecommerceweb[dot]com server responds with the “Sign In” page.
Request: Okay (the heck??), so you sign in again.
Response: “Your Orders” page is displayed.
Request: Now, you click on an order to check your order details.
Response: ecommerceweb[dot]com server responds with the “Sign In” page.
You see, the server is unable to “remember” you! For every request you make, it would ask for your credentials.
🍪Cookies help Websites (Servers) remember you!
This is why cookies were created in the first place, to help websites remember you and your preferences. In technical terms, cookies help in “state persistence” (remember HTTP is stateless). So how does this work? Back to requests and responses again :)
Let’s imagine John is signing in to ecommerceweb[dot]com for the first time.
Request: John types in his credentials and clicks “Sign In”.
POST /home HTTP/1.1
Host: ecommerceweb[dot]com
User-Agent: Mozilla/5.0 (Windows NT 10.0; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4086.0 Safari/537.36
Accept: text/html,application/xhtml+xml
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate Connection: close
username=John&password=p@$$w0rd
John wants to go to the “/home”
page of the website ecommerceweb[dot]com
, which is possible only after he “POST”s
his credentials (username & password) on the website. This is a POST request.
Response: ecommerceweb[dot]com server responds with the “Home” page.
HTTP/1.1 200 OK
Set-Cookie: sessionid=456485
Content-Type: text/html
Content-Length: 2966
Connection: close
<html> ...HTML garbage... <script>...some javascript rubbish...</script> </html>
Note that the server adds a “Set-Cookie”
header. The server is assigning John a cookie by the name “sessionid”
and value “456485”
.
Why? You’ll see.
Request: John clicks on the “Your Orders” option to check his orders.
GET /orders HTTP/1.1
Host: ecommerceweb[dot]com
User-Agent: Mozilla/5.0 (Windows NT 10.0; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4086.0 Safari/537.36
Accept: text/html,application/xhtml+xml
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate Connection: close
Cookie: sessionid=456485
Note the “Cookie”
header in the request. Once a cookie is set by the server in the first response, the browser will add the “Cookie”
header with the name (sessionid
) and value (456485
) in every subsequent request John makes.
The cookie “sessionid=456485”
assigned to John is an identification. The server will now remember John as “sessionid=456485”
on every subsequent request. This is because the server sets a file in its disk corresponding to the cookie set for John.
So, when John clicks the “Your Orders” link after signing in (the cookie is sent in the request), the server checks for the file corresponding to the cookie it received from John’s browser, recognizes it is John and responds with the “Your Orders” page (happy ending).
This is how a state is persisted. This is how ecommerceweb[dot]com remembers John.
The definition of a cookie in the introduction of the article would make more sense now.
💡First Request-Response:
Cookies are small pieces of information/data (sessionid=456485) that a website/web-server sends to your browser, (via the Set-Cookie header) so it can be stored in your computer’s hard-disk.
💡Subsequent Request-Response:
The next time you make a request on the same website, your browser sends the cookie (sessionid=456485 via the Cookie header) along with the request, so the website can identify you.
🍪Cookies are specific to a domain
Cookies are specific to a domain. For example, Facebook cannot read cookies set by Amazon.
Strictly speaking, when a cookie is set by the server for the first time, the cookie would also have other “attributes” apart from a name and a value.
Set-Cookie: sessionid=456485; Path=/; Secure; Domain=ecommerceweb[dot]com
Name = sessionid
Value = 456485
Domain attribute - Specifies that cookie:
sessionid=456485
is specific toecommerceweb[dot]com
.
Attributes are properties of cookies which make them abide by certain rules, like this cookie in the above example is specific to ecommerceweb[dot]com.
I do not want to get technical, explaining what other attributes do. I’ll stick to what is enough to explain the concept of how cookies work (for now).
🍪Types of Cookies
Broadly, cookies can be of two types:
First-party cookies (stuff we discussed so far)
Third-party cookies
First-Party Cookies
“sessionid=456485”
is set by ecommerceweb[dot]com. This is a cookie set by the same website (ecommerceweb[dot]com) John is currently/directly browsing. This is called a “First-Party cookie”.
Third-Party Cookies
What if ecommerceweb[dot]com had ads on their web-page? These ads, though seen on ecommerceweb[dot]com, are being served through another server, let’s say ads[dot]com
.
If there is a banner ad on ecommerceweb[dot]com, to John, the ad “appears” to reside on ecommerceweb[dot]com. In reality, it is a snippet of HTML from another server (ads[dot]com) being embedded on ecommerceweb[dot]com. Much like an embedded YouTube video on a website, when in reality the video exists on YouTube servers.
Now, when John visits ecommerceweb[dot]com for the first time, the ads[dot]com server serving the banner ad, could set a cookie for John. Now, this is a cookie set by another website (ad server) John is not directly browsing (he is on ecommerceweb[dot]com). The cookie set by the ad server is a “Third-Party cookie”.
💡Types of First-party Cookies
First-party cookies, the topic of discussion in this article, are of two types:
Session cookies
Persistent cookies
Session cookies
Session cookies are those that get deleted immediately after you close your browser. Amazon lets you shop and add items in your shopping cart without you having to sign in. This is because, even without signing in, the Amazon server sets a “session cookie” with which you could shop as a “guest” (an unauthenticated user - you haven’t signed in). With this cookie, the Amazon server could remember your items in the shopping cart. Once you close the browser, all is lost. You may have to shop again. But once you sign in, the session cookie is replaced by a persistent cookie.
Persistent cookies
Persistent cookies are, well, PERSISTENT! Ever wondered how you are forever logged in to Facebook, Twitter, Instagram, or any similar website? This is because, cookies set by these websites after you log in, are persistent. They do not expire on closing the browser. These cookies expire only if you delete them from your computer or log out of the website.
Persistent cookies have this property because of the “Expires” attribute. Any cookie set by the server without an “Expires” attribute is a session cookie.
Let’s take John’s example (explained above) who is logged to ecommerceweb[dot]com.
Set-Cookie: sessionid=456485; Path=/; Secure; Domain=ecommerceweb[dot]com; Expires=Wed, 15 Jun 2022 10:18:14 GMT
Expires attribute - Specifies that cookie: sessionid=456485
should expire on the 15th of June 2022 10:18:14 GMT. So, John, who is logged in to ecommerceweb[dot]com, with a cookie name and value sessionid=456485
, would be persistently logged in until
he logs out of the application.
he deletes all cookies belonging to ecommerceweb[dot]com.
his cookie expires on the 15th of June 2022.
Closing Notes
You may have come across or read about how several companies keep track of your browsing habits via cookies. While there are many different ways companies use to track you online, third-party cookies are an important tool with which they achieve the same. Well, this article is a primer. You must understand how cookies work before you could dive into the intricacies of how you are tracked online with third-party cookies (this would be the discussion in Part II of this series).
💡Remember
First-party cookies are needed for web or mobile applications to function well. User tracking for targeted ad serving purposes is achieved with third-party cookies.
This is a huge topic to discuss and I have not covered everything related to cookies. But if this article piqued your interest and you want to get into the technical nitty-gritty details, RFC 6265 - HTTP State Management Mechanism is a great read. Using HTTP cookies in MDN web docs is also well worth reading!