Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Understanding HTTP

HTTP is an application layer protocol designed to transfer information between client and servers. The protocol operates on a simple exchange i.e. a client sends a request describing the action on resources and the userver returns a response with the outcome. The resource is identified using Uniform Resource Locator(URL) and the action is the specified by an HTTP method.

  • HTTP follows a client-server model i.e. a client sends a request to a server, and the server sends back a response.
  • HTTP is stateless by design. Each request is self-contained and the server does not retain any memory of previous request by the same client. This allows HTTP to scale horizontally as any server behind a load balancer handles any request without needing shared state.

HTTP Message

An HTTP messages are the mechanism used to exchange data between a server and a client in HTTP protocol. There are two types of messages:

  • Request: sent by client to trigger an action on server
  • Response: answer sent by server in response to request.

HTTP Message Format

An HTTP message has the following format:

  • A start line that describes the HTTP version along with the request method or the outcome of the request
  • An optional set of HTTP headers containing metadata that describe the message
  • An empty line indicating the metadata of message is complete
  • An optional body containing data associated with the message.

The start line and headers of the HTTP message are collectively known as the head of the requests, and the part afterwards that contain its content is known as the body.

HTTP request

A HTTP request looks like

POST /users HTTP/1.1
Host: example.com
Content-Type: application/x-www-form-urlencoded
Content-Length: 49

name=FirstName+LastName&email=bsmth%40example.com

Request Line

the start line in HTTP/1.x requests is called request-line and is made up of three parts

<method> <request-target> <protocol>
  • Method: The HTTP method is one of the methods that define what the request must do.
  • Request target: It is ussually an absolute or relative URL and helps to identify the resource.
  • Protocol: This declares the HTTP version version used, which defines the structure of the remaining message, acting as an indicator of the expected version to use for the response.

Request Headers

Headers are metadata sent with a request after the start line and before the body. The Host header is the only required header in HTTP/1.1 and in HTTP/1.x, each header is a case-insensitive.

Request Body

The request body is the part of a request that carries information to the server. Only PATCH, POST and PUT requests have a body. The body can be of various formats such as text, json, etc whatever the server expects.

HTTP Response

Responses are the HTTP messages a server sends back in reply to a request. The response lets the client know what the outcome of the request was.

HTTP/1.1 201 Created
Content-Type: application/json
Location: http://example.com/users/123

{
  "message": "New user created",
  "user": {
    "id": 123,
    "firstName": "Example",
    "lastName": "Person",
    "email": "[email protected]"
  }
}

Request Line

The start line is called a status-line in response and has three parts:

<protocol> <status-code> <reason-phrase>
  • Protocol: The HTTP version of the message
  • Status Code: The numeric status code that indicates the status of request such as 200 being success.
  • Reason Phrase: The optional text after the status code is a brief, purely informational, text description of the status to help a human understand the outcome of a request

Response Headers

Response headers are the metadata sent with a response. In HTTP/1.x, each header is a case-insensitive string followed by a colon (:) and a value whose format depends upon which header is used. There are two types of headers:

  • Response Headers: It give additional context about the message or add extra logic to how the client should make subsequent requests
  • Representation Headers: if the message has a body, they describe the form of the message data and any encoding applied such as text/html, application/json, etc. This allows a recipient to understand how to reconstruct the resources as it was before it was transmitted over the network.

Response Body

Request Body is included in most messages when responding to a client that tells the recipient about the action performed on the server by the request such as GET request has the data in the response body.

HTTP Methods

The method in the request line tell the server which action to perform on the resource. HTTP defines the following methods:

MethodPurposeRequest has payload bodyResponse has payload bodySafeIdempotentCacheable
GETRetrieve a resourceOptionalYesYesYesYes
HEADRetrieve headers only (no body)OptionalNoYesYesYes
POSTSubmit data for processing such as adding entry in databaseYesYesNoNoYes*
PUTReplace a resource entirelyYesYesNoYesNo
DELETERemove a resourceOptionalYesNoYesNo
CONNECTEstablish a TCP tunnel through an HTTP proxyOptionalYesNoNoNo
OPTIONSAsk server what methods it supportsOptionalYesYesYesNo
TRACEMake server echo back the sent request for detecting if data was modified while being sentNoYesYesYesNo
PATCHPartially update a resourceYesYesNoNoNo

NOTE: POST is technically cacheable per the HTTP spec, but almost never cached in practice. RFC 9110 says a POST response can be cached if the server explicitly includes the right headers, specifically Cache-Control or Expires, along with a Content-Location header that matches the request URI.

Status Code

The status code in the response indicates the outcome of the request. Status codes are grouped into five classes by their first digit

ClassRangeMeaning
1xx100–199Informational, request received, processing continues
2xx200–299Success, request accepted and processed
3xx300–399Redirection, further action needed
4xx400–499Client error, malformed or unauthorized request
5xx500–599Server error, valid request, server failed

Note: This note is for 418 I'm a teapot enjoyers. Click for #cat

HTTP Versions

HTTP has gone through many iteration over the years such as:

HTTP/0.9 (1991)

The original, extremely primitive version. It had only one method, GET and no headers, no status codes, no metadata of any kind. You requested a path, and the server sent back raw HTML and closed the connection.

# Request
GET /page.html

# Response
<html>hello world</html>

HTTP/1.0 (1996)

This was the real first version of HTTP.

  • Added Headers: It added headers for both request and response
  • Status Codes: Added status codes
  • Methods: Added more methods to the spec make the total methods 3 (GET, POST and HEAD)
  • Content Type: Added support for more content type than HTML
  • HTTP versioning: Added versioning to request line

HTTP/1.1 (1997)

It added the following features

  • Persistent Connections (Keep-Alive): TCP connection stays open after a request, so multiple requests can reuse it.
  • Pipelining: Client can send multiple requests without waiting for responses but responses must come back in order. This caused a serious problem called Head-of-Line (HOL) blocking
  • Chunked Transfer Encoding: Server can stream a response in chunks without knowing the total size upfront.
  • Host Header (mandatory): Since multiple websites can share one IP address (virtual hosting), the Host header tells the server which site the client wants.
  • New Methods: Added new methods: PUT, DELETE, OPTIONS, TRACE, and CONNECT.
  • Caching improvements: Cache-Control, ETag, If-None-Match, If-Modified-Since were added.

HTTP/2 (2015)

It is a complete rewrite of how HTTP is transmitted, while keeping the same semantics (methods, headers, status codes all unchanged). Based on Google’s experimental SPDY protocol. It added the following features:

  • Binary Framing: HTTP/1.x is plain text. HTTP/2 is binary. All communication is split into small frames and tagged with a type. Faster to parse, more compact, less error-prone.
[Text HTTP/1.1]          [Binary HTTP/2]
GET /index HTTP/1.1  →   0x00 0x00 0x12 0x01 ...
Host: example.com
  • Multiplexing: Multiple requests and responses travel over a single TCP connection simultaneously, each in their own numbered stream thus solving HOL.
  • Header Compression (HPACK): HTTP/1.1 headers are repetitive plain text sent on every request. HPACK compresses them and maintains a shared table of previously seen headers on both ends, so repeated headers (like User-Agent, Host, Cookie) are sent as tiny references instead of full strings.
  • Server Push: Server can proactively send resources the client hasn’t asked for yet. For example, when a browser requests index.html, the server can push style.css and app.js immediately, anticipating the browser will need them.
  • Stream Prioritization: Clients can assign priority weights to streams so critical resources (HTML, CSS) load before less important ones (analytics scripts).

HTTP/3 (2022)

HTTP/3 replacing TCP entirely with a new transport protocol called QUIC, which runs over UDP.

  • QUIC (Quick UDP Internet Connections): Originally developed by Google, QUIC bakes in everything TCP provided (reliability, ordering, congestion control) but does it at the application layer on top of UDP.
  • 0-RTT and 1-RTT Handshakes: TCP + TLS requires 2–3 round trips before data can flow. QUIC combines the transport and TLS handshake into one, needing just 1 round trip (1-RTT). For repeat connections to a known server, QUIC can send data in the very first packet i.e. 0-RTT.
  • Connection Migration: TCP connections are tied to a 4-tuple (source IP, source port, dest IP, dest port). If you switch from Wi-Fi to mobile data, your IP changes and the connection breaks. QUIC uses a Connection ID instead, so connections survive network changes seamlessly.
  • TLS 1.3 built in: QUIC mandates TLS 1.3 encryption.

HTTP Caching

Caching is the practice of storing a copy of a response so future requests can be served from that copy instead of hitting the server again. It reduces latency, saves bandwidth and reduces server load. There are two types of caches:

  • Private Cache: It is a cache tied to a specific client, typically a browser cache. It is suitable for personalized content.
  • Shared Cache: It is located between the client and server and can store responses that can be shared among users and shared caches can be further sub-classified into proxy caches and managed caches
    • Proxy Cache: It is a cache that is operated by someone other than the website owner such as ISP, a coorporation, network admin. Since website owner doesnt control them, it can lead to cause issues such as providing stale data, caching responses that shouldnt be cached, etc. It has become ineffective due HTTPS and data being encrypted.
    • Managed Cache: Managed caches are explicitly deployed by service developers to offload the origin server and to deliver content efficiently. Examples include reverse proxies, CDNs, and service workers in combination with the Cache API.

Heuristic Caching

HTTP is designed to cache as much as possible, so even if no Cache-Control is given, responses will get stored and reused if certain conditions are met. This is called heuristic caching. For example, take the following response. This response was last updated 1 year ago.

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1024
Date: Tue, 22 Feb 2022 22:22:22 GMT
Last-Modified: Tue, 22 Feb 2021 22:22:22 GMT

<!doctype html>

It can be said that data that hasnt changed in a year has low prob. of change for some time. Therefore client stores this response despite lack of max-age. The spec. recommends about 10% of the time after storing. Heuristic caching is a workaround that came before Cache-Control support became widely adopted, and basically all responses should explicitly specify a Cache-Control header.

Fresh and State based on age

Stored HTTP responses have two states: fresh and stale. The fresh state usually indicates that the response is still valid and can be reused, while the stale state means that the cached response has already expired.The criterion for determining when a response is fresh and when it is stale is age. In HTTP, age is the time elapsed since the response was generated.

HTTP/1.1 200 OK
Content-Type: text/html
Content-Length: 1024
Date: Tue, 22 Feb 2022 22:22:22 GMT
Cache-Control: max-age=604800

<!doctype html>

The cache that stored the example response calculates the time elapsed since the response was generated and uses the result as the response’s age. For the example response, the meaning of max-age is the following:

  • If the age of the response is less than one week, the response is fresh.
  • If the age of the response is more than one week, the response is stale

Expires or max-age

In HTTP/1.0, freshness used to be specified by the Expires header. The Expires header specifies the lifetime of the cache using an explicit time rather than by specifying an elapsed time.

Expires: Tue, 28 Feb 2022 22:22:22 GMT

Vary

The vary header tells caches to store seperate variants of the same URL based on specific request header values. A response with Vary:Accept-Encoding instructs the cache to key stored entries on the Accept-Encoding header. A client requesting br encoding and a client requesting gzip each get their own cached copy

Vary: Accept-Encoding, Accept-Language

Without Vary, a cache serves the same stored response to all clients regardless of request headers. This causes problems when the origin returns different representations based on encoding, language, or other negotiated features.Vary: * effectively disables caching. Every request is treated as unique

Validation

When a stored response becomes stale, the cache does not discard the response but cache sends a conditional request to the origin to check whether the resource is changed. This process is called revalidation.

Validators

Two validator mechanism exist:

  • Etag: Its an opaque identifier representing a specific version of the resource.
  • Last-Modified: A timestamp indicating when the resource last changed. Has one-second resolution, making Etags the more reliable validator for rapidly changing resources.

Conditional request headers

The cache attaches validator to the revalidation request:

  • If-None-Match sends the stored ETag. If the origin has the same ETag, the response has not changed. It takes more precedence.
  • If-Modified-Since sends the stored Last-Modified date. If the resource has not changed since the data, the origin confirms freshness.

Invalidation

Unsafe methods like POST, PUT and DELETE change server state. When a cache receives a non-error-response to an unsafe request, the cache must invalidate stored responses for the target URI. The cache also invalidates response for URI’s in the Location and Content-Location headers if they share the same origin. Invalidation marks stored responses as requiring revalidation.

Cache Groups

Cache Groups provide a mechanism for grouping related cached responses so a single unsafe request invalidates an entire set of resources. The Cache-Groups response header assigns a response to one or more named groups. The value is a list of case-sensitive strings.

HTTP/1.1 200 OK
Cache-Groups: "product-listings", "homepage"

The Cache-Group-Invalidation response header triggers invalidation of all cached responses belonging to the named groups. The header is processed only on responses to unsafe methods like POST or PUT.

HTTP/1.1 200 OK
Cache-Group-Invalidation: "product-listings"

After receiving this response, a cache invalidates all stored responses tagged with the “product-listings” group from the same origin. The invalidation does not cascade: invalidated responses do not trigger further group-based invalidations.

Cache-Control directives

The Cache-Control header carries directives controlling cache storage, reuse and revalidation. Directives appear in both request and response

Response Directives

DirectiveEffect
max-age=NResponse is fresh for N seconds
s-maxage=NOverrides max-age in shared caches; implies proxy-revalidate
no-cacheCache stores the response but must revalidate before every reuse
no-storeCache must not store any part of the response
publicAny cache stores the response, even for authenticated requests
privateOnly private caches store the response
must-revalidateOnce stale, the cache must revalidate before reuse; serves 504 on failure
proxy-revalidateSame as must-revalidate for shared caches only
no-transformIntermediaries must not alter the response body
must-understandCache stores the response only if the status code semantics are understood
immutableResponse body does not change while fresh; skip revalidation on user-initiated reload
stale-while-revalidate=NServe stale for up to N seconds while revalidating in the background
stale-if-error=NServe stale for up to N seconds when revalidation encounters a 500–599 error

Request Directives

DirectiveEffect
max-age=NAccept a response no older than N seconds
max-stale[=N]Accept a stale response, optionally no more than N seconds past expiry
min-fresh=NAccept a response fresh for at least N more seconds
no-cacheDo not serve from cache without revalidating first
no-storeDo not store the request or response
no-transformIntermediaries must not alter the body
only-if-cachedReturn a stored response or 504

CDN and edge caching

A CDN (content delivery network) operates a distributed network of shared caches at edge locations close to end users. CDN caches follow the same HTTP caching rules as proxy caches, with some platform-specific extensions.

Cache keys

CDN caches identify stored responses by a cache key, typically the request URL. Many CDNs extend the cache key with additional components: query string parameters, request headers (per Vary), Cookies, or geographic region. A misconfigured cache key is a common source of cache poisoning or unintended content sharing between users.

Cache purging

CDNs provide purging APIs to invalidate cached content before expiry. Purging by URL removes a single resource. Purging by tag (surrogate key) removes all responses tagged with a specific label. Tag-based purging is useful for invalidating all pages referencing a changed asset or data source.

Googlebot and HTTP caching

Google’s crawling infrastructure implements heuristic HTTP caching. Googlebot supports ETag / If-None-Match and Last-Modified / If-Modified-Since for cache validation when re-crawling URLs. When both validators are present, Googlebot uses the ETag value as the HTTP standard requires. The Cache-Control max-age directive helps Googlebot determine how often to re-crawl a URL. A page with a long max-age is re-fetched less frequently, while a page with a short max-age or no-cache signals the content changes often and warrants more frequent visits.

Common caching patterns

Versioned assets (cache forever)

Static assets with a fingerprint or version string in the URL are safe to cache indefinitely.

Cache-Control: max-age=31536000, immutable

The URL /assets/app.d9f8e7.js changes whenever the file content changes. The immutable directive tells the browser not to revalidate even on a user-initiated reload.

HTML pages (always revalidate)

HTML pages change frequently and benefit from revalidation on every load.

Cache-Control: no-cache

The cache stores the response but checks with the origin before serving. Combined with a strong ETag, this pattern ensures fresh content with minimal transfer cost when nothing changed.

Sensitive content (never cache)

Login pages, banking portals, and other pages with private data must not be cached.

Cache-Control: no-store

Shared cache with revalidation fallback

API responses served through a CDN with graceful degradation during outages.

Cache-Control: s-maxage=300, stale-if-error=3600

The CDN caches the response for five minutes. On origin failure, the CDN serves stale content for up to one hour.

HTTP Content Negotiation

Content negotiation is the mechanism by which a client and server agree on the best representation of a resource. The server selects from available variants based on client preferences expressed through HTTP headers, or presents alternatives for the client to choose from. A single resource from a URI can contain various representation such as json, xml, png, avif, etc depending upon the resource. Three negotiation patterns exists to determine which variet to send to client:

  • Proactive negotation: the server picks the best representation using preferences the client sent in the request. This is the dominant pattern using Accept, Accept-Encoding, Accept-Charset and Accept-Language headers
  • Reactive negotiation: the server lists available representations and the client picks one.
  • Request content negotiation: the server advertises preferences in a response, influencing how the client formats subsequent requests

HTTP Compression

HTTP compression reduces the size of data transferred between servers and clients. A client sends an Accept-Encoding header listing supported content codings. The server picks one, compresses the response body, and indicates the choice in a Content-Encoding header. This exchange is a form of proactive content negotiation.

The Accept-Encoding request header lists acceptable codings with optional quality values:

Accept-Encoding: br, gzip;q=0.8, zstd;q=0.9

The server selects one coding and returns the compressed body with two key headers:

  • Content-Encoding names the coding applied
  • Content-Length reflects the compressed size, not the original

Content Encoding

The most common ones are:

  • gzip: The gzip coding uses the GZIP file format , combining LZ77 and Huffman coding. Supported by every HTTP client and server. The gzip coding remains the most widely deployed content coding on the web.
  • br(Brotli): The br coding uses the Brotli algorithm , developed by Google. Brotli achieves higher compression ratios than gzip at comparable decompression speeds. Brotli includes a built-in static dictionary of common web content patterns, an advantage for compressing HTML, CSS, and JavaScript.All major browsers support Brotli over HTTPS connections.
  • zstd(Zstandard): The zstd coding uses the Zstandard algorithm , developed at Facebook. Zstandard offers a wide range of compression levels, from fast modes exceeding gzip speed to high modes rivaling Brotli compression ratios. Zstandard decompression is fast regardless of the compression level used. When used as an HTTP content coding, Zstandard encoders must limit the window size to 8 MB and decoders must support at least 8 MB. This cap prevents excessive memory consumption in browsers and other HTTP clients
  • dcb(Dictionary-Compressed Brotli): The dcb coding compresses a response using Brotli with an external dictionary. The compressed stream includes a fixed header containing the SHA-256 hash of the dictionary, allowing the client to verify the dictionary matches before decompression. Dictionary-based Brotli supports compression windows up to 16 MB.
  • dcz(Dictionary-Compressed Zstandard): The dcz coding compresses a response using Zstandard with an external dictionary. Like dcb, the compressed stream starts with a header containing the SHA-256 hash of the dictionary. The header is structured as a Zstandard skippable frame, making the format compatible with existing Zstandard decoders.
  • deflate: The deflate coding wraps a DEFLATE compressed stream inside the zlib format . Historical inconsistencies between implementations (some sent raw DEFLATE without the zlib wrapper) made deflate unreliable. Modern HTTP favors gzip or br instead.
  • compress: The compress coding uses adaptive Lempel-Ziv- Welch (LZW). Rarely encountered in modern HTTP.
  • identity: The identity coding means no transformation was applied. This value appears in Accept-Encoding to signal acceptance of uncompressed responses. A server is always allowed to send an uncompressed response, even when the client lists only compression codings.