There are times when server-side cache mechanisms may speed up subsequent requests, but for now let's focus on browser-related aspects.
Hovering over the waterfall 'blocks' reveals time details:
https://i.sstatic.net/5r64b.png
Here is a brief explanation of each phase (sourced from Google Developers):
- Queueing. Requests are queued by the browser when:
- There are higher priority requests.
- There are already six TCP connections open for this origin, hitting the limit. This applies to HTTP/1.0 and HTTP/1.1 only.
- The browser needs to make space in the disk cache briefly
- Stalled. The request could be stalled due to reasons mentioned under Queueing.
- DNS Lookup. The browser is resolving the requested IP address.
- Proxy negotiation. Negotiating the request with a proxy server.
- Request sent. Sending out the request.
- ServiceWorker Preparation. Setting up the service worker.
- Request to ServiceWorker. Request being sent to the service worker.
- Waiting (TTFB). Waiting for the first byte of response. TTFB means Time To First Byte, which includes one
round trip of latency and the server preparation time for the response.
- Content Download. Receiving the response content.
- Receiving Push. Getting data via HTTP/2 Server Push for the response.
- Reading Push. Reading local previously received data.
So what sets apart the initial and subsequent requests in a traditional HTTP/1.1 setup?
- DNS Lookup: Resolving DNS for the first request might take longer. Subsequent requests benefit from faster resolution using the browser's DNS cache.
- Waiting (TTFB): The first request must establish a TCP connection to the server. With HTTP keep-alive, subsequent requests reuse the existing TCP connection, eliminating the need for another TCP handshake and reducing three round-trip times compared to the first request.
- Content Download: Initial request experiences more time due to TCP slow start when downloading content. As subsequent requests leverage the established TCP connection and scale up the TCP window, content download accelerates significantly compared to the first request.
Hence, generally subsequent requests should be quicker than the first one. This leads to a common network optimization tactic: Minimize the number of domains used for your website.
HTTP/2 enhances multiplexing to optimize usage of a single TCP connection. Hence, HTTP/2 provides performance enhancements in the modern frontend landscape where numerous small assets are deployed on CDN servers.