System Design Interview Roadmap

System Design Interview Roadmap

Understanding Head-of-Line Blocking: HTTP/2 vs. HTTP/3 (QUIC) in Production

Mar 28, 2026
∙ Paid

Introduction

You’re streaming a 4K video on YouTube when suddenly your Wi-Fi hiccups. A single lost packet freezes all 127 video chunks in transit—not because they’re damaged, but because TCP won’t deliver chunk #43 until it retransmits the missing packet #42. Meanwhile, chunks #44 through #127 sit idle in kernel buffers, perfectly intact but artificially stalled. This is head-of-line blocking, and it’s why Google spent five years rebuilding the internet on UDP.

The Fundamental Problem

Head-of-line blocking (HOL blocking) occurs when one slow or failed resource prevents processing of subsequent independent resources. In HTTP/1.1, this happened at the application layer: browsers opened 6 connections per domain, but within each connection, requests queued serially. Request #2 couldn’t start until #1 completed, even if #2’s resource was ready. HTTP/2 solved this with multiplexing—sending multiple requests over a single TCP connection using stream IDs.

But HTTP/2 introduced a deeper problem at the transport layer. TCP guarantees in-order delivery of bytes. When packet #42 drops on a connection carrying 100 multiplexed streams, TCP’s receive buffer holds packets #43-127 but refuses to deliver them to the application. The kernel waits for retransmission of #42, stalling all 100 streams even though 99 streams have no dependency on that lost packet. This is TCP-level HOL blocking, and it’s invisible to HTTP/2’s multiplexing.

HTTP/3 fundamentally restructures this relationship by running on QUIC, which implements streams natively in the transport layer over UDP. Each QUIC stream maintains independent packet ordering. When stream #5 loses a packet, only stream #5 stalls—streams #1-4 and #6-100 continue delivering data immediately. QUIC also integrates TLS 1.3 handshake into connection establishment, reducing RTTs from 3 to 1 for new connections and enabling 0-RTT resumption for repeat visitors.

The implementation difference is architectural. TCP operates on a byte stream abstraction with a single sequence number space. QUIC maintains per-stream offset tracking with independent acknowledgment state machines. When you send 100 HTTP/3 requests, you’re creating 100 logical channels with separate congestion control feedback but shared connection-level flow control. Packet loss affects only the stream(s) in that packet, not the entire connection.

User's avatar

Continue reading this post for free, courtesy of System Design Roadmap.

Or purchase a paid subscription.
© 2026 SystemDR Inc · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture