There is a constraint baked into every TCP connection that no amount of bandwidth upgrades will remove. Your users on gigabit fiber face the same first-response bottleneck as users on a 20 Mbps cable connection, because the bottleneck is not the pipe, it is the protocol. This observation about a specific 13 KB threshold is a useful entry point into understanding why size discipline remains one of the highest-leverage performance optimizations available to web developers.
How TCP Slow Start Creates a Fixed First-RTT Budget
When a TCP connection sends its first batch of data, it does not immediately use all available bandwidth. It starts with a small congestion window, cwnd, and grows it as acknowledgements arrive. This is TCP slow start, and it exists to avoid overwhelming network infrastructure before the connection has any information about available capacity.
RFC 6928, published in 2013, increased the initial congestion window from 3 segments to 10. Before that, Chrome and other browsers were seeing severe underutilization on high-bandwidth connections because the initial window was too conservative for modern page sizes. The increase to 10 segments was based on Google’s data showing that most HTTP responses fit under that threshold and that the network harm was minimal.
The math is straightforward. The Maximum Segment Size (MSS) on a standard IPv4 Ethernet connection is 1460 bytes, accounting for the 40-byte overhead of IP and TCP headers within a 1500-byte Ethernet frame. Ten segments gives you:
IW10 = 10 × 1460 bytes = 14,600 bytes ≈ 14.3 KB
That is the total TCP payload the server can send before it must wait for an acknowledgement from the client. If the content fits, the client receives everything in a single data round trip. If it does not fit, the server sends 14,600 bytes, waits for ACKs, then sends the next window (now 20 segments, because slow start doubles cwnd per RTT), and so on.
Why the Number Is 13 KB, Not 14.6 KB
The actual usable application payload in that initial window is smaller than 14,600 bytes because of protocol overhead stacked on top of TCP.
TLS 1.3 uses AEAD encryption with a 16-byte authentication tag per record, plus a 5-byte record header. A TLS record covering a full segment adds roughly 21 bytes of overhead. Across 10 segments that is approximately 210 bytes consumed by TLS.
HTTP/2 adds 9-byte frame headers per DATA frame, plus a HEADERS frame at the start of the response that typically runs 200 to 400 bytes depending on response headers. By the time you account for all of this framing, the effective payload budget for your HTML or critical CSS is closer to 13 KB than 14 KB.
The exact number varies by server configuration, header verbosity, and TLS record sizing strategy. Servers that pack a single large TLS record spanning multiple TCP segments can cause a subtle problem: if that record extends beyond the initial congestion window, the client receives incomplete ciphertext and cannot decrypt any of it until the remaining segments arrive in the next RTT. Keeping TLS records aligned with TCP segments, as Nginx’s ssl_buffer_size directive allows, avoids this.
# Tune TLS record size to match TCP segment boundaries
ssl_buffer_size 4k;
# Enable these to reduce latency on small responses
tcp_nopush on;
tcp_nodelay on;
Bandwidth Is Irrelevant Below the Congestion Window
This is the core point that the intuition about fast connections tends to obscure. Consider two users:
- User A: 10 Mbps connection, 30ms RTT to server
- User B: 1 Gbps connection, 30ms RTT to server
For the first 14,600 bytes of a response, both users receive the data in exactly the same number of RTTs. The congestion window does not care about bandwidth. It expands based on acknowledgement timing, which is determined by RTT. User B’s gigabit pipe is utilised at roughly 0.05% during that first RTT, because the bandwidth-delay product of a 1 Gbps, 30ms connection is 3.75 MB, and the initial window covers only 14.6 KB of it.
Fast connections only help once the congestion window has grown enough to start filling the available pipe, which takes multiple RTTs. For small resources, that growth never happens. The entire transfer completes before slow start has had time to matter.
The implication is that optimising page size for the initial window is orders of magnitude more effective than bandwidth improvements at reducing perceived load time. Reducing RTT (via CDN edge nodes, geographic distribution) also helps, but you can meaningfully improve performance without any infrastructure changes by keeping critical resources within budget.
The Timeline Where This Shows Up
A typical HTTPS page load over a fresh connection looks like this:
- DNS resolution: 1 RTT
- TCP handshake: 1 RTT
- TLS 1.3 handshake: 1 RTT
- HTTP request and first response window: 1 RTT
That is a minimum of 4 RTTs before the browser has received any HTML to parse. On a 30ms RTT connection, that is 120ms before a single byte of content is available. If the HTML, critical CSS, and any blocking JavaScript fit within the initial congestion window, the browser has everything it needs to begin rendering after that 4th RTT completes. If the content is 30 KB instead of 13 KB, the browser must wait for a 5th RTT.
TLS 1.3 reduced this compared to TLS 1.2, which required 2 RTTs for the handshake rather than 1. QUIC and HTTP/3 reduce it further with their combined transport and TLS handshake, plus 0-RTT session resumption. But QUIC still has an initial congestion window of approximately 14,720 bytes (RFC 9000, Section 7.2), because the same fundamental constraint applies: you cannot know how much bandwidth is available until you have observed the network’s behaviour.
What Actually Fits in 13 KB
With Brotli compression, 13 KB of compressed HTML represents a meaningful amount of content. A typical HTML document compresses to 20 to 30 percent of its original size. That means roughly 40 to 65 KB of source HTML can fit within the budget.
The more practical concern is inlined critical CSS. A common performance pattern is to inline the CSS required for above-the-fold rendering directly in the <head>, eliminating a render-blocking stylesheet request. If that inlined CSS plus the HTML document exceeds 13 KB compressed, you are adding an RTT to the first render.
A rough budget for a 13 KB initial window:
HTTP/2 response headers (compressed): ~500 bytes
TLS overhead (10 records): ~210 bytes
HTTP/2 framing overhead: ~300 bytes
Available for HTML + inlined CSS: ~12,000 bytes (~12 KB)
At typical Brotli compression ratios for CSS (around 4:1 for repetitive stylesheets), 12 KB of compressed CSS represents about 48 KB of source. That is tight but achievable for a focused design system. Tailwind with PurgeCSS commonly produces under 10 KB of CSS for a simple page. Bootstrap’s critical path subset is trickier but manageable.
Adjusting the Initial Window Server-Side
Some operators configure a larger initial congestion window at the OS level to get more data out in the first RTT. On Linux:
# View current setting
ip route show default
# default via 192.168.1.1 dev eth0 proto dhcp src 192.168.1.100 initcwnd 10
# Increase initial congestion window to 32 segments
ip route change default via 192.168.1.1 initcwnd 32
Setting initcwnd to 32 or higher gives you roughly 46 KB in the first RTT, which is more comfortable for typical web pages. CDN operators commonly do this on their edge servers. The tradeoff is that a larger initial window is more aggressive on congested paths; if many simultaneous connections all start with initcwnd 32, they compete harder with each other early in the connection lifetime.
This is a legitimate option, but it shifts the problem rather than removing it. The initial window on the client’s connection to the CDN edge may be large, but origin pull connections and the client’s own downstream congestion characteristics still apply.
HTTP/3 and Whether QUIC Changes the Calculus
QUIC’s 0-RTT feature gets attention for reducing connection establishment latency on repeat visits, which is a genuine improvement. But QUIC’s initial congestion window uses the same 10-segment-equivalent logic (expressed as a byte count, typically 14,720 bytes), because the congestion avoidance goal is unchanged.
What HTTP/3 does improve is head-of-line blocking. Under HTTP/2 over TCP, a single lost packet stalls all streams until the loss is recovered. QUIC handles packet loss per-stream, so a single lost datagram does not block unrelated resources. For pages loading many small resources in parallel, this matters. For the single critical first response, it changes little about the initial window constraint.
The Discipline This Requires
Keeping critical-path content under 13 KB is not about following a rule mechanically. It is about understanding which bytes the browser genuinely needs before it can start painting. Server-side rendering helps because the HTML already contains the initial state. Inlining critical CSS helps because it removes a round trip. Deferring non-critical JavaScript helps because blocking scripts in the document head pause HTML parsing.
Tools like Lighthouse and Chrome’s Coverage panel identify exactly which CSS and JavaScript is used in the initial view. WebPageTest’s waterfall view shows the TCP connection timeline and makes the initial window constraint visible as a stair-step in the data delivery graph.
The number 13 KB is approximate. The principle is not: the first round trip of data delivery has a hard ceiling set by the kernel and the protocol stack, no matter how fast the connection is. Fitting your most important content within that ceiling is one of the few optimisations that pays off for every user on every network, including the ones on the fastest connections available.