When Two Servers Disagree: HTTP Desync and What It Cost Discord's Media Proxy

HTTP request smuggling has been a known class of vulnerability since Watchfire first documented it in 2005, but it spent years as an obscure footnote. That changed in 2019 when James Kettle published research at DEF CON 27 that reframed the problem, renamed it “HTTP desync,” and demonstrated it was exploitable at scale across major internet infrastructure. Since then, desync bugs have shown up at Netflix, Akamai, AWS, and now, as documented by the researcher tmctmt, in Discord’s media proxy.

The core mechanic is worth understanding precisely, because it is easy to misstate. HTTP/1.1 persistent connections mean a single TCP socket carries multiple sequential requests. Both parties, the proxy and the origin server, must agree on exactly where each request ends. HTTP gives two mechanisms for signalling body length: the Content-Length header, a plain byte count, and Transfer-Encoding: chunked, which frames the body in sized chunks terminated by a zero-length chunk. When a request carries both headers and the intermediate proxy honours one while the backend honours the other, the two sides develop incompatible views of the stream. The proxy thinks it has forwarded one complete request; the backend has consumed only part of it and treats the leftover bytes as the beginning of the next request. That leftover is the smuggled prefix.

The Three Variants

The community conventionally labels these by which header each side prioritises:

CL.TE: the front-end proxy uses Content-Length, the backend uses Transfer-Encoding. The attacker sends a request whose Content-Length covers everything, but whose chunked body terminates early, leaving a suffix that the backend treats as a new request.
TE.CL: the front-end uses Transfer-Encoding, the backend uses Content-Length. The attacker sends a chunked body whose declared Content-Length is shorter than the actual body, so the backend reads only part of the chunk and treats the rest as a preamble for the next request.
TE.TE: both sides nominally support chunked encoding but one can be confused by a malformed or obfuscated Transfer-Encoding value, such as Transfer-Encoding: xchunked or a header with whitespace padding. One side falls through to Content-Length; the other processes chunked normally.

In code, a minimal CL.TE desync probe looks like this:

POST / HTTP/1.1
Host: target.example
Content-Length: 6
Transfer-Encoding: chunked

0

X

The proxy sees a body of 6 bytes (0\r\n\r\nX) and forwards the whole thing. The backend, honouring chunked encoding, reads the zero chunk as end-of-body and considers the request complete. The trailing X sits in the socket buffer, waiting to be prepended to the next request that arrives on that connection.

Why Proxies Are the Problem

The vulnerability lives at the seam between two systems rather than inside either one alone, which makes it architecturally slippery. A proxy that correctly normalises ambiguous requests before forwarding, stripping one of the conflicting headers, eliminates the bug entirely. The HTTP/1.1 specification (RFC 7230) is explicit: if both Content-Length and Transfer-Encoding are present, Transfer-Encoding takes precedence and Content-Length must be removed. Many production proxies do not follow this rule, either for performance reasons or because they were written before the attack class was well understood.

Discord’s media proxy, media.discordapp.net, sits in front of the CDN layer that serves user-uploaded images, video, and attachments. Requests flow through an HTTP load balancer or reverse proxy before reaching the backend that validates media tokens, checks expiry, and streams content. That two-layer topology is exactly the setup desync attacks require.

The Discord-Specific Attack Surface

What makes the Discord media proxy interesting as a target, beyond the obvious scale of the platform, is the nature of the requests it handles. Discord signs media URLs with time-limited tokens to prevent unauthorised hotlinking and to gate access to private attachment content. A valid signed URL is effectively a short-lived credential. If an attacker can smuggle a prefix that captures the beginning of another user’s request, they receive that user’s full URL including the token, the path identifying the media, and any cookies the client sends.

The exploitation chain tmctmt describes follows the standard request-capture pattern Kettle documented: send a smuggled prefix that begins a partial POST request to a path you control, then send innocuous requests to the server until another user’s request arrives on the same backend connection and gets appended to your partial body. The backend accumulates the victim request as form data and reflects it, or stores it somewhere the attacker can retrieve. In the context of a media proxy, the captured data is a signed URL pointing at private content.

At Discord’s traffic volumes, the window between injecting the smuggled prefix and a victim request landing on the same backend socket is measured in milliseconds, not seconds. The attack is probabilistic but not impractical. Running it repeatedly against a high-traffic endpoint tilts the odds quickly.

Why Normalisation Alone Is Not Enough

The instinct after reading about this class of bug is to reach for a simple fix: reject any request that carries both Content-Length and Transfer-Encoding. The HTTP/2 transition helps here, because HTTP/2 removes Transfer-Encoding entirely and uses its own framing layer, making desync structurally impossible end-to-end. But HTTP/2 only protects the client-to-proxy leg. If the proxy downgrades to HTTP/1.1 when communicating with the backend, the desync surface moves inward to that internal connection. Cloudflare documented this variant in 2021 in their analysis of h2.TE attacks, where HTTP/2 clients could smuggle requests through h2c-to-h1 translation.

Proxies that implement RFC 7230 Section 3.3.3 correctly, stripping Content-Length whenever Transfer-Encoding is present, close the CL.TE vector. But TE.TE obfuscation is harder to handle without a complete and pedantically strict parser, because the space of malformed Transfer-Encoding values is large and implementations differ in which ones they reject versus silently accept.

Tooling and Detection

PortSwigger’s HTTP Request Smuggler Burp extension automates the probing process, sending timed and differential requests to detect whether a server exhibits desync behaviour without fully exploiting it. The timing-based detection works because a server that is “desynced” will hold a connection open waiting for more data, causing an anomalous delay. Differential detection compares response behaviour between requests sent through the same socket versus fresh sockets.

For defenders, the canonical mitigation checklist is:

Enable HTTP/2 end-to-end, not just at the edge.
Enforce strict header parsing: reject, do not normalise, requests with both Content-Length and Transfer-Encoding.
Disable backend connection reuse for requests from different clients, accepting the performance cost.
Use unique per-connection IDs to detect when a response is delivered to the wrong client, a symptom of desync poisoning.

Items 3 and 4 are expensive at scale, which is why large CDN and proxy operators prefer items 1 and 2.

Scope and Responsible Disclosure

The phrase “spying on a whole platform” in the original title is technically accurate but worth contextualising. The attack does not give bulk access to a database. It captures individual requests, one at a time, from users whose traffic happens to land on the same backend socket immediately after the attacker’s smuggled prefix. Executing this at meaningful scale requires sustained, high-volume request sending, which is detectable. The harm model is targeted interception, plausible deniability about which specific users were affected, and the ability to access private media content from any user unlucky enough to land in the window.

For a platform where users share private images, medical documents in DMs, or sensitive files inside private servers, that is a serious enough primitive. Tmctmt reported the issue to Discord’s security team through their HackerOne programme, and Discord patched the affected proxy infrastructure.

The takeaway for anyone operating multi-tier HTTP infrastructure is familiar but still underweighted in practice: the proxy layer is not passive plumbing. It is an active participant in request semantics, and disagreements between its interpretation and the backend’s create a gap that is both exploitable and non-obvious in routine code review. HTTP desync is not a bug you find by reading your own code; it requires thinking about what happens at the boundary between two systems that each believe they are behaving correctly.