· 8 min read ·

Poisoning the DPI State Machine: eBPF Sock Ops and the Fake TLS Handshake Trick

Source: lobsters

The field of DPI circumvention has followed a consistent arc: as inspection infrastructure improves, evasion tooling moves deeper into the stack. What started with port-hopping and HTTP proxy tricks has become a quiet competition at the kernel level. gecit, a recent project by Bora Tanrikulu, is a technically precise entry in this space. It combines eBPF socket operation hooks, raw socket injection, TTL manipulation, MSS clamping, and a built-in DoH resolver into a layered strategy that targets the DPI state machine directly rather than simply hiding data from it.

What DPI Actually Inspects

Modern DPI appliances deployed by ISPs in Turkey, Russia, Iran, and similar environments do not read every byte of every TCP stream. That would be prohibitively expensive at backbone scale. Instead, they rely on a common set of heuristics: intercept the first TCP data segment of each new connection, parse the TLS ClientHello, extract the SNI field, and make a block or allow decision within microseconds.

The SNI field exists because virtual hosting requires it. When multiple TLS domains share an IP address, the server needs to know which certificate to present before the handshake can proceed, but encryption has not started yet. So the client sends the hostname in plaintext in the ClientHello, in an extension defined in RFC 6066. That single plaintext field, typically appearing around byte 70 to 100 into the ClientHello payload depending on cipher suite list length and session ID, is what a decade of censorship infrastructure has been built around.

A flow that passes initial inspection is typically marked as allowed in the DPI box’s connection tracking table. Subsequent packets for the same TCP four-tuple receive minimal or no re-inspection. This optimization is fundamental to achieving line-rate throughput, and it is also the property that the fake ClientHello attack exploits.

The Established Toolkit

Two dominant approaches to defeating SNI-based blocking have been in use for several years.

Fragmentation tools like GoodbyeDPI on Windows and zapret on Linux intercept outbound packets using WinDivert or Linux’s NFQUEUE, split the TLS ClientHello across multiple TCP segments so the SNI field does not appear in the first segment, and reinject the pieces. Many DPI boxes inspect only the first segment and never see the SNI.

The architectural cost of this approach is significant: packets must be dequeued from the kernel’s netfilter subsystem, processed in userspace, and reinjected via raw sockets. There is CPU overhead, potential for sequence number edge cases, and a dependency on iptables rules and persistent process management. More critically, it fails against stateful DPI systems that perform full TCP stream reassembly before inspection. China’s GFW has done this since at least 2020.

The second approach, Encrypted Client Hello (ECH), is architecturally correct: it encrypts the inner ClientHello entirely using a public key the server publishes in its DNS HTTPS record. The outer ClientHello carries only a generic public-facing SNI. The DPI box sees traffic addressed to a CDN frontend, not the blocked target. Cloudflare has deployed ECH in production. The problem is that ECH requires server-side support and DNS infrastructure, and censors have responded by blocking all ESNI and ECH traffic wholesale. Russia has done this; China blocks it at the DNS layer. ECH will be the durable solution, but it cannot be deployed unilaterally by a client today.

The Two-Pronged Strategy in gecit

gecit does not simply fragment; it poisons the DPI state machine with a decoy handshake before the real one arrives.

When a process initiates a TCP connection to port 443, the kernel completes the three-way handshake. At that point, BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB fires in gecit’s sock_ops eBPF program. This callback runs on the client (active) side of a newly established TCP connection, before any application data has been sent. The application’s TLS stack has not yet written the real ClientHello.

gecit uses this window to inject a fabricated ClientHello through a raw socket. The fake packet uses the same TCP four-tuple as the real connection and contains a ClientHello with a spoofed SNI pointing to a non-blocked domain. Crucially, this packet carries a deliberately low TTL.

The TTL is calibrated so the fake packet travels far enough to reach the ISP’s DPI infrastructure, which typically sits within a few router hops at the edge of the access network, but expires before reaching the destination server. The DPI box receives what looks like a valid TLS ClientHello destined for an allowed hostname, records the flow as permitted in its connection tracking table, and advances its state machine accordingly. The fake packet then dies and never arrives at the server.

The real ClientHello follows almost immediately after, carrying the actual target SNI for the blocked domain. The DPI box, having already classified this TCP flow as allowed, applies no further inspection or drops the re-inspection as redundant. The handshake completes with the server normally.

As a secondary layer, gecit also clamps the TCP MSS for the connection. Calling bpf_setsockopt(skops, SOL_TCP, TCP_MAXSEG, &small_value, sizeof(small_value)) from within the sock_ops program causes the kernel’s TCP stack to segment subsequent sends from this socket into chunks no larger than the specified value. With an MSS of something like 10 to 40 bytes, a 400-byte TLS ClientHello arrives as 10 to 40 TCP segments. Any DPI box that does not reassemble the stream before inspection sees nothing coherent in the first segment. This fragmentation protection is useful against infrastructure that might attempt re-inspection despite its connection tracking state.

BPF_SOCK_OPS_ACTIVE_ESTABLISHED_CB as a Precision Trigger

It is worth understanding why the eBPF sock_ops hook is the right mechanism here rather than, say, NFQUEUE or a TC egress hook.

BPF_PROG_TYPE_SOCK_OPS was introduced in Linux 4.13 by Lawrence Brakmo at Facebook for TCP performance engineering. The original use cases were things like: detect a connection to a specific remote subnet and apply a custom congestion control algorithm, set initial congestion window sizes per traffic class, or configure ECN behavior at scale. The hook was designed for observing and tuning TCP connection lifecycle events in-kernel with zero per-packet overhead except on the specific events of interest.

Attached to a cgroup via BPF_CGROUP_SOCK_OPS, the program fires for every TCP socket created by processes in that cgroup hierarchy. The struct bpf_sock_ops context at the ACTIVE_ESTABLISHED callback provides the remote IP, remote port, the local port, and the is_fullsock flag indicating whether the underlying socket is accessible via bpf_setsockopt. This is sufficient to identify port-443 connections and apply MSS clamping in-kernel before the first byte of application data moves.

For the actual packet injection, gecit cannot operate entirely within eBPF. Constructing and sending an arbitrary raw IP packet with a custom TTL and spoofed payload is not something a sock_ops program can do directly. The eBPF program instead signals userspace, likely via a BPF perf ring buffer or a shared map, with the connection’s four-tuple. A userspace goroutine receives this event, constructs the fake ClientHello with the spoofed SNI, sets the desired TTL on the raw socket with IP_TTL, and sends it. The timing window between TCP establishment and the application’s first write is narrow but reliably present in practice, because the TLS library always runs in the application’s thread and does not begin the handshake until after the connection is established.

This division of responsibility is clean. eBPF handles observation with minimal overhead. Raw socket injection handles the construction of non-standard packets the kernel’s TCP stack would never generate on its own.

The TTL Calibration Problem

The TTL-based poisoning attack is the most fragile component of gecit’s strategy. Getting the value right requires knowing the hop count to the ISP’s DPI infrastructure for a given network path. That varies by ISP, by access technology, and sometimes by geographic region within the same ISP.

Tools in this space typically start with a TTL of 5 or 6 hops as a reasonable guess for residential access networks, where the DPI box is usually at or near the DSLAM or CMTS. A TTL that is too high causes the fake packet to reach the destination server, which receives an unexpected second ClientHello on an already-TCP-established connection and responds with a RST or ignores it, potentially disrupting the real handshake. A TTL that is too low causes the packet to expire before reaching the DPI box, wasting the injection without poisoning anything.

GoodbyeDPI’s Windows implementation handles a version of this problem differently: it injects a fake packet with a deliberately wrong TCP checksum. Some DPI boxes pass the packet through without verifying the checksum; the endpoint drops it. This avoids the TTL calibration problem but depends on the DPI box not validating checksums, which is not universally true.

The combination of TTL manipulation and MSS fragmentation in gecit covers both cases: if the TTL is miscalibrated and the poisoning fails, the fragmentation still degrades the DPI’s ability to extract the SNI from the first segment.

DNS as the Complementary Channel

SNI inspection is only one of two places where a DPI box can identify a blocked target. Before TCP even begins, the client issues a DNS query for the hostname. Standard DNS over UDP port 53 is plaintext and trivially inspectable. Many censorship systems block at the DNS layer in addition to or instead of the TLS layer, using either query filtering or response tampering.

gecit includes a built-in DoH resolver to address this. DNS over HTTPS (RFC 8484) encapsulates DNS queries in standard HTTPS POST or GET requests to a known resolver endpoint. From the network perspective, this traffic is indistinguishable from regular HTTPS to Cloudflare or Google’s infrastructure. The DNS query content is encrypted within the TLS session to the resolver.

Without DoH, a tool that successfully hides the SNI from DPI inspection still leaks the target hostname in the preceding DNS query. Both channels must be covered for the bypass to be complete against a system that filters on both.

Limitations and Where This Fits

gecit operates in a specific niche: ISP-deployed DPI infrastructure that makes allow or block decisions based on the first observed TCP data segment and then relies on connection tracking state for subsequent packets. This describes a substantial fraction of censorship infrastructure in practice, because full stream reassembly at backbone scale is expensive and many deployed appliances do not do it.

It will not work against China’s GFW in its current form, which performs stateful TCP stream reassembly and is not deceived by either fragmentation or TTL-based packet injection. It will not work against systems that re-inspect each packet independently of connection tracking state. It operates in the same effectiveness band as GoodbyeDPI and zapret for ISP-level Sandvine-class infrastructure, with the architectural advantage of an in-kernel trigger and no NFQUEUE process dependency.

The tool is also purely client-side, which matters. Tools that require proxy infrastructure, VPN servers, or coordinated external endpoints are subject to server-side blocking and require ongoing operational maintenance. gecit requires only that the kernel on the client machine supports eBPF sock_ops, which has been true since Linux 4.13, and that the process has sufficient privileges for raw socket access.

The longer-term resolution to the SNI inspection problem is ECH, where the attacked field simply does not exist in plaintext by construction. But ECH adoption depends on server-side deployment, and it is actively blocked in precisely the environments where it is most needed. In that gap, client-side tools at the kernel level will keep finding new uses for the same infrastructure that network performance engineers built for entirely different purposes.

Was this interesting?