The Arms Race Below the OS: Kernel Anti-Cheats, DMA Hardware, and Why Software Alone Can't Win
Source: hackernews
The premise of kernel anti-cheat is straightforward: if your security code runs at a higher privilege level than the cheat code, you win. Vanguard, EasyAntiCheat, and BattlEye all stake their legitimacy on this logic. They load kernel drivers at ring 0, intercept process creation, scan memory, and hook the operating system at a level that user-space cheats cannot reach. For the most part, it works. But there is a ceiling to what software-level protection can achieve, and that ceiling is the PCIe bus.
A detailed breakdown of how kernel anti-cheats function architecturally has been making the rounds, and the HackerNews thread it spawned is worth reading alongside it. The article is a solid primer covering the Windows ring model, kernel callbacks, and memory integrity checks. What it leaves open is the adversarial response to all of that: DMA-based hardware cheats, which operate outside the software trust model entirely.
The Windows Privilege Model and Why Ring 0 Matters
Windows runs user processes at ring 3 and the kernel at ring 0. The CPU enforces this at the hardware level. Code at ring 3 cannot read arbitrary physical memory, cannot modify kernel data structures, and cannot intercept system calls directly. Anti-cheat systems exploit this asymmetry by placing their detection logic in ring 0, where they have full visibility into what any ring-3 process does.
The kernel driver registers callbacks that fire on specific OS events. PsSetCreateProcessNotifyRoutine gives the driver a notification on every process creation and termination. PsSetLoadImageNotifyRoutine fires whenever a module is loaded into any process. ObRegisterCallbacks lets the driver intercept handle operations: when a cheat tries to call OpenProcess on the game process to read its memory, the anti-cheat can deny or downgrade that handle before it ever reaches user space. These three APIs form the backbone of most behavioral detection in production anti-cheat systems.
Memory scanning happens through MmCopyVirtualMemory and direct traversal of the VAD (Virtual Address Descriptor) tree. The VAD tree describes every memory region in a process’s virtual address space, including protection flags and whether a region is file-backed or anonymous committed memory. Injected shellcode and external cheat modules tend to appear in anonymous RWX regions with no backing file. A scanner walks the tree and flags regions that match suspicious profiles.
DKOM and Why It No Longer Works
Early cheats fought back through DKOM, Direct Kernel Object Manipulation. The technique involved unlinking a process’s EPROCESS structure from the doubly-linked list that PsActiveProcessHead anchors. Task manager and most kernel enumeration APIs walk that list, so an unlinked process becomes invisible to them. A cheat process could hide itself completely from the OS’s own bookkeeping.
PatchGuard (Kernel Patch Protection), introduced with Windows Vista x64, ended most of this. It periodically checksums critical kernel structures including the EPROCESS list, the SSDT (System Service Descriptor Table), and IDT entries. Any unauthorized modification triggers a 0x109 CRITICAL_STRUCTURE_CORRUPTION bug check and forces an immediate reboot. PatchGuard runs on randomized timers with encrypted state, making it difficult to disable without that action itself being detectable.
HVCI (Hypervisor-Protected Code Integrity) pushes the enforcement boundary even further. It uses VT-x to run a thin hypervisor below the Windows kernel and marks kernel code pages read-only from the hypervisor level. Unsigned or tampered kernel code simply cannot execute on a properly configured system with HVCI enabled, because the hypervisor controls the page table entries and will not grant execute permission to pages it has not validated. Combined with Driver Signature Enforcement, loading an unsigned kernel driver on a current Windows system requires either disabling protections (which is itself detectable) or compromising a legitimate signing certificate.
The DMA Problem
PCIe DMA, Direct Memory Access, allows hardware devices to read and write system RAM directly, bypassing the CPU and operating system. This is how GPUs move framebuffer data, how NVMe drives do high-throughput I/O, and how network cards operate. It is also how a class of cheating hardware operates that fundamentally defeats software-level detection.
FPGA boards like those built around the pcileech-fpga project plug into an M.2 or PCIe slot on a second machine. A PCIe interconnect cable bridges the two systems. The FPGA presents itself as a legitimate PCIe device to the victim machine and performs DMA reads against arbitrary physical addresses. The pcileech software stack on the attacker side translates virtual addresses to physical ones by reading the page table structures directly from memory, then extracts whatever it needs: game state, entity positions, the entire process heap.
From the operating system’s perspective on the victim machine, nothing unusual happened. No process opened a handle to the game. No kernel callback fired. The memory was read by hardware directly over the bus, and the OS was not involved at any point in the transaction. Anti-cheats built entirely on behavioral monitoring at the OS level have no signal to work with here.
How Anti-Cheats Try to Detect DMA Hardware
The detection landscape for DMA attacks is less mature than the software side, but active development is happening.
One approach is cache timing analysis. DMA reads over a PCIe interconnect introduce memory bus contention and latency that would not appear from local memory access patterns. Anti-cheats can probe whether their own memory has been read by external hardware using CPU cache timing side-channels. If game state data is evicted from cache in a pattern inconsistent with local access, that is a signal worth investigating. This is conceptually similar to the memory timing side-channels used in hardware security research like Rowhammer.
The more architecturally sound approach is IOMMU enforcement. Modern CPUs include an IOMMU (Intel VT-d on Intel platforms, AMD-Vi on AMD) that mediates PCIe DMA the same way the MMU mediates CPU memory access. The IOMMU restricts which physical memory regions a given PCIe device can access. A rogue PCIe device operating outside its mapped region faults. If the system enables and configures IOMMU protection strictly, a foreign DMA board cannot read arbitrary RAM. Some anti-cheat vendors have begun advocating for IOMMU as part of their hardware requirements, though enforcement remains inconsistent across OEM configurations.
The DMA hardware ecosystem responds to IOMMU by spoofing device identity. FPGA firmware kits now ship with profiles that emulate specific known-good devices, copying PCI vendor IDs, device IDs, and subsystem IDs from legitimate network cards or storage controllers. If the spoofed device’s identity is trusted, it may inherit the IOMMU mapping of its legitimate counterpart, depending on policy. The arms race has descended to firmware.
Hypervisor Detection
A separate escalation layer involves running the cheat inside a hypervisor beneath the OS. The anti-cheat perceives a normal Windows environment inside its guest VM; the hypervisor below intercepts and can modify the results of any kernel API, memory scan, or timing measurement the anti-cheat performs. This is conceptually the same as the rootkit-via-hypervisor techniques Blue Pill demonstrated in 2006, now applied to game cheating.
Detection relies on side effects the hypervisor cannot fully suppress. The CPUID instruction with leaf 0x40000000 returns a hypervisor vendor string on virtualized platforms: VMware returns “VMwareVMware”, Hyper-V returns “Microsoft Hv”. Bare-metal execution returns zeros or an empty string. RDTSC timing is another signal. A hypervisor exit carries overhead, and instruction sequences that force VM exits produce timing anomalies measurable with enough samples across multiple calibration runs. Vanguard’s requirement for Secure Boot and TPM 2.0 is partly motivated by these concerns: attestation primitives make running the OS inside an undetected hypervisor substantially harder because the boot chain is cryptographically verified before the anti-cheat even initializes.
The Driver Deployment Model
The actual deployment of a kernel anti-cheat follows a consistent pattern across vendors. The driver is signed with a valid EV code signing certificate (mandatory for Windows x64 kernel code since Vista), loads at ring 0 via the Service Control Manager, registers its kernel callbacks, and opens a communication channel to its user-mode service component via IOCTL through DeviceIoControl. The user-mode service handles server-side telemetry, update delivery, and ban enforcement logic while the kernel component performs local behavioral enforcement.
Vanguard departs from the typical model by loading at system boot rather than at game launch. This guarantees the driver starts before any potential kernel-level tampering and gives it visibility over the boot process itself. The design generated substantial criticism when it shipped with Valorant in 2020, since a continuously-running ring-0 driver with boot persistence represents real attack surface: a vulnerability in the anti-cheat driver itself becomes a privilege escalation path on every machine where it runs. This tension between coverage and risk is not unique to Vanguard; it is a structural property of the kernel anti-cheat model.
The Structural Limit
Kernel-level anti-cheat is not a solved problem, and more aggressive ring-0 access does not solve it. The DMA hardware problem is architectural: with physical machine access and a PCIe slot, software protections are insufficient without hardware cooperation. IOMMU enforcement, TPM attestation, and Secure Boot move the trust boundary to firmware and silicon, which is the right direction, but they depend on OEM implementation quality that varies significantly across consumer hardware.
Server-side validation, keeping authoritative game state on the server and treating all client-reported data as untrusted input, is the structurally sounder defense. It is also old wisdom in this space. Many game designs make it expensive or impractical due to latency and infrastructure requirements. Until that changes, kernel anti-cheats will remain the dominant approach, DMA hardware will continue evolving to evade them, and the boundary between software security and hardware security will stay the genuinely interesting frontier in this space.