The VxD Problem: Why Win9x Compatibility Requires More Than API Translation
Source: hackernews
A new project called Windows 9x Subsystem for Linux appeared on Hacker News this week with 784 points and genuine technical discussion, which is unusual for what might otherwise look like a niche retrocomputing curiosity. The reason the thread attracted serious attention is that anyone who has thought carefully about Win9x compatibility knows the problem is genuinely hard in ways that are not obvious from the outside.
The surface-level story is that Wine does not handle Windows 9x well. That is true, but saying so explains nothing. The deeper story is about a specific design decision Microsoft made in the 1990s that had enormous consequences for anyone building a compatibility layer thirty years later: VxDs.
Virtual Device Drivers and the Myth of Ring Protection
Modern operating systems enforce a strict boundary between kernel code and user code using the x86 protection rings. Kernel code runs at Ring 0 with full hardware access. User applications run at Ring 3 and must ask the kernel for anything privileged through a defined system call interface. This boundary is the foundation of OS security and stability.
Windows 9x enforced this boundary in name but not in practice. Its driver model, the Virtual Device Driver or VxD, allowed system components to run at Ring 0, which is expected for drivers. What was not expected, from a modern perspective, is that ordinary applications could call into VxDs directly.
The mechanism was a software interrupt, INT 20h with a specific calling convention, or a direct far call through a gate. Applications that needed to talk to hardware abstraction layers, file system drivers, or multimedia components could bypass the Win32 API entirely and issue VxD calls. This was not an obscure backdoor; it was documented and widely used. DirectX 1 through 5 relied on VxDs. Many games of the era used VWIN32.VXD to perform Win32 operations from within context-switched callbacks. IFSMGR.VXD handled filesystem operations in ways that normal file I/O did not.
For a compatibility layer, this creates a structural problem. Wine intercepts Windows API calls at the DLL import level. When an application loads and calls ReadFile, Wine catches that call before it reaches any Windows code and redirects it to a Linux read syscall. This works because NT applications are constrained to system calls through ntdll.dll; there is a clean interception point.
VxD calls have no equivalent interception point. They are either software interrupts or gate calls that go below the DLL level entirely. A compatibility layer that only watches DLL imports will simply not see them.
What a Correct Implementation Requires
Handling VxD calls in user space on Linux requires intercepting software interrupts at the process level. Linux provides SIGSEGV and SIGILL delivery for certain fault conditions, and the vm86 system call on 32-bit x86 Linux allows a process to enter virtual 8086 mode and receive notifications on privilege violations. But the Win9x VxD interface is not a DOS-era INT call into real mode; it is a protected-mode mechanism, and the vm86 path does not cover it.
The realistic approach for a user-space subsystem is to run the Win9x application binary under a sandboxed execution environment that catches all INT instructions and far calls that would trigger VxD dispatch. On modern 64-bit Linux, this means either using ptrace to intercept every privileged instruction, or running the binary under a lightweight hypervisor or sandbox that virtualizes the protection ring behavior.
ptrace works but has overhead. Every VxD call becomes a context switch out of the traced process, into the tracer, which dispatches the call to a stub implementation, and back. For applications that make VxD calls infrequently, this is acceptable. For multimedia applications that are hammering audio or graphics VxDs at 60Hz, the latency accumulates.
An alternative is KVM-backed execution. By running the Win9x application inside a minimal KVM virtual machine with only the memory and interrupt handling needed, a subsystem can catch privileged instructions with hardware-assisted VM exits rather than ptrace. The KVM exit overhead for an intercepted instruction is substantially lower than ptrace round-trips, on the order of single-digit microseconds rather than tens of microseconds. The QEMU user-mode emulation framework uses a similar approach for cross-architecture binary execution.
The 16-bit Problem Within the 32-bit Problem
VxDs are not the only complication. Win9x’s 32-bit kernel, VMM32.VXD, is implemented partially in 16-bit code. When a 32-bit application calls certain KERNEL32.DLL functions, the runtime thunks down into 16-bit protected mode to use legacy code in KRNL386.EXE that predates the Win32 era. Microsoft called this a “flat thunk” in their internal documentation.
A compatibility layer that handles only 32-bit code will crash as soon as one of those thunks fires. The x86 instruction set in 16-bit protected mode uses different operand size defaults, different effective address calculations, and a segmented memory model where pointers are selector:offset pairs rather than flat linear addresses. Running 16-bit code inside a process that expects 32-bit flat addressing requires either a full x86 16-bit interpreter or a JIT that translates 16-bit instructions to equivalent 32-bit flat-address sequences.
DOSEMU2, which runs DOS applications on Linux, has a 16-bit execution engine that uses the vm86 system call for real-mode code. That approach does not extend to 16-bit protected mode, which has different segment register semantics than real mode. A Win9x compatibility layer needs a distinct implementation for protected-mode 16-bit execution.
Comparing the Scope to Wine
Wine’s success makes this sound simpler than it is. Wine has been in development since 1993 and still ships frequent compatibility fixes for NT-era applications. Its staging branch contains patches that have been pending merge for years due to their complexity. The NT API surface is well-documented through Microsoft’s own SDK, through leaked source code that was studied before being taken down, and through decades of reverse engineering.
The Win9x API surface is less documented. The VxD calling convention was never part of the official Win32 SDK. The thunk mechanisms were described in developer articles from 1996 that are now primarily accessible through archive.org. The internal behavior of KRNL386.EXE is known mainly through disassembly and the work of researchers like Michal Necasek who have documented Win9x internals in detail over many years.
Hailey’s project is tackling this with a subsystem framing rather than a full emulation framing, which is the right call. Full hardware emulation through DOSBox-X can run Win9x but at a performance cost that makes interactive use of demanding applications painful. A subsystem that runs native code and only intercepts OS interface calls can be fast. Getting the VxD layer right is the technical crux that separates a subsystem that works for simple applications from one that handles the era’s most demanding software.
The Preservation Stakes
The Win9x window runs roughly from 1995 to 2002. Software from that era is not old enough to have been systematically preserved by academic institutions and not new enough to run on anything modern. A game from 1985 can run in DOSBox. A game from 2005 often runs in Wine or works natively on modern hardware. A game from 1998 that shipped with DirectX 5 and used VxD-based audio frequently runs nowhere without significant effort.
Getting this right is not a hobbyist side project. It is genuine systems programming at the intersection of CPU architecture, OS design, and compatibility engineering. The fact that someone has taken it on seriously enough to build something worth discussing is reason enough to watch the project.