The Specific Risks in a Vibe-Coded ext4 Driver for OpenBSD

OpenBSD ships an ext2 filesystem driver in its kernel. It lives at /sys/ufs/ext2fs/ and has been there for years. The problem is that modern Linux volumes use ext4 with extents enabled by default, and OpenBSD’s driver has no extent tree support. That means the vast majority of Linux-formatted disks formatted in the past decade are simply unreadable on OpenBSD. The driver name says ext2, and in practice that is all it handles.

An LWN article covers a project that closes this gap. The author used an LLM as the primary writing tool and described the development method as “vibe coding.” The label has generated considerable discussion in systems programming circles, most of which misses the specific technical questions that actually determine whether this kind of work is safe.

What the Ext2 Driver Cannot Do

Ext4 is backward-compatible with ext2 at the basic inode and block level, but its defining feature, the extent tree, replaces the old indirect block scheme entirely. When EXT4_FEATURE_INCOMPAT_EXTENTS is set in the superblock’s s_feature_incompat field, every inode’s block map is an extent tree rather than a list of indirect block pointers.

The extent tree lives in the first 60 bytes of the inode’s i_block array. A four-byte header (ext4_extent_header) establishes the magic number 0xF30A, the entry count, and the tree depth. Leaf nodes contain ext4_extent records, each mapping a range of logical blocks to physical blocks. The physical block address is split across two fields: a 16-bit high half and a 32-bit low half, reconstructed as (ee_start_hi << 32) | ee_start_lo. Interior nodes contain index entries pointing to blocks that contain the next level. Reading a file means traversing this tree by logical block offset, then doing arithmetic to find the physical address.

OpenBSD’s ext2 driver does not know any of this. When it encounters a volume with the extents feature flag, the inode’s i_block data is misinterpreted as indirect block pointers. The result is garbage reads, not informative errors. For a user trying to access a Linux USB drive or a shared partition from a dual-boot system, the experience is silent failure.

The Term and Its Baggage

Andrej Karpathy popularized “vibe coding” in early 2025 to describe a mode of development where the programmer accepts LLM output with minimal line-by-line scrutiny, steering by intent and testing by running rather than by reading. The framing was aimed at application code. Applying it to a kernel filesystem driver is a deliberate provocation.

The provocation lands because kernel code that parses on-disk structures is a specific class of code with specific failure modes. An ext4 volume presented to the driver is untrusted data. A maliciously crafted image can contain a directory entry with a corrupt rec_len field; if the parser does not bounds-check this before advancing its cursor, the result is an infinite loop or a read past the end of a buffer. A crafted extent tree with a fabricated depth field can drive recursive descent into arbitrary memory. Integer overflow in block number arithmetic produces wrong physical addresses that, in a read-only context, return wrong data and, in a write context, corrupt unrelated blocks.

None of these are hypothetical. They are the known attack surface of filesystem parsers and have been found in nearly every new filesystem driver ever written, including ones reviewed carefully by competent engineers. The question for vibe-coded kernel code is not whether LLMs are good or bad in the abstract; it is whether the specific checks these known failure modes require are present and correct in the generated output.

Why Read-Only Changes the Risk Profile

Read support and write support are not equally risky, and conflating them is the most common error in this debate. A read-only ext4 driver cannot corrupt the target volume. The worst outcomes are wrong data returned to userspace, a kernel panic, or memory corruption from a maliciously crafted image. None of these are acceptable, but they are different in kind from the failure modes of a write driver, where incorrect journal replay, wrong metadata updates, or a truncated write could corrupt a volume irreversibly.

For a read-only driver, the feature flag machinery is also more tractable. Ext4’s superblock contains three sets of feature flags: s_feature_compat (safe to ignore), s_feature_incompat (must understand completely before mounting), and s_feature_ro_compat (must understand before mounting read-write, but can be ignored for read-only). A correct read-only driver checks the s_feature_incompat field against a mask of understood features and refuses to mount if any unknown incompatible features are set. This single check bounds the exposure: the driver will not silently mishandle any feature combination it was not written to handle.

The flags that a minimal read-only driver must handle include EXT4_FEATURE_INCOMPAT_EXTENTS (0x40), EXT4_FEATURE_INCOMPAT_64BIT (0x80), EXT4_FEATURE_INCOMPAT_FILETYPE (0x2), and EXT4_FEATURE_INCOMPAT_RECOVER (0x4). That last flag indicates that the journal has not been cleanly replayed. For a read-only mount, the correct response is either to replay the journal before reading or to refuse the mount; silently proceeding exposes the user to stale or inconsistent data from committed but not-yet-flushed transactions.

The Endianness Problem

Ext4 is a little-endian format. OpenBSD runs on amd64, arm64, and sparc64, the last of which is big-endian. Linux kernel code uses le32_to_cpu() and friends throughout its ext4 implementation to convert on-disk values to host byte order. OpenBSD’s equivalent macros are letoh32(), letoh16(), and letoh64().

An LLM trained on Linux kernel source will naturally produce Linux-style byte order conversions. Whether it correctly translates these to OpenBSD conventions, handles every field in the superblock, block group descriptors, inode table, extent tree, and directory entries, without omission, is not something that functional testing on an amd64 machine will reveal. Little-endian hosts read little-endian data correctly even without any conversion. Endianness bugs in filesystem drivers are famously invisible until the code runs on a big-endian host.

OpenBSD’s existing ext2fs driver handles this correctly because it was written with explicit attention to the portability requirement. Whether the vibe-coded ext4 additions carry that attention forward is the single most important unresolved question about the implementation’s correctness.

OpenBSD’s Culture and the Code’s Future

OpenBSD is not going to merge vibe-coded kernel contributions. The project’s culture of careful review predates the AI coding conversation by decades, and Theo de Raadt has been public about his skepticism toward LLM-generated systems code. A patch submission that the author cannot fully explain will not pass review.

This does not mean the work has no value. NetBSD extended its own ext2fs driver with extent tree support years ago, producing a reference implementation that OpenBSD developers could study. A vibe-coded OpenBSD implementation, reviewed publicly and discussed in detail, produces a second data point. It also surfaces the specific places where an LLM gets the implementation right and where it does not, which is more informative than general claims about AI reliability.

The gap the project addresses is real. Ext4 is the default Linux filesystem. OpenBSD users who share disks with Linux systems, or who want to read Linux USB drives, currently have no path to accessing modern ext4 volumes. The FreeBSD and NetBSD ecosystems have closed this gap through conventional development. OpenBSD has not. Whatever the origin of the code under discussion, the conversation it has started is one that the project needed to have.

The technical questions are not mystical. Does the superblock parsing apply letoh32() consistently? Does the unknown-incompat-feature check cover the full set? Does the extent tree traversal bound the depth before recursing? Does the directory entry parser validate rec_len before advancing? These are code review questions, answerable by reading the patch. The label “vibe coded” tells you something about the process that produced the code. It does not tell you the answers to those questions, and those are the answers that matter.