· 6 min read ·

What 100 Kernel Bugs in a Month Actually Tells You About Linux Security

Source: lobsters

A recent piece on Substack made the rounds on Lobsters for a reason: someone found more than 100 Linux kernel bugs in 30 days. That number lands differently depending on your background. To a lot of developers, it sounds alarming. To anyone who has spent time in kernel security research, it sounds about right, and maybe even modest.

The more interesting question is not how many bugs were found, but what conditions make that number achievable. The answer says something important about the Linux kernel as a piece of software, about the state of security tooling, and about the gap between finding bugs and actually shipping secure systems.

The Tooling That Made This Possible

The single biggest shift in kernel vulnerability research over the last decade is syzkaller, Google’s coverage-guided kernel fuzzer. Before syzkaller, kernel fuzzing existed but was largely dumb: throw random syscalls at the kernel and hope something crashes. syzkaller changed the model by tracking code coverage at the basic block level, mutating syscall sequences to maximize new coverage, and encoding syscall argument semantics so generated inputs were structurally valid.

The practical result is that syzkaller reaches deep kernel paths that pure random fuzzing never touched. It has found thousands of kernel bugs since its release, and Google’s syzbot infrastructure runs it continuously across dozens of kernel configurations, producing a steady stream of reports. If you look at the syzbot dashboard at any given moment, you’ll see hundreds of open bugs.

But syzkaller is not magic. It runs against sanitizer-enabled kernels, and without those sanitizers, most of what it finds would be silent. KASAN (Kernel Address Sanitizer) instruments memory accesses to detect out-of-bounds reads and writes and use-after-free bugs. KCSAN (Kernel Concurrency Sanitizer) detects data races by tracking memory accesses across CPUs. UBSAN catches undefined behavior. Without these, a use-after-free that doesn’t immediately corrupt anything observable just silently proceeds.

The setup for a productive 30-day audit looks something like this: QEMU with KVM for fast VM reset, a kernel built with CONFIG_KASAN=y, CONFIG_KCSAN=y, CONFIG_UBSAN=y, and optionally CONFIG_KMSAN=y (Kernel Memory Sanitizer for uninitialized reads), syzkaller with a syscall description file tuned to the target subsystem, and a corpus seeded with prior reproducers from syzbot. Add targeted manual fuzzing and static analysis of recently merged code, and the conditions are in place.

Where the Bugs Actually Live

Not all kernel subsystems are equally prone to bugs. The areas that consistently produce the most findings share a few traits: large recent additions of complex code, heavy use of asynchronous operations or shared state, and callback-heavy designs where lifetimes are hard to reason about.

io_uring has been the poster child for this since its introduction in Linux 5.1 in 2019. Its performance advantage comes from a shared ring buffer between kernel and userspace with minimal copying, but that design means operations are inherently asynchronous and the lifetime of resources tied to in-flight operations is complex. Jann Horn at Project Zero documented early io_uring vulnerabilities that showed how its completion model could produce use-after-free conditions. Many container and cloud vendors disabled io_uring for years as a result.

The eBPF subsystem is another consistent source. The BPF verifier is supposed to guarantee that loaded programs cannot crash or corrupt the kernel, but the verifier itself is a large, complex piece of code with subtle state-tracking logic. Researchers have repeatedly found ways to construct programs that pass verification but execute unsafe operations. CVE-2021-3490, CVE-2021-34866, and numerous others demonstrate that the verifier’s correctness is a moving target rather than a solved problem. As eBPF’s use in production networking and observability tools has grown, so has the incentive to find these bypasses.

The USB subsystem and Bluetooth stack are older and messier, full of driver code written when security was not the primary concern. They have enormous input attack surfaces that are reachable from userspace with minimal privilege on most default configurations.

Why the Kernel Keeps Producing Bugs

The scale of the codebase is the starting point. The Linux kernel is roughly 30 million lines of code as of recent releases, and it accepts contributions from thousands of developers across hundreds of organizations. The review process is rigorous for core subsystems but varies considerably across the tree. A lot of code gets merged that is correct in isolation but introduces subtle lifetime or ordering bugs when combined with concurrent operations from other subsystems.

C is load-bearing here in an uncomfortable way. The language gives you no memory safety guarantees, no checked integer arithmetic, and no ownership semantics. Every reference counting operation is manual. Every array index is unchecked by default. The kernel has added mitigations over time, including bounded array indexing via FORTIFY_SOURCE and a push toward __counted_by annotations with Clang’s -fbounds-safety, but these are opt-in and incremental.

The Rust for Linux effort is the most serious attempt to address this at the language level. Rust’s ownership model makes use-after-free and data race bugs compile errors rather than runtime surprises. The kernel now has Rust-written drivers in the mainline tree, and the abstractions being built allow driver authors to interact with kernel subsystems through safe APIs. Whether this will measurably reduce the bug rate in new code depends on how broadly it gets adopted and whether the safe abstractions themselves are correct, which is a non-trivial question.

The Gap Between Finding and Fixing

Finding 100 bugs in 30 days is one thing. Getting them fixed in all the places that matter is a different problem entirely.

The upstream kernel moves relatively quickly. Once a bug is reported with a reproducer, core developers typically respond within days for security-relevant issues. The kernel security team handles embargoed reports for more serious vulnerabilities.

But the path from mainline fix to deployed system involves a lot of steps. The stable and LTS kernel trees, maintained by Greg Kroah-Hartman and Sasha Levin, apply a selection of fixes but not everything. Distribution kernels diverge significantly from upstream and maintain their own backport processes. A bug fixed in mainline 6.8 might take months to reach an enterprise Linux distribution running a 5.14-based kernel, if it gets there at all.

The silent fix problem compounds this. Many kernel bugs that qualify as security vulnerabilities get fixed as ordinary bug fixes without a CVE assignment. This is sometimes appropriate for low-severity issues, but it means security-conscious administrators cannot easily audit what has changed between kernel versions without reading every commit. Tools like linux-vulns and the tracking work done by the SUSE and Red Hat security teams partially address this, but the coverage is incomplete.

What This Means in Practice

For most users running a maintained distribution, 100 kernel bugs in 30 days is less alarming than it sounds. The majority are not remotely exploitable, most require local access, and many require specific hardware or configurations. The security model of a well-maintained Linux system still holds.

The more relevant question is who bears the risk. Cloud providers running multi-tenant workloads where many customers share a kernel care deeply about local privilege escalation bugs. Container environments where the kernel is the only security boundary between workloads care even more. The real-world urgency of any given bug depends heavily on the threat model.

What the 100-bugs-in-30-days result demonstrates most clearly is that the kernel’s attack surface is not shrinking and that the tooling to find bugs has never been more accessible. syzkaller is open source. KASAN and KCSAN are built into the kernel. A researcher with time, a few VMs, and a working knowledge of kernel internals can produce a significant number of findings.

The kernel security community has responded to this with more sophisticated defenses: CFI for indirect call hardening, KASLR and fine-grained ASLR variants, KFENCE for low-overhead memory safety checking in production. These raise the cost of exploitation without eliminating bugs. The combination of better exploitation mitigations and the slow introduction of memory-safe code via Rust represents the most realistic path toward a kernel that produces fewer bugs over time, but that path is measured in years and releases, not 30-day sprints.

Was this interesting?