Spinlocks, State Pruning, and the Holes in eBPF's Safety Model

The eBPF verifier is often described as the thing that makes eBPF safe. Load a program, and the kernel runs it through a static analysis pass that checks every possible execution path for memory safety, type correctness, and resource management. Only if the verifier signs off does the program run in kernel space. This guarantee is what allows eBPF to be the infrastructure layer for everything from network packet filtering to production observability tools.

But the verifier is also a piece of software with real complexity, real edge cases, and real bugs. A recent article by Rohan Varma documents the process of tracking down and fixing spinlock-related verification issues in the Linux kernel, and it’s worth unpacking why this class of bug is so subtle and what it reveals about the fundamental tension in how the verifier is built.

What the Verifier Promises for Spinlocks

bpf_spin_lock was introduced in Linux 5.1 in early 2019 to allow eBPF programs to protect shared data in maps from concurrent access. The API is straightforward: a struct bpf_spin_lock must be embedded in a map value, and programs call bpf_spin_lock() and bpf_spin_unlock() to acquire and release it.

struct value {
    struct bpf_spin_lock lock;
    int counter;
};

SEC("xdp")
int my_prog(struct xdp_md *ctx)
{
    struct value *v = bpf_map_lookup_elem(&my_map, &key);
    if (!v)
        return XDP_DROP;

    bpf_spin_lock(&v->lock);
    v->counter++;
    bpf_spin_unlock(&v->lock);
    return XDP_PASS;
}

The verifier’s job is to enforce several invariants statically: a lock must not be acquired twice without an intervening release, a lock must not be released without being held, and no lock can be held when the program exits. These are properties the verifier must prove hold across all possible execution paths through the program.

Tracking these invariants requires the verifier to maintain, as part of its internal state, knowledge of which locks are currently held at each point in the analysis. This is meta-state, separate from the register values and stack contents the verifier normally reasons about. The bpf_verifier_state structure in kernel/bpf/verifier.c maintains an active_locks counter and per-object lock tracking to capture this.

State Pruning: The Optimization That Causes Problems

To understand where things go wrong, you need to understand state pruning. The BPF verifier does a depth-first traversal of all possible execution paths through a program. Without optimizations, this traversal is exponential in the number of branches, which quickly becomes intractable for real programs.

The verifier’s primary tool against this explosion is state caching and pruning. When the verifier arrives at an instruction it has visited before, it checks whether the current abstract state is “more constrained” than a previously cached state at that instruction. If so, it prunes: it stops exploring from the current point, reasoning that if the cached (less constrained) state was safe, the current (more constrained) state must be safe too.

The equivalence check is implemented in states_equal() in verifier.c. Two states are considered equivalent when each register and stack slot in the current state is a subtype or equal value compared to the cached state. The intuition is sound: if you know less about a value (wider type, unknown value) and the program was safe, then knowing more about it (narrower type, concrete value) is still safe.

The problem is that “more constrained” must account for all state, not just register contents.

When Lock State Isn’t Part of the Comparison

Consider what happens when two execution paths arrive at the same instruction: one path has a spinlock held, the other does not. If the state pruning logic compares these two states and only looks at register types and values, it may consider them equivalent. The registers could be identical. The stack could look the same. But one path is mid-critical-section and the other is not.

If the verifier prunes using a cached “safe” state that had no lock held, but the current state has a lock held, it incorrectly concludes the current execution path is safe without ever verifying that the lock gets properly released on all future paths from that point. The program passes verification, but under the pruned execution path, the lock may never be released.

This is the core class of bug. The verifier’s state equivalence check was not fully accounting for lock-holding status in its pruning decisions. The active_locks count might be checked at the top level, but the per-lock state tracking or the conditions under which pruning is allowed were not robust against all the ways program structure can cause two superficially similar verifier states to diverge in their lock metadata.

This isn’t a theoretical concern. A verifier bug that allows an eBPF program with an unreleased spinlock to load can cause a CPU to spin indefinitely, making it unavailable to the rest of the system, a hard hang with no recovery short of reboot.

The Verifier as Abstract Interpreter

Zooming out, what the BPF verifier implements is a form of abstract interpretation, a technique from program analysis theory where program execution is simulated on an abstract domain rather than concrete values. The abstract domain for eBPF includes things like “this register holds a pointer to a map value” or “this register holds a scalar in the range [0, 255]”, and operations on those abstractions are defined to be sound over-approximations of the concrete semantics.

The soundness of pruning depends on the abstract domain including all state that matters for the safety property being checked. For memory safety, that’s register types and pointer provenance. For lock safety, that’s which locks are currently held. When the abstract domain is incomplete, the pruning logic can reach incorrect conclusions.

This is a known hard problem in abstract interpretation. More complete abstract domains give more precise analysis but are more expensive to compute. State pruning is valuable precisely because it collapses states that would otherwise be explored separately. Every time you add a new dimension of state to track, you potentially reduce the effectiveness of pruning and increase verification time.

The BPF verifier has walked this line repeatedly as new features were added. Referenced objects require tracking open file descriptor-like handles. Iterators require tracking iterator state. Timers and per-CPU variables each added new axes of meta-state. Each one represents a potential for the same class of bug: the pruning logic not fully incorporating the new state dimension.

What the Fix Looks Like

Fixes for this class of bug generally take one of two forms. The narrower fix is to ensure states_equal() correctly incorporates the relevant lock state fields in its comparison, so that states with different lock-holding status are never considered equivalent for pruning purposes. This is conservative: it may cause the verifier to reject programs it previously accepted (if those programs relied on incorrect pruning) or to take longer to verify programs (because fewer states get pruned).

The broader fix, where applicable, is to restructure how lock state is tracked so that it is naturally part of the state comparison rather than a separately handled special case. This involves changes to bpf_verifier_state and the functions that copy, merge, and compare states.

The kernel’s BPF subsystem has a practice of detailed commit messages that explain exactly what invariant was violated and how the fix restores it. Reading through the BPF mailing list archives for spinlock-related fixes shows a pattern of incremental tightening: each fix addresses a specific path through the verifier where lock state was not properly propagated or compared, and each fix is accompanied by a test case that would have triggered the bug.

Why This Matters Beyond eBPF

The broader lesson is about the limits of static verification as a security boundary. The eBPF verifier is one of the more sophisticated in-kernel static analyzers in any production OS, and it still has bugs. This doesn’t make eBPF unsafe in any practical sense, but it does mean the safety argument is “the verifier is correct” rather than “the property is provably enforced”, and those are different claims.

The Linux kernel treats verifier bugs as security issues when they can be triggered by unprivileged users. CAP_BPF or CAP_SYS_ADMIN is required to load most BPF program types, which limits exposure. But the kernel’s threat model for BPF has been evolving, with proposals to allow more BPF program types from unprivileged contexts, which would make verifier correctness more critical.

For anyone building production systems on eBPF, understanding that the verifier is a best-effort safety net, not a formal proof, is important context. The tools built on top of it, Cilium, Falco, bpftrace, are generally sound, but the foundation they sit on is maintained code, not verified code, and it benefits from the same scrutiny as any other kernel subsystem.