When the BPF Verifier Proves the Wrong Thing: Spinlocks and State Pruning

The BPF verifier is one of the most carefully constructed pieces of the Linux kernel. Before any eBPF program runs, the verifier walks every possible execution path, checking register types, bounds, memory safety, and a growing list of feature-specific invariants. The whole premise is that a program which passes the verifier is safe to run. This account of tracking down and fixing eBPF spinlock bugs demonstrates what happens when the verifier’s internal book-keeping fails to keep up with one of those invariants, and the pattern it reveals is one of the recurring challenges in BPF kernel development.

bpf_spin_lock and What It Requires

bpf_spin_lock and bpf_spin_unlock landed in Linux 5.1, introduced by Alexei Starovoitov in early 2019. The design is intentionally constrained. A spinlock must be a struct bpf_spin_lock embedded inside a BTF-annotated map value. Programs acquire the lock by passing a pointer to that embedded struct to the bpf_spin_lock() helper and release it with bpf_spin_unlock(). The lock is tied to the map value, not to a program variable, which means the kernel can locate and validate the lock at load time through BTF type information.

The constraints while holding a lock are strict. The kernel forbids tail calls, preemption, sleeping helpers, and certain map operations. More critically, the kernel requires that every execution path through the program that acquires a lock also releases it before the program exits. There is no runtime cleanup mechanism. If a program exits while holding a spinlock on a shared map, that lock stays held indefinitely. On a real workload, this causes a lockup on whatever CPU last ran that path.

The verifier is entirely responsible for enforcing these properties. It tracks lock state in struct bpf_verifier_state through a field that records whether a spinlock is currently held and which register and offset it corresponds to. At program exit points and before restricted operations, the verifier asserts this field is clear.

State Pruning: Why It Exists and What It Assumes

Tracking lock state across a linear program is trivial. The problem is loops. BPF programs that process packets, iterate map entries, or parse headers contain backward edges, and a naive path-enumeration strategy would explore them exponentially. The verifier solves this with state pruning.

At each instruction, the verifier maintains a cache of previously verified states. A state encodes everything the verifier knows at that point: the type, bounds, and liveness of every register; the contents of every stack slot; references to map values; and all the additional tracked properties. When the verifier reaches an instruction and finds a cached state that dominates the current state, it stops. A cached state dominates the current one if it has equal or tighter constraints on every tracked dimension. If the cached state was proved safe with those constraints, the current state, being no more permissive, is also safe.

This optimization is not optional. The BPF kernel selftests include programs that would require millions of path explorations without pruning. State pruning makes them verify in milliseconds. The soundness condition is the key requirement: the domination check must cover every tracked property. If any property the verifier enforces elsewhere is missing from the domination check, that property can be lost at a pruning decision point.

The Spinlock Gap

Spinlock state is exactly the kind of property that can fall out of the domination check. Consider a program with a loop that sometimes acquires a spinlock and sometimes does not. The verifier, exploring the no-lock path first, caches a state at some instruction after the branch. When it later explores the lock-holding path and reaches the same instruction, it compares against the cached state. If the comparison function does not account for whether a spinlock is held, the cache hit fires and the verifier stops exploring. The lock-holding path is marked safe by proxy, even though the cached state’s safety proof was built without any lock-release obligation.

The program that results can pass full verification and load into the kernel while containing a path that exits with a spinlock held. Nothing in the runtime catches this.

The comparison logic lives in kernel/bpf/verifier.c, in the function that decides whether two states are equivalent for pruning purposes. That function has grown considerably since BPF was first introduced, with additions for reference-counted objects, dynptrs, per-CPU variables, and various other features. Each addition to the verifier’s state model is a potential source of a new pruning gap if the corresponding field is not wired into the comparison.

This pattern repeats across BPF verifier history. Reference objects had similar issues. Dynptrs required careful integration into the pruning path. Spinlocks have appeared more than once because the lock-tracking logic spans multiple layers of the verifier’s state representation, including per-frame lock state in subprogram call frames.

Subprograms Add Another Layer

BPF subprograms, which allow BPF programs to call other BPF functions, complicate lock tracking further. The verifier maintains a stack of call frames, and each frame can carry its own lock state. A subprogram that acquires a lock and returns without releasing it leaves a lock held in the caller’s context. If subprograms are cached and reused across different call sites, the cached result must carry complete information about which locks are held at return.

Getting this right requires that the state comparison accounts for frame-level lock state, not just top-level state. A pruning decision made at a call site that does not compare frame-level lock state can allow a subprogram with a lock-holding return to be considered equivalent to one without.

The subprogram case is also where minimal tests tend to fail to catch the bug. A self-contained test for spinlock plus pruning might not involve subprograms at all. The bug only surfaces in programs that combine subprogram calls with loops and spinlocks in specific configurations, which makes a minimal reproducer harder to write.

Finding and Fixing This Class of Bug

The BPF verifier exposes its internal state through log output, enabled with the BPF_F_TEST_VERIFIER_LOG flag at program load time or through bpftool prog load with verbose output. The log shows, instruction by instruction, the register state and any pruning decisions the verifier makes. Finding a pruning soundness bug involves writing a program that should be rejected, loading it with the log enabled, and identifying the pruning event that incorrectly marked the unsafe path as already explored.

The kernel’s BPF selftests, under tools/testing/selftests/bpf/, include programs specifically designed to test verifier rejection. These are negative tests: programs expected to fail. Adding a negative test for the specific spinlock plus pruning interaction serves as both confirmation that the bug exists and validation that the fix is complete. A test the verifier incorrectly accepts before the patch and correctly rejects after it is the clearest possible demonstration.

One useful diagnostic technique for ambiguous cases is to suppress or reduce state pruning by adjusting the pruning threshold in a debug kernel build. If a program that passes with pruning enabled fails when pruning is suppressed, you have confirmed a pruning soundness gap. Without pruning, the verifier examines every path independently, so its decisions are much easier to reason about.

Fixes follow a consistent structure: locate the comparison function, add the missing field, write the regression test, verify no other code path presents the same gap. The “no other path” step is where subsequent patches often appear. A fix that adds lock state to the top-level state comparison can miss cases where lock state appears inside frame state for active subprogram calls or embedded in referenced objects. The initial patch closes the most direct case; review and follow-up work close the cases it missed.

Why the Verifier Keeps Accumulating These Bugs

The BPF verifier has an unusual obligation. It cannot be conservative in the way a type checker is conservative, rejecting programs it is uncertain about. The programs it is uncertain about are often the programs users need to deploy. It must make precise claims about safety, and those claims must be correct.

State pruning is how the verifier achieves the performance necessary to make precise claims tractable. But each optimization that speeds up verification also narrows how completely the verifier examines each path. Every time a new tracked property is added to the state model, there is a window during which pruning decisions do not account for it.

This is the structural reason these bugs keep appearing. BPF development is active, the state model keeps expanding, and correctly wiring new properties into the pruning path requires understanding an already large and intricate function. The BPF mailing list carries a steady stream of verifier fixes that follow exactly this pattern, for spinlocks, reference objects, dynptrs, and other features. Each one reflects a real gap between what the verifier claimed to verify and what it was actually verifying.

The work described in the article is part of that ongoing process. The verifier’s guarantees are only as strong as the completeness of its state comparison, and completeness is something you have to actively maintain as the model grows. Finding and closing these gaps, one at a time, is how those guarantees stay meaningful.