Undefined Behavior as a Proof Engine: What C2Y Is Trying to Fix

C’s undefined behavior was a design choice, not an oversight. The committee in the early 1970s faced a genuine problem: C was meant to run on machines with ones-complement arithmetic, sign-magnitude integers, and trap representations for ordinary integer types. Mandating two’s-complement overflow semantics would have broken portability on hardware in production use. So the standard said, effectively, “if you do this, we make no promises.” For 1974, that was a reasonable position.

What nobody fully anticipated was that compilers would eventually treat “we make no promises” as “we may assume this never happens, and optimize accordingly.” The result is that code written defensively, with explicit overflow checks and null-pointer guards, can have those safety checks deleted by the optimizer because the optimizer can prove that if the code were correct, the checks could never trigger.

This is the landscape that N3861, “Ghosts and Demons: Undefined Behavior in C2Y” addresses. The paper is part of the ongoing work for C2Y, the next C standard after C23, expected around 2028-2029.

The Taxonomy Problem

The C standard’s Annex J lists over two hundred instances of undefined behavior. They range from trivially dangerous (accessing a null pointer) to historically contingent (behavior when a character value does not fit in char) to genuinely necessary for optimization (strict aliasing). Treating all of them identically is the core problem, and the “Ghosts and Demons” framing makes this concrete.

Ghosts are UBs that remain in the standard because the committee could not agree on what to say about them, but which no current compiler actually exploits. They occupy the same formal category as things that will delete your security checks, but they are harmless in practice. Demons are UBs that compilers actively use as proof-theoretic handles for optimization, and which regularly produce surprising or dangerous behavior in real programs.

The distinction matters because the remedies are different. For ghosts, the fix is reclassification: promote them to implementation-defined or unspecified behavior, require implementations to document what they do, and move on without requiring compiler changes or incurring performance cost. For demons, the committee has to decide whether to define the behavior, require compilers to limit how aggressively they exploit the UB, or provide programmer-facing tools to opt into defined semantics.

How Compilers Use UB as Proof

The canonical example is signed integer overflow. In C, overflow of a signed integer is undefined behavior, which means the compiler may assume it never happens. Consider:

int will_overflow(int x) {
    return x + 1 > x;
}

With optimizations enabled, GCC compiles this to return 1. This is correct per the standard: since signed overflow is undefined, the compiler assumes it cannot occur, and therefore x + 1 is always greater than x. The programmer’s intention to detect overflow is defeated entirely.

A more dangerous version appears in security-critical allocation code:

/* Intended to guard against integer overflow */
if (size + extra < size) { abort(); }
void *p = malloc(size + extra);

If size and extra are signed integers, the overflow check is itself UB when the addition overflows. The compiler may eliminate the abort on the grounds that signed overflow “cannot happen,” and the allocation proceeds with an overflowed size value. The result is a heap buffer that is smaller than the code believes it to be.

Strict aliasing produces similarly surprising transformations. The rule in C11 section 6.5p7 says an object can only be accessed through an expression of a compatible type. Accessing a float through an int * is undefined behavior:

float f = 3.14f;
int i = *(int *)&f;  /* UB: strict aliasing violation */

Compilers use this assumption to completely separate pointer analysis for different types, which dramatically improves load/store scheduling and enables vectorization. The Linux kernel has historically relied on type punning for performance-critical network code, which is why GCC provides -fno-strict-aliasing and the kernel is compiled with it. The correct solution for type punning is memcpy, which compilers optimize to a register move regardless:

int i;
memcpy(&i, &f, sizeof i);  /* defined behavior, identical codegen */

Null pointer exploitation follows the same logic. If a pointer is dereferenced, the optimizer may conclude that it was non-null at the point of dereference, and use that inference to eliminate null checks elsewhere in the function:

void process(int *p) {
    *p = 42;              /* compiler: p must be non-null here */
    if (p == NULL) {      /* dead code: eliminated by optimizer */
        emergency_halt();
    }
}

This pattern has appeared in real CVEs. A Linux kernel tun device vulnerability from 2009 involved GCC eliminating a null check because the pointer had been used before the check. Per the standard, the compiler was correct.

What C23 Actually Changed

C23 (ISO/IEC 9899:2024) made one significant step toward reducing UB: it mandated two’s-complement representation for all signed integer types, following N2412. Every conforming implementation had been using two’s-complement for decades; the standard finally wrote down what was already universally true.

The critical point is what C23 did not do. It required two’s-complement representation, meaning negative numbers must be stored as their two’s-complement encoding. It did not define what happens when signed arithmetic overflows. Signed integer overflow remains undefined behavior in C23. The committee was explicit: the representation mandate eliminates UB related to the storage of negative values, but the arithmetic behavior on overflow stays in the UB column, available for compiler exploitation.

C23 also added _BitInt(N), a bit-precise integer type. For unsigned _BitInt, arithmetic wraps modulo 2^N. For signed _BitInt, overflow is still UB. The type gives programmers an opt-in path to exact-width arithmetic, but it does not fix the default behavior of int. The = {} universal zero-initializer syntax reduces uninitialized value UB, and memset_explicit addresses the pattern where compilers eliminate zeroing calls because the zeroed memory is never read again. These are practical improvements, but the core signed overflow problem was left to C2Y.

What C2Y Is Targeting

C2Y work is ongoing, and N3861 is part of a broader effort to treat UB reduction as a first-class deliverable. The proposals in active discussion include the following.

Signed integer overflow is the highest-profile target. The options range from mandating wrapping semantics (equivalent to compiling everything with -fwrapv) to a lighter approach requiring compilers to preserve explicit overflow-check patterns even when they could theoretically optimize them away. A pragma or attribute for wrapping arithmetic is also under consideration, giving programmers opt-in semantics without changing the default behavior for existing code.

Pointer provenance is the most technically complex target. Work began in C23 with N2951 and involves formally specifying what it means for a pointer to be derived from an allocation. The rule that you cannot use a pointer to access a different object, even if the addresses are numerically equal, is currently specified imprecisely. Getting this right matters for alias analysis and for hardware like CHERI that enforces provenance at the hardware level.

The restrict keyword’s aliasing rules in C99 through C17 are notoriously underspecified. Programmers use restrict in high-performance code to communicate that two pointers do not alias, but what happens when they actually do alias is not clearly defined. C2Y aims to provide a formal model.

How Other Languages Have Approached This

Rust’s answer is architectural: safe Rust cannot produce undefined behavior. The borrow checker enforces aliasing rules at compile time. Overflow in debug builds panics; in release builds it wraps. Explicit alternatives are available for any case: wrapping_add, saturating_add, checked_add. The tradeoff is a more complex language with a steeper learning curve and a different ownership model that takes time to internalize.

Zig takes the position that safety is a build mode, not a language property. In debug builds, all illegal behavior causes a panic. In ReleaseFast, UB-equivalent assumptions are enabled, identical to C at -O2. Programmers who want wrapping get explicit +% and +| operators. Zig calls these situations “detectable illegal behavior” rather than undefined behavior, which is more accurate about what is happening operationally. The distinction between “the compiler may assume this never happens” and “this will panic in debug builds” is meaningful, and it is one C has never drawn.

C cannot adopt either approach wholesale. The FFI story is complicated by decades of existing C that other languages need to interoperate with, and the migration cost of changing default arithmetic semantics for signed integers would be enormous. What C2Y can do is reduce the blast radius by reclassifying the ghosts and establishing limits on how compilers can exploit the demons.

The Audit Framework

The N3861 paper represents a shift in how the committee is thinking about the problem. Earlier revisions of the C standard treated UB as a precision problem: these behaviors are undefined because they could not be specified precisely in 1989. C2Y is treating it as a safety problem: these behaviors are vectors for real vulnerabilities, and the standard should be actively hostile to patterns that eliminate security checks.

The audit framework proposed in the paper evaluates each UB instance by whether compilers actually exploit it, whether it reflects genuine hardware diversity among current targets, and what the security cost is in practice. Over two hundred items in Annex J are not going to be resolved in a single standard cycle, but having a methodology for working through them systematically is better than the current situation, where UBs accumulate by accretion and are removed almost never.

C23 made two’s-complement representation mandatory because every conforming implementation had been using it for twenty years; the committee finally wrote down what was already true. C2Y has the opportunity to do the same thing for signed overflow. Every compiler that ships -fwrapv as an option already knows how to produce wrapping semantics. The question is whether the standard will catch up to the practice, or whether programmers will continue writing overflow checks that the optimizer quietly removes.