Naming the Demons: What C2Y's Formal Approach to Undefined Behavior Actually Proposes

The phrase “nasal demons” entered C programming lore through a 1992 post on comp.std.c, where Henry Spencer noted that the standard technically permits a conforming implementation to make demons fly out of your nose in response to undefined behavior. The phrase caught on because it captured something precise: the C standard does not say “your program will crash.” It says the standard imposes no requirements on the implementation whatsoever. Crashing, corrupting memory, producing wrong output, and silently eliminating safety checks are all equally valid responses.

N3861, titled “Ghosts and Demons: Undefined Behavior in C2Y,” is part of the active standardization cycle for the next C standard revision, informally called C2Y and targeting approximately 2029. The paper borrows vocabulary from formal program semantics, and the choice is deliberate. These terms have specific technical meanings that illuminate exactly what the reform effort is trying to accomplish.

What “Demons” and “Ghosts” Mean

In refinement calculus and predicate transformer semantics, “demonic nondeterminism” describes a nondeterministic choice made adversarially: the demon picks the outcome worst for program correctness. This vocabulary appears in formal verification work, Hoare logic extensions, and game semantics, where it contrasts with “angelic nondeterminism,” in which the nondeterministic choice is made favorably.

The connection to C UB is exact. When a program invokes signed integer overflow, the standard grants the implementation unconstrained freedom. The optimizer is the demon, and it exercises that freedom aggressively. The assumption “signed overflow cannot occur in a correct program” allows the compiler to eliminate safety checks, collapse loop bound calculations, and reorder memory operations. The code becomes faster; the behavior becomes wrong; and the standard says this is fine.

“Ghost” variables appear in Hoare logic and separation logic as logical quantities that exist in the proof or specification but have no runtime representation. In C memory semantics, pointer provenance is ghost-like: the formal model must track which allocation a pointer belongs to, even when that information is absent from the pointer’s bit pattern after an integer round-trip. The PNVI model (Pointer, Not Value Integer) developed by researchers at Cambridge and INRIA formalizes provenance as surviving certain conversions, giving the optimizer a precise definition of when aliasing can be assumed and when it cannot.

The Optimizer as Adversary

The practical damage from demonic UB is clearest with signed overflow. Consider a common overflow check:

int result = a + b;
if (result < a) {
    /* overflow detected, return error */
    return -1;
}

With signed overflow as UB, a conforming optimizer can reason as follows: if result < a were true, then a + b must have overflowed, which is UB, which cannot happen in a correct program. Therefore result < a is always false; the branch is dead code. The check that was supposed to catch overflow is compiled away silently, with no warning.

GCC and Clang both perform this elimination under default optimization settings. The -fno-strict-overflow flag disables it. The Linux kernel has used -fwrapv for years to define overflow as two’s complement wraparound, accepting the non-standard extension as necessary for writing correct low-level code.

Strict aliasing UB operates on the same principle. C23 §6.5 says an object may only be accessed through an expression of a compatible type. Violating this is UB, so the compiler assumes violations never occur, which means loads and stores through incompatible pointer types can be freely reordered. Type-punning through pointer casts, widespread in serialization code and hardware register access, technically invokes this UB. The Linux kernel build has carried -fno-strict-aliasing as standard configuration for more than two decades, treating it as a necessary correction to the standard’s optimizer permissions.

What C23 Actually Fixed

C23 (ISO/IEC 9899:2024) made meaningful but targeted changes to the UB situation. The most significant was mandating two’s complement representation for signed integers. C had previously allowed ones’ complement and sign-magnitude representations for historical hardware compatibility, and that flexibility was part of the rationale for signed overflow UB. C23 removes that rationale: implementations must use two’s complement. Signed overflow is still UB in C23, but the exotic-hardware justification is gone.

C23 also added unreachable() in <stddef.h>, standardizing what had been __builtin_unreachable() in GCC and Clang. It added _BitInt(N) for bit-precise integers, where unsigned _BitInt(N) wraps on overflow by definition. Several attributes that had been compiler extensions, [[nodiscard]], [[deprecated]], [[noreturn]], [[fallthrough]], and [[maybe_unused]], were standardized. Signed overflow, strict aliasing, uninitialized reads, and pointer provenance remained unaddressed.

The Erroneous Behavior Proposal

The C2Y reform effort introduces a new tier called “erroneous behavior” (EB). The concept has been developed through a series of WG14 papers in the N3000 series, and N3861 continues this line of work. In the proposed taxonomy, EB occupies the space between “implementation-defined behavior” (the implementation chooses from valid options and must document its choice) and “undefined behavior” (no requirements whatsoever).

Erroneous behavior means the program is incorrect, and the implementation must produce some defined response, but the response may be unspecified. The implementation can trap, produce an arbitrary bit pattern, or substitute an unspecified value; what it cannot do is use the erroneous operation as a justification for eliminating surrounding code through the demonic reasoning described above.

For signed integer overflow under EB, an implementation could trap at runtime (as UBSan does today), produce two’s complement wraparound (as -fwrapv does), or produce an unspecified but representable int value. All three responses are valid; the optimizer loses its license to eliminate the overflow’s observable consequences. The distinction is subtle but consequential: the implementation still has latitude, but that latitude does not extend to silent miscompilation.

For uninitialized reads, EB shifts from “arbitrary behavior including code elimination” to “the read produces an unspecified value.” This matches physical reality: most hardware loads whatever bits occupy the register or memory cell. The sanitizer behavior (trap on uninitialized read) becomes a valid conforming implementation choice rather than a helpful debugging extension that operates outside the standard’s model.

What This Means for Sanitizer Tooling

Today, Clang’s UBSan inserts runtime checks for behaviors the standard says impose no requirements. A program that traps under UBSan is neither more nor less correct per the standard than one that continues running with wrong values. UBSan is a useful debugging extension, not a conformance mechanism.

Under erroneous behavior, an implementation that traps on EB becomes a valid, conforming choice. UBSan-style instrumentation becomes a strict conforming mode rather than an unofficial tool. This matters for security and safety contexts where you want deployed builds with sanitizer-style checks to have formally defined semantics.

AddressSanitizer and MemorySanitizer cover the memory safety UBs (out-of-bounds access, use-after-free, uninitialized reads) through shadow memory techniques. The EB proposal would classify the underlying behaviors they detect as formally erroneous rather than undefined, aligning the standard’s model with what these tools have always assumed.

The -fsanitize=undefined workflow has another practical implication here: today, enabling UBSan while also enabling optimization requires careful ordering of passes so that checks are inserted before the optimizer eliminates the undefined code paths. Under erroneous behavior, that tension is reduced because the optimizer is not permitted to eliminate the erroneous paths in the first place.

The Language Comparison

Zig’s approach to integer overflow is the closest analog to the EB proposal. In Debug and ReleaseSafe builds, integer overflow panics at runtime. In ReleaseFast, it wraps. The behavior is defined in all modes; only the response differs. The implementation has latitude over the specific outcome while the program’s behavior is bounded. C2Y’s erroneous behavior proposal follows this same structure: acknowledge the operation is erroneous, permit multiple conforming responses, but remove the optimizer’s license to use the operation as an optimization premise.

Rust separates the question architecturally. Safe Rust eliminates most UBs through the ownership and type system at compile time. unsafe Rust has a defined, enumerated list of UBs smaller than C’s Annex J.2, with invariants the programmer must maintain explicitly. Integer overflow in safe Rust panics in debug mode and wraps in release mode, with wrapping_add, checked_add, and saturating_add methods available when the programmer needs explicit control over the behavior.

C cannot adopt Rust’s ownership model without abandoning its identity as a language where the programmer controls the machine directly. The EB proposal takes the Zig path instead: define that something bounded must happen, leave the specific response to the implementation, and remove the compiler’s license to use erroneous operations as optimization premises.

Go simply defines integer overflow as two’s complement wraparound, nil dereference as a panic, and slice bounds violations as a panic. The language accepts the performance cost as a design decision. This is not practical for C, but it illustrates the far end of the spectrum: full specification at the cost of optimizer flexibility.

Where This Leaves Us

The formal semantics vocabulary in N3861 is not academic decoration. Calling the optimizer a demon and provenance a ghost is precise terminology drawn from a long tradition of formal reasoning about program correctness. The Cerberus project at Cambridge has spent years building a rigorous formal model of C memory semantics; the PNVI provenance model and the erroneous behavior proposal are connected pieces of the same effort: building a formal model of C that constrains the optimizer without sacrificing performance, and that is honest about what programs can expect when they make mistakes.

Whether erroneous behavior draws the line correctly is still under discussion. Some proposals push for fully defined overflow, meaning always two’s complement. Others want to retain full UB for performance-critical cases and add EB only for the most common safety hazards. The C2Y standardization cycle will work through these tradeoffs over the next several years.

What the proposal resolves, if adopted, is that “the compiler optimized out your safety check because of UB reasoning” is no longer a valid conforming behavior for the affected operations. The demon does not disappear; it gets a narrower cage. For anyone writing C where correctness and security matter, that narrowing is worth following closely.