· 6 min read ·

Why Production C Needs Compiler Flags the Standard Doesn't Know About

Source: lobsters

Open any large C codebase built for correctness and security, and you will find compiler flags that quietly contradict the C standard. The Linux kernel’s Makefile carries -fwrapv, -fno-strict-aliasing, and -fno-delete-null-pointer-checks. OpenSSL adds -fno-strict-aliasing. These are not optimization choices or debugging aids; they are declarations that the standard’s model of program behavior does not match what the code requires.

This pattern is the most direct evidence that C’s undefined behavior problem has become structural. N3861, “Ghosts and Demons: Undefined Behavior in C2Y”, the WG14 paper driving C2Y’s approach to UB reform, addresses the formal taxonomy. But the practical argument for reform has been sitting in production Makefiles for two decades.

What the Flags Actually Do

-fwrapv tells GCC and Clang to treat signed integer overflow as two’s complement wraparound rather than undefined behavior. The standard has always permitted compilers to treat signed overflow as impossible, which allows them to eliminate overflow checks before they can execute. This is not theoretical: GCC has eliminated signed overflow checks in security-critical code since at least the 4.x series, and the Linux kernel’s handling of user-controlled sizes, which must guard against integer overflow to prevent heap corruption, made -fwrapv a practical requirement. The kernel added it long before the formal UB discourse reached its current pitch.

-fno-strict-aliasing disables the alias analysis that the C standard enables through section 6.5p7’s rule that an object may only be accessed through a pointer of a compatible type. Violating this rule is undefined behavior; the compiler uses the rule to assume that, for example, a uint32_t * and a float * cannot point to the same memory, enabling freer load and store reordering. Network protocol parsing, hardware register access, and any code that needs to interpret raw bytes as typed values violates this rule in ways the compiler can turn into subtly wrong behavior. The Linux kernel networking stack, OpenSSL’s internal encoding routines, and dozens of similar codebases carry this flag because the alternative is auditing large amounts of working code to replace type-punning casts with memcpy.

-fno-delete-null-pointer-checks prevents a specific transformation: the compiler, having observed that a pointer is dereferenced, concludes it was non-null at the dereference point, and eliminates subsequent null checks on the same pointer. CVE-2009-1897 in the Linux kernel’s tun device driver involved exactly this: GCC eliminated a null check because the pointer had been used before the check. The check was the security boundary. The kernel added -fno-delete-null-pointer-checks as a project-wide precaution after that incident.

The Implicit Dialect

Each of these flags replaces a piece of the C standard’s undefined behavior with a non-standard but defined behavior. -fwrapv replaces “signed overflow: undefined” with “signed overflow: two’s complement wraparound.” -fno-strict-aliasing replaces “incompatible pointer access: undefined” with “incompatible pointer access: whatever the hardware does at the bit level.” -fno-delete-null-pointer-checks replaces “the compiler may use null dereference as an optimization premise” with “the compiler may not.”

This is an implicit dialect of C. Code compiled with these flags behaves correctly for its intended deployment context, but that correctness comes from GCC and Clang extensions, not from the C standard. Change compilers, change the toolchain version, or port to an embedded toolchain without these specific extensions, and the defined behavior disappears. The portability guarantee C is supposed to provide becomes conditional on matching the flag set.

Chris Lattner’s three-part series “What Every C Programmer Should Know About Undefined Behavior”, published in 2011, marked the point when the broader community understood that LLVM and GCC were both using UB as a global optimization premise. Lattner’s position was that the standard authorized this behavior and the optimizations were legitimate. He was correct on both counts. The flags became more widely understood as workarounds after that.

For safety-critical and embedded contexts where specific compiler certifications are required, this matters formally. MISRA C and CERT C reference the ISO/IEC 9899 standard. They do not automatically account for -fwrapv semantics. Developers in those contexts carry the implicit contract in their heads, or maintain supplementary documentation listing which non-standard flags are in effect and what behaviors they normalize.

What C2Y’s Erroneous Behavior Tier Actually Addresses

N3861 proposes a three-tier taxonomy: demon UB (the compiler’s optimization license), ghost UB (technically undefined but not exploited in practice), and erroneous behavior (EB), a wrong-program operation that must produce a bounded response. The EB tier is defined specifically to match what -fwrapv-style flags already provide in practice.

An implementation handling EB may trap, produce an unspecified representable value, or wrap as two’s complement. What it may not do is use the erroneous operation as a global optimization premise to eliminate surrounding code. The adversarial model, where the compiler uses UB as proof that a code path is unreachable and eliminates it, is not valid for EB operations.

If signed integer overflow moves to EB, -fwrapv becomes unnecessary for the security-critical use case. The compiler would be required to produce a bounded result without being permitted to eliminate overflow checks. The kernel’s concern about overflow guards being optimized away disappears because the behavior is now erroneous rather than undefined: the check is meaningful even if the overflow itself still violates the contract.

For uninitialized reads, EB matches what valgrind and MemorySanitizer already assume: the read produces whatever bits are in memory. Under full UB, a conforming compiler could delete code following an uninitialized read on the grounds that correct programs never read uninitialized memory, therefore this path is unreachable. Under EB, the read produces an unspecified value and the program continues with that value. The compiler cannot use the read as justification for eliminating surrounding code. C++26 is pursuing the same proposal through P2795R5, with WG14 and WG21 coordinating the design in parallel.

The Strict Aliasing Exception

The -fno-strict-aliasing case is more complicated, and C2Y will likely leave it more complicated. Strict aliasing UB is not primarily a ghost or a demon in the N3861 sense; it is genuinely load-bearing for performance. The alias analysis it enables, the assumption that float * and int * cannot alias, drives SIMD vectorization, loop hoisting, and store-to-load forwarding across type boundaries. This produces measurable differences on numerical and rendering code, not just micro-benchmark noise.

The correct fix for type punning is memcpy, which compilers optimize to a register move at -O2:

/* Undefined behavior: strict aliasing violation */
float f = 3.14f;
int i = *(int *)&f;

/* Defined behavior, compiles identically */
int i;
memcpy(&i, &f, sizeof i);

But a codebase that has relied on type-punning casts for twenty years cannot be migrated automatically, and the undefined behavior is currently silent: the code appears to work until an optimizer becomes aggressive enough to exploit it. -fno-strict-aliasing will remain in production Makefiles for some time after C2Y, even if the standard’s EB tier addresses overflow and uninitialized reads. The Cerberus formal semantics project at Cambridge, whose PNVI provenance model informs C2Y’s pointer aliasing work, approaches this through pointer provenance rather than type-based aliasing rules, but that model change is a longer-term effort than the EB tier.

The Specification Debt

The proliferation of UB-suppression flags in production C codebases is not an implementation failure; it is a specification failure. The standard describes a language that cannot be used safely for the tasks people actually use C for, so practitioners build a private dialect using compiler extensions and document it in project Makefiles rather than standards documents.

C2Y’s taxonomy work, particularly the EB tier in N3861, is an attempt to bring the formal specification closer to the implicit specification that real codebases already operate under. If signed integer overflow becomes EB, the standard begins to describe the same language the Linux kernel has been compiled under for years. That convergence has practical value for toolchain portability, safety certification, and the coherence of teaching C to new developers.

The flags will not disappear immediately even if C2Y adopts the EB proposals. Strict aliasing remains unresolved. Pointer provenance work is ongoing. But the direction is correct: a standard that describes production C, rather than a hypothetical language where signed overflow genuinely cannot occur in correct programs, is more useful than one that does not.

Was this interesting?