The C Preprocessor Is Not Supposed to Be Turing-Complete, and Yet

The C preprocessor is a text substitution engine. It scans tokens, replaces macro names with their expansions, and moves on. That is the entire contract. There is no type system, no scope, no runtime, and no loop construct. And yet, people have built iteration, recursion, and generic programming on top of it, using nothing but the rules of token scanning.

pfultz2’s Cloak library is one of the clearest expositions of how this actually works. Reading it forces you to confront how the preprocessor scans text, because the techniques only make sense once you understand exactly what the scanner is doing at each pass.

The Fundamental Model

The preprocessor processes tokens in one left-to-right pass. When it encounters a macro name, it expands it, then scans the result. The critical rule: a macro is not re-expanded within its own expansion. This prevents infinite loops, but it also prevents direct recursion. If FOO expands to something that mentions FOO, the inner FOO is marked as already expanded and left alone.

This is the wall everyone hits when they first try to write recursive macros in C.

#define FOO FOO // This does not recurse. It expands once and stops.

The preprocessor’s scan-once-per-macro guarantee is what makes the advanced techniques necessary, and what makes them work.

X-Macros: The Practical Starting Point

Before getting to recursion, the most broadly useful pattern is also the simplest: X-macros. The idea is to separate data from code generation.

#define COLORS \
    X(RED,   0xFF0000) \
    X(GREEN, 0x00FF00) \
    X(BLUE,  0x0000FF)

Now you can use this list to generate whatever you need by defining X differently each time:

// Generate an enum
#define X(name, value) name = value,
typedef enum { COLORS } Color;
#undef X

// Generate a string lookup table
#define X(name, value) [name] = #name,
static const char *color_names[] = { COLORS };
#undef X

This pattern shows up in production codebases constantly. SQLite uses it for opcode definitions, where the same list of opcodes drives the enum, the name table, and the bytecode interpreter’s jump table. The Linux kernel uses it for CPU feature flags and error code tables. The appeal is genuine: you maintain one canonical list and derive everything else from it automatically. Adding a new entry means editing one place.

The weakness is readability. Code that relies on an undefined X and expects the caller to define it before inclusion is genuinely confusing to anyone reading it cold. Comments help, but the indirection remains.

The Deferred Expression Technique

X-macros get you a long way, but they cannot loop over a generated or computed list. For that, you need something closer to iteration, and that means confronting the scan-once rule directly.

The deferred expression technique works by inserting an extra level of indirection that delays macro expansion until a subsequent scan. Two helper macros are the foundation:

#define EMPTY()
#define DEFER(id) id EMPTY()

EMPTY() expands to nothing. When you write DEFER(SOME_MACRO), the preprocessor expands DEFER, producing SOME_MACRO EMPTY(). At that point, SOME_MACRO is adjacent to EMPTY(), but EMPTY() has not been expanded yet. The scanner sees SOME_MACRO and the not-yet-expanded EMPTY(). In the next pass, EMPTY() expands to nothing, leaving SOME_MACRO exposed for expansion.

This is the key insight: you are not recursing directly; you are delaying evaluation until the preprocessor makes another pass over the output. The Cloak library pairs this with an EVAL macro that triggers multiple expansion passes:

#define EVAL(...)  EVAL1(EVAL1(EVAL1(__VA_ARGS__)))
#define EVAL1(...) EVAL2(EVAL2(EVAL2(__VA_ARGS__)))
#define EVAL2(...) EVAL3(EVAL3(EVAL3(__VA_ARGS__)))
#define EVAL3(...) EVAL4(EVAL4(EVAL4(__VA_ARGS__)))
#define EVAL4(...) EVAL5(EVAL5(EVAL5(__VA_ARGS__)))
#define EVAL5(...) __VA_ARGS__

Nesting three-expansion macros five levels deep gives you 3^5 = 243 total passes. This is the upper bound on your “recursion depth”: whatever you want to iterate, you are bounded by how many expansion passes EVAL performs.

This is not a hack around the standard; it is a deliberate exploitation of the standard’s scanning rules. The technique is documented in the Cloak wiki and has been independently reinvented in multiple libraries.

FOREACH and Recursive Map

Once you have deferred expressions and EVAL, building a FOREACH macro is mechanical:

#define FOREACH(macro, first, ...) \
    macro(first) \
    __VA_OPT__(DEFER(FOREACH_INDIRECT)()(macro, __VA_ARGS__))

#define FOREACH_INDIRECT() FOREACH

The indirect reference through FOREACH_INDIRECT is what prevents the scan-once rule from stopping expansion. When the preprocessor scans the output of FOREACH, it sees FOREACH_INDIRECT() rather than FOREACH itself, so the not-already-expanded rule does not apply. FOREACH_INDIRECT() expands to FOREACH, which is then available for expansion in the next pass triggered by EVAL.

Note that __VA_OPT__ requires C23 or C++20. For older codebases, you need a different emptiness-detection trick, typically using variadic argument counting combined with token concatenation.

The Boost.Preprocessor library takes a different approach to the same problem. Rather than using deferred evaluation, it uses a finite, pre-generated set of iteration macros at different depths. BOOST_PP_FOR is not a general recursive macro; it is a large collection of numbered macros (BOOST_PP_FOR_1, BOOST_PP_FOR_2, …) that call each other in sequence. This avoids the scan-once problem entirely by never actually recursing through the same macro name twice. It is more portable across older preprocessors but harder to extend beyond its pre-baked depth.

Compile-Time Conditionals and Overloading

Two other patterns from the Cloak wiki are worth understanding. The first is compile-time conditional selection:

#define CAT(a, ...) PRIMITIVE_CAT(a, __VA_ARGS__)
#define PRIMITIVE_CAT(a, ...) a ## __VA_ARGS__

#define IIF(c) PRIMITIVE_CAT(IIF_, c)
#define IIF_0(t, ...) __VA_ARGS__
#define IIF_1(t, ...) t

IIF(COND)(true_branch, false_branch) expands to one branch or the other based on whether COND is 0 or 1. This works because IIF(0) expands to IIF_0, which discards its first argument, and IIF(1) expands to IIF_1, which keeps it.

The second is argument counting for macro overloading:

#define VA_NARGS_IMPL(_1,_2,_3,_4,_5,N,...) N
#define VA_NARGS(...) VA_NARGS_IMPL(__VA_ARGS__, 5, 4, 3, 2, 1)

#define OVERLOAD(name, ...) CAT(name, VA_NARGS(__VA_ARGS__))(__VA_ARGS__)

With this, you can write OVERLOAD(foo, a, b, c) and have it expand to foo3(a, b, c). This is how many C APIs simulate variadic function overloading without actual C overloading support.

What Modern C and C++ Replace

The honest answer to “should I use these techniques” depends heavily on your toolchain constraints.

In C++, most use cases for preprocessor metaprogramming have been superseded. Constexpr functions handle compile-time computation with actual type safety and readable error messages. C++17 fold expressions handle variadic expansion. Concepts replace macro-based type guards. If you are writing C++20 or later, the preprocessor should mostly be reduced to include guards and platform detection.

In C, the situation is more nuanced. C23 added typeof, improved _Generic, and constexpr for object declarations. These close some gaps, but not all. Generic data structures in C still often use preprocessor techniques because templates do not exist, and void pointers sacrifice type safety. The Linux kernel, which targets older C standards by necessity, uses preprocessor tricks extensively throughout.

Rust’s macro system is worth comparing. Declarative macros (macro_rules!) match patterns syntactically and produce hygienically scoped output, which eliminates most of the footguns in C macro writing. Procedural macros go further, operating on the token stream as a proper AST. The result is a system that can do everything C preprocessor tricks do, with comprehensible error messages and no scan-order subtleties to reason about.

When Preprocessor Metaprogramming Still Earns Its Place

Four scenarios justify this level of complexity in C code today.

First, X-macros for maintaining parallel data structures. When you have a table of items that needs to appear as an enum, a name array, a dispatch table, and a documentation string, X-macros keep all four in sync with one canonical definition. This is maintenance value, not cleverness.

Second, portability layers. When you need the same interface to work across radically different platforms, the preprocessor lets you select implementations at compile time without runtime overhead. The Linux BUILD_BUG_ON family of macros, which generate compile errors for violated static assertions, are clearer than any runtime check.

Third, code generation for performance-critical paths. Manually unrolled loops, SIMD dispatch tables, and lookup tables generated from data definitions can all benefit from preprocessor-driven generation when the alternative is either hand-duplication or slower runtime dispatch.

Fourth, embedded and kernel code where C++ is unavailable or undesirable and C standards are locked to C11 or earlier.

For application-level code in modern C or C++, the complexity cost of recursive FOREACH macros is almost never justified. The techniques in the Cloak wiki are worth understanding because they reveal something real about how the preprocessor works, and because you will encounter them in production code. Using them as the default solution to iteration problems in new code is a different matter.

The preprocessor was designed to solve a narrow set of problems: conditional compilation, textual inclusion, and simple substitution. The fact that you can build a Turing-complete computation engine on top of it is an emergent property of those simple rules, not a design goal. Understanding the tricks gives you leverage when you genuinely need them. The rest of the time, reaching for a language feature that was designed for the job is the more defensible choice.