Swift Already Solved This: What Zig's Per-Function LLVM Modules Learn From WMO

The Zig project’s April 8, 2026 devlog entry describes progress on incremental compilation through the LLVM backend. The technical approach, creating a separate LLVM Module per function rather than one per whole program, looks novel. It is not. Swift worked through the same wall years earlier, arrived at a partially similar answer, and the differences between the two approaches reveal something worth paying attention to.

Swift’s Wall, and the WMO Answer

When Swift launched in 2014, it used LLVM as its sole code generation backend. The compilation model matched how Clang works: one LLVM Module per source file, each compiled independently, linked together. This gave incremental builds, but it also gave LLVM no visibility across file boundaries. Inlining, dead code elimination, and interprocedural optimization could not operate across the program as a whole.

For a language that was competing against Objective-C and making claims about performance, this mattered. Apple’s solution, shipped in Swift 2.0 in 2015, was Whole Module Optimization (WMO). With WMO enabled, the compiler merges all source files into a single LLVM Module before running the optimizer. The result looks like LTO but happens before the linker rather than during it.

The tradeoff was explicit and documented: WMO makes release builds significantly faster by enabling cross-file inlining and eliminating dead code, but it destroys incremental compilation. When any source file changes, the entire module is recompiled. Apple recommended WMO for release builds and left the default per-file compilation for development.

This is nearly identical to the split Zig is implementing, but at a different granularity. Swift’s split was per-file vs. whole-module. Zig’s split is per-function vs. whole-program. Zig is resolving the same fundamental tension at a finer level.

What LLVM’s Type System Does to This Plan

The per-function Module approach sounds straightforward until you think about what LLVM IR types are. LLVM uses structural type equivalence: two types with the same structure are the same type. This is handled internally by making identical types point to the same object in the LLVM context. When all your functions are in one Module sharing one context, this works automatically. When each function lives in its own Module with its own context, it breaks.

Consider a function foo that takes a struct Point as a parameter, and a function bar that also takes Point. In one Module, both functions reference the same LLVM StructType*. In separate Modules, each function has its own StructType that happens to describe the same layout. These are different objects. When the linker combines the two compiled functions, it needs to know they are compatible. The LLVM bitcode format handles this through type identity across modules using named struct types, but the compiler has to be deliberate about naming them consistently. Anonymous struct types cannot be deduplicated across Module boundaries at all.

Zig’s InternPool, the global interning table for types, values, and declarations in the self-hosted compiler, provides the canonical identity. Every struct type, every enum type, every function signature has a single authoritative representation in the InternPool. When the compiler creates per-function LLVM Modules, it can use the InternPool identities to generate consistent names for LLVM named struct types. Unchanged types get the same name in every Module that uses them. The linker’s type deduplication then works correctly.

This is one of those problems that is not hard once you have the right data structure in your compiler, but is practically unsolvable without it. C++ compilers deal with a version of this through the One Definition Rule and COMDAT sections. Zig has a cleaner foundation.

Debug Information Across Module Boundaries

DWARF, the debug information format used by LLVM on most platforms, has its own version of the type identity problem. DWARF describes types using DW_TAG_structure_type and similar tags, and a type can be described in one compilation unit and referenced in another via DW_AT_specification and cross-unit references. When one LLVM Module per function produces one compilation unit per function in the resulting DWARF, the type graph gets fragmented.

LLVM provides DWARF type units as a mechanism for factoring shared type descriptions into a separate section, referenced from multiple compilation units. Properly using type units avoids the type bloat that comes from every compilation unit independently describing struct Point. The Zig compiler needs to use this consistently when emitting debug info for per-function Modules, or the resulting binary will have correct debug information that is large and slow to parse in a debugger.

Swift ran into the same issue with WMO. The merged-module approach sidesteps it because there is only one compilation unit, one type graph. Per-function Modules require handling it explicitly.

The ORC JIT Design as a Preview

LLVM’s own codebase has an architecture for exactly the use case of treating LLVM as a per-function code generator: ORC JIT. ORC is LLVM’s second-generation JIT framework, designed around the idea that code is compiled and linked on demand, function by function, into a running process. Each unit of JIT compilation is a small Module, typically containing one or a few functions. ORC manages symbol resolution across these small Modules so that a function compiled in one Module can call a function compiled in another.

The machinery ORC uses, the JITDylib for managing symbol tables, MaterializationUnit for deferred compilation, and ResourceTracker for controlling symbol lifetimes, was designed for JIT scenarios but the underlying model is applicable to AOT incremental compilation. The key difference is that in JIT, you need these symbols to resolve into a running process’s address space at runtime. In AOT incremental compilation, you need them to resolve at link time. The linker replaces ORC’s runtime dynamic linker, but the per-function Module granularity transfers directly.

Zig’s per-function Module approach applies ORC’s conceptual model ahead-of-time rather than at runtime. The Zig linker, which the self-hosted compiler includes rather than delegating to an external tool for most targets, handles the symbol resolution that ORC handles in JIT mode.

What the Two-Mode Split Actually Means

Both Swift and Zig have arrived at the same architectural decision: development builds and release builds are genuinely different compilation pipelines, not the same pipeline with different flags. This is the honest answer to a real constraint.

In Swift’s case, the split is between per-file (incremental, no cross-file optimization) and whole-module (full rebuild, full optimization). In Zig’s case, it will be between per-function LLVM Modules (incremental, no cross-function inlining) and whole-program LLVM compilation (full optimization). The Zig version is more fine-grained because the dependency graph the compiler maintains operates at function granularity, not file granularity.

The practical implication is that profiling a debug build in either language can be misleading. Inlining changes hot paths. Without it, function call overhead that is invisible in a release build shows up in profiling data, and the hot functions are different. This is not a new problem; C developers have known for decades that profiling an unoptimized binary gives you different results from profiling an optimized one. Making it explicit in the toolchain documentation is the responsible thing to do.

What the April 2026 devlog marks is that the LLVM incremental path is becoming usable, not just designed. The type deduplication machinery, the debug info handling, the symbol resolution infrastructure: these are the engineering work that turns a sound architectural decision into something you can actually compile code with. Swift shipped WMO in 2015 after dealing with equivalent infrastructure problems. Zig is working through the same class of problems with a more ambitious target.