Splitting the Module: How Zig Brings Incremental Compilation to the LLVM Backend

One of the harder problems in compiler engineering is making incremental compilation work well when LLVM sits in the middle of the pipeline. Andrew Kelley’s Zig devlog entry from April 8, 2026 describes progress on exactly this problem: extending Zig’s existing incremental compilation system to cover the LLVM backend. The technical choices involved expose a structural tension that every language using LLVM eventually runs into.

Two Backends, Two Strategies

Zig’s self-hosted compiler maintains two distinct code generation paths. The LLVM backend emits LLVM IR, sends it through LLVM’s optimization pipeline, and gets back native machine code. This is the release build path, the one that produces optimized binaries. Alongside it sits a family of native backends, most notably the x86_64 backend primarily authored by Jacob Young, that emit machine code directly from Zig’s typed intermediate representation (called Air) without constructing any LLVM IR at all.

LLVM’s optimization infrastructure is extensive: inlining, loop unrolling, auto-vectorization, scalar replacement of aggregates, and a range of other passes that produce substantially better code than a simple direct emitter. But LLVM’s optimization model is built around whole-module analysis, and its most powerful passes are inter-procedural. They need to see multiple functions, or the whole program, to work correctly. The native backends give up some code quality and gain a critical property for development: they participate fully in Zig’s incremental compilation system.

How Zig’s Incremental System Works

Zig’s self-hosted compiler performs fine-grained dependency tracking at the level of individual declarations. When you change a function, the compiler re-parses only the affected file, re-analyzes only the declarations whose dependencies have changed, and re-emits only the functions that were invalidated. For the native backends, re-emission patches machine code directly into the in-memory binary image, updating the on-disk binary as a persistent compilation artifact rather than rewriting it from scratch.

This architecture is developed in detail across the Zig project repository and in Kelley’s running devlog. The key property is that the compiler maintains a live dependency graph at declaration granularity, not file granularity. When a comptime constant changes, the graph tells the compiler exactly which downstream functions were affected. When a function signature changes, the compiler computes precisely which callers need re-analysis and which do not.

This separates Zig’s approach from build-system-level incrementality, where a changed file triggers recompilation of every file that includes it. Zig’s system tracks actual semantic dependencies rather than syntactic inclusion relationships, so the set of recompiled functions is as small as correctness allows.

Why LLVM Cannot Participate by Default

LLVM has no API for “re-compile this one function and give me updated machine code to patch into an existing binary.” The fundamental unit of work in LLVM is the module: an LLVMModule containing functions and globals, with the optimization pipeline operating over the whole thing. Inlining decisions depend on callee properties only visible when caller and callee share the same module. Loop analysis and alias information can propagate across function boundaries. Even calling convention optimization may change a function’s ABI based on its callers.

ThinLTO, LLVM’s scalable approach to link-time optimization, addresses some of this by computing per-module summaries and enabling targeted cross-module inlining without loading all bitcode at once. It also supports disk caching of per-module optimization results, so unchanged modules are not re-optimized on subsequent builds. But ThinLTO’s unit of caching is still the module, and it assumes all modules are available at link time. It reduces redundant work across large builds; it does not minimize the work triggered by changing one function.

The tension is fundamental. Global optimization needs to see everything to work well; incremental compilation needs to see as little as possible to work fast. There is no resolution that satisfies both fully, only trade-offs between them.

Per-Function Modules as the Solution

The approach Zig is pursuing for the LLVM backend treats each function as its own LLVMModule. When a function needs to be emitted, the compiler creates a minimal module containing just that function, runs it through LLVM independently, and links the result into the binary. When a function changes, only its per-function module gets recompiled. The linking step assembles the updated function into the final binary alongside everything that did not change.

This gives up inter-procedural optimization between separately compiled functions. In a debug build, that is an acceptable trade. Most optimizations are off in debug mode anyway, and getting from a file save to a running updated binary in well under a second is more valuable than LLVM’s inliner having visibility across all call sites. The per-function module approach serves this use case directly.

For release builds, the strategy remains unchanged: compile everything through LLVM as a full module, accept the longer build time, and get the optimized output. The two use cases get two different compilation strategies, each matched to what matters in its context. This debug/release split is not a novel idea, but implementing it at the function granularity level is more demanding than it looks.

How Rust and Swift Navigate the Same Problem

Zig is not the first LLVM-based language to pursue fine-grained incremental compilation. Rust introduced incremental compilation in version 1.24 by splitting a crate into “codegen units” (CGUs), each compiled as a separate LLVM module and cached to disk. When a function changes, only the CGU containing that function is recompiled. Rust’s dep-graph tracks semantic dependencies between compiler queries to determine which CGUs are invalidated.

CGU-level incrementality works well in practice, but each unit typically contains many functions. When one function in a CGU changes, LLVM recompiles every function in that unit. The granularity is coarser than Zig’s per-function model by design: managing thousands of tiny LLVM modules would impose its own overhead in IR construction, optimization pipeline setup, and linking. CGU size is a tunable parameter that trades incremental granularity against per-compilation overhead, and Rust’s tooling has spent years finding reasonable defaults.

Swift’s approach varies by build mode. In debug mode, each source file is a separate LLVM module, giving file-level incremental compilation without a complex dependency graph. In release mode, Swift uses whole-module optimization, processing all files together for maximum cross-file analysis. Swift 5.4 added fine-grained dependency tracking between files to reduce which files are triggered by a change, which is similar in intent to Zig’s declaration-level dep graph but operates at file rather than declaration granularity.

The pattern across these languages is consistent: debug builds favor fine-grained incrementality with minimal optimization; release builds favor whole-program or near-whole-program analysis. Zig is implementing the same split at a finer granularity than existing approaches, which is possible because its dependency tracking operates at a lower level of abstraction than either Rust’s or Swift’s.

Why Dependency Precision Is the Prerequisite

The precision of Zig’s dependency tracking is what makes per-function module emission viable at scale. Without accurate tracking, the compiler would have to recompile functions conservatively whenever anything in their transitive dependency set changed. With per-declaration tracking, it recompiles only what actually needs recompilation.

This matters more at the LLVM backend level than at the native backend level, because each LLVM compilation is heavier. Native backend emission is fast enough that over-approximation is relatively cheap; LLVM module construction, optimization, and object file generation are not. If the dependency graph produces too many spurious invalidations, too many per-function modules get recompiled, and the incremental benefit narrows. The same precise dep graph that makes native backend incremental compilation practical is the prerequisite for making LLVM incremental compilation worthwhile.

The Zig compiler’s pipeline separates parsing, semantic analysis, Air construction, and backend emission into distinct stages, with the backend layer designed to accept per-function work items rather than a single flush of the full program at the end. Extending LLVM to participate in this architecture means wiring the LLVM backend’s emission path to create and compile per-function modules on demand, rather than accumulating declarations into one large module and invoking LLVM once at the end of compilation.

Build Speed as a Design Constraint

The speed of the edit-compile-test loop shapes how code gets written. Fast incremental builds encourage smaller, more frequent changes and tighter feedback on behavioral questions; slow builds push toward batching changes and relying on CI for verification. Zig’s design has consistently treated build speed as a first-class constraint, not a secondary nicety.

The native backends exist partly because the bootstrap process requires generating Zig code before LLVM is available, but the fast incremental builds they enable are a feature in their own right. Extending incremental compilation to the LLVM backend closes a gap: right now, fast incremental builds require either a native backend, with its code quality trade-off, or accepting full LLVM rebuild times on every change. A working incremental LLVM debug mode eliminates that trade-off for most development workflows.

Whether the per-function approach stays fast as program size grows depends on how LLVM’s per-module overhead scales. Module construction, the optimization pipeline setup, and the linker each impose costs that are trivial for one function and potentially significant when multiplied across thousands of invalidated functions in a large codebase. The devlog is where the empirical picture of that scaling will accumulate, entry by entry, as the implementation matures.