How Cranelift Tamed E-Graphs by Exploiting SSA's DAG Structure

The compiler mid-end is where the interesting optimization work happens. It sits between the high-level IR coming out of the frontend, where semantics are still close to the source language, and the backend, where machine-specific concerns take over. For Cranelift, the code generator used in Wasmtime and other projects, this layer was redesigned around a data structure called an acyclic e-graph, or aegraph. Chris Fallin’s writeup explains the design in detail, but the core idea rewards deeper examination, particularly in how it relates to the broader history of e-graphs in compilers and why the specific constraint Cranelift imposes is the key that makes everything else tractable.

E-graphs and the equality saturation approach

E-graphs were introduced in the late 1970s and early 1980s for theorem proving and automated reasoning. The basic concept is straightforward: instead of committing to one representation of an expression, an e-graph maintains a set of equivalent representations compactly, grouping nodes into equivalence classes (e-classes) via union-find. Discovering that x + 0 equals x merges two e-classes rather than rewriting one term to another. The original expression and its simplification coexist in the graph.

For compiler optimization, this became compelling when Max Willsey and collaborators introduced the egg library (POPL 2021). Egg implements equality saturation: you apply every rewrite rule you know to the e-graph until no new equalities can be discovered, then extract the optimal term from the saturated graph using a cost function. The appeal is that you sidestep the phase-ordering problem entirely. Traditional compilers run optimization passes sequentially, and the order matters: running dead code elimination before constant folding might miss an opportunity that only becomes visible after constants are folded. With equality saturation, all rewrites run in a single joint fixed point, so there is no ordering to get wrong.

The cost is real, though. In the worst case, equality saturation is exponential. Merging two e-classes creates new potential matches for other rules, which creates more merges, in a feedback loop. Egg manages this with iteration limits and node count budgets, which works well for many research applications. In a production compiler processing arbitrary user code, unpredictable and potentially unbounded compile times are not acceptable. This is the problem the aegraph design solves.

SSA gives you acyclicity for free

Cranelift’s IR, called CLIF, is in SSA form. In SSA, each value is defined exactly once, and all uses of a value reference its unique definition point. When you draw the def-use graph of an SSA program, ignoring back-edges from loop bodies to loop headers, the result is a directed acyclic graph. Values flow from definitions toward uses, and there are no cycles in the pure dataflow graph.

This observation is what makes the aegraph design possible. If the underlying IR is a DAG, then an e-graph built over it is also a DAG, and the cycles that drive equality saturation’s worst-case complexity simply cannot arise. When a rewrite fires and creates new nodes, those nodes may reference existing values, but they cannot create new cycles, because every new node’s operands refer to values that were defined earlier in the SSA ordering. The graph can only grow in the forward direction.

The aegraph exploits this directly. Instead of a union-find structure for merging equivalence classes, each SSA value has a list of equivalent nodes that can compute it. When a rewrite produces a node equivalent to an existing value, you append that node to the value’s list. That is the entire merge operation. Because the graph is acyclic, there are no transitive consequences to propagate; the append is O(1) and complete.

The data structure in detail

An aegraph node corresponds to a CLIF instruction: an opcode plus a list of operand value references. Each node is tagged with an eclass identifier, which is an index into a side-table mapping eclass IDs to their list of equivalent nodes.

The key difference from a traditional e-graph is that eclass IDs are stable. In a union-find structure, merging two classes requires choosing a canonical representative and updating all pointers to the absorbed class across the graph. In an aegraph, merging means adding a node to the target eclass’s list. Existing eclass IDs remain valid, no pointers need updating, and the operation touches exactly one data structure entry.

Rewrite rules in Cranelift are expressed in ISLE (Instruction Selection Lowering Engine), the same domain-specific language used for the backend’s instruction selection rules. An ISLE rule matches a pattern against the aegraph and produces new nodes. This unification of the rewrite rule surface for both the mid-end optimizer and the backend instruction selector is a significant practical benefit: contributors need to understand one DSL, and the optimization and selection logic live in the same place with the same tooling for testing and debugging.

Extraction without iteration

After all rewrite rules have been applied, Cranelift needs a concrete CLIF program for the backend. This is the extraction step, called elaboration in Cranelift’s source.

Because the aegraph is a DAG, elaboration is a recursive descent starting from the roots: values that are actually used, typically block terminators and effectful operations. For each root, the elaborator looks at the eclass’s list of equivalent nodes, selects the cheapest one using a cost function, and recursively elaborates its operands. Values shared by multiple roots, or referenced by multiple equivalent nodes, are elaborated once and reused.

The cost function is intentionally simple: prefer the node with the lowest estimated instruction cost, using opcode costs as a proxy. This is not globally optimal in the way that egg’s extraction can be, where you can plug in arbitrary cost models and solve a minimum-cost expression problem over the full e-graph. It handles the common cases well, and it runs in O(n) time, which is the property that matters for a JIT compiler where compilation latency is part of the user-visible performance profile.

What this buys compared to LLVM

LLVM does not use e-graphs in its mid-end. Its GVN (Global Value Numbering) pass identifies equivalent expressions and eliminates redundant computations, which is conceptually similar to e-class identification in one direction. Algebraic simplifications live in the instcombine pass, which applies a collection of peephole rules. These passes run sequentially, and LLVM’s optimization pipeline is famously sensitive to ordering. The reassociate pass must run before certain instcombine patterns become visible; running the pipeline in the wrong order produces demonstrably worse code.

The aegraph approach gives Cranelift something closer to a joint fixed-point of GVN and instcombine, without requiring the programmer to reason about pass ordering or how many times each pass should run. For a project that aims to keep the compiler’s optimization layer maintainable by a small team, this reduction in mental overhead matters as much as the algorithmic properties.

MLIR’s pattern rewriting infrastructure offers a middle ground: you can configure a rewriting driver to run rules to a local fixed point within a region, with topological ordering ensuring termination. It is more flexible than LLVM’s sequential passes and more predictable than full equality saturation, but it does not maintain equivalence sets the way an e-graph does, so it cannot defer the choice between equivalent forms until extraction time.

The WebAssembly context

Cranelift’s primary consumer is Wasmtime, Bytecode Alliance’s WebAssembly runtime. WebAssembly compilation has specific performance constraints that make the aegraph’s properties especially relevant. Wasmtime runs in production services where compilation latency affects startup time, and in some configurations it compiles individual functions on demand, so per-function compile time matters directly.

WebAssembly code often arrives pre-optimized by upstream toolchains like clang or wasm-pack, which have already applied general algebraic simplifications and most standard compiler optimizations. What Cranelift needs to add are IR-specific and target-specific rewrites that the upstream toolchain could not anticipate. The aegraph’s bounded behavior is well-suited to this profile: rewrites run exhaustively within the DAG structure with no artificial iteration limits, but the acyclic constraint ensures that the work done per function is proportional to the function’s size rather than to some exponent of it.

Wasm’s structured control flow also maps cleanly to the DAG model. Cranelift lowers Wasm to a CFG of basic blocks, with loop-carried values represented as block parameters rather than implicit uses. The back-edges of loops create the only genuine cycles in the program graph, and CLIF makes them explicit, preserving the DAG property for all dataflow within a loop body.

The broader pattern

The aegraph is a clean example of the kind of engineering judgment that distinguishes production compiler work from research implementation. Full equality saturation is theoretically richer: it can find optimal rewrites that the aegraph might miss when the optimal solution requires considering cycles of equivalent forms. In practice, for the kinds of programs Cranelift processes, those missed opportunities are rare and the compile-time savings are consistent and measurable.

Taking a theoretically powerful technique, identifying the property that makes it expensive in the worst case, and finding a structural constraint that removes that property while preserving most of the benefit, is a recurring theme in systems software. The aegraph applies it to compiler optimization in a way that is also simpler to implement and maintain than the alternative, which is not always how these trade-offs resolve.