The IR That Promised to Simplify Everything
In 1995, Cliff Click and Michael Paleczny published “A Simple Graph-Based Intermediate Representation”, proposing a compiler IR that unified data flow and control flow into a single graph. Rather than organizing code into basic blocks with explicit control edges, you represent each operation as a node, connect operations through data dependencies, and treat control flow as just another kind of dependency edge. Operations without explicit ordering constraints float freely in the graph until a final scheduling pass decides where they belong in the generated code.
Click called this the Sea of Nodes. Its first major production home was HotSpot’s C2 compiler, the JIT inside the JVM. When Google built Turbofan, V8’s end-tier optimizing compiler, the same architectural bet was made. The result was one of only a handful of large-scale production compilers in the world to use this representation.
Now, as the V8 team describes in a March 2025 retrospective, they are substantially done leaving it behind. The replacement, Turboshaft, returns to the conventional Control-Flow Graph (CFG) that compilers like LLVM have used from the beginning. Nearly three years of incremental migration work tells you something about what makes compiler IRs workable in practice versus what makes them interesting on paper.
Why Sea of Nodes Seemed Worth Adopting
Sea of Nodes has genuine appeal for a JavaScript JIT compiler. JavaScript’s control flow is complex: generators, async functions, try/catch boundaries, and speculative type guards that need deoptimization fallback paths. A single unified graph handles all of these in the same framework without requiring special cases for control versus data dependencies.
The optimization benefits are also real. Loop-invariant code motion becomes natural: a computation with no control dependency pinning it inside a loop simply floats above the loop during scheduling. Global value numbering and common subexpression elimination operate directly on the graph structure. For the optimizer, the representation is genuinely expressive.
Deoptimization points, which are central to how Turbofan generates fast speculative code, fit cleanly into Sea of Nodes. When a type guard fails (a value expected to be a small integer turns out not to be), the deoptimization node captures the control state needed to resume in the interpreter. The graph naturally encodes the relationship between guards and fallback states without additional bookkeeping.
The Scheduling Problem
The practical difficulty with Sea of Nodes concentrates in the scheduling phase. Because operations have no fixed position, a separate scheduler must decide where each node belongs before linear machine code can be generated. This scheduler juggles several competing goals simultaneously.
It should hoist operations as high as possible to maximize the window for subsequent optimizations. It should avoid computing values in branches that may not be taken. It must respect effect chains, so memory reads and writes are not reordered past each other. It should cluster related operations to reduce register pressure. These goals conflict regularly, and the scheduler must make heuristic tradeoffs whenever they do.
In a mature production compiler, the scheduler accumulates a great deal of that complexity. Mistakes in scheduling produce either suboptimal code or correctness bugs that are difficult to reproduce, because the bugs depend on which node placement the scheduler chose for a particular graph structure. This is a category of bugs that simply does not exist in CFG-based compilers, where operations have fixed positions from the start.
Debugging is correspondingly harder. When you inspect a Sea of Nodes graph mid-compilation, operations have no inherent order. Tracing why an optimization fired or failed requires understanding both the graph structure and the scheduler’s placement logic simultaneously. V8 built a dedicated visualization tool called Turbolizer to make Turbofan’s IR inspectable, which is itself evidence of how opaque the representation is without significant tooling support.
Cache Locality and Compilation Latency
A less-discussed cost of Sea of Nodes is compilation performance. The IR’s graph structure is unfriendly to the CPU cache. Nodes point to arbitrary other nodes anywhere in memory, so traversing the graph during any optimization pass involves pointer-chasing. This degrades cache performance throughout compilation, not just in one hot path.
For V8, compilation latency is not an abstract concern. The JIT runs at runtime, and its speed directly affects application startup and responsiveness. Long-running compile jobs block execution, and for JavaScript applications with many hot functions, cumulative compilation time is visible to users.
Turboshaft stores operations in a flat array, with basic blocks indexing into contiguous ranges of that array. Iterating over a block’s operations is a sequential scan through a single memory region, which the cache handles well. The structural change to CFG is simultaneously a change to the memory access pattern of every optimization pass in the compiler.
What Turboshaft Looks Like
Turboshaft organizes code into explicit basic blocks, each containing a linear sequence of operations in a fixed order. The IR uses SSA form, with phi nodes at block entries where control flow merges. Blocks have explicit successor lists. This is structurally the same representation LLVM uses internally.
Because operations are fixed to blocks, there is no separate scheduling phase. Optimization passes work on operations in their final order within blocks. Code motion, moving an operation from one block to another, is explicit and localized, which makes it easier to reason about in isolation. The class of scheduling-induced bugs that existed in Turbofan does not exist in Turboshaft.
The migration was incremental by design. Turboshaft initially replaced only portions of the Turbofan backend, then progressively took over more phases as each piece was validated. This allowed correctness to be verified in stages rather than requiring a full flag-day cutover of the entire compiler pipeline. The strategy reflects hard-won lessons from compiler engineering: migrations that replace everything at once tend to produce extended periods of instability that are difficult to diagnose.
The Landscape: LLVM, HotSpot, and Graal
LLVM has used a CFG-based SSA IR since its creation and has become one of the most widely deployed compiler infrastructures in history. The LLVM IR is well-specified and extensively documented, and decades of research on SSA-form optimizations applies to it directly. Its success at scale is consistent with V8’s conclusion.
HotSpot’s C2 compiler continues to use Sea of Nodes and remains a performant JIT, but it is widely considered one of the most difficult components in the JVM to modify. Contributing new optimizations to C2 requires navigating graph semantics and scheduling semantics simultaneously, and complexity has accumulated over two decades. A significant amount of new JVM JIT work happens in GraalVM rather than C2, which is itself a data point about maintainability.
GraalVM’s compiler uses a hybrid called the Graal IR: explicit basic blocks for control flow structure, combined with data flow graph edges between operations that allow some of the floating freedom of Sea of Nodes. This gives it some of the optimizer’s expressiveness while preserving block structure for analyses. The hybrid approach works, but it carries complexity from both sides of the design space, and Graal IR is not notably simpler than either pure alternative.
What the IR Choice Actually Reflects
The V8 team’s migration is not a claim that Sea of Nodes is wrong in principle. For research compilers, for environments where compilation latency is not constrained, or for small teams with deep shared context around the representation, the elegance is real. The optimization simplifications it enables are genuine.
What the migration reflects is where complexity accumulates in each design. Sea of Nodes pushes complexity into the scheduling phase, into graph traversal patterns, and into the tooling required to make an inherently unordered representation inspectable. A CFG pushes complexity into explicit block management and control flow bookkeeping. For a team maintaining a production JIT across a decade of active development, the CFG’s complexity proved more tractable.
The structural reason is not subtle: the closer your IR is to the shape of the output, the smaller the gap between representation and reality. Machine code executes in linear blocks with explicit branches. Sea of Nodes is furthest from that shape in the middle of compilation, which is precisely where most optimization work happens, but the gap has to be closed before anything can run. The scheduling phase is that gap made mandatory.
Three years of sustained engineering effort to migrate off an IR is a significant organizational commitment. That the V8 team saw it through to a published retrospective is the clearest possible statement that what they found in production did not match what they expected from the theory.