· 7 min read ·

Checkpoint and Continue: What a Fully Snapshotable Wasm Interpreter Actually Takes

Source: lobsters

The premise of gabagool is simple to state and harder to execute: build a WebAssembly interpreter where you can pause execution at any point, serialize the complete machine state to a file or buffer, and later restore it and keep running. The project does this by writing a pure interpreter in Rust rather than a JIT compiler, and that choice is not a concession but a precondition. You cannot build a fully snapshotable runtime on top of JIT compilation without OS-level cooperation, and even then the result is tied to a specific platform. With an interpreter, the entire execution state is data you already own.

What “fully snapshotable” requires

A WebAssembly module at runtime has a well-defined state, laid out clearly in the Wasm core specification. There are five components you need to capture:

  • Linear memory: a flat, resizable byte array that holds everything the module has allocated
  • Value stack: a sequence of typed values (i32, i64, f32, f64, v128, funcref, externref) being actively computed
  • Call stack: frames containing local variables, return addresses, and the current label stack
  • Global variables: typed mutable and immutable values accessible across the module
  • Tables: typed arrays of references, used primarily for indirect function calls

Capturing these from an interpreter is a matter of serializing a handful of structs. The value stack is a Vec<Value>. The call stack is a Vec<Frame>. Linear memory is a Vec<u8> with a tracked page count. The interpreter loop has a program counter pointing into the bytecode. Serialize these, and you have a snapshot. Deserialize them into fresh data structures, wire up a module, and the interpreter loop picks up exactly where it left off.

This is not true for any JIT-compiled Wasm runtime. When Wasmtime compiles a Wasm function with Cranelift, the Wasm value stack becomes the native machine stack. Local variables live in registers or native stack slots. The program counter is the native instruction pointer. To snapshot this, you would need to freeze and serialize the entire OS-level thread state, native stack frames and all, which is what CRIU does on Linux. The resulting snapshot is tied to a specific machine architecture and OS version. Move it to an ARM machine and it fails. This is the structural reason JIT runtimes cannot offer portable snapshotting: the boundary between “the Wasm state” and “the machine state” no longer exists once compilation runs.

Prior art and adjacent approaches

The idea of snapshotting Wasm execution has appeared in several forms before, though none of them are quite the same thing.

Wizer from the Bytecode Alliance does something adjacent: it pre-initializes a Wasm module by running its initialization function, then snapshots the resulting module state and emits a new Wasm binary with the memory pre-populated. This cuts cold start time because the initialization work is already done. But Wizer snapshots the output of initialization, not an arbitrary mid-execution state. You cannot pause a computation halfway through, serialize it, and resume it elsewhere. The scope is the module boundary, not the execution frame.

FAASM takes a different approach: it is a distributed serverless runtime for Wasm that uses snapshot support to scale out stateful functions. A function can be paused, its state replicated to another node, and execution resumed there. This is close to what gabagool enables, but FAASM is a full platform with its own distributed state model and scheduling system, not a general-purpose embeddable interpreter.

WARDuino is another relevant project: an embedded Wasm VM for microcontrollers that transfers execution state between a device and a development machine for live debugging. The snapshot/restore mechanism there serves a debugging use case rather than a serverless or replay one.

Cloudflareworkers uses a related idea with V8 isolate snapshots to reduce cold start time, and they have written publicly about the performance impact. But V8 snapshots are not portable across architectures, because V8 snapshots capture heap layout that depends on the machine’s pointer size and endianness. The portability limitation comes from the same JIT entanglement.

What gabagool offers is the foundational primitive: a correct Wasm interpreter with snapshotting as a first-class feature, without the surrounding platform machinery. You embed it, snapshot at will, and restore wherever you need.

The use cases this unlocks

A fully snapshotable interpreter changes what you can build in several distinct ways.

Time-travel debugging. If you snapshot interpreter state before executing each instruction, or at regular intervals, you can implement backwards execution. When a bug surfaces, restore the nearest checkpoint and replay forward from there. This is what Mozilla’s rr does for Linux processes, but rr works by recording and replaying kernel syscalls and is inherently Linux-specific. A snapshotable Wasm interpreter works entirely in userspace without kernel cooperation, and the resulting recording is portable across platforms.

Long-running computation checkpointing. If a computation takes hours and the host process dies, a conventional Wasm runtime loses all progress. With snapshot support, you can checkpoint at regular intervals and resume from the last known-good state on the next run. This is useful for batch processing, scientific computation, and any workload where restarting from scratch is expensive.

Serverless cold start reduction. The standard serverless model initializes a function from scratch on each cold start. If you snapshot a function after initialization and restore the snapshot on each invocation, cold start time reduces to the cost of snapshot deserialization. This is the same insight behind Wizer, but generalized to arbitrary execution states rather than a static pre-initialization step. For workloads where initialization is expensive relative to actual work, this is a meaningful optimization.

Live migration. A running computation can be suspended on one machine, serialized, and resumed on another. This has obvious applications in edge computing, where you might want to move a computation closer to the user who needs the result, or in resource-constrained environments where you need to evict a computation temporarily.

The interpreter performance tradeoff

None of this comes without cost. An interpreter evaluates Wasm instructions one at a time by dispatching on the opcode, maintaining its own stack as a data structure in host memory, and following branches in software. A JIT compiler translates bytecode to machine code, and the CPU executes it directly. For CPU-bound workloads, a well-implemented JIT will outperform an interpreter by a factor of 10 to 50 or more. wasm3, one of the fastest Wasm interpreters, benchmarks at roughly 10 to 20 times slower than native code. JIT-compiled runtimes like Wasmtime typically get within 1.3 to 2 times native speed on most workloads.

The tradeoff makes sense in two situations. When the workload is I/O-bound and CPU time is not the bottleneck, the gap between interpreter and JIT performance is largely irrelevant. When the capability enabled by snapshotting outweighs the raw execution cost, the slower path is the right one. For serverless functions that spend most of their time waiting on database queries or HTTP calls, cold start latency and migration overhead matter more than whether the compute runs at 1x or 0.1x native speed.

There is also a middle ground: tiered compilation, where a fast interpreter handles startup and a JIT kicks in for hot paths. Wasmtime uses this with its Winch baseline compiler. But tiered compilation makes snapshotting harder, not easier, because hot paths have JIT-compiled native frames mixed into the call stack alongside interpreter frames. You lose the clean separation that makes the interpreter case straightforward.

The WASI complication

One aspect that makes “fully snapshotable” genuinely hard is WASI state. A module that uses the WebAssembly System Interface may have open file descriptors at particular offsets, environment variables loaded into memory, clocks that have advanced, and network connections in flight. A snapshot of just the Wasm execution state is incomplete if the module is partway through a fd_read call or has opened a file and seeked to a specific position.

A fully faithful snapshot needs to capture WASI state alongside the Wasm state, or at minimum define clearly what happens to WASI handles on restore. This is a harder problem than snapshotting the interpreter itself. It requires either re-establishing WASI handles on restore (which may not be possible if the underlying file has changed) or serializing a synthetic WASI state that replicates what the module expects to find. Different use cases have different requirements here: a time-travel debugger might only need to snapshot pure computation between WASI calls, while a migration scenario needs the full state including I/O.

The value of a clean baseline

Projects like gabagool matter beyond their immediate use case. A portable, correct Wasm interpreter with snapshotting as a first-class feature gives the ecosystem something concrete to build on. Distributed computing frameworks can use it as a building block. Debugging tools can use it to implement replay. Testing infrastructure can use it to inject fault conditions at specific execution points and observe outcomes.

The Wasm spec defines a clean execution model precisely because it was designed to be implementable in many ways. An interpreter that fully respects that model, without the platform entanglement that JIT compilation introduces, closes the loop between the spec and the capabilities the spec implicitly enables. Snapshotting is one of the clearest examples: the spec defines the state machine precisely enough that “snapshot” has an obvious meaning at every instruction boundary, and an interpreter exposes exactly that state as data you can manipulate directly. It is worth having a runtime that treats this as a core feature rather than an afterthought.

Was this interesting?