· 6 min read ·

WebAssembly Finally Gets the Debugger It Deserves

Source: lobsters

Debugging WebAssembly has always been an awkward experience. You get source maps when you’re lucky, DWARF info when the toolchain cooperates, and a wall of hex when neither shows up. For anyone who has spent time chasing a memory corruption bug through a compiled Rust or C module running in a browser or a WASI host, the tooling gap is obvious. Breakpoints work, mostly. Step-through works, sometimes. The moment something goes wrong and you need to understand the sequence of events that led there, you’re back to printf-style instrumentation and manual reasoning.

Gabagool is an attempt to close that gap with a time travel debugger built specifically for WebAssembly, surfaced through the Debug Adapter Protocol. The combination matters: time travel because it changes how you investigate bugs, and DAP because it means the debugger works in VS Code, Neovim, Emacs, and anything else that speaks the protocol.

What Time Travel Debugging Actually Means

The term gets thrown around loosely, so it’s worth being precise. Time travel debugging, also called reversible debugging or record-and-replay, lets you step backward through a program’s execution history, not just forward. You can set a breakpoint on a corrupted variable and ask the debugger to run backwards until the last write to that address. You can replay the exact sequence of instructions that produced an unexpected value.

The canonical implementation on Linux is Mozilla’s rr. It works by recording all nondeterministic events, system calls, and hardware counter values at the kernel level, then replaying execution deterministically from that recording. The overhead during recording is typically 1.2x to 5x depending on I/O intensity. WinDbg’s Time Travel Debugging takes a similar approach on Windows, recording a trace file that can be shared across machines.

GDB has had software-based reverse execution since version 7.0, but it’s painfully slow because it snapshots the full process state at every instruction. For anything beyond trivial programs, it’s not practical. The rr approach of recording only the nondeterminism and replaying deterministically is far more efficient.

Why WebAssembly Changes the Equation

Here is where WebAssembly’s design makes an interesting case. Native code running on x86 or ARM is inherently nondeterministic: ASLR randomizes memory layout, system calls have side effects, timer interrupts can change observable behavior, and hardware instructions like RDTSC return different values each time. Recording all of that faithfully requires either kernel-level interception (rr’s approach) or hardware support.

WebAssembly has a formally specified execution model that is almost entirely deterministic by construction. A WebAssembly module’s execution is defined in terms of a stack machine with typed values, a linear memory region, a set of globals, and a call stack. There is no address space layout randomization, no arbitrary pointer arithmetic, no system calls that aren’t mediated through explicitly imported host functions. The entire state of a running WebAssembly instance at any point in time is: the operand stack, the call stack with its local variables, the contents of linear memory, and the values of globals. That’s it.

This means snapshotting state for time travel is conceptually straightforward. You checkpoint the relevant portions of state at some granularity, and replay is just restoring a snapshot and continuing forward from there. The expensive part is linear memory, which can be up to 4GB in a 32-bit Wasm module, though in practice most modules use far less. Efficient implementations use copy-on-write page tracking to record only the pages that change between checkpoints rather than copying the entire memory region each time.

The other relevant property is that Wasm’s nondeterminism is controlled and explicit. The spec identifies exactly four sources of nondeterminism: the choice of NaN bit patterns in floating-point operations, resource exhaustion, uninitialized memory (in some proposals), and host function behavior. Everything else is deterministic. A time travel implementation only needs to record the outcomes of those specific cases.

The Debug Adapter Protocol as the Right Abstraction

The Debug Adapter Protocol was designed by Microsoft in 2016 for VS Code, and it solves a real combinatorial problem: without a standard protocol, every IDE needs a custom integration for every debugger. DAP standardizes the conversation between an IDE frontend and a debug backend into a JSON-RPC-style message protocol over stdio or a socket.

The protocol has explicit support for reverse execution. The reverseContinue request tells the debugger to run backward until the previous breakpoint. The stepBack request steps one instruction backward. These have been in the spec since early versions, but the number of adapters that actually implement them is small, because most debuggers don’t support it. Implementing these correctly is a meaningful engineering challenge.

For a Wasm time travel debugger, DAP is the sensible surface area. The core logic, recording execution, managing checkpoints, and replaying to a previous state, lives in the adapter. The IDE just sends standard DAP messages. A user working in VS Code gets the same experience as someone using a DAP-compatible Neovim plugin.

The relevant DAP message flow for time travel looks like this:

// Forward execution
{ "type": "request", "command": "continue", "arguments": { "threadId": 1 } }

// Reverse execution
{ "type": "request", "command": "reverseContinue", "arguments": { "threadId": 1 } }

// Step backward one instruction  
{ "type": "request", "command": "stepBack", "arguments": { "threadId": 1, "granularity": "instruction" } }

The adapter responds with stopped events and serves stackTrace, scopes, and variables requests against whatever state it has restored.

Checkpoint Strategy and the Performance Trade-Off

The interesting engineering is in the checkpoint granularity. Checkpointing at every instruction gives maximum backward stepping precision but is expensive in both time and memory. Checkpointing at function entry/exit points is cheaper but means stepping backward across a function boundary requires re-execution from the nearest prior checkpoint.

Most practical implementations use a tiered approach: dense checkpoints (every N instructions) stored temporarily, sparse checkpoints stored for longer history. rr uses a similar strategy with its checkpoint infrastructure. For a Wasm-specific implementation, there are additional opportunities: Wasm’s structured control flow means you can identify safe checkpointing points at block boundaries without the ambiguity you’d have in arbitrary machine code.

Linear memory is the dominant cost. A module with 64MB of active linear memory, if checkpointed naively, would consume 64MB per snapshot. Page-level dirty tracking solves this: you track which 4KB pages were written since the last checkpoint and store only those diffs. For many programs, memory access patterns are local enough that this is a significant win.

Where This Fits in the Wider Ecosystem

The WebAssembly tooling space has been maturing steadily. LLDB gained basic Wasm support through the collaborative work on Chrome’s DevTools and the Emscripten toolchain. The wasm-dwarf proposal standardized how DWARF debug info is embedded in .wasm files. The WebAssembly component model and the WASI preview 2 runtime interface add new surface area that debuggers will need to handle.

On the runtime side, Wasmtime from the Bytecode Alliance exposes a stable embedding API that a debug adapter can hook into. Wasmtime provides execution callbacks and memory access primitives that make it a reasonable host for an instrumented time travel implementation. Wasmer offers similar hooks through its middleware system.

For server-side and edge use cases where Wasm is increasingly common, the ability to attach a time travel debugger to a misbehaving module without being able to reproduce the bug locally is genuinely useful. The record-and-replay workflow, where you capture a failing execution in production and analyze it offline, is the same pattern rr enabled for native Linux programs and that WinDbg TTD brought to Windows.

Gabagool is early work, but it is pointing at a real gap in the Wasm ecosystem. The deterministic execution model makes time travel more tractable here than in native code. The Debug Adapter Protocol provides a standard interface that amortizes integration work across every supporting editor. The open question is runtime coverage: supporting Wasmtime, Wasmer, and the browser’s execution environment each requires different integration work. Getting all three is a long project, but the browser may be the most impactful target given how much production Wasm still runs there.

Was this interesting?