· 3 min read ·

The Unglamorous Reality of Compiling a Real Language to WebAssembly

Source: eli-bendersky

There is a particular kind of blog post I always respect: someone dusts off an old project, adds something genuinely hard to it, and writes up the journey without oversimplifying. Eli Bendersky just published exactly that with his Scheme-to-WebAssembly compiler, built on top of his 15-year-old project Bob.

Bob is a suite of Scheme implementations in Python — interpreter, compiler, bytecode VM, and now a WebAssembly backend. It started as a way to understand CPython-style bytecode VMs from scratch, grew to include a C++ VM with a hand-rolled mark-and-sweep GC, and has now crossed into WASM territory. That’s a project with real longevity and a clear educational thread running through it.

Why Scheme Is a Hard Compile Target

Most WASM tutorials and “let’s build a compiler” posts target toy languages that are essentially C without the standard library. You get integers, basic control flow, maybe functions — and you’re done. No GC to worry about, no closures, no dynamic dispatch.

Scheme is a different animal. You’re dealing with:

  • Lexical closures — values that capture their enclosing environment and can outlive the stack frame that created them
  • Garbage collection — you cannot just malloc and forget, the runtime has to track object lifetimes
  • First-class functions — functions passed around like values, called through pointers
  • Built-in data structures — pairs, lists, symbols, all with runtime type tags

Compiling this to WebAssembly means you have to make decisions that a C-level language lets you ignore entirely. Where do closures live? How does the GC know what’s a pointer and what’s an integer? What does a tail call look like in WASM bytecode?

The WASM GC Extension Changes the Equation

What makes this timely is the WASM GC proposal, now shipping in major engines. Before it landed, compiling a GC’d language to WASM meant one of two ugly options: bring your own GC in WASM linear memory (complex, slow), or use JavaScript interop as a crutch (defeats the purpose).

The GC extension gives you managed heap types directly in the WASM type system — structs, arrays, references — with the host engine doing the collection. For a language like Scheme, this is significant. You can represent cons cells as WASM struct types and let V8 or SpiderMonkey worry about when to collect them.

Eli’s project is a hands-on test of whether this is actually usable for a real language runtime. Not a sample, not a benchmark — a working compiler for a complete (if modest) Scheme dialect.

What I Take Away From This

I build Discord bots, not language runtimes. But I find myself drawn to projects like Bob because they make the abstractions explicit. Every time I’ve wrestled with something weird in a dynamic language — a closure that didn’t capture what I expected, a GC pause at the wrong moment — a project like this is the explanation rendered in code.

The part that sticks with me is the framing: “experiments that compile toy languages at the C level.” It’s easy to write a blog post about compiling a made-up language with no runtime. Writing a real compiler for a language with real semantics and actually shipping it is a different commitment.

If you’ve ever been curious about what a compiler actually has to do once the AST walk is finished, this project is worth reading carefully. Start with the original Bob README for context, then follow Eli’s new post for the WASM-specific decisions. The gap between those two documents is where the interesting problems live.

Was this interesting?