Scheme to WebAssembly: What Compiling a Real Language Actually Looks Like
Source: eli-bendersky
Most WebAssembly compiler tutorials start with something like a toy language — a handful of integer operations, maybe a function call or two, no heap, no GC. You get to feel like you accomplished something, but you haven’t really grappled with what makes compilation hard.
Eli Bendersky’s post on compiling Scheme to WebAssembly is refreshingly different. His project Bob is a 15-year-old suite of Scheme implementations in Python — an interpreter, a bytecode compiler, a VM — and he recently added a new backend: a compiler targeting WebAssembly directly.
Scheme is not a toy language. It has:
- First-class closures with lexical scoping
- Tail call optimization
- Garbage collection
- Proper Scheme data types (pairs, symbols, vectors)
- A real runtime
Compiling that to WASM means you can’t just emit a flat list of instructions and call it done. You have to make real decisions about your runtime representation.
The WASM GC Extension Changes Everything
One of Eli’s explicit goals was hands-on experience with the WASM GC extension. This is significant. Classic WASM gives you linear memory and you manage it yourself — which means you’re either writing a malloc/free runtime or bolting on your own GC in WASM bytecode. Neither is fun.
WASM GC lets you define struct and array types that the host runtime manages, with reference types instead of raw pointers. For a language like Scheme where everything is a heap-allocated value (cons cells, closures, continuations), this is a much more natural fit. Your pair is a GC-managed struct with two anyref fields. Your closure is a struct holding an environment and a function reference.
This does mean you’re now coupled to the host’s GC, but for a project like this, that’s a reasonable tradeoff.
What the Compiler Actually Has to Do
The interesting engineering is in the lowering decisions. Scheme closures close over variables by reference — so you need to decide how to represent environments in WASM types. Tail calls matter too; R5RS Scheme requires proper tail call elimination, and WASM’s return_call instruction (part of the tail-call proposal) is what makes this tractable without blowing the stack.
I’ve spent some time with WASM as a compilation target for simpler things, and even getting integer arithmetic right with the type system is finicky. Adding closures, first-class functions, and a runtime type system on top of that is genuinely non-trivial work.
Why This Is Worth Reading
There’s a class of projects where the value is in the attempt more than the result. Bob isn’t trying to ship a production Scheme runtime — it’s a way to learn by building. Adding a WASM backend to a project you’ve maintained for 15 years means you bring a lot of accumulated context about what the language needs, and the new backend has to earn its place in that design.
If you’re interested in language implementation, the WASM GC proposal, or just want to see what a real compilation pipeline looks like below the surface, Eli’s writeup is worth your time. The source is on GitHub as part of the Bob project.
It also makes me want to revisit a half-finished bytecode interpreter I wrote a couple years ago. Maybe WASM GC would make a better target than the custom C VM I was trying to build.