· 8 min read ·

Type-Checking a Stack Machine: What WebAssembly Validation Actually Does

Source: eli-bendersky

Eli Bendersky released watgo last week, a pure-Go, zero-dependency WebAssembly toolkit covering the four core operations any WASM tooling needs: parse WAT text into a semantic IR, validate a module against the WebAssembly spec, encode the IR to binary WASM, and decode a binary back to the IR. The toolkit is positioned as a Go-native peer to wabt (C++) and wasm-tools (Rust), and the motivation for it is legible if you’ve followed Bendersky’s other recent WebAssembly writing. His Scheme-to-Wasm compiler series involved hand-authoring substantial quantities of WAT, which means reaching for wasm-tools or wat2wasm constantly. Building your own toolkit for the job is the natural conclusion of that kind of work.

The practical story around dependency elimination and Go tooling ergonomics is real, and has been covered. The more technically interesting piece is the validation step, what it actually does, and why implementing it in pure Go is tractable in a way that would not be true for a more complex language runtime.

WebAssembly Validation Is Formal, Algorithmic, and Defined in the Spec

Most binary format validation is ad hoc: check that field lengths are consistent, that offsets don’t exceed file bounds, that required headers are present. WebAssembly validation is qualitatively different. The WebAssembly specification defines validation formally, as a type system with explicit inference rules. Section 3 of the spec is titled “Validation” and reads like a type theory paper, not a format description. Every instruction has a defined type signature, expressed as a function from a stack of input types to a stack of output types. Validation is the process of type-checking a sequence of instructions against those signatures, ensuring the stack is always consistent.

The algorithm the spec defines is a linear pass over the instruction stream, maintaining a typed value stack and a stack of control frames. The value stack holds the types of values currently on the WASM operand stack: i32, i64, f32, f64, and the reference types funcref and externref (plus WasmGC struct and array reference types in newer modules). Each instruction specifies what it consumes from the stack and what it produces. i32.add consumes two i32 values and produces one. f64.load consumes an i32 address and produces an f64. If the stack doesn’t have the right types in the right order, validation fails.

Control flow complicates this. WebAssembly has structured control flow only: block, loop, and if constructs that nest lexically, with br and br_if for branching to labeled outer blocks. There are no arbitrary jumps. This is not just a safety restriction; it is what makes type checking tractable. The validator maintains a stack of control frames, one per enclosing block, loop, or if construct. Each control frame records the types that will be on the stack at the end of that block, which is the type that branches to that label must deliver. When you encounter a br n instruction, you look up the nth enclosing frame and verify the current stack matches its expected types.

The spec handles unreachable code with a polymorphic stack. After an unconditional br, return, or unreachable instruction, subsequent instructions are considered unreachable. The validator enters a mode where any type can be assumed available on the stack, because the code will never execute. This is the subtlest part of the algorithm: unreachable code must still parse and be structurally valid, but type mismatches within it are not errors. When the validator exits the unreachable region (by reaching the end of a block), it resets to the stack state defined by the control frame.

What This Looks Like in Go

watgo’s wasmir package represents a WebAssembly module as typed Go structs. A wasmir.Module has fields for each section type: Types []FuncType, Funcs []Func, Exports []Export, and so on. A FuncType is a pair of slices, parameter types and result types, where each element is a ValType (an integer constant for i32, i64, f32, f64, or a reference type).

mod, err := watgo.ParseWAT([]byte(wasmText))
// mod.Types is []wasmir.FuncType
// mod.Funcs is []wasmir.Func
// mod.Exports is []wasmir.Export

Instructions inside function bodies are similarly typed. Each instruction in a function’s code section is represented as a Go struct carrying its opcode and any immediate operands (like branch targets, function indices, or type indices). The validator walks over these, maintaining a []ValType as the value stack and a []ControlFrame as the block stack.

This maps to Go idioms cleanly. A slice append is a push; a slice shrink is a pop. Checking the top of the stack is an index operation. The control frame stack grows when entering a block and shrinks when exiting. There is no dynamic dispatch, no object hierarchy to navigate. The algorithm is essentially a loop over a slice of instruction structs, which is the kind of code Go handles with minimal friction.

The structural constraint of WebAssembly’s control flow is what makes a pure Go implementation feasible without a parser generator, an IR transformation framework, or anything beyond the standard library. A language with arbitrary jumps would require dominance analysis to type-check. A language with polymorphic dispatch would require constraint solving. WebAssembly’s design choices, made primarily to enable fast single-pass compilation in browsers, also happen to make the validation algorithm implementable in a few hundred lines of Go.

Reference Types and WasmGC Make It Harder

The core validation algorithm is linear and elegant. Newer proposals expand the type system in ways that add complexity.

Reference types (funcref, externref, and the WasmGC additions) introduce a subtype relationship. A (ref $myStruct) is a subtype of (ref null $myStruct), which is a subtype of externref in certain contexts. When two branches of an if produce different reference types, the validator must compute the least upper bound (LUB) in the type hierarchy to determine what type the join point delivers. For core WASM with only funcref and externref, the hierarchy is flat and the LUB is trivial. With WasmGC, you can declare struct types with inheritance, so the validator needs to walk the declared type hierarchy.

WasmGC also adds struct.get, struct.set, array.new, array.get, and similar instructions, each with type rules that check that the index is valid for the declared struct layout and that the field type matches the operation. The validation rules for these are defined in the spec but require looking up the declared type by index, which means the validator needs the module’s type section in scope.

For a first release of watgo, the question is how far into the WasmGC type lattice the validation step reaches. The core validation, covering everything up through the reference types proposal, is the meaningful baseline for most real-world WASM modules today. WasmGC coverage is the next layer, driven by Kotlin, Dart, and Java on WASM.

The Separation Between Validation and Execution

One architectural point worth noting: validation and execution are entirely separate in watgo’s design. watgo validates a module; it does not run it. Execution is wazero’s domain. wazero is a pure-Go, zero-dependency WASM runtime that can execute validated modules. The two projects complement each other cleanly.

This separation is not accidental. WebAssembly validation was designed to be done before execution, once at load time, as a precondition for allowing a module into a runtime. A conforming runtime may assume that any module it executes has already been validated; the runtime does not need to type-check during execution. The validator is a gate, not part of the hot path.

For Go developers building systems that accept WASM modules from external sources, this separation has a practical shape. You run watgo’s validator on ingestion, before the module reaches wazero. If validation fails, you reject the module with a meaningful error. If it passes, wazero can execute it knowing the type invariants hold. Neither step requires CGo, a system library, or a Rust toolchain. The full path from a .wat source file to a running module is now achievable in Go without leaving the go tool’s native dependency management.

Where the CLI Fits In

watgo ships a CLI designed for compatibility with wasm-tools, the Rust CLI from the Bytecode Alliance:

go install github.com/eliben/watgo/cmd/watgo@latest
watgo parse stack.wat -o stack.wasm

Bendersky migrated his wasm-wat-samples repository from wasm-tools to watgo, which is a direct demonstration of the compatibility claim. For anyone who has been including a wasm-tools installation step in a CI pipeline or a development setup script, the replacement is available. The installation story reduces to go install on any machine with Go.

The wabt comparison is worth making explicit. wabt has been fuzzed extensively and is considered the reference validator; it is the oracle when you want to know definitively whether something is spec-compliant. watgo is not a decade-old C++ project with that level of hardening. For production security-critical validation of untrusted WASM, wabt’s track record matters. For build tooling, development pipelines, testing infrastructure, and tool distribution in a Go-native environment, the trade-off reads differently. watgo’s validation is spec-derived and Go’s straightforward implementation of the algorithm is auditable without a C++ toolchain.

The Component Model Is the Next Frontier

wasm-tools has evolved significantly beyond core module tooling. Its current major surface is the Component Model: a standard for composing WASM modules with interface types defined in WIT (WebAssembly Interface Types). WASI 0.2, released in January 2024, is built on the Component Model. Cargo-component, jco, and the broader Bytecode Alliance toolchain all assume it.

watgo covers core WebAssembly modules. The Component Model is a separate layer with its own binary format, validation rules, and tooling surface. The validation algorithm for components is substantially more complex than for core modules, because components can import and export interface types (like records, variants, and resource handles) with their own subtyping rules.

When watgo reaches Component Model support, the comparison with wasm-tools becomes closer to feature parity. Until then, watgo occupies the core module layer of the WASM tooling stack. That layer covers the majority of current WASM use cases: Go, Rust, and C++ programs compiled to WASM, plugin systems using core module interfaces, and any tooling that inspects or transforms WASM binaries without using the Component Model.

For the Go developer writing WASM-adjacent tooling today, the practical summary is: the tools you need to parse, validate, and inspect WASM modules exist in Go, are installable with go install, and require no C or Rust dependencies. The validation step is the algorithmically interesting part, and watgo implements it against the formal WebAssembly spec. That is the correct foundation.

Was this interesting?