The Abstract Machine Meets the Black Box: How MiniRust Narrates Inline Assembly
Source: lobsters
Rust’s asm! macro was stabilized in Rust 1.59 back in February 2022. That was a significant moment, not because inline assembly was new to the world, but because bringing it into a language built around provable safety guarantees requires you to answer a hard question: what does the abstract machine actually do when the assembly runs?
That question is what Ralf Jung’s recent post on fitting inline assembly into MiniRust is really about. Jung is the researcher behind RustBelt, Stacked Borrows, and the ongoing MiniRust project, which is an operational semantics specification for Rust. MiniRust describes, step by step, what the Rust abstract machine does as a program runs. Most of the language fits into this model reasonably well. Inline assembly is the part that makes the model work hard.
What the asm! macro actually exposes
The Rust Reference’s inline assembly documentation covers the operand types and options in detail, but the structure is worth recapping before getting into the semantics.
Operands come in several flavors: in passes a value into the assembly, out receives a value back, inout does both with the same register, and lateout / inlateout allow the compiler to reuse the input register for the output because the input is no longer needed when the output is written. A concrete example:
use std::arch::asm;
fn add_one(x: u64) -> u64 {
let result: u64;
unsafe {
asm!(
"add {0}, 1",
inout(reg) x => result,
);
}
result
}
That is the common case. But the more interesting parts of asm! for the purposes of formal semantics are the options. The options are declarations that constrain what the assembly is allowed to do:
nomem— the assembly does not read from or write to memoryreadonly— the assembly may read memory but does not write to itnostack— the assembly does not use or modify the stack pointerpure— the assembly has no side effects beyond its outputs, and produces the same outputs for the same inputsnoreturn— the assembly does not returnpreserves_flags— the assembly does not modify condition flags
There is also clobber_abi, which takes a calling convention name (like "C" or "system") and tells the compiler that the assembly follows that convention’s caller-saved register semantics. This means the compiler knows which registers survive the assembly block and which are clobbered, without requiring you to list every register explicitly.
unsafe {
asm!(
"call some_external_fn",
clobber_abi("C"),
// no need to list rax, rcx, rdx, etc. individually
);
}
Where operational semantics gets uncomfortable
MiniRust describes execution as a sequence of abstract machine steps. Load a value from memory. Apply an arithmetic operation. Store the result. Branch on a condition. Each step is defined precisely: what the machine reads, what it computes, what it writes, and what counts as undefined behavior if the preconditions are not met.
For most Rust operations this story is tight. When you index into a slice, the abstract machine checks bounds. When you dereference a pointer, there are rules about validity, alignment, and borrow state. The story has clear chapters.
Inline assembly breaks the narrative because the assembly block is, from the abstract machine’s perspective, a black box. The machine does not know what instructions are inside. It cannot trace which memory locations are accessed, which flags are modified, which registers contain values that matter. It hands control to something opaque and then needs to reason about what came back.
The way Jung frames this in MiniRust is through the options as formal preconditions and postconditions on that black box. The assembly block is not defined instruction by instruction. Instead, MiniRust says: here is what the abstract machine does leading up to the assembly, here is the range of things the assembly is permitted to do while running, and here is what the abstract machine does after. The options bound that middle section.
When you declare nomem, you are telling MiniRust that the black box does not interact with the memory model at all. The abstract machine can hold memory state constant across the assembly block. Its alias analysis and borrow tracking remain valid because nothing in the middle could have invalidated them.
When you declare readonly, the black box can observe memory but the abstract machine’s write-tracking stays clean. No allocation has been modified. No borrow has been invalidated through a write path.
Without any such declaration, the abstract machine has to assume the worst: the assembly may have read and written arbitrary memory, may have called functions (changing global state), and may have violated any invariant that was not passed in through a declared operand. The compiler cannot hoist loads above the assembly block or sink stores below it. Optimization opportunities disappear.
The tension between tight and loose
There is a real design pressure here that Jung’s MiniRust work has to navigate. Make the semantics too strict — say, require that all accessed memory locations be declared as operands — and you cannot write useful assembly. Device driver code, cryptographic primitives, and operating system kernels all need to access memory through pointers without listing every location in advance.
Make the semantics too loose — allow the assembly to do anything unless constrained — and you lose the properties that make Rust optimizations correct and soundness arguments hold. The compiler passes code to LLVM with certain guarantees attached. If those guarantees are wrong, LLVM’s optimization pipeline produces code that violates the program’s intended semantics.
The options are where this tension gets resolved, and getting them wrong produces undefined behavior. The clearest example is nomem. Consider:
unsafe fn broken(ptr: *mut u64) -> u64 {
let result: u64;
asm!(
"mov {0}, [{1}]",
"add {0}, 1",
"mov [{1}], {0}", // writes through ptr
out(reg) result,
in(reg) ptr,
options(nomem, pure, nostack)
);
result
}
This declares nomem but then reads from and writes to the location ptr points to. The abstract machine, having been told that no memory is touched, will treat the assembly as not affecting any memory. The compiler may reorder memory operations around this call. Another thread or another part of the same thread that reads from *ptr may see stale values or have its own writes reordered unexpectedly. The operational semantics story says one thing; the assembly does another. That mismatch is undefined behavior.
The comparison with GCC-style extended asm in C is instructive here. C’s inline assembly uses a similar operand syntax but the options are thinner. You declare input and output operands and a clobber list. There is no nomem or readonly at the asm statement level. The nearest equivalent is the memory clobber, which conservatively tells the compiler that memory may have been modified and forces it to flush its state around the asm block. Rust’s approach is more granular: rather than a binary memory/no-memory flag, you have a range from nomem through readonly to the default assumption of full memory access. This lets the compiler preserve more optimization opportunities when the programmer provides tighter guarantees.
How this flows into LLVM IR
The options are not just abstract constraints for MiniRust to reason about. They map to LLVM IR attributes on the inline asm call. The nomem option corresponds to LLVM’s readnone memory effect. The readonly option corresponds to readonly. The pure option enables constant folding across repeated calls. The nostack option informs LLVM’s stack frame analysis.
When LLVM’s optimization passes see a readnone inline asm call, they treat it the same way they treat a readnone function call: safe to hoist out of loops, safe to eliminate in dead code paths, safe to reorder with surrounding operations. If that attribute was wrong because the programmer miscategorized their assembly, those optimizations produce miscompiled code.
This is why the MiniRust framing matters for practical use. The formal story the abstract machine tells about assembly is not just academic scaffolding. It is the contract that makes compiler optimizations correct. Getting the options right is not about satisfying a type checker; it is about making the compiler’s trust in your declarations earned rather than misplaced.
The storytelling approach
The metaphor Jung uses — and that the MiniRust project embodies — is that operational semantics specifies program behavior by narrating it. The abstract machine steps through time, and the specification says what each step does. This is different from axiomatic approaches that define what cannot happen without describing how execution proceeds.
For most of Rust, the two perspectives align closely. But inline assembly is a case where the narrative approach has a structural advantage. You can say: the machine reads the input operands, records them in registers, then the black box runs subject to these constraints, then the machine reads the output registers and writes them to the declared destinations. That is a coherent story even when the inside of the black box is opaque. You bound the story’s middle section with the options, and the before and after remain fully specified.
What Jung’s work on the inline assembly chapter of MiniRust does is make that middle section precise enough to reason about, without requiring the abstract machine to understand the assembly itself. The goal is that a Rust implementation is correct if its execution matches MiniRust’s story. An implementation that miscompiles code because it trusted a nomem declaration that was violated is not matching that story — the UB is the point where the story and the execution diverge.
For anyone writing low-level Rust with asm!, the practical upshot is to treat the options as binding contracts rather than hints. The nomem option is not a performance suggestion; it is a claim about what your assembly does, and the correctness of everything around it depends on that claim being true.