The volatile keyword in C tells the compiler that a memory location has side effects: do not optimize away reads or writes to it. That is the entirety of the guarantee. The keyword says nothing about access direction, ownership, valid state transitions, or concurrent access from interrupt handlers. Every other constraint in hardware register programming, C can only express through documentation, naming conventions, and programmer discipline.
This is the gap that Ferrous Systems’ post on hardware access in Rust addresses. The post walks through the progression from raw pointer dereferences to typed register abstractions, but the deeper argument is that Rust can move hardware contracts into the type system, where the compiler enforces them rather than the programmer.
What volatile Actually Guarantees
In a typical ARM Cortex-M UART driver, C code following the CMSIS conventions looks like this:
typedef struct {
__I uint32_t SR; // Status register: read-only
__IO uint32_t DR; // Data register: read-write
__IO uint32_t BRR; // Baud rate register: read-write
__IO uint32_t CR1; // Control register 1: read-write
} USART_TypeDef;
#define USART1 ((USART_TypeDef *) 0x40011000UL)
ARM’s CMSIS standard, introduced around 2008 and paired with auto-generated headers derived from SVD (System View Description) XML files, standardized this struct-overlay pattern across the embedded C ecosystem. The __I and __IO macros expand to volatile const and volatile respectively, marking each register with its intended access direction.
The enforcement is purely conventional. A volatile const uint32_t* can be cast to volatile uint32_t* and written to without any compiler warning. C has no write-only type qualifier, so __O registers are simply volatile, readable by any code that holds a pointer. The most fundamental constraint, that only one logical owner should access a given peripheral at a time, is invisible to the compiler. Two modules can each hold a pointer to USART1 and call into it concurrently from main code and an ISR; that conflict surfaces as a logic bug at runtime, invisible to any static analysis.
Memory ordering is a separate discipline entirely. The C compiler preserves ordering among volatile accesses relative to each other but may reorder them relative to non-volatile operations. Correct peripheral access often requires explicit barriers (__DMB(), __DSB(), __ISB()) to synchronize with DMA controllers and other bus masters. Inserting these correctly is the programmer’s responsibility, with no language mechanism for verification.
Rust’s Starting Point: Volatile as an Operation
Rust has no volatile type qualifier. Volatile semantics are explicit function calls: core::ptr::read_volatile and core::ptr::write_volatile. This reframes the model from “this memory location has side effects” to “this specific access has side effects,” which matters when building safe abstractions on top.
Both functions take a raw pointer and are unsafe, forcing each call site to acknowledge that unmediated memory access is happening:
use core::ptr;
const UART_SR: *const u32 = 0x4001_1000 as *const u32;
const UART_DR: *mut u32 = 0x4001_1004 as *mut u32;
unsafe fn uart_send(byte: u8) {
while ptr::read_volatile(UART_SR) & (1 << 7) == 0 {}
ptr::write_volatile(UART_DR, byte as u32);
}
Compared to USART1->SR and USART1->DR = c, this is more verbose and provides no inherent safety advantage on its own. These raw primitives are the foundation; the interesting work happens when crates build typed abstractions on top.
Typed Register Access
The volatile_register crate provides three types: RO<T> for read-only registers, WO<T> for write-only registers, and RW<T> for read-write registers. A peripheral register block becomes:
use volatile_register::{RO, RW, WO};
#[repr(C)]
pub struct UartRegisters {
pub sr: RO<u32>, // Status: read-only
pub dr: RW<u32>, // Data: read-write
pub brr: RW<u32>, // Baud rate: read-write
pub cr1: RW<u32>, // Control 1: read-write
}
The types expose only the operations their access mode permits. RO<T> has a .read() method and nothing else. WO<T> has .write() only. RW<T> has both, plus a .modify(|v| ...) closure for read-modify-write sequences. Every access goes through read_volatile or write_volatile internally.
Calling .write() on an RO<u32> is a compilation failure. The constraint that C expressed as a volatile const qualifier, subvertable by any cast, is now checked unconditionally by the type system. The Tock OS project takes this further with tock-registers, which adds named bit-field definitions directly into the type:
use tock_registers::{register_bitfields, register_structs, registers::ReadWrite};
register_bitfields![u32,
CR1 [
UE OFFSET(13) NUMBITS(1) [],
TE OFFSET(3) NUMBITS(1) [],
M OFFSET(12) NUMBITS(1) [
EightBits = 0,
NineBits = 1,
]
]
];
Named field access eliminates the manual masking and shifting that fills embedded C codebases, and enumerated values for fields with defined states make it impossible to write an invalid bit pattern at the type level.
The PAC Generation Pipeline
Writing register structs by hand for a microcontroller with forty peripherals and several hundred registers is impractical. The embedded Rust ecosystem solves this with svd2rust, which reads SVD files, the same XML format used to generate CMSIS C headers, and produces a Peripheral Access Crate (PAC) for a specific chip.
PACs exist for most common targets: stm32f4 for STMicroelectronics’ F4 series, nrf52840 for Nordic’s nRF52840, rp2040 for the Raspberry Pi RP2040. The generated API is more structured than a raw struct overlay. For a USART peripheral on STM32:
use stm32f4::stm32f411::Peripherals;
let dp = Peripherals::take().unwrap();
let usart1 = dp.USART1;
// Named field access via typed readers
while usart1.sr.read().txe().bit_is_clear() {}
// Write closure resets to reset value, then applies changes
usart1.dr.write(|w| unsafe { w.dr().bits(b'A' as u16) });
Two design decisions here are worth examining separately. Peripherals::take() returns Option<Peripherals> and uses a static AtomicBool to ensure it succeeds exactly once per program execution. The singleton constraint, which in C amounts to a documentation note that two modules should not both call USART_Init(), is now a runtime-enforced ownership invariant. Once taken and split, the USART1 token can be moved into a driver by value, and no other code in the program can obtain a second handle.
The write closure addresses a different class of bug. A naive read-modify-write on a control register can retain unintended bits from prior state. PAC write closures start from the register’s documented reset value and apply only the specified changes, making the result deterministic regardless of what the register contained before the call.
State Machines in the Type Signature
The typestate pattern extends type-level hardware contracts to cover initialization sequences and valid state transitions. A UART that must be configured before transmitting becomes a type parameterized by its current state:
pub struct Disabled;
pub struct Enabled;
pub struct Uart<STATE> {
regs: &'static mut UartRegisters,
_state: core::marker::PhantomData<STATE>,
}
impl Uart<Disabled> {
pub fn configure(self, baud: u32, clk: u32) -> Uart<Enabled> {
let div = clk / (16 * baud);
unsafe { self.regs.brr.write(div) };
unsafe {
self.regs.cr1.modify(|v| v | (1 << 13) | (1 << 3) | (1 << 2))
};
Uart { regs: self.regs, _state: core::marker::PhantomData }
}
}
impl Uart<Enabled> {
pub fn send_byte(&mut self, byte: u8) {
while self.regs.sr.read() & (1 << 7) == 0 {}
unsafe { self.regs.dr.write(byte as u32) };
}
}
Uart<Disabled> has no send_byte method. Using the peripheral before configuring it is a compilation failure. The sequence constraint that embedded C enforces through runtime assertions, or through trust that callers read the manual, is now part of the function signature.
The PhantomData<STATE> field is zero-sized. All of the type-level state tracking has no runtime representation and no runtime cost. The compiled output is equivalent to what a careful C developer would produce by hand, minus the defensive checks, because the compiler has already verified the invariants statically. HAL crates use this pattern extensively; stm32f4xx-hal represents GPIO pins as types parameterized by their current mode, making it a compile error to use a pin configured as SPI MOSI simultaneously as a generic GPIO output, because the pin’s ownership was moved into the SPI peripheral constructor.
The Portability Layer
Code written against a PAC is chip-specific. The embedded-hal crate, which reached 1.0 in late 2023 after several years as a 0.2.x ecosystem standard, defines traits for common hardware interfaces: SPI buses, I2C controllers, UART read and write, GPIO input and output, and others. HAL crates for specific chips implement these traits on top of their PAC types. Driver crates target the traits rather than any specific chip, so a driver for an SPI-connected sensor works without modification on any microcontroller with a compatible HAL implementation.
This composability relies on Rust’s monomorphization. A generic driver parameterized over SPI: embedded_hal::spi::SpiBus produces specialized machine code for each concrete SPI implementation it is instantiated with. There is no vtable, no dynamic dispatch, no indirection at runtime. The abstraction layers collapse entirely in the compiled binary.
The Embassy framework extends this into async territory with embedded-hal-async, providing async versions of the same trait family. A blocking SPI implementation and a DMA-driven async SPI implementation can satisfy the same trait bound; executor-aware code can use either without changing the driver logic. Embassy has become the default framework for new embedded Rust projects targeting STM32, nRF, RP2040, and ESP32, and its HALs demonstrate the full stack in practice: PAC types at the bottom, typed register abstractions above them, trait implementations above those, and generic drivers at the top.
What Accumulates Across the Stack
The Embedded Rust Book documents the full ecosystem in detail, from PAC generation through HAL crates to driver development. The toolchain has matured considerably since the initial experiments in 2018; svd2rust, probe-rs for flashing and debugging, defmt for binary-format logging, and the RTIC framework for interrupt-driven concurrency with compile-time priority analysis now constitute a coherent platform.
The accumulated difference from C is substantial. Access direction is enforced by types rather than naming conventions. Singleton peripheral ownership is enforced at runtime through a checked take and then at compile time through move semantics. Pin assignment is tracked by the borrow checker, making it structurally impossible to accidentally share a pin between two drivers. State machine invariants are enforced by the type checker rather than runtime assertions or documentation. Generic drivers work across chips without modification because the portability layer is defined in traits.
Experienced embedded C developers looking at PAC code often note the verbosity. The types carry more information than a CMSIS struct. That additional information participates in compilation, enforced by the same tool that checks every other correctness property in the program. The hardware contract, previously confined to comments and conventions that casts could quietly violate, now has the same standing as any other type-level invariant in the codebase.