bunnie’s BIO (Bao I/O) coprocessor arrives as part of the Bao platform, his custom silicon successor to Precursor. The concept it embodies, a small programmable engine dedicated to bit-level I/O work, is one of embedded systems’ most consistently reinvented ideas. Understanding why it keeps coming back, and what each generation adds, is more useful than treating any single instance in isolation.
The Problem That Will Not Go Away
Modern SoCs are expected to speak dozens of protocols: SPI, I2C, I2S, UART, SDIO, CAN, MIPI CSI, and a long tail of industrial and proprietary variants. The standard approach is to hardwire dedicated peripheral blocks in silicon. Every I2C controller, every SPI master, is fixed logic with a register interface baked into the die. This works until you need a protocol the silicon designer did not anticipate, or you need timing precision that OS-driven interrupt handling cannot provide, or you are building an open hardware platform where trusting closed peripheral IP is philosophically inconsistent with the project’s goals.
The alternative is a programmable engine: a small processor or state machine that accepts firmware and then toggles GPIO lines, shifts bits, and responds to pin transitions at deterministic cycle counts, entirely independent of whatever the main CPU is doing. That is the architectural idea behind BIO. It is also the idea behind the RP2040’s PIO blocks, TI’s PRU subsystem, the ESP32’s ULP coprocessor, and chips going back forty years.
The 6522 and the First Programmable Peripherals
In 1975, MOS Technology introduced the 6522 Versatile Interface Adapter alongside the 6502. The 6522 VIA ended up in the Apple II, the Commodore 64, and the BBC Micro. It offered two 8-bit bidirectional I/O ports with individually configurable pin directions, a shift register capable of synchronous serial at rates derived from an internal timer, and a pair of interval timers with interrupt output. It was not a general-purpose processor, but its behavior was configurable in ways that went beyond passive GPIO expansion.
Intel’s 8255 Programmable Peripheral Interface took a parallel approach for the 8086 world: three 8-bit ports configurable across several modes, including bidirectional handshaked transfers with hardware strobe and acknowledge signals. The Zilog Z80-PIO and Z80-SIO chips handled synchronous serial protocols including HDLC with their own internal state machines, requiring the Z80 only for setup and error handling.
All of these chips shared the same design logic. You configure the peripheral via registers, hand it a mode, and it runs autonomously. The CPU is freed from cycle-by-cycle bit manipulation. The limitation is that the peripheral’s behavior is fixed in silicon; you cannot reprogram it to handle a protocol the designer did not include.
Intel’s 8089 I/O Processor pushed the concept further. Designed as a companion chip to the 8086, the 8089 had two independent I/O channels, each with its own instruction pointer running a custom instruction set optimized for block data transfers, scatter-gather DMA, and device control loops. It understood channel programs loaded from memory by the host. It was, in effect, a dedicated I/O CPU with its own ISA. The 8089 died when peripheral integration made dedicated I/O chips less economically interesting, but the architecture it explored would resurface repeatedly.
PRU: The Industrial Answer
Texas Instruments’ Programmable Realtime Units, present in the AM335x (the processor in the original BeagleBone Black) and a range of subsequent Sitara SoCs, represent the serious industrial answer to the same problem. Each AM335x has two PRU cores at 200 MHz. Each gets 8 KB of instruction RAM, 8 KB of local data RAM, and access to 12 KB of shared data RAM. GPIO access is single-cycle: one instruction to read or write a pin, with deterministic latency regardless of what the main ARM core is doing.
The PRU instruction set is deliberately minimal, centered on load, store, shift, logical operations, and branches. The toolchain is sparse by modern standards. pasm, the PRU assembler, produces tight, auditable binaries. There is no OS, no scheduler, no interrupt jitter. You load the firmware, assert the enable bit from the ARM side, and the PRU runs until you stop it.
Communication with the Linux host happens through shared memory and the RPMsg mailbox framework. Userspace can load firmware via the remoteproc subsystem and exchange messages with a running PRU program through character devices. The result is a real-time I/O engine that coexists with a full Linux stack without any of the timing compromises that RT-patched kernels try and sometimes fail to eliminate.
This architecture has been used for EtherCAT slave controllers, stepper motor drivers, industrial network stacks, and signal analyzers where microsecond precision matters. The PRU earns its complexity budget because the applications demand it.
RP2040 PIO: Programmable I/O for Everyone
The Raspberry Pi Foundation’s RP2040, launched in 2021, made programmable I/O accessible to the widest audience yet and brought the concept into mainstream embedded discourse. The chip has two PIO blocks, each containing four state machines sharing a 32-instruction program memory.
The instruction set has exactly nine opcodes: JMP, WAIT, IN, OUT, PUSH, PULL, MOV, IRQ, and SET. Every instruction executes in exactly one clock cycle, with an optional delay field stretching execution to up to 32 cycles. Each state machine has a 4-word TX FIFO and a 4-word RX FIFO connected to the chip’s DMA controller, and a clock divider with a 16-bit integer and 8-bit fractional component for precise baud rate generation from the 125 MHz system clock.
The side-set mechanism is particularly well thought-out. An instruction can atomically assert one to five GPIO pins to specific values while executing its main operation. This means you can generate a clock edge and shift a data bit in the same cycle without spending an extra instruction. The PIO programs that ship with the SDK, including the eight-instruction WS2812 LED driver and the DVI/HDMI output program, demonstrate what is achievable inside 32 instructions when the design is constrained to force clarity.
The 32-instruction limit is a choice, not a shortcoming. A PIO program you can read in two minutes is a PIO program you can verify in two minutes. Raspberry Pi’s pioasm assembler and the C SDK’s PIO support made the development workflow workable. The MicroPython rp2.PIO class brought PIO programming to the scripting audience. The resulting community of programs for PS/2 keyboards, VGA signal generation, 1-Wire, I2S audio, and half a dozen other protocols reflects how much demand existed for this kind of low-overhead, inspectable I/O flexibility.
BIO in the Context of Bao
bunnie’s Precursor used a Xilinx Spartan-7 FPGA to implement a soft RISC-V CPU and all its peripherals in fully auditable Verilog. Every I2C transaction, every SPI byte, was the product of logic that any competent engineer could read and verify. The philosophical argument was clean: closed peripheral silicon is a trust surface, and Precursor eliminated it by making the peripheral logic itself open and reconfigurable.
The FPGA approach is correct in principle but expensive, power-hungry, and difficult to manufacture at scale. Bao is the move toward custom ASIC, where the economics improve but reconfigurability at the Verilog level is gone. BIO is how the Bao platform preserves the philosophical core of the Precursor approach within a fixed-silicon design.
A programmable I/O engine means the peripheral behavior is firmware, not silicon. Users can read the program driving their I2C bus. They can audit it, modify it, and verify that it does exactly what it claims. For a device positioned as a secure, auditable communications platform, this is not a marginal advantage. It is the difference between a platform where you can inspect every layer of the I/O stack and one where you cannot.
The Design Tension Every Programmable I/O Engine Faces
There is a consistent tension in this design space between generality and verifiability. A full RISC core like the PRU can implement almost any protocol, including ones complex enough that reviewing the firmware becomes difficult. A narrow state machine like the RP2040 PIO is easy to reason about but hits its instruction limit on protocols with more than modest complexity.
The PRU approach trades simplicity for power. You can build a TCP/IP stack on a PRU. You can also build something whose correctness is not obvious to a reviewer without significant effort. The RP2040 PIO’s constraint forces a kind of legibility that is valuable in itself, though it becomes frustrating when you need to implement a stateful protocol that genuinely requires more than 32 instructions.
Where BIO lands on this spectrum is a meaningful design statement. Given bunnie’s consistent emphasis on trust, verifiability, and the ability for users to understand their hardware at every level, the architecture choices in BIO will reflect a considered position on this tradeoff, not an oversight in either direction.
Programmable I/O as a Statement About Trust
It is easy to treat I/O coprocessors as purely pragmatic: offload the timing-sensitive work, free the main CPU, reduce interrupt overhead. Those are real benefits. In open hardware, they are secondary.
A system where every peripheral is a closed IP block from a vendor’s library is open in name only. The CPU may be auditable; what drives the SPI lines may not be. A programmable I/O engine, with its firmware published and readable, collapses that gap. The peripheral behavior is code, and code can be read, criticized, and improved by anyone with the time and interest to do so.
The lineage from the 6522 VIA through the Intel 8089, TI’s PRU, the RP2040 PIO, and now BIO is partly a technical evolution driven by process improvements and changing protocol demands. It is also a persistent argument, restated in each generation, that the right amount of flexibility in an I/O engine is the amount that keeps the behavior legible. bunnie’s contribution is building that argument into a platform where the stakes around hardware trust are taken seriously from the start.