· 6 min read ·

bhyve and ZFS: What It Looks Like When a Hypervisor Fits the OS

Source: hackernews

The Dragas post on why FreeBSD still earns genuine affection pulled 500 points on Hacker News last week, and the thread spent predictable time on jails, ZFS, and the unified codebase. One thing that came up less, but deserves more attention, is bhyve: FreeBSD’s built-in hypervisor, which has been part of the base system since FreeBSD 10.0 in 2014 and which illustrates the same design coherence that makes the rest of the system appealing.

How bhyve Is Structured

bhyve is not a monolithic application in the QEMU mold. It is split between a small kernel module, vmm.ko, and a userspace process, bhyve(8), where the division is strict. vmm.ko handles everything that requires hardware virtualization support: VT-x and AMD-V setup, extended page tables (EPT/NPT), VMCS management, and vCPU scheduling. The bhyve(8) userspace process handles device emulation: virtio block and network devices, NVMe, AHCI, USB via xhci, and the console.

This structure means each guest runs as a separate userspace process with its own file descriptors and address space. When a guest crashes or is killed, the bhyve process exits cleanly. Debugging a guest problem means looking at a process, not reasoning about shared kernel state. The kernel module surface exposed to the hypervisor is accessed through /dev/vmm/<vmname> character devices created per-VM.

# Load the hypervisor module
kldload vmm

# Create a 2-vCPU, 4GB VM from a ZFS-backed disk image
bhyve -c 2 -m 4G \
  -s 0,hostbridge \
  -s 1,lpc \
  -s 2,virtio-blk,/dev/zvol/tank/vms/debian \
  -s 3,virtio-net,tap0 \
  -s 31,fbuf,tcp=5900 \
  -l bootrom,/usr/local/share/uefi-firmware/BHYVE_UEFI.fd \
  -l com1,stdio \
  debian

The command flags directly reflect hardware PCI slot assignments. -s 2 places the virtio block device at PCI slot 2. The PCI topology of the VM is explicit in the invocation rather than hidden behind layers of XML or JSON configuration. For operators who want to understand exactly what hardware a guest sees, the bhyve(8) man page is the complete reference, and it describes the running system because it ships with the same tree.

ZFS as VM Storage

The combination of bhyve and ZFS is where the system model pays off most clearly. A VM disk is a ZFS zvol: a block device backed by the ZFS storage pool, subject to all the same properties as any other dataset.

# Provision a 40GB zvol for a new VM
zfs create -V 40G -o volblocksize=16K tank/vms/debian

# Before an in-guest OS upgrade, snapshot the zvol
zfs snapshot tank/vms/debian@before-upgrade

# Start the VM against /dev/zvol/tank/vms/debian as usual
# If the upgrade goes badly:
zfs rollback tank/vms/debian@before-upgrade

Cloning a VM for testing takes milliseconds:

zfs clone tank/vms/debian@before-upgrade tank/vms/debian-test
bhyve -c 2 -m 4G -s 2,virtio-blk,/dev/zvol/tank/vms/debian-test ...

The clone shares blocks with the parent until writes diverge, so the storage cost of a clone is zero until the guest modifies its filesystem. This is the same mechanism that makes jails fast to provision when backed by ZFS datasets: zfs clone is O(1) regardless of dataset size.

ZFS properties also apply directly to VM disks. Setting compression=lz4 on the pool propagates to zvols automatically. recordsize can be tuned per-zvol to match guest filesystem characteristics. checksum=blake3 (available on recent OpenZFS) means data corruption in the VM disk image is detected by the storage layer before the guest sees it, which is not something QEMU’s raw or qcow2 formats on ext4 give you without explicit dm-integrity or similar layering.

vm-bhyve and the Management Layer

Raw bhyve(8) invocations are transparent but verbose. vm-bhyve is the most widely used management layer, written in POSIX shell, BSD-licensed, and designed around the same tool-as-configuration philosophy as FreeBSD’s rc system.

pkg install vm-bhyve bhyve-firmware

# Initialize vm storage
vm init
zfs set org.freebsd:swap=1g tank/vms

# Create a VM definition
vm create -t debian -s 40G debian01

# Install from ISO
vm install debian01 debian-12.iso

# List running VMs
vm list
NAME     DATASTORE  LOADER  CPU  MEMORY  VNC           AUTO  STATE
debian01 default    uefi    2    4G      0.0.0.0:5900  No    Running (14203)

Each VM is a directory under the datastore containing a configuration file and, if using file-based disk images, the disk images themselves. The configuration format is a simple key-value file:

# /vm/debian01/debian01.conf
loader="uefi"
cpu=2
memory=4G
disk0_type="virtio-blk"
disk0_name="disk0"
network0_type="virtio-net"
network0_switch="public"

This is readable, version-controllable, and requires no running daemon to parse. The vm-bhyve commands translate these into bhyve(8) invocations, wrapping network setup via if_bridge(4) and tap interfaces, console management via nmdm(4) null modem device pairs, and service startup via rc.d.

How bhyve Compares Architecturally with KVM

bhyve and KVM address the same problem but make different tradeoffs in implementation scope.

KVM (Kernel-based Virtual Machine) is a kernel module that exposes /dev/kvm and relies on QEMU for device emulation. QEMU has twenty-plus years of accumulated device support: ISA, PCI, PCIe, USB, dozens of CPU models, machine types for ARM, MIPS, RISC-V, s390x, and others. The surface area of QEMU’s device emulation layer is very large, which is why QEMU CVEs regularly involve vulnerabilities in device emulation code for hardware that no production deployment uses.

bhyve’s device list is shorter by design: virtio-blk, virtio-net, virtio-rng, NVMe, AHCI, xhci, a framebuffer. Guest support covers FreeBSD, Linux, Windows (since FreeBSD 11, with a dedicated Windows-compatible machine type), OpenBSD, and NetBSD. The narrower scope makes the codebase more auditable. The FreeBSD security advisories for bhyve have been in specific device emulation paths, but the total CVE count is substantially smaller than QEMU’s history.

The architectural similarity to jails is worth noting. A jail is a process tree with a restricted kernel view: namespaced network stack, limited filesystem access, no escalation path to the host. A bhyve VM is a userspace process with hardware-virtualized memory isolation: the guest OS runs in a hardware-isolated address space, with device I/O mediated by the bhyve process. Both mechanisms trust the kernel for enforcement and expose a minimal interface to the contained environment. Both integrate with ZFS for snapshot-backed storage.

Where bhyve Is Today

FreeBSD 14.2, the current stable point release, ships bhyve with improved NVMe controller emulation, better PCIe topology support, and more stable passthrough of host PCI devices via vmm_ppt. PCI passthrough is the mechanism that lets a VM receive direct, exclusive access to a host NIC or GPU without device emulation overhead, useful for network appliance VMs that need line-rate performance without virtio overhead.

TPM 2.0 passthrough arrived in FreeBSD 14.x, which matters for Windows 11 guests and for production workflows that depend on attestation. UEFI support via the bhyve-firmware port ships OVMF for x86_64 guests. ARM64 bhyve support exists but is still maturing and not yet suitable for production use on aarch64 hardware.

TrueNAS CORE, iXsystems’ storage appliance software, uses bhyve for its built-in VM hosting, with storage backed by the same ZFS pool that serves NFS and SMB shares. This is a natural fit: a storage-primary OS where VM disks are first-class ZFS datasets, subject to the same replication, snapshots, and scrub jobs as everything else on the system.

The Operational Shape of the Thing

What makes bhyve interesting in the context of the broader FreeBSD discussion is not that it outcompetes KVM/QEMU for feature breadth, because it does not. What it demonstrates is that the same design values visible in jails, DTrace, and the unified codebase shape virtualization in FreeBSD too.

A hypervisor where each guest is a process, where disk images are ZFS datasets, where management tooling is POSIX shell that generates bhyve(8) invocations, and where the man pages describe the installed system accurately, is a coherent operational experience. The pieces fit together because they come from the same source tree and are maintained by people who understand the whole.

That coherence is the thing the Hacker News thread kept returning to, in different forms. bhyve is one more example of it in an area where the Linux ecosystem, despite its considerable raw capability, shows the accumulated cost of assembling independent projects around a shared kernel.

Was this interesting?