· 2 min read ·

Building All the Way Down: The Guix Full-Source Bootstrap

Source: lobsters

Every Linux system you have ever run started from a binary you did not compile. Someone handed you a pre-built compiler, and you used it to build everything else. That compiler came from another compiler, which came from another, and at some point the chain dissolves into a blob you simply have to trust.

In 1984, Ken Thompson demonstrated in Reflections on Trusting Trust that you cannot verify a compiler by reading its source if the compiler itself was compiled by something compromised. The attack is self-sustaining and invisible in any source code audit. The only real defense is to minimize and inspect the binary seed you start from.

The GNU Guix project has now done that, end to end. Their full-source bootstrap traces every binary in a complete Guix system back through a chain of auditable source to a seed of roughly 356 bytes.

How the chain actually works

The path starts with hex0, a minimal hex assembler. Its binary is small enough to inspect by hand; the source is a human-readable hex encoding of machine instructions. From there the chain climbs:

  • hex0hex1hex2: progressively more capable assemblers, each built from the previous
  • M2-Planet: a C-like language compiler, built using those assemblers
  • mescc-tools and GNU Mes: a Scheme interpreter paired with a C compiler written in Scheme, built via M2-Planet
  • TinyCC: a small, fast C compiler, built using Mes
  • From TinyCC, the rest of the GCC bootstrap and the full system

The stage0 project by Jeremiah Orians provided the early rungs of this ladder. GNU Mes, maintained by Jan Nieuwenhuizen, bridges the gap between that minimal foundation and a working C toolchain. Together with the broader Bootstrappable Builds effort, these pieces fit into a complete, verifiable chain.

Why this matters beyond Guix

Most distributions do not think about this at all. You install a distro, you trust the package maintainers, you trust whoever compiled their bootstrap binaries, and you move on. That is a reasonable pragmatic choice for most people most of the time.

But for anyone building software that needs to be auditable, whether for security, regulatory, or reproducibility reasons, the ability to point at a 356-byte seed and say “everything above this came from source code you can read” is genuinely meaningful. It changes the threat model. The attack surface for a Thompson-style trusting trust compromise shrinks to something small enough to manually verify.

It also required years of careful, unglamorous work across multiple projects and contributors. The Guix team did not build this chain alone; they assembled existing pieces and filled in the gaps. The result is a property that no other general-purpose Linux distribution can currently claim.

For anyone interested in reproducible builds, supply chain security, or just the deep plumbing of how software systems come to exist, the full writeup is worth reading carefully.

Was this interesting?