· 8 min read ·

The Measurement Gap: Why a Valid TPM Quote Doesn't Prove a Clean Boot

Source: lobsters

When security teams talk about boot integrity, the conversation usually lands on two things: Secure Boot is enabled, and the server has a TPM. Both are true on most modern hardware. Neither one, nor both together, gives you the guarantee most people think they do. This is the argument at the center of an examination of boot verification by unmitigated risk, and it is worth unpacking precisely because the gap is architectural, not something a firmware update will close.

Two Mechanisms With Different Jobs

Secure Boot and Measured Boot are often discussed as if they are two names for the same thing. They are not, and the distinction matters more than almost anything else in this space.

Secure Boot is an enforcement mechanism. The UEFI firmware checks cryptographic signatures on each boot component against a database of trusted keys (the db and dbx variables stored in NVRAM) before executing anything. If the signature is invalid or the binary is on the revocation list, execution stops. Secure Boot does not record what ran. It only enforces what is allowed to run.

Measured Boot is an audit mechanism. It does not block anything. At each stage of the boot sequence, the current component hashes the next one and extends that hash into a TPM Platform Configuration Register (PCR) before executing it. The operation is:

PCR[n] = Hash(PCR[n] || new_measurement)

PCRs are extend-only. You cannot write a value directly. The final PCR state is a hash chain representing the ordered sequence of everything measured, and a change anywhere in that sequence produces a completely different final value. TPM remote attestation then lets a verifier request a signed quote of those PCR values, with a nonce to ensure freshness. If you have a valid quote and you know the expected values, you have cryptographic evidence of what the boot chain measured.

The problem lives in the space between “what the boot chain measured” and “what the boot chain executed.”

What PCRs Actually Cover

The TCG PC Client Platform Firmware Profile specifies which components should be measured into which PCRs during UEFI boot. The standard allocation:

PCRContent
0UEFI firmware executable (the BIOS/UEFI code itself)
1UEFI firmware configuration (NVRAM variables)
2Option ROM code
3Option ROM configuration data
4Initial Program Loader (bootloader EFI stub or MBR)
5Boot configuration data
7Secure Boot policy: whether it is enabled, enrolled keys
8–9OS kernel and initrd (measured by the bootloader)
10IMA (Linux Integrity Measurement Architecture)

PCR 0 covers the UEFI firmware volume. That sounds comprehensive. It is not. PCR 0 does not cover the Intel Management Engine (ME) or AMD Platform Security Processor (PSP) firmware, which run before and alongside the UEFI code with their own execution environments and network access on some platforms. It does not cover the Embedded Controller firmware. It does not cover NIC or HBA firmware beyond what is explicitly measured as an option ROM, and measurement of option ROMs depends on the firmware implementation doing so correctly. It does not cover BMC (Baseboard Management Controller) firmware, which has full out-of-band system access.

PCR 7 records the Secure Boot policy state, including whether Secure Boot is enabled and what keys are enrolled. It does not record whether the enforcement logic in the firmware was implemented correctly, only that the policy database had a particular value at boot time.

An attacker who compromises Management Engine firmware, or implants code in a NIC’s option ROM, or modifies the BMC, may leave all PCR values entirely unchanged. The TPM quote will be valid. The verifier will see expected values. The system is compromised.

The Reference Values Problem

Even setting aside measurement coverage gaps, TPM remote attestation requires a known-good baseline to compare against. A TPM quote contains PCR values. To verify them, you need expected values for a clean boot of this specific hardware, with this specific firmware version, this specific Secure Boot key enrollment, and this specific boot configuration.

No universal reference database exists. Maintaining accurate baselines for a heterogeneous fleet is a continuous operational burden. Every firmware update changes PCR values. Every Secure Boot key rotation changes PCR 7. Every change to NVRAM variables changes PCR 1. Organizations that implement TPM attestation often end up with large, frequently stale reference sets, or they verify only a subset of PCRs, widening the gap further.

The IETF RATS working group (RFC 9334) provides a conceptual framework for remote attestation that acknowledges this: the endorsement, the reference values, and the attestation result are three separate concerns that must all be handled correctly for attestation to mean anything. In practice, most deployments get one or two of those right.

Why Cloud VMs Make This Harder

In a standard cloud IaaS deployment, a customer VM receives a virtual TPM emulated in software by the hypervisor. The hypervisor writes measurements into the vTPM’s PCRs. The customer’s OS can request a TPM quote, receive one, and verify that the quote is signed by a valid key. What the customer cannot verify is whether the hypervisor measured things accurately, whether the UEFI firmware provided to the VM (typically OVMF or a vendor-supplied variant) was the version claimed, or whether the physical hardware the hypervisor runs on booted cleanly.

The physical hardware TPM is accessible to the hypervisor for its own measurements. The customer has no path to request a quote from it. The cloud provider’s hardware attestation chain exists and, in well-run environments, is used internally. It is not exposed to customers in standard IaaS offerings. The customer’s workload sits two trust boundaries away from the hardware root of trust with no cryptographic link between them.

Confidential computing architectures, specifically AMD SEV-SNP and Intel TDX, address part of this. Under SEV-SNP, the AMD Secure Processor measures the initial VM memory image at launch and produces an attestation report signed by the AMD Versioned Chip Endorsement Key (VCEK), which chains to AMD’s root CA. The hypervisor cannot forge or modify this report. The customer can verify the signature against AMD’s PKI and confirm that their VM’s initial memory image matched what they expected, independent of the cloud provider’s assertions.

This is a meaningful improvement. It is also incomplete. The SEV-SNP attestation covers the initial memory image at launch, not the complete subsequent boot sequence. What the VM’s OVMF firmware does after that initial measurement, which OS components it loads, whether signature verification is correctly enforced throughout, none of that is in the SEV-SNP report. The attestation establishes the starting point. The rest of the boot chain is still subject to the same gaps as bare metal.

When the Cryptography Is Fine but the Key Infrastructure Isn’t

Firmware supply chain incidents from 2023 and 2024 demonstrated a different class of failure: the measurement and signature mechanisms worked exactly as designed, and the systems were still compromised or compromisable.

In May 2023, the Money Message ransomware group published data stolen from MSI, including Intel Boot Guard private keys for approximately 200 MSI product lines. Boot Guard is Intel’s hardware-anchored root of trust for UEFI firmware, where a hash of the Initial Boot Block is fused into the CPU at manufacturing time. It operates below the UEFI layer and verifies firmware before UEFI runs. Once the signing keys are exposed, an attacker can sign modified firmware that Boot Guard will accept as legitimate. The critical constraint: Boot Guard key hashes are fused into CPUs at manufacture. They cannot be rotated. The affected hardware is permanently unable to distinguish attacker-signed firmware from vendor-signed firmware.

The PKfail disclosure in June 2024 identified approximately 900 device models from major OEMs shipping with the same AMI test Platform Key in production firmware. The Platform Key sits at the top of the Secure Boot key hierarchy; compromise of the PK allows enrollment of any signing key, which allows signing any bootloader. The AMI “DO NOT SHIP” test key had been present in production devices from Acer, Dell, Gigabyte, Intel, MSI, and Supermicro for years.

The BlackLotus bootkit (CVE-2022-21894), discovered by ESET researchers in 2023, demonstrated a different key management failure. Microsoft had patched the underlying Secure Boot bypass vulnerability but had not added the vulnerable bootloaders to the UEFI dbx revocation list. Since the old bootloaders were still signed with valid Microsoft keys and not revoked, BlackLotus could use them to bypass Secure Boot on fully patched Windows 11 systems. The vulnerability was known, the patch existed, and the revocation list lagged far enough behind that exploitation remained practical in the wild. Maintaining the dbx is a coordination problem involving CPU vendors, firmware vendors, OS vendors, and OEMs, and in practice it moves slowly.

Each of these incidents exploited the space between the cryptographic model and the operational reality around it. The signature verification worked. The TPM would have produced valid quotes. The compromised state was not reflected in any measurement.

What Practitioners Can Actually Do

The picture here is not that firmware security is hopeless. It is that the guarantees are narrower than the marketing suggests, and knowing precisely where the edges are determines whether attestation is a meaningful control or an expensive checkbox.

For bare metal infrastructure, combining Secure Boot enforcement with Measured Boot logging and regular TPM attestation against maintained reference baselines covers the UEFI boot chain reasonably well, while accepting that ME/PSP firmware, BMC firmware, and option ROMs are largely outside that coverage. Platforms with Intel Boot Guard or AMD Platform Secure Boot extend coverage downward into the pre-UEFI layer, assuming OEM key management is sound.

For cloud workloads where boot integrity matters, confidential computing instances with SEV-SNP or TDX attestation provide hardware-rooted attestation that the cloud provider cannot forge, which is a substantial improvement over standard vTPM. The remaining gap, what happens after the initial measurement, requires additional controls inside the VM.

For supply chain risk, the key management layer deserves as much attention as the cryptographic mechanisms. The Boot Guard key leak and PKfail both demonstrate that the cryptography itself was functioning correctly. What failed was the process around key generation, storage, and revocation. Auditing whether devices have moved from test to production key configurations, tracking dbx currency relative to known-bad binaries, and monitoring for new firmware signing key disclosures are operational practices that sit alongside the technical attestation chain.

The measurement gap is not a flaw waiting for a fix. It is a structural property of how the boot chain was assembled over decades of layered standards and independent firmware ecosystems. Attestation covers what it covers. The rest requires separate controls and the discipline to maintain them.

Was this interesting?