The Trust Deficit Behind Your Approval Chain

Apenwarr published a post this week arguing that every layer of review added to a software process makes the team roughly ten times slower. The piece hit 510 points on Hacker News and opened a 295-comment thread split between people who have lived this problem and people defending review culture. Both camps have valid points, but the framing that makes the claim useful is about what review is, structurally: a queue.

Queues compound

When you add a review step to a workflow, you are not adding a fixed delay. You are adding a queue. A reviewer has their own inbox, their own priorities, their own context-switching overhead. The work sits in that queue until they get to it. In M/M/1 queuing terms, the average wait time for a single-server queue is:

W = 1 / (μ - λ)

where μ is the reviewer’s service rate and λ is the arrival rate of review requests. As λ approaches μ, wait time approaches infinity. A reviewer who is 80% utilized on other tasks will make your work wait far longer than their raw review time suggests.

Add a second reviewer in series and you have two queues, each with their own utilization. The expected wait times compound. This is why the “10x per layer” framing, while not derived from a formal model, points at something real. Queuing systems have non-linear behavior. A team where every PR requires two reviewers, a security sign-off, and a product manager approval is not three times slower than a team with one reviewer. It is operating in a fundamentally different regime.

The Accelerate research by Nicole Forsgren, Jez Humble, and Gene Kim measured this empirically across thousands of organizations. Elite performers had lead times for changes under one hour. Low performers measured lead times in weeks or months. The differences tracked back to specific practices: trunk-based development, automated testing, continuous deployment, and notably, the structure of code review. High performers reviewed fast and in small batches. Low performers accumulated review debt.

Where the 10x estimate comes from

The “10x per layer” number sounds like rhetorical amplification, but there is a path to it from plausible assumptions. If a reviewer turns around reviews within their own work day when unblocked, but has to finish their current focus block first, the average wait is something like two to four hours for a PR submitted at a random point in the day. Add a second reviewer in series and you are waiting for two independent schedules to align. Add an approval from someone in a different timezone and you are looking at a cycle that spans calendar days.

If your baseline cycle time without review is one hour of development work, and each review layer adds an expected wait of several hours, a three-layer process does not give you linearly more hours elapsed. It gives you at least that in wall-clock time, with the added cost that you have context-switched away by the time each review returns. Re-engaging with a PR two days later, incorporating feedback, and re-queuing for re-review carries a real cost that is hard to measure in aggregate. The “10x” figure is loose but it captures the right shape.

Lean manufacturing had this figured out first

The Toyota Production System drew a sharp distinction between flow efficiency and resource efficiency. Resource efficiency optimizes for keeping everyone busy. Flow efficiency optimizes for moving work through the system quickly. These goals conflict. A review process that keeps reviewers occupied is terrible for flow. Work queues up waiting for capacity. The system is locally efficient and globally slow.

This is why the DORA metrics treat lead time as a primary indicator of team health rather than utilization. Lead time captures the queue dynamics. Utilization metrics obscure them.

Toyota restructured inspection rather than eliminating it: small batch sizes and immediate feedback loops replaced end-of-run batch inspection. Defects became visible as they occurred, and the feedback loop compressed enough to be actionable. The software equivalent is continuous integration with fast automated tests, pair programming for synchronous review, and feature flags that let you deploy without a manual approval chain waiting at the end.

The bystander effect in your PR queue

Research on code review dynamics at Microsoft found that as the number of required reviewers increases, individual reviewer engagement tends to decrease. Each reviewer assumes someone else will catch the problem. More required approvals means more queues and diffused accountability, with individual engagement falling as the total gate count grows. This is the specific failure mode of approval chains: the safety guarantee weakens at the same time that the latency penalty compounds.

The trust deficit underneath

Review culture often exists because of a trust deficit, and naming that directly is more useful than debating review in the abstract. When every PR needs three approvals, it frequently means the organization does not trust individual engineers to make sound decisions independently. Sometimes that distrust is appropriate: security reviews for cryptographic code, legal review for terms changes, compliance sign-off for regulated industries. Those cases are real and the overhead is justified.

The problem emerges when the approval architecture built for high-stakes changes gets applied uniformly to all changes. The tax is the same regardless of risk. A one-line bug fix sits in the same queue as a data model migration affecting production. The overhead that makes sense for one category creates compounding friction for everything else.

High-performing teams have built systems that make risk-based distinctions automatically. Automated guardrails handle the routine safety checks, fast linting and test pipelines catch the obvious errors, and human review is reserved for decisions that genuinely require human judgment: architecture choices, security boundaries, significant behavioral changes. Google’s engineering practices document describes a rigorous review culture with explicit expectations about turnaround time and with specialized reviewers who know their domain well enough to turn reviews around efficiently. The rigor is real; the queue is managed.

What the alternatives look like concretely

Pair programming eliminates the review queue entirely. The review happens synchronously during development and the feedback loop is immediate. It is well-suited for high-complexity changes where review friction is highest, and it often reduces total time spent while improving quality because misunderstandings surface before they are baked into a large diff.

Trunk-based development with short-lived branches compresses the batching problem. Instead of a large PR that requires significant reviewer time and context reconstruction, changes land in small increments reviewable in minutes. The queue wait shortens because the service time shortens. Merge conflicts, which represent two batches that waited too long independently and are themselves a form of queuing overhead, become rare.

Feature flags decouple deployment from review. Code can merge and deploy without being active, separating the approval process for “is this ready for users” from the technical merge process. This removes one of the most common justifications for blocking review chains: the concern that merged code will immediately affect production before it has been signed off.

None of these are novel ideas. The DevOps movement has been arguing for them for fifteen years. Apenwarr’s “10x per layer” framing earns its place in that conversation by making the cost legible: each review layer added through organizational habit rather than deliberate risk calibration is carrying a compounding cost that most teams have never tried to measure, and the math suggests they would be uncomfortable if they did.