· 5 min read ·

Why the Second Reviewer Costs More Than the First

Source: hackernews

The intuitive model of code review overhead is additive. Each reviewer takes some time, reviews create some rework, and the delays accumulate as you add stages. A team adding a second review layer expects to pay roughly twice the overhead of the first. The actual cost is higher, and not by a small margin.

Avery Pennarun lays out the case that each review layer is closer to a 10x multiplier on cycle time than an additive overhead. The mechanism is queuing theory, and once you work through the math, the claim is hard to dismiss.

Why Queues Cost More Than Their Service Time

A code review step is not a service you consume instantaneously; it is a queue. A change arrives, waits for a reviewer to become available, gets reviewed, possibly cycles back with requests for changes, and eventually clears. The wait time in that queue depends not just on how long reviews take, but on how busy the reviewer is with everything else.

For an M/M/1 queue (single server, Poisson arrivals, exponential service times), the mean waiting time is:

W = ρ / (μ(1 - ρ))

where μ is the service rate and ρ = λ/μ is the utilization. The critical part is the 1/(1-ρ) denominator. At 50% utilization, mean wait equals one mean service time. At 80% utilization, it equals four service times. At 90%, nine service times. The curve is steep and the divergence begins well below full capacity.

Most engineers doing code review are not dedicated reviewers. They are engineers who also carry their own feature work, attend meetings, and handle interruptions. A reviewer utilization of 80% or higher is normal for someone splitting time between their own code and others’. At 80% utilization, a change that takes one hour to actually review waits four hours before the reviewer starts.

Two Queues in Series Are Worse Than Two Times One

Add a second required reviewer, structured so the second only sees the change after the first approves, and you now have two queues in series. The expected cycle time is the sum of both wait times plus both service times. If each reviewer operates at 80% utilization, the compounded wait time is roughly eight service-time-equivalents before the change ships, even before accounting for revision cycles.

There is a second effect that does not appear in the M/M/1 formula directly. Each round trip between reviewer and author costs a context switch. The author, who has likely moved on to other work while waiting, has to reload the full context of the original change. Research by Gloria Mark at UC Irvine estimates the cost of re-establishing focused work context at upwards of 20 minutes after an interruption. If a review comes back requesting changes two days after submission, the author pays that reload cost on top of the rework itself.

A change that requires two rounds of feedback at each of two review stages, with each round separated by a day, accumulates four context-switch penalties plus four separate queue waits. This is why teams with multiple required approvers report cycle times measured in weeks for changes representing a few hours of actual work.

How the Pull Request Model Industrialized This

Sequential review gates were not the default before pull requests became the standard collaboration model. Many projects operated with commit-to-trunk workflows where code was reviewed post-merge via mailing lists, or where trusted contributors could land changes with a brief justification. The pre-GitHub workflow was not review-free, but it was asynchronous: a change could ship while design discussion continued in parallel.

The GitHub pull request model encoded pre-merge review as the canonical path to main. This was a sensible default for open source projects with many untrusted contributors. The problem is that the same model became the default inside organizations of trusted engineers. Required reviewer counts, branch protection rules, and CODEOWNERS files turned an external-contributor workflow into a series of internal approval gates applied to colleagues who were already vetted when they were hired.

Each gate gets added for a locally defensible reason. A production incident leads to mandatory on-call review. A security audit leads to required sign-off on any change touching authentication. A compliance requirement leads to a legal review gate on certain feature areas. None of these decisions are obviously wrong in isolation. The queuing cost is invisible until cycle times become a crisis, and by then the organizational memory of why each gate was added has often faded, making removal feel risky.

What High-Velocity Teams Do Differently

The teams that ship fastest have generally solved this by reducing the scope of what requires human review, not by removing review entirely. Google’s internal code review culture, described in detail in Software Engineering at Google, requires exactly one LGTM from a code owner plus passing automated presubmit checks. The human reviewer’s job is narrow because automated tooling handles formatting, style, test coverage thresholds, and build correctness. Reviewers focus on logic and design, and changes are kept small enough to make that tractable in a single focused session.

DORA research on software delivery performance has consistently found that high-performing teams use trunk-based development with branches that live less than a day, while lower performers use long-lived branches with multiple approval stages. The mechanism matches what queuing theory predicts: smaller changes clear review faster, faster reviews reduce context-switch costs, and the compound effect shows up directly in deployment frequency and lead time.

The Linux kernel project handles this through a tree-based delegation model. Subsystem maintainers accept patches relevant to their area, review happens in parallel across subsystems, and Linus Torvalds pulls from subsystem trees rather than reviewing individual patches. The review work is distributed across parallel queues rather than funneled through sequential gates. It is not a fast process by commercial standards, but for a project that integrates thousands of patches per release cycle, it distributes the bottleneck rather than concentrating it.

The Automation Substitution

The practical path for most teams is not to strip review out but to compress what human review must cover. Automated checks can fully replace review for categories that do not require human judgment: formatting, type errors, dependency audits, test coverage floors, license compliance. Every category that automation owns is a category that can no longer trigger a revision cycle from a human reviewer. Each prevented revision cycle eliminates a context switch, a queue wait, and the associated latency.

This changes the calculus on tooling investment. A CI check that reliably catches the kind of nit that currently comes back in review comments pays for itself in cycle time reduction, not just bug prevention. The goal is to make the set of things a human reviewer can plausibly object to as small as possible before the change ever reaches the queue.

Pennarun’s 10x figure is a heuristic, not a derived constant. The actual multiplier depends on reviewer utilization, change size, revision probability, and how much parallel review is possible. What the math establishes clearly is that the cost function is superlinear: the second layer is proportionally more expensive than the first, and the third more expensive than the second. The organizations that treat review gates as zero-cost controls will accumulate them until the compounded latency becomes visible as a shipping problem, and by then the gates have usually become load-bearing organizational structures that nobody wants to be the one to remove.

Was this interesting?