What Queuing Theory Says About Every Review Gate You Add

Avery Pennarun’s recent post on review overhead makes a claim that sounds extreme until you work through the math: each layer of review you add to a software process does not slow the team down by a fixed amount, it multiplies the slowdown by another factor. Two review layers at 10x each means 100x total friction. Three means 1000x. Most engineering organizations have never done this arithmetic, and the gap between intuition and reality is wide.

The mechanism is queuing, and queuing has its own mathematics. Kingman’s formula, sometimes called the VUT equation, describes wait time in a queue:

W ≈ (ρ / (1 - ρ)) × Cv² × S

The critical term is ρ / (1 - ρ), where ρ is utilization. When a reviewer is 80% occupied with their own work, wait time is already 4x the average service time. At 90% utilization, it is 9x. At 95%, it is 19x. The function is hyperbolic, not linear: small increases in utilization near capacity produce large increases in wait time. This is not a productivity failing, it is a mathematical property of queues.

Each review gate is a queue. A reviewer has a utilization rate, variance in response time, and a mean service duration. When you add a second required reviewer, you compose two queues in series. Total wait time is not the sum of each reviewer’s expected wait; it is shaped by the variance and utilization of both queues combined, and it grows faster than the sum. When you add a third, you compose again.

Little’s Law compounds this further. Work in progress equals throughput multiplied by cycle time: L = λW. If cycle time grows because you added review layers, WIP grows proportionally at constant throughput. Higher WIP means more context switching, more merge conflicts, more stale branches, and more coordination overhead, each of which reduces effective throughput. The degradation is non-linear because queue depth and throughput are not independent variables.

What the research says about high performers

The DORA research program, which has tracked software delivery performance across thousands of organizations since 2014, published its core findings in Accelerate by Nicole Forsgren, Jez Humble, and Gene Kim. The data consistently shows that elite performers deploy multiple times per day with lead times measured in hours, while low performers deploy quarterly and wait weeks for changes to reach production.

Elite performers do not skip review. They do it differently. The distinguishing characteristics are small batch sizes, short-lived branches, trunk-based development, and review that is not a blocking gate before merge. The Google Software Engineering book describes code review at Google as a knowledge-sharing and correctness-checking tool with a cultural expectation of short turnaround, not a mandatory approval checkpoint with indefinite wait time. The review happens, but it does not block the queue.

The distinction between blocking and non-blocking review is where most organizations fail. Requiring sign-off before merge is a blocking gate that creates a queue with compounding wait time. Reviewing a change that is already deployed behind a feature flag is not. Both involve a human reading code; only one introduces queuing costs. Feature flags, continuous deployment, and immutable audit logs can serve the same risk management goals that mandatory pre-merge review is meant to achieve, without the composition of queues in series.

How review layers accumulate

Review gates accumulate because the costs are systemic and invisible, while the benefits are local and credited directly to the gate. No one proposes a new approval requirement and says it will slow the team down multiplicatively. They say it will catch more bugs, satisfy a compliance requirement, or ensure senior engineers stay informed. Each justification is locally plausible; the aggregate effect on cycle time is not tracked.

Compliance requirements are the hardest case. SOX, PCI-DSS, and HIPAA often mandate separation of duties and documented approval chains. These requirements are real, with legal consequences for ignoring them. But many organizations implement them as blocking pre-merge review when the underlying regulatory intent can be satisfied by continuous deployment pipelines with immutable audit logs, automated test coverage thresholds, and post-deploy monitoring. The DORA research documents teams in regulated industries achieving elite delivery performance while maintaining compliance, by negotiating implementation with auditors rather than accepting the default interpretation. The flexibility is often there; it takes more upfront work to find it than adding an approval checkbox.

The more common case is informal review theater: process that made sense at a previous organizational size or risk profile and was never revisited. Each addition is a one-way ratchet. Adding a review requirement is politically safe. Removing one requires someone to accept visible responsibility for risks the review was notionally preventing. That asymmetry is why gates accumulate without anyone designing the system to be slow.

The hidden costs beyond wait time

The queue delay is only part of the cost. When a PR sits waiting for review, the author context-switches to other work. When the review comes back with requested changes, returning to full context takes time. Research on programmer productivity has estimated that interrupted cognitive tasks require 15 to 25 minutes to resume fully. A multi-day review cycle involves multiple such interruptions per PR, on both sides.

Meanwhile, the main branch keeps moving. A PR that waited four days for review will have more merge conflicts than one that merged within four hours, because other developers kept committing. Resolving those conflicts requires understanding changes made during the wait, which requires additional review. The cycle reinforces itself.

Batch size follows the same feedback loop. When review is slow, developers rationally make larger PRs to reduce the ratio of overhead to actual work. Larger PRs are harder to review, so review quality drops and wait time increases further. This is the review process equivalent of a traffic jam: trying to push more throughput through a constrained segment makes congestion worse, not better.

What the alternatives actually look like

The practices that correlate with high delivery performance in DORA research share a common structure: they reduce batch size, eliminate blocking handoffs, and use automation to enforce properties that humans were reviewing manually.

Continuous integration with a strong automated test suite catches regressions faster than human review, without creating a queue. Trunk-based development, where all developers commit to a shared main branch with branches measured in hours rather than days, eliminates the branch divergence that makes merges expensive and conflict-laden. Feature flags let you deploy code before it is active, decoupling deployment from release and removing much of the risk that mandatory review was meant to manage.

Pair programming deserves separate mention. Two developers working on the same code simultaneously provides review coverage with no async queue costs, because review is concurrent with authoring rather than sequential to it. The practice has a high per-hour cost but a low per-PR cycle time cost, which is the metric that actually determines delivery performance.

The argument that review layers multiply rather than add their costs is the kind of claim that requires mathematical backing to be taken seriously in most organizations, because the intuitive model treats each gate as a small, fixed cost. The queuing math is unambiguous, the DORA research is consistent across years and sample sizes, and the organizational dynamic that creates the one-way ratchet is well understood. The gap is not in the evidence; it is between knowing the mechanism and having the standing to act on it.