Sequential By Default: The Pull Request Design Decision That Shaped a Decade of Code Review

Avery Pennarun published an argument this week that every review layer makes a team roughly ten times slower. The queuing math is correct, and the organizational mechanisms that accumulate review layers have been well analyzed. What the discussion tends to skip is how much of the problem originates in the design of the tools themselves. GitHub’s pull request model is not a neutral container for review; it is an opinionated design that made blocking sequential review the path of least resistance, and the industry has been operating inside that design decision for fifteen years without examining it very carefully.

How Code Review Looked Before Pull Requests

Before GitHub popularized the pull request, code review happened through mechanisms with very different default behaviors.

The Linux kernel, like most large open source projects, used email-based patch review. Developers sent patches to mailing lists. Maintainers replied with inline feedback. Patches were revised and resubmitted. The critical detail is that nothing in this workflow made review a blocking gate by default. Linus Torvalds could merge a patch without waiting for every commenter to sign off. Senior maintainers could merge code they trusted without requiring formal approval from anyone else. The process was calibrated: fast and lightweight for trusted contributors, more deliberate for new ones. The overhead tracked the actual risk.

Google developed Critique, its internal code review tool, on top of a workflow with specific properties. Changes lived in a central monorepo (Piper), not in long-lived branches. Every change needed at least one “readability” approval per language, but that certification was earned once per engineer, not required on every change. The cultural expectation was that small changes should be reviewed and approved within hours. Critique was designed around a flow model; reviewers and authors could exchange comments while development continued on related changes in parallel. The separation of readability certification from per-change review reduced mandatory overhead for experienced engineers substantially.

Gerrit, the open-source tool that emerged from Google’s internal tooling, brought some of this design philosophy outside Google. Gerrit’s model is built around change sets with explicit Code-Review and Verified score fields. A change can accumulate +1 votes without being approved to merge; the final merge requires a +2. This scoring model allows discussion and partial approval without the binary approved/not-approved state that dominates GitHub. Gerrit also tracks the full history of patchset revisions under a single change ID, which makes rework cycles legible: you can see that a change went through six revisions over three days, which GitHub’s model of closing and reopening PRs obscures.

What GitHub Changed

GitHub launched in 2008 and built the pull request as its central collaboration primitive. Several design properties distinguished the PR model from what came before.

Pull requests are branch-scoped rather than change-scoped. A PR contains all commits on a branch relative to a base, not a single logical change. This makes it easy to accumulate large batches of work and harder to maintain the small-change discipline that correlates with faster review and lower defect rates.

Merge blocks on required reviewer approval. GitHub’s branch protection rules make it trivially easy to require one or more approvals before a PR can merge. The UI paths for requiring review are prominent and well-designed. The paths for configuring non-blocking or post-merge review are less discoverable and require deliberate setup that most teams never do.

CODEOWNERS files specify which teams must review changes to which paths. The file accumulates entries over time as teams add themselves to paths relevant to their concerns. A security team adds their entry after an incident. A platform team adds theirs after a stability issue. A compliance function adds theirs after an audit. Once an entry exists, removing it requires a deliberate decision to accept the risk that entry was notionally preventing. The default is always toward more review, never less.

The result is a tooling default that makes sequential, blocking, multi-reviewer approval the natural outcome for any team that adds protection rules without a strong counteracting philosophy. Most teams add protection rules without a strong counteracting philosophy, because the individual decisions look like obvious safety measures and the aggregate cost does not appear on any dashboard.

The Defaults That Shaped a Generation

GitHub’s growth tracked the professionalization of software development through the 2010s. The platform moved from startup novelty to primary infrastructure for both open source and commercial engineering within roughly five years. Teams adopting modern development practices learned those practices in an environment where GitHub’s defaults were the reference implementation.

The email-patch model and Gerrit’s model both required active effort to configure blocking review requirements; GitHub made blocking review the cheapest path. The same protection rule a team can configure in thirty seconds takes deliberate organizational coordination to remove, because removal means someone accepts visible accountability for risks the rule was managing.

This is not a criticism of GitHub’s product decisions in isolation. Blocking review by default probably made sense for the use cases GitHub was optimized for in 2008: open source projects accepting contributions from unknown contributors, where some friction was appropriate. The problem is that those defaults migrated without adjustment to commercial teams with very different risk profiles and contributor trust levels, and nobody reconfigured them because reconfiguration looked like risk acceptance.

Ship, Show, Ask

Rouan Wilsenach described the ship/show/ask model in 2021 as a framework for routing changes to the appropriate level of review scrutiny. The model distinguishes three categories: changes you ship directly without review for being routine and well-understood; changes you show to colleagues after merging for being worth communicating but not worth blocking on; and changes you ask about before merging for being novel, high-risk, or outside your current expertise.

The model requires that engineers categorize changes by actual risk rather than applying uniform process to everything. It requires trust between engineers and tolerance for occasional mistakes. It also requires a tooling environment that makes non-blocking and post-merge review practical, which most GitHub configurations do not provide by default.

Some teams implement this through informal norms rather than tooling: committing directly to main without formal PR review for changes below a certain size, using PRs for changes above a threshold or touching critical paths. The pattern is more common in practice than the standard discussion implies, but it requires conscious deviation from the platform defaults rather than working with them.

Automated Review and the Queue Problem

The emerging category of automated code review tools, including GitHub Copilot’s review feature and tools like CodeRabbit and Qodo, adds a new variable to this analysis. An automated reviewer that comments on PRs within seconds adds no queue time and has no utilization curve that bends upward under load.

But these tools are frequently configured as blocking requirements, creating mandatory checks that must pass before merge. An automated check that fails intermittently introduces problematic properties: variable wait time, unclear resolution paths, and no mechanism for engineers to override based on context. The latency cost of an automated check is low when it passes; when it fails and requires human judgment to resolve, it becomes a queue with unpredictable service time.

The more productive integration is automated review tools as advisory rather than blocking. A security scanner that flags potential vulnerabilities without holding up merge adds value at low cost. Configuring that same scanner as a mandatory gate turns it into a queue with the same compounding dynamics as any other review layer. Tooling vendors have incentives to present blocking integrations as the more serious, enterprise-grade configuration, and buying teams often interpret “blocks merge” as a sign of rigor rather than a design choice with tradeoffs.

The Measurement Gap

The underlying tooling problem is this: GitHub and similar platforms have invested heavily in making it easy to add review requirements and relatively little in making it easy to measure the latency cost of existing requirements. GitHub’s insights dashboard shows review cycle time for individual PRs. It does not show aggregate wait time by approval layer, the probability distribution of revision cycles before merge, or the compounding effect of CODEOWNERS chains on end-to-end delivery lead time.

Teams that have built internal tooling to measure these things typically find that the gap between perceived and actual review overhead is wide. Engineers who experience the overhead calibrate their intuitions against their own PRs. Engineers who add protection rules calibrate their intuitions against the justification for each individual gate, not against the aggregate effect of the collection. Without a shared measurement, those two calibrations never converge.

Pennarun’s argument that review layers multiply slowdown rather than adding it is grounded in queuing mathematics that applies to any multi-stage sequential blocking process. The mechanism that produced those layers in most organizations is a tooling default, made fifteen years ago for a specific set of use cases, that has since propagated into contexts where it was never evaluated. Making the aggregate visible is the precondition for changing it, and the tools most teams use were not designed to provide that visibility.