· 5 min read ·

Faster Code Writing Just Feeds the Queue

Source: lobsters

There is a version of this conversation happening in every engineering org right now. Someone in leadership notices that developers are spending a lot of time on GitHub Copilot demos or asking for Claude licenses, and the promise is always the same: we will ship faster. The thing is, for most teams, writing code is not where time goes.

This point gets made periodically and teams nod along and then go back to optimizing the wrong thing. So it is worth being precise about why the intuition fails and what the data actually shows.

Where Time Actually Goes

In 2018, Nicole Forsgren, Jez Humble, and Gene Kim published Accelerate, which synthesized years of DORA research into what separates high-performing software teams from the rest. The four metrics they identified, deployment frequency, lead time for changes, change failure rate, and time to restore service, say almost nothing about how fast developers type. Lead time for changes measures from commit to production. The time a developer spends writing the code is a fraction of that window.

Microsoft Research published a study on code review at Microsoft that found the median code review takes around 15 hours to get a first response, even on teams that care about it. For teams that do not have an explicit review culture, that number is much higher. The code itself, the part your AI tooling touches, might take two hours to write. It then sits in a queue for a day and a half.

GitHub’s own data from their 2024 Octoverse report showed that developer time is split roughly into thirds: writing new code, reviewing others’ code, and everything else, which includes meetings, incident response, and context switching. The writing portion is already a minority of the day. Compressing it does not meaningfully change the overall picture.

The Queue Is the System

Little’s Law, borrowed from queueing theory, states that the average number of items in a system equals the throughput rate multiplied by the average time each item spends in the system: L = λW. This is not a metaphor, it is a mathematical identity that holds for any stable queueing system.

If your team merges, say, ten pull requests per week, and each PR takes an average of five days from open to merge, you will have fifty open PRs at any given time, on average, regardless of how fast the code gets written. If you adopt an AI coding tool and your developers now open fifteen PRs per week instead of ten, but the review process stays the same, you do not ship fifteen PRs per week. You accumulate a larger backlog. Lead time may actually increase because reviewers have more to process.

Eliyahu Goldratt made this argument in the manufacturing context in The Goal, which was later adapted for software in The Phoenix Project and The Unicorn Project. The core insight from the Theory of Constraints is that the throughput of a system is determined by its slowest stage. Optimizing any stage that is not the bottleneck does not increase throughput; it only creates more inventory piling up in front of the bottleneck.

For most software teams, the bottleneck is not code authorship. It is some combination of review latency, manual QA, deployment pipeline reliability, and organizational sign-off processes.

The AI Coding Tool Problem

None of this means AI coding tools are worthless. They are genuinely useful for certain tasks: generating boilerplate, explaining unfamiliar code, writing tests for well-specified functions, navigating large codebases. But the productivity framing that vendors use to sell them is almost always measured in code-writing speed, which is the wrong unit.

GitHub’s own research on Copilot claimed a 55% speedup in a controlled task completion benchmark. That benchmark involved writing a web server in isolation, with no review process, no deployment pipeline, and no coordination with other developers. It measures the thing that was not your bottleneck and leaves everything else unchanged.

There is also a subtler problem. Faster code generation tends to produce larger pull requests, because the friction of writing the code is lower. Research on code review effectiveness, including work from Cisco and later studies at Microsoft, consistently finds that reviewer thoroughness drops sharply for PRs over 400 lines of diff. Reviewers start skimming. Defect detection rates fall. The code that your AI tool wrote faster now gets reviewed less carefully because it arrived in a larger batch.

What High-Performing Teams Actually Do

The DORA research is consistent on this. High-performing teams deploy more frequently, which means they work in smaller batches. Trunk-based development, where developers commit to the main branch frequently rather than maintaining long-lived feature branches, is one of the highest-signal practices associated with elite performance. It is not about speed of authorship; it is about reducing the size of each unit of work.

Code review turnaround time is one of the strongest levers available to most teams. A norm of reviewing PRs within two hours instead of two days compresses lead time more than any coding tool can. This requires social and organizational change, not new software purchases. Teams that do daily standup with an explicit focus on unblocking PRs, that treat open reviews as a shared responsibility rather than something each developer fits in around their own work, consistently outperform teams that leave review to chance.

Deployment pipeline reliability matters in a similar way. A pipeline that takes 45 minutes and fails one run in five creates enormous friction. Developers work around it by batching changes, which makes reviews harder, which increases defect rates, which increases the failure rate. Investing in a fast, reliable CI/CD pipeline compounds. A ten-minute green build that developers trust changes how they work at a fundamental level.

The Coordination Layer

There is one more layer that rarely appears in productivity conversations: coordination overhead. Every dependency between teams, every approval gate, every meeting that exists to synchronize people who should have been able to work independently, adds latency that is invisible to code authorship metrics.

Conway’s Law observes that systems tend to mirror the communication structures of the organizations that build them. The inverse is also true: if your architecture requires constant cross-team coordination, no amount of tooling will fix the coupling. This is why some organizations have restructured around stream-aligned teams, as described in Team Topologies, to minimize the coordination tax on feature delivery.

Writing code faster does not reduce the number of people who need to sign off on it. It does not make your architecture less coupled. It does not eliminate the weekly change advisory board meeting. These are the things that determine whether a feature ships in two weeks or two months, and they are all upstream of the code itself.

Where This Leaves You

The next time someone proposes that the team needs better tooling to write code faster, the useful question is: where does time actually go between a developer starting work on something and a user being able to use it? Map that out concretely, including wait times at each stage, and the answer will almost never point to code authorship as the primary constraint.

This is not an argument against improving developer tooling. It is an argument for measuring the right thing before optimizing. The teams shipping reliably multiple times a day are not doing it because they type faster. They have small PRs, fast review turnaround, reliable pipelines, and organizational structures that minimize coordination overhead. Those properties take time and intentional effort to build, and no tool does it for you.

Was this interesting?