The Constraint Your AI Coding Tool Won't Move

Most software teams cannot tell you how long it takes from a feature being prioritized to it running in production. They can tell you how fast their developers type, how many PRs were opened last week, and what their test coverage percentage is. These numbers are easy to count. They are also, mostly, not the ones that determine delivery speed.

A post on Debugging Leadership makes this case plainly: if you think coding speed is your bottleneck, you have misdiagnosed the system. The observation is not original, but it keeps getting overlooked because the solution to writing speed is visible and purchasable. A Copilot or Cursor subscription appears on a budget line. Fixing your review culture or your deployment pipeline does not.

Where the time actually goes

Nicole Forsgren’s research on software delivery performance, codified in Accelerate (2018, with Jez Humble and Gene Kim), measures lead time for changes as one of four key metrics separating elite teams from low performers. Elite teams ship changes in under an hour. Low performers take between one and six months.

That gap is not about typing speed. A developer on a slow team is not writing code at 1/720th the rate of someone on a fast team. The difference is almost entirely in queue time: how long code sits waiting to be reviewed, waiting to be merged, waiting to be deployed, waiting for someone to approve a production change.

Flow efficiency, a concept borrowed from lean manufacturing and applied to software delivery, measures the proportion of time a work item is actively being processed versus the total time it spends in the system. Research from software delivery analytics practitioners puts typical team flow efficiency somewhere between 10% and 25%. For every hour of active development work, three to nine hours pass where a feature is sitting somewhere: awaiting review, blocked on a dependency, queued for deployment, waiting for a test environment to become available.

If code writing represents, generously, 15% of a feature’s total lead time, and you make writing twice as fast, you reduce total lead time by 7.5%. This is arithmetic.

Why Goldratt’s constraint matters here

Eliyahu Goldratt’s Theory of Constraints, developed in the 1980s for manufacturing, has a deceptively simple observation: every system has exactly one binding constraint at any given time, and improving anything other than that constraint does not improve throughput. It only builds inventory in front of the bottleneck.

Apply this to a software team. If code review is the constraint, and you give developers AI tools that make them produce code twice as fast, you now have twice as many PRs sitting in the review queue. The merge rate does not change. Lead time does not improve. And if your queue dynamics follow Little’s Law — where queue length equals arrival rate multiplied by wait time — adding more work to a near-saturated system causes wait times to grow faster than arrival rates. You may have made things measurably worse.

This is the failure mode lurking in uncritical adoption of AI coding tools. Organizations purchase subscriptions, productivity metrics around code written per hour improve, the review queue grows, PR merge times extend, and the connection between those two things often goes unmade.

What the productivity data on AI tools actually shows

The GitHub Copilot productivity research found roughly 55% faster task completion on controlled coding benchmarks. That is a real, measurable gain at the task level.

But task completion speed and lead time are different things. Lead time starts when work is prioritized. It ends when the feature reaches users. Task completion speed measures a specific slice of that journey: the part where a developer is in an editor writing functions. AI tools accelerate the inner loop. The outer loop, where handoffs and queues live, is untouched.

There is also a less-discussed second-order effect. When individual developers write code faster, they often produce larger PRs, because they complete more work per session before pausing to submit for review. Google’s engineering practices documentation and Microsoft Research’s work on code review at scale both point toward PR size as a significant predictor of review time: larger changes take disproportionately longer to review and carry higher deployment risk. Tools that accelerate writing can push developers toward PR sizes that slow the review stage, which is typically already the constraint.

What the constraint usually is

DORA’s State of DevOps research consistently surfaces a set of practices that correlate with high delivery performance. The common constraints they implicate are:

Code review latency. Microsoft Research’s studies on code review in large organizations find median review response times around 24 hours, with significant long-tail delays. Review is partly a social coordination problem: reviewers are context-switching, PRs are large and hard to evaluate quickly, and ownership is diffuse. When a PR sits open in a shared queue, the same bystander dynamics that apply in other group contexts apply here. Everyone assumes someone else will pick it up.

Deployment pipeline friction. Build times measured in tens of minutes, manual production promotion gates, environment scarcity, and incomplete automated test coverage all extend the time between a merged PR and a running feature. This is infrastructure debt that accumulates quietly and rarely shows up in individual productivity metrics.

Batch size. Large batches of work take longer to review, carry higher deployment risk, and take longer to recover from when they fail. Teams with consistently short lead times tend to ship small, frequent changes. This is not a developer productivity habit; it is a queueing property. Smaller work items flow through constraints faster regardless of how quickly the code in them was written.

Requirements churn. Work that must be revisited because the spec was unclear does not just cost the development time; it resets the entire flow cycle, including every queue the work item had already cleared.

None of these respond to faster code writing.

What I notice even working alone

When I’m building something for personal use, with no review queue to worry about and direct control over deployments, actual code writing is still a minority of the total time. Reading documentation, testing behavior interactively, working out the right abstraction before committing to it, reviewing my own work before deploying: these together consume more time than generating new lines in an editor does. And I have no handoffs.

For teams, every phase involves a handoff. The PR sits until someone picks it up. The reviewer requests changes and must then re-review. The deployment waits on a window or an approval. Each transition point is a potential queue, and queues are where time accumulates. Writing code faster just means reaching each queue sooner.

What actually moves lead time

The DORA capabilities model maps specific practices correlated with high delivery performance: trunk-based development, deployment automation, loosened change approval processes that replace manual gates with automated verification, and thoughtful team topology design that manages cognitive load. These are organizational and process changes, not primarily tooling changes.

Measuring your actual lead time, breaking it into component stages, and finding where work reliably stalls gives you the information needed to intervene at the right point. Not where you imagine work gets stuck, but where it does, with data from your own pipeline. This usually means instrumenting your deployment pipeline, tracking PR open-to-merge time, and measuring deployment frequency directly.

This is harder than buying tools. It requires changing how people coordinate, sometimes restructuring review responsibilities, and often confronting uncomfortable facts about deployment automation debt. The payoff, measured in lead time reduction, is substantially larger than any realistic gain from accelerating the writing phase.

Code writing speed responds to tooling. Delivery lead time responds to process change. Treating them as the same lever is how you end up with a faster-writing team that ships at the same cadence as before, paying for AI subscriptions that accelerated the one part of the system that was not holding anything back.