· 6 min read ·

The Developer Backlog Problem That AI Finally Changed

Source: simonwillison

The activation energy problem is something every developer with more ideas than time understands. You accumulate a list, mental or written, of tools you have been meaning to build for years: a small script to parse a particular log format, a personal dashboard that surfaces the data you care about most, a command-line utility that would save you twenty minutes a week. Most of these never get built, because the cost of starting outweighs the perceived benefit of finishing.

Simon Willison’s recent reflection on eight years of wanting and three months of building with AI names this problem precisely and documents what it looks like when the threshold finally drops. Willison, the co-creator of Django and the creator of Datasette, has spent years writing carefully about practical LLM usage. His framing is worth examining closely, because it is not primarily about speed or automation. It is about a different kind of cost.

The Backlog Is a Threshold Problem

When developers talk about things they want to build, they are usually not talking about projects they lack the technical ability to complete. They are talking about projects where the upfront cost of scaffolding, configuration, and the first few hundred lines of boilerplate exceeds the activation energy available on a Tuesday evening. The idea is real, the skills are there, but the cost of getting from zero to a running prototype is high enough that it keeps getting deferred.

AI coding assistants have changed this calculation in a specific way: they reduce the cost of the first hour. For a new project, that first hour is often the highest-friction part, covering project structure, dependency choices, and the initial scaffolding that a framework does not provide automatically. An experienced developer working with Claude or GPT-4 can move from a description of intent to a running skeleton in minutes, and that compression in startup cost is what moves the threshold for many long-deferred projects.

Willison’s LLM CLI tool is a clean example of this kind of project. It lets you pipe prompts to language models directly from the terminal, stores every conversation in a local SQLite database, and has grown to support dozens of model providers through a plugin system. It is narrow in scope, clear in purpose, and genuinely useful, which is precisely the profile of a project that would otherwise wait indefinitely for a sustained block of time to materialize.

The tool itself surfaces the workflow that makes it valuable. A command like:

llm 'Write a Python script that watches a directory for new JSON files and posts each one to a webhook' > watcher.py

gets you a draft in seconds. The draft is not always correct, but it eliminates the blank-page problem and gets you to a reviewable starting point rather than an empty file. For experienced developers who can evaluate the output quickly, this changes the calculus on what feels worth starting.

Why Experience Amplifies the Benefit

There is a pattern among developers who report the highest productivity gains from AI coding tools, and it correlates with experience rather than with enthusiasm for the tools themselves.

Andrej Karpathy coined the term vibe coding in February 2025 to describe coding by describing intent to an LLM and accepting the output without deeply reviewing it. For prototyping or throwaway scripts, this approach is reasonable. For production code you are responsible for maintaining, it accumulates a different kind of risk, because the errors in generated code tend to be subtle rather than obvious.

Willison has been consistent in his own writing that he stays in control of everything that gets committed. He reads the generated code and understands it before it enters the codebase. The experience is closer to having a pair-programmer who handles the drafts while you review, steer, and make the consequential decisions than it is to accepting machine output wholesale. The distinction matters because the value of AI assistance scales with your ability to evaluate what it produces. An experienced developer can spot a subtly wrong data model or a missing authorization check in generated code within seconds. Someone without that foundation may not catch it.

The tools amplify what you bring to them, and what Willison brings is substantial. Eight years of ideas built in three months reflects an experienced developer working fluidly with AI tools; it is not evidence that AI has replaced developer judgment. The judgment about what to build, how to structure it, and whether the generated code is correct remains human.

What Kinds of Projects Benefit Most

The projects that sit in backlogs longest tend to share a profile: narrow scope, clear purpose, genuine personal utility, but enough bootstrapping cost to prevent starting on a whim. Data exploration tools, personal automation scripts, small web apps that surface an API in a more convenient form, command-line utilities for a specific workflow. These are exactly the projects where AI assistance compresses build time the most, because the requirements are clear enough to prompt effectively and the scope is small enough to review completely.

I have experienced this building Discord bots. A bot feature that would have required a focused afternoon of reading documentation and writing boilerplate now takes thirty to forty-five minutes of iterating with Claude. The Discord.js API surface, the slash command registration pattern, the permission model: I know these well enough to verify generated code quickly, and the first draft is usually close enough to work from. That compression changes what feels worth attempting. Features that once felt like a project now feel like an evening.

Willison’s case is scaled up but structurally similar. His tools, including sqlite-utils and the growing Datasette plugin ecosystem, represent years of careful accumulated work. AI assistance does not replace that foundation; it lets him spend less time on the parts that do not require his specific expertise, which frees up capacity for the parts that do.

The Open Source Implication

There is a broader implication for open source that has not fully played out yet. A significant fraction of useful open source tools were never built because the person who would have built them ran out of time or motivation at the scaffolding stage. If AI assistance meaningfully lowers the activation energy for starting and completing small projects, the long tail of useful-but-never-built tools gets shorter.

This is already visible in the proliferation of small, focused GitHub repositories that have appeared over the past year. Many of these would not have existed without AI assistance. The average quality varies considerably, but the volume suggests that a real developer backlog is being drained at scale.

The caution is that quality still requires sustained attention. Generated code that nobody reviews carefully accumulates its own form of technical debt. The developers shipping reliable work with AI assistance are treating it as a drafting tool, not a final answer, and they have enough background to know the difference between a plausible-looking implementation and a correct one.

What Eight Years Actually Means

Eight years of wanting is a specific kind of frustration: you know exactly what you would build, you could describe the architecture in a few minutes, the ideas are not fuzzy or half-formed. They are sitting there, waiting for a block of time that other priorities keep consuming.

Three months of building is what happens when the activation energy drops enough that evenings and weekends become viable build windows. Willison’s account is valuable precisely because it is specific. He is not describing a general improvement in developer productivity; he is describing what it looks like when a concrete backlog of deferred ideas becomes achievable because the cost structure changed.

That is the frame that makes AI coding tools genuinely interesting, separate from the hype around autonomous agents and fully automated pipelines. The near-term value is quieter: the scripts that finally got written, the tools that finally got built, the projects that stopped waiting for a better time to start.

Was this interesting?