· 6 min read ·

What Actually Ends When Programming Ends

Source: simonwillison

Simon Willison published “Coding After Coders: The End of Computer Programming as We Know It” yesterday, and it is the most precise articulation I have seen of something that has been half-formed in my head for at least eighteen months. Willison is a reliable thinker on this: he created Django, he built Datasette, and he has spent years writing carefully about what LLMs actually are and what they actually do rather than what the press releases say. When he says something about where programming is going, the argument is worth reading closely.

His thesis, roughly: we are approaching the end of programming as a profession centered on writing code. The bottleneck is moving from syntax and keystrokes to something higher up the stack, something closer to specification and judgment. The coder who types is being displaced by the coder who directs.

I think that is mostly right, but the framing needs unpacking, because the word “programming” covers such a wide surface area that collapsing it into a single trajectory obscures more than it reveals.

The Abstraction Ladder Has Always Had This Shape

Every major transition in programming history was someone arguing that the previous layer would disappear. Assembly programmers worried that C would make real programmers unnecessary. C programmers worried that managed languages would produce developers who didn’t understand the machine. The web era produced a generation of developers who had never thought about memory management, and the world did not end.

What happened in each case was not that the prior layer disappeared. It became smaller, more specialized, and more valuable per practitioner. There are fewer assembly programmers now than in 1985, but the ones who exist are doing work that nothing else can do. The abstraction stacks up; the floor doesn’t fall out.

AI-assisted coding is the next rung. The question is which parts of the work are being abstracted away and which parts are being surfaced as the real constraint.

What I See From Two Sides of the Stack

I maintain Ralph, a Discord bot that has grown from a simple command handler into something with autonomous workflows, event-driven processing, and scheduled tasks. The high-level work on Ralph, writing commands, wiring up API calls, structuring message handlers, is already heavily AI-assisted. Claude Code generates the scaffolding; I review, steer, and correct. The keystrokes are not where the time goes. The time goes into specifying what I want precisely enough that the generated code is right, into reading the output carefully enough to catch the places where it looks right but isn’t, and into debugging when the two don’t match.

That work is genuinely different from what programming felt like five years ago. But it is still programming. The mental model of what the code should do, the judgment about whether it actually does that, the debugging intuition that says “this failure pattern looks like a race condition, not a logic error” — none of that is delegated to the tool. The tool writes code faster than I can; I still have to know what code is worth writing.

On the other side, I spend time in systems programming territory: Rust, memory models, concurrency primitives, unsafe blocks. This work has also been changed by AI tools, but far less dramatically. When I am reasoning about whether a particular lifetime annotation is correct, or about whether an unsafe block genuinely upholds the invariants that Rust’s type system is relying on, the model helps but it does not lead. The probability of a plausible-looking wrong answer is too high, and the cost of a wrong answer is too serious. Understanding what the machine is actually doing is not optional here, and that understanding cannot be offloaded.

// This compiles and looks reasonable
// but is only correct if the caller upholds the contract
// that `data` outlives the returned reference
unsafe fn get_ref<'a>(data: *const u8) -> &'a u8 {
    &*data
}

A model will generate this. A model will also explain why it is sound under certain conditions. Knowing whether those conditions actually hold in your specific context requires understanding the surrounding code, the allocation lifetimes, and the concurrency guarantees in ways that cannot be fully expressed in a prompt.

The Split Willison Implies but Doesn’t Name

Willison’s piece gestures at a distinction without making it fully explicit: there is a category of programming where the hard part was always the code itself, and a category where the hard part was always the specification of what the code should do. AI tools collapse the cost of the first category dramatically. They do much less to the second.

Most web development, most glue code, most business logic, most CRUD scaffolding: these were always specification-constrained work wearing a code-writing costume. The code was not the interesting part. Getting the behavior right, handling the edge cases, making the system do what the business actually needed, that was the interesting part. AI-assisted tools strip the costume and expose the real work.

Systems programming, embedded development, compiler work, kernel code, protocol implementations: here the code is often the hard part. The behavior you want is sometimes expressible in a sentence. Getting the machine to actually do that, correctly, under all conditions, with the right performance and safety properties, is the technical problem. These domains are harder to delegate because the domain knowledge required to evaluate the output is the same domain knowledge required to produce it.

This does not mean AI tools are useless in systems contexts. Aider, Cursor, and Claude Code are all useful for drafting, for generating test cases, for suggesting approaches. But the review burden is higher and the tolerance for plausible-but-wrong is lower. You cannot accept “it compiles and the tests pass” as sufficient evidence of correctness when the correctness criteria include things the test suite cannot capture.

What Survives

Willison is right that something is ending. The programmer whose value was primarily in the ability to produce syntactically correct code in a particular language, to hold the API surface of a framework in memory, to type faster and look things up more efficiently than their colleagues, is being automated away in the same way that calculation was automated away. That was always a thin basis for a career, and it is getting thinner.

What survives is the capacity to specify systems precisely enough that generated code can be evaluated against the specification. It is debugging intuition: the ability to look at a failure and reason backward to the cause, knowing which hypotheses are worth testing and in what order. It is the domain knowledge to recognize when generated output is subtly wrong, the kind of wrong that passes tests and looks fine and fails under production conditions you hadn’t anticipated.

In systems programming, what survives is also the model of what the machine is actually doing. Not the syntax. Not the API. The mental model of memory layout, of how the CPU pipeline behaves under branch misprediction, of what the borrow checker is actually checking and why. The MiniRust project is a formal specification of a significant fragment of Rust’s memory model. Knowing that it exists, knowing roughly what it says, and knowing when to consult it is part of what it means to write correct unsafe Rust. A language model cannot substitute for that.

The Redox OS project banned LLM-generated contributions outright for precisely this reason: in a security-focused microkernel, plausible-but-wrong is a threat model, not just an inconvenience. That policy would make no sense for a Discord bot. It makes complete sense for a kernel. The difference is what kind of work the code actually is.

The Honest Version

The honest version of Willison’s argument is not that programming is ending. It is that the population of people whose job was primarily keystroke production is shrinking, and the population of people whose job is knowing what to build and whether it was built correctly is what remains. Those people will write less code themselves. They will need to understand code more deeply than most keystroke-producers ever did, because they will be responsible for output they did not personally generate line by line.

For someone who works across both ends of the stack, that transition feels less like an ending and more like a sharpening. The work that was always the real work is now more visibly the real work. The scaffolding that used to hide it is thinner. That is uncomfortable if the scaffolding was where your value lived. It is clarifying if you were always most interested in what sat underneath it.

Was this interesting?