Code as a Thinking Tool: Why Source Won't Disappear in the Agent Era

Martin Fowler published a piece by Unmesh Joshi called What is Code, and it lands in the middle of a debate that has been bubbling all year. If agents can produce working software from a prompt, do we still need source code? Joshi’s answer is that code does two jobs at once, and only one of them is being automated. The other job is the reason programming languages exist in the first place, and it is not going anywhere.

The two jobs are easy to state. Code is a set of instructions a machine executes, and code is a model of the problem domain that humans reason about. The compiler cares about the first. Every other person who touches the system, including the author six months later, cares about the second. Joshi traces this back to the earliest days of computing, when Grace Hopper’s A-0 compiler in 1952 introduced the idea that humans could write something more abstract than machine words and have a tool translate it down. The whole history of programming languages since then has been a slow climb up the ladder of expressiveness, not because machines needed it, but because people did.

The conceptual model is the point

The clearest articulation of this view is still Peter Naur’s 1985 essay Programming as Theory Building. Naur argued that the real artifact of a programming project is not the source code, it is the theory in the heads of the programmers about why the code is shaped the way it is. The source is a partial projection of that theory. When the team dissolves and the theory is lost, you can still run the program, but you cannot meaningfully evolve it. Anyone who has inherited a working system with no original authors available has felt this directly.

Eric Evans took the same idea in a different direction with Domain-Driven Design, where the Ubiquitous Language is the explicit project of making the code speak in the vocabulary of the business. The point is not pretty naming. The point is that when an underwriter says claim and the code says Claim, the gap between the business and the system shrinks, and so does the chance that a future change will violate an invariant nobody wrote down.

This is why programming languages keep evolving features that have no impact on what the machine does. Rust’s borrow checker compiles away. TypeScript’s types are erased. Pattern matching in Scala or OCaml could always be expressed as nested conditionals. None of these features make programs faster or smaller. They make the conceptual model legible. Sum types let you say a Payment is either Pending, Settled, or Refunded and have the compiler check that you handled every case. The machine never needed that. You did.

What LLMs are good at, and what they are not

The argument that source code becomes irrelevant tends to lean on a particular framing: if I can describe what I want and the agent produces something that works, the intermediate text is just scaffolding. Andrej Karpathy’s much-cited tweet about English becoming the hottest new programming language captures the optimistic version. Matt Welsh has gone further, arguing that programming as a discipline is ending.

The empirical record so far is more cautious. The 2024 DORA report found that AI adoption correlated with higher individual productivity but lower software delivery throughput and stability, which the authors tied to AI generating more code that then has to be reviewed, tested, and maintained. GitClear’s 2024 analysis of four million commits showed a rise in copy-paste and churn since Copilot’s release and a fall in refactoring. METR’s March 2025 study of experienced open-source developers was the one that surprised people the most: developers using AI tools were 19 percent slower on tasks in repositories they knew well, even though they believed they had been faster.

These numbers do not say agents are useless. They say that generating code is the easy part. Verifying that the generated code matches a coherent mental model of the system is where the time goes, and the model is exactly what Joshi’s second purpose is about. When you delete the source and keep only the prompt, you have not removed the conceptual model, you have just made it implicit and uncheckable.

The interesting middle

The genuinely interesting question is what role source code plays when an agent is doing most of the typing. Two patterns are emerging.

The first is what Simon Willison has been calling vibe coding, borrowing Karpathy’s phrase. You describe behavior, the agent produces code, you run it, you iterate. For throwaway scripts and prototypes this is genuinely transformative. The source still exists, but nobody reads it, and that is fine because the artifact is disposable. Joshi’s two purposes collapse into one: the code only needs to instruct the machine, because no human will ever build a mental model from it.

The second pattern is the opposite, and it is what production engineering still looks like. The agent produces a draft, the developer treats it as a proposal, and the work is in revising the draft into something that fits the existing model of the system. Anthropic’s own engineering posts on Claude Code emphasize this: the value of CLAUDE.md, of careful prompting, of small reviewable diffs, all of it is about keeping the human in the loop on the conceptual model while delegating the mechanical work. The source code is read more carefully than ever, because it is now the artifact you use to check whether the agent understood what you meant.

Higher-level languages, not no languages

If you take Joshi’s framing seriously, what LLMs are doing is not eliminating source code, they are functioning as a new kind of compiler from a higher-level specification. The specification happens to be in English, which is ambiguous, context-dependent, and impossible to typecheck. That is a real problem. Every wave of higher-level abstraction in the history of programming has had to invent a way to make the higher level precise enough to be useful. Assembly to C, C to Java, Java to Haskell, each step added expressiveness and also added a more rigorous way to say what you meant.

The parallel today is the rise of formal specification languages used as agent input. TLA+ and Alloy have always been niche, but tools like Lean for proof-driven development and the renewed interest in property-based testing point at where the precision has to come from. If the agent’s output is going to be trusted without close reading, the input has to encode the conceptual model unambiguously, and natural language alone does not do that.

The Fowler and Joshi piece does not predict where this ends up, and neither will I. The point worth holding onto is that source code was never primarily about telling the CPU what to do. We had punched cards for that. Source code is the medium in which we work out what the system means, and that work does not go away when the typing is automated. It just moves.

If you want the original argument in Joshi’s own words, the essay is worth reading in full. The bibliography alone is a tour of the ideas that shaped how working programmers think about their craft, and the discipline of returning to those ideas is more useful right now than another round of predictions about what agents will replace next year.