· 6 min read ·

The Abstraction Loop LLMs Compress But Cannot Close

Source: martinfowler

The what/how divide in software is not a new observation. Every structured programming text from the 1970s frames the central discipline as separating what a program should accomplish from how it accomplishes it. Dijkstra’s foundational work on structured programming treated this as the bedrock of program reasoning: to understand a system, you separate the contract from the mechanism. David Parnas formalized it with information hiding in his 1972 paper “On the Criteria to Be Used in Decomposing Systems into Modules”. Bertrand Meyer made it executable with Design by Contract. The names change, but the principle holds.

Martin Fowler, Unmesh Joshi, and Rebecca Parsons returned to this ground in a January 2026 conversation piece on martinfowler.com, framing LLMs as new participants in the what/how loop. Their framing centers on a familiar engineering concern: building systems that survive change requires managing cognitive load, and managing cognitive load requires clear mappings between the abstract intent of a domain and the concrete mechanisms that implement it. LLMs, in their reading, give us new tools for navigating that mapping.

That framing is right, but it leaves interesting territory unexplored. LLMs do not just navigate the what/how map. They change what the map looks like, and they make it easier to skip the parts of the loop that do the most work.

The Classical Loop

In traditional software design, the what/how distinction operates as a feedback mechanism. You begin with an abstract intention: “this service should return a sorted list of users by last login time.” To implement it, you are forced to make decisions. Do you sort in the database or in the application layer? How do you handle null timestamps? What is the performance contract for large datasets? Each decision refines your understanding of what you actually wanted. The implementation interrogates the specification.

This feedback is not a side effect of doing implementation work; it is one of the primary values of it. Test-driven development formalizes the same principle from the other direction: write the test first, which encodes the “what,” and let the act of making it pass reveal what the implementation needs to be. The loop between specification and implementation is how requirements get clarified and how designs stabilize.

Abstraction layers are the accumulated residue of many such loops. When you define an interface like this:

interface UserRepository {
  findRecentlyActive(since: Date, limit: number): Promise<User[]>;
}

you are not just hiding implementation details. You are encoding the result of prior what/how negotiations. The method name says what it does. The parameters reflect constraints that emerged from implementation reality. The return type reflects the data shape that callers actually needed. Every design decision in that interface was, at some point, the resolution of a tension between intent and capability.

Where LLMs Enter

LLMs can generate both sides of this interface. Given the interface above, a capable model produces a reasonable PostgreSQL implementation, handles edge cases, and writes test scaffolding. The acceleration is real for straightforward cases, and developers who use these tools daily know this well.

The problem is that generating the “how” from the “what” bypasses the loop. When a developer writes the implementation themselves, the friction of implementation feeds back into the design. Awkward method signatures get caught because they are awkward to implement. Missing parameters surface because the implementation needs them. You discover that findRecentlyActive needs a timezone parameter because the implementation makes that dependency obvious.

When an LLM fills in the implementation, the feedback loop becomes optional. You can accept generated code without engaging with what it reveals about your specification. The implementation still carries the same hidden assumptions, but they sit inside generated code that the developer never fully negotiated with.

This connects to John Sweller’s cognitive load theory, which distinguishes intrinsic load (inherent complexity of the task), extraneous load (overhead from poor tooling or presentation), and germane load (the productive struggle that builds understanding and schema formation). LLMs are excellent at reducing extraneous load: the friction of typing boilerplate, looking up syntax, and remembering API signatures all decrease. But germane load does not disappear. It gets displaced to harder-to-notice places.

The Abstraction Quality Problem

There is a specific failure mode worth naming precisely. When LLMs generate implementation, the generated code tends to reflect the patterns most common in training data. For well-traveled paths, this produces good results. For domain-specific problems where the “what” is non-standard, the generated “how” often smuggles in assumptions that do not fit the domain.

Consider a billing system with non-standard proration rules. The correct implementation might look structurally similar to a standard proration function but differ in important edge cases. An LLM generating from a vague specification will produce the standard version. A developer who does not interrogate the generated implementation will ship the wrong behavior. The error is not in the generation; it is in the specification-to-implementation handoff that the loop is supposed to catch.

Eric Evans made the argument for domain-driven design partly on these grounds: language shapes what you can specify, and imprecise language in a domain model produces imprecise implementations no matter how carefully the code is written. LLMs do not change this principle; they amplify it, because a vague natural language prompt generates confidently structured code that looks more finished than it is. The code compiles, the basic tests pass, and the domain error lives quietly in the logic.

What Good Use Looks Like

The productive use of LLMs in the what/how loop is not to hand over the loop entirely, but to accelerate the parts of it that carry little domain-specific negotiation. Boilerplate code, test scaffolding, standard data transformations, and well-understood infrastructure plumbing can be generated without meaningful loss, because they involve little domain-specific negotiation and their correctness is easily verified against known patterns.

The developer’s role shifts toward specifying the “what” with enough precision that the generated “how” is interrogable. This means writing interfaces before asking for implementations. It means treating generated code not as a finished artifact but as a draft that needs to be read against the specification it claims to satisfy.

One practical pattern is to write the docstring before generating the implementation. This forces you to articulate the contract in natural language, which surfaces ambiguities before they become bugs:

def allocate_credits(
    account: Account,
    amount: Decimal,
    effective_date: date,
) -> AllocationResult:
    """
    Allocate credits to the account, applying to the earliest
    unpaid invoice first. If effective_date falls within a prior
    billing period, recalculate all subsequent period balances.
    Prior-period allocations are flagged for audit and trigger
    a balance_recalculated event.
    """
    ...

That docstring contains at least three non-obvious domain constraints: application order, retroactive recalculation, and audit flagging. A developer who reads the generated implementation and asks whether the retroactive recalculation branch is actually present is using the what/how loop correctly. A developer who accepts the implementation and moves on is not, regardless of whether the tests pass.

Fowler has written separately about domain-specific languages as a tool for making the “what” explicit and composable. The idea translates: any mechanism that forces you to articulate intent before generating implementation keeps the loop alive. The specific mechanism matters less than the discipline.

Surviving Change

The framing from the Fowler conversation comes back here. Building systems that survive change requires good abstractions, and good abstractions require the what/how loop to do its work. LLMs change what that work looks like in practice. The extraneous load of writing implementation from scratch decreases. The cognitive demand of specifying intent precisely, reviewing generated output against domain constraints, and preserving the loop deliberately must increase to compensate.

The what/how loop has not been eliminated. It has been compressed and made easier to skip. The developers who use these tools well will be those who recognize that the loop still does necessary work and build habits that preserve it: writing contracts before implementations, reading generated code against domain requirements rather than just running tests, and treating the act of specification as the primary design discipline rather than a preliminary to “real” coding.

The abstraction quality of software written over the next several years will reflect whether the teams building it managed that correctly.

Was this interesting?