LLMs and the What/How Loop: Abstraction Has Always Been the Game

Software engineering has always been a negotiation between two levels of description: what a system should do, and how it should do it. Every major advance in the field, from structured programming to object-oriented design to declarative infrastructure, has been an attempt to raise the floor of what counts as a “what” and push the messy implementation details further down. A recent conversation between Martin Fowler, Unmesh Joshi, and Rebecca Parsons puts this framing at the center of how we should think about LLMs in software development, and it’s worth tracing that history carefully before accepting the framing uncritically.

The Long History of the What/How Divide

The separation shows up early. In the 1970s, structured programming gave us the idea that a function signature was a contract: the what, independent of the body’s how. Abstract data types formalized this into interfaces where callers only needed to know the surface, not the internals. By the time object-oriented languages became dominant, the interface/implementation split was baked into syntax itself.

Domain-specific languages pushed further. SQL let you express what data you wanted without specifying how the database engine should retrieve it. The query planner owned the how. XSLT let you describe a transformation declaratively. Make and later build systems let you describe dependency relationships, not execution order.

Model-Driven Architecture, or MDA, tried to take this to its logical extreme in the early 2000s. The idea was to write platform-independent models, the what, and then generate platform-specific code, the how, through automated transformation. It mostly failed, largely because the what-level models ended up encoding enough implementation detail that they stopped being genuinely abstract.

Behavior-Driven Development gave us another angle. Gherkin syntax, with its Given/When/Then structure, was a formal grammar for expressing the what of system behavior in a form that humans could read and tools could process. The intention was always that the step definitions, the how, would be written separately and could change without touching the scenarios.

Declarative infrastructure followed the same logic. A Kubernetes manifest describes the desired state: what you want to exist. The control plane figures out how to create or reconcile it. A Terraform configuration expresses what infrastructure should exist, and the provider plugins figure out the API calls required to make it so.

Each generation raised the abstraction level. Each generation also discovered the limits of that raise.

What Each Attempt Left Unsolved

The pattern is consistent: the what representation eventually accrues how-level detail. SQL queries acquire hints. Terraform configurations acquire depends_on blocks that the planner should have inferred. Kubernetes YAMLs grow initContainers and lifecycle hooks that exist purely to work around how the runtime behaves. The abstraction leaks.

This happens because the what-level language was designed by people who already understood the how. When a SQL query performs badly, a developer who knows query execution adds an index hint. The hint is implementation knowledge encoded into the supposedly declarative layer. The what-level representation becomes a pidgin that mixes intent with execution detail.

BDD suffered from a related problem. The Given/When/Then scenarios were supposed to be written by business stakeholders who understood the what without knowing the how. In practice, developers wrote them, because writing good scenarios requires enough understanding of the system that the what/how boundary has already collapsed.

The Loop That LLMs Create

What changes with LLMs is not the existence of the what/how divide, but the dynamics of crossing it. Historically, moving from a what-level description to a working implementation required either a human developer or a code generator built on rigid templates. Both imposed friction. The human required context-setting; the template required that your what fit the generator’s model of the domain.

LLMs create an iterative cycle that previously didn’t exist at this speed. You express a what in natural language or near-natural language. The model generates a how. You inspect the generated implementation, notice where it diverges from your intent, and revise your what. The loop runs at conversational speed, which changes the economics of abstraction design.

Consider the difference in concrete terms. Here is an imperative approach to parsing a config file:

def load_config(path):
    result = {}
    with open(path) as f:
        for line in f:
            line = line.strip()
            if not line or line.startswith('#'):
                continue
            key, _, value = line.partition('=')
            result[key.strip()] = value.strip()
    return result

A declarative configuration schema raises the level:

fields:
  - name: host
    type: string
    required: true
  - name: port
    type: integer
    default: 8080

An LLM-assisted what might be a natural-language description followed by iterative refinement: “Parse a config file where lines are key=value pairs, comments start with #, and the port field should default to 8080 if absent.” The model generates something resembling the first snippet. You look at it, realize you forgot validation, and refine your description. The cycle takes seconds.

The key observation from the Fowler conversation is that this loop helps manage cognitive load specifically by letting you stay at the what level longer before committing to implementation. Systems that survive change tend to be ones where the what-level design was solid enough that implementation details could be swapped out. The LLM loop gives you more iterations to test whether your abstractions hold before you’re locked into a specific how.

The Risks of Natural Language as Spec

This is where the historical context matters as a corrective. Natural language is not a specification. It is a prompt. The difference is not semantic pedantry; it has concrete engineering consequences.

A SQL schema is a what that is also a formal artifact. It can be version-controlled, diffed, validated, and used to generate migration scripts. The meaning of a column definition is fixed by the database engine. When you write NOT NULL, that constraint is enforced regardless of what you intended.

A natural language description is interpreted by a model with a probability distribution over possible implementations. Two runs of the same prompt can produce different code. The “what” you expressed and the “how” that was generated can drift apart silently, with no compiler or runtime to catch the divergence. This is the failure mode that MDA never fully solved either, but MDA at least produced deterministic outputs from its transformations.

For domain model design and API design, this risk is especially pronounced. An API surface is a what that encodes contracts with callers. If the LLM-generated implementation of an endpoint doesn’t match the natural-language description you gave, the divergence may not surface until a client integration fails. The prompt is not a test. Generating tests from the same prompt that generated the implementation compounds the problem, since both artifacts inherit the same ambiguities.

What the Loop Is Good For

The productive use of the LLM what/how loop is in domains where you can quickly verify the generated how against ground truth that exists outside the loop. Data transformation is a good example: you express what transformation you want, generate an implementation, and run it against known inputs and outputs. The feedback is immediate and objective.

Test generation from existing code is another strong case. Here the what is the existing implementation, the how is the test suite, and the LLM is navigating a known mapping. The generated tests can be run against real behavior to validate that the generated how actually captures the intended what.

Domain modeling benefits most from the iterative speed when used to explore the design space rather than to produce final artifacts. Generating several candidate domain models from the same high-level description, then examining the trade-offs in each, compresses what used to be a multi-day whiteboarding exercise.

The deepest insight from the Fowler conversation is that the what/how divide is not a problem to be solved; it is a structural feature of complex systems. Every successful abstraction in software history has been a renegotiation of where that boundary sits. LLMs shift the economics of that renegotiation by making the boundary cheap to cross and re-cross. The engineering judgment required is the same as it has always been: knowing which level of the stack your problem actually lives at, and resisting the temptation to let the convenience of generation substitute for the rigor of specification.