The Structural Specification Your Agent Isn't Getting

Erik Doernenburg’s January 2026 assessment of coding agent impact on CCMenu adds a specific finding to what is becoming a familiar conversation: AI agents produce code that works, and that is structurally worse than what a careful human developer would write. The violations he found in CCMenu were real: coupling introduced across layers that had been deliberately kept separate, logic duplicated across the codebase rather than reused, responsibilities piled into existing classes rather than distributed appropriately.

Most of the discussion about findings like this focuses on detection, on how to catch structural violations after the agent produces them. The harder question is why the violations happen in the first place. The answer points to something upstream of review and tooling.

What a Prompt Specifies

When a developer asks a coding agent to add a feature, they write a functional specification. They describe the behavior they want. They might provide context files, explain the domain, describe edge cases. What they almost never provide is a structural specification: where this code should live, which existing abstractions it should extend or reuse, which architectural boundaries it must not cross, what the resulting shape of the code should look like when it’s done.

This isn’t an accident. Functional specification is what developers have practice writing. Bug reports, feature tickets, user stories: all of them describe behavior. The structural side of a specification is the part that has historically been communicated through code review, pairing, architectural documentation, and the accumulated conventions of a codebase. It was never codified as a prompt-shaped artifact, because there was no prompt.

A typical agent prompt for a new feature looks like:

Add support for parsing GitHub Actions workflows.
The app should fetch the status of runs from a repo URL
and display it in the menu bar alongside existing CI results.

This prompt contains no information about where in the existing module structure feed parsers live, whether there is an existing protocol for parsers to conform to, how existing parsers handle network errors, which layer converts raw API responses into the domain model, or whether networking concerns belong in the same component as parsing. The agent answers all of these questions implicitly, by making choices that feel locally coherent with whatever context it can see. If a model file is nearby, it may put logic there. If a view model is accessible, it may route through that. The choice is driven by what is nearest in the context window, not by what the architecture requires.

Coherence vs. Consistency

Coding agents are genuinely good at local coherence: code that makes sense on its own terms, follows consistent naming within the file it occupies, handles the immediate problem correctly. What they do not reliably produce is consistency with the broader codebase, because that requires knowledge that isn’t local.

Fowler’s design stamina hypothesis frames internal quality as an investment in future change speed. The violations Doernenburg found are all violations of consistency, not coherence. The new code was coherent; it fit its immediate context well enough to pass review and tests. The violations only became visible when someone compared it against the broader system’s conventions.

This distinction matters for how you think about the problem. Asking an agent to write more coherent code is a feedback problem: show it examples, give it linting rules, run its output through static analysis. Asking an agent to write code consistent with architectural conventions is a specification problem. The agent can only be consistent with information it has. If structural conventions are absent from the prompt, the agent has nothing to be consistent with.

What Structural Specification Looks Like

The closest thing to structural specification in current practice is CLAUDE.md files, architecture documentation, and explicit conventions provided as context. A CLAUDE.md that says:

## Architecture

This project uses a three-layer structure:
- Model: pure data types, no networking, no UI
- Services: network calls, parsing, caching. Never import SwiftUI.
- Views: SwiftUI only, receive data from services via ObservableObject

When adding a new feed provider:
1. Create a struct in Sources/Services/Feeds/ conforming to FeedParser
2. Handle errors using AppError in Sources/Model/Errors.swift
3. Register the parser in Sources/Services/FeedRegistry.swift
4. Do NOT put networking logic in model types or views

…gives an agent a structural specification to work from. It describes not just what to build but where to build it and what it should conform to. Thoughtworks’ guidance on AI-assisted development notes that architectural context in system prompts correlates with more structurally coherent output.

The results are better, but not perfect. An agent given this context will generally put code in the right directories and conform to the right protocols. It may still miss subtler conventions: naming patterns, the level of abstraction used for similar constructs elsewhere, the project’s conventions around when to extract a new type versus extend an existing one. These conventions are rarely written down because they were never needed as written artifacts.

This is where the tacit knowledge problem intersects with the specification problem. Doernenburg’s assessment worked precisely because he built CCMenu over more than a decade and could detect violations against a mental model no document contained. Polanyi’s observation that “we can know more than we can tell” applies directly: much of what makes a codebase coherent is knowledge the original developers have but have never articulated in a form that can be injected into a prompt.

The implication is that structural specification scales with documentation discipline. Projects with careful architecture decision records, explicit module boundary documentation, and detailed coding conventions give agents substantially more to work with. Projects that rely on convention-by-osmosis and tribal knowledge give agents almost nothing.

Making Structural Intent Testable

There is a category of countermeasure that converts structural intent into executable constraints: architecture fitness functions. These are tests that assert structural properties rather than behavioral ones.

For a Swift project, custom SwiftLint rules can enforce layer separation:

custom_rules:
  service_no_swiftui:
    name: "Services must not import SwiftUI"
    regex: "^import SwiftUI"
    included: ".*/Services/.*\\.swift"
    message: "Service types must not depend on SwiftUI"
    severity: error
  model_no_network:
    name: "Model types must not import networking"
    regex: "^import (Alamofire|Foundation\\.URLSession)"
    included: ".*/Model/.*\\.swift"
    message: "Model types are pure data; networking belongs in Services"
    severity: error

When these rules run in CI, structural violations become build failures. The agent’s output has to pass not just behavioral tests but structural ones. Neal Ford and Mark Richards’ work on fitness functions in “Building Evolutionary Architectures” provides the conceptual framework; the tooling to implement them is available now for most languages and stacks.

This approach has limits. Custom rules catch violations you anticipate. Violations you don’t anticipate, the novel ways an agent finds to cross a boundary you hadn’t thought to protect, slip through until someone writes a new rule. The rules also require architectural intent to be specified precisely enough to be encoded: fuzzy conventions don’t translate into regex patterns.

The Specification Debt Most Teams Are Carrying

The practical situation in most teams is that structural intent exists, but mostly as tacit knowledge. Developers who have worked on a codebase long enough have internalized its conventions. New developers learn them through review and pairing. This system worked when humans wrote all the code, because the review process transferred architectural knowledge alongside functional review.

Coding agents disrupt this transfer. An agent doesn’t learn conventions through pairing; it learns from whatever appears in its context. If conventions aren’t written down, they aren’t available to the agent. The GitClear analysis of AI-assisted code changes found code duplication roughly doubling in repositories corresponding to heavy AI tool adoption. Duplication is precisely what you’d expect when an agent doesn’t know an abstraction already exists, because no one told it where to look.

The countermeasure isn’t to avoid coding agents. It’s to write down what you’ve been leaving implicit. Architecture decision records, module boundary documentation, explicit conventions: these have value independent of AI tooling, because they help human developers too. The difference is that before coding agents, this documentation was optional, a nice-to-have that made onboarding faster. With coding agents in the workflow, the structural specification is the primary mechanism through which architectural intent reaches the agent.

Doernenburg could assess the damage to CCMenu because he holds the mental model. Most teams using agents in codebases they maintain collectively, with distributed ownership and evolving conventions, don’t have a single person who holds that model clearly enough to run the same audit. The structural specification problem scales with team size and codebase age, and most teams are running a deficit they haven’t measured.

Writing the specification is the work the tooling can’t do for you.