How AI Coding Tools Made Specs Load-Bearing Again

The Agile Manifesto Had a Point

In 2001, the Agile Manifesto included a line that shaped twenty years of software culture: “working software over comprehensive documentation.” The authors were responding to something real. Waterfall-era projects spent months producing specification documents that were obsolete before the first line of code was written, were rarely consulted during development, and accumulated into filing cabinets of artifacts that served regulatory compliance more than engineering. The industry concluded, reasonably, that documentation was largely ceremonial.

The framing that emerged was that writing things down before you built them was a form of waste. A working test suite was documentation. Readable code was documentation. Any artifact that was not kept current by automated tooling would drift from reality and mislead rather than inform. These were not wrong conclusions, at the time, for the problem they were solving.

The problem has changed.

What LLMs Change About the Spec Equation

When your implementation partner is a person, implicit specification is workable. A developer on your team knows the codebase, carries context from last week’s standup, and asks clarifying questions when requirements are ambiguous. They fill in gaps from context, flag contradictions, and push back when something does not fit. The spec can live in shared understanding rather than written documents.

A language model cannot do any of that. When you give a model a vague requirement, it does not surface the gaps. It fills them silently, with plausible-sounding decisions that may or may not match what you wanted. Whether the operation should be idempotent, whether state survives a restart, whether an error message is user-facing or a developer log entry: these get decided based on statistical patterns, not your actual requirements. The output looks correct. Whether it is correct is only legible to someone who already knew what they wanted.

Better models make this harder to detect, not easier. A model that makes confident wrong decisions produces code that passes casual review. The underlying issue is not model quality. It is that vague inputs produce outputs that drift from intent, and the only structural fix is better inputs.

The Get Shit Done (GSD) system, which arrived on Hacker News with 237 points and 128 comments, is an attempt to make producing better inputs systematic. The reception suggests developers are recognizing something they had already partly arrived at independently, not encountering a novel idea.

The Spec as Model Input, Not Human Artifact

What GSD provides is a discipline for writing specifications calibrated to what models need rather than what human reviewers expect. The distinction is practical.

A waterfall-era specification was comprehensive, written for regulatory review or cross-team handoff. Its purpose was completeness for archival and compliance. A GSD spec is written at the task level, meant to be fed directly into the model’s context when generating the implementation. Its purpose is precision at the point of use.

A concrete example from the project illustrates what this looks like:

## Command: /remind

**Purpose**: Schedule a message in the current channel after a specified duration.

**Inputs**:
- `time`: Duration string (e.g., "30m", "2h", "1d")
- `message`: The reminder text

**Constraints**:
- Maximum reminder duration: 7 days
- Maximum active reminders per user: 10
- Times stored in UTC, displayed in user's registered timezone
- All pending reminders must be restored from the database on bot restart

**Error cases**:
- Invalid duration format: reply with examples
- Duration exceeds maximum: reply with the limit
- User at reminder cap: reply with count and list offer

The restart-survival constraint is the revealing item. A human developer reading a vague requirement for a reminder command would ask about persistence. A model given the same vague requirement will implement something that works during the session and silently drop all reminders on restart, because in-memory is the simpler path and nothing in the prompt ruled it out. Writing the spec forces you to be explicit about that constraint before implementation, which is where fixing it is cheap.

This is spec-driven development, but the motivation differs from the formal methods tradition. The goal is not mathematical correctness or regulatory completeness. The goal is making the model’s implicit decisions visible before they become code.

What GSD Is, Technically

The repository is a set of markdown templates and workflow conventions. There is no library to install, no new runtime dependency, no framework to learn. This is deliberate: if the workflow lives in plain text files, it works with any AI coding tool. The same spec files and meta-prompt templates function whether you are using Claude Code, Cursor, or Aider. Tool-specific context files like CLAUDE.md and .cursorrules are part of the system rather than replacements for it.

The templates cover project context documents, per-feature spec files, meta-prompt templates for planning and review phases, and workflow documentation describing the sequence. The overall flow has four phases:

Bootstrap: describe project intent in natural language, generate context documents
Task planning: describe a feature, generate a spec informed by project context
Implementation: feed spec plus relevant code into the model
Maintenance: describe changes, regenerate coherent context rather than patch stale docs

That last step is the meta-prompting component, and it addresses a structural failure mode in manually maintained context files. A CLAUDE.md written by hand reflects the project’s history. You add build commands when you set up the project, append notes when you hit surprises, patch things forward as the project evolves. Months in, the file describes a system that no longer quite exists. A meta-prompting workflow regenerates context from the current state of the project rather than accumulating forward from the initial state. The result stays coherent rather than accreting.

How This Fits Into the Existing Tool Landscape

Aider’s repository map is the most directly adjacent approach, and the two are complementary. Aider’s repo map auto-generates structural context by parsing the codebase with tree-sitter, producing a symbol-level index of function signatures, class definitions, and their locations. It answers what the codebase contains. GSD’s spec infrastructure answers what was decided and why. Both belong in the model’s context; they serve different questions.

Static context files in tools like Claude Code and Cursor cover project-wide conventions reliably. They do not cover the specific constraints and edge cases of a feature being implemented today. GSD’s per-task spec files fill that gap without replacing the project-level context layer. The system is additive rather than a wholesale replacement of existing tooling.

Research on SWE-bench tasks has consistently shown that agents working from well-specified intent dramatically outperform those working from implicit context, even when codebase structure is equally available to both. The benchmark measures on real GitHub issues, which makes the result harder to dismiss as an artifact of how the tasks were constructed.

The Discipline Problem, Honestly Stated

None of this is overhead-free. GSD is explicit that it is a methodology rather than a library, and methodologies require discipline to follow under time pressure. Teams that skip the spec step get process overhead without the benefit. The meta-prompting workflow adds latency to each task. Initial bootstrapping requires writing a detailed project description, which requires a clarity about the project that may not exist yet.

The historical parallel is instructive. Test-driven development has similar properties: the discipline pays off at scale, the upfront cost makes adoption difficult, and most teams practice it intermittently rather than consistently. GSD faces the same adoption challenge that every methodology requiring upfront work has always faced.

What changes is the motivation. With waterfall-era specs, you were writing to satisfy a process requirement. With AI-mediated development, you write specs to get better code out of the next prompt. The feedback loop is immediate and concrete: a precise spec produces noticeably better implementation output than a vague one, and the difference is visible before you reach code review. That tighter feedback cycle might sustain the discipline where previous methodologies could not.

The Hacker News reception, with 128 comments skewing toward recognition rather than discovery, fits this framing. Developers who have been working seriously with AI coding tools have independently converged on similar practices. GSD’s contribution is making the pattern concrete enough to adopt deliberately, rather than arriving at it after months of accumulated frustration.

The agile intuition that documentation maintained for its own sake is waste remains correct. What GSD adds is the observation that specification written as model input is something different from documentation overhead: not an artifact for future reference, but a direct lever on what the model produces now. The discipline the agile era taught us to distrust turns out to matter again, for a different reason than the one that motivated waterfall, and at a finer granularity than project-level requirements documents ever aimed at.