The Middle Loop: What GSD Solves That Your AI Coding Tool Doesn't

Software development has always had concentric loops. The inner loop is immediate: write code, run tests, fix failures, repeat. The outer loop is longer: plan features, estimate work, ship releases, gather feedback, plan again. Most engineering tooling has been built for one of these two loops. IDEs optimize the inner loop. Project management tools, CI/CD pipelines, and release processes serve the outer loop.

AI coding tools have dramatically accelerated the inner loop. Cursor, Claude Code, and Aider can write working implementations from terse descriptions, navigate unfamiliar codebases with a combination of lexical and semantic search, and apply edits across multiple files with a single instruction. The iteration cycle from intent to testable code has compressed from hours to minutes for many classes of task. The outer loop has seen real improvements too: AI can write tickets, draft architecture docs, generate test plans, and summarize codebases for onboarding.

What the tools have not addressed is the layer between these two loops. Call it the middle loop: the process of decomposing a feature into an implementation plan, managing context across multiple AI sessions on the same task, aligning output with original intent, and reviewing generated work systematically before it ships. This layer exists and it is consequential, but current AI coding tools give you almost nothing for it.

Get Shit Done is an attempt to fill that gap. It gathered 237 points and 128 comments on Hacker News, with many commenters describing it as a formalization of patterns they had already arrived at independently. That reception pattern is worth examining: it suggests the gap is real and widely understood by practitioners, but that solutions remain inconsistent across teams.

What the Middle Loop Requires

The inner loop and outer loop have different failure modes, and the middle loop sits at the intersection of both.

The outer loop fails when planning disconnects from implementation reality: features estimated against misunderstood complexity, architectural decisions made without implementation context, requirements that leave too much room for interpretation. The inner loop fails when implementation context is shallow: models filling gaps with plausible-but-wrong decisions, constraint drift as sessions extend and earlier architectural constraints lose influence in the attention window, review cycles that catch code style but miss intent mismatches.

The middle loop is where outer-loop planning translates into inner-loop execution. It requires a spec that bridges human intent and model input, a context state that persists across sessions, a planning step that makes sequencing and trade-offs explicit before code is generated, and a review step that closes the loop between spec and implementation.

Each of these requirements has well-understood analogs in traditional software development. Specs, technical design documents, code review, and architecture decision records have existed for decades. What GSD adapts them for is an AI-assisted workflow, where part of the consumer of the spec is a language model rather than a person, and where the cost of inconsistency across sessions is higher than in a human team that shares ambient project knowledge through daily interaction.

The Meta-Prompting Layer

The most technically interesting component of GSD is meta-prompting, and it is also the one most absent from ad-hoc workflows.

A meta-prompt produces structure rather than code. Instead of asking the model to implement a feature, a meta-prompt asks it to produce an implementation plan given a spec and a set of constraints. The output is not the artifact you ship; it is the input for the prompts that generate the artifact.

This externalizes the planning step. When you prompt an LLM directly to implement something, planning and execution happen simultaneously and are largely invisible. The model makes choices about sequencing, decomposition, and trade-offs, and you see only the result. With a meta-prompting step, those choices become visible before any code is generated. You can evaluate whether the planned sequence is sensible, whether the scope is correct, whether edge cases are missing, and course-correct at the plan level rather than the code level. Plans are cheap to revise; generated code is not.

The GSD bootstrap phase applies meta-prompting at a more fundamental level: you describe your project in natural language, and a meta-prompt generates your context documents, spec templates, and architecture docs from that description. The model produces the scaffolding that subsequent model interactions will rely on. Rather than incrementally patching a CLAUDE.md over time as problems emerge, you generate a coherent context snapshot from a deliberate project description and update it at significant decision points.

A hand-maintained context file accumulates the history of where the project has been. A meta-prompt-generated context file reflects where the project should be, filtered through what is relevant to the current task. These are structurally different documents, and the distinction matters for how reliably the model maintains consistent behavior across long sessions. Tools like Claude Code’s CLAUDE.md and Cursor’s .cursorrules provide the injection mechanism; GSD provides a principled approach to generating and maintaining what gets injected.

What the Spec Provides That Vague Descriptions Cannot

The spec component addresses a specific failure mode in LLM-assisted development: silent gap-filling.

Human collaborators ask clarifying questions when requirements are ambiguous. They flag contradictions. They notice when a stated constraint conflicts with an implied one. LLMs do none of these things reliably. When a model encounters ambiguity in a requirement, it generates the most statistically plausible interpretation and proceeds. If that interpretation matches your intent, you never notice the gap. If it does not, you find out during review or in production.

A spec written before implementation shifts where that ambiguity surfaces. You work through the unclear cases during spec writing rather than during code review. You make the restart-survival requirement explicit, the maximum active reminders per user, the timezone handling rules, before any code exists. These constraints are obvious in retrospect; they are invisible when rapidly iterating toward a working feature.

The spec format in GSD is deliberately lean rather than exhaustive: purpose, inputs, constraints, error cases, enough to prevent the most common category of intent-divergence in generated code without becoming a documentation burden. The discipline is in writing it before prompting for implementation, not afterward as documentation of what the implementation ended up doing.

SWE-bench research has consistently shown that agents working from well-specified intent substantially outperform those working from implicit context, even when the codebase is equally available to both. The spec is not ceremony; it is a functional input that changes what the model produces.

Why This Remains a Methodology Rather Than a Tool

GSD ships as markdown templates. There is no package to install, no CLI, no runtime to configure. The workflow instructions are natural language, the spec templates are markdown, and the meta-prompts are text files.

This is partly a design choice and partly an honest reflection of what the middle loop requires. The inner loop can be tooled because edit-run-test cycles have well-defined interfaces: file reads, shell commands, test runner output. The middle loop operates on intent, context coherence, and alignment between human goals and generated artifacts. Tools can prompt you to write a spec; they cannot determine whether the spec captures what matters for your specific project and your specific intent.

The practical consequence is that adopting GSD means adopting a discipline, not a dependency. Teams under deadline pressure will skip the spec phase first; it is the step that feels most like overhead when the goal is to start coding. Once spec writing is consistently dropped, what remains is a context management system, still useful, but without the loop-closing benefit of having a defined target to evaluate generated work against.

The gap GSD addresses will eventually be absorbed into the tools. Nothing in the middle loop is categorically non-automatable; the tools simply have not built it yet. Structured spec generation, context lifecycle management, automated plan-versus-implementation comparison, all of these could be first-class tool features. For now they are markdown files and developer discipline.

The teams getting consistent results from AI-assisted development are the ones who have assembled their own version of this layer, regardless of whether they use GSD’s templates or have a name for what they are doing. The value of GSD is not that it invents something new; it is that it makes the pattern explicit enough to adopt deliberately rather than converge on through accumulated friction.