Building the Context That Builds the Code

There is a pattern that developers working seriously with LLMs arrive at, usually after several months of frustration, where they stop trying to be clever in individual prompts and start thinking systematically about what information the model has available at all. The Get Shit Done (GSD) system surfaced on Hacker News with 237 points and 128 comments, which is the community signaling recognition, not discovery. The practices GSD formalizes have been circulating informally. What makes it worth examining is the specific combination of three components, and what that combination enables at the meta-level.

The Distinction Between Prompting and Context Engineering

Prompt engineering, as a term, picked up connotations early: magic phrases, jailbreaks, tricks for steering outputs. Context engineering is the more useful framing for systematic work. You do not primarily control model outputs by choosing better words in your prompt. You control them by controlling what information occupies the context window at inference time.

This includes the system prompt but extends well beyond it: conversation history, retrieved documents, tool call outputs, file contents, and the position of all of these within the window. The Liu et al. finding from 2023 documented what practitioners were already noticing empirically: transformer models attend reliably to information at the beginning and end of long contexts, but content placed in the middle degrades substantially in recall. For a coding session that spans dozens of tool calls, any constraint or architectural decision introduced in the middle of the conversation may not be reliably applied by the time the model acts on it.

Context engineering as a discipline is the deliberate management of all of this. You decide what information the model needs, where it should be positioned, and how to ensure it remains accurate as the session evolves. Tools like CLAUDE.md (used by Claude Code), .cursorrules (used by Cursor), and GitHub Copilot Workspace instructions all implement the same architectural insight independently: inject stable project-level information at the beginning of each session, where attention is highest, so it survives the full session without drift.

GSD builds on this foundation and adds two layers that existing tools leave largely unaddressed.

Spec-Driven Development as the Working Layer

The spec-driven component of GSD formalizes something that LLM-experienced developers do informally: write an explicit specification of what you want before generating any implementation. The discipline is old. Formal methods, design-by-contract, and test-driven development all impose it from different angles. The specific application to LLM workflows has a concrete motivation.

When a model implements something, it treats your description as a complete specification. Anything you did not mention is implicitly out of scope. A description written for a human collaborator can leave architectural decisions open because the human will ask. A description given to an LLM will be implemented silently. The model will decide what your database schema looks like, whether the operation is idempotent, and what happens on restart, without surfacing those decisions as decisions.

A spec file for a Discord bot command illustrates what this looks like in practice:

## Command: /remind

**Purpose**: Schedule a message to be sent in the current channel after a specified duration.

**Inputs**:
- `time`: Duration string (e.g., "30m", "2h", "1d")
- `message`: The reminder text

**Constraints**:
- Maximum reminder duration: 7 days
- Maximum active reminders per user: 10
- Times stored in UTC, displayed in the user's registered timezone
- All pending reminders must be restored from the database on bot restart

**Error cases**:
- Invalid duration format: reply with examples
- Duration exceeds maximum: reply with the limit
- User at reminder cap: reply with current count and list offer

This level of detail feels over-specified for a document you would hand to a colleague. For an LLM, it is about right. The restart-survival constraint in particular would be entirely absent from a generated implementation if you did not include it. You only think to write it when you are forcing yourself to be explicit about behavior under failure. That forcing function is the spec discipline’s value.

Spec files like this serve as both context and contract. The model generating the implementation has everything it needs. The developer reviewing the output has a clear target to evaluate against. When the generated code diverges from the spec, the mismatch is legible rather than requiring you to reconstruct what you were trying to build.

Meta-Prompting: Closing the Loop

The component that distinguishes GSD from existing tools is the meta-prompting layer. Meta-prompting, in this context, means using the LLM itself to generate and maintain the prompts and context files that drive subsequent interactions. This closes a loop that manual tooling leaves open.

A hand-maintained CLAUDE.md file has a structural problem: it reflects the project’s history rather than its current state. You add build commands when you set up the project, append a note about a gotcha when you hit it, add a convention someone violated when it caused a bug. The result is a document that accumulates rather than coheres. Six months in, the CLAUDE.md describes a system that no longer quite exists.

A meta-prompting system addresses this differently. You describe your project in natural language. The system generates the context files from that description, producing structured documentation that reflects the project as described rather than the project as accumulated. As the project evolves, you describe the changes, and the system regenerates coherent documentation rather than patching stale documentation forward.

The workflow this enables has a specific shape:

1. Bootstrap
   Describe project intent → generate spec files, architecture docs, CLAUDE.md

2. Task planning
   Describe feature → generate task spec informed by project context

3. Implementation
   Feed spec + relevant code → generate implementation

4. Context maintenance
   Describe changes → update/regenerate project context documents

Each step uses prior outputs as context for the next. The spec files generated at bootstrap inform how task specs are written. Task specs inform implementation. Implementation outcomes feed back into context maintenance. The context documents are treated as living artifacts rather than write-once reference material.

This is structurally different from the ad-hoc workflow that most developers use. In the typical session, you reconstruct context from scratch: paste relevant files, explain architecture, restate constraints. That overhead grows with project size and with the interdependence of tasks. Externalizing it into maintained documents eliminates the reconstruction cost and improves consistency across sessions.

What the Meta Layer Makes Possible

The interesting property of generating context with the same model that will consume it is not just efficiency. It is that the generated context tends to be in a form that the model handles well. A human writing CLAUDE.md writes in whatever style comes naturally. A model generating project documentation produces structured prose at the granularity models are calibrated to process. This is a modest benefit on any single task and compounds across a project’s lifetime.

There is also the bootstrapping advantage. When you onboard a new developer to a project with good GSD-style context infrastructure, they get not just the code but a structured description of what the code is trying to do and why. The same context that guides the model is useful documentation for humans. Projects that invest in this infrastructure produce better handoff artifacts as a side effect.

Aider’s repository map addresses an adjacent problem: it auto-generates structural context about the codebase, a symbol-level index of what files contain, so the model forms accurate plans informed by actual structure rather than memory. GSD and Aider’s repo map are complementary. The repo map answers “what does the codebase contain,” and GSD’s spec infrastructure answers “what decisions have been made and why.” Both belong in position zero of a well-engineered context.

Where This Likely Breaks Down

Meta-prompting has a seam at the point where it asks the model to reason about itself: generating context from a vague project description produces coherent results when the description is detailed and structured, and produces confidently wrong results when it is not. The bootstrapping problem is real. To generate good context, you need to describe your project well, which requires some of the same clarity that good context helps you maintain.

This means the system probably works best for developers who are already reasonably disciplined about specification, and adds less value for developers who would struggle to write a clear project description in the first place. The meta-prompting layer does not automatically produce the rigor that spec-driven development requires; it amplifies whatever rigor you bring to the initial description.

The Hacker News comment count on GSD suggests the reception is largely recognition, not discovery. Developers who have converged on similar practices independently are validating that the framework formalizes something real. That is a useful signal. It means the core ideas are robust to independent rediscovery, which is typically a sign that they reflect actual constraints in the problem space rather than one person’s idiosyncratic preferences.

The practical question is durability. Whether GSD specifically becomes a stable tool or its patterns get absorbed into the defaults of AI coding editors, the framing is directionally correct. Developers who maintain context infrastructure as a first-class project artifact will work more effectively with AI tools than developers who reconstruct context session by session, for the same reason that developers with good test infrastructure outperform developers who test manually. The work compounds, and it compounds in the direction of making the next session faster rather than slower.