· 6 min read ·

Why Every AI Coding Tool Independently Invented the Same Context File

Source: martinfowler

Every time you start a new AI coding session, you start from nothing. The model has no memory of the last session, no recollection of the constraints you discovered, no awareness of the decision you made three days ago to use optimistic locking instead of pessimistic locking. You explain it again, the session goes well, and then it ends, and the next session starts from nothing.

This is not a bug in any specific tool; it is the structural reality of how context windows work. A conversation is a state machine that initializes empty every time. The model’s weights encode general programming knowledge, but your project’s specific decisions, conventions, and hard-won constraints live nowhere except in your own head and whatever you type at the start of each session.

Rahul Garg’s Context Anchoring article on Martin Fowler’s site names this problem clearly and proposes the obvious but often skipped solution: externalize the decision context into a living document that you include at the start of every session. Garg calls this a context anchor, and the core argument is that the document should be treated as living, maintained alongside the code, updated whenever significant decisions are made, kept specific enough to be useful and short enough to be read.

This problem and its solution are not new. Michael Nygard introduced Architecture Decision Records in a blog post that has since become a reference point in the industry, and the format was later carried forward through Thoughtworks’ Technology Radar. An ADR captures the context of a decision, the decision itself, the alternatives considered, and the consequences. The whole point is that the next participant in the codebase should not have to rediscover what was already decided; they should find it documented, with the reasoning intact.

AI workflows collapse the timeline of that problem. The participant who lacks context is not a future developer six months from now; it is the same session you start tomorrow morning. ADRs were always solving this problem, and AI development just makes the problem occur at session frequency rather than onboarding frequency.

The Ecosystem Converged Without Coordination

What makes context anchoring compelling beyond the Fowler article is that every major AI coding tool vendor independently arrived at the same pattern. Anthropic’s Claude Code reads a CLAUDE.md file from the project root automatically at the start of every session. OpenAI’s agent frameworks look for AGENTS.md. Cursor supports .cursorrules at the project root and more granular per-directory rules in .cursor/rules/. GitHub Copilot added support for .github/copilot-instructions.md in 2024. Windsurf has .windsurfrules.

None of these were coordinated. They are an independent convergence on the same structural insight: the model needs project-specific context that cannot be encoded in its weights, and the right place to store that context is a file in the repository. The variation is in the filename and some tool-specific features. The underlying pattern is identical: a markdown file, read at session start, containing project-specific context that the model would otherwise lack.

When multiple competing products independently solve the same problem in the same way, it is evidence that the problem is real and the solution space is constrained. Teams that resist this pattern, keeping project context in their heads or in scattered notes, will re-explain the same decisions in every session, and that friction compounds over time.

What the Document Should Contain

A context anchor is not a README. READMEs are written for human newcomers at onboarding time and updated at major release milestones. A context anchor is written for a stateless model that will read it at the start of every session and needs to quickly re-establish an accurate picture of the project.

The contents break into two categories: stable and volatile.

Stable content changes rarely. It includes the tech stack, architectural conventions, and decisions that have been made and should not be revisited without explicit discussion:

## Tech stack
- Runtime: Node.js 22, TypeScript 5.4
- Database: PostgreSQL 16 with Drizzle ORM
- Testing: Vitest + Supertest
- Deployment: Fly.io via Docker

## Conventions
- All database access through the repository layer in src/db/
- Never call the database directly from route handlers
- Use the logger from src/lib/logger.ts, never console.log
- All errors must be instances of AppError or a subclass

## Decisions (do not revisit without discussion)
- [2025-01-15] Chose Drizzle over Prisma: better TypeScript inference, no Rust binary dependency
- [2025-02-03] REST only, no GraphQL. Clients are internal and can absorb schema changes.

Volatile content changes per session or per feature. It is the handoff note: what the current work is and what you have already discovered about it:

## Currently in progress
Working on the payment webhook handler in src/routes/webhooks/stripe.ts.
The Stripe SDK event types are not well-typed for older API versions;
cast through unknown rather than trying to use the SDK types directly.

Keeping stable and volatile content in separate sections matters. If someone updates the “currently in progress” section every session, they should not have to touch the stable architectural content. The stable content is the anchor; the volatile content is the cursor position.

One constraint worth enforcing: keep the whole document under a single screen of text. A 500-line context document defeats its own purpose. Link out to detailed ADR files in docs/decisions/ for anything that requires extended explanation; inline only what is universally applicable across every session.

Negative Constraints Are More Valuable Than Positive Descriptions

One pattern worth emphasizing from Garg’s framing is the value of negative constraints. Telling the model “use the internal logger” is useful. Telling the model “never use console.log because it bypasses structured logging and breaks log aggregation in production” is more useful. The model’s training defaults will tend toward common patterns, and your codebase’s conventions often deliberately diverge from those defaults for specific reasons. The context anchor is where you document the divergence with its rationale.

This is where the ADR parallel is most instructive. ADRs are valuable precisely because they record alternatives that were considered and rejected. A bare decision such as “we use Drizzle” is less useful than a decision with rationale: “we use Drizzle because Prisma’s Rust binary causes issues in our Alpine-based Docker builds.” The rationale is what prevents a future participant, human or model, from improving the code back toward the common default that your project deliberately moved away from.

This asymmetry runs deeper than it might look. A model trained on the entire public internet has strong priors toward conventional choices. Your project’s unconventional choices need explicit justification in the context anchor, or the model will quietly revert them every time you aren’t watching.

The Discipline Is the Hard Part

Treating the context anchor as a living document requires a discipline that static documentation does not. The document needs to be updated in the same commit as the decision it records, not in a cleanup pass later. When something goes wrong in a session because the model lacked a constraint, the right response is to add that constraint to the anchor document immediately, not to re-explain it in the next session and forget.

This is a higher standard than writing an occasional ADR. It requires treating the context anchor as part of the definition of done for any significant architectural decision. Teams that reach this standard find that the document becomes genuinely useful beyond its AI use case; it functions as a fast-start guide for humans too, including developers returning to a project after time away.

The deeper point in Garg’s article is that AI workflows expose a documentation deficit that already existed. The deficit was tolerable because humans carried the context in their heads and the cost of re-establishing it was low. When you introduce a stateless model that cannot carry any context, the cost becomes explicit and structural. A context anchor does not eliminate the deficit; it makes the documentation that compensates for it actually exist, and keeps it current enough to be worth reading.

Was this interesting?