Why Your Team Keeps Paying for the Same AI Context, Session After Session

There’s a recognizable pattern in AI-assisted development. You open a new session, ask the tool to add a feature, get back code that compiles and looks plausible, then spend the next ten minutes explaining why it’s wrong for your specific system: why you don’t use that library, why that abstraction layer doesn’t fit, why the generated type should implement the interface your team defined two quarters ago. The correction loop usually produces something workable, but you’ve spent more time than the task was worth.

The individual experience of this loop is well-documented. Rahul Garg names it the “frustration loop” in an article on Martin Fowler’s site, and the diagnosis is accurate: the model is capable enough, but it lacks the project-specific knowledge required to generate contextually correct code. The solution Garg proposes, which he calls knowledge priming, is to front-load that context before asking for anything, systematically and deliberately.

What the individual framing understates is how this plays out at team scale. When you think of the frustration loop as an individual productivity issue, the accounting makes the investment look marginal. Writing a comprehensive priming file takes an afternoon. Recovering from one misaligned AI response takes fifteen minutes. On a bad day you might encounter two or three such responses. The arithmetic seems borderline at best.

That math changes when you count the whole team.

The Invisible Team Cost

A context file created once is consumed by every developer on the team, in every session, across the lifetime of the project. The one-time investment cost stays constant while the benefit compounds with team size, session frequency, and project longevity.

Consider a team of four developers who each use an AI coding assistant for two hours per day, five days per week. If each developer spends an average of twenty minutes per day in correction loops, that’s eighty minutes of team time lost daily. Over a twelve-week quarter, that’s roughly sixty hours, across the whole team, spent re-explaining context the AI should have had from the start.

A well-maintained priming file doesn’t eliminate that cost entirely, but it attacks the most expensive part: the recurring explanations for the same structural decisions. The prohibition on a specific library, the architectural boundary between modules, the reason your team chose one concurrency pattern over another. These explanations cost the same whether they happen in week one or week twelve, and they happen every session without a priming file.

Garg draws a clean distinction between reactive and proactive approaches. Reactive means explaining your codebase mid-conversation, one session at a time, as problems surface. Proactive means explaining it once, systematically, in a form that every future session can use automatically. The reactive approach feels lower-cost per instance because the work of explaining is diffuse and invisible. The proactive investment is visible and concentrated, which makes it feel more expensive than it is.

What Actually Belongs in a Priming File

The failure mode for priming files is filling them with information the model already handles correctly. Teams write long lists of style conventions the model follows by default, obvious things about the project structure that are apparent from reading the code, rules already enforced by linters. These entries add noise without adding constraint.

The entries that earn their place share a property: they tell the model something it cannot infer from the code itself. Specifically, they capture decisions that look arbitrary without context, prohibitions that override the model’s training-data priors, and constraints that emerged from incidents or architectural choices made before the current state of the codebase.

A low-value entry looks like this:

Use TypeScript for all new files.

The model will infer this from the existing files. The entry is noise.

A high-value entry looks like this:

Do not use the pg package directly. All database access goes through
/packages/db. This was enforced after an incident where connection pool
exhaustion in one service could not be diagnosed because connections were
being opened outside the centralized pool.

The difference is not the prohibition itself; it’s the reason. An LLM predicting what a competent developer would write will, given the constraint plus its rationale, generalize more reliably to edge cases the explicit rule didn’t anticipate. The model that knows why the constraint exists can apply it correctly to situations that weren’t enumerated. The model that only knows the rule will miss cases the rule didn’t explicitly cover.

Negative examples are especially valuable for patterns the model would otherwise generate with high probability. If your codebase avoids a common pattern that appears frequently in training data, showing an annotated example of what not to do suppresses the pattern more reliably than simply not mentioning it. The model has strong priors from training; explicit prohibitions with annotations are the most direct way to override them.

A useful test for any candidate entry: would a capable new developer reading this file learn something non-obvious about the codebase? If yes, it belongs. If the answer is “they’d figure that out in ten minutes of reading the code,” skip it.

Stale Context Is Worse Than No Context

The compound-returns story has a failure mode that teams underestimate. A priming file that accurately described the architecture eight months ago, and hasn’t been updated through any refactoring or architectural change since, is not neutral. It actively misleads.

When a priming file states an architectural constraint, the model treats that statement as authoritative, overriding what it might otherwise infer from reading the current code. A file that describes the old module boundary causes the model to generate code consistent with an architecture the team has since dismantled. The stated constraint competes with, and often wins against, the observable reality in the codebase.

This is the specific failure mode that makes the knowledge priming practice difficult as a long-term discipline. The information in a priming file is most valuable precisely because it captures things not visible in the code, but that invisibility also means drift is hard to detect. You can’t run a test to verify that the prohibitions in your CLAUDE.md are still accurate.

Teams that handle this well treat the priming file as a first-class artifact with explicit ownership, reviewed in pull requests when architectural decisions change, included in the definition of done for any work that makes a stated constraint obsolete. The analogy to CI pipeline configuration is accurate: you wouldn’t let your pipeline configuration drift from actual project requirements without noticing; the context file deserves the same discipline.

The Coordination Problem That Keeps Teams Underinvesting

There’s a structural reason why teams systematically underinvest in proactive context, even when they understand the ROI case. The cost of writing the priming file is individual and visible. The benefit is collective and distributed. The developer who writes the file pays the cost; the entire team, across all future sessions, captures the benefit.

This is a public goods problem. Rational actors underinvest in public goods because individual incentives don’t capture collective returns. When a single developer gets frustrated with the AI’s lack of context and types a correction, that correction costs them a few minutes. When they instead invest thirty minutes in documenting the architectural constraint that caused the confusion, the benefit accrues to every session thereafter, for every team member. But the thirty-minute cost is absorbed entirely by the person making the investment.

Garg’s framing addresses this implicitly by positioning knowledge priming as a team practice rather than a personal productivity technique. The pattern makes the investment visible and attributable, which helps justify it in planning. But the coordination problem doesn’t fully dissolve until teams explicitly assign ownership of context files, review them in PRs alongside code, and include their maintenance in engineering work estimates.

Birgitta Böckeler’s work on AI-assisted development puts the broader point clearly: AI-assisted development productivity depends more on the quality of the supporting infrastructure than on the quality of the model. Context files are part of that infrastructure, and infrastructure without ownership degrades.

The Forcing Function Effect

Teams that write and maintain priming files often report something unexpected: the process reveals architectural disagreements among team members that were never surfaced explicitly. When someone attempts to write down the canonical approach to a particular problem, they frequently discover that no one has agreed on what canonical means. Different developers hold different mental models of the architecture, and the tacit knowledge they carry is inconsistent.

Writing a priming file forces these disagreements into the open. The question “should we document this as a prohibition or a preference?” requires the team to first answer “is this a prohibition or a preference?” That conversation produces clarity for human contributors, not only for AI sessions.

Architecture Decision Records serve a similar forcing function for documenting design choices, and they share the same maintenance discipline challenge. The priming file is arguably more tractable because its contents are in direct conversation with day-to-day tool use: when a developer notices the AI generating something the file says to prohibit, they’ll catch it immediately. When a file entry goes stale, the AI generates code that violates the stated constraint, the developer corrects it, and if that pattern repeats, it becomes visible as a maintenance signal.

Context Investment as Engineering Work

The practical implication is that building and maintaining a priming file is engineering work with a measurable return, not documentation overhead. The investment is front-loaded; the returns compound across team size and project lifetime. The cost of a single correction loop is modest; the cost of that same correction loop multiplied by every developer across every session is substantial and largely invisible.

The tools make this straightforward. Claude Code reads CLAUDE.md automatically at every session start, including a layered system where global user preferences, repo-level conventions, and per-directory constraints compose together. Cursor applies .cursorrules globally and .cursor/rules/ entries scoped to file patterns. GitHub Copilot reads .github/copilot-instructions.md and slots it into every Copilot Chat context in the repository.

The infrastructure for proactive context is already in place across every major AI coding tool. What’s missing in most teams is the discipline to populate and maintain it systematically, with ownership and review cadence, as the codebase evolves.

The frustration loop is a context problem. Solving it once, for the whole team, at the beginning of a project, is the investment that pays for itself every day.