· 7 min read ·

The Context Window Is the Product: What Coding Agent Configuration Actually Does

Source: martinfowler

Back in February, Birgitta Böckeler published a piece on Martin Fowler’s blog about the explosion of options for configuring coding agent context. Reading it now, a few weeks later, the timing feels right for a retrospective look at what’s changed and what the underlying mechanics actually are.

The term “context engineering” was pushed into wide circulation by Andrej Karpathy around mid-2025, who argued that “prompt engineering” was always a misnomer. The actual skill was never crafting clever instructions; it was managing the full state of what the model sees: system prompts, file contents, tool results, conversation history, retrieved documents, and injected conventions. For a coding task, the wording of your request is often the least important variable. What matters is whether the model knows your project’s conventions, which files are relevant, and what happened three tool calls ago.

Shopping for context management across the major tools right now reveals genuinely different philosophies, not just different syntaxes for the same idea.

What “Context” Means in an Agent Loop

Before comparing tools, it helps to be precise about what we’re talking about. When a coding agent processes a request, its context window contains some combination of:

  • A system prompt (instructions, persona, constraints)
  • Project-level conventions injected at session start
  • The current file being edited
  • Related files retrieved by semantic search or explicit selection
  • A compressed map or summary of the broader codebase
  • The conversation history and recent tool call results
  • Retrieved external documentation or issue context

For a 128K token window, assuming roughly 3 tokens per word or ~4 characters per token, you have space for perhaps 400-600 pages of content. That sounds generous until you realize a single moderately large source file can consume 5,000-10,000 tokens, and a full conversation with several file edits and test runs can burn through 30-40K tokens in under an hour. Context engineering is fundamentally resource allocation: deciding what deserves space and where it should sit.

The “lost in the middle” problem, documented empirically by Liu et al. in 2023, adds a spatial dimension to this. Models perform reliably on information at the start and end of the context, but degrade on material buried in the middle. This isn’t fixed in newer models as much as marketing suggests. The implication: if your critical project conventions are injected mid-context, surrounded by tool outputs and file contents, they may as well not be there.

Claude Code’s Layered Approach

Claude Code has the most explicitly designed context hierarchy of any tool currently available. It uses a cascading set of CLAUDE.md files that are injected automatically into the system prompt:

~/.claude/CLAUDE.md          # user-global: applies to every project
<project-root>/CLAUDE.md     # project-level: checked into the repo
<subdir>/CLAUDE.md           # scoped: loaded when Claude works in that directory

The cascade means a frontend subpackage can have its own CLAUDE.md with React conventions without polluting the backend conventions defined at the project root. This composability is missing from most other tools.

The content of a well-maintained project CLAUDE.md tends to be operational rather than philosophical: the exact command to run tests, which directories are off-limits, naming conventions that deviate from defaults, architecture decisions that aren’t obvious from the code, and anything the model would otherwise have to guess at or ask about. A CLAUDE.md that says “always run pnpm test:unit --filter <package> rather than the top-level test command” saves the model from discovering this the hard way by watching a timeout.

Beyond the file system, Claude Code supports hooks that fire at defined points in the agent loop:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write",
        "hooks": [
          {
            "type": "command",
            "command": "npx prettier --write $CLAUDE_FILE_PATHS"
          }
        ]
      }
    ]
  }
}

This is context engineering in a different sense: not what the model reads, but what the environment does in response to model actions. A post-write hook that runs a formatter means the model never sees malformatted output and doesn’t need to include formatting instructions in its context budget. The hook encodes the constraint in the environment rather than the prompt, which is generally more reliable.

The tool permission system adds another layer. Rather than giving Claude Code blanket access or constantly approving every tool call, you can declare at the project level which tools are allowed without confirmation, which require it, and which are banned. For a project where the only acceptable shell commands are the test runner and the build script, encoding that constraint in .claude/settings.json is more robust than hoping the model respects a prose instruction.

How Aider’s Repo Map Solves a Different Problem

Aider takes a different angle on codebase context, one that’s worth understanding in detail because it solves a specific problem elegantly.

The problem: for any non-trivial task, the model needs to understand where in the codebase relevant code lives. But loading every potentially relevant file would exhaust the context window. Semantic search helps, but it retrieves chunks without their structural context, making it hard for the model to understand how pieces fit together.

Aider’s solution is the repository map: a compressed structural representation of the entire codebase, generated using tree-sitter to parse every file into its AST. The map contains file names, class names, function signatures, and the relationships between them, but not function bodies. A repository with 50,000 lines of code might produce a repo map of 3,000-5,000 tokens.

The clever part is the ranking. Aider uses a graph-based algorithm, conceptually similar to PageRank, to weight which symbols are most relevant to the current task. If you’re editing a function in auth.py, symbols that are called by or call into that function rank higher, and their signatures appear more prominently in the map. The model gets structural context proportional to relevance rather than a flat dump.

This approach trades completeness for coverage. The model knows the shape of the whole codebase rather than the full content of a few files. For tasks that require understanding cross-file dependencies, it outperforms whole-file injection. For tasks requiring deep understanding of a single complex file, it falls short. The --read flag lets you explicitly add files to context without them being editable, which fills the gap when you need both.

# Add reference files to context without allowing edits
aider --read docs/architecture.md --read src/types.ts src/auth.py

Cursor’s Retrieval Index

Cursor indexes the entire project using embeddings, updated as files change, and makes this accessible via the @codebase command. When you ask Cursor to find all the places that implement a particular pattern, it performs semantic search against this index, retrieves the top-k chunks, and injects them into the next request.

The move from .cursorrules (a single flat file) to the .cursor/rules/ directory reflects a maturation of the context configuration problem. A single rules file works fine for small projects, but as projects grow, you want different rules applied to different parts of the codebase:

.cursor/rules/
  typescript.mdc         # alwaysApply: true
  react-components.mdc   # globs: ["src/components/**/*.tsx"]
  api-routes.mdc         # globs: ["src/app/api/**/*.ts"]
  testing.mdc            # globs: ["**/*.test.ts", "**/*.spec.ts"]

Each rule file can specify whether it applies always, or only when certain file patterns are present in the current context. This means your testing conventions don’t clutter the context when you’re writing a component, and your component patterns don’t appear when you’re writing an API route. The model gets a focused, relevant slice of the project’s conventions rather than everything at once.

GitHub Copilot’s equivalent is .github/copilot-instructions.md, which went GA in 2025 and follows a simpler single-file model closer to the original .cursorrules design. It’s less expressive but lower friction to set up, which fits how Copilot is typically used: as a lighter-weight assistant rather than an autonomous agent.

MCP as the Dynamic Layer

The tools above all handle static context: project conventions, file contents, codebase maps. The emerging gap is dynamic context: the current state of external systems that the agent needs to make good decisions.

Model Context Protocol (MCP), Anthropic’s open standard for connecting agents to external data sources, is becoming the standard answer to this. An MCP server exposes tools that the agent can call to retrieve live data: the current CI build status, recent error logs from production, the content of a linked issue, or the result of a database query. From the agent’s perspective, this is just another tool call, but the result gets injected into context as fresh, current information.

The pattern that’s emerging in teams using Claude Code seriously is roughly: CLAUDE.md handles static project knowledge, hooks handle environmental constraints, and MCP handles live operational context. Each layer covers what the others can’t.

Context as Architecture

The SWE-agent paper from Princeton found that the design of the agent-computer interface, including how context is structured and what information is available, accounts for more performance variation than switching between similarly-capable base models. This is worth sitting with. The tooling around the model, not the model itself, is where most of the practical leverage lives.

The Fowler article frames context engineering as an exploding option space, which is accurate. But the underlying principle is simpler: the model can only use what’s in its context window, and what’s in the window is entirely within your control. Every project convention the model doesn’t know about is a source of drift. Every relevant file it doesn’t have access to is a potential error. Every stale assumption left over from three context windows ago is a bug waiting to happen.

Treating context configuration as a first-class engineering concern, with the same care given to code architecture or deployment infrastructure, is what separates projects where the agent is consistently useful from projects where it occasionally gets lucky. The tools have caught up to the need; the discipline of using them well is still catching up to the tools.

Was this interesting?