Context Is the Program: The Engineering Layer Beneath Your Coding Agent

A few years ago, the main lever you had with an LLM-based coding tool was the words you put in your message. You learned to phrase things carefully, break requests into steps, and add clarifying context inline. That was prompt engineering, and it was a craft built around the limitations of the model.

Something has shifted. The Martin Fowler article on context engineering for coding agents, published in early 2026, is a good retrospective snapshot of where things stand, noting that the options for configuring and enriching a coding agent’s context have “exploded over the past few months.” Claude Code is named as leading this space, with other assistants following. That is accurate, but the more interesting thing to understand is why this shift happened and what the underlying architecture looks like.

The term “context engineering” started gaining traction in 2025, partly popularized by people like Andrej Karpathy, to distinguish between two different problems. Prompt engineering is about choosing the right words for your instruction. Context engineering is about what information you put into the context window at all: which files, which history, which tool outputs, which memory, and how it is all structured. For small tasks with self-contained inputs, the distinction does not matter much. For agents working on real codebases across long sessions, it is the whole ballgame.

The Memory Hierarchy Problem

Systems programmers are familiar with memory hierarchies. You have registers, L1/L2/L3 cache, RAM, and disk. Each level is faster but smaller than the one below it. Good systems performance comes from keeping the right data at the right level. The same framing applies surprisingly well to context engineering for coding agents.

A coding agent has a finite context window. Claude’s is 200k tokens, which sounds large until you are working on a mid-sized codebase with deep dependencies. You cannot put everything in. The engineering question is always: what goes in, what stays out, and who decides?

Claude Code has structured this into three distinct layers of persistent context:

Project memory lives in a CLAUDE.md file at the root of your repository. It gets read into every session automatically. This is where you document build commands, testing conventions, architecture decisions, files the agent should never touch, and anything else that would take a human five minutes to explain to a new teammate. It supports importing sub-documents using @path/to/file syntax, so you can compose context from multiple files instead of dumping everything into one.

User memory lives in ~/.claude/CLAUDE.md. This is your personal preferences layer: preferred code style, tools you always use, workflows you follow. It persists across all projects.

Local memory uses a CLAUDE.local.md file at the project root, excluded from version control by convention. This handles the overlap between project and personal: project-specific preferences that are yours alone, like your local database URL or the fact that you prefer running tests with a specific environment flag.

This three-level structure maps well onto the hierarchy analogy. Project memory is shared, versioned, team-owned. User memory is personal, global, always available. Local memory fills the awkward middle where personal preference meets project context. When there is a conflict, local overrides project, and both override user memory.

Dynamic Context: MCP and the Tool Layer

Static files cover a lot, but they have a hard limit: they can only encode what you knew when you wrote them. Real codebases change constantly, and the most valuable context is often situational.

This is where the Model Context Protocol (MCP), an open standard Anthropic released, becomes relevant. MCP defines a protocol for connecting an LLM to external servers that expose three kinds of things: resources (files, database records, API responses), tools (functions the model can call), and prompts (pre-built templates). Claude Code can be configured to connect to multiple MCP servers defined in .claude/settings.json.

The practical effect is that the agent’s effective context window expands beyond what fits in the literal token window. Instead of inlining a huge documentation file, you expose it through an MCP resource and let the model request what it needs. Instead of pre-loading database schemas, you expose a query tool and let the model ask questions.

This architecture turns context engineering from a static configuration problem into a dynamic infrastructure problem. You are not just writing a CLAUDE.md once. You are deciding what capabilities to expose, how to structure them, and what the latency cost of each retrieval is.

A concrete example: if you run a service with a large internal API surface, you could expose an MCP server that serves endpoint documentation indexed by path. The agent retrieves only the documentation for the endpoints it is actually touching, rather than ingesting your entire OpenAPI spec upfront. For a spec with hundreds of endpoints, this is the difference between fitting comfortably in the context window and burning half of it on irrelevant routes.

Hooks: Context Injection at Runtime

Claude Code’s hooks feature is the most architecturally interesting recent addition. Hooks let you attach shell commands to specific events in the agent loop: before a tool runs, after a tool runs, when a session starts, when a notification fires. The hook output is injected back into the context.

This is powerful because it means you can make context reactive. If the agent is about to write to a file, a pre-tool hook can inject the current git diff for that file. If the agent has just run tests, a post-tool hook can inject coverage metrics. You are not predicting what context will be relevant; you are responding to what the agent is doing.

Hooks also handle the side-effects case cleanly. You might want to log every file edit to a separate audit trail, or update a project management ticket when certain tasks complete. These do not need to go into the context at all; they are just triggered by agent actions.

A minimal hook configuration in .claude/settings.json looks roughly like this:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Write",
        "hooks": [
          {
            "type": "command",
            "command": "git diff HEAD -- $CLAUDE_TOOL_INPUT_PATH"
          }
        ]
      }
    ]
  }
}

The variable substitution gives you access to tool inputs, so the hook can respond to what is actually happening rather than firing blindly.

How Other Tools Compare

Claude Code is not operating in isolation. Cursor, GitHub Copilot, Windsurf, and Aider have all converged on similar patterns, though with different architectural choices.

Cursor started with a single .cursorrules file and has since moved to a .cursor/rules/ directory structure where each rule file can be scoped to specific file globs and set to apply always, on request, or automatically based on context. This is more granular than a single CLAUDE.md, though it requires more upfront organization to be useful.

GitHub Copilot uses .github/copilot-instructions.md for project-level context, which is intentionally simple. Microsoft’s approach has been conservative: ship something that works everywhere in the existing GitHub ecosystem rather than something architecturally sophisticated.

Aider takes a fundamentally different approach with its repo-map. Rather than having you manually configure what context to load, Aider builds a compressed graph of your codebase using tree-sitter, representing function signatures, class definitions, and call relationships in a compact format. The repo-map is regenerated dynamically and fits within the model’s context window even for large projects. It means the model has a structural understanding of the whole codebase without you having to articulate it. The tradeoff is that Aider’s context is always structural; it cannot capture the intent, team conventions, or known gotchas that you would put in a CLAUDE.md.

These approaches are not mutually exclusive. Future tools will likely combine structural maps with declarative memory files and dynamic retrieval, using each for what it is best at.

The Engineering Discipline This Requires

What the shift from prompt engineering to context engineering actually demands is a change in where you spend your attention. Writing better prompts is a conversational skill. Designing context architecture is a systems design skill.

A CLAUDE.md that works well for a solo developer on a small project will fail on a large team because the instructions will be too generic, too sparse, or actively wrong for different parts of the codebase. Context that works for greenfield development will mislead an agent doing refactoring because it does not encode the constraints of the existing system.

The useful mental model is to treat your context configuration as documentation that runs. Every piece of context you add has a cost, because it consumes tokens that could be used for actual work. Every piece you omit has a cost, because the agent may make wrong assumptions or ask clarifying questions. The discipline is in finding what is worth the space.

This is also why versioning and reviewing your CLAUDE.md matters as much as reviewing your code. An outdated build command or a stale list of conventions can cause an agent to fail in confusing ways that are hard to debug. The context is load-bearing.

What This Looks Like in Practice

For a Discord bot project, the CLAUDE.md I keep at the root covers a few things: the Node version and package manager in use, the command to run the bot locally, the environment variables required and where to find their values locally, the directory structure and what each top-level folder is for, and a short list of conventions like preferred error handling patterns and how slash commands are registered.

I also keep a local .claude.local.md with my personal test server ID and some shorthand for common tasks I run in isolation. None of that belongs in version control.

The MCP setup is lighter: I have a local MCP server that can query the bot’s SQLite database so the agent can inspect state during debugging without me having to copy-paste query results. That single tool has saved more context space than any amount of careful instruction writing.

Hooks I have not fully exploited yet, but the use case I want is a pre-edit hook that injects recent git log entries for files the agent is about to modify. The history often contains the reason a piece of code is written strangely, which is exactly the context that prevents the agent from “cleaning up” something that was written that way deliberately.

The Bigger Picture

What is happening across the industry is a convergence of insight: the model’s behavior is mostly determined by the information available to it, not the instruction given to it. A well-configured context can make a mediocre prompt work adequately. A poorly configured context can make a perfect prompt fail consistently.

This means the investment in context tooling is justified. CLAUDE.md hierarchies, MCP servers, and hook systems are not frivolous features. They are the interface through which you engineer reliable agent behavior, and treating them with the same care you give to code is the right instinct to build on.