What the AI Memory Architecture Landscape Is Missing: Curated Context

The attention that LLMs pay to tokens is not uniformly distributed. A 2023 Stanford paper by Liu et al. demonstrated this empirically: information placed in the middle of a long context is recalled less reliably than information at the beginning or end. The researchers called it the “lost in the middle” problem. It has direct implications for AI-assisted software development, where a working session might span dozens of exchanges, and a decision made in exchange three needs to influence everything that follows.

Context anchoring, described by Rahul Garg in a piece on Martin Fowler’s site, addresses this problem by externalizing decision context into a living document that gets injected at the start of each session. Before the first message, the AI already has the relevant constraints, key decisions, and current goals. The conversation starts oriented rather than blank.

The pattern is worth taking seriously, and it becomes clearer why when placed alongside the other approaches people use to give AI sessions persistent memory. Context anchoring occupies a specific position in that landscape, one that the other approaches cannot fill.

Two Distinct Failure Modes

The session memory problem in AI-assisted development has two components that are worth separating, because they have different causes and different solutions.

The first is within-session drift. Even in a single long conversation, early context loses influence as the exchange grows. The “lost in the middle” research quantified this for retrieval tasks: LLMs show a U-shaped performance curve relative to where information appears in their context window, performing best on information near the beginning and end, worst on information buried in the middle. In practice this means that a constraint you stated in message three may not be honored in message twenty, even though it remains technically present in the context.

The second failure mode is cross-session amnesia. Every new conversation starts from scratch. Decisions made yesterday, constraints established last week, the rationale behind an architectural choice: all of it disappears unless you write it somewhere and bring it back in.

These two failure modes require different responses. Cross-session persistence does not automatically fix within-session drift. Context anchoring addresses both at once by placing the critical information at the beginning of every session, where the attention distribution is most favorable.

The Memory Architecture Landscape

There are several common ways to give AI sessions something resembling persistent memory, and each has a different design center.

Fine-tuning encodes knowledge directly into model weights. It works for stable, domain-specific knowledge with long shelf lives, but it is expensive to update, requires significant data preparation, and is entirely unsuitable for per-project context that changes week to week. The payment service constraints you need today did not exist when the model was trained.

Retrieval-augmented generation (RAG) maintains a document store and injects the most relevant chunks at query time. This is genuinely powerful for large, heterogeneous knowledge bases where you cannot afford to put everything in context. For project-scoped working context, it often introduces more infrastructure than the problem warrants. Chunking strategy, embedding models, retrieval thresholds, index freshness: all of these become operational concerns for a problem that might be solved by a well-maintained markdown file injected at session start.

System prompts are static instructions set at the deployment or platform level. They establish the model’s general behavior and persona but are not designed to carry per-session working state. A system prompt for a coding assistant might define its general approach to code review; it cannot contain the specific constraints of the microservice you’re refactoring this sprint.

Native AI memory features, available in several commercial AI products, attempt to automate the extraction and persistence of information across sessions. The model or the platform decides what to remember, based on what it infers is worth retaining. This removes the curation cost but introduces an opacity cost. You do not know what has been retained, retained information can conflict with current context, and diagnosing why the AI is operating with wrong assumptions becomes difficult when the memory state is not directly visible.

Context anchoring occupies a different position entirely. The document is maintained by the human, injected explicitly at session start, and its full contents are always visible. Nothing is retained that was not consciously placed there.

What the Document Contains

A practical context anchor document is not a project overview or a README. It is specifically the information that the AI needs to be aligned before the first exchange, and that would take too long to re-establish through conversation.

A minimal structure for a software project might look like this:

# Working Context: Payment Service Refactor

## Active Constraints
- All changes must deploy under a feature flag
- No new third-party dependencies without security review
- The v1 API endpoints must remain backwards compatible through Q2

## Key Decisions
- Using PostgreSQL advisory locks for distributed locking, not Redis.
  Rationale: reduces operational dependencies; Redis not available in staging.
- New business logic lives in `src/services/`, not in route handlers.
  Rationale: established in architecture review, see ADR-0012.

## Current Session Goal
Migrate the order processing pipeline to use the new OrderService class.

## Known Gotchas
- `processOrder()` has an undocumented side effect on inventory state.
  Documented in PR #1823 but not yet reflected in inline comments.
- The test database does not enforce foreign keys, unlike production.

This is not an architecture document. It captures the things you would say if you were onboarding a colleague for the afternoon: the constraints that are not visible in the code, the decisions that have already been made, and the specific goal for the current session. The AI needs all three categories to avoid producing work that is technically correct but contextually wrong.

Why Manual Maintenance Is the Point

The labor involved in maintaining this document is sometimes framed as the pattern’s primary drawback. It is worth reframing it as its primary value.

Automated memory systems skip the curation step. That step, the act of deciding what matters, how to phrase it, and what to remove when it becomes stale, is not pure overhead. It is a lightweight form of the same reasoning that produces good technical documentation. Teams and individuals who maintain context anchors tend to get better at articulating constraints and decisions generally, not only in the AI context. The discipline of making implicit assumptions explicit has compounding benefits beyond the immediate session.

Comparison with automated native memory highlights this. When a native memory system retains something incorrectly, the error is invisible until it produces a confusing output. When a context anchor document contains wrong information, it is directly auditable, correctable, and the developer who last edited it can explain why. This auditability matters in production development work where the AI’s operating assumptions need to be reproducible and reviewable, not probabilistically determined.

The maintenance lifecycle has three natural integration points: before the session starts, review and update the anchor; during the session, the AI or human can flag when the anchor appears incomplete or inconsistent with new information; after the session ends, capture any decisions or constraints that emerged. Treating the end-of-session update as a file commit alongside the code changes keeps the anchor in sync with the codebase.

Where It Breaks Down

An anchor document that grows without discipline recreates the original attention problem. The same “lost in the middle” phenomenon that motivated the pattern will degrade it if the document grows long enough. An anchor should contain what the AI needs at session start, not a comprehensive record of everything that has happened on the project. If the document exceeds a few hundred words, it is probably including information that belongs in the codebase, in a dedicated ADR, or nowhere.

The pattern also shows limited returns for short, self-contained tasks. If you are asking an AI to write a one-off script with no particular constraints or prior decisions to honor, the overhead of maintaining an anchor document outweighs the benefit. The pattern earns its cost on long-horizon work: multi-day projects, recurring tasks on an evolving codebase, anything where prior decisions compound across sessions.

Stale anchors are a distinct failure mode from having no anchor at all. A context anchor that describes a constraint that no longer applies, or a decision that was reversed last week, can actively mislead the AI in ways that are harder to diagnose than a blank-slate session. Maintenance is not optional; it is the mechanism by which the pattern works.

The Broader Frame

Context anchoring is not a novel technique invented for AI tooling. It is a software engineering discipline applied to a new medium. The practices that make projects manageable over time, writing down what was decided and why, keeping that record current, ensuring collaborators can orient quickly, transfer directly to AI-assisted work. What AI sessions make newly visible is how much tacit context engineers normally carry in their heads and re-establish through conversation, and how much of that tacit context never gets written down.

The framing from Garg’s article is accurate: this is about externalizing decision context into a living document. The reason the approach works where automated memory systems fall short is that the externalization is human-driven. The developer is forced to decide what matters, and that decision itself carries information that no automated extraction process can replicate.