The Edit Format Problem Every Coding Agent Has to Solve

Simon Willison’s guide to how coding agents work covers the tool loop clearly: model calls a tool, result feeds back into context, loop continues. What the loop-level description leaves implicit is that “edit a file” is itself a deeply non-trivial operation. The format in which the agent specifies that edit, how it tells the runtime what to change and where, is one of the most consequential design decisions in any coding agent’s architecture.

Most users never see this layer. The agent edits a file and the file is different; the mechanism is invisible. But for anyone building coding agents or debugging why a particular agent keeps corrupting files in certain situations, the edit format explains a lot.

Four Approaches, Four Different Bets

The major coding agents have landed on four distinct approaches.

Whole-file output is the simplest. The model generates the complete new file contents in its response, and the runtime replaces the file. No need to parse a diff or locate a target string. The model just outputs the finished result. The problem is token cost: a 500-line file requires generating 500 lines of output even if only three lines changed. For small files this is fine. For anything over a few hundred lines, whole-file output becomes impractical and error-prone, as the model starts drifting from the original content in the unchanged sections.

Unified diff format is what git diff produces: context lines, minus-lines for deletions, plus-lines for additions, with @@ hunk headers indicating line positions.

--- a/src/auth/session.ts
+++ b/src/auth/session.ts
@@ -42,7 +42,7 @@
 export function createSession(userId: string) {
-  const expiry = new Date(Date.now() + 3600);
+  const expiry = new Date(Date.now() + 3600 * 1000);
   return { userId, expiry };
 }

Unified diffs are compact and semantically clear. The problem is that applying them requires matching line numbers or context lines exactly, and line numbers shift as earlier edits in the same file take effect during a session. Context matching helps, but a long agent session making multiple edits to the same file accumulates position-tracking debt that leads to misapplied patches.

SEARCH/REPLACE blocks are Aider’s primary format. The model outputs structured text blocks within its response:

src/auth/session.ts
<<<<<<< SEARCH
  const expiry = new Date(Date.now() + 3600);
=======
  const expiry = new Date(Date.now() + 3600 * 1000);
>>>>>>> REPLACE

The runtime finds the SEARCH string in the file and replaces it with the REPLACE string. No line numbers. No diff heuristics. Just exact string matching against the current file contents.

old_string / new_string JSON tool parameters are what Claude Code and similar tool-calling agents use. The edit is expressed as a structured tool call:

{
  "name": "edit_file",
  "input": {
    "path": "src/auth/session.ts",
    "old_string": "  const expiry = new Date(Date.now() + 3600);",
    "new_string": "  const expiry = new Date(Date.now() + 3600 * 1000);"
  }
}

Functionally identical to SEARCH/REPLACE blocks, but using the model’s native tool-calling API rather than structured text in the assistant response.

Why String Matching Outperforms Line Numbers

The core insight behind both the SEARCH/REPLACE and old_string/new_string approaches is that string matching is more robust than line-number addressing across the span of a long agent session.

Consider what happens when an agent makes three sequential edits to the same file. After the first edit, the file has changed. If the agent planned all three edits in a single reasoning turn and expressed them as line-numbered hunks, hunks two and three will refer to line positions that may no longer exist in the modified file. String matching against current file contents does not have this problem; the runtime always searches the file as it currently exists.

Aider’s documentation on edit formats explains the choice explicitly: SEARCH/REPLACE is more reliable than unified diff across the range of models Aider supports because it removes the line-number dependency. Claude Code’s edit tool makes the same bet.

There is a constraint that comes with string matching: the SEARCH string or old_string must be unique within the file. If the same line appears twice, the runtime cannot know which occurrence to replace. The model has to include enough surrounding context to make the target unique. This is occasionally a source of failures when a file contains genuinely identical repeated blocks, but in practice it is less common than the line-number drift problem it replaces.

Why Aider Uses Text Output Instead of Tool Calling

Aider predates widespread tool-calling API support and made a deliberate choice to keep using text-output edits even as tool calling became standard. Paul Gauthier, Aider’s creator, has written about this: when tool-calling APIs were less reliable and model support varied, embedding the edit format in the assistant response text was more portable and less prone to silent failures from malformed JSON arguments.

There is another property that matters in practice: text-output edits are visible in the conversation history as readable content. When you review an Aider session, you can see exactly what the model proposed at each step. With JSON tool calls, the edit parameters are in a structured blob that requires rendering to read. For debugging and auditing agent behavior, the text-output approach has a real advantage.

The tradeoff is that text-output format requires a parser that handles the model’s occasional formatting variations. Tool calling APIs return strictly structured JSON that requires no custom parsing. Anthropic’s tool call format in particular, where arguments come back as a typed object rather than a JSON string, is reliable enough that most new agents built against the Claude API use the tool approach rather than the text-output approach.

The Read-Before-Write Discipline

Both the SEARCH/REPLACE and old_string/new_string formats share a common failure mode: if the model generates a target string that does not exactly match the current file contents, the edit fails. Trailing whitespace, different indentation, slightly different variable names all cause mismatches.

The mitigation is the read-before-write discipline. A well-designed agent should read a file before editing it, not just to understand the code, but to see the exact characters it will need to reproduce in the old_string. Claude Code’s system prompt enforces this explicitly, as does Aider’s default behavior.

When the read is recent and the file has not been modified externally since, the old_string matches reliably. Failures tend to cluster in two situations: when the model generates the old_string from training data memory rather than from the file read in context, and when another tool (the user, a formatter, a background process) modifies the file between the read and the edit.

The error message design matters here. A good edit tool returns the failure with context:

Error: old_string not found in src/auth/session.ts.
Current lines 40-45:
  const expiry = new Date(Date.now() + 3600000);
  return { userId, expiry };

Given that error, the model can self-correct by re-reading the file and constructing a new old_string against the actual current content. An error that returns only “edit failed” forces the model to guess at what went wrong, which often leads to retry spirals rather than productive recovery.

What This Means When You Build Your Own Tools

If you are building tools for a coding agent, the edit format question comes up whenever you need the model to modify structured content. Source code is the obvious case, but the same considerations apply to configuration files, JSON, YAML, and any structured text the agent needs to update.

String-matching approaches work best when target strings are unique and the model has read the file recently. For formats where exact string matching is fragile, like heavily generated or formatted config files, providing a structured update tool that accepts a key path and a new value is more reliable than asking the model to reproduce exact text fragments.

For source code specifically, the read-before-write discipline is not optional. An agent that writes to files it has not read in the current session will produce incorrect edits at a rate that makes it unreliable for real work. The session-scoped file read is not just for understanding the code; it is the mechanism that makes precise editing possible at all.