Minor Version, Major Signal: What the Opus 4.6-to-4.7 Prompt Diff Tells Us
Source: hackernews
Simon Willison published a detailed comparison of the system prompts between Claude Opus 4.6 and 4.7 last Friday, and the 176 upvotes and 111 comments it received on Hacker News reflect something beyond casual interest. People building on these APIs have learned to watch Willison’s work because it is, functionally, the only behavioral changelog the community has.
I have written before about the general methodology Willison uses to extract and version-track these prompts. The short version: Claude’s base system prompt can be retrieved directly by asking for it, because Anthropic’s model spec instructs Claude not to actively lie about having instructions, even if it might keep specific operator prompts confidential. Put that on a cron job, commit the result to git, and you get a timeline of Anthropic’s internal guidance to their model without any cooperation from Anthropic.
The Opus 4.6 to 4.7 diff is interesting for a different reason than the methodology. It is evidence that system prompt changes now happen within minor version increments, not just across major model releases.
What Minor Version Means Now
In software, version numbers carry conventions. Semver says that a patch bump (4.6.1 to 4.6.2) means backwards-compatible bug fixes; a minor bump (4.6 to 4.7) means new backwards-compatible functionality; a major bump (4.x to 5.0) means breaking changes. These conventions exist because the people consuming a library need to know what level of attention to pay when upgrading.
AI model versioning does not follow semver, and nobody expects it to. But there has been an implicit assumption in how the Claude 4 series has been released that minor version bumps reflect capability improvements, not behavioral policy revisions. That assumption appears to be wrong. The Opus 4.6 to 4.7 diff shows that Anthropic is using minor version releases as a vector for updating the instructions the model operates under, not just the weights.
This matters because it collapses the distinction between “the model improved” and “the model was told to behave differently.” Both changes arrive as a version bump. Neither comes with a description of what changed behaviorally. If your application’s outputs shift after you update from Opus 4.6 to 4.7, you now have two independent variables to investigate: what the new weights do differently, and what the new system prompt instructs.
The Opus Line Specifically
Opus occupies a particular place in Anthropic’s lineup. It is the model positioned for complex, multi-step reasoning: extended writing, difficult analysis, agentic pipelines where the model is making decisions across many turns. Sonnet handles volume; Opus handles depth. Because of this positioning, Opus deployments tend to be in contexts where consistency matters more than in high-throughput, lower-stakes applications.
A developer running a customer service chatbot on Sonnet can absorb unexpected behavioral variation relatively easily. A team using Opus to drive a research pipeline, a legal document review workflow, or a coding agent that takes actions based on its judgment has a much stronger interest in knowing exactly how the model was instructed to behave and when that changed.
The Claude 4 era has seen Anthropic push hard on agentic capabilities. The Claude 4 release blog post emphasized tool use, multi-step task completion, and computer use. The system prompt is where Anthropic operationalizes those capabilities: it specifies how Claude should handle ambiguous instructions, what to do when tool calls fail, how to represent uncertainty, and when to stop and ask versus when to proceed. These are not cosmetic guidelines. They are the behavioral foundation of anything you build on top.
When that foundation shifts between 4.6 and 4.7, the shift is worth understanding.
The Archaeology of Behavioral Change
Willison’s diff approach has a useful property beyond just detecting changes. Because it treats the prompt as text committed to a repository, it captures the exact wording, not just a summary of intent. This matters because the gap between policy intent and behavioral effect in large language models is mediated almost entirely by wording.
Consider how differently a model might behave under “avoid speculation about future events” versus “do not make predictions you cannot support with evidence.” Both might represent the same underlying policy. The model interprets them differently across edge cases, particularly in agentic contexts where the model is deciding how confident to appear in a plan it is about to execute. A diff that shows this specific change, with the old and new wording side by side, gives a developer something actionable. They can test the new wording against their use cases, understand why outputs shifted, and decide whether the change helps or hurts their application.
Without the diff, the developer has only the output change itself, with no path to explanation.
What Anthropic Could Do Instead
The right comparison here is not other AI labs, most of which operate the same way. The right comparison is what Anthropic has already shown they are willing to do around transparency.
Anthropics usage policies are versioned and publicly updated, with dates. The model spec is a published document that gets revised. The company has a responsible scaling policy with specific commitments.
Publishing a diff of the base system prompt alongside each model version release would fit naturally within this existing framework. It would not require disclosing anything proprietary about the training process. The text of the prompt is already extractable by anyone willing to spend five minutes asking Claude for it. What publishing it would add is the narrative: why did this change, what problem was it solving, what should developers expect to see differently.
A prompt changelog entry might look like this:
## Opus 4.7 base prompt changes
- Added guidance on handling interrupted tool use sequences in agentic contexts
- Revised language around expressing uncertainty in multi-step plans
- Removed instruction to recommend human review for [category] decisions;
replaced with calibrated judgment based on consequence severity
That is three lines. It takes longer to describe than to write. The information value for developers running Opus in production pipelines would be substantial.
The absence of this is not malicious. It is the path of least resistance when product velocity is high and formal documentation processes are expensive. But the community filling the gap with git-based prompt archaeology is a signal that the demand is real and the current supply is zero.
The Practical Takeaway
If you are building on Opus and you care about output consistency, Willison’s work is worth subscribing to. He is building a public record of something Anthropic is not publishing themselves, and over the Claude 4 series, it has become a meaningful signal rather than a curiosity.
The deeper point is architectural. Any application that depends on consistent model behavior should be testing that behavior against a fixed benchmark on every version change, not assuming that a minor bump means nothing changed. The 4.6 to 4.7 diff is a concrete example of why. The weights probably changed. The system prompt definitely changed. Your outputs may or may not have changed, and you will not find out from the release notes.
Version your evaluation suite the same way you version your code. Run it on the new model before you roll it out. And keep one eye on what Willison finds in the diff, because right now that is the closest thing to a behavioral changelog you have access to.