Git Was a Journal. With Coding Agents, It Becomes a Control Surface.
Source: simonwillison
Git was designed around a simple assumption: a human author is making incremental decisions, and the commit history is a journal of those decisions. That assumption quietly breaks when a coding agent is writing the code. Simon Willison’s guide on using Git with coding agents names this shift explicitly, and it’s worth slowing down on what the inversion actually means in practice.
When you run a coding agent against your repository, you’re not the author of the resulting commits in the conventional sense. You’re the supervisor. The agent produces a batch of file changes that you then review, selectively accept, and commit. Git in this model isn’t a journal of your work; it’s the scaffolding around someone else’s work. The commit before an agent session is a checkpoint. The diff after a session is the primary review interface. The ability to git reset --hard HEAD is the escape hatch. These are control surface concerns, not authorship concerns.
Once you see it that way, the specific practices that experienced practitioners recommend stop seeming like arbitrary habits and start making structural sense.
The Clean State Rule
Every serious practitioner of agentic coding converges on this: before running an agent, your working tree should be clean.
git add -A && git commit -m "wip: checkpoint before agent session"
# or
git stash push -u -m "pre-agent-run-backup"
The reason isn’t tidiness. It’s review fidelity. When an agent runs against a dirty tree, its changes and your uncommitted changes are mixed together in the diff. If the agent introduces a regression you want to roll back, git reset --hard HEAD now discards your work too. If you try to review what the agent changed, you’re sorting through noise. The clean state rule makes git diff HEAD a precise instrument: everything in the diff is exactly what the agent did.
The corollary to this is treating git diff HEAD --stat as a scope check. If you asked an agent to add error handling to one function and the stat shows 14 files changed, that’s a signal to read the diff more carefully before staging anything. Agents frequently make changes beyond their stated scope, not maliciously but because they follow implications. The scope check is cheap and catches most of the surprises early.
Where Tools Diverge: The Auto-Commit Question
The most consequential design decision across current coding agents is whether they auto-commit.
Aider commits every accepted change automatically, prefixing messages with aider:. It also refuses to run at all on a dirty repository unless you pass --no-auto-commits. This is an opinionated position but a coherent one: Aider treats Git as the primary recovery mechanism, so it enforces the clean state rule and creates checkpoints continuously. The undo path is aider --undo, which reverts the last Aider commit. Aider also builds a “repo map” from git ls-files to give the model a structural index of tracked files without loading every file into context, which means .gitignore accuracy affects what the model can see.
Claude Code takes the opposite stance: it reads git status, git diff, and git log autonomously to understand repository state, but defers committing to the human by default. Its built-in safety protocol explicitly avoids git push --force, git reset --hard without instruction, --no-verify flags, and git add -A. The staging and commit remain a human checkpoint. You can shift this behavior via CLAUDE.md:
## Git Policy
After each discrete logical change:
1. Run the test suite
2. Commit with: `git commit -m "<type>: <what> [agent]"`
Valid types: feat, fix, refactor, test, docs, chore
3. Never amend published commits, force push, or rebase shared branches
Cursor shows diffs before writing to disk at all, making the checkpoint even earlier. Copilot Workspace sidesteps the question by running in a cloud environment and outputting a PR rather than local commits.
None of these approaches is strictly better. Aider’s auto-commit model means every agent operation is immediately recoverable and attributable, but it produces noisier history. Claude Code’s deferred model keeps the human firmly in the loop but requires discipline to not let sessions accumulate unreviewed changes.
Git Worktrees Are the Pattern Most People Skip
Git worktrees have been available since Git 2.5 in 2015. They let you check out multiple branches simultaneously as separate working directories, all sharing the same .git object store. Most developers have never used them. Agentic workflows make them genuinely useful.
The problem worktrees solve: if you’re running parallel agent sessions on different features, you can’t check out two branches in the same working directory. You could clone the repository multiple times, but a large repo makes that expensive in disk space. A worktree gives you a full working directory on a new branch at the cost of only the working files, not the entire object history.
# Spin up two parallel agent environments
git worktree add ../myproject-agent-auth feature/auth-refactor
git worktree add ../myproject-agent-tests feature/test-coverage
git worktree list
# /home/user/myproject abc1234 [main]
# /home/user/myproject-agent-auth 000000 [feature/auth-refactor]
# /home/user/myproject-agent-tests 000000 [feature/test-coverage]
For speculative work where you might discard the whole session:
git worktree add /tmp/agent-explore -b agent/explore-approach
# run agent, evaluate result
git worktree remove /tmp/agent-explore
git branch -D agent/explore-approach # throw it away cleanly
Walden Cui’s article on combining worktrees with direnv extends this further: each worktree can have its own .envrc specifying isolated database URLs and port numbers, making parallel agent sessions genuinely independent without Docker overhead. The practical constraint is that lock files (package-lock.json, Cargo.lock) are shared via the object store and concurrent installs will conflict, so install dependencies before creating worktrees rather than after.
Claude Code treats worktrees as a first-class pattern. It has a built-in EnterWorktree tool and supports isolation: "worktree" as an Agent tool parameter, which creates a temporary worktree for a subagent, runs it in isolation, and cleans up automatically if no changes were made.
Commit Messages Now Have Two Audiences
Here’s the inversion nobody talks about enough: commit messages aren’t just for future human maintainers anymore. They’re also context that future agent sessions will read.
Agents reading git log --oneline -20 before starting a session get temporal context: what changed recently, what decisions were made, what was recently broken and fixed. “Fix bug” tells a future agent nothing. “Fix null dereference in auth middleware when session token expires mid-request” tells it what was wrong, where, and under what condition.
The implication is that commit message quality has a compounding effect in agentic workflows that it doesn’t have in pure human workflows. A human can read the diff to recover context when messages are vague. An agent reading git log to orient itself before a session treats the message as the summary, not the diff.
For attribution, the Co-Authored-By trailer renders in GitHub and is filterable:
git commit --trailer "Co-Authored-By: Claude <noreply@anthropic.com>" \
-m "Add token refresh with exponential backoff
Agent task: the auth token refresh logic doesn't handle rate limit
responses from the token endpoint. Add exponential backoff with
jitter, cap at 32 seconds, abort after 5 attempts."
Aider’s aider: prefix convention achieves the same filterability automatically: git log --grep="aider:" gives you a clean view of all AI-generated commits.
CLAUDE.md Is Advisory; Hooks Are Mandatory
This distinction matters more than it might initially seem. CLAUDE.md files tell the model what conventions to follow. Hooks enforce invariants regardless of what the model decides.
In a long agent session, attention degrades over the context window. Instructions given early in a session may receive less weight by the end. This is documented in the “Lost in the Middle” research from Stanford and UC Berkeley. If your linting rule or secret-scanning check is in CLAUDE.md, it’s advisory with occasional failures. If it’s in a pre-commit hook, it runs on every commit unconditionally.
#!/bin/bash
# pre-commit hook: runs before every commit, agent or human
git secrets --scan # block accidental secret commits
npx tsc --noEmit # type check
npm run lint --silent # linting
Tools like pre-commit, husky, and lefthook manage this cleanly. Aider’s auto-commit mode actually benefits most from pre-commit hooks, since every agent operation triggers the hook and surfaces issues before they land in history.
One critical detail on hook failure recovery: when a pre-commit hook fails, the commit did not happen. If you then git commit --amend, you’re modifying the previous successful commit, not creating a new one. The correct response to a hook failure is to fix the issue and create a new commit. This is a subtle but real footgun.
Claude Code’s PostToolUse hooks operate at a finer granularity than pre-commit hooks, firing after every file write tool call. This creates a tight feedback loop where lint errors are surfaced to the agent mid-session rather than accumulated and discovered at commit time:
# PostToolUse hook: fires after Write/Edit tool calls
lint_output=$(npx eslint "$file" --format compact 2>&1)
if [ $? -ne 0 ]; then
echo "ESLint found issues in $file:"
echo "$lint_output"
fi
The practical rule: use CLAUDE.md for conventions where occasional deviation is tolerable, hooks for invariants where a single violation has real cost. Don’t rely on one mechanism for both.
What Git Doesn’t Cover
These patterns address file-level changes, which is most of what coding agents do. They don’t help when an agent takes actions with external side effects: API requests, database writes, environment configuration changes, or running scripts that modify system state. For those cases, the safety model is in how you configure what actions the agent is allowed to initiate, not in version control.
Git is excellent infrastructure for agentic coding precisely because it was designed for exactly this kind of adversarial scenario: work was done, you need to understand what changed, you might need to undo it, and the record should be clear enough to reason about later. Those properties don’t change just because the author is an AI. They become more important.
The practices that emerge from Willison’s guide and the broader practitioner community around agentic engineering are, at bottom, about taking Git’s existing capabilities seriously rather than inventing new ones. Clean state, atomic commits, descriptive messages, worktrees for isolation, hooks for invariants. These were good ideas before coding agents existed. With agents doing the work, they become load-bearing.