Simon Willison’s guide on agentic engineering patterns has a section on Git that I keep coming back to. The core argument is simple: treat Git not as bookkeeping after the fact, but as the primary control mechanism for working with coding agents. What I want to dig into is why this reframe is more significant than it sounds, and what the concrete mechanics look like in practice.
When you write code yourself, Git is mostly a record-keeping tool. You make changes, they work, you commit. The diff reflects your intentions because you wrote the code. With a coding agent, that assumption breaks completely. The agent’s natural language explanation of what it did and the actual diff can be significantly different. Agents describe their work in terms of intent; the diff shows what happened. This is not a flaw in any particular agent, it is structural to how they work. The only reliable way to understand what an agent changed is to read git diff.
This makes Git the observation layer, not just the version control layer.
The Clean Working Tree Rule
Every productive workflow with coding agents starts from the same place: a clean working tree. Before you hand a task to an agent, git status should show nothing. This sounds obvious but the implications run deeper than they appear.
If you start an agent session with uncommitted changes, you immediately lose the ability to cleanly attribute what came from where. The agent modifies files you were also editing, and now you have a mixed authorship problem that git diff cannot cleanly separate. More practically, your recovery options collapse. If the agent does something wrong, git checkout -- . discards everything, including your in-progress work. If you try to surgically revert only the agent’s changes, you have to do it manually.
The workflow that actually holds up:
# Confirm clean state before every agent session
git status
# If you have in-progress work, stash it or commit it
git stash push -m "work in progress before agent session"
# Create a branch for the agent task
git checkout -b agent/$(date +%Y%m%d)-add-rate-limiting
This takes thirty seconds. The alternative, debugging a mixed-authorship diff after something goes wrong, takes much longer.
Worktrees for Parallel Agent Work
The single most underused Git feature for agentic workflows is git worktree. A worktree lets you check out the same repository at multiple paths simultaneously, each on a different branch. For agents, this is a straightforward enabler of parallelism.
Without worktrees, running two Claude Code sessions against the same repository means they share a working directory. File writes race, context bleeds between sessions, and neither agent has a stable view of the codebase. With worktrees, each agent gets its own checked-out copy:
# Set up isolated worktrees for parallel agent tasks
git worktree add /tmp/agent-auth feature/auth-rewrite
git worktree add /tmp/agent-tests feature/test-coverage
# Run each agent in its own directory
# Agent 1 cd's to /tmp/agent-auth, has complete isolation
# Agent 2 cd's to /tmp/agent-tests, has complete isolation
# Review results independently
git diff main..feature/auth-rewrite
git diff main..feature/test-coverage
# Clean up
git worktree remove /tmp/agent-auth
git worktree remove /tmp/agent-tests
This is how I have been running multiple Claude Code sessions against my Discord bot project. The worktrees share git history but have independent working directories. Neither session can corrupt the other’s state, and I can review each diff in isolation before deciding what to merge.
The git worktree documentation is worth reading if you have not touched it before. The feature has been in Git since 2.5 (2015), but most developers only encounter it when someone else recommends it for agent workflows.
Commit Frequency as Recovery Granularity
With human-written code, commit frequency is a style preference. With agent-written code, it determines your recovery granularity.
An agent working autonomously for twenty minutes might touch thirty files. If it goes wrong somewhere in the middle and you have no intermediate commits, your options are git reset --hard HEAD (lose everything) or manual surgical reverting (tedious and error-prone). If the agent committed after each logical subtask, you can git log, identify the last good state, and reset to that point.
The pattern that works well:
- Give the agent a small, scoped task.
- Agent completes; run
git diff. - Review the diff carefully, not the agent’s explanation of the diff.
- If acceptable, stage and commit.
- If not,
git checkout -- .and re-prompt with a corrected instruction.
Commit messages for agent work are worth tagging. Many teams use a ai: or agent: prefix on commits produced primarily by an agent. This makes git blame and git log more honest about authorship, which matters for code review and for tracing decisions later.
What to Look for in Agent Diffs
An agent reviewing its own work is not the same as you reviewing it. The things agents most commonly do that they do not mention:
- Touch files they were not asked to touch (check
git diff --name-onlyfirst) - Quietly remove code they assessed as dead (which may not be dead)
- Weaken test assertions to make tests pass rather than fix the underlying issue
- Add dependencies to
package.json,requirements.txt, orgo.mod - Remove or rewrite comments and documentation
- Change config files that were not part of the stated task
The review workflow:
# Start with the file list to calibrate what you're about to read
git diff --stat
# Then read the full diff
git diff
# For large diffs, review file by file
git diff -- src/handlers/rate_limit.ts
# Use interactive staging to accept only what you want
git add -p
Interactive staging with git add -p is particularly useful when an agent made some changes you want and some you do not. You can accept specific hunks and reject others, then commit only the acceptable portions.
Encoding Conventions in CLAUDE.md
Claude Code reads a CLAUDE.md file at the project root as persistent instructions. This is the right place to encode your git conventions so agents follow them without repeated prompting:
## Git Conventions
Always work on a feature branch. Never commit directly to main.
Use conventional commit prefixes: feat:, fix:, chore:, docs:, test:
Make atomic commits after each logical unit of work, not at the end of the session.
Run `npm test` before committing. Do not commit if tests fail.
Do not modify package.json or package-lock.json unless explicitly asked to add or remove a dependency.
Without this, you re-explain conventions every session. With it, the agent has persistent context and you spend less time correcting process violations.
Pre-Commit Hooks: A Nuanced Case
Pre-commit hooks that run linters and formatters work well with coding agents. Claude Code reads hook failure output and attempts to fix the issue before retrying, which is the intended behavior. A formatter hook that runs black or prettier will consistently produce clean output without any special configuration.
Heavy hooks, specifically ones that run full test suites or slow type-checkers, are more complicated. An agent in an autonomous loop that repeatedly fails a slow hook will sometimes start making increasingly contorted changes to satisfy the check rather than addressing the underlying problem. The pre-commit framework lets you tag hooks with stages so you can run fast checks on commit and reserve the slow ones for CI.
The right default: hooks for formatting (fast, auto-fixable) on commit; hooks for tests and type-checking in CI.
Squashing Before Merge
Agent-produced branches often contain WIP commits, checkpoint commits, and correction commits that represent the agent’s process rather than a clean history. Before merging, squash these into a coherent set of commits that reflect the logical changes:
# Interactive rebase against main
git rebase -i main
# Or squash everything into one commit on merge
git checkout main
git merge --squash feature/agent-auth-rewrite
git commit -m "feat: add token bucket rate limiting to auth endpoints"
The resulting commit history is then readable by humans and accurately reflects what changed, without exposing the internal back-and-forth of the agent session.
The Underlying Shift
What changes when you work with coding agents is the ratio of output to attention. An agent can produce more lines of change per hour than any human developer. The bottleneck shifts from generation to review, and Git is the tool that makes review tractable.
Clean trees give you clean diffs. Branches give you rollback. Worktrees give you parallelism without interference. Atomic commits give you recovery granularity. None of this requires new tooling; it requires more deliberate use of what Git already provides.
The discipline is lower overhead than it sounds. Starting from a clean tree takes seconds. Creating a branch takes seconds. The payoff, when an agent goes off-script, is that you recover in seconds rather than hours.