Git as a Control Surface for Coding Agents

Simon Willison has been documenting agentic engineering patterns at a pace that most of us struggle to keep up with, and his guide on using Git with coding agents is one of the more practically dense pieces in that series. It covers the basics well. What I want to do here is go deeper on the why, and explore how some of these patterns change the fundamental relationship between you and your version control history.

When you write code yourself, Git serves as a journal. You accumulate changes, periodically checkpoint them, and the commit log tells a story about decisions made over time. With a coding agent, that relationship inverts. Git stops being a journal and starts being a control surface. The commits come before and after agent invocations, not as a record of work, but as a frame around it.

Commit Before You Invoke

The single most important habit when working with coding agents is to have a clean working tree before you start. This sounds obvious until you’ve been burned a few times by an agent that modified five files, introduced a regression, and left you unable to cleanly reason about what changed versus what you’d already been editing.

The discipline is simple: before you run an agent on any non-trivial task, stage and commit everything in progress. Even a WIP commit is fine. What you want is a clean git diff HEAD so that after the agent finishes, you can see exactly what it did with no noise from your own prior edits.

git add -A && git commit -m "WIP: before agent refactor of auth module"

After the agent runs:

git diff HEAD

That diff becomes your primary review surface. You’re not reading through files top to bottom looking for changes; you’re reading a bounded, structured diff. This is a fundamentally different review posture, and a much better one for catching what an agent got wrong.

This pattern also gives you a clean escape hatch. If the agent goes off the rails, git checkout . or git reset --hard HEAD gets you back to where you started with zero ambiguity.

Atomic Commits as Agent Checkpoints

The corollary to committing before invocation is committing after, if the result looks good. Agents can produce a lot of correct output quickly, and if you don’t commit it, you risk conflating multiple agent sessions in a single diff and losing the ability to bisect if something breaks later.

A pattern I’ve settled into is treating each agent invocation like a small PR: start clean, let the agent work, review the diff, commit or discard. This produces a commit history that actually tells you something useful:

add OAuth token refresh handling (agent)
fix type errors in auth module (agent)
extract token validation to its own function
WIP: before agent refactor of auth module

The (agent) suffix is optional but I find it useful for my own auditing later. When I come back to a codebase after a few weeks, knowing which commits were agent-generated changes how I read them. I’m more skeptical, I check surrounding context more carefully.

Willison’s guide makes a similar point about committing everything liberally, which he calls the “atom everything” pattern. The idea is that frequent commits create dense checkpoints the agent itself can reference. Some agents can be given access to git log and will use recent commit messages to understand the shape of recent work.

Git Worktrees for Parallel Agents

If you’re running a single agent on a single task, branches are usually sufficient. But if you want to run multiple agents in parallel on different aspects of a codebase, branches stop being enough because each agent needs its own working tree.

This is where git worktrees become genuinely useful. A worktree lets you check out a branch into a separate directory, backed by the same repository, without cloning:

git worktree add ../myproject-agent-auth feature/auth-refactor
git worktree add ../myproject-agent-tests feature/test-coverage

Now you can point two separate agent sessions at two separate directories. They share no working tree state. Each sees its own branch. Each can run git diff without interference from the other.

The cleanup is straightforward:

git worktree remove ../myproject-agent-auth
git worktree prune

Worktrees were designed for this kind of parallel workflow, though they predate AI agents by about a decade. They were originally used for maintaining stable release branches while doing feature work. The agentic use case maps onto the same underlying need: isolated working trees for independent streams of work.

One caveat: some editors and language servers get confused by multiple worktrees pointing at the same repo. If you’re using an LSP that relies on a project root, you’ll want to open each worktree as a separate project window.

The Diff as Code Review

Once you’ve internalized the commit-before-invoke discipline, your review workflow changes significantly. You’re no longer reviewing files, you’re reviewing diffs. This seems minor until you realize it changes what you’re looking for.

Reviewing a file, you’re asking: does this file look correct? Reviewing a diff from an agent, you’re asking: are these specific changes appropriate? Did the agent touch things it shouldn’t have? Did it introduce something subtly wrong in a part of the code that wasn’t the focus of the task?

I’ve started using git diff --stat as a first pass. If the agent was supposed to add error handling to one function and the diff touches fourteen files, that’s a signal to look more carefully before accepting anything.

git diff HEAD --stat

For larger changes I’ll do a proper review with something like:

git diff HEAD -- src/auth/
git diff HEAD -- tests/

Scooping through by directory helps when the agent made widespread but shallow changes, like a rename or a type annotation pass. You want to quickly establish that the changes are uniform and mechanical before approving them in bulk.

Willison’s approach extends to using git stash as a way to temporarily shelve agent changes while you check that the existing tests pass against the pre-agent state. This is good discipline especially if you’re working in a codebase where the test suite is not fast.

Branching Strategy for Agentic Work

The branch-per-task pattern maps well onto agentic workflows. Each agent session gets its own branch, the session produces a diff, you review and either merge or discard. This keeps your main branch clean and gives you a clear history of which agent sessions contributed what.

The pattern works particularly well when you’re using an agent to explore a solution space. You can create three branches, run the same task description against three separate agent sessions with slightly different prompts or context, compare the resulting diffs, and cherry-pick the best pieces. Git makes this compositional approach tractable in a way that wouldn’t work if you were just overwriting files.

git checkout -b agent/auth-approach-1
# run agent session
git stash
git checkout main
git checkout -b agent/auth-approach-2
# run agent session with different prompt

This kind of parallel exploration is expensive with human time, which is why we don’t usually do it. With agents it’s cheaper because the bottleneck shifts from writing to reviewing.

What This Does to Commit History

One thing nobody talks about enough: agentic commits look different from human commits. They tend to be larger, more internally consistent, and less incremental. A human building a feature commits as they go, adding a little at a time. An agent often produces the whole feature in one shot.

This isn’t necessarily bad, but it does change how you read history. git log --oneline becomes less useful as a narrative of the work and more useful as an index into review artifacts. The interesting story is in the diffs, not the message summaries.

It also changes how useful git blame is. When an agent wrote a function, the blame annotation tells you less than it used to. The author is you, the session, or whoever ran the agent. The interesting question is why that code was added, and that’s in the commit message and in whatever prompt or issue was the input.

Some teams are starting to put the task description or prompt into the commit body as a way of preserving intent alongside the code. This is a reasonable idea:

git commit -m "Add token refresh with exponential backoff

Agent task: The auth token refresh logic doesn't handle rate limit
responses (429) from the token endpoint. Add exponential backoff
with jitter, cap at 32 seconds, abort after 5 attempts."

That body section becomes part of the permanent record, searchable with git log --grep, and useful for anyone trying to understand the motivation behind a specific change months later.

Clean State as Default

The broader shift that agentic coding forces is treating a clean working tree as the default rather than an occasional state. When I’m writing code by hand, I’m comfortable with a dirty tree for hours at a time. When I’m working with agents, a dirty tree before invocation creates ambiguity that compounds across sessions.

This has actually made me a more disciplined committer overall. The habit of cleaning up before invoking an agent bleeds into cleaning up more generally. Smaller commits, more frequent checkpoints, less accumulated working tree debt.

Willison’s guides on agentic patterns are worth reading in full because they’re grounded in actual practice rather than theoretical ideals. The Git section in particular is useful as a checklist of habits that most people pick up the hard way. None of these patterns are novel in isolation; what’s changed is how much they matter when an agent can make fifty changes in the time it takes you to review ten.