· 7 min read ·

Git as the Safety Net That Makes Agentic Coding Sessions Recoverable

Source: simonwillison

The patterns for using Git with coding agents differ from what you’d use in ordinary development, and the differences are worth understanding precisely. Simon Willison’s guide on agentic engineering patterns lays these out systematically, and the core insight is that Git stops being primarily a collaboration and history tool and becomes a state management layer for the session itself.

Committing as a Checkpoint, Not a Milestone

Human developers tend to treat commits as meaningful units of completed work: a feature done, a bug fixed, a refactor finished. With an agent, the right frame is different. Commits are checkpoints in a state machine. The “atom everything” approach Willison describes means committing early, committing often, committing partial work, and committing before anything risky. The goal is enough snapshots that when the agent does something unexpected, you have a clean recovery point.

# Before asking an agent to attempt something structural
git add -A && git commit -m "checkpoint before auth refactor"

This feels wasteful if you’re thinking about commit hygiene, but that framing doesn’t apply to agentic sessions. You can always squash or rebase before merging. The intermediate commits are throwaway scaffolding that buys recoverability. Treating them as permanent history is the wrong mental model.

The practical implication is that you should include committing behavior in your agent instructions rather than leaving it to the model’s defaults. Different models have wildly different defaults: some commit obsessively, some never commit at all. Making the expected behavior explicit removes the variance:

Before starting work:
1. Create a branch named feature/<short-description>
2. Commit after completing each logical unit of work
3. Use the format "feat(module): description" for commit messages
4. Do not push until all tests pass

Worktrees for Agent Isolation

Git worktrees are the underappreciated primitive that makes parallel agentic work practical. A worktree gives you a separate working directory that shares the same .git object store as your main checkout. You can have an agent working on a feature in one worktree while you work on something else in another, without interference.

# Create an isolated workspace for an agent task
git worktree add ../agent-auth-fix feature/auth-fix

# The agent works in ../agent-auth-fix
# Your main checkout is untouched
# Both share the same .git history and object store

# Clean up when done
git worktree remove ../agent-auth-fix

The practical value is that you can run multiple agents on the same repository simultaneously. Each gets its own working directory, its own branch, and complete filesystem isolation from the others. Without worktrees, you’d either clone the repo multiple times (wasteful, and the clones diverge) or serialize all agent work (slow and limiting).

Claude Code supports this pattern natively through its worktree isolation mode, where the agent operates in a temporary git worktree that gets cleaned up automatically if no changes are made. The underlying mechanism is exactly git worktree add, but surfaced as a session-level option rather than something you have to set up manually each time.

For anyone building tooling around agents, worktrees are worth supporting explicitly. The object store sharing means disk usage scales sublinearly with the number of concurrent agents, and you get branch isolation for free.

Reading Git State as Context

A well-configured agentic session doesn’t just write to git; it reads from it. git status, git diff, and git log give an agent concrete grounding about current state without requiring it to read every file. This is particularly useful for understanding what has already been changed in the current session.

# Useful context to provide or let an agent gather at session start
git status --short
git diff HEAD --stat
git log --oneline -10

When working on my own Discord bot, I’ve found that feeding git diff HEAD into the start of a follow-up prompt gives the agent accurate context about what was already changed in a previous session. This prevents the common failure mode where it tries to re-apply changes that are already in place, or worse, undoes them because it doesn’t know they exist.

The git log --all --oneline view after a completed agentic session also serves as a compressed narrative of what the agent did. Combined with git show <commit>, you can step through the agent’s reasoning as expressed through code changes, which matters when you need to understand why it made a particular decision.

Branches as Task Scope

Creating a branch per agent task serves two purposes: it isolates the agent’s work from main, and it gives you a natural review surface. After the agent finishes, you can review the entire diff as a unified change rather than digging through interleaved commits.

git checkout -b agent/add-rate-limiting
# Assign task to agent
# Agent makes commits
git diff main...agent/add-rate-limiting
# Review, then merge or discard

The naming convention matters here. Prefixing agent branches with agent/ or ai/ makes them identifiable in branch lists and CI pipelines. You can configure branch protection rules to require human review before any agent/ branch merges to main, which gives you a lightweight review gate without building custom tooling.

GitHub’s required reviewers and branch protection rules work well here. The agent opens a pull request, a human reviews the diff, and the merge requires approval. The agent does the work; the human approves the boundary crossing into main.

Pre-Commit Hooks as Guardrails

Pre-commit hooks are underused as agent safety nets. If you’re letting an agent commit directly, a hook that runs a lint pass or type check before each commit catches a significant class of errors at the moment they’re introduced.

# .git/hooks/pre-commit (or managed via husky or lefthook)
#!/bin/sh
npx tsc --noEmit && npx eslint src/

The trade-off is that hooks slow the agent’s commit cycle. A full test run per commit is usually too slow to be practical, but a fast type check under ten seconds is worth it. When the hook fails, the agent sees the error output and can fix the issue before the commit lands, keeping the history clean and ensuring every committed state at least passes static analysis.

Tools like lefthook and husky make hook configuration portable and versioned, which matters if you’re sharing a workflow across a team or across multiple agents.

Git Bisect for Agent-Introduced Bugs

When an agent works through a large refactor and introduces a subtle bug, git bisect becomes genuinely useful in a way it rarely is in human workflows. Agents tend to make many small commits, which gives bisect a fine-grained search space.

git bisect start
git bisect bad HEAD
git bisect good main
git bisect run npm test
# Git automatically finds the first bad commit

The atomicity of agent commits, which can look like noise when you’re scanning history, pays off during debugging. Each commit represents a discrete state that bisect can test, and the automated git bisect run mode means you don’t have to step through each one manually.

This is one area where agent workflows can be strictly better than human workflows. A human working on a complex feature might make ten commits over a week; an agent might make thirty commits in an hour. The higher commit density makes bisect faster and more precise.

The Audit Trail

One underappreciated benefit of frequent commits is the audit trail they create. After an agentic session, git log --all --oneline --graph gives you the full picture of what the agent did and in what order. This is useful for debugging but also for review: you can see whether the agent stayed on task or wandered into unrelated changes.

This audit function is part of why the “atom everything” philosophy matters even for changes you might normally group together. If the agent modifies a utility function as part of adding a feature, those changes being in separate commits means you can identify exactly when and why the utility changed, rather than having to untangle it from the feature work.

For teams, this audit trail also provides accountability. If an agent-written change causes a production incident, the commit history tells you what changed and the PR description (if the agent wrote one) tells you why. This isn’t fundamentally different from human-authored changes, but the density of the history makes it more useful.

The Mental Model Shift

The underlying shift is treating git not as a record of finished work but as a live substrate that the agent reads from and writes to throughout its operation. The history becomes a log of agent state transitions. The branch becomes a task scope boundary. The diff becomes the primary review surface.

This changes how you set up a session. You start by ensuring the repo is clean (or committing any in-progress work), create a branch, and give the agent explicit permission and instructions for committing. You end by reviewing the diff, squashing the intermediate commits if they’re noise, and merging or discarding the branch. The git workflow is the session protocol.

For everyday agentic work, internalizing these patterns makes sessions substantially more recoverable without much overhead. A few extra commands at the start and end of each session means a bad agent run goes from a working directory full of confusing changes to a branch you can delete and restart. Given how often agents do unexpected things, that recoverability is worth building in from the start.

Was this interesting?