· 5 min read ·

The AI Learning Loop Teams Keep Leaving Open

Source: martinfowler

The flywheel metaphor comes from Jim Collins’s Good to Great: consistent effort in one direction builds momentum, and each turn makes the next easier. Rahul Garg borrows it in the final piece of his series on reducing friction in AI-assisted development, published on Martin Fowler’s site, to describe something specific: the loop between what individual engineers learn during AI sessions and what the team as a whole knows.

That loop, in most teams, is open. Someone discovers a prompt pattern that reliably produces good migrations, or finds that giving Claude a concrete example before asking it to generate tests cuts the correction round-trips in half. That knowledge lives in their head. Maybe they mention it in a standup. Mostly it doesn’t travel.

The cost compounds slowly enough that teams don’t notice it at first. But after a few months, you have a wide spread in AI effectiveness across the team that looks like a talent gap and is really an information gap.

What ‘Shared Artifacts’ Actually Means

The phrase ‘shared artifacts’ covers a lot of ground, so it’s worth being concrete. In the AI-assisted development context, there are a few categories:

Context files. Tools like Claude Code use a CLAUDE.md file at the repository root (and optionally in subdirectories) as a persistent system prompt. Cursor uses .cursor/rules. GitHub Copilot has its own instructions format. These files are the most direct way to encode team knowledge into the AI’s behavior. A well-maintained CLAUDE.md can capture things like:

# Project conventions

## Database migrations
Always generate migrations using Knex's `table.timestamps(true, true)` for created_at/updated_at.
Never use raw SQL in migration files unless the Knex API genuinely can't express the operation.

## Testing
We use Vitest. Test files live next to source files as `*.test.ts`.
Avoid mocking the database layer in integration tests -- we have a test DB seeding script at `scripts/seed-test.ts`.

## Code style
Prefer explicit error types over generic Error throws. See `src/errors.ts` for the error hierarchy.

This is not documentation for humans. It’s operational context that shapes every AI interaction in the repository. The difference between a team that maintains this file and one that doesn’t is measurable: the team that does spends far less time on correction rounds because the AI’s defaults are already calibrated to their codebase.

Prompt libraries. Some teams maintain a shared collection of prompt patterns that work well for their domain. This can be as simple as a markdown file in the repo, or as structured as a shared Notion database with tags for context type (code generation, review, debugging) and effectiveness notes. The tooling matters less than the habit.

Retrospective notes. Meeting notes that capture what worked, what didn’t, and what changed. These age poorly if they’re not periodically synthesized into the artifacts above, but they serve as the raw input for that synthesis.

The Harvest Problem

Garg’s framework calls this ‘harvesting’ learnings from AI sessions, and the word is apt: value that exists gets lost if you don’t collect it. The question is when and how.

Ad-hoc sharing (the standup mention, the Slack message) doesn’t work reliably because it requires memory and initiative at the moment of discovery, and the discovery often happens deep in a flow state where stopping to document feels like friction. By the time the session ends, the specific insight has faded into general intuition.

A structured retrospective cadence works better. Not a full retrospective on everything, but a short focused session, weekly or bi-weekly, with a specific set of questions:

  • What did you ask the AI to do this week that it handled well with minimal correction?
  • What did you ask for that required multiple rounds of back-and-forth to get right?
  • Did you reuse a prompt pattern from last time, or did you have to rediscover something?
  • Is there a project convention the AI kept getting wrong that we haven’t documented?

The output of this session isn’t notes. It’s PRs to the CLAUDE.md, additions to the prompt library, or updates to the onboarding docs. The discipline of ending with an artifact, not a discussion, is what closes the loop.

Why This Is Harder Than It Looks

Knowledge management problems have been with software teams since long before AI. Wikis rot. Coding standards documents go unread. Tribal knowledge concentrates in the engineers who’ve been around longest. None of this is new.

What AI-assisted development adds is the scale and speed of the knowledge production. Before, an engineer might develop one or two strong intuitions per month about how to approach a class of problem. Now, a developer using AI intensively might discover half a dozen effective prompt patterns in a single week. The volume of tacit knowledge being generated has increased, but the mechanisms for externalizing it haven’t changed.

There’s also an incentive problem. Updating a CLAUDE.md file or writing up a prompt pattern is invisible work. It doesn’t show up in sprint velocity. It benefits colleagues more than it benefits the person doing it, at least in the short term. Teams that solve this tend to treat AI context file maintenance as part of the definition of done for certain categories of work, the same way they treat updating tests or documentation.

The Compounding Effect

The flywheel framing is useful because it captures the compounding nature of getting this right. A team that runs good feedback loops sees accelerating improvement. The first few months, the gains are modest: fewer correction rounds, less time spent on boilerplate. After six months, the CLAUDE.md encodes dozens of project-specific conventions, and new engineers are productive faster because the AI is already calibrated to the codebase’s norms before their first day.

This mirrors what organizations have learned about pair programming: the knowledge transfer isn’t in the moment of pairing, it’s in the accumulation of shared context over time. The practice creates a channel for knowledge to flow; the value comes from using that channel consistently.

At the individual level, AI tools are productivity multipliers. At the team level, they’re only multipliers if the learnings compound. A team of ten, where each person has independently learned how to prompt effectively but none of that knowledge is shared, gets roughly ten times the individual benefit. A team that has closed the loop gets something closer to exponential improvement, because each discovery builds on the last and the whole team benefits from each individual session.

What This Looks Like in a Real Codebase

For a concrete picture: in the Ralph bot project I work on, the CLAUDE.md has grown to cover Discord.js API quirks that Claude gets wrong by default, our JSON schema conventions for skill definitions, and the fact that we use atomic file writes for data persistence (don’t overwrite, write-then-rename). None of that came from a planning session. Each entry was added after someone spent fifteen minutes in a correction loop and thought to ask why.

The test: if a new engineer cloned the repo and their first AI session produced code that fit the project’s conventions without corrections, the shared context is working. If they spend the first week in correction loops that the senior engineers have already resolved, the knowledge isn’t being shared, it’s just being re-earned.

Garg’s series has been a useful prompt for making that test explicit. The feedback flywheel isn’t a complex system. It’s a habit, with a specific artifact as the output and a regular cadence to enforce it. The teams that build the habit early will have a compounding advantage over those that treat AI learning as purely individual.

Was this interesting?