· 5 min read ·

What Parallel Subagent Execution Actually Requires

Source: simonwillison

When a developer first encounters the subagent pattern in Simon Willison’s agentic engineering patterns guide, the appeal is immediate: instead of a single agent serializing all work through one context window, spawn multiple agents that run concurrently and finish faster. The reality is that most multi-agent systems call their subagents sequentially, and this is not laziness on the part of framework designers. Sequential execution is always safe. Concurrent execution has prerequisites that are easy to violate.

Context Isolation Is Not the Same as Execution Isolation

Every subagent architecture gives you context isolation by default: each subagent runs in a fresh conversation context with no visibility into the parent’s accumulated history or other subagents’ work. This holds in Claude Code’s Task tool, the OpenAI Agents SDK, LangGraph, and any framework implementing the orchestrator-subagent pattern. Context isolation is structural; it falls out of the design.

Execution isolation, meaning two subagents can run simultaneously without producing conflicting results, is a separate property that depends on what the tasks actually do to shared resources. Claude Code runs subagents sequentially by default. The OpenAI Agents SDK supports async execution via asyncio.gather, but the orchestrating model still tends to issue tool calls one at a time unless the system prompt explicitly encourages concurrency. Both defaults reflect the same conservative reasoning: sequential execution is always correct, parallel execution requires proof.

The Two Prerequisites

Two conditions must hold simultaneously for a pair of subagents to run in parallel safely.

The first is no write-access overlap. If both subagents might write to the same file, push to the same branch, modify the same database table, or create the same external resource, they cannot run concurrently without coordination. The outcome depends on which subagent finishes last, and that ordering is non-deterministic. This is a standard write-write conflict, identical to what makes concurrent database transactions without serialization unsafe.

The second is no input-output dependency. If subagent B needs the output of subagent A as an input to its task, they cannot run at the same time. This sounds obvious in the abstract and is easy to miss in practice. The dependency is often indirect: subagent A generates a shared type definition that subagent B imports, or subagent A updates configuration that subagent C’s tests rely on. The tasks look independent until you trace what each one reads.

Building the Access Matrix

When decomposing a large task into potential subagent work, the right approach is to build an access matrix before scheduling anything. This does not need to be formal. A simple table covering which files or resources each task reads and writes is sufficient.

Task           | Reads                | Writes
--------------------------------------------------------
type-defs      |                      | src/auth/types.ts
auth-module    | src/auth/types.ts    | src/auth/*.ts
               |                      | tests/auth/*.ts
user-service   | src/auth/types.ts    | src/user/*.ts
               |                      | tests/user/*.ts
gateway-mw     | src/auth/token.ts    | src/gateway/*.ts
               |                      | tests/gateway/*.ts

In this example, type-defs writes src/auth/types.ts. Both user-service and gateway-mw read from that file. These three tasks have input-output dependencies: type-defs must complete before either of the other two can start. The auth-module task writes to src/auth/*.ts, which includes src/auth/types.ts. If type-defs also writes there, they have a write-access overlap and cannot run concurrently.

What looks like a four-way parallel decomposition turns out to have required ordering:

  1. type-defs runs first.
  2. auth-module runs after type-defs completes.
  3. user-service and gateway-mw can run in parallel, since their write sets do not overlap and they depend only on type-defs output, which is already available.

You get partial parallelism at the final step, not the full parallelism you might have expected. The shape of the task graph is narrower than intuition suggests for most real refactors.

What Tool Access Declarations Reveal

This analysis is a natural byproduct of the minimal footprint principle that Willison’s guide introduces as a security recommendation: give each subagent only the tools it needs for its assigned task. When you specify exactly which tools each subagent receives, you get an enumeration of what resources it can touch. A subagent with write_file access scoped to src/auth/ has a precise write domain. A subagent with only read tools is trivially safe to parallelize with any other subagent.

The security benefit is real. The InjecAgent benchmark found that attack success rates compound across each hop in a multi-agent chain, making broad tool access at each level disproportionately dangerous. But tool access scoping also forces the dependency analysis to happen at design time rather than at runtime through failed execution. If you cannot enumerate which files a subagent will write, the task is not well-scoped enough to safely schedule in parallel with anything else.

From Implicit to Explicit Dependencies

The frameworks have taken different approaches to encoding this structure.

LangGraph represents workflows as directed acyclic graphs where dependencies are explicit structural relationships. When you define an edge from node A to node B, the framework knows A must complete before B starts. Nodes with no incoming edges from running nodes can execute concurrently automatically. The dependency information is in the graph structure, not in the developer’s head.

The OpenAI Agents SDK with asyncio leaves dependency management to the developer. You call asyncio.gather for tasks you know are independent and call them sequentially otherwise. The correctness of the ordering is entirely implicit:

from agents import Agent, Runner
import asyncio

# type-defs must finish first
type_result = await Runner.run(type_agent, task_types)

# auth-module depends on type-defs output
auth_result = await Runner.run(auth_agent, task_auth)

# user-service and gateway-mw are independent of each other
user_result, gateway_result = await asyncio.gather(
    Runner.run(user_agent, task_user),
    Runner.run(gateway_agent, task_gateway)
)

This is not wrong, but it requires the developer to have done the dependency analysis upfront and encoded it into the structure of the orchestrator. Most introductory examples skip that step and show all subagents running in a single asyncio.gather, which works for the tutorial’s carefully chosen independent tasks and fails silently for anything with write-access overlap.

The Practical Shape of Real Refactors

For tasks that are genuinely independent by design, such as generating tests for independent modules or updating documentation across services that share no types, the prerequisites for parallelism are easy to verify. The tasks are parallel because the design made them parallel.

For tasks derived from a unified codebase refactor, the dependency analysis is more revealing than developers expect. Most large refactors have a load-bearing abstraction at the center: a shared type, an interface, a configuration schema. Updating it is the prerequisite for everything else. The subagent decomposition ends up being: update the core abstraction first, then fan out the downstream changes in a second wave that can run in parallel.

This is the same shape as a database migration with dependent application changes, or a library API change with downstream consumer updates. The parallel structure exists, but it is two-phase, not flat.

Understanding this before writing the orchestration code prevents a category of runtime failures where two subagents silently conflict, both complete and return success to the orchestrator, and the merged output has subtle inconsistencies that only surface in integration tests or production. The dependency graph is not optional bookkeeping. It determines whether a multi-agent system produces correct output or confidently wrong output that passes local verification and fails later.

Was this interesting?