· 6 min read ·

When Codex Delegates to Your Custom Agents, Debugging Gets Harder

Source: simonwillison

OpenAI’s Codex CLI has added support for subagents and user-defined custom agents, as noted by Simon Willison in his March 2026 writeup. The feature lets Codex spawn subordinate agent processes for specific subtasks and lets users configure their own agents with tailored tool access and system prompts. On the surface that sounds like a logical extension of the tool. Beneath the surface, it introduces a class of coordination problems that monolithic agents simply do not have.

What Codex Already Does

For context: Codex CLI, open-sourced by OpenAI in April 2025, is a terminal-resident coding agent that runs in a sandboxed environment. It reads your codebase, executes shell commands, writes and edits files, and runs tests, all within a loop driven by a language model. The key configuration surface is AGENTS.md, a file you can place at the repo root or in subdirectories to inject task-specific instructions, coding conventions, or tool constraints into the agent’s context.

The model driving it is typically GPT-4o or one of OpenAI’s o-series reasoning models. Each run gets a context window, fills it with relevant file content and conversation history, and produces tool calls that map to real filesystem and shell operations. When the task is done, or the context is exhausted, the run ends.

That architecture works well for self-contained tasks. It starts to strain when you need to coordinate across multiple concerns that don’t all fit cleanly in one context window.

The Subagent Model

Subagent support addresses this strain by letting the primary Codex agent delegate subtasks to child agent instances. Instead of a single agent accumulating all context for a long task, the orchestrating agent can spawn a subagent, describe the subtask, hand it the relevant files and instructions, and wait for a result.

This is, at its core, a remote procedure call. The parent agent serializes a task description, the child executes it and returns output, and the parent incorporates that output into its own reasoning. The child does not share the parent’s conversation history; it gets only what the parent explicitly passes. Each subagent call starts a fresh context window.

This has some real advantages. Parallel subagent invocations can run simultaneously, which maps well onto tasks like “refactor these five modules independently” or “run linting on each package in this monorepo.” Isolation also means the child cannot be confused by irrelevant context accumulated in the parent’s turn. And the parent’s context window stays smaller, which keeps latency and token costs from spiraling as tasks grow longer.

But the fresh-context guarantee is a double-edged property. The child has no memory of decisions the parent made earlier in the session unless the parent restates them. If the parent discovered midway through a task that a certain API is deprecated, the child will not know that unless the parent explicitly includes it in the delegation prompt. Implicit state that accumulates naturally in a single-agent conversation has to be made explicit at every delegation boundary.

Custom Agents as Configuration Units

Custom agents extend this further by letting users define purpose-specific agents with their own system prompts, model selections, and tool access restrictions. You might define a reviewer agent that has read-only tool access and uses a reasoning-optimized model for analysis, a test-writer agent configured to only touch test files, or a docs agent that operates on markdown exclusively.

This maps to a pattern that has appeared across the agent ecosystem in 2025 and 2026: treating agents as composable units rather than monolithic processes. LangChain’s agent framework, Claude Code’s agent tool, and Microsoft’s AutoGen have all converged on similar decompositions, where a supervisor agent delegates to specialists with constrained capabilities.

The constraint part matters. A test-writing agent that cannot touch production code is a weaker attack surface than a general-purpose agent that can touch everything. If a prompt injection arrives through test output or a malicious dependency, the blast radius is bounded by what tools the agent can invoke. This is not a complete security solution, but it is a meaningful reduction in risk compared to a flat permission model.

The Codex approach to custom agent configuration through AGENTS.md and CLI flags fits naturally into the existing workflow. Developers who already maintain AGENTS.md files for codebase-specific instructions can extend that pattern to define agent specializations without leaving the filesystem-as-configuration model that Codex already uses.

The Debugging Surface Expands

Here is where things get genuinely complicated. When a single agent fails or produces a wrong result, you have one conversation trace to inspect. When a parent agent delegates to three subagents in parallel, each of which may have spawned further subagents, the error could have originated anywhere in that tree, and the parent’s trace will show only the summarized output the child returned, not the child’s internal reasoning.

This is the same problem distributed systems engineers have faced for decades: correlating failures across service boundaries. The tooling for LLM agent systems is nowhere near as mature as distributed tracing infrastructure like OpenTelemetry. Most agent frameworks currently give you log files and token counts, not flame graphs of agent execution with annotated decision points.

The specific failure mode to watch for in hierarchical Codex setups is context loss at delegation boundaries. A parent agent might correctly identify that a module needs refactoring, spawn a subagent to do it, but underspecify the constraints in the delegation prompt. The subagent completes its task successfully given what it was told, but the result violates an assumption the parent held that was never written down. Neither agent made an error; the gap was in the handoff.

Good subagent prompts need to be explicit about constraints that would normally be implicit in a single-agent conversation: relevant architectural decisions, files that must not be modified, APIs that are in-flight, test requirements, style conventions. Writing those prompts well is a skill distinct from writing good system prompts for single-agent tasks.

Prompt Injection Scales With Depth

The security concern worth flagging explicitly is how prompt injection interacts with agent depth. A single agent reading a file can encounter malicious instructions embedded in that file. With a single agent, the injected instruction competes with the system prompt for influence. With a multi-level agent hierarchy, a successful injection at a leaf agent can produce output that the parent agent trusts and incorporates into its own reasoning, potentially triggering further actions with the parent’s broader tool access.

This was discussed in the context of indirect prompt injection by Kai Greshake and colleagues in 2023, and the concern compounds as agent systems add delegation layers. Each level of delegation is a potential amplification point for injected instructions. Custom agents with restricted tool access help contain this, but the trust model between parent and child agents deserves careful design rather than implicit inheritance.

What This Means in Practice

For most Codex users working on self-contained tasks, subagents will rarely be relevant. The single-agent model handles the majority of coding work effectively. Where subagents become valuable is in longer-running automated pipelines: CI-integrated agents that run on pull requests, repository-wide refactoring tasks, or multi-step workflows where different phases genuinely benefit from different model configurations.

Custom agents are immediately useful for teams maintaining shared AGENTS.md conventions. Defining a project-specific reviewer agent that knows your team’s conventions and only has read access creates a repeatable code review step that can be triggered by the orchestrating agent without granting it write permissions to your codebase.

The tooling for inspecting and debugging multi-agent runs will need to mature before hierarchical Codex workflows become routine practice for most developers. Right now, building with subagents means accepting that you will spend more time reasoning about what was passed between agents than about what any individual agent did. That trade-off makes sense for large enough tasks; for smaller ones, a single well-configured agent with a clear AGENTS.md is almost always simpler to reason about and debug.

The capability is a meaningful addition to what Codex can do. How well it works in practice depends heavily on how carefully you design the delegation boundaries.

Was this interesting?