Agent Embedding and the Return of JSON-RPC

OpenAI’s article on the Codex App Server, published in February 2026, is worth reading for what it reveals about protocol design under pressure. The team needed to embed an autonomous coding agent into applications, not just expose an LLM endpoint, and the choice of a bidirectional JSON-RPC 2.0 API reflects a clear understanding of what that distinction requires.

What embedding an agent actually means

When you call an LLM API, the flow is fixed: send a prompt, get tokens back. Even with streaming via Server-Sent Events, communication is unidirectional. The client requests; the server responds. The OpenAI Responses API uses exactly this model: HTTP POST, SSE stream back.

Embedding an autonomous agent is structurally different. The agent doesn’t just respond; it acts. It reads files, runs shell commands, writes code, proposes changes. In any real deployment, the embedding application needs to handle four distinct concerns:

Knowing what the agent is doing in real time (streaming progress)
Controlling which tools the agent can invoke and executing them on its behalf (tool use)
Gating risky operations on human confirmation before they proceed (approvals)
Previewing proposed code changes before they are written to disk (diffs)

All four require the server side, the running agent, to send requests to the client side, the embedding application. That inverts the conventional client-server relationship, and it is why a simple REST or SSE design breaks down. You need a channel where either party can initiate a request and await a response.

JSON-RPC 2.0: the protocol that already solved this

JSON-RPC 2.0 is a minimal specification for remote procedure calls over JSON. It defines three message types:

Request: carries an id, expects a corresponding response
Response: carries a result or error, keyed to a request id
Notification: no id, no response expected, fire-and-forget

The specification is transport-agnostic and direction-agnostic. Both sides can send any message type at any time. Run it over a persistent socket and you have a full peer-to-peer RPC channel with correlation, error semantics, and async notifications built in.

This is not new. The Language Server Protocol has used bidirectional JSON-RPC since 2016. When you hover over a symbol in VS Code and the language server responds with documentation, that is a client-to-server request. When the server decides to push a diagnostic annotation without being asked, that is a server-to-client notification. LSP mixes both directions constantly, over the same stdio pipe or socket.

The Debug Adapter Protocol follows the same design. The IDE sends a “step over” request to the debugger; the debugger sends a “stopped” event back when execution halts at a breakpoint. Neither side is purely a client.

The Codex App Server applies this pattern to agent embedding. The choice reads less like an invention and more like a recognition that the problem is structurally identical to what LSP already handles.

The four pillars in protocol terms

Streaming progress

An agent working on a non-trivial task can run for minutes. Streaming progress notifications give the embedding application a live feed of agent activity, so UIs can show meaningful state rather than a spinner. In JSON-RPC terms, these are notifications: no id, no expected response.

{
  "jsonrpc": "2.0",
  "method": "codex/progress",
  "params": {
    "taskId": "t_abc123",
    "message": "Running test suite...",
    "step": 3,
    "totalSteps": 7
  }
}

The embedding application listens and updates its UI. The agent continues without waiting for acknowledgment.

Tool use

When the agent needs to run a shell command, read a file, or call an external service, it sends a request to the embedding application and blocks until it receives a result. The embedding application holds OS access; the agent running inside the App Server is sandboxed.

// Agent to App: request tool execution
{
  "jsonrpc": "2.0",
  "id": 5,
  "method": "codex/tool",
  "params": {
    "taskId": "t_abc123",
    "tool": "shell",
    "command": "npm test",
    "cwd": "/project"
  }
}

// App to Agent: tool result
{
  "jsonrpc": "2.0",
  "id": 5,
  "result": {
    "stdout": "42 passed, 0 failed",
    "exitCode": 0
  }
}

This inversion is what makes the bidirectional channel necessary. The agent is the server in one sense, processing the overall task, but it becomes the requester when it needs a tool executed. A unidirectional protocol cannot express this.

Approvals

For operations that carry real risk, the agent surfaces an approval request before proceeding. Deleting files, overwriting configuration, making network calls: these warrant explicit confirmation.

// Agent to App: approval required
{
  "jsonrpc": "2.0",
  "id": 9,
  "method": "codex/approval",
  "params": {
    "taskId": "t_abc123",
    "operation": "deleteFile",
    "path": "src/legacy-auth.ts",
    "reason": "No longer referenced after refactor"
  }
}

// App to Agent: decision
{
  "jsonrpc": "2.0",
  "id": 9,
  "result": { "approved": true }
}

The agent blocks on the request. The embedding application surfaces a confirmation dialog; the user decides; the result flows back through the channel. This is a synchronous human-in-the-loop checkpoint embedded naturally into an otherwise asynchronous task flow.

Treating approval as a protocol primitive rather than an application-layer concern is the clearest design decision in the whole system. It means any embedding application gets human oversight for free, without having to intercept agent outputs and guess which ones are risky.

Diffs

Before writing changes to disk, the agent sends structured diffs for review. Structured rather than raw unified diff text means the embedding application receives the semantic information it needs to render a proper code review UI: side-by-side views, syntax highlighting, hunk-level accept-or-reject controls. The protocol defines what changed; the UI decides how to present it.

Why not WebSockets with a custom protocol, or gRPC

WebSockets provide a bidirectional transport, but nothing on top of it. You still need to define message formats, correlation IDs, error handling, and notification semantics. JSON-RPC 2.0 is a thin spec that handles all of that with library support in every major language.

gRPC offers bidirectional streaming with strong typing via Protocol Buffers, but comes with meaningful operational weight: schema compilation, binary encoding, HTTP/2 as a hard requirement, and more complex tooling. For a developer-facing embedding API, friction at integration time matters. JSON-RPC over a WebSocket or stdio pipe is something a developer can read in a text editor and implement in an afternoon.

The stdio option in particular is pragmatic. If the App Server runs as a subprocess, the parent application reads and writes JSON-RPC messages on stdin and stdout. This is exactly how LSP works between editors and language servers, and it means embedding requires no network infrastructure whatsoever. The agent process is just another child process.

The separation that matters

The App Server architecture has one implication worth holding onto: the protocol is the interface, not the implementation. If OpenAI changes the internal model or the agent’s reasoning approach, the JSON-RPC contract stays stable. Embedding applications code against message types and method names, not against model internals.

The approval and tool-use flows can also be extended without touching the agent. An embedding application can implement custom tool handlers for domain-specific operations, policy-based auto-approval for low-risk file writes, or rich diff UIs with inline comment support. The protocol defines the seams; both sides evolve independently behind them.

The Codex CLI open-sourced in April 2025 took a simpler approach, driving the Responses API directly from a terminal where a human watches stdout. That is appropriate for a single-user command-line tool. The App Server trades that simplicity for the expressiveness required when the embedding application needs programmatic control over every step of execution: knowing when the agent reads a file, approving each shell command, previewing every diff before commit.

For anyone building a coding assistant into an IDE, a web application, or a CI pipeline, the bidirectional JSON-RPC model is the design worth understanding. It is a considered answer to what agent embedding actually requires, shaped by the same constraints that produced LSP a decade earlier.