Two Agents, One IRC Server, and What the Stack Reveals About Transport Design

George Larson’s nullclaw/ironclaw project landed on Hacker News this week and collected over 300 points, mostly from people who reacted to the IRC angle as though it were a novelty. It isn’t. The interesting part isn’t that someone used IRC in 2026; it’s that IRC turns out to be a well-suited transport for a persistent conversational agent, for reasons that are easy to see once you look at the actual constraints.

The Stack, Concretely

Two agents, two separate VPS boxes. The public one, nullclaw, is a 678 KB Zig binary consuming roughly 1 MB of RAM. It connects to an Ergo IRC server and presents itself to visitors through a gamja web client embedded directly in the site. You can also connect with any standard IRC client to irc.georgelarson.me:6697 over TLS and land in #lobby.

The private one, ironclaw, handles email and scheduling. It’s only reachable over Tailscale, using Google’s Agent-to-Agent (A2A) protocol as the communication layer between the two agents. Crucially, ironclaw doesn’t have its own Anthropic API key; it borrows nullclaw’s inference pipeline via A2A passthrough, so there’s one billing relationship regardless of which agent is doing the work.

Inference is tiered: Claude Haiku 4.5 handles conversational turns (sub-second, cheap), while Sonnet 4.6 only activates for tool use. There’s a hard cap at $2/day.

Why IRC Works Here

I build Discord bots, so I spend a lot of time thinking about transport. Discord’s gateway is a WebSocket connection with a heartbeat requirement, a session ID, reconnect sequences, and rate limits on both the connection level and per-route. The client library handles most of this, but the complexity is real. When something breaks, you’re debugging a stateful session against an opaque cloud API.

IRC is a line-delimited text protocol over TCP, specified in RFC 1459 in 1993 and incrementally extended by the IRCv3 working group since. The entire client protocol fits comfortably in a few hundred lines of code. A persistent agent connecting to IRC gets several things for free:

Multi-user presence in a channel without any additional state management
A natural queue: messages arrive in order, one per line
Reconnection semantics that are simple enough to implement in a tight loop
No rate-limit buckets to track at the channel-message level for normal use
The ability for multiple clients to share the same session if the server supports it (Ergo does, via its built-in bouncer functionality)

For a conversational agent that needs to be always-on and stateless between turns, this is a good fit. The agent doesn’t need to maintain a session object between messages; it just reads lines, processes them, and writes lines back. The broker (the IRC server) handles everything else.

Compare this to building the same thing on an HTTP webhook model. You’d need a publicly routable endpoint, TLS termination, a framework to parse request bodies, and some persistent state to track conversation context across requests. You’d also lose the multi-user channel model unless you built it yourself. IRC gives you the channel abstraction without paying for it.

Zig for the Gateway Binary

A 678 KB binary using 1 MB of RAM is not an accident. Zig achieves this through a combination of design choices: no runtime, no hidden allocations, comptime-driven code generation that strips unused paths, and a standard library small enough to reason about completely. The language was built to produce lean, predictable binaries for exactly this kind of systems work.

For a long-running agent process on a $7/month VPS, this matters. A Node.js IRC bot with an Anthropic SDK dependency would consume somewhere between 80 and 150 MB of RSS just to start. A Go binary would be smaller, maybe 20-30 MB including the runtime. Zig eliminates the runtime entirely. At 1 MB, nullclaw leaves the rest of the box free for the IRC server and anything else running on the machine.

The tradeoff is that Zig’s toolchain and language are still maturing. The async story in Zig changed significantly between 0.11 and 0.13, and the language doesn’t yet have a stable ABI. For a personal project with one contributor, this is manageable. For a team, you’d want to weigh it more carefully.

The A2A Pattern for Private Agents

The more architecturally interesting part of this project is how the private agent, ironclaw, connects to the public one. Google’s A2A protocol, announced at Cloud Next 2025, is a JSON-based specification for agent interoperability over HTTP. An agent advertises its capabilities via an Agent Card served from /.well-known/agent.json, and other agents can discover and invoke it using a standardized task-and-response model.

The clever part here is using Tailscale as the auth layer. Tailscale creates a WireGuard mesh network where every device has a stable identity derived from your identity provider. When ironclaw is only reachable via Tailscale, you get mutual authentication and encryption for free without managing certificates or API keys between agents. A2A then sits on top of this as the application-layer protocol.

The billing passthrough is worth noting separately. Rather than giving ironclaw its own Anthropic API key, nullclaw proxies inference requests on its behalf. This means one key, one usage dashboard, one rate limit bucket to monitor. In a multi-agent system, credential sprawl is a real operational problem. Centralizing API access in the public gateway and letting private agents borrow that pipeline is a pattern worth copying.

Tiered Inference as a Cost Model

The Haiku-for-conversation, Sonnet-for-tools split is something I’ve been thinking about for my own bot work. The intuition is correct: most conversational turns don’t require the full capability of a frontier model. Acknowledging a message, asking a clarifying question, or giving a short factual answer are tasks that Haiku handles well at a fraction of the cost and latency.

Tool use is different. When an agent needs to decide whether to call a function, compose multiple tool calls, or interpret a structured result, you want a model that reasons reliably about schemas and sequences. Sonnet earns its higher cost there.

The $2/day hard cap is a sensible guardrail for a personal project. Without it, a single conversation with a verbose user or a runaway tool loop could produce an unexpectedly large bill. Most provider SDKs don’t expose spend-based circuit breakers natively; you’d implement this by counting tokens per request, accumulating a daily total, and refusing new inference requests once you cross the threshold.

What This Project Actually Demonstrates

The thing that makes nullclaw/ironclaw notable isn’t the use of IRC or even the Zig binary size. It’s that someone made deliberate choices at every layer of the stack and those choices compound. IRC as transport eliminates a class of complexity. Zig as the implementation language eliminates a class of resource overhead. Tailscale as the network perimeter eliminates a class of credential management. A2A as the inter-agent protocol gives the private agent a standard interface without custom RPC code. Tiered inference eliminates a class of unnecessary cost.

None of these are individually surprising. Together, they produce a system that runs two agents for $7/month in infrastructure plus a small variable API spend, with a codebase that’s small enough for one person to hold in their head.

That’s the actual lesson. Not that IRC is back, but that the right small decision at each layer is worth more than any single framework choice.