· 7 min read ·

IRC as an Agent Bus: What a 678KB Zig Binary Gets Right

Source: hackernews

George Larson’s Nullclaw Doorman project is a two-agent system running on a $7/month VPS, where all inter-agent communication happens over IRC. The public-facing agent, nullclaw, is a 678KB Zig binary consuming roughly 1MB of RAM, connected to an Ergo IRC server with a gamja web client embedded in the site so visitors can drop into the same channel the agents use. The private agent, ironclaw, handles email and scheduling and is reachable only over Tailscale. The project surfaced on Hacker News with 318 points and a lively discussion about whether this is clever or anachronistic.

The instinct is to call it a novelty project, an experiment in using a decades-old protocol for something new. The architecture is more principled than that reading suggests.

Why IRC Works as an Agent Transport

IRC has three properties that make it a reasonable choice for agent-to-agent communication: it is a pub/sub system with built-in presence, it is human-observable by default, and it has almost no wire overhead.

Channels are topic-based message queues. An agent JOINs a channel and receives every PRIVMSG sent to it. That is the entire subscription model. No Kafka topic configuration, no Redis pub/sub client library, no separate message broker to operate. The semantics are simple enough to implement correctly in a small binary.

Presence comes for free at the protocol level. NICK, JOIN, PART, and QUIT events tell any connected client whether an agent is online. For orchestration, that matters: a coordinator can verify a worker agent is ready before dispatching tasks using the same connection and protocol primitives already used for messaging. You do not need a separate health-check endpoint.

The human-observability aspect is underrated in agent infrastructure. When something goes wrong with a Discord bot or an HTTP-based multi-agent system, you look at logs, correlate request IDs, and reconstruct what happened. When something goes wrong in a system like this, you open an IRC client and read the channel. The IRCv3 extensions that Ergo supports, specifically chathistory and labeled-response, let you replay missed messages and correlate sent messages with their server-side delivery confirmations. The entire communication bus is introspectable with any standard client.

The Zig Binary as a Gateway

The 678KB binary and 1MB RSS figure deserves attention on its own terms. That is not a stripped-down Node.js service or a Go binary with the standard library embedded. Zig produces genuinely small binaries because it has no runtime, no garbage collector, and produces minimal overhead when you avoid pulling in large standard library components.

For a gateway process that primarily does IRC protocol parsing, message routing, and HTTP calls to an inference API, that footprint makes sense. The process spends most of its time blocked on network I/O. It does not need a garbage collector running between requests, and it does not need Go’s goroutine scheduler, which adds meaningful overhead to Go binaries even at idle.

On a $7/month VPS, you typically have 512MB to 1GB of RAM shared across all running processes. A 1MB RSS process is negligible. A Node.js or Python service doing equivalent routing work consumes 50-200MB for the runtime alone before any application code runs. That gap matters when you are fitting multiple services on constrained hardware.

Zig also gives you predictable memory layout and explicit allocation control, which is useful for a long-running gateway process where you want to avoid latency spikes from GC pauses or fragmentation.

A2A as the Protocol Between Agents

Google’s Agent-to-Agent (A2A) protocol, published in April 2025, defines a JSON-RPC 2.0 envelope for agent task communication with a structured lifecycle: submitted, working, input-required, completed, failed, canceled. Each agent publishes an Agent Card at /.well-known/agent.json describing its capabilities and accepted input/output modalities. Client agents discover remote agents via those cards and issue task requests against them.

The standard A2A transport is HTTPS. The Nullclaw system routes A2A messages over IRC instead, using PRIVMSG as the carrier for the JSON-RPC payload. The agent’s IRC nick becomes its identity. The IRC server handles delivery, channel membership, and presence; A2A handles task state and content structure. Those are properly separated concerns.

A representative A2A task request looks like this:

{
  "jsonrpc": "2.0",
  "id": "req-001",
  "method": "tasks/send",
  "params": {
    "id": "task-42",
    "message": {
      "role": "user",
      "parts": [{"type": "text", "text": "Summarize my email from this morning"}]
    }
  }
}

That payload travels as the body of a PRIVMSG from the public agent’s nick to the private agent’s nick. The private agent responds with a tasks/send reply carrying its output. The IRC server’s labeled-response extension gives the sender confirmation that the message was received, which is the delivery guarantee that base IRC lacks.

One detail from the architecture: ironclaw borrows nullclaw’s inference pipeline via A2A passthrough. There is one Anthropic API key and one billing relationship, regardless of which agent initiated the request. This avoids credential distribution across trust boundaries while letting the private-side agent access the same inference capabilities.

Tiered Inference and the Hard Cost Cap

The inference setup uses Haiku 4.5 for conversation and Sonnet 4.6 for tool use, with a hard cap at $2/day.

Haiku 4.5 handles sub-second responses at approximately $0.80 per million input tokens. Sonnet 4.6 runs around $3 per million input tokens and handles function calling and structured reasoning more reliably. For an IRC-connected agent where the majority of messages are conversational, Haiku absorbs the high-volume cheap work. Sonnet activates only when a tool call is genuinely required.

This tiering is straightforward to implement because the Anthropic API accepts a model parameter per request. No special routing infrastructure is needed, just conditional logic: if the request requires tool use, pass claude-sonnet-4-6; otherwise pass claude-haiku-4-5-20251001. The cost differential between the two models means this single routing decision can reduce inference spend by 70-80% on a workload that is mostly conversational.

The $2/day hard cap is not just frugality. It bounds the blast radius if the agent enters an unexpected loop, gets hit with adversarial input that triggers repeated tool calls, or encounters a bug in the routing logic. Without a cap, a misbehaving agent can generate a significant invoice before anyone notices. A hard limit at the API key level means the worst case is $60/month in unexpected spend, not an unbounded surprise.

Tailscale as the Trust Boundary

The private agent is reachable only over Tailscale, which means it has no public IP exposure. Its A2A endpoint is a stable Tailscale hostname accessible only to nodes on the same tailnet. Compromising the public agent does not automatically grant access to the private one, because the attacker would also need Tailscale credentials.

For agents handling sensitive operations such as email, calendar access, and personal scheduling data, this boundary is a meaningful security property. The Tailscale ACL policy file defines exactly which nodes can reach which services. The private agent’s A2A endpoint is not firewalled at the OS level or behind a VPN concentrator; it simply has no route from the public internet at all.

Tailscale’s MagicDNS gives each node a stable hostname (ironclaw.tailnet-name.ts.net), which means the public agent can discover the private agent by a known name without maintaining a service registry or a configuration file that maps addresses.

The Comparison to Platform-Native Bot Development

Building Discord bots, you accumulate a specific kind of complexity: gateway shard management, rate limit header parsing, interaction tokens that expire in three seconds, webhook signature verification, embed character limits that differ from message character limits. The Discord platform is designed for human users, with bots accommodated as a secondary use case, and the API surface reflects that priority.

An Ergo IRC server has none of that ceremony. You connect over TCP, authenticate with SASL, JOIN a channel, and PRIVMSG. The protocol surface is small enough to implement fully in a few hundred lines of code, which is part of why a 678KB binary can handle it competently.

The tradeoff is that you build more infrastructure yourself. There is no built-in concept of slash commands, no interaction framework, no attachment handling pipeline. What you gain is a transport layer you can fully understand and debug with standard tooling, running on hardware you control, with no platform terms of service governing what your agents can or cannot do.

Ergo specifically earns its place in this stack. It ships with embedded bbolt storage, so there is no external database dependency. It handles TLS natively with built-in Let’s Encrypt support. Its WebSocket support means gamja can connect directly to the IRC server from a browser without a separate WebSocket-to-IRC proxy. The combination gives you a production-quality IRC server that operates as a single binary on the same VPS as everything else.

The result is a system where two AI agents coordinate via a 40-year-old protocol, cost less than a Netflix subscription per month to run, and are fully observable to anyone with an IRC client. The constraint is the point: working within that budget forces architectural decisions that turn out to be correct for reasons beyond cost.

Was this interesting?