The $500k Interpreter: What Running JSONata at Scale Actually Costs

Reco.ai published a post this week about rewriting their JSONata-based processing pipeline using AI assistance in a single day, claiming $500k per year in cloud savings. The headline is provocative enough to generate 229 comments on Hacker News, and the usual HN skepticism is warranted. But underneath the AI hype and the large cost number, there is a real and interesting technical story about what it costs to run a dynamic expression language in a production hot path.

What JSONata Is, and Why People Reach for It

JSONata is a query and transformation language for JSON, originally developed at IBM. It solves a genuine problem: sometimes you need to let an operator or configuration author write expressions that reshape JSON documents at runtime, without deploying new code. The expressions look like Account.Order.Product.(Price * Quantity) or $sum(items[category = "books"].price). It is essentially what XPath and XSLT did for XML, brought into the JSON era.

The language is used in IBM App Connect, Node-RED, and a range of iPaaS and low-code platforms. The npm package consistently pulls around 500k-1M weekly downloads. For integration platforms where non-engineers write transformation rules, it is a reasonable tool.

The npm package jsonata exposes a two-step API: parse the expression once, then evaluate it against data repeatedly:

const jsonata = require('jsonata');
const expression = jsonata('Account.Order.Product.(Price * Quantity)');
const result = await expression.evaluate(data);

The parsing step is relatively cheap. The evaluation step is where the cost hides.

The Interpreter Overhead

JSONata is implemented as a pure JavaScript AST interpreter. When you call evaluate(), it walks the parsed AST node by node against the input document. There is no compilation to bytecode, no JIT-friendly hot path, no native code generation. Every evaluation is a fresh interpreter run through the tree.

This is fine when you evaluate an expression occasionally. It becomes expensive when you evaluate thousands of times per second in a real-time data pipeline.

Consider what happens in a security or SaaS observability product like Reco.ai. Their system presumably ingests a continuous stream of events, applies JSONata expressions to filter, classify, or transform each one, and makes decisions based on the results. If they are processing, say, 10,000 events per second and each JSONata evaluation takes 1-5ms on a moderately complex expression, that is 10 to 50 CPU-seconds of interpreter work per wall-clock second. You need a lot of parallel compute to keep up.

The GC pressure compounds this. JSONata builds intermediate objects during evaluation: array results, object constructions, function call frames. In a high-throughput Node.js process, this generates a steady stream of short-lived allocations that V8’s garbage collector has to collect. The GC time adds latency jitter and reduces effective throughput further.

For comparison, equivalent transformations written in native JavaScript benefit from V8’s full optimization pipeline: JIT compilation, inline caching, hidden class optimization for object shapes. The same logical operation can be 10x to 100x faster when expressed as plain JS code that V8 can reason about, versus an AST interpreter that V8 cannot optimize across.

Getting to $500k

The math on $500k per year is roughly $1,370 per day, or about $57 per hour in compute costs attributable to JSONata evaluation. On typical cloud instance pricing (say $0.10-0.30/hour for a compute-optimized instance), that represents somewhere between 190 and 570 core-hours per day just for the interpreter overhead.

That is entirely plausible for a company running a real-time event processing pipeline at meaningful scale. If JSONata evaluation is your CPU bottleneck, you end up provisioning extra instances to absorb the load. Replace the bottleneck with native code and you can run the same throughput on a fraction of the infrastructure.

The HN comments are appropriately skeptical about whether this is genuinely $500k in waste versus compute that would have been needed regardless. But the direction of the claim is credible: interpreted expression evaluation is a well-known source of unnecessary compute overhead in production pipelines.

What “Rewriting JSONata” Likely Means

There are two ways to approach this kind of rewrite. The first is to statically analyze the JSONata expressions used in the codebase (if they are known at build time, not user-authored at runtime) and transpile each one into an equivalent native JavaScript function. The expression items[price > 100].name becomes something like:

(data) => (data.items || [])
  .filter(item => item.price > 100)
  .map(item => item.name)

This approach preserves the semantics while giving V8 full optimization opportunity. The transpiler has to handle the full JSONata grammar, including path navigation, predicates, aggregation functions, conditional expressions, and the $$ context binding, but once it is written it runs automatically.

The second approach, if the expressions are truly dynamic (user-authored at runtime), is to evaluate them ahead of time where possible and cache compiled versions, or to replace the use case with a different mechanism that does not require runtime interpretation.

Either way, the bulk of the work is mechanical: for each JSONata AST node type, write a code generator that emits the equivalent JavaScript. The grammar has roughly 30-40 distinct node types. Each one has a clear transformation rule.

Why This Is a Good Problem for AI Assistance

Mechanical, high-volume code transformation with clear correctness criteria is exactly where current LLMs perform well. The JSONata grammar is documented. The target JavaScript idioms are well-understood. The validation criterion is straightforward: run the new implementation against a test suite and compare outputs with the old one.

The AI does not need to understand the business logic. It needs to systematically apply transformation rules across a bounded set of cases. That is a pattern-matching and code-generation task, which models handle reliably. The engineer’s role becomes writing the test harness, reviewing edge cases in the generated code, and handling the small number of constructs the model gets wrong.

This is meaningfully different from asking AI to reason about novel architectural decisions or debug subtle concurrency bugs. The “we used AI” framing often obscures whether the AI did the hard part or the easy part. In this case, the hard part was recognizing that JSONata was the bottleneck and deciding to replace it; the AI handled the grunt work of generating equivalent code for each expression in the codebase. That is a reasonable division of labor.

The Broader Lesson About DSLs in Hot Paths

JSONata’s performance characteristics are not a bug; they are a consequence of its design goals. A language that can be authored by non-engineers at runtime, evaluated against arbitrary JSON, and extended with custom functions is always going to have interpretation overhead. The library is not poorly implemented. It does what it says.

The mistake is architectural: treating a configuration-layer tool as if it belongs in a processing hot path. JSONata is well-suited to a low-code integration platform where a human configures a transformation that runs occasionally. It is poorly suited to a real-time event pipeline processing thousands of documents per second.

This pattern shows up in other contexts. Template engines used as API response serializers. Rule engines evaluated on every request. Scripting languages embedded in application logic that runs in tight loops. The DSL buys you flexibility and writability at the cost of execution speed. That tradeoff is only worth it if you are actually using the flexibility.

If your JSONata expressions are static (defined in your codebase, not user-authored at runtime), you have been carrying the interpretation overhead without getting the flexibility benefit. That is where the waste accumulates.

For teams running JSONata in production pipelines, the decision framework is simple: audit whether your expressions are static or dynamic. If they are static and you are processing at volume, the calculation that Reco.ai did is worth doing. If they are genuinely dynamic (end-user authored at runtime), you are paying for a feature you are using and the economics look different.

The jsonata GitHub repository has longstanding open requests for a compilation mode that would transpile expressions to JavaScript functions rather than interpreting them. That work has not landed in the mainline package. The community tooling around JSONata is oriented toward correctness and expressiveness, not throughput. For teams with throughput requirements, the gap between what the library offers and what the workload needs has historically required exactly the kind of one-off rewrite that Reco.ai did.