· 6 min read ·

Why the JSONata Rewrite Worked, and Where Rewrites Like It Go Wrong

Source: hackernews

The headline from Reco.ai is good: they rewrote JSONata with AI assistance in a single day and are saving $500,000 per year in compute costs. On HackerNews the post picked up 249 points and over 200 comments, most of them debating whether the savings figure is credible.

The savings figure probably is credible, for reasons worth explaining. But the more interesting thread to pull on is what the rewrite required, why JSONata of all things was the bottleneck, and what AI-assisted rewrites of this type produce versus what teams assume they produce.

What JSONata Is and Why It Accumulates Cost

JSONata is a query and transformation language for JSON data, created by Andrew Coleman at IBM and released around 2016. The npm package sits at tens of millions of total downloads and powers IBM App Connect Enterprise, Node-RED pipelines, and a wide range of data integration and security tooling.

The syntax is worth understanding. Where something like JMESPath gives you path navigation and basic filters, JSONata gives you a full expression language: path navigation with wildcards and descendants, predicate filters, arithmetic, string manipulation, lambda expressions, higher-order functions, partial application, aggregation, and date/time operations. An expression like Account.Order[OrderID = 'order1'].Product.Price navigates nested JSON and filters in a single step. $sum(Account.Order.Product.(Price * Quantity)) aggregates across nested arrays with inline arithmetic. This is closer to XPath/XQuery territory than to simple JSON querying.

The implementation is a pure JavaScript recursive descent parser and interpreter. Parsing is separate from evaluation: the library parses an expression once and caches the resulting AST, then evaluates it on each call by walking that tree. For light use this is completely acceptable. For a security monitoring platform processing millions of events per day, running JSONata expressions against each one, the interpreter overhead compounds significantly. JavaScript interpreter overhead in a tight loop is real: every property access, every function call, every array allocation is in the hot path. There is no JIT specialization for JSONata expression semantics; it is just a JavaScript program calling other JavaScript.

This is the pattern behind most of the big performance rewrites in the JS ecosystem. SWC replaced Babel with a Rust compiler. esbuild replaced webpack-era bundling with a Go program. Biome is doing what ESLint and Prettier do, in Rust. The common thread: all of these tools run in inner loops where constant JavaScript runtime overhead accumulates faster than the language’s strengths can compensate.

The $500k/Year Arithmetic

Half a million dollars a year is around $41,700 a month. At current AWS pricing for a moderately sized EC2 fleet, this implies somewhere in the range of 20 to 50 large-instance equivalents running continuously to handle JSONata evaluation load. For a company with a busy SaaS security pipeline, that could easily be a real number. Security monitoring products ingest high volumes of audit log events, normalize them, and evaluate policy expressions against each one. If that normalization and evaluation layer is JSONata-heavy, the compute footprint follows directly.

The number is also plausible if they were running JSONata in Lambda functions with per-invocation billing, where even a few milliseconds of interpreter overhead at high concurrency turns into meaningful monthly spend. This is the math that makes performance rewrites of runtime libraries so impactful compared to build-tool rewrites: a 10x faster build tool saves developer time, but a 10x faster runtime library reduces your infrastructure invoice every month indefinitely.

What AI Gets Right in This Kind of Rewrite

A recursive descent interpreter for a well-specified expression language is, structurally, a good target for AI-assisted translation. The source code is self-contained, the semantics are defined by a published specification and a substantial test suite, and the translation task is largely mechanical: parse the JavaScript, understand the tree-walking pattern, emit equivalent code in the target language.

This is where “in a day” becomes plausible. The original JSONata JavaScript source is readable, well-organized, and not deeply coupled to Node.js internals. An AI model given the source and asked to produce a Go or Rust equivalent will produce a working first pass quickly. The tokenizer, the parser, the basic expression evaluators for arithmetic and path navigation, these translate with relatively low friction.

What takes longer, and what AI still struggles with, is getting the edge cases right. JSONata’s semantics have several behaviors that are surprising on first encounter and easy to get subtly wrong in a reimplementation.

The most notorious is array singleton flattening. In JSONata, expressions that return a single-element array frequently get unwrapped to a scalar automatically, but not always. The rules are context-sensitive and documented but non-obvious. Get this wrong and you produce a reimplementation that passes 95% of the test suite and fails in production on the narrow slice of behavior your application depends on.

Similarly, null and undefined propagation in JSONata is intentional: expressions that would throw in most languages instead return undefined and are silently dropped from results. Path expressions against missing keys return undefined rather than throwing. This is a deliberate design choice that makes JSONata comfortable for schema-flexible data, but it means a reimplementation has to faithfully reproduce the error suppression semantics rather than defaulting to the target language’s own error handling conventions.

There is also the question of the built-in function library. JSONata ships with over 70 built-in functions covering string manipulation, numeric operations, aggregation, array manipulation, date/time, and HTTP. Each one has documented behavior for edge inputs. A rewrite that covers path navigation and predicates but silently mishandles $formatDateTime will work fine until someone runs a time-sensitive query.

The Part Worth Watching

Reco.ai’s post is honest that they validated against JSONata’s test suite, which is the right move. The original repository includes a comprehensive test suite that covers the language’s more unusual behaviors. Running a rewrite against that suite is how you discover the singleton flattening problems and the undefined propagation edge cases before they become production incidents.

What is less clear from the article is how much of the JSONata feature surface they are using. If their workload only exercises a subset of JSONata’s capabilities, say common path navigation, predicates, and a handful of built-in functions, the risk profile of the rewrite is much lower. A partial reimplementation tested against the full suite but only relied upon for a known subset of features is a reasonable strategy. It is also the kind of thing you can validate empirically: shadow the new implementation against the old one in production for a week and compare outputs.

This is a general principle for AI-assisted rewrites of DSLs: the specification is your friend. The JSONata project’s test suite is the artifact that makes this kind of rewrite tractable. Without it, you are reimplementing by reading source code and hoping you understood all the semantics. With it, you have a runnable definition of correctness that AI tools can be evaluated against at each step.

Where This Fits

The Reco.ai story fits a broader pattern that is now mature enough to have a methodology. Performance-critical JavaScript libraries with well-defined semantics are being rewritten in faster runtimes, with AI handling the mechanical translation and existing test suites providing the correctness baseline. The economics make sense: the rewrite cost is bounded because the scope is known and the tests exist, and the savings are permanent because the infrastructure bill arrives every month.

What has changed in the last two years is that the mechanical translation step, once requiring a skilled systems programmer who could hold the entire JS source in their head while writing Go or Rust, is now something an AI model can rough out in hours. The human work shifts toward verification: running the test suite, investigating failures, understanding the edge cases well enough to judge whether the AI’s fix is correct or just passes the specific failing test.

What matters about the Reco.ai story is not the speed of the initial rewrite but the rigor applied after: running the full test suite, measuring the performance improvement, and validating before committing to the operational risk of switching. That combination, clear specification, thorough testing, empirical validation before cutover, is harder to get right than any single day of writing code.

Was this interesting?