What a One-Day JSONata Port Reveals About AI-Assisted Migration
Source: simonwillison
Simon Willison linked to a case study from Vine this week about porting their JSONata implementation with AI assistance in roughly a day, resulting in $500K/year in savings. The headline grabs attention, but the more interesting question is why JSONata specifically made this feasible, and what the economics of enterprise JSONata usage look like in the first place.
What JSONata Actually Is
JSONata is a query and transformation language for JSON data, created by Andrew Coleman at IBM. If you’ve used XPath or XQuery against XML documents, the mental model transfers directly. JSONata lets you navigate, filter, and transform JSON using compact expressions:
// Navigate to nested data
Account.Order.Product.Price
// Filter with predicates
Account.Order.Product[Price > 10].Description
// Aggregate
$sum(Account.Order.Product.Price)
// Transform structure
{
"total": $sum(Account.Order.Product.Price),
"items": Account.Order.Product.Description
}
The npm package jsonata has been pulling in several million downloads per week for years. It’s the transformation engine embedded in IBM’s Node-RED, IBM App Connect, and various other IBM integration products. You encounter it most often in enterprise integration contexts: ETL pipelines, API response transformations, data mapping between systems.
The reference implementation is a pure-JavaScript interpreter. It tokenizes the expression, builds an AST, then walks that AST against the input JSON document. No compilation, no JIT, just a recursive evaluator. The core evaluator sits at around 3,500 lines of JavaScript. It works, it’s correct, and the test suite is thorough. But for high-throughput workloads, you pay for every evaluation in CPU time.
Where $500K/Year Comes From
That figure sounds large until you consider where it might originate. There are two plausible sources.
The first is compute cost. If Vine was processing JSONata expressions at any real scale on cloud infrastructure, the evaluation cost adds up. JSONata’s JavaScript interpreter is not particularly fast. Running it inside a Node.js process for millions of daily transformations means millions of parse-and-evaluate cycles, each one allocating objects, building AST nodes, and walking the tree. A reimplementation in Go, Rust, or even a more carefully optimized TypeScript/JavaScript build can cut that CPU budget significantly. On cloud platforms, CPU is money.
The second possibility is licensing. IBM’s enterprise integration products that bundle JSONata can carry substantial annual license fees. If a team was running an IBM integration product primarily for its JSONata transformation capability, replicating that capability in a standalone implementation and cutting the IBM dependency entirely would save the full license cost. Enterprise IBM licensing for serious workloads can easily reach six figures annually.
Either way, the math is believable.
Why JSONata Was a Good Porting Target
Not all codebases are equally tractable for AI-assisted migration. JSONata has several properties that make it close to ideal.
The semantics are formally specified. JSONata has a specification that describes what every expression should evaluate to. There are no implicit platform behaviors, no environment-specific quirks, no side effects to worry about. The function takes an expression string and a JSON document, and returns a deterministic result.
The test suite is comprehensive. The JSONata project ships a substantial set of test cases covering the full expression language, including edge cases for type coercion, null handling, error propagation, and the built-in function library. When porting an implementation to a new language, these tests become the correctness oracle. You run the tests, fix failures, run again. The feedback loop is tight and mechanical.
The codebase is self-contained. The reference JavaScript implementation has no significant external dependencies. It doesn’t call out to databases, APIs, or file systems. The entire logic lives in a manageable set of files. An AI model can be given the full source context without hitting token limit issues.
The task is translation, not design. This matters. When AI helps with greenfield feature design, it has to make architectural decisions under uncertainty, and those decisions can be wrong in non-obvious ways. Porting a working implementation to a new language is different: the architecture is already proven, the algorithm is already correct, and the job is primarily syntactic and idiomatic translation. LLMs are significantly better at this narrow task than at open-ended design.
The Methodology
The practical workflow for an AI-assisted port of something like JSONata would look roughly like this. You feed the AI the source files along with the target language’s standard library documentation. You ask for a translation of each module, proceeding from the simplest (the tokenizer) to the most complex (the evaluator). At each step, you run the existing test suite against the translated code. Failures give the AI concrete feedback to iterate on.
The tokenizer and parser sections tend to go smoothly. Regular languages and context-free grammars translate well between languages with similar string and array primitives. The evaluator is harder because it requires idiomatic handling of the target language’s type system, especially around JSONata’s dynamic typing rules and its handling of sequences (JSONata has a specific notion of singletons versus arrays that has subtle implications throughout).
The built-in function library is largely mechanical translation: string functions, numeric functions, aggregation, date/time handling. Tedious in human time, fast with AI assistance.
The parts that require human attention are the semantic edge cases where the reference implementation’s behavior is only partially documented and is instead encoded in tests. JSONata has a few of these, particularly around how predicates interact with singleton propagation, and how certain error conditions propagate versus being swallowed. These are the places where an automated test run will surface unexpected failures and where a human needs to reason about intended behavior rather than just translating code.
What the “One Day” Number Actually Means
One day of AI-assisted work is credible for this kind of port, given the right preparation. The preparation matters more than the single-day sprint. Before that day, someone had to understand JSONata’s semantics well enough to recognize when a translation was wrong. Someone had to set up the test harness in the target language. Someone had to make the architectural decision about which language to port to.
The one-day figure also likely elides the bug-fixing tail. Getting 90% of the test cases passing quickly is feasible; getting to 100% on the edge cases takes longer. Real production use surfaces failure modes that the test suite doesn’t cover. The “done in a day” frame is accurate for the initial translation; it’s not accurate for the full validation cycle.
That caveat does not diminish what was accomplished. Before capable AI coding tools existed, porting a non-trivial language interpreter to a new runtime was a weeks-long project, possibly months if the target language had significant semantic differences. The AI compresses the tedious mechanical work while leaving the genuinely hard semantic reasoning to the engineer.
The Broader Pattern
This case fits a pattern that’s been appearing across the industry: AI-assisted migration is most tractable when the source has good tests, the semantics are formally specified, and the task is translation rather than design. Database client libraries, serialization codecs, protocol parsers, and embedded language interpreters like JSONata are all in this category.
The category where AI-assisted migration struggles is large systems with implicit environmental dependencies, extensive mutable state, or behaviors that are only defined by production usage rather than tests. The lesson from the JSONata case is partly about AI capability and partly about the value of writing well-tested, self-contained code with documented semantics. Those properties were valuable long before AI tools existed; they’re just dramatically more valuable now.
For teams running JSONata at scale and paying for the privilege, the tooling exists to do what Vine did. The question is whether the organization has the test coverage and language expertise to catch the semantic edge cases that automated testing will miss.