· 6 min read ·

Language Lock-in Has a Price Tag: The JSONata Port That Saved $500K

Source: simonwillison

Simon Willison linked to a case study this week that deserves more attention than it will probably get. A company called Vine rewrote the JSONata runtime in a day using AI assistance and cut $500K per year in infrastructure costs. The headline is a good hook, but the interesting part is what it says about a class of problem that has been quietly expensive for years.

What JSONata Is and Why It Matters

JSONata is a query and transformation language for JSON data, created by Andrew Coleman at IBM around 2015. If you’ve used IBM App Connect, Node-RED, or various iPaaS tools, you’ve probably written JSONata without knowing it. The syntax borrows from XPath but feels natural against JSON structures. A simple example:

Account.Order.Product.(Price * Quantity)

This navigates nested JSON, iterates over arrays implicitly, and computes inline expressions. More complex expressions involve filters, closures, user-defined functions, and built-in operations for strings, numbers, datetimes, and arrays. The full language spec covers about 60 built-in functions and a reasonably expressive functional programming model.

The reference implementation is the jsonata npm package, written in JavaScript. It’s about 7,000 lines of reasonably clean code: a tokenizer, a recursive descent parser, an AST evaluator that passes a frame object for variable scoping, and the built-in function library. Version 2.x added Promise-based async evaluation so custom functions can call out to async services.

For the JavaScript ecosystem, this is fine. For everyone else, it’s a problem.

The Infrastructure Tax

If you’re running a Python service, a Go service, a Java service, and you need to evaluate JSONata expressions against user-provided data, you have three realistic options.

First, you run a Node.js microservice alongside your main service and make HTTP requests to it. This is the most common enterprise approach. It works, it’s correct, and it’s expensive. Every horizontal scale event in your main service potentially means scaling the Node.js sidecar too. You’re paying for the compute, the inter-service latency, the networking, the operational overhead of a second runtime in your deployment, and the engineering time to maintain it. At scale, across multiple services or high-traffic systems, this compounds quickly.

Second, you embed a JavaScript engine directly in your process. GraalVM’s JavaScript support can run Node.js code inside a JVM process. QuickJS is embeddable in C applications. These approaches trade the network hop for memory overhead and startup complexity. GraalVM in particular adds significant heap pressure and warm-up time.

Third, you use one of the community ports. There’s jsonata4java, an IBM-maintained Java port, and a handful of Go and Python attempts. These ports are incomplete. The JSONata test suite has over 500 test cases covering edge cases in the language semantics, and the community ports fail a non-trivial number of them. Using an incomplete port means either accepting incorrect behavior or writing workarounds for every gap you discover in production.

The $500K figure in the Vine case study almost certainly reflects option one at scale. A fleet of Node.js instances running to serve JSONata evaluation requests, scaled to handle traffic, across what sounds like a significant data integration workload. That’s not unusual for companies in the iPaaS or enterprise integration space.

Why Manual Porting Failed

Ports of the JSONata runtime have been attempted by serious engineers multiple times. The Java port has existed since around 2018 and is still incomplete. The gap isn’t a reflection on the engineers who worked on it; it reflects the nature of the problem.

JSONata’s built-in function library has about 60 functions. Many of them have subtle JavaScript-specific semantics. The $string() function follows JavaScript’s toString rules, including how it formats numbers. The $number() function follows JavaScript’s number parsing behavior. The datetime functions have specific behavior around timezone handling that mirrors JavaScript’s Date object. Getting any of these subtly wrong means your port passes simple tests but fails on edge cases that real data will hit.

The evaluator’s scoping model uses mutable JavaScript objects as frames, passed by reference through recursive evaluation. Translating this into a language with different object semantics requires understanding the full execution model, not just the surface syntax.

And the async evaluation path in version 2 is genuinely complex to port to languages that don’t have JavaScript’s particular flavor of async/await semantics.

All of this adds up to a port that is mechanically large, requires careful attention to semantic detail, and provides no interesting engineering challenges. It’s the kind of work that’s hard to staff because it’s tedious, and hard to complete because the surface area is wide. Estimates for a correct, complete port historically ran to months of engineer time.

What Changed

The Vine case study describes completing this in roughly a day with AI assistance. That compression factor, months to a day, deserves unpacking.

The built-in function library is the most direct example. Sixty functions, each with specific semantics, each needing to be translated from JavaScript to the target language, each needing tests. This is exactly the kind of repetitive, well-specified translation task where a language model does well. The model doesn’t get bored on function 40. It doesn’t start cutting corners because the deadline is approaching. You feed it the JavaScript source, you describe the target language’s idioms, and it produces a reasonable translation. You run the test suite, you feed the failures back, you iterate.

The JSONata test suite is the hidden enabler here. Without it, this approach wouldn’t work. You’d have no reliable way to know whether your translation was correct. The test suite acts as an executable specification. Every test that passes is evidence of correct behavior; every failure is a precise, actionable bug report. The iteration loop of translate, test, fix is tight and mechanical, which is exactly where AI assistance provides leverage.

This points to a general principle: AI-assisted porting is most effective when there’s a comprehensive test suite, a clean separation between the translation task and the design task, and a target language that the model has substantial training data for. JSONata checks all three boxes.

The ROI Calculation Shifted

The interesting economic story isn’t just that the port saved $500K. It’s that this port was probably considered and rejected multiple times before. The engineering estimate said months of work. The infrastructure cost said $500K per year. The business case was clear but the investment wasn’t justified given other priorities and the risk of an incomplete port creating production bugs.

AI assistance changed the denominator. A one-day project with high confidence of correctness, backed by a comprehensive test suite, is a completely different ROI calculation than a months-long project with uncertain completion and a long tail of correctness issues to shake out.

This is where AI tooling is actually moving the needle in software engineering: not in writing greenfield code faster, but in making previously uneconomical projects economical. The work that was technically possible but cost-prohibitive relative to its benefit. Porting a library runtime. Migrating a large codebase to a new framework. Translating a legacy system from one language to another.

These projects have existed in backlogs for years because the cost-benefit didn’t work out. The cost side of that equation is changing.

What This Means for Language Ecosystem Design

There’s a design lesson here for library authors. JSONata is in this situation because it was designed as a JavaScript library for a JavaScript ecosystem. The language specification and the JavaScript implementation are deeply entangled. The reference for “what does this function do” is the JavaScript source code, not a language-agnostic specification.

Libraries designed this way are a hidden liability for adopters who might want to use them from other languages. The jsonata test suite exists and is comprehensive, which is what made the Vine port tractable. But the spec-as-tests approach is less robust than a formal language specification that could drive independent implementations.

The Vine story will probably accelerate more ports of JavaScript-only libraries. The pattern is now proven: comprehensive test suite, AI-assisted translation, iterative correction. Any library sitting at the center of an expensive infrastructure workaround is now a candidate.

Was this interesting?