The Conformance Suite Is Why That One-Day JSONata Port Worked

When Vine posted about rewriting JSONata with AI in a single day, the $500K/year savings figure got the attention. Simon Willison flagged it with the observation that this is a pattern worth paying attention to. Both framings are accurate, but neither one explains the part I find most interesting: why this particular project worked when so many prior attempts to port JSONata failed.

The answer is the conformance test suite. The AI was necessary but not sufficient. What made the result trustworthy was having a machine-checkable oracle to verify correctness against at every step.

What JSONata Actually Does

JSONata is a query and transformation language for JSON, originally created by Andrew Coleman at IBM. The comparison to XPath/XSLT for XML is instructive. JSONata does not just filter or extract: it transforms. You can navigate paths, filter predicates, aggregate, write lambdas, compose with higher-order functions, and construct entirely new JSON structures from input data.

A few examples to make this concrete:

// Path navigation with implicit iteration
Account.Order.Product.Price

// Predicate filter
Account.Order.Product[Price > 10].Description

// Aggregation over a computed expression
$sum(Account.Order.Product.(Price * Quantity))

// Recursive descent
Account.**.Price

The standard library covers around sixty built-in functions. The pipe operator (~>) enables point-free composition. There is a full lambda syntax with closures and partial application. This is not a query language that accidentally got some transformation features bolted on; the transformation capability is the point.

Where JSONata gets used: IBM App Connect Enterprise uses it as the expression language for data mapping, Node-RED uses it for IoT flow programming, and AWS Step Functions added native JSONata support in late 2024 as an alternative to its own intrinsic function syntax.

The Embedding Problem

Here is the practical friction. JSONata’s reference implementation is a JavaScript library. The npm package is MIT licensed and costs nothing. But if your system is written in Go, Rust, Python, or anything JVM-based, “MIT licensed” does not mean “free to use.” It means you are now running a Node.js sidecar.

In practice that sidecar comes with cold starts in the 200ms to 1 second range, baseline memory around 50MB before your application code, and a process boundary that introduces latency on every evaluation call. At low volumes this is annoying. At millions of evaluations per day, the compute costs compound, and you carry the operational burden of a separate deployment pipeline, its own observability, on-call responsibilities, and version management. The $500K figure is plausible as either a compute bill for a large sidecar fleet or as IBM enterprise platform licensing for teams running App Connect primarily for JSONata capability.

The obvious fix is a native implementation in the language where your system actually lives. Community ports have existed since around 2018. JSONata4Java, maintained by IBM itself, has been in development for years and is still incomplete. There are Go and Python attempts in similar states. None of them reached production-complete status.

Why Previous Ports Stalled

The gap between “mostly works” and “semantically correct” in a JSONata port comes down to three specific behaviors that are deeply entangled with JavaScript’s own semantics.

Sequences are not arrays. JSONata’s sequence concept comes from XPath. When a path expression matches a single value, it returns that value directly, not a single-element array. When it matches multiple values, it returns a sequence. A bare sequence and an array of one element behave the same in some contexts and differently in others. The [] operator explicitly forces a sequence into an array. Get this wrong and your implementation is correct 90% of the time with subtle failures at composition boundaries.

Undefined propagates silently. Accessing a missing path returns nothing, not null and not an error. $uppercase(foo.bar.baz) on input where the path does not exist returns nothing rather than throwing. All sixty-plus built-in functions and every operator need to respect this. Rust’s Option<T>, Go’s comma-ok idiom, and Python’s None all have different default behavior, so it is not possible to just follow the target language’s conventions; you have to implement the JSONata convention consistently everywhere.

The regex engine is JavaScript’s regex engine. The $match(), $replace(), and $split() functions use JavaScript regex semantics, including lookahead, lookbehind, named capture groups, and Unicode property escapes. Go’s standard regexp package and Rust’s regex crate both use RE2 semantics, which deliberately excludes lookahead and lookbehind to guarantee linear-time execution. You can work around this by binding PCRE via FFI, documenting the divergence, or implementing a JavaScript-compatible evaluator, but none of these are trivial.

These three semantic differences compound. Sequence semantics interact with undefined propagation. Both interact with how functions are called. A port that handles each in isolation may still fail when they combine in ways not covered by casual testing. This is why human-driven ports tend to stall: you get to 90% and then discover the remaining edge cases require either deep knowledge of JavaScript internals or a comprehensive way to enumerate what you have not gotten right yet.

What the Conformance Suite Changes

The JSONata conformance test suite exists as a separate repository specifically for this purpose: enabling third-party implementations to verify behavioral compatibility. Each test is a self-contained JSON document with an expression, input data, and expected output. The suite covers hundreds of cases organized by feature area, including the edge behaviors that informal testing misses.

This is the component that transforms the AI-assisted port from a bet into a methodology. Without it, running a model over the source JavaScript and generating a Go or Rust translation produces something with unknown correctness. You have output that compiles and passes whatever tests you wrote, but you have no way to know what you missed.

With the conformance suite, every translation attempt immediately produces a pass rate. Every failure is a precise, actionable bug report: expression, input, expected output, actual output. You can feed that directly back to the model alongside the relevant section of translated code and ask it to fix the discrepancy. The feedback loop is tight enough to iterate on within hours.

The approximate progression Vine described: translating the lexer and parser gets you to around a third of tests passing. Fixing evaluator edge cases pushes that above two thirds. Working through built-in function failures gets you to the low nineties. The sequence semantics and context edge cases take you the rest of the way. The last few percent tend to be JavaScript-specific Unicode normalization and numeric precision decisions where you may choose to intentionally diverge and document it.

Why JSONata Is an Ideal Target for This Approach

Not every library port benefits equally from AI assistance, and it is worth being explicit about what made this one tractable.

JSONata’s entire implementation is self-contained with no external dependencies. The full source fits comfortably within a large model’s context window. The architecture is classical: a Pratt parser feeding into a tree-walking evaluator. Models have extensive training data on both patterns. The task is translation, not design; the architecture is proven and the model is converting it, not inventing it.

Compare this to a storage engine, a network stack, or a distributed systems component. Those have implicit behavioral dependencies on their runtime environment, assumptions about error handling that are encoded in operational practice rather than tests, and semantics that only become visible through production incident history. An AI-assisted port of PostgreSQL storage management would not produce the same outcome as this.

The conformance suite also matters more for a language with complex semantics than for a pure computation library. If you are porting a library that does arithmetic or string manipulation, unit tests of representative inputs are probably sufficient. JSONata’s sequence model and undefined propagation are subtle enough that you need exhaustive specification coverage to know your port is complete.

The Design Implication

JSONata was designed as a JavaScript library for the JavaScript ecosystem. Its specification and its reference implementation are deeply entangled with JavaScript. The fact that a conformance suite exists at all, maintained separately as a cross-implementation verification tool, is what made this port tractable. Many libraries in this situation never build that infrastructure.

The lesson for library authors is not subtle: if you design a language or a complex behavioral library and you want non-reference implementations to be possible, a language-agnostic conformance suite is the investment that makes it happen. The alternative is that your library remains locked to whichever language you wrote it in, and every team that needs it in a different context pays the sidecar tax forever.

JavaScript-originated libraries in similar positions include JSON Schema validators like Ajv, Markdown parsers like marked, and various DSL interpreters that grew from Node.js tooling. Some of these have conformance suites. Many do not. The difference shows up every time a team needs the library outside the JavaScript ecosystem.

The Economic Threshold Shift

The broader observation, which is what Willison and others are pointing at, is that AI-assisted porting has changed the cost-benefit calculation for remediation work like this. A port that would have taken months of careful engineering by someone with deep knowledge of both the source language and the target language was previously cost-prohibitive for most teams. At $500K/year in operational cost, the math might have worked out but the project still landed in the backlog indefinitely because the upfront engineering investment looked too large.

A port that takes one day, even with verification and integration work extending over another week, looks completely different on a cost-benefit spreadsheet. Projects that have been deferred for years because the economics never worked become viable. The threshold for what saves enough money to justify the remediation effort has shifted down significantly.

This does not generalize to arbitrary rewrites. The JSONata port worked because of a very specific combination: bounded scope, classical algorithms, thorough documentation, and most importantly a comprehensive conformance suite that functioned as an automated correctness oracle. That combination is not common. But it is more common than it used to be, and teams building complex libraries should think harder about whether their own work could one day become as portable as JSONata just became.