Running Untrusted JavaScript: Why Language-Level Sandboxing Keeps Breaking

Simon Willison published a roundup of JavaScript sandboxing research this past week, and it’s worth sitting with for a moment, because the picture it paints is not flattering. After decades of JavaScript runtimes, a mature spec, and serious engineering effort from well-funded organizations, running untrusted JavaScript safely remains genuinely hard. Not “hard like implementing a B-tree” hard. Hard in the sense that the language’s design actively works against the goal.

This post is an attempt to explain why, using the concrete failure modes that have defined the space, and to lay out what the approaches that do work have in common.

The Fundamental Problem: Shared Primordials

JavaScript is a prototype-based language where almost everything inherits from a small set of built-in objects: Object, Function, Array, Error, and so on. These are called primordials. In a standard JavaScript runtime, all code running in the same agent shares these primordials. That sharing is the root of the sandboxing problem.

Consider the classic escape:

const escape = ({}).constructor.constructor;
escape('return process')();

This works because ({}) creates a plain object, .constructor retrieves Object, and .constructor on Object retrieves Function from the outer realm. Once you have the outer Function constructor, you can evaluate arbitrary code in the outer context. The sandbox never even saw what happened.

The escape variants are nearly infinite. Proxy traps, generator protocol leaks, Symbol.unscopables, Symbol.iterator, Symbol.toPrimitive on built-in objects: every protocol that JavaScript uses to make its built-ins extensible is also a potential tunnel out of any scope-based containment.

vm2: A Case Study in Language-Level Failure

vm2 was for years the most widely used Node.js sandbox. It wrapped Node’s built-in node:vm module, which runs code in a new V8 context, and added extensive prototype-level shielding to prevent escape. The Node.js docs have always been explicit that node:vm is not a security boundary, but vm2 tried to make it one anyway.

In April 2023, the maintainers received a report for CVE-2023-29199, a critical (CVSS 9.8) sandbox escape via the exception handling mechanism. Crafted code could throw an exception object that, when caught and inspected by vm2’s own error-handling wrapper, gave access to the outer Function constructor. A patch landed. Then CVE-2023-30547 arrived within weeks: another escape, different vector, same severity.

At that point the maintainers deprecated the project entirely, writing that the problem is fundamentally unsolvable at the pure-language level. They were right. The attack surface is not a bug in vm2’s implementation; it is the prototype chain itself.

What the TC39 Realms Proposal Actually Does

The Realms proposal is often cited in sandboxing discussions, and the confusion it generates is worth clearing up. A Realm gives you a separate global object and a fresh set of primordials:

const r = new Realm();
r.evaluate('globalThis') !== globalThis; // true

That sounds like isolation. It is not. Both the host realm and the Realm share the same agent, the same process, and the same event loop. Objects crossing the realm boundary via return values or shared references expose host-realm constructors immediately. The proposal’s own documentation explicitly states it is not a security primitive; it is designed for module system isolation and polyfill sandboxing, not for running untrusted code.

The Realms proposal reached Stage 3 at TC39 in 2021 and has been advancing slowly since. It is useful for things like isolating test environments or running user-authored configuration scripts where you trust the author but want a clean global. It is the wrong tool for untrusted code.

SES and Hardened JavaScript: The Language-Level Approach That Works

Agoric’s SES (Secure ECMAScript) takes a different approach. Rather than trying to intercept escapes after the fact, SES prevents them by freezing all primordials before any untrusted code runs:

import 'ses';
lockdown(); // freezes Object, Function, Array, and all other primordials

const c = new Compartment({
  globals: { fetch: safeFetch },
});
c.evaluate(`fetch('https://example.com/api')`);

After lockdown(), Object.prototype is frozen. Function.prototype is frozen. Nothing in user code can modify them or use them to navigate to something mutable. The Compartment API then provides a controlled evaluation environment where you explicitly choose what globals to expose.

This works because it changes the invariant: instead of trying to catch all escapes (impossible), it removes the mutable state that makes escapes useful. The approach has real costs: it is a shim, not a native feature; lockdown() must run before any other code in the process; and it breaks libraries that mutate primordials at startup (which is, unfortunately, a lot of them).

Agoric builds their entire Endo framework on SES for capability-secure JavaScript distribution. The design is sound, but adoption requires fully committing to the model.

V8 Isolates: The Approach That Scales

Cloudflare Workers takes the position that language-level isolation is too fragile and uses V8 Isolates instead. A V8 Isolate is an independent instance of the V8 engine with its own heap, garbage collector, and primordials. Code in one Isolate genuinely cannot access objects in another Isolate; there is no shared prototype chain.

Cloudflare’s workerd runtime (open-sourced in 2022) manages thousands of Isolates per process. The memory overhead per Isolate is roughly 128KB at startup, compared to tens of megabytes for a container or VM. This is what allows Cloudflare to run Workers at the network edge at scale.

The npm package isolated-vm brings the same primitive to Node.js applications via native bindings. It gives you proper Isolate-based sandboxing without requiring you to deploy to Cloudflare:

import ivm from 'isolated-vm';
const isolate = new ivm.Isolate({ memoryLimit: 128 });
const context = await isolate.createContext();
const jail = context.global;
await jail.set('log', new ivm.Reference(console.log));
await context.eval(`log.applySync(undefined, ['hello from sandbox'])`);

Crossing the Isolate boundary is explicit and typed: you pass Reference objects, not live JavaScript values. That boundary crossing is the security model. It is clunky compared to just calling a function, but that friction is intentional.

QuickJS and the Embedding Approach

QuickJS, Fabrice Bellard’s compact JavaScript engine, is worth mentioning as a different axis of the tradeoff space. When embedded in a C or Rust application, QuickJS exposes only what the host explicitly provides through its C API. There is no process, no require, no file system access unless you wire those up yourself. The isolation boundary is the FFI layer.

This is how many game engines and plugin systems approach the problem: run a minimal JS engine, expose a narrow API, and never give user code access to the host process’s standard library. quickjs-emscripten brings this to browser and Node.js environments by compiling QuickJS to WebAssembly.

The Pattern That Emerges

Looking across all the approaches that have held up under scrutiny, the pattern is consistent: working sandboxes use a hardware or process-level boundary, not a language-level one. V8 Isolates use V8’s internal heap separation. QuickJS embedded in WASM uses WASM’s linear memory model. Process isolation uses the OS. SES is the exception, and it works because it changes the language state rather than trying to police runtime access.

Pure scope-based containment in a standard JS runtime does not work. The prototype chain is too deeply wired into the language’s execution model. Every pure-JS sandbox that has tried to solve this has eventually encountered a CVE that proves the point.

The research Simon links to maps the same terrain from a more formal angle, cataloging attack vectors and evaluating mitigations. What it confirms is that the ecosystem has largely converged on the right conclusion: if you need to run untrusted JavaScript, you need a real isolation boundary. The question is which one fits your deployment constraints, not whether you can avoid it.