· 8 min read ·

The vm2 Collapse and What It Proved About JavaScript Sandboxing

Source: simonwillison

The need to execute untrusted JavaScript safely has gone from a niche problem to an everyday one. LLM-powered tools now routinely generate and run code in loops; plugin systems embed user-supplied scripts into larger applications; browser-based notebooks evaluate arbitrary expressions server-side. Simon Willison’s recent roundup of JavaScript sandboxing research is a useful moment to take stock of where this problem actually stands, because the landscape changed significantly with a public post-mortem in 2023 that most people in the ecosystem have still not fully absorbed.

The vm2 Post-Mortem

For years, vm2 was the default answer for running untrusted JavaScript in Node.js. It wrapped Node’s built-in vm module with a Proxy-based interception layer designed to prevent untrusted code from reaching host objects. It was downloaded millions of times and embedded in countless production systems.

In September 2022, Oxeye Security disclosed CVE-2022-36067, a CVSS 10.0 sandbox escape they named “Sandbreak.” The mechanism was Error.prepareStackTrace: Node.js allows you to override this function to receive a raw CallSite array when a stack trace is formatted. Inside a vm2 sandbox, triggering an exception caused the override to run with a CallSite whose .getThis() method returned the actual host context rather than the sandboxed proxy. From there, process.mainModule.require was accessible and the sandbox was gone.

The vm2 maintainer patched it. Then came CVE-2023-29017 in April 2023, a CVSS 10.0 escape via Promise.resolve().then(): microtask callbacks executed in the host context because vm2 failed to wrap the microtask queue. The same month brought CVE-2023-29199, exploiting Symbol.iterator on certain built-in prototypes, and then CVE-2023-30547, another bypass of an earlier incomplete fix. In May 2023, the maintainer published a deprecation notice stating that the architectural approach was fundamentally broken and the library should not be used.

What these CVEs collectively showed is that you cannot build a security boundary at the JavaScript layer, above V8, using proxy objects and wrapper functions. V8’s internal machinery has too many callback sites: stack trace hooks, microtask queue callbacks, Symbol.* hooks, Proxy handler edge cases with exotic objects, arguments in strict versus non-strict mode. Fixing one escape leaves the others. The failure was not a matter of missed edge cases; it was the architectural choice of wrapping a non-boundary as if it were one.

Primordial Poisoning: The Root Problem

To understand why same-realm sandboxing fails at a deeper level, you need to understand primordial pollution. JavaScript’s prototype chain is a shared mutable data structure. Every plain object {} in a realm inherits from Object.prototype. Every function from Function.prototype. Every array from Array.prototype. These objects exist once per realm and are shared by all code running in that realm.

If untrusted code runs in the same realm as trusted code, and if the shared primordials are mutable, the untrusted code can modify them before the trusted code runs:

// Attacker code, runs in the same realm
Array.prototype.push = function(...args) {
  exfiltrate(args); // steal all data pushed to any array
  return Array.prototype.push.apply(this, args);
};

Now every push call anywhere in the process delivers data to the attacker. The attack operates at the level of prototype lookup at runtime, not at the level of function call interception. There is no way to detect or prevent this after the fact; you have to freeze the prototypes before any untrusted code evaluates.

Beyond the obvious Array.prototype.push, there are subtler channels: Symbol.toPrimitive hooks into type coercion, Symbol.iterator intercepts for...of loops, Symbol.hasInstance intercepts instanceof, Symbol.species on Array.prototype and RegExp.prototype hooks the return type of .map(), .filter(), and similar methods. Each of these is a path through which mutations by untrusted code affect trusted code that runs afterward.

This is the problem that Agoric’s Hardened JavaScript / SES solves with lockdown(). The call transitively freezes every primordial object in the ECMAScript specification: Object.prototype and all its methods, Function.prototype and all its methods, every array method, every string method, every error prototype, every iterator and generator prototype, the Promise prototype, Math, JSON, Reflect, Proxy, and every Symbol.* hook. After lockdown() runs, any attempt to mutate these throws a TypeError. The prototype chain becomes read-only for the lifetime of the process.

import 'ses';

lockdown();

// This now throws:
Array.prototype.push = function() {};
// TypeError: Cannot assign to read only property 'push'

lockdown() solves the poisoning problem within a realm. To give untrusted code its own global scope, SES provides the Compartment API, a TC39 proposal currently at Stage 2:

lockdown();

const compartment = new Compartment({
  globals: {
    console: { log: console.log },
    Math: Math,
  },
});

const result = compartment.evaluate(`Math.sqrt(16)`);
// result === 4

The Compartment gets its own eval and its own Function constructor scoped to that compartment. It sees only the globals you explicitly provide. Combined with lockdown, this gives you defensible language-level isolation in a single V8 process. The limitation worth stating clearly: this is language-level, not process-level. A V8 engine bug can still escape this boundary, and timing side-channels between compartments sharing a CPU core remain a theoretical concern. The threat model is untrusted-but-not-malicious code, or code constrained to language-level attacks.

Real Heap Boundaries

For stronger guarantees, you need a real heap boundary: two separate V8 heaps with no shared memory.

isolated-vm is the practical answer for Node.js. It wraps the V8 C++ Isolate API and exposes it as a Node.js native addon. Each isolate has its own garbage-collected heap; objects cannot cross the boundary by reference. Data passes via structured clone (ExternalCopy) or explicit handles (Reference) that marshal through the embedding API:

const ivm = require('isolated-vm');

// Create an isolate with a 128MB memory cap
const isolate = new ivm.Isolate({ memoryLimit: 128 });
const context = await isolate.createContext();
const jail = context.global;

// Inject a controlled API into the sandbox
await jail.set('log', new ivm.Reference((...args) => {
  console.log('[sandbox]', ...args);
}));

const script = await isolate.compileScript(
  `log.applySync(undefined, ['hello']); 1 + 1`
);
const result = await script.run(context);
// result === 2

isolate.dispose(); // hard-kill the isolate

The heap boundary is enforced at the V8 engine level. There is no prototype chain connection between the host and the sandbox. The overhead of crossing the boundary (calling a Reference) is roughly in the microseconds range, which matters if you design a chatty host-sandbox protocol, but code running within the isolate JITs normally.

For environments where a native addon is impractical, quickjs-emscripten takes a different approach: it compiles Fabrice Bellard’s QuickJS engine to WebAssembly via Emscripten. The untrusted JavaScript runs inside QuickJS’s WASM linear memory, completely separate from V8’s heap. There are no shared prototypes, no shared intrinsics, nothing in common unless you build an explicit bridge:

import { getQuickJS } from 'quickjs-emscripten';

const QuickJS = await getQuickJS();
const vm = QuickJS.newContext();

// 1MB heap limit
vm.runtime.setMemoryLimit(1024 * 1024);

// CPU timeout via interrupt handler
const deadline = Date.now() + 500;
vm.runtime.setInterruptHandler(() => Date.now() > deadline);

const result = vm.evalCode(`[1, 2, 3].reduce((a, b) => a + b, 0)`);
const val = vm.dump(result.value);
result.value.dispose();
vm.dispose();
// val === 6

Note the explicit dispose() calls. QuickJS is reference-counted, and every handle returned by the quickjs-emscripten API must be manually freed. Forgetting to dispose causes QuickJS’s internal reference count to leak. The library provides scope helpers to manage this, but the API burden is real compared to isolated-vm.

QuickJS has a meaningful performance penalty relative to V8. It is an interpreter with no JIT, so compute-intensive code inside the sandbox runs roughly an order of magnitude slower than in Node.js. For sandboxing LLM-generated glue code, configuration scripts, or formula evaluation, this is usually acceptable. For anything CPU-bound, it is not.

How Production Platforms Handle This at Scale

Cloudflare Workers and Deno Deploy both rely on V8 Isolates at the runtime level, and the economics of this choice are instructive. Cloudflare runs thousands of isolates in a single OS process per edge server. A V8 Isolate costs roughly 1-3MB of memory to start versus 100MB or more for a Node.js process; cold starts land around 5ms versus hundreds of milliseconds for a container. The open-sourced workerd runtime that powers Workers provides a restricted Web API surface with no Node.js fs, no process, no path to the OS.

Both platforms explicitly disable SharedArrayBuffer for user code. The reason is that SharedArrayBuffer combined with a Worker-based spinning thread lets you implement a high-resolution timer, which is the core primitive behind Spectre-style timing attacks between co-located isolates sharing a CPU core. This is the same reason browsers require Cross-Origin-Opener-Policy and Cross-Origin-Embedder-Policy headers before re-enabling SharedArrayBuffer. Security at the Isolate level does not automatically protect against timing side-channels at the microarchitecture level.

The TC39 Compartment Proposal

The TC39 Compartment proposal, championed by Agoric’s Mark Miller and Kris Kowal, is currently at Stage 2. It would make new Compartment() a native language feature rather than a polyfill, with native module evaluation support via the companion ModuleSource proposal. Native compartments would not eliminate the need for something like lockdown() to prevent primordial poisoning; the two mechanisms address different layers. But native support would make the story more ergonomic and would allow engine-level optimization of compartment boundaries that a userland polyfill cannot achieve.

The path from Stage 2 to shipping in engines is long and full of implementation feedback, so relying on this for production sandboxing in the near term is premature. The Agoric SES polyfill implements the Compartment API today, but it carries the lockdown requirement and the compatibility trade-offs that come with permanently freezing primordials in the host process.

Picking the Right Tool

Node’s built-in vm module is documented as not a security mechanism. That documentation has been there for years; vm2’s post-mortem simply provided empirical confirmation at scale. Same-realm JavaScript sandboxing using proxies and wrappers does not work against a determined attacker.

For running untrusted JavaScript in Node.js where security matters, isolated-vm gives you a real V8 Isolate boundary with manageable API overhead. For portability or environments without native addons, quickjs-emscripten gives you a WASM-separated heap at the cost of performance and an explicit memory management API. For language-level isolation within a controlled deployment, SES lockdown() plus Compartment is defensible, but you should be clear about what threat model it covers and what it does not.

The common thread across everything that works is that it crosses a genuine boundary: separate V8 heap, separate WASM linear memory, or frozen primordials before any untrusted code evaluates. Approaches that try to impose a boundary above the engine, after the fact, using JavaScript-layer interception, have a structural problem that no amount of patching resolves.

Was this interesting?