· 7 min read ·

Why Running Untrusted JavaScript Safely Is Still an Open Problem

Source: simonwillison

Running untrusted JavaScript in a controlled environment sounds like it should be a solved problem by now. V8 is mature. Node.js is everywhere. We have a decade of serverless infrastructure built on the premise of isolating JavaScript workloads. And yet the space of “run this code safely” remains genuinely difficult, with a graveyard of abandoned libraries and at least one widely-used sandboxing package that turned out to be fundamentally broken. Simon Willison’s recent roundup of JavaScript sandboxing research is worth reading as a current-state survey, but the topic has enough depth to warrant unpacking the design decisions and failure modes that make this hard in the first place.

The Node.js vm Module Is Not a Sandbox

The most common first attempt at JavaScript sandboxing in Node.js is the built-in vm module. You create a context, compile a script into it, and run it. The official documentation has a prominent warning that most developers scan past: the vm module does not provide a security sandbox. Code running inside a vm context can still escape to the host process. The classic escape involves the prototype chain:

const vm = require('vm');
const ctx = vm.createContext({});
vm.runInContext(
  `this.constructor.constructor('return process')().exit(1)`,
  ctx
);

That one-liner exits the host process. The vm module was designed for code isolation in the sense of variable scoping and module boundaries, not for security boundaries. Treating it as one has burned enough projects that it’s worth stating plainly: if you’re using vm to run arbitrary user code, you have a vulnerability.

vm2 and Why It Collapsed

For years, vm2 was the go-to answer when someone asked how to sandbox JavaScript in Node.js. It wrapped the vm module with a Proxy-based interception layer designed to catch and neutralize escape attempts. It had millions of weekly downloads and was used in competitive programming judges, online IDEs, and various code execution services.

In 2023, a critical sandbox escape vulnerability was disclosed. The maintainer’s response was to archive the repository and recommend against using vm2 for any security-sensitive purpose. The underlying problem was not a one-off bug but an architectural one: trying to build a security boundary on top of JavaScript’s own reflection and prototype machinery means you are defending against the full expressiveness of the language itself. Every new language feature, every Proxy edge case, every interaction between built-in objects and user code is a potential escape route. The attack surface scales with the language spec.

This is the core tension in JavaScript sandboxing. You either isolate at the VM level, below the language, or you accept that language-level sandboxing requires ongoing maintenance against an ever-expanding spec.

V8 Isolates: The Right Abstraction Level

isolated-vm takes the lower-level approach. It exposes V8 Isolates directly to Node.js code. A V8 Isolate is V8’s fundamental unit of isolation: a separate heap, a separate garbage collector, no shared objects with other isolates. Communication between the host and the sandboxed code happens through explicit message passing with serialization, which means you can’t accidentally share a reference to a mutable object.

const ivm = require('isolated-vm');
const isolate = new ivm.Isolate({ memoryLimit: 32 });
const context = await isolate.createContext();
const jail = context.global;
await jail.set('_ivm', ivm);
const result = await isolate.compileScript('1 + 1').then(s =>
  s.run(context)
);

This is what Cloudflare Workers runs on. Each worker gets its own V8 Isolate, which provides strong memory isolation without the overhead of a full OS process boundary. The tradeoff is API surface: getting data in and out of an isolate requires explicit serialization, and any capability you give the sandboxed code (network access, timers, a custom API) has to be explicitly threaded through a Reference or ExternalCopy. That explicitness is the security model working correctly, but it makes building a rich execution environment significantly more work.

Hardened JavaScript and the SES Compartment Proposal

A different approach comes from the Secure ECMAScript (SES) project, developed largely by people at Agoric. Rather than isolating at the VM level, SES works by hardening the intrinsics of the JavaScript environment and then providing a Compartment abstraction that restricts what code can access.

lockdown() is called once to freeze all the built-in objects that JavaScript programs rely on: Object.prototype, Array.prototype, and so on. Once frozen, untrusted code cannot modify shared state through the prototype chain. Compartments then provide lexically scoped environments where you control what global names are visible.

import 'ses';
lockdown();

const c = new Compartment({
  Math,
  globals: { log: console.log },
});
c.evaluate(`Math.random()`);

The TC39 Compartment proposal aims to bring this into the language standard, which would give a native, spec-maintained compartment boundary. As of early 2026 the proposal is at Stage 1, meaning the concept is accepted for exploration but the API is not stable. The work being done in userland SES is effectively a prototype of what the spec might eventually look like.

The SES approach is well-suited to contexts where you want to run code that shares a JavaScript environment with the host, because you’re working with the language’s own semantics rather than isolating away from them. It’s the model that makes sense for smart contract execution (Agoric’s primary use case) where you want deterministic, auditable behavior, not just process separation.

Deno and the Permissions Model

Deno took a different angle by making the runtime itself capability-based. All I/O operations require explicit permission flags, and by default a Deno program can’t read files, write to the network, or spawn subprocesses. This is a process-level sandbox rather than an in-process one: you’re running code in a full JavaScript runtime, but the runtime mediates all interactions with the outside world.

deno run --allow-net=api.example.com script.ts

For use cases where you control the execution environment, Deno’s model is ergonomic. For use cases where you’re embedding a runtime inside an existing Node.js application, it’s not directly applicable. Deno Deploy uses V8 Isolates underneath, similar to Cloudflare Workers.

QuickJS and Lightweight Engines

Another direction is using a JavaScript engine that was designed to be embedded: QuickJS, written by Fabrice Bellard. It’s a small, complete ES2023 implementation in C, designed to be compiled into another application. Bindings like quickjs-emscripten compile QuickJS to WebAssembly, so you can run it inside a browser or Node.js with near-total isolation, since WebAssembly’s memory model is sandboxed by construction.

import { getQuickJS } from 'quickjs-emscripten';
const QuickJS = await getQuickJS();
const result = QuickJS.evalCode('1 + 1');

The appeal is strong: if QuickJS is compiled to WASM, it runs in WASM’s linear memory model and cannot directly access the host’s memory or APIs. You get sandboxing through the WASM boundary rather than through JavaScript’s own semantics. The cost is performance, both startup latency and execution throughput, and a smaller set of available APIs since you’re building a fresh environment from scratch.

Why AI Tooling Is Renewing Interest in This Problem

The renewed attention to JavaScript sandboxing in 2025 and 2026 is largely driven by AI code execution. LLMs generating and running JavaScript is now a common pattern: code interpreters in AI assistants, tool-use plugins, browser-based agent sandboxes. The execution environment requirements for LLM-generated code are different from traditional use cases in a few ways.

First, the code is adversarially generated by nature. Even without malicious intent, an LLM will produce code that accesses globals, modifies prototypes, and tries to use APIs that aren’t available. A sandbox that crashes or throws on unexpected input is not useful in an inference loop. Second, the latency budget matters. If you’re running code as part of a tool-use chain, isolate startup time and serialization overhead compound across multiple calls. Third, the outputs need to be inspectable. A code interpreter that returns a final value isn’t enough; you want stdout, stderr, intermediate state, and potentially a REPL-style interaction model.

None of the existing approaches handles all of these requirements cleanly. isolated-vm gives you strong isolation but requires manual API wiring. SES Compartments give you a natural JavaScript environment but require hardening work and careful auditing of what you expose. QuickJS-on-WASM gives you memory-safe isolation but limited stdlib. Deno gives you a clean permissions model but requires process spawning.

The active research is mostly happening at the boundaries: better ergonomics for the isolated-vm pattern, progress on the TC39 Compartment proposal, and experiments with Wasm-compiled engines for browser-side execution. The Endo project from the SES team is building a full capability-based platform on top of Hardened JavaScript that addresses the object-capability model more completely.

What the Landscape Tells You

If you’re building something that needs to run untrusted JavaScript today, the decision tree is approximately: if you control the hosting environment and can spawn processes, use Deno with minimal permissions. If you’re embedding inside Node.js and need strong isolation, use isolated-vm and accept the serialization overhead. If you’re building something that needs a shared JavaScript environment with auditable capability passing, look at SES. If you’re targeting a browser or WASM context and can accept the performance trade-off, QuickJS-on-WASM gives you the cleanest isolation story.

The vm module remains a trap. vm2 is archived. Any library built on top of either of these without a V8 Isolate or process boundary underneath it deserves careful scrutiny.

The Compartment proposal, if it reaches Stage 3 or 4, would change this landscape by giving the runtime itself a maintained, spec-level compartment abstraction. Until then the problem remains genuinely open, with the best answers depending heavily on which constraints matter most to your use case.

Was this interesting?