Simon Willison published a research roundup on JavaScript sandboxing recently, and the pattern across every approach he documents is consistent enough to name directly: sandboxing that operates at the language level keeps failing, and the failures are not implementation bugs. They are consequences of how JavaScript was designed.
This matters more now than it did five years ago. Code execution is a standard capability for LLM-powered tools. Discord bots, AI assistants, and data platforms all want to run user-supplied or model-generated code in some contained way. The pressure to get sandboxing right has increased, but the underlying JavaScript runtime has not become easier to contain.
The Shared Primordials Problem
Every JavaScript engine initializes a small set of built-in objects before any user code runs: Object, Function, Array, Error, and their prototypes. These are called primordials, and in a standard JavaScript environment, all code in the same agent shares them. That shared state is the root of the containment problem.
The classic sandbox escape takes about thirty characters:
const escape = ({}).constructor.constructor;
escape('return process')();
The steps here are entirely within the language spec. {} constructs an object literal. .constructor retrieves Object from the prototype chain, which the sandbox did not prevent because it is a legitimate property access. .constructor again on Object retrieves Function from the outer realm, the actual Function constructor from the host environment. Calling escape('return process')() constructs a new function in the outer realm with full access to the outer scope.
This is not a bug to patch. It is how prototype-based inheritance works. The same mechanism that makes Array.prototype.map available on every array also makes Function reachable from any object. Any sandbox that shares prototype chain roots with its host can be escaped via this path, or one of the many variations of it: Symbol.toPrimitive, generator protocol leaks, Proxy traps, Symbol.unscopables. Every extensibility hook in the language is a potential tunnel.
vm2 and the Evidence
The vm2 package was the most widely used JavaScript sandbox in Node.js for years. It wrapped Node’s built-in node:vm module, which creates a new V8 context, and added prototype-level shielding to block the known escape paths. The implementation was careful and actively maintained.
In April 2023, CVE-2023-29199 was published with a CVSS score of 9.8. The vulnerability was in vm2’s exception handling: crafted code threw an exception object that, when caught by vm2’s own error-handling wrapper, exposed the outer Function constructor. The sandbox’s own defensive code created the escape route.
Within weeks, CVE-2023-30547 arrived. Also 9.8. Different vector, same severity. The maintainers deprecated the project entirely, writing in the deprecation notice that the problem cannot be solved at the pure language level. That is the correct conclusion, arrived at after years of engineering effort.
Node’s own node:vm module has never claimed to be a security boundary. The official documentation says it explicitly: this module does not provide a way to run untrusted code safely. It creates a new V8 context, but within the same process, sharing the event loop and the prototype chain roots. It was designed for module system isolation and test environment setup, and it is good at those things. Using it as a security primitive is a misreading of what it provides.
The TC39 Realms Proposal
The TC39 Realms proposal reached Stage 3 in 2021 and provides a fresh global object and fresh primordials per realm, so r.evaluate('globalThis') !== globalThis is true. Objects crossing the realm boundary via return values, though, immediately expose host-realm constructors to the code that receives them. The proposal’s own documentation states that Realms are not a security primitive; they are designed for module system isolation, polyfill sandboxing, and trusted-author configuration scripting. Passing an object from host to realm and back is the normal use case, and that crossing is where untrusted code finds its footing.
SES: Changing the Invariant Instead of Patching the Escape
The SES (Secure ECMAScript) project from Agoric takes a structurally different approach. Rather than intercepting escape attempts at runtime, it removes the mutable state that makes escapes useful. The lockdown() call freezes all primordials before any untrusted code runs:
import 'ses';
lockdown(); // Object.prototype, Function.prototype, Array.prototype, all frozen
const compartment = new Compartment({
globals: { fetch: safeFetch },
});
compartment.evaluate(`fetch('https://example.com/api')`);
After lockdown(), Object.prototype cannot be modified by anything. The prototype chain still exists, but navigating it no longer reaches mutable state. The Compartment API provides a controlled evaluation environment where the available globals are explicitly chosen by the host.
This is the only language-level approach that has held up. It works because it changes what is true about the environment before the attacker’s code runs, rather than trying to catch and block operations after the fact.
The cost is real. lockdown() must run before anything else in the process, including library initialization. Many npm packages mutate primordials at startup, sometimes intentionally as polyfills, sometimes as a side effect of design decisions made when primordial mutation was unconstrained and considered fine. Adopting SES in an existing Node.js application means auditing every dependency. Agoric builds their entire Endo framework on top of SES, which demonstrates that the approach is viable for new systems built around it from the start.
V8 Isolates: The Scalable Path
For production systems running untrusted code at scale, V8 Isolates provide a different boundary. An isolate is an independent instance of the V8 engine with its own heap, garbage collector, and primordials. Code in one isolate genuinely cannot access objects in another because they do not share a prototype chain or any heap memory.
Cloudflare Workers runs thousands of isolates per process with about 128KB overhead per isolate at startup, compared to tens of megabytes for a container. The sub-millisecond cold start comes from V8 heap snapshots: a snapshot of the initialized engine state is restored via memcpy for each new isolate. This has been in production since at least 2016.
For Node.js environments that need this level of isolation, the isolated-vm package provides V8 Isolate sandboxing via native bindings:
import ivm from 'isolated-vm';
const isolate = new ivm.Isolate({ memoryLimit: 128 });
const context = await isolate.createContext();
const jail = context.global;
await jail.set('log', new ivm.Reference(console.log));
await context.eval(`log.applySync(undefined, ['hello from sandbox'])`);
Crossing the isolate boundary requires using Reference objects rather than passing live values. The friction is intentional: it makes the crossing explicit and typed, which is the security model. There is no path from inside the isolate to outside it that does not go through the explicitly defined interface.
WebAssembly provides a structurally similar guarantee via its linear memory model. Wasm code accesses only its own linear memory region by construction, enforced at the type system level rather than at runtime. The WASI capability model extends this: filesystem and network access must be explicitly granted by the host, with no ambient authority. Projects like quickjs-emscripten combine both, compiling QuickJS to WebAssembly and using the WASM boundary as the containment layer.
Why LLMs Make This Urgent
The Snowflake Cortex sandbox escape that Willison documented earlier this month illustrates why the sandboxing question has become more urgent. Cortex AI executed malware that escaped its sandboxed execution environment. The sandbox in question was designed for trusted users writing UDFs, not for containing code generated by a model that might have been influenced by prompt injection in the data it processed.
Traditional execution environments run known code written by authorized users. The sandbox bounds what those users wrote. LLM-powered execution environments run generated code, sometimes based on inputs from untrusted sources: customer records, scraped content, third-party API responses. A prompt injection in a database row can become arbitrary code execution if the model has a code execution tool and the sandbox is not strong enough.
The OWASP Top 10 for LLM Applications ranks prompt injection and insecure output handling as the top two risks, and those two combine directly into this problem. When building anything that gives a model access to code execution, the strength of the sandbox is not optional.
The Pattern Across All of This
The approaches that hold up share a common property: they use a hardware or process-level boundary, or they change the language state before untrusted code runs. node:vm shares a process and shares prototype roots. vm2 tried to police access after the fact and published critical CVEs until it was abandoned. Realms provide isolation for trusted-author scenarios, not adversarial ones.
SES works at the language level but only by eliminating the mutable shared state before any code runs, not by monitoring what code does. V8 Isolates and WebAssembly work by giving each execution context its own memory and its own primordials, with no shared state to navigate through.
If you are building something that runs untrusted code, the checklist is straightforward: node:vm is off the table for security purposes, vm2 is deprecated for good reason, and any pure-scope-based containment approach in a standard JavaScript runtime should be treated with the same skepticism. The things that actually work involve a real boundary between execution contexts, whether that boundary is an isolate, a WASM linear memory region, a process, or a lockdown that runs before any untrusted code sees the environment.