· 5 min read ·

Node.js Has No Interface: The Architectural Root of the VFS Problem

Source: hackernews

There is a gap in Node.js that bundlers, test frameworks, and single-executable application tools have been navigating around for years. It is not a missing feature so much as an architectural consequence: the fs module has no interface layer that user code or the runtime itself can intercept. When Deno and Bun built their compile-to-binary pipelines, they got this for free because they wrote their own fs implementations in Rust and Zig respectively. Node.js inherited libuv, and libuv does not know anything about virtual paths.

The Platformatic post on this topic lays out the immediate motivation: you cannot build a reliable single-executable application with Node.js if any of your dependencies call fs.readFile with a path relative to __dirname. The Node.js Single Executable Applications feature (stable since Node.js 21, backported improvements in 20.12.0) provides a getAsset() API on the node:sea module, but that only works for assets you explicitly declared in sea-config.json. Code that calls fs.readFile(path.join(__dirname, 'templates/base.html')) will try to hit the real filesystem, fail, and there is nothing you can do about it short of patching the dependency.

What the Hook APIs Can and Cannot Do

Node.js does have loader hooks. Since Node.js 20.6.0 (backported to 18.19.0), you can register a module loader programmatically:

import { register } from 'node:module';
register('./my-loader.mjs', import.meta.url);

The hooks you can export from that loader are resolve and load. The resolve hook lets you rewrite import specifiers; the load hook lets you return synthetic source for any URL, including made-up schemes like virtual:config. Together, they are powerful enough to build a complete ESM module virtualization layer.

But they only fire for module loading. A require() call, a fs.readFile() call, a fs.createReadStream() call, fs.stat(), fs.access(): none of these pass through the loader hook system. CommonJS module loading (Module._load, Module._resolveFilename) is a completely separate code path that the ESM hooks do not touch. And direct fs calls bypass everything.

The gap is not subtle. If you are building a tool that needs to virtualize the filesystem, you are left with monkey-patching:

// preload.js, loaded via node --require
const fs = require('fs');
const { readFile } = fs;
fs.readFile = function(filePath, options, callback) {
  if (isVirtualPath(filePath)) {
    return serveFromMemory(filePath, options, callback);
  }
  return readFile.call(this, filePath, options, callback);
};

Libraries like mock-fs and memfs have been doing this for years, primarily for testing. Webpack uses memfs internally as its compilation virtual filesystem. It works well enough in controlled environments, but it has a hard ceiling: native addons written in C++ that call libuv directly bypass the JavaScript shim entirely. The interception only works at the JavaScript binding layer, not below it.

The pkg History Is a Warning

The pkg tool, originally from Zeit and later maintained by Vercel before being archived in 2023, built a real working VFS by patching process.binding('fs') at build time. Every packaged binary got a synthetic root at /snapshot/ (or C:\snapshot\ on Windows), and all __dirname values were rewritten to point into it. The VFS was a serialized dictionary of all project files, stored directly in the binary.

This worked. For several years, pkg was the standard way to distribute Node.js applications as single executables. But process.binding is an internal API that Node.js changes without notice, and every major Node.js release required a corresponding update to pkg’s patching strategy. The community fork @yao-pkg/pkg continues to maintain it, but the fundamental fragility has not changed. You are patching internals that were never meant to be patched.

Nexe took a different approach: patch only Module._resolveFilename and Module._extensions, which covers require() calls but not raw fs operations. Less fragile, but also less complete.

Both tools demonstrate the demand clearly. The lack of a VFS API has not stopped people from building VFS solutions; it has just meant every solution is a workaround.

What Deno and Bun Got Right

When deno compile produces a self-contained binary, arbitrary assets embedded with --include are accessible via Deno.readTextFileSync() using their original specifier paths. The VFS is transparent to application code. This is not magic; it is a consequence of Deno implementing its own fs-equivalent operations in Rust, routing through its own resource/op system. Inserting a VFS layer before the OS call is a straightforward architectural decision when you own the entire stack from JavaScript down to the syscall.

Bun’s bun build --compile does the same thing with its Zig-based runtime. Since Bun 1.1.0, the --assets flag embeds an entire directory, accessible via Bun.file() or transparently through the node:fs compatibility shim:

bun build ./index.ts --compile --assets ./public --outfile myapp

Code that calls fs.readFile('./public/index.html') inside the binary just works, because Bun’s fs implementation checks the embedded asset map before calling the OS. The VFS is a first-class concern, not an afterthought.

Node.js’s fs module calls libuv. Libuv calls the OS. There is no layer in between where you could insert a VFS without either patching C++ code or replacing libuv, neither of which is a reasonable ask for application developers.

What a Real API Might Look Like

The proposal that makes sense is a hook at the Node.js binding level, something like a registered filesystem provider that the runtime consults before delegating to libuv. The interface would need to cover the full fs surface: reads, stats, directory listings, and probably writes (for tools that want a fully writable in-memory filesystem, as mock-fs enables for testing).

A minimal version might look like:

import { registerFileSystemProvider } from 'node:fs';

registerFileSystemProvider({
  canHandle(path) {
    return path.startsWith('/virtual/');
  },
  readFile(path, options) {
    return virtualFiles.get(path);
  },
  stat(path) {
    return virtualStats.get(path);
  }
});

The hard part is not the API design. It is threading the provider through the C++ binding layer so that it is consulted for all fs operations, including those initiated by native addons. AsyncLocalStorage provides a precedent: context that propagates through async call chains, including those crossing the JS/C++ boundary. A similar mechanism for filesystem dispatch is technically feasible, though it would require significant work in Node.js core.

The HN thread predictably surfaces the counterargument that this is scope creep and that the existing monkey-patch approach is sufficient for most use cases. That argument holds until you try to embed a native addon in a SEA binary (you cannot; they must be extracted to a temp directory at runtime) or until you need VFS interception inside a worker thread (where --require preloads do not apply).

The Practical State of Things

For now, the realistic options are:

  • Node.js SEA for applications you can bundle into a single JS file (using esbuild or rollup) with only explicit assets accessed via node:sea’s getAsset(). Simple apps that control all their code can make this work.
  • @yao-pkg/pkg for applications that need a full VFS and can tolerate the maintenance overhead of depending on a tool that tracks Node.js internals.
  • Bun or Deno if you are starting a new project and portability is acceptable. Both have robust compile pipelines with real VFS support.
  • Monkey-patching fs in test environments via memfs or mock-fs, with the understanding that native addons will bypass it.

None of these are the answer Node.js should be giving. The runtime has invested heavily in making single-executable applications a first-class feature, but the getAsset() API only solves the easy part of the problem. Libraries written by people who did not know they would ever run inside a SEA binary call fs.readFile. That is not going to change. The runtime needs to meet them where they are.

Was this interesting?