· 7 min read ·

Why Node.js Can Hook into require() but Not into fs

Source: hackernews

There is a pattern in how Node.js gets extended that is worth understanding before reading any proposal about virtual filesystems. Node.js has spent several years building a stable, official API for intercepting module loading. You can hook require() and import today, in production, without monkey-patching anything, using module.register() and the Module Hooks API that stabilized in Node.js 22. What Node.js does not have is any equivalent hook for fs. No official way to intercept readFile, createReadStream, stat, or any other filesystem operation at a level that would serve every caller transparently.

Platformatic’s argument for a virtual filesystem in Node.js Single Executable Applications sits squarely inside this asymmetry. The missing piece is not exotic. Other runtimes have it. Other languages have it. But it requires a kind of extensibility that Node.js built for its module system and then stopped short of extending to its I/O layer.

What the Module Hooks API Actually Does

The Module Hooks API, introduced in Node.js 16.12.0 as --experimental-loader and stabilized through several redesigns into the module.register() form, lets you intercept the full module loading pipeline. A hook can modify how specifiers resolve, change what file a specifier maps to, and transform the source code before it executes:

// hooks.mjs
export async function resolve(specifier, context, nextResolve) {
  if (specifier.startsWith('virtual:')) {
    return {
      shortCircuit: true,
      url: `data:text/javascript,export default "${specifier}"`,
    };
  }
  return nextResolve(specifier, context);
}

export async function load(url, context, nextLoad) {
  if (url.startsWith('data:')) {
    return { format: 'module', shortCircuit: true, source: url.slice(5) };
  }
  return nextLoad(url, context);
}
// app.mjs
import { register } from 'node:module';
register('./hooks.mjs', import.meta.url);

This is powerful. It is how test frameworks like Vitest intercept module imports for mocking without monkey-patching require. It is how TypeScript loaders like tsx transpile .ts files on the fly. The module loading pipeline is genuinely extensible.

But notice the scope of these hooks. They intercept module loading: the process of resolving a specifier to a URL and loading source code to evaluate as a module. They do not intercept fs.readFile('./config.json', 'utf8'). That call goes directly to Node.js’s bindings layer, bypasses the module system entirely, and hits the operating system’s filesystem APIs through libuv. A module hook sees the import statements in your code, not the file I/O that happens after the module has loaded.

Why This Matters for SEA

Node.js Single Executable Applications became stable in Node.js 22. The mechanism is straightforward: bundle your application into one JavaScript file, generate a blob, inject it into a Node.js binary using postject. Assets can be embedded via sea-config.json:

{
  "main": "app.bundle.js",
  "output": "sea-prep.blob",
  "assets": {
    "config.json": "./config/default.json",
    "template.html": "./templates/index.html"
  }
}

And retrieved via the node:sea module:

import { getAsset } from 'node:sea';
const config = JSON.parse(getAsset('config.json', 'utf8'));

This API is deliberately explicit. It does not integrate with node:fs. A library that calls fs.readFile('./config.json') will not find the embedded asset. The node:sea module and the node:fs module are separate namespaces that do not communicate.

You could write a module hook that intercepts import { readFileSync } from 'node:fs' and returns a mocked version. But you cannot intercept a call to readFileSync that happens inside an already-loaded module, because the hook fires when the import resolves, not each time the function executes. Once node:fs is loaded, readFileSync is a function reference. Replacing it in one module’s scope does nothing to the copies held by other modules.

The Monkey-Patch Dead End

The obvious workaround is to mutate the fs module’s exported properties before any other code runs, using a --require or --import preload:

// preload.cjs
const fs = require('node:fs');
const { getAsset, isSea } = require('node:sea');

if (isSea()) {
  const originalReadFile = fs.readFile;
  fs.readFile = function(path, options, callback) {
    const virtualContent = tryGetVirtualAsset(path);
    if (virtualContent !== null) {
      const cb = typeof options === 'function' ? options : callback;
      process.nextTick(cb, null, virtualContent);
      return;
    }
    return originalReadFile.call(this, path, options, callback);
  };
  // ... repeat for readFileSync, readdir, stat, fs.promises.readFile, etc.
}

This is what pkg did. It is not unreasonable, but it has real limitations. The fs module exports over 60 functions. Wrapping all of them correctly, including the promise-based variants under fs.promises, the streaming variants, stat, lstat, access, opendir, and everything else that filesystem-aware code might call, is significant work. More importantly, this approach breaks whenever code bypasses the JavaScript fs module entirely and uses native addons that call into libuv or the OS filesystem directly.

Monkey-patching does not compose. If two libraries both try to intercept fs this way, the second one wraps the first one’s wrapper, and things get complicated quickly. An official hook point avoids this because hooks compose explicitly through a chain, the same way module hooks do.

The Binding Layer Is Where This Would Have to Live

Node.js’s fs module is implemented in a combination of JavaScript and C++. The JavaScript layer in lib/fs.js calls through to binding('fs'), which is a native binding that invokes libuv’s async I/O functions. A proper VFS hook would need to intercept at the binding layer, not at the JavaScript layer, to be transparent to both JavaScript fs calls and to any native addon that goes through the same bindings.

This is exactly what ASAR does in Electron. Electron maintains a fork of Node.js with patches applied at the binding level. When a path resolves into an .asar archive, the binding layer intercepts it, reads the archive header, and serves the embedded content. The JavaScript fs module, every native addon, everything that goes through Node.js’s filesystem bindings sees the virtual path as real.

The cost of Electron’s approach is that it requires maintaining a fork. Node.js core has been reluctant to accept ASAR-style patches for several reasons: the right abstraction for a hookable filesystem binding is not obvious, the performance implications of checking every I/O call against a registered hook table are not free, and the security model for which code can register which virtual paths needs careful design.

What the Module Hooks API Shows Is Possible

The module hooks precedent is instructive because it shows that Node.js can add extensibility to a core subsystem without compromising the normal case. Module hooks run in a separate thread to avoid blocking the main thread during loading. The hook chain is set up once at startup. The overhead for the common case, where no hook intercepts a given specifier, is minimal because the check is done in C++ before invoking the JavaScript hook chain.

An fs hooks API would need similar design. A rough shape might look like:

import { registerFsHook } from 'node:fs';
import { getAsset } from 'node:sea';

registerFsHook({
  handles(path) {
    return path.startsWith('/bundled/');
  },
  readFile(path, options) {
    const key = path.replace('/bundled/', '');
    return Buffer.from(getAsset(key));
  },
  stat(path) {
    const content = getAsset(path.replace('/bundled/', ''));
    return { isFile: () => true, size: content.byteLength };
  },
  readdir(path) {
    return [];
  }
});

The implementation would need to check registered hooks in the binding layer before falling through to the real filesystem. For paths with no matching hook, the cost would be a hash lookup against a prefix table, similar to how ASAR does it. The Platformatic team’s full writeup goes into more specific API design proposals.

The Contrast with Go and Python

Go’s //go:embed directive and embed.FS type, introduced in Go 1.16, show what happens when this problem gets solved at the toolchain level. The embedded filesystem implements io/fs.FS, the standard read-only filesystem interface, so it composes with anything that accepts an fs.FS: HTTP file servers, template parsers, archive writers. There is no monkey-patching and no special runtime hook because the compiler resolves the embedding at build time and the type system ensures composability.

Python’s approach is older and messier, but instructive. zipimport has been part of the standard library since Python 2.1, allowing imports from .zip archives. PyInstaller builds on this with a more complete VFS that handles data files alongside modules. The importlib.resources API in Python 3.9 provides a standard interface for accessing package data that works whether the package is installed normally or bundled into an archive. None of this is elegant, but it exists as official infrastructure that third-party tools can rely on.

Node.js has neither the compile-time embedding of Go nor the standardized package data API of Python. What it has is a very capable module hook system that solves half the problem, module loading, while the other half, file I/O, remains unaddressed.

What Userland Can and Cannot Do Today

If you need a SEA-compatible VFS today, the practical options are limited. You can preload a script that wraps the fs module, accepting the incomplete coverage and composability limitations. You can use esbuild to inline all assets as Base64-encoded string literals, which works for text and small binary files but is unwieldy for anything larger. You can structure your code to always use getAsset() and never use fs for embedded content, which works for code you control but fails for any dependency that reads files internally.

The broader Module Hooks API does give enough surface area to intercept import() and require() calls for module loading, so you can serve synthetic modules from embedded content. But that only covers the module system. The filesystem remains opaque to hooks.

Filling this gap properly requires either changes to Node.js core to add a hook point at the binding layer, or a native addon that patches the binding table directly, which is fragile and version-sensitive. The Platformatic article is arguing for the former. Given that the Module Hooks API went from experimental to stable over several release cycles with real design effort behind it, there is precedent for this kind of work landing in core.

The question is whether the Node.js core team treats it as a priority in a world where Deno’s deno compile and Bun’s bun build --compile both handle embedded assets through the standard file-reading APIs without asking users to think about binding layers at all. The asymmetry between Node.js’s extensible module system and its non-extensible filesystem is the concrete architectural gap behind every conversation about why distributing Node.js applications is still harder than it should be. That gap requires design work at the binding layer, the same level where pkg patched and where Electron still patches today. Re-solving it in each generation of tooling is the cost of leaving it there.

Was this interesting?