· 6 min read ·

The Abstraction Layer That Node.js fs Never Had

Source: hackernews

Node.js 21.7 shipped stable Single Executable Applications. The feature does what it says: you can bundle your application into a self-contained binary, add assets via a config file, and distribute a single file. It is a genuine improvement over the previous state of affairs, where pkg was doing this job with hacks against unstable internal APIs.

The limitation that Matteo Collina’s Platformatic article identifies is specific and worth taking seriously: getAsset(), the API that retrieves embedded files, is entirely separate from node:fs. A dependency that calls fs.readFile('./config/schema.json') inside a SEA binary gets an ENOENT error. The fs module has no knowledge of the asset store. You can embed files, but the filesystem is not aware of them.

This matters beyond SEA. The same gap affects testing tools, edge runtimes, and any scenario where you want to serve files from somewhere other than the OS filesystem. To understand why closing this gap is hard, you need to follow a filesystem read all the way down.

How a filesystem read works in Node.js

Every call to fs.readFile flows through lib/fs.js for argument validation, then into src/node_file.cc, a C++ binding layer. From there it passes to libuv, which dispatches the operation to the OS via a thread pool for async operations or a blocking syscall for the sync variants:

fs.readFile()
  → lib/fs.js         (JavaScript, argument validation)
  → src/node_file.cc  (C++ binding)
  → libuv             (C, async I/O)
  → OS syscall

There is no seam in this chain where an alternative implementation can be inserted from JavaScript. Module Customization Hooks, stable in Node.js 22 via module.register(), intercept import resolution, but they fire during module loading, not during every fs.readFile call. Once a module has a reference to fs.readFileSync, no hook intercepts subsequent calls to it.

The problem compounds because files arrive through many paths: fs.readFile, fs.createReadStream, fetch('file://...') (a separate implementation from fs), and native addons calling uv_fs_read() directly. A JavaScript-layer interception covers almost none of these uniformly, because most of them eventually reach the C layer through different routes.

How pkg worked, and why it stopped

pkg, the tool Vercel archived in September 2023, solved this by patching process.binding('fs') before any user code ran. This is the boundary between lib/fs.js and src/node_file.cc. All files were stored in a snapshot dictionary embedded in the binary, and __dirname was rewritten to point into a virtual root. Because the interception happened below the JavaScript layer, it covered readFileSync, readFile, stat, readdir, createReadStream, and require(). Dependencies required no changes:

// Works identically in development and in a pkg binary.
const html = fs.readFileSync(path.join(__dirname, 'views/index.html'), 'utf8');
const config = require('./config/defaults.json');

The problem was that process.binding is an internal API with no stability guarantee. Maintaining pkg required building and hosting patched Node.js binaries for every version and every target platform. Vercel eventually deprecated pkg in favor of SEA, but SEA did not close the gap pkg had filled. The community fork @yao-pkg/pkg continues, but the maintenance burden has not shrunk, because the underlying problem has not changed.

Electron proved binding-level interception works

Electron has shipped this in production since around 2014. The ASAR format, a sequential archive with a JSON header describing file offsets, is read by a patched binding layer in Electron’s Node.js fork. When the binding receives a path containing .asar, it reads from the archive rather than the OS. The interception covers the complete fs surface, including native addons, because the intercept is written in C at the binding layer. VS Code, Slack, and Discord all ship this way.

The reason ASAR has never been upstreamed is that Electron maintains a forked Node.js binary. There has been no consensus on API design, and the governance overhead for a proposal of this scope in Node.js core has historically been prohibitive. The existence proof is still meaningful, though: binding-level virtual filesystem interception is tractable and production-viable. Node.js added --experimental-permission in Node.js 20, which intercepts fs calls in node_file.cc before they reach libuv for permission checks. The infrastructure for this kind of interception already exists in the codebase.

Why Deno and Bun handle this transparently

deno compile and bun build --compile both produce binaries where embedded assets are accessible via the same filesystem APIs you use in development. Call Deno.readTextFile('./config.json') inside a compiled Deno binary and you get the embedded content, without a getAsset() equivalent. In Bun, fs.readFile on an embedded path works the same way.

Deno and Bun achieve this because they own the runtime stack end-to-end. Deno implements all I/O through an “op” system dispatched into Rust. The VFS is a routing decision in one codebase: the Rust layer checks an embedded file map before falling through to the OS. Bun’s node:fs compatibility layer is written in Zig, and asset lookup happens before any OS call. Neither runtime has to patch a foreign binding layer, because there is no foreign binding layer.

Node.js is built on libuv, developed separately, with no virtual filesystem hook and no per-path dispatch table. uv_fs_open() calls open(). That separation is a strength in many respects, but it means VFS support requires changes to two codebases or a workaround at the binding layer between them.

What Go got right

Go 1.16 added io/fs and embed simultaneously, and the sequence matters. The io/fs.FS interface is deliberately minimal:

type FS interface {
    Open(name string) (File, error)
}

The standard library was updated to thread this interface through http.FileServer, html/template, and io/fs.WalkDir. Implementations include os.DirFS for real files and embed.FS for compile-time embedded files. Any conforming implementation gets full compatibility with both the standard library and third-party code. The embed package lets you ship a web server with templates baked in, with no changes to the template parsing code:

//go:embed templates/*.html
var content embed.FS

// http.FileServer does not know or care whether this is disk or an embedded archive.
http.Handle("/", http.FileServer(http.FS(content)))
tmpl, _ := template.ParseFS(content, "templates/*.html")

This worked on day one because the abstraction was defined alongside the implementation. Node.js has no equivalent. The fs module is a set of concrete functions calling into libuv. Libraries receive path strings and call fs.readFile directly. Retrofitting a composable interface on top of that requires changes across much of the standard library and a long compatibility window.

Java’s java.nio.file.FileSystemProvider SPI, added in Java 7, solves the same problem through provider registration; Spring Boot fat JARs, which nest JARs three levels deep, rely on exactly this mechanism. Python has had zipimport in the standard library since 2.1 and importlib.resources since 3.9. The pattern of defining the filesystem abstraction before shipping the runtime is not unusual. Node.js is the outlier.

The monkey-patch ceiling

Because no composable seam exists, the ecosystem patches the fs module object. memfs provides an in-memory filesystem with the same API surface, and unionfs overlays multiple implementations. These work well for testing and webpack’s internal compilation VFS, but they hit hard limits.

Patching the module cache for 'fs' does not automatically patch 'node:fs'. Any library using the node: prefix, standard since Node.js 14.18, escapes the patch. Native addons calling uv_fs_open() reach the real filesystem regardless of what you have done to require('fs'). Worker threads with fresh module scopes bypass any --require preload patches. mock-fs, which attempted to patch the binding layer directly, worked through Node.js 16 and has broken progressively since, as process.binding() was deprecated and Node.js 22 removed several internal bindings entirely.

What a fix requires

The Node.js single-executable working group has listed VFS support as a future goal beyond getAsset(). A practical proposal faces specific constraints. Synchronous variants like readFileSync cannot go async; providers must respond synchronously, ruling out network-backed storage. Native addons requiring a real path for dlopen still need sidecar extraction. Every fs call in every application would check a hook table, so the no-VFS-registered path must carry near-zero overhead. Worker threads require separate provider registration, and any hook must cover both require('fs') and require('node:fs') to close the prefix gap.

Two plausible paths exist. A binding-level hook in node_file.cc with synchronous-only providers would give a general VFS mechanism usable for testing, SEA, and edge runtimes, comparable in scope to what pkg achieved with its internal hack. A narrower extension to node:sea, where embedded assets become transparently accessible via fs.readFile without a separate API, would close the gap for the most common use case and carry less risk.

Either way, getAsset() is a partial answer. The Platformatic post frames this as Node.js needing to catch up to Deno and Bun. The more precise framing is that Node.js has a filesystem binding with no interface layer above it, and the ecosystem has been working around that absence for a decade through increasingly fragile means. Adding the interface is the real work, and the design decisions made now, around synchronous providers, hook registration scope, and the node: prefix gap, will determine whether the solution ends up as composable as Go’s io/fs or as limited as the current patchwork of module cache overwrites and abandoned binding hacks.

Was this interesting?