· 6 min read ·

How Go's io/fs Interface Solved the Problem Node.js SEA Didn't

Source: hackernews

Node.js has spent years building solid interception points for module loading. The --experimental-loader flag arrived in Node 9, and module.register() stabilized in Node 18.19 and 20.6 as a composable, non-deprecated way to hook import and require. If you need to intercept module resolution, rewrite specifiers, or return synthetic modules, there is a clean API for it.

The filesystem has nothing equivalent. Matteo Collina’s recent article for Platformatic makes this case directly: Node.js needs a virtual filesystem layer. The argument is sound, and Collina, a Node.js TSC member, understands the constraints from the inside. What is worth examining alongside it is how Go and the JVM tackled the same problem, because the design choices they made show what Node.js is missing at the structural level.

The Go Approach: Define the Interface First

Go 1.16, released in February 2021, introduced two packages simultaneously: io/fs and embed. The pairing is what made both useful.

io/fs.FS is a deliberately minimal interface with a single method:

type FS interface {
    Open(name string) (File, error)
}

The standard library ships several implementations: os.DirFS wraps the real OS filesystem, embed.FS contains compile-time embedded files, and zip.Reader implements it for ZIP archives. More important than the implementations is what the standard library does with the interface: http.FileServer, text/template.ParseFS, html/template.ParseFS, and fs.WalkDir all accept fs.FS. Any conforming implementation gets the full standard library.

The embed package provides the compile-time half:

//go:embed templates/*.html config/*.json
var content embed.FS

data, _ := content.ReadFile("config/default.json")
tmpl, _ := template.ParseFS(content, "templates/*.html")

Files are bundled into the binary at compile time. At runtime, content.ReadFile() returns data from memory without any OS call. Because embed.FS implements io/fs.FS, passing it to template.ParseFS works the same as passing a real directory. Application code does not need to know whether the files came from disk or a binary.

This works because Go defined the abstraction before shipping the implementation. The standard library already spoke io/fs.FS on day one of Go 1.16. New filesystem implementations, whether in-memory, ZIP-backed, embedded, or network-backed, inherit full standard library compatibility without additional wiring.

What Java Got Right in 2011

Java’s java.nio.file package, introduced in Java 7, included a FileSystemProvider SPI. Providers register via java.util.ServiceLoader and handle URI schemes. The built-in ZipFileSystem provider handles jar: URIs:

Path jarPath = Paths.get("myapp.jar");
try (FileSystem fs = FileSystems.newFileSystem(jarPath)) {
    Path config = fs.getPath("/config/default.properties");
    byte[] data = Files.readAllBytes(config);
}

Any code that uses java.nio.file.Files works with any FileSystem implementation. Third-party libraries can register their own providers. The standard library is written against the abstraction, so every new implementation gets ecosystem compatibility automatically.

Java’s approach is heavier than Go’s: a full service provider interface with registration, URI schemes, and a factory pattern. But the design principle is the same. Define the abstraction first, build the standard library on top of it, and everything that implements the interface gets compatibility automatically.

Node.js fs Has No Interface

The Node.js fs module is a set of functions, each of which calls into a C++ binding that calls libuv, which calls the OS. The call stack for fs.readFile:

fs.readFile()          ← lib/fs.js (argument validation, wrapping)
binding('fs').open()   ← internal C++ binding (node_file.cc)
uv_fs_open()           ← libuv async I/O
OS syscall             ← kernel

There is no layer in this stack where you can substitute a custom implementation. You cannot pass a custom filesystem provider to fs.readFile. You cannot register a handler that runs before the C++ binding. The module loader was refactored to support hooks because it is implemented in JavaScript, where composition is straightforward. The fs module was not.

memfs provides a complete fs-compatible object that you can inject explicitly:

import { createFsFromVolume, Volume } from 'memfs';

const vol = Volume.fromJSON({ '/app/config.json': '{}' });
const myFs = createFsFromVolume(vol);

// Works for code you own and can refactor:
myFs.readFileSync('/app/config.json', 'utf8');

// Unaffected: third-party code that imports node:fs directly
import fs from 'node:fs';

memfs works for code you control and can refactor to accept an injected fs object. It does not work for any library that imports node:fs internally, which is most libraries.

mock-fs tries the binding layer: it patches the internal C++ binding directly. This worked through Node.js 16 but has broken progressively as process.binding() was deprecated and removed. Node 22 removed several internal bindings entirely. Even when it worked, it did not affect native addons that call libuv without going through lib/fs.js.

The bundler ecosystem found a different workaround: static transformation at build time. esbuild’s loader: 'text' embeds file contents as string literals. webpack’s asset/inline converts files to base64 data URLs. Rollup plugins do the same. These approaches replace runtime filesystem reads with inline values baked into the bundle. They work for static, predictable assets. They do not work for paths constructed at runtime, plugins that read their own configuration, or code that enumerates directories.

The pkg Story

vercel/pkg, archived in September 2023, solved the transparency problem by patching Node.js at the binding layer. It compiled a modified version of Node.js where the C++ binding checked a virtual filesystem table before passing the call to libuv. Files bundled into the executable were stored at virtual paths with a prefix convention (/snapshot/myapp/ on Linux and macOS). When code called fs.readFile('/snapshot/myapp/config.json'), the patched binding returned data from memory without an OS call.

Existing code worked without modification. Third-party libraries, framework internals, the module loader itself: all went through the same patched binding and all got transparent VFS access. pkg worked because it put the interception at the right level.

The maintenance cost was substantial; the pkg team had to re-patch Node.js for every release, maintaining a fork of the compiled binary. When Node.js SEA arrived, the team deprecated pkg and directed users upstream. Node.js SEA did not include the VFS interception. Instead, it added a new getAsset() API in the node:sea module:

const { getAsset } = require('node:sea');
const config = getAsset('config.json', 'utf8');
const template = getAsset('email.html', 'utf8');

getAsset() requires code changes. Every callsite that previously used fs.readFileSync to load a bundled asset must be rewritten to call getAsset() instead. The API also only works inside a SEA context; calling it in a regular Node.js process throws. This is functional for applications you own entirely, but it does not help with third-party dependencies or framework code that reads files internally.

Deno and Bun did not make this tradeoff. Both deno compile and bun build --compile produce binaries where existing file-reading code works transparently on embedded assets. Deno implements its EmbeddedFs at the op layer (the Rust-level equivalent of Node’s binding layer), and the lookup runs before any OS call. Bun does the same in its Zig-based native layer. Both runtimes own their full stack from JavaScript to the OS boundary, and both put the VFS at the level where it needed to be.

What a Real Solution Requires

The Node.js single-executable working group has listed VFS support as a future goal beyond the initial getAsset() API. The design challenge is not trivial.

A hook system modeled on module hooks would look like:

// Hypothetical future API
import { registerFsHook } from 'node:fs/hooks';

registerFsHook({
  async read(path) {
    const asset = embeddedAssets.get(path);
    if (asset) return asset;
    return null; // fall through to real fs
  }
});

This runs into trouble with synchronous calls. fs.readFileSync blocks the event loop thread via SyncCall() in node_file.cc. Running async JavaScript in that path has no clean mechanism in the current architecture. A hook that runs in JavaScript would have to be synchronous, which constrains what it can do and limits its usefulness for async-heavy startup code.

The more robust solution is a C++ implementation: a virtual filesystem table in Node.js core, populated via a stable native API, checked in node_file.cc before the libuv call. pkg took this approach in its patched binary. Moving it into upstream Node.js would make it available to all callers, including native addons, and would handle both sync and async paths uniformly.

The JVM’s FileSystemProvider SPI remains the most principled design for a system at Java’s scale. Go’s io/fs.FS is simpler and fits Node.js’s preference for minimal APIs better. Node.js does not need to copy either design exactly, but it does need to pick a layer, implement the interception there, and expose it as a stable API surface.

The Platformatic post is a clear call for that work. The HackerNews discussion confirms the frustration is widespread. The implementation precedent exists in pkg, in Deno, and in Bun. The remaining step is getting it into Node.js core.

Was this interesting?