· 5 min read ·

Every npm Install Is a Code Execution Event

Source: simonwillison

Simon Willison’s recent piece, Package Managers Need to Cool Down, lands at a moment when the JavaScript and Python ecosystems are both mid-churn on tooling. The JavaScript side has npm, yarn (classic), Yarn Berry, pnpm, and bun all coexisting, each with subtly different resolution strategies and lock formats. Python has pip, pipenv, poetry, conda, and now uv racing ahead on speed. Each new entrant adds features, changes defaults, and shifts what “installing a package” means in practice. Willison’s argument, broadly, is that this pace of change has a cost the ecosystem underweights.

The cost I want to focus on is specific: package managers have become implicit code execution environments, and most developers interact with that fact only after something goes wrong.

What Happens When You Type npm install

The package.json format allows any package to define lifecycle scripts that execute at install time. A dependency deep in your tree can include:

{
  "scripts": {
    "preinstall": "node ./scripts/check-env.js",
    "install": "node-gyp rebuild",
    "postinstall": "node ./scripts/telemetry.js"
  }
}

This runs under your user account, with access to your filesystem, your environment variables, and your network. When a project has 300 transitive dependencies, npm install can trigger scripts from dozens of those packages. You consented to none of them individually.

The node-gyp ecosystem exists largely because of this mechanism. Native addons need to compile C++ against your local Node headers, and postinstall is how they do it. That’s a legitimate use case. But the same hook is available to every package, regardless of whether it actually needs to compile anything.

The Incidents That Should Have Changed This

The event-stream incident in 2018 is the canonical example. A malicious contributor gained maintainership of a widely-used stream utility, inserted a dependency on flatmap-stream, and used a postinstall script to inject code that targeted Copay Bitcoin wallet users. The package had millions of weekly downloads. The attack was live for weeks before detection.

In 2021, the maintainer of colors.js and faker.js deliberately shipped broken versions in protest of unpaid open source labor. Thousands of projects that pinned to latest rather than a specific version broke on next install. The protest was understandable; the mechanism was the same one attackers use.

The ua-parser-js hijacking the same year followed the same pattern: a maintainer’s npm account was compromised, malicious versions were published, and the postinstall script dropped a cryptocurrency miner.

Then came xz-utils in 2024, which was not npm-specific but was the clearest demonstration of how patient and sophisticated supply chain attacks had become. A two-year social engineering campaign resulted in a backdoor in a compression library that ships in nearly every Linux distribution. The attacker contributed meaningfully to the project before inserting the payload.

Each of these incidents prompted retrospectives, new tooling announcements, and pledges to do better. The structural problem, which is that package installation implies trusting arbitrary code from arbitrary authors at install time, remained intact.

How Cargo Handled This Differently

Cargo, Rust’s package manager, made a deliberate choice here. Packages do not get to run arbitrary code at install time by default. If a crate needs build-time code execution, it must declare a build.rs file explicitly, which Cargo runs in a sandboxed build context:

// build.rs
fn main() {
    println!("cargo:rerun-if-changed=src/native.c");
    cc::Build::new().file("src/native.c").compile("native");
}

The key difference is visibility and explicitness. A build.rs is a first-class artifact in the source tree, visible in code review, checked into version control, and not a hidden entry in a JSON metadata file. Cargo also runs build scripts in a restricted environment that limits what they can observe about your system.

This does not eliminate supply chain risk. A malicious build.rs can still do damage. But the surface area is smaller, the signal is more visible, and the ecosystem norm pushes toward native compilation rather than arbitrary shell execution.

Pip, for its part, has no standard equivalent of npm’s lifecycle hooks. Packages that need native compilation use setuptools or meson, and the build process is invoked explicitly rather than implicitly on install. The new uv package manager from Astral, which has become the default recommendation for Python tooling in many circles, inherits this model and adds aggressive isolation and reproducibility on top of it.

AI-Assisted Development Makes This Worse

There is a compounding factor here that Willison’s framing touches and that deserves direct attention. AI coding tools suggest package additions constantly. A suggestion from Copilot or Claude to “add the sharp library for image processing” is usually reasonable; sharp is well-maintained and its postinstall compilation step is expected and documented. But the average developer using an AI assistant to scaffold a project is not auditing the postinstall scripts of every suggested dependency. The cognitive overhead of reviewing those scripts has always been high; AI tooling has shifted the question from “should I use this library” to “the AI added this library, do I remove it.”

Socket.dev has built a business on exactly this gap. Their tool analyzes packages before installation and flags unusual behaviors: new install scripts, suspicious network activity in postinstall, maintainer changes on packages that previously had none. It’s good tooling, but it should not be necessary. The default trust model should not require a separate audit layer on top of it.

The --ignore-scripts Flag Nobody Uses

npm has had --ignore-scripts for years:

npm install --ignore-scripts

This disables all lifecycle scripts across the entire install. It’s also largely unusable in practice because a meaningful fraction of packages with native dependencies require their postinstall step to function. Try running a project with node-gyp-compiled packages after an --ignore-scripts install and you’ll get runtime errors immediately.

A better design would flip the default: require packages to opt into install-time execution, require explicit user acknowledgment when a new dependency declares that requirement, and surface those declarations in the package registry itself. npm’s provenance feature, which cryptographically links published packages to their source repository and CI build, is a step in this direction but addresses authenticity rather than execution scope.

What Cooling Down Actually Looks Like

For npm specifically, cooling down means decoupling the package format from implicit execution and surfacing install-time scripts as first-class security signals rather than buried metadata. The registry has enough usage data to flag packages that added a postinstall script for the first time, or that changed their install script between versions. Surfacing that in npm install output costs nothing and would have caught several of the incidents above.

For the broader ecosystem, cooling down means resisting the pull toward feature parity across package managers. Every new capability a package manager acquires is a new attack surface and a new behavior developers have to model mentally. Bun adding a test runner and a bundler alongside its package manager is technically impressive; it also means that bun install is now part of a larger executable with a larger blast radius if something goes wrong.

The package managers that age well are the ones that do less, do it predictably, and make the security properties of every operation legible. Cargo did not win by being faster than Cargo’s competitors; it won by making dependency management feel safe enough that developers stopped dreading it. That’s the bar.

The JavaScript and Python ecosystems are both capable of getting there. But the path runs through restraint, not through the next feature release.

Was this interesting?