Nix Type Checking Hits a Hard Ceiling at callPackage

Building a type checker for Nix sounds like a matter of encoding its small value universe into a suitable type system. Nix has nine value kinds: integers, floats, booleans, strings, paths, null, lists, attribute sets, functions, and derivations. For a language with that small a surface, adding types seems tractable. Johns’ recent article on building a type checker and LSP for Nix explores exactly this territory, and the project makes real progress on the parts of Nix that behave like an ordinary typed functional language.

The hard part is not encoding Nix’s type universe; it is callPackage, and what it would actually take to verify it statically.

How nixpkgs Actually Works

callPackage is how virtually every package in nixpkgs is instantiated. A simplified version of the pattern:

# A package file
{ stdenv, fetchurl, lib }:

stdenv.mkDerivation rec {
  pname = "hello";
  version = "2.12";
  src = fetchurl { /* ... */ };
}

# At the nixpkgs level
hello = callPackage ./pkgs/hello/default.nix {};

callPackage inspects the function’s argument names at runtime using builtins.functionArgs, then pulls matching values from the package set. The function in hello/default.nix expects { stdenv, fetchurl, lib }, so callPackage looks up those three names in the package set and passes them. If you add a new parameter to the package function without making it available in the package set, the error surfaces at evaluation time, possibly after a long build.

The mechanism that makes this work is builtins.functionArgs:

builtins.functionArgs ({ foo, bar }: foo + bar)
# => { foo = false; bar = false; }

This returns an attribute set whose keys are the parameter names of the function. The boolean values indicate whether each parameter has a default. callPackage uses this result to know what to supply from the package set.

Why This Cannot Be Represented Without Dependent Types

TypeScript can extract a function’s argument types through conditional types and infer:

type ArgType<F> = F extends (arg: infer A) => any ? A : never;

What TypeScript cannot do is extract a function’s argument names as string literals from an arbitrary function type. There is no TypeScript type machinery that yields "foo" | "bar" from (foo: number, bar: string) => void. Parameter names are erased at the type level. The type system operates on value shapes, not on the identifiers used to denote parameters.

callPackage’s correctness depends on a computation that takes a function value, extracts its parameter names as strings, and looks those strings up in another attribute set. The type of callPackage’s result depends not on the type of the function argument but on a structural property of its internal representation: specifically, the names its parameters happen to have. This is a dependent type. Hindley-Milner inference cannot handle it. TypeScript’s conditional types cannot model it. Encoding callPackage’s behavior statically requires a type system with dependent types, or a significant extension like the row polymorphism that Nickel uses.

Nickel is the most serious typed alternative in this space. It implements gradual typing with row polymorphism for records: a Nickel function type { foo: Num | r } -> Num accepts any record with at least a foo : Num field, where r is a row variable capturing additional fields. This can express “a record that is a superset of this shape” without pinning down the exact field names. It is considerably closer to what callPackage needs than TypeScript’s structural object types. But Nickel is a new language, not a type system for existing Nix code, and the two ecosystems do not share code.

The `with` Statement and Scope Injection

callPackage is the structurally deepest case, but with creates a related problem at the expression level. Nearly every file in nixpkgs begins with something like:

{ lib, ... }:

with lib;

mkDerivation { /* ... */ }

with lib; injects every attribute of lib into lexical scope. To type-check a reference to mkDerivation following that statement, a type checker needs to know the complete type of lib at that point, then simulate scope injection from a value.

TypeScript has no equivalent mechanism. The closest approximation is an explicit import: import { mkDerivation } from './lib'. Handling with through a TypeScript-based type system means rewriting with expr; body into explicit attribute references, at which point the type-encoded representation no longer resembles Nix. For a tool that aims to check actual Nix code rather than a TypeScript encoding of it, with expressions that reference anything other than a statically known attrset with a fully specified type are opaque.

Both callPackage and with point in the same direction: the Nix idioms that are easiest to check statically are the ones that look most like ordinary typed functional programming. The idioms at the core of nixpkgs are the hardest, because they rely on runtime structural introspection that static type systems are specifically designed not to require.

What nil Does Differently

The nil language server for Nix, written in Rust, takes a different approach entirely. It does not attempt to be a sound type checker. Instead, it performs lightweight local type inference to support IDE features: hover information, go-to-definition, basic completion. For the callPackage case, nil parses the function being called, extracts argument names directly from the AST, and uses that information to provide completions for the override attribute set. This is parse-time analysis, not type-theoretic analysis.

The distinction matters because nil’s approach sidesteps the dependent type problem by never trying to assign callPackage a type that captures its behavior across all possible inputs. Instead, it reads the structure of the specific code being edited and provides locally grounded information. nixd takes this further by actually invoking the Nix evaluator to provide completion candidates, analogous to what clangd does with the Clang frontend. Using the real evaluator means nixd can answer questions that static analysis alone cannot, at the cost of evaluation overhead.

The pattern here mirrors how editors handle dynamic languages generally. Python’s language servers can look up attribute names for an object whose type they’ve inferred, even if they cannot verify soundness across all call sites. The goal is useful IDE support, not a proof of correctness.

What the Type Checker Provides, Honestly

The value of a Nix type checker is not uniform across the language’s features. For the parts that work well: attrset field types, function argument types, list element types, module option declarations via lib.mkOption, and basic structural mismatches, a statically analyzed type layer provides genuine value. Errors appear before nixos-rebuild switch runs. Editor completions work. Mismatched types on NixOS module options surface inline rather than surfacing halfway through an evaluation.

For callPackage, with expressions over unknown attrsets, dynamic attribute access via set.${dynamicKey}, and the NixOS module system’s merging semantics (lib.mkIf, lib.mkForce, lib.fixedPoints.fix), any static type checker without dependent types hits a wall. It cannot verify these cases without either abandoning soundness or refusing to check them at all.

This is not a reason to abandon the project. Python’s mypy has equivalent gaps around __getattr__, locals(), and runtime class manipulation. TypeScript cannot fully type eval. Ruby’s Sorbet gives up on some dynamic method dispatch patterns. The pragmatic measure of a type checker is not whether it is sound on the entire language but whether it catches enough real mistakes to justify the annotation overhead.

The LSP features, completions, hover types, go-to-definition, and rename refactoring, deliver value even where the type system is incomplete. A language server that infers the type of a NixOS module option from its lib.mkOption declaration and provides completion for valid values is useful regardless of whether it can verify callPackage usage downstream.

What Would Actually Move the Needle

Two things would expand coverage significantly without requiring dependent types.

First, type stubs generated from nixpkgs’s existing lib.types declarations would give any static type checker coverage over the NixOS option surface with minimal manual annotation. The lib.types system already encodes the type information at runtime; extracting it for static analysis is a generation problem, not a design problem. A tool that reads lib.mkOption declarations and emits type stubs could provide type-checked editor support for the entire NixOS configuration surface.

Second, treating callPackage and with as acknowledged coverage gaps, rather than cases to encode imprecisely, produces a more honest and more useful tool. A type checker that reports “this expression is unverifiable due to runtime introspection” is more trustworthy than one that assigns a conservative or incorrect type and produces false confidence.

The Nix type checking problem is not uniformly hard. Most of the language is straightforward to analyze statically. The hard cases are specific and well-understood. Knowing where the ceiling is, and designing around it explicitly rather than obscuring it, is what separates a useful tool from one that gives you types in the easy cases and quietly fails in the cases that matter most.