· 6 min read ·

Nix Finally Gets a Type Checker, and the Hard Part Isn't What You'd Expect

Source: lobsters

The Nix language has always occupied an uncomfortable position in the tooling landscape. It is powerful enough to describe the entire dependency graph of a Linux distribution, yet its editor support has lagged behind languages a fraction of its age. You get syntax highlighting if you’re lucky, maybe a formatter, and an LSP that can tell you where a variable is defined but not what type it has. A recent project by John Lorenz changes that by building both a type checker and a language server for Nix from scratch, and the engineering story behind it reveals why nobody had done it properly before.

What Makes Nix Hard to Type

Nix is a purely functional, lazily evaluated, dynamically typed language. Those three properties compound in ways that defeat the standard approaches to adding types to a dynamic language.

Lazy evaluation means a value is never computed until something demands it. In a strict language, if you write a function that would produce a type error, the error surfaces the first time that code path runs. In Nix, a value can sit unevaluated in an attribute set forever without causing problems, and a type error buried in an unused branch may never be observed at runtime. Static analysis has to reason about code that the evaluator would never touch.

Attribute sets are the more fundamental challenge. In Nix, attribute sets (written { foo = 1; bar = "hello"; }) are the primary data structure. Everything composes through them: packages, modules, overlays, NixOS configurations. The problem is that attribute sets in Nix are structurally dynamic. You can merge two sets with //, extend them with rec, open their namespace with with, and pass them to functions that destructure them with pattern matching. A function might accept { pkgs, lib, ... }:, where the ... means “any other attributes are fine too.” Typing this correctly requires row polymorphism or some form of structural subtyping, neither of which falls out naturally from the Hindley-Milner algorithm that underlies most ML-family type inference.

The with statement deserves special mention. with pkgs; dumps every attribute of pkgs into scope, making it impossible to know statically what names are available without evaluating pkgs first. This is why the existing language servers, nil and nixd, handle with differently. nil does conservative static analysis and treats with scopes carefully; nixd partially evaluates actual nixpkgs to provide completion candidates. Both are pragmatic workarounds rather than solutions to the underlying problem.

The Existing LSP Landscape

nil is probably the most widely used Nix language server today. Written in Rust, it provides go-to-definition, find references, completion, and some basic diagnostics without evaluating any Nix code. It parses using a hand-rolled parser and does name resolution statically. Its diagnostics are conservative: it won’t tell you that you’re calling a function with the wrong argument type because it does not track types at all.

nixd, by contrast, shells out to the Nix evaluator for certain operations. This gives it much better completion inside nixpkgs expressions because it can actually evaluate the attribute set you’re inside and enumerate its keys. The cost is evaluation latency and the requirement that your system has a working Nix installation. It is a different philosophy: lean on the evaluator rather than re-implement its semantics.

Neither tool does type inference. Lorenz’s project targets this gap specifically.

Typing an Untyped Language

The canonical reference point for this kind of project is TypeScript. TypeScript added a structural type system to JavaScript, a language with a similarly dynamic character, and navigated the same tension between soundness and practicality. TypeScript explicitly chose to be unsound: some programs that are type-correct according to TypeScript will still throw at runtime. The payoff is that you can type most real JavaScript patterns, including object spread, dynamic property access, and union types that arise from conditional logic.

For Nix, the analogous design questions are: how do you type attribute set merging (//), how do you handle rec sets where attributes reference each other, and how do you represent the type of a function that accepts open attribute patterns?

Row polymorphism, as used in Elm and PureScript, is one answer to the last question. A function { pkgs, lib, ... }: ... would have a type like { pkgs : PkgSet | r } -> ..., where r is a row variable representing the remaining fields. This lets the type checker verify that pkgs and lib are present and correctly typed without constraining what other fields the caller provides. It is more expressive than simple record types but significantly more complex to implement and explain.

Gradual typing is another lever. Rather than requiring every expression to have a known type, a gradual type system allows ? or dynamic types that opt out of checking. Python’s mypy and pyright use this model: unannotated functions are treated as returning Any, which is compatible with everything. The trade-off is that errors can hide behind Any boundaries.

For Nix, gradual typing makes particular sense because the language is used in two very different modes. Nix expression files that configure packages are often one-off scripts where type checking would add friction without benefit. NixOS module definitions, by contrast, have a structured interface with known attribute types, and type errors in modules can produce confusing evaluation failures that are hard to debug without type information.

The LSP Layer

Building a type checker is one problem; building a language server that uses it is another. An LSP must respond to user input in tens of milliseconds. Full re-analysis on every keystroke is not viable for anything but trivial files, which means the architecture needs incrementality baked in from the start.

The approach used by rust-analyzer, the gold standard for LSP implementation, is to model compilation as a demand-driven computation graph using the Salsa framework. Each analysis phase (parsing, name resolution, type inference) is expressed as a query; results are cached and only recomputed when their inputs change. A single character edit invalidates a small portion of the graph, and only the affected queries re-execute.

For Nix, this is complicated by the language’s lazy and import-heavy nature. A Nix file can use import ./other-file.nix to pull in another module, and builtins.import can construct paths dynamically. The dependency graph between files is not always statically knowable.

Lorenz’s approach, based on the article, keeps the type checker independent of the Nix evaluator and focuses on what can be determined syntactically and through local inference. This is the right call for a first iteration: a sound but incomplete type checker that catches real errors in well-structured code is more useful than an ambitious system that requires solving the halting problem.

Nickel and the Road Not Taken

It is worth noting that Nickel, developed by Tweag, took the other road: instead of adding types to Nix, they designed a new configuration language with a gradual type system built in. Nickel’s type system supports both static types and runtime contracts, letting you mix typed and untyped code at module boundaries. It is a cleaner solution in theory, but it requires migrating away from the entire Nix ecosystem, including nixpkgs, which contains hundreds of thousands of packages.

For the working Nix user, a type checker that understands .nix files is far more immediately useful than a better language that nobody’s packages are written in yet.

What This Unlocks

A working type checker for Nix would change how NixOS modules are written and debugged. Module options are already typed at runtime through the NixOS module system’s own type DSL, which includes types like types.str, types.listOf types.package, and types.attrsOf. These runtime types could serve as ground truth for a static type checker: if a module declares options.foo.type = types.str, the type checker can verify that every assignment to foo in the configuration is actually a string.

Diagnostics for common mistakes, like passing an integer where a derivation is expected, or accessing a nonexistent attribute of a known attrset, would catch errors that currently only surface when you run nixos-rebuild switch and wait for evaluation to fail several seconds later.

The project is early, and building a complete, sound type system for Nix may not be the goal. But even partial type inference, combined with a responsive language server, would represent a meaningful step forward for a language that millions of developers use daily and that has historically had to rely on runtime errors for feedback.

Was this interesting?