· 9 min read ·

Why Giving Nix a Type Checker Is So Much Harder Than It Looks

Source: lobsters

A recent post on Lobsters described building a type checker and Language Server Protocol server for Nix from scratch. The project is experimental, but it sits at the intersection of several genuinely hard problems in programming language theory. Reading it prompted me to think about what it would actually take to do this rigorously, and why nobody has managed it yet despite Nix being old enough to vote.

Nix the language is deceiving. It looks small. The syntax fits on a single page. There are no classes, no interfaces, no statements, no loops. Just functions, attribute sets, strings, numbers, paths, and lists. But the combination of lazy evaluation, dynamic scoping via with, heterogeneous lists, and open-record function arguments creates a type inference problem that touches almost every hard corner of the literature.

The Existing Landscape

Two serious Nix LSP servers exist today. nil, written in Rust by oxalica, is the more mature option. It uses a lossless concrete syntax tree built on the rowan library (the same approach as rust-analyzer), runs entirely statically, and gives you go-to-definition, find-references, rename, and unused-binding diagnostics without ever invoking a Nix evaluator. It is fast and reliable. It also has no type inference whatsoever. Hover shows you the source definition, not a type.

nixd takes the opposite approach. It embeds libnix and actually evaluates Nix code to answer queries. Hover over pkgs.stdenv and nixd will show you what stdenv actually is, because it evaluated pkgs to find out. Completion for services.nginx.<TAB> works because nixd evaluated the NixOS module system to enumerate the options. This is powerful, but it means nixd is doing evaluation, not type checking. It tells you about values, not types, and it can hang on slow or infinite evaluations.

rnix-lsp was the earlier attempt and is now officially deprecated in favor of nil and nixd.

What none of these tools provide is static type inference: the ability to look at a Nix expression and tell you, without evaluating it, that it has type { stdenv: Derivation, fetchFromGitHub: FetchArgs -> Derivation } -> Derivation or that a particular call site is passing a string where an integer is expected. That is what a type checker actually means, and it is where the real difficulty lives.

The Row Polymorphism Problem

The most immediate obstacle is how Nix functions receive arguments. Nearly every non-trivial Nix function uses a pattern like this:

{ stdenv, fetchFromGitHub, pkg-config, ... }:
stdenv.mkDerivation {
  pname = "my-package";
  version = "1.0";
  src = fetchFromGitHub { owner = "..."; repo = "..."; rev = "..."; hash = "..."; };
  nativeBuildInputs = [ pkg-config ];
}

The ... is not a variadic argument in the usual sense. It means the function accepts an attribute set with at least the named fields, but possibly more. This is structural subtyping on open records, which in type theory terms requires row polymorphism to express properly.

Row polymorphism lets you say something like: this function accepts any record that has at least a stdenv field of type Stdenv and a fetchFromGitHub field of type FetchFromGitHubFn, plus any additional fields of any types. Standard Hindley-Milner does not handle this. You need an extension like the one in OCaml’s polymorphic variants and object types, or the approach taken by MLton, or the structural record types in PureScript.

This matters practically because the nixpkgs callPackage mechanism works by passing the entire nixpkgs attrset to a function and letting the function select what it needs via destructuring. Any type checker that cannot handle open records is dead on arrival for real nixpkgs code.

Dynamic Attribute Access

Row polymorphism covers the open-record problem, but Nix also has dynamic attribute access:

let key = someComputedString;
in attrset.${key}

The type of attrset.${key} where key is not a literal string is fundamentally a dependent type problem. The result type depends on the runtime value of key. No practical mainstream type system handles this, and the tradeoffs are severe: you either reject all dynamic attribute access as ill-typed, assign it a top type like Any, or require the user to annotate constraints on key.

In practice, dynamic attribute access shows up everywhere in nixpkgs infrastructure. builtins.getAttr, builtins.hasAttr, lib.getAttrFromPath, and lib.attrByPath are used to navigate attribute paths computed at evaluation time. A type checker that rejects all of this is not useful for the actual Nix ecosystem.

The with Expression

The with expression brings every attribute of an attrset into scope:

with import <nixpkgs> {};
stdenv.mkDerivation { buildInputs = [ git curl openssl ]; }

From a static analysis perspective, this is a scope that can only be resolved by evaluating import <nixpkgs> {}. Without evaluation, you do not know what names with introduces. nil works around this by tracking with-introduced names as unknown and suppressing undefined-variable diagnostics for identifiers that might have come from a with. nixd resolves it by evaluation. A static type checker has no clean option: it must either evaluate (defeating the purpose), use annotations, or treat with-introduced names as having type Any.

This is not a minor edge case. Nearly every older nixpkgs expression and a large fraction of NixOS configurations use with pkgs; or with lib; at the top level.

Laziness and Thunks

Nix is lazy in the call-by-need sense. Bindings are not evaluated until their value is demanded:

let
  badExpr = 1 + "this is a type error";
  goodExpr = "hello";
in goodExpr

This evaluates without error because badExpr is never forced. A sound type checker would still report the error in badExpr, which is correct behavior. But laziness also means that infinite structures are valid:

let
  ones = [ 1 ] ++ ones;  # infinite list
in builtins.head ones    # evaluates to 1

And recursive attribute sets are ubiquitous:

rec {
  a = b + 1;
  b = 2;
  c = a * b;
}

Type inference for rec requires solving a system of mutually recursive type constraints simultaneously. Standard HM handles simple recursive let bindings via the value restriction and generalization, but rec attrsets with arbitrary mutual recursion between fields require a fixed-point iteration over the constraint system.

The interaction gets worse when rec is combined with // (attrset merge), which is used to override individual fields:

let defaults = rec { a = 1; b = a + 1; };
in defaults // { a = 10; }  # b is now 11, not 2

The merged attrset has b depending on the new a. This is essentially a form of inheritance with late binding, and its type depends on the full evaluation semantics of // applied to rec sets.

What Nickel Chose Instead

Tweag’s Nickel is worth studying here because it is an attempt to build a properly typed language that does what Nix does, starting from scratch. Rather than retrofitting types onto Nix, Nickel designed a language with gradual typing from day one.

Nickel uses the | operator for type annotations and contracts:

let f : Number -> Number = fun x => x + 1 in
f 5

For records, Nickel uses row-polymorphic types with explicit open/closed syntax:

let greet : forall r. { name: String; .. | r } -> String =
  fun person => "Hello, " ++ person.name

The .. marks the record as open. This is exactly the row polymorphism that Nix would need. Nickel also has Dyn as the gradual escape hatch, analogous to TypeScript’s any.

Nickel is not Nix-compatible and was not designed to be, but it demonstrates the design space clearly: if you want a typed language for configuration management with the same expressiveness as Nix, you need row polymorphism, gradual typing, contract types for runtime validation, and careful handling of recursive records. These are not optional features you can add incrementally; they are load-bearing parts of the type system.

What a Practical Type Checker Would Need

If someone wanted to build a genuinely useful Nix type checker rather than a proof-of-concept, the minimum viable feature set looks something like this.

First, a type inference algorithm capable of row-polymorphic record types. Standard HM is insufficient. The HM(X) constraint-based framework extended with row variables is the theoretical foundation, but implementing it efficiently for a language where every function call potentially involves record subtyping is nontrivial. Daan Leijen’s extensible records with scoped labels is a cleaner model, though it has different tradeoffs.

Second, a Dyn or Any type with gradual typing semantics, plus a principled decision about how many things to assign it. with expressions, dynamic attribute access, and import-based module loading all need an escape hatch. A checker that is too aggressive about Dyn gives no useful information; one that is too strict rejects valid code.

Third, special typing rules for builtins. The builtins are not ordinary Nix functions. builtins.isString, builtins.isInt, etc. are type predicates that should narrow types in conditional branches, exactly like TypeScript’s typeof checks. Without type narrowing for these predicates, the checker cannot handle the extremely common pattern of validating inputs at the top of a function.

Fourth, the NixOS module system needs to be understood at some level. Modules use lib.types.* to declare option types at runtime, but those types have a regular structure that a type checker could recognize. Without some awareness of the module system, the checker cannot provide useful diagnostics for the largest category of Nix code most users actually write.

LSP Infrastructure

Beyond the type theory, the LSP plumbing itself has real costs. The Language Server Protocol 3.17 specification defines around fifty request and notification types. A competitive LSP needs completion, hover, goto-definition, find-references, diagnostics, document symbols, and inlay hints at minimum. Each of these must be correct under concurrent edits, respond within acceptable latency, and handle syntactically broken files gracefully, since files are always partially broken while being edited.

Position encoding is a known pitfall: LSP historically used UTF-16 offsets because of VS Code’s internal representation, but LSP 3.17 added positionEncodings negotiation to allow UTF-8 and UTF-32. Servers that get this wrong produce subtly incorrect behavior on files containing non-ASCII characters.

For a Nix LSP specifically, incremental analysis is critical. nixpkgs imports fan out into hundreds of files; re-analyzing the full dependency graph on every keystroke is not viable. nil’s use of rowan and a Salsa-style incremental computation framework is the right architecture here. Any serious implementation needs the same.

Where This Leaves Things

The project at johns.codes is doing the hard, underappreciated work of actually trying to build this rather than just theorizing about why it is difficult. Experimental type checkers for dynamic languages often start exactly this way: get something running, discover the hard cases, iterate. TypeScript started as a proof-of-concept for adding types to JavaScript before it became infrastructure for the entire web ecosystem. mypy started as a research project before Python formalized type hints in PEP 484.

Nix’s trajectory may be similar. The language is increasingly important as Flakes matures and NixOS adoption grows, and the tooling gap compared to languages with mature type systems is real. nil and nixd are genuinely useful, but neither replaces the feedback loop you get from a type checker that can tell you, before evaluation, that you are passing a list where a string is expected.

The theoretical tools exist. Row polymorphism is well-understood. Gradual typing has decades of research behind it. Incremental LSP architectures have been demonstrated in rust-analyzer and similar projects. What the Nix ecosystem needs is someone willing to synthesize them for this specific language, which is exactly what projects like this one are attempting. Whether this particular implementation becomes production-ready or serves primarily as a research artifact that informs a future effort, the exploration itself advances the state of the art.

Was this interesting?