· 6 min read ·

The Precision Dial: How Type Systems Close the Gap Between Spec and Code

Source: lobsters

Gabriel Gonzalez makes the argument on haskellforall that a sufficiently detailed specification is code. The claim sounds like a provocative reframe until you trace its history, at which point it reads less like a thesis and more like a delayed acknowledgment of something type theorists established sixty years ago. The useful question is not whether the statement is true, but what degree of precision is required and what happens to the gap between specification and implementation as you increase it.

The Gap That Never Stayed Closed

Formal specification methods have existed since at least the 1970s. Z Notation, developed at Oxford by Jean-Raymond Abrial, let you write a system description in set theory and predicate logic with mathematical rigor. VDM, the Vienna Development Method, offered similar capability. Both were serious tools and both had the same structural failure: the spec lived in one document and the code lived elsewhere. The two could drift apart, and proving they still corresponded required continuous effort. In practice, most teams stopped maintaining the spec once the code shipped.

Leslie Lamport’s TLA+ is the most prominent survivor of this lineage. Amazon Web Services used it to find subtle bugs in S3, DynamoDB, and other distributed protocols. It works, and it is a serious engineering tool. But the fundamental separation remains: TLA+ describes what the system should do, and some other language implements it. The gap between spec and code is a maintenance burden measured in engineering hours.

Bertrand Meyer’s solution with Eiffel in 1986 was to embed the spec directly in the code as preconditions, postconditions, and invariants checked at runtime. Design by Contract gave specifications a place to live inside the implementation file, which at least kept them from drifting invisibly. But the contracts described properties of the code rather than being the code itself. The gap narrowed without closing.

Types Are Already Specifications

Every typed function signature is a fragment of specification. In Python with type hints, def head(xs: list[T]) -> T specifies that the output type matches the element type of the input. That constraint alone eliminates a large class of wrong implementations. In Haskell, parametric polymorphism tightens this considerably.

Philip Wadler’s 1989 paper “Theorems for Free!” showed that you can derive nontrivial behavioral theorems from a polymorphic type signature without inspecting the implementation at all. A function of type [a] -> [a] in Haskell cannot inspect the elements of the list, because it knows nothing about the type a. A theorem follows from the type alone: every element in the output was in the input. The type specifies this, and parametricity guarantees it holds for every valid implementation.

This is the foundation of the Curry-Howard correspondence. Types are propositions; programs are proofs of those propositions. A type signature is a theorem statement. An implementation is a proof that the theorem holds. When the compiler type-checks your function, it is verifying the proof. This is not a metaphor; it is a precise formal isomorphism established by William Howard in 1969, building on Haskell Curry’s earlier observation about typed lambda calculus.

When the Type Uniquely Determines the Implementation

The correspondence has a direct corollary. If a type, read as a proposition, has exactly one proof, then only one implementation satisfies the type. Consider:

id :: a -> a
id x = x

fst :: (a, b) -> a
fst (x, _) = x

flip :: (a -> b -> c) -> b -> a -> c
flip f y x = f x y

Each of these is fully determined by its type. Implementing id requires no design decisions: the function receives a value of some unknown type a and must return a value of that same type, and there is nothing else it can do with a universally quantified type variable. The spec has become the program.

Compare to sort :: Ord a => [a] -> [a]. The type says the output is ordered and contains the same elements as the input, but leaves the algorithm entirely unspecified. Quicksort, mergesort, heapsort, and insertion sort all satisfy it. The type is a partial specification. Uniquely pinning down the implementation would require specifying stability, asymptotic complexity, and memory allocation behavior, none of which natural language can embed into a Haskell type today. For that, you need a richer type language.

Refinement Types and Liquid Haskell

Liquid Haskell, developed by Ranjit Jhala’s group at UCSD, extends Haskell’s type system with refinement types: types annotated with logical predicates checked by an SMT solver (Z3) at compile time. The canonical example:

{-@ head :: {v:[a] | len v > 0} -> a @-}
head :: [a] -> a
head (x:_) = x

The annotation requires a non-empty list as a precondition. Z3 verifies at every call site that this precondition holds statically. The spec has moved into the type layer and is enforced at compile time, not at runtime and not in a separate document. The gap is narrower still.

Liquid Haskell has been applied to verify bounds safety in bytestring and to check termination properties in other libraries. Importantly, it works on ordinary Haskell code with annotations. You can push spec precision into the type layer incrementally, as far as your current correctness requirements demand, without switching to a proof assistant or rewriting existing code.

Dependent Types Close It Entirely

In Idris 2 and Lean 4, types can depend on runtime values, which allows the full realization of Curry-Howard. In Idris, vector types encode their length in the type itself:

data Vect : Nat -> Type -> Type where
  Nil  : Vect 0 a
  (::) : a -> Vect n a -> Vect (S n) a

zipWith : (a -> b -> c) -> Vect n a -> Vect n b -> Vect n c

The length n is part of the type. An implementation of zipWith that drops an element or produces a vector of the wrong length simply fails to type-check. The precondition (equal-length inputs), the postcondition (output length matches), and the signature are a single object. There is no separate spec document and no runtime assertion; the type checker enforces the entire contract.

Lean 4 has become the serious contender for combining verified programming with automated proof assistance. The Mathlib4 library contains formalized mathematics at graduate level, and the community around it has grown substantially. Recent work on AI-assisted proof search, including models fine-tuned on Lean proof corpora, has started to make the construction of proof terms more automated. The theoretical claim and the 2026 development moment are meeting from opposite directions: when the spec is formal enough to live in a Lean type, the implementation becomes a proof search problem, and proof search is increasingly tractable for automated systems.

Where GHC Is Heading

Most production Haskell does not use Liquid Haskell or approach dependent types today. GHC 9.10 introduced RequiredTypeArguments, a concrete step toward visible dependent quantification. The broader dependent types project, architecturally grounded in Richard Eisenberg’s research, is advancing incrementally rather than arriving as a single language flag. Each GHC release closes the expressiveness gap between Haskell and Idris a little further.

In the meantime, the ecosystem already applies spec-as-type in practical ways. The servant library encodes a REST API structure in a type like "users" :> Capture "id" Int :> Get '[JSON] User, and derives both the server-side handler type and client-side call functions from that single type expression. The API spec and the interface contract are the same object. GADTs encode valid state machine transitions in types, making illegal transitions a compile error rather than a runtime panic. Linear types in GHC 9.x express resource usage constraints that the runtime cannot check. Each of these narrows the distance between what you intend and what the compiler verifies.

The Practical Upshot

The claim that a sufficiently detailed spec is code is not an aspiration; it describes a spectrum. At one end, a comment in natural language specifies nothing the compiler can check. At the other end, a dependent type signature specifies everything, and the implementation is a proof term that the type checker accepts or rejects. Between those poles, there is a precision dial with named positions: parametric types, design by contract, property-based tests, refinement types, indexed types, full dependent types.

Turning the dial up costs verbosity and cognitive load. It pays off in the proportion of correctness guarantees you get statically rather than at runtime or not at all. Where to sit on the dial is an engineering judgment driven by the cost of bugs in your particular domain. Safety-critical systems, cryptographic code, and distributed protocols justify the cost of Liquid Haskell annotations or a move to Lean. A Discord bot probably does not.

The theory just says the limit exists. At the limit, the spec is the code, and the compiler is the verifier.

Was this interesting?