· 6 min read ·

The Line Between Spec and Code Is Just Precision

Source: lobsters

There is a gradient that every programmer crosses without noticing. You start with a prose description of what a function should do. You add a type signature. You write some unit tests. You add property-based tests. At some point along that path, the “specification” you were writing starts to look indistinguishable from the implementation. Gabriel Gonzalez makes this observation in a recent post on haskellforall.com, and it is worth following the thread to where it actually leads.

The Spectrum

Most specifications start as prose. “The function takes a list of users and returns those with active subscriptions.” This is underspecified in dozens of ways. What counts as active? What if the list contains duplicates? What if it is empty? Prose outsources these decisions to the implementer.

Types narrow the contract without closing it:

function getActiveUsers(users: User[]): User[]

This rules out returning a string, but it says nothing about correctness. You could return an empty array from every input and satisfy the type.

Property-based tests go further. In Hypothesis (Python’s QuickCheck descendant):

from hypothesis import given
from hypothesis import strategies as st

@given(st.lists(user_strategy))
def test_active_filter(users):
    result = get_active_users(users)
    assert all(u.active for u in result)
    assert all(u in result for u in users if u.active)

Now the spec is behavioral and complete for this function. The two assertions together constrain the implementation to a single logical definition. There is only one correct answer for any given input, and the test says exactly what it is. Notice that you could almost read those two assertions as the implementation itself.

In a dependently typed language like Lean 4, you can express this contract directly in the type:

def getActiveUsers (users : List User) :
  { result : List User //
    (∀ u ∈ result, u.active = true) ∧
    (∀ u ∈ users, u.active = true → u ∈ result) } :=
  ⟨users.filter (·.active), by simp [List.mem_filter], by simp [List.mem_filter]⟩

The type is the spec. Any term with this type is a correct implementation by construction.

The Curry-Howard View

This gradient has a formal name. The Curry-Howard correspondence, developed by Haskell Curry in the 1930s and extended by William Howard in 1969, establishes that types are propositions and programs are proofs. A function type A → B corresponds to logical implication: a program of that type is a proof that A implies B.

When you write a precise type, you state a proposition. When you write a program satisfying that type, you prove the proposition. The spec and the implementation are the same mathematical object viewed through different lenses. This is not a metaphor; it is the theoretical basis of proof assistants like Coq, Agda, and Lean 4, all of which use this correspondence as their foundation.

The practical consequence is that the gap between specification and code closes exactly as fast as precision increases. Add enough constraints to a type and the implementation space shrinks to one point. At that point the spec and the implementation are, in the formal sense, identical.

The Same Insight, Rediscovered

This keeps recurring across different parts of computing.

TLA+: Leslie Lamport’s TLA+ is a specification language for concurrent and distributed systems. A TLA+ spec describes the system’s state and the transitions allowed between states. The TLC model checker can execute this spec exhaustively over small state spaces and find violations. Amazon and Microsoft have published reports on using TLA+ to find bugs in production distributed systems, bugs that code review and testing both missed. The spec runs; the bugs appear.

SQL: SQL was designed explicitly as a declarative language. A SELECT statement specifies what data you want, not how to retrieve it. The query optimizer may choose among many physical execution plans, all of which satisfy the spec. In this sense SQL is a specification language that happens to be executable. Every SQL query is simultaneously a description of data and a program that retrieves it.

Dhall: Gabriel Gonzalez built Dhall as a typed, total configuration language. A Dhall type annotation for a configuration schema IS the specification of what configurations are valid. Because Dhall is total (no general recursion, guaranteed to terminate), a configuration file that type-checks is correct by construction relative to the spec expressed in the type. The “specification” of the config format is the Dhall type; the implementation is any file that satisfies it.

JSON Schema and Zod: In the JavaScript ecosystem, Zod schemas have become a common pattern for defining the shape and constraints of data at runtime. A Zod schema is both a specification (“this API response must look like this”) and executable validation logic. When teams use Zod schemas as the source of truth and derive TypeScript types from them, the schema IS the spec AND the type AND the runtime validator. Three things that used to be separate documents become one artifact.

Where It Shows Up in Practice

I ran into this building a Discord bot that ingests RSS feeds and classifies articles. The bot needs a schema for how an article is represented internally. I started with a TypeScript interface:

interface Article {
  title: string;
  url: string;
  publishedAt: Date;
  tags: string[];
}

That is a type-level spec. It rules out obvious nonsense but says nothing about whether url is a valid URL, whether tags is empty, or whether publishedAt is in the past. So I reached for Zod:

import { z } from 'zod';

const ArticleSchema = z.object({
  title: z.string().min(1).max(500),
  url: z.string().url(),
  publishedAt: z.date().max(new Date()),
  tags: z.array(z.string().min(1)).min(1),
});

type Article = z.infer<typeof ArticleSchema>;

Now the schema generates the TypeScript type, validates runtime data, and documents the constraints. The schema is the spec. When I change a constraint, the type, the validator, and the documentation all change together. There is no separate spec document to drift out of sync.

Property-based testing with fast-check would take this further: I could write properties describing what must hold for any valid Article and have the library generate thousands of inputs to stress the processing logic.

The Calibration Problem

There is a real cost to precision. The Lean type signature above is complete and correct, but most teams will not reach for dependent types to specify a list-filter function. The expressiveness of the specification language needs to match the stakes of the domain.

A few calibrations that hold up in practice:

Specify behavior, not implementation. A property like “result contains exactly the active users” is better than “result equals users.filter(u => u.active)” because the latter is just the implementation restated. Properties that describe the contract independently of the implementation are the useful ones.

Pin down the hard parts. Edge cases are where prose specifications lie. An empty list, null inputs, duplicate entries: these are the cases where a formal constraint pays for itself.

Let types do structural work. Even without dependent types, a precise type signature closes gaps that prose leaves open. A function typed (input: NonEmptyArray<User>) => [User, ...User[]] communicates a constraint that a prose spec might mention in a footnote.

What the Convergence Points At

Every few years someone rediscovers this principle in a new context. TDD gestures at it: write the spec as tests before writing the implementation. BDD extends it: write specs in language readable by non-programmers. Property-based testing sharpens it: write behavioral properties over all inputs. Formal verification takes it to the limit: make the spec machine-checkable.

The convergence is not coincidental. Specification and code are not different things. They are different levels of precision applied to the same underlying object: a function with a domain, a codomain, and a set of required properties. Make the required properties fully precise and you have a complete definition of that function. That definition can be run, tested, model-checked, or formally verified depending on what tools you bring to it.

This is what the Curry-Howard correspondence makes formal: a program that satisfies a precise type is a proof of a proposition. The spec and the proof are the same artifact. A sufficiently detailed spec is not like code. It is code.

Was this interesting?