· 6 min read ·

The Executable Specification Was the Program All Along

Source: lobsters

The Haskell for All post makes an observation that sounds almost tautological: if you describe what you want with enough precision, you have written code. The difference between a spec and a program is not categorical. It is a matter of how much detail you have committed to.

That framing is worth taking seriously, because it reframes a question that most developers treat as settled. We act as if specs live in documents and code lives in files, and the work is translating from one to the other. But the translation model is wrong, or at least incomplete. The spec and the implementation are the same kind of object at different points on a precision dial.

Hoare Logic and the First Precise Specs

Tony Hoare’s 1969 paper “An Axiomatic Basis for Computer Programming” introduced what became known as Hoare logic: a formal system for reasoning about program behavior using preconditions and postconditions. The notation {P} C {Q} means that if predicate P holds before executing command C, then Q will hold after. This is unambiguously a specification language, not an implementation language. You cannot run a Hoare triple. But consider what it contains: a precise, machine-checkable characterization of what a function must do. The only thing separating it from code is executability.

Hoare logic was a theoretical tool, not a practical one. It required manual proof construction and did not scale to production software. But the underlying idea, that correctness properties can be stated precisely and checked mechanically, set the direction for everything that followed.

Design by Contract Made Specs Executable

Bertrand Meyer’s Eiffel language (1986) took Hoare’s preconditions and postconditions and made them first-class language features. Design by Contract lets you write:

class BANK_ACCOUNT
  feature
    balance: INTEGER

    deposit (amount: INTEGER)
      require
        positive_amount: amount > 0
      do
        balance := balance + amount
      ensure
        balance_updated: balance = old balance + amount
      end
end

The require and ensure blocks are not comments. They are executable. At runtime, if a precondition is violated by a caller, or a postcondition is violated by the implementation, the program raises an exception with the name of the violated clause. The spec runs alongside the code and polices it.

Design by Contract never conquered mainstream programming, but the idea survived in scattered forms: assert statements, Java’s @Preconditions annotations, Python’s hypothesis library. Each represents an attempt to give specs teeth at runtime.

QuickCheck: Specs as Generators

Haskell’s QuickCheck library (Claessen and Hughes, 2000) approached the problem from a different direction. Instead of annotating functions with contracts that run on specific inputs, QuickCheck lets you write properties and then generates hundreds of random inputs to check them:

prop_reverse :: [Int] -> Bool
prop_reverse xs = reverse (reverse xs) == xs

prop_sort_idempotent :: [Int] -> Bool
prop_sort_idempotent xs = sort (sort xs) == sort xs

prop_insert_length :: Int -> [Int] -> Bool
prop_insert_length x xs = length (insert x xs) == length xs + 1

These properties are specifications. They say nothing about how reverse or sort work. They say what must be true about the result. QuickCheck treats the property as an oracle and the random inputs as a search for counterexamples.

The specification is executable, and the execution is adversarial. You are not writing test cases; you are writing laws, and the framework tries to break them. The gap between spec and test closes significantly: a QuickCheck property is simultaneously both.

Dependent Types: The Spec Moves into the Type

In languages with dependent types, the gap closes further. In Idris or Agda, types can express arbitrarily complex propositions:

data Vect : Nat -> Type -> Type where
  Nil  : Vect Z a
  (::) : a -> Vect n a -> Vect (S n) a

(++) : Vect m a -> Vect n a -> Vect (m + n) a
(++) Nil       ys = ys
(++) (x :: xs) ys = x :: (xs ++ ys)

The type of ++ is not just documentation. It is a machine-checked theorem: concatenating a vector of length m with one of length n produces one of length m + n. If you write an implementation that does not satisfy this type, the compiler rejects it. The type is the specification, and the specification is enforced at compile time.

This is the Curry-Howard correspondence made concrete. In type theory, types are propositions and programs are proofs. A function of type A -> B is a proof that A implies B. Writing a well-typed program is constructing a proof that the type’s proposition holds. The spec and the implementation are literally the same syntactic object.

Lean 4 has taken this further in the context of mathematics. The Mathlib4 library contains tens of thousands of formally verified theorems. The proofs are programs. The theorems are types. There is no separate specification layer.

The Practical Spectrum

For most working code, this plays out as a spectrum rather than a binary:

  • Comments and docs: informal, not checked by anything
  • Type signatures: lightweight, checked at compile time for consistency
  • Runtime assertions: checked on specific inputs during execution
  • Property tests: checked probabilistically across generated inputs
  • Refinement types: types annotated with predicates, checked statically
  • Dependent types: arbitrary propositions encoded in types, fully verified
  • Explicit proof terms: manually constructed proofs, machine-verified

Different languages land at different points. Python sits near the informal end; most correctness properties live in comments or are caught at runtime if tested. TypeScript adds a type layer that rules out large classes of errors. Rust’s ownership and lifetime system encodes memory safety properties directly in types, enforced at compile time without a garbage collector. Liquid Haskell extends Haskell’s types with SMT-checked refinement predicates. Agda and Lean 4 operate at full dependent-type expressiveness.

The trend over fifty years is consistent: languages keep moving up the spectrum, making more of the specification machine-checkable.

What This Changes About How You Work

Accepting the continuum has practical consequences that do not require switching to a dependently-typed language.

Write types before implementations. A function that accepts NonEmptyList instead of [] is encoding the non-emptiness precondition in the type, where the compiler enforces it for free. A newtype wrapper for validated email addresses expresses the validation constraint in the type system rather than in comments.

Write properties before test cases. A specific test case checks one point in the input space. A property checks the entire space up to whatever your generator reaches. reverse . reverse == id is more general than any individual input-output pair, and it catches bugs that no specific case would find.

Treat invariants as load-bearing specifications. If a data structure must maintain an ordering, or a count must stay in sync with a collection, express that invariant somewhere machine-checkable. A runtime assertion in the insert function is better than a comment. A type that makes invalid states unrepresentable is better than both.

The Direction of Travel

The Haskell for All post is pointing at something the programming language research community has known for decades but that has not fully landed in mainstream practice: spec and code are not different activities. Writing a spec is writing an incomplete program. Writing code is filling in the spec until it runs.

The question is not whether to write specs, but how precise to make them and how much of that precision to move into machine-checkable form. Hoare asked what it would mean to reason precisely about programs in 1969. Meyer made that reasoning executable in 1986. Claessen and Hughes made it probabilistically exhaustive in 2000. The dependent-type community made it fully verifiable in the decades since.

The gap between spec and code has been closing this whole time. The languages and tools that are gaining ground are the ones that close it faster.

Was this interesting?