· 6 min read ·

One Language for Computing and Proof: The Case Leo de Moura Is Making for Lean

Source: lobsters

The proof assistant landscape has been stable for a long time. Coq, Isabelle, and Agda have occupied their respective niches for decades, each carrying significant mathematical libraries and communities built up over years of careful work. Lean 4, by contrast, arrived as a full rewrite with an unusually clear argument: that separating the programming language from the proof assistant was a mistake from the start. Leo de Moura’s recent post makes the case directly, and reading it alongside the history of formal verification tools reveals what makes Lean’s position distinctive. The argument is not primarily about syntax or ecosystem size. It rests on a wager about what kind of language infrastructure makes formal methods scale beyond small research groups.

The Unified Language Thesis

Most proof assistants carve out a clean separation: the object language, the thing you prove things about, sits inside the proof assistant’s meta-language, the thing you use to orchestrate proofs. In Isabelle, you write ML tactics. In Coq, you write Ltac or Gallina. In Lean 4, you write Lean.

The practical consequences of this are significant. When your tactic language is the same as your proof language, the cost of writing new automation drops substantially. You can write a tactic that computes a decision procedure, compile it to native code, and run it as part of a proof check without switching contexts. The decide tactic in Lean 4 does exactly this for decidable propositions:

example : 2 + 2 = 4 := by decide
example : ∀ n : Fin 100, n.val < 100 := by decide

For larger finite computations, native_decide compiles the decision procedure to native code before running it. This is a direct consequence of the unified language design: the kernel trusts compiled Lean code the same way it trusts proof terms, because they are written in the same language and pass through the same elaboration pipeline.

Coq has moved in this direction with Coq.Decidability and plugins, but the metaprogramming story in Lean 4 is more systematic. The Macro, Elab, and MetaM monads give you access to the full elaboration pipeline. You can write syntax extensions, define new term-level constructs, and inspect or manipulate proof states using the same type system you use for everything else. There is no separate plugin language with its own documentation and failure modes.

What Metaprogramming Actually Enables

The practical payoff shows up in mathlib4, the community-maintained formalization of mathematics. At over 200,000 lines of proof, mathlib4 depends heavily on tactics like simp, ring, omega, linarith, and aesop. Each of these is implemented in Lean itself, not in a separate plugin language.

example (x y : ℤ) (h1 : x + y = 10) (h2 : x - y = 4) : x = 7 := by
  omega

example (x y : ℝ) : (x + y)^2 = x^2 + 2*x*y + y^2 := by
  ring

The omega tactic handles linear arithmetic over integers and natural numbers. The ring tactic closes polynomial ring equalities. Both are Lean programs that run during elaboration. When they fail, you get error messages within the same framework as any other Lean error.

In Coq’s Ltac, debugging failed tactics is notoriously difficult. The meta-language has different error semantics from the object language, and the mental model required to understand what went wrong spans two systems. In Lean 4, tactic failures propagate through ordinary monadic error handling. You can write a tactic, test it with #eval, and debug it with the same tools you would use for any other program. The cognitive overhead of switching between “proof mode” and “programming mode” largely disappears.

The Dependent Type Foundation

Lean 4’s type theory is based on the Calculus of Inductive Constructions, shared with Coq. The key design decision separating Lean from Agda, which also has dependent types, is the inclusion of classical axioms. Lean 4 ships with Classical.choice, which gives you the axiom of choice, and from that derives excluded middle, propositional extensionality, and functional extensionality.

-- These are theorems in Lean 4, derived from Classical.choice
#check Classical.em       -- ∀ (p : Prop), p ∨ ¬p
#check Classical.choice   -- {α : Sort u} → Nonempty α → α
#check funext             -- function extensionality, derived classically

Agda takes a more constructive stance by default, and the Agda community remains divided on whether classical axioms should be standard. Lean’s position is pragmatic: mathematicians working in ZFC-style classical logic should not have to fight the type theory. The classical axioms are isolated enough that constructive users can avoid them, but present enough that mainstream mathematics formalization does not require constant workarounds.

Isabelle/HOL takes a different approach entirely, using Higher-Order Logic rather than dependent types. HOL is simpler and more decidable in key subtheories, which made it historically popular for hardware and software verification through projects like seL4 and the L4.verified effort. The expressiveness ceiling is lower, though. Formalizing category theory or homotopy type theory in Isabelle requires significant contortion; in Lean 4, it is direct. The mathlib4 category theory hierarchy exists largely because Lean’s type theory makes these structures natural to express.

The Tooling Layer

One concrete advantage Lean 4 has over its predecessors is investment in editor tooling. The Lean 4 language server, built into the compiler, provides real-time proof state display, error highlighting, and goal visualization inside VS Code and Neovim. When you write a tactic proof, the editor shows you the current goal after each tactic step without any configuration.

This interactivity is central to how proof development actually works. Formal proof is not like writing a program where you know the structure in advance. You explore. You try tactics, see what remains, and adjust. The fast feedback loop in Lean 4’s server makes this practical in a way that batch-mode proof checkers do not. Coq has a similar interactive mode through Proof General and CoqIDE, but the server protocol is older and the response latency is higher for large files. Lean 4 was designed from the start to support incremental checking, so editing one part of a file does not invalidate the entire downstream proof state.

The #check, #eval, and #print commands reinforce this exploratory style:

#check List.map           -- List.map : (α → β) → List α → List β
#eval [1, 2, 3].map (· * 2)  -- [2, 4, 6]
#print Nat.add            -- prints the definition

These commands treat the proof assistant as an interactive development environment rather than a batch checker, which lowers the barrier to experimentation.

Where the Tradeoffs Are

Lean 4 is not without costs. The Coq ecosystem is substantially larger and older. Libraries like CompCert (the verified C compiler), VST (the verified software toolchain), and Iris (a framework for concurrent separation logic) represent decades of work with no Lean 4 equivalents yet.

The Lean 4 community is also more concentrated around pure mathematics than software verification. Mathlib4 is extraordinary in scope, but if you want to verify a concurrent algorithm or a compiler transformation, Coq has better-established infrastructure today. Projects like Lean4Lean, which formalizes the Lean 4 kernel in Lean 4 itself, suggest the verification direction is not absent, but it is behind.

The learning curve is also steeper than it appears at first. Lean 4’s type theory is powerful enough to express constructions that confuse newcomers, and the interaction between Prop and Type, universe polymorphism, and the instance search system requires investment to understand properly. The documentation has improved, but Lean 4 still demands more upfront commitment than picking up a conventional programming language.

Why This Argument Matters Now

Formal verification is moving from a research curiosity to an engineering practice with real production stakes. The recent progress toward formalizing the Fermat’s Last Theorem proof in Lean demonstrated that frontier mathematics is within reach of the tooling. At the same time, AI-assisted proof search, represented by tools like LeanDojo and the work coming out of DeepMind and other labs, is betting heavily on Lean as the substrate for training and evaluation.

The choice of proof assistant is increasingly a practical organizational decision, not just an academic one. De Moura’s argument is that Lean 4’s unified language model is what scales, for mathematics, for software verification, and for the emerging integration of machine learning with formal reasoning. The alternatives are coherent and mature, with larger existing libraries and more established toolchains for specific domains. The question is whether the architectural bet on language unification compounds over time, as the ecosystem fills in, or whether the incumbents’ head start proves durable. The answer depends mostly on what the community builds next, and there is reason to think the momentum is real.

Was this interesting?