The Language That Invented AI Cannot Be Written by It

There is a particular irony buried in this post by Dan Haskin: John McCarthy invented Lisp in 1958 specifically as a language for artificial intelligence research. It powered the symbolic AI boom at MIT, Stanford, and CMU through the 1970s and 1980s. Lisp machines were literal dedicated hardware for running it. Now, sixty-eight years later, the AI coding tools that have transformed how most developers work are largely useless when you write Lisp. It is a training data problem, a structural syntax problem, a macro problem, and a feedback loop problem, all compounded together.

Haskin’s post focuses on the lived experience: reaching for GitHub Copilot or Claude and getting suggestions that look plausible but are structurally broken, semantically wrong, or just blatantly hallucinated. He is not the first to notice. The r/lisp and r/Common_Lisp communities have documented this for years. Copilot routinely generates unbalanced parentheses. ChatGPT confuses let and let*, gets loop macro syntax wrong, and invents library functions that do not exist. Claude does better at structural coherence but still makes characteristic errors on anything beyond basic patterns.

The Training Data Problem Is Severe

The fundamental issue is representation. Modern LLMs are trained on enormous code corpora, and those corpora reflect the open-source ecosystem: predominantly Python, JavaScript, TypeScript, Java, C, and C++. Estimates from datasets like The Stack v2 suggest Python accounts for somewhere in the range of 200 billion tokens of training material. Common Lisp sits in the low hundreds of millions. Clojure fares better given its JVM ecosystem crossover, but it remains in the low single-digit billions. Scheme and Racket are smaller still.

This is not just a numbers problem. Quality and recency matter. Much of the existing Lisp training data comes from OCR’d textbooks, Usenet archives from the 1990s, and tutorial-level code. The idiomatic, modern Common Lisp you would write in 2026 using ASDF, Quicklisp, and the full standard library is thinly represented. A model trained on this corpus has learned what Lisp looks like superficially but has not seen enough of it written well to reliably reproduce it.

Clojure is the exception that proves the rule. Because Clojure interoperates with Java and attracted significant attention during its growth years in the early 2010s, there are more blog posts, Stack Overflow answers, and GitHub repositories. The AI experience with Clojure is noticeably better than with Common Lisp, which tracks directly with corpus size and recency.

S-Expressions Defeat Token Prediction at a Structural Level

Beyond the data deficit, there is a deeper mismatch between how LLMs generate text and what Lisp code requires.

Consider a non-trivial Common Lisp function:

(defun process-records (records &key (predicate #'identity) (transform #'identity))
  (loop for record in records
        when (funcall predicate record)
        collect (funcall transform record)
        into results
        finally (return (sort results #'string<))))

This is one logical unit, one top-level form, spanning multiple lines with parentheses that must balance across the entire thing. An LLM generating tokens left-to-right must track nesting depth through the whole sequence to close correctly. Compare that to Python, where indentation gives the model a strong local signal about structure at every line boundary, or C, where braces appear as syntactically obvious matched pairs. In Lisp, the signal is global: the count of open parentheses minus the count of closed parentheses must reach zero at the end of each form. Models lose count. They close one parenthesis too many or too few, and the code does not parse.

The uniform prefix notation of S-expressions, which makes Lisp trivially parseable by machines and elegant for practitioners of structural editing, is precisely what makes it hard for statistical token prediction. Every subexpression looks the same at the token level. The model has no visual differentiation between a function call, a special form, a macro invocation, or a data literal. Python has def, class, for, if as unmistakeable syntactic landmarks. Lisp has (.

Macros Make the Semantic Problem Intractable

Syntax is only part of it. Lisp’s macro system adds a layer that LLMs cannot reason about reliably.

When you write:

(with-open-file (stream "data.txt" :direction :input)
  (loop for line = (read-line stream nil)
        while line
        collect line))

with-open-file is a macro. It expands at compile time into code that wraps the body in unwind-protect, handles the file handle, and guarantees cleanup. An LLM generating code that uses with-open-file needs to know that this is a macro, that its first argument is a binding form, and that the implicit variable stream is in scope in the body. None of this is inferable from token statistics alone, and Common Lisp codebases use macros heavily. The standard library itself is full of them: loop, with-slots, define-condition, pushnew.

The problem compounds because Common Lisp allows user-defined macros that are syntactically indistinguishable from function calls. A model has no way to know whether (my-thing x y z) is a function, a macro that transforms x before evaluation, or a macro that does not evaluate y and z at all. The semantics are invisible at the surface level. Contrast this with Python decorators or JavaScript async/await, where special syntax marks out the non-standard evaluation explicitly and the model has strong statistical priors about what follows.

The Paredit Problem: AI Completions Break the Editing Workflow

There is a tooling mismatch that Haskin touches on and that deserves more attention on its own. The standard way to write Lisp is with structural editing, either paredit or parinfer. These tools operate on the tree structure of the code rather than its text representation. They guarantee that parentheses stay balanced at all times. Inserting a character, deleting a node, or moving an expression always produces valid S-expression structure.

AI completion suggestions are generated as raw text. When Copilot suggests a completion that is structurally broken, paredit will reject it or produce garbled output during insertion. The editing environment and the AI completion system are operating on fundamentally different models of what the code is. This is not just an annoyance; it means AI assistance is architecturally at odds with how Lisp developers actually work.

A genuinely useful AI tool for Lisp would need to be tree-aware, operating on AST nodes rather than token sequences. Some experiments in this direction exist, including tree-sitter-based editor integrations, but nothing in production-grade AI tooling has made this leap. The economic incentive to do so is not there, which brings us to the real problem.

The Feedback Loop That May Be Permanent

All of this creates a compounding dynamic. Less AI support means the developer experience of Lisp, already non-trivial to onboard into, is significantly worse relative to Python or TypeScript in an era where AI assistance has become a normal part of the workflow. Fewer developers adopting Lisp means less new Lisp code written publicly. Less new code means sparser training data in future model generations, which means worse AI support, completing the loop.

This matters because the developer population that Lisp might attract in 2026 includes people who learned to code with AI assistance and expect it to work. For those developers, reaching for a language where the AI is actively counterproductive is a serious friction point. The argument that Lisp rewards you with a fundamentally different way of thinking about programs still holds. Making that argument to someone accustomed to a language where tooling accelerates them is harder when the alternative offers tooling that actively misleads them.

Haskin’s sadness is about watching a language with genuine intellectual depth get further marginalised by a tooling trend it has no way to participate in. The underlying cause is structural. Lisp’s properties that make it resistant to AI assistance, its syntactic uniformity, its macro power, its structural editing ecosystem, are the same properties that make it worth writing in the first place. You cannot separate them without making Lisp into something else.

Whether this changes depends on whether any AI lab decides it is worth training on high-quality Lisp code with curated macro semantics, or whether tree-aware code generation ever becomes mainstream. Neither looks likely in the near term. In the meantime, Lisp developers are writing code without the assistant, which, depending on how you look at it, is either a loss or the whole point.