Lisp is the language that gave artificial intelligence its name as a research field. John McCarthy wrote the first Lisp interpreter at MIT in 1958 to explore symbolic computation, and for three decades afterward, Lisp was the substrate on which serious AI work happened. MACSYMA, the early computer algebra system. SHRDLU, Terry Winograd’s natural language understanding demo. The precursors to expert systems and planning algorithms. Nearly all of it written in Lisp.
So Dan Haskin’s observation lands with some force: modern AI coding assistants are nearly useless when you’re writing Lisp. The language of AI cannot get AI to help write it. Dan is sad about this, which is the honest position. He wants the assistance. The tools do real work for him in other languages. But when he writes Common Lisp or Clojure, the suggestions degrade, hallucinations multiply, and idiomatic patterns come out wrong in ways that erode trust in the tool entirely.
Why the Training Data Gap Is Structural
LLMs learn to write code by ingesting enormous amounts of existing code. GitHub, Stack Overflow, documentation sites, tutorials, open source projects. The quality of output in any given language is roughly proportional to how much of that language appeared in the training set, weighted by the quality and diversity of the examples. This is not a subtle finding; it falls directly out of how these models work.
Python accounts for somewhere around a third of public GitHub repositories. JavaScript and TypeScript together are comparable. Go, Rust, Java, and C++ each have meaningful representation. Clojure, by most estimates, sits below one percent. Common Lisp is smaller still. Scheme is a rounding error.
This matters not just for raw line counts but for coverage of idioms, library APIs, error patterns, and domain-specific conventions. Python has millions of Stack Overflow answers. Clojure has thousands. When an LLM is uncertain about a Python approach, there are enough training examples that it can usually fall back to something plausible. When it is uncertain about a Clojure approach, the fallback is something that looks superficially correct but misses the idiomatic center of gravity.
The concrete failure modes follow predictably. Ask for idiomatic Clojure and you often get Java-flavored Clojure: explicit loops where sequence operations belong, verbose if-let chains where destructuring in function arguments would be natural, missed opportunities to use the threading macros that are central to how Clojure code actually reads in practice.
;; What you write if you know Clojure
(->> users
(filter :active)
(map :email)
(into #{}))
;; What a confused LLM often produces
(into #{} (map :email (filter :active users)))
;; Or worse, loop/recur where reduce works fine
Both versions are valid Clojure, but the threading macro form is the one you would write after reading any amount of real Clojure code. The nested form reads like someone who knows Common Lisp notation but not Clojure convention. The model has seen enough Lisp to generate legal syntax but not enough idiomatic Clojure to produce the style practitioners expect.
For Common Lisp the situation is worse. Models mix in Emacs Lisp idioms, generate format directives that follow the right shape but have wrong control flow, and occasionally produce code that was correct in a pre-ANSI dialect but not in modern standard Common Lisp. The condition system, which is one of Common Lisp’s most distinctive features and one of the things that makes it worth using, comes out garbled. Models tend to reach for handler-case patterns that work but miss the restart machinery that makes conditions actually useful.
The hallucination problem is more acute than in popular languages. For Python, if you reference a library function that does not exist, the model often has enough coverage to suggest a real alternative. For Clojure, it will confidently invent plausible-sounding namespaces. Fabricated entries like clojure.data.collections/group-and-merge follow the naming conventions exactly but are pure hallucination. There is just enough signal to sound right but not enough to be right.
Tokenization Adds Another Layer
Below the training data problem is a lower-level issue. Modern LLMs use byte-pair encoding tokenization, where common sequences of characters get merged into single tokens. In popular languages, frequent constructs like function, return, const, import, and common library names get compressed. Effective sequence lengths stay manageable for the attention mechanism.
Lisp’s s-expression syntax means deeply nested code is a long stream of individual parentheses and symbols. Each paren is typically its own token. The structural information that a Lisp programmer reads through indentation and nesting must be recovered by the model from a flat token sequence. Correctly tracking that a particular closing paren forty tokens back closes the let form opened earlier is the kind of positional reasoning that attention mechanisms handle inconsistently at length.
Paredit, the structural editing mode that most Lisp programmers use, abstracts all of this away by operating at the s-expression level rather than the character level. The programmer never thinks about individual parens. The model does not get that abstraction.
A Feedback Loop With No Natural Floor
The situation is self-reinforcing in a way that is difficult to escape. Less popular languages have less training data, which produces worse AI assistance, which makes the language less attractive to newcomers who now expect AI tools to work reasonably well, which keeps the language less popular, which produces less training data for the next model generation.
This feedback loop already existed in weaker form with Stack Overflow and documentation. Niche languages have less community support, which makes them harder to learn, which keeps communities small. AI assistance amplifies the same dynamic because the quality gap is larger and more visible. When Python gets measurably better AI help than Clojure for the same category of task, the productivity difference is legible to anyone evaluating language choices.
Haskell, Erlang, and OCaml are in similar positions. Strong communities, genuinely well-designed languages, but AI assistance that consistently misses the idiom. Haskell code generation is noticeably worse than TypeScript even though Haskell has been around longer and has extensive documentation and academic literature. The model can write Haskell syntax but fails on type class idioms, lens usage, and the interaction between IO and pure code that every real Haskell program navigates constantly.
The Symbolic Versus Statistical Divide
The deeper irony is about the nature of the AI that displaced Lisp’s AI.
Lisp was the language of symbolic AI: search, logic, tree manipulation, formal reasoning over structured representations. The program synthesis work of the 1970s and 1980s operated by searching the space of possible programs, using rules and heuristics about program structure. For that paradigm, Lisp’s homoiconicity, the property that Lisp code is itself a Lisp data structure navigable with the standard list operations, was a genuine technical advantage. You could write a program that reasoned about programs, manipulated their syntax trees directly, generated new code from templates using the same operations you used on any other data.
Modern LLM-based code generation is statistical and text-based. It works by pattern-matching over the distribution of text in the training set. For this approach, homoiconicity provides no benefit whatsoever. What matters is frequency of occurrence in the training distribution. And here, Lisp loses badly.
The very property that made Lisp suitable for symbolic program synthesis does nothing for statistical program synthesis. McCarthy’s insight, that code and data should be the same thing so that programs can manipulate programs, remains elegant. It just turns out to be irrelevant to the specific mechanism by which modern AI generates code. A language designed to be reasoned about by symbolic AI gets left behind by statistical AI.
The REPL Defense, and Its Limits
There is a counterargument that Lisp programmers have always had better tools than AI assistance provides. The REPL offers a tighter feedback loop than any static suggestion system. You evaluate a form and immediately see what it returns. Paredit handles the mechanical work of structural editing faster than autocomplete. The macro system means you build abstractions that eliminate boilerplate rather than generating it. The Common Lisp condition system handles error recovery in ways that reduce the need for the error pattern lookup that AI tools are often used for.
Much of what AI coding tools do well, REPL-driven development addresses through different means and often addresses it better. The question of whether some unfamiliar function exists in the standard library is answered in seconds by evaluating in the REPL rather than waiting for a suggestion that might be hallucinated anyway.
But this defense only goes so far. There are genuine parts of any programming workflow where AI assistance is useful regardless of language: sketching unfamiliar data transformation logic, working through a domain you have not written code in before, pulling in the structure of an approach you vaguely remember but cannot quite reconstruct. Lisp programmers are not exempt from those situations. They get worse assistance in them, and the gap is growing as AI tools improve faster for the popular languages than for the niche ones.
Dan’s framing as sadness rather than indifference is the right one. He is not arguing that Lisp is broken or that he wants to stop writing it. He is observing that a tool that makes other parts of his work meaningfully better does not work well for this part, and that the reason is structural rather than incidental. The fix would require either a major shift in the training data distribution or targeted fine-tuning on high-quality Lisp corpora from sources like Quicklisp and Clojars. Neither is imminent.
For now, if you write Lisp, you write it mostly with the tools the language community built before AI assistance existed. Those tools are genuinely good. But they are not the tools everyone else is getting better at the same pace.