The irony is hard to ignore. Lisp was invented in 1958 by John McCarthy, the same researcher who coined the term “artificial intelligence” and helped organize the 1956 Dartmouth Conference that launched the field. For the next three decades, Lisp and AI were nearly synonymous. MACSYMA, Interlisp, and hundreds of expert systems ran on it. The language was so associated with AI research that university departments bought specialized Lisp machines just to run it faster.
Now, in 2026, AI coding tools struggle to write Lisp. Dan Haskin’s post captures the frustration of a developer who finds modern AI assistants nearly useless when working in Lisp dialects. The complaints are specific: suggestions that ignore available macros, completions that reach for the wrong functions, code that is syntactically valid but stylistically wrong. The model writes Lisp like someone who memorized the parenthesis rules but never read any actual Lisp.
This is, at its core, a training data problem. Calling it a training data problem undersells how structural and self-reinforcing it actually is.
The Numbers Are Stark
The Stack Overflow Developer Survey consistently places Clojure around 1.6-1.9% of respondents, Common Lisp under 1.5%, and Scheme and Racket in similar territory. GitHub’s Octoverse reports don’t list Lisp dialects at all among the top languages by repository count or contributor activity. Python, JavaScript, TypeScript, Java, and C# occupy those positions by a wide margin.
LLMs trained on code learn from distributions that reflect this reality. The Pile, CodeParrot, and similar pretraining corpora source heavily from GitHub, Stack Overflow, and the open web. If Lisp represents 1-2% of developers and a smaller fraction of public repositories, it represents a correspondingly small slice of training tokens. A model that sees Python code fifty thousand times for every time it sees Common Lisp will be correspondingly better at Python. This is not a failure of model architectures. It is an accurate reflection of what the models were trained on.
Dialect Fragmentation Makes It Worse
The category “Lisp” is not a single language. It is a family of languages with shared syntax but divergent semantics, standard libraries, and idioms. Common Lisp, Clojure, Racket, Scheme, Guile, Chicken Scheme, Chez Scheme, Fennel, Janet, and Emacs Lisp are all meaningfully different programming environments. A Clojure developer and a Common Lisp developer share a rough syntactic family resemblance, but they write completely different code in almost every particular.
The 1-2% of developers writing “Lisp” gets split across these dialects. A model that sees some Clojure, some Common Lisp, and some Racket learns a blurry average of all of them. When it generates Clojure, it may reach for Common Lisp idioms. When it generates Common Lisp, it may use Clojure naming conventions. The dialects are similar enough to confuse the model but different enough that this confusion produces unusable output.
Python doesn’t have this problem at the same scale. There is Python 2 and Python 3, and models have enough examples of each to handle the distinction. The Lisp dialects don’t have enough examples for similar disambiguation.
The Macro Vocabulary Problem
Here is the most concrete technical failure mode. Lisp’s distinguishing feature is its macro system. Macros in Lisp are not preprocessor text substitution; they are code transformations that run at compile time, receive unevaluated syntax, and return new syntax. This enables library authors to build domain-specific languages that look and behave like first-class language features.
In practice, idiomatic Clojure makes heavy use of threading macros like -> and ->>:
;; Non-idiomatic: nested function calls
(str/upper-case (str/trim (get user :name)))
;; Idiomatic: threading macro
(-> user
:name
str/trim
str/upper-case)
Idiomatic Common Lisp uses loop, with-slots, handler-case, and dozens of library-specific macros. Idiomatic Racket uses define-syntax, syntax-rules, and syntax-parse to build pattern-matching forms that look completely unlike anything in mainstream languages.
A model that hasn’t seen enough examples of a specific library’s macro vocabulary will not use those macros. It will write verbose longhand, if that exists, or produce structurally plausible code that doesn’t compile. The result is Lisp that passes a syntax check but ignores the expressive vocabulary that makes Lisp worth writing in the first place. The model knows S-expressions. It doesn’t know the specific dialect’s accumulated library idioms.
The Homoiconicity Paradox
One might expect Lisp to be an easy target for language models. Lisp is homoiconic: code and data use the same representation. An S-expression like (defun square (x) (* x x)) is simultaneously source code and a list data structure. There is no ambiguity about the parse tree. The parentheses are always balanced. There is no operator precedence to memorize, no special-case syntax for conditionals or loops or blocks.
In theory, this regularity should make Lisp syntactically trivial for a language model. In practice, it doesn’t help much because the problem is not syntactic. The problem is semantic and idiomatic. A model learns to produce balanced parentheses easily. Producing the right parentheses, calling the right functions with the right arguments, using the macros that a working Lisp programmer would reach for, is a knowledge problem. And that knowledge comes from training data volume.
The homoiconicity argument occasionally surfaces in discussions about AI theorem proving and program synthesis, where Lisp’s uniform representation is genuinely useful for symbolic manipulation. But those use cases involve generating small, constrained expressions, not the kind of open-ended library-aware code that developers want from a coding assistant.
A Self-Reinforcing Loop
The consequence of poor AI support isn’t just inconvenience for existing Lisp programmers. It changes the economics of adopting the language at all.
When developers evaluate tools for a new project, AI coding assistant quality has become a real consideration. A Python project benefits from completions, suggestions, error explanations, and refactoring help that have been trained on hundreds of millions of examples. A Clojure project gets less useful assistance. A Common Lisp project gets even less. This tilts decisions toward mainstream languages for developers on the margin.
Fewer Lisp projects means less public Lisp code. Less public Lisp code means less training data for the next generation of models. The gap widens over time. This dynamic is not unique to Lisp. Erlang, Haskell, Forth, Prolog, and many other languages with small but dedicated communities face the same feedback loop. Lisp’s case is notable partly because of its historical centrality to computing, and partly because its structural properties seem like they should make it a natural fit for formal reasoning tools.
What Might Actually Change This
Fine-tuning on dialect-specific data is theoretically tractable. The Lisp community is small but produces high-quality code, and a meaningful portion of it is public. A model fine-tuned specifically on Clojure, trained to recognize when to use threading macros and when to reach for core.async, would serve Clojure developers far better than a general-purpose model weighted toward Python idioms.
The challenge is economic. Fine-tuning and maintaining a dialect-specific model requires sustained investment. The population of Clojure developers willing to pay for better AI tooling is not large enough to obviously justify that cost against improving TypeScript support.
The Clojure community has made incremental progress elsewhere. Clojure LSP provides structural editing and navigation that don’t depend on AI at all. Malli provides schema-driven data validation with enough structure that generation tools could theoretically leverage it for constrained output. These are useful but they don’t close the gap with what Python or TypeScript developers have access to.
Better prompting strategies help at the margins. Providing the model with relevant function signatures, a few idiomatic examples, and explicit context about which dialect you are working in improves output quality. This is more setup than most developers want to do for routine tasks, but for a complex library or macro-heavy section of code it’s often the difference between a useful suggestion and a plausible-looking hallucination.
For now, writing Lisp means writing without much AI assistance. That may not be entirely bad, depending on how you value the workflow. Lisp has always rewarded careful reading, deep familiarity with the macro vocabulary, and thinking about program structure before committing to it. The AI coding workflow, where you describe what you want and evaluate a suggestion, doesn’t map cleanly onto Lisp development regardless of model quality. The language asks for something different.
Haskin’s frustration is still reasonable. The benefit of AI assistance for boilerplate, for documentation lookup, for catching simple errors at the margins, is real and worth having. Lisp programmers are not getting it, and the trend lines don’t point toward improvement without deliberate effort from the tooling community. McCarthy built the language that made AI research possible. It would be fitting if the field returned the favor.