When Legal Isn't Enough: AI, Clean-Room Reimplementation, and the Slow Death of Copyleft

There’s a question that’s been nagging at the edges of open source discourse for a while now, and Hong Minhee finally put it into sharp relief: is something being legal the same as it being legitimate? The specific context is AI reimplementation of copyleft software, and the implications are worth sitting with.

The classic clean-room reimplementation has always been a known escape hatch from copyleft. The idea: have one team read the GPL-licensed code to understand its behavior, write thorough documentation, then have a separate team implement from scratch based only on that documentation. Never look at the original source. No derivative work, no license obligation. It’s tedious, expensive, and historically rare enough that copyleft held up reasonably well in practice.

AI changes that calculus dramatically.

Now you can feed a model a pile of GPL code, ask it to explain the behavior, and then ask a second model — or the same one in a new context — to produce a functionally equivalent implementation in any language you like. The output is arguably not a derivative work in the legal sense. You never copied a line. The license, strictly read, may not apply.

The Gap Between Law and Spirit

Copyleft was never just a legal mechanism. It was a social contract. The GPL’s viral nature was by design — Richard Stallman wanted to ensure that anyone who built on free software had to give back. The license was the enforcement arm of a philosophy.

When you use AI to reimplement a GPL library and ship it as proprietary code, you’re not violating the letter of the GPL. But you’re absolutely violating its spirit. You benefited from the collective work of the open source community. You used that work — mediated through a model — to produce something you then locked down. The copyleft didn’t propagate, even though the value did.

This is the erosion Hong Minhee is pointing at. It’s not a sudden collapse. It’s a slow leak.

Why This Matters More Than It Seems

You might think: so what? Companies have always found ways around copyleft. The AGPL exists partly because the original GPL didn’t cover network use. Licenses evolve.

But there’s something qualitatively different about AI-assisted reimplementation. The barrier is now so low that it’s not just well-resourced companies doing it. Any developer with API access can do a clean-room reimplement in an afternoon. The friction that made copyleft practically enforceable — the sheer cost of reimplementation — is gone.

The legal system hasn’t caught up. Courts haven’t ruled clearly on whether AI-mediated clean-room reimplementation creates derivative works. Until they do, the ambiguity itself is corrosive.

What Can Actually Be Done

Honestly? I’m not optimistic that license language alone can fix this. You can’t write a clause that says “and also don’t use AI to get around this” in a way courts will reliably enforce.

What might actually matter:

Social norms and naming: calling this practice what it is — laundering copyleft obligations through AI — and making it reputationally costly
Community-based enforcement: the open source community has always run partly on trust and peer pressure; that still works when it’s applied
New license models: some projects are already experimenting with licenses that focus on use and benefit rather than just code derivation

The hardest part is that the people doing this probably don’t think of themselves as bad actors. They’re following the rules as written. But legal compliance was never supposed to be the ceiling — it was supposed to be the floor.