What the Linux Kernel's AI Policy Document Actually Says About Trust
Source: hackernews
The Linux kernel codebase recently gained a new document: Documentation/process/coding-assistants.rst. It showed up on Hacker News with 453 points and 342 comments, which tells you people had opinions. Reading the document itself, though, is more interesting than the discourse around it.
This is not a ban. It is not an endorsement. It is a carefully worded statement of what the kernel community expects contributors to understand before submitting AI-assisted code, and the specifics reveal a lot about both the kernel’s culture and the genuine limitations of current AI tools in this context.
Why the Kernel Is a Hard Case
Most codebases that struggle with AI-generated contributions are dealing with quality problems: code that works but is poorly structured, or suggestions that don’t match the project’s conventions. The kernel has all of those problems, but it also has a few that are specific to what it is.
Kernel code runs in a context with no memory protection, no standard library, and no recoverable error states for certain classes of failures. A subtle off-by-one in a user-space web server might cause a 404. The same mistake in a kernel memory allocator might cause silent data corruption or a privilege escalation that nobody notices for months. The cost of a confident wrong answer is different here than in most software.
AI coding assistants are very good at being confidently wrong. The document addresses this directly, noting that these tools can produce code that looks correct, compiles cleanly, and fails in ways that are difficult to trace back to the original mistake. This is not a novel observation, but the kernel is one of the few places where that failure mode has a realistic path to affecting millions of systems before anyone catches it.
The document also highlights a subtler issue: kernel idioms are not what AI tools learned the most of. The training corpus for most large language models is dominated by user-space code, web applications, and library code. Kernel-specific patterns, like the specific discipline required around rcu_read_lock(), the rules governing when you can and cannot sleep in an interrupt context, or the precise semantics of memory barriers on different architectures, appear far less frequently and in far more varied forms. A model trained on GitHub will have seen thousands of examples of React hooks for every one example of a correct RCU critical section.
The License Problem Is Real and Underappreciated
Section two of the document gets into copyright and licensing, and this is where things get complicated in ways that even experienced contributors sometimes miss.
The Linux kernel is licensed under GPL-2.0-only, which is unusually strict. Many projects accept GPL-2.0-or-later, which gives some flexibility. The kernel does not. This means that code contributed to the kernel cannot be derived from code under an incompatible license, and the list of incompatible licenses includes things that show up in permissively licensed codebases all the time.
AI tools are trained on code from many sources. When a model suggests an implementation of, say, a particular algorithm for a driver, it may be reproducing or closely paraphrasing code from a BSD-licensed project, an MIT-licensed project, or something with no clear license at all. The contributor who submits that code is responsible for its licensing, not the tool that suggested it. The document makes this explicit: if you cannot verify the provenance of a suggestion, you cannot submit it.
This is not a theoretical concern. There have already been cases in other open source projects where AI-generated contributions were found to contain passages that closely matched proprietary or incompatibly licensed code. In the kernel, where the license terms are stricter and the review process is more thorough, the risk of this causing real problems is higher.
What the Document Permits
The document is not hostile to AI tools. It describes them as potentially useful for several things: understanding unfamiliar code, generating documentation, writing test cases, and getting oriented in a large codebase. These are the cases where AI tools tend to be genuinely helpful and where the failure modes are less catastrophic. A wrong explanation of what a function does is annoying; a wrong implementation of that function in the kernel is a security vulnerability.
The core requirement the document establishes is straightforward: you must understand every line you submit. If an AI tool generates a patch and you cannot explain why each line is correct, you should not submit that patch. This is not a higher standard than what already exists; it has always been true that kernel contributors are expected to understand their own code. The document is clarifying that the source of the code does not change this expectation.
This framing is more useful than a blanket prohibition would be, because it puts the responsibility exactly where it belongs. The tool is not submitting a patch. The contributor is. The review process will hold that contributor accountable for every line, and maintainers have limited patience for authors who cannot answer basic questions about their own submissions.
The Maintainer Perspective
The document exists in part because maintainers have been dealing with an increase in low-quality patches that appear to be AI-generated. Greg Kroah-Hartman and other senior maintainers have been vocal about this. The kernel’s mailing-list-based review process is already a significant time sink for everyone involved, and a flood of patches that require substantial back-and-forth to identify basic problems makes it worse.
This is a real tension in the open source ecosystem right now. AI tools lower the barrier to producing code, including the barrier to producing code that looks plausible but is not quite right. For projects with limited review bandwidth, this is a genuine problem, not just a philosophical one. The kernel’s maintainers are not paid to do code review; they do it because they care about the project, and there is a finite amount of care that any person can sustain.
The document is partly an educational resource and partly a way of setting expectations clearly enough that maintainers can reject problematic contributions by pointing to written policy rather than having to relitigate the same arguments repeatedly.
What This Looks Like in Practice
For someone actually contributing to the kernel with AI assistance, the workflow the document implies is roughly: use AI tools to understand the codebase and generate drafts, then do the work of actually verifying those drafts against the kernel’s coding standards, the specific subsystem’s conventions, the relevant hardware documentation, and your own understanding of the problem. The AI output is a starting point, not a submission.
This is honestly how AI-assisted development should work in any high-stakes context. I use AI tools constantly when building out new features in my Discord bot or working through unfamiliar systems code, and the pattern that produces good results is always the same: generate, review, understand, revise. The difference in the kernel is that the review step requires a level of domain expertise that takes years to develop, and the consequences of getting it wrong are more severe than a bot going offline.
The broader software industry is still working out how to integrate AI assistance into processes that were designed around the assumption that every line of code was written by someone who understood it. The kernel’s document is one of the more thoughtful attempts to articulate what that integration should look like, and the answer it arrives at is not very complicated: the tools can help, but the accountability stays with the person who signs off on the code.
That is not a radical position. It is the only position that makes sense.