The Linux Kernel Sets Ground Rules for AI-Assisted Contributions

The Linux kernel project recently published a formal document titled Documentation/process/coding-assistants.rst laying out its position on AI and LLM-based coding assistants. The policy does not ban AI assistance. What it does instead is require that contributors disclose when AI tools were used in generating or assisting with code, and it reaffirms that the human submitter bears complete responsibility for everything they send to the mailing lists, regardless of how it was produced. That framing is sensible, but the more consequential part of the document is what it implies about the Developer Certificate of Origin, the certification mechanism every kernel contributor signs when they add a Signed-off-by: line to a patch.

What the Policy Actually Requires

The document is not lengthy. It establishes three core obligations. First, disclose AI tool use in the patch or submission. Second, understand every line of the submitted code well enough to explain and defend it. Third, recognize that the standard submission process, including the DCO, applies in full. The third point is where the document gets complicated, because the DCO was written for a world where humans wrote code and the question of authorship was straightforward.

The DCO 1.1 requires contributors to certify, under penalty of legal consequence, that “the contribution was created in whole or in part by me and I have the right to submit it under the open source license indicated in the file.” For AI-assisted code, that language starts to strain. The US Copyright Office has ruled repeatedly that works generated by AI without sufficient human creative input are not eligible for copyright protection. If a contributor prompts a model, accepts its output with light review, and submits it, the copyright status of that contribution is genuinely uncertain. The contributor may not qualify as the author in any legally meaningful sense.

The DCO Problem

This matters for an open-source project that depends on copyright for its legal foundation. The GPL’s copyleft mechanism works because copyright exists in the covered code. The GPL grants downstream recipients rights conditioned on preserving those terms, and that conditioning only functions if there is a copyright holder whose exclusive rights are being licensed. If AI-generated contributions have no copyright holder because no human author exists in the required sense, the GPL’s reciprocal terms do not technically attach to those contributions. That is a gap in the copyleft chain, not a theoretical one.

The kernel policy’s answer is to place responsibility on the submitter’s good faith and technical judgment. Disclose that you used an AI tool, verify that you understand and stand behind the code, and sign off. That transfers moral responsibility clearly. It does not resolve the legal question. A contributor can honestly disclose Copilot usage and still not be able to answer whether the model’s training data creates copyright claims, or whether the contributor qualifies as the author of something a model generated in response to a prompt.

The Linux Foundation introduced the DCO in 2004 as a lightweight alternative to contributor license agreements. Its design deliberately places certification burden on the individual contributor through self-attestation rather than institutional process. That design is efficient and has worked well for decades, but self-attestation models have no mechanism to audit copyright ownership claims. In a world where AI tools can produce plausible kernel code in seconds, that gap becomes meaningful.

How GCC Handles the Same Problem Differently

The contrast with the GCC project is worth examining. GCC requires contributors to assign copyright to the Free Software Foundation for non-trivial contributions before patches can be merged. This is more burdensome than the kernel’s DCO approach, but it surfaces the AI authorship problem more directly. You cannot assign copyright you do not own. If an AI-generated contribution has no copyright owner, there is nothing to assign, and the FSF’s standard process cannot accommodate the patch.

The FSF has stated publicly that AI outputs may not be copyrightable, and GCC maintainers have been asking contributors to confirm that submitted code is not purely AI-generated, in part for this reason. The copyright assignment requirement that many contributors have found burdensome turns out to be a cleaner mechanism for handling AI contributions, because it makes the ownership question unavoidable rather than deferred. The kernel’s lighter-weight DCO approach can sidestep the question but cannot answer it.

Quality and the Ring 0 Problem

There is a technical dimension separate from the legal one. Kernel code runs at privilege level 0. A bug in user-space code might produce a wrong result or a crash; a bug in kernel memory management or locking code can produce data corruption, security vulnerabilities, or hardware damage, often under specific load or timing conditions that automated testing misses.

Greg Kroah-Hartman, who maintains the stable kernel tree and USB and driver subsystems, has been publicly rejecting AI-generated patches since at least 2023. The pattern he documented is consistent: patches that appear to fix issues that do not exist, that introduce subtle lock ordering violations, or that follow the surface syntax of kernel idioms while misunderstanding their invariants. AI models can produce code that looks like correct RCU (Read-Copy-Update) usage, correct memory barrier placement, or correct interrupt handling while violating the underlying constraints those patterns enforce.

The RCU mechanism in particular represents the kind of subsystem where AI assistance is most likely to produce dangerous output. RCU’s correctness depends on precise understanding of which code paths can preempt which other code paths, and on correctly distinguishing between read-side and update-side critical sections. That understanding comes from deep familiarity with the subsystem, not pattern matching on existing code. Current generation models pattern-match; they do not reason about the operational semantics.

The policy’s requirement that contributors “fully understand” submitted code is pointing at this directly. It is not boilerplate. It is the kernel project’s way of saying that the review process is not a complete substitute for submitter competence, and that AI-assisted code requires the submitter to supply the understanding the model cannot.

Enforceability and What the Policy Actually Changes

There is no technical enforcement mechanism. AI-generated code cannot be reliably detected. Statistical classifiers exist, but they produce false positives and false negatives, and the kernel project has no practical way to run them across the hundreds of subsystem mailing lists that receive patches. The policy works on the same honor system as the DCO itself.

What the document actually changes is the social contract. Maintainers who suspected undisclosed AI-generated patches were being submitted were rejecting them on informal grounds, with varying consistency across subsystems. The policy formalizes that informal norm. It gives any maintainer clear standing to ask about AI tool use, to require disclosure, and to reject patches that don’t comply. The cost of that formalization is low and the benefit, in terms of consistent treatment across the project, is real even without technical enforcement.

Where Open Source More Broadly Stands

The kernel is not alone in working through this. The Apache Software Foundation published AI contribution guidelines in 2023 recommending disclosure and careful review. OpenBSD’s core team has expressed strong skepticism about AI contributions on licensing grounds, consistent with that project’s long-standing caution about code provenance. The LLVM Foundation has discussed disclosure requirements without publishing a standalone policy document. GCC is navigating its copyright assignment tension.

The convergence across these projects is notable: none of them have resolved the copyright question, and all of them have defaulted to some version of requiring disclosure while placing responsibility on the human contributor. That convergence reflects the state of the legal landscape. Until courts, the Copyright Office, or legislators provide clearer answers about the copyright status of AI-generated code and its relationship to training data, open source projects are operating with genuine uncertainty, and disclosure is the minimum viable response to that uncertainty.

The coding-assistants.rst document is the Linux kernel project’s formal entry into that conversation. It draws a clear line around submitter responsibility at a time when that clarity is valuable. The harder questions, about copyright in AI outputs, about what the DCO certifies when a model wrote the code, about how the GPL interacts with contributions that may have no author, remain open. The policy does not close them. What it does is establish the ground rules while the answers develop.