Signed-off-by Means Something: The Linux Kernel's Policy on AI Coding Assistants
Source: lobsters
The Linux kernel’s process documentation recently gained a new file: Documentation/process/coding-assistants.rst, laying out the official position on using AI coding tools when contributing to the kernel. The headline is not a ban but a responsibility framework, and understanding why the kernel needed one at all says more about the state of AI-assisted development than the document itself does.
What the Policy Actually Says
The document does not prohibit AI tools. What it does is clarify several things that should already be obvious but apparently were not:
- If you use an AI coding assistant to generate or assist with code you submit, you must disclose that use.
- The human submitting the patch bears full legal and technical responsibility for the code. The AI cannot sign off on anything.
- The Developer Certificate of Origin (DCO), which every kernel contributor signs via the
Signed-off-byline, still applies to you, the human. You are certifying that you have the right to submit the code and that you understand it. - LLMs hallucinate. In the kernel, hallucinated APIs, incorrect locking patterns, or wrong memory model assumptions do not produce the kind of visible test failures you get in application code. They produce data corruption, kernel panics, and CVEs.
There is also a direct warning about the pattern the kernel community had been observing before this was written down: contributors using AI tools to produce plausible-looking patches at scale without actually understanding what those patches do. That is the proximate cause for the document existing.
The DCO Problem
The Developer Certificate of Origin is a lightweight legal mechanism that lets contributors certify, roughly, that the code is their original work or that they have the right to submit it under an appropriate license, and that they are doing this knowingly. The full text lives in Documentation/process/submitting-patches.rst and has been part of kernel process since the early 2000s, adopted from the broader open source community after SCO-era litigation made provenance documentation important.
When you add Signed-off-by: Your Name <email@example.com> to a patch, you are personally certifying that lineage. This works because the assumption underlying the whole mechanism is that a human developed the code, understands it, and can vouch for it.
AI-generated code breaks the second part of that contract even when the first part is technically satisfied. A contributor can legitimately submit AI-assisted code if they review, verify, and own it in a way that satisfies the DCO’s intent. But they cannot satisfy that intent if they did not read, understand, and verify the code before submitting it. The kernel’s policy makes this explicit: disclosure is required, and the human developer takes on full responsibility for everything in the patch.
The scope of this question extends beyond the kernel. The DCO is used across dozens of major open source projects including LLVM, OpenSSL, and a large portion of the Linux ecosystem tooling. How any project interprets “I wrote this code” in an era where LLMs can plausibly draft most code a contributor would want to submit is something every significant open source project will eventually have to address directly.
Why Kernel Code Is an Unusually Poor Place to Deploy Unverified LLM Output
Most software has a reasonable tolerance for subtle bugs. Tests catch them. Type systems narrow the error space. Integration tests surface the mismatch between component behavior and system expectations. A hallucinated function signature produces a compile error, not a silent correctness bug, and the feedback loop is tight enough that problems surface quickly.
Kernel code, specifically the subsystems that most kernel patches touch, has almost none of those properties.
Consider the Read-Copy-Update (RCU) subsystem. RCU is a synchronization mechanism that allows reads to proceed concurrently with updates without locks. Using it correctly requires knowing which objects are RCU-protected, when you need to be inside an rcu_read_lock() / rcu_read_unlock() section, when to use rcu_dereference() versus a plain pointer read, and when the grace period mechanism guarantees a freed object is safe to reclaim. Getting any of these wrong can produce memory corruption that only manifests under specific concurrent load patterns on specific architectures, months after the patch ships in a stable release.
An LLM trained on historical kernel source will have seen substantial RCU usage, but RCU’s API has evolved significantly across major versions. Patterns that were correct in 2.6.x-era code are wrong or deprecated in 6.x-era code. A model generates superficially plausible RCU code, the code looks correct to a reviewer who is not an RCU expert, it passes the static analysis tools available without running the kernel, and it introduces a use-after-free on a slow path that triggers under specific production load conditions. That is a realistic failure mode.
The same analysis applies to spinlock and interrupt context interactions, the constraints around sleeping in atomic context, the memory ordering semantics of READ_ONCE() and WRITE_ONCE(), and architecture-specific barriers. Tools like KASAN and kfence can catch some of this in testing, but only if the test workload exercises the relevant code paths, which is not guaranteed for edge-case concurrency bugs.
Greg Kroah-Hartman, the maintainer of the kernel’s stable tree and one of its most active reviewers, documented the practical problem before the formal policy existed. On the Linux Kernel Mailing List and in public posts, he described receiving batches of patches that appeared to be LLM-generated, containing hallucinated function names, wrong include paths, and technically incorrect fixes for problems that were not actual bugs. His approach was to reject them and ask submitters to demonstrate that they understood the code they had submitted. The policy document is the institutionalization of that already-established maintainer behavior.
The Broader Open Source Landscape
The Linux kernel’s approach lands in the middle of a range that other major projects are navigating.
PostgreSQL’s development culture enforces quality through extensive review and a relatively small core contributor group with deep codebase knowledge. AI-generated patches tend to fail review for the same reason poorly-understood patches always fail: reviewers notice when submitters cannot defend their design choices. The project has not issued a formal policy, because its existing quality bar handles the problem implicitly.
LLVM’s community works similarly. The codebase is large and internally consistent enough that patches not fitting the existing architecture are immediately visible to reviewers. No formal AI policy exists, but the review standards function as one in practice.
OpenSSL sits in a more difficult position. Security-critical cryptographic code shares the kernel’s property that plausible but subtly wrong implementations are far more dangerous than obviously wrong ones, because they ship, get audited as correct, and get trusted in production. The project has expressed caution about AI-generated contributions without a formal published policy, relying instead on reviewer judgment.
What distinguishes the Linux kernel’s approach is the choice to write the policy down explicitly rather than relying on the existing quality bar to handle it implicitly. This fits the kernel’s broader documentation culture: its process documentation is unusually thorough compared to most open source projects, covering patch structure, subsystem-specific requirements, and maintainer communication norms in considerable detail. Adding AI tool guidance to that corpus follows the existing pattern of making implicit expectations explicit and discoverable for new contributors.
What This Policy Does Not Resolve
The policy is unenforceable at the technical level. No reliable automated method exists for detecting AI-generated code. Disclosure depends entirely on submitters being honest, and the contributors most likely to cause problems are precisely those using AI to produce patches for credential-building rather than to solve problems they actually care about.
Enforcement is social rather than technical, which was true before the policy was written. Maintainers can identify patches whose submitters do not understand what they submitted, request explanation, and reject on those grounds. That mechanism was already functioning before this document existed. What the policy adds is clear written documentation for maintainers to reference when explaining rejections, and an unambiguous signal to good-faith contributors about what is expected of them.
There is also the copyright question, which the policy acknowledges without fully resolving. The legal status of AI-generated code remains unsettled. The kernel’s practical position is that the human submitter bears responsibility regardless of how that question is eventually resolved, which is the right pragmatic approach given that legal clarity will arrive on a much longer timescale than patches need to be reviewed.
The Frame That Makes Sense Here
The policy carries no anti-AI premise; it is a clarification that the kernel’s existing accountability model extends to AI-assisted contributions in exactly the same way it extends to any other kind. If you use an LLM to help understand an unfamiliar subsystem, generate structural boilerplate, or sketch a first draft of a fix, then read the output carefully against the current source tree, verify it, test it, understand it, and submit it with a proper explanation of what it does and why it is correct, the policy has nothing to say against that workflow.
What it rules out is treating AI output as a substitute for understanding. In kernel development, that is not a stylistic preference or a philosophical position about AI. It is a correctness requirement grounded in how the kernel is developed and what happens when subtle errors make it into a stable release. The Signed-off-by line exists because someone needs to be accountable for what goes into the kernel, and that accountability requires a human being who knows what the code does.
The document formalizes something the kernel’s maintainers were already enforcing through case-by-case rejection. That formalization matters because written policy is clearer than informal practice, fairer to new contributors who do not yet know the unwritten rules, and more useful as a reference point when the community needs to discuss how the expectations should evolve as the tools themselves change.