AI Can Write Your Rust, But Not Teach You Why It Compiles

The survey that Niko Matsakis published in late February, summarizing perspectives from across the Rust project on AI tools, covers familiar ground on the surface: some teams find AI useful, others don’t, the borrow checker makes AI-generated Rust unreliable, unsafe code is a specific concern. What it doesn’t say explicitly, but what emerges from reading between the lines, is that Rust occupies a genuinely unusual position in the AI-assisted development landscape. Most discussion focuses on AI’s weaknesses with Rust. The more interesting question is what Rust’s friction with AI means for how people learn to program.

The Skill Atrophy Problem in Other Languages

Over the past few years, the standard model of AI-assisted development has converged on something like this: describe what you want, the model produces a working draft, you verify and iterate. For languages with weak static guarantees, this loop works surprisingly well. An LLM-generated Python function that does approximately the right thing will often run, pass tests, and ship. The developer’s job becomes validation rather than construction.

The concern this raises is not hypothetical. Developers who learned Python or JavaScript primarily through AI-assisted workflows often have significant gaps in their understanding of the language’s actual semantics. They know how to prompt effectively. They struggle to debug outputs that don’t fit the pattern of things models typically produce. The 2024 Stack Overflow Developer Survey showed growing AI tool use alongside rising reports that developers feel less confident solving problems without them. Whether these trends are causally related is contested, but the direction is plausible: if you outsource the hard parts consistently enough, you stop developing the ability to do the hard parts yourself.

This is the context in which the Rust survey’s findings become interesting.

Why the Borrow Checker Cannot Be Bypassed

The distinguishing feature of AI-generated Rust is not that it fails often. It is how it fails. When an LLM produces Python with a logic error, that error can be subtle and often requires running the code to surface. When an LLM produces Rust that violates ownership or lifetime rules, the compiler refuses it immediately with a structured diagnostic. The failure is legible, explicit, and local.

This matters for skill development because fixing a borrow check error requires understanding what went wrong. Consider a common pattern:

fn first_even(items: &[i32]) -> Option<&i32> {
    for item in items {
        if item % 2 == 0 {
            return Some(item);
        }
    }
    None
}

Straightforward. But when you combine iteration with mutation, or return references through closures, or compose futures across async boundaries, the borrow checker generates errors that cannot be resolved by pattern recognition alone. You have to reason about what owns what and when.

// AI-generated code that fails to compile
let mut results = vec![];
let source = vec![1, 2, 3, 4];
let filtered = source.iter().filter(|&&x| {
    results.push(x); // error: cannot borrow `results` as mutable
    x % 2 == 0       // because it is also captured by the outer scope
});

The fix requires understanding why the closure’s simultaneous capture of results as mutable and its use in filter creates an aliasing problem. An LLM can sometimes produce the fix. It cannot teach you to see why the original was wrong. When it fails in a single shot, you are left with a compiler error that you have to reason through yourself.

This is where the survey’s implicit lesson lives. The contributors who reported finding AI tools genuinely useful for Rust were not using them as a primary construction mechanism. They were using them for well-bounded tasks: generating trait implementations for Display or From, producing iterator chains over known types, scaffolding error type hierarchies, tasks where the compiler can validate output mechanically. For anything involving non-trivial lifetimes, async runtimes, FFI boundaries, or unsafe blocks, the consensus was that AI tools require more review overhead than they save.

The Training Data and Polonius Factors

There is also a practical issue with training data. Rust has significantly less public code in LLM training corpora than Python, JavaScript, or C++. Idiomatic Rust patterns, particularly around non-lexical lifetimes, the ? operator, Pin and async combinators, and the Deref coercion system, are learned behaviors that require exposure to production codebases, not derivable from first principles. Empirical benchmarks on code generation tasks consistently show LLMs scoring lower on Rust than on Python or TypeScript, because the model’s first attempt fails to compile more often and the iterations required to reach valid output are more demanding.

This gap is going to widen before it narrows, because Rust itself is in the middle of a significant evolution. Polonius, the next-generation borrow checker implemented in Datalog rather than region-based inference, will accept a strictly larger class of valid programs. Code that LLMs generate today and that fails the existing borrow checker may, under Polonius, actually be semantically valid. As the inside Rust blog noted in 2023, this work has been in progress for years precisely because getting the semantics right is harder than getting something that mostly works. Models trained on current Rust will need to update their priors once Polonius ships in stable form. The language is moving under them at the same time that models are improving generally.

What This Means for Learning Rust Now

The practical advice for developers who want to use AI tools while learning Rust is probably the opposite of what you would do with Python. With Python, leaning on AI for first drafts and iterating to correctness is a reasonable way to get productive quickly. With Rust, that workflow tends to produce an understanding gap that costs time later. You can get AI-generated code to compile through enough iteration, and still not know why it compiles, which means you will recreate the same problems in slightly different forms.

A better approach: use AI tools to explore the standard library, understand crate APIs, and generate trait scaffolding, but write the logic that touches ownership and borrowing yourself. The rust-analyzer LSP already provides rich semantic context that AI tools can leverage for more accurate completions than raw text prediction. The rustc error index remains one of the best error explanation resources in any language. The compiler’s structured diagnostics turn what would be a runtime debugging problem in most languages into a conversation with a precise static analyzer.

The Rust project’s survey, in mapping where AI tools succeed and fail against the language, inadvertently produced a map of where understanding is and is not required. For the parts where AI tools reliably work, understanding is optional. For the parts where they reliably fail, understanding is not. That map is also a curriculum.

Whether this makes Rust a good investment of learning time during a period when AI is eroding the return on understanding hard things is a separate question. The survey does not answer it. But Rust is the language where the question is most clearly posed, and the Rust community’s response, gathering structured data from the people closest to the language rather than issuing a verdict, is itself a model worth noticing.