There’s a specific kind of problem that doesn’t announce itself. It accumulates over months of reasonable, productive decisions. A developer starts using an AI coding assistant for boilerplate, then for understanding unfamiliar libraries, then for debugging, then for architectural choices. At no single point does something break catastrophically. The code compiles, reviews pass, and deployments succeed. The drift is comfortable precisely because nothing in the short term signals that anything has gone wrong.
This is the central argument in a recent post on Ergosphere that has been generating significant discussion in engineering circles. The author isn’t claiming that AI produces unreliable code or that the tools themselves are the root problem. The concern is what happens to the engineers using them over time, gradually, unnoticed, until something forces the question.
The Mechanism Isn’t New
Automation-induced complacency is well-documented outside of software, most thoroughly in aviation. The BEA investigation into Air France 447, the 2009 crash that killed 228 people, found that the crew had become so dependent on flight automation that when the autopilot disengaged unexpectedly, they were unable to manually fly the aircraft through a stall they should have recognized. They had been trained for exactly that scenario and had practiced it in simulators. Extended automation dependency had eroded the fluent, practiced response that training had originally built.
The pattern is worth understanding precisely: competence persists as knowledge but degrades as reflex. You retain the ability to describe what you should do long after you lose the ability to do it quickly enough, or confidently enough, when the system stops guiding you.
The FAA has studied this for decades and has produced guidance on automation dependency and manual flying proficiency. The industry responded with requirements for periodic manual flight operations, specifically to maintain capabilities that automation otherwise erodes. There is no equivalent practice in software development. There is no requirement to solve problems without AI assistance, no proficiency check, and almost no feedback mechanism to detect the degradation as it happens.
The Progression in Practice
The drift in software development tends to follow a recognizable sequence. First, AI handles the tedious parts: boilerplate, test skeletons, repetitive handlers for known patterns. This stage is largely healthy. The developer understands the output and is genuinely saving time on work that required no deep thought anyway.
The second stage is more complicated. The developer uses AI to understand unfamiliar code, to explain what an API does, to suggest an approach to a problem they haven’t encountered before. This is still defensible as tool use, but something material has shifted: the developer is no longer building understanding so much as renting it for the duration of the task. Once the work is done, the understanding may not stick. The AI explained it, the feature shipped, and the next time a similar problem appears, the same transaction repeats.
By a third stage, the developer is generating code they could not have written independently and cannot meaningfully evaluate without asking the AI to evaluate it. The output looks correct, tests pass, and the PR gets merged. The developer’s ability to reason about correctness, edge cases, performance characteristics, or failure modes has quietly atrophied while productivity metrics remained strong or improved.
The core of the Ergosphere argument sits here: the problem is not that the AI is wrong. The problem is that the developer has lost the capacity to recognize when it’s wrong.
Why Software Is Particularly Exposed
Software is unusually vulnerable to this kind of drift because its complexity is almost entirely invisible. When a pilot stops flying manually, the consequences are observable: rougher corrections, slower responses, degraded technique during line checks. When a developer stops reasoning deeply, the consequences may not surface for months or years, buried in the first serious production incident, the debugging session that takes days instead of hours, or the refactor that nobody on the team feels confident executing.
The research on transactive memory, the 2011 study by Sparrow, Liu, and Wegner published in Science, established that people who know they can look up information are less likely to encode it. They remember where to find the answer rather than the answer itself. Google was already producing this effect before AI coding tools existed. What AI assistants add is a much richer version of the same substitution, one that extends from factual recall into complex reasoning, design judgment, and debugging strategy.
Code review is the mechanism most teams rely on to catch shallow understanding, and it is increasingly happening with AI assistance on both sides. If the author generated the code and the reviewer uses AI to evaluate it, the depth of human understanding in the loop may be considerably thinner than either party has stopped to consider.
Andrej Karpathy introduced the term “vibe coding” in early 2025 to describe accepting AI-generated output without close reading. For certain contexts, rapid prototyping among them, the productivity case is genuine. The concern from senior engineers was immediate precisely because they recognized the trajectory: vibe coding as a time-boxed experiment is one thing, but as a habitual mode extending into production systems it leads somewhere specific, and that somewhere is a team that cannot explain what its own codebase does.
What Distinguishes Tool Use from Understanding Replacement
There is a real distinction between tools that amplify understanding and tools that substitute for it. A compiler error that teaches you something about a type system is amplifying your model of the language. A linter that flags a mistake you were about to make does the same. A profiler that shows you where time is being spent makes your mental model of the program more accurate.
An AI that generates code you cannot evaluate is doing something different. It is not helping you reason better; it is removing the need to reason in this particular case. Whether that accumulates into genuine skill degradation depends on how frequently you exercise independent reasoning and whether you have any mechanism for noticing that the frequency has been quietly declining.
The aviation parallel is instructive because it shows this problem has been addressed before, imperfectly but credibly, through deliberate practice requirements. The solution was not to reject automation; it was to keep the underlying skills exercised alongside it, on a schedule, with explicit intent.
Holding the Line
The practical response is to keep using AI tools while treating independent reasoning as something that requires active maintenance. Engineers who reject the tools outright are not building deeper understanding by doing so; they are mostly doing slower work. The question is whether the capacity to evaluate AI output critically, to catch the cases where it is confidently and plausibly wrong, is being exercised enough to remain intact.
One useful threshold: if you cannot assess whether the AI’s output is correct without asking the AI to assess it, that gap deserves attention. Write things from scratch periodically. Debug without assistance first. Read the source of libraries you depend on rather than an AI summary of them. Work through at least some problems where you have to construct the understanding yourself, not because the AI couldn’t do it, but because the construction is the point.
These are not acts of nostalgia for a harder workflow. They are maintenance of the capacity that makes you useful when the stakes are higher than the daily task queue, when the production system is broken at 2am and the AI is generating plausible-sounding but wrong hypotheses and you need to be able to tell the difference.
The drift is comfortable because productivity signals remain green while understanding signals are mostly invisible. The Ergosphere post is worth reading not because it offers a solution but because it names the shape of the problem clearly, and naming it is what makes it possible to notice.