The dead internet theory first circulated on obscure forums around 2021. The core claim was that most of what you encounter online, the content, the engagement, the profiles sharing it, is generated by automated systems rather than real people. When it emerged, it was easy to dismiss as paranoid pattern-matching from people who’d spent too long in comment sections. Three years later, the evidence has piled up to the point where dismissal requires its own motivated reasoning.
The article on adriankrebs.ch that recently surfaced on Hacker News puts a fine point on this transition. As someone who builds bots professionally, the story it tells matches exactly what I see from the inside.
The Numbers
Imperva’s annual Bad Bot Report, which aggregates traffic data across thousands of websites and applications, found that automated traffic had crossed the 49% mark of all internet traffic, with malicious bots alone accounting for roughly 32% of total requests. That is not a rounding error. For certain verticals like gaming and financial services, bad bot traffic exceeds legitimate human traffic outright.
These are not contested figures. Cloudflare publishes similar findings in its quarterly Internet disruptions and traffic reports. Akamai’s State of the Internet series has tracked bot traffic growth for years and tells a consistent story: the automated fraction of internet traffic has grown steadily and accelerated sharply post-2022.
The consensus across major CDN and security vendors is that automated traffic has been a majority, or near-majority, of total internet traffic for several years. The question is no longer whether the bots are there. It is what they are doing, and why the economics keep working in their favor.
The LLM Acceleration
Before late 2022, generating convincing human-sounding text at scale required either a large team of low-wage workers or fairly sophisticated ML infrastructure that most operators could not afford. ChatGPT changed that arithmetic completely. The cost of generating a coherent, topically relevant article dropped from dollars to fractions of a cent per piece.
NewsGuard, which tracks misinformation and low-quality news sites, documented over 900 AI-generated news sites by mid-2023, up from nearly none a year prior. Many had professional layouts, consistent publishing schedules, and articles that passed a casual reading without triggering obvious flags. Some were generating thousands of posts per month.
Google’s response was the Helpful Content Update, deployed in multiple waves through 2023 and 2024. The results have been mixed. Many AI content farms took short-term ranking hits before adapting. Some operators started mixing human-written content with AI-generated material. Others found that targeting less competitive long-tail queries, where ranking signals are weaker, kept their operations profitable regardless of the algorithm changes.
The underlying economics never changed. When content generation is cheap enough and the upside from ad revenue or affiliate commissions is large enough, the rational move is to keep publishing.
The Mechanics, From Someone Who Builds This
I build Discord bots, not content farms or engagement rings. But the process of making software that behaves like a person online is something I understand at a technical level, and the infrastructure for synthetic presence has become shockingly accessible.
For basic web automation, Playwright and Puppeteer let you drive a full browser with a few dozen lines of code. For evading behavioral detection, libraries like playwright-extra with puppeteer-extra-plugin-stealth mask the fingerprints that platforms use to identify automated clients: things like the webdriver property being set, missing browser plugins, inconsistent canvas rendering. Residential proxy networks, services like BrightData or Oxylabs, rent out IP addresses belonging to real consumer devices, so requests appear to originate from genuine users in real geographic locations.
A basic engagement operation in 2025 might look something like this:
from playwright.async_api import async_playwright
import anthropic
async def generate_engagement(post_url: str, persona_context: str):
client = anthropic.Anthropic()
# Generate contextually appropriate comment at near-zero cost
response = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=150,
messages=[{
"role": "user",
"content": (
f"Write a natural, conversational comment for a social media post. "
f"Persona: {persona_context}. Keep it under 100 words."
)
}]
)
comment_text = response.content[0].text
async with async_playwright() as p:
browser = await p.chromium.launch()
context = await browser.new_context(
user_agent="Mozilla/5.0 ...", # realistic UA
# route through residential proxy here
)
page = await context.new_page()
await page.goto(post_url)
# locate comment box, type comment, submit
await browser.close()
return comment_text
The cost of running something like this at scale, including LLM API calls at sub-cent rates, residential proxy bandwidth, and hosting, is low enough that the economics work for anyone trying to manipulate rankings, build perceived social proof, or generate ad revenue through traffic. I am not describing exotic tradecraft here. These techniques are documented across SEO forums, GitHub repositories, and mainstream marketing automation tools. The barrier to entry is a weekend of reading and a credit card.
Why Platforms Have Not Solved This
Platform incentives are structurally awkward on this topic. Engagement metrics, likes, comments, shares, follower counts, are what platforms sell to advertisers. Aggressive bot removal deflates those numbers. Multiple researchers have documented cases where platform transparency reports appear to undercount coordinated inauthentic behavior, though attributing deliberate motive rather than detection limitations is genuinely difficult.
There is also a real detection problem. Behavioral analysis catches unsophisticated bots: things that click too fast, never scroll, post at inhuman intervals. Modern operations are designed around human behavioral distributions. Headless browser fingerprinting can be defeated with the right plugins. Rate limiting can be gamed by spreading activity across hundreds of accounts and residential IPs. CAPTCHA solving services, which combine ML models with human solvers, handle most remaining friction at around two dollars per thousand solves.
Meta’s Community Standards Enforcement Reports show hundreds of millions of fake account removals per quarter. Removal counts measure what got caught, not the overall population of synthetic accounts. The Stanford Internet Observatory has published detailed analyses of coordinated inauthentic behavior campaigns that ran undetected for months or years before removal.
The Engagement Farm Model
One specific mechanism worth understanding is the engagement farm. These are operations that combine automation with human oversight: a small team manages thousands of accounts using automation tools, intervening manually when a human touch is needed to pass detection. The Stanford Internet Observatory documented these in political contexts at scale, but the same infrastructure serves commercial purposes, inflating product reviews, boosting creator metrics, manufacturing social proof for new services.
The dead internet theory, in its original form, sometimes implied a conspiracy: a deliberate effort to hollow out genuine human connection for some shadowy coordinated purpose. The reality is more prosaic and in some ways more depressing. It is the aggregate outcome of thousands of independent economic actors, each optimizing for some local objective, ad revenue, political influence, product visibility, using tools that have gotten cheaper and more effective every year. No single actor planned the synthetic web. It emerged from the aggregate pressure of everyone playing the same optimization game.
Where Genuine Interaction Has Retreated
Closed platforms and access-gated spaces seem to have held up better than the open web. Discord servers, private forums, group chats, email newsletters, the places where synthetic presence is less economical because the surface area is smaller and the signal-to-noise ratio matters more to participants. The public web, particularly the content-ranked-for-search portion of it, has gotten measurably worse by most qualitative assessments.
This is not an entirely new dynamic. The early web had its share of link farms, spam blogs, and keyword-stuffed doorway pages. What has changed is the quality floor: generating content that is indistinguishable from competent human writing used to be hard. It is not hard anymore.
The adriankrebs.ch piece frames this as a reckoning, which feels accurate. The dead internet theory was never really a claim that bots existed; those have been around for decades. It was a claim about ratios: that automated synthetic activity had crossed a threshold where the underlying assumption of the internet as primarily a space for human exchange was no longer valid. Whether that threshold is here or a few percentage points away, the operational difference is minimal.
The tools to build synthetic internet presence are mature, cheap, and widely documented. The incentives to use them are structural, not incidental, which means they are not going away. Understanding the mechanics is at least a start toward thinking clearly about what the internet is now, versus what it was assumed to be.