When someone runs a proper audit of a major news site and the page weight comes in at 49MB, the number sounds wrong at first. A single web page should not weigh more than a CD-ROM track. But after spending time with browser devtools open on news sites, the number stops being surprising and starts being instructive.
The audit over at thatshubham.com breaks down exactly what’s loading on modern news pages, and the findings are a useful map of how the commercial web has evolved since ad networks became the primary business model for online publishing.
Where the Weight Comes From
A 49MB page is not mostly articles. The editorial content of a news story, even with images, might be 500KB on a good day. The rest is infrastructure for a business model built on surveillance and attention capture.
The typical breakdown on a heavy news site looks something like this:
- JavaScript: 5-10MB of script, split between first-party app code and dozens of third-party vendors
- Video: Autoplay prerolls and hero videos, often loading regardless of whether the user scrolls to them
- Images: Unoptimized hero images, ad creatives, and tracking pixels
- Fonts: Multiple web font families loaded for brand consistency
- Third-party scripts: Analytics, A/B testing platforms, consent managers, and ad tech of every variety
The third-party category is where things get instructive. A large news site might load scripts from 60 to 100 distinct third-party origins. Each of those is a separate DNS lookup, TCP handshake, and TLS negotiation, plus the actual payload. Many of them chain to additional scripts. You load a consent management platform, which loads the ad stack, which loads the data management platform, which loads retargeting pixels for six different ad exchanges.
The Ad Tech Stack Is Its Own Runtime
Modern programmatic advertising is not a simple banner system. It is a distributed real-time auction infrastructure that gets embedded directly into the browser for every page load.
A typical header bidding setup on a news site might look like this:
// Prebid.js configuration excerpt (simplified)
pbjs.addAdUnits([{
code: 'div-gpt-ad-hero',
mediaTypes: { banner: { sizes: [[728, 90], [970, 250]] } },
bids: [
{ bidder: 'appnexus', params: { placementId: '12345' } },
{ bidder: 'rubicon', params: { accountId: '1234', siteId: '5678', zoneId: '9012' } },
{ bidder: 'openx', params: { unit: '123456', delDomain: 'pub.openx.net' } },
{ bidder: 'ix', params: { siteId: '12345', size: [728, 90] } },
// ... 15 more bidders
]
}]);
Prebid.js itself is a legitimate open-source project for managing header bidding, and the library runs around 300KB minified. But each bidder adapter adds weight, and large publishers configure 20 or more. Before a single ad renders, the browser has made dozens of requests to ad servers, data brokers, and measurement vendors, each of which may set cookies or fingerprint the device.
Google’s Publisher Tag (GPT) is nearly always present as well, adding another layer of requests to Google’s ad infrastructure. Because these systems need to coordinate, they often block rendering or delay interactive content while the auction negotiates.
This Is Not New, But It Has Gotten Worse
Maciej Cegłowski gave a talk in 2015 called The Website Obesity Crisis that holds up well. At the time, the median page weight hovered around 2MB. His argument was that the web was getting fat not because of richer content but because of accumulated cruft, tracking scripts, and cargo-culted dependencies.
A decade later, the median desktop page weight according to the HTTP Archive sits above 2.5MB, with news and media sites consistently at the heavy end of the distribution. The Web Almanac that HTTP Archive publishes annually shows third-party content accounting for a growing share of that weight, separate from first-party editorial content.
The trajectory follows the economics. Each additional tracker represents a revenue stream or a contractual obligation. Publishers sign deals with data brokers to share audience segments. They integrate ad verification vendors because advertisers demand it. They add consent platforms to comply with GDPR, then load all the previously blocked scripts once consent is obtained, often through deliberately confusing UI flows. Every layer adds weight and latency.
What the Browser Actually Experiences
Forty-nine megabytes of transferred data understates the processing cost. Compressed JavaScript that transfers as 5MB might decompress and parse to 20MB of bytecode. Mobile CPUs, which are what the majority of the world uses to read news, handle that parsing serially on the main thread.
Alex Russell has written extensively about the performance costs of JavaScript on mid-range Android devices. A script that executes in 200ms on a MacBook Pro can take 1-2 seconds on a $200 Android device. Multiply that across 50 scripts and the page becomes functionally unusable on hardware that billions of people actually own.
Google’s Core Web Vitals were partly designed to make this visible in search rankings. Largest Contentful Paint, Interaction to Next Paint, and Cumulative Layout Shift are all metrics that heavy news sites tend to fail. The layout shift metric captures something readers recognize immediately: you start reading a paragraph, an ad loads, and the text jumps down by 300 pixels.
The Consent Theater Problem
GDPR and similar regulations were supposed to give users genuine control over tracking. In practice, many large news sites treat consent management as an obstacle to route around rather than a signal to respect.
The pattern works like this: show a consent dialog, make “Accept All” the large prominent button, bury “Manage Preferences” behind several taps, then load maximum tracking infrastructure on consent. The consent management platform itself is another 200-500KB of JavaScript that must load before anything else on the page. Publishers then continue fingerprinting through “legitimate interest” claims regardless of what the user selected.
The noyb.eu organization has filed hundreds of complaints about exactly this pattern, and regulators have issued fines. The behavior persists because the revenue from behavioral advertising outweighs the compliance risk calculation at the publisher level.
The Tools That Reveal It
Anyone who wants to see this for themselves can open Chrome DevTools, go to the Network tab, reload a major news site, and sort by size or by initiator. The waterfall chart tells the story. A few initial requests load the editorial content. Then dozens of third-party domains light up in parallel, each spawning their own chains of requests.
WebPageTest provides a more structured view, with connection view, request details by domain, and a Content Breakdown chart that segments weight by content type. Running a test from a simulated mid-tier Android device on a 4G connection makes the user experience implications concrete in a way that a desktop test does not.
The uBlock Origin extension takes a different approach to the same data, showing a count of blocked requests per page. On major news sites that number commonly runs between 50 and 120 requests. Each blocked request is weight that a reader without an ad blocker absorbs on every visit.
Who Is Doing It Better
Text-heavy sites built on simple stacks, personal blogs, and documentation sites routinely load in under 100KB. Modern news organizations have engineering teams that know exactly how to build fast pages; when the choice goes the other way, it reflects deliberate prioritization of ad revenue over reader experience.
Some publishers have pushed in the other direction. The Guardian has documented performance engineering work aimed at reducing page weight. The Washington Post experimented with AMP-based experiences for speed, though AMP came with its own trade-offs around distribution and Google eventually backed away from using it as a ranking signal.
The most useful frame for thinking about page weight is to ask whose interests each resource serves at load time. Editorial content, images, and the code that renders them serve the reader. Everything else, the bidding libraries, the pixel trackers, the session replay tools, the A/B testing SDKs, is overhead serving someone else’s business objective, loaded on the reader’s device, consuming the reader’s data plan and battery.
Forty-nine megabytes, measured against that frame, tells you something clear about the trade-offs the publisher made. The technology to build fast news pages has existed for years. The 49MB page is a product decision, documented in network requests.