· 6 min read ·

The Small Web Has Built Its Own Infrastructure Stack

Source: hackernews

Kevin Boone’s recent essay makes a point worth sitting with: the personal, non-commercial web is larger and more active than the mainstream narrative suggests. He frames it as a counterculture to surveillance capitalism and JavaScript bloat, which is a fair framing, but the more interesting story is about the infrastructure that supports this ecosystem. The small web has built real, working technical systems, and they are worth understanding on their own terms.

The Discovery Problem

The first problem any decentralized publishing ecosystem faces is discovery. If your content lives outside Google’s index priorities, you are invisible to most readers. The small web has addressed this through a mix of old and new approaches, some of them quite clever.

Marginalia Search is the most compelling example. Built by Viktor Lofgren in Sweden and launched in 2021, it is an independent search engine that deliberately ranks non-commercial, text-heavy personal websites higher than their commercial counterparts. The crawler scores pages against heuristics: heavy JavaScript usage, the presence of ad networks, cookie consent banners, and tracking pixels all push a site’s ranking down. A plain HTML page with dense, well-organized text ranks up. The project is open source (Java), funded through an NLnet grant, and indexes over 80 million documents. The result is a search engine where you can reliably find the kind of content that mainstream search buried years ago.

Wiby.me takes a different approach: a curated index of simple, text-based personal pages, manually submitted by users. It is small by design, perhaps half a million pages, and searching it genuinely feels like the late 1990s web. There is a “surprise me” button that loads a random indexed page, which is more useful for lateral discovery than it might sound.

Webrings have also made a substantial comeback. A webring is a circular linked list of websites around a shared topic, with each site including navigation links to the previous and next members. They peaked around 1995 to 2001, disappeared when Google made topic-based discovery frictionless, and have returned as communities rebuilt them without corporate gatekeeping. The communities around 32bit.cafe maintain dozens of active webrings. The discovery mechanism is structurally different from algorithmic ranking: you browse laterally through a set of curated peers rather than down a relevance-sorted list.

The Protocols

The IndieWeb community has been building web standards for personal site infrastructure since 2011, when Tantek Celik and Aaron Parecki started holding IndieWebCamp events following discussions at the W3C Federated Social Web Summit. Several of their specifications became W3C Recommendations.

WebMention (W3C Recommendation, 2017) is the most widely deployed. When you write a post that links to another site, your publishing software sends an HTTP POST to the target site’s WebMention endpoint, notifying them of the reference. The receiving site can display these mentions alongside the original content, creating a decentralized comment and cross-site conversation layer. It is simpler than XML-RPC pingbacks and more reliable because it requires the sender to verify that the linking page actually exists.

A site advertises its WebMention endpoint in a <link> tag:

<link rel="webmention" href="https://example.com/webmention" />

A sender discovers this endpoint and posts:

POST /webmention HTTP/1.1
Content-Type: application/x-www-form-urlencoded

source=https://sender.com/my-post&target=https://example.com/their-post

Services like webmention.io handle the endpoint side for sites that want WebMention support without running their own server.

Micropub (W3C Recommendation, 2017) defines a standard API for posting to your own site from third-party clients. The practical effect is that you can use any Micropub-compatible client, whether a mobile app, a browser extension, or a command-line tool, to create posts on your own site regardless of what software runs it. Combined with IndieAuth, which uses your own domain as an OAuth2 identity, this gives the personal web a portable, interoperable publishing stack that requires no dependence on any single platform.

The design philosophy behind these protocols is consistent: plain HTTP with standard content types, no custom client libraries required, and request formats legible to anyone who can read HTTP. That simplicity is a deliberate feature.

The Hosting Layer

Neocities launched in 2013 as a direct revival of GeoCities, which Yahoo shut down in 2009. Kyle Drake built it to make personal website hosting as accessible as the old GeoCities interface while keeping it free. As of 2023, Neocities hosts over 500,000 sites. Growth accelerated in 2022 during the Twitter/X ownership transition, as people started taking platform independence more seriously. The free tier provides 1 GB of storage and 200 GB of bandwidth per month; a paid supporter tier adds custom domains, more storage, and unlimited bandwidth.

The tildeverse is a different model: a federation of shared Unix servers where users get a ~/public_html directory and terminal access. tilde.club, tilde.town, and around twenty related servers form the network. Each server typically runs IRC, and many run Gopher and Gemini servers alongside HTTP. It is a direct revival of the 1990s ISP shell account model, and the communities are small and close-knit. The barrier is just low enough for motivated beginners while remaining meaningfully different from managed hosting.

The Gemini Tangent

Gemini deserves a mention as a distinct piece of infrastructure, though it occupies a narrower niche. Created by “Solderpunk” in 2019, it is a new application-layer protocol positioned between Gopher and HTTP. Requests are a single URL line over a mandatory TLS connection; responses consist of a status code plus a body in a minimal markup format called Gemtext. There are no cookies, no JavaScript, no headers beyond the status line, and no persistent connections.

Gemtext is notably constrained: six line types only, consisting of text, link (=> prefix), heading (#, ##, ###), unordered list (*), quote (>), and preformatted toggle. No inline formatting of any kind.

# A Gemtext page

Welcome to my capsule.

=> gemini://example.com/about.gmi About me
=> gemini://example.com/posts.gmi Writing

> The protocol is the constraint.

Servers like Agate (Rust) and Molly Brown (Go) are actively maintained. The ecosystem has roughly 2,000 to 4,000 active capsules. The spec is deliberately frozen; Solderpunk has resisted feature additions to preserve simplicity. The trade-off is real: no inline images, no styling, and a full TLS handshake per request make it less capable and potentially less accessible on constrained hardware than basic HTTP/1.1 with minimal HTML. But for text publishing, it is a coherent engineering choice that forces both authors and clients into a defined set of behaviors.

RSS as the Connective Tissue

Running beneath all of this is RSS and Atom, the syndication formats that predate the social media era. They have seen a strong revival as the small web’s primary subscription layer. Self-hosted feed readers like Miniflux and FreshRSS have grown meaningfully. The hosted options, Feedbin and NewsBlur among them, have maintained steady userbases through the platform migrations of 2022 and 2023. Blogrolls, shared as OPML files, have returned as a discovery mechanism; Blogroll.org is a service for publishing and browsing them.

The small web’s discovery stack is essentially: RSS for subscriptions, webrings and directories for lateral browsing, WebMentions for cross-site conversation, and Marginalia for search. Each layer is replaceable and most are self-hostable. That redundancy is a reasonable response to the lesson that depending on a single platform for any one of those functions tends to end badly.

What This Adds Up To

Kevin Boone is right that the small web is bigger than people assume. It is also more infrastructurally developed than critics give it credit for. There are working search engines, W3C-standardized protocols, active hosting platforms, and federated community systems, none of which require corporate backing or venture funding to operate.

None of it approaches the scale of the commercial web, and that is partly the point. Marginalia deliberately avoids the commercial optimization that would make it grow into something else. Neocities stays free to keep the barrier low. The IndieWeb protocols are designed for individual site owners rather than enterprise publishing platforms. These systems were built to optimize for different values than speed, scale, and engagement metrics, and the fact that they function is its own argument for those values being viable.

Was this interesting?