· 6 min read ·

How Wayland's Correct Architecture Created a Decade of Desktop Regressions

Source: hackernews

The Wayland project began in 2008 when Kristian Høgsberg, an X.Org developer at Red Hat, started sketching out what a modern display protocol might look like if you could discard thirty years of accumulated X11 complexity. The design goals were compelling on paper: eliminate the display server as a separate process, give compositors direct responsibility for rendering, and enforce a security model where windows cannot spy on each other’s input or screen content.

Eighteen years later, a post on omar.yt makes the provocative claim that this transition set the Linux desktop back by ten years. It sparked nearly 400 comments on Hacker News, which tells you something about how much accumulated frustration exists on this subject. The argument is partially right, but the most interesting part of it is not the conclusion, it is which specific design choices caused the damage.

What the Security Model Required Throwing Away

The core of Wayland’s design is that the compositor mediates everything. In X11, all windows share a common display tree, and any application can query, inspect, or manipulate other windows. This is what let xdotool send synthetic key events to arbitrary windows, what let xrandr reconfigure displays from a terminal, and what made screen recorders work without any special privileges.

Wayland deliberately removed all of that. Each window communicates only with the compositor, and the compositor decides what to expose. A Wayland client cannot enumerate other windows, inject input events, or read another application’s screen contents without explicit permission routed through a portal interface.

This is a legitimate security improvement. On X11, any application you run can silently screenshot your screen or log your keystrokes. The X11 architecture predates the assumption that desktop applications might be adversarial. Wayland’s model is appropriate for a world where you run untrusted code from Flatpak, Snap, or arbitrary downloads.

The problem is that the clean model required rebuilding, from scratch, a large set of features that X11 users relied on. That rebuild took much longer than anyone admitted upfront.

The Feature Debt That Accumulated

When Fedora 25 shipped GNOME on Wayland by default in late 2016, the feature parity gap was stark. Screen recording did not work for most applications. Remote desktop via X11 forwarding, a staple of developer and sysadmin workflows for decades, was simply gone. Tools like xdotool and wmctrl, used everywhere from test automation to scripted window management, had no Wayland equivalents.

The global shortcuts problem was particularly damaging for gamers and power users. X11 provided a global hotkey API that any application could use to register keybindings that fired regardless of which window had focus. Push-to-talk in voice chat clients, media controls in music players, screenshot tools all relied on this. Wayland had no standardized equivalent for years. KDE and GNOME each developed their own compositor-specific extensions, but a cross-compositor standard took until the ext-idle-notify and related protocols in the wayland-protocols repository to begin materializing, and adoption across all compositors lagged further.

Input method support for CJK languages was another casualty. The text-input-v3 protocol took years to stabilize, and even today some input methods behave differently under XWayland versus native Wayland clients in certain compositors.

Color management and HDR had no standardized Wayland protocol at all until the xx-color-management-v4 protocol began making its way into compositors. GNOME and KDE both shipped HDR support in 2023 and 2024 respectively, years after equivalent Windows and macOS support.

Screen sharing in video calls required both browser and compositor support for the xdg-desktop-portal screen capture interface. This worked reliably in Firefox and Chrome on major Wayland compositors only around 2021. OBS added PipeWire-based Wayland screen capture in the same year. Five years after Fedora’s default switch, a workflow that had worked trivially on X11 was finally functional again.

The Protocol Fragmentation Problem

Wayland’s architecture splits the protocol into a minimal core and compositor-specific extensions. The intent was to keep the core clean while letting compositors experiment with new features before standardizing them. In practice, this produced a fragmented ecosystem where applications must detect compositor capabilities at runtime and implement multiple code paths.

The wlr-protocols project, maintained by the wlroots community and used by compositors like Sway, Hyprland, and River, defines extensions that GNOME and KDE do not implement. The plasma-wayland-protocols package covers capabilities like screen edge activation, virtual desktop management, and window thumbnails. GNOME Shell exposes its own extensions for shell integration.

Application developers targeting cross-compositor support face a combinatorial problem. Something as mundane as moving a window to a specific position, trivial in X11 with XMoveWindow, requires compositor-specific protocol support and is deliberately unsupported by the xdg-toplevel spec for security reasons.

The rule that “if you need something, propose a protocol extension” sounds reasonable until you trace the actual cycle time. A feature going from identified need to reliable cross-compositor support has historically required: a detailed technical proposal, review by developers across competing compositor projects, independent implementation in each compositor, and then implementation in toolkits or applications. The realistic timeline for that cycle, across GNOME, KDE, and the wlroots ecosystem, has typically been three to seven years.

XWayland, the compatibility layer that runs X11 applications inside a Wayland session, papered over some of these gaps. An X11 application under XWayland gets a full X11 environment with the old capabilities intact. But clipboard and drag-and-drop integration between XWayland and native Wayland applications remained buggy for years. Fractional scaling under XWayland, relevant for high-DPI displays, produced blurry rendering until Xwayland 23.1 in 2023, nearly a decade after HiDPI displays became common on desktops.

Where Things Stand Now

The situation is genuinely better than it was in 2016 or 2019. KDE Plasma 6.0, released in February 2024, shipped mature Wayland support including working screen sharing, global shortcuts, HDR on compatible hardware, and proper fractional scaling. GNOME 46 and 47 continued closing remaining gaps. The xdg-desktop-portal ecosystem has matured into a functional cross-compositor abstraction for screen capture, file access, and other capabilities.

The argument that X.Org was unmaintainable is also credible. The X.Org Server codebase accumulated security vulnerabilities at a steady rate, the graphics architecture did not support modern GPU capabilities cleanly, and the pool of willing maintainers was shrinking. A protocol replacement was probably necessary regardless of how the transition was handled.

The Cost and Who Bore It

The ten-year framing captures something real. The transition created a prolonged period where Linux desktop users had worse support for common workflows than they had under a system from the 1980s. The typical response to bug reports during this period was that the missing feature required a new protocol extension, and that the reporter was welcome to propose one.

The cost was borne almost entirely by users, not by the developers who made the design decisions. People running professional workflows, dual-monitor setups with specific scaling requirements, remote desktop sessions, or accessibility tools found themselves on a degraded platform for years.

Wayland illustrates a recurring pattern in systems software: clean-slate designs solve the architectural problems of the previous system while creating a transition period where users have less capability than before. The new design may be correct in the long run, but the long run tends to be longer than anyone admits when the decision is made.

X11’s accumulated complexity was real. Wayland’s security model is correct. The fragmented protocol ecosystem is a structural improvement over X11’s global mutable state. But the commitment to deprecating X11 before Wayland achieved parity, combined with a development culture that categorized missing features as protocols to be designed rather than regressions to be fixed, meant the transition cost landed squarely on users. That is what a decade-long setback looks like in practice.

Was this interesting?