Wayland's Ecosystem Debt and the Price of Principled Design

There’s a post making the rounds on Lobsters titled “Wayland set the Linux Desktop back by 10 years”, and the title is provocative enough that it’s generating the usual split reaction: one camp treating it as vindication, the other dismissing it as X11 nostalgia. Both reactions miss the more interesting question, which is not whether Wayland was the right direction but whether the Linux desktop paid an unnecessary price for how it got there.

My position: Wayland’s isolation model was the correct architectural choice. X11’s security model is genuinely dangerous. But the timeline of defaulting to Wayland in major distributions substantially preceded the readiness of the replacement infrastructure, and that gap caused real, measurable regression in daily-use functionality that persisted for years.

What X11 Actually Permitted

To understand what broke, you need to understand what X11’s permissive model enabled. X11 has no privilege separation between applications. Any window can call XGrabKey or XGrabButton to intercept global keyboard and mouse input before other applications see it. Any process can call XGetImage to capture pixels from any window, or the entire screen, without permission. XSendEvent lets applications inject synthetic input events into other applications’ windows. The clipboard is a shared, unmediated resource.

This model is a security nightmare. It means a malicious or compromised application can log your keystrokes, screenshot your password manager, and inject clicks into your banking session. The “X11 privilege escalation” problem was documented thoroughly by Joanna Rutkowska as far back as 2011, long before Wayland was practical.

But that same permissive model was the foundation of decades of legitimate tooling. Screenshot utilities, screen recorders, clipboard managers, global hotkey daemons, accessibility tools, remote desktop software, and window management utilities all built directly on X11’s ability to observe and manipulate the shared display state. When Wayland replaced that model with strict compositor-mediated isolation, those capabilities didn’t disappear cleanly. They fractured into a patchwork of protocols.

The Protocol Fragmentation Problem

Wayland’s core protocol is deliberately minimal. The Wayland protocol itself covers surface creation, input delivery, and compositor frame synchronization. Everything else, including things as fundamental as listing the windows currently open on a desktop, requires extensions negotiated between the application and the compositor.

This is architecturally sound. The problem is what happened when different compositor projects solved the same problems independently:

wlroots-based compositors (Sway, Hyprland, River) developed wlr-protocols: a collection of extensions like wlr-layer-shell-unstable-v1 for panels and overlays, wlr-data-control-unstable-v1 for clipboard manager access, and wlr-output-management-unstable-v1 for display configuration. These work well within the wlroots ecosystem and not at all outside it.

KDE developed its own set: kde-plasma-window-management for taskbar window enumeration, kde-output-device-v2, and various others tied to KWin specifically.

GNOME Shell implemented a different subset, often through D-Bus rather than Wayland protocols directly, and for years explicitly declined to implement wlr-data-control on the grounds that clipboard access should go through the portal stack.

The ext-* namespace, which represents protocols that are supposed to be compositor-agnostic and stable, has been slowly filling in. ext-idle-notify-v1 landed in wayland-protocols 1.32 in 2023. ext-session-lock-v1 also arrived in 2023. ext-foreign-toplevel-list-v1 for enumerating open windows was still in draft as of early 2025. These are capabilities that X11 applications have had trivial access to since the 1990s.

An application that needs to work across GNOME, KDE, and Sway now has to speak multiple dialects or fall back to the xdg-desktop-portal D-Bus interface, which adds a runtime dependency on the portal infrastructure and requires the compositor to have a matching portal backend installed and running.

The Portal Layer and Its Latency

The xdg-desktop-portal project is the intended solution for most of these capabilities. It provides a D-Bus API where sandboxed applications can request screen capture, file access, global shortcuts, remote desktop sessions, and more. The portal implementation is provided by the desktop environment, so GNOME ships xdg-desktop-portal-gnome, KDE ships xdg-desktop-portal-kde, and wlroots-based desktops use xdg-desktop-portal-wlr (or increasingly xdg-desktop-portal-hyprland for Hyprland specifically).

For screen capture, the portal works by creating a PipeWire stream that the application reads. This is genuinely elegant: it provides compositor-controlled sharing with explicit user consent, solves the latency and format negotiation problem through PipeWire’s graph model, and works reasonably well once configured. OBS Studio added PipeWire capture support and it now works on modern distributions without incident.

But “now works on modern distributions” is doing a lot of work in that sentence. Fedora 25, released in November 2016, was the first major distribution to default to Wayland for GNOME sessions. PipeWire didn’t reach production readiness until around 2021. The GlobalShortcuts portal interface (org.freedesktop.portal.GlobalShortcuts) wasn’t added to the specification until 2023. For the years in between, screen recording and global hotkeys were either broken outright, required falling back to XWayland, or required distribution-specific configuration that most users never discovered.

XWayland: The Compatibility Tax

XWayland runs a nested X server inside the Wayland compositor, allowing unmodified X11 applications to run. It is an impressive piece of engineering and it kept the Linux desktop functional during the transition. It is also a long-term liability.

XWayland’s HiDPI handling was, until recently, limited to integer scaling. A 2x HiDPI display works fine; a 1.5x fractional scale produces blurry X11 windows. The wp-fractional-scale-v1 protocol merged into wayland-protocols 1.31 in 2023, and XWayland gained experimental support for it around the same time, but this was not remotely sorted out at the time Wayland became the default on most user-facing distributions.

Input method support under XWayland has its own set of quirks. CJK input through IBus or Fcitx5 in XWayland applications uses a different code path than native Wayland clients, and for years the two environments had different behavior around commit timing, pre-edit rendering, and focus handling. Users relying on input methods for daily work, particularly in East Asian languages, had legitimate reasons to prefer X11 sessions.

Clipboard synchronization between XWayland and native Wayland applications requires active effort from the compositor. When that synchronization has bugs, which it periodically does, you get cut-and-paste failures between your native and XWayland applications with no obvious explanation.

What Wayland Genuinely Gets Right

None of the above changes the fact that the direction was correct.

HiDPI on X11 was a mess. Different scaling implementations across toolkits, uncoordinated DPI handling, and the fundamental mismatch between X11’s protocol and modern display hardware created a situation where high-resolution displays on Linux were either unusable or required per-application workarounds. Wayland’s per-output scaling and the wp-viewport protocol provide a foundation that actually works.

Multi-monitor setups with different DPIs, which are increasingly common, were practically impossible to get right under X11. The Wayland model handles them correctly by design.

Frame synchronization and tearing were persistent X11 problems that Wayland addresses architecturally. The compositor has full control over when frames are presented, which enables proper VSync, variable refresh rate support, and consistent frame timing for applications that need it.

The security isolation is real. An application running in a Wayland session cannot silently observe keystrokes typed into other applications. For threat models that include compromised application dependencies or third-party software, this matters.

The Honest Account

The “10 years” figure in the original article is probably calibrated for rhetorical effect rather than precision, but the underlying observation has merit. The Linux desktop defaulted to Wayland compositors while essential replacement infrastructure, things like standardized screen sharing, global shortcuts, window enumeration, and clipboard manager protocols, was still years from being ready.

The fragmentation across compositor projects compounded this. An application developer targeting “Wayland” is not targeting a single coherent platform. They are targeting three or four partially-overlapping protocol sets with a lowest common denominator that, even in 2025, still does not include some features that X11 made trivially available in 1995.

This was not inevitable. A more conservative transition, where distributions would not default to Wayland until the portal and ext-* protocol coverage was genuinely equivalent to X11 for common workflows, would have avoided years of user-facing regression. The decision to move the default earlier was made with the belief that putting users on Wayland would accelerate ecosystem development. There is something to that argument. There is also a real cost that was paid by people who needed clipboard managers, screen readers, input methods, or any of the other X11 capabilities that the Wayland ecosystem was still working on.

Wayland will be the right answer when the ecosystem catches up to it. That process took longer than it needed to, and the Linux desktop absorbed that cost in the form of years of rough edges, broken tooling, and the ongoing maintenance burden of two parallel code paths that XWayland requires. That is worth acknowledging clearly.