· 6 min read ·

River's Layout Protocol and the Problem Wayland Created for Window Managers

Source: lobsters

When you run a tiling window manager on X11, the separation of concerns is clear at the OS level. The window manager is just a process. It calls XSelectInput on the root window with SubstructureRedirectMask, which tells the X server to route all window mapping requests through it. From that point, every new window goes to the WM first. The WM reparents it into a frame, decides where to put it, and tells the X server the final geometry. The display server and the policy layer are genuinely separate things running in separate processes.

Wayland collapsed that. The Wayland compositor is simultaneously the display server, the compositing manager, and the window manager. There is no SubstructureRedirect equivalent because there is no separate display server to redirect through. If you want to influence window placement, you need to either be the compositor or have the compositor cooperate with you through a protocol. This is not a flaw in Wayland’s design so much as a consequence of it; the X11 model worked partly because the X server was a shared mutable thing that any sufficiently privileged process could poke at. Wayland deliberately avoids that.

The pragmatic response to this constraint was to bundle everything. Sway, the most widely used Wayland tiling compositor, takes all the i3 tiling logic and puts it inside the compositor itself. The layout engine, the IPC socket, the configuration parser, all of it lives in one binary. Sway is excellent software, and that approach works. But it means that if you want different tiling behavior, you fork sway or write a plugin, not a separate program.

River, written in Zig by Isaac Freund and built on wlroots, took a different position. The compositing infrastructure, input handling, rendering, and the Wayland protocol implementation all live in River. The tiling logic does not. River exports a Wayland protocol called river-layout-v3 that lets external processes act as layout generators, and the compositor asks them where to put windows rather than deciding itself.

How river-layout-v3 Works

The protocol is request-response at its core. When River needs to arrange windows on an output, it sends a layout_demand event to whatever layout generator is connected for that output. The event carries four pieces of information: the number of views to arrange, the usable width and height of the output, and a serial number to match the response.

The layout generator receives this demand and responds by sending one push_view_dimensions request per view, each specifying x, y, width, and height. After all the view geometries are sent, the generator sends a commit request with the matching serial. River applies the layout.

A minimal exchange looks like this:

# River -> layout generator
layout_demand(view_count=3, usable_width=1920, usable_height=1080, serial=42)

# Layout generator -> River (one per view)
push_view_dimensions(x=0,   y=0,   width=960, height=1080, serial=42)
push_view_dimensions(x=960, y=0,   width=960, height=540,  serial=42)
push_view_dimensions(x=960, y=540, width=960, height=540,  serial=42)

# Layout generator -> River
commit(serial=42)

The generator does not know what those windows contain. It does not need to. It receives a count and a bounding box and returns rectangles. The protocol deliberately keeps the layout generator stateless with respect to window content; River tracks views, the generator tracks geometry rules.

The layout generator also advertises a name string. River uses this to let users switch layouts at runtime via riverctl send-layout-cmd and riverctl set-layout-generator, so you can swap from a tall stack to a grid to a floating arrangement by changing which process handles demands for a given tag.

The Ecosystem This Enables

River ships with rivertile, a layout generator that handles the common cases: main-area ratio, main view count, padding, and orientation. For most users, rivertile is sufficient. But because the interface is a Wayland protocol that any process can implement, the community has produced alternatives.

stacktile extends the concept with multiple configurable stacks. rivercarro adds monocle mode and gaps. kile is written in Haskell, which tells you something about the kind of flexibility this model affords; you can implement the protocol in any language that can speak Wayland. The protocol boundary is low-overhead because it runs over the same socket as all other Wayland communication, and the compositor only calls into the generator when an actual layout change is needed.

This is architecturally similar to what X11 window managers had by accident. A dwm fork and a stock dwm installation are separate binaries. River makes that separation explicit and protocol-defined rather than accidental.

Tags, Control, and the Rest of the Protocol Surface

River uses a tag system in the dwm tradition rather than numbered workspaces. Each view has a bitmask of tags, and each output has a bitmask of visible tags. Filtering is bitwise AND. This is meaningfully different from workspaces because a view can appear on multiple outputs simultaneously by assigning it multiple tag bits, and you can display multiple tags at once on a single output.

The river-control protocol handles the command interface. riverctl is the userspace tool, but the actual communication goes over a Wayland protocol, not a Unix socket with a custom serialization format. This means the control surface has the same version negotiation that all Wayland protocols have, which is a small but meaningful consistency win.

River also implements river-status, which notifies clients of focused output, focused view, and tag state changes. Taskbar applications and status bars use this to display what is happening without polling or guessing.

Comparison with Sway’s Approach

Sway’s monolithic model has real advantages. The tiling engine has direct access to view metadata, window titles, application IDs, and output properties without going through a protocol. Sway can implement container hierarchies, marks, and scratchpads with full knowledge of window state. The i3 IPC protocol is rich because the compositor can expose rich information.

River’s model trades that richness for composability. A layout generator only knows counts and bounding boxes. It cannot make decisions based on window title or application class, at least not through river-layout-v3 alone. Combining river-layout-v3 with wlr-foreign-toplevel-management or the newer ext-foreign-toplevel-list standard protocol would let a sufficiently motivated layout generator fetch toplevel metadata and make smarter decisions, though at significant implementation cost.

The tradeoff is that River’s compositor surface stays smaller and more auditable. The layout generator can crash or be replaced without touching the compositor. You can prototype a new tiling algorithm in a scripting language without recompiling anything. If your layout generator has a bug, you get bad window placement, not a dead compositor.

Implications for Wayland Protocol Standardization

river-layout-v3 is a River-specific protocol, not a wlroots or freedesktop standard. Other compositors do not implement it. This is a recurring tension in the Wayland ecosystem: compositors build compositor-specific extension protocols, and applications that want to work across compositors either target the intersection of standardized protocols or write compositor-specific backends.

The standardization process under freedesktop.org is slow and conservative, which is intentional. Protocols that get standardized tend to be stable for a long time. River’s approach of shipping a compositor-specific protocol and iterating on it, currently at version three, lets the design evolve based on real usage before anyone attempts standardization. The ext-foreign-toplevel-list protocol that River also supports is an example of that pipeline working: similar functionality existed as a wlroots extension (wlr-foreign-toplevel-management) for years before being formalized.

Whether a layout-generator protocol eventually gets standardized depends on whether enough compositors see value in the model. Right now the compositors with the most users tend toward the sway model. But River’s architecture makes a credible case that you can have a capable tiling compositor without embedding tiling policy, and the existence of a working protocol and a growing list of third-party generators is more persuasive evidence than a design document.

The X11 window manager ecosystem thrived partly because the SubstructureRedirect mechanism was a common interface every WM could target. Wayland will not replicate that mechanism, but River’s work suggests that a well-scoped protocol can recover some of that diversity without sacrificing the security properties Wayland was designed to provide.

Was this interesting?