· 6 min read ·

Ten Years of Slug: The Maintenance Cost of Exact GPU Font Rendering

Source: hackernews

Ten years is a long time to maintain a commercial graphics library. In that span, graphics APIs have completed a full generational cycle: OpenGL gave way to Vulkan and Metal and Direct3D 12; GLSL, HLSL, and Metal Shading Language all accumulated significant revisions; mobile hardware evolved enough that ARM and Qualcomm now run shaders that would have been desktop-only a decade ago. Through all of this, Eric Lengyel’s Slug library has rendered TrueType and OpenType fonts analytically on the GPU, and his ten-year retrospective is most usefully read as a case study in what exactness costs as an engineering commitment over a long timeframe.

The Core Algorithm

Slug’s fragment shader receives the actual Bezier control points for each glyph, stored in GPU-accessible buffers, and evaluates them analytically per fragment. For each pixel, the shader identifies which curve segments have y-intervals overlapping the current fragment row, finds their intersections with a notional horizontal ray by solving the Bezier polynomial, accumulates signed winding contributions from those intersections, and integrates the winding number over the fragment area to produce fractional coverage between 0 and 1.

For the quadratic Bezier curves that make up TrueType outlines, intersection-finding requires solving a quadratic:

float a = P0.y - 2.0 * P1.y + P2.y;
float b = 2.0 * (P1.y - P0.y);
float c = P0.y - frag_y;
float disc = b * b - 4.0 * a * c;
if (disc >= 0.0) {
    float t1 = (-b - sqrt(disc)) / (2.0 * a);
    float t2 = (-b + sqrt(disc)) / (2.0 * a);
    // accumulate signed winding from roots in [0,1]
}

For CFF cubic curves in OpenType fonts, the pattern is identical with cubic polynomial solving. A spatial acceleration structure, organizing curve segments into horizontal bands per glyph, bounds typical per-fragment work to 3 to 6 curve evaluations for Latin glyphs rather than the full outline length.

Lengyel published the algorithm in 2017 in the Journal of Computer Graphics Techniques as GPU-Centered Font Rendering Directly from Glyph Outlines, providing an independently auditable specification for what the library computes. That paper exists separately from the implementation. Its mathematics have not changed.

What Algorithm Stability Covers and What It Does Not

The stability of the core algorithm is the central feature of Slug’s maintenance story, and it is worth being precise about what that stability includes.

The winding number rule, Bezier intersection arithmetic, and fractional coverage integration are not API-specific constructs. They are classical computational geometry. None of these required re-derivation when Vulkan shipped, when Metal became the dominant macOS API, or when SPIR-V became the intermediate representation shared by Vulkan and OpenCL. The mathematics ran on 2015 hardware and run on 2025 hardware unchanged.

What algorithm stability does not cover is the substantial work of expressing the same computation across every shader language that a commercial graphics library must support. GLSL, HLSL, and MSL have different syntax, different memory model semantics, different limits on buffer binding models, and different tooling. A Vulkan backend adds synchronization obligations that OpenGL abstracted away. Metal’s argument buffer approach to resource binding differs from both. SPIR-V compilation, validation, and transpilation through spirv-cross each add toolchain surface area. None of this changes what the fragment shader computes; all of it constitutes real engineering work arriving continuously through a decade of API evolution.

Format Extensions and the Price of Exactness

The OpenType specification introduced variable fonts in version 1.8 in 2016. A single variable font file encodes a continuous design space, specifying control points at named axes such as weight, width, and optical size; any point in the space can be interpolated to produce valid glyph outlines. For Slug, variable font support reduces to interpolating the Bezier control points before uploading them to the GPU buffer. The fragment shader evaluates whatever control points it receives and has no concept of named axes or design-space position. The mathematics compose without modification.

Color fonts are a more substantive case. OpenType has accumulated multiple competing color font formats: COLR v0 for layer-based coloring, COLRv1 with gradients and compositing operations, SBIX for bitmap-embedded emoji, CBLC/CBDT for PNG-compressed bitmaps, and an SVG table for scalable artwork. Each of these requires specific handling in the renderer. An exact renderer cannot absorb format subtleties into approximation tolerance. If COLRv1 gradient compositing is implemented incorrectly, the result is visibly wrong; there is no smoothed-over distance field obscuring the error.

This is where the maintenance profile of exactness diverges most sharply from approximate methods. A signed distance field pipeline can generate an atlas that is subtly incorrect for some edge case in glyph geometry, and the error may be imperceptible within the approximation’s tolerance. An exact renderer is accountable to every geometry detail it claims to handle, because the geometry is evaluated at display time.

The SDF Comparison

Valve’s signed distance field technique, introduced by Chris Green at SIGGRAPH 2007, stores a precomputed signed distance field in a texture. The fragment shader is trivially cheap: a single texture sample and a smoothstep threshold. The shader itself rarely needs to change across API generations. The maintenance debt accumulates instead in the pipeline that produces the texture atlas, which must account for every new font format, every target platform with different atlas dimension constraints, and every design-space point in a variable font that requires its own bake.

MSDF, Viktor Chlumský’s multi-channel extension, improves corner sharpness by encoding directional information across RGB channels. The fundamental trade-off is the same: cheap fragment shader, preprocessing pipeline with its own format-specific maintenance burden.

Neither SDF nor MSDF handles arbitrary scale correctly. Corner rounding in single-channel SDF reflects information that was never encoded; the distance field records proximity to the nearest edge, not the angular character of that edge. At large display sizes or significant zoom, this becomes visible and cannot be corrected without regenerating the atlas at higher resolution. Slug’s winding number integration is scale-independent because it operates on the actual curves at display time.

The Loop-Blinn technique from SIGGRAPH 2005 addressed the scale limitation by rendering Bezier curves analytically in the fragment shader, using an elegant implicit-form formulation where Bezier coordinates assigned per vertex allow containment evaluation with a single multiply and comparison. The limitation was that this produces a binary inside/outside decision rather than fractional coverage. Antialiasing required MSAA, and CPU-side tessellation into a specific triangle configuration was needed per glyph. Slug’s contribution was closing the coverage gap that Loop-Blinn left open.

Where Things Stand in 2026

The GPU text rendering ecosystem in 2026 contains multiple coexisting approaches suited to different use cases.

Vello, the compute-shader tiling renderer developed under the Linebender umbrella, is also exact in the same mathematical sense as Slug. It tiles the screen, assigns curve segments to tiles in a compute geometry pass, and accumulates exact area coverage per tile in compute shaders. The architectural difference matters in practice: Vello accumulates per-tile, which maps well to 2D scene rendering where entire scenes are submitted together, but sits less naturally inside a 3D game pipeline with an existing frame graph and separate per-object draw calls. Vello is open source and Rust-native; Slug is a commercial C++ SDK targeting games and real-time graphics applications across multiple platform APIs.

SDF and MSDF remain dominant for mobile games and applications rendering text at predictable sizes. On bandwidth-constrained mobile hardware, a trivial fragment shader and a cached texture atlas outperform analytical per-fragment evaluation for straightforward text scenarios. The quality trade-offs are acceptable when text is decorative UI at a defined size range and display density handles subpixel concerns.

What the Commercial Model Enables

Slug is a licensed commercial product. The implications for long-term maintenance are worth naming directly: Lengyel has financial accountability to integrators depending on stable API and behavior guarantees across a ten-year span. Projects exploring similar technical territory without that structure have not all sustained development at the same pace. Pathfinder, Mozilla’s analytical GPU text renderer, is now archived. Commercial licensing creates continuity incentives that volunteer maintenance alone does not reliably provide.

The decade also demonstrates something about exactness versus approximation as a long-term engineering strategy. Exactness changes the location of maintenance work rather than reducing it. Instead of maintaining preprocessing pipelines and asset regeneration tooling for each new format extension, you maintain shader ports and specification-tracking for the same format extensions. The algorithm is stable; the world it runs in is not. Lengyel’s retrospective is, among other things, an honest accounting of what that distinction costs in practice, and why the trade-off still holds for the applications that need it.

Was this interesting?