· 7 min read ·

Searching Rust by Type: What Roogle Borrows from Haskell and Where It Goes From There

Source: lobsters

There is a particular kind of frustration unique to statically typed languages with large standard libraries. You know a function exists. You can describe what it should do in terms of types. You want Vec<T> -> usize. Or maybe &str -> Option<u64>. The function is there, somewhere, but the documentation search is keyword-based and you cannot remember the name. Haskell solved this problem two decades ago with Hoogle. The Rust ecosystem is catching up, and Roogle is one of the more interesting attempts at that.

The Hoogle Model

Hoogle, originally built by Neil Mitchell around 2004 and continuously maintained since, is a search engine that indexes Haskell packages by function type signature. You type a type, and it finds functions that match. Search for a -> [a] and you get pure, return, repeat, and friends. Search for [a] -> [a] -> [a] and (++) appears near the top. The magic is that type variables like a are treated as wildcards during matching: any concrete type can unify with a, so the search does not require you to know the exact polymorphic type, only its shape.

This works cleanly in Haskell for a few reasons. Haskell has a Hindley-Milner type system where types are structurally uniform and type variables are genuinely first-class. The surface syntax for types is relatively small. There are no lifetimes, no mutability markers, and no distinction between a reference and a value in the type language itself. Unification across a large corpus of Haskell functions is computationally tractable because the type grammar is compact.

The result is that experienced Haskell programmers use Hoogle constantly. It is not a fallback for when you forget a name. It is a primary discovery interface, a way to explore what the standard library can do by describing the shape of the transformation you need.

What Roogle Does

Roogle ports this idea to Rust. It ingests the rustdoc JSON output that rustdoc can produce for a crate, builds an index of function signatures, and then answers queries like (String, usize) -> String or Vec<T> -> Option<T>. Type variables in queries use lowercase names by convention, matching the Rust generic parameter style.

The core algorithm is a form of type unification. Given a query type fn(a, Vec<a>) -> Option<a>, Roogle walks the indexed signatures and tries to find a consistent substitution where every occurrence of a maps to the same concrete type throughout the signature. If a substitution exists, the function is a candidate result. The ranking layer then orders candidates by how specific the match is, preferring exact matches over those that required substituting several type variables.

Indexing is driven by rustdoc’s unstable JSON format, stabilized incrementally since it first appeared behind the --output-format json flag. This means Roogle can in principle index any crate that rustdoc can document, which covers the entire crates.io ecosystem once you pipe each crate through cargo doc --output-format json.

A query against the Rust standard library looks like this:

// Find functions taking a slice of T and returning the length
[T] -> usize

// Find functions converting a string to a number, possibly failing  
&str -> Result<u64, E>

// Find functions taking two arguments of the same type and returning bool
a -> a -> bool

The last example is genuinely useful. It surfaces comparison functions, equality checks, and ordering predicates without needing to remember whether the relevant trait method is called eq, cmp, partial_cmp, or something else.

Where Rust Makes This Harder

Rust’s type system is substantially more complex than Haskell’s, and that complexity surfaces immediately when you try to do type-directed search.

The first problem is lifetimes. A Rust function signature like fn split_at(&self, mid: usize) -> (&str, &str) carries lifetime information that is often elided in source code but present in the fully elaborated type. Deciding whether &str -> (&str, &str) should match split_at requires either ignoring lifetimes entirely during search or handling them gracefully during unification. Ignoring them is the pragmatic choice but means you lose information that could matter for disambiguation.

The second problem is trait bounds. Rust generics are always bounded. A function fn max<T: Ord>(a: T, b: T) -> T has a shape that looks like a -> a -> a, but only works for types implementing Ord. A Haskell-style search that ignores bounds will return this function for any query matching a -> a -> a, even if the caller’s type does not implement Ord. Properly incorporating bounds into the matching would require the search engine to reason about the trait graph, which brings in significant complexity.

The third problem is method resolution and self types. Many of the most useful Rust functions are methods on types, and their signatures are written as impl Foo { fn bar(&self, ...) -> ... }. The self parameter is special. Whether it appears in a query as Foo, &Foo, or &mut Foo matters for correctness but may not matter for discovery. Roogle has to make pragmatic choices about how strictly to interpret method receivers.

None of these problems are unsolvable, but each requires a design decision about how strict versus permissive the matching should be. Hoogle had the luxury of a simpler type language. Roogle is navigating a richer surface.

The Rustdoc Search Evolution

Roogle did not develop in isolation. Rust’s own built-in documentation search, served through docs.rs and generated by rustdoc, has grown significantly more capable over time. In Rust 1.73, released in October 2023, rustdoc shipped type-based search as a stable feature. You can now type usize, usize -> bool into the docs.rs search bar and get functions matching that signature shape.

The rustdoc implementation uses a different strategy than Roogle. Rather than full type unification, it uses a type-aware scoring system on top of the existing inverted index. The index stores type paths for each function parameter and return type, and queries can specify types using -> syntax to filter by return type or provide comma-separated argument types. It is less expressive than full Hoogle-style unification but significantly faster and already integrated into the tooling every Rust developer uses.

Roogle’s value proposition has therefore shifted somewhat. When it was first proposed, type search in Rust docs simply did not exist. Now the gap is about expressiveness: Roogle can match type variable patterns across arguments and return types simultaneously in ways that rustdoc’s search does not fully support. Searching for a -> a -> a as a truly polymorphic pattern, where all three a tokens must unify to the same type, is qualitatively different from filtering by individual argument types.

The Architecture and Practical Status

Roogle’s codebase is organized around a few key stages. The parser reads either a rustdoc JSON file or a pre-built index and constructs an in-memory representation of all documented items. The query parser handles the type syntax in the search input, building a type AST that the unification engine can work with. The unification engine then iterates the index, attempts substitutions, and collects matches. Results are ranked and returned.

The project is experimental. The rustdoc JSON format that Roogle depends on was unstable for a long time and is still evolving. Functions added to the rustdoc JSON spec require updates to Roogle’s parser. The corpus size is also a challenge: indexing all of crates.io at once would produce a large index, and the project has focused primarily on the standard library as a demonstration target.

For practical use today, the most accessible path is pointing Roogle at a specific crate or small set of crates you are actively working with, rather than expecting a full crates.io search. That matches how most developers would use it anyway: you are trying to find a function in the library you already imported, not searching the entire ecosystem.

Why This Matters Beyond the Feature

Type-directed search is a specific instance of a broader idea: that the type system, which developers invest significant effort maintaining, should be queryable as a first-class artifact. A type signature is a compact, precise description of a function’s contract. The function name is just a label, often chosen for readability in the library author’s context, which may not match the vocabulary a caller uses when searching.

In Rust specifically, the type system is unusually rich. The combination of generics, traits, and the ownership model means that type signatures carry more semantic weight than they do in many other languages. A function like fn drain<R: RangeBounds<usize>>(&mut self, range: R) -> Drain<T> tells you a great deal about what it does and what it requires. Making that information queryable is a direct investment in making the type system legible.

Projects like Roogle, and the type search improvements in rustdoc itself, represent the ecosystem recognizing that documentation is not just prose. The types are documentation too, and they are machine-readable. Building search on top of them is an obvious step once you frame it that way.

The Haskell community learned this early. The Rust community is building the same infrastructure with the added difficulty of a more complex type system and the practical constraints of a language that runs without a garbage collector. Roogle is an early, honest attempt at that problem, and whether or not the project itself reaches maturity, the ideas it explores have already influenced how Rust’s official tooling handles discovery.

Was this interesting?