· 7 min read ·

Searching Rust APIs by Type Signature: What Roogle Gets Right and Why It's Hard

Source: lobsters

There is a particular kind of frustration that comes from knowing exactly what shape a function should have and having no idea what it is called. You need something that takes a Vec<u8> and a delimiter and gives you back an iterator of slices. You know this function exists somewhere in the standard library. You search docs.rs for “split”, wade through twelve methods on str, and eventually find split_at which is not it, then stumble across chunks which is closer but wrong, and twenty minutes later you find split on slices buried three pages down. The whole time you knew the type signature. You just had no way to search by it.

Roogle is a Rust API search engine built to solve exactly this. The name is a direct nod to Hoogle, Haskell’s famous type-directed search tool. The concept is the same: describe the function you want in terms of its types, and the engine finds it. But the implementation details diverge considerably, because Rust and Haskell have different type systems with different properties, and bridging that gap for search purposes is genuinely non-trivial.

What Hoogle Does and Why It Works

Hoogle has been around since 2004 and has become central to how Haskell programmers discover APIs. The input is a Haskell type signature. You type (a -> b) -> [a] -> [b] and Hoogle returns map. You type [a] -> Int -> a and get (!!). The search engine performs type unification: it checks whether the query type can be made to match a known function’s type through substitution of type variables.

Haskell’s type system makes this tractable for several reasons. Types are built from a small set of primitive constructs: type variables, type constructors, and function arrows. There are no lifetimes, no mutability qualifiers, no where clauses with complex bounds. When you write a -> b -> a, there are two type variables and one constraint implied by the structure: the return type must be the same as the first argument. Unification over these structures is well-understood and efficient.

Hoogle indexes packages from Hackage, Haskell’s package repository, using the type information extracted from package documentation. The search index is built from these extracted signatures and ranking is based on a combination of type match quality and package popularity.

The Same Problem in Rust

Rust programmers encounter the same friction. The standard library has several hundred public functions and methods. The ecosystem on crates.io has hundreds of thousands more. Text-based search on docs.rs works reasonably well when you know the name or part of it, but when you are reasoning from types outward, it fails you.

Consider a concrete example. You want a function that converts an Option<Result<T, E>> into a Result<Option<T>, E>. This transposition pattern comes up frequently. The function Option::transpose exists in the standard library, but if you do not already know that name, no amount of keyword searching will reliably surface it. If you could query by type:

Option<Result<T, E>> -> Result<Option<T>, E>

You would get there in one step.

Roogle makes this possible. It indexes the Rust standard library and supports type-directed queries. You describe a function signature and the engine returns matching functions ranked by how closely they fit.

How Roogle Indexes Rust APIs

The foundation of Roogle’s approach is rustdoc’s JSON output format, available on nightly as rustdoc --output-format json. This produces a machine-readable representation of every public item in a crate: functions, structs, enums, traits, type aliases, and their full type information. Unlike parsing source code, the JSON output has already been through name resolution and type inference, so types are fully qualified and unambiguous.

The JSON format represents types as a recursive structure. A function item has an input and output type, each of which can be a primitive, a generic parameter, a resolved path (like std::vec::Vec), a reference with a lifetime, a slice, a tuple, and so on. For a function like:

pub fn transpose<T, E>(opt: Option<Result<T, E>>) -> Result<Option<T>, E>

The JSON would encode both the input and output types as nested type nodes, with T and E represented as generic parameters. Roogle uses this structure to build its search index.

The Matching Algorithm

Type matching for Rust signatures is harder than for Haskell signatures because there are more things that need to match and more ways they can legitimately vary.

The core operation is type unification: given a query type and a candidate function’s signature, can we find a substitution for the type variables that makes them identical? For simple cases this is straightforward. A query of Vec<u8> -> usize looking for a function that takes a Vec<u8> and returns its length matches Vec::len because after substituting T = u8, the signatures align.

But Rust introduces complications that Haskell does not have:

Lifetimes. A function signature in Rust might be fn foo<'a>(s: &'a str) -> &'a str. A naive query of &str -> &str needs to match this despite the explicit lifetime annotation. Lifetime parameters need to be treated similarly to type parameters: abstractable, substitutable, and ignorable at the query level when the user has not specified them.

Trait bounds. In Haskell, typeclass constraints appear explicitly in the type signature: Eq a => [a] -> [a] -> Bool. In Rust, trait bounds appear in where clauses or angle-bracket syntax: fn contains<T: PartialEq>(slice: &[T], item: &T) -> bool. A search engine needs to decide how strictly to enforce these bounds when matching. Requiring exact bound matches would make the search too narrow; ignoring them entirely would produce too many false positives.

impl Trait and dyn Trait. Rust has two ways to express “some type that implements this trait” in function signatures. impl Iterator<Item = u8> in a return position means the function returns some concrete iterator type the caller does not name. dyn Iterator<Item = u8> means a trait object. Both need to be handled in the index and matched against appropriately vague queries.

Associated types. Many Rust traits have associated types. Iterator::Item, Deref::Target, Add::Output. A function that returns <T as Iterator>::Item has a return type that depends on an associated type projection. Matching against these projections requires additional machinery beyond straightforward unification.

Roogle’s approach to these challenges, based on the project structure, involves normalizing types during indexing (stripping lifetimes, treating type parameters uniformly) and using an approximate matching strategy that ranks results by closeness rather than requiring exact structural identity.

Comparison with Other Ecosystems

The idea of type-directed search has spread beyond Haskell. Pursuit does the same for PureScript, which has a similar type system to Haskell. Elm’s package documentation supports type search as a first-class feature. These languages all share Haskell’s relatively simple type grammar, which makes the matching problem tractable.

For languages with more complex type systems, solutions are harder. TypeScript has a type-based search tool called TypeSearch for DefinitelyTyped, but it does structural matching rather than unification because TypeScript’s structural subtyping makes nominal matching inadequate. Java and C# have reflection-based API browsers but nothing resembling type-directed search because their type systems (especially with generics) make the unification problem considerably more complex.

Rust sits somewhere between Haskell and Java in this regard. The type system is more expressive than Java’s but has structural regularity that makes search feasible. The lifetimes are the main complication without a direct analog in other languages.

The Current State

Roogle is an active but early-stage project. It currently indexes the Rust standard library and supports a usable subset of type query syntax. The hosted version is accessible online and functional for basic queries. It does not yet cover the full crates.io ecosystem, which would require both the indexing infrastructure to process thousands of crates and a ranking system to surface relevant results over that corpus.

The rustdoc JSON format itself has been stabilizing over several Rust releases. RFC 2963 formalized the format, and it has been incrementally extended to cover more of Rust’s type surface. As the format matures, tools like Roogle benefit from increasingly complete type information without needing to reimplement parts of the compiler’s type resolution.

There is an open tracking issue in the Rust repository for type-based search in rustdoc itself, suggesting interest from the core team in eventually incorporating something like this into the official documentation tooling. Whether that means adopting Roogle, building something new, or integrating with docs.rs remains open.

Why This Matters

Type-directed search changes how you explore a library. With text search, you start from names and navigate toward types. With type search, you start from types and arrive at names. For programmers who think type-first, the second workflow is significantly more natural, and the difference compounds when the codebase you are exploring is unfamiliar.

The Haskell community’s experience with Hoogle suggests the effect is real and substantial. Hoogle is not just a convenience; it has shaped how Haskell libraries are designed, because authors know their functions will be findable by type. Functions get cleaner, more compositional signatures partly because those signatures become the primary interface through which developers discover them.

Rust has a similarly strong type system culture. Types carry significant semantic weight in idiomatic Rust code. A tool that lets you search that semantic weight directly, rather than routing through names and documentation text, fits naturally into how Rust programmers already reason about their code.

Roogle is early but the direction is right. The hard parts, handling lifetimes and trait bounds and associated types in a way that produces useful results without too much noise, are tractable engineering problems. The infrastructure for indexing, in the form of rustdoc’s JSON output, is already there. The main work is in the matching algorithm and the scale of the index.

For now, it covers the standard library well enough to be genuinely useful for the case you hit most often: that function you know exists in std but cannot name. That alone is worth knowing about.

Was this interesting?