· 7 min read ·

Searching Rust APIs by Type Signature, Not Name

Source: lobsters

When you know what types you’re working with but not what the function is called, Rust’s documentation ecosystem doesn’t help much. The search on docs.rs is name-based, matching against function identifiers and crate names. cargo search is even further removed from the code itself. If you have an Option<Vec<u8>> and need to flatten it, or you want to convert a Result<T, E> to an Option<T> without knowing the method is called ok, you’re left browsing method lists manually.

Roogle is a Rust API search engine that addresses this directly. You give it a type signature query written in Rust syntax, and it finds functions that match that shape. The name is a portmanteau of Rust and Hoogle, Haskell’s type-directed search engine, and the lineage is intentional.

Hoogle and the Idea of Type as Query

Haskell’s Hoogle has been around since 2004. The core insight is that in a typed language, the type signature of a function encodes a significant portion of what it does. If you need a function that takes a list and returns one element, you search for [a] -> a and get head, last, and a handful of others. If you need something that applies a function to every element of a list, (a -> b) -> [a] -> [b] gives you map. The type is the specification.

This works especially well in Haskell because of parametric polymorphism and the strong convention that the standard library functions are lawful. A function of type [a] -> [a] is almost certainly some kind of list rearrangement. Hoogle handles typeclass constraints too, so Ord a => [a] -> [a] surfaces sort even when you don’t know what it’s called.

The idea spread elsewhere over time. Elm had type-based search in its package catalog for several years. OCaml’s Merlin does type-directed completion inside editors. Agda and Idris have type-directed proof search baked into their core workflows. But for a language as widely used as Rust, this kind of search tooling arrived late.

What Roogle Does

Roogle implements type-directed search over Rust’s public API surface. You write a query that looks like a Rust function signature, and it returns matching functions from indexed crates. The query syntax mirrors Rust’s own syntax closely: types are written as they appear in actual code, with generics, references, and slices all expressible.

A query for fn(Vec<T>) -> Option<T> would match functions that take a Vec of some type and return an Option of the same type. The type variable T is treated as a unification variable, not a concrete type. A query like fn(&str) -> String surfaces string conversion functions. You can leave parts of the signature underspecified with wildcards to broaden results.

The engine indexes crates by consuming the JSON output that rustdoc produces when invoked with --output-format json. This output format, which landed as an unstable feature around Rust 1.55 and has been evolving since, provides a complete machine-readable description of a crate’s public API: every item, its type, its documentation, and its relationships to other items. Rather than scraping HTML documentation or reimplementing Rust’s type system from scratch, Roogle works from this structured representation that the compiler itself produces.

Type Unification in Practice

The matching algorithm at the heart of Roogle is a form of first-order unification. When you submit a query signature, the engine walks the type tree of each candidate function and attempts to construct a substitution that maps the type variables in your query to concrete types in the candidate, such that the two signatures become identical under that substitution.

For a concrete example: if you query fn(T, T) -> bool, the unifier needs to find functions where both argument types are the same, whatever that type is. This matches PartialEq::eq, Ord::lt, and similar comparison functions, but not fn(T, U) -> bool where the two type variables are independent. The constraint that T must unify with itself in both positions is what filters the results.

This gets substantially more complex with Rust’s type system than with Haskell’s. Rust distinguishes between T, &T, &mut T, Box<T>, Rc<T>, and Arc<T>. A query for fn(&str) -> usize is not the same as fn(String) -> usize, and the engine needs to handle these distinctions correctly. Lifetimes add another layer: fn<'a>(&'a str) -> &'a str is a different contract from fn(&str) -> String, expressing that the output borrows from the input rather than owning new memory. Whether Roogle treats lifetime variables as proper unification variables or elides them for matching purposes is a significant design decision that affects both correctness and the volume of results.

Rust’s trait bounds also have no direct analog in basic Hoogle queries. If a function is generic over T: Hash + Eq, that constraint is meaningful for understanding what the function does, and a user querying for fn(HashMap<K, V>, &K) -> Option<&V> probably expects to find HashMap::get with its K: Hash + Eq requirement. Surfacing these correctly without generating noise from overly permissive matches requires the engine to index bounds as part of the signature and reason about them during unification.

Why This Problem Is Harder in Rust Than in Haskell

Haskell’s type system, while sophisticated in its own ways, is more uniform in how polymorphism is expressed. You have type variables, typeclass constraints, and function arrows. Rust adds ownership, borrowing, multiple smart pointer types, and the fundamental split between types and traits as separate namespaces.

A Haskell function f :: [a] -> a has multiple natural Rust analogs: fn(&[T]) -> &T, fn(Vec<T>) -> T, fn(&mut Vec<T>) -> Option<T>. These are meaningfully different functions with different ownership semantics. Roogle has to decide whether a query for one should surface the others, and under what conditions. There is no clean answer, only trade-offs between recall and precision.

The scale of the ecosystem also creates pressure. Hoogle indexes Hackage, which is large but fairly curated. Crates.io hosts several hundred thousand crates with enormous variance in documentation quality and API design conventions. Efficient indexing and fast lookup over this surface requires careful data structure choices, particularly since type queries can be structurally complex and expensive to unify naively.

Method vs. function syntax adds a normalization problem. A method Vec<T>::pop(&mut self) -> Option<T> is semantically a function fn(&mut Vec<T>) -> Option<T>. A type-directed search should surface it when you query for that shape. Normalizing receiver-style methods into regular function signatures for indexing is conceptually straightforward but needs to be handled consistently across the entire index.

The Tooling Gap It Fills

The existing Rust documentation tooling is good at what it does. rustdoc produces excellent HTML docs, docs.rs hosts them reliably, and rust-analyzer provides deep IDE integration for code you’re already writing. But none of these help when you’re standing outside a codebase trying to discover what functions exist.

The closest thing in the standard ecosystem is rustdoc’s built-in JavaScript search, which as of recent Rust versions includes some limited type filtering. You can search for a function name and it will narrow results by type, but this is closer to filtering than unification, and it’s scoped to a single crate’s documentation page rather than cross-crate.

There have been ongoing discussions in the Rust project about improving API discoverability, including work on making rustdoc’s search more expressive. Roogle as a community project applies concrete pressure to that conversation: it demonstrates that the underlying data (rustdoc JSON) supports this use case and that the demand exists.

A Practical Use Case

Consider the situation where you have a Result<T, E> and want to convert it to an Option<T>, discarding the error. You might not know that the method is called ok. A query to Roogle for fn(Result<T, E>) -> Option<T> should surface it immediately, alongside err() for the mirror case. This is a small example, but the pattern scales: any time you know the type transformation you need and not the name of the function that performs it, type-directed search is the right tool.

The same applies when learning a new crate. Rather than reading through an entire API surface to find the function that converts a type you have into a type you need, you describe the transformation in types and let the search do the navigation.

What the Implementation Reveals

The Roogle repository is worth reading as an example of what becomes possible once rustdoc JSON is treated as a first-class data source. The architecture is relatively clean: a JSON ingestion layer that parses rustdoc output, a type index built from the ingested items, a query parser that handles Rust’s type syntax, and a unification engine that matches queries against the index.

Building on rustdoc JSON means Roogle inherits the compiler’s understanding of types rather than reimplementing it. The format includes resolved types with their full generic structure, making it significantly more tractable to implement unification than if you were parsing raw source code.

Rust’s strong type system creates information density in function signatures that name-based search throws away. The type signature of a function tells you, precisely and verifiably, what it consumes and what it produces. Roogle recovers that information and puts it to work as a query language, which is the direction this kind of tooling should have been heading all along.

Was this interesting?