· 5 min read ·

Finding Rust Functions by Shape, Not by Name

Source: lobsters

The problem is straightforward: you are writing Rust, you know a function exists that takes a Vec<u8> and returns a String, but you do not know what it is called. Docs.rs search takes names and keywords. You guess a few terms, read through some results, and eventually land on String::from_utf8. This is a solved problem in other languages. Roogle is the Rust community’s most direct attempt to solve it here.

The Hoogle Model

Haskell’s Hoogle has offered type-based API search for over two decades. The premise is that in a statically typed language, a function’s type signature describes what it does more precisely than its name often does. Searching for a -> [a] -> [a] returns (:), insert, intersperse, and other list functions that take an element and a list and return a list. The query encodes the structural constraint, and the search engine finds everything that satisfies it.

The matching works through alpha-equivalence: type variable names in your query do not need to match the names in the function signature, only the structure matters. Searching for a -> b -> a finds const :: a -> b -> a, and searching for concrete types like String -> Int finds functions with compatible signatures. The engine handles partial matches too, so you can search on just the return type or a subset of the arguments.

This works cleanly in Haskell because the type system has a relatively uniform structure. Types, type variables, and type class constraints occupy well-defined roles, and the surface area a search engine has to cover is manageable.

Why Rust Makes This Harder

Rust’s type system adds several dimensions that complicate type-based search in ways Hoogle never had to deal with.

Lifetimes are structural but frequently implicit in practice. A function like fn longest<'a>(x: &'a str, y: &'a str) -> &'a str has a meaningful lifetime relationship between its parameters and its return value. A user searching for (&str, &str) -> &str expects to find it, but a naive matcher that requires literal signature equality misses it unless lifetime annotations are normalized away or treated as wildcards.

Trait bounds change the semantics of generic type variables. fn<T: Clone>(x: T) -> T is different from fn<T>(x: T) -> T because the first version can call .clone(). A search engine has to decide whether a query for T -> T should match both, match only the unconstrained version, or surface them in different ranking bands. No single answer is universally correct.

impl Trait and dyn Trait present two structurally different representations for what is often the same semantic intent. fn foo(x: impl Write) and fn foo(x: &mut dyn Write) both accept something that implements Write, but they are different types at the syntactic level. A search engine has to decide how aggressively to unify them.

Reference mutability is a first-class type distinction in Rust. &T and &mut T are different types, and a search for a function taking &T might or might not want to include functions that require &mut T, depending on context.

None of these are hard blockers. They are design decisions that require explicit policy choices, and getting them wrong produces a search engine that either floods results with noise or silently omits obvious matches.

How Roogle Approaches It

Roogle builds its index from rustdoc’s JSON output format, which provides structured type information for every public item in a crate. This is a sensible foundation: the JSON format is well-defined, consistently available for any crate that produces documentation, and carries exactly the type information a search engine needs, including generics, bounds, and lifetime structure.

Queries use a syntax modeled on Rust type signatures. The engine applies alpha-equivalence so the names of type variables in a query do not need to match those in the function signature. A search for (usize, usize) -> usize finds binary arithmetic operations. A search with type variables finds functions with matching structural shapes across concrete types.

To make the matching concrete:

-- Find a function that wraps a value in Some
T -> Option<T>

-- Find a function that extracts from an Option with a fallback
(Option<T>, T) -> T

-- Find a function that takes two slices and combines them
(&[T], &[T]) -> Vec<T>

The first query surfaces Some and Option::from. The second points to Option::unwrap_or and Option::unwrap_or_default with the right signature adjustment. The third finds functions like concatenation utilities. In each case, you are searching by what the function must do structurally, not by what someone decided to call it.

The current implementation covers the Rust standard library well. Extending coverage to the broader crate ecosystem requires running the rustdoc JSON pipeline across crates.io at scale, which is a non-trivial infrastructure problem separate from the search engine itself.

Rustdoc Gained Type Search Too

One significant piece of context: rustdoc itself added type-based search within individual crate documentation, shipping in stable Rust in late 2023. The search box on docs.rs now accepts type signature queries for single-crate lookups, which means if you already know you are working within a specific crate, you can search by type shape directly in the browser.

This shifts Roogle’s value proposition toward the cross-crate case: finding which crate or standard library module has the function you need, before you know where to look. That requires an index spanning the full ecosystem rather than a single crate’s documentation, which is a significantly harder infrastructure problem.

Lib.rs, the community-built crate index, demonstrates that ecosystem-wide indexing is achievable through sustained effort. Building something similar for type-based search requires a build pipeline that processes rustdoc JSON for new crate versions continuously, a storage layer that can hold the resulting indices efficiently, and a query engine that handles the matching at scale without degrading response time.

The Size of the Problem

The Rust ecosystem has grown past 150,000 crates on crates.io. The standard library itself spans enough ground that no developer has all of it memorized, and the community of high-quality utility crates for parsing, serialization, async, numerics, and system interfaces has grown proportionally.

Name-based search works when you know the vocabulary of the function you need. Type-based search works when you know what the function must do but not what someone decided to call it. These two approaches are complementary. Haskell developers have had both for a long time, and they use them in different situations depending on how much they already know about the API surface they are searching.

A Rust developer who needs something from Iterator that takes a predicate and returns a count can search docs.rs for “count” and find Iterator::count and related methods. Or they could type (&mut Self, FnMut(&Self::Item) -> bool) -> usize into a type search engine and find Iterator::filter_map alongside Iterator::count more directly, without knowing the method names first. The second approach requires no prior knowledge of Rust’s naming conventions.

The case for type-based search gets stronger as the ecosystem grows, because the vocabulary gap between what you know how to describe and what the crate author decided to name things widens with more crates and more API surface. Roogle is an early project working in a space that Rust tooling has needed for a while. The technical challenges involved in matching across Rust’s type system are real and solved only partially so far, and the infrastructure work for ecosystem-wide coverage is the clearest remaining gap. That rustdoc itself moved toward type search signals that the broader tooling community considers the idea worth investing in, and community projects like Roogle are where that investment starts.

Was this interesting?