Searching Rust APIs by Type: What Roogle Is Actually Doing Under the Hood
Source: lobsters
If you have spent time in the Haskell ecosystem, you have used Hoogle. You type a type signature like (a -> b) -> [a] -> [b] and it tells you that map is what you are looking for. The idea is simple: when you know what shape of transformation you need, you should be able to search for it by that shape rather than guessing what the standard library author named it.
Rust has always had a gap here. docs.rs is excellent for reading documentation once you know what crate and function you are targeting, but it offers only name-based search. cargo search searches crate descriptions. IDEs with rust-analyzer give you completion, but completion is local to your current project graph and requires that you already have the crate as a dependency. None of these answer the question: “I have a &str and I want an IpAddr, what function does that?”
Roogle is an attempt to fill that gap. It is a type-signature search engine for Rust, and it works in a way that is worth understanding in detail.
The Foundation: rustdoc’s JSON Output
Before Roogle can search anything, it needs a machine-readable description of every public API in the Rust ecosystem. The answer to that is rustdoc’s JSON output format, which was stabilized over several nightly releases and became a supported output mode through the rustdoc-types crate.
You can generate it yourself:
RUSTDOCFLAGS="-Z unstable-options --output-format json" cargo +nightly doc --no-deps
This produces a JSON file in target/doc/ that contains a complete structural description of every public item in your crate: functions with their full signatures, structs, enums, traits, and their interrelationships. The rustdoc-types crate provides Rust types that deserialize this JSON output, giving downstream tools a stable interface to work with.
Roogle ingests this JSON. That is its corpus. Rather than parsing source code or re-implementing type inference, it relies on the compiler’s own documentation output as the source of truth about what a function’s signature actually is after all the type resolution the compiler has already done.
What a Type Signature Query Actually Means
When you write a Hoogle query like String -> Int, you are asking for any function that takes a String and returns an Int. The match is structural. Hoogle’s search uses type unification to determine whether a candidate function’s type could match the query under some substitution of type variables.
Roogle operates on the same principle but applied to Rust’s type system. A query might look like:
fn(u32) -> Option<u32>
or:
fn(&str) -> Result<IpAddr, _>
The underscore in that second query acts as a wildcard, letting you express partial knowledge about a type. Roogle attempts to match these queries against indexed function signatures by checking structural compatibility.
The implementation uses a similarity scoring approach rather than strict unification. A function matches if its signature is compatible with the query, but results are ranked by how closely they match. This handles the case where the user’s query is approximate, which is the common case when you are searching precisely because you do not know the exact types involved.
Why Rust Makes This Harder Than Haskell
Haskell’s type system, while expressive, has properties that make type search tractable. Types are always fully inferred. There are no references or ownership semantics in the signature. A Haskell function signature completely describes the input-output relationship.
Rust signatures carry much more information, and some of it is noise for the purpose of search. Consider:
pub fn parse<F: FromStr>(s: &str) -> Result<F, F::Err>
This single function signature involves a generic type parameter with a trait bound, a reference with an implicit lifetime, an associated type, and a Result wrapper. A user searching for “something that turns a string into a number” might write fn(&str) -> i32, but the function they want is str::parse which returns a Result<i32, _>, not a bare i32.
Rogue has to make decisions about how loosely to match. Does a query for fn(&str) -> i32 match fn(&str) -> Result<i32, _>? Arguably it should, with a penalty. Does it match when the &str is borrowed with a named lifetime? Almost certainly yes, lifetimes should probably be erased for search purposes.
Generics add another layer. A function fn<T: Display>(val: T) -> String could plausibly match a query for fn(i32) -> String if i32: Display, which it does. Checking that requires either actually running trait resolution or maintaining a pre-computed trait impl index. Roogle approaches this with type variable substitution: generic parameters are treated as wildcards that can match any concrete type, which is a reasonable approximation.
Associated types are the hardest case. Iterator::collect has a signature roughly like fn<B: FromIterator<Self::Item>>(self) -> B. Matching this against a query requires understanding the relationship between B and the iterator’s item type. This is the kind of case where a search engine either has to be conservative and miss matches, or approximate and produce false positives.
The Broader Ecosystem Context
Roogle is not the only project thinking about this problem. The Rust project itself has had long-running discussions about adding type-based search to docs.rs. There have been experiments with integrating search into rust-analyzer at the IDE level. The challenge in all cases is the same: the corpus is large, the type system is complex, and getting high-quality results requires real type system knowledge, not just string matching.
Other languages have solved adjacent problems differently. In Go, pkg.go.dev has name-based search with filtering by package. Java developers rely on IDE completion almost exclusively for API discovery because the ecosystem is too large to navigate manually. Python’s help() and dir() functions are runtime-introspection tools, not static search.
The Hoogle model is compelling because it works at the level of abstraction that matters for API discovery. When you are writing code, you think in types. You have a value of some type and you need a value of another type. The name of the function is secondary information you will learn once you find it.
Current State and Limitations
Roogle is still early-stage. The index it ships with covers a subset of the Rust standard library and some popular crates, not the full ecosystem on crates.io. Building and maintaining an index of the full ecosystem at scale is a significant infrastructure problem that requires regular re-indexing as crates release new versions.
The query syntax is also still evolving. Expressing complex queries, like “a function that takes an iterator of something and returns a HashMap”, requires getting the syntax right, and the current syntax is not yet as forgiving as Hoogle’s.
Despite these limitations, the core idea is sound and the implementation demonstrates that type-based search for Rust is feasible. The rustdoc-types JSON format provides a clean ingestion path. The similarity scoring approach handles the fuzziness inherent in type-level queries. The harder engineering problems are about scale and index maintenance, not about whether the matching approach works.
Why This Matters for Rust’s Tooling Story
Rust’s reputation for a steep learning curve is partly about the ownership system, but it is also about discoverability. The standard library is large, the iterator adapter chain is extensive, and knowing that flat_map, chain, zip, scan, and fold exist at all requires either reading through the full Iterator documentation or having already written enough Rust to encounter them naturally.
Type-based search shortcuts that discovery process. You know you have a Vec<Option<T>> and you want a Vec<T> with the None values removed. You do not need to know that flatten or filter_map exists; you just describe the transformation and the tool finds the name for you.
Tools like Roogle sit alongside rust-analyzer, docs.rs, and the Rust playground as part of the broader question of what it means to have good language tooling. The language itself can be excellent; the ecosystem of tools around it determines how accessible that excellence is to someone working through a problem they have not solved before.
The roogle-rs/roogle repository is worth watching. The core problem it is solving is real, the approach is principled, and there is a clear path to it becoming genuinely useful as the index grows and the query language matures.