JSON Search Doesn't Need a Query Language

jq has been the default answer to “how do I process JSON in a shell script” for over a decade. It is fast, it is everywhere, and it is the first thing people install after git and curl. But there is a friction point that surfaces every time you just want to find a value somewhere in a JSON blob: the syntax asks you to think in terms of filters and pipelines before you can do what amounts to a search.

Micah Kepe’s jsongrep is a Rust-built CLI that targets this specific case, and the HN discussion it generated reflects widespread frustration. The tool does not try to replace jq for transformation workloads; it occupies a different position, closer to grep than to awk.

What jq’s DSL Actually Costs You

jq’s filter language is genuinely elegant if you work in functional programming. The pipe operator (|), the identity filter (.), recursive descent (..), and the alternative operator (//) compose cleanly into surprisingly powerful expressions. For transformations, there is nothing better in the shell toolkit.

For search, the language gets in the way. Say you want to find every occurrence of an email field anywhere in a deeply nested structure:

jq '.. | .email? // empty' data.json

That .. is recursive descent, the ? suppresses errors on nodes where .email does not exist, and // empty filters out nulls. Three separate language features are needed to express what is semantically just “find the field named email.” Compare this to gron, which flattens JSON to greppable text:

gron data.json | grep '"email"'
# json.users[0].email = "alice@example.com";
# json.users[2].email = "bob@example.com";

The results include the full path automatically, and the query requires no new syntax beyond standard grep flags. The trade-off is output format: gron produces flat assignment statements rather than JSON, so reassembling the results requires running ungron.

jsongrep takes a third path, keeping output JSON-native while replacing the query language with a pattern interface that works like grep. You provide a key name or a pattern and it finds matches recursively, without requiring any knowledge of the tree structure.

Where jq Quietly Fails

The DSL complexity is the most visible pain point, but jq has deeper issues that push people toward alternatives.

Memory usage is significant for server-side workflows. jq loads the entire input into memory before processing. That is fine for API responses but becomes a problem for log files or data exports in the hundreds of megabytes. jq offers a --stream flag for incremental processing:

jq -n --stream 'fromstream(2|truncate_stream(inputs; 1))' huge.json

That expression comes from the official documentation, and it requires rewriting your entire query in a fundamentally different paradigm. Streaming in jq is not an optimization path you can drop in; it is a separate programming model.

Number precision is a quieter issue. jq uses IEEE 754 double-precision floats for all numeric values, which means integers larger than 2^53 lose precision silently:

echo 'null' | jq '9007199254740993'
# 9007199254740992

Snowflake IDs, database primary keys at scale, nanosecond timestamps: any of these can be corrupted without warning. jaq, the Rust reimplementation of jq, handles this correctly, which is one of several reasons it has become the preferred alternative for precision-sensitive work.

The // operator has its own trap. It treats false as absent, the same as null, so default-value patterns break on boolean fields:

echo '{"active": false}' | jq '.active // "default"'
# "default"

The fix requires an explicit if expression. This behavior is documented, but it is frequently surprising and the error is silent.

The Ecosystem Has Been Splitting for Years

The current landscape of JSON CLI tools reflects a decade of developers solving specific sub-problems that jq handles poorly. The fragmentation falls along three axes: search versus transform, interactive versus pipeline, and in-memory versus streaming.

For interactive exploration, jless is a Rust-built TUI pager with vi keybindings that renders JSON as a collapsible tree. fx, rewritten from Node.js to Go in recent versions for performance, combines an interactive browser with a pipeline mode that accepts JavaScript expressions:

fx '.users.filter(u => u.age > 30).map(u => u.name)' data.json

Neither of these does what jq does. They exist for understanding a structure before writing a query, not for writing the query itself.

For transformation at jq’s level with better performance and error messages, jaq runs the jq filter language, handles integers correctly, streams natively, and produces error messages that identify what actually went wrong. Benchmarks from the jaq repository show roughly 2 to 5 times throughput improvement over jq on large files with significantly lower peak memory.

For data beyond JSON, yq extends jq-compatible syntax to YAML, XML, TOML, and CSV with format conversion between them. Miller takes a verb-based pipeline approach that handles JSON, CSV, and TSV with native streaming:

mlr --json filter '$age > 30' then cut -f name,email employees.json

For workloads that are fundamentally tabular, dsq and sq let you run SQL queries over JSON files using DuckDB or SQLite under the hood, which fits better than jq’s DSL when you are thinking about rows and columns rather than trees.

jsongrep’s position in this ecosystem is narrower than all of these: it handles the search case, the “I have a JSON blob and need to find where a value lives” case. Having a tool that does only that, quickly and without ceremony, is more useful than reaching for jq and spending time composing the right recursive filter.

Matching Tool to Operation

For transforming or reshaping JSON, jq or jaq are the right choices. jaq is preferable for new work because the error messages and number handling are improvements that compound over time. The filter language is close enough to jq that existing expressions mostly transfer.

For searching JSON without transforming it, jsongrep or gron cover the case with minimal friction. gron works with every existing grep flag and produces output that other Unix tools can process directly; jsongrep is the better choice when the output needs to remain valid JSON.

For exploring an unfamiliar JSON structure interactively, jless or fx in interactive mode avoid the write-and-rewrite cycle that blind jq querying produces.

For large files where memory matters, jaq and miller both stream natively. jq without --stream will load the entire file, and the --stream mode is too cumbersome to use routinely.

For tabular JSON, SQL via dsq or sq expresses the operation in a way that matches how you think about the data, and it performs better on large record sets where jq’s DSL produces verbose and slow expressions.

What This Points To

jq remains the most capable general-purpose tool in this space. Its filter language is genuinely powerful, and for complex transformations it has no peer in the shell ecosystem. But the assumption that jq is the correct default for every JSON operation has always been a convenience rather than a technical conclusion.

A search operation does not need a functional language runtime. A paging operation does not need a transformation engine. A tabular query does not need a tree-walking DSL. What jsongrep and tools like it represent is a gradual recognition that JSON tooling covers several distinct problems, and that tools built for one of them will always outperform a general-purpose tool on that specific case.

The Unix philosophy of doing one thing well took longer to reach JSON tooling than it should have, partly because jq arrived early and did enough things well that alternatives seemed unnecessary. The current ecosystem, with its proliferating specializations, is making up for that lag.