jq Gets the Job Done. That's Not Enough Anymore.

Every few months, a new JSON query tool lands on Hacker News promising to replace jq. Some focus on friendlier syntax. Some focus on speed. jsongrep goes for both, and the discussion it sparked — 343 points and 216 comments — suggests it landed on something real.

Before getting into what makes jsongrep different, it’s worth being precise about what jq actually is, because “JSON processor” undersells it in ways that explain both its power and its friction.

What jq Really Is

jq is a Turing-complete functional programming language whose runtime happens to read JSON. Stephen Dolan released the first version around 2012, and the design was deliberate: rather than build a query language bolted onto JSON, he built a composable filter system where every expression takes input and produces output. The pipe operator works exactly like Unix pipes, but operating on JSON values rather than byte streams.

This is why jq can do things that look like magic:

# Group an array of log entries by status code and count each group
jq '[group_by(.status) | .[] | {status: .[0].status, count: length}]' logs.json

# Recursively find all keys named "error" anywhere in a nested structure
jq '[.. | objects | .error? // empty]' response.json

# Transform and merge two arrays by a shared key
jq 'INDEX(input[]; .id) as $lookup | .[] | . + {extra: $lookup[.id]}' a.json b.json

That last example uses INDEX, a built-in that constructs a lookup dictionary. jq ships with functions for math, string manipulation, date handling, path operations, and more. It’s not a query language. It’s a language.

The cost of that expressiveness is a learning curve that doesn’t flatten. The mental model required to write non-trivial jq is genuinely different from anything else in a typical Unix toolkit. Most developers have a handful of patterns memorized and reach for Stack Overflow for everything else.

Where the Performance Goes

jq’s architecture is tree-based. When you run it against a file, it parses the entire JSON document into an in-memory tree, compiles your filter expression into a bytecode representation, and then executes that bytecode against the tree. For small files, the overhead is invisible. For large files — multi-gigabyte API responses, log aggregations, database exports — it becomes the bottleneck.

The parsing step is where modern alternatives have the most room to win. The simdjson library demonstrated in 2019 that JSON parsing could hit 2-3 GB/s on commodity hardware by exploiting SIMD instructions to process multiple bytes in parallel. jq, written in C without SIMD optimization, parses at roughly 100-300 MB/s depending on document structure. For a 500 MB JSON file, that difference is measurable in seconds.

But parsing speed is only part of the story. If your query just extracts a value at a known path, you don’t need to build a full tree at all. A streaming parser that skips irrelevant tokens can answer path-based queries without ever allocating memory proportional to the document size. This is the insight that tools like jsongrep exploit: for the common case — “give me the value at this path” — you can be dramatically faster by doing dramatically less.

The Grep Mental Model

The naming choice in jsongrep is doing real work. grep is one of the most universally understood tools in the Unix ecosystem, and its mental model is simple: give it a pattern, give it input, get matching lines back. No special output format, no composable filters, no second-order functions.

Translating that model to JSON means path-based pattern matching rather than functional transformation:

# jq equivalent for extracting all user names from a nested structure
jq '.users[].name' data.json

# grep-style path expression (conceptually)
jsongrep 'users[*].name' data.json

The query in the grep-style version maps directly onto how you’d describe the path in conversation. You’re not thinking about filter composition; you’re thinking about navigation. For 80% of real-world jq usage — extract this field, filter these objects, flatten this array — the navigation model is sufficient and the functional model is overhead.

This is the same bet that tools like gron made, though with a different mechanism. gron flattens JSON into a text format that grep can actually process:

gron response.json | grep '"error"'
# json.errors[0].message = "not found";
# json.errors[1].message = "rate limited";

The advantage over jsongrep’s approach is that gron composes with every other Unix tool transparently. The disadvantage is that you’re doing a full document parse and serialize just to get to grep. For large files, you’ve added overhead rather than removed it.

Performance Claims and What They Mean

Benchmark comparisons between JSON tools tend to measure specific things that may or may not match your workload. The scenarios where a streaming tool like jsongrep will win decisively are:

Large documents (hundreds of MB or more) with simple path queries
Cases where the target data is near the beginning of the file, letting the streamer exit early
High-volume pipeline usage where per-invocation overhead compounds

The scenarios where jq holds its own or wins:

Complex transformations that require multiple passes or aggregation
Documents small enough to fit comfortably in cache
Situations where the expressiveness of jq’s DSL saves you from writing multiple commands

jq has also accumulated significant optimization work over the years. The --stream flag enables a streaming parser mode that avoids full-document allocation for compatible queries. It’s not the default, and the output format it produces is unusual, but it exists for exactly the large-file scenarios where streaming would help.

The Ecosystem That Keeps Growing

jsongrep is one of many attempts to occupy the space between “full jq” and “grep on raw text.” Each makes a slightly different set of tradeoffs:

dasel (Go) — multi-format (JSON, YAML, TOML, CSV) with a unified selector syntax, sacrifices some JSON-specific optimization for breadth
jql (Rust) — subset of jq’s syntax with Rust’s memory safety and performance profile
fx — interactive terminal viewer with JavaScript expression support, targets exploration over pipeline use
jnv (Rust) — interactive filter builder that evaluates jq expressions in real time, useful for developing queries before using them in scripts

None of these have displaced jq in scripting contexts, which suggests the question is less about which tool is best in the abstract and more about what you’re actually doing. If you’re processing API responses in a shell script and you need .data.items[] | select(.active) | .id, jq’s expressiveness is the right fit. If you’re tailing structured logs and need to extract a field from each line as fast as possible, a streaming tool is probably the better choice.

The Real Takeaway

What’s interesting about jsongrep’s reception isn’t that it’s faster than jq — several tools are, for specific cases. It’s that 343 people upvoted it and 216 showed up to discuss it, which means there’s genuine appetite for something that sits between jq and grep in both complexity and capability.

jq’s position as the default JSON tool in shell scripting is stable because it’s genuinely powerful and because institutional knowledge of it is widespread. But the friction of its learning curve is also real, and for users who spend most of their time doing path extraction rather than transformation, the complexity-to-value ratio doesn’t always feel justified.

Tools like jsongrep aren’t replacing jq. They’re carving out the use cases where jq is more than you need, and making those cases faster and simpler to handle. That’s a legitimate and useful thing to exist in the ecosystem, even if it doesn’t change what goes in your .bashrc.