Elasticsearch's Complexity Is Load-Bearing, and Meilisearch Proves It

Most developers who have self-hosted Elasticsearch have a similar story. The cluster comes up fine, indexing works, and then a few months later you are tuning JVM heap settings, fighting a yellow cluster status after a node restart, or discovering that the shard count you chose at index creation time was wrong and now you need to reindex sixty million documents. The tooling exists to manage all of this, and the cognitive load compounds over time regardless.

This article at anisafifi.com walks through that frustration and makes the case for switching to Meilisearch. It is a reasonable case. But the framing of “here’s what nobody tells you” skips over something worth examining: Elasticsearch’s complexity reflects the scope of problems it was built to solve, and Meilisearch’s simplicity is a deliberate constraint on that scope. Understanding both sides makes for a better-informed migration decision.

Why Elasticsearch Got This Way

Elasticsearch was released in 2010, built as a distributed wrapper around Apache Lucene, which had existed since 2001. Lucene handles the core of full-text search: tokenization, building inverted indexes, and BM25 relevance scoring. Elasticsearch’s job is to distribute that across nodes, manage cluster state, and expose it through a JSON HTTP API.

The distribution layer is where the operational weight lives. Elasticsearch uses a primary/replica shard model. Each index has a fixed number of primary shards set at creation time, and each primary can have replicas for redundancy and read scaling. At query time, a coordinating node fans requests out to all relevant shards and merges the results. Cluster state is managed through a consensus mechanism, evolved through Zen discovery and eventually to a Raft-based system called Cluster Coordination since v7.x.

Choosing the wrong shard count at index creation locks you in until you reindex. JVM heap sizing directly impacts how much Lucene can cache in memory, affecting query latency. Elasticsearch recommends allocating no more than half of system RAM to the JVM heap, leaving the remainder for the OS page cache to hold Lucene’s on-disk segment files. Getting that balance wrong is a common production issue.

This is the price of running at petabyte scale across dozens of nodes, executing complex aggregation queries across billions of documents, and powering observability pipelines like the ELK stack. A query like “give me a histogram of HTTP response latencies, bucketed by hour, filtered to a specific service, for the last 30 days” is not just useful, it is the primary reason Elasticsearch exists for many organizations. Meilisearch cannot execute that query.

What Meilisearch Is

Meilisearch is a search engine written in Rust, first released in 2018, with a clear scope: fast, typo-tolerant, developer-friendly search for application use cases. The v1.0 release in February 2023 marked API stability after several years of iteration.

The storage layer uses LMDB, a memory-mapped B-tree database that is fast for reads and ACID-compliant for writes. Indexing is single-node and asynchronous through a task queue; documents become searchable within seconds of submission without any shard configuration or mapping decisions.

The relevance model is where Meilisearch’s philosophy diverges most sharply from Elasticsearch. Rather than BM25’s statistical scoring, which computes term frequency weighted against inverse document frequency across the corpus, Meilisearch uses a configurable ranking rules pipeline. The defaults, applied in order as tiebreakers, are: words (documents matching more query terms rank higher), typo (fewer edit-distance corrections rank higher), proximity (matched terms closer together in the document rank higher), attribute (matches in higher-priority fields rank higher), sort, and exactness.

This produces results that feel more predictable and tunable. With BM25, a short document containing an unusual term can outscore a more thorough document purely because of the IDF component. With Meilisearch’s pipeline, you configure what relevance means for your domain. The tradeoff is that statistical relevance on large text corpora, where frequency signals carry genuine information, is something rule-based ranking cannot replicate without explicit tuning.

Typo Tolerance Under the Hood

Meilisearch’s built-in typo tolerance is often cited as its headline feature, and the implementation is worth understanding. The engine builds deterministic finite automata (DFAs) derived from Levenshtein distance to match query terms against indexed tokens. At query time, for each query term, the DFA evaluates which indexed tokens fall within an edit distance threshold: no typos for terms of 1-4 characters, one typo for 5-8, two typos for 9 or more by default. Evaluating a candidate against the DFA is constant time per token after construction.

Elasticsearch supports fuzzy matching through the fuzzy query type, which also uses Levenshtein distance. The difference is ergonomics: it is opt-in per query, verbose to configure, and easy to forget. Meilisearch applies it by default, transparently, to every search. In practice, developers building app search on Elasticsearch often rediscover the need for fuzzy matching after users complain about missed results, then add it, then spend time tuning the fuzziness: "AUTO" parameter. Meilisearch handles this without any configuration.

The API in Practice

A basic fuzzy search against a title field in Elasticsearch looks like this:

{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "title": {
              "query": "javascript frameworks",
              "fuzziness": "AUTO"
            }
          }
        }
      ]
    }
  }
}

The equivalent in Meilisearch:

{
  "q": "javascript frameworks",
  "attributesToSearchOn": ["title"]
}

The Elasticsearch query DSL is not unnecessarily complex. The bool query’s must, should, must_not, and filter clauses encode real semantic distinctions, particularly around whether a clause affects scoring or only filtering. Understanding those distinctions takes time, and for most product or content search you do not need them. Meilisearch’s API fits comfortably in a README and typically requires no framework-specific client library to use effectively.

What You Give Up

Meilisearch has no aggregation pipeline. Sums, averages, histograms, percentile distributions, cardinality counts: none of these are available at query time. Faceting, which Meilisearch does support, is limited to counts per distinct value, not computed metrics. If your search requirements include any analytical querying over stored documents, Meilisearch is not a substitute.

There is no native cluster mode. High availability requires either running multiple instances with shared storage or using Meilisearch Cloud. For most single-region deployments at moderate scale, a single Meilisearch instance on a reasonably sized machine handles tens of millions of documents with low single-digit millisecond query latencies. For horizontal read scaling or multi-region active-active deployments, the architecture requires more thought.

Vector search landed in Meilisearch v1.6, enabling hybrid search that combines embedding-based semantic similarity with keyword ranking. Elasticsearch has had vector search since v7.0 in 2019, and its approximate k-NN implementation using HNSW is more mature and tested at larger scale. If high-dimensional hybrid search is a core requirement, Elasticsearch has more production headroom.

Elastic also changed its licensing in 2021, moving from Apache 2.0 to the Elastic License 2.0 and SSPL for new versions, which restricts certain cloud-provider use cases. Meilisearch is MIT-licensed, which is worth noting for teams where licensing factors into infrastructure decisions.

When the Switch Makes Sense

The migration described in the source article makes complete sense for its context: a product or content search use case where user-facing search UX is the goal, operational simplicity matters, and the full Elasticsearch feature set is going unused. For this category of application, including documentation search, e-commerce product catalogs, developer tool search, and knowledge base search, Meilisearch is a significantly better fit. The developer experience is better, the memory footprint is lower (Rust without a JVM has meaningful consequences at small to medium scale), and the path from “I need search” to “search works” is measured in minutes rather than days.

The miscalibration in the “what nobody tells you” framing is treating Elasticsearch’s operational requirements as waste. For applications that only need search-box-with-typo-tolerance, those requirements are pure overhead. For applications running the aggregation pipeline, using Elasticsearch’s distributed data model, or relying on it within a broader observability stack, that complexity is doing real work.

Meilisearch is an excellent tool for a well-defined problem; Elasticsearch is an excellent tool for a different and harder one. The decision should start with the problem rather than the operational preference.