From Lucene to LMDB: What the Elasticsearch-to-Meilisearch Migration Actually Changes

The migration from Elasticsearch to Meilisearch that Ani Safifi describes follows a pattern many developers recognize: Elasticsearch started as the pragmatic choice, became the operational burden, and eventually the question of switching became harder to dismiss. The switch to Meilisearch is genuinely the right call for a specific class of problems. But the reasons go deeper than “simpler API” and the limitations go further than most migration posts acknowledge.

Why Elasticsearch Is What It Is

Elasticsearch is built on Apache Lucene, which has been around since 2001. Lucene’s inverted index design was built for high-throughput document indexing and precise relevance scoring over large corpora. Elasticsearch took that foundation and wrapped a distributed architecture around it: shards, replicas, cluster state management, master node elections. This is the right architecture for log aggregation at scale, for analytics over billions of events, for geospatial queries across millions of points.

The complexity is not gratuitous. It is the cost of that capability.

When you run Elasticsearch, you’re running a JVM application that follows the 50% heap rule: give Elasticsearch no more than half your system RAM so the OS can use the other half for the filesystem cache that Lucene depends on. A reasonable production setup starts at 4GB heap, which means 8GB minimum RAM before you have indexed a single document. The Query DSL is verbose by necessity because it needs to express a wide range of query types: match, multi_match, bool, nested, function_score, script_score. A simple search with typo tolerance and field boosting in Elasticsearch looks like this:

{
  "query": {
    "bool": {
      "should": [
        {
          "multi_match": {
            "query": "comfortable running shoes",
            "fields": ["title^3", "body", "tags^2"],
            "fuzziness": "AUTO",
            "type": "best_fields"
          }
        }
      ]
    }
  }
}

This is not unnecessarily complex. It is expressing a real set of choices: which fields matter, how much fuzziness, how to combine scores. The problem is that most user-facing search boxes do not need that level of control, and the operational overhead of running a cluster to serve a product search page is disproportionate to what you get.

What Meilisearch Actually Is

Meilisearch is written in Rust and uses LMDB (Lightning Memory-Mapped Database) as its storage engine. LMDB is a B-tree-based key-value store that uses memory-mapped files, which means it can work with datasets larger than available RAM by relying on the OS page cache. This is different from Elasticsearch’s Lucene segments in that Meilisearch’s single-node architecture removes cluster coordination overhead entirely.

The API is a simple REST interface. The same search above becomes:

curl \
  -X POST 'http://localhost:7700/indexes/products/search' \
  -H 'Content-Type: application/json' \
  --data-binary '{"q": "comfortable running shoes", "limit": 20}'

Typo tolerance is on by default. The ranking rules are applied in order: words, typo, proximity, attribute, sort, exactness. You customize attribute weights by setting searchableAttributes in priority order, not by numeric boost values:

{
  "searchableAttributes": ["title", "tags", "body"]
}

The order of that array is the weighting. Meilisearch scores documents that match earlier attributes higher. This is opinionated and less flexible than Elasticsearch’s boost multipliers, but it covers the majority of use cases for product search, documentation search, and content discovery without any relevance tuning required.

The target latency Meilisearch commits to is under 50ms for most queries. The v1.0 release in February 2023 formalized the stability guarantees around the API surface and this performance target.

The Ranking Model Is a Deliberate Simplification

Elasticsearch uses BM25 as its default similarity algorithm, a probabilistic ranking function that weighs term frequency against inverse document frequency. It is well-suited for information retrieval scenarios where domain experts tune relevance over time. Meilisearch uses its own ranking algorithm built around those six default rules, which is closer to how a user intuitively expects search to work: an exact match beats a typo match, a word in the title beats one in the body.

For user-facing search in a product or SaaS context, Meilisearch’s defaults win in practice because they match user expectations without customization. For a research corpus or legal document retrieval where term frequency and document frequency carry semantic weight, Elasticsearch’s BM25 is better suited because it has the right theoretical foundation.

Since Meilisearch v1.7, the engine also supports hybrid search, combining vector embeddings with its keyword ranking. You configure a semanticRatio between 0 and 1 to blend vector similarity with traditional scoring:

{
  "q": "comfortable running shoes",
  "hybrid": {
    "semanticRatio": 0.5,
    "embedder": "default"
  }
}

This brings Meilisearch meaningfully closer to Elasticsearch’s semantic search capabilities without requiring Elasticsearch’s operational footprint.

What You Actually Lose

Migration posts tend to underemphasize three areas where Meilisearch genuinely falls short.

Aggregations. Elasticsearch’s aggregation framework is one of its strongest features. terms aggregations for faceted navigation, date_histogram for time-series breakdown, percentiles for latency distribution, bucket_script for derived metrics. Meilisearch has faceted search for simple category counts, but if you need nested aggregation pipelines or statistical distributions over your dataset, Meilisearch cannot help you. This matters if your search UI includes a price range histogram or a “results by month” chart.

Distributed scale. Meilisearch runs as a single node. The multi-search and federated search features allow querying multiple indexes in one call, which helps with multi-tenant architectures, but horizontal scaling across nodes to handle hundreds of millions of documents or thousands of queries per second is not part of the current architecture.

Log ingestion and analytics. The ELK stack exists because Elasticsearch handles high-volume append-only data and aggregation queries over time windows. Meilisearch is designed for a user-facing search box, not an observability pipeline. If you need both, you will likely end up running both.

The License Situation

In January 2021, Elastic changed Elasticsearch’s license from Apache 2.0 to the Server Side Public License (SSPL). AWS forked Elasticsearch at version 7.10 to create OpenSearch, which remains Apache 2.0 licensed. If the license change influenced your decision to leave Elasticsearch, OpenSearch is a viable path that keeps the full feature set, including aggregations and cluster distribution. Meilisearch is MIT licensed, which is permissive with no cloud-provider restrictions.

This is not an abstract concern for many organizations. Depending on your deployment model and whether you are offering search as part of a hosted product, the SSPL can introduce legal ambiguity that the MIT license avoids entirely.

When the Migration Makes Sense

The switch to Meilisearch is the right call when your use case is user-facing search over a bounded dataset, you want reasonable relevance out of the box with minimal tuning, and your operational budget does not include infrastructure engineering time for cluster management. A documentation search, a product catalog, a blog search index, an internal knowledge base: these all fit Meilisearch’s design well.

The switch is the wrong call if you need aggregations, if you are ingesting high-volume event streams, if you require full horizontal scaling beyond a single powerful node, or if you have existing relevance tuning in Elasticsearch that took months to calibrate and represents real business logic.

Running both is also a reasonable outcome: Elasticsearch for analytics and log pipelines where its aggregation engine earns its keep, Meilisearch for the search box that users actually type into. The operational overhead of maintaining two systems is real, but it is sometimes lower than the overhead of building aggregation tooling on top of a system that was never designed for it, or the overhead of running a full Elasticsearch cluster for a search box that processes a few hundred queries per day.

Meilisearch’s simplicity comes from narrowing the problem scope, not from solving the same problem more elegantly. That distinction is worth keeping clear before the migration starts, because it determines what you will be asking for help from either system a year later.