Beyond WAL Mode: SQLite Replication Strategies for Docker and Kubernetes

When Simon Willison documented the behavior of SQLite WAL mode across Docker containers sharing a volume, the central finding was precise: it works on a single Linux host with local storage, but breaks silently on network filesystems and Docker Desktop for Mac and Windows. That boundary matters more than it looks, because container orchestration has a habit of dissolving the single-host assumption the moment you move from development to production.

The WAL mode limitation comes down to how SQLite coordinates concurrent readers and writers. The -shm file, which accompanies every WAL-mode database, is a shared memory index that all connected processes must access through a coherent view. On a single Linux host, two containers mounting the same named volume both reach the same inode, and mmap() with MAP_SHARED gives them the same physical kernel page-cache pages. POSIX byte-range locks on that file work correctly because the kernel mediates all lock state. Take the same setup across a network filesystem like EFS or Azure Files, and the coherency guarantee disappears: each host maintains its own page cache, mmap() regions on different machines are not synchronized, and the WAL index becomes incoherent between containers.

This is a well-documented SQLite limitation. The SQLite WAL mode documentation states directly that WAL does not work over a network filesystem, and the howtocorrupt.html page describes the consequences. But knowing the rule does not immediately tell you what to do instead. When you have a working single-host setup and need to scale it, or when your orchestration platform might schedule containers on different nodes, you face a real architectural choice. Four options are worth understanding in depth.

Rollback Journal Mode as the Simple Fallback

The easiest path is to stop using WAL mode entirely. Rollback journal mode (PRAGMA journal_mode=DELETE) predates WAL and has different coordination properties: it puts POSIX byte-range locks directly on the main database file rather than on a shared memory index. Readers block writers and writers block readers, which is exactly why WAL mode was introduced in SQLite 3.7.0, but the locking model is far simpler and does not require coherent mmap() semantics across hosts.

PRAGMA journal_mode=DELETE;
PRAGMA synchronous=FULL;
PRAGMA busy_timeout=5000;

The trade-offs are real. Rollback mode serializes readers and writers in a way WAL mode avoids. A long read transaction blocks all writes for its duration. For read-heavy workloads, this regression can be significant. For applications with infrequent writes and modest concurrency, it is often the right call: no shared memory dependency, no -shm file, no coherency assumptions that deployment changes can violate.

On a network filesystem, rollback mode still requires the filesystem to correctly implement POSIX fcntl() advisory locks. NFS through the lockd daemon and NLM protocol has a long history of failures here, so “safer than WAL” does not mean “safe for NFS under all conditions.” Validate lock behavior explicitly if your backing store is NFS rather than assuming it works.

Litestream: WAL Streaming to Object Storage

Litestream takes a different approach: instead of sharing a database file directly between containers, it replicates WAL frames in real time to an object storage backend like S3, GCS, or Azure Blob Storage. Each container maintains its own local copy of the database and runs Litestream as a sidecar to handle restoration and replication.

The typical pattern for a Kubernetes pod looks like this:

initContainers:
  - name: restore
    image: litestream/litestream:latest
    args: ["restore", "-config", "/etc/litestream.yml", "/data/app.db"]
    volumeMounts:
      - name: db
        mountPath: /data
containers:
  - name: app
    image: myapp:latest
    volumeMounts:
      - name: db
        mountPath: /data
  - name: litestream
    image: litestream/litestream:latest
    args: ["replicate", "-config", "/etc/litestream.yml"]
    volumeMounts:
      - name: db
        mountPath: /data

The init container restores the latest snapshot before the application starts. The sidecar then replicates all subsequent WAL frames continuously. This gives you durability, point-in-time recovery, and the ability to start new instances from a recent snapshot without coordinating between running containers.

The constraint is that Litestream is a single-writer architecture. Multiple containers can each maintain a local read copy, but writes must route through one primary. If two replicas both try to write, there is a conflict problem that Litestream does not resolve. For stateless services behind a load balancer where one pod accepts writes, this works cleanly. For anything requiring active-active write distribution, it does not.

Litestream replication latency to S3 is typically under one second under normal conditions. The WAL replication documentation covers how to tune checkpoint frequency and segment size for different durability requirements.

LiteFS: FUSE-Based Replication with Leader Election

LiteFS, developed at Fly.io, addresses the single-writer coordination problem more directly. It is a FUSE filesystem that intercepts all database reads and writes at the filesystem level, replicates WAL frames over HTTP to follower instances, and uses a distributed consensus system (Consul or etcd) for leader election.

All containers mount the LiteFS FUSE volume at a shared path. The elected leader forwards writes through the FUSE layer, which replicates them to followers. Followers serve reads from their local copy. The LiteFS documentation covers the Docker and Kubernetes setup in detail, including the Consul integration for leader election.

FUSE introduces latency and operational complexity. Every filesystem operation goes through the FUSE kernel module to the LiteFS process, which adds overhead compared to direct filesystem access. On write-heavy workloads, this overhead is measurable. LiteFS also requires deploying a coordination backend for leader election, which adds infrastructure dependencies a simple SQLite deployment does not have.

The payoff is genuine multi-reader scale with a single write point, without the manual routing coordination that Litestream requires. Fly.io positions LiteFS for applications with a primary write region and multiple distributed read replicas, not concurrent multi-writer architectures. That model fits a large category of web applications.

rqlite: Raft Consensus Around SQLite

rqlite is a structurally different category of solution. Rather than replicating a SQLite file between nodes, it wraps SQLite in the Raft consensus protocol and exposes a HTTP API for database access. Multiple rqlite nodes form a cluster; all writes go through a leader and are replicated to followers via Raft before acknowledgment. The rqlite consistency documentation explains the available read consistency levels, from fully linearizable to eventually consistent reads from followers.

This gives you distributed consistency with automatic leader failover, at the cost of a client-server model. Your application no longer opens a local SQLite file; it speaks HTTP to the rqlite API. The query syntax is standard SQL, but the embedded database ergonomics that make SQLite attractive in the first place are no longer present.

rqlite targets use cases where you need multi-node write availability and automatic failover, not just read scaling. Three-node clusters with automatic leader re-election and surviving one node failure are a reasonable production configuration. For applications that chose SQLite to avoid client-server database infrastructure, rqlite reintroduces exactly that overhead. For applications that need distributed consistency and chose SQLite for its data model rather than its embedded nature, it is a coherent fit.

libSQL and Turso: A Protocol for Remote SQLite

libSQL, the SQLite fork underlying Turso, extends SQLite with a network protocol that allows remote database access while preserving the SQLite API surface for application code. A Turso database runs in their infrastructure or your own sqld server; your application connects via a client library that speaks the sqld protocol and uses a local replica cache for read performance.

The client libraries for Python and Node.js offer the same API as the standard SQLite driver, with connection strings targeting either a local file or a remote server. For containerized deployments, this means containers connect to a shared remote database without coordinating local filesystem access at all.

The trade-off is a dependency on managed infrastructure or the operational cost of running your own sqld server, combined with network round-trips for writes and reads that miss the local cache. The performance profile shifts noticeably from embedded SQLite. For applications serving a global user base from multiple regions and wanting SQLite semantics without local file coordination, libSQL is the most complete solution in the current ecosystem.

Choosing Between Them

The decision tree maps to what you actually need. If you are running a single-host Docker Compose setup with a local-backed volume, WAL mode works and you should use it. If you are on a network-backed volume or Kubernetes with unpredictable pod scheduling, switch to rollback journal mode first and measure whether the concurrency regression affects your workload.

If you need durability beyond the local disk and a single-writer model fits your application, Litestream offers the lowest operational complexity. If you need read scaling across containers with a managed single primary, LiteFS adds what Litestream cannot provide. If you need distributed consistency with automatic failover, rqlite is the appropriate tool. If you want SQLite semantics without any filesystem coordination, libSQL trades embedded simplicity for managed infrastructure.

All four options exist because SQLite in production containers is a real engineering problem with multiple valid solutions. The WAL shared memory constraint is a property of the implementation, not a verdict on SQLite’s fitness for containerized deployments. Understanding which layer of the stack breaks under your specific deployment conditions is what makes the right solution clear.