· 6 min read ·

The Bloat Problem Nobody Talks About When You Use Postgres as a Queue

Source: lobsters

Using Postgres as a job queue has become a respectable architectural choice. The appeal is obvious: you already have a database, it has ACID transactions, and tools like SELECT ... FOR UPDATE SKIP LOCKED make it straightforward to implement a reliable queue without adding Redis or RabbitMQ to your stack. But there is a cost that almost every naive implementation eventually runs into, and it comes from the same feature that makes Postgres reliable in the first place: MVCC.

What MVCC Does to Queue Tables

Postgres uses Multi-Version Concurrency Control to handle concurrent reads and writes without locking readers out. Every UPDATE does not overwrite a row in place. Instead, it writes a new version of the row and marks the old version as dead. The dead rows accumulate in the table until VACUUM comes along to reclaim that space. For most tables, this is fine. Rows change infrequently, VACUUM keeps up, and bloat stays manageable.

A queue table is a different animal entirely. Consider the lifecycle of a single job: it is inserted as pending, then updated to running when a worker claims it, then updated again to completed or failed. That is three versions of a single row written in rapid succession. Multiply that by thousands of jobs per second and you have a table that is generating dead tuple garbage faster than autovacuum can collect it. The dead row count outpaces VACUUM’s throughput, the table grows, indexes bloat alongside it, and eventually you are doing full sequential scans over a table that is ninety percent dead tuples.

This is not a hypothetical. Teams running pg_boss or hand-rolled queue implementations at scale have hit this wall. Autovacuum is tunable, but there is a ceiling on how aggressively you can run it before it starts competing with your actual workload for I/O.

The Existing Landscape

pg_boss is the most widely used Postgres queue library. It has scheduling, retries, fan-out, throttling, and a mature API. It manages bloat through DELETE rather than soft-deletes and through aggressive vacuum configuration recommendations, but it still carries the fundamental burden of row-per-job state transitions. It works well at moderate scale and its battle-tested reliability matters.

pgmq (from Tembo) takes a more explicit approach. It is modeled on SQS semantics: messages have a visibility timeout, a consumer locks a message by updating a vt (visible after) timestamp, and completed messages are deleted. The design is clean, but UPDATE-on-dequeue is still an MVCC event, and under high throughput the bloat problem resurfaces.

River (from Riverqueue) is a Go library that writes directly against a Postgres schema without an intermediate API layer. It is fast and has a thoughtful design around worker concurrency, but again, jobs go through state transitions via UPDATE.

All of these systems are fundamentally fighting the same physics. You have a mutable state machine (pending, running, done) expressed as a Postgres row, and MVCC is working against you every time that row changes.

What PgQue Does Differently

PgQue, built by Nikolay Samokhvalov, attacks this from the schema level rather than through VACUUM tuning. The core insight is that the majority of bloat comes from UPDATE statements on queue rows. If you can eliminate the status-transition updates, you eliminate most of the bloat.

The approach PgQue takes is append-only with immediate deletion. Instead of updating a row’s status column, jobs are deleted from the queue immediately upon successful processing. There is no running state written back to the table. The worker claims the job using SELECT ... FOR UPDATE SKIP LOCKED, does the work, and issues a DELETE. The only write operations on the queue table are INSERT and DELETE, never UPDATE.

This changes the MVCC story significantly. INSERT creates a single row version. DELETE marks it dead, but VACUUM can reclaim a deleted row much more aggressively than a row that has been through multiple update cycles, because there is no chain of old versions to traverse. More importantly, a row that has been deleted is simply gone once VACUUM runs, versus an updated row which leaves behind a dead tuple that still occupies space in the page.

For failed jobs and retry tracking, PgQue uses a separate table rather than putting retry state on the original job row. This keeps the hot path completely clean and moves the write-heavy retry bookkeeping to a colder path.

SKIP LOCKED and Why It Matters

The SELECT ... FOR UPDATE SKIP LOCKED clause, available since Postgres 9.5, is the foundation of every modern Postgres queue implementation worth using. Without it, workers competing for jobs would block each other. With it, a worker that tries to claim a row another worker already has locked simply skips that row and moves on to the next one. This enables fan-out to multiple workers without serialization.

SELECT id, payload
FROM jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;

In PgQue’s append-only model, there is no status column to filter on. The presence of a row in the queue table means it is available. SKIP LOCKED handles the rest. This simplifies the query significantly and removes the index complexity that comes from filtering on a frequently-changing status column. A partial index on a mutable column is another bloat source; removing that column removes the index problem too.

The Tradeoffs

Nothing is free. The append-only, delete-on-complete model means you lose the built-in audit trail that a soft-delete or status-column design gives you. With pg_boss or pgmq, you can query for all jobs that ran in the last hour, how many failed, which ones are still running. With PgQue’s approach, completed jobs are gone. You need a separate event log or audit table if you want that history.

This is a genuine tradeoff, not a deficiency. Many queue workloads do not need the history. A background email sender does not need to query completed jobs. A webhook delivery system that writes delivery outcomes to a separate deliveries table does not need queue history either. If your use case fits, the simplicity of PgQue’s model is a net win.

The retry story also requires more thought. Traditional queues handle retry by updating a counter and rescheduling in the same row. PgQue’s separate retry table keeps this from polluting the hot path, but it adds a join for the rare case where you need to correlate a failed job with its original metadata.

When the Bloat Problem Actually Bites You

Small-scale queue usage rarely triggers this. If you are processing a few hundred jobs per minute, autovacuum keeps up fine and bloat never becomes visible. The problem surfaces around the 10,000-100,000 jobs-per-minute range depending on job duration, the number of concurrent workers, and how aggressively you have tuned autovacuum_vacuum_cost_delay.

There are a few signals that indicate you are approaching the wall:

  • pg_stat_user_tables.n_dead_tup stays persistently high on the queue table even after autovacuum runs
  • pg_stat_user_tables.last_autovacuum shows autovacuum running very frequently
  • Query times on the queue table are creeping up even as the logical queue size stays constant
  • pg_relation_size() on the queue table is much larger than SELECT count(*) * avg_row_size would suggest

At that point, VACUUM tuning is a band-aid. The underlying issue is the write pattern, and PgQue addresses that at the root.

Putting It in Context

Samokhvalov has done extensive work on Postgres performance, including the postgres.ai tooling around database lab and Joe (the Postgres query optimization bot). PgQue fits within that tradition of approaching Postgres scaling problems from first principles rather than through configuration knobs.

The timing is also interesting because Tembo’s pgmq has been gaining significant adoption and has driven a lot of conversation about the right way to build Postgres queues. PgQue represents a distinct philosophical position: optimize the common case (successful job processing) rather than making the full state machine cheaper. Most jobs succeed. Optimize for that.

For teams already running pg_boss or pgmq at scale and hitting bloat, PgQue is worth benchmarking in a realistic load test. The migration path is not trivial since the data model is genuinely different, but the bloat elimination is structural rather than parametric.

For teams starting fresh with a new queue system, the choice depends on what you need from your queue. If you need history, visibility into running jobs, complex retry policies with per-job configuration, and scheduling, the more established options are more complete. If you need raw throughput with a clean Postgres footprint, PgQue is solving the right problem in the right place.

The broader lesson is that Postgres as a queue is a valid pattern but not a free one. The MVCC accounting is real and the cost scales with your throughput. Any serious Postgres queue implementation has to have a story for bloat, and the best story is to not create the bloat in the first place.

Was this interesting?