· 6 min read ·

What a Million-Line Monolith Actually Teaches You

Source: lobsters

The industry spent most of the 2010s treating monoliths as a problem to escape. Then receipts started arriving. Amazon Prime Video published a case study in 2023 describing how they moved a distributed serverless microservices pipeline back toward a single-process architecture and cut costs by 90%. Stack Overflow’s infrastructure lead Nick Craver spent years documenting how the entire Stack Exchange network, handling over a billion page views per month, ran on fewer than a dozen web servers behind a single .NET monolith. DHH wrote “The Majestic Monolith” in 2016 and then followed it up by having his team leave AWS entirely, saving millions per year while keeping their Rails codebase intact.

Against that backdrop, Isaac Lyman’s 113-lesson distillation of scaling a Rails monolith from a small codebase to one million lines reads less like a counterargument and more like a manual. It is a long piece worth reading in full. But the lessons, taken together, point at something specific: the monolith does not fail because of size. It fails because teams do not invest in the infrastructure, discipline, and tooling that large codebases require.

The Module Boundary Problem

The single most important lesson in any large codebase is the one that sounds bureaucratic until you have felt its absence: module boundaries need physical enforcement, not just social convention.

At twenty engineers and eighty thousand lines of code, you can get away with architectural rules that live in a README or a Notion document. At two hundred engineers and a million lines, those rules are violated constantly, not out of malice but because nobody can hold the full dependency graph in their head while under deadline pressure.

Shopify solved this with Packwerk, an open-source static analysis tool that enforces package privacy and dependency direction within a Rails monolith. A package can declare its constants private, specify which other packages may depend on it, and violations surface as CI failures. The key design choice is that enforcement happens at the CI level, not the code review level. A reviewer can miss a violation; a CI gate cannot.

Java teams have ArchUnit, which lets you write architectural rules as test code: “the service layer must not import from the controller layer,” verified on every build. The .NET world uses assembly boundaries as a natural enforcement mechanism. Go has package visibility conventions and tools like go mod graph for visualizing the dependency graph. The specifics differ; the principle is the same across all of them.

Circular dependencies are particularly destructive at scale because they are nearly impossible to remove once established. Two modules that depend on each other cannot be refactored independently. You cannot extract one to a service, cannot test one in isolation, cannot understand one without understanding the other. Enforcing a directed acyclic graph of module dependencies early, even imperfectly, costs almost nothing compared to untangling cycles after three years of feature development.

Database Migrations Are Permanent

The other place where monolith teams consistently underinvest is migration discipline. At a ten-table database, a poorly written migration is an inconvenience. At five hundred tables with eight years of production data and a team that ships thirty times per week, it is a production incident.

The pattern that works is called Expand/Contract, sometimes called “parallel change.” Removing a column requires three separate deploys, not one. First, add the new column and update writers to populate it (expand). Second, backfill historical data and update readers to use the new structure (migrate). Third, remove the old column only after all readers have been updated and consistency has been verified (contract). This requires more discipline per schema change, but the alternative is either long maintenance windows or corrupted data.

Two specific ORM patterns deserve extra attention at scale. Polymorphic associations in ActiveRecord and similar ORMs are a convenience that becomes a query planner problem: the database cannot build a useful index on a column that sometimes contains a user_id and sometimes contains an order_id. The query plan for a join against a polymorphic association is effectively a sequential scan dressed up as a feature. The second pattern is N+1 queries, which are invisible in development, where your database has five records, and catastrophic in production, where it has five million. Tools like Bullet for Rails detect these automatically; running them in your test suite is straightforward and the payoff is substantial.

The Test Suite as Architecture Signal

One of the more useful observations in Lyman’s piece is that a slow test suite is an architectural smell, not a CI infrastructure problem. If your tests take forty minutes, developers stop running them locally. That collapse in the feedback loop changes how code gets written: conservatively, with less refactoring, because the cost of finding out you broke something is too high.

The standard advice is to maintain a test pyramid, many fast unit tests, fewer integration tests, even fewer end-to-end tests. But at a million lines of code, the pyramid shape matters less than the absolute time budget. A suite of fifty thousand unit tests that finishes in four minutes is better than a suite of ten thousand that takes twenty. Parallel test execution and careful management of database state (transactional cleanup is meaningfully faster than truncation at scale) are the main levers.

The harder problem is test quality. Line coverage is a nearly useless metric at this scale: 100% line coverage on a million-line codebase tells you that every line executed, not that your assertions catch real bugs. Mutation testing tools like mutant for Ruby or Stryker for JavaScript and TypeScript measure test suite quality by introducing small code changes and checking whether the tests detect them. The results are often humbling and almost always informative.

Feature Flags as a Deployment Primitive

The most underrated technical investment for any monolith operating at scale is a feature flag system, not as a nice-to-have for gradual rollouts but as the primitive that decouples deployment from release.

The reason this matters structurally is the Expand/Contract migration pattern described above. The pattern requires that old and new code run simultaneously during the migration window. Feature flags are how you route traffic between them. Without flags, you are either doing big-bang releases, which are risky, or maintaining parallel code paths that only differ by a git branch, which becomes unmanageable quickly.

Flipper is the standard Rails option, with adapters for Redis, ActiveRecord, and in-memory storage. Unleash is a well-maintained open-source option for polyglot environments. Both give you the core primitive: a named flag, a set of targeting rules, and a storage backend. The more advanced features, experimentation, audit trails, percentage rollouts, can be added over time. Starting with the primitive is what matters.

What Microservices Actually Solve

Martin Fowler’s MonolithFirst recommendation argued that microservices solve organizational scaling problems, not technical ones. That framing has aged well.

The teams that genuinely needed microservices first, Netflix, Google, Amazon at peak scale, had a specific constraint: thousands of engineers across dozens of teams, with genuinely independent deployment requirements. A deployment by one team should not require coordination with all other teams. Microservices enforce that independence structurally, at the cost of distributed systems complexity: network partitions, partial failures, distributed transactions, inter-service authentication, distributed tracing, and the operational overhead of running dozens of independently deployable units.

A modular monolith provides most of the organizational discipline of microservices with a fraction of the operational complexity. Each module owns its data, exposes a clean API to other modules, and enforces privacy on its internals. The refactoring cost when those boundaries need to change is low: no API versioning, no separate deployments, no distributed transaction coordination. If a module genuinely needs independent scaling or separate team ownership, you can extract it; but that should be a deliberate decision based on a real constraint, not a default architectural assumption.

Shopify runs a Rails monolith with over 2.8 million lines of Ruby, processes billions in GMV during Black Friday peaks, and has consistently chosen to fix the monolith rather than migrate to services. Their investment in Packwerk, in database sharding at the infrastructure layer rather than the application layer, and in continuous deployment tooling is what makes that scale possible.

The lesson from a million lines of code is that the monolith’s advantages compound with investment. Better tooling, enforced module boundaries, disciplined migrations, fast tests, and feature flags all make each other more effective. The microservices architecture’s disadvantages also compound: each additional service adds operational overhead, testing complexity, and latency budget. The tradeoff only favors microservices when the organizational problem is real and the team is prepared to pay the distributed systems tax in full.

Most teams are not at that point, and the honest read of both Lyman’s lessons and the broader industry evidence is that most teams never will be.

Was this interesting?