· 8 min read ·

Five Years In, C++20 Modules Still Have a Build System Problem

Source: isocpp

C++20 modules were finalized in the standard in late 2020. The pitch was compelling: replace the #include preprocessor hack with a proper module system, dramatically improve compile times, eliminate header guard boilerplate, enforce better encapsulation, and finally make the One Definition Rule something compilers could actually help you with. Five years later, you still cannot write import boost; in a production codebase and expect it to work reliably.

Rubén Pérez Hidalgo, the author of Boost.MySQL, gave a talk about this journey on the state of C++20 modules from a library developer’s perspective. The talk is worth watching in full, but the broader story it tells is one the C++ community keeps circling: a genuinely good language feature held hostage by toolchain fragmentation and ecosystem inertia.

What Modules Actually Do

The core idea in C++20 modules is straightforward. Instead of textually including a header file every time you need its declarations, you declare a module interface unit that the compiler parses once and caches as a Binary Module Interface (BMI). Downstream translation units import that cached representation rather than re-parsing thousands of lines of template-heavy headers.

A minimal module interface looks like this:

export module mylib;

export int add(int a, int b) { return a + b; }

export class Connection {
public:
    void connect(std::string_view host);
};

Files that use it write import mylib; instead of #include "mylib.hpp". The module boundary is real: macros defined inside the module stay inside, non-exported symbols are genuinely hidden, and the compiler enforces the encapsulation rather than trusting that nobody reaches into namespace detail.

For large codebases, the compile-time gains can be substantial. Victor Zverovich’s experiments with the {fmt} library in Clang showed roughly a 10x speedup for translation units that imported a prebuilt module versus including the equivalent headers. The speedup compounds as more translation units import the same module: the BMI is parsed once, and every subsequent import is essentially free compared to header parsing.

Modules also allow splitting a large module across multiple files using module partitions, which is the mechanism a library like Boost would need to organize its many headers under a unified import boost; interface:

// boost_mysql.cppm - a partition
export module boost:mysql;
export class connection { ... };

// boost.cppm - primary interface
export module boost;
export import :mysql;
export import :asio;

The design is coherent. The execution is where things break down.

The BMI Is Not Portable

The most fundamental structural problem with C++20 modules is that the Binary Module Interface format is compiler-specific and not standardized. MSVC generates .ifc files, Clang generates .pcm files, GCC generates .gcm files. These are not interchangeable, and there is no guarantee of compatibility even between minor versions of the same compiler.

This matters enormously for library distribution. When you install a Boost package from your system package manager or vcpkg, you get compiled .a/.lib files and headers. Those headers work with any conforming compiler because they are text. A pre-built BMI would only work with the exact compiler, version, and potentially flags used to generate it. The implications for binary package distribution are severe enough that there is no consensus solution yet.

For now, libraries that want to support modules must ship module interface source files and require users to compile them as part of their own build. This is workable, but it means the module compilation cost is borne by every downstream project rather than amortized across the library’s release.

The Build System Problem Is Deeper Than CMake Support

Traditional C++ build systems assume translation units are independent. You hand make a list of .cpp files, and it can compile them in any order, in parallel, without coordination. Modules break this assumption at a fundamental level.

A module interface unit must be compiled before any translation unit that imports it. Determining which files import which modules requires scanning source files before compilation begins. This dependency scanning phase is new, not present in any existing build tooling, and requires active cooperation between the build system and the compiler.

CMake added experimental modules support in version 3.28, released in late 2023. The canonical way to declare module sources looks like this:

cmake_minimum_required(VERSION 3.28)
project(ModuleExample CXX)
set(CMAKE_CXX_STANDARD 20)

add_library(mylib)
target_sources(mylib
  PUBLIC FILE_SET CXX_MODULES FILES
    src/mylib.cppm
)

The support only works with the Ninja generator, because GNU Make cannot express the dynamic dependencies that module scanning produces. If your project uses Makefiles, you are stuck. Meson and Bazel support is even less mature.

Package managers have not caught up either. vcpkg and Conan, which underpin a significant fraction of C++ dependency management, do not have first-class support for consuming or distributing module interface units. A library that ships modules today is essentially asking its users to do extra work that the tooling should automate.

The Macro Wall

C++20 modules cannot export macros. This is a deliberate design choice: modules operate at the language level, and macros are a preprocessor feature that predates the type system entirely. A macro defined in a module interface unit is visible within that module, but import mylib; gives you none of them.

For libraries like Boost, this is a significant obstacle. BOOST_FOREACH, BOOST_STATIC_ASSERT, BOOST_PP_SEQ_FOR_EACH, and hundreds of others are macros that users call directly. A module interface cannot expose these.

The standard workaround is to ship both: a module interface unit for the parts of the library that map cleanly to language constructs, and a companion header for the macro API:

import boost.mysql;               // language-level API via module
#include <boost/mysql/macros.hpp> // macro API via traditional header

This works, but it puts a two-interface burden on library authors and makes the developer experience messier than the original #include <boost/mysql.hpp> approach it was supposed to replace.

The global module fragment partially addresses the inverse problem. When a module implementation needs to use a macro-heavy legacy header internally without leaking it to importers, you put the includes before the module declaration:

module;  // begin global module fragment
#include <boost/preprocessor.hpp>  // internal use only, not exported

export module mylib;  // module proper begins here
export void do_something();

This lets library authors consume their own macro-heavy dependencies without those macros bleeding through the module boundary to users. It is an adequate workaround for internal dependencies, but it does not help when the macros themselves are the public API.

How Other Languages Solved This

The comparison with other compiled languages is instructive. Rust’s crate system requires no separate interface declaration: the crate boundary is the unit of compilation, the compiler derives the public interface from visibility markers on declarations, and there is no binary artifact to distribute separately from the source or compiled library. Go’s package system similarly derives everything from the source, with no header-equivalent at all.

Java’s .class files contain enough metadata that the interface is derivable by any JVM, standardized enough to be portable across compiler versions, and the module system added in Java 9 layers on top of this existing portable bytecode foundation.

C++20 modules arrived at the problem differently because they had to. The language has 40 years of header-based code to remain compatible with, a template system that requires source-level information at instantiation time, and no standardized bytecode layer to anchor a portable module format. The BMI is necessarily compiler-specific because it encodes intermediate representations specific to how each compiler’s optimizer and template instantiation engine works.

This is not a failure of design ambition; it is a consequence of C++‘s constraints. But it does mean the migration path is longer and harder than it was for Rust, which could design the module system simultaneously with the language.

What import std; Looks Like Today

C++23 adds import std; as a way to import the entire standard library as a module. Compiler support as of mid-2026 is MSVC 19.35 and later, Clang 18 with partial support, and GCC 15 in experimental mode. Using it in CMake still requires opting into experimental flags with an unstable UUID that changes between CMake versions, which communicates something about how experimental the feature remains.

The irony is that import std; is where most of the compile-time benefit is concentrated. Standard library headers are parsed in every non-trivial translation unit. Moving them into a single BMI that the compiler loads once would eliminate a large fraction of the redundant parsing that makes large C++ codebases slow to build. It is precisely this high-value case that remains the least reliably available.

The State of Boost.MySQL

Pérez Hidalgo’s library, Boost.MySQL, is a modern async MySQL/MariaDB client using Asio coroutines. It is a good test case for modules because it is a relatively new Boost library without decades of preprocessor baggage, written in contemporary C++17/20 style. Even so, the talk makes clear that getting to import boost.mysql; requires navigating all of the above: the build system coordination, the macro compatibility question, the compiler divergences, and the distribution story for the resulting interface files.

The libraries best positioned to lead this transition are exactly the ones Boost.MySQL represents: new libraries without macro-heavy public APIs, written by developers who understand the module compilation model and can structure their code to work within it. Older, more macro-entangled parts of Boost face a harder path.

There is also a sequencing problem. Boost.MySQL depends on Boost.Asio and Boost.Beast among others. A complete import boost.mysql; experience requires those libraries to also provide module interfaces, or for the global module fragment workaround to be used extensively. The dependency graph of a major library does not modularize one library at a time.

Why It Will Eventually Work Out

None of the obstacles above are insurmountable, and several are improving on a visible trajectory. CMake’s module support will mature. Compiler implementations are converging. Package managers will add module-aware workflows once the underlying toolchain support stabilizes enough to build on.

The compile-time argument is strong enough to sustain the effort. A codebase that replaces heavy header inclusion with module imports can see dramatic build speedups, which translates directly to developer productivity and CI costs. For a library like Boost, which is already in the dependency tree of an enormous fraction of C++ projects, a modules interface would compound those savings across the ecosystem.

The path Pérez Hidalgo is walking, building out the module interface for Boost.MySQL, is exactly the kind of advance work that will eventually produce the patterns, the tooling guidance, and the lived experience the ecosystem needs. The question is not whether import boost; happens, but how long the tools take to become solid enough to recommend without caveats.

Was this interesting?