· 8 min read ·

C++26 Reflection Solves the Harder Half of the Python/C++ Bridge Problem

Source: isocpp

The standard framing of Python-vs-C++ in trading is about speed: Python for strategy development, C++ for sub-microsecond execution. That framing is correct but incomplete. The binding layer connecting them has a problem that is more persistent and more expensive than the performance gap, and it has nothing to do with latency.

When you write a C++ pricer and expose it to Python via pybind11, you produce two representations of the same API: the C++ class definition and the binding registration block. Every function, every parameter, every data member must be listed twice. When the C++ API changes, the binding must be updated separately. If you forget, the build still succeeds. The mismatch only surfaces at runtime, possibly in production. This is the binding maintenance problem, and tools like pybind11 and nanobind have never addressed it. They refined the syntax for writing bindings manually; they did not eliminate the need to write them manually.

Richard Hickling’s isocpp.org post frames C++26 reflection as dissolving the Python/C++ performance trade-off. The more precise description is that it makes the binding a derived artifact of the C++ source rather than a separately maintained document. The performance story was already solved, more or less. The maintenance story had not been.

Two decades of manual binding registration

Boost.Python appeared in 2002. Its model, adopted with minor syntax changes by pybind11 in 2015 and by nanobind in 2022, requires the developer to re-describe their API in a registration DSL. A typical pybind11 binding file looks like this:

PYBIND11_MODULE(pricer, m) {
    py::class_<BlackScholesPricer>(m, "BlackScholesPricer")
        .def(py::init<double, double, double, double, double>(),
             py::arg("spot"), py::arg("strike"),
             py::arg("rate"), py::arg("vol"), py::arg("expiry"))
        .def_readwrite("spot", &BlackScholesPricer::spot)
        .def("price",  &BlackScholesPricer::price)
        .def("delta",  &BlackScholesPricer::delta)
        .def("gamma",  &BlackScholesPricer::gamma)
        .def("vega",   &BlackScholesPricer::vega);
}

Every member listed here exists in the C++ header. None of this information is new; it is being transcribed. The compiler enforces that the types match, but it cannot enforce that the transcription is complete. A new method added to BlackScholesPricer simply will not appear on the Python side until someone manually adds a .def() line. Nanobind improved compile times (roughly 5-10x over pybind11) and binary size (2-4x smaller), but the fundamental authorship model is identical.

SWIG and external codegen tools like Binder, built by Rosetta Commons for their protein modeling codebase, take a different approach: they parse C++ headers outside the compiler, walk the resulting AST, and emit binding code automatically. Binder reads the full Clang AST and handles default argument values that in-compiler reflection cannot access. The limitation is that these tools run as a separate build step, maintain their own parser, and require explicit configuration for template instantiations. The binding file is still generated rather than continuously derived, and the synchronization guarantee depends on the codegen step running correctly in every build.

Why C++ reflection could not do this before

C++ reflection has been proposed multiple times. The P0194 reflexpr proposal from 2016 encoded metadata as types: a list of members became tuple<type<A>, type<B>, type<C>>. Operating on that metadata required recursive template instantiation, which hit template depth limits on non-trivial class hierarchies and produced error messages spanning multiple screens. The compile cost grew quadratically with the number of members. P0194 was technically functional and practically unusable for binding generation at any real scale.

P2996, voted into the C++26 working draft at WG21’s Wrocław meeting in November 2024, takes a fundamentally different approach. Reflected entities are values, not types. The reflection operator ^ produces a std::meta::info scalar. A list of members is a constexpr std::vector<std::meta::info>. You can filter it with std::ranges::filter, pass it to ordinary consteval functions, and iterate it without recursive template instantiation:

auto public_members = std::meta::nonstatic_data_members_of(^T)
    | std::views::filter(std::meta::is_public);

This is the design decision that makes everything else tractable. When metadata is a value, standard algorithms apply without modification. Compile errors point at the offending line in the reflection logic, not at a wall of template instantiation noise. The std::meta namespace provides consteval query functions for names, types, member lists, parameter lists, and layout information, all returning std::meta::info values or constexpr standard containers of them. The [: r :] splice operator converts a std::meta::info back into a usable C++ entity, enabling member access and pointer-to-member formation from reflected descriptors.

What automated binding generation looks like

With P2996 and the companion P1306 expansion statements, a binding generator for an entire class reduces to a function template:

template <typename T>
void bind_class(py::module_& m) {
    auto cls = py::class_<T>(m, std::meta::identifier_of(^T).data());

    template for (constexpr auto field : std::meta::nonstatic_data_members_of(^T)) {
        if constexpr (std::meta::is_public(field)) {
            if constexpr (std::meta::is_const(std::meta::type_of(field))) {
                cls.def_readonly(
                    std::meta::identifier_of(field).data(),
                    &T::[:field:]);
            } else {
                cls.def_readwrite(
                    std::meta::identifier_of(field).data(),
                    &T::[:field:]);
            }
        }
    }

    template for (constexpr auto method : std::meta::members_of(^T)) {
        if constexpr (std::meta::is_public(method)
                   && std::meta::is_nonstatic_member_function(method)
                   && !std::meta::is_special_member(method)) {
            cls.def(
                std::meta::identifier_of(method).data(),
                &T::[:method:]);
        }
    }
}

The binding file for the entire pricer library becomes:

PYBIND11_MODULE(pricers, m) {
    bind_class<BlackScholesPricer>(m);
    bind_class<HestonModel>(m);
    bind_class<SABRModel>(m);
}

Adding a theta() method to BlackScholesPricer in C++ surfaces immediately in Python. No binding file needs to be touched. A quant can call the new method without waiting for an engineer to update a registration block.

The template for construct is the P1306 expansion statement, which is not yet merged into the C++26 working draft as a standalone proposal. Without it, iterating a constexpr range requires an index-sequence workaround. The semantics are identical either way; the workaround is more verbose but produces the same compiled output. The Bloomberg Clang fork implements both P2996 and P1306 together, and both are available on Compiler Explorer under the experimental reflection builds.

What still requires manual annotation

The automation ceiling for this approach is roughly 70-80% of a typical class API surface. Several categories still require explicit handling.

Overloaded functions are the most common case. P2996 can enumerate all overloads via members_of and retrieve their parameter types via parameters_of, but it cannot decide how to present them in Python. A class might have price(double spot) and price(double spot, double vol). Python has no native overloading; the binding author must choose between two differently named Python methods or a single method with optional arguments. Reflection sees the type structure; the naming decision is a design choice that belongs to a human.

Default argument values are a direct limitation of P2996. The proposal explicitly does not expose default expression values, only whether a default exists. Binder handles this because it reads Clang’s full AST, which retains parsed expressions from earlier compilation phases. In-compiler reflection, operating inside the type system at constant-evaluation time, does not have access to those expression trees.

Return value policies require annotation for anything involving pointer or reference returns. Pybind11 supports six distinct return_value_policy values. The compiler can see that a method returns const Config&; it cannot determine whether that reference is safe to expose to Python as a view or must be copied. P1854, which proposes user-defined attributes accessible to reflection, would address this cleanly but is not in C++26.

GIL release semantics for CPU-intensive methods similarly require explicit annotation that cannot be derived from type information alone.

Prior art: D has been doing this for years

Compile-time binding generation is not a concept introduced by C++26. D’s autowrap library has been generating Python, Excel, and Jupyter bindings for D code automatically using D’s __traits introspection and compile-time function evaluation. It has been in production in trading environments for years. The C++26 version of the concept arrives in a language with considerably wider industry adoption, which changes the practical impact substantially.

Rust’s PyO3 takes a related approach via procedural macros: decorating Rust types with #[pyclass] and functions with #[pyfunction] generates binding code at compile time. The annotations are explicit rather than automatic, but the generated code is always synchronized with the source. Binding drift as a runtime failure mode does not exist in PyO3-managed code because the binding is compiled, not transcribed.

The C++26 path is closer to D’s autowrap than to PyO3: the goal is full automation with minimal annotation, where the binding generator reflects over the class structure without requiring source modification. Compared to PyO3’s approach, you get broader out-of-the-box coverage but sacrifice the explicitness that makes ownership and GIL semantics unambiguous. For a codebase with existing C++ code that you do not want to annotate, that trade-off tends to favor the reflection-based approach.

Implementation status and the path to production

The Bloomberg Clang fork has a substantially complete implementation of P2996, accessible on Compiler Explorer under “clang (experimental reflection)” with -freflection -std=c++2c. The EDG frontend, used for standards testing by the P2996 authors, has the most complete implementation. Upstream Clang has partial merges from the Bloomberg fork under experimental flags. GCC has a community-driven implementation underway. MSVC has no public implementation or timeline announced.

C++26 finalizes late 2026. Production-quality compiler support typically follows 12-24 months after standardization, putting realistic adoption around 2027-2028 for Clang-heavy Linux shops, later for MSVC-dependent environments.

For teams evaluating this today, the practical path is to experiment with the Bloomberg Clang fork, design the binding generator as a template library that also compiles under current compilers using the index-sequence workaround, and migrate to template for once upstream Clang support stabilizes. The binding generator’s API does not change between the two forms; only the iteration mechanism does.

The binding as a derived artifact

The binding maintenance problem persists because the binding file is authored separately from the source it describes. It is a second document that must be kept in sync with the first. When synchronization fails, the compiler does not notice. The test suite may not notice either if the dropped method had no test coverage.

What C++26 reflection enables is eliminating that second document. The binding becomes a function of the source: a constexpr computation over the reflected type graph, executed during normal compilation. If the source changes, the binding changes with it. If the binding becomes inconsistent with what pybind11 can accept, the failure is a compile error at the point of inconsistency, not a runtime exception in a strategy evaluation loop.

This is a structural change in how the Python/C++ bridge is maintained, not a change in how fast it runs. It does not make C++ execution faster and it does not make Python iteration faster in itself. It removes the synchronization overhead between two representations of the same API: a real cost in any environment where the C++ pricer layer evolves frequently and quant iteration speed is the actual constraint. The latency numbers on the C++ side were already good. The bottleneck was the engineer keeping two files in sync.

Was this interesting?