· 7 min read ·

What the Hard Cases in C++26 Reflection Tell You About C++ API Design

Source: isocpp

The Python/C++ bridge in quantitative finance has always been a maintenance tax. Write a pricing model in C++, expose it to Python researchers via pybind11, and then update the binding file every time the C++ API changes. The update is mechanical, error-prone, and invisible to the compiler if you skip it. Richard Hickling’s article on isocpp.org frames C++26 reflection as the solution: instead of maintaining a hand-written binding file, you write one template and let the compiler derive the binding from the C++ type.

The promise is real. P2996, accepted into the C++26 working draft at the November 2024 Wrocław WG21 meeting, gives C++ code first-class access to structural information about types: member names, access specifiers, method signatures, enumerator values. Combined with P1306’s template for expansion statements, you can write a single registration template that works for any class and stays perpetually synchronized with its C++ definition:

template <typename T>
void bind_type(py::module_& m) {
    auto cls = py::class_<T>(m, std::meta::identifier_of(^T).data());

    template for (constexpr auto field : std::meta::nonstatic_data_members_of(^T)) {
        if constexpr (std::meta::is_public(field)) {
            cls.def_readwrite(
                std::meta::identifier_of(field).data(),
                &T::[:field:]);
        }
    }

    template for (constexpr auto method : std::meta::members_of(^T)) {
        if constexpr (std::meta::is_public(method)
                   && std::meta::is_nonstatic_member_function(method)
                   && !std::meta::is_special_member(method)) {
            cls.def(
                std::meta::identifier_of(method).data(),
                &T::[:method:]);
        }
    }
}

The ^T operator reflects a type into a std::meta::info value. nonstatic_data_members_of returns a range of such values, one per field. identifier_of extracts the member name as string_view. The [:field:] splice converts a reflected member back into a pointer-to-member. The result: add a method to your pricer in C++, and Python sees it on the next build without any binding update.

Practical experiments with the Bloomberg-maintained Clang fork that implements P2996 — already testable on Compiler Explorer — put the automation coverage for a typical class at roughly 70 to 80 percent. The remaining fraction requires manual annotation or explicit registration.

What is worth examining is not just where the ceiling is, but why it exists at those specific places. The hard cases for reflection-based binding generation are not arbitrary. They map to genuine design ambiguities in C++ APIs, and several of them are worth resolving in the C++ code itself, independent of any reflection concerns.

Overloaded Methods Are an API Design Question

The most common failure mode for automatic binding generation is method overloading. When a class exposes multiple overloads of the same name, the expression &T::[:method:] is ambiguous, and the generated binding cannot compile without explicit disambiguation. pybind11 handles this with py::overload_cast<ArgTypes...>; generating the right cast requires knowing which overload Python should see, or whether both should be merged into a single Python function with flexible dispatch.

The reflection system cannot make this decision because it is a Python API design decision, not a structural one. Should compute(SingleScenario const&) and compute(ScenarioGrid const&) appear as one compute in Python, or as compute_single and compute_grid? That depends on how Python callers will actually use the API, which the type system does not encode.

The practical resolution, in most trading library contexts, is that overloaded methods in Python-facing APIs are a design smell regardless of reflection. Python’s duck typing makes overloading by argument type less natural than in C++. A method that computes risk differently depending on whether you pass a single scenario or a grid is better served by two distinct names that communicate intent:

// Before: overloaded, requires manual binding
double compute(SingleScenario const& s);
double compute(ScenarioGrid const& grid);

// After: distinct names, automation works
double compute_single(SingleScenario const& s);
double compute_grid(ScenarioGrid const& grid);

Alternatively, if the intent is truly polymorphic behavior, a std::variant<SingleScenario, ScenarioGrid> input type eliminates the overload entirely and exposes a cleaner single-method interface to both C++ and Python callers. The reflection limitation surfaces an API decision that was already worth making.

Return Value Policies Reveal Ownership Ambiguity

The second hard case is more fundamental. pybind11 requires explicit specification of return value policies: six options covering how Python should treat returned pointers and references, from copy to reference_internal to take_ownership. The right policy depends on the lifetime relationship between the returned object and the object that owns it. A method returning const Config* might return a pointer to an internally managed singleton, a pointer whose lifetime is tied to the bound object, or a freshly allocated object the caller is expected to delete. Reflection sees the return type; the ownership contract lives in documentation.

This is precisely where C++ API design can resolve the ambiguity structurally. The six pybind11 policies exist because raw pointer returns do not encode ownership in the type system. Smart pointers do:

// Ambiguous to reflection, requires policy annotation
const Config* get_config() const;

// Ownership is in the type; reflection handles it naturally
std::shared_ptr<const Config> get_config() const;

// Clearly caller-owned; deep copy, no lifetime concern
Config get_config() const;

std::shared_ptr<T> return types are handled correctly by pybind11 without any policy annotation because the shared ownership semantics are explicit. A const T& return maps to reference_internal by convention if the object it references is a data member of the bound class. Value returns copy. The cases that require explicit policy are almost always cases where raw pointer returns are carrying implicit ownership contracts that smart pointers or value semantics would make explicit.

For trading infrastructure with existing ABI constraints, the redesign may not be immediately feasible. The forward path in that case is P1854, the proposed standard for user-defined attributes inspectable at compile time, which would allow:

[[py::return_policy(reference_internal)]]
const Config* get_config() const;

P1854 is moving through the committee but did not make C++26. Until it ships, the practical approach is to document the intended policy as a comment in the exact form of the future annotation, so the conversion is mechanical when the feature arrives.

Default Arguments and the Constructor Design Trade-off

P2996 explicitly does not reflect default argument values. std::meta::has_default_argument(param) reports whether a default exists, but not what it is. A pricer constructor with twelve parameters, half defaulted to standard market conventions, cannot be reproduced fully by an automatic binder. Callers who rely on the Python binding seeing py::arg("vol") = 0.2 will need the binding written manually, or will lose that ergonomics.

Binder, the external Clang-AST-based generator used by Rosetta Commons for binding generation at scale, handles this because it reads default expressions directly from Clang’s parsed AST, where they are available as subexpressions. In-compiler reflection does not see them; the C++ type system deliberately does not model default expressions as inspectable values.

The design fix here is different. A constructor taking twelve parameters with eight defaults is a design pattern that causes friction beyond binding generation: it is hard to read at call sites, hard to extend without breaking callers, and hard to document. A named parameter struct resolves the problem:

// Hard to bind, hard to call
BlackScholesPricer(double spot, double strike, double rate = 0.05,
                   double vol = 0.20, double expiry = 1.0,
                   double dividend = 0.0, /* ... */);

// Cleaner for C++ and Python both
struct BlackScholesParams {
    double spot;       double strike;
    double rate    = 0.05;  double vol  = 0.20;
    double expiry  = 1.0;   double dividend = 0.0;
};
BlackScholesPricer(BlackScholesParams const& p);

The BlackScholesParams struct is a plain aggregate: reflection handles it completely, all fields appear in Python with their C++ names, and Python callers can use keyword-argument-style initialization through the binding. The default values live in the struct definition and remain visible in C++ documentation and IDE tooling, even if reflection cannot expose them to Python as argument defaults directly. The constructor becomes a single, unambiguous entry point.

The Pattern Behind All Three Cases

Overloads, raw pointer returns, and deep default argument lists have something in common: they are places where a C++ interface expresses intent through convention rather than through the type system. An overloaded compute assumes callers know which variant they need. A const Config* return assumes the caller knows whether they own it. A twelve-parameter constructor assumes the caller has read the documentation on which parameters are optional.

Reflection-based binding generation cannot automate these cases because the information needed to handle them correctly is not in the type system. It is in the programmer’s mental model of the interface.

Designing C++ APIs with reflection in mind means moving that implicit information into explicit structure: distinct method names instead of overloads, smart pointer or value returns instead of raw pointer returns, parameter structs instead of long optional argument lists. These changes serve reflection, but they also produce C++ interfaces that are easier to reason about, easier to test, and easier to extend. The 20 percent that reflection cannot automate is a diagnostic for C++ interface debt, not a limitation of the reflection proposal itself.

Where Things Stand for Practical Use

The experimental bloomberg/clang-p2996 fork implements the core of P2996 and is accessible today on Compiler Explorer. P1306 expansion statements are implemented experimentally in the same branch. Neither is production-ready, and C++26 will not finalize until late 2026. Compiler adoption for complex features like this typically trails standardization by one to two years, putting production deployment on a 2027 to 2028 horizon depending on your compiler.

The productive use of that time is not waiting. It is reviewing Python-facing C++ APIs for the three patterns above: overloads that should be named distinctly, pointer returns that should encode ownership in their types, and multi-default constructors that should become parameter structs. Each change makes the future automation work better and makes the current C++ interface cleaner. None of it requires C++26 to be deployed first.

When compilers do ship full P2996 support, the teams that have been designing this way will find that their binding layer is mostly already automated by construction.

Was this interesting?