C++20 shipped co_yield as a language keyword but deliberately omitted std::generator from the standard library. If you wanted lazy sequences in the years between C++20 and C++23, you either pulled in a third-party library or wrote the promise_type machinery yourself. The isocpp deep-dive by Quasar Chunawala covers the implementation mechanics clearly, but the more interesting question is why the committee had co_yield available and still did not ship a standard generator. The answer is not a scheduling gap.
What You Write Before co_yield Does Anything
A generator is the simplest coroutine use case: a function that produces a sequence of values lazily, driven by the caller. Python and JavaScript both have built-in syntax for this. In C++20, you have the keyword but none of the supporting machinery, and the language requires you to supply a promise_type before any of it works.
The minimum for a typed generator looks like this:
template <typename T>
class Generator {
public:
struct promise_type {
T current_value;
std::exception_ptr exception;
Generator get_return_object() {
return Generator{Handle::from_promise(*this)};
}
std::suspend_always initial_suspend() noexcept { return {}; }
std::suspend_always final_suspend() noexcept { return {}; }
std::suspend_always yield_value(T value) noexcept {
current_value = std::move(value);
return {};
}
void return_void() noexcept {}
void unhandled_exception() { exception = std::current_exception(); }
};
using Handle = std::coroutine_handle<promise_type>;
// iterator, begin(), end(), RAII destructor follow...
};
Each method is present for a specific reason, and omitting any one of them produces a compile error with minimal guidance about what is missing.
initial_suspend returning std::suspend_always makes the generator lazy: the body does not execute until the caller explicitly asks for the first value by calling begin(). Returning std::suspend_never instead would run the coroutine body to the first co_yield at construction time, which produces surprising behavior for infinite generators and makes initialization order hard to reason about.
final_suspend returning std::suspend_always keeps the coroutine frame alive after the body finishes. The frame must outlive the body completion because the caller needs to observe handle.done() to detect exhaustion. If final_suspend returned std::suspend_never, the frame would self-destruct on completion, and any subsequent access through the handle would be undefined behavior.
yield_value is the bridge between co_yield expr and the caller. The compiler transforms co_yield value into co_await promise.yield_value(value). The promise stores value in current_value, returns std::suspend_always, and the coroutine suspends. The caller reads the stored value through the handle. co_yield is syntactic sugar for an await call on the promise’s yield_value method, nothing more.
Add a move-only RAII wrapper, an iterator type with operator++ that calls handle.resume() and rethrows any stored exception, and a begin() function that performs the first resume. That brings you to roughly forty lines before user code can write a single useful co_yield. This is the protocol the language requires you to supply because C++ deliberately left all policy decisions to the library layer.
The Stack Problem That Stopped Standardization
The flat sequence case works, and for most generators it is all you need. The recursive case exposes what held up standardization.
Consider in-order tree traversal:
Generator<int> inorder(Node* n) {
if (!n) co_return;
for (int v : inorder(n->left))
co_yield v;
co_yield n->value;
for (int v : inorder(n->right))
co_yield v;
}
This compiles, and for shallow trees it runs correctly, but for trees of any meaningful depth it overflows the stack. The reason requires following the call chain precisely.
Each iteration of for (int v : inorder(n->left)) calls operator++ on the inner generator’s iterator. That calls handle.resume() on the inner coroutine. resume() is a regular function call; it allocates a real stack frame. When the inner coroutine suspends at its own co_yield v, execution returns from resume() through operator++ and into the outer loop body, which then executes co_yield v on the outer coroutine. That outer co_yield calls operator++ on the outer iterator, which calls handle.resume() on the outer coroutine. Two .resume() calls are nested on the real call stack for each level of the tree. For a tree of depth N, you accumulate O(N) real stack frames in the resumer’s stack, independent of the coroutine frame allocation. For trees of depth in the thousands, this overflows.
Fixing this requires symmetric transfer. When await_suspend returns a coroutine_handle rather than void, the runtime jumps directly to the target coroutine without allocating a stack frame. This is a tail call at the coroutine level, reducing stack depth from O(N) to O(1) for arbitrarily deep recursion. Lewis Baker documented the symmetric transfer mechanism in detail in 2020, and it became the key primitive for making recursive generators viable in library code.
Building symmetric transfer correctly into a standard generator required settling questions about the allocator story, ranges integration, and how delegation interacts with exception propagation. The committee had the primitive available, but the complete settled design took until C++23.
What C++23 Added
std::generator<T> from <generator> resolves the recursive case through a dedicated mechanism: std::ranges::elements_of. Instead of looping over a sub-generator and re-yielding each element, you write:
std::generator<int> inorder(Node* n) {
if (!n) co_return;
co_yield std::ranges::elements_of(inorder(n->left));
co_yield n->value;
co_yield std::ranges::elements_of(inorder(n->right));
}
elements_of(range) is a tag type. When the generator’s yield_value receives it, the promise recognizes it as a delegation request and performs symmetric transfer to the inner generator rather than suspending and returning to the outer caller. Stack depth stays bounded regardless of recursion depth. This is a deliberately constructed path through the promise machinery that plain co_yield cannot reach, because co_yield always yields to the immediate resumer and has no mechanism to transfer control transitively.
std::generator also carries the full std::ranges::input_range concept, which means it composes with standard range adaptors without a wrapper:
#include <generator>
#include <ranges>
std::generator<int> fibonacci() {
int a = 0, b = 1;
while (true) {
co_yield a;
auto next = a + b;
a = b;
b = next;
}
}
for (int n : fibonacci() | std::views::drop(5) | std::views::take(10)) {
// elements 5 through 14 of the Fibonacci sequence, evaluated lazily
}
The generator holds its state in local variables across yields. No explicit state struct is needed.
The full template std::generator<Ref, Val, Alloc> also supports custom allocators for the coroutine frame via std::allocator_arg_t in the promise constructor, which matters for high-frequency generators in performance-sensitive paths where heap allocation is undesirable.
How Other Languages Addressed Delegation
Python addressed recursive delegation in version 3.3 with PEP 380 and the yield from syntax:
def inorder(node):
if node is None:
return
yield from inorder(node.left)
yield node.value
yield from inorder(node.right)
yield from handles delegation at the runtime level and also threads .send(value) and .throw(exc) through to the sub-generator, so two-way communication works across delegation boundaries transparently. Python and JavaScript (yield*) could add syntax because they control the entire runtime. The delegation semantics are baked into the language, not the library.
C++ operates under stricter constraints. The coroutine machinery must require no runtime support, the protocol must remain extensible to arbitrary coroutine return types, and paths that do not use delegation must carry zero overhead. elements_of satisfies all three: it is a library type, it opts in through the promise’s yield_value overload, and it only activates symmetric transfer when the delegation path is explicitly taken.
Rust’s situation is instructive from a different angle. Synchronous generators in Rust are still stabilizing (the gen keyword was in nightly through 2025), partly because Rust’s borrow-checker constraints create distinct design challenges for coroutine frames. The guarantee that makes Rust’s async futures safe, that borrowed data cannot outlive the future holding the borrow, requires Pin<&mut Self> semantics that add API complexity. C++ sidesteps this through move-only convention without enforcement, which avoids the pinning complexity while creating the dangling-reference problem that the committee has separately documented as a structural limitation.
Practical Guidance
On C++23 or later, std::generator is the right choice for lazy sequences. The implementation handles recursive delegation correctly, participates in the ranges concept hierarchy, and composes with standard adaptors without wrappers.
On C++20, the implementation above is serviceable for flat generators, the common case. Recursive delegation through manual loops is possible but carries O(depth) real stack growth. For non-recursive producers, file lines, pagination results, token streams, a C++20 generator matches a hand-written iterator in throughput once the compiler elides the initial frame allocation. GCC and Clang both perform this elision reliably at -O2 when the generator’s lifetime is provably contained within the caller’s scope, a property the standard calls HALO (Heap Allocation eLision Optimization) but does not require.
The three years between C++20 and std::generator in C++23 reflected real design work. The recursive delegation problem was genuine, the symmetric transfer solution required careful integration, and the committee was right to wait for a design they could standardize with confidence rather than ship something that worked for flat sequences but was unsafe for tree traversal. Understanding the problem elements_of solves makes the rest of the C++23 generator design legible.