The Promise Protocol: What Every Method in C++20 Coroutines Is Actually Doing
Source: isocpp
Quasar Chunawala published a thorough walkthrough of C++20 coroutines in February 2026, covering the mechanics you need to get a coroutine compiling. It is a good starting point, and if you have not read it, the setup there pairs well with what follows here. This post goes a layer deeper: specifically into why promise_type looks the way it does, what each of its methods controls in the underlying state machine, and where the design got genuinely clever.
The Compiler Rewrites Your Function
A function becomes a coroutine when its body contains co_await, co_yield, or co_return. The compiler does not just add suspension points; it rewrites the function entirely into a state machine backed by a heap-allocated frame. That frame holds the promise object, all locals that live across a suspension point, and two function pointers: one to resume the coroutine at the right label, one to clean it up.
The heap allocation is real overhead, though the standard permits elision (called HALO, specified in P0981R0) when the compiler can prove the coroutine’s lifetime is bounded by the caller’s scope. This fires reliably for generators used immediately in range-for loops. For networked async tasks where the handle escapes to a scheduler queue, you are paying for an allocation on every task creation.
The promise_type Is the Control Plane
Every coroutine return type R must expose a nested R::promise_type. This type is not incidental boilerplate; it is the interface through which you configure every behavioral decision the coroutine makes:
struct promise_type {
ReturnType get_return_object(); // called first, before the body runs
std::suspend_always initial_suspend(); // eager vs. lazy start
std::suspend_always final_suspend() noexcept; // whether the frame survives completion
void return_value(T value); // handles co_return <expr>
void unhandled_exception(); // handles exceptions escaping the body
std::suspend_always yield_value(T val); // handles co_yield <expr>
};
Each method maps to a specific moment. get_return_object runs before initial_suspend, which means the return object is constructed before the coroutine body executes at all; this lets you return a handle to a not-yet-started coroutine, which is the pattern that makes lazy tasks work. initial_suspend returning suspend_always gives you a lazy coroutine; suspend_never gives you an eager one. The difference matters for task composition: a lazy task does not start until someone explicitly awaits it, which is usually what you want.
final_suspend is where people get burned most often. If it returns suspend_never, the coroutine frame is destroyed immediately when the body finishes, before the caller can read any result stored in the promise. The safe default is suspend_always, letting the owning RAII wrapper call handle.destroy() at the right time.
unhandled_exception must be implemented or exceptions silently vanish. The standard pattern is exception_ptr_ = std::current_exception(), followed by a rethrow at the co_await site on the consumer side.
How co_await Lowers to Three Method Calls
co_await expr expands into a sequence that consults an awaiter object. The compiler first checks for promise_type::await_transform(expr) to allow the promise to intercept or transform expressions before they are awaited. Then it resolves the awaiter itself, either through operator co_await or by using the expression directly if it already satisfies the awaiter concept.
The awaiter interface has three methods:
bool await_ready(); // if true, skip suspension entirely
void await_suspend(coroutine_handle<promise_type> h); // called on suspension
T await_resume(); // called on resume, value becomes result of co_await
await_ready exists as a short-circuit for values already available synchronously. await_suspend receives the handle of the suspending coroutine, which the awaiter can store, pass to a scheduler, or resume immediately. await_resume is where the value materializes; its return type becomes the type of the co_await expression.
co_yield expr is syntactic sugar for co_await promise.yield_value(expr). The typical generator implementation stores the value in the promise and returns suspend_always from yield_value, suspending until the caller calls handle.resume() again.
Symmetric Transfer and the Stack Depth Problem
The most interesting design decision in the C++20 coroutine spec is symmetric transfer, introduced by P0913. Without it, resuming a continuation from await_suspend looks like this:
void await_suspend(coroutine_handle<> caller) {
caller.resume(); // recursive call, grows the stack
}
In a chain of N tasks each co_awaiting the next, naive resumption nests N frames on the call stack. For production async systems this causes stack overflows.
Symmetric transfer solves this by allowing await_suspend to return a coroutine_handle<>. The compiler generates a tail call to that handle’s resume function, keeping stack depth constant regardless of chain length:
std::coroutine_handle<> await_suspend(
std::coroutine_handle<promise_type> dying) noexcept {
auto cont = dying.promise().continuation;
return cont ? cont : std::noop_coroutine();
}
The std::noop_coroutine() sentinel handles the case where there is no continuation; resuming it returns immediately, handing control back to the executor. This is the pattern used in every serious Task<T> implementation, including cppcoro and Boost.Asio’s awaitable<T>.
What the Standard Does and Does Not Include
C++20 ships the machinery but none of the components you would actually use: no task<T>, no generator<T>, no executor. C++23 added std::generator<T> for synchronous lazy sequences. Async tasks remain library territory.
The practical landscape has a few established options. Boost.Asio (asio::awaitable<T>) is the most battle-tested; it integrates with Asio’s I/O context and is widely deployed in networked applications. Folly’s coroutine library (folly::coro::Task<T>) is production-grade with built-in cancellation via CancellationToken and structured concurrency through AsyncScope. For anything green-field, these two cover most use cases.
The longer-term direction is P2300 (std::execution), the Sender/Receiver proposal targeting C++26. Coroutines sit on top as syntactic sugar over the Sender model; std::execution::task in libunifex shows what that integration looks like. The separation of coroutine machinery from execution policy is one of the cleaner architectural decisions in the C++20 design.
A Minimal Generator to Make It Concrete
#include <coroutine>
#include <optional>
template<typename T>
struct Generator {
struct promise_type {
std::optional<T> value;
Generator get_return_object() { return Generator{Handle::from_promise(*this)}; }
std::suspend_always initial_suspend() noexcept { return {}; }
std::suspend_always final_suspend() noexcept { return {}; }
void return_void() noexcept {}
void unhandled_exception() { std::rethrow_exception(std::current_exception()); }
std::suspend_always yield_value(T v) { value = std::move(v); return {}; }
};
using Handle = std::coroutine_handle<promise_type>;
explicit Generator(Handle h) : h_(h) {}
~Generator() { if (h_) h_.destroy(); }
Generator(const Generator&) = delete;
Generator(Generator&& o) noexcept : h_(std::exchange(o.h_, {})) {}
struct Sentinel {};
struct Iterator {
Handle h;
bool operator==(Sentinel) const noexcept { return h.done(); }
Iterator& operator++() { h.resume(); return *this; }
T& operator*() const { return *h.promise().value; }
};
Iterator begin() { h_.resume(); return {h_}; }
Sentinel end() { return {}; }
private:
Handle h_;
};
Generator<int> range(int n) {
for (int i = 0; i < n; ++i) co_yield i;
}
Notice that begin() calls h_.resume() to advance past initial_suspend. The iterator’s operator== checks h_.done() to detect that the coroutine body has finished. The destructor calls h_.destroy() because final_suspend returns suspend_always, leaving the frame alive for that final check.
The Real Cost of the Protocol
The promise_type protocol is verbose, and that verbosity reflects genuine design surface. Every method corresponds to a decision that could go either way; the standard chose to expose all of them rather than picking defaults that would not fit everyone. Whether that tradeoff was correct is a fair debate. Lewis Baker’s coroutine theory series remains the best treatment of why each piece exists, written from the perspective of someone who shaped the final design.
For most application code, you will never write a promise_type directly. You pick a library (asio::awaitable, folly::coro::Task, std::generator) and use the protocol they provide. The mechanics described here are what those libraries implement on your behalf, and knowing them makes it considerably easier to debug when something in the coroutine lifecycle goes wrong.