C++ Coroutines Are a Scheduler Protocol, and Unity Makes That Obvious
Source: lobsters
Most developers who come to C++20 coroutines arrive from async/await territory. They have used Python’s asyncio, C#‘s Task, Rust’s async fn, or JavaScript’s Promise, and they expect to find roughly the same thing in a C++ costume. What they find instead is promise_type, await_suspend, await_ready, await_resume, and several hundred words of boilerplate just to define a coroutine return type. The gap between expectation and reality produces genuine frustration.
Mathieu Ropert’s recent article describes finding the mental model that closes that gap: Unity’s coroutine system. Not the Unity that uses C#‘s async/await (though it supports that too), but the older one, built on IEnumerator and yield return, where the engine drives each coroutine one frame at a time.
The connection is not obvious at first. Unity coroutines are cooperative game-loop tools; C++ coroutines are marketed as async machinery. But the shared underlying structure turns out to be the key to understanding why C++ made the choices it did.
What Unity Actually Does
Unity coroutines are written in C# and look like this:
IEnumerator FadeOut(float duration) {
float elapsed = 0f;
while (elapsed < duration) {
SetAlpha(1f - (elapsed / duration));
elapsed += Time.deltaTime;
yield return null; // suspend until the next frame
}
SetAlpha(0f);
}
void Start() {
StartCoroutine(FadeOut(2f));
}
The C# compiler transforms any method returning IEnumerator that contains yield return into a state machine class. The class stores local variables as fields and implements MoveNext() as a switch statement dispatching on an integer state index. StartCoroutine registers this state machine with Unity’s coroutine scheduler.
Each frame, Unity’s engine calls MoveNext() on every registered coroutine. MoveNext() advances the state machine to the next yield return and returns true if the coroutine is still running, false when it finishes. The value passed to yield return (null, a WaitForSeconds, a WaitUntil, etc.) tells the engine how long to wait before calling MoveNext() again.
The coroutine itself does not drive anything. It defines where it can pause and what it wants to happen before it runs again. Unity, the engine, is the scheduler. Unity decides when to call MoveNext(). Unity inspects the yielded value. Unity owns the lifecycle.
The Same Structure in C++
C++20 coroutines use the same separation. A coroutine defines where it can suspend; something else decides when to resume it. The difference is that in C++, “something else” is not the engine. It is whatever you write.
When the C++ compiler encounters a function containing co_await, co_yield, or co_return, it transforms the function body into a heap-allocated state machine, allocates a coroutine frame, and produces a std::coroutine_handle<> pointing to it. Whoever holds that handle can call handle.resume() to advance the coroutine to its next suspension point, exactly as Unity calls MoveNext().
The three customization points that confuse newcomers correspond directly to pieces of the Unity system.
The promise type is the equivalent of Unity’s engine scheduler. You define initial_suspend() to control whether the body runs immediately on first call or waits to be resumed. You define yield_value() to process whatever the coroutine yields. You define final_suspend() to control what happens when the coroutine finishes. The promise type is literally where you write your own engine loop.
The awaitable (co_await target) is the equivalent of yield return null versus yield return new WaitForSeconds(2f). Each awaitable’s await_suspend method receives the coroutine handle and decides what to do with it: store it in a timer queue, push it onto a thread pool, schedule it for the next frame. The awaitable is how the coroutine tells the scheduler: here is my handle, here is the condition under which you should call resume.
Here is what a next-frame awaitable looks like in C++:
struct NextFrame {
GameLoop& loop;
bool await_ready() { return false; } // always suspend
void await_suspend(std::coroutine_handle<> h) {
loop.schedule_next_frame([h]() mutable { h.resume(); });
}
void await_resume() {}
};
Task<void> fade_out(GameLoop& loop, float duration) {
float elapsed = 0.0f;
while (elapsed < duration) {
set_alpha(1.0f - (elapsed / duration));
elapsed += loop.delta_time();
co_await NextFrame{loop}; // yield return null
}
set_alpha(0.0f);
}
The structure is identical to the Unity version. co_await NextFrame{loop} hands the coroutine handle to the game loop scheduler, which calls handle.resume() on the next tick. The only difference is that in C++ you wrote the scheduler; in Unity, the engine team wrote it.
Why the Boilerplate Exists
This is where the mental model pays off. The standard library deliberately ships only the mechanism. There is no standard executor, no standard event loop, no canonical Task<T>. This matches the fact that Unity’s scheduler is part of Unity, not part of C#. A scheduler for async I/O looks different from a scheduler for per-frame game logic, which looks different from a thread pool work queue, which looks different from a synchronous generator.
std::generator<T> was standardized in C++23 (via P2502) precisely because generators have exactly one meaningful scheduler: the calling code. The promise type for std::generator is simple because the scheduling decision is trivial. For everything else, the committee’s position is that the right scheduler depends on context only you know, so the customization point is open.
Lewis Baker’s cppcoro library, written before C++20 shipped, demonstrates what the full abstraction looks like once a scheduler is committed to. The task<T> type uses symmetric transfer (standardized in P0913) to ensure that resuming a chain of N awaiting coroutines uses O(1) stack depth regardless of chain length, the kind of guarantee a production async scheduler needs.
Boost.Asio has offered C++20 coroutine support since Boost 1.75, and its asio::awaitable<T> with co_spawn and use_awaitable provides a complete scheduler-plus-coroutine system:
asio::awaitable<void> handle_connection(tcp::socket socket) {
char buf[1024];
for (;;) {
auto n = co_await socket.async_read_some(
asio::buffer(buf), asio::use_awaitable);
co_await asio::async_write(
socket, asio::buffer(buf, n), asio::use_awaitable);
}
}
The experience of writing this is close to C# async/await, because Asio is acting as the engine. The boilerplate lives in Asio, not in your code.
The Real Design Trade-offs
There is a genuine cost to this approach: C++ coroutines have no compile-time marker distinguishing them from regular functions. A function returning Task<int> might be a coroutine; it might be a regular function that constructs a Task<int> from scratch. There is no async keyword.
This was a deliberate committee decision documented in P0973R0. The argument is that the return type already carries the semantic signal (a Task<int> implies deferred work), and adding a keyword would create backward-compatibility complications for existing code. The practical consequence is that there is no compiler-level protection against calling a coroutine and discarding the return value, which silently means the body never executes. Adding [[nodiscard]] to the return type (common in cppcoro, folly::coro, and similar libraries) catches this at the call site, but it is a library convention rather than a language guarantee.
The more serious structural problem is reference parameters across suspension points:
// Dangerous: the reference may dangle after suspension
Task<void> process(const std::string& config) {
co_await async_setup(); // coroutine suspends here
use(config); // config may be dangling by now
}
// At the call site:
auto t = process(compute_config()); // temporary destroyed at semicolon
The compiler does not warn. Clang-Tidy’s cppcoreguidelines-avoid-reference-coroutine-parameters check catches this pattern statically, and it is worth enabling on any codebase using coroutines. Rust’s borrow checker catches the equivalent at compile time, which is a genuine advantage C++ does not currently have, and the C++26 roadmap does not address it.
That said, the other side of the trade-off is real control. promise_type::operator new can be overridden to allocate frames from per-thread arenas or fixed-size pools. await_transform intercepts every co_await in a coroutine body, allowing a scheduler to inject cancellation checks or thread-affinity validation at every suspension point without requiring the coroutine author to annotate each one. With HALO, GCC and Clang elide the heap allocation entirely when the frame’s lifetime is provably bounded within the caller, reducing a coroutine resumption to an indirect function call.
What the Mental Model Unlocks
Once you see coroutines as a scheduler protocol rather than async syntax, several things that seemed like quirks become legible engineering decisions.
The heap allocation of the coroutine frame exists because the frame’s lifetime is decoupled from the calling stack, exactly as Unity’s IEnumerator object lives on the managed heap independent of the calling Start() method. The reason final_suspend must return noexcept is that at final suspension the frame is about to be destroyed; any exception here would propagate during destruction, which has no coherent recovery semantics. The reason await_suspend can return bool (false cancels the suspension), void (suspend unconditionally), or a coroutine_handle<> (symmetric transfer to another coroutine) is that different scheduler designs need different behavior at the suspension site.
The generator case is where this becomes most concrete for developers who have never written a game engine. std::generator<T> yields values to its caller and resumes when the caller advances the range. The caller is the scheduler. The range-for loop is MoveNext(). Writing co_yield value in a generator is exactly yield return value in Unity, with the calling loop playing the role of the Unity engine.
The Remaining Gap
What C++ still lacks, and what P2300 (std::execution, targeting C++26) aims to close, is a standard scheduler. Unity provides its scheduler as part of the engine package. C++ provides the protocol for writing any scheduler, but no canonical scheduler for common cases. P2300’s sender/receiver model is designed to compose with coroutines via std::execution::as_awaitable, which would mean a standard way to express “run on this thread pool” or “complete when this I/O finishes” without depending on a specific library.
Until P2300 ships and tooling stabilizes, the practical path is Boost.Asio for networked I/O, std::generator for synchronous generators, and cppcoro or folly::coro as references for building custom task types. The mechanism is solid; the library layer is still assembling itself.
The Unity lens does not make that gap disappear. But it makes clear why the gap exists: Unity is a complete game engine, not just a coroutine mechanism. C++20 shipped the mechanism and left the engine to the ecosystem, intentionally. Whether that was the right call is worth debating. That it was a conscious choice, made with a coherent reason, is what the Unity comparison finally makes legible.