Async Spread Like a Virus, and the Runtime Was Always the Hidden Part

There is a version of the async/await story where everything worked out. You write code that looks sequential, the compiler turns it into a state machine, threads stay unblocked, and throughput climbs. That version is true, in the same way it is true that a map projection accurately represents the earth.

The original essay on causality.blog traces the gap between what async promised and what it actually delivered. The short version: the syntax hid complexity rather than removing it. The longer version is worth unpacking, because the gap is different depending on which language you are in, and in some cases the delivery was genuinely good.

The Problem Before Async

To understand why async/await felt like liberation, you have to remember what it replaced. In early Node.js, non-blocking I/O meant callback pyramids:

fs.readFile('config.json', (err, data) => {
  if (err) return handleError(err);
  parseConfig(data, (err, config) => {
    if (err) return handleError(err);
    connectDb(config.db, (err, db) => {
      if (err) return handleError(err);
      db.query('SELECT 1', (err, rows) => {
        // by now you've lost the thread
      });
    });
  });
});

Error handling was manually threaded through every callback. Stack traces were useless. The control flow was inverted: instead of the code describing what happens in sequence, it described what to do when each step eventually completed. Promises flattened the nesting but preserved the inversion. async/await finally gave you the sequential appearance back.

async function boot() {
  const data = await fs.promises.readFile('config.json');
  const config = await parseConfig(data);
  const db = await connectDb(config.db);
  const rows = await db.query('SELECT 1');
  return rows;
}

This reads like synchronous code. It is not synchronous code. That distinction is the source of most of the trouble.

The Color Problem

In 2015, Bob Nystrom wrote “What Color is Your Function?”, which named the core structural problem. Async functions are a different “color” from synchronous ones. You can call a synchronous function from an async context. You cannot straightforwardly call an async function from a synchronous one. This asymmetry is not a quirk: it is load-bearing.

The consequence is that async spreads. The moment one function deep in a call stack becomes async, every caller that wants to actually await its result must also become async. This propagation has no natural stopping point except the event loop itself or a runtime boundary like block_on in Rust or asyncio.run in Python. In practice, it means that introducing async I/O into a library forces callers to adopt async whether they wanted to or not.

In JavaScript this is especially visible because the entire Node.js standard library was effectively rewritten twice: once with callbacks, once with the util.promisify wrapper and fs.promises, so that async/await could reach it. The color boundary sits at the language runtime level, not at any user-controlled abstraction.

What the State Machine Actually Looks Like

When the compiler transforms an async function, it builds a state machine. Each await point is a state transition. The function’s local variables become fields of a struct that can be suspended and resumed. In Rust, this is explicit through the Future trait:

trait Future {
    type Output;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output>;
}

A future does nothing until it is polled. The executor calls poll, the future either returns Poll::Ready(value) or registers a waker and returns Poll::Pending. This is a pull-based model, and it is where Rust made a distinctive choice: zero-cost futures. The generated state machine allocates nothing on the heap unless you box it explicitly. No thread is consumed while the future is waiting.

The tradeoff is that Rust futures require an executor, and no executor is bundled with the language. You pick tokio or async-std or something smaller, and those choices are not always composable. A library that depends on tokio’s timer cannot easily run under a different executor. The viral spread in Rust is not just about the async keyword: it extends into runtime coupling.

Python’s Middle Ground

Python’s asyncio is a closer parallel to JavaScript than to Rust. The event loop is managed by the standard library, and async def functions produce coroutines rather than the poll-based futures of Rust. The coloring problem is just as present: you cannot await inside a regular function, and calling async code from synchronous code requires asyncio.run() or a running event loop.

What Python added with Trio is a clearer structural model. Nathaniel J. Smith’s work on structured concurrency formalizes a constraint: tasks cannot outlive the scope that created them. In Trio, nurseries replace arbitrary task spawning:

async def main():
    async with trio.open_nursery() as nursery:
        nursery.start_soon(fetch_data, url1)
        nursery.start_soon(fetch_data, url2)
    # both tasks are done here, guaranteed

This eliminates an entire class of leak and error-propagation bugs that asyncio’s create_task leaves open. Python 3.11 adopted the same model with TaskGroups. The structured concurrency guarantee is that you know when your concurrent work finishes, and exceptions propagate out of the group rather than silently dying in a background task.

But Python also has the GIL, which means async concurrency and parallel execution are separate problems. Async handles I/O concurrency; multiprocessing handles CPU parallelism. This split is explicit and honest, which is better than pretending async solves both.

Go Said No

Go’s answer to the color problem was to not have it. Goroutines are cheap enough (starting at around 2-4 KB of stack) that you can create hundreds of thousands of them. The scheduler multiplexes goroutines onto OS threads transparently. Any function can block; the scheduler simply parks the goroutine and runs another one.

func fetchAll(urls []string) []string {
    results := make([]string, len(urls))
    var wg sync.WaitGroup
    for i, url := range urls {
        wg.Add(1)
        go func(i int, url string) {
            defer wg.Done()
            results[i] = fetch(url) // blocking call, no special syntax
        }(i, url)
    }
    wg.Wait()
    return results
}

No async, no await, no color. The function fetch does not need to know it is called concurrently. The complexity is in synchronization primitives (channels, WaitGroups, mutexes) rather than in the type system. The criticism is that this makes it harder to see where concurrency happens in code review; the defense is that this is the right tradeoff because concurrency should be a scheduling concern, not a type concern.

The goroutine model does have costs: the garbage collector must scan goroutine stacks, and the scheduler adds overhead that a hand-tuned async state machine avoids. In Rust’s benchmarks, zero-cost futures outperform goroutines in raw I/O throughput at high connection counts. Whether that matters for your application is a different question.

Java Came Around Eventually

Project Loom, stabilized in Java 21, brought virtual threads to the JVM. Like goroutines, virtual threads are cheap to create and can block freely without consuming an OS thread. The entire existing Java ecosystem, which was written for blocking I/O, works without modification. You do not rewrite your JDBC calls; you just run them in a virtual thread.

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    var futures = urls.stream()
        .map(url -> executor.submit(() -> fetch(url)))
        .toList();
    // blocking .get() is fine here; virtual thread yields to scheduler
    return futures.stream().map(Future::get).toList();
}

This is arguably the most pragmatic solution: no new syntax, no ecosystem split, no viral coloring. The JVM team spent years on continuation support to make it work. Whether other languages could have taken this path depends on runtime architecture; it requires the ability to capture and restore call stacks cheaply, which is easier on a managed runtime with its own JIT.

What Was Actually Delivered

Async/await delivered exactly what it promised on the narrow question of syntax. Code looks sequential. Callback pyramids are gone. Error handling with try/catch works across await boundaries in ways it never did with callbacks.

What it did not deliver is simplicity. The complexity moved: from callback nesting into the type system, the executor model, the runtime dependency graph, and the debugging toolchain. Stack traces in async code are still fragmented in many runtimes. In Node.js, --async-stack-traces helps but adds overhead. In Rust, async stack traces remain a known pain point that tools like tokio-console partially address by providing a dedicated async task inspector.

Cancellation is another place where the abstraction leaks. In JavaScript, AbortController was bolted onto fetch and the broader web platform years after async/await stabilized, because the original design had no structured way to cancel in-flight work. In Rust, dropping a future cancels it at the next poll point, which is efficient but requires every .await point to be a safe cancellation boundary, which is not always true for code using non-async resources.

Structured concurrency addresses some of this, but it is an addition to the model rather than a property of async/await itself. It took the ecosystem years to converge on it, and it is still not universal.

The Honest Assessment

Async/await is better than callbacks. It is better than manual continuation-passing style. For I/O-bound workloads where thread-per-connection does not scale, it solves a real problem with acceptable costs.

But the pitch was often that it makes concurrency easy. It does not. It makes certain concurrency patterns syntactically tractable while pushing the structural complexity elsewhere. The coloring problem is real; it does constrain ecosystem design. The runtime coupling in Rust is real; it does fragment library compatibility. The broken stack traces are real; they do make debugging harder.

Go’s goroutine model and Java’s virtual threads represent a different bet: that hardware is cheap enough and schedulers are good enough that you do not need to expose the async boundary to users at all. Given that Java 21 virtual threads can handle millions of concurrent connections on commodity hardware while preserving the entire existing synchronous API surface, that bet looks increasingly correct for general-purpose server-side code.

For systems-level code where allocator control and deterministic latency matter, Rust’s explicit model remains the right choice. The zero-cost abstraction is not marketing: the generated state machines are genuinely smaller and faster than goroutines under contention.

The lesson is not that async/await failed. The lesson is that “sequential-looking code” and “simple concurrency” are not the same thing, and selling the former as the latter created years of confusion about what the model actually required from the programmer and the runtime beneath them.