How Virtual Threads Saved Me from Becoming a Meme

I came dangerously close to becoming a walking Java meme - until virtual threads swooped in like a caffeinated superhero and saved me from endless callback hell.

It’s confession time: for a while there, I thought I was actually turning into a meme. Specifically, a substrain of the GNU/Linux copypasta meme.

Whenever a discussion about asynchronous programming in Java came up, I’d get triggered – and then I’d spout something like this:

I’d just like to interject for a moment. What you’re referring to as async is, in fact, non-blocking, or as I’ve recently taken to calling it, non-blocking IO.

If you think I’m joking, here’s the latest example from a discussion on Slack:

And a little further down the thread, you can see the moment when I finally snapped out of it:

But first, let’s set the stage with some context.

Let me introduce you to the problem

If you’ve talked to me about programming over the past ~10 years, you’ve probably heard me mention technologies like RxJava or its Spring-aligned successor, Project Reactor. These libraries, along with underlying technologies such as the Netty framework, have enabled us to build highly scalable HTTP API applications in Java.

The long-standing problem with Java apps was the “one thread per request” model.

This works for relatively low-traffic use cases, where you can let the JVM spawn a few hundred threads and hope it’s enough. It’s unavoidable for CPU-intensive tasks, like running regexes or hashing passwords with bcrypt, where the thread actually needs to run.

But for HTTP APIs, most time is spent waiting on IO to backend services or databases, so threads mostly sit idle while consuming resources. Context switching and reserved stack memory add up, limiting how many threads the JVM can realistically handle.

And then… comes callback hell

Historically (and I’m talking Java EE 6, circa 2009 levels of historic here) the solution to this was asynchronous programming.

The underlying technique is using a callback, potentially invoked by a different thread later. This allows a method to return early while leaving part of its logic in the callback to continue execution. This matters in Java’s cooperative concurrency: once a method runs on a thread, it can’t be forcefully stopped – it must return or throw an exception, or the thread remains stuck.

This callback technique can lead to ergonomic issues, often referred to as “callback hell.” Fortunately, there are solutions to address this.

One approach is to wrap callbacks in promise objects.

In the Java world, we can achieve this with Future, and more recently with CompletableFuture. We can go a step further to build more complex chains of asynchronous processing that may produce multiple responses. This is where libraries like Reactor come in. In some cases, the language itself provides syntactic sugar to simplify this – for example, JavaScript’s async/await keywords (and other languages that transpile to JavaScript).

Ultimately, all of these are just different interfaces on top of the same underlying callback mechanism. The same applies to Kotlin’s coroutines – they all inherit the same fundamental challenge.

Sometimes functions don’t play nice

In order for any of this to work, the functions that are getting invoked need to support it. They need to either work with callbacks or return futures or reactor types.

It’s relatively easy to convert one of these into another, so they can all be considered equivalent. At the end of the day, you’ll inevitably end up with two types of functions in your codebase: those that work with callbacks (or any of their ergonomic wrappers) and those that don’t.

Because of the cooperative concurrency model, as a function caller, you can’t force a function that doesn’t support callbacks to work with them. This is sometimes called the “colored function problem,” and it’s considered the cardinal sin of this approach to concurrency.

By the way, do you recognize the color problem I mentioned in the Slack thread above? Nikolay added some cool new features to his library and asked me to migrate the API Gateway service to use them. The issue was that the gateway relies on “red” (non-blocking) functions, while the new feature was implemented as a “blue” (blocking) function.

In the past, this would have made it unusable in the gateway – but not anymore!

The solution? Virtual threads

The solution is virtual threads – a relatively new (welcome to the 2020s!) feature that has been gradually rolling out since Java 19. By version 24, they were stable enough for production use, and with the release of the latest LTS, version 25, we can now use them without reservation.

Let’s take a moment to explore exactly how virtual threads help us here.

Remember, the original problem was that having many threads in a waiting state is costly. The callback-style approach mitigates this by reducing the number of waiting threads: properly implemented functions (“red” ones) can return early, freeing threads that would otherwise be idle.

Virtual threads address the same problem by lowering the cost of waiting threads directly. They decouple Java threads from OS threads, reducing context-switching overhead, and they minimize the stack memory the JVM allocates per thread. By the way, this is the same approach used by Go with its goroutines.

Advantages over callbacks

This approach neatly sidesteps the problems caused by cooperative concurrency: functions no longer need to return early, so there is no need to track non-blocking “red” versus blocking “blue” functions.

In practice, this means we can freely call any kind of function from any other function. In the case of the library upgrade mentioned in the Slack thread, the gateway service can use the new feature without requiring that feature to be refactored to support callbacks.

Additionally, the need for libraries that simplify the ergonomics of callbacks is largely eliminated. We can write and invoke plain old Java functions just like we used to before 2009. Surprisingly, this is a win.

Reactor is EVERYWHERE

A lot of code currently uses the callback approach – sometimes directly, but more often abstracted behind promises or Project Reactor types. Libraries such as Reactor Netty provide robust and reliable HTTP server and client implementations.

In the past few weeks alone, I encountered Reactor playing a major role in two modern projects I worked on:

Spring AI project provides many of its libraries in 2 flavors: blocking and non-blocking. The business logic is implemented in the non-blocking libs, while the blocking ones are just a thin wrapper around it that basically call blocking get method and return the result.
azure-kusto-java library from Microsoft for working with ADX just released a new major version 7. Flagship new feature: Reactor based async APIs.

This is just anecdotal evidence, but it illustrates the massive Java ecosystem – with all its inertia – moving in the direction of a reactive-style, callback-based model.

Bringing the story back to the Slack thread case, the API gateway is one such Reactor-based application. It’s unrealistic to expect that it will be rewritten to avoid using Reactor anytime soon. Fortunately, there’s no need to do that.

There’s a simple and inexpensive way to switch existing reactive workflows to virtual threads: we can instruct a Mono or Flux to run on a scheduler backed by virtual threads. The change is as straightforward as this:

// Given a Mono or Flux chain: 
chain 
  .map(element -> someBlockingFunctionCall(element)) 
  .publishOn(Schedulers.fromExecutor(Executors.newVirtualThreadPerTaskExecutor())) 
  // rest of the chain...

Just make sure the app is running on Java version 24 or later or face catastrophic failure mode that is hard to diagnose and understand.

Now we have the best of both worlds – modern Reactor and classic Java

For more details on how Reactor uses threads and schedulers, check out the 3rd installment in the Flight of the Flux blog series: Hopping Threads and Schedulers. (BTW, the fact that this awesome blog post series even exists highlights an additional problem of the reactive approach: its complexity.)

One potentially tricky thing is that once a reactive chain is switched over to run on virtual threads (or any other scheduler), it can’t be “returned back to the original scheduler” in a generic way. You may have tight control over all schedulers, in which case you can move back, but in general it is not possible.

Luckily, this is usually not necessary – you can let the chain run on virtual threads until the end. Just be aware that a lot of your code will run on virtual threads after this change. Fortunately, the language and ecosystem have matured to the point where this is no longer the problem it was a year ago.

Now we get the best of both worlds: we can continue using libraries and frameworks built on Project Reactor, while seamlessly integrating code that exposes only classic, blocking “blue” functions. Over time, we can write more and more classic code, which is easier to understand and maintain.

And for new projects we can start fresh and write code like it’s 2008 again!