Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Flavors of Concurrency in Java: Threads, Executors, ForkJoin and Actors


We live in a world where multiple things happen at the same time. Naturally, the programs we write reflect this trait and are capable of doing things concurrently. Except for Python code of course, but even then you can use Jython to run your programs on JVM and make use of the fabulous power of multiprocessor machines.

However, the complexity of concurrent programs does not sit well with the limited performance of human brains. By comparison, we downright suck: we are not created to think about multithreaded programs, assess concurrent access to limited resources and predict where errors or at least bottlenecks will occur.

As with many hard problems, humanity has come up with a number of solutions and models for concurrent computations that emphasize different parts of the problems, as well as making different choices for the computational tradeoffs that occur when we talk about achieving parallelism.

In this post, I’d like to examine code that implements a concurrent solution to the same problem and talk about what’s good about the given approach, what’s are some potential drawbacks, and what pitfalls may lay in wait for you.

We’ll go over the following methods and approaches to enable concurrent processing and asynchronous code:

  • Bare Threads
  • Executors & Services
  • ForkJoin framework and parallel streams
  • Actor model

In this post we won’t look at Fibers, also known as the lightweight threads, but here you can find a great explanations what Fibers are and when to use them.

To make it more interesting, I didn’t just provide any kind of code to illustrate the approach, but used a common task, so the code in every section is more or less equivalent to each other. Oh, and don’t take the code for anything more than an illustration, most of the initialization code should not be in the same method and in general it’s not a production level software examples. If you’re interested in the Java 8 vs Java 7 performance benchmark blog I wrote, you should read that as well!

Oh, one last thing: Towards the end, there is a 1-question poll that asks what you or your organization uses for concurrency, so please fill it out for the sake of your fellow engineers!

The Task

The task: Implement a method that takes a message and a list of strings that correspond to some search engine query page, issues http requests to query the message and returns the first result available, preferably as soon as it is available.

In case everything goes wrong, it’s acceptable to throw an exception or return null. I just tried to avoid looping forever waiting for the result.

Quick note: I cannot go really deep into the details of how multiple threads communicate or into the Java Memory Model at this time, but if you have a strong thirst for such things, you can start with my previous post on the subject: testing concurrency with JCStress harness.

So here we go, let’s start with the most straightforward and hardcore way to do concurrency on the JVM: managing bare threads by hand.

Method 1: Try hand-crafted, fully-organic, GMO-free Bare Threads

Unleash your inner code naturalist with bare threads! Threads are the most basic concurrency primitive there is. Java threads are actually mapped to the operating system threads and every Thread object represents one of the lower level computation threads.

Naturally, the lifecycle of a thread is taken care of by the JVM and scheduling is not your concern as long as you don’t have to make Threads communicate with each other.

Every thread gets its own stack space, consuming a part of the designated JVM process memory.

The Thread API is pretty straightforward, you feed it a Runnable and call .start() to begin the computation. There’s no good API to stop the Thread, you have to implement it yourself using some kind of boolean flag to communicate.

In the following example, we create a new Thread per search engine to be queried. The result of the querying is set into the AtomicReference, which doesn’t require locking or anything to ensure that only a single write will happen. Here we go!

private static String getFirstResult(String question, List<String> engines) {
 AtomicReference<String> result = new AtomicReference<>();
 for(String base: engines) {
   String url = base + question;
   new Thread(() -> {
     result.compareAndSet(null, WS.url(url).get());
 while(result.get() == null); // wait for some result to appear
 return result.get();

The main benefit of using bare threads is that you are the closest to the operating system / hardware model of concurrent computations and the best thing is that this model is quite simple. Multiple threads run, communicate via shared memory and that’s it.

The biggest disadvantage of managing threads yourself is that it’s so easy to go overboard with the number of threads you spawn. They are costly objects that take up a decent amount of memory and time to create. Paradoxically, by having too few threads you’ll sacrifice potential parallelism, but by having too many will probably lead to memory issues and the scheduling becomes more complex.

However, if you need a quick and simple solution, you can definitely use this approach without much hastle.

Method 2: Get dead serious with Executors and CompletionServices

Another option is to use the API for managing groups of threads behind the curtain. Luckily, our wonderful JVM offers us exactly that with the Executor interface. The executor interface definition is quite simple:

public interface Executor {
  void execute(Runnable command);

It abstracts away the details about how the Runnable will be processed. It just says, “Simple developer! You’re nothing but a bag of meat, give me the task, I’ll handle it.”

And what’s even cooler is that the Executors class offers a bunch of methods to create thread pools and executors that have sane configurations. We’ll go with a newFixedThreadPool(), which creates a predefined number of threads and doesn’t allow it to grow over time. It means that all submitted commands will have to wait in a queue when all threads are in use, but this is also handled by the executor itself.

On top of that there are ExecutorService to have control over the executor lifecycle and CompletionService that abstracts away even more details and acts like a queue for finished tasks. Thanks to that, we don’t have to worry ourselves with getting only the first result.

The call to service.take() below will return us only one result at a time.

private static String getFirstResultExecutors(String question, List<String> engines) {
 ExecutorCompletionService<String> service = new ExecutorCompletionService<String>(Executors.newFixedThreadPool(4));

 for(String base: engines) {
   String url = base + question;
   service.submit(() -> {
     return WS.url(url).get();
   try {
     return service.take().get();
   catch(InterruptedException | ExecutionException e) {
     return null;

Going with executors and executor services is the right way if you want to have precise control over how many threads will your program generate and their exact behavior. For example, one important question to ponder about is what is the strategy for the tasks when all threads are busy doing other things? Do we spawn a new worker to handle it, up to some number of threads or infinitely? Do we put the task into a queue? What if that one is full? Grow the queue unboundedly?

Thanks to the JDK, many configurations that answer these questions are already available with sensible names for you, like the Executors.newFixedThreadPool(4) above.

The lifecycle of threads and services is also mostly handled with options to shut things down appropriately. The only downside is that the configuration could be simpler and more intuitive for beginners. However, you hardly find anything simple when talking about concurrent programming.

All in all, I personally think that for a larger system you want to use executors approach.

Method 3: Get wet and wild in the ForkJoinPool (FJP) with parallel streams

Parallel streams were added to Java 8, and since then we have a straightforward way to achieve parallel processing of collections. Together with lambdas, they form a powerful tool for organising concurrent computation.

There are a couple of catches that can get to you if you’ll decide to go this way. First of all, you’ll have to grasp some functional programming concepts, which actually is more a benefit than a downside. Next, it’s difficult to be sure that the parallel stream is actually using more than a single thread for the operations, which is left for the stream implementation to decide. And if you don’t control the source of the stream, you can never be sure what it does.

Additionally, you have to remember that, by default, parallelism is achieved by using the ForkJoinPool.commonPool(). The common pool is managed by the JVM and is shared across everything that runs inside the JVM process. This simplifies configuration to the point where you don’t have to worry about it at all.

private static String getFirstResult(String question, List<String> engines) {
 // get element as soon as it is available
 Optional<String> result = -> {
   String url = base + question;
   return WS.url(url).get();
 return result.get();

Looking in the example above, we don’t really care where or by whom the individual tasks will be completed. However, it also means that in one careless move you can find yourself with multiple stalled parts of your application without knowing it. In another post on the subject of parallel streams, I described the issue in more detail and while there is a workaround, it’s not the most obvious solution in the world.

ForkJoin is a great framework, written and preconfigured by people much smarter than me. So that would be my first choice if I had to write a small program with some parallel processing.

The biggest downside is that you have to foresee the complications it might produce, which is not easy without a deep understanding of how the JVM works as a whole. And this most probably comes with experience only.

Method 4: Hire yourself an Actor

Actors represent a model that is a somewhat odd addition to the groups of approaches we’re looking at in this post. There is no implementation of actors in the JDK, so you’ll have to include some library that can implement them for you.

In short, in the actor model you think that everything is an actor. An actor is a computational entity, like a thread was in the first example above, that can receive messages from naturally other actors, because everything is one.

In response to a message it can send messages to other actors or create new ones and interact with them, or just change its own internal state.

Pretty simple, but it’s a very powerful concept. The lifecycle and message passing is handled by the framework for you, you just specify what should the work units be. Additionally, actor model emphasizes avoiding global state, which comes with several benefits. You can often get supervision strategies, like a retry for free, much simpler distributed system design, fault tolerance and so forth.

Below is an example of the code using Akka Actors, one of the most popular JVM actors library that has a Java API. Actually, it has one for Scala too and, in fact, Akka is the default actor library for Scala, which once had an internal implementation of actors. Several JVM languages, for instance Fantom if you’re into that kind of stuff, have implementations of actors too. This just shows that the actor model is broadly accepted and seen as valuable addition to the language.

static class Message {
 String url;
 Message(String url) {this.url = url;}
static class Result {
 String html;
 Result(String html) {this.html = html;}

static class UrlFetcher extends UntypedActor {

 public void onReceive(Object message) throws Exception {
   if (message instanceof Message) {
     Message work = (Message) message;
     String result = WS.url(work.url).get();
     getSender().tell(new Result(result), getSelf());
   } else {

static class Querier extends UntypedActor {
 private String question;
 private List<String> engines;
 private AtomicReference<String> result;

 public Querier(String question, List<String> engines, AtomicReference<String> result) {

   this.question = question;
   this.engines = engines;
   this.result = result;

 @Override public void onReceive(Object message) throws Exception {
   if(message instanceof Result) {
     result.compareAndSet(null, ((Result) message).html);
   else {
     for(String base: engines) {
       String url = base + question;
       ActorRef fetcher = this.getContext().actorOf(Props.create(UrlFetcher.class), "fetcher-"+base.hashCode());
       Message m = new Message(url);
       fetcher.tell(m, self());

private static String getFirstResultActors(String question, List<String> engines) {
 ActorSystem system = ActorSystem.create("Search");
 AtomicReference<String> result = new AtomicReference<>();

 final ActorRef q = system.actorOf(
   Props.create((UntypedActorFactory) () -> new Querier(question, engines, result)), "master");
 q.tell(new Object(), ActorRef.noSender());

 while(result.get() == null);
 return result.get();

Akka actors use the ForkJoin framework for internal workers handling, and the code here is quite verbose. Don’t worry, most of it are the definitions of message classes: Message and Result, and then two different actors: Querier to organise the search across all search engines and URLFetcher to fetch a given URL. If there are more lines of code here, it’s because I didn’t want to inline all the things. The power of actor model comes from the API on the Props objects, where we can define specific routing patterns, a custom mailbox address for the actor, etc. The resulting system is extremely configurable and contains very little moving parts. Which is always a great sign!

One disadvantage of using the actor model is that it really wants you to avoid global state, so you have to design your application a bit differently, which may complicate a project in mid-migration. At the same time, it includes a number of benefits, so getting acquainted with a new paradigm and learning to use a new library is totally worthwhile.

Feeback time: What do you use?

What is your default way to handle concurrency? Do you understand what model of computation lies behind it or is it just a framework with some Jobs or background tasks objects that automagically add async capabilities to your code?

In order to gather more data and find out if I should continue exploring different approaches to concurrency more in depth, like for example, write a detailed blogpost about how Akka Actors work and what’s good and bad in its Java API, I’ve created a simple single question survey for you. Please, dear reader, if you got this far, answer it too. I appreciate your interactivity!


In this post we discussed different ways to add parallelism to your Java application. Starting with managing Java threads ourselves, we gradually looked at more advanced solutions involving different executor services, ForkJoin framework and the actor model of computation.

Wondering what to pick when you’re facing a real-world problem? They all have their own pros and cons, and mostly do different picks in the intuitiveness and ease of use vs. configuration and raw power of increase / decrease performance of your machine.

In the next post, I’d like to present a more detailed overview of one approach, which one will it be? Answer the survey question above, and the top answer will get the spot!

Do you have a war-story about concurrent programming or know a common pitfall to avoid? Share it in the comments below, so we can all grow as professionals, or chat with me on Twitter: @shelajev.

Update: we’ve published the Java Concurrency Flavors Followup post with the survey results Check it out!



  • Pierre DAL-PRA

    It’s a tiny bit misleading to keep talking about Scala’s own Actor Model implementation, since it Akka is the official Actor Model implementation for Scala since Scala 2.10 (released in early 2013), and deprecated with Scala 2.11.

  • Oleg Šelajev

    True, good catch, thank you Pierre. The intention was not to create confusion, but to show that many other languages have adopted the actor model deeper than Java.

    I’ve edited the post to clarify that.

  • Bruno Santos

    I think you should also include the CompletableFuture API from Java 8

  • Oleg Šelajev

    That’s great idea for a continuation post! It does boil down to using ForkJoinPool or a supplied executor, but it’s a very important addition to Java8.

    I’m definitely thinking about creating the second part to the blogpost now :)

  • Guillermo Guzmán

    Some time ago, i used aparapi to perform a simulation of the sun position using an algorithm developed by Roberto Grena. Because aparapi uses the GPU, the computation was really fast. You should talk about it.

  • Shailendra Singh

    As most of the Java developers are actually Java EE developers (don’t have numbers to prove it), I would like to throw some light on this from Java EE perspective –

    Bare threads – It was never an option for a Java EE application as its a well known myth that one should never create threads in a Java EE application due to blah blah reasons.

    ExecutorService – JSR 236 brought ManagedExecutorService to Java EE applications which is a counterpart of ExecutorService for Java EE applications. But as there is no independent implementation available for JSR 236, it can’t be used with Tomcat (which according to a report by this site is used by 50% of Java developers). As confirmed by tomcat team one can use ExecutorService directly with tomcat, this makes your Java EE application unportable. So either stick to standard and search endlessly to find out how to make it work with tomcat or use ExecutorService and make you Java EE application unportable.

    ForkJoinPool – Most of the time it happens (especially in case of Java concurrency utilities) that whenever new shinny features are introduced in Java SE, Java EE developers post questions on various forums asking how to use these shinny features in Java EE applications. This is the current situation of ForkJoinPool. Even after going through many forums I could not find any concrete information which confirms that ForkJoinPool can be used in a Java EE application.

    Actor Model – Have not done any research on this.

    And so winner is “NONE”.

  • Oleg Šelajev

    Yeah, this is the downside of the Jave EE. Because an application server is a host for several application, it cannot allow one application to consume all the parallelism, right? It has to be fair and meet some expectations, so the only solution is to limit of the parallelism it offers by default and hope that developers will figure it out and change it and behave responsibly.

    Other means are even harder to configure so it doesn’t bring the system down in case of overload or any problems. Which leaves ManagedExecutorService as a viable solution. And it is a good solution, maybe some containers don’t implement it well yet, but it will come. I hope :)

    If I had that choice, I’d sacrifice a bit of portability and would go with the executor services that I can configure on the app level. And I’d document that and reached out to the operations people managing the deployments to clear things out with them, etc.

    However, that’s a great point, I cannot remember any good resource about concurrency in the enterprise. Or about scaling the enterprise solution for that matter. I’ll look into that, sounds really interesting! Thanks for making me think about it! :)

  • dwaltz

    On the contrary: in EE you respect some simple rules and get concurrency management for free.

  • Maxim

    It is a good article, but I found nothing about the STM (software transactional memory).

  • Igor Spasić

    There is a small logical issue with method #2 (or any other method where number of urls > number of used threads). The number of threads in the pool determine which input urls will be used. Method #2, as it is written above, does not work the same as e.g. #1. The result will come _always_ from one of the first 4 urls, and querying other urls make no sense. But ok, this is just sample code ;)

  • Rob W

    The company I work for is a scala shop, so we stick to actors and futures for managing concurrency. Actors really do a great job of helping you conceptualize your business logic either as a workflow or even as a state machine (Akka has an FSM module, but I haven’t used it). We were recently ramping up the number of concurrent users the site could handle and were seeing futures time out all over the place and services erroring out (at a much lower concurrency level than any of us had anticipated) and when we spent time with our monitoring tools we saw that all of our services were barely touching the CPU (ie. 4 core machines with a load factor barely getting above one). It turns out, we were experiencing thread starvation because of all the blocking code happening inside of the actors, and we weren’t putting those blocking tasks onto a separate thread pool. Additionally, engineers have sprinkled Await() calls on Futures which basically blocks the current thread to wait for the future to complete (so you end up blocking 2 threads (facepalm)). Our interim hack, while we rethink things a bit, has been to jack up the throughput setting within the Actor system’s dispatcher. Essentially, this tells the underlying actor system to keep the actor running on its thread until it processes x messages from its mailbox, the default is 5, we are running it upwards of 1000. The moral of the story is that even when you are using an abstraction like an actor system or a future, you still really need to understand the mechanics of what is going on at the JVM and OS level, because the defaults are little more than guesses. The corollary to that moral is that blocking code mixed into async systems can cause nasty performance problems that are very tricky to debug and actually limit the efficacy of horizontal scaling, which makes everyone very sad.