Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Java Parallel Streams Are Bad for Your Health!

This post continues the series that started with sneaky default methods in interfaces, which when used unwisely can cause your application to turn into a code mess. that you don’t want to look at.

As we claimed previously, Java 8 delivers three major features everyone is eager to use: Lambdas, stream API and default methods in interfaces. Sadly, all of them can easily be abused and can actually be a detriment to your code if you add them to your toolbelt.

Today, we will look at the stream API, specifically parallel streams. If you want to get an overview of the pitfalls that await you with serial stream processing, check out this post on the jOOQ blog by Lukas Eder.

But for now let’s focus on the parallel execution that the stream API is praised for. Allegedly, it might speed up some tasks your application executes by utilizing multiple threads from the default ForkJoinPool.

Mousetrap of parallel streams

Here’s a classic example of the awesomeness that parallel streams promise you. In this example we want to query multiple search engines and return the output from the first to reply.

public static String query(String question) {
    List<String> engines = new ArrayList<String>() {{
      add("http://www.google.com/?q=");
      add("http://duckduckgo.com/?q=");
      add("http://www.bing.com/search?q=");
    }};   
    // get element as soon as it is available
    Optional<String> result = engines.stream().parallel().map((base) -> {
      String url = base + question;
      // open connection and fetch the result
      return WS.url(url).get();
    }).findAny();
    return result.get();
  }

Nice, isn’t it? But let’s dig a bit deeper and check what happens in the background. Parallel streams are processed by the parent thread that ordered the operation and additionally by the threads in the default JVM’s fork join pool: ForkJoinPool.common().

However, an important aspect to notice here is that querying a search engine is a blocking operation. So at some point of time every worker thread will call the get() operation and sit right there waiting for the results to come back.

Hang on, isn’t this what we wanted in the first place? Instead of going through the list and waiting for each url to respond sequentially, we wait on all of the responses at the same time. Saving your time, just like using JRebel does (sorry couldn’t resist :-) ).

However, one side-effect of such parallel waiting is that instead of just the main thread waiting, ForkJoin pool workers are. And given the current ForkJoin pool implementation, which doesn’t compensate workers that are stuck waiting with other freshly spawned workers, at some point of time all the threads in the ForkJoinPool.common() will be exhausted.

Which means next time you call the query method, above, at the same time with any other parallel stream processing, the performance of the second task will suffer!

However, don’t rush to blame the ForkJoinPool implementation, in a different use case you’d be able to give it a ManagedBlocker instance and ensure that it knows when to compensate workers stuck in a blocking call. And get your scalability back.

Now, the interesting bit is, that it doesn’t have to be a parallel stream processing with blocking calls to stall the performance of your system. Any long running function used to map over a collection can produce the same issue.

Consider this example:

long a = IntStream.range(0, 100).mapToLong(x -> {
    for (int i = 0; i < 100_000_000; i++) {
    System.out.println("X:" + i);
  }
  return x; 
}).sum();

This code suffers from the same problem as our networking attempt. Every lambda execution is not instantaneous and during all that time workers won’t be available for other components of the system.

This means that any system that relies on parallel streams have unpredictable latency spikes when someone else occupies the common ForkJoin pool.

So what, I’m the boss in my program anyway, right?

Indeed if you’re creating an otherwise single-threaded program and know exactly when you intend to use parallel streams, then you might think that this issue is kinda superficial. However, many of us deal with web applications, various frameworks and heavy application servers.

How can a server that is designed to be a host for multiple independent applications, that do who knows what, offer you a predictable parallel stream performance if it doesn’t control the inputs?

One way to do this is to limit the parallelism that the ForkJoinPool offers you. You can do it yourself by supplying the -Djava.util.concurrent.ForkJoinPool.common.parallelism=1, so that the pool size is limited to one and no gain from parallelization can tempt you into using it incorrectly.

Alternatively, a parallelStream() implementation that would accept a ForkJoinPool to be parallelized might be a workaround for that. Unfortunately it is not currently offered by the JDK.

Moral of the story

Parallel streams are unpredictable and complex to use correctly. Almost any use of parallel streams can affect the performance of other unrelated system components in an unpredictable way. I have no doubt that there are people who can manage to use them to their benefit, clearly and correctly. However, I’d think twice before typing stream.parallel() into by code and would look twice when reviewing the code containing it.

Do you think I’m overdramatizing the issue? Leave a comment or find me on Twitter: @shelajev.