Imagine a bacon-wrapped Ferrari. Still not better than our free technical reports.
See all our reports

Streams: The Real Powerhouse in Java 8 by Venkat Subramaniam

Venkat Subramaniam

Welcome to the new year of wonderful online presentations about software engineering, Java, best practices, tools and technology choices. All that is naturally the Virtual JUG — the online Java User Group that brings you the best technical sessions from all over the world.

The first session in 2016 was a real treat! None other than the man himself, Dr. Venkat Subramaniam, delivered a fast-paced, astonishing presentation: Streams: The Real Powerhouse in Java 8. This was Venkat’s second time on Virtual JUG, the first time he spoke about creating reactive applications.

This session was, however, all about Java 8 streams. What do they represent, what are the common code patterns around streams and how to get the most of them in your codebase?

Just click on the video below and enjoy the session.

Streams: The Real Powerhouse in Java 8

The official Java 8 release came with a myriad of features, the most prominent of which are undoubtedly lambdas and the streams API. Many projects upgraded to Java 8 just to leverage the sweet lambda syntax, or because existing frameworks updated themselves to use them. Streams are no less important.

The whole idea of streams is to enable functional-style operations on streams of elements. A stream is an abstraction, it’s not a data structure. It’s not a collection where you can store elements. The most important difference between a stream and a structure is that a stream doesn’t hold the data. For example you cannot point to a location in the stream where a certain element exists. You can only specify the functions that operate on that data. A stream is an abstraction of a non-mutable collection of functions applied in some order to the data.

A stream represents a pipeline through which the data will flow and the functions to operate on the data. Here’s an example where a stream is used as a fancy iterator:

List<Integer> numbers = Arrays.asList(1, 2, 3, 4); 
List<Integer> result = numbers.stream()
  .filter(e -> (e % 2) == 0)
  .map(e -> e * 2)
  .collect(toList());

In this example we select only even values, by using the filter method and doubled them by mapping the function that doubles the input. What does this provide us? The streams API gives us the power to specify a sequence of operations on the data in individual steps. We don’t specify any conditional processing code, we are not tempted to write large complex functions, we don’t care about the data flow. In fact, we only bother ourselves with one data processing step at a time: we compose the functions and the data flows through the functions by itself by the power of the streams framework.

The example above shows one of the most important pattern you’ll end up using with the streams:

  • Raise a collection to a stream
  • Ride the stream: filter values, transform values, limit the output
  • Compose small individual operations
  • Collect the result back into a concrete collection

Common operations on streams

In Java 8 you can easily obtain a stream from any collection by calling the stream() method. After that there are a couple of fundamental functions that you’ll encounter all the time.

  • Filter returns a new stream that contains some of the elements of the original. It accepts the predicate to compute which elements should be returned in the new stream and removes the rest. In the imperative code we would employ the conditional logic to specify what should happen if an element satisfies the condition. In the functional style we don’t bother with ifs, we filter the stream and work only on the values we require.
  • Map transforms the stream elements into something else, it accepts a function to apply to each and every element of the stream and returns a stream of the values the parameter function produced. This is the bread and butter of the streams API, map allows you to perform a computation on the data inside a stream.
  • Reduce (also sometimes called a fold) performs a reduction of the stream to a single element. You want to sum all the integer values in the stream – you want to use the reduce function. You want to find the maximum in the stream – reduce is your friend.
  • Collect is the way to get out of the streams world and obtain a concrete collection of values, like a list in the example above.

Of course you won’t use all of these functions every time you encounter a stream, but you have them available to use at will.

There are some caveats of using the streams API though, and Venkat showed us a great example of the stream processing getting a tad out of hands. Imagine we have the following class Person:

class Person { 
   Gender gender; String name; 
   public Gender getGender() { return gender; }
   public String getName() { return name; }
}
enum Gender { MALE, FEMALE, OTHER }

This is a typical Java bean with some getters on the fields. Now, suppose we have a list of these persons and want to get the list of uppercase names of all the “FEMALE” people in that list.

Easy you say, right?

List<String> names = new ArrayList<String>(); 
List<Person> people = …
people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase)
  .forEach(name -> names.add(name)); 

The code is so natural, we just follow the specification of what we have to do at every step. The problem is though in the mutation of the shared state. We know nothing of the nature of the stream at our hands and if the stream is parallel, the concurrent addition of the elements into the stream can lead to errors.

Instead, we should have collected the stream into the resulting list, making worrying about the concurrency and mutability the responsibility of the streams framework. Here’s the example of how to do so:

List<Person> people = …
List<String> names = people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase)
  .collect(Collectors.toList()); 

In general, the Collectors class provides almost all necessary primitives to transform a stream into a concrete collection. One of the examples Venkat showed was the toMap() collector. You might be confused about how can an element be transformed into a key-value pair required for the map. Easy, you specify a function that turns the element into the key and another function that creates the value. Here’s an example that collects the same stream of people into a map:

List<Person> people = …
Map<String, Person> names = people.stream()
  .collect(Collectors.toMap(p -> p.getName(), p -> p)); 

The first function given to the toMap method transforms the element into the key and the second to the value for the map.

Intermediate and terminal operations

One of the virtues of streams is that they are lazily evaluated. Some operations on the streams, particularly the functions that return an instance of the stream: filter, map, are called intermediate. This means that they won’t be evaluated when they are specified. Instead the computation will happen when the result of that operation is necessary.

This means that if we just specify the code like:

Stream<String> names = people.stream()
  .filter(p -> p.getGender() == Gender.FEMALE)
  .map(Person::getName)
  .map(String::toUpperCase);

None of the names will immediately collected and made into the upper case. When does the computation occur, you might ask. When a terminal operation is called. All operations that return something other than a stream are terminal. Operations like forEach, collect, reduce are terminal. This makes streams particularly efficient at handling large amounts of data.

On top of that, one can almost always try to parallelize the stream processing by converting the stream into a parallel stream by calling the parallel() method. Note, that although the stream doesn’t have to be parallelizable, the method to parallelize it is always there. So depending on the internal nature of the stream you can get the performance benefits. There are pitfalls of running every stream operation in parallel, because most streams implementations use the default ForkJoinPool to perform the operations in background. Thus, you can easily make the particular stream processing a bit faster, but instead sacrifice the performance of the whole JVM without even realizing it!

Naturally, solving the problems using functional programming requires a different way of thinking. But with a bit of experimentation you can definitely get a handle of that. And often, you can really struggle with coming up with a functional solution, but once you get it, you realize that it’s not particularly complicated. And then the next time solving a similar problem will be much easier.

Resources

One of the best resources to learn about streams is surprisingly the javadoc of the java.util.stream package. It will guide you through the common stream idioms, explain how streams are lazy evaluated and the difference between intermediate and terminal stream operations, the possibilities to parallelize a stream and so on.

Agile Learner – is a website where one can access many more presentations by Venkat. It is a commercial resource, so you’d be required to purchase the access to the videos. But it might totally be worth it!

We have previously published a 1 page cheat sheet that talks about the best practices of using lambdas, streams and Optionals in Java 8. Print it out and have it handy near your desk to remember some typical idioms.

RebelLabs Interview with Virtual JUG speaker

After the session Venkat has joined me for our regular interview with the Virtual JUG speaker. Besides asking him about his favorite JVM language and what’s the most terrible habit a programmer can have, we discussed one real-life experiment on the readability of imperative code vs. the code written in the functional style. Venkat shares his opinion on the adoption of Java 8 and how he thinks teams should go about adding Java 8 streams into their code.




Read next:

  • Cedric Chaveriat

    This is why everybody should use an IDE

    filter(e – e % 2 == e)

    filter(e -> e % 2 == 0))

    List names = people.stream()
    .collect(Collectors.toMap(p -> p.getName(), p -> p));

    Map cache = people.stream()
    .collect(Collectors.toMap(p -> p.getName(), p -> p));

    List names = people.stream()
    .filter(p -> p.getGender() == “Female”)
    .map(Person::getName)
    .map(String::toUpperCase);

    List names = people.stream()
    .filter(p -> p.getGender() == “Female”)
    .map(Person::getName)
    .map(String::toUpperCase)
    .collect(Collectors.toList());

  • Oleg Šelajev

    Thank you, I fixed the code samples.

  • Kennedy Oliveira

    Also, in the example:

    List names = people.stream()
    .filter(p -> p.getGender() == “Female”)
    .map(Person::getName)
    .map(String::toUpperCase);

    it should be

    List names = people.stream()
    .filter(p -> “Female”.equals(p.getGender()))
    .map(Person::getName)
    .map(String::toUpperCase);

    Strings should be compared with equals and not ‘==’