Scala Stream: There’s no reason to use it if you don’t actually need one.

The first time I read about Scala Streams in Functional Programming in Scala I was impressed by the example’s captivating simplicity. I assume a certain familiarity with the idea of a Stream but just as a quick recap if that is not the case, the original example in the book was:

List(1,2,3,4).map(_ + 10).filter(_%2==0).map(_ * 3)

Which gets evaluated, step by step, as:

List(11,12,13,14).filter(_%2==0).map(_ * 3)
List(12,14).map(_ * 3)
List(36,42)

Whereas a Stream version

Stream(1,2,3,4).map(_ + 10).filter(_%2==0).map(_ * 3).toList1

because of its laziness, will evaluate in the following way:

Stream(1,2,3,4).map(_ + 10).filter(_ % 2 == 0).toList
cons(11, Stream(2,3,4).map(_ + 10)).filter(_ % 2 == 0).toList
Stream(2,3,4).map(_ + 10).filter(_ % 2 == 0).toList
cons(12, Stream(3,4).map(_ + 10)).filter(_ % 2 == 0).toList
12 :: Stream(3,4).map(_ + 10).filter(_ % 2 == 0).toList
12 :: cons(13, Stream(4).map(_ + 10)).filter(_ % 2 == 0).toList
12 :: Stream(4).map(_ + 10).filter(_ % 2 == 0).toList
12 :: cons(14, Stream().map(_ + 10)).filter(_ % 2 == 0).toList
12 :: 14 :: Stream().map(_ + 10).filter(_ % 2 == 0).toList
12 :: 14 :: List()2

Note that no intermediate results like List(11,12,13,14) or List(12,14) get instantiated in this case. It looks like there is simply less work to do but this impression of efficiency might be misleading. A simple micro benchmark reveals how the Stream perform worse than the List:

testOriginalExampleWithList    thrpt   20  4368200.644 ± 155109.000  ops/s
testOriginalExampleWithStream  thrpt   20  2881809.414 ± 122061.743  ops/s

The reason has to be tracked down in the Stream lazy nature along with its immutability as it belongs to the Scala immutable collection but at the same time its tail is not “real” yet (sort to speak) and it will be evaluated when required. On that respect the tail resemble a rule or a set of instructions to materialise its elements. The marriage between these two aspects of a Stream requires some synchronisation which is the toll to pay (and the reason behind the different performance above):

/** A lazy cons cell, from which streams are built. */
@SerialVersionUID(-602202424901551803L)
final class Cons[+A](hd: A, tl: => Stream[A]) extends Stream[A] {
  override def isEmpty = false
  override def head = hd
  @volatile private[this] var tlVal: Stream[A] = _
  @volatile private[this] var tlGen = tl _
  def tailDefined: Boolean = tlGen eq null
  override def tail: Stream[A] = {
    if (!tailDefined)
      synchronized {
        if (!tailDefined) {
          tlVal = tlGen()
          tlGen = null
        }
      }

    tlVal
  }
}3

So, why use a Stream after all? As the title (and the Scala doc) suggest, there are occasions where a lazy sequence comes handy. One well known example is the possibility to define “infinite” sequences like the Fibonacci one in the documentation. An other aspect, and potentially an advantage in some situation, is the generally low memory footprint of a Stream (remember, the intermediate sequences, like List(11,12,13,14) and List(12,14) in the example, don’t get instantiated).

The following terminate successfully on my machine with roughly 4GB of heap memory4:

Stream.fill(600)("Some random sample text.")
      .map(_ * 500000 + "test")
      .zipWithIndex.filter(_._2 % 2 == 0)
      .map(_._1.contains("test"))

Whereas its List equivalent throws a

java.lang.OutOfMemoryError: Java heap space

.The above case is build purely with the intent to trigger the exception but compare this:

final case class Wrapper(index: Int, text: String)

val listInput: List[Int] = (1 to 500).toList

listInput
      .map(Wrapper(_, "Some simple text"))
      .map(_.copy(text = "some simple text" * 500))
      .map(w => w.copy(text = w.text + "look for me!"))
      .filter(_.index % 2 == 0)
      .map(_.text.toLowerCase)
      .count(_.contains("look for me!"))

with this one:

val streamInput: Stream[Int] = (1 to 500).toStream

streamInput
      .map(Wrapper(_, "Some simple text"))
      .map(_.copy(text = "some simple text" * 500))
      .map(w => w.copy(text = w.text + "look for me!"))
      .filter(_.index % 2 == 0)
      .map(_.text.toLowerCase)
      .count(_.contains("look for me!"))

Both complete successfully but the Stream version in this case is performing better than the List one:

List    thrpt   15  76.588 ± 2.734  ops/s
Stream  thrpt   15  80.246 ± 2.260  ops/s

Flight Recorder shows what is happening under the hood: the first example with List generates more intermediate results that then increase the amount of work for the garbage collector:

gc-list

Whereas in the Stream case the garbage collector is definitely less busy and less CPU resources are needed in order to free the heap.

gc-stream

In this last case, the amount of extra work on the garbage collector was sufficient to tip the balance in favor of a lazy sequence. Those cases are probably the exceptions rather than the rule but I guess the lesson is that wherever there are long chain of transformations on non lazy collection and/or the suspect that those might involve extra work for the GC, then perhaps the idea to test a lazy alternative should be taken into consideration.

All code and tests available here.

———————————

1. [a Stream is lazy hence we need to turn it into a List to trigger the evaluation]
2. [Chiusano P., Bjarnason R. Functional Programming in Scala. Manning, 2014. p.72]
3. [The code comes from the definition of the Stream class.]
4. [All the numbers regarding performance and potential out of memory errors or general memory usage, depends obviously on the specific hardware and configuration in use (in my case an old laptop and not much heap assigned) but I think the general idea behind those tests, rather than the specific numbers, still stands.]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s