The first time I read about Scala Streams in Functional Programming in Scala I was impressed by the example’s captivating simplicity. I assume a certain familiarity with the idea of a Stream but just as a quick recap if that is not the case, the original example in the book was:
List(1,2,3,4).map(_ + 10).filter(_%2==0).map(_ * 3)
Which gets evaluated, step by step, as:
List(11,12,13,14).filter(_%2==0).map(_ * 3) List(12,14).map(_ * 3) List(36,42)
Whereas a Stream version
Stream(1,2,3,4).map(_ + 10).filter(_%2==0).map(_ * 3).toList1
because of its laziness, will evaluate in the following way:
Stream(1,2,3,4).map(_ + 10).filter(_ % 2 == 0).toList cons(11, Stream(2,3,4).map(_ + 10)).filter(_ % 2 == 0).toList Stream(2,3,4).map(_ + 10).filter(_ % 2 == 0).toList cons(12, Stream(3,4).map(_ + 10)).filter(_ % 2 == 0).toList 12 :: Stream(3,4).map(_ + 10).filter(_ % 2 == 0).toList 12 :: cons(13, Stream(4).map(_ + 10)).filter(_ % 2 == 0).toList 12 :: Stream(4).map(_ + 10).filter(_ % 2 == 0).toList 12 :: cons(14, Stream().map(_ + 10)).filter(_ % 2 == 0).toList 12 :: 14 :: Stream().map(_ + 10).filter(_ % 2 == 0).toList 12 :: 14 :: List()2
Note that no intermediate results like List(11,12,13,14)
or List(12,14)
get instantiated in this case. It looks like there is simply less work to do but this impression of efficiency might be misleading. A simple micro benchmark reveals how the Stream perform worse than the List:
testOriginalExampleWithList thrpt 20 4368200.644 ± 155109.000 ops/s testOriginalExampleWithStream thrpt 20 2881809.414 ± 122061.743 ops/s
The reason has to be tracked down in the Stream
lazy nature along with its immutability as it belongs to the Scala immutable collection but at the same time its tail is not “real” yet (sort to speak) and it will be evaluated when required. On that respect the tail resemble a rule or a set of instructions to materialise its elements. The marriage between these two aspects of a Stream requires some synchronisation which is the toll to pay (and the reason behind the different performance above):
/** A lazy cons cell, from which streams are built. */ @SerialVersionUID(-602202424901551803L) final class Cons[+A](hd: A, tl: => Stream[A]) extends Stream[A] { override def isEmpty = false override def head = hd @volatile private[this] var tlVal: Stream[A] = _ @volatile private[this] var tlGen = tl _ def tailDefined: Boolean = tlGen eq null override def tail: Stream[A] = { if (!tailDefined) synchronized { if (!tailDefined) { tlVal = tlGen() tlGen = null } } tlVal } }3
So, why use a Stream after all? As the title (and the Scala doc) suggest, there are occasions where a lazy sequence comes handy. One well known example is the possibility to define “infinite” sequences like the Fibonacci one in the documentation. An other aspect, and potentially an advantage in some situation, is the generally low memory footprint of a Stream (remember, the intermediate sequences, like List(11,12,13,14)
and List(12,14)
in the example, don’t get instantiated).
The following terminate successfully on my machine with roughly 4GB of heap memory4:
Stream.fill(600)("Some random sample text.") .map(_ * 500000 + "test") .zipWithIndex.filter(_._2 % 2 == 0) .map(_._1.contains("test"))
Whereas its List equivalent throws a
java.lang.OutOfMemoryError: Java heap space
.The above case is build purely with the intent to trigger the exception but compare this:
final case class Wrapper(index: Int, text: String) val listInput: List[Int] = (1 to 500).toList listInput .map(Wrapper(_, "Some simple text")) .map(_.copy(text = "some simple text" * 500)) .map(w => w.copy(text = w.text + "look for me!")) .filter(_.index % 2 == 0) .map(_.text.toLowerCase) .count(_.contains("look for me!"))
with this one:
val streamInput: Stream[Int] = (1 to 500).toStream streamInput .map(Wrapper(_, "Some simple text")) .map(_.copy(text = "some simple text" * 500)) .map(w => w.copy(text = w.text + "look for me!")) .filter(_.index % 2 == 0) .map(_.text.toLowerCase) .count(_.contains("look for me!"))
Both complete successfully but the Stream version in this case is performing better than the List one:
List thrpt 15 76.588 ± 2.734 ops/s Stream thrpt 15 80.246 ± 2.260 ops/s
Flight Recorder shows what is happening under the hood: the first example with List generates more intermediate results that then increase the amount of work for the garbage collector:
Whereas in the Stream case the garbage collector is definitely less busy and less CPU resources are needed in order to free the heap.
In this last case, the amount of extra work on the garbage collector was sufficient to tip the balance in favor of a lazy sequence. Those cases are probably the exceptions rather than the rule but I guess the lesson is that wherever there are long chain of transformations on non lazy collection and/or the suspect that those might involve extra work for the GC, then perhaps the idea to test a lazy alternative should be taken into consideration.
All code and tests available here.
———————————
1. [a Stream is lazy hence we need to turn it into a List to trigger the evaluation]↩
2. [Chiusano P., Bjarnason R. Functional Programming in Scala. Manning, 2014. p.72]↩
3. [The code comes from the definition of the Stream class.]↩
4. [All the numbers regarding performance and potential out of memory errors or general memory usage, depends obviously on the specific hardware and configuration in use (in my case an old laptop and not much heap assigned) but I think the general idea behind those tests, rather than the specific numbers, still stands.]↩