Value classes are available since Scala 2.10 and the initial proposal SP-15 included several use case for this new mechanism but on this post I would like to focus on one in particular: the possibility to create wrappers with no boxing overhead and how we can leverage on this to improve our type system at zero cost.
And in doing that, I’ll add also some number to show what “zero cost” exactly means. The class Wrapper
below:
class Wrapper(val flag: Boolean) extends AnyVal
is a value class. Generally speaking a value class has a single, public val parameter and extends directly AnyVal
. There is a specific list of criteria that a value class must satisfy in order to be considered as such (for all the details, including universal traits, see SP-15) but here I would like to focus on the general idea behind such mechanism and it’s benefits.
In the example above, Wrapper
is the type at compile time but when we run the program, at runtime the representation is a boolean and the “free” in the title refers precisely to the fact that Wrapper
never gets actually created at runtime hence no boxing overhead and no extra memory allocation (but I’ll get back on this after an example).
So far so good… but how could we exploit this feature at our advantage? An example (perhaps a bit clumsy) should illustrate the point. Let’s say we are working on an online auction web site and we have an entity for a generic auction item:
final case class Item ( itemId: Int, asked: Double, soldPrice: Double )
The model is pretty simple: an id to identify uniquely an item, the asking price expressed as double (although in a real case scenario BigDecimal
1 would be the right choice) and the price the item has been sold for. And, for no particular reason beyond our curiosity, we also have a method to check whether, overall, the buyers payed more than the asking price (for example for all the items sold in the last month):
def totalPerformanceBase(list: List[Item]): Double = { list.foldLeft(0d)((acc, item) => { acc + (item.soldPrice - item.asked) }) }
The method is called totalPerformance because we are going to check the performance of this method later and use it as our base reference.
Representing a price as a Double
2 is perfectly fine but on the other hand we also have bets, shipping costs and quite likely many other domain models which include a price. We could create our own type to represent specifically an Item
‘s price in order to make our code more robust and prevent common mistakes (pass a price for a bet to an item or vice versa), after all that’s the advantage of having a type system. The compiler will spot immediately the error, less runtime bugs, improved IDE experience and so forth. So we write a wrapper for our price:
final case class Price(price: Double)
And re-factor the Item class:
final case class Item ( itemId: Int, asked: Price, soldPrice: Price )
Also totalPerformance needs to be refactored accordingly:
def totalPerformance(list: List[Item]): Double = { list.foldLeft(0d)((acc, item) => { acc + (item.soldPrice.price - item.asked.price) }) }
In the graph below, the blue line represents our base implementation with a Double
whereas the orange line represents the performance of the version with the wrapper Price
.
You can find the Scala meter tests and all the code here, but bear in mind that results might vary based on your hardware. However the important point to notice here is that introducing Price
as a wrapper for extra type safety comes with a small price in terms of performance. When totalPerformance crunches a list of half a million items we can see a difference of a couple of milliseconds3. Besides, because the wrapper Price gets actually instantiated, also from a memory usage point of view we can note a similar behavior. Let’s consider for example the following methods:
// For the original Item def instantiate(list: List[Int]): List[Item] = { list.map(n => Item(n, n, n + 1.0)) } // For the Item using Price as wrapper def instantiate(list: List[Int]): List[Item] = { list.map(n => Item(n, Price(n), Price(n + 1.0))) }
From the graph below we can notice the different heap memory usage while increasing progressively the number of items (in orange the implementation with Price wrapper and blue the original implementation with Double)
Just one note: the y-axis’ measure is actually in Kb (that’s a known issue with Scala Meter) so when the size is 970000, for instance, the heap usage for the Item with Double is 54222.97 Kb against one of 93120.00 Kb for the version with the price wrapper.
Value classes comes into play because allow us to keep the type safety we gain with the wrapper without paying the extra price (both in terms of execution time and memory usage).
We can change slightly the Price
definition to make it a value class:
final case class PriceVal(price: Double) extends AnyVal { def -(that: PriceVal) = PriceVal(this.price - that.price) def +(that: PriceVal) = PriceVal(this.price + that.price) }
and change our Item
definition to reflect this:
final case class Item ( itemId: Int, asked: PriceVal, soldPrice: PriceVal )
We also need to refactor a bit our methods to accommodate the new Item
‘s definition:
def totalPerformance(list: List[Item]): PriceVal = { list.foldLeft(PriceVal(0d))((acc, item) => { acc + (item.soldPrice - item.asked) }) } def instantiate(list: List[Int]): List[Item] = { list.map(n => Item(n, PriceVal(n), PriceVal(n + 1.0))) }
Now we can compare totalPerformance time usage with all the solutions we came up with:
In orange is the implementation with the wrapper Price
In blue is the original implementation of Item with Double
in green the one with the value class PriceVal
Is quite evident how there isn’t any significant difference between the base implementation with Double and the one with value classes. On the other hand, the version with a Price wrapper doesn’t perform as efficiently as the others. Analogous outcome when the result for the memory usage is examined:
Once again, no difference between the Double and value class implementation and that’s why you see only one line with the green and blue lines perfectly overlapped.
In case the two overlapping lines hide a bit too much what’s happening, the graph below instead compare just the memory usage of the Price
wrapper solution (in orange) against the version using PriceVal
value class (in green)
———————————-
1. [Here there is a good explanation of why double is not a wise choice when dealing with money but for the purpose of this post is perfectly fine and illustrate the point quite effectively]↩
2. [Note that in Scala, Double itself is not represented by an object in the underlying runtime system and is equivalent to a double Java primitive]↩
3. [Whether those few milliseconds actually make a difference it largely depends on the context. For the purpose of this post, the only relevant aspect is that there is a difference]↩