Scala Type Classes and Family Resemblances

This post is not a general introduction to type classes in Scala. There are quite a few good posts already on the subject (here, here and here for example), but it is about type classes and the somewhat-cryptic title require a brief explanation. In Scala with Cats, the authors describe a type class as a programming pattern that “allow us to extend existing libraries with new functionality, without using traditional inheritance, and without altering the original library source code.” Although that is not the only possible definition, it’s a perfect starting point for what I would like to cover. Can type classes help us at “removing” old functionalities in the same way? And how and why we might end up in such a situation? Let’s try to model a simple type for a bird to see how the issue could arise:

sealed trait Bird {
  def fly: Unit
}

In a OOP approach we could proceed adding a few implementations, a parrot for example:

sealed trait Bird {
  def fly: Unit
}

class Parrot(name: String) extends Bird {
  override def fly: Unit = println(s"I'm a parrot and I can fly. My name is $name")
}

And perhaps a few other birds until we get to penguins. Penguins are birds, but there is a problem: penguins don’t fly.

sealed trait Bird {
  def fly: Unit
}

class Penguin(name: String) extends Bird {
  override def fly: Unit = ??? // throw new UnsupportedOperationException("Sorry, can't fly")
}

A solution might be to throw an UnsupportedOperationException but not an optimal one you might argue: catching those problems at compile time would be better. We probably were a bit rushed to add a fly method as part of the Bird trait. Not every birds flies and penguins are not the only exception. But was it just an oversight or there is more?

Ludwig Wittgenstein introduced in his Philosophical Investigations the the idea of family resemblance to describe certain concepts in our language for which we don’t have a clear set of traits shared across all members but more a general overlapping mesh of these features like in a large family picture where we can recognize each single member as part of the family and the common traits (hair color, shape of the nose, etc.) spread across them but where no definite set of those traits is shared by every single member.

Back to our problem, “flying” is such a common characteristic of birds that is almost understandable why that def fly slipped within the Bird trait.1
But perhaps there is something that a type class approach could do to mitigate those issues (all code available here2). Assume we have the same birds as above:

sealed trait Bird // no fly method now

class Parrot(val name: String) extends Bird
class Penguin(val name: String) extends Bird

And proceed defining the various components of our type classes. First the traits/behavior (note that there is no mention of Bird in them)

trait Fly[A] {
  def fly(a: A): Unit
}

and

// for penguins
trait Swim[A] {
  def swim(a: A): Unit
}
// for parrots
trait Talk[A] {
  def talk(a: A): Unit
}

and then two instances implementing the above traits (one for a penguin and the other for a parrot):

implicit val parrotFly: Fly[Parrot] = new Fly[Parrot] {
    def fly(parrot: Parrot): Unit = {
      println(s"I'm a parrot and I can fly. My name is ${parrot.name}")
    }
  }

implicit val parrotTalk: Talk[Parrot] = new Talk[Parrot] {
    def talk(parrot: Parrot): Unit = {
      println(s"I'm a parrot and I can talk. My name is ${parrot.name}")
    }
  }

  implicit val penguinSwim: Swim[Penguin] = new Swim[Penguin] {
    def swim(penguin: Penguin): Unit = {
      println(s"I'm a penguin and I can swim. My name is ${penguin.name}")
    }
  }

And finally the api, the last bit that acts like glue and, relying on implicits, puts everything together.

  implicit class FlyOps[A](val value: A) extends AnyVal {
    def fly(implicit flyInstance: Fly[A]): Unit = flyInstance.fly(value)
  }

implicit class TalkOps[A](val value: A) extends AnyVal {
    def talk(implicit talkInstance: Talk[A]): Unit = talkInstance.talk(value)
  }

  implicit class SwimOps[A](val value: A) extends AnyVal {
    def swim(implicit swimInstance: Swim[A]): Unit = swimInstance.swim(value)
  }

Now that we have all the pieces in place, we can see if we achieved something good out of it. We need two birds first:

val parrot = new Parrot("George")
val penguin = new Penguin("Pingu")

The api and instances defined above need to be imported for the implicits to do their magic:

// here is where I defined the api in the sample code
import com.lansalo.family.resemblance.fp.api.ImplicitBehavioursApi._
// here the instances implementing the traits
import com.lansalo.family.resemblance.fp.Instances._ 

parrot.fly
// As expected, parrots can fly and this will output:
// "I'm a parrot and I can fly. My name is George"

parrot.talk
// Will print: "I'm a parrot and I can talk. My name is George"

parrot.swim
// compilation error because parrots can't swim

penguin.swim
// will output: "I'm a penguin and I can swim. My name is Pingu"

penguin.fly
// compilation error, because penguin don't fly and more practically, 
// because we didn't define an instance for Fly[Penguin] 
// and the api will complain that one can't be found

So, both parrots and penguins are Birds but parrots can fly (but not swim) whereas penguins can swim (but not fly, or talk for that matter) and errors are picked up at compile time. All good then! Or not? Well… it turns out there is a friendly parrot in New Zealand that can’t fly: the kakapo (and apparently is the only one). Once again we found ourselves victims of the same mistake, to reiterate how easy it is to generalize and ignore edge cases. But perhaps this time we are in a better position to face the issue without the need to throw runtime errors (ie. UnsupportedOperationException) or to modify the data defined so far (which could be out of our control). Clearly a new member has to be added to the Bird family and even if it doesn’t fly, it still has to be a parrot:

sealed trait Bird

class Parrot(val name: String) extends Bird
class Penguin(val name: String) extends Bird
class Kakapo(override val name: String) extends Parrot(name)

the Kakapo is still a parrot:

kakapo.isInstanceOf[Parrot] // returns true

But it can’t fly (kakapo.fly will not compile) although the reason might not be immediately clear. It’s true that we have an instance of Fly[Parrot] in scope and equally, a Kakapo is a parrot (as shown above) but their parent/child relationship doesn’t get propagated to the higher-kinded type Fly[A] as this is invariant in its type A.
For the same reason, the kakapo can’t talk like other parrots. I must admit that my knowledge on the whole kakapo subject is quite limited but from the information I gather, that seems to be the case. On the other hand, the ability to speak is widespread across the parrot family. It might well be that with a bit of patient training someone could teach few words to a kakapo. In that case, we can still adapt to this newly discovered behavior (still without modify our models) by adding an instance for it:

implicit val kakapoTalk: Talk[Kakapo] = new Talk[Kakapo] {
    def talk(kakapo: Kakapo): Unit = {
      println(s"I'm a kakapo and I can talk. My name is ${kakapo.name}")
    }
  }

Even if there is no perfect solution in those cases, a type class approach seems to provide a better way to deal with those scenarios where the we are trying to model concepts that relate to the family resemblance paradigm.

———————————

1. [All that remind me of my first experience as a developer, working for a big Italian utility company. It was long time ago and we were trying (among other things) to model a “meter reader” attached to user accounts. We were struggling with old legacy data and when we thought we capture the essence of a meter reader (and its types and hierarchy) it was just a matter of time and a new account pop up with a generally old type of reader, breaking our assumptions. It was then a matter to find someone with the know out in a position to throw some light on the weird new type. Usually someone close to retirement.  ]
2. [The examples will rely on implicits, although, strictly speaking, type classes can be implemented without using them. In the code there are also few example of the more verbose version which does not use implicits]

 

Scala Variance Explained

There are quite a few articles and resources about variance in Scala available on line. Still, I decided to add my own contribute with this post, trying to keep the whole topic as simple as I could, using familiar concept, few diagrams and some code you can play with. That hopefully will make the whole thing a bit easier to grasp. I also decided not to include reference to category theory: not because of not use (on the contrary) but because I hope this rough explanation could work as a sort of encouragement and introduction to it (if you want to dig it further I linked some resources at the end of the page).

Kingdom Animalia

As we need an example, actually an analogy, I chose something that lack originality but not familiarity (hopefully). The code is available here but you won’t find any surprise in it. It’s just a hierarchy of classes modeling the relationship between different groups of animals:
Kingdom Animalia

Variance

From the hierarchy above Vertebrate is the parent class of Fish (or Fish a subtype of Vertebrate) and we can clearly write something like:

val vertebrate: Vertebrate = new Fish()

And that’s because a fish is a vertebrate (or at least we modeled it wisely as such). But what about a higher-kinded types like T[A]? Let’s say for example that we have a generic class Cage[A] that takes a type parameter, what can be said about the relationship between Cage[Vertebrate] and Cage[Fish]? Is Cage[Vertebrate] the parent of Cage[Fish]? Does the same relationship between a vertebrate and a fish apply to the respective higher-kinded type?

functor
Variance is mainly about how we answer this question. We have precisely three options: invariance, covariance, and contravariance

Invariance

In the invariant case the answer to the previous question is pretty simple. There is no relationship between the types and their corresponding higher-kinded types.

invariance
An invariant class Cage would be defined in the following way in Scala:

class Cage[A]

Then we can unsurprisingly write val mammalCage: Cage[Mammal] = new Cage[Mammal] but, despite Primate being a subtype of Mammal, the following will not compile:

val primateCage: Cage[Mammal] = new Cage[Primate]
// Expect an error like:
// [error] Note: kingdom.animalia.vertebrate.mammal.primate.Primate <: kingdom.animalia.vertebrate.mammal.Mammal, but class Cage is invariant in type A.
// [error] You may wish to define A as +A instead.

And for analogous reasons val vertebrateCage: Cage[Mammal] = new Cage[Vertebrate] won’t compile either. Conclusion: an invariant class Cage does not preserve the inheritance relationship between its type arguments.

Covariance

In the covariant case the answer to our initial question is more articulate and interesting. If B is a subtype of A, then even F[B] is a subtype of F[A].

covariance
Back to our example, a covariant Cage class in Scala would look like:

class Cage[+A]

And:

val mammalCage: Cage[Mammal] = new Cage[Mammal]
val primateCage: Cage[Mammal] = new Cage[Primate]
val homoCage: Cage[Mammal] = new Cage[Homo]

Will all compile without errors because of the parent/child relationship between mammal/primate (or mammal/homo) is extended to the higher-kinded type Cage so that Cage[Mammal] is the parent of Cage[Primate]. Of course the same won’t be true of:

// expect a type mismatch error if you try to compile any of these
val reptileCage: Cage[Mammal] = new Cage[Reptile]
val vertebrateCage: Cage[Mammal] = new Cage[Vertebrate]

As there is no parent/child relationship between a mammal and a reptile, nor between a mammal and a vertebrate. In this last case though, the opposite is true actually. All mammals are vertebrate and we modeled Vertebrate to be the parent of Mammal and that introduce us to the next topic.

Contravariance

As you might have guessed at this point, contravariance is the opposite of covariance:

contravariance
We still extend the types’ inheritance to the corresponding higher-kinded type but in this case, so to speak, the arrow is inverted. More precisely, if B is a subtype of A, then F[A] is a subtype of F[B] (note the difference with the covariant case). A contravariant Cage class in Scala would look like:

class Cage[-A]

Then in such a cage for a mammal, we can keep:

val mammalCage: Cage[Mammal] = new Cage[Mammal]
val vertebrateCage: Cage[Mammal] = new Cage[Vertebrate]
val animalCage: Cage[Mammal] = new Cage[Animal]

But the following lines will not compile:

// type mismatch error
val primateCage: Cage[Mammal] = new Cage[Primate]
// type mismatch error
val molluscCage: Cage[Mammal] = new Cage[Mollusc]

Recap

  • In an invariant Cage[Mammal] I can keep only a type Mammal (not its subtypes or supertypes)
  • In a covariant Cage[Mammal] I can keep a type Mammal or one of its subtypes (but not its supertypes)
  • In a contravariant Cage[Mammal] I can keep a type Mammal or one of its supertypes (but not its subtypes)

And note that this restriction is imposed at compile time. In other words, the compiler will check for you what can and cannot be passed to an higher-kinded type accordingly to its type arguments variance definition.

Covariant type A occurs in contravariant position

That was a sort of really general overview but It might leave you with some question about the use of variance from a practical perspective, including advantages and disadvantages. In this section (and the following), I’ll try to present some example along with some common Scala class that makes use of variance and I’ll start first with some code that does not compile: a broken cage.

class BrokenCage[+A](var guest: A)

The BrokenCage class is defined as covariant in is type but then we simply pass a guest of type A. The compiler will not digest that and will return an error like covariant type A occurs in contravariant position in type A of value guest.
To explain what is happening here and why the compiler is not happy, an absurd reasoning might help. So let’s assume for a moment that the code above compile and see what happens. I first create a cage with a primate in it:

val primateCage = new BrokenCage[Primate](new Primate)

Because BrokenCage is covariant in A, BrokenCage[Primate] is a subtype of BrokenCage[Animal] and therefore we can assign a BrokenCage[Primate] to a BrokenCage[Animal]

val animalCage: BrokenCage[Animal] = primateCage

So now we have an animalCage of type BrokenCage[Animal]. Then there shouldn’t be any problem to assign an invertebrate as a guest. After all an invertebrate is an animal and BrokenCage is covariant in its type.

animalCage.guest = new Invertebrate

But our animalCage is the primateCage we define at the beginning so basically we managed to put an invertebrate in a cage for a primate! This issue is known as heap pollution and the Scala compiler will prevent these sort of problems from happening, but it can occur for example in a normal Java array:

Primate[] primates = new Primate[5];
Animal[] animals =  primates;
animals[0] = new Invertebrate();

This code will compile and will result in an ArrayStoreException at runtime. One of the advantages of variance in Scala is precisely that these kind of error are captured at compile time rather than at runtime.
And just as a note (as it’s out of scope for the present post), the BrokenCage class can be fixed either using a val (rather than a var) or an upper type bound like in:

class FixedCage[+A, B <: A](var guest: B)

Where we specify that the guest B has to be a subtype of A (there are a couple of simple test around it in the code)

Option

To see how variance (covariance in the specific case) is used in Scala, the familiar Option trait is a good example. A simple implementation would be (and the actual Scala implementation is not much different):

sealed trait Option[+A]

case class Some[+A]( value: A ) extends Option[A]
case object None extends Option[Nothing]

Option is covariant in it’s type A (and so it’s Some). But why is covariant might not be immediately evident so to understand a bit more about it, let’s try to make it invariant and see what sort of behavior we should expect as a result of it:

sealed trait Option[A]

case class Some[A]( value: A ) extends Option[A]
case object None extends Option[Nothing]

The first thing to notice is that even Some has to be invariant in its type argument A as it extends Option and now Option is invariant in A. Besides (not surprisingly) we cannot assign an Option[Primate] to an Option[Mammal].

var optionalMammal: Option[Mammal] = Some(new Mammal)
val optionalPrimate: Option[Primate] = Some(new Primate)
// This won't compile
// optionalMammal = optionalPrimate

Despite Primate being a subtype of Mammal, that relationship doesn’t get passed to our invariant version of Option. Same applies to this invariant version of Some (for the same reason). Perhaps not so obvious, we cannot assign a None to Option[Mammal]:

// This won't compile:
// val mammal: Option[Mammal] = None

Nothing is a subtype of any type in Scala but this time Option is invariant so we can’t assign to an Option[Mammal] anything that is not an Option[Mammal], hence not an Option[Nothing], nor None which extends Option[Nothing]. We could get around this inconvenient with an upper type bound defining Option trait like sealed trait Option[A <: Nothing] but I let you imagine the usefulness of such Option type. More generally, an invariant version of Option fails to meet those expectation an Option type is supposed to deliver (in the code, you can find also a version of a covariant Option with an invariant definition for Some).

Function

Functions are another interesting example of variance usage in the Scala language. The trait Function1 is defined as contravariant in its argument and covariant in its returned type. Simplifying it looks like:

trait Function1[-X,+Y] {
  def apply(arg: X): Y
}

What we are saying with such type definition is that if A is a supertype of X and B is a subtype of Y, then Function1[A,B] is a subtype of Function1[X,Y]. In terms of the kingdom animalia analogy used so far, Function1[Vertebrate,Homo] is a subtype of Function1[Mammal,Primate] as Vertebrate is a supertype of Mammal and Homo is a subtype of Primate:
functions

Although the reason for such a choice might not be immediately evident. When we think about a class in our animal kingdom it’s quite intuitive what a parent/child relationship involves. Primate is a subtype of Mammal because all Primates are Mammals. Hence, if you ask me for a mammal, I can give you a primate as a primate is a mammal. This concept has a name: Liskov substitution principle. It states (from Wikipedia):

If S is a subtype of T, then objects of type T may be replaced with objects of type S (i.e. an object of type T may be substituted with any object of a subtype S) without altering any of the desirable properties of T

Back to our example, if we want to understand why Function1[Vertebrate,Homo] is a subtype of Function1[Mammal,Primate], or in other words, why Function1 is defined as contravariant in its argument and covariant in its returned type, then we need to start from those desirable properties that the present definition is not supposed to alter (and that, likely, a different definition would have altered instead). For a function (and for a category in category theory) we expect the composition operation as a primitive. Function composition can be defined more rigorously but intuitively it says that if we have a function from A to B and a function from B to C, then we can always compose the two functions and obtain a function from A to C.

function1_right
Translated in Scala

val f: Vertebrate => Mammal = ???
val g: Mammal => Primate = ???

// We can compose those to obtain a new function Vertebrate => Primate
val fromVertebrateToPrimate: Vertebrate => Primate = f andThen g // or g compose f

Given the definition of Function we saw, can we safely say that replacing a function with one of its subtypes doesn’t alter the compose operation? Let’s start checking whether making the return type covariant is actually a problem.
Assume g stays the same but we replace f with a subtype of f. Accordingly to our definition of function the return type is covariant so Vertebrate => Homo is a subtype of f (as Homo is a subtype of Mammal)

function-compose2

It should be clear from the picture that replacing f with its subtype doesn’t prevent us from compose the two functions. g takes a Mammal as argument and Homo is a Mammal.
Now let’s keep f as it is and replace g with a subtype of g. As a function is contravariant in its argument, a function Vertebrate => Primate is a subtype of g (as the argument Vertebrate is a supertype of Mammal)

function-compose1

Even in this case the composition is preserved. g takes a vertebrate now and all mammals are vertebrate. To complete the picture, remain to be seen how a different definition of function would have impact on the composition operation. What for example if the return type is contravariant? Then we could replace f with a function Vertebrate => Vertebrate (with the return type being a a supertype of Mammal and therefore Vertebrate => Vertebrate a subtype of Vertebrate => Mammal). Would that impact on our ability to compose the two functions?

function-boken1

Yes, as the picture above show. For example we pass a vertebrate to f and f returns a fish (which is a vertebrate but not a mammal). g is not defined for fishes though, only for mammals.
Similar outcome if we define function as being covariant in its argument. In the following example we keep f unchanged but we replace the original g with a subtype: Primate => Primate. If we define Function as covariant in its argument then I can replace Mammal with its subtype Primate and therefore Primate => Primate would be a legitimate subtype of Mammal => Primate accordingly to this definition:

function-broken2

Even in this case our ability to compose the two function is lost as g is not defined for all the possible mammals but just for a specific subset of them: primates. If f returns a dog, g wouldn’t know what to do with it.
The lesson in case of function can be expressed in more formal way introducing the concept of domain and codomain of a function. A domain is the set of values for which the function is defined (its possible arguments) and the codomain is the set of all possible values that the function can return. So, for instance, in a function Mammal => Homo the set of all mammals is the domain of the function and the set of all humans is the codomain. We can now say that given two functions A => B and C => D, those can be composed into a function A => D if and only if the codomain of the first function (B) is a subset of the domain of the second function (C). And that is what is captured in Scala definition of function with its contravariant argument type and covariant returned type.

Code is available here: https://github.com/lansaloltd/type-variance/
And here: https://github.com/lansaloltd/kingdom-animalia