# Why is ADT pattern matching allowed?

by Stephen Compall on Nov 10, 2014

technical

One of the rules of the Scalazzi Safe Scala Subset is “no type casing”; in other words, testing the type via `isInstanceOf` or type patterns isn’t allowed. It’s one of the most important rules therein for preservation of free theorems. Common functional programming practice in Scala seems to violate this rule in a subtle way. However, as we will see, that practice carves out a very specific exception to this rule that, morally, isn’t an exception at all, carrying convenient advantages and none of the drawbacks.

## Why forbid type tests?

With the “no type tests” rule, we forbid writing functions like this:

``````def revmaybe[T](xs: List[T]): List[T] = {
val allInts = xs.forall{case _:Int => true
case _ => false}
if (allInts) xs.reverse else xs
}
``````

Which violates the free theorem of `revmaybe`’s type `revmaybe(xs map f) = revmaybe(xs) map f`, as follows.

``````val xs = List(1, 2, 3)
def f(i:Int) = Some(i)

scala> revmaybe(xs map f)
res2: List[Some[Int]] = List(Some(1), Some(2), Some(3))

scala> revmaybe(xs) map f
res3: List[Some[Int]] = List(Some(3), Some(2), Some(1))
``````

## ADTs are OK to go

On the other hand, the Scalazzi rules are totally cool with pattern matching to separate the parts of ADTs. For example, this is completely fine.

``````def headOption[T](xs: List[T]): Option[T] = xs match {
case x :: _ => Some(x)
case _ => None
}
``````

Even more exotic matches, where we bring type information forward into runtime, are acceptable, as long as they’re in the context of ADTs.

``````sealed abstract class Expr[T]
final case class AddExpr(x: Int, y: Int) extends Expr[Int]

def eval[T](ex: Expr[T]): T = ex match {
case AddExpr(x, y) => x + y
}
``````

Let’s look at the compiled code of the `eval` body, specifically, the `case` line.

``````         2: aload_2
6: ifeq          39
13: astore_3
18: istore        4
24: istore        5
``````

So, instead of calling `unapply` to presumably check whether `AddExpr` matches, scalac checks and casts its argument to `AddExpr`. Why does it do that? Let’s see if we could use `AddExpr.unapply` instead.

``````scala> AddExpr.unapply _
``````

In other words, the `unapply` call can’t tell you whether an `Expr` is an `AddExpr`; it can’t be called with arbitrary `Expr`.

The only actual check here is inserted by scalac as part of compiling the pattern match expression, and it is a type test, supposedly verboten under Scalazzi rules. `headOption`, too, is implemented with type tests and casts, not `unapply` calls.

We’ve exhorted Scala users to avoid type tests, but then turn around and say that type tests are OK! What’s going on?

## An equivalent form

In every case where we use pattern matching on an ADT, there’s an equivalent way we could write the expression without pattern matching, by adding an encoding of the whole ADT as a method on the class or trait we use as the base type. Let’s redefine the `Option` type with such a method to see how this is done.

``````sealed abstract class Maybe[T] {
def fold[Z](nothing: => Z, just: T => Z): Z
}

final case class MNothing[T]() extends Maybe[T] {
override def fold[Z](nothing: => Z, just: T => Z): Z =
nothing
}

final case class Just[T](get: T) extends Maybe[T] {
override def fold[Z](nothing: => Z, just: T => Z): Z =
just(get)
}
``````

It’s key to our reasoning that we completely avoid `match` in our implementations; in other words, the `fold` is matchless.

With the `fold` method, the following two expressions are equivalent, notwithstanding scalac’s difficulty optimizing the latter, even in the presence of inlining.

``````(selector: Maybe[A]) match {
case MNothing() => "default case"
case Just(x) => justcase(x)
}

selector.fold("default case", x => justcase(x))
``````

It’s a simple formula: the fold takes as many arguments as there are cases, always returns the given, sole type parameter, and each argument is a function that results in that same parameter. There’s a free theorem that a `fold` implementation on data structures without recursion, like `Maybe`, can only invoke one of these arguments and return the result directly, just as the pattern match does.

If you prefer the clarity of named cases, just use Scala’s named arguments. Here’s that last fold:

``````selector.fold(nothing = "default case",
just = x => justcase(x))
``````

Encoding `Expr` is a little bit more complicated. For the full power of the type, we have to turn to `Leibniz` to encode the matchless `fold`.

``````import scalaz.Leibniz, Leibniz.{===, refl}

sealed abstract class Expr2[T] {
def fold[Z](add: (Int, Int, Int === T) => Z,
concat: (String, String, String === T) => Z): Z
}
``````

What does this mean? The type `Int === T`, seen in the `add` argument signature, is inhabited if and only if the type `T` is the type `Int`. So an implementation of `fold` can only call the `add` function if it can prove that type equality. There is, of course, one that can:

``````final case class AddExpr2(x: Int, y: Int) extends Expr2[Int] {
override
def fold[Z](add: (Int, Int, Int === Int) => Z,
concat: (String, String, String === Int) => Z): Z =
}
``````

Not only does `AddExpr2` know that `Expr2`’s type parameter is `Int`, we must make the type substitution when implementing methods from `Expr2`! At that point it is enough to mention `refl`, the evidence that every type is equal to itself, to satisfy `add`’s signature.

This may seem a little magical, but it is no less prosaic than implementing `java.lang.Comparable` by making this substitution. So you can do this sort of thing every day even in Java.

``````public interface Comparable<T> {
int compareTo(T o);
}

class MyData implements Comparable<MyData> {
@Override
public int compareTo(MyData o) {   // note T is replaced by MyData
// ...
}
}
``````

If only Java had higher kinds, you could go the rest of the way and actually implement GADTs.

Moving on, let’s see another case for `Expr2`, and finally to tie it all together, `eval2` with some extra constant data in for good measure.

``````final case class ConcatExpr2(x: String, y: String) extends Expr2[String] {
override
def fold[Z](add: (Int, Int, Int === String) => Z,
concat: (String, String, String === String) => Z): Z =
concat(x, y, refl)
}

def eval2[T](ex: Expr2[T]): T =
ex.fold((x, y, intIsT) => intIsT(1 + x + y),
(x, y, strIsT) => strIsT("one" + x + y))
``````

Using the `Leibniz` proof is, unfortunately, more involved than producing it in the fold implementations. See my previous posts, “A function from type equality to Leibniz” and “Higher Leibniz”, for many details on applying `Leibniz` proof to make type transformations.

While the pattern matching `eval` didn’t have to explicitly apply type equality evidence – it just knew that `Int` was `T` when the `IntExpr` pattern matched – Scala has holes in its implementation, discussed in the aforementioned posts on `Leibniz`, that sometimes make the above implementation strategy an attractive choice even though pattern matching is available.

## We could, but that’s good enough, so we won’t

You might have noticed that adding another case to `Expr` caused us not only to implement an extra `fold`, but to add another argument to the base `fold` to represent the new case, and then go through every implementation to add that argument. This isn’t so bad for just two cases, but indeed has quadratic growth, to the point that adding a new case to a large datatype is a majorly annoying project all by itself.

There is an interesting property of `fold`, though: the strategy isn’t available for our first function, `revmaybe`, to discriminate arguments of arbitrary type! To do that, we would have to add a signature like this to `Any`.

``````def fold[Z](int: Int => Z, any: Any => Z): Z = any(this)
// and, in the body of class Int
override def fold[Z](int: Int => Z, any: Any => Z): Z = int(this)
``````

Obviously, you cannot do this.

You can only add `fold` methods to types you know; I can only call `fold` in `expr2` by virtue of the fact that I know that the argument has type `Expr2[T]` for some `T`. If the argument was just `T`, I wouldn’t have enough static type information to call `fold`. So the use of `fold`s doesn’t break parametricity. Equivalently, a pattern match that could be implemented using a matchless fold also does not break parametricity.

As we have seen, it is unfortunately inconvenient to actually go through the bother of writing `fold` methods, when pattern matching is there. But it is enough to reason that we could write a matchless `fold` and replace the pattern matching with it, to prove that the pattern matching is safe, no matter how many underlying type tests scalac might use to implement it.

A simple test follows: if you could write a matchless fold, and use that instead, the pattern match is type-safe.

## A selector subtlety

Here’s a pattern match that violates parametricity.

``````selector match {
case MNothing() => "default case"
case Just(x) => justcase(x)
}
``````

Wait, but didn’t we rewrite that using a `fold` earlier? Not quite. Oh, I didn’t mention? The type of `selector` is `T`, because we’re in a function like this:

``````def notIdentity[T](selector: T) =
// match expression above goes here
``````

Scala will permit this pattern match to go forward. It doesn’t require us to prove that the selector is of the ADT root type we happened to define; that’s an arbitrary point as far as Scala’s subtyping system is concerned. All that is required is that the static type of `selector` be a supertype of each of `MNothing[_]` and `Just[_]`, which `T` is, not being known to be more refined than `Any`.

The test works here, though! What is ambiguous to scalac is a bright line in our reasoning. We can’t define a matchless `fold` that can be invoked on this `selector`, so we reach the correct conclusion, that the match violates parametricity.

## The rule revisited

So we’ve carved out a clear “exception” to the “no type tests” Scalazzi rule, and seen that it isn’t an exception at all. There’s a straightforward test you can apply to your pattern matches,

If and only if I could, hypothetically, write a matchless fold, or use an existing one, and rewrite this in its terms, this pattern match is safe.

but beware the subtle case where the match’s selector has a wider type than you anticipated.

Finally, this is a rule specifically about expressions that don’t violate our ability to reason about code. This doesn’t hold for arbitrary type-unsafe rewrites: that you could write a program safely means you should write it safely. Unlike arbitrary rewrites into nonfunctional code, the pattern match uses no non-referentially-transparent and no genuinely non-parametric expressions.