Consider the simple cons-list datatype.

```
import scalaz.Equal, scalaz.std.option._, scalaz.syntax.std.option._,
scalaz.std.anyVal._, scalaz.std.function._,
scala.reflect.runtime.universe.reify
sealed abstract class XList[A]
final case class XNil[A]() extends XList[A]
final case class XCons[A](head: A, tail: XList[A]) extends XList[A]
```

And a simple function over this structure. Say, a simple summing function.

```
def sum(xs: XList[Int]): Int = xs match {
case XNil() => 0
case XCons(x, xs) => x + sum(xs)
}
```

And that seems to work out alright.

```
scala> val nums = XCons(2, XCons(3, XCons(4, XCons(42, XNil()))))
nums: XCons[Int] = XCons(2,XCons(3,XCons(4,XCons(42,XNil()))))
scala> sum(nums)
res0: Int = 51
```

Has it ever struck you as curious that, though its own value was
required to construct a value like `sum`

, the system has no problem
with that?

Oh, well, that’s just a recursive function, you say. Well, what’s so special about recursive functions? Why do they get special treatment so that they can define themselves with themselves?

First, let’s be clear: there’s a limit to how much of `sum`

can be
used in its own definition.

Let us consider the moral equivalent of the statement “this function gives the sum of a list of integers because it is the function that gives the sum of a list of integers.”

```
def sum2(xs: XList[Int]): Int = sum2(xs)
```

scalac will compile this definition; it is well-typed. However, it will be nonsensical at runtime, because it is nonsense; it will either throw some exception or loop forever.

Let us consider a similar case: the infinite list of 42s.

```
scala> val fortyTwos: Stream[Int] = 42 #:: fortyTwos
fortyTwos: Stream[Int] = Stream(42, ?)
scala> fortyTwos take 5 toList
res0: List[Int] = List(42, 42, 42, 42, 42)
```

The definition of `fortyTwos`

is like that of `sum`

; it uses its own
value while constructing said value. A similar definition to `sum2`

is, likewise, nonsense, though scalac can catch this particular case:

```
scala> val fortyTwos2: Stream[Int] = fortyTwos2
<console>:7: error: value fortyTwos2 does nothing other than call itself recursively
val fortyTwos2: Stream[Int] = fortyTwos2
^
```

Obviously, functions *aren’t* special; non-function values, like
functions, can be defined using their own values. But how can we
characterize the difference between the good, terminating definitions,
and the bad, nonterminating definitions?

Proof systems like Coq and Agda perform a strong check on recursive
definitions; for definitions like `sum`

, they require the recursion
match the structure of the data type, just as ours does, so that each
recursive call is known to operate over smaller data. For definitions
like `fortyTwos`

, they apply other strategies. In Scala, we have to
make do with informality.

I like to think of it this way: **a recursive definition must always
perform at least one inductive step**. `sum`

does so because, in the
recursive case, it gives “supposing I have the sum of `tail`

, the sum
is the `head`

plus that.” `fortyTwos`

does because it says “the value
`fortyTwos`

is `42`

consed onto the value `fortyTwos`

.” It is, at
least, the start of a systematic way of thinking about terminating
recursive definitions.

Now that we have a framework for thinking about what is required in a recursive definition, we can start abstracting over it.

The above recursive definitions were accomplished with special
language support: the right-hand side of any term definition, `val`

or
`def`

, can refer to the thing being so defined. Scalaz provides
the `fix`

function,
which, if it were provided intrinsically, would eliminate the need for
this language support.

```
def fix[A](f: (=> A) => A): A = {
lazy val a: A = f(a)
a
}
```

In this definition, the value returned by `f`

*is* the value given to
it as an argument. It’s a by-name argument because that’s how we
enforce the requirement: `f`

must perform at least one inductive step
in the definition of its result, though it can refer to that result by
its argument, which we enforce by requiring it to return a value
*before* evaluating that argument.

Let’s redefine `sum`

with `fix`

, after importing it from
`scalaz.std.function`

.

```
val sum3: XList[Int] => Int = fix[XList[Int] => Int](rec => {
case XNil() => 0
case XCons(x, xs) => x + rec.apply(xs)
})
```

And Scala thinks that’s alright.

```
scala> sum3(nums)
res1: Int = 51
```

The interesting thing here is that `sum3`

’s definition doesn’t refer
to the name `sum3`

; the recursion is entirely inside the `fix`

argument. So one advantage of `fix`

is that it’s easy to write
recursive values as expressions without giving them a name.

For example, there’s the definition of `fortyTwos`

:

```
scala> fix[Stream[Int]](42 #:: _)
res2: Stream[Int] = Stream(42, ?)
scala> res2 take 5 toList
res3: List[Int] = List(42, 42, 42, 42, 42)
```

It can be inconvenient to avoid evaluating the argument when providing
an induction step. Fortunately, the requirement that `f`

be nonstrict
in its argument is too strong to characterize the space of values that
can be defined with `fix`

-style recursion.

For a given data type, there’s often a way to abstract out the
nonstrictness. For example, here’s an `Equal`

instance combinator
that is fully evaluated, but doesn’t force the argument until after
the (equivalent) result has been produced.

```
def lazyEqual[A](A: => Equal[A]): Equal[A] = new Equal[A] {
def equal(l: A, r: A): Boolean = A equal (l, r)
override def equalIsNatural = A.equalIsNatural
}
```

Given that, we can produce a `fix`

variant for `Equal`

that passes the
`Equal`

argument strictly. You’re simply not allowed to invoke any of
the typeclass’s methods.

```
def fixEq[A](f: Equal[A] => Equal[A]): Equal[A] =
fix[Equal[A]](A => f(lazyEqual(A)))
```

And now, we have the machinery to build a fully derived `Equal`

instance for `XList`

, without function recursion, by defining the base
case and inductive step!

```
def `list equal`[A: Equal]: Equal[XList[A]] =
fixEq[XList[A]](implicit rec =>
Equal.equalBy[XList[A], Option[(A, XList[A])]]{
case XNil() => None
case XCons(x, xs) => Some((x, xs))
})
```

That works out to interesting compiled output. Note especially the
last line, and its (strict) use of `rec`

towards the end.

```
scala> reify(fixEq[XList[Int]](implicit rec =>
| Equal.equalBy[XList[Int], Option[(Int, XList[Int])]]{
| case XNil() => None
| case XCons(x, xs) => Some((x, xs))
| }))
res10: reflect.runtime.universe.Expr[scalaz.Equal[XList[Int]]] =
Expr[scalaz.Equal[XList[Int]]]($read.fixEq[$read.XList[Int]](((implicit rec) =>
Equal.equalBy[$read.XList[Int], Option[Tuple2[Int, $read.XList[Int]]]]
(((x0$1) => x0$1 match {
case $read.XNil() => None
case $read.XCons((x @ _), (xs @ _)) => Some.apply(Tuple2.apply(x, xs))
}))(option.optionEqual(tuple.tuple2Equal(anyVal.intInstance, rec))))))
```

f0, a binary serialization library, uses a similar technique to help define codecs on recursive data structures.

`XList`

?If we can abstract out the idea of recursive value definitions, what
about recursive type definitions? Well, thanks to higher kinds, sure!
Scalaz doesn’t provide it, but it is commonly called `Mu`

.

```
final case class Mu[F[_]](value: F[Mu[F]])
```

We have to put a class in the middle of it so that we don’t have an
infinite type; Haskell has a similar restriction. But the principle
is the same as with `fix`

: feed one datatype induction step `F`

to the
higher-order type `Mu`

and it will feed `F`

’s result back to itself.

For example, here is the equivalent definition of `XList`

with `Mu`

.

```
type XList2Step[A] = {type λ[α] = Option[(A, α)]}
type XList2[A] = Mu[XList2Step[A]#λ]
```

Note the typelambda’s similarity to the second type argument to
`#equalBy`

above. And for demonstration, the isomorphism with
`XList`

.

```
def onetotwo[A](xs: XList[A]): XList2[A] = xs match {
case XNil() => Mu[XList2Step[A]#λ](None)
case XCons(x, xs) => Mu[XList2Step[A]#λ](Some((x, onetotwo(xs))))
}
def twotoone[A](xs: XList2[A]): XList[A] =
xs.value cata ({case (x, xs) => XCons(x, twotoone(xs))}, XNil())
```

Of course, `fix`

lends itself to both of these definitions; I have
left its use off here. But let’s check those functions:

```
scala> onetotwo(nums)
res11: XList2[Int] = Mu(Some((2,Mu(Some((3,Mu(Some((4,Mu(Some((42,Mu(None)))))))))))))
scala> twotoone(res11)
res12: XList[Int] = XCons(2,XCons(3,XCons(4,XCons(42,XNil()))))
```

`fix`

over `Mu`

And, finally, the associated general `Equal`

definition for `Mu`

. The
`contramap`

step is just noise to deal with the fact that the `Mu`

structure has to actually exist; you can ignore it for the most part.

```
def equalMu[F[_]](fa: Equal[Mu[F]] => Equal[F[Mu[F]]]): Equal[Mu[F]] =
fixEq[Mu[F]](emf => fa(emf) contramap (_.value))
```

The evidence we really want is `forall a. Equal[a] => Equal[F[a]]`

,
but that’s too hard to express in Scala, so this does it in a pinch.
All we’re interested in is that we can derive `F`

’s equality given the
equality of any type argument given to it. Let’s prove that we have
such an `Equal`

-lifter:

```
// redefined because Tuple2Equal scalaz is strict on equalIsNatural
class Tuple2Equal[A1, A2](_1: Equal[A1], _2: Equal[A2])
extends Equal[(A1, A2)] {
def equal(f1: (A1, A2), f2: (A1, A2)) =
_1.equal(f1._1, f2._1) && _2.equal(f1._2, f2._2)
override def equalIsNatural: Boolean = _1.equalIsNatural && _2.equalIsNatural
}
implicit def tup2eq[A1: Equal, A2: Equal] =
new Tuple2Equal[A1, A2](implicitly, implicitly)
abstract class Blah // just a placeholder
scala> {implicit X: Equal[Blah] => implicitly[Equal[XList2Step[Int]#λ[Blah]]]}
res4: scalaz.Equal[Blah] => scalaz.Equal[Option[(Int, Blah)]] = <function1>
```

And now that we have `F`

equality, we’re done, because `Mu`

is `F`

s
all the way down.

```
scala> equalMu[XList2Step[Int]#λ](implicit fa => implicitly)
res5: scalaz.Equal[Mu[[α]Option[(Int, α)]]] = scalaz.Equal$$anon$2@de52bcf
scala> res5 equal (onetotwo(nums), onetotwo(nums))
res6: Boolean = true
scala> res5 equal (onetotwo(nums), onetotwo(XCons(3,nums)))
res7: Boolean = false
```

*This article was tested with Scala 2.10.4 & Scalaz 7.0.6.*

Unless otherwise noted, all content is licensed under a Creative Commons Attribution 3.0 Unported License.

Back to blog