10 - Demo of a type-level sorter in Scala

The live demo you saw on Friday was there just so you can see what Scala's type system inferrer can do.
If less than 100% of that entire demo code made sense, it's absolutely fine. We'll go at great length here to make everything click.

At the actual demo, I followed the ideas from this quicksort article with some modifications.
For this explanation I'll cover my own mergesort version (instead of quicksort) which seems a bit easier and more gradual to build.

Objective

So what do we want?

We want to sort types (instead of values), at compile time.

Of course, nobody in their right mind would want to do that for any purely practical reason, but hey:

  1. First of all, people in their right mind use boring languages and by now you've seen that Scala is anything but.
  2. More importantly, we want to understand the magic behind Scala's type inference system.
    If we understand this, few programming concepts will ever stand in our way.

OK, so we're crazy and we want to make the compiler sort types by itself. Let's do this.

Defining natural numbers

What we first need is an axiomatic definition of natural numbers as types, so we want to be able to define type _0, type _1 etc as analogues to regular values.

Well we can't define all natural numbers, since there's quite a few of them, it would take us a while.
But we can define them in terms of their succession:

trait Nat
class _0 extends Nat
class Succ[A <: Nat] extends Nat

So in 3 lines of code we defined all natural numbers, as every number has a unique successor. In our object with the main method we can now define some type aliases:

object TypeDemo {
  type _1 = Succ[_0]
  type _2 = Succ[_1] // = Succ[Succ[_0]]
  type _3 = Succ[_2] // = Succ[Succ[Succ[_0]]]
 
  ...
}

Comparing numbers - "less" is actually a lot

OK, that was easy. Now in order to be able to sort anything, we need to be able to compare things.
No Booleans here, because we just said goodbye to regular values. We'll declare the less-than relationship as a type:

trait <[A <: Nat, B <: Nat]

And of course we used the < name, because Scala allows us to.
Important thing: _0 < _1, which is sugar for <[_0, _1], is a type, not an expression.
Again, we want to define the inequality in an axiomatic way.

First of all, we know that 0 < succ(x) for any natural x.
Secondly, for any two numbers x and y, succ(x) < succ(y) if and only if x < y.

For the Scala compiler, the first axiom means finding the type _0 < Succ[Something], where Something <: Nat.
The second axiom means the compiler has to work. The intuitive mechanism should work like this. Let me just write

val x: _2 < _4 = ...

When the compiler sees the type _2 < _4, it should work out the following:

  1. Hey, that's the same as saying Succ[_1] < Succ[_3] by the names I've given to the _2 and the _4 types!
  2. Does Succ[_1] < Succ[_3] exist?
  3. Hm, first axiom doesn't work.
  4. OK, second axiom - Succ[_1] < Succ[_3] exists if _1 < _3 exists. OK.
  5. Hey, that's the same as saying Succ[_0] < Succ[_2]!
  6. Does Succ[_0] < Succ[_2] exist?
  7. Hm, first axiom doesn't work.
  8. OK, second axiom - Succ[_0] < Succ[_2] exists if _0 < _2 exists. OK.
  9. Hey, that's the same as saying _0 < Succ[_1]!
  10. Does _0 < Succ[_1] exist?
  11. First axiom - Yes!
  12. So _0 < _2 exists,
  13. So _1 < _3 exists,
  14. So _2 < _4 exists!

So the compiler infers that the type <[_2, _4] (or _2 < _4 in sugar mode) exists, which is the same as us saying 2 < 4.
How does the compiler know how to look for axioms?

Through implicits.

We'll declare a companion for the < trait as follows:

object < {
  implicit def ltbasic[B <: Nat]: <[_0, Succ[B]] = new <[_0, Succ[B]] {}
}

in which this implicit method says to the compiler: hey, if you ever need an implicit value of type <[_0, Succ[B]] for any type B which is a natural, then let me give you an instance of the <[_0, Succ[B]] type. We're not really concerned about the actual instance returned by the method - the compiler only relies on the fact that it exists.

So if I add an apply method to the companion:

def apply[A <: Nat, B <: Nat](implicit lt: <[A, B]): <[A, B] = lt

and I say

val x = <[_0, _2]

the compiler fires up:

  1. hey, this programmer's calling the apply method of the object < with the types _0 and _2
  2. the method takes an implicit parameter, it has the type <[_0, _2] or <[_0, Succ[_1]]
  3. well I have an implicit method which gives me an instance of exactly <[_0, Succ[_1]] - hey implicit, when the program runs, I'll call you for this one!
  4. implicit says: no problem boss.
  5. well good, implicit said OK, let me link up the types and just give a green light to the programmer.

So for us humans, the compiler just chose the axiom that 0 < succ(x) for any x.

Let's add the other axiom, in the form of another implicit:

implicit def inductive[A <: Nat, B <: Nat](implicit lt: <[A, B]): <[Succ[A], Succ[B]] = new <[Succ[A], Succ[B]] {}

which says: hey compiler, if you ever need an instance of type <[Succ[A], Succ[B]], just let me know; give me a value of type <[A, B] and I'll compute an instance of <[Succ[A], Succ[B]] for you.

So if I, programmer, say

val x = <[_1, _2]

the compiler fires up:

  1. the damn programmer's calling the apply method with the types _1 and _2.
  2. the apply method takes an implicit parameter, it has the type <[_1, _2] or <[Succ[_0], Succ[_1]].
  3. well I have an implicit method which gives me an instance of exactly <[Succ[_0], Succ[_1]], it's called inductive.
  4. hey inductive, when the program runs, I'll call you for this one!
  5. inductive says: no problem boss, just give me an instance of <[_0, _1] when you need me and I'll do that for you.
  6. OK, so I'll need an instance of <[_0, _1].
  7. luckily I have an implicit named basic that does just that - hey basic, when the program runs, I'll need you to give me an instance of <[_0, _1]!
  8. basic says: no problem, boss.
  9. well good, basic and inductive said OK:
    1. when basic gives me the instance of <[_0, _1] then I'll pass it to inductive
    2. inductive will give me an instance of <[_1, _2] then I'll pass it to apply
  10. everything's fine, let me link up the types and just give a green light to the damn programmer.

Finally, I hope we're good. The comparison code looks like this:

trait <[A <: Nat, B <: Nat]
object < {
  def apply[A <: Nat, B <: Nat](implicit lt: <[A, B]): <[A, B] = lt
  implicit def ltbasic[B <: Nat]: <[_0, Succ[B]] = new <[_0, Succ[B]] {}
  implicit def inductive[A <: Nat, B <: Nat](implicit lt: <[A, B]): <[Succ[A], Succ[B]] = new <[Succ[A], Succ[B]] {}
}

and I'll finally be able to write

val x = <[_1, _2]

but instead if I try

val x = <[_2, _1]

smart compiler smacks me: could not find implicit value for parameter lt: <[_2, _1]
which is its gentle way of saying: “wtf, there's no such thing as _2 < _1, you crazy? I'm not compiling that piece of sh for you.”

A small break

We went so very granular in understanding the 5 lines of the < definition above because this principle is at the core of the entire thing.
If you went through this and feel you got it, that's awesome. You are awesome.
You now understand how the compiler looks for implicit values and how we want exploit this mechanism with the meaning of finding mathematical axioms.

Less than, or equal

Shamelessly copying most of the stuff from the strict < trait:

trait <=[A <: Nat, B <: Nat]
object <= {
  def apply[A <: Nat, B <: Nat](implicit lte: <=[A, B]): <=[A, B] = lte
  implicit def basic[B <: Nat]: <=[_0, B] = new <=[_0, B] {}
  implicit def inductive[A <: Nat, B <: Nat](implicit lt: <=[A, B]): <=[Succ[A], Succ[B]] = new <=[Succ[A], Succ[B]] {}
}

which has a slightly different basic axiom in that _0 is less than or equal to B for any natural B.
You should be able to easily read the other axiom with what you've learned so far.

A type-level list

Ok, cool. We defined natural numbers and a type-level comparison method.
Now in order to sort anything, we need a definition of a type-level list, which now “stores” types instead of values.

We'll define a list as

trait HList
class HNil extends HList
class ::[H <: Nat, T <: HList] extends HList

while again using the nice :: name for the “cons” type, so we can later easily say H :: T as sugar for ::[H, T].

Well, good. We have numbers, comparison and lists. We now need the algorithm.

You know the algorithm, but just a quick recap:

  1. Need to sort a list? Fine, split it half.
  2. Sort the halves.
  3. Merge the sorted halves in a sorted fashion.

We'll do this in three parts: splitting, merging and actual sorting.

Mergesort, episode 1 - The Split

Judging by most of the code you've written so far, all your operations are composed of functions.
You've kept track so far, so you know this is unlike most code. Instead of functions, we'll use types.

The Split type will look something like this:

trait Split[L <: HList] {
  type Left <: HList
  type Right <: HList
}

Now this is a tiny bit different than comparing numbers, since this trait now “stores” types.
They will be the “results” of our computation, or rather, the compiler's computation.

As before, we'll add a companion to Split, now with a small twist:

object Split {
  type Aux[HL <: HList, L <: HList, R <: HList] = Split[HL] {
    type Left = L
    type Right = R
  }
}

Just for the sake of not expanding this code more than it's worth, read the above as the compiler would, as follows: “whenever someone gives me an instance of Aux[HL, L, R], I'll actually think of it as a Split[HL]; in it, I'll store the Left and Right type members as L and R, respectively.”

Good. Now the axioms. What, you thought you were done with axioms?

One basic case is splitting the empty list, which is dead easy:

  implicit def basic: Aux[HNil, HNil, HNil] = new Split[HNil] {
    type Left = HNil
    type Right = HNil
  }

Now look at the actual value I'm returning - I declare I'm returning an Aux, but I'm returning a Split with the exact Left = HNil, Right = HNil match. I do that because

  1. I can't instantiate Aux on its own, because it's just a type name, not a class/trait declaration
  2. I have to specify the types for Left and Right and they have to match - otherwise the compiler's type equalizer will fail
  3. I'm helping the compiler actually “write” the types in the Split[HNil] instance I'm returning

We'll use this principle throughout the entire mergesort algorithm.

Back to axioms. Another basic case is splitting a one-element list:

  implicit def basic2[H <: Nat]: Aux[H :: HNil, H :: HNil, HNil] = new Split[H :: HNil] {
    type Left = H :: HNil
    type Right = HNil
  }

which keeps the one element in the Left side of the split.

OK, now the hard one. What should happen for a list with more than one element?
The trick here is not to split the list in half, at its half. That's a bit complicated.
By the definition of a list with more than one element, it has at least two elements.
So just put one in the Left side, one in the Right side, and split the rest recursively.
So axiomatically (in an ad-hoc invented language) we could say something like

split(HNil) = { Left = HNil, Right = HNil }
split(H :: HNil) = { Left = H :: HNil, Right = HNil }
split(H :: HH :: T) = { Left = H :: split(T).Left, Right = HH :: split(T).Right }

Hopefully the above code makes sense, as a condensed version of the above explanation.
We already have the first two axioms. The third, rewritten in Scala, looks like this:

implicit def inductive[H <: Nat, HH <: Nat, T <: HList] (implicit split: Split[T])
  : Aux[H :: HH :: T, H :: split.Left, HH :: split.Right]
  = new Split[H :: HH :: T] {
    type Left = H :: split.Left
    type Right = HH :: split.Right
  }

and finally, an apply method:

def apply[L <: HList](implicit split: Split[L]): Aux[L, split.Left, split.Right] 
  = new Split[L] {
  type Left = split.Left
  type Right = split.Right
}

Try to apply the “hey implicit” logic we described in the number comparison, here. Example: Split[_0 :: _1 :: _2 :: HNil]:

  1. apply method - I need an implicit which gives me a Split[_0 :: _1 :: _2 :: HNil]
  2. inductive is the only one which returns a Split[_0 :: _1 :: Something]. Hey inductive
  3. inductive: ok boss, give me a Split[_2 :: HNil] and I'll do whatever you need.
  4. basic2 is the only implicit which gives me a Split[_2 :: HNil]. Hey basic2
  5. basic2: ok boss.
  6. All good. Linking up the types:
    • Split[_2 :: HNil] { Left = _2 :: HNil, Right = HNil } from basic2, now going back to
    • Split[_0 :: _1 :: _2 :: HNil] { Left = _0 :: _2 :: HNil, Right = _1 :: HNil } from inductive.

Mergesort, episode 2 - The Merge

By now you should be pretty familiar with the logic we (and the compiler) follow with implicits and axioms.
We'll go a tiny bit faster with the merge.

The merge “operation” will be, as before, a trait

trait Merge[LA <: HList, LB <: HList] extends HList {
  type Out <: HList
}

which has two lists and a “result” type. Its companion has an auxiliary type which follows a similar logic as Split, for the compiler's sake:

object Merge {
  type Aux[LA <: HList, LB <: HList, R <: HList] = Merge[LA, LB] {
    type Out = R
  }
}

Now the axioms.

First of all, any L merged with HNil will give us exactly L.
Problem is, Merge[L, HNil] is different from Merge[HNil, L], so we'll write two axioms:

  implicit def basicleft[LA <: HList]: Aux[HNil, LA, LA] = new Merge[HNil, LA] {
    type Out = LA
  }
  implicit def basicright[LA <: HList]: Aux[LA, HNil, LA] = new Merge[LA, HNil] {
    type Out = LA
  }

Now the inductive ones. Both arguments are non-empty lists, so we can write them as HA :: TA and HB :: TB.
There are two options:

merge(HA :: TA, HB :: TB) = HA :: merge(TA, HB :: TB) if HA ⇐ HB (forgive the ⇐, it's less than or equal - the wiki doesn't help me)
OR
merge(HA :: TA, HB :: TB) = HB :: merge(HA :: TA, TB) if HB < HA

Now those < and types may come in handy!

  implicit def inductivelte[HA <: Nat, TA <: HList, HB <: Nat, TB <: HList] (implicit merged: Merge[TA, HB :: TB], lte: HA <= HB)
  : Aux[HA :: TA, HB :: TB, HA :: merged.Out] = new Merge[HA :: TA, HB :: TB] {
    type Out = HA :: merged.Out
  }
 
  implicit def inductiveg[HA <: Nat, TA <: HList, HB <: Nat, TB <: HList] (implicit merged: Merge[HA :: TA, TB], g: HB < HA)
  : Aux[HA :: TA, HB :: TB, HB :: merged.Out] = new Merge[HA :: TA, HB :: TB] {
    type Out = HB :: merged.Out
  }

And an apply method:

  def apply[LA <: HList, LB <: HList](implicit merged: Merge[LA, LB])
  : Aux[LA, LB, merged.Out] = new Merge[LA, LB] {
    type Out = merged.Out
  }

Try to follow the “hey implicit” logic on your own this time. For example: Merge[_0 :: _3 :: HNil, _1 :: _2 :: HNil].

Mergesort, episode 3 - The Sorting

We're here. The epic finale.

No surprises so far:

trait Sorted[L <: HList] extends HList {
  type Out <: HList
}
object Sorted {
  type Aux[L <: HList, S <: HList] = Sorted[L] {
    type Out = S
  }
}

Basic axioms say sorting an empty or a one-element list will give the same list:

  implicit def basicnil: Aux[HNil, HNil] = new Sorted[HNil] {
    type Out = HNil
  }
  implicit def basicone[H <: Nat]: Aux[H :: HNil, H :: HNil] = new Sorted[H :: HNil] {
    type Out = H :: HNil
  }

Now the crux of it. What is the axiomatic definition of sorting a list with at least two elements?
Well, you have to split, sort the halves and merge them together, right?
Let's think it backwards. Let U be the unsorted list.

  1. Sorted[U] = merged.Out
  2. such that merged satisfies merged = Merge[SL, SR]
  3. such that SL and SR satisfy Sorted.Aux[L, SL] and Sorted.Aux[R, SR], respectively…
  4. such that L and R satisfy Split.Aux[U, L, R].

Now read the “such that…” in the words of the compiler, which says “I need to find an implicit such that…”.

More words are worthless. The dragon-killing code is below:

  implicit def inductive[U <: HList, L <: HList, R <: HList, SL <: HList, SR <: HList]
  (implicit
    split: Split.Aux[U, L, R],
    sl: Sorted.Aux[L, SL],
    sr: Sorted.Aux[R, SR],
    merged: Merge[SL, SR])
  : Aux[U, merged.Out] 
  = new Sorted[U] { type Out = merged.Out }

If you find the above code a bit hard to read, you passed the test - you're a normal human being seeing alien code for the first time.
Write two or three more of these and you'll start to really read them.

The above code is the exact representation of the axiom for the induction case.
If you know a bit of Prolog and constraint satisfaction, here's an equivalent Prolog code, it might help.
The compiler goes through almost the same resolution process:

sort([], []).
sort([H], [H]).
sort(U, M) :-
  split(U, L, R),
  sort(L, SL),
  sort(R, SR),
  merge(SL, SR, M).

Also if it helps, here's a potato.

And finally, a (very boring by now) apply method:

  def apply[L <: HList](implicit sorted: Sorted[L]): Aux[L, sorted.Out] = new Sorted[L] {
    type Out = sorted.Out
  }

And that was it. You can now mergesort types, at compile time. If we make Sorted extend HList and put in some magic test code

  def show[L <: HList](s: L)(implicit tag: TypeTag[L]) = tag
 
  def main(args: Array[String]): Unit = {
    val x = Sorted[_1 :: _5 :: _0 :: _3 :: _2 :: HNil]
    println(show(x))
  }

You'll see that…
… well, test it for yourself!

Congratulations - you can now mergesort types, at compile time, with Scala.

Now what?

Breathe.

sesiuni/scala/lab10.txt · Last modified: 2016/07/24 17:26 by dciocirlan