I’ve been working through this list of 99 Scala Problems, which is modeled after this list of 99 Prolog Problems. As I’ve been going through them, I have been comparing my solutions to those provided (obviously). Sometimes, my solution is more or less the same as the “official” solution. Sometimes, theirs is better. In the case of problem 28, I think mine is far easier to read and understand.
Problem 28 has two parts. The first part reads:
a) We suppose that a list contains elements that are lists themselves. The objective is to sort the elements of the list according to their length. E.g. short lists first, longer lists later, or vice versa.
Running the function should look like this:
[scala]
scala> lsort(List(List("a", "b", "c"), List("d", "e"), List("f", "g", "h"), List("d", "e"), List("i", "j", "k", "l"), List("m", "n"), List("o")))
res0: List[List[java.lang.String]] = List(List(o), List(d, e), List(d, e), List(m, n), List(a, b, c), List(f, g, h), List(i, j, k, l))
[/scala]
For this part, my solution was almost identical. Here’s what I came up with:
[scala]
def lsort[T](ls: List[List[T]]) = {
ls.sortWith {(a, b) => a.length < b.length}
}
[/scala]
You can see that this function takes a List of type T, and then calls the sortWith method on that list, passing in a function value that sorts the lists based on their length, shortest to longest. The “official” solution was only slightly different:
[scala]
def lsort[A](ls: List[List[A]]): List[List[A]] =
ls sort { _.length < _.length }
[/scala]
Here, they used A instead of T, but that doesn’t affect anything, and they specified the return type, while I left mine inferred. Instead of assigning each bucket of the list to a named variable, as I did, they use the underscore placeholder. The two functions are functionally (get it?) identical, but theirs is a bit shorter because they removed the outer braces, and were able to skip the parameter list, since they used the underscores.
Now, the second part is where I diverge from the official solution. Here’s the problem description:
b) Again, we suppose that a list contains elements that are lists themselves. But this time the objective is to sort the elements according to their length frequency; i.e. in the default, sorting is done ascendingly, lists with rare lengths are placed [first], others with a more frequent length come later.
And the expected call and result is
[scala]
scala> lsortFreq(List(List("a", "b", "c"), List("d", "e"), List("f", "g", "h"), List("d", "e"), List("i", "j", "k", "l"), List("m", "n"), List("o")))
res1: List[List[java.lang.String]] = List(List(i, j, k, l), List(o), List(a, b, c), List(f, g, h), List(d, e), List(d, e), List(m, n))[/scala]
First, let’s look at what they presented as the solution. It referenced functions from other files, but I have included them all here for easy of viewing.
[scala]
def lsortFreq[A](ls: List[List[A]]): List[List[A]] = {
val freqs = Map(encode(ls map { _.length } sort { _ < _ }) map { _.swap }:_*)
ls sort { (e1, e2) => freqs(e1.length) < freqs(e2.length) }
}
def encode[T](ls: List[T]): List[(Int, T)] = {
val packedList = pack(ls)
packedList map {list => (list.length, list.head)}
}
def pack[T](ls: List[T]): List[List[T]] = ls match {
case Nil => Nil
case h :: tail => (h :: tail.takeWhile(_ == h)) :: pack(tail.dropWhile(_ == h))
}
[/scala]
I think this is very confusing code. It’s calling the encode function which does run-length encoding of the passed-in thing. It then uses a Map of these encodings to sort the passed-in list. The presence of five underscores in the first line, obscures where those parameters are coming from, and the final underscore is actually part of the _* method of the Array class!
My solution, while being a longer function, is far more readable, in my opinion. And, it’s the same number of lines as the three-method solution. Here it is
[scala]
def lsortFreq[T](ls: List[List[T]]) = {
val lengthMap = scala.collection.mutable.Map[Int, Int]()
for (l <- ls) {
val len = l.length
if (!lengthMap.contains(len)) {
lengthMap(len) = 1
} else {
lengthMap(len) += 1
}
}
ls sortWith {(a, b) => lengthMap(a.length) < lengthMap(b.length)}
}
[/scala]
In my function, I created a mutable Map and then iterate over the list, getting each item’s length, and then keep a running tally of how many items had that length. The map has these lengths as its keys, and the number of items with that length as its values. Get it? I then sort the original list by having each item in the comparison lookup how many items share its length, and use that as the sort criterion.
I have no idea which of these solutions is more efficient. For small problems like this, I doubt there’s any measurable difference. But I do believe that my solution is easier to read and understand. So much so, in fact, that I think someone who is not familiar with Scala would be able to easily figure out what it’s doing. I don’t know that the same can be said of the other solution.
I got criticized for promoting terse code in this article, so this is my attempt at balance.
Note: I did change the inputs to these functions from symbols to strings. The code formatter I use on the blog wasn’t colorizing things properly when there were symbols involved.
