# Scala's Modular Roots

Earlier this year Martin Odersky went on a whistle-stop tour of conferences and meetups with his latest talk, ‘Scala — The Simple Parts’. The first stop was the flatMap(Oslo) conference in Norway, and then one might have observed a blur resembling Martin across Europe and the US, with a grand finale at this years’s Scala Days.

- 12 May 2014 flatMap(Oslo)
- 13 May 2014 Jax
- 19 May 2014 Gilt Groupe and Hunter College
- 21 May 2014 GOTO Chicago
- 22 May 2014 SF Scala
- 16 June 2014 Scala Days

Similar to his 2013 talk ‘Scala with Style’, which also concluded its tour at that year’s Scala Days, Odersky’s 2014 talk appeared as a mix of advocacy, evangelism, and an attempt by a language designer to make explicit his authorial intent—as much as that is possible. Due to the scheduling, ‘Scala — The Simple Parts’ had an air of a roadshow (he is a founder of Typesafe after all); due to the content, it also had an air of defensiveness. This latter aspect is, while understandable, rather unfortunate. I will not attempt to describe or diagnose the nature and intricacies of the Scala community, but I have the sense that Odersky’s attempt to highlight the essence of Scala–-the simple parts, as *he* sees them—was ultimately received with some skepticism. While Odersky seemed successful in drawing out seven core aspects of Scala, it remains a language that, as a whole, is powerful, expressive, unopinionated, and *complicated*. And what’s more, I also have the sense that many ‘Scala people’ are “ok with that”.

The purpose of this article, however, is not to critique Odersky’s talk and vision, but to explore just a single slide. Naturally, he refined his slide deck over the course of the tour, but one constant and core idea in his talk was Scala as a ‘modular language’. The original academic agenda for Scala was to create a hybrid language that combines object-oriented and functional programming. In his talk, Odersky laments that Scala is seen as precariously balancing between OOP and FP, and argues that rather than call Scala an ‘*object/functional*’ language—thus setting itself up to be caught in the crossfire of perpetual holy wars—we should instead call it a ‘modular’ language. With this context, the slide I’d like to discuss in depth is titled “Scala’s Modular Roots”. Here, Odersky points out the inspiration he drew from Modula-2, Modula-3, Haskell, and Standard ML; of particular interest is the equivalence he draws between Scala and Standard ML’s module system.

```
Object ≅ Structure
Class ≅ Functor
Trait ≅ Signature
Abstract Type ≅ Abstract Type
Refinement ≅ Sharing Constraint
```

## Scala à la ML modules

For the rest of this article we’re going to walk through an example to aid our exploration of the above equivalence between Scala and Standard ML. I’m going to use a set data structure adapted from chapter 2.2 of Purely Functional Data Structures by Chris Okasaki. This is a handy source as all the implementations in this book are in Standard ML (and Haskell in the appendix), so we can compare our Scala translations against the original.

Before we proceed much further we should make sure we have an understanding of the ML module system.

An ML

structurecombines related types, values and other structures, with a uniform naming discipline. An MLsignaturespecifies a class of structures by listing the name and type (or other attributes) of each component. […] ML also providesfunctors—structures taking other structures as parameters—[…]

L.C. Paulson, *ML for the Working Programmer*, 2nd Edition, (1996) pp.59

## Signatures as Traits

We will come back to ML functors in due course, but let’s start with a signature. To implement a set, we’re going to need to be able to compare the elements, so we’ll demand that our elements are ordered.

```
signature ORDERED =
sig
type T
val compare : T × T -> int
end
```

`ORDERED`

is our ML signature, which specifies a type `T`

and a function named `compare`

that takes a pair of `T`

s to an `int`

. The contract is that the result of compare is negative if the first is less than the second, zero if equal, otherwise, positive. This is not the signature listed by Okasaki, but the signature here is closer to the `Ordering`

trait in Scala standard library:

```
trait Ordering[T] {
def compare(x: T, y: T): Int
}
```

This is the first case of Odersky’s equivalence: signatures and traits. There are some obvious differences. Leaving aside the fact that ML signatures are pure interface and Scala traits can contain implementation, the `Ordering`

trait is a parameterized type—it has a type parameter `T`

. In contrast, `ORDERED`

declares a type `T`

as part of the signature in the same manner as it declares the function `compare`

. But, this is no concern, as in Scala we can trade positional type parameters for abstract type members, as follows:

```
trait Ordering {
type T
def compare(x: T, y: T): Int
}
```

Now we really have a exact correspondence. There is a larger discussion to be had for type parameters versus type members, but we’ll shelve it for now.

## Structures as Objects

Now let’s see a structure. The following is a structure that implements the signature for integers. (There is an obvious overflow bug here, but we’ll gloss over that for the sake of brevity.)

```
structure IntOrdered : ORDERED =
struct
type T = Int
fun compare(x, y) = x - y
end
```

Odersky’s equivalence says we should translate an ML structure as a Scala `object`

. We saw two traits above, so we’ll show both objects.

```
object IntOrdering extends Ordering[Int] {
def compare(x: Int, y: Int): Int = x - y
}
object IntOrdering extends Ordering {
type T = Int
def compare(x: Int, y: Int): Int = x - y
}
```

The former is roughly what appears in the standard library. The latter clearly corresponds, modulo syntax, exactly to the ML structure.

## Functors as Classes

Now that we have an abstraction for comparable elements in place, we can move on to implementing a set data structure. Let’s see an adaptation of Okasaki’s set in ML in its entirely first. First, the signature:

```
signature SET =
sig
type Elem
type Set
val empty : Set
val insert : Elem × Set -> Set
val member : Elem × Set -> bool
end
```

This signature declares types for both the collection type and the element type, as well as functions to construct an empty set, insert a element, and test membership.

We’ve reached the next component of Odersky’s equivalence—Scala classes and ML functors—so we need a little more detail on what exactly an ML functor is.

If structure

`B`

depends on structure`A`

, and we wish to replace`A`

by another structure`A'`

, […] ML lets us declare`B`

to take a structure as a parameter. We can then invoke`B(A)`

and`B(A')`

, possibly at the same time. A parametric structure, such as`B`

, is called afunctor. Functors let us treat`A`

and`A'`

as interchangeable parts.

L.C. Paulson, *ML for the Working Programmer*, 2nd Edition, (1996) pp.257

Okasaki’s implementation of a set is as a binary tree, and it is formulated as a functor so that it is parametric in the element and its ordering—the following is our adaptation.

```
functor UnbalancedSet(Element : ORDERED): SET =
struct
type Elem = Element.T
datatype Tree = E | T of Tree × Elem × Tree
type Set = Tree
val empty = E
fun member (x, E) = false
| member (x, T (a, y, b)) =
if Element.compare (x, y) < 0 then member (x, a)
else if Element.compare (x, y) > 0 then member (x, b)
else true
fun insert (x, E) = T (E, x, E)
| insert (x, s as T (a, y, b)) =
if Element.compare (x, y) < 0 then T (insert (x, a), y, b)
else if Element.compare (x, y) > 0 then T (a, y, insert (x, b))
else s
end
```

The `UnbalancedSet`

functor take a structure that is an *instance* of the `ORDERED`

signature, and produces a structure that is an instance of the `SET`

signature. The `Elem`

type of the set is implemented as the `T`

of the ordering structure `Element`

; the `Set`

type is implemented as a basic binary tree and in terms of the `Elem`

type. The rest of the `SET`

signature is implemented as one would expect, making use of the `compare`

function that comes from the `Element`

parameter. To construct a concrete structure, we simply apply the functor to a chosen ordering. The following constructs a structure of integer trees using the ordering structure we defined for integers earlier.

```
structure UnbalancedIntSet = UnbalancedSet(IntOrdered)
```

Even if you are unfamiliar with Standard ML, I’m sure the above is clear and comprehensible.

## Scala, the *object/functional* language

Let’s turn our attention back to Scala. While I’m cautious to claim that the following implementation of a set is ‘idiomatic’ Scala, it certainly hits all the facets of Scala that make it an ‘*object/functional*’ language. It also stakes out a point in the spectrum that is far away from the ML implementation above, which we will attempt to reconcile.

```
sealed abstract class Set[+A] {
import Set._
def insert[B >: A](x: B)(implicit o: Ordering[B]): Set[B] = this match {
case Leaf => Branch(Leaf, x, Leaf)
case b@Branch(l,y,r) =>
if (o.compare(x, y) < 0)
Branch(l.insert(x), y, r)
else if (o.compare(x, y) > 0)
Branch(l, y, r.insert(x))
else b
}
def member[B >: A](x: B)(implicit o: Ordering[B]): Boolean = this match {
case Leaf => false
case Branch(l,y,r) =>
if (o.compare(x, y) < 0)
l.member(x)
else if (o.compare(x, y) > 0)
r.member(x)
else
true
}
def isEmpty: Boolean
}
object Set {
def empty[A]: Set[A] = Leaf
case object Leaf extends Set[Nothing] {
val isEmpty = true
}
final case class Branch[+A](
left: Set[A],
elem: A,
right: Set[A]) extends Set[A] {
val isEmpty = false
}
}
```

This implementation is quite different in many regards, and it also highlights many of key features of Scala, some of which made Odersky’s list of ‘simple parts’.

First off, it follows an object-oriented style of methods over functions. For example, set insertion will used as `mySet.insert(x)`

rather than `SetStruct.insert(x, mySet)`

—the set data structure is the receiver of the methods. Related to this is the use of a ‘companion object’, where we provided the static method `empty`

to construct an empty set. In some sense, `empty`

here is much closer to the ML implementation, than `insert`

or `member`

. I’ve chosen to nest `Leaf`

and `Branch`

inside the companion object, for no reason other than to highlight Scala’s flexibility in allowing one to nest most things in most other things.

The set is a parametric type, like our first declaration of the `Ordering`

trait. Moreover, we are using declaration-site variance and we have declared our set to be covariant with the `+`

. (As an aside, see this Stackoverflow question and Scala-user thread for a discussion on the invariance of `Set`

in the Scala standard library.) This still requires use-site variance for the `insert`

and `member`

methods where the parametric type appears in contravariant position for the method parameters. Note that `Leaf <: Set[Nothing]`

, and because of `Set`

’s covariance, `Leaf <: Set[A]`

for all `A`

.

In his talk, Odersky made a point of highlighting that `if`

—`then`

—`else`

is an expression in Scala, not a statement. Another expression is `match`

, which mirrors ML’s pattern matching and function definition by cases. In ML the binary search tree is defined as an algebraic data type.

```
datatype Tree = E | T of Tree × Elem × Tree
```

This is a sum-of-products, recursive representation of a tree. There are two cases `E`

and `T`

, the former being a nullary constructor and the latter a product of a left tree, an element, and a right tree. This is expressed in Scala using the `case`

keywords,

```
case object Leaf extends Set[Nothing]
final case class Branch[+A](
left: Set[A],
elem: A,
right: Set[A]) extends Set[A]
```

as well as the `sealed`

keyword on the `abstract`

`Set`

class. The combination of these two aspects means that we can pattern match over a closed, finite set of cases (the consequence of a *sealed* type hierarchy) and as well as extract the contents of each case (using the `unapply`

methods generated for us by `case`

, *cf.* Chapter 15 of Programming in Scala, First Edition).

Scala being a hybrid *object/functional* language, there is often more than one way of doing things. What was just described is the functional approach—the approach taken by ML as well as Haskell and other related languages. The object-oriented approach, is demonstrated with the `isEmpty`

method. The method signature is given in the supertype `Set`

and left abstract; the two subtypes, `Leaf`

and `Branch`

, provide an implementation. The correct code path is determined by dynamic dispatch instead of by pattern matching. We could have implemented `member`

and `insert`

this way as well, and there would be no need for `sealed`

or `case`

—indeed, this would be the typical implementation strategy in Java (although the visitor pattern is also an option). In contrast to *closed* world of the functional approach, this is an *open* world, where we are free to expand the cases at some later point. For example, we could turn the binary tree into a 2-3 tree, by introducing

```
final case class Branch4[+A](
left: Set[A],
elem1: A,
middle: Set[A],
elem2: A,
right: Set[A]) extends Set[A] {
val isEmpty = false
}
```

Due to openness, we can provide the appropriate implementation for `isEmpty`

here; due to closedness, we would have to go back and rewrite the implementations of `member`

and `insert`

, as there is a now new case. The flip side is that with the closed-world assumption, we can freely add new methods based on pattern matching without having to touch `Leaf`

or `Branch`

. This tradeoff is the essence of The Expression Problem. However, this is a much larger discussion, and we will leave this aside before we get side-tracked any further.

## Direct ML style

Returning to the topic at hand, let’s now jump to the other end of the spectrum and attempt to replicate the ML set signature and the ML functor for sets implemented as unbalanced binary trees. The signature is straightforward:

```
trait SetSig {
type Elem
type Set
def empty: Set
def insert(e: Elem, s: Set): Set
def member(e: Elem, s: Set): Boolean
}
```

There should be nothing surprising here: this is close to a transliteration of the ML signature. However, this becomes a tad tricky when we attempt to do the same for the functor. The following is a first stab, which doesn’t quite work out.

```
class UnbalancedSet(val Element: Ordering) extends SetSig {
type Elem = Element.T
sealed trait Set
case object Leaf extends Set
case class Branch(left: Set, elem: Elem, right: Set) extends Set
val empty = Leaf
def member(x: Elem, s: Set): Boolean = s match {
case Leaf => false
case Branch(l, y, r) =>
if (Element.compare(x, y) < 0)
member(x, l)
else if (Element.compare(x, y) > 0)
member(x, r)
else
true
}
def insert(x: Elem, s: Set): Set = s match {
case Leaf => Branch(Leaf, x, Leaf)
case Branch(l, y, r) =>
if (Element.compare(x, y) < 0)
Branch(insert(x, l), y, r)
else if (Element.compare(x, y) > 0)
Branch(l, y, insert(x, r))
else
s
}
}
```

The primary constructor of the class takes a instance of `Ordering`

(the one we declared earlier) in much the same way that the ML functor took a parameter. The first nuance is that the constructor parameter must be turned into a public member. If we were to instead only do,

```
class UnbalancedSet(Element: Ordering) extends SetSig {
…
```

we would get the following compilation error:

```
[error] private value Element escapes its defining scope as part of type UnbalancedSet.this.Element.T
[error] type Elem = Element.T
[error] ^
```

The issue here is that we are refining the abstract type member `Elem`

that was declared in the signature to be equal to the type member `T`

of the value `Element`

of type `Ordering`

. Therefore, as `Elem`

is public, `Element`

must also be.

Now we encounter the next issue. While the code above will compile, it will not be very useful to us. Observe the following console session.

```
scala> object S extends UnbalancedSet(IntOrdering)
defined module S
scala> S.insert(1, S.empty)
error: type mismatch;
found : Int(1)
required: S.Element.T
S.insert(1, S.empty)
^
```

Unfortunately we can’t get very far. This is because the type checker is unable to prove that the type `T`

of the `Ordering`

value `Element`

for this module `S`

is in fact `Int`

. This would require it to “track values in types”. To help us understand the workaround, let’s first expand the shorthand that is occurring in the primary constructor.

```
class UnbalancedSet(val Element: Ordering) extends SetSig { …
```

is equivalent to

```
class UnbalancedSet(o: Ordering) extends SetSig {
val Element: Ordering = o
…
```

When object `S`

extends this, the type of `Element`

is still just `Ordering`

, so it has no idea what the concrete type of `Element.T`

is as `Ordering#T`

is abstract. To help the type checker we’re going to need to do some type refinement. The first step is to leave `Element`

abstract.

```
abstract class UnbalancedSet extends SetSig {
val Element: Ordering
…
```

The second step is the refinement.

```
object S extends UnbalancedSet {
val Element = IntOrdering
}
```

And now everything will work as expected. Why? Well, it may be instructive to consider the following. If we were to adjust the above to,

```
object S extends UnbalancedSet {
val Element: Ordering = IntOrdering
}
```

we’d be back where we started—we’ve thrown away type information. If we specify *all* the types, we get:

```
object S extends UnbalancedSet {
val Element: IntOrdering.type = IntOrdering
}
```

So, when the type checker tries to unify `Int(1)`

with `S.Element.T`

, it now knows that the type of the value `Element`

is `IntOrdering.type`

, and `IntOrdering.T`

is `Int`

, so `Element.T`

is `Int`

. Problem solved!

## Mixed ML and Scala style

So, after a little type-gymnastics to ensure that abstract types are fully refined, we have successfully translated ML signatures, structures, and functors into equivalents in Scala. Often, the translation was simple enough to be more akin to a transliteration; however, the style of using our sets,

```
S.insert(1, S.empty)
```

is function-like, not method-like. To adhere to the object style of Scala, we would prefer to write something like,

```
S.empty.insert(1)
```

where the set data structure is the receiver for methods like `insert`

and `member`

(while `empty`

still makes sense as a constructor function).

To set this up we need something akin to a nested signature:

```
trait SetModule {
type Elem
type Set <: SetSig
trait SetSig {
def insert(e: Elem): Set
def member(e: Elem): Boolean
}
def empty: Set
}
```

The outer trait defines two abstract type members, `Elem`

and `Set`

, same as before, however, now `Set`

is constrained by the inner trait `SetSig`

, which declares the signature of set-like objects. Finally there is the `empty`

constructor. Note that we could have done something like,

```
trait SetModule {
type Elem
trait SetSig {
def insert(e: Elem): SetSig
def member(e: Elem): Boolean
}
def empty: SetSig
}
```

however, the types here are slightly less precise. Before, we could say that whatever refines `Set`

by implementing the `SetSig`

signature, that’s the type that `insert`

and `empty`

will return. As an aside, the naming convention to draw the parallels with ML is breaking down here: I have chosen the suffix `Module`

for the outer trait and `Sig`

for the inner trait.

Finally, we can implement an abstract class that implements `SetModule`

using the same unbalanced binary tree as before.

```
abstract class UnbalancedSet extends SetModule {
val Element: Ordering
type Elem = Element.T
sealed abstract class Set extends SetSig {
def member(x: Elem): Boolean = this match {
case Leaf => false
case Branch(l, y, r) =>
if (Element.compare(x, y) < 0)
l.member(x)
else if (Element.compare(x, y) > 0)
r.member(x)
else
true
}
def insert(x: Elem): Set = this match {
case Leaf => Branch(Leaf, x, Leaf)
case Branch(l, y, r) =>
if (Element.compare(x, y) < 0)
Branch(l.insert(x), y, r)
else if (Element.compare(x, y) > 0)
Branch(l, y, r.insert(x))
else
this
}
}
case object Leaf extends Set
case class Branch(left: Set, elem: Elem, right: Set) extends Set
val empty = Leaf
}
```

This presents something that is definitely inspired by ML’s module system, but retains a bit more of Scala’s character.

## Conclusions

I should make it clear that none of what has been presented here is novel: the techniques for expressing modular systems in Scala is clearly near and dear to Martin Odersky’s heart. In fact, the final formulation of `SetModule`

follows the same formulation as a module for a graph data structure that Odersky shows towards the end of his ‘Simple Parts’ talk (and in more detail in a CACM article ‘Unifying functional and object-oriented programming with Scala’). Abstractly providing the element type of the set via the `Ordering`

signature does go a little beyond what Odersky shows with his graph module; however, the issue of helping the type checker along with type refinement should be familiar to those who have written bigger macros, as that is exactly what is needed when passing the path-dependent macro context around components of a large macro.

Scala is clearly a powerful language: powerful enough that you can craft your own module system, and emulate a very well regarded one, such as ML’s, as we have seen here. What I have presented, I am not necessarily advocating or endorsing—although Odersky has used something similar to promote Scala, the ‘modular’ language—I am simply attempting to explore the design space and hope to encourage others to do the same. Another powerful language feature in Scala are implicits. There are plenty of use cases, but probably the best known is using them to emulate Haskell’s type classes. Projects such as Miles Sabin’s Shapeless are at the cutting edge of what is possible with Scala, and implicits are used there to great effect. Operations on Shapeless HLists take implicit parameters that amount to proof objects—proofs of properties such as the length of the list. This is commonly seen in fully dependently-typed languages such as Coq, Agda, and Idris; on the latter, one should watch Miles Sabin and Edwin Brady’s talk ‘Scala vs. Idris’. Implicits were barely used in this article, only when showing the initial, ‘idiomatic’ Scala implementation of a set; however, it seems reasonable to suspect that implicits could be used to greater effect when crafting a module system inspired by ML’s—a language feature that ML does not itself possess.

When first talking about the `Ordering`

trait we briefly discussed the choice between type parameters and abstract type members, but set aside any detailed discussion. This was alluding to another talk by Oderky, ‘The Trouble With Types’ at Strange Loop ’13, where he contrasted parameters with abstract members, characterizing the former as positional and functional, and the latter as nominal and object-oriented (or modular). We have seen some of that contrast in this article. The larger news from the talk at Strange Loop was the future of Scala in DOT, a calculus for ‘Dependent Object Types’, which is proposed as a type-theoretic foundation for Scala. Should a future Scala have DOT as its core, type parameters would certainly remain, given their convenience, but they would be syntactic sugar on top of abstract type members. This is the *modular* language on the horizon.

Finally, I should highlight what has been left out here. As objects are values in Scala, that makes what we have show a first-class module system. While some members of the ML family have added extensions to cover this, an ML module is not first-class as standard. It would be interesting to discuss the consequences of this further. Also, we have left two aspects of Odersky’s equivalence between Scala and the ML module system, namely abstract types in an ML sense and sharing constraints. This will also need to be left for another time.

## Further links

The following are some links for further reading.

## Scala — The Simple Parts

## Purely Functional Data Structures

- The book at Cambridge University Press
- The book at Google Books
- The original PhD dissertation
- A retrospective from Chris Okasaki

## Scala and Modules

- Modular Programming Using Objects (Chapter 27 of Programming in Scala, First Edition)
- Unifying functional and object-oriented programming with Scala
- A Nominal Theory of Objects with Dependent Types
- Scalable Component Abstractions
- Mixin’ Up the ML Module System

## Bio

Dan James is an API-Lead at Pellucid Analytics. He has a PhD in Computer Science from Oxford University. He’s a major contributor to our open source project Datomisca, a Scala library for Datomic, as well as our Framian and AWS-Wrap libraries. Rich Content