## Introduction to Mezzo

- January 11, 2013

Some of you may have heard about it already: this blog post is about
a new programming language that’s being designed right there at Gallium,
codenamed * Mezzo*.

Designing a new programming language is nothing out of the
extraordinary: many students in our research area do it for a living,
and I’m no exception. I’ve been doing the traveling salesman, running
presentations of Mezzo all over the place: Texas, New York, and going
even as far as *Plateau de Saclay*. Now is the time for a blog
post.

* Mezzo* is a joint work between François
Pottier, my advisor, and myself. Recently, Thibaut Balabonski joined our
team as a post-doctoral researcher to work on the concurrent aspects of
the language; we also had an undergrad intern, Henri Chataing, work on
the compilation part of the project.

* Mezzo* draws inspiration from many other
research projects; I don’t want to bother the reader with a long
introduction – this is not a research paper: I will provide an extended
comparison of our work with other material in a later blog post. For the
moment, let me just mention that we blend ideas from affine type
systems, separation logic, and systems based on regions and
capabilities.

* Mezzo* is still in a preliminary stage, but
I feel like it is time to start the revolution, and introduce this
almighty language to a wider audience. Behold!

### What is **Mezzo**?

**Mezzo**

**Mezzo**^{1} draws most of its
inspiration from ML. Therefore, a big chunk of the language reads and
feels just like ML (keywords: strict evaluation, functional, impure).
However, there’s a reason we’re designing a new language: we want to go
further than ML.

- We want to reject some programs previously accepted by ML; in a sense, we want to forbid patterns we consider dangerous.
- Conversely, we want to accept some programs previously deemed unsafe by an ML type-checker: we want to make more programming patterns possible.

Most of these improvements come at a cost, naturally. The design that I’m about to introduce is still very much in flux, and time will tell whether the cognitive cost of our new system is worth the benefits.

However, we deliberately tried to cut down the complexity of the system: in a sense, we don’t want to end up doing proofs within the type system. Therefore, the mechanisms that I’m about to introduce may seem low-tech compared to recent research works, but we believe that’s what makes it possible to explain the language to a (seasoned) ML programmer, and also what makes writing a type-checker relatively doable, without requiring a huge burden of annotations from the programmer.

### Permissions

This blog post assumes that the reader is familiar with ML; in particular, the reader is expected to know what references are, what a boxed value is, and how only things in the heap may be mutable.

One of the goals for * Mezzo* is to offer
better control of

*aliasing*and

*ownership*of mutable data. In order to make this possible,

*replaces the traditional concept of “type” with that of “permission”.*

**Mezzo**In essence, one can view a permission as a conceptual access token to
a part of the heap, that is, a heaplet. A permission both describes the
shape of the heaplet, and grants its owner the right to *access*
this heaplet. Permissions are an artefact of the type system: they do
not exist at runtime.

Consider the following piece of code:

```
let x = (2, "foo") in
...
```

Right after this line of code, a new permission is made available to
the programmer, namely `x @ (int, string)`

. This permission
basically tells the programmer that `x`

points to a block in
the heap, with two fields. The first one contains an integer, and the
second one contains a pointer to a string. Moreover, the permission
grants its owner the right to access this portion of the heap.

At any given point in the program, a set of permissions is available. They determine the fragment of the heap that we can access at this point of the program.

In the example above, several permissions may be available along
`x @ (int, string)`

; we chose to focus on the small subset of
permissions we are interested in. In general, there is more than just
one permission available in a given scope; for instance, we may have
`x @ (int, string) ∗ y @ float`

, that is, both
`x @ (int, string)`

and `y @ float`

at the same
time: `∗`

denotes the conjunction of two permissions^{2}.

#### What are permissions useful for?

##### A simple example: the
`length`

function

So far, there’s nothing new here. Let us stay low-tech for a moment
and consider the signature of the `length`

function that, as
the name suggests, computes the length of a `list`

.

```
val length: [a] (x: list a) -> int
```

`[a]`

denotes universal quantification, that is,
`length`

is valid for any type `a`

, and
`x`

is a binder, that is, it gives us a way to talk about the
argument of `length`

.

The `length`

function, in order to perform its job,
requires that the *caller* passes it the “right” to read the list
at address `x`

. We say that this function needs the
permission `x @ list a`

to run: indeed, it needs to iterate
over the elements of the list in order to compute its length, and the
aforementioned permission enables it to do just that.

We understand this function to *return to its caller* the
permission `x @ list a`

, along with a value of type
`int`

. In other words, if comments denote the permissions
available at a given program point:

```
(* empty *)
let x = ... in
(* x @ list a *)
let l = length x in
(* l @ int ∗ x @ list a *)
```

The permission `x @ list a`

is temporarily taken from the
caller’s context, conceptually passed to the `length`

function for the duration of its execution, then returned to the caller
once `length`

terminates.

##### A more
sophisticated example: the `mswap`

function

We are now ready to move on to a more sophisticated example that will
demonstrate the power of * Mezzo*! We now
consider

*mutable*pairs of elements, which we write as

`mpair a b`

(think `('a, 'b) mpair`

^{3}). We wish to write a function that will swap,

*in place*, the two components of a pair. That is, without allocating a fresh pair.

What is the type for this function?

```
val mswap: [a, b] (consumes x: mpair a b) -> (| x @ mpair b a)
```

Let us review this fairly complex type. There are several elements at play here.

- Unlike the
`length`

example, the argument`x`

is*consumed*, that is, when the caller passes the`x @ mpair a b`

permission to`mswap`

, it*loses it*. - However,
`mswap`

will return to the caller a different permission saying that`x`

, after`mswap`

has run, has a different type, namely`x @ mpair b a`

. - The
`(|p)`

syntax indicates that the function returns a permission`p`

, but no value^{4}

```
(* empty *)
let x = ... in
(* x @ mpair a b *)
mswap x;
(* x @ mpair b a *)
```

This is a function that we cannot write in ML. Indeed, in order for
this function to be sound, we must make sure that no one else can ever
hold an *outdated* copy of `x @ mpair a b`

. Therefore,
`mswap`

must have an *exclusive* access to its
argument. * Mezzo* provides such a guarantee:

*mutable portions of the heap are guaranteed to have a unique owner*; therefore,

`mswap`

*does*have an exclusive access to its argument:

`x @ mpair a b`

is an
exclusive permission, and the system guarantees that it has an unique
owner. In particular, as soon as `mswap`

runs, the caller has
lost its permission on `x`

.#### Different flavors of permissions

Getting back to our very first example, the fragment of the heap that
the permission `x @ (int, string)`

describes is immutable;
therefore, it is safe to share this knowledge with others. In other
words, *this permission can be freely copied*, and passed to
other parts of the programs (think about threads, for instance). This
permission is said to be *duplicable*.

As we just saw, not all permissions are duplicable. Because
* Mezzo* aims to control aliasing to mutable
data, we make sure that only one person at a time can “see” a mutable
fragment of the heap: this is of utmost importance when it comes to
concurrency. The way we enforce this is by forbidding duplication of
such a permission. For instance,

`x @ mpair a b`

is said to
be *exclusive*, and

*cannot*be copied.

Therefore, having two copies of a permission is not necessarily the same as having just one copy.

##### The mode system

Some functions can only be type-checked if they know that their
argument is duplicable. The `find`

function, because it
returns a pointer to an element in the list, must require that the type
of the elements in the list be duplicable: indeed, because a pointer to
the element is returned to the caller *and* remains in the list
at the same time, there are two people pointing to it, so it better be
duplicable.

We express this fact using a mode constraint:

```
val find: [a] duplicable a => (a -> bool, list a) -> option a
```

#### Accounting for aliasing relationships

So far, we’ve seen that * Mezzo* provides a
better control of ownership by making sure a mutable part of the heap
has a unique owner;

*also describes effects in a more accurate manner compared to traditional ML, by leveraging permissions to give finer-grained types to functions.*

**Mezzo*** Mezzo* is able to talk about aliasing as
well, through the use of

*singleton types*. A singleton type basically means “same as”, as is written

`=t`

. For instance,
if a variable `x`

“has type” `=y`

, this means that
`x`

and `y`

are aliases for each other. We write
`x @ =y`

or, more concisely, `x = y`

.```
let x = ... in
(* x @ τ *)
let y = x in
(* x @ τ ∗ y = x *)
...
```

The important point here is that we *did not duplicate* the
permission `x @ τ`

, which would be illegal, since we know
nothing about `τ`

. We merely added a new permission that
accounts for the aliasing relationship.

Similarly, if we hold `x @ mpair a b`

, we know that
`x`

points to a block in the heap with two slots, the first
one having type `a`

and the second one having type
`b`

. This amounts to the following drawing:

```
(l)
,-----,
,---------------------> | a |
(x) / `-----`
,-----/------------,
| / | | (r)
| left | right | ,-----,
| | \ | ,--> | b |
`---------------\--` / `-----`
\ /
`-----`
```

* Mezzo* is able to accurately reflect this
aliasing pattern through the permission system. Let us consider the
definition (finally!) of the

`mpair`

type.```
mutable data mpair a b =
MPair { left: a; right: b }
```

In * Mezzo*, sum type and record types are
fused together.

*is able to automatically transform:*

**Mezzo**```
x @ mpair a b
```

into

```
x @ MPair { left: a; right: b }
```

thus reflecting the fact that `x`

is a block with two
fields, named left and right. However, just like I’ve given names to the
two anonymous blocks with types `a`

and `b`

,
* Mezzo* will introduce names for these
components and say that this permission is also equivalent to the
following conjunction:

```
x @ MPair { left: = l; right: = r } ∗ l @ a ∗ r @ b
```

We write this more concisely as:

```
x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b
```

(note the `left = l`

shorthand without a colon).

#### A final example

Here is an example that puts all these mechanisms together: the
implementation of the `mswap`

function. As usual, comments
indicate the set of available permissions.

```
val mswap [a, b] (consumes x: mpair a b): (| x @ mpair b a) =
(* We start off with:
x @ mpair a b
which the type-checker automatically converts into:
x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b *)
let t = x.left in
(* x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b ∗ t = l *)
x.left <- x.right;
(* x @ MPair { left = r; right = r } ∗ l @ a ∗ r @ b ∗ t = l *)
x.right <- t
(* We end up with:
x @ MPair { left = r; right = t } ∗ l @ a ∗ r @ b ∗ t = l
which the type-checker automatically converts into:
x @ mpair a b
in order to satisfy the return type of the function. *)
```

At the end of the function, the type-checker, in order to satisfy the
return type of the function, must check that the final expression “has
type” `()`

, that is, the unit type: it is the case. It must
also check that the new permission `x @ mpair b a`

is
available: by recombining the final set of permissions, we can indeed
trade it for `x @ mpair b a`

: the return condition of the
function is satisfied, and the example above will type-check^{5}:

The expressive power of * Mezzo* allows it to
accurately reflect the aliasing and points-to relationships in the type
system. This is what makes this example type-check, since we need a very
precise reasoning in order to justify its soundness. In a sense, the
permission system is close, in this example, to symbolic execution, or
abstract interpretation.

### An incomplete presentation of Mezzo

The permission mechanism is very good at describing local
relationships between identifiers, but this very precise description
cannot scale to larger^{6}, more complex structures such as
doubly-linked lists. Indeed, there are some patterns that the permission
system cannot express at this stage: if a list contains the same
exclusive element twice, for instance, the permission system will never
accept such a list: there would be two ways of reaching the exclusive
element, which means that one could gain two permissions for it: this is
something that the system explicitly disallows.

In a second blog post, I will detail the main contribution that we
bring with * Mezzo*, namely, a powerful adoption
and abandon mechanism that provides a way to express these examples
using permissions, at the expense of having dynamic checks.

### Some bonus examples

In case you are still skeptical about the benefits of such a
sophisticated (if not complex) type system, here are some examples
slightly more motivating than the `mswap`

example which, I
admit, will rarely pop up “in the real world”.

These examples are taken from the test suite of our prototype type-checker. I’ll give a brief overview of the current state of our type-checker in a subsequent post.

#### Tail-recursive
`list::concat`

without `Obj.magic`

This function is written in destination-passing-style; that is, the
`concat1`

function keeps a pointer to the previous cell.
Cells are mutable: they represent an “unfinished” list cell; once their
`tail`

field has been mutated, `cell`

s can be
“frozen” into a regular list cell, thus enabling tail-recursion.

```
data list a =
| Cons { head: a; tail: list a }
| Nil
mutable data cell a =
| Cell { head: a; tail: () }
val rec concat1 [a] (
consumes dst: cell a,
consumes xs: list a,
consumes ys: list a): (| dst @ list a)
=
match xs with
| Nil ->
dst.tail <- ys;
tag of dst <- Cons
| Cons { head; tail } ->
let dst' = Cell {
head = head; tail = ()
} in
dst.tail <- dst';
tag of dst <- Cons;
concat1 (dst', tail, ys)
end
val concat [a] (
consumes xs: list a,
consumes ys: list a): list a
=
match xs with
| Nil ->
ys
| Cons { head; tail } ->
let dst = Cell {
head = head; tail = ()
} in
concat1 (dst, tail, ys);
dst
end
```

We also have a tail-recursive version of `map`

that uses
the exact same trick.

#### Computing the size of a tree

The following function:

- computes the size of a tree,
- while annotating all sub-trees with their sizes,
- accepts a tree that was previously annotated with any kind of data.

```
mutable data tree a =
| Node { left: tree a; right: tree a; size: a }
| Leaf
val rec tree_size (consumes x: tree unknown): (int | x @ tree int) =
match x with
| Leaf ->
1
| Node { left; right } ->
let l = tree_size left in
let r = tree_size right in
let total = l + r + 1 in
x.size <- total;
total
end
```

Former names for this language include: “the language of the future” and “Ha

**ML**eT”. My very own proposal of “Kre**ML**in” never really took off.↩︎A man of the trade will undoubtedly feel an immediate familiary with separation logic. Indeed, our

`*`

conjunction acts as a separating conjunction on*exclusive*portions of the heap. On*duplicable*portions of the heap, it merely indicates that the set of permissions must be consistent with itself. More on that in the next section.↩︎Unlike OCaml, type applications are done the same way as term applications, namely, left-to-right.↩︎

The

`(t|p)`

syntax allows one to bundle a type`t`

along with a permission`p`

. For the special case where`t`

is equal to the unit type`()`

, we write`(|p)`

.↩︎Function types are always annotated in

: we specify the type of the arguments as well as the return type of the function. If the function needs any extra permissions to run, these must be specified as well.↩︎**Mezzo**Precisely, the permission system we just saw cannot express structures of arbitrary length with sharing.↩︎