## Introduction to Mezzo

- January 11, 2013

Some of you may have heard about it already: this blog post is about a new programming language that's being designed right there at Gallium, codenamed * Mezzo*.

Designing a new programming language is nothing out of the extraordinary: many students in our research area do it for a living, and I'm no exception. I've been doing the traveling salesman, running presentations of Mezzo all over the place: Texas, New York, and going even as far as *Plateau de Saclay*. Now is the time for a blog post.

* Mezzo* is a joint work between François Pottier, my advisor, and myself. Recently, Thibaut Balabonski joined our team as a post-doctoral researcher to work on the concurrent aspects of the language; we also had an undergrad intern, Henri Chataing, work on the compilation part of the project.

* Mezzo* draws inspiration from many other research projects; I don't want to bother the reader with a long introduction -- this is not a research paper: I will provide an extended comparison of our work with other material in a later blog post. For the moment, let me just mention that we blend ideas from affine type systems, separation logic, and systems based on regions and capabilities.

* Mezzo* is still in a preliminary stage, but I feel like it is time to start the revolution, and introduce this almighty language to a wider audience. Behold!

### What is **Mezzo**?

**Mezzo**

**Mezzo**^{1} draws most of its inspiration from ML. Therefore, a big chunk of the language reads and feels just like ML (keywords: strict evaluation, functional, impure). However, there's a reason we're designing a new language: we want to go further than ML.

- We want to reject some programs previously accepted by ML; in a sense, we want to forbid patterns we consider dangerous.
- Conversely, we want to accept some programs previously deemed unsafe by an ML type-checker: we want to make more programming patterns possible.

Most of these improvements come at a cost, naturally. The design that I'm about to introduce is still very much in flux, and time will tell whether the cognitive cost of our new system is worth the benefits.

However, we deliberately tried to cut down the complexity of the system: in a sense, we don't want to end up doing proofs within the type system. Therefore, the mechanisms that I'm about to introduce may seem low-tech compared to recent research works, but we believe that's what makes it possible to explain the language to a (seasoned) ML programmer, and also what makes writing a type-checker relatively doable, without requiring a huge burden of annotations from the programmer.

### Permissions

This blog post assumes that the reader is familiar with ML; in particular, the reader is expected to know what references are, what a boxed value is, and how only things in the heap may be mutable.

One of the goals for * Mezzo* is to offer better control of

*aliasing*and

*ownership*of mutable data. In order to make this possible,

*replaces the traditional concept of "type" with that of "permission".*

**Mezzo**In essence, one can view a permission as a conceptual access token to a part of the heap, that is, a heaplet. A permission both describes the shape of the heaplet, and grants its owner the right to *access* this heaplet. Permissions are an artefact of the type system: they do not exist at runtime.

Consider the following piece of code:

```
let x = (2, "foo") in
...
```

Right after this line of code, a new permission is made available to the programmer, namely `x @ (int, string)`

. This permission basically tells the programmer that `x`

points to a block in the heap, with two fields. The first one contains an integer, and the second one contains a pointer to a string. Moreover, the permission grants its owner the right to access this portion of the heap.

At any given point in the program, a set of permissions is available. They determine the fragment of the heap that we can access at this point of the program.

In the example above, several permissions may be available along `x @ (int, string)`

; we chose to focus on the small subset of permissions we are interested in. In general, there is more than just one permission available in a given scope; for instance, we may have `x @ (int, string) ∗ y @ float`

, that is, both `x @ (int, string)`

and `y @ float`

at the same time: `∗`

denotes the conjunction of two permissions^{2}.

#### What are permissions useful for?

##### A simple example: the `length`

function

So far, there's nothing new here. Let us stay low-tech for a moment and consider the signature of the `length`

function that, as the name suggests, computes the length of a `list`

.

`val length: [a] (x: list a) -> int`

`[a]`

denotes universal quantification, that is, `length`

is valid for any type `a`

, and `x`

is a binder, that is, it gives us a way to talk about the argument of `length`

.

The `length`

function, in order to perform its job, requires that the *caller* passes it the "right" to read the list at address `x`

. We say that this function needs the permission `x @ list a`

to run: indeed, it needs to iterate over the elements of the list in order to compute its length, and the aforementioned permission enables it to do just that.

We understand this function to *return to its caller* the permission `x @ list a`

, along with a value of type `int`

. In other words, if comments denote the permissions available at a given program point:

```
(* empty *)
let x = ... in
(* x @ list a *)
let l = length x in
(* l @ int ∗ x @ list a *)
```

The permission `x @ list a`

is temporarily taken from the caller's context, conceptually passed to the `length`

function for the duration of its execution, then returned to the caller once `length`

terminates.

##### A more sophisticated example: the `mswap`

function

We are now ready to move on to a more sophisticated example that will demonstrate the power of * Mezzo*! We now consider

*mutable*pairs of elements, which we write as

`mpair a b`

(think `('a, 'b) mpair`

^{3}). We wish to write a function that will swap,

*in place*, the two components of a pair. That is, without allocating a fresh pair.

What is the type for this function?

`val mswap: [a, b] (consumes x: mpair a b) -> (| x @ mpair b a)`

Let us review this fairly complex type. There are several elements at play here.

- Unlike the
`length`

example, the argument`x`

is*consumed*, that is, when the caller passes the`x @ mpair a b`

permission to`mswap`

, it*loses it*. - However,
`mswap`

will return to the caller a different permission saying that`x`

, after`mswap`

has run, has a different type, namely`x @ mpair b a`

. - The
`(|p)`

syntax indicates that the function returns a permission`p`

, but no value^{4}

```
(* empty *)
let x = ... in
(* x @ mpair a b *)
mswap x;
(* x @ mpair b a *)
```

This is a function that we cannot write in ML. Indeed, in order for this function to be sound, we must make sure that no one else can ever hold an *outdated* copy of `x @ mpair a b`

. Therefore, `mswap`

must have an *exclusive* access to its argument. * Mezzo* provides such a guarantee:

*mutable portions of the heap are guaranteed to have a unique owner*; therefore,

`mswap`

*does*have an exclusive access to its argument:

`x @ mpair a b`

is an exclusive permission, and the system guarantees that it has an unique owner. In particular, as soon as `mswap`

runs, the caller has lost its permission on `x`

.#### Different flavors of permissions

Getting back to our very first example, the fragment of the heap that the permission `x @ (int, string)`

describes is immutable; therefore, it is safe to share this knowledge with others. In other words, *this permission can be freely copied*, and passed to other parts of the programs (think about threads, for instance). This permission is said to be *duplicable*.

As we just saw, not all permissions are duplicable. Because * Mezzo* aims to control aliasing to mutable data, we make sure that only one person at a time can "see" a mutable fragment of the heap: this is of utmost importance when it comes to concurrency. The way we enforce this is by forbidding duplication of such a permission. For instance,

`x @ mpair a b`

is said to be *exclusive*, and

*cannot*be copied.

Therefore, having two copies of a permission is not necessarily the same as having just one copy.

##### The mode system

Some functions can only be type-checked if they know that their argument is duplicable. The `find`

function, because it returns a pointer to an element in the list, must require that the type of the elements in the list be duplicable: indeed, because a pointer to the element is returned to the caller *and* remains in the list at the same time, there are two people pointing to it, so it better be duplicable.

We express this fact using a mode constraint:

`val find: [a] duplicable a => (a -> bool, list a) -> option a`

#### Accounting for aliasing relationships

So far, we've seen that * Mezzo* provides a better control of ownership by making sure a mutable part of the heap has a unique owner;

*also describes effects in a more accurate manner compared to traditional ML, by leveraging permissions to give finer-grained types to functions.*

**Mezzo*** Mezzo* is able to talk about aliasing as well, through the use of

*singleton types*. A singleton type basically means "same as", as is written

`=t`

. For instance, if a variable `x`

"has type" `=y`

, this means that `x`

and `y`

are aliases for each other. We write `x @ =y`

or, more concisely, `x = y`

.```
let x = ... in
(* x @ τ *)
let y = x in
(* x @ τ ∗ y = x *)
...
```

The important point here is that we *did not duplicate* the permission `x @ τ`

, which would be illegal, since we know nothing about `τ`

. We merely added a new permission that accounts for the aliasing relationship.

Similarly, if we hold `x @ mpair a b`

, we know that `x`

points to a block in the heap with two slots, the first one having type `a`

and the second one having type `b`

. This amounts to the following drawing:

```
(l)
,-----,
,---------------------> | a |
(x) / `-----`
,-----/------------,
| / | | (r)
| left | right | ,-----,
| | \ | ,--> | b |
`---------------\--` / `-----`
\ /
`-----`
```

* Mezzo* is able to accurately reflect this aliasing pattern through the permission system. Let us consider the definition (finally!) of the

`mpair`

type.```
mutable data mpair a b =
MPair { left: a; right: b }
```

In * Mezzo*, sum type and record types are fused together.

*is able to automatically transform:*

**Mezzo**`x @ mpair a b`

into

`x @ MPair { left: a; right: b }`

thus reflecting the fact that `x`

is a block with two fields, named left and right. However, just like I've given names to the two anonymous blocks with types `a`

and `b`

, * Mezzo* will introduce names for these components and say that this permission is also equivalent to the following conjunction:

`x @ MPair { left: = l; right: = r } ∗ l @ a ∗ r @ b`

We write this more concisely as:

`x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b`

(note the `left = l`

shorthand without a colon).

#### A final example

Here is an example that puts all these mechanisms together: the implementation of the `mswap`

function. As usual, comments indicate the set of available permissions.

```
val mswap [a, b] (consumes x: mpair a b): (| x @ mpair b a) =
(* We start off with:
x @ mpair a b
which the type-checker automatically converts into:
x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b *)
let t = x.left in
(* x @ MPair { left = l; right = r } ∗ l @ a ∗ r @ b ∗ t = l *)
x.left <- x.right;
(* x @ MPair { left = r; right = r } ∗ l @ a ∗ r @ b ∗ t = l *)
x.right <- t
(* We end up with:
x @ MPair { left = r; right = t } ∗ l @ a ∗ r @ b ∗ t = l
which the type-checker automatically converts into:
x @ mpair a b
in order to satisfy the return type of the function. *)
```

At the end of the function, the type-checker, in order to satisfy the return type of the function, must check that the final expression "has type" `()`

, that is, the unit type: it is the case. It must also check that the new permission `x @ mpair b a`

is available: by recombining the final set of permissions, we can indeed trade it for `x @ mpair b a`

: the return condition of the function is satisfied, and the example above will type-check^{5}:

The expressive power of * Mezzo* allows it to accurately reflect the aliasing and points-to relationships in the type system. This is what makes this example type-check, since we need a very precise reasoning in order to justify its soundness. In a sense, the permission system is close, in this example, to symbolic execution, or abstract interpretation.

### An incomplete presentation of Mezzo

The permission mechanism is very good at describing local relationships between identifiers, but this very precise description cannot scale to larger^{6}, more complex structures such as doubly-linked lists. Indeed, there are some patterns that the permission system cannot express at this stage: if a list contains the same exclusive element twice, for instance, the permission system will never accept such a list: there would be two ways of reaching the exclusive element, which means that one could gain two permissions for it: this is something that the system explicitly disallows.

In a second blog post, I will detail the main contribution that we bring with * Mezzo*, namely, a powerful adoption and abandon mechanism that provides a way to express these examples using permissions, at the expense of having dynamic checks.

### Some bonus examples

In case you are still skeptical about the benefits of such a sophisticated (if not complex) type system, here are some examples slightly more motivating than the `mswap`

example which, I admit, will rarely pop up "in the real world".

These examples are taken from the test suite of our prototype type-checker. I'll give a brief overview of the current state of our type-checker in a subsequent post.

#### Tail-recursive `list::concat`

without `Obj.magic`

This function is written in destination-passing-style; that is, the `concat1`

function keeps a pointer to the previous cell. Cells are mutable: they represent an "unfinished" list cell; once their `tail`

field has been mutated, `cell`

s can be "frozen" into a regular list cell, thus enabling tail-recursion.

```
data list a =
| Cons { head: a; tail: list a }
| Nil
mutable data cell a =
| Cell { head: a; tail: () }
val rec concat1 [a] (
consumes dst: cell a,
consumes xs: list a,
consumes ys: list a): (| dst @ list a)
=
match xs with
| Nil ->
dst.tail <- ys;
tag of dst <- Cons
| Cons { head; tail } ->
let dst' = Cell {
head = head; tail = ()
} in
dst.tail <- dst';
tag of dst <- Cons;
concat1 (dst', tail, ys)
end
val concat [a] (
consumes xs: list a,
consumes ys: list a): list a
=
match xs with
| Nil ->
ys
| Cons { head; tail } ->
let dst = Cell {
head = head; tail = ()
} in
concat1 (dst, tail, ys);
dst
end
```

We also have a tail-recursive version of `map`

that uses the exact same trick.

#### Computing the size of a tree

The following function:

- computes the size of a tree,
- while annotating all sub-trees with their sizes,
- accepts a tree that was previously annotated with any kind of data.

```
mutable data tree a =
| Node { left: tree a; right: tree a; size: a }
| Leaf
val rec tree_size (consumes x: tree unknown): (int | x @ tree int) =
match x with
| Leaf ->
1
| Node { left; right } ->
let l = tree_size left in
let r = tree_size right in
let total = l + r + 1 in
x.size <- total;
total
end
```

Former names for this language include: "the language of the future" and "Ha

**ML**eT". My very own proposal of "Kre**ML**in" never really took off.↩A man of the trade will undoubtedly feel an immediate familiary with separation logic. Indeed, our

`*`

conjunction acts as a separating conjunction on*exclusive*portions of the heap. On*duplicable*portions of the heap, it merely indicates that the set of permissions must be consistent with itself. More on that in the next section.↩Unlike OCaml, type applications are done the same way as term applications, namely, left-to-right.↩

The

`(t|p)`

syntax allows one to bundle a type`t`

along with a permission`p`

. For the special case where`t`

is equal to the unit type`()`

, we write`(|p)`

.↩Function types are always annotated in

: we specify the type of the arguments as well as the return type of the function. If the function needs any extra permissions to run, these must be specified as well.↩**Mezzo**Precisely, the permission system we just saw cannot express structures of arbitrary length with sharing.↩