## Singleton types for code inference, continued

- December 30, 2012

In this post you can find out what was cut from the previous post on singleton types. These are more technical considerations that may require some familiarity with type theory. In particular, I'll discuss linear and dependent types in the context of describing more and more singletons.

In the previous
post,
I discussed a code inference construction `(Γ |- ? : σ)`

that would be
replaced, after type-checking, by the unique term `t`

of type `σ`

(modulo program equivalence) in the environment Γ, or fail if no term
or several distinct terms exist at type σ.

## More singletons

I have argued that the terms we will look for should live in
a different term language than the host language in which we would
consider using code-inference, to restrict ourselves to well-behaved,
pure terms. Any term language is fine as long as it can be embedded
into the host language, and can reasonably be shown to the user: you
may want to keep the `?`

in the source code, but also sometimes show
the inferred term to the user, to help program understanding, or
because finding the term was the point as in my example of "I forgot
the argument order".

Another point of variability in the type system: we can in fact pick
any type system we want: in the most explicit form where the user
would write `(Γ |- ? : σ)`

in the source, the type `σ`

could live in
the type system for code inference, and not be a type expressions in
the host language. We only need to be able to translate the inferred
*terms* back into the host term language (and then we can verify
that they are type-correct if we don't trust the term
inference engine).

The embedding of the host type system to the type system for code
inference could be partial: maybe we won't be able to handle some of
the types inferred by the host system when using the simple `?`

construction, and those cases would fail with an error. The more
interesting aspect is that we can also have *finer types* in the
type system for code inference. For example, we can have linear
types.

For example, `∀α. (α → α → α) → α → α`

is not a singleton type: it is
inhabited by `fun f x -> x`

, but also `fun f x -> f x x`

, ```
fun f x ->
f x (f x x)
```

, etc. However, `∀α. (α → α → α) → α ⊸ α`

, where the last
arrow is the linear arrow `⊸`

(`-o`

in glorious ASCII) which means "use
my argument exactly once", is a singleton type.

This suggests some experiment with finer type systems. For example,
`(β→γ) → (α*β*α) → (α*γ*α)`

is not a singleton type:
the first and third components of the input tuple, of the same
type α, could be swapped in the output. But there are some type
systems (some of which my coworker Julien is working on) that are
somehow stricter than linear type systems in that they request not
only each input to be consumed exactly once, but its structure to be
preserved (no swapping of pair elements, etc.). Consider for example
a language of subtyping coercions: if you have a coercion of type
`β ≤ γ`

in context, then there is only one coercion of type
`(α * β * α) ≤ (α * γ * α)`

, and if seen as a function it does not
swap the tuple components.

We can observe the difference between the usual arrow `→`

, the
linear arrow `⊸`

and a notion of structure-preserving arrow
sketched above, that I would write `⇒`

, on the type of `List.map`

:

`∀αβ. (α → β) → (List α → List β)`

is inhabited by any function that applies its function argument to the elements of its list argument, but may drop or reorder some of them. For example, always returning the empty list is ok.the inhabitants

`∀αβ. (α ⊸ β) → (List α ⊸ List β)`

are not allowed to drop elements from the input list anymore, but may still reorder them.`[x1,x2] ↦ [f x2, f x1]`

is okthe inhabitants of

`∀αβ. (α ⇒ β) → (List α ⇒ List β)`

must preserve the structure of the input list, neither reordering nor dropping elements; it is a singleton type. I think of it as the coercion`(α ≤ β) |- List α ≤ List β`

turned into a term.

(You may be unsure what the `→`

arrow between the function argument
and rest means in the two latter example. I am as well.)

## Towards dependent types

Of course, another obvious direction to refine the type system is to
introduce dependent types (and refinement types, etc.). Consider the
type of List.fold_right: `∀αβ. (α → β → β) → (List α → β → β)`

: in
this case, using linear or structure-preserving arrows does not get
us a singleton type, because there is no structure in the return
type `β`

to be preserved. However, a dependent type for `fold_right`

on lists (with arguments reordered for readability) is a singleton:

```
∀α ∀(P : List α → ★).
P nil →
(∀(x : α) (xs : List α). P xs → P (cons x xs)) →
∀(li : List α). P li
```

Of course, you may notice that writing this type is in no way easier than writing the corresponding term:

```
let rec fold init f = function
| nil -> init
| cons x xs -> f x xs (fold init xs)
```

If you move to sophisticated enough type systems, the idea that
"types help you infer code" becomes a bit of cheating as writing the
types themselves is just as much work as writing the terms. The idea of
using singleton types for code inference is not a magic wand that
will have you suddenly write order of magnitudes less *stuff*,
rather one principled way to move between term implementation and
type specification, type and code inference, and to study the
practical aspects under different points of view. The study of
singleton types in sufficiently sophisticated type systems may turn
out to be of theoretical rather than practical interest, but it's
enough for me, because it looks *fun*.

### Singleton types?

The concept of singleton types already exists in the literature in
a quite different form: for any term `M`

, we can consider the type
`(= M)`

which classifies the set of programs... equal to `M`

. This has
been found quite useful by the people working on module systems to
capture just the level of dependent types needed to express public
type declarations in a module signature (they're used one level up:
singleton *kinds* that publish equalities with a *type*; but the
term/type level is more convenient for explanations).

Of course, the type `(= M)`

is clearly a singleton type, and if you
determine that a type σ has a unique inhabitant `t`

, this should be
equivalent to saying that you've discovered an equality between `σ`

and the singleton type `(= t)`

. For example, if `∀αβ.(α*β)→(β*α)`

is
"a singleton type" then it is the same type as
`(= λ(x,y).(y,x))`

. This means that you could do everything of
interest by adding this notion of singleton types `(= M)`

in your
language, and proving type equalities between regular types and this
explicit form of singletons.

However that only works if the *host* language and the *search*
language are the same. If I use singleton types to infer code in OCaml
or Haskell, I'll want to infer stuff from `∀αβ.(α*β)→(β*α)`

while this
type is not equal to `(= λ(x,y).(y,x))`

in the host language. I'd
rather keep the two different notions of singleton types separate, to
emphasize that my focus here is on code inference, not equality
reasoning (which, again, may not hold in the host language).

## Singleton types as "proof search"

There is a nice interplay with Curry-Howard (already explicitly
employed in the previous works of Wells and Yakobowski): if we see
terms as proofs, we are interested in properties that have a *unique*
proof. I am not aware of logicians having worked on this question
(probably ignorance on my side), but to be fair mathematicians tend to
be interested in *existence* of proofs (the question of
type inhabitation) more than unicity. A good thing however is that
a large part of the literature on proof search is concerned with
finding *canonical* proofs, to avoid "redundancy" between proof
terms. Proof nets, uniform proofs, focusing are all somehow concerned
about this aspect (because proving meta-theorems about your proof
system is easier when proof objects are strongly structured). And it
turns out that the work to have more "canonical" proofs consists
exactly in finding representations where less different proofs are
equivalent as programs. This is an excellent excuse to learn more
about the proof search literature, to see if we can find ideas inside
that are helpful to determine singleton types.

As a form of proof search, there is also a clear relation with
*tactics* (or maybe even more with this fascinating Emacs gadget
that Agda users have, that allows them to interactively refine proof
goals with "obvious" steps according to the shape of the goal). `?`

is a specific kind of tactics that is interested not only in finding
some proof term, but in the details of its dynamic semantics. Are
there other "tactics" that are also useful for programming rather
than proving?

## A closing remark and a quote

Some readers will have recognized some example of types I gave here as
examples of reasoning on parametricity. The fact that `∀α.α→α`

is only
inhabited by the identity function is, for example, often argued by
invoking the kind of results of "Theorems for free!". But a simple
algorithm to find singletons will simply prove that there is only one
correct βη-normal form of this type: first you have to use a Λαλ(x:α)
by η-expanding, and then you have nothing to use as a function body in
your context other than (x:α).

I'll remark that the link between parametricity and singleton types is maybe weaker than it could seem, as a practical algorithm to determine that types are singleton will rather work by trying to exhaust the search space for terms, than reasoning on semantic relations derived from the type -- but who knows. Let's say, at least, that I understand term search better than I understand parametricity, a topic on which I suspect we have still a lot to find out.

Let me conclude with a nice 2011 quote from Conor McBride on Quora (no idea why he put it there).

Whilst I don't want to gainsay the importance of types as a source of corrective raspberry-blowing, I would like to offer the prospect that types might have an active role to play, structuring the process of program inference. Overloading allows you to get rid of boring lumps of code if it can be figured out from types. Datatype-generic programming uses representations of the structure of types to calculate specific instances of algorithm-schemes. Dependent type systems often allow run-time-relevant values to be inferred silently from type information.

Crucially, also, types structure the search for programs in useful ways, provided your editing environment offers you type information and makes it easy to select type-appropriate choices. Sometimes it's easier to search for good programs in the space of well typed programs, rather than in the space of ascii turds.

This position constitutes a change of viewpoint in the purpose of types. If programs worked just the same with the types rubbed out, then types would represent a form of piety often inadequate with respect to testing. It's when types contribute information to algorithm selection, design statements which program definitions need merely refine, that they constitute a significant win.

To be fair, even in last century's typed languages, types had a beneficial organisational effect on programmers. This century, it's just possible types will have a comparable effect on programs. Types are concepts and now mechanisms supporting program-discovery as well as error-discovery. I think that's more than just gravy.