## Happy Ada Lovelace Day

- October 16, 2012
- Last updated on 2012/10/18

Tuesday October 16 is an excellent day for a lot of truly excellent reasons. The one of concern here is that it was chosen to be the Ada Lovelace Day, whose purpose is to celebrate important women in one's scientific field.

(A recent ESOP submission has left me enthusiastic about exotic time zones. I'm aiming to publish this article on Octobre 16 before 23:59 GMT-11, which might look like the following day to some readers, including those living in Samoa as it changed timezone in December 2011. Oh well.)

As far as I have witnessed, the Gallium-related fields in computer science research (programming languages design, compilation, type systems...) share the bane of most other computer science domains (or software development communities) in having a very low ratio of women at all levels.

Despite this imbalance, we actually have quite a few women that made very important advances. In the process of writing this billet, I gathered a rather long list of names of persons I would like to write about. Those are mostly researchers from the second half of the 20th century, but not only: Emmy Noether was suggested to me by Xavier, as the well-founded induction techniques she developed are quite useful when writing formal devlopments in a proof assistant such as Coq -- and of course her leading role in the movement of formalization of mathematics through abstract algebra had wide-ranging consequences that can definitely be felt, indirectly, in the development of programming languages and systems.

I have finally decided to avoid enumerations and rather attempt to write something interesting on just a few women. This also has the advantage of leaving ample material for the next years, even if they will be new names to add to the list by then.

## Barbara Liskov

The works of Barbara Liskov that had the most influence on me, personally, where abstract data types (not to be confused with algebraic data types) and the Liskov Substituability Principle (LSP).

### Abstract data types

The idea of abstract data types and encapsulation seems so natural now that it is difficult to imagine that is wasn't around before the beginning of the seventies. A lot of people were doing similar things at that time, but Barbara Liskov proposed a lucid notion of abstract datatypes, whose internal implementation choices were hidden from outside user modules. She then integrated it in 1974 into a new language design, CLU, which proved very influential for many other aspects as well: garbage collected, iterators, multiple value return, exception handling... If you want to learn more about the genesis of abstract types and CLU at large, you should read A History of CLU (PDF,31 pages) that she wrote later in 1992.

Abstract types were named "clusters", and this is where the CLU name comes from. Let's show the Wikipedia example:

```
complex_number = cluster is add, subtract, multiply
rep = record [ real_part: real, imag_part: real ]
add = proc ... end add;
subtract = proc ... end subtract;
multiply = proc ... end multiply;
end complex_number;
```

This cluster exports the type `complex_number`

along with operations
`add`

, `substract`

and `multiply`

. This is all the user of the cluster
ever sees. Internally, the type is given a representation (a type
definition `rep = ...`

) and the operations are implemented in terms of
this representation, but this is encapsulated and doesn't leak to the
outside.

Barbara Liskov has also played an influential in the development of object-oriented techniques (and languages for distributed computing), but CLU was not object-oriented. Some aspects of the design were influenced by Simula (1967), but the "History of CLU" paper gives a careful and detailed explanation of the difference between the "abstract data-type" and the "object-oriented" (in the spirit of Simula and Smalltalk) design style, that would be later expanded upon and constrasted -- see William Cook's On Understanding Data Abstraction, Revisited.

Abstract types are a very important concept in ML programming languages. In 1988, Plotkin and Mitchell proposed to understand such type abstraction as use of existential types (Abstract Typess have Existential Type), but the ML module systems actually takes a relatively different approach to type abstraction. Several authors have tried to bridge the gap, including in particular our very own Benoît Montagu in his 2009 article with Didier Rémy, Modeling Abstract Types in Modules with Open Existential Types.

### Liskov Substituability Principle

The Liskov Substituability Principle
(LSP)
stems from a 1994 paper that Barbara Liskov wrote with Jeannette
Wing, A Behavioral
Notion of
Subtyping. It
gives the *right* definition of what subtyping means: a type `S`

is
a subtype of a type `T`

if all values of type `S`

behave like values
of type `T`

. From this vision you can derive most properties of
subtyping relations (in particular contravariance in
input parameters), but those were already well-understood in 1994;
Liskov and Wing made the notion of "behave like" *formal*, by saying
that anything that could be *proved* about values of type `T`

should
also be provable on values of type `S`

.

The rest of the paper is interesting (they provide limited extension
techniques that guarantee that this strong semantics requirement
is met), but what made history was how this principle got a name, LSP,
and was widely diffused to practitioners of object-oriented
programming. It is now part of all methodology advice on programming.
Mainstream object-oriented languages went for richer notions of
extension that those that are clearly correct, and systems mixing
programs and proofs to statically ensure substituability are still at
the research stage, but everyone shoud know by now that the derived
types the programmers implement *must respect* the LSP -- it could be
argued that the preferred extension method of object-oriented
languages, implementation inheritance, makes it particularly easy to
mistakenly violate this invariant, so talking about it loud is
extremely useful.

## Christine Paulin-Mohring

I had the luck to have Christine Paulin-Mohring as teacher at the MPRI (if you are looking for a research-oriented graduate curriculum in computer science, consider coming to Paris for the MPRI). Christine played an instrumental role in the development of the Coq proof assistant, including leading the development of it. Among the most apparent works of Christine are her work on program extraction out of Coq, and on inductive definitions.

If you already know a bit of Coq, you should consider reading the Credits page of the Coq documentation, that is essentially a changelog of important Coq version since Coq 5.10 in 1995 -- it really is surprisingly interesting. Christine Paulin-Mohring is directly cited in most of those release notes, and wrote several of them.

Program extraction allows to take a Coq file and output a program in
a functional programming language such as OCaml, Haskell or
Scheme. A Coq file actually is a mix of programs (or proofs) written
in a style that resembles strongly typed functional programming
language, and tactics that are evil little metaprogramming scripts
that *build* proofs (or programs) and are composed interactively by
the user, with direct feedback on the desired type of the terms to
build and the available environment. The job of extraction is to take
this happy mix and produce a Haskell, OCaml or Scheme program that can
be compiled down to something efficient -- in particular, it strives
to remove unnecessary computations that are not really needed during
the program execution but corresponds to static correctness proof
about the program.

In Coq, this separation between what's needed at runtime and what can
be "erased" is done by a strict typing discipline: some types live in
`Prop`

, the universe of logical propositions (this mean that they
themselves have type `Prop`

), and their values can always be erased,
and the rest, outside `Prop`

, is kept. Note that this does not depend
on how the terms are written (programs or tactics); if you're
adventurous, you may use tactics as a meta-programming tool to write
honest programs executed at runtime, or write proofs by directly
writing the corresponding proof terms as ML-like programs.

`Prop`

has specific typing rules (it is called impredicative)
designed to make it convenient to use for mathematical proofs, and the
type-checking rules have been designed to allow extraction. In
particular, if an inductive type in `Prop`

has non-erasable parameters
or at least two constructors, you may not pattern-match on it in an
expression returning a non-erasable value. The following examples
fails to type-check with a rather self-descriptive error:

```
Inductive Choice : Prop := A | B.
Definition zero_or_one choice : nat :=
match choice with
| A => 0
| B => 1
end.
```

Error:

Incorrect elimination of "choice" in the inductive type "Choice":

the return type has sort "Set" while it should be "Prop".

Elimination of an inductive object of sort Prop

is not allowed on a predicate in sort Set

because proofs can be eliminated only to build proofs.

Indeed, the parameter `choice`

, having type `Choice`

, will be erased
during extraction. Yet the return type of `zero_or_one`

is a concrete
natural number; it is impossible to compute this result if the
parameter `choice`

is not passed at runtime. One can fix this by
moving the type of `Choice`

out of `Prop`

, or the result type inside
`Prop`

. When writing certified programs these choices are usually
rather simple to make, as it is clear from the whole program
architecture which parts are "program" and which are "proofs about the
program" : the program definition should not depend on its own
correctness proof. Of course in some cases you would wish for a more
flexible distinction, but that's for another discussion.

While this whole description is a bit fuzzy on the details and make some simplifications, I took care to point out that the limitation on pattern-matching of propositions to return runtime values held for inductive types of at least two constructors (or using non-erasable parameters). Indeed, if a type has only one constructor, of which all parameters are themselves erasable, matching on it does not actually influence the runtime execution of the program (it is called "non-informative") and may be used for its type-checking virtues. The reason I develop this example here is that it interleaves both topics of inductive definitions and program extraction.

The one-constructor case is extremely useful. That is how, for
example, equalities between terms can be used to type-check
a non-erasable program : they have only one constructor,
reflexivity. This is also the key to the use of accessibility
predicates (a constructive remnant of noetherian, or
well-founded, induction) to write programs with subtle termination
proof. Given a relation `R`

on the type `A`

, here is the type that
claims that `R`

is well-founded:

```
Inductive Acc (x: A) : Prop :=
Acc_intro : (forall y:A, R y x -> Acc y) -> Acc x.
```

When making a recursive call that has not obviously decreasing
parameters, one may instead use such an accessibility predicate to
show only that the recursive arguments are related (~ smaller than)
the arguments of the current call. This allows the user to choose the
termination argument (encoded in the relation `R`

) in a very general
way.

The other major use of this rule is the case of the inductive type
having no constructor at all. This is the proposition `False`

, whose
proposition can be asked to the Coq system (here using the `coqtop`

interactive toplevel):

Coq < Print False.

Inductive False : Prop :=

That's it: `False`

have no constructor. This means that when you have
a term at type `False`

(a proof of contradiction), you may say "I'm
done, this case never happens" through an empty pattern matching:
`match contradiction with end.`

This pattern-matching works at any
target type `A`

(the outer function you're defining tries to produce
an `A`

) as all cases of the match are well-typed at type `A`

. And it
works inside proofs, but also to build terms despite `False`

being
a proposition, as it has less than two constructors.

Christine Paulin-Mohring worked on all of this, and later taught it to me and numerous other students.