Research

I'm a postdoc' in the Gallium team at INRIA Paris-Rocquencourt. There, I work with François Pottier and Didier Rémy. This involves some Mezzo, and some ornamentation.

Prior to joining Gallium, I was Conor McBride's PhD student. In my PhD, I've worked on the theory behind the intensional fragment of Epigram. I've studied some universes of data-types and the possibilities offered by such an approach. If Epigram were to be (eventually) implemented, it might contain the cool stuff I've modelled with polynomial functors.

Publications

Coq: The world's best macro assembler?

Andrew Kennedy, Nick Benton, Jonas Jensen, Pierre-Evariste Dagand

We describe a Coq formalization of a subset of the x86 architecture. One emphasis of the model is brevity: using dependent types, type classes and notation we give the x86 semantics a makeover that counters its reputation for baroqueness. We model bits, bytes, and memory concretely using functions that can be computed inside Coq itself; concrete representations are mapped across to mathematical objects in the SSReflect library (naturals, and integers modulo 2n) to prove theorems. Finally, we use notation to support conventional assembly code syntax inside Coq, including lexically-scoped labels. Ordinary Coq definitions serve as a powerful ``macro'' feature for everything from simple conditionals and loops to stack-allocated local variables and procedures with parameters. Assembly code can be assembled within Coq, producing a sequence of hex bytes. The assembler enjoys a correctness theorem relating machine code in memory to a separation-logic formula suitable for program verification.

Transporting Functions across Ornaments

Pierre-Evariste Dagand, Conor McBride
Under submission to JFP
Extended version of the ICFP'12 paper

Programming with dependent types is a blessing and a curse. It is a blessing to be able to bake invariants into the definition of data-types: we can finally write correct-by-construction software. However, this extreme accuracy is also a curse: a data-type is the combination of a structuring medium together with a special purpose logic. These domain-specific logics hamper any effort of code reuse among similarly structured data.

In this paper, we exorcise our data-types by adapting the notion of ornament to our universe of inductive families. We then show how code reuse can be achieved by ornamenting functions. Using these functional ornament, we capture the relationship between functions such as the addition of natural numbers and the concatenation of lists. With this knowledge, we demonstrate how the implementation of the former informs the implementation of the latter: the user can ask the definition of addition to be lifted to lists and she will only be asked the details necessary to carry on adding lists rather than numbers.

Our presentation is formalised in a type theory with a universe of data-types and all our constructions have been implemented as generic programs, requiring no extension to the type theory.

A Cosmology of Datatypes
(PhD thesis)

Pierre-Evariste Dagand

With special thanks to:

  • Clément!
  • Fredrik!
  • Guillaume (A.)!
  • Guillaume (M.-M.)!
  • Hank!
  • Lorenzo!
  • Matteo!
  • Pedro!
  • Stevan!
  • Vincent!

A Categorical Treatment of Ornaments

Pierre-Evariste Dagand, Conor McBride

Ornaments aim at taming the multiplication of special-purpose datatypes in dependently typed programming languages. In type theory, purpose is logic. By presenting datatypes as the combination of a structure and a logic, ornaments relate these special-purpose datatypes through their common structure. In the original presentation, the concept of ornament was introduced concretely for an example universe of inductive families in type theory, but it was clear that the notion was more general.

This paper digs out the abstract notion of ornaments in the form of a categorical model. As a necessary first step, we abstract the universe of datatypes using the theory of polynomial functors. We are then able to characterise ornaments as cartesian morphisms between polynomial functors. We thus gain access to powerful mathematical tools that shall help us understand and develop ornaments. We shall also illustrate the adequacy of our model. Firstly, we rephrase the standard ornamental constructions into our framework. Thanks to its conciseness, this process gives us a deeper understanding of the structures at play. Secondly, we develop new ornamental constructions, by translating categorical structures into type theoretic artefacts.

Elaborating Inductive Definitions

Pierre-Evariste Dagand, Conor McBride

We present an elaboration of inductive definitions down to a universe of datatypes. The universe of datatypes is an internal presentation of strictly positive types within type theory. By elaborating an inductive definition -- a syntactic artifact -- to its code -- its semantics -- we obtain an internalized account of inductives inside the type theory itself: we claim that reasoning about inductive definitions could be carried in the type theory, not in the meta-theory as it is usually the case.

Besides, we give a formal specification of that elaboration process. It is therefore amenable to formal reasoning too. We prove the soundness of our translation and hint at its completeness with respect to Coq's Inductive definitions. The practical benefits of this approach are numerous. For the type theorist, this is a small step toward bootstrapping, ie. implementing the inductive fragment in the type theory itself. For the programmer, this means better support for generic programming: we shall present a lightweight deriving mechanism, entirely definable by the programmer and therefore not requiring any extension to the type theory.

Fully-Abstract Compilation to JavaScript

Cédric Fournet, Nikhil Swamy, Juan Chen, Pierre-Evariste Dagand, Pierre-Yves Strub, Ben Livshits

Many tools allow programmers to develop applications in high-level languages and deploy them in web browsers via compilation to JavaScript. While practical and widely used, these compilers are ad-hoc: no guarantee is provided on their correctness for whole programs, nor their security for programs executed within arbitrary JavaScript contexts.

This paper presents a compiler with such positive guarantees. We compile an ML-like language with higher-order functions and references to JavaScript, while preserving all source program properties. Relying on type-based invariants and applicative bisimilarity, we show full abstraction: two programs are equivalent in all source contexts if and only if their wrapped translations are equivalent in all JavaScript contexts. We evaluate our compiler on sample programs, including a series of secure libraries.

Transporting Functions across Ornaments

Pierre-Evariste Dagand, Conor McBride

Programming with dependent types is a blessing and a curse. It is a blessing to be able to bake invariants into the definition of data-types: we can finally write correct-by-construction software. However, this extreme accuracy is also a curse: a data-type is the combination of a structuring medium together with a special purpose logic. These domain-specific logics hamper any effort of code reuse among similarly structured data.

In this paper, we exorcise our data-types by adapting the notion of ornament to our universe of inductive families. We then show how code reuse can be achieved by ornamenting functions. Using these functional ornament, we capture the relationship between functions such as the addition of natural numbers and the concatenation of lists. With this knowledge, we demonstrate how the implementation of the former informs the implementation of the latter: the user can ask the definition of addition to be lifted to lists and she will only be asked the details necessary to carry on adding lists rather than numbers.

Our presentation is formalised in a type theory with a universe of data-types and all our constructions have been implemented as generic programs, requiring no extension to the type theory.

Ornaments, organising the Zoo of data-types

Pierre-Evariste Dagand, Conor McBride

When programming in type theory, we discover a whole zoo of data-types. We have all met unary natural numbers, lists, finite sets, vectors and other similarly structured yet often exotic breeds. Despite their specificities, we notice a striking similarity between these data-types, with natural numbers appearing here as a common ancestor. In type theory, such a diverse evolution comes out of necessity: to be fit for purpose, our types have to be as precise as possible. From a reusability perspective, this is a disaster: these purpose-built data-types are too specific to fit into or benefit from a global library.

In this work, we study an organisation principle for the zoo of data-types. McBride's Ornaments define a general principle for creating new data-types from old. We shall give a categorical presentation of this construction and show that it exactly captures our intuition of a ``lineage''. Our abstract treatment greatly simplifies the type-theoretic definition. Hence, we can explain the standard constructions (e.g., ornamental algebra, algebraic ornament) in simpler terms. Further, we are able to tap into the rich categorical structure of ornaments to uncover new programming artifacts.

Besides giving a mathematical basis for our classification work, this presentation gives a hint on how one could semi-automatically lift operations from ancestor to descendant. Whilst lifting algebras is still work in progress, we have obtained some promising results for restricted classes of morphisms of data-types.

This work has been formalised in Agda, using Interaction Structures to model inductive families.

The Gentle Art of Levitation

James Chapman, Pierre-Evariste Dagand, Conor McBride, Peter Morris

We present a closed dependent type theory whose inductive datatypes are presented not by a scheme for generative datatype declaration, but by encoding in a "universe". Each inductive datatype is thus given by interpreting its description" -- an ordinary first-class value in a datatype of descriptions. Moreover, the latter itself has a description.

By presenting datatypes via such a self-encoding universe, datatype-generic programming becomes ordinary programming. We show some of the resulting generic operations and deploy them in particular, useful ways on the datatype of datatype descriptions itself. Surprisingly this apparently self-supporting setup is achievable without paradox or infinite regress.

Filet-o-Fish: practical and dependable domain-specific languages

Pierre-Evariste Dagand, Andrew Baumann, Timothy Roscoe

We address a persistent problem with using domain-specific languages to write operating systems: the effort of implementing, checking, and debugging the DSL usually outweighs any of its benefits. Because these DSLs generate C by templated string concatenation, they are tedious to write, fragile, and incompatible with automated verification tools.

We present Filet-o-Fish (FoF), a semantic language to ease DSL construction. Building a DSL using FoF consists of safely composing semantically-rich building blocks. This has several advantages: input files for the DSL are formal specifications of the system's functionality, automated testing of the DSL is possible via existing tools, and we can prove that the C code generated by a given DSL respects the semantics expected by the developer.

Early experience has been good: FoF is in daily use as part of the tool chain of the Barrelfish multicore OS, which makes extensive use of domain-specific languages to generate low-level OS code. We have found that the ability to rapidly generate DSLs we can rely on has changed how we have designed the OS.

The Multikernel: A new OS architecture for scalable multicore systems

Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter,
Timothy Roscoe, Adrian Schupbach, Akhilesh Singhania

Commodity computer systems contain more and more processor cores and exhibit increasingly diverse architectural tradeoffs, including memory hierarchies, interconnects, instruction sets and variants, and IO configurations. Previous high-performance computing systems have scaled in specific cases, but the dynamic nature of modern client and server workloads, coupled with the impossibility of statically optimizing an OS for all workloads and hardware variants pose serious challenges for operating system structures.

We argue that the challenge of future multicore hardware is best met by embracing the networked nature of the machine, rethinking OS architecture using ideas from distributed systems. We investigate a new OS structure, the multikernel, that treats the machine as a network of independent cores, assumes no inter-core sharing at the lowest level, and moves traditional OS functionality to a distributed system of processes that communicate via message-passing.

We have implemented a multikernel OS to show that the approach is promising, and we describe how traditional scalability problems for operating systems (such as memory management) can be effectively recast using messages and can exploit insights from distributed systems and networking. An evaluation of our prototype on multicore systems shows that, even on present-day machines, the performance of a multikernel is comparable with a conventional OS, and can scale better to support future hardware.

Opis: reliable distributed systems in OCaml

Pierre-Evariste Dagand, Dejan Kostic, Viktor Kuncak

Concurrency and distribution pose algorithmic and implementation challenges in developing reliable distributed systems, making the field an excellent testbed for evaluating programming language and verification paradigms. Several specialized domain-specific languages and extensions of memory-unsafe languages were proposed to aid distributed system development. We present an alternative to these approaches, showing that modern, higher-order, strongly typed, memory safe languages provide an excellent vehicle for developing and debugging distributed systems.

We present Opis, a functional-reactive approach for developing distributed systems in Objective Caml. An Opis protocol description consists of a reactive function (called event function) describing the behavior of a distributed system node. The event functions in Opis are built from pure functions as building blocks, composed using the Arrow combinators. Such architecture aids reasoning about event functions both informally and using interactive theorem provers. For example, it facilitates simple termination arguments.

Given a protocol description, a developer can use higher-order library functions of Opis to 1) deploy the distributed system, 2) run the distributed system in a network simulator with full-replay capabilities, 3) apply explicit-state model checking to the distributed system, detecting undesirable behaviors, and 4) do performance analysis on the system. We describe the design and implementation of Opis, and present our experience in using Opis to develop peer-to-peer overlay protocols, including the Chord distributed hash table and the Cyclon random gossip protocol. We found that using Opis results in high programmer productivity and leads to easily composable protocol descriptions. Opis tools were effective in helping identify and eliminate correctness and performance problems during distributed system development.

Contact

email:
pierre-evariste.dagand --at-- inria.fr
office:
29
address:
INRIA Rocquencourt
Domaine de Voluceau
B.P. 105
78153 Le Chesnay, France