Project short title

Test the OCaml compiler with random tests and a reference interpreter

Long description

The aim of this project is to extend an existing testcase-generator for the OCaml compiler, using a reference interpreter (existing or newly developed) to find a lot of bugs in the compiler (read below for how that works), and fix as much of them as possible.

Randomly testing a language implementation is subtle because it is difficult to test if a randomly-generated program is “doing the expected thing”. The usual solution is to use differential testing: compare two implementations of the same language, and check that they do the same thing – otherwise, one of them is buggy. This assumes that the programs tested are “fully determined”: they have only one possible behavior under any valid implementation of the language. If you test a program with undefined behavior, or several valid evaluation orders, or non-determinism (for example, it prints a random number between 1 and 6), you can observe two different behaviors without having a bug.

It would be too difficult and time-consuming to manually inspect each generated program with a difference to determine if it is a bug or not (you would find thousands of non-bugs for each bug), so we write specialized program generators that stay in a “fully determined” subset of the language. Building a good generator of “fully determined” program is the hardest part of random compiler testing, where most of the effort goes. When done well, the results can be spectacular; the most well-known differential testing tool is Csmith, generating C programs; it has found hundredths of bugs in all existing C compilers (gcc, clang, etc.).

A program generator exists for differential testing of a well-determined subset of OCaml program. It is an early prototype written by a research team in Denmark (the corresponding research article is “Effect-Driven QuickChecking of Compilers”, Jan Midtgaard, Mathias Nygaard Justesen, Patrick Kasting, Flemming Nielson and Hanne Riis Nielson, 2017). The tool found some bug in the OCaml compiler, mostly around the evaluation order of side-effects, but not as many as it could (as you would!) due to two limitations:

  1. The subset of the language supported by their generator is very small (simple expressions, mostly computations on integers).

  2. The two implementations they compared are the two official implementations of the language, a bytecode compiler and a native compiler that share a large part of their codebase. It is very likely that there are bugs in the part of the code that they share; a program would do the same wrong thing with both compilers, so differential testing will not detect those bugs.

The goal of the present project is to find many more bugs in the OCaml compiler(s) by working on these two problems:

  1. Extend the existing generator to other parts of the language (user-defined datatypes and pattern matching, the module system. etc.), so that we can generate many more fully-determined programs and find more bugs.

  2. Instead of comparing those two very close compilers together, use a “reference interpreter” as a comparison point: an interpreter for the language that is built to be simple and closely follow the specification, rather than to be efficient. Some early prototypes of reference OCaml interpreters exist (https://github.com/johnwhitington/ocamli, https://github.com/Ekdohibs/camlboot, mlinterp); they may have to be completed to support larger fragments of the language.

The idea of using a reference interpreter is that it is more likely to be correct with respect to the language specification; of course, currently those interpreter prototypes may have bugs (they are used less often than the compilers, so less well-tested); differential testing would start by finding those bugs in the interpreter, but because the codebase is simpler they should be easier to fix.

Minimum system requirements

There are no specific computer requirements for this project. The OCaml compiler implementation builds in a few minutes on most laptops, and can be built under most operating systems (Windows, OSX, Linux, BSD, etc.).

Repository

https://github.com/ocaml/ocaml

Issue tracker

https://caml.inria.fr/mantis/

Newcomer issue tag

junior_job

Intern tasks

The first task is to build and run the existing program generator, against a development version of the OCaml compiler. We might find some new bugs at this step, which should be reported – and we could try to fix them as well.

The second task is to connect one of the existing interpreters prototypes to the program generator. We would probably find bugs this way; we have to find out if they are bugs in the interpreter or in the compiler, report them in the right place, and try to fix them as well.

The main tasks are to:

Finally, an important task would be to document the testing process and ensure that it is easy for others to run the same tests. Ideally, the compiler developers would keep running those tests regularly to find any new regression in the compiler codebase.

Intern benefits

Working on this project is an excellent way to learn how programming languages are implemented (how to execute the source code), and how to do random testing (fuzzing) of programming language implementations. Learning how programming languages work (interpreters, compilers) is fascinating; learning how to find bugs in production implementations is a useful skill that can make you employable in compiler-implementation teams (for any language or system).

Community benefits

Your work could find (and fix) many bugs in the main OCaml implementations, and build good tools to more bugs and detect regressions in the future. The OCaml programming language is used to write high-assurance programming tools (static analyzers, program verifiers, theorem provers, etc.) that are themselves used to check critical software (airplane control systems, etc.); compiler bugs could have dire consequences.

How can applicants make contributions to your project

Small tasks

Here are some easier issues marked as “junior_job” on the OCaml issue tracker:

Medium tasks

Larger tasks