Gagallium : Florian compiler weekly, 11 December 2023

This series of blog post aims to give a short weekly glimpse into my (Florian Angeletti) work on the OCaml compiler. This week subject is the release of OCaml 5.1.1, some review work on occurrences analysis for OCaml projects and a bit of on-going work on structured logs.

OCaml 5.1.1

OCaml 5.1.0 has been released nearly three months ago, in those months we have discovered a few significant bugs that were impeding the use of OCaml 5.1:

one type system soundness bug
one dependency bug
one GC performance regression bug in numerical code

To fix those issues, and have a version of OCaml 5.1 usable by anyone, we have release a new patched version 5.1.1 of OCaml.

Let’s spend some time analysing those issues.

Type system (and type printing) bug

The first worrying bug in OCaml 5.1.0 was an issue with the refactoring of variance computation. Suddenly, in OCaml 5.1.0 one could write

type -'a n
type +'a p
type +'a ko = 'a p n

and the typechecker would accept the absurd statement that + * - = +. Such a statement can be easily exploited to crash an OCaml program, and it was introduced due to a small typo when refactoring the typechecker.

and mn =
  mem May_pos v1 && mem May_neg v2 || mem May_pos v1 && mem May_neg v2

(Can you spot the typo?)

Fortunately, the typo was easy to fix and the type system unsoundness was soon fixed.

But that was only the first problem. Soon after, a second report came in, mentioning that the typechecker was crashing on cyclic abbreviations like

type 'a t = 'a s * 'a irr
and 'a irr = unit
and  'a s = 'a t

but only when the -short-paths flag was specified. After some investigation, it was not the typechecker that was crashing but the type printer! (which is unfortunately not a that rare occurrence)

More precisely, the type printer was crashing when trying to print the improved type error message in OCaml 5.1 for cyclic definition

Error: The type abbreviation t is cyclic:
         'a t = 'a s * 'a irr,
         'a s * 'a irr contains 'a s,
         'a s = 'a t

because in presence of the flag -short-path the printer tried to find a better name for 'a t in a type environment which contained the recursive definition. This endeavour was doomed to fail. Fortunately, the fix was quite straightforward: don’t feed the type printer invalid type environment.

Packaging bug

The introduced of compressed compiler artefacts (.cmi, .cmt, and .cmo files) decreased the size of a compiler installation by a third. However, when focusing on the nice decrease in size of the installation; we forgot that using an external C library for compression within the runtime added this library as dependency for all users of the OCaml runtime.

In other words, we had added a new C dependency to all OCaml users. If this didn’t seem to create problem on macOS or most linux situation, at least for developers, it was clearly a non ideal situation. A compiler shouldn’t impose its dependency on compiled programs.

Moreover, if compressed marshalling is very nice for the compiler, this is a quite biased use case. In particular, one cannot use marshalled data coming from untrustworthy sources (because such data breaks type safety). Consequently, marshalled data is mostly used as a computation cache. This is what the compiler or Coq are doing, but this is not a frequent use case in the rest of ecosystem. Thus adding a new dependency to all OCaml programs for a niche use case was really not optimal.

Thus, after this packaging issue was discovered, we went on a journey to remove this dependency. After exploring hacks, visiting the limitations of dead symbol elimination by linkers, we finally reached a not that comfortable conclusion: the most robust solution was to remove the Compression flag from the standard library, while moving the compression support to a compiler internal library.

With this solution, the compression library dependency is no longer propagated to OCaml programs. But if someone wishes to used compressed marshalling, the only simple option for now is to use the compiler library.

However, this also means for the first time in a long time, a patch version of OCaml introduces a breaking change. Such breaking changes could have been avoided by ignoring the current Marshal.Compression flag. However, having a disabled flag exposed in the standard library sowing confusion amongst all users seemed to be a worse outcome than breaking the code of the few advanced users that might had released code relying on this new feature.

Garbage Collection Performance Regression

Finally, while the compression odyssey was on-going, I received a report by an user complaining that their numerical program ported to OCaml from Python was slower in OCaml. After looking at the performance, the Garbage Collector was surprisingly high in the list of hot performance spot for a purely numerical program.

Knowing that there were maybe some bugs lurking in the shadows of the 5.1.0 GC, I suggested to check the performance of this program on OCaml 4.14.1. Lo and behold, OCaml 4.14 was faster than python.

What happened to OCaml 5 to create such slowdown on a single thread program? Comparing the amount of time the GC triggered a major collection was surprising:

compiler version	time	minor	major
4.14.1	0.803s	6626	1656
5.0.0	1.448s	8697	8679
5.1.0	5.507s	56162	56156
5.2.0+trunk	1.701s	10945	10939
5.1+gc_fixes	1.740s	10563	10557

Between OCaml 4.14 and OCaml 5.1.0, the amount of major collection had been multiplied by a factor 50 .

If the root cause behind this change has not been completely understood yet, we fortunately had an easy way out: the OCaml 5.2 development version had already accumulated enough fixes to limit this increase to a less dramatic factor 6.

This is still not satisfying, but this is half expected for a still experimental runtime. And while waiting for a better solution, the collection of bug fixes integrated in OCaml 5.1.1 should made it possible to use numerical code in OCaml.

Project-wide occurence index

Beyond the release of OCaml 5.1.1, I have been working on reviewing PRs.

With Gabriel Scherer and Ulysse Gérard, we spend an afternoon reading and discussing the design of a PR by Ulysse which improves the shape metadata to make them provides an index of value and definition occurrences within an OCaml module.

The shape metadata are a new form of metadata introduced in OCaml 4.14. Those metadata tracks definitions of types, modules, module types, …

One important characteristic of shapes is that they are able to see through functors applications and includes to find back the original definition of values.

With the proposed PR, which still need more review time (maybe this week), one could use this information to find the occurrences of any values in an OCaml project (through the use of IDEs)

Structured log for the OCaml compiler

One of my ongoing work during this release cycle has been to create a structured log API for all output of the compiler.

One of the expected benefits for those logs would be a complete and versioned description of all user-facing outputs emitted by the compiler.

Consequently, a scheme for a log should have a clear version. But my current log API has a notion of nested logs, since the scheme DSL is essentially a versioned description of Algebraic Data Types.

Which leads me to the question, should nested logs have a version too?

My current answer is that only toplevel logs should have a version. Currently, this is implemented in OCaml by using a handful of generative functors as a fun use case for mixing phantom types and generative functors.

The idea is that we first define a Root module type that can define new versions by creating a value of type id scheme_version (with id a fresh new type).

type 'a scheme_version
type ('a,'b) key
module type Def = sig
  type id (* this should be a fresh and unique id *)
  type scheme = id sch
  type log = id t
  type nonrec 'a key = ('a,id) key
  val scheme: scheme
end

module type Root = sig
  include Def
  val v1: id scheme_version
  val new_key: id scheme_version  -> string -> 'a typ -> 'a key
  val new_version: version -> id scheme_version
end

Then nested records or sum types can only create new fields or constructors by using a typed version defined by the parent module:

module type Sum = sig
  type root
  include Def
  val new_constr: root scheme_version -> string -> 'a typ -> 'a key
end

and we can use generative functors to ensure that our type-level identifiers are as fresh as we need them to be:

module New_root_scheme(): Root
module New_record(Root:Def)(): Record with type root := Root.id
module New_sum(Root:Def)(): Sum with type root := Root.id