This series of blog post aims to give a short weekly glimpse into my (Florian Angeletti) daily work on the OCaml compiler. This week, the focus is on two interesting bugs in the first alpha release of OCaml 5.1.0.

With the release of the first alpha for OCaml 5.1.0, I have shifted a part of my time towards updating core tools like odoc for OCaml 5.1 and hunting bugs in the new release.

Last Monday, working with Kate Deplaix, we found two interesting bugs by looking at the opam-health-check reports.

  • A bug in the new parsetree node for value bindings
  • A potentially painful change of behavior for explicitly polymorphic with open polymorphic variant types.

Coercion on value definitions

The first bug stems from a rare construct in the OCaml language: coercion in value definition:

  let x :> < x : int > = object
    method m = 0
    method n = 1
  end

Here, we are coercing the body of the value definition to the type <x:int> masking the method m. This syntax is a bit surprising we have an explicit coercion which is an expression which is applied on the pattern side of the definition.

Before OCaml 5.1, such constructions were desugared in the parser itself to:

  let (x : < x : int >) = (object
    method m = 0
    method n = 1
        end:> <x:int>
  )

When I updated the abstract syntax tree to avoid desugaring value bindings of the form

let pat: typ = exp

in the parsetree, I forgot this case which means that the AST dropped the coercion part of the annotation.

This mistake ought to be fixed but it leads to an interesting question on how to represent constraints on value bindings. Should we be generic and represent the constraints as:

type value_constraint = {
  locally_abstract_types: string loc list; (* type a b c . ... *);
  typ: type_expr option (* ... : typ? ...* );
  coercion: type_expr option (* ...  :> typ? *)
}

However, with this representation, we cover two possible new cases that could be written in fantasy syntax as

type a b c. typ :> coercion

and

type a b c. :> coercion

More problematically, this product type allows for constraints without any real constraints

type a b c.

Thus the generic product feels a tad too wide.

Another option is to tailor a type closer to the currently supported syntax with

type value_constraint =
  | Constraint of {
      locally_abstract_types: string loc list; (* type a b c . ... *);
      typ: type_expr (* ...: typ *)
    }
  | Coercion of {
      ground type_expr option (* typ? ...* );
      coercion: type_expr  (* ...  :> typ *)
    }

This representation has the disadvantage of losing part of the similarity between the Coercion and constraint case but it covers exactly the constructs allowed in the syntax.

This my current bug fix proposal for OCaml 5.1.0 .

Polymorphic variants and explicit universal quantification

Another interesting difference of behavior between OCaml 5.1.0 and 5.0.0 appears when writing code that mix both open polymorphic variant types and explicit polymorphic annotation:

let f : 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false

This code compiled in OCaml 5.0.0, but fails in OCaml 5.1.0 with

Error: This pattern matches values of type [? `X of 'b ]
      but a pattern was expected which matches values of type [> `X of 'a ]
      The universal variable 'a would escape its scope

because the universal variable 'a might escape through the global row variable hidden in [> X of _ ].

The issue can be fixed by making sure that the row variable is also bound by the explicit universal quantification:

let f : 'a 'r. ( [> `X of 'a ] as 'r ) -> 'a = function (`X x) -> x | _ -> assert false

Not only this fix is not that obvious, but it is not compatible with the short syntax for universal-outside and locally abstract-inside type variables. For instance, if we start with

type 'a w = Int: int w
let f : type a. a w -> [> `X of a ] -> a = fun Int -> function (`X x) -> x | _ -> 0

adding a local type r doesn’t help

type ('a,'r) c = [> `X of  'a ] as 'r
let f : type a r. a w -> (a,r) c -> a = fun Int -> function (`X x) -> x | _ -> 0
Error: This type r should be an instance of type [> `X of a ]

because we would need a constrained abstract type.

Thus, we are left with no other options than desugaring the short hand to:

let f: 'a 'r. 'a w -> ( [> `X of 'a ] as 'r ) -> 'a = fun (type a): (a w -> [> `X of a ] -> a) -> fun Int ->
function (`X a) -> a | _ -> 0

which is a bit of mouthful compared to our starting point.

Thus, I have been investigating with Gabriel Scherer a possibility to keep the previous definition working

let f: 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false

by making the assumption that any row variables that are unnamed in an explicit type annotation under an explicit universal quantification should be bound by the binder. In other words, we could consider that whenever an user write

let f: 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false

they meant to write

let f: 'a 'r. ([> `X of 'a ] as 'r) -> 'a = function (`X x) -> x | _ -> assert false

and thus the typechecker ought to add implicitly the row variable 'r' to the list of bound variables in the left-hand side of ..