Florian's OCaml compiler weekly, 24 April 2023
- April 24, 2023
This series of blog post aims to give a short weekly glimpse into my (Florian Angeletti) daily work on the OCaml compiler. This week, the focus is on two interesting bugs in the first alpha release of OCaml 5.1.0.
With the release of the first alpha for OCaml 5.1.0, I have shifted a part of my time towards updating core tools like odoc for OCaml 5.1 and hunting bugs in the new release.
Last Monday, working with Kate Deplaix, we found two interesting bugs by looking at the opam-health-check reports.
- A bug in the new parsetree node for value bindings
- A potentially painful change of behavior for explicitly polymorphic with open polymorphic variant types.
Coercion on value definitions
The first bug stems from a rare construct in the OCaml language: coercion in value definition:
let x :> < x : int > = object
method m = 0
method n = 1
end
Here, we are coercing the body of the value definition to the type
<x:int>
masking the method m
. This
syntax is a bit surprising we have an explicit coercion which is an
expression which is applied on the pattern side of the definition.
Before OCaml 5.1, such constructions were desugared in the parser itself to:
let (x : < x : int >) = (object
method m = 0
method n = 1
end:> <x:int>
)
When I updated the abstract syntax tree to avoid desugaring value bindings of the form
let pat: typ = exp
in the parsetree, I forgot this case which means that the AST dropped the coercion part of the annotation.
This mistake ought to be fixed but it leads to an interesting question on how to represent constraints on value bindings. Should we be generic and represent the constraints as:
type value_constraint = {
locally_abstract_types: string loc list; (* type a b c . ... *);
typ: type_expr option (* ... : typ? ...* );
coercion: type_expr option (* ... :> typ? *)
}
However, with this representation, we cover two possible new cases that could be written in fantasy syntax as
type a b c. typ :> coercion
and
type a b c. :> coercion
More problematically, this product type allows for constraints without any real constraints
type a b c.
Thus the generic product feels a tad too wide.
Another option is to tailor a type closer to the currently supported syntax with
type value_constraint =
| Constraint of {
locally_abstract_types: string loc list; (* type a b c . ... *);
typ: type_expr (* ...: typ *)
}
| Coercion of {
ground type_expr option (* typ? ...* );
coercion: type_expr (* ... :> typ *)
}
This representation has the disadvantage of losing part of the
similarity between the Coercion
and constraint
case but it covers exactly the constructs allowed in the syntax.
This my current bug fix proposal for OCaml 5.1.0 .
Polymorphic variants and explicit universal quantification
Another interesting difference of behavior between OCaml 5.1.0 and 5.0.0 appears when writing code that mix both open polymorphic variant types and explicit polymorphic annotation:
let f : 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false
This code compiled in OCaml 5.0.0, but fails in OCaml 5.1.0 with
Error: This pattern matches values of type [? `X of 'b ]
but a pattern was expected which matches values of type [> `X of 'a ]
The universal variable 'a would escape its scope
because the universal variable 'a
might escape through
the global row variable hidden in [> X of _ ]
.
The issue can be fixed by making sure that the row variable is also bound by the explicit universal quantification:
let f : 'a 'r. ( [> `X of 'a ] as 'r ) -> 'a = function (`X x) -> x | _ -> assert false
Not only this fix is not that obvious, but it is not compatible with the short syntax for universal-outside and locally abstract-inside type variables. For instance, if we start with
type 'a w = Int: int w
let f : type a. a w -> [> `X of a ] -> a = fun Int -> function (`X x) -> x | _ -> 0
adding a local type r
doesn’t help
type ('a,'r) c = [> `X of 'a ] as 'r
let f : type a r. a w -> (a,r) c -> a = fun Int -> function (`X x) -> x | _ -> 0
Error: This type r should be an instance of type [> `X of a ]
because we would need a constrained abstract type.
Thus, we are left with no other options than desugaring the short hand to:
let f: 'a 'r. 'a w -> ( [> `X of 'a ] as 'r ) -> 'a = fun (type a): (a w -> [> `X of a ] -> a) -> fun Int ->
function (`X a) -> a | _ -> 0
which is a bit of mouthful compared to our starting point.
Thus, I have been investigating with Gabriel Scherer a possibility to keep the previous definition working
let f: 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false
by making the assumption that any row variables that are unnamed in an explicit type annotation under an explicit universal quantification should be bound by the binder. In other words, we could consider that whenever an user write
let f: 'a. [> `X of 'a ] -> 'a = function (`X x) -> x | _ -> assert false
they meant to write
let f: 'a 'r. ([> `X of 'a ] as 'r) -> 'a = function (`X x) -> x | _ -> assert false
and thus the typechecker ought to add implicitly the row variable
'r'
to the list of bound variables in the left-hand side of
.
.