Gagallium : Using well-disciplined type-propagation to disambiguate label and constructor names

An explanation of the raging discussion that has taken place on the OCaml bugtracker about new ways to resolve record fields and variant constructor names using type information. Read more for a presentation of the different proposals and an explanation of the consensus reached.

Resolving record fields and variant constructors in OCaml today

Today, when you write an access to a record field/label r.l, the OCaml type-checker will look for what the label l means. If your code is correct, it will find in the typing environment that the label l corresponds to some record type t (defined as a record of the form {...; l:foo; ...}), infer that the expression r must have type t, and that the return type of r.l must be the type of the field l in t (foo in our example).

# type t = {l1 : int; l2: bool};;
type t = { l1 : int; l2 : bool; }
# let test
r = rl1;;
val test : t -> int = <fun>

In case you have actually defined t and its labels not in the current module but in a sub-module that has not been opened, you need to prefix the label with the module name.

# module P = struct type t = {l1 : int; l2:
bool} end;;
module P : sig type t = { l1 : int; l2 : bool; } end
# let test r = rPl1;;
val test : P.t -> int = <fun>

Symmetrically, when constructing a record, OCaml will look at the labels to determine the type of the whole expression. {P.l1 = 1; P.l2 = true} will be inferred at type P.t. The exact same story holds for variant/sum types: if you write Some 12, OCaml will look in the typing environment for the type having constructor Some (it is the builtin type 'a option), and infer the result type int option. Symmetrically, when pattern matching (match v with None -> true | Some _ -> false), it checks that v indeed has an option type.

Note that this is fundamentally different from how ocaml objects and polymorphic variants work. They are open, extensible types, you do not need to have defined types with a given method or polymorphic variant tag to use it.

# let get_m = fun o -> om;;
val get_m : < m : 'a; .. > -> 'a = <fun>
# let foo =
`Foo true;;
val foo : [> `Foo of bool ] = `Foo true

To infer the type for get_m, the type checker didn’t have to know “which type” had a method named m: it works for any object type with such a method. Similarly, the [> in the type of foo indicate that this value has a polymorphic type: it is not typed with a particular sum type having the constructor `Foo, but is of the type of any polymorphic variant having at least a constructor `Foo with a single parameter of type bool.

One way to say this is that methods have “polymorphic” access (they can work on any types with that has this method) while record fields have “monomorphic” access: they denote a field in a unique, well-determined record type that has this label.

(In this regard, variant constructors behave like record fields, and polymorphic variant tags behave like object methods. This will be the case during much of the discussion, so I won’t repeat it and keep talking about fields, while I really mean to discuss both record fields and variant constructors.)

In particular, it does not make sense to prefix a method name with a module (o#P.m), because the method name m does not have a “point of definition” that would be in a module, it is a global name. The OCaml manual says that record fields are “named objects” that may be prefixed by an access path of one or several module names, while method names are “global labels” that “need not be qualified”.

There are two perceived issues with the current state of how field access is type-checked, that led to this discussion of how to add a new feature to alleviate them.

Problem 1: Writing module prefixes can be inconvenient

If the type of the record you’re trying to build is named in the submodule Bar of the module Foo, building a value with {Foo.Bar.l1 = ...; Foo.Bar.l2 = ...} can be inconvenient. Similarly, pattern matching with two cases, one on the constructor Foo.Bar.Var x and the other on Foo.Bar.App (a,b) is redundant and painful.

In the case of record construction, this problem has been solved (in 3.12 I believe) by allowing to prefix only one of the labels, instead of all the labels: it is legal OCaml to only write {Foo.Bar.l1 = ...; l2 = ...}, the type-checker will notice that you gave the full name for one of the fields (not necessarily the first one) and not complain.

Perhaps strangely, this feature has not been implemented for variant pattern matching. But since 3.12, thanks to Alain Frisch’s feature proposal, you can locally open modules let open Foo.Bar in ... or even Foo.Bar.( ... ) around expressions using those labels or constructors. Compared to global open, this avoids some environment pollution, but you still get a lot of names imported in your scope (that may shadow names in your environment) when you only wanted to speak about those damn variant constructors.

Problem 2: Label shadowing is possible and extremely awkward

You are allowed to define two types that share some record names in the same module.

type t1 = { foo : char; bar : bool };;
let make_t1 foo bar = { foo; bar };;
type t2 = { foo : bool; baz : int
};;
let make_t2 foo baz = { foo; baz };;

(The feature I used in the make_.. functions is record field punning, where at record creation time foo; is syntactic sugar for foo = foo. You may not know that if your record is defined in a submodule P, then let (foo, bar) = (1, true) in {P.foo; bar} will do the right thing.)

If you do that, you end up in an uncomfortable situation where you may create value at types t1 or t2, but you cannot access the field foo of the type t1 anymore: it has been shadowed by the definition of t2. If you use the label foo anywhere in your program after those definitions, it may only mean the label foo of t2.

# let get_bar r =
rbar;;
val get_bar : t1 -> bool = <fun>
# let get_foo_and_bar r = (rfoo, rbar);;
Line 1, characters 34-37:
Error: The record type t2 has no field bar
Did you mean baz?

Note that there is an interaction between those two issues: to avoid writing qualified labels, you can open the modules in which the labels are defined. But if you open two (sub)modules that share some labels, you get into the shadowing problem. To avoid shadowing, you need to use long names more often than you would like. One typical example is the following, where you translate from some form of Abstract Syntax Tree (AST) to another, that have been defined in different modules but share some natural constructor names:

let rec type_check env : UntypedAst.expr -> TypedAst.expr * Type.t =
  function
  | UntypedAst.Var x ->
      (TypedAst.Var x, lookup x env)
  | ...

One may open either UntypedAst or TypedAst or both, but for conflicting constructor names (Var, App…) only those of the module opened last can be used unqualified. Short of using polymorphic variants to avoid this issue, the idiomatic solution is to prefix such constructor names with a letter, for example UVar in UntypedAst and TVar in TypedAst to be able to open them both without shadowing. This is a specialized example of the real-world cases (syntax transformations) that Alain is considering problematic in OCaml today.

The two competing feature proposal

Fixing problem 1: the scope-inferring proposal

Jacques Garrigue (re)started the discussion with a rather simple change proposal for the type-checker that had the following effect: if type information is available, the type-checker guesses the module prefix that needs not be written explicitly. This was merely a simple prototype to explore the subject, and Jacques expected the final semantics to be different; but this initial proposal is still important to understand the design space.

Consider the following example:

module Foo = struct
  type t = {l1 : int; l2 : int}
end

let tuple (r : Foo.t) = (r.Foo.l1, r.Foo.l2)

With this scope-inferring proposal, tuple can be simply written

let tuple (r : Foo.t) = (r.l1, r.l2)

The labels l1 and l2 appear to be unknown (note that we did not open the module Foo), but type information is available and the type-checker can rewrite it into the previous form.

While this proposal doesn’t directly help with the problem of label shadowing, it can be used to work around it. Indeed, we can rewrite the previous conflicting example into:

module M = struct
  type t1 = { foo : char; bar : bool }
end
module N = struct
  type t2 = { foo : bool; baz : int }
end
open M
open N

let get_foo_and_bar (r : t1) = (r.foo, r.bar)

I have wrapped the two conflicting record declarations into distinct submodules. Then, by providing a type annotation (r : t1), we can access r.foo which is understood as r.M.foo. In a sense, the non-ambiguous prefix M was inferred.

This also works naturally in the AST transformation example, as the conflicting labels are already defined in distinct submodules. With this proposal you could then write the following:

let rec type_check env : UntypedAst.expr -> TypedAst.expr * Type.t =
  function
  | Var x -> (Var x, lookup x env)
  | ...

One downside of this proposal is that may not be obvious to understand which module prefixes are inferred (type information can flow from less explicit places such as function applications or annotations that are far away); an user that has been used to the idea that labels appearing unqualified in the code are available in the naming environment can be surprised and have a difficulty finding the definition place of the labels/constructors – specialized tooling such as Jun Furuse’s ocamlspotter could help with this issue.

On the other front, this proposal is not intended to help with the problem 2 of conflicting labels, and the possible workaround is not terribly convenient. Alain Frisch, in his local variant of the OCaml language at Lexifi, has been using conflict-solving strategies for a long time, and was asking for a change that would address this problem as well.

Fixing problem 2: the conflict-solving proposal

Solving conflicting label names is another way to use type information. Remember the initial wrong example:

type t1 = { foo : char; bar : bool }
type t2 = { foo : bool; baz : int }

let get_foo_and_bar (r : t1) = (r.foo, r.bar)

In this situation, the type-checker knows when checking r.foo that r has type t1, and could resolve the field access as meaning “the field foo of t1” rather than using the shadowing rule (the last field t1 defined or opened in the naming environment). We could write this explicitly by an imaginary extension of the OCaml syntax, where the label l at type t would be written l^t. In our example, the type-checker could disambiguate r.foo into r.foo^t1.

This proposal corresponds to a rather unoriginal treatment of type-directed overloading: some restricted class of identifiers is considered to have potentially several simultaneous meanings, and a type-directed disambiguation strategy is applied. Labels are “named objects” and resolved as such; on the other hand, restricting the potential ambiguities to only labels (and variant constructors) limits the scope of the ambiguity and its potential dangers of making the code harder to read.

A similar feature exists in the Agda programming language, where constructor names do not follow a strict shadowing discipline but may be disambiguated by typing. This is very convenient in a dependently typed language where the current techniques to ornaments of data structures is to define several versions of the same inductive type with varying degrees of type information.

This proposal doesn’t directly address the problem 1: if t1 and t2 were both defined in a submodule P, you would still need to write (r.P.foo, r.P.bar); the type information is only used to disambiguate between the two candidate labels foo^(P.t1) and foo^(P.t2).

This feature also works if the two types added to the unqualified scope came from different modules. Therefore it help solve the problem of having to prefix labels by their module paths. Indeed, you can then open the corresponding modules, and use type-based disambiguation to avoid the shadowing problem. For example you could write:

let rec type_check env : UntypedAst.expr -> TypedAst.expr * Type.t =
  let open UntypedAst in
  let open TypedAst in
  function
  | Var x -> (Var x, lookup x env)
  | ...

and have it disambiguated into

let rec type_check env : UntypedAst.expr -> TypedAst.expr * Type.t =
  let open UntypedAst in
  let open TypedAst in
  function
  | Var^(UntypedAst.expr) x ->
      (Var^(TypedAst.expr) x, lookup x env)
  | ...

It is more explicit than the scope-inferring proposal as to where the labels are defined. Compared to this example with the scope-inferring proposal, we had to use two open directives that explicitly enrich the naming environment.

Note that while the explicit syntax r.P.l^t is not available in OCaml, this proposal adds a way to express it: you can write (r:t).P.l, where the type annotation (r:t) guarantees that type-based disambiguation will work. This proposal is therefore strictly more convenient than adding r.P.l^t alone: you can have the type annotation somewhere close (the type information is propagated), as in my type_check example where I annotate the whole function rather than each constructor.

Still, the scope-inferring proposal had the nice property that the result of the disambiguation could be expressed as an explicit OCaml program without type annotations, by adding the inferred module prefixes. In absence of the explicit l^t syntax, this proposal doesn’t have this desirable property. More generally, it adds a form of syntactic overloading, which has been carefully avoided in the design of OCaml so far.

Choices, consequences and compromises

The argument in favor of the scope-inferring proposal is that you need to write less code, in situations where the conflict-solving proposal would require either an open (with the associated environment pollution risk) or an explicit module prefix.

The converse argument supports the conflict-solving proposal: by requiring to be strictly more explicit in terms of which labels are meant, programs are easier to read and more robust to change of type information (removal of an annotation, etc.). In particular, the “specific semantics” introduced by the proposal would only kick in when two labels exist in the same scope, instead of being a general way to shorten label denotations: you only pay (in complexity of the part of system that needs to be explained to understand the program) in the already-fishy case where label shadowing happens.

In the bugtracker discussion, Jacques was in favor of the scope-inferring proposal, Alain wished to have both proposals implemented, while Leo P. White and myself were in favor of the conflict-solving proposal only. After some time to reach convergence, the arguments on all sides were well-understood, and there remained a difference of opinion that couldn’t be resolved (no pun intended) by objective arguments. Jacques and Alain carry a lot of weight in the discussion, as Jacques is the maintainer of the type-checker code (and is therefore the person that ultimately has a say on the matter), and Alain has implemented and used (a variant of) the two proposals in his own dialect of the OCaml language used at Lexifi. Leo opinion was also important in keeping the discussion open, as he is one of the few brave souls to have worked on the type-checker code to implement his external extensible open datatypes proposal for OCaml).

I don’t want to give the impression that this is only a matter of blunt opinion and authority: there were theoretical arguments on the side of both proposals, but I don’t know how to sum them up concisely so they will be the topic of a second, more technical blog post.

In the end, the following compromise made consensus : implement both type-directed disambiguation strategies (scope inference and label conflict resolution) as a way to subsume both proposals, but raise a warning when scope inference has an effect. By treating this warning as an error, you get exactly the conflict-solving proposal. By disabling this warning, you get the union of the conflict-solving and scope-inferring proposal.

The implementation of this feature is surprisingly tricky and Jacques and Leo are discussing back-and-forth on how to do that properly. While I don’t have enough information to give any guarantee on this (the proposal may still be rejected by other developers feeling the feature is too adventurous, and indeed it seems quite complex), I would not be surprised to see this compromise effectively available in the next major version of OCaml (4.01)… maybe.

That’s it!

PS: after this post, you may also want to read the second post for a more technical exposition of the details and design arguments.