I've been investigating the issue further, searching the relevant literature for previous work on GADTs+subtyping. This situation is indeed described in the work by Vincent Simonet and François Pottier, "A Constraint-Based Approach to Guarded Algebraic Data Types", 2007. http://www.normalesup.org/~simonet/publis/index.en.html#simonet-pottier-hmg-toplas They describe a generic system HMG(X) that adds GADTs on top of an existing type system optionally supporting subtyping -- in the spirit of HM(X), a generic study of adding Hindley/Milner parametric polymorphism to a language. When X is instantiated with (an idealized view of) the OCaml type system, their work can be used to get soundness proof for the current GADT implementation. # The soundness requirement from Simonet & Pottier Simonet and Pottier prove type soundness of a language with subtyping extended with GADT under general hypotheses that the underlying time system must verify. There are three requirements, two saying that function types must have the correct variance and must not be in subtyping relation with a GADT type, and one that is relevant to the present discussion, that says in essence that an "inversion" principle must hold for all constructors taken independently: for each GADT (ᾱ t) (ᾱ denotes a sequence of variables α₁,α₂...αn) and each of its constructor (K : σ(ᾱ)), it must holds in the underlying type system that (ᾱ t ≤ ᾱ' t) implies (σ(ᾱ) ≤ σ(ᾱ'). More precisely, a GADT constructor K for ᾱ t has a type scheme of the form K : ∀ᾱ∀β̄[D̄] τ̄ → ᾱ t or, equivalently, K : ∀ᾱ (∃β̄[D̄] τ̄) → ᾱ t The ᾱ are the variables of the parametrized type (ᾱ t), the β̄ are the existential types for this constructor K, used in the types τ̄ of the constructor parameter, and D̄ is a set of constraints on those ᾱ and β̄, equality constraints as with current-ocaml-GADTs or possibly other forms of constraints, such as subtyping constraints – with the assumption D is always satisfiable, other K would be a dead constructor. The "product" GADT in my previous mail would be written: type α expr Val : ∀α α → α expr Prod: ∀α∀β₁β₂[α = β₁*β₂] β₁ expr * β₂ expr → α expr Formally, the statement of Simonet and Pottier (page 19) says that if ∀ᾱ∀β̄[D̄] τ̄ → ᾱ t and ∀ᾱ'∀β̄'[D̄'] τ̄' → ᾱ t are two alpha-renamed copies of the type of a single constructor K (I mean ᾱ,β̄ are distinct from ᾱ', β̄') then the following constraint entailment holds: D̄ ∧ (ᾱ t ≤ ᾱ' t) ⊩ ∃β̄'. (D̄' ∧ τ̄ ≤ τ̄') For example, the (Prod) constructor of the (α expr) type gives rise to the following requirement: α = β₁*β₂ ∧ (α expr ≤ α' expr) ⊩ ∃β'₁β'₂, (α' = β'₁*β'₂) ∧ (β₁ t ≤ β'₁ t) ∧ (β₂ t ≤ β'₂ t) I have applied this requirement to the various previously discussed examples and counter-examples of GADTs parameter variance. If the requirement holds given some variance annotations, then we know for sure that the annotations are correct. If the variance annotations contradict the requirements, then they are incorrect. The study of examples allowed to formulate a general case criterion as refinement of Simonet&Pottier requirement in the equality case. The criterion, however, depends on properties of OCaml's subtyping relation that is strongly affected, in particular, by the presence of `private` types. This raises interesting questions on the type system design and its potential evolution, with several points that would be worthy of further exploration. (To lighten the present mail I have separated some technical arguments to a joint document that you should feel free to skip.) # A definition of parameter variance My definition of variance is as follows. The parameters α₁,α₂...an of a parametrized type ᾱ t can be each assigned a variance annotation among {+,-,=} (covariance, contravariant and invariant) with the following meaning: (ᾱ t ≤ ᾱ' t) holds *if and only if*, for each variable αi ∈ ᾱ: - if it is annotated with +, αi ≤ α'i - if it is annotated with -, αi ≥ α'i - if it is annotated with =, αi = α'i We will write (≤i) the order corresponding to the variance of the parameter αi: (≤), (≥) or (=) if it is respectively (+), (-) or (=). We can then write the variance fact for the type ᾱ t as: (ᾱ t ≤ ᾱ' t) iff (∀i, αi ≤i α'i) ## Simple admissible criteria There are two simple facts to note: - choosing invariance by default is always safe - the (more permissive) current OCaml choice for GADTs is safe If all parameters of a given GADT (ᾱ t) are considered invariant, then the requirement of Simonet and Pottier holds for all constructors of this GADT, which means that this choice preserves soundness. Indeed, given ᾱ, ᾱ' and β̄ we need to prove that: ᾱ t ≤ ᾱ t ∧ D̄ ⊩ ∃β̄'. (D̄' ∧ τ̄ ≤ τ̄') By invariance, we have that for all i, αi = α'i, that is ᾱ = ᾱ'. Then taking β̄':=β̄ we have D̄=D̄' and τ̄=τ̄', so indeed ᾱ=ᾱ ∧ D̄ ⊩ D̄' ∧ τ̄≤τ̄' The current OCaml choice is to use the usual variance determination rules for parameters αi that "are not instantiated", which means that there is no constraint on αi in D, and mark the other invariant. But then we have, as in the previous case, D̄=D̄', and τ̄≤τ̄' is guaranteed by the usual variance check on the non-instantiated αi. # Simple examples and counter-examples It is now interesting to have a look at the examples and counter-examples that were given in the discussion. ## My earlier 'Prod' example type +α expr Val : ∀α α → α expr Prod: ∀α∀β₁β₂[α = β₁*β₂] β₁ expr * β₂ expr → α expr The type of the 'Val' constructor, which is not instantiated, falls under the previous criterion: the existing covariance check suffices to prove the requirement of Simonet&Pottier. For the Prod constructor, the requirement was: α = β₁*β₂ ∧ (α expr ≤ α' expr) ⊩ ∃β'₁β'₂, (α' = β'₁*β'₂) ∧ (β₁ expr ≤ β'₁ expr) ∧ (β₂ expr ≤ β'₂ expr) Under covariance assumption (α ≤ α' iff α expr ≤ α' expr) this is equivalent to: α = β*β₂ ∧ α ≤ α' ⊩ ∃β'₁β'₂, (α' = β'₁*β'₂) ∧ (β₁ ≤ β'₁) ∧ (β₂ ≤ β'₂) But we "know" (this is an assumption I make about the OCaml type system) that the following "inversion principle" holds on tuples: β*β₂ ≤ α' implies that α' is of the form β´₁*β´₂ with β₁≤β´₁, β₂≤β´₂. Hence our requirement is met. ## Jeremy Yallop's 'eq' counter-example type (=α,+β) eq Refl : ∀αβ∀γ[α=γ,β=γ] (α,β) eq The requirement derived from this definition is: α=γ ∧ β=γ ∧ (α,β) eq ≤ (α',β') eq ⊩ ∃γ', α'=γ' ∧ β'=γ' Which is equivalent, by the assumed variance annotations, to α=γ ∧ β=γ ∧ α=α' ∧ β≤β' ⊩ ∃γ', α'=γ' ∧ β'=γ' that is γ=α' ∧ γ≤β' ⊩ β'=γ which is false in a model with at least one strict subtyping relation. Note that (+α,+β) (or any other annotation different from (=,=)) would also fail, as α≤α', β≤β' and α=β do not entail α'=β': intuitively, variant parameters must be free to vary "independently". # The constant case: Upward-closed, downward-closed Jacques private example is equally interesting: type priv_int = private int type +α expr Priv : ∀α[α=priv_int]. α → α expr The requirement is: α = priv_int ∧ α expr ≤ α' expr ⊩ α' = priv_int ∧ α ≤ α' Which is equivalent under the covariance assumption to: α = priv_int ∧ α ≤ α' ⊩ α' = priv_int But this is false: for α' := int we have α≤α' yet α' ≠ priv_int. As the requirement is not met, we have no proof of the soundness of the system and should reject this choice of variance annotation. ## Covariance and upward-closed types Note that the above criterion would be true of any "constant constructor": for a base type `c` not parametrized by α, the constructor: K : ∀α[α=c]. α → α expr gives birth, under a covariance assumption, to the requirement: α = c ∧ α ≤ α' ⊩ α' = c This requirement exactly says that c is 'upward-closed': all types bigger than c are equal to c. This is wrong for the type 'priv_int' above, and probably polymorphic variant types, but true for a lot of OCaml types: int, int list, etc. There is something strange happening, where we start doing semantic distinctions on types that we didn't have to do before; now we care whether a type is a private type or not, because it affects the variance of GADTs constructors using it for parameter instantiation. ## There are no downward-closed types Of course the same phenomenon happens in contravariant position: constant types instantiating contravariant positions should be downward-closed. But this is not possible in OCaml type theory, where no type (except ⊥) is downward-closed because of the pervasive presence of private types: morally, (private σ) is always strictly smaller than σ (even if σ is itself a private type). This means that accepting any constant instance in contravariant position, eg. type -α t K : ∀α[α=int] (α → int) → α t would be unsound: eliminating (K (fun n -> n) : priv_int t) would allow to forge private integers. # Constructors cases: general inversion In the general case we have non-constant constraints of the form (αi = Ti(β̄)), where Ti(β̄) is a type expression having as free variables a subset of the existential variables of the constructor. D̄ ∧ (ᾱ t ≤ ᾱ' t) ⊩ ∃β̄'. (D̄' ∧ τ̄ ≤ τ̄') which is equivalent, assuming some variance on the ᾱ, to D̄ ∧ (∀j, αj ≤j α'j) ⊩ ∃β̄'. (D̄' ∧ τ̄ ≤ τ̄') Let's assume that, for a given αi, the existential variables appearing in Ti(β̄) are not used in any other constraint (we will discuss this requirement later). We can therefore study independently the necessary sub-requirement (αi = Ti(β̄)) ∧ αi ≤i α'i ⊩ ∃β̄' (α'i = Ti(β̄') ∧ τ̄ ≤ τ̄[β̄'/β̄]) which is equivalent, substituting equalities, to Ti(β̄) ≤i α'i ⊢ ∃β̄' (α'i = Ti(β̄') ∧ τ̄ ≤ τ̄[β̄'/β̄]) This is an "eliminability" requirement on type expressions Ti. In the covariant case for example, it states that all types bigger than Ti(β̄) are also of the form Ti(β̄') for some β̄'. The requirement further asks that the variance of the β̄ in Ti coincides with their variance in the parameter types τ̄ (this is what τ̄ ≤ τ̄[β̄'/β̄] means). This extends the notion of upward/downward closure for constant types. A parametrized type (ᾱ t) is "upward closed" if all types τ such that (ᾱ t ≤ τ) are of the form (ᾱ' t): ᾱ t ≤ τ ⊩ ∃ᾱ'(τ = ᾱ' t) (Combined with the variance information on t, we also know that, for all satisfying ᾱ', we have (∀i, αi ≤i α'i)). Again, all the usual OCaml parametrized datatypes are upward closed, private types are not, and no type is downward closed because of the presence of private types. A type expression T(β̄) with head constructor (ᾱ t) is upward closed if its subexpressions appearing in covariant positions (for t) are upward closed, and those in contravariant positions are downward closed. # Existential variable interference The reasoning in the constructor case was done under the assumptions that the set of existential variables β̄ appearing in each Ti(β̄) were distinct. I claim that it is both necessary and sufficient (to satisfy the requirement) to assume that eventual shared variables are invariant with respect to the whole GADT type. If one writes the constructor type in OCaml syntax: K : τ̄ → (T₁(β̄),T₂(β̄)...) t then, if a given β appears in several Ti,Tj, each of its occurrence in the type expression (T₁(β̄),T₂(β̄)...) t must be in invariant position. It is easy to show that this ensures soundness, but I found it harder to get a (draft of) proof that it is also required. It may be interesting to see examples of interference. In Jeremy Yallop's equality example Refl : ∀β. (β,β) eq that is Refl : ∀α₁α₂∀β[α₁=β,α₂=β] (α₁,α₂) eq both constraints are covariant in α₁, so the corresponding parameters must both be invariant: only (=α₁,=α₂) eq is a correct variance annotation. A (convoluted) variation of this example is type (+α₁,=α₂) refpair RefPair : ∀βγ. β * γ ref → (γ ref, β * γ) or equivalently, in constraint form RefPair : ∀α₁α₂∀βγ[α₁ = γ ref, α₂ = γ * β]. β * γ ref → (α₁, α₂) refpair This case is safe: while the variable γ is shared, it is only used invariantly in (γ ref, β * γ) t as it is instantiated invariantly in the first parameter, and the second one is invariant. If (γ ref, γ * β) refpair ≤ (α'₁, α'₂) refpair, then we have (γ ref ≤ α'₁), which means (by upward closure and then invariance of 'ref') that α'₁ = γ ref, and α'₂ = γ * β (by invariance). # A criterion in the general case I believe (and have a draft of proof in the attached technical annex) that the requirement of Simonet&Pottier is equivalent, in the equality case, to the following criterion that *could* be implemented in a type-checker – more easily than by adding a general constraint solver. Given variance annotations (≤i) for the type ᾱ t, the type constructor K : ∀ᾱ∀β̄[α₁ = T₁(β̄), α₂ = T₂(β̄)...] τ̄ → ᾱ t or equivalently (in OCaml syntax) K : τ̄ → (T₁(β̄),T₂(β̄)...) t respects the variance if all the conditions below are met: 1. Non-interference: if an existential variable β appears in several Ti(β̄), then all its occurences in the whole type expression (T₁(β̄),T₂(β̄)...) t must be in invariant position. 2. Upward/downward closure, or eliminability: for each i, β̄, α', if (Ti(β̄) ≤i α') holds then α' is of the form Ti(β̄'). 3. Constructor parameters variance: the variance of the β̄ in the type expression (T₁(β̄),T₂(β̄)...) t must respect their variance in the constructor parameters τ̄ – just like if we had a non-GADT declaration in term of the β̄, (type β̄ t = K of τ̄ | ...). I believe this is sound, not unreasonably difficult to check, and would help in a lot of concrete cases that are currently rejected. # Discussion of private types and contravariance There is something quite disturbing in the idea that the mere existence of private types creates a strong asymmetry between covariant and contravariant parameters for GADTs: while upward-closed types are relatively common (so that we can expect "the usual examples" of GADTs to have "the expected variance"), there are no more downward-closed types, which means no contravariant constrained parameter. Upward or downward closure can be easily decided on fully transparent types (types whose definition is known). But there is a problem with opaque types: as we don't know how they were defined, we can't check if they are upward/downward closed. From a design point of view, the "right thing" would be to let people publish this information in the signature for an opaque type. Just as we can annotate the type with variance information, we should be able to publish its upward/downward closure properties. Another natural question is: what if we added the symmetric of the "private" concept, a type (blind σ) strictly above σ for all types? Would we lose all reasonable variance properties on OCaml types? One idea from Didier Rémy is to allow, when defining a new type 't' (not a type synonym), to define it as "non-privatizable" (and correspondingly non-blindable). We or another programmer would then not be allowed to define a type 'private t'. In exchange for this lost flexibility, 't' would be downward-closed. The idea is that introducing private types increased our freedom, and reciprocally lessened our ability to reason about "what cannot be done"; non-privatizable annotations allow to locally make the inverse choice of losing 'private' for better subtyping properties. (Of course, I am not requesting short-term language changes to add those features.) From a pragmatic standpoint, I suspect the current restrictions 'private' entails on the GADT variance checks are not going to be really problematic in practice. We tend to use covariant types (data) more often than contravariant types (computations). The OCaml type system is already skewed towards covariance in one place: while the relaxed value restriction (which, I recall, is the reason why I needed GADT variance in the first place) would work equally well to generalize variables that appear only in covariant or only in contravariant positions, the relaxation is implemented in the type-checker only for covariant parameters. And I have not yet heard of someone requesting generalization of only-contravariant parameters, so apparently treating contravariant parameters as second-class citizens is ok among OCaml users. # A theoretical open-world approach Some people may consider the upward-downward criterion as fundamentally unsatisfying because of the closed-world assumption it makes: it is not preserved by enrichment of the OCaml subtyping lattice. There is another possibility to get some variance properties out of GADTs that still works in an open-world system: allow GADTs constraints to have subtyping constraints, instead of only equality constraint. For example we could define the type of well-typed expressions in this way (note the ≥ in constraints): type +α expr = | Val : ∀α. α → α expr | Prod : ∀α∀βγ[α ≥ β*γ] β*γ → α expr | Priv : ∀α[α ≥ priv_int] priv_int → α expr It is trivial that this definition is covariant in its type parameter, because covariance was baked in the definition. Indeed, if α≤α' and α≥β*γ, then α'≥β*γ by transitivity: the constraint is preserved by going to a larger type. In the soundness requirement of Simonet&Pottier, one can use the exact same existential variables on both sides (β̄':=β̄), and the constraints will be satisfied. The downside of this easier covariance property is that we get weaker types on deconstruction (pattern matching of GADT values): when matching on a (α t), in the Prod clause, we only know that (α ≥ β*γ) rather than (α = β*γ) – note that we could regain the stronger hypothesis by locally using the closed-world assumption. This simple idea has two potential problems: 1. This may not be implementable as part of the current OCaml type-checker, which uses type equalities internally and has more limited subtyping support (in particular we can't abstract over arbitrary subtyping assumptions). That said, this particular way of adding subtyping hypothesis in the context might be simple enough for it to handle. 2. Getting a weaker hypothesis may be unusable: maybe the usual GADT programs really make use of the equality assumptions, and going for subtyping assumptions would make untypable programs that would be perfectly fine under the closed-world assumption. Jacques can tell us about point 1. Regarding point 2., I'm not sure and have not thoroughly checked examples, but my intuition is that it is actually not a problem. I think that the types we naturally want to be covariant are also the types we want to manipulate as data (so it's ok if the type is larger than what we expect), rather than consumers, and conversely. For example, the idiomatic use of the (+α expr) datatype is the (eval : ∀α. α expr → α) function, and it can be written on this monotonous version. let rec eval : ∀α. α expr → α = function | Val (v : α) → (v : α) | Prod ∃βγ[α≥β*γ] (b:β, c:γ) → ((b, c) : β*γ :> α) | Priv [α ≥ priv_int] (n : priv_int) → (n : priv_int :> α) One would need to study more examples to get a better understanding of this question. # Conclusion and questions We have a relaxed criterion that we're confident is sound and can plausibly be implemented in OCaml (including possibly before the next release). But it relies on a closed-world assumption that is currently skewed towards covariance because of private types -- not a problem for our use case, the relaxed value restriction. Finally, it would suggest some language extensions to benefit fully from it, by enabling upward/downward closure specification for abstract types, and allowing to forbid making a new type `private` in the future (or blind) to gain stronger variance properties. On the other hand, we know that this criterion would not hold as is in an open world scenario. Is this consideration relevant to the OCaml type system? If one wants to keep the door open to the open world case, would the suggested approach (reasoning under subtyping constraints in pattern clauses) be implementable? I don't believe the two choices are exclusive: if we have an annotation to say "this type will never be marked private", it becomes downward-closed even in an open-world scenario, and we can recover the current behavior by saying that all types, by default, "will never be marked blind" but "may be marked private". Those defaults are design areas to explore. I would welcome feedback on the current state of affairs. Given what we know, is there a hope of relaxing the variance check for GADT at some short-to-medium-term point?