** Step 1: Separating Simonet&Pottier into the type instance part and data parameter part Setting: our user is defining a GADT type (ᾱ t) and has fixed the variance of its parameters. A variance assignment is a family of comparison functions (≤i) among {(≤),(≥),(=)}, that adds the following assumption on our model: (ᾱ t ≤ ᾱ' t) <=> (∀i, αi ≤i α'i) With this variance assignment fixed, we will reason over the requirement of Simonet&Pottier, which is a condition that must holds for each of the GADT type constructor for the type system to be sound. If (under our variance assumption) the requirement are not met for some constructor, the type definition, or at least this constructor type, must be rejected as incorrect. I recall the general criterion of Simonet&Pottier: D̄ ∧ ᾱ t ≤ ᾱ' t ⊩ ∃β'. D̄' ∧ τ̄ ≤ τ̄' When the constraints D are equality constraints of the form (αi = Ti(β̄)), we can substitute the equalities of D̄ to get the following equivalent requirement: {Ti(β̄)}i t ≤ {α'i}i t ⊩ ∃β̄'. (∀i, α'i = Ti(β̄')) ∧ τ̄ ≤ τ̄' Notation: {foo}i is the vector of values indexed by the variable i; for example {α'i}i is also noted ᾱ'. Note that we will use 'i' and 'j' to iterate over the parameters ᾱ of ᾱ t, and 'v' to iterate over its existential variables β̄. More precisely, for each i, Ti(β̄) only depends on a subset of all the existential variables β̄ (which we will note fv(Ti(β̄))), so the family {βv}v of existential variables used by Ti will range over only such variables. It will be clear from the context whether a quantification (∀v, ...) ranges over all β̄ or only those of some Ti. This requirement is also equivalent, using the variance assumption, to: (∀i, Ti(β̄) ≤i α'i) ⊩ ∃β̄'. (∀i, α'i = Ti(β̄')) ∧ τ̄ ≤ τ̄' [req] In [req], (∀i, α'i = Ti(β̄)) is a property on the type instances of (ᾱ' t), and (τ̄ ≤ τ̄') is a property on the types of the GADT parameters. We can split this requirement into an equivalent conjunction of two requirements, that separates these two aspects: (∀i Ti(β̄) ≤i α'i) ⊩ ∃β̄'. (∀i, α'i = Ti(β̄')) [req-inst] ∧ (∀i Ti(β̄) ≤i Ti(β̄')) ⊩ τ̄ ≤ τ̄' [req-data] Proof of ([req] <=> [req-inst]∧[req-data]): [req-inst]∧[req-data] => [req] assume (∀i, Ti(β̄) ≤i α'i) by [req-inst] we have β̄' such that ∀i, α'i = Ti(β̄') we can substitute those equalities in our assumption, and get (∀i, Ti(β̄) ≤i Ti(β̄')) which gives (τ̄ ≤ τ̄') by [req-data] [req] => [req-inst]: trivial [req] => [req-data]: to avoid conflict we α-rename β̄' into β̄": assume (∀i, Ti(β̄) ≤i Ti(β̄")), and prove τ̄ ≤ τ̄'[β̄"/β̄'] we can instantiate [req] in the case α'i := Ti(β̄"): ∃β̄'. (∀i, Ti(β̄") = Ti(β̄')) ∧ τ̄ ≤ τ̄' but Ti(β̄") = Ti(β̄') implies ∀v, β"v = β'v (by case analysis on Ti's variance)¹ so we have β̄" = β̄', so from τ̄ ≤ τ̄' our result τ̄ ≤ τ̄'[β̄"/β̄'] ¹: We should, and will in future versions of this document, describe precisely what "Ti's variance" means – we have only defined variance for type definitions, not type expressions. For now let's assume that for each i, there is a family {(≤iv)}v of relations among {(=),(≤),(≥)} such that: Ti(β̄) ≤ Ti(β̄') <=> (∀v, βv ≤iv β'v) Whatever (≤iv) is in {(=),(≤),(≥)}, we have that Ti(β̄) = Ti(β̄") implies (∀v, β' = β") by symmetry. Finally, we will write (≤i,iv) for the composition of the relations (≤i) and (≤iv): - if (≤i) is (=), (≤i,iv) is also (=) - if (≤i) is (≤), (≤i,iv) is (≤iv) - if (≤i) is (≥), (≤i,iv) is the dual of (≤iv) ((≤)↦(≥), (≥)↦(≤), (=)↦(=)) This composed relation verifies the following property: Ti(β̄) ≤i Ti(β̄') <=> (∀v, βv ≤i,iv β'v) Our splitting of [req] into [req-inst] and [req-data] can be understood as follows: a GADT is "just a datatype with some constraints", and we have separated, in the requirement, the [req-inst] part that speaks about the constraints and the [req-data] part that simply speaks about the datatype parameters. ** Step 2: a simple criterion on the data part Given our variance assumption on the {Ti}i, (∀i Ti(β̄) ≤i Ti(β̄')) ⊩ τ̄ ≤ τ̄' [req-data] can be rewritten into: (∀i,∀v, βv ≤i,iv β'v) ⊩ τ̄ ≤ τ̄' This really looks like (the left-to-right implication of) a variance requirement on the τ̄: it almost says that the {βv}v have variance (≤i,iv) in τ̄, which means that [req-data] would be the criterion we already have for plain old OCaml datatypes. However, this would only be true if the variables constrained by each (≤iv) were distinct; there may be some βv that is defined for several i, breaking the analogy with a variance assignment property. We will show that this approximation can be made precise by taking such interferences of existential variables across several Ti into account. This is actually a result from [req-inst]. ** Step 3: a fine-grained study of type instance requirement We will here decompose (equivalently) our first requirement [req-inst] into two different requirements; one that speaks about the properties of the {Ti}i, and one that speaks about the interference case when a single βv appears in more than one Ti (or rather, {fv(Ti(β))}i). Indeed, we claim that [req-inst] (∀i Ti(β̄) ≤i α'i) ⊩ ∃β̄'. (∀i, α'i = Ti(β̄')) is equivalent to the conjunction of ∀i, (Ti(β̄) ≤i α'i ⊩ ∃β̄'. α'i = Ti(β̄')) [req-elim] and ∀v, (#{j | βv ∈ fv(Tj(β̄))} > 1) => (∀i (≤i,iv) = (=)) [req-interf] [req-elim] is a simple, local elimination property: we require that Ti is "closed by subtyping" in the sense that any α'i in (≤i)-relation with a type of the form Ti(β̄) is itself of the form Ti(β̄') for some β̄'. [req-interf] is our criterion about what happens in case of interference, with a single existential variable βv being shared among several distinct Tj(β̄). If a βv appears in some Ti(β̄), (≤i,iv) is the relation corresponding to β's variance in Ti composed with the variance of αi. Requirement [req-interf] says that if βv appears in several Tj, then all Tj must either use βv invariantly or be invariant themselves. That is, we request that the variable βv be invariant when considered as part of the wider type expression {Ti(β̄)}i t. [req-elim]∧[req-interf] => [req-inst] assume (∀i, Ti(β̄) ≤i α'i) by [req-elim], for each i, we have a family {β'iv}v such that α'i = Ti({β'iv}v) which means in particular that ∀i, Ti(β̄) ≤i Ti({β'iv}v) that is, under our variance assumption on Ti, (∀i∀v, βv ≤i,iv β'iv) we must construct a family {β'v}v such that (∀i, α'i = Ti({β'v}v)) idea: take (β'v := β'iv) for some i such that β'iv is defined but what if we have two such i,j (β'iv, β'jv both defined)? then by [req-interf] we have (≤i,iv) = (≤j,jv) = (=) so our assumption (∀i∀v, βv ≤i,iv β'iv) gives us βv = β'iv and βv = β'jv so by transitivity β'iv = β'jv => for each i, for all defined v, β'iv = β'v => (∀i, α'i = Ti({β'v}v)) [req-inst] => [req-elim] ∧ [req-interf] [req-inst] => [req-elim] assume that for some fixed i, we have some α'i such that Ti(β̄) ≤i α'i we must show that ∃β̄'. α'i = Ti(β̄') pose α'j := Tj(β̄) for all j≠i we have (∀i, Ti(β̄) ≤ α'i) so by [req-inst] we have ∃β̄'. (∀i, α'i = Ti(β̄')) [req-inst] => [req-interf] assume for some v that (#{j | βv ∈ fv(Tj(β̄))} > 1) do we have ∀i ((≤i,iv) = (=))? claim: yes, for all β"v such that (βv ≤i,iv β"v), we have (β"v = βv) given such a β"v, we pose some specific ᾱ' as such: α'j := Tj(β̄) for all j≠i α'i := Ti(β̄[β"v/βv]) (we only replace the variable v) those definitions respect the property (∀i, Ti(β̄) ≤i α'i) (trivially for j≠i, and because of the hypothesis (βv ≤iv β"v) for i) so by [req-inst] we have ∃β̄'. (∀i, α'i = Ti(β̄')) now: - for all j≠i we have α'j := Tj(β̄) and α'j = Tj(β̄'), so for all v ∈ ⋃{j≠i} fv(Tj(β̄)) we have β'v = βv - we also have α'i = Ti(β̄') and α'i := Ti(β̄[β"v/βv]), so in particular β"v = βv as we have (#{j | βv ∈ fv(Tj(β̄))} > 1), there is some j≠i such that βv ∈ fv(Tj(β̄)) so we have both β'v = βv and β"v = βv, giving our result β"v = βv