A few days ago Jacques-Pascal Deplaix came to ask me this question: when using the include trick to override some definition in a module, how can we also override some definition in a submodule? The answer is short -- just scroll to the last code block -- and may be of interest to some readers.

Module overriding

The proposed use-case is the following: you want to write code that looks like it uses the module List from the standard library, but in fact the functions hd and tl are the variants that return option values, because you like that style better than catching exceptions.

There is a rather well-known trick to do that, which is to prepend the following code at the beginning of your source, or in a separate My_std module you always open first.

module List = struct
  include List
  let hd = function
    | [] -> None
    | x::_ -> Some x
end

However, you may run into problem if the definition you want to override is not directly in the module, but some of its submodule. Let's consider the following code:

module Toto = struct
    let x = 1
    module Tata = struct
      let y = 2
    end
    module Titi = struct
      let z = 3
    end
end

Suppose I want to replace Toto.Tata.y to be of type string rather than int. The obvious thing doesn't work.

module Totoverride = struct
  include Toto
  module Tata = struct
    include Toto.Tata
    let y = "2"
  end
end
Error: Multiple definition of the module name Tata.
       Names must be unique in a given structure or signature.

The problem is that, while OCaml will let you shadow an existing let declaration, it does not accept declaring other sorts of declarations (types, modules, module types...) twice in the same module. Here a Tata submodule is created in Totoverride first at the include Toto point, and then for a second time explicitly.

One manual solution is to use a signature ascription to remove the Tata submodule from Toto at include time.

module Totoverride = struct
  include (Toto : sig
             val x : int
             module Titi : (sig val z : int end) 
           end)
  module Tata = struct
    include Toto.Tata
    let y = "2"
  end
end

The problem with this solution is that you have to repeat most Toto's signature, which leads to a painful maintenance burden. Not only will changes in Toto's definition (addition of new fields) require changes in this place, but the compiler will not tell you when such changes happen, Totoverride will just be incomplete with respect to Toto.

module type of and Destructive substitution

Destructive substitution (with .. := ...) is a newish feature of the OCaml language, as it was added in 3.12 -- released in summer 2010. The documentation in the manual is extremely clear and provides good examples, so if you don't know about it you should just go read it now.

module type of was also added in 3.12, and lets you speak about the interface of a module without defining it explicitly: module type of Foo is an interface that Foo satisfies.

There are subtleties about the semantics of both these constructions, so you should not abuse them unless you want to read papers about module systems -- you probably don't -- to understand why your code doesn't type-check. However, they can be combined in a perfectly reasonable way to remove some type or submodules of a module at inclusion time: module type of Foo with type t := Foo.t is the signature of Foo, minus the type t, and the same thing works for with module. This lets us solve our module overriding problem:

module Totoverride = struct
  include (Toto : module type of Toto with module Tata := Toto.Tata)
  module Tata = struct
    include Toto.Tata
    let y = "2"
  end
end

Completeness is always hard

This is a neat trick to have in your toolbox, but you should know that it may not satisfy all your module overriding drives. You can remove type and module from a signature, but there are a lot of other OCaml signature items that don't have a corresponding destructive subtitution: module types, classes, class types and exceptions.

The problem is that it's not always clear what the intended semantics of destructive substitution should be. For example, I'm not really sure what destructive substitution of exception declarations would mean -- exceptions are a kind of blind spot as they don't appear in types. But in any case, if you try to override everything, you'll run into this incompleteness and go add your lament to the feature request PR#5460, "Replace/rename/remove module types".

To be honest, the signature language grows pieces by pieces as needs are justified (and semantics are understood), so it's not particularly surprising that it is not complete. There was a notable attempt by Norman Ramsey in 2001 to think about what a more self-sufficient signature languages should be, Towards a Calculus of Signatures -- with, I just found out, a draft implementation by J├╝rgen Pfitzenmaier). The good news is that probably the most important construct (destructive substitution of types) has been integrated since, but otherwise things are going at their own pace.