Independently of classes, Objective Caml features a powerful module system,
inspired from the one of Standard ML.
The benefits of modules are numerous. They make large programs compilable by allowing to split them into pieces that can be separately
compiled. They make large programs understandable by adding structure
to them. More precisely, modules encourage, and sometimes force, the
specification of the links (interfaces) between program components, hence
they also make large programs maintainable and reusable.
Additionally, by enforcing abstraction, modules usually make programs safer.
Compared with other languages already equipped with modules such as
Modular-2, Modula-3, or Ada, the originality of the ML module system is to
be a small typed functional language “on top” of the base language. The
ML module system can actually be parameterized by the base language,
which need not necessarily be ML. Thus, it could provide a language for
modules other base languages.
Basic modules are structures, ie. collections of phrases, written
structp1 … pnend.
Phrases are those of the core language, plus definitions of
sub modules moduleX = M and of module types moduletypeT
= S. Our first example is an implementation of stacks.
Alternatively, the directive openS allows to further skip the
prefix and the dot, simultaneously: struct open S … (fx : t)
… end. A module may also be a subcomponent of another module:
The “dot notation” and open extends to and can be used
in sub-modules.
Note that the directive openT.R in a module Q makes all
components of T.R visible to the rest of the module Q but it
does not add these components to the module Q.
The system infers signatures of modules, as it infers types of values.
Types of basic modules, called signatures,
are sequences of (type) specifications, written
sigs1 … snend.
The different forms of specifications are described
in figure 4.1.
Figure 4.1:
Specification of
form
values
valx : σ
abstract types
typet
manifest types
typet = τ
exceptions
exceptionE
classes
classz : object … end
sub-modules
moduleX : S
module types
moduletypeT [ = M]
For instance, the system's answer to the Stack example was:
An explicit signature constraint can be used to restrict the signature
inferred by the system, much as type constraints restrict the types inferred
for expressions. Signature constraints are written (M : S) where M is
a module and S is a signature. There is also the syntactic sugar moduleX :
S = M standing for moduleX = (M : S).
Precisely, a signature constraint is two-fold: first, it checks that
the structure complies with the signature; that is, all components specified
in S must be defined in M, with types that are at least as general;
second, it makes components of M that are not components of S
inaccessible. For instance, consider the following declaration:
moduleS : sigtypetvaly : tend =structtypet = intletx = 1 lety = x + 1 end
Then, both expressions S.x and S.y + 1 would produce errors.
The former, because x is not externally visible in S.
The latter because the component S.y has the abstract type S.t
which is not compatible with type int.
Signature constraints are often used to enforce type abstraction.
For instance, the module Stack defined above exposes its
representation. This allows stacks to be created directly without
calling Stack.create.
Stack.pop { Stack.elements = [2; 3]};;
However, in another situation, the implementation of stacks might have
assumed invariants that would not be verified for arbitrary elements of the
representation type. To prevent such confusion, the implementation of
stacks can be made abstract, forcing the creation of stacks to use the
function Stack.create supplied especially for that purpose.
moduleAstack :sigtype 'atvalcreate : unit -> 'atvalpush : 'a -> 'at -> unitvalpop : 'at -> 'aend = Stack;;
Abstraction may also be used to produce two isomorphic but incompatible views
of a same structure. For instance, all currencies are represented by floats;
however, all currencies are certainly not equivalent and should not be
mixed. Currencies are isomorphic but disjoint structures, with
respective incompatible units Euro and Dollar. This is modeled
in OCaml by a signature constraint.
moduletypeCURRENCY =sigtypetvalunit : tvalplus : t -> t -> tvalprod : float -> t -> tend;;
Remark that multiplication became an external operation on floats
in the signature CURRENCY.
Constraining the signature of Float to be CURRENCY
returns another, incompatible view of Float.
Moreover, repeating this operation returns two isomorphic structures but
with incompatible types t.
In Float the type t is concrete, so it can be used for
"float". Conversely, it is abstract in modules Euro and
Dollar. Thus, Euro.t and Dollar.t are incompatible.
Remark that there is no code duplication between Euro and Dollar.
A slight variation on this pattern can be used to provide multiple views of
the same module. For instance, a module may be given a restricted interface
in a given context so that certain operations
(typically, the creation of values) would not be permitted.
moduletypePLUS =sigtypetvalplus : t -> t -> tend;;modulePlus = (Euro : PLUS)
moduletypePLUS_Euro =sigtypet = Euro.tvalplus : t -> t -> tend;;modulePlus = (Euro : PLUS_Euro)
On the left hand side, the type Plus.t is incompatible with Euro.t.
On the right, the type t is partially abstract and compatible with
Euro.t; the view Plus allows the manipulation of values that are
built with the view Euro.
The with notation allows the addition of type equalities in a
(previously defined) signature.
The expression PLUSwithtypet = Euro.t is an abbreviation
for the signature
sigtypet = Euro.tvalplus: t -> t -> tend
The with notation is a convenience to create partially abstract
signatures and is often inlined:
Modules are also used to facilitate separate compilation. This is obtained
by matching toplevel modules and their signatures to files as follows. A
compilation unit A is composed of two files:
The implementation file a.ml is
a sequence of phrases,
like phrases within struct … end.
The interface file a.mli (optional) is
a sequence of specifications,
such as within sig... end.
Another compilation unit B may access A as if it
were a structure, using either the dot notation
A.x or the directive openA.
Let us assume that the source files are: a.ml, a.mli, b.ml.
That is, the interface of a B is left unconstrained.
The compilations steps are summarized below:
Command
Compiles
Creates
ocamlc -ca.mli
interface of A
a.cmi
ocamlc -ca.ml
implementation of A
a.cmo
ocamlc -cb.ml
implementation of B
b.cmo
ocamlc -omyproga.cmob.cmo
linking
myprog
The program behaves as the following monolithic code:
moduleA : sig(* content of a.mli *)end =struct(* content of a.ml *)endmoduleB = struct(* content of b.ml *)end
The order of module definitions correspond to the order of
.cmo object files on the linking command line.
A functor, written functor (S : T) → M, is a function from modules to
modules. The body of the functor M is explicitly parameterized by the
module parameter S of signature T. The body may access the components
of S by using the dot notation.
As for functions,
it is not possible to access directly the body of M.
The module M must first be explicitly applied to an implementation of
signature T.
moduleT1 = T(S1)moduleT2 = T(S2)
The modules T1, T2 can then be used as regular structures.
Note that T1 et T2 share their code, entirely.
In this section, we use the running example of a bank to illustrate
most features of modules and combined them together.
Let us focus on bank accounts and, in particular, the way the bank and the
client may or may not create and use accounts. For security purposes, the
client and the bank should obviously have different access privileges to
accounts. This can be modeled by providing different views of accounts to
the client and to the bank:
moduletypeCLIENT = (* client's view *)sigtypettypecurrencyvaldeposit : t -> currency -> currencyvalretrieve : t -> currency -> currencyend;;moduletypeBANK = (* banker's view *)sigincludeCLIENTvalcreate : unit -> tend;;
We start with a rudimentary model of the bank: the account book is given to
the client. Of course, only the bank can create the account, and to prevent
the client from forging new accounts, it is given to the client, abstractly.
This model is fragile because all information lies in the account
itself. For instance, if the client loses his account, he loses his money
as well, since the bank does not keep any record. Moreover, security relies
on type abstraction to be unbreakable…
However, the example already illustrates some interesting benefits of
modularity: the clients and the banker have different views of the bank
account.
As a result an account can be created by the bank and used for deposit by
both the bank and the client, but the client cannot create new accounts.
Moreover, several accounts can be created in different currencies, with no
possibility to mix one with another, such mistakes being detected by
typechecking.
Furthermore, the implementation of the bank can be changed while preserving
its interface. We use this capability to build, a more robust —yet more
realistic— implementation of the bank where the account book is maintained
in the bank database while the client is only given an account number.
Using functor application we can create several banks. As a result of
generativity of function application, they will have independent and private
databases, as desired.
moduleCentral_Bank = Bank (Euro);;moduleBanque_de_France = Bank (Euro);;
Furthermore, since the two modules Old_bank and Bank have the
same interface, one can be used instead of the other, so as to created
banks running on different models.
All banks have the same interface, however they were built. In fact, it
happens to be the case that the user cannot even observe the difference
between either implementation; however, this would not be true in general.
Indeed, such a property can not be enforced by the typechecker.
Check the equality (X + Y) (X −Y) = (X2 − Y2) by treating polynomials with
two variables as polynomials with one variable X and where the coefficients
are the ring of the polynomials with one variable Y.
Write a program that reads a polynomial on the command line and evaluates it
at each of the points given in stdin (one integer per line); the
result should be printed in stdout.