A few days ago, Hugo Heuzard asked about the details of the problem with OCamlbuild’s parallelism today on mantis: PR#5754. My mantis comment quickly grew into the blog post below.

I worked last September on better parallelization for OCamlbuild, but didn’t finish the thing (I couldn’t get something mergeable in time for ICFP) and got distracted with tons on other things to do since. I just uploaded the state of my branch:

https://github.com/gasche/ocaml/tree/ocamlbuild-indirect-style

Why ocamlbuild’s parallelization is often disappointing today

Below is an explanation of my understanding of the current problem with ocamlbuild’s parallelization. OCamlbuild central form of abstraction is the “rule action”, an OCaml function that knows how to build a target. In fact, the action

  • is passed a “building function” to invoke whenever it needs to build a dependency of the target
  • returns exactly one command (think of a command-line invocation) that will finally produce the target

For example, the rule to produce a .cmo from a .ml file whose .mli file is also present is defined here in ocaml_specific.ml:

ocamlbuild-indirect-style is exactly this: define a type of actions that is data instead of a blackbox function, use it for all built-in rule descriptions, and implement an “execution” function that evaluates such a rule to actually produce a result.

The nice thing with this change is that it can be done in a very incremental way: it’s easy to still expose the old interface where rule expects a function (as action), and on-the-side expose the new interface with a indirect_rule function whose action is the new action description format. You can then let users convert their rules if they want to, at their own leisure (it’s even possible to mix both styles in the same action description); just converting the builtin OCaml-specific rules to the indirect style should already give a nice parallelization boost.

If I remember correctly, when I stopped working on that (expecting to resume soon, of course), the missing piece was a good “evaluation” function that would take an indirection action and execute it correctly, exploiting the extra parallelism information. The naive way to do it doesn’t quite work, and you have to be careful about sharing work when two rules end up requiring the same subtarget at different times in the build – building an explicit task-dependency graph could be a good way to do it properly.

In the branch, the code I’ve shown previously looks as follows (ocamlc_c is strictly unchanged):

Jenga, you may have noticed the remarkable similarity between the monadic version I describe here and Jenga’s rule design (or maybe you know about Shake and also found a strong link). I noticed myself when I first saw a talk about Jenga… a few months after doing this work. The indirect-style rules are only a small (and extremely natural) improvement over OCamlbuild’s current state, which is in fact quite close to Jenga’s design to start with.

In particular, one thing I am curious about is whether it would be possible to reuse Jenga’s runtime (the part that gets the dependency graph and runs it as fast as possible or does clever thing with cache of previous runs and filesystem notifiers). But this experiment would be more invasive than the work strictly necessary to just improve parallelization on OCamlbuild (which has no reason to be particularly slow on its own), so it’s not a priority.

Remark 2

If you’re interested in contributing to OCamlbuild, there is certainly interesting work to be done in the area of parallelization. My gut feeling, however, is that other issues may be more pressing (and less time-consuming), such as satisfying reasonable feature requests, or helping improve the documentation.