Performance improvements in the universe checker of Coq
- December 3, 2012
One of my parsing-related
Coq files takes more than two hours to compile, which is pretty annoying
when the file is less than one thousand lines long and does no complex
computations. After some profiling, it appeared that Coq was spending
most of its time in the Univ
module. This module is part of the
Coq kernel, its role is to check that the user is using universes in a
consistent way.
You can get an introduction of what universes are in
this book
chapter. In Coq, universes are implicit: it is the role of the
Coq kernel to determine whether it is possible to give a number to
each occurrence of Type
, in a manner that is proved not to
introduce inconsistencies.
This computation is done in the Univ
module: the rest of the Coq
kernel generates some constraints, and this module is supposed to warn
if they are not satisfiable. Roughly, theses constraints are of the
form: "universe A has an index strictly smaller than universe B", or
"universe A has an index smaller or equal to universe B". In order to
check that they are satisfiable, it is enough to see all these
constraints as a graph where edges are constraints and vertexes are
universes, and check that there is no cycle containing a strict
constraint.
This graph can be simplified: if there is a cycle of large constraints, that does mean that all universes in this cycle are in fact necessarily equivalent, and we can replace this cycle with only one universe. We have to keep track of equivalences, using a specialized data structure. We recognize here the classical union-find problem, which have very efficient solutions.
However, in order to do this task, in Coq, a naive approach is used: it builds equivalence trees, but nothing is done to prevent these trees to be very unbalanced. And, in fact, that was the problem in my file: for some reason, my tactics generated hundred of universes constraints. All these universes were inferred to be equivalent. In order to keep track of this information, a long chain of universes were created: for each of them, Coq knows that it is equivalent to the next one, but it does not know how to go directly to the head of the chain, that is effectively present in the graph. So, it was actually spending most of its time following this long chain...
In the literature, there are two main heuristics to solve this (classical) problem: "union by rank" and "path compression". Using either one gives logarithmic amortized complexity, and they give a practically constant amortized complexity if implemented together. By experience, I know that each of these heuristics gives very good performances when used alone, because the worst case is very rare. Moreover, it is not easy to implement path compression in a persistent way, because it involves side effects. So, I decided to implement "union by rank" inside the Univ module of Coq.
It was not really difficult: I just added a rank
field
in the record representing the nodes of the graph, that contains the
length of the longest path of equivalences leading to this node. So,
when we are merging nodes, we can chose to keep the node having the
longest chain as the head node. This keeps the graph balanced: if the
other nodes were part of smaller chains, the longest chain's length
doesn't increase.
I got a spectacular improvement of performances when Coq is executed on my code: from more than two hours, I went to about 6.5 minutes. I get a reasonable 1% improvement when compiling the Coq standard library: it is actually expected, because the standard library does not contain a lot of complex polymorphic code.
You can see the current patch (which may still evolve a bit) in this pull request. My first implementation was against 8.3, this patch is on the trunk version of Coq, and we're planning to port it to 8.4 soon. Do not hesitate to test it if you have universe-heavy developments; any feedback is welcome.