Hash tables have slightly changed between OCaml 3.12.1 and OCaml 4.00.0. While some care has been taken for forward compatibility, you might encounter strange behaviors if you accidentally try to backport a hash table.

Here are two snippets of code:

(* dumpml *)
let _ =
  let h = Hashtblcreate 2 in
  Hashtbladd h 23l "Hello";
  Hashtbladd h 42l "World";
  let oc = open_out_bin "dump" in
  output_value oc h;
  close_out oc
(* readml *)
let _ =
  let ic = open_in_bin "dump" in
  let h = (input_value ic: (int32, string) Hashtblt) in
  Printfprintf "iter\n!";
  Hashtbliter (fun k v -> Printfprintf "%ld -> %s\n" k v) h;
  Printfprintf "find\n!";
  let s1 = Hashtblfind h 23l in
  let s2 = Hashtblfind h 42l in
  Printfprintf "print\n!";
  Printfprintf "%s %s\n" s1 s2;
  close_in ic

Now, here is the output I got from running read:

$ ./read
42 -> World
23 -> Hello
Fatal error: exception Not_found

What kind of sorcery is this!?

The problem is: I work on two machines, one of which is not mine, and quite hostile. Therefore, instead of building my whole compiling environment on it, I just hacked my path to point to the ocaml build directory of my boss. dump (of course, I only presented here a simplification of it) has to be run on this machine, because it has a PowerPC architecture, which is useful in this project. However, I run read on my own machine, because it’s much simpler. Both used to run OCaml 3.12.1, since the project can’t be built under 4.00.

However, one day, the boss updated OCaml on the PowerPC machine to 4.00. After that, I re-ran dump, oblivious to that change, and then was a bit puzzled by read’s output! («It used to work!»™)

So, why does Hashtbl.iter behave well, while Hashtbl.find can’t find the keys? It’s just that iter browses through the buckets, ignoring the hash function entirely, while find hashes the key, and looks into the bucket for that particular hash. Since the hash function changed, but not the underlying representation of hash tables, iter succeeds while find fails.

Conclusion: Beware when dealing with serialized data structures among heterogeneous environments. Well, we already knew that, didn’t we? :-)