Statistical memory profiling for OCaml speaker: Jacques-Henri Jourdan where: Chameau sur le plateau seminar, LRI date: November 12th, 2019 (Notes taken by Gabriel Scherer) Q: could the users decide to sample certain allocation points in particular? Jacques-Henri: an easy way to see certain allocations more is to increase the sampling rates, and discard samples that you are not interested in. Gabriel: you could also give a sort of "virtual size" to certain blocks that represent resources you care about. Jacques-Henri: yes, but only for custom blocks, you cannot afford to check this for minor-heap allocations. Q: what's the use of having several samples in a block? Jacques-Henri: you should only use the size of the block for statistics, but not to actually give weight to the allocation. Q: if I use a sampling rate of 1., I'm monitoring everything the GC is doing? Jacques-Henri: yes, but it's going to be costly. Q: in the callback, why not have an Obj.t? Jacques-Henri: you want it to be a weak pointer to not interact with liveness properties, and allocating a weak pointer is costly. The `allocation` record can be allocated in the minor heap, while weak pointers go to the major heap. A former API used Ephemerons, you had access to the Obj.t value, but it was sensibly slower. Q: allocations on the major heap are always large? Jacques-Henri: yes, small blocks get *promoted*, not allocated directly. Except some very infrequent cases: weak pointers, ephemerons, etc. Jacques-Henri: and we sample each major block to keep this part of the code simple, but we could switch to the same sampling strategy as the minor heap, if the need were to arise. Q: instead of looking for whsize in the frame table, could you increment before and decrement after the call? A: yes, but the new code is more compact. For processors that support it (ARM), the new allocation code is just a conditional call. Questions (from Gabriel): - support for allocation of custom blocks with virtual resource sizes? Jacques-Henri: still to be done - what happens if you call Statmemprof.start, and some previously-sampled blocks are not promoted/deallocated yet? You could have a landmark-style API where you want to sample only a portion of your code, so you would call Gc.start several times when entering there. Jacques-Henri: a better way to do this would be to have one Gc.start call at the beginning of the application, and then have the Landmark-style library give enough information to filter callbacks outside the region you want to instrument. I don't want to have to store a pointer to the callbacks inside the sampled blocks.