[12:42:24] I wrote a little bit of code to explore the question of how many samples you need in a bucket before you get a reasonable estimate of p75 [12:43:41] I used 1 million timings from api.log as input [12:43:48] the results are surprising [12:45:16] using a bucket size of 7 overestimates the p75 by a factor of 4 [12:49:24] it's much worse than my assumption that it would be biased towards the mean [12:50:23] the mean is 51 and the p75 is 36, but if you use a bucket size of 7 you get a p75 estimate of 180! [12:52:36] to get p75 within a long piece of chalk (10% rel error) you need 300-400 samples in the bucket [12:54:14] that's 10% RMS error, 10% mean error is achieved with a bucket size of 52 [13:00:02] for p90, a long piece of chalk is achieved at 400 for the mean error and 5000 for the RMS error [13:01:15] the code is https://phabricator.wikimedia.org/P35378