[12:42:24] I wrote a little bit of code to explore the question of how many samples you need in a bucket before you get a reasonable estimate of p75
[12:43:41] I used 1 million timings from api.log as input
[12:43:48] the results are surprising
[12:45:16] using a bucket size of 7 overestimates the p75 by a factor of 4
[12:49:24] it's much worse than my assumption that it would be biased towards the mean
[12:50:23] the mean is 51 and the p75 is 36, but if you use a bucket size of 7 you get a p75 estimate of 180!
[12:52:36] to get p75 within a long piece of chalk (10% rel error) you need 300-400 samples in the bucket
[12:54:14] that's 10% RMS error, 10% mean error is achieved with a bucket size of 52
[13:00:02] for p90, a long piece of chalk is achieved at 400 for the mean error and 5000 for the RMS error
[13:01:15] the code is https://phabricator.wikimedia.org/P35378