> about outliers, i splitted every tests into chuncks that is sized 100. Made IQR calculations and calculated average time on valid chunks. you can find the result in the attachment also pushed to gitlab.
> also, since statistics and benchmarking are a sciences their self, i am a bit struggling while approaching the problem as well as feels like out of the gsoc project scope. I would like to share this with your indulgence. yet, of course I will move in accordance with your instructions.
Hmm, this is lacking basic maths skills... cutting into chucks and recombining them aren’t going to deal with outliners. Read about "median", "quantile" on Wikipedia/Google'ing. Anyway, you want to calculate the "median" time. E.g. sort 100 numbers by size, getting the average of 50th and 51th, and your error is the difference between the 91th and the 10th quantile. ( the 10th and the 91th when you sort them in order of size). If you can do that for the entire set, do it for the whole set; if not, a running median - ie. The median of every chuck of 100. Then combine the running medians.
This way, the top 9 and bottom 9 values of each 100 have no contribution at all to your outcome. This is dealing with outliners.