[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
## Re: [gnubg] Help with a new MET

**From**: |
Timothy Y. Chow |

**Subject**: |
Re: [gnubg] Help with a new MET |

**Date**: |
Tue, 12 Nov 2019 12:04:09 -0500 (EST) |

**User-agent**: |
Alpine 2.21 (LRH 202 2017-01-01) |

On Tue, 12 Nov 2019, Joseph Heled wrote:

Hi Timothy,
Here is a stats question I encounter from time to time.

`Suppose I run N BG games and collect the average win rates and gammon
``rates.
``4 estimates which are dependent as they sum to 1. How do I determine
``the confidence intervals for each? This is a 4d vector and it seems
``like a non trivial Q, but I assume this crops up a lot and must have a
``standard answer. what is your take?
`
Thanks, Joseph

Joseph,

`I'm guessing that what you're really interested in is some measure of the
``variation or dispersion of your sample dataset. In that case, you can
``simply compute the sample standard deviation for each parameter of
``interest. The fact that each sample consists of 4 numbers that satisfy
``the equation that their sum equals 1 just means that your 4 estimated
``standard deviations aren't independent estimates, but for most practical
``purposes this is an irrelevant technicality.
`

`On the other hand, if you really want to compute a confidence interval for
``the purposes of hypothesis testing, then you need to be explicit about
``what your null hypothesis and alternative hypotheses are. If you're not
``sure what your null and alternative hypotheses are, then to me that
``confirms that what you're interested in is not hypothesis testing but some
``sense of how good an estimate your averages are.
`

`It's important to realize that a 95% confidence interval does *not* mean
``that there is a 95% probability that the quantity you're trying to
``estimate lies in your interval. This is a common misconception about what
``confidence intervals are.
`
https://en.wikipedia.org/wiki/Confidence_interval#Misunderstandings

`If you really want to make statements of the form "there is a 95%
``probability that the win rate is in such-and-such an interval" then you
``need to adopt a Bayesian rather than a frequentist framework. In
``particular you'll need to choose some prior probability distribution and
``compute the posterior probability distribution by applying Bayes's rule to
``your data.
`
Tim