bug-gnubg
[Top][All Lists]

## Re: [gnubg] Help with a new MET

 From: Timothy Y. Chow Subject: Re: [gnubg] Help with a new MET Date: Tue, 12 Nov 2019 12:04:09 -0500 (EST) User-agent: Alpine 2.21 (LRH 202 2017-01-01)

```On Tue, 12 Nov 2019, Joseph Heled wrote:
```
```Hi Timothy,
Here is a stats question I encounter from time to time.

```
Suppose I run N BG games and collect the average win rates and gammon rates. 4 estimates which are dependent as they sum to 1. How do I determine the confidence intervals for each? This is a 4d vector and it seems like a non trivial Q, but I assume this crops up a lot and must have a standard answer. what is your take?
```
Thanks, Joseph
```
```
Joseph,

```
I'm guessing that what you're really interested in is some measure of the variation or dispersion of your sample dataset. In that case, you can simply compute the sample standard deviation for each parameter of interest. The fact that each sample consists of 4 numbers that satisfy the equation that their sum equals 1 just means that your 4 estimated standard deviations aren't independent estimates, but for most practical purposes this is an irrelevant technicality.
```
```
On the other hand, if you really want to compute a confidence interval for the purposes of hypothesis testing, then you need to be explicit about what your null hypothesis and alternative hypotheses are. If you're not sure what your null and alternative hypotheses are, then to me that confirms that what you're interested in is not hypothesis testing but some sense of how good an estimate your averages are.
```
```
It's important to realize that a 95% confidence interval does *not* mean that there is a 95% probability that the quantity you're trying to estimate lies in your interval. This is a common misconception about what confidence intervals are.
```
https://en.wikipedia.org/wiki/Confidence_interval#Misunderstandings

```
If you really want to make statements of the form "there is a 95% probability that the win rate is in such-and-such an interval" then you need to adopt a Bayesian rather than a frequentist framework. In particular you'll need to choose some prior probability distribution and compute the posterior probability distribution by applying Bayes's rule to your data.
```
Tim```