bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TEST RESULT: Variance Reduction


From: Joern Thyssen
Subject: Re: [Bug-gnubg] TEST RESULT: Variance Reduction
Date: Sat, 5 Jul 2003 19:19:21 +0000
User-agent: Mutt/1.4.1i

On Mon, Jun 30, 2003 at 11:15:20AM +0100, Ian Shaw wrote
> I've re-tested rollouts of the initial position now that the variance
> reduction has been corrected. I'd like to get some comment on before
> posting the test to GammOnLine.
> 
> My observations are:
> 
> 1) Wins reduced The winner of the opening roll wins less often. This
> is to be expected since doubles would always be a good opening roll.
> The old VR was factoring in some rolls being doubles and
> overcorrecting for what it saw as a poor set of rolls during the
> rollout.
> 
> 2) Crawford 2-away Match Equity Leader's MEQ at Crawford 2-away is
> even lower than the original rollout suggests, 68.2% rather than
> 68.3%. At 2-ply, my rollouts suggested that Trailer benefited even
> more, so we can expect Leaders' MEQ to drop below my current best
> value of 67.7%, which is already lower than any published MET. (Oh how
> I want to get multi-processing before I repeat THAT rollout!)

Interesting. The Danish player Lars Trabolt has suggested values of
66-67%, although he didn't do any rollouts -- he was just guessing based
on the fact that the trailer can play extremely aggresive and happily
ignore own gammons.

http://www.dbgf.dk/Debat/showflat.php?Cat=&Board=diverse&Number=23514&page=1&view=collapsed&sb=5&o=0&fpart=
[sorry, in Danish]


> 
> 3) Standard Error The reported overall std error is a LOT lower (see
> result below). I would have expected some reduction because, without
> doubles, there is simply less variation in the opening rolls. There is
> a huge swing on 66 for starters - it wins about 2/3 of games
> (according to 2-ply evaluation).  Even so, I was surprised to find
> such a large reduction in the std error. The individual std errors for
> w, wg etc have not changed much, so it's hard to see how the overall
> value would change so much. Could you have a look at this, please.

I can't explain this...

Try doing a similar rollout but without RAIP in order to compare std.
errors.  I would expect that the std. error of a normal rollout and RAIP
rollout is of the same size. 

> 4) Rotating opening rolls I rolled out 7776 times (6 * 36 * 36). Then
> I wondered how the stratification is affected when it's the opening
> roll. Will the rollout have done any stratification at all? Should I
> have done a multiple of 21 * 36 = 756? Will the stratification work if
> I do?

I assume you mean 30 * 36?

Stratification will always work, independently of the number of trials.
For normal rollouts the stratification will be optimal for rollouts with
a number of trials proportional to 36, 1296, etc. For example, for a 37
trial rollout one of the rolls will be grossly over-represented compared
to any of the other rolls. The same of a 3700 trial rollout, but now the
effect is obviously smaller. 

I'm not sure exactly what happens for RAIP. Jim? Nis?

> 
> 5) Match Equity Calculation Finally, I'm a bit suspicious of the
> reported match equity values. I calculated them by hand from the win
> percentages, and got different results. 
> 
> Match equities reported by gnubg and calculated by hand:
> Wrong variance reduction
> Trailer: 33.784%      33.635%
> Leader:  69.939%      70.265%
> Corrected variance reduction                                  
> Trailer: 69.081%      69.080%
> Leader: 69.081%       69.080%

This is most likely due to the wrong variance reduction, now fixed.

I've seen examples where the variance reduction results in negative
gammon rates or other artifacts. gnubg will correct this on the
fly (e.g., applying a sanity check to the result of the rollout so far).
However, the calculated mwc is not corrected, hence you may see examples
where a manual calculation of the MWC differs from the MWC calculated by
gnubg. 

For example,

The first trial returns: 51% -1% 0% - 49% 0% 0%. The mwc calculated from
this is: 25%. gnubg applies the sanity check to the gwc and arrives at
51% 0% 0% - 49% 0% 0%. A "manual" mwc calculation gives 25.5%.

A negative number of gammons are possible due to imperfections in the
luck analysis. 

Jørn




reply via email to

[Prev in Thread] Current Thread [Next in Thread]