bug-gnubg
[Top][All Lists]

## Re: [Bug-gnubg] error rate and match winning chances

 From: Robert-Jan Veldhuizen Subject: Re: [Bug-gnubg] error rate and match winning chances Date: Sat, 9 Aug 2008 16:36:36 +0200

On Tue, Aug 5, 2008 at 6:24 PM, Mueller Achim wrote:
Hi Christian,

Am 05.08.2008 um 15:13 schrieb Christian Anthon:

Hi Achim,

it is not an entirely easy question to answer. But here is a short
explanation of the numbers:

Luck total EMG (MWC)                    -5.700 (-12.144%)       -2.564
(+20.291%)

means that B gained 20.291+12.144 = 32.435% MWC by luck. As he
presumably started the match with 50% MWC, that means that he gained
100-50-32.435=17.57% MWC through skill:

+17.57%

or that he wins 32.43:67.57% of the matches. Whether these numbers are
good or meaningful I don't know.

I think we're both right, assuming Christian means that B wins 67.57% with that last sentence.

The thing is, you need +50% MWC to win a match; if you get +32.435% from luck then the rest must be due to skill.

Unfortunately, GNUBG's evaluations of luck are less accurate than its normal play evaluations. So for individual matches, especially short ones (5ptr and less), GNUBG's estimates of luck can be far off. This is even more true when you use the default 0-ply luck evaluation.

An interesting thing to try is the command "set analysis luckanalysis plies n" with n=1,2,3 and see what the result is. Sometimes you get huge jumps in the luck adjusted result.

N.B. luckanalysis at n plies takes about as much time as a normal play analysis at n+1 ply! So with n=3 it's really very slow.

No matter whether I use your interpretation (player B wins 32.43%, player A 67.57%) or
his (vice versa), this number can't be very meaningful. Your "skill" always depends on
how much luck you have. With an error rate difference less than 1 (snowie) the
difference in skill can't be that big. But perhaps I miss something here.

They are two different measurements, which explains why they aren't always in agreement. Usually the error based analysis is more accurate, but it suffers from bot bias (i.e. in games where gnubg goes way off, the error analysis could be way off too).

For this one particular match, you can immediately see how the error bases analysis reflects on your winning chances:

Overall Statistics:
Error total EMG (MWC)                   -3.935 (-28.802%)       -4.498 (-25.202%)

This means that B was a 50+28.802-25.202 = 53.6% favourite to win this match.

The difference with looking at the the gnubg or snowie error rate, is that the above method will let you pay more for mistakes when there was more MWC at stake; it's totally MWC based. The more often used error rate is EMG based, sort of trying to treat each game as equally important. Both methods have their merits.

In your particular example, they lead to contradictory results about who played best. A did better in EMG, B did better in MWC.

So in the end, gnubg gives you many different ways to compare you to your opponent which may give different results.

1. error based total MWC difference
2. error based EMG per move difference
a. gnubg method
b. snowie method