Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rat

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rat

From:	Nis
Subject:	Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates)
Date:	Tue, 25 Feb 2003 20:47:05 +0100

ALERT: Long and boring article ahead. Includes math.

--On Thursday, February 20, 2003 12:02 -0500 "Moore, Dave"<address@hidden> wrote:

Jim Segrave wrote:

This leads me to a point I've been wondering about. Everyone discusses
gnubg having errors on odd ply evaluations.

I remembered it this way also - but looking back at the gnubg archive, itseems like the onlything discussed was that gnubg has huge DIFFERENCESbetween odd and even ply on some types of positions. See this thread:


http://mail.gnu.org/archive/html/bug-gnubg/2002-07/msg00023.html

I think this has mixed in the brains of myself and others with the factthat 1-ply does not play much better than 0-ply - which I also rememberreading somewhere, probably on this list.

Can someone explain why
this would be? I have always taken it on faith that this is true, but
I'd like to understand the mechanism.


Here is a quick explanation:

This is a little imprecise. What happens is:

The 1-ply evaluation is the result of:

1. Finding the best move on one ply for each of the 21 possible rolls

2. Averaging the resulting positions, as evaluated on 1-ply (with the otherplayer on roll)

Thus 1-ply evaluation is the average of 21 OTHER positions with theopponent on roll. However, a lot of these positions will be similar to thecurrent one (see the example in the link above)

If gnubg is better at evaluating a certain position from one side than theother, then 1-ply and

0-ply might differ a lot. The same is true for higher even and odd plies

So far, I have seen no good arguments for why gnubg should be better at onethan the other. Since I haven't been able to find the explanation, here isa try at it:

Gnubg has been trained with the specific purpose of being able to makebetter decisions on 0-ply. Thus, either through evolution (the trainingproces) or breeding (the selection of which changes to make to the netinput), the neural nets have been improving more on positions where thereare active decisions to make. Since crashed positions are mostlycharacterized by one side having very few checker play decisions, one sideof the positions have been favored by this.

It is worth noting that this means, that it might not be a good idea to fixthe evaluations - since this would mean weaker play (at even ply) for theside having to make the hard decisions. The exception would be doublingdecisions - which are likely to be equally important from both sides of theboard.

My naive interpretation:

For example, a position viewed from player 0's point of view may evaluate
to +0.600 equity, but the same position evaluated from the other side will
evaluate to -0.589 equity.

Almost. It is the result of the 21 resulting positions (with opponent onroll) having equity -0.589 on average.

I also have two naive questions:

1.  Wouldn't it be possible to run the odd-ply evaluations while always
evaulating the board from player 0's point of view?
You would still go
through the possible dice and moves for player 1, but the move would be
selected by evaluating the resulting position from player 0's point of
view, always.  This would eliminate the jumps in absolute equity numbers.

For short: No, and even if we could it wouldn't give more precise results,only more consistent ones.

(very naive)
2.  Could positions that evaulate to different equity from different sides
of the board be used as training data so that the Net would converge to an
agreed upon answer when looking at things from either side of the board?

Not stupid at all. It seems the general agreement last time was that thiswastoo dangerous, since there would be a risk that the net became better atthis kind of positions at the expense of other, more common types ofpositions.

I have, however, thought of an idea for overcoming differences between oddand even plies:

The basic idea is to introduce the half-ply: The average between 0 and 1ply. or in general between n and (n-1) ply. This would decrease the averagesquare of the error for the kind of position - since the "true" equity forthe position is most likely to be somewhere between these two evaluations.

At the same time, however, we loose something - since hopefully theevaluation at n-ply should be better on average than the one at (n-1).After all, that is why we evaluate at higher plies.

An obvious way of correcting for this fact would be to use a weightedaverage of the n and (n-1) evaluations - with a weighing factor determinedby empiric research. Does anyone have a large database of rolled outpositions lying around - if possible including at 0 and 1-ply evaluationsfrom the current net as well.

My idea would be to find the average of (rollout - 0-ply)/(1-ply - 0-ply)and use this as the weight given to the n-ply evaluation.

Another way to make a half-ply evaluation would be to evaluate some of therolls at 0-ply, some at 1-ply. This can be extended to (n+1/2)-ply byrecursion, just like it is done with the integer plays todayWhen I got this idea, I thought to myself: "So THAT must be how reducedevaluation works". Looking in the list archives and then into the sourcecode, it seems like this is not the case. Gnubg actually only evaluatessome of the rolls at each leaf in the ply-tree. This does however mean,that we have an existing framework in eval.c for doing the half-plyevaluation.

This approach would have the same positive and negative effects as the(weighted) average model described above - with the exception that we donot "waste our time" by doing a full 1-ply eval, but get a very goodapproximation of what it would have been.

I have more ideas than the ones given here - but let me hear the reactionsof the rest of you before I "go wild".


--
Nis Jorgensen
Greenpeace
Amsterdam

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gnubg] Re: New Contact Net Error Rates, Michaeldepreli, 2003/02/20
- Re: [Bug-gnubg] Re: New Contact Net Error Rates, Jim Segrave, 2003/02/20
- RE: [Bug-gnubg] Re: New Contact Net Error Rates, Moore, Dave, 2003/02/20
  - Re: [Bug-gnubg] Re: New Contact Net Error Rates, Jim Segrave, 2003/02/20
  - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis <=
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/25
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Jim Segrave, 2003/02/25
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/26
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/26
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/26
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/27
    - RE: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), David Montgomery, 2003/02/27
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/28
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), David Montgomery, 2003/02/28
    - Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/27

Prev by Date: Re: [Bug-gnubg] Problem using command files for rollouts.
Next by Date: Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates)
Previous by thread: Re: [Bug-gnubg] Re: New Contact Net Error Rates
Next by thread: Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates)
Index(es):
- Date
- Thread