[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rat
From: |
Nis |
Subject: |
Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates) |
Date: |
Tue, 25 Feb 2003 20:47:05 +0100 |
ALERT: Long and boring article ahead. Includes math.
--On Thursday, February 20, 2003 12:02 -0500 "Moore, Dave"
<address@hidden> wrote:
Jim Segrave wrote:
This leads me to a point I've been wondering about. Everyone discusses
gnubg having errors on odd ply evaluations.
I remembered it this way also - but looking back at the gnubg archive, it
seems like the onlything discussed was that gnubg has huge DIFFERENCES
between odd and even ply on some types of positions. See this thread:
http://mail.gnu.org/archive/html/bug-gnubg/2002-07/msg00023.html
I think this has mixed in the brains of myself and others with the fact
that 1-ply does not play much better than 0-ply - which I also remember
reading somewhere, probably on this list.
Can someone explain why
this would be? I have always taken it on faith that this is true, but
I'd like to understand the mechanism.
Here is a quick explanation:
This is a little imprecise. What happens is:
The 1-ply evaluation is the result of:
1. Finding the best move on one ply for each of the 21 possible rolls
2. Averaging the resulting positions, as evaluated on 1-ply (with the other
player on roll)
Thus 1-ply evaluation is the average of 21 OTHER positions with the
opponent on roll. However, a lot of these positions will be similar to the
current one (see the example in the link above)
If gnubg is better at evaluating a certain position from one side than the
other, then 1-ply and
0-ply might differ a lot. The same is true for higher even and odd plies
So far, I have seen no good arguments for why gnubg should be better at one
than the other. Since I haven't been able to find the explanation, here is
a try at it:
Gnubg has been trained with the specific purpose of being able to make
better decisions on 0-ply. Thus, either through evolution (the training
proces) or breeding (the selection of which changes to make to the net
input), the neural nets have been improving more on positions where there
are active decisions to make. Since crashed positions are mostly
characterized by one side having very few checker play decisions, one side
of the positions have been favored by this.
It is worth noting that this means, that it might not be a good idea to fix
the evaluations - since this would mean weaker play (at even ply) for the
side having to make the hard decisions. The exception would be doubling
decisions - which are likely to be equally important from both sides of the
board.
My naive interpretation:
For example, a position viewed from player 0's point of view may evaluate
to +0.600 equity, but the same position evaluated from the other side will
evaluate to -0.589 equity.
Almost. It is the result of the 21 resulting positions (with opponent on
roll) having equity -0.589 on average.
I also have two naive questions:
1. Wouldn't it be possible to run the odd-ply evaluations while always
evaulating the board from player 0's point of view?
You would still go
through the possible dice and moves for player 1, but the move would be
selected by evaluating the resulting position from player 0's point of
view, always. This would eliminate the jumps in absolute equity numbers.
For short: No, and even if we could it wouldn't give more precise results,
only more consistent ones.
(very naive)
2. Could positions that evaulate to different equity from different sides
of the board be used as training data so that the Net would converge to an
agreed upon answer when looking at things from either side of the board?
Not stupid at all. It seems the general agreement last time was that this
was
too dangerous, since there would be a risk that the net became better at
this kind of positions at the expense of other, more common types of
positions.
I have, however, thought of an idea for overcoming differences between odd
and even plies:
The basic idea is to introduce the half-ply: The average between 0 and 1
ply. or in general between n and (n-1) ply. This would decrease the average
square of the error for the kind of position - since the "true" equity for
the position is most likely to be somewhere between these two evaluations.
At the same time, however, we loose something - since hopefully the
evaluation at n-ply should be better on average than the one at (n-1).
After all, that is why we evaluate at higher plies.
An obvious way of correcting for this fact would be to use a weighted
average of the n and (n-1) evaluations - with a weighing factor determined
by empiric research. Does anyone have a large database of rolled out
positions lying around - if possible including at 0 and 1-ply evaluations
from the current net as well.
My idea would be to find the average of (rollout - 0-ply)/(1-ply - 0-ply)
and use this as the weight given to the n-ply evaluation.
Another way to make a half-ply evaluation would be to evaluate some of the
rolls at 0-ply, some at 1-ply. This can be extended to (n+1/2)-ply by
recursion, just like it is done with the integer plays today
When I got this idea, I thought to myself: "So THAT must be how reduced
evaluation works". Looking in the list archives and then into the source
code, it seems like this is not the case. Gnubg actually only evaluates
some of the rolls at each leaf in the ply-tree. This does however mean,
that we have an existing framework in eval.c for doing the half-ply
evaluation.
This approach would have the same positive and negative effects as the
(weighted) average model described above - with the exception that we do
not "waste our time" by doing a full 1-ply eval, but get a very good
approximation of what it would have been.
I have more ideas than the ones given here - but let me hear the reactions
of the rest of you before I "go wild".
--
Nis Jorgensen
Greenpeace
Amsterdam
- [Bug-gnubg] Re: New Contact Net Error Rates, Michaeldepreli, 2003/02/20
- RE: [Bug-gnubg] Re: New Contact Net Error Rates, Moore, Dave, 2003/02/20
- Re: [Bug-gnubg] Re: New Contact Net Error Rates, Jim Segrave, 2003/02/20
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates),
Nis <=
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/25
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Jim Segrave, 2003/02/25
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/26
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/26
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/26
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/27
- RE: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), David Montgomery, 2003/02/27
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Nis, 2003/02/28
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), David Montgomery, 2003/02/28
- Re: [Bug-gnubg] Even, odd and half plies (WAS: New Contact Net Error Rates), Joseph Heled, 2003/02/27