bug-gnubg
[Top][All Lists]

## Re: [Bug-gnubg] Neural network symmetry question

 From: Joseph Heled Subject: Re: [Bug-gnubg] Neural network symmetry question Date: Sat, 10 Dec 2011 19:10:04 +1300

```Well, I am not sure how you flip the position, since it matters who is
on the move.

-Joseph

On 10 December 2011 16:17, Mark Higgins <address@hidden> wrote:
> I've been playing around a bit with neural networks for backgammon and found
> something interesting, and want to see whether this is already part of gnubg.
>
> Assume a Tesauro-style network with the usual inputs, and some number of
> hidden nodes. And for simplicity, just one output representing the
> probability of win.
>
> If I take a given board and translate the position into the inputs and then
> evaluate the network, it gives me a probability of win. If I then flip the
> board's perspective (ie white vs black) and do the same, I get another
> probability of win. Those two probabilities should sum to 1, since one or the
> other player must win (or equivalently, the probability of white winning =
> probability of black losing = 1 - probability of black winning).
>
> But that constraint isn't satisfied with the usual TD setup.
>
> If however you make a few assumptions:
>
> * Hidden layer nodes don't include bias weight.
> * Hidden->input weights have a specific symmetry: weight of the i'th hidden
> node vs the j'th input node = w(i,j) = -w(i,j*), where j* is the index of the
> other player's corresponding position.
> * Output layer node doesn't include a bias weight.
>
> Then you can show that, for each set of output->hidden node weights, those
> weights sum to zero, the flip-the-perspective constraint is satisfied.
>
> This seems to reduce the number of weights by about half, since you need only
> half the middle weights. The network should be more accurate since a known
> symmetry is respected, and should converge quicker since there are fewer
> parameters to optimize.
>
> You can generalize to a bias weight on the output node; in that case, the
> constraint is on the bias weight that it = -1/2 sum( output->hidden node
> weights ).
>
> You can generalize as well to including a "gammon win" output node. In this
> case there are no constraints on the output->hidden node weights, but the
> probability of a gammon loss can be calculated from the probability of a
> gammon win weights, and you don't need to explicitly include an output node
> for the gammon loss.
>
> I googled around a fair bit but couldn't figure out whether this is well
> known or already included somewhere in gnubg. I took a look through eval.c
> but it's a bit daunting. :) Is there documentation somewhere that I've just
>
>
>
> _______________________________________________
> Bug-gnubg mailing list