How's your rally driving going. ;-)
On Sun, Dec 11, 2011 at 4:45 AM, Mark Higgins <address@hidden>
I notice in gnubg and other neural networks the probability of gammon gets its own output node, alongside the probability of (any kind of) win.
Doesn't this sometimes mean that the estimated probability of gammon could be larger than the probability of win, since both sigmoid outputs run from 0 to 1?
There is a sanity check function called after the neural net evaluation, that check that gammons don't exceed wins and backgammon does not exceed gammons.
I'm playing around with making the gammon node represent the probability of a gammon win conditioned on a win; then the unconditional probability of a gammon win = prob of win * conditional prob of gammon win. In that setup, both outputs are free to roam (0,1) without causing inconsistencies.
That's a possibility, but I go not believe it gains anything. (This is of course just a guess, since I've not tried. And you are of course free to try.) I guess you also need a similar scheme for backgammons?
Is there something I'm missing here about why this is suboptimal? Is there some other way people tend to ensure that prob of gammon win <= prob of any kind of win?
I guess you have to divide by the win prob in the training, which is still just an estimate. Hmmm.. I'm still thinking, maybe it can gain something, since they are kind of depending on each other.
However... what I would rather try is to have six outputs with a softmax activation function. Several neural net experts recommends softmax in their books and papers, and other parameter update rules (other than backpropagation) has been developed based on softmax outputs.