So, we don't care
about the exactness of the absolute evaluation, we care about the
relative evaluation between the moves (or resulting positions after
each move). That is what makes it select good moves!
strategy was originally adopted by Tesauro. I agree that it is fine for
chequerplay, where you only have to find the best play relative to the
for cube decisions it is important to know the absolute equity. It is
known that gnubg is inaccurate in some areas, most notably holding-game
cube action, where gnubg overestimates the holding player’s chances. I
wonder if this is due to only training for relative move selection.
might be worth devising a training regime that trains for absolute
equity. This ought to give good chequerplay, too, since if the nn can
accurately determine the absolute value of each position it will
inevitably rank candidates correctly, too.