May I ask a couple of questions regarding NN training?
From my little understanding, I suppose there are 2 sources of errors during an evaluation:
a) the NN may be intrinsically unable (say, because of the type, number, etc of inputs) to "score" well, when compared with the true equity/%s
(at least, the ones you want to replicate)
of a position, or
b) the benchmark against which the NN is being tested might not have the "true" equity/%s because, for example, the rollouts were done in 0-ply.
Is there any way to know which of these two factors is limiting the most the improvement of gnubg's NN? I mean, do we need to improve the benchmark, do more training or change the NN altogether to improve the evaluation?
Thanks