On 19 June 2012 09:36, Philippe Michel
<address@hidden> wrote:
On Tue, 5 Jun 2012, Philippe Michel wrote:
The benchmark database for the crashed positions seems seriously corrupted.
I have rerolled it. How should I proceed to have it uploaded to ftp.demon.nl ?
I should remind everyone that the benchmark measures how well the net does against a "hypothetical super computer player" which is fast enough to use rollouts instead of evaluations. Also, the rollout for crashed positions are nowhere near as good as for non-crashed contact.
A better approach (better net, more nets, evaluation+rollout hybrid, something else) is still sorely needed.
-Joseph
The change for checker plays is quite large.
Original database :
% perr.py -W $DATA/nets/nngnubg.weights $DATA/benchmarks/crashed.bm
98 Non interesting, 99902 considered for moves.
0p errors 26022 of 99902 avg 0.00772197376664
n-out ( 1026 ) 1.03%
26651 errors of 213398
cube errors interesting 26651 of 213394
me 0.00196758116477 eq 0.00016451881191
cube errors non interesting 0 of 4
me 0.0 eq 0.0
New database :
% perr.py -W $DATA/nets/nngnubg.weights $DATA/benchmarks/crashed.bmn
78 Non interesting, 99922 considered for moves.
0p errors 24169 of 99922 avg 0.00599333832388
n-out ( 578 ) 0.58%
26662 errors of 213398
cube errors interesting 26662 of 213391
me 0.00201874481948 eq 0.000165458107733
cube errors non interesting 0 of 7
me 0.0 eq 0.0
I think there was nothing wrong with the cube results and the small discrepancy is only due to the different weights files used in the rollouts.