bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Problem with the crashed benchmark database


From: Joseph Heled
Subject: Re: [Bug-gnubg] Problem with the crashed benchmark database
Date: Tue, 19 Jun 2012 13:39:38 +1200



On 19 June 2012 09:36, Philippe Michel <address@hidden> wrote:
On Tue, 5 Jun 2012, Philippe Michel wrote:

The benchmark database for the crashed positions seems seriously corrupted.

I have rerolled it. How should I proceed to have it uploaded to ftp.demon.nl ?

I should remind everyone that the benchmark measures how well the net does against a "hypothetical super computer player"  which is fast enough to use rollouts instead of evaluations. Also, the rollout for crashed positions are nowhere near as good as for non-crashed contact.

A better approach (better net, more nets, evaluation+rollout hybrid, something else) is still sorely needed.

-Joseph



The change for checker plays is quite large.

Original database :

% perr.py -W $DATA/nets/nngnubg.weights  $DATA/benchmarks/crashed.bm
98 Non interesting, 99902 considered for moves.
0p errors 26022 of 99902 avg 0.00772197376664
n-out ( 1026 ) 1.03%
26651 errors of 213398
cube errors interesting 26651 of 213394
 me 0.00196758116477 eq 0.00016451881191
cube errors non interesting 0 of 4
 me 0.0 eq 0.0

New database :

% perr.py -W $DATA/nets/nngnubg.weights $DATA/benchmarks/crashed.bmn
78 Non interesting, 99922 considered for moves.
0p errors 24169 of 99922 avg 0.00599333832388
n-out ( 578 ) 0.58%
26662 errors of 213398
cube errors interesting 26662 of 213391
 me 0.00201874481948 eq 0.000165458107733
cube errors non interesting 0 of 7
 me 0.0 eq 0.0

I think there was nothing wrong with the cube results and the small discrepancy is only due to the different weights files used in the rollouts.


_______________________________________________
Bug-gnubg mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-gnubg


reply via email to

[Prev in Thread] Current Thread [Next in Thread]