bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: The importance of METs


From: Joseph Heled
Subject: Re: [Bug-gnubg] Re: The importance of METs
Date: Fri, 05 Sep 2003 23:08:10 +1200
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030624



Douglas Zare wrote:

Ok. I'm not sure that I see enough accuracy to say 0.12% rather than 0.0-0.2%,
but I'll trust that someone has gone through that carefully. However, the
Woolsey-Heinrich MET is a straw man. Woolsey says he doesn't use it (for
extreme scores), and there are scores which seem to be quite wrong, such as for
3-away 4-away. If you have a new MET that is supposed to be an improvement over
what is out there, why not test it against METs people believe, or at least
better ones?

I think perhaps you are missing the history of this discussion. It started as my reaction to the suggestion we make the Woolsey table the default for GNUbg. I thought that with only 2 digit accuracy it was not a good choice, and wanted to give some direct evidence. Remember, I am not talking about the "truth" of the various entries, just of the playing ability of GNUbg with that particular table.

It turns out that even toppling this straw man is not easy. Past experience, (and other runs I did, such as Snowie/mec26 and Trice/mec26), convinced me once more that difference are small, and any reasonable table will do.

My guess is that NN errors (or noise if you will) are much bigger than differences between METS, so a better MET will become important only if the NN will become much better, which is unlikely before bots start playing using rollouts in realtime (perhaps 5 years from now, assuming moore's law holds?).

-Joseph

If you just want to be able to report as large a (correct) advantage as
possible, you might want to use match lengths at which the defects in the W-H
table show up more. Rather than use a percentage system that does not make
sense out of context, why not translate the advantage into elo points?


I expected the correlation to be much higher - I am surprised that the

MET

used influences the outcome of more than a quarter of matches (although
these MET's are much more different than Snowie and mec26)

Better variance reduction may fix this. If I understand your methodology,

if the

length of a game but not the result depends on the MET, then the rest of

the

match should be only slightly more correlated than independent trials

starting

at the resulting match score. If so, you may find a greater correlation if

you

make the rolls of each game independent of the number of moves made up to

that

point. You could test why the matches diverge, too.


I guess I was wrong in assuming that the rolls were generated randomly for each
pair of matches rather than for each pair of games. Why not generate them
randomly for each match score, rather than for each game? There must be a lot
of transposition.
Douglas Zare



_______________________________________________
Bug-gnubg mailing list
address@hidden
http://mail.gnu.org/mailman/listinfo/bug-gnubg







reply via email to

[Prev in Thread] Current Thread [Next in Thread]