|Subject:||[gnubg] Help with a new MET (2)|
|Date:||Tue, 26 Nov 2019 09:41:25 +0000 (UTC)|
I have tried several subscription attempts to the email list directly, I think, without success. After finding the archives I see that there have been a number of replies to my first post. I didn't read them until very recently I wasn't ignoring your feedback, I just had not viewed it. Thank you, all, for what I received. Also, I apologise that this post almost certainly starts a new thread, though I considered it more important to make a response now.
I have been in direct email contact with both Joseph and Philippe behind the scenes and this has been most fruitful for me, thank you both. Philippe has found an error in the way the external player handles its "Takes". Gergely Elias (the savvy friend I mentioned before) has built a new Gnubg build to take advantage of this bug-fix, however, we still have problems it seems. Gergely will contact Philippe directly about those problems.
Thank you for the interest/encouragement in my project! I provide a little background for you, though please feel free to chat with me via email for more detail if you wish.
If we consider an approximation to a near-perfect-player (npp) as being (say) Gnubg Supremo+ or some XG 3-ply, 4-ply... + then I think it is fairly easy to believe that:
A (npp) vs another (npp) should be using a near-perfect MET (npm)
Our current (npm)s are not 'perfect', of course, although we can use the Rockwell-Kazaross MET or Kazaross-XG2 as a reasonable approximation. Note: Despite the rollout bot used and the number of rollout trials involved for each MET, I suspect that the Roc-Kaz MET may be better. A discussion for another day, perhaps.
In my case, the rollouts I performed recently were all originally done solely for the calibration purposes of my theoretical "Variable MET" (VM). After doing the rollout trials and achieving what I consider a very successful calibration, it occurred to me that it would be interesting to find out what PR level the rollouts are related to. It transpired that after analysing an assortment of matches, mainly around (and averaging) the 5pt length, I got a PR of ~2.1. The idea for a 'PR2 MET' was born from this finding.
Hence, the premise I wish to explore is that a top-level-human-player (tlhp) playing another (tlhp) might actually do better not using a (npm). i.e. said another way:
A (tlhp) vs another (tlhp) should use a MET for their level of play like the PR2 MET.
Incidentally, I am not being heretical here. There are several posts online sharing this viewpoint, and no, I am not looking for them ;-) However, I did revisit my copy of "CAN A FISH TASTE TWICE AS GOOD" and there is a paragraph, or two, sympathetic to my stated case. I can re-type a couple of sentences if I have to. FWIW, book co-author, Walter Trice, would have been a fascinating man to talk to about this.
As Joseph pointed out to me, there is no theoretical reason that using some 'lesser strength' MET achieves a better result for humans. I agree though I do wonder why humans are playing on METS based on underlying g+bg rates of around 28.2% that achieve 2a1a equity of ~32.3%. Whereas, our top humans are more likely to be achieving numbers closer to 27.3% and 31.6% respectively and these last values I determined from my rollouts and are inherently used in the PR2 MET.
There is a lot more to it than provided here, the link between g+bg rates and the supposed link to 2a1aC, for one. Both those numbers (along with others) are important inputs in my (VM) used to create the PR2 MET.
Also, it is not like one human plays another and has a MET each sitting in 'the cloud' above their heads awaiting exact use (a nice analogy from Joseph I thought). The MET we use comes in later during analysis and/or before a match when we attempt to benchmark our play and cube decisions. I have barely scratched the surface with using the PR2 MET with my own matches. Are there differences? Yes, and those interest me. How much overall difference does it make? Very unclear and I realise that the difference (if one exists) is probably tiny and will be very difficult to show with mathematical significance.
I am having fun trying though ;-)
(Australian Backgammon Federation, Secretary)
|[Prev in Thread]||Current Thread||[Next in Thread]|