Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (lon

bug-gnubg

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (lon

From:	Philippe Michel
Subject:	Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (long)
Date:	Sun, 24 Jun 2012 21:37:35 +0200 (CEST)
User-agent:	Alpine 2.00 (BSF 1167 2008-08-23)

On Sun, 24 Jun 2012, Joseph Heled wrote:

I am very interested to know how those nets were generated?

They were trained with your gnubg-nn tools, but from improved trainingdata. This is basically how it went :

I first tried to train the crashed net. Since it seemed one of itsproblems was dubious absolute equities in many positions and largediscrepancies between even and odd evaluations, I used the original set ofpositions with the average of 3ply and 4ply evaluations.

Early results looked promising but it didn't go very far, the 0plyerrors going from :

checkers 771 cube 1088 (total errors in the 0.90.0 net)
to

checkers 747 cube 753to

checkers 741 cube 776 (with the training set evaluated with the above net)
to
checkers 753 cube 787

Checking the worse positions (worse as 3ply differing from 4ply), it wasclear that if large differences went down from more that 4.0 in the oldnet to about 1.0, the equity given by a rollout were often close to either3ply or 4ply, often outside the interval of these and taking the averagewasn't converging.

At this point I started to roll out the whole crashed training database(1296 trials, 0ply) using the 741/776 net. I used a slightly modifiedgnubg since gnubg-nn, not using SSE, would be much slower.


Training from that led to a benchmark of checkers 766 cube 514.

Then I looked at what I could change to the training set to improvechecker play. Since it had been reported that he crashed net was bad atcontainment play and rolling outside primes, making bizarre stacks in theoutfield, I started there.

Looking at the training positions, I found quite many such positions,stacks of 7, 8 chekers on the 12 point, things like that. I tried toremove them, but since you added them in pairs, I tried hard to removegroups of related positions, not single ones. It was tedious and led onlyto minimal improvement. I gave up and left all the original positions.

I then tried to add positions from rolling a prime from far away (playingout from something like Advanced Backgammon's position 127 with a varyingnumber on men already off) against one or two checkers. I asked gnubg forits 0ply hint and if its 2 or 3 favorites looked wrong I added them, aswell as my choice, to the training set. All in all, I added 700-800positions. This worked quite well, decreasing the checker error to the730s.

While investigating on these 1-checker containment positions, I had notedthat the original training set was very unbalanced, with like 1500positions seen from the container and 30 from the runner. This is quitelogical if most positions were added when 0ply and 2ply plays differ butI had the idea that the even/odd effect might be somehow related to this,especially for crashed positions where checker mobility, hence a trainingset more or less automatically generated, would be likely to be veryasymetrical.

To test this, I did the same full rollouts on the race training databaseas well as its positions with the other player on roll. It worked well,improving the 0ply benchmark a little and the 1ply one a lot. Theswapped positions are not related pairs like most original ones but theystill help.

Redoing full rollouts on the crashed training set or even only on itsswapped positions was going to take more time than I wished so I settledon doing a truncated rollout (324 trials, truncated at 8 ply then 2 plyevaluation) of the whole new set (original + inverted positions).

Xavier Dufaure from XG had claimed in the Bgonline forum that its roller++evaluations (similar to the above truncated rollout) were generally aboutas good as a full rollout. Later experience makes me think this is notquite the case for gnubg (for starters, its equity estimates at thetrucation point are not good enough). But this seemed like a decentcompromise that I used the reevaluate the contact traing set (original +inverted positions) as well. My thoughts were that it would somehowdiffuse the improved estimations from the crashed and race nets and theeasy late positions deeper than a simple 2ply evaluation would.

The weights attached to the earlier message are those resulting the aboveprocess. In summary :- full rollout of the original race database + inverted positions usingthe 0.90.0 net and train a race net from that- truncated rollout of the original crashed database + inverted positions+ new containment positions using an intermediate crashed net and theabove race net, and train from that- truncated rollout of the contact database + inverted positions using the0.90.0 contact net and above crashed and race nets, and train from that

After that I tried another pass of truncated rollouts on the contact andcrashed training sets but it didn't improve the benchmarks (or maybe itdid : this is when I realized the crashed benchmark was flawed, but theimprovement, if any, looked like it would be minimal).

Since I didn't see any obvious prospects for further quick improvementsbut the nets seemed to be worthwhile, I trained corresponding pruning netsand posted the weights files as they were.

At this point, I'm looking at redoing full rollouts of the contact andcrashed databases (1.8M positions!). I've done some tests on a fewthousands of the smaller and larger pip counts. No surprise here : theformer are fast but current data is already accurate, the latter take alot of time but are quite often much more plausible than the currentestimates.

[Prev in Thread]

Current Thread

[Next in Thread]

[Bug-gnubg] Alternative weights files and call for benchmarkers, Philippe Michel, 2012/06/23
- Re: [Bug-gnubg] Alternative weights files and call for benchmarkers, Joseph Heled, 2012/06/23
  - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers, Joseph Heled, 2012/06/24
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers, Philippe Michel, 2012/06/24
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers, Joseph Heled, 2012/06/24
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers, Joseph Heled, 2012/06/25
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (long), Philippe Michel <=
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (long), Joseph Heled, 2012/06/25
    - Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (long), Philippe Michel, 2012/06/25

Prev by Date: Re: [Bug-gnubg] Problem with the crashed benchmark database
Next by Date: Re: [Bug-gnubg] Alternative weights files and call for benchmarkers
Previous by thread: Re: [Bug-gnubg] Alternative weights files and call for benchmarkers
Next by thread: Re: [Bug-gnubg] Alternative weights files and call for benchmarkers (long)
Index(es):
- Date
- Thread