[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Confused

From: Philippe Michel
Subject: Re: [Bug-gnubg] Confused
Date: Tue, 23 Jun 2015 01:06:01 +0200 (CEST)
User-agent: Alpine 2.20 (BSF 67 2015-01-07)

On Tue, 16 Jun 2015, Ian Shaw wrote:

Did you also re-roll the benchmark data? Or was this not used? I remember what a big effort it was the first time round.

Yes. There were some problems with the previous benchmark databases. All were not rolled with the same nets, the evaluation of backgammons could be very wrong.

It took some time to do this (a few weeks, I think) but current computers must be at least 10 times faster than they were in the early- to mid-2000s

I noticed that in the earlier post you added some positions with your own selection of best move. This must have been very labour intensive, and going back to the earliest days of bot training with expert knowledge. I take my hat off you!

I did this only to add positions about rolling an outer prime against one or a few checkers. This was more tedious than really "labour intensive" and hardly expert knowledge.

I had a modified gnubg that logged the id of every evaluated position. I played out the positions human vs. human, asked for a 0 ply hint with evaluation cache disabled. I selected the first few choices from gnubg, added a few of mine if those from gnubg didn't look right and asked for a 0 ply evaluation of the selected choices. Everything logged twice was rolled and added to the training set.

What would be the next stage of bot training? Repeating the rollout process and re-training? Or would it be better to search for new positions that the bot does not understand well?

I don't know. Redoing the rollout would certainly help, but this is a lot of work for an uncertain gain. Redoing what Joseph did to have 0ply mimic 2ply starting from a more accurate base may work as well.

Adding positions shouldn't hurt but you would have to add a lot of them. Removing those that are too unlikely may help as well (but you have to find them first).

Something I think would help would be to train the various nets on slightly overlapping sets of positions to try to avoid discontinuities between contact and racing and contact and crashed nets.

Of course there could be something to do about the net inputs. These are mostly from Berliner, 40 years ago, when everybody was rather clueless by modern standards. Some of them are very costly to compute ; could they be remplaced a few simpler ones combined through the neural net ?

There is the issue of net structure. Adding a second intermediate layer ? Something mixed like feeding the 200 raw inputs into, say a 50 nodes layer, adding the 50 complex inputs and passing the 100 values to another layer ? That would imply significant changes in the training code.

There may be something to do on the tree search side as well. For instance, currently deeper evaluation use the most radical of forward pruning, assuming static evaluation gives the right move at intermediate plies. We could do a limited minimax search, for instance exploring the 2 best moves, or something like "if the best move is a hit, explore the best non-hit as well if it is close" and vice-versa.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]