bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TRAINING


From: Joseph Heled
Subject: Re: [Bug-gnubg] TRAINING
Date: Tue, 8 Aug 2017 17:15:21 +1200

Here are my few short cents,

On 8 August 2017 at 09:39, Philippe Michel <address@hidden> wrote:
On Sat, 29 Jul 2017, greg etem wrote:

I want to know how I can train gnubackgammon in my PC.Thanks in advance!

You cannot train GNU backgammon with itself. What was used to train the current networks is available by CVS at
cvs.savannah.gnu.org:/cvsroot/gnubg/gnubg-nn
(instructions to obtain the gnubg source by CVS are at http://www.gnubg.org/index.php?itemid=26 for gnubg-nn the only difference is th last argument since you check out gnubg-nn instead of gnubg).

Since the above software performs supervised training you need a training database. The one used to train the current networks is available at
http://files.gnubg.org/media/nn-training/pmichel/training_data/
There are 3 files since gnubg uses 3 different networks for various stages of the game.

In sibling directories you'll find the current nets (format is slightly different from the one used in gnubg itself but the conversion from one to the other is trivial) in .../nn-training/pmichel/nets and benchmark databases used to evaluate the results of training in .../nn-training/pmichel/benchmarks/


You don't say what you aim to do exactly, but you must realize that if you just start with the above training database and networks you are unlikely to obtain something meaningfully different from the current level of play.

The kind of changes you may try could be :

- start with a more accurate training database. The current one contains positions rolled out at 0-ply. Most of them are ok but complex positions like containment play or backgames could have been badly misevaluated.
Re-rolling out the databases should improve them (the current networks are better than what was used to create the current databases) but this is a lot of work. Re-rolling the misevaluated ones only would be faster but you would have to identify them first.

- add positions for the classes of positions that are misevaluated. Starting from one of these you would need some way to generate hundreds of similar positions (adding one or two or ten won't be enough).

Adding more positions of certain types can be beneficial or detrimental. The resulting trained net can turn out bettert or worse. You always need some kind of an independent evaluator. In GNUBG we have a separate set of rolled out positions to evaluate the quality of a net. 
  

- remove "toxic" positions for the training database. At some stage (10 or so years ago) misevaluated positions were added automatically to the training database but since the program didn't play too well back then some of them are bizarre positions the don't happen with sensible play and may in addition be misevaluated in the rollouts.

- use a different network structure. Current networks have one hidden layer of 128 neurons. Frank Berger, the creator of BgBlitz reported that he tried different sizes (up to 160 if I remember correctly) and the larger ones didn't help.

I tried different sizes for the hidden layer. There was no clear reason to increase the size from 128. Of course this was a long time ago.

Of course this was for a different program and may work out differently for gnubg. Moreover, trying a smaller hidden layer may lead to a program that would be weaker but in a more human-like way than the current randomly-weakened levels of gnubg.
Trying something more complex like a second hidden layer or some kind of not fully connected setup could be interesting but would need some programming work.

- adding new inputs or modifying current ones. This is probably the most promising way to improve the level of play, but this seems to be pretty hard. The inputs currently used are mostly (or even exclusively) from old research papers by Hans Berliner 40 years ago. I don't think there is a lot more recent litterature on the subject.
As far as I understand (I wasn't interested in gnubg at the time), Joseph Heled tried some things in this area with gnubg in the early 2000s but was frustrated by the difficulty to get meaningful improvements.
Of course this one involves some coding in both gnubg and gnubg-nn.

The original net had 22 (x 2) inputs in addition to the "basic" 100 (x 2). I removed 4 and added 7 others. I also changed the definition of some of the existing ones. Those changes had a significant effect on performance. I tried many other inputs over the years, but failed to find something with a major impact. 

But I am sure that other good inputs exist, and can grealy help the 0 ply.

-Joseph



_______________________________________________
Bug-gnubg mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/bug-gnubg


reply via email to

[Prev in Thread] Current Thread [Next in Thread]