[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: current development

From: Timothy Y. Chow
Subject: Re: current development
Date: Thu, 5 Dec 2019 11:32:00 -0500 (EST)
User-agent: Alpine 2.21 (LRH 202 2017-01-01)

On Thu, 5 Dec 2019, Nikos Papachristou wrote:
My personal view on improving GNUBG: Why not try to "upgrade" your existing supervised learning approach? There have been lots of advances in optimization/regularization algorithms for neural networks in the past years and it might be less demanding that trying a new RL self-play approach from scratch.

Regarding expected results, I also believe that backgammon bots are very close to perfection and whatever improvements (from any approach) will be marginal.

In order to determine whether a new network is doing better than the old network, it helps to have examples of positions where the old network is clearly playing poorly. Here's one example of a game that I played against eXtreme Gammon where the bot made a lot of obvious blunders:


For example, search for "10/8 6/4(3)". The bot's ridiculous play here would not be among the top 50 plays of any halfway decent human player. Admittedly this was XG but I would expect GNU to behave similarly, if not in these specific positions then in similar ones.

Playing around with positions like this will quickly disabuse anyone of the illusion that "backgammon bots are very close to perfection."

As I recall, in the past, people have tried specifically training neural nets on positions like these, as well as "snake" positions where you have to roll a prime for a long distance, and the problem was that it seemed to degrade performance on other types of positions. It's possible that, as Papachristou suggests, recent incremental improvements in regularization algorithms might be good enough to overcome these difficulties. Anecdotal evidence from Robert Wachtel's revised version of "In the Game Until the End" suggests that Xavier was able to improve eXtreme Gammon's post-coup classique play significantly, without a wholesale switch to modern deep learning methods.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]