bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The status of gnubg?


From: Isaac Keslassy
Subject: Re: The status of gnubg?
Date: Mon, 19 Oct 2020 23:23:54 +0300
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1

Hi,

It would be great to renew the effort on gnubg!

I have a question regarding the fundamental NN weight improvement technique. If I understand correctly, to improve the NN weights, you are trying the supervised-learning approach of picking tough positions, determining the best move using rollouts, then gradually optimizing the NN weights. However, as Joseph mentioned, this may affect the NN play in positions arising in regular games.

However, there are other techniques that have proved more efficient at games like chess. They avoid the long rollouts and work on positions of regular games. For instance:

1. SPSA: This is an obvious approach. Let the NN play against a very slightly modified version of it, pick the winner, and using a random walk, gradually converge to better parameters; or:

2. Logistic regression: Instead of teaching the best move, teach the position equity (as also mentioned by Aaron). Specifically, we could try to minimize the equity error associated to each position. Assume DMP for simplicity. Run a million games through self-play, and associate all the obtained positions to the final game result (-1 for loss, +1 for win). Then tune all the NN weights through gradient descent to minimize the difference between the position estimate and the final game result.

(see https://www.chessprogramming.org/Automated_Tuning, Texel's tuning, SPSA etc. for more details)

Has anybody tried such alternative methods?

Thanks,
Isaac



reply via email to

[Prev in Thread] Current Thread [Next in Thread]