bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] ***SPAM*** Re: Variance reduction efficiency


From: Philippe Michel
Subject: Re: [Bug-gnubg] ***SPAM*** Re: Variance reduction efficiency
Date: Sat, 26 Aug 2017 23:47:27 +0200 (CEST)
User-agent: Alpine 2.21 (BSF 202 2017-01-01)

On Thu, 24 Aug 2017, tchow wrote:

On 2017-08-24 16:28, Philippe Michel wrote:
With your proposal used for the usual 2-ply rollouts the first few
steps would be 20 times slower, the following ones unchanged. The
total number of steps would depend of the position but the final cost
may be in the
x 1.5 to x 2 range, similar to the gain in accuracy when *all* steps
use 2-ply variance reduction.

Thanks for taking a look. I thought of a related idea, which is to increase the ply-level for the VR occasionally, but not necessarily for the first play; instead, one increases the ply-level for the VR level "as needed." I'm not sure exactly what "as needed" should mean, but one possibility is that if the roll chosen in the rollout trial is extremely lucky or extremely unlucky, then we invest some extra effort to make sure that the luck estimate is accurate for that roll. If the threshold for "extremely lucky or unlucky" is chosen so that the 20x slowdown is invoked only 1/20 of the time then the overall time penalty should be in the 2x range. Of course it's not clear whether my intuition is correct that the extremely lucky/unlucky rolls are the ones that are in most need of accurate VR.

I now think it couldn't work, no matter how accurately you are able to select the rolls that would get a deeper-ply VR. First few, most volatile, whatever.

Let's assume a 2-ply rollout and that, as suggested in my previous sample, by doing a 2-ply instead of 1-ply VR decrease the SD in the same proportion doing twice as many trials would, but at the cost of being 20 times slower.

Correcting the 1-ply vs. 2-ply inaccuracy of the VR of all moves does this, and your idea amounts to hope that by fixing the VR a small fraction (less than 5%) of the moves you could get most of the benefit.

That could only work if the 1-ply vs. 2-ply inaccuracy was concentrated in a few spikes and this is simply not the case.

The only possibly worthwhile case could be to do this on the first one or two rolls because, by recording the results, you could do the VR part only 21 or 441 times whatever the number of rollout trials. That would entail some significant coding effort for, at best, a very limited gain.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]