bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-gnubg] Feature request: Rollout frequency based on std. errors


From: Christopher Yep
Subject: [Bug-gnubg] Feature request: Rollout frequency based on std. errors
Date: Mon, 31 Aug 2009 05:05:58 -0400

I don't know how difficult this would be to program, but one possible enhancement (which the user can optionally select) for rolling out positions is to rollout moves (candidate plays) with large standard errors more frequently than moves with small standard errors.

For example, a user may want to rollout four moves (A, B, C, and D). Moves A and B may usually lead to complex (future) positions which gnubg has a hard time evaluating, while moves C and D may usually lead to simple positions (e.g. holding games, races, etc.) which gnubg can easily evaluate. So, variance reduction is much less effective for moves A and B than for moves C and D. It may be the case that the standard errors of moves A and B are about 4 times as large as the standard errors of moves C and D (for an N-game rollout of each move).

Instead of rolling out all four moves N times, this new feature would rollout A and B approximately 16 (i.e. 4*4) times more frequently than C and D, resulting in approximately equal standard errors for each move at the conclusion of the rollout (and even during the rollout, based on the algorithms below).

For a single thread, the algorithm works as follows (for the example of rolling out 4 moves):

1. Rollout A, B, C, and D two times each.
2. Check the standard errors of each move. Rollout one additional game of the move with the highest standard error.
3. Repeat step 2 until the rollout is complete.

Note that "set rollout trials" could correspond to (1) the number of times to rollout move A (the first play selected), (2) the average number of rolled out games per play, (3) the maximum number of games to rollout each play, or something else. To avoid the possibility of unexpected long rollouts (i.e. if move A has extremely low standard errors), I suggest choosing (2) or (3).

For multiple threads, the algorithm is similar: when a thread needs a new game to rollout, assign it to rollout the move which currently has the highest standard error.

If all of the above results in too much overhead (I doubt this is the case, but I suppose it's possible), then the single-thread algorithm could be carried out in batches of 36 games:

1. Rollout A, B, C, and D 36 times each.
2. Check the standard errors of each move. Rollout 36 additional games of the move with the highest standard error.
3. Repeat step 2 until the rollout is complete.

(The multiple-thread algorithm would have to be tweaked a little, but the idea is essentially the same.)

Chris





reply via email to

[Prev in Thread] Current Thread [Next in Thread]