bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] Re: Different rollouts are good


From: Jim Segrave
Subject: Re: [Bug-gnubg] Re: Different rollouts are good
Date: Mon, 30 Sep 2002 18:12:16 +0200
User-agent: Mutt/1.4i

On Mon 30 Sep 2002 (13:32 +0200), ?ystein O Johansen wrote:
> 
> If we first agree to keep a limit to handle only _two_ different search
> spacees.


I considered (and decided that I wouldn't touch it) whether the plies
and search spaces should decrease in some way as the rollout gets
further along. My feeling is that it would add a lot of complexity for
probably little gain. Kit Woolsey's suggestion seemed to be that
inaccuracies several plays deep in a rollout would have only a small
influence, to the point where 0 or 1 ply fairly small searchspace
would be sufficient.
 
> The user must be able to set the two different evaluation classes and the
> number of plies whrer to change the search space.
> The rollout context struct (in eval.h) must therefore at least one
> evalcontext for the late decisions and the rollout context must also
> include the limit value whre to change evalcontext.
> 
> typedef struct _rolloutcontext {
> 
>   evalcontext aecCube[ 2 ], aecChequer [ 2 ]; /* evaluation parameters */
>   evalcontext ecLateDecisions;
> 
>   unsigned int fCubeful : 1; /* Cubeful rollout */
>   unsigned int fVarRedn : 1; /* variance reduction */
>   unsigned int fInitial: 1;  /* roll out as opening position */
>   unsigned int fRotate : 1;  /* rotate dice of first two rolls */
> 
>   unsigned short nTruncate; /* truncation */
>   unsigned short nTrials; /* number of rollouts */
>   unsigned short nLateDecisionsLimit
>   rng rngRollout;
>   int nSeed;
> } rolloutcontext;
> 
> We may also discuss if it is neccesarry to add add an aecLateCube[2] and
> aecLateChequer[2]

 
> Now the rest of the changes will be in BasicCubefulRollout () in rollout.c
> 
> I suggest this is added in line 262:
> 
>   int nTruncate = prc->nTruncate;
>   int cGames = prc->nTrials;
>   int nLateDecisionsLimit = prc->nLateDecisionsLimit;
> 
> Then in line 332 the cube decision has to be split:
> 
> if ( iTurn > nLateDecisionsLimit ){
>    if ( GeneralCubeDecisionE ( aar, aanBoard[ ici ],
>                                pci,
>                                &prc->ecLateDecisions ) < 0 )
>    return -1;
> 
> } else {
> 
>    if ( GeneralCubeDecisionE ( aar, aanBoard[ ici ],
>                                pci,
>                                &prc->aecCube[ pci->fMove ] ) < 0 )
>    return -1;
> }
> 
> 
>    cd = FindCubeDecision ( arDouble, aar, pci );
> 
> 
> And also for the chequer play decisions. Split the FindBestMove call in
> line 490. I think that should do. (And also the the call in 530 just in
> case the player doesn't use variance reduction).
> 
> What about the evalcontext for variance reduction aecVarRedn? Do we have to
> change this in some kind of way as well?

I don't know. I don't really know how variance reduction works, so I
don't really know the effect of this.


> Now, what about the evaluation at the truncation point? Should this
> position be evaluated by the first or LateDecision evalcontext. How much
> will it cost in speed of rollout, and how much will gain to use the first
> (best) eval context?

If I understand correctly, at the truncation point, no further
rollouts will be done. So this should happen once per rolled out
game. The savings in the late decisions really are to get the
(presumably) large number of moves between the starting position and
the truncation point, so I'd assume that you go back to the non-late
parameters for the final evaluation.
> 
> For user interface:
> 
> set rollout latedecision eval plies ...
> set rollout latedecision eval cubeful ..
> ....
> set rollout latedicisions limit 7

If anyone can think of a better word than 'late', I'd be happy. I
started it, I know, but I don't think it would be very clear to anyone
what was meant by it.

> ?? OK
> 
> For the GTK GUI:
> An "advanced rollout" button in the dialog. This button opens a new dialog
> where the user can set the limit and the evalcontext.

When I compare the current Analysis single page to the Evaluation and
per-player setups, I find I personally much prefer the single page. If
we do have a toggle for applying this algorithm, then the settings for
the late evaluations simply get greyed out. 

My guess would be that either people know a fair bit about how the
rollout settings interact or they follow someone's suggestions blindly
in picking the settings.


-- 
Jim Segrave           address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]