bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-gnubg] TEST RESULT: Variance Reduction


From: Jim Segrave
Subject: Re: [Bug-gnubg] TEST RESULT: Variance Reduction
Date: Mon, 7 Jul 2003 11:29:15 +0200
User-agent: Mutt/1.2.5.1i

On Mon 07 Jul 2003 (07:37 +0000), Joern Thyssen wrote:
> On Sun, Jul 06, 2003 at 11:38:13AM +0200, Jim Segrave wrote
> 
> > I've just been looking at this as part of my implementation of
> > extending rollouts (I have it working for chequer play but not yet for
> > cube decisions and I need to add some things so that saving a match
> > will save the rollout info so that you can extend a rollout of a saved
> > match. I expect to be commiting later this week.
> 
> I don't know if this is related to your problem, but RolloutGeneral and
> BasicCubefulRollout are probably overly complex since they allow the
> input of multiple cubeinfo's. There is no gain in speed from this, so we
> could remove it to make the routines a bit less complex (i.e., avoid all
> the loops over cci).
> 
> The only place where gnubg is called with multiple cubeinfos is
> GeneralCubeDecisionR, which is easily fixed by calling RolloutGeneral
> twice instead.
> 
> I can do the necessary changes, but I won't like to introduce merge
> conflicts into your work so I'll await your "go".

I have got extending cube rollouts working with the current
structure. It looks pretty good so far, but it's not heavily tested
(verifying that a stopped and resumed rollout gives the same result as
simply doing the same rollout on the old code is very time
consuming). I've been running out simple rollouts and comparing the
text mode report of results for exact matches in all the stats. 

I think people will like it. I save the internal state of
RolloutGeneral as part of the evalstat. If you select rollout on code
which has already been rolled out, it temporarily sets the rollout
context from the saved state, but sets the number of games to be
rolled out and any special stop condition (currently only the stop on
std deviation) from the current settings). The rollout then simply
continues. I've tried it with one move where different alternatives
were rolled out with drastically different rollout settings, each one
resumed with the appropriate settings. It should also preserve the
full state of the quasi-random dice setup as well.

A pleasant side-effect is that the export of rollout results now shows
the actual number of games rolled out.

While I was coding, I was wondering where and how you would get as
many as 16 cubeinfo's in a call, but I just decided to live with
it. Reducing this would save some memory, as extending rollouts
requires keeping the following internal state of the rollout code:

float aarMu[max-cube][rollout-outputs]
float aarSigma[max-cube][rollout-outputs]
float aarVariance[max-cube][rollout-outputs]
float aarResult[max-cube][rollout-outputs]
cubinfo aciLocal[max-cube]
      where a cubeinfo is about 8 ints and 4 floats
2 other ints ( no of games rolled out, nSkip for the quasi-random dice)

max-cube is 16, rollout-outputs is 7

I'm not convinced that aciLocal[] has to be preserved, I think it can
be recreated whenever you resume, but I wanted to be safe.

So it's adding just under 2Kbytes/eval setup, whether or not it's a
rollout. It would be possible to avoid the overhead for non-rollout
evalsetups, but I didn't want to try this straight away as it requires
a lot of work to ensure that the memory is malloced whenever it's
needed and freed whenever an analysis is re-done or a moverecord is
deleted. 

The consequences are:

For a match of say 10 games, 50 moves/game, 10 alternatives analysed
per move (plus a cube eval for every move):

500 moves, 5000 alternatives = 5500 evalsetups = 10Mb. So reducing the
first index of those arrays would be quite profitable. I'm hoping
people won't find the overhead too much in general.

I just have to finish working on sgf.c so I can save and restore
rollout internal state. I expect to commit somewhere between tomorrow
and Friday.

I then want to look at a couple of possible extra uses:

When rolling out several moves from the analysis window, instead of
doing the first move to completetion before starting the next one,
roll out one or more of the first move, then do the second, third,
etc. This allows people to know if a rollout to compare two moves is
going to be pointless (one move is so clearly better/worse) without
having to wait until one move is completely rolled out and the other
has been going for a while.

The second would be that, when rolling out a list of moves, after
doing some minimum number of each candidate, comput the joint standard
deviation of the equity differences of each pair of moves and stop
rolling out moves where that difference is greater than some user
selected number of standard deviations. Continue until only one move
is left, then stop the rollout. Included in this would be noting if
the remaining candidate's equities change enough so that one of the
moves which has been stopped might be a candidate again, rollout that
move until it has caught up again. This is to address the cases where
someone wants to know "which of the following moves is the best
(within a 95% confidence)?"


-- 
Jim Segrave           address@hidden





reply via email to

[Prev in Thread] Current Thread [Next in Thread]