bug-gnubg
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Bug-gnubg] Is it time for Gnubg 0.15? Improving the evaluation func


From: Ian Shaw
Subject: RE: [Bug-gnubg] Is it time for Gnubg 0.15? Improving the evaluation function
Date: Tue, 18 Jul 2006 12:48:43 +0100

Joseph Heled wrote on 18 July 2006 10:59

> On 7/18/06, Jonathan Kinsey <address@hidden> wrote:
> > Ian Shaw wrote:
> > >
> > > I think we've rested on our laurels long enough, and it's 
> about time we started trying to improve the playing strength 
> of our favourite bot.
> > >
> > > I can think of several ways where might seek to make improvements:
> > >
> > > A) Speed up the evaluation function so gnubg can search 
> faster, and maybe deeper.
> > > B) Improve the evaluation function by changing the neural 
> net inputs or hidden nodes.
> >
> > Is having more neural nets a good idea?  The race net does 
> seem nearly 
> > perfect, the crashed net is quite specialised,
> 
> Yes but was developed by isolating a case where gnubg was 
> provably not playing well.
> 
> > this seems to leave a lot
> > of positions for the contact net (the vast majority I 
> guess).  If we 
> > split the contact positions up into several/lots of different 
> > categorises (e.g. back games, holding games, prime positions) would 
> > this produce a stronger bot?
> >
> > I've deliberately side-stepped how you would exactly define these 
> > types of positions
> 
> I am glad your explicitly mention this because it is the 
> whole crux of the matter. Past experience tell us that 
> classifying positions in groups whose graph contain cycles is 
> gets very tricky when it comes to getting the nets work well 
> together. while it may work, previous attempts to do so has failed.
> 
> > and also the worked involved...  Just wondered if it was a 
> direction 
> > worth considering?
> >

I think Snowie have adopted this approach - using seven nets. I can't
remember where I saw this information, though. 

If we could split the nets, I would try using fuzzy logic to smooth out
the gaps at the edges. A position could be evaluated as being 60% prime,
35% holding, 5% race. The evaluations from those nets would then be
combined proportionally to give an overall equity. Maybe Snowie already
does this.

However, it should be pointed out that this would mean evaluating each
position using a number of networks. This would slow the evaluation
function down, not to mention the overhead of defining the position
class and combining the weighted results.

One of the advantages of Gnubg over Snowie is that it is much faster. It
is also just as strong as Snowie even though it uses one network for
about 80% of positions.

My conclusion is that is isn't worth the effort. I would rather
concentrate on improving the one neural net. I've started looking at the
evaluation function CalculateHalfInputs, to try to understand what the
current inputs are, and where we might add some more. (It's a hell of a
way to learn C!) I understand the raw board encoding, it's the
hand-crafted features I'm looking at.

Does anyone have any idea why those particular inputs were chosen, what
they are intended to model, and how effective they are? Some of them are
obvious  well commented, but others at the bottom of the enum are
undocumented.

I also see that some of the inputs, A_CONTAIN and I_CONTAIN, also have
the same value squared. I'm curious as to why.

To try to understand it, I set up some positions in gnubg and looking at
the evolution output. However, most of the values seem to be zero. What
is going on? For example, BREAK_CONTACT always seems to be zero, even
when there is contact.

I have the idea of trying to encapsulate the concepts from Robertie's
"Modern Backgammon" into the new inputs. These are connectivity,
flexibility, robustness and non-commitment. Some of these may already be
covered, under another name, in the network inputs, which is one of the
reasons I am trying to understand what is there now. They may also be
implicit in the raw board encoding. For example,
Connectivity: Might be measured as the sum of the distances from each
chequer to the next one in front of it.
Flexibility: Might measure the number of point-making rolls for the
spares and blots. 

Maybe I_MOBILITY and I_MOMENT cover these concepts, though.


If new inputs are developed, would it be useful to firstly test them on
the pruning net? As I understand it, the pruning net uses the same
inputs as the main net, but only five hidden nodes. I assume its much
quicker to train.

On the subject of hidden nodes, how much work has been done on
optimising their number? Is there any mileage here?

-- Ian










reply via email to

[Prev in Thread] Current Thread [Next in Thread]