gnugo-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] Pattern tuning and regression tests


From: Arend Bayer
Subject: [gnugo-devel] Pattern tuning and regression tests
Date: Tue, 18 Dec 2001 18:00:59 +0100 (CET)

I wonder about a good way to determine efficiency of pattern tuning
patches. A quick statistical comparision shows that regression tests
are not enough: currently, we have 1247 patterns only in fuseki.db,
patterns.db and patterns2.db. OTOH, there are 944 regression tests using
some genmove command. Since usually a move is suggested by maybe 2 or
3 patterns, this means that, on average, each pattern suggests maybe
2 moves in the whole regression test suite that finally get selected.
(This does not yet take into account that certainly many of the test
cases involves decisions mailny relying on owl/tactical reading.)

And, if the pattern is really involved in the final decision of genmove,
it is far more likely that we have situations as test cases where the
pattern produces bad moves, than where it is necessary to make good moves.

Hence, if one takes a single test case and modifies a pattern to solve
this case, one will hardly ever get a negative feedback, even if the
change would actually result in a decrease of GNU Go's playing strength.
So we rely mostly on subjective opinions on whether a pattern is useful;
I guess the idea behind the regression suite should be to allow a more
objective opinion on such matters.

Instead, I wonder whether it might make sense to have an automatically
maintained database that would just record for each pattern, which moves
it suggested in (say) GNU Go's NNGS games.  Maybe it would be sufficient
to only record those moves that actually got selected. At least for
specializing and/or removing patterns, this would allow an easy check
whether the pattern is really unnecessary. OTOH, newly added/generalized
patterns could be checked after a while, whether they actually get used
in "real life".

Has this been discussed before on this list? 

-Arend





reply via email to

[Prev in Thread] Current Thread [Next in Thread]