[gnugo-devel] twin endgame match

gnugo-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[gnugo-devel] twin endgame match

From:	alain Baeckeroot
Subject:	[gnugo-devel] twin endgame match
Date:	Fri, 3 Mar 2006 14:38:29 +0100
User-agent:	KMail/1.9.1

Hi

Following Arend advice, gg378 and twin-378 had a 85 games endgame-match:
- twin 26 win (1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 3 5 7 10 14 15 21 25 28)
- GNU Go 14 win(-9 -3 -3 -3 -2 -2 -2 -1 -1 -1 -1 -1 -1 -1)
- 45 unchanged
The sum is +135, the average on 85 games +1.6

_but_ when one looks at the attached plot of cumulative +PASS -FAIL versus 
game_status, the twin fails a lot of end-game tests (game_status>0.85). It is 
already a huge task to check big failures, but i feel too lazy to investigate 
this 40 tests and more than 50 regressions in endgame, (and i am a very bad 
yose player ;-) 

By construction, the twin "knows" exactly how gg378 evaluates the game, and 
the twin may steal a big point before gg378 plays it, but it is still 
gnugo-logic. So i wonder if this endgame match is significant or if it is 
just a systematic error.

In other words, a reliable endgame comparison should imply an other engine, 
good at endgame, and compare the results of both against the reference 
engine.

Am i right, or just paranoid ?
Is there such an engine available ?

- Alain

PS: the plot include all boardsizes, it is not so flat when separating them, 
but i have made too much clean-up, and erased the results, so ... i re run 
regression tests again :(

twin4-d1.5_cumul+P-F_vs_gstatus.png
Description: PNG image

[Prev in Thread]

Current Thread

[Next in Thread]

[gnugo-devel] twin endgame match, alain Baeckeroot <=
- Re: [gnugo-devel] twin endgame match, Paul Pogonyshev, 2006/03/04

Prev by Date: Re: [gnugo-devel] Fwd: [computer-go] 12th KGS online computer Go tournament
Next by Date: Re: [gnugo-devel] mailer problems?
Previous by thread: [gnugo-devel] mailer problems?
Next by thread: Re: [gnugo-devel] twin endgame match
Index(es):
- Date
- Thread