|
From: | John Washburn |
Subject: | [gnugo-devel] Regression test suites |
Date: | Thu, 28 Jul 2005 07:55:39 -0600 |
I would be interested in the regression test suite
improvement/expansion. I have done software testing for 11 years and have
the CSQE certification from ASQ. As for the super KO task on the tactical list, would using (hashes/message
digests) to encode the board position of a given move be adequate. What I envision is this:
i.
361 O’s for no handicap stones,
ii.
60 O’s, a B, 239 O’s, a
B, and 60 O’s if 2 handicap stones are placed by Japanese rules
i.
I will use 512-bits (SHA512) in this
example.
ii.
This is represented in memory as an
array of 1. 64, 8-bit
bytes, 2. 32, 16-bit words, 3. 16, 32-bit double
words 4. 8, 64-bit
quad words
iii.
Since the number is used internally
by the program, there should be no interoperability issues or even little/big-endian
issues.
The super Ko rule then is this:
i.
If different, Ko has certainly NOT
been violated. Return NO_SUPER_KO.
ii.
If same , continue to next pair array
elements.
There are only 3^361 possible board patterns. This is less
than 2^573. Thus, SHA512 with 512-bits should prove an efficient substitute for
encoding the whole-board pattern of any move. But, if the fact that there
are 2^61 of the possible 3^361 board patterns which “fold” into the
same 512-bit message digest, then create 2, SHA512 hashes. One hash is
for the original board string and the second hash is for the lexical reversal
of the board string. Use strrev() for example. This reversal for a bouble
hash will only be effective if the hash algorithm is cryptographically sound.
CRC’s would not work since the CRC of message and the CRC of the reversal
is (by design) the same. The reason to use a sound, cryptographic hash algorithm is
that the: 1. The hash is
designed to be computationally efficient. 2. The hash is
designed to minimize collision (2 different messages yields the same hash
value) 3. The hash is
designed to maximize diffusion. (A single bit change anywhere in the board
pattern is nearly equi-likely to change a given bit in the hash value).
This increases the likelihood that a move anywhere on the board affected a bit
in the first 32-bits of the hash. Thus, the difference of board position can
be detected quickly. In summary: Two, 512-bit hashes (128, 8-bit bytes) can encode any board
position without loss of information. A single 512-bit hash (64, 8-bit bytes) can encode any board
position with some loss of information; on the order of 1 part in 2^512. A single 256-bit hash (32, 8-bit bytes) can encode any board
position with some loss of information; on the order of 1 part in 2^256. I recommend the double SHA512 approach. This is a
total memory usage of 35,200 bytes for the double hashes of a game consisting
of 275 moves. I do not know if this storage requirement is too large for
GNU Go. Hope this helps. In John Washburn |
[Prev in Thread] | Current Thread | [Next in Thread] |