Re: [Bug-gnubg] License of training data on ftp.

From: Philippe Michel
Date: Wed, 12 Oct 2016 22:52:30 +0200 (CEST)
Øystein Schønning-Johansen wrote:

Do we consider the training data on the ftp site under any license, or do we consider the training data public domain?

In case someone feel ownership to the data, please state your opinion.

It may depend on what you mean by "ftp site". The positions at ftp://ftp.demon.nl/pub/Museum/Demon/games/gnubg/nn-training/training_data/ or http://files.gnubg.com/media/nn-training/nl20040621/training_data/ are from Joseph Heled. The output values are probably 2ply evaluations by the previous iteration of the nets.

At http://files.gnubg.com/media/nn-training/pmichel/training_data/ ,about 49% of the training positions are the same as above, from Joseph Heled's work in the mid-2000s, 49% more are the same positions with the other player on roll and the last 1-2% are those I added. The output values are rollouts.

I don't "feel ownership" of anything there.

Why I'm asking is because I guess we could try to upload the training dataset to Kaggle, and maybe some of the experts there can take a shot at it and maybe provide some insight. Good idea?

In case you think this is a good idea or don't mind in any way, I will volunteer to convert the data to a standard format suitable for kagglers, and write up an information document.

I'm not sure how the could get something out of that in the current format, with the key for each row being an obscure string. Would converting this to a suitable format mean to something looking more like a backgammon board ?
