swarm-support
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Robustness Check


From: Theodore C. Belding
Subject: Re: Robustness Check
Date: Thu, 8 Jul 1999 16:44:09 -0400 (EDT)

On Thu, 8 Jul 1999, Jan Kreft wrote:

> Regarding #1 and #2, it would be better to start the program only once
> with one seed and do all 100 runs in a sequence. This way, you avoid the
> possibility of overlap of the sequence of numbers you draw from the random
> number generator. Also, you have only one file with all results that you
> can analyse in one go.

In theory, this would be ideal. In practice, it isn't so important. It's
far more important to use a good, well-tested random number generator
(RNG) that has a long enough period before it starts to repeat (say at
least 10000 times as long as the maximum total number of calls that you're
going to make for all of the runs). You can generally just seed the RNG
for each run from the current time, or something similar (make sure that
each seed that you generate this way is different and in the allowed range
of seeds for your generator!).

As long as the initial state of the random number generator is simply its
initial seed and the generator is deterministic, then you can easily
calculate an upper bound on the probability that two sequences with
different initial seeds will overlap: It's simply the probability that one
of the seeds falls within the sequence of random numbers that begins with
the other seed. An upper bound on this probability is two times the
maximum number of calls that you make to the generator in one run, divided
by the period of the random number generator. Now multiply that
probability by the square of the number of runs. The result is an upper
bound of the probability that you will have at least one pair of
overlapping runs. If this second probability is large (larger than say
0.01), then you should use a different generator with a longer period.

So if the RNG period is long enough, you don't have to worry too much. And
if your results are so shaky that one pair of overlapping runs makes a
difference, then you shouldn't be publishing them anyway. :)

If you're *really* worried about this, then you should test your RNG by
seeding it 100 times, producing 100 sequences of random numbers that are
as long as the sequences that your program uses, and checking that there's
no significant correlation between the sequences.  In fact, you should
check your RNG output even if you're only seeding the RNG once: Simply
starting with the seed from the previous run won't guarantee independent
sequences if your RNG is broken. And it's always good to replicate your
results using a different type of RNG and a different set of seeds.

By the way, in some random number generators, you can easily jump n
numbers ahead in a sequence. So you can just choose one seed and jump k *
n numbers ahead from the seed at the beginning of each run, where k is a
different integer for each run and n is the maximum number of times the
generator is called in a single run. That will guarantee that each run
uses a non-overlapping sequence of random numbers. One such generator is:

L'Ecuyer, P., and S. Cote. (1991). Implementing a random number package
with splitting facilities.  ACM Transactions on Mathematical Software
17(1):98-111.

Having all of the results in one file might be easier in some situations,
but it's easy to write a Perl script to collate results from many separate
files into one file and reformat them as needed.

Finally, you don't have to do your analysis manually if you're using Unix:
You can write a script that formats your results correctly and analyzes
them in Mathematica or whatever. You can do the same thing on the Mac
with MacPerl, Applescript, and Excel. If you're using Excel on Windows,
you might be able to do the same thing with Visual Basic, Perl, or some
other scripting language; I don't know (and don't want to know :) 
-Ted

--
Ted Belding                              address@hidden 
University of Michigan Center for the Study of Complex Systems
Homepage: http://www-personal.umich.edu/~streak/
PGP key:  http://www-personal.umich.edu/~streak/pgp-key.html





                  ==================================
   Swarm-Support is for discussion of the technical details of the day
   to day usage of Swarm.  For list administration needs (esp.
   [un]subscribing), please send a message to <address@hidden>
   with "help" in the body of the message.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]