lynx-dev
[Top][All Lists]

## Re: lynx-dev Options, V.Links, random

 From: Philip Webb Subject: Re: lynx-dev Options, V.Links, random Date: Tue, 21 Dec 1999 13:38:01 -0500 (EST)

```991221 Doug Kaufman emulated Joan of Arc:
> On Tue, 21 Dec 1999, Doug Kaufman wrote:
>> Correct formula for chance of duplicate values (20 from pool of 10000)
>> 1 - ( 9999! / (9980! * 10E76 * 20!))
>> It certainly is humbling to be so wrong in public.
> I guess that I should also learn not to reply after midnight
> when I haven't had much sleep the night before.
> Now that I have had some sleep,
> I believe that the first equation I posted was correct.
> Look at a reductionist case, 3 draws out of a pool of 5.
> After one draw the result is always unique.  After a second draw,
> there is 4/5 chance that it is still unique, regardless of order.
> After a third draw, the chance that it is still unique is (4/5)*(3/5).
> The total number of combinations, drawing 20 out of 10000 is (10E80 /20!).
> The number of unique combinations ditto is (10000! / (9980! * 20!)).
> Hence the chance of a unique combination is:
> ((10000! / (9980! * 20!)) / (10E80 / 20!))
> which reduces to: (10000! / (9980! * 10E80)) or (9999! / (9980! * 10E76))
> So the chance of a duplicate filename is: 1 - (9999! / 9980! * 10E76)
> It looks like the chance of a unique permutation is the same as the chance
> of a unique combination.  Any corrections to this would be appreciated.

like JoA, you are correct.  the classic text for statistical theory is
William Feller's `An Introduction to Probability Theory & Its Applications',
originally written c 1950, which i have in a Wiley paperback.
our problem is basically the same as (a, c, d) on pp 31-3 ,
the last being the famous birthday surprise,
ie if 23 people are at a party, it's more likely than not
that  >= 2  of them will have the same birth date.

in our case, the probability (p) that all filenames will be different is
(10000)20 / 10000^20 ,
where `^' is the power symbol & `(10000)20' is shorthand for
10000 x (10000-1) x ... x (10000-(20-1)) ;
p  reduces to  10000/10000 x 9999/10000 x ... x 9981/10000 ,
ie  0,9999 x 0,9998 x ... x 0,9981 = 0,981167233 (exactly) ;
so the probability of getting a clash is  0,018832767 (exactly) .

--
========================,,============================================
SUPPORT     ___________//___,  Philip Webb : address@hidden
ELECTRIC   /] [] [] [] [] []|  Centre for Urban & Community Studies
TRANSIT    `-O----------O---'  University of Toronto
```