coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Generating pseudo-random integers


From: Jim Meyering
Subject: Re: Generating pseudo-random integers
Date: Sat, 05 Feb 2011 15:13:22 +0100

Pádraig Brady wrote:
> On 05/02/11 07:51, Jim Meyering wrote:
>> Melikamp T. Medley wrote:
>>
>>> Thanks!
>>>
>>> I think I want to write a utility that prints pseudo-random
>>> integers (I have in CL, but I like it fast, so this time in C),
>>> and link it with coreutils. I looked at Paul Eggert's work,
>>> and I won't be able to do any better than that if I spend
>>> years on it :) I want to print uniformly distributed integers
>>> fast, formatted, in bulk if needed. And I want to call it roll.
>>> Let me know if you have a word of advice.
>>>
>>> On 02/04/2011 11:35 PM, Eric Blake wrote:
>>>> Using just coreutils, though, I'm afraid your options are slim.  You
>>>> could use 'seq 0 9 | sort -R | head -n1' (or use shuf instead of sort
>>>> -R) to generate one random number at a time, but that gets expensive.
>>>> Or, if you don't mind limiting yourself to 62**6 values (instead of the
>>>> more traditional 2**32), you could do 'mkdir tmp; mktemp -u -p tmp
>>>> XXXXXX'.  But that's the extent of coreutils' randomness exposed to the
>>>> user.
>>
>> It's hard to justify writing a new program in C
>> when you can do the job with a Perl one-liner:
>>
>>   $ perl -le 'foreach (1..20) { print int rand 1000 }'
...
>> However, adding an *option* to shuf might make sense,
>> since you can already give shuf both a range and a count
>> of how many numbers to print.
>
> Sorry for being dense, but what option shuf is needed? formatting?

shuf -i A-B prints a permutation of a set of integers.
You can make it print a subset of that permutation, but it's
still a subset of a permutation.  I.e., when the input is a range,
the same number cannot appear more than once.

A new option would enable shuf to print N randomly selected
elements, but with repetition.  Hence N could be larger than B-A+1,
(in general, N could be larger than the number of inputs)
and the output is no longer a permutation or subset of one.
Since it doesn't need to save state, it can be more efficient.

> Though on trying shuf to generate the numbers I notice:
>
> $ shuf -i0-$((2**31)) -n1
> 1241475845
> $ shuf -i0-$((2**31)) -n2
> shuf: memory exhausted

For small N and large B-A, that code should be improved.
It certainly doesn't need to allocate so much space.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]