[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Shuffling elements in a dataset (fwd)
From: |
Ted Harding |
Subject: |
Shuffling elements in a dataset (fwd) |
Date: |
Sat, 22 Feb 1997 15:44:40 +0000 (GMT) |
( Re Message From: John W. Eaton )
>
> On 21-Feb-1997, Ted Harding <address@hidden> wrote:
>
> | I find that something like
> |
> | [dummy,ix] = sort(rand(1,rows(x))); new_x = x(ix,:);
> |
> | seems pretty fast. (0.04 secs for 10000 rows, 0.05 secs for 100000 rows,
> | or for 1000000, on a 386-DX/25MHz; 0.003 secs for 10000 rows, 0.004 secs
> | for 100000 rows, or for 1000000, on Pentium-120, i.e. almost independent
> | of number of rows. However for 10000000 rows it starts swapping and takes
> | a while (48 MB RAM)). Above timings for 1 column only; reduce sizes pro
> | rata for extra columns (RAM limit).
>
> Hmm. Here is what I get (counting down to minimize memory problems):
>
> for n = [[8, 6, 4, 2]*1e5, 10.^[5, 4, 3, 2, 1]]
> x = rand (n, 1);
> t = cputime ();
> [dummy,ix] = sort(rand(1,rows(x)));
> new_x = x(ix,:);
> t = cputime () - t;
> printf ("N = %8d, t = %6.2f seconds, n*log(n) = %.2e\n", n, t, n*log(n));
> fflush (stdout);
> endfor
>
> N = 500000, t = 13.96 seconds
> N = 100000, t = 2.30 seconds
> N = 10000, t = 0.17 seconds
> N = 1000, t = 0.02 seconds
> N = 100, t = 0.01 seconds
> N = 10, t = 0.01 seconds
>
> at N == 10^6, I get into paging trouble on my 64MB Pentium system.
>
> How did you end up with almost constant time? I took Octave's sort
> algorithm from Knuth. It's good, but I don't think it's *that* good.
>
> jwe
Eerrrmm OOPS. Turns out I inadvertently did it on a row vector instead of
a column vector. Too late at night. Anyway, it's next day now, and here's
some typical results (Pentium 120, 48MB RAM).
> x=(1:1000)' ; tic; [x0,ix]=sort(rand(1,rows(x))); x(ix); toc
ans = 0.049538
> x=(1:10000)' ; tic; [x0,ix]=sort(rand(1,rows(x))); x(ix); toc
ans = 0.74168
> x=(1:100000)'; tic; [x0,ix]=sort(rand(1,rows(x))); x(ix); toc
ans = 13.884
Very comparable. I still reckon it's pretty fast! (especially since
there's a lot else happening on the machine at the moment).
Best wishes,
Ted. (address@hidden)