[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: strange FFT timing
From: |
Sergei Steshenko |
Subject: |
Re: strange FFT timing |
Date: |
Sun, 20 Apr 2008 15:50:27 -0700 (PDT) |
--- David Bateman <address@hidden> wrote:
> Przemek Klosowski wrote:
> > I would expect that FFT would be fasterst for datasets of the size
> > that is a power of 2, but a quick test doesn't bear this out. I am
> > using Octave Version 3.0.0 on Fedora 8 (2.6.24) runing on a 2.1 GHz
> > Athlon XP 2800+ with 1GB of memory (no swap activity during the
> > calculation). The 1,000,000 element FFT is almost twice as fast as
> > 2^20 element FFT (see below for excerpt of the Octave session).
> >
> > Anyone would venture what is going on? It could be cache aliasing
> > (this Athlon has a 512kB cache). In any case, FFTW seems to be very
> > good handling the non-power-of-2 sizes, and seems
> > to be optimized for power-of-ten sizes.
> >
> > v=rand(2^20,1); tic;fft(v);toc
> > Elapsed time is 0.339144 seconds.
> > Elapsed time is 0.290032 seconds.
> > Elapsed time is 0.283474 seconds.
> > Elapsed time is 0.291376 seconds.
> > Elapsed time is 0.285946 seconds.
> > v=rand(1e6,1); tic;fft(v);toc
> > Elapsed time is 0.223884 seconds.
> > Elapsed time is 0.179272 seconds.
> > Elapsed time is 0.179442 seconds.
> > Elapsed time is 0.184591 seconds.
> > Elapsed time is 0.179012 seconds.
> >
> > Similar results obtained with v=rand(1e6,1);x=cputime;fft(v);cputime-x
>
> At a guess I'd say SIMD memory alignment. With 16byte data alignment
> FFTW can use the SSE2/SSE3 instructions to speed things up. Maybe your
> 1e6 case just got lucky. I see the reverse behavior..
>
> D.
>
> _______________________________________________
> Help-octave mailing list
> address@hidden
> https://www.cae.wisc.edu/mailman/listinfo/help-octave
>
On my machine - Athlon AM2 X2 4600+:
"
octave:1> v=rand(2^20,1); tic;fft(v);toc
Elapsed time is 0.244031 seconds.
octave:2> tic;fft(v);toc
Elapsed time is 0.199732 seconds.
octave:3> tic;fft(v);toc
Elapsed time is 0.199396 seconds.
octave:4> tic;fft(v);toc
Elapsed time is 0.197449 seconds.
octave:5> tic;fft(v);toc
Elapsed time is 0.198605 seconds.
octave:6> v=rand(1e6,1); tic;fft(v);toc
Elapsed time is 0.184027 seconds.
octave:7> tic;fft(v);toc
Elapsed time is 0.15657 seconds.
octave:8> tic;fft(v);toc
Elapsed time is 0.144034 seconds.
octave:9> tic;fft(v);toc
Elapsed time is 0.152095 seconds.
octave:10> tic;fft(v);toc
Elapsed time is 0.155661 seconds.
octave:11>
"
1e6 is faster than 2^20.
FFTW3 is with SSE2 support.
Regards,
Sergei.
Applications From Scratch: http://appsfromscratch.berlios.de/
____________________________________________________________________________________
Be a better friend, newshound, and
know-it-all with Yahoo! Mobile. Try it now.
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ