discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] GNURadio and CUDA reprised


From: Steven Clark
Subject: Re: [Discuss-gnuradio] GNURadio and CUDA reprised
Date: Wed, 12 Jan 2011 09:56:19 -0500

On Wed, Jan 12, 2011 at 2:44 AM, Moeller <address@hidden> wrote:
On 11.01.2011 23:13, Andrew Hofmaier wrote:
> I've begun to look into accelerating GNURadio applications with Nvidia CUDA GPU's
> and have scanned through the archives of the discussion list.  I had two
> questions on the topic:
>
> 1.  Is the CUDA-GNURadio port done by Martin DvH circa 2008 still
> available and runnable?  All links I've seen are broken.

Is CUDA really suitable? There is a certain overhead in data communications.
CUDA is only useful, if it can compute complex things without communicating.
But a data streaming application needs lots of I/O.
The CPU with SSE is also very fast in things like FFT.
I made some experiments with CUDA, but they were not very successful,
far below the peak FLOPS you get in benchmarks.
But I'm not an experienced programmer ...

> 2.  Much of the results I've seen, both here and elsewhere, suggest that
> CUDA is not typically applicable to general GNURadio applications.  It
> has worked in specific cases, but only where the data throughput
> requirements are very high and the algorithms are extremely

Yes, I had the same experiences. I tried to let CUDA do the one-dimensional FFT.
It was slower than on CPU, had a large communication overhead.
Maybe better with larger FFT sizes, or with 2D FFT, or better programming ...
In contrast, the sample programs were very fast, but also very special
like Fractals computing, Image processing or particle physics.

> these cards for GNURadio applications?  Some of the major relevant
> improvements are the ability to concurrently schedule multiple kernels
> and asynchronously perform memory transfers.

I think important is that the kernels have to compute very much, compared
to data transmission tasks. 1D FFT is not very computing-intensive, related to
data shifting. What kind of algorithm do you want to port to CUDA?


_______________________________________________
Discuss-gnuradio mailing list
address@hidden
http://lists.gnu.org/mailman/listinfo/discuss-gnuradio


I've done some work with both CUDA and GNURadio, and I think there's definitely some potential there for using them jointly, but only for certain applications, and only if the software is architected intelligently.

GPUs are incredibly powerful, with 1+TFLOP operation and 100+GB/s memory speeds within the GPU. I've used GPUs to perform real-time signal processing on 300+MHz of continuously-streaming data, without dropping a sample. But the PCI bus bandwidth of ~5GB/s can sometimes be a real bottleneck, so you have to design accordingly.

You DON'T want to try to make individual drop-in CUDA replacements for multiple GNURadio processing blocks in a chain. It doesn't make any sense to send data to the GPU, perform an operation (eg filtering), bring the result back to the host, send some more data to the GPU, perform a 2nd operation, bring the data back, etc. The PCI transfers will eat you alive. The key is to send large chunks (10s or 100s of MBs) of data to the GPU, and do as much computation as possible while there. Large batched ffts, wideband frequency searches, channelizing, it's all gravy. It's great if you can stream wideband data to the GPU, have it do some computationally intensive stuff, perform a rate reduction, then stream the lower bandwidth data back to the host to do further (annoyingly serial) operations. You could even (if you wanted to) implement an entire transmitter or receiver within the GPU, with the CPU solely shuttling data to or from the ADC/DAC.

In summary, yes please do get excited about CUDA/OpenCL -- it's great technology. When the USRP 9.0 comes out with a gigasample ADC/DAC, GPUs are there ready to do the heavy lifting :)

-Steven

reply via email to

[Prev in Thread] Current Thread [Next in Thread]