[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Discuss-gnuradio] GNURadio and CUDA reprised
From: |
Michael Dickens |
Subject: |
Re: [Discuss-gnuradio] GNURadio and CUDA reprised |
Date: |
Wed, 12 Jan 2011 15:22:58 -0500 |
On Jan 12, 2011, at 2:56 PM, Moeller wrote:
> On 12.01.2011 14:25, Michael Dickens wrote:
>> the CPU). I think that if a GPU can be used, it will be most effective in
>> things like filterbanks, or when searching for packets (via their unique
>> sync sequence, so matched filtering), or very large FIR filters -- places
>> where a LOT of computations and data must be processed and can be
>> parallelized easily.
>
> Is there an efficient parallel FIR implementation for CUDA? You need only few
> operations on
> a large set of data. So, isn't this too much for the stream-processor
> local-memory?
> If GPU global memory has to be used, this would lead to a slower concurrent
> access.
> And then there is still the transfer time from/to the computer RAM.
> It would be great to have a fast filter, but is it really faster than an
> optimized SSE CPU FIR?
> I had the feeling, that the ratio of computing operations vs. number of
> samples has to be
> high for a significant GPU vs. CPU speedup.
> I'm curious about how much speedup you can achieve for FIR filters
> (let's say large/sharp filters of 1024 taps).
The "very large FIR filters" was a thought, as an example of an operation that
might benefit from a GPU at least when using OpenCL (or CUDA). I haven't done
testing yet to know if a GPU can do better than a CPU using vector instructions
... but I'm getting there. If/when I do get there, I'll post my results &
thoughts.
Your comment about global versus local memory certainly does seem true from
reading the OpenCL specs. Most modern GPUs have 3 levels of memory: global
(for the whole GPU, across all cores), core (across all kernel execution
units), and kernel -- in order of decreasing size, increasing access speed, and
increasing time to move data to/from. I've been playing around with global
memory only so far, but I'll look into the other levels as well to see what
they can provide & the trade-offs required.
Good & interesting discussion! - MLD
- [Discuss-gnuradio] GNURadio and CUDA reprised, Andrew Hofmaier, 2011/01/11
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Moeller, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Sylvain Munaut, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Michael Dickens, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Marc Epard, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Moeller, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised,
Michael Dickens <=
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Steven Clark, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Marcus D. Leech, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Tom Rondeau, 2011/01/12
- Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Moeller, 2011/01/13
Re: [Discuss-gnuradio] GNURadio and CUDA reprised, Steven Clark, 2011/01/12