[Discuss-gnuradio] CUDA-Enabled GNURadio gr_benchmark10 possible improve

From:

Yu-Hua Yang

Subject:

[Discuss-gnuradio] CUDA-Enabled GNURadio gr_benchmark10 possible improvements

Date:

Mon, 29 Jun 2009 05:10:52 -0400

testblock3= cuda.fir_filter_fff(1,taps)
testblock4= cuda.multiply_const_ff(1.0)
testblock5= cuda.multiply_const_ff(1.0)
testblock6= cuda.multiply_const_ff(1.0)

I attempted to "increase" the GPU performance by inserting very large floating point numbers as parameters to cuda.multiply_const_ff and also messing around taps which is declared by:

taps=range(1,64,1)

But in doing so, I assume that I am passing in "more work" to be done so the GPU should be faster, but it is not. the CPU still takes fractions of a second to complete (with large floating points) while the GPU takes a little over 1 second.

- Following this thread:http://lists.gnu.org/archive/html/discuss-gnuradio/2009-01/msg00378.html
I would like to approach the problem by increasing computation intensity, thats why I am changing the benchmark parameters, but it doesnt seem to work, Am I approaching this correctly?

- From this thread: http://lists.gnu.org/archive/html/discuss-gnuradio/2008-11/msg00292.html

If I benchmark a single block with a big output_multiple then I do see

performance increases.

How do I do the above? How have the experts (Martin, Achilleas) been able to tweak the performance of CUDA-Enabled GNURadio to show that GPU computing can indeed be faster?

- Is there anyway to measure the time the memory calls to and from CPU and CUDA? This way we can know what exactly is the overhead.

Please help!!

[Prev in Thread]

Current Thread

[Next in Thread]

[Discuss-gnuradio] CUDA-Enabled GNURadio gr_benchmark10 possible improvements, Yu-Hua Yang <=