discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Using volk


From: Marcus Müller
Subject: Re: [Discuss-gnuradio] Using volk
Date: Thu, 09 Oct 2014 14:53:04 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

Compilers are good, just use a linear comparison:

float *current= input;
float max = *current;
float *end = first + length_of_array;
while(current < end){
    max = (*current > max) ? *current++ : max;

:)

On 09.10.2014 13:09, Mostafa Alizadeh wrote:
> Thank you so much Marcus,
>
> I've learnt so much from you here :)
> The algorithm of finding the Max of a vector by comparing one half to the
> other half, is an appropriate idea! I can't use GNURadio blocks for this
> calculation because I must do these within my own block.
>
> Unfortunately, I'm not familiar with optimization and parallelization
> algorithm, so I just want to compute fast not necessarily as fast as
> possible :)
>
> Best,
> Mostafa
>
> On Wed, Oct 8, 2014 at 5:56 AM, Marcus Müller <address@hidden>
> wrote:
>
>>  Hi Mostafa,
>>
>> VOLK is but an accelerated Library of Vector Optimized Kernels.
>> What you want is basically three operations:
>> a) finding maximum absolute
>> b) finding average absolute
>> c) dividing these two values
>>
>> Now, looking closer at a) and b), one notices that both require the
>> samples to be converted to their magnitudes, first. And because we're in
>> the business of optimizing things, let's just use the squared magnitude,
>> because that's faster to compute by one sqrt, usually. So this boils down to
>> a) take mag_squared of input (length N)
>> b1) find maximum of a)
>> b2) find sum of a)
>> c) sqrt(b2/b1)/N
>>
>> As you can see, c) is not a vector operation, and thus not a case for volk.
>> For a) ("Complex to Mag ^ 2") there is a GNU Radio block that uses VOLK.
>> That's the example for using VOLK that I would have recommended to read,
>> anyway :)
>>
>> In other terms: If you don't have to write your own highly optimized
>> block, don't use VOLK directly, use the standard GNU Radio blockset. It's
>> rather optimized ;)
>>
>> Now, for the maximum search b1, things are a bit more complicated.
>> Searching for a maximum is not *easily* vectorizable, because it is a
>> inherently sequential operation (think of it as the first step of a bubble
>> sort).
>> Now, you can achieve *awesome* performance by basically turning your
>> linear search into a N-ary tree, with N being the order of parallelism you
>> can achieve by using a maximum-finding SIMD instruction. But that requires
>> the size of the problem to be a power of N. That just doesn't fly well with
>> the usually more "multiple of 64 bit"-typey alignment restrictions.
>> You're however, highly encouraged to try just that: use the existing
>> volk_32f_x2_max_32f, which compares two vectors, and stores the
>> element-wise maximum in a third one, to compare the first with the second
>> half of your mag_squared vector, and repeat the same with the first and
>> second half of the result (and so on) until you have a single maximum
>> value. That's the comparison tree from above for the N=2 case. You can
>> employ clever overlapping to use as many values twice in the input to
>> virtually extend your input's length to a power of two, and then just waltz
>> on.
>>
>> For b2) you can simply use the "integrate" block, which is not VOLK
>> optimized (possibly because it's a gengen template and these are *so much
>> fun* to specialize). But seeing as it is simply an accumulating for loop, I
>> kind of expect your compiler to make the best of the situation. However,
>> you can also use the volk_32f_accumulator_s32f VOLK kernel. I kind of want
>> to use that in integrate, because for my machine, the SSE VOLK kernel is 4
>> times as fast as the generic implementation, which nicely matches the
>> 4-operand SSE SIMD instruction behind it.
>>
>> Greetings,
>> Marcus
>>
>>
>> On 07.10.2014 21:49, Mostafa Alizadeh wrote:
>>
>> Hello all,
>>
>> I wondered about volk. I want it to compute mean to peak value of a complex
>> array. How could I do this?
>> Besides, I really need to know is there any example of using volk? The code
>> itself, doesn't reflect input and output parameters explicitly.
>>
>> Best,
>> Mostafa
>>
>>
>>
>>
>> _______________________________________________
>> Discuss-gnuradio mailing 
>> address@hidden://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>
>>
>>
>> _______________________________________________
>> Discuss-gnuradio mailing list
>> address@hidden
>> https://lists.gnu.org/mailman/listinfo/discuss-gnuradio
>>
>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]