discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] VOLK division between complexes


From: Marcus Müller
Subject: Re: [Discuss-gnuradio] VOLK division between complexes
Date: Fri, 13 May 2016 21:58:56 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

Hi Federico,

I don't know if that will help much, but:
volk_32fc_magnitude_squared_32f(&mag_sq_b[0], &b[0], N); // mag_sq_b = |b|^2
Maybe doing it in-place, i.e.
volk_32fc_magnitude_squared_32f(&[0], &b[0], N); // b = |b|^2
might be even faster; just don't forget that you're then treating the first half of b as floats instead of complexes.

I just realized there's the __mm_rcp_ps SSE1 intrinsic... maybe that complex/complex VOLK kernel is closer than I thought.

Cheers,
Marcus
On 13.05.2016 20:59, Federico Larroca wrote:
Thank you Andy. However, I only need the division, although this is indeed a good idea if more operations were needed.

So far, I've applied the following lines with some significant savings (w.r.t. a loop):

volk_32fc_x2_multiply_conjugate_32fc(&c[0], &a[0], &b[0], N); // c = a*conj(b)
volk_32fc_magnitude_squared_32f(&mag_sq_b[0], &b[0], N); // mag_sq_b = |b|^2
volk_32f_x2_divide_32f(&inv_mag_sq_b[0], &ones[0], &mag_sq_b[0], N); // inv_mag_sq_b = 1/|b|^2, since I've previously defined ones as an array containing N ones.
volk_32fc_32f_multiply_32fc(&out[0], &c[0], &inv_mag_sq_b[0], N); // out = c*inv_mag_sq_b = a*conj(b)/|b|^2 = a/b

The idea of using VOLK's pow operator is significantly slower.

I've also experienced interesting performance improvements by simplifying some for loops not amenable to VOLK, as suggested by Marcus. On the other hand, I'm crazy enough to try to implement a VOLK kernel that performs the division. I've just started, don't know if I'll be successful, but guess I'll learn something anyhow.

best
Federico

2016-05-13 15:14 GMT-03:00 Andy Walls <address@hidden>:
On Thu, 2016-05-12 at 16:24 -0400, address@hidden
wrote:
> Date: Wed, 11 May 2016 16:09:56 -0300
> From: Federico Larroca
> To: address@hidden
> Subject: [Discuss-gnuradio] VOLK division between complexes

> Hello everyone,
> We are on the stage of optimizing our project (gr-isdbt). One of the
> most consuming blocks is OFDM synchronization, and in particular the
> equalization phase. This is simply the division between the input
> signal and the estimated channel gains (two modestly big arrays of
> ~5000 complexes for each OFDM symbol).
> Until now, this was performed by a for loop, so my plan was to change
> it for a volk function. However, there is no complex division in VOLK.
> So I've done a rather indirect operation using the property that a/b =
> a*conj(b)/|b|^2, resulting in six lines of code (a multiply conjugate,
> a magnitude squared, a deinterleave, a couple of float divisions and
> an interleave). Obviously the performance gain (measured with the
> Performance Monitor) is marginal (to be optimistic)...
> Does anyone has a better idea?

I have a different idea, but I doubt it is better.  The transformation

w = Log (z) = ln|z| + jArg(z)

transforms multiplication, division, power and root operations into
addition, subtraction, multiplication and division  operations
respectively.

So if c = Log(a), d = Log(b), then a/b = Exp(c-d) .

If along with your complex division, you also have a lot of additional
complex multiplcation, power, and/or (real) root operations to perform,
then the transform *might* give you a savings.  A savings would also be
more likely, if you don't need to invert the transformation at the end
(i.e. no need for z = Exp(w)).

Regards,
Andy

>  Implementing a new kernel is simply out of my knowledge scope.
> Best
> Federico





_______________________________________________
Discuss-gnuradio mailing list
address@hidden
https://lists.gnu.org/mailman/listinfo/discuss-gnuradio


reply via email to

[Prev in Thread] Current Thread [Next in Thread]