discuss-gnuradio
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] Writing SIMD code with sse


From: Dominik Auras
Subject: Re: [Discuss-gnuradio] Writing SIMD code with sse
Date: Wed, 12 Dec 2007 19:45:10 +0100
User-agent: Thunderbird 2.0.0.9 (X11/20071115)

Hi!

The intrinsics are more or less C wrapper functions for assembler commands. You can find a detailed description here:

http://www.intel.com/products/processor/manuals/index.htm

SSE1-3 is supported by modern AMD and Intel processors.

There are many possible improvements, but you need to have processor-specific selection of code.

An example for intrinsics:

typedef float v4sf __attribute__ ((vector_size(16)));
typedef short int v8hi __attribute__ ((vector_size(16)));
typedef int v4si __attribute__ ((vector_size(16)));

v4sf * o = static_cast<v4sf*>(buffer->write_pointer());
const v8hi * in = reinterpret_cast<v8hi*>(usrp_buffer);
for(i = 0; i < nbytes; i+=16, o+=2, ++in){
  const v8hi x = *in;

  o[0] = __builtin_ia32_cvtdq2ps(
         __builtin_ia32_psradi128(
         reinterpret_cast<v4si>(
         __builtin_ia32_punpcklwd128(x,x)),16));
  o[1] = __builtin_ia32_cvtdq2ps(
         __builtin_ia32_psradi128(
         reinterpret_cast<v4si>(
         __builtin_ia32_punpckhwd128(x,x)),16));
}

The code snippet fastly converts the shorts the usrp delivers to floats, using SSE. Actually, it ignores the endian-order and assumes little-endian. The buffer size is supposed to be a multiple of 16 bytes.

Dominik




reply via email to

[Prev in Thread] Current Thread [Next in Thread]