[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Lzip-bug] Speedup by including intrinsics for vectorization

From: Antonio Diaz Diaz
Subject: Re: [Lzip-bug] Speedup by including intrinsics for vectorization
Date: Thu, 13 Oct 2016 19:27:01 +0200
User-agent: Mozilla/5.0 (X11; U; Linux i586; en-US; rv: Gecko/20110420 SeaMonkey/2.0.14

Hello Erick,

Erick Couts II wrote:
I wanted to point out that no SSE intrinsics were included in the source
code in order to vectorize the encoding process.  I've found that a small
but decent speed gain can be achieved by including the immintrin.h header
and then compiling with auto-vectorization enabled in GCC and LTO for
linktime.  I also profiled the program after running with the --best option
in order to further optimize the program.

What program? There are several programs in the lzip family.

I have tried your suggestion and I haven't noticed any increase in speed for '-9' in lzip-1.18 after including 'immintrin.h' (or 'ammintrin.h') and compiling with '-O3 -flto' on an AMD Athlon64 X2.

I know that you like to keep code simple, but just adding in the #include
immintrin.h to the headers will allow for auto-vectorization without
requiring further changes to any of the existing code.

I like to keep code simple and portable. For example, 'immintrin.h' does not exist in the computer from which I'm writing this.

But the problem with these optimization hacks is that they tend to not being reproducible in other environments. As seems to be the case for this one.

Best regards,

reply via email to

[Prev in Thread] Current Thread [Next in Thread]