help-gplusplus
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Help with Hand-Optimized Assembly


From: Terje Mathisen
Subject: Re: Help with Hand-Optimized Assembly
Date: Wed, 28 Mar 2012 18:29:57 -0000
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0.1) Gecko/20111221 Firefox/9.0.1 SeaMonkey/2.6.1

sfuerst wrote:
There is a straight-forward algorithm using the fact that only one of
the bounds can be crossed...

Something like this:
(Inputs in %xmm0, and %xmm1, output in %xmm0)

subsd %xmm1,%xmm0
movsd plusM_PI(%rip), %xmm1
movsd minusM_PI(%rip), %xmm2

cmpgtsd %xmm0, %xmm1
cmpltsd %xmm0, %xmm2

andpd  minus2M_PI(%rip), %xmm1
andpd  plus2M_PI(%rip), %xmm2

addsd %xmm1, %xmm0
addsd %xmm2, %xmm0

I probably have some of the comparisons reversed by mistake... but you
get the idea.  You can do both comparisons in parallel.  Using sign
tricks doesn't seem to be profitable, as that increases the length of
the critical path.

Very nice, and definitely much better than my approach!
:-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"


reply via email to

[Prev in Thread] Current Thread [Next in Thread]