bug-binutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug gas/22871] Encode instructions of 64-bit operand without the REX_W


From: address@hidden
Subject: [Bug gas/22871] Encode instructions of 64-bit operand without the REX_W bit
Date: Fri, 23 Feb 2018 18:26:14 +0000

https://sourceware.org/bugzilla/show_bug.cgi?id=22871

--- Comment #20 from Linus Torvalds <address@hidden> ---
I thought I could make the numbers more stable by using serializing
instructions (cpuid with %eax=0) around the rdtsc, but that just caused some
odd bi-modal behavior where testb/testl and testw/testq "pair up":

Round 0
testb  : 150868957
testw  : 117736338
testl  : 147902663
testq  : 117523153
Round 1
testb  : 146681110
testw  : 118466921
testl  : 147758755
testq  : 118308050
Round 2
testb  : 147607803
testw  : 118383229
testl  : 147303788
testq  : 118873304
Round 3
testb  : 147266141
testw  : 121145806
testl  : 151399470
testq  : 116112309

Funky. But doing profiling, I notice that most of the cost is in the main()
function, not the test functions, so I suspect it ends up being about just
cacheline alignment of the test loops: in the fast cases, the loop start is
16-byte aligned, in the slow case it's 8-byte aligned.

If I align everything on cacheline boundaries, things stabilize a lot:

Round 0
testb  : 146282055
testw  : 145670901
testl  : 147173973
testq  : 146631984
Round 1
testb  : 149647175
testw  : 145634421
testl  : 145738496
testq  : 150404114
Round 2
testb  : 147685735
testw  : 146392328
testl  : 144992998
testq  : 146145547
Round 3
testb  : 145870460
testw  : 146986702
testl  : 145906570
testq  : 146429161

-- 
You are receiving this mail because:
You are on the CC list for the bug.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]