[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Crash on Linux+Clang (NSTimer)
From: |
Riccardo Mottola |
Subject: |
Re: Crash on Linux+Clang (NSTimer) |
Date: |
Wed, 11 Nov 2020 00:33:22 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.12.0 |
Hello David,
thank you for the thorough analysis.
when I wrote x86, I meant x86-32, to distinguish from x86-64!
On 09/11/2020 10:58, David Chisnall wrote:
Hi Riccardo,
Nothing there looks obviously wrong, but the fact that you're getting
a SEGV on an line that isn't explicitly accessing memory is
interesting, as is the fact that this is in a `+initialize` method.
I wonder if the code in `objc_msgSend` is leaving the stack
incorrectly aligned? I encountered a similar problem with some
assembly trampolines in another project recently, where everything was
fine 99% of the time, but in a few cases the compiler was using SSE
instructions on the stack and was crashing because the stack pointer
was only 8-byte aligned and not the ABI-mandated 16-byte aligned.
This is clang10 and in March I perhaps used clang9, maybe they optimize
more - this cpu has SSE & SSE2... who knows there were a couple of
months where I had little time to devote do GNUstep and didn't use this
second laptop, so i don't know when things broke, but not overly
interesting I think.
If this is the case, you'll be able to see by looking at the value of
the %rsp register and the current instruction. Please can you:
- Use `show registers rsp` to let me know your current stack pointer
value.
- Use `disas` and show me the instruction that it points to
I don't have that registerm it is for x86-64, right? esp should be the
equivalent IA-32, no?
here my dump:
(gdb) info registers
eax 0xb7afa5e4 -1213225500
ecx 0xb7850510 -1216019184
edx 0x2 2
ebx 0xb7ad1000 -1213394944
esp 0xbfffeb84 0xbfffeb84
ebp 0x0 0x0
esi 0x834bde4 137674212
edi 0x833ef24 137621284
eip 0xb7850558 0xb7850558 <-[NSTimer
initWithFireDate:interval:target:selector:userInfo:repeats:]+72>
eflags 0x210246 [ PF ZF IF RF ID ]
cs 0x73 115
ss 0x7b 123
ds 0x7b 123
es 0x7b 123
fs 0x0 0
gs 0x33 51
disas:
Dump of assembler code for function -[NSTimer
initWithFireDate:interval:target:selector:userInfo:repeats:]:
0xb7850510 <+0>: push %ebp
0xb7850511 <+1>: push %ebx
0xb7850512 <+2>: push %edi
0xb7850513 <+3>: push %esi
0xb7850514 <+4>: sub $0x1c,%esp
0xb7850517 <+7>: call 0xb785051c <-[NSTimer
initWithFireDate:interval:target:selector:userInfo:repeats:]+12>
0xb785051c <+12>: pop %ebx
0xb785051d <+13>: movsd 0x3c(%esp),%xmm0
0xb7850523 <+19>: mov 0x38(%esp),%ebp
0xb7850527 <+23>: mov 0x44(%esp),%edi
0xb785052b <+27>: mov 0x30(%esp),%esi
0xb785052f <+31>: xorpd %xmm1,%xmm1
0xb7850533 <+35>: add $0x280ae4,%ebx
0xb7850539 <+41>: movsd -0x10d648(%ebx),%xmm3
0xb7850541 <+49>: movapd %xmm0,%xmm2
0xb7850545 <+53>: test %ebp,%ebp
0xb7850547 <+55>: cmpnlesd %xmm1,%xmm2
0xb785054c <+60>: andpd %xmm2,%xmm0
0xb7850550 <+64>: andnpd %xmm3,%xmm2
0xb7850554 <+68>: orpd %xmm0,%xmm2
For extra confirmation, put an breakpoint on the first instruction of
`NSRunLoop`'s `+initialize` and let me know what the value of %rsp is
on function entry?
That would be breaking:
Breakpoint 1, +[NSRunLoop initialize] (
self=0xb7af2944 <._OBJC_CLASS_NSRunLoop>, _cmd=0x8085db0)
at NSRunLoop.m:746
746 if (self == [NSRunLoop class])
(gdb) info registers esp
esp 0xbfffec44 0xbfffec44
Looking at the code, I think we are spilling 24 8-byte words to the
stack, but the weirdness related to the x86 call instruction means
that we will be doing the wrong thing for the ABI.
Before any fix, libobjc2 test suite:
76% tests passed, 44 tests failed out of 186
Notably:
125 - objc_msgSend (Child aborted)
126 - objc_msgSend_optimised (Child aborted)
127 - objc_msgSend_legacy (Child aborted)
128 - objc_msgSend_legacy_optimised (Child aborted)
If you want to try a fix, I believe changing the 0x98 to 0xa0 on these
two lines should work:
https://github.com/gnustep/libobjc2/blob/41808111aa0a58708daf66b2322940d4353e37d8/objc_msgSend.x86-64.S#L216
https://github.com/gnustep/libobjc2/blob/41808111aa0a58708daf66b2322940d4353e37d8/objc_msgSend.x86-64.S#L257
I believe doing that should cause the objc_msgSend test in the
runtime's test suite to fail, but then changing the 0xD8 to 0xE0 on
this line should fix it:
https://github.com/gnustep/libobjc2/blob/41808111aa0a58708daf66b2322940d4353e37d8/objc_msgSend.x86-64.S#L241
If your theory about alignment still applies, maybe you can hint me the
32bit version of it?
The 32bit method in slowLookUp is much shorter :)
5: # slowSend:
mov \sel(%esp), %ecx
lea \receiver(%esp), %eax
push %ecx # _cmd
push %eax # &self
.cfi_def_cfa_offset 12
call CDECL(slowMsgLookup)@PLT
add $8, %esp # restore the stack
I see "12" delcared here as frame offset
Thanks and looking forward for more from you!
Riccardo