[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error
From: |
Igor Kovalenko |
Subject: |
Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error |
Date: |
Mon, 11 Apr 2011 07:16:32 +0400 |
On Mon, Apr 11, 2011 at 12:00 AM, Artyom Tarasenko <address@hidden> wrote:
> On Sun, Apr 10, 2011 at 9:41 PM, Igor Kovalenko
> <address@hidden> wrote:
>> On Sun, Apr 10, 2011 at 11:37 PM, Artyom Tarasenko <address@hidden> wrote:
>>> On Sun, Apr 10, 2011 at 8:52 PM, Igor Kovalenko
>>> <address@hidden> wrote:
>>>> On Sun, Apr 10, 2011 at 10:35 PM, Artyom Tarasenko <address@hidden> wrote:
>>>>> On Sun, Apr 10, 2011 at 7:57 PM, Blue Swirl <address@hidden> wrote:
>>>>>> On Sun, Apr 10, 2011 at 8:48 PM, Artyom Tarasenko <address@hidden> wrote:
>>>>>>> On Sun, Apr 10, 2011 at 4:44 PM, Blue Swirl <address@hidden> wrote:
>>>>>>>> On Sun, Apr 10, 2011 at 5:09 PM, Artyom Tarasenko <address@hidden>
>>>>>>>> wrote:
>>>>>>>>> On Sun, Apr 10, 2011 at 3:24 PM, Aurelien Jarno <address@hidden>
>>>>>>>>> wrote:
>>>>>>>>>> On Sun, Apr 10, 2011 at 02:29:59PM +0200, Artyom Tarasenko wrote:
>>>>>>>>>>> Trying to boot some proprietary OS I get qemu-system-sparc64 crash
>>>>>>>>>>> with a
>>>>>>>>>>>
>>>>>>>>>>> tcg/tcg.c:1892: tcg fatal error
>>>>>>>>>>>
>>>>>>>>>>> error message.
>>>>>>>>>>>
>>>>>>>>>>> It looks like it can be a platform independent bug though, because
>>>>>>>>>>> when a '-singlestep' option IS present, qemu doesn't crash and seems
>>>>>>>>>>> to translate the code properly.
>>>>>>>>>>>
>>>>>>>>>>> (gdb) bt
>>>>>>>>>>> #0 0x00000032c2e327f5 in raise () from /lib64/libc.so.6
>>>>>>>>>>> #1 0x00000032c2e33fd5 in abort () from /lib64/libc.so.6
>>>>>>>>>>> #2 0x000000000051933d in tcg_reg_alloc_call (s=<value optimized
>>>>>>>>>>> out>,
>>>>>>>>>>> def=0x89d340, opc=INDEX_op_call, args=0x10acc98, dead_iargs=3) at
>>>>>>>>>>> qemu/tcg/tcg.c:1892
>>>>>>>>>>> #3 0x000000000051a557 in tcg_gen_code_common (s=0x10b8940,
>>>>>>>>>>> gen_code_buf=0x40338b60 "address@hidden 3\355I\211\256\220") at
>>>>>>>>>>> qemu/tcg/tcg.c:2099
>>>>>>>>>>> #4 tcg_gen_code (s=0x10b8940, gen_code_buf=0x40338b60
>>>>>>>>>>> "address@hidden
>>>>>>>>>>> 3\355I\211\256\220") at qemu/tcg/tcg.c:2142
>>>>>>>>>>> #5 0x00000000004d38f1 in cpu_sparc_gen_code (env=0x10cce10,
>>>>>>>>>>> tb=0x7fffe91bc218, gen_code_size_ptr=0x7fffffffd9b4) at
>>>>>>>>>>> qemu/translate-all.c:93
>>>>>>>>>>> #6 0x00000000004d1fd7 in tb_gen_code (env=0x10cce10, pc=18868776,
>>>>>>>>>>> cs_base=18868780, flags=15, cflags=0) at qemu/exec.c:989
>>>>>>>>>>> #7 0x00000000004d4029 in tb_find_slow (env1=<value optimized out>)
>>>>>>>>>>> at
>>>>>>>>>>> qemu/cpu-exec.c:167
>>>>>>>>>>> #8 tb_find_fast (env1=<value optimized out>) at cpu-exec.c:194
>>>>>>>>>>> #9 cpu_sparc_exec (env1=<value optimized out>) at
>>>>>>>>>>> qemu/cpu-exec.c:556
>>>>>>>>>>> #10 0x0000000000408868 in tcg_cpu_exec () at qemu/cpus.c:1066
>>>>>>>>>>> #11 cpu_exec_all () at qemu/cpus.c:1102
>>>>>>>>>>> #12 0x000000000053c756 in main_loop (argc=<value optimized out>,
>>>>>>>>>>> argv=<value optimized out>, envp=<value optimized out>) at
>>>>>>>>>>> qemu/vl.c:1430
>>>>>>>>>>>
>>>>>>>>>>> I inspected ts->val_type causing the abort() case and it turned out
>>>>>>>>>>> to be 0.
>>>>>>>>>>>
>>>>>>>>>>> The last lines of qemu.log (without -singlestep)
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f0: rdpr %pstate, %g1
>>>>>>>>>>> 0x00000000011fe9f4: wrpr %g1, 2, %pstate
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f8: ldub [ %o0 ], %o1
>>>>>>>>>>> 0x00000000011fe9fc: mov %o1, %o2
>>>>>>>>>>> 0x00000000011fea00: rdpr %tick, %o3
>>>>>>>>>>> 0x00000000011fea04: cmp %o1, %o2
>>>>>>>>>>> 0x00000000011fea08: be %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c: ldub [ %o0 ], %o2
>>>>>>>>>>>
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> Search PC...
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fe9f8: ldub [ %o0 ], %o1
>>>>>>>>>>> 0x00000000011fe9fc: mov %o1, %o2
>>>>>>>>>>> 0x00000000011fea00: rdpr %tick, %o3
>>>>>>>>>>> 0x00000000011fea04: cmp %o1, %o2
>>>>>>>>>>> 0x00000000011fea08: be %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c: ldub [ %o0 ], %o2
>>>>>>>>>>>
>>>>>>>>>>> 110521: Data Access MMU Miss (v=0068) pc=00000000011fe9f8
>>>>>>>>>>> npc=00000000011fe9fc SP=000000000180ae41
>>>>>>>>>>> pc: 00000000011fe9f8 npc: 00000000011fe9fc
>>>>>>>>>>>
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea00: rdpr %tick, %o3
>>>>>>>>>>> 0x00000000011fea04: cmp %o1, %o2
>>>>>>>>>>> 0x00000000011fea08: be %icc, 0x11fea00
>>>>>>>>>>> 0x00000000011fea0c: ldub [ %o0 ], %o2
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea10: brz,pn %o2, 0x11fe9f8
>>>>>>>>>>> 0x00000000011fea14: mov %o2, %o4
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea18: rdpr %tick, %o5
>>>>>>>>>>> 0x00000000011fea1c: cmp %o2, %o4
>>>>>>>>>>> 0x00000000011fea20: be %icc, 0x11fea18
>>>>>>>>>>> 0x00000000011fea24: ldub [ %o0 ], %o4
>>>>>>>>>>> --------------
>>>>>>>>>>> IN:
>>>>>>>>>>> 0x00000000011fea28: brz,pn %o4, 0x11fe9f4
>>>>>>>>>>> 0x00000000011fea2c: wrpr %g0, %g1, %pstate
>>>>>>>>>>> <EOF>
>>>>>>>>>>>
>>>>>>>>>>> The crash is 100% reproducible and happens always on the same place,
>>>>>>>>>>> so it's probably a pure TCG issue, not related on getting the
>>>>>>>>>>> external/timer interrupts.
>>>>>>>>>>>
>>>>>>>>>>> Do you need any additional info?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What would be interesting would be to get the corresponding TCG code
>>>>>>>>>> from qemu.log (-d op,op_opt).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> OP:
>>>>>>>>> ---- 0x11fea28
>>>>>>>>> ld_i64 tmp6,regwptr,$0x20
>>>>>>>>> movi_i64 cond,$0x0
>>>>>>>>> movi_i64 tmp8,$0x0
>>>>>>>>> brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>> movi_i64 cond,$0x1
>>>>>>>>> set_label $0x0
>>>>>>>>>
>>>>>>>>> ---- 0x11fea2c
>>>>>>>>> movi_i64 tmp7,$0x0
>>>>>>>>> xor_i64 tmp0,tmp7,g1
>>>>>>>>> movi_i64 pc,$0x11fea2c
>>>>>>>>> movi_i64 tmp8,$compute_psr
>>>>>>>>> call tmp8,$0x0,$0
>>>>>>>>> movi_i64 tmp8,$0x0
>>>>>>>>> brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>> movi_i64 npc,$0x11fe9f4
>>>>>>>>> br $0x2
>>>>>>>>> set_label $0x1
>>>>>>>>> movi_i64 npc,$0x11fea30
>>>>>>>>> set_label $0x2
>>>>>>>>> movi_i64 tmp8,$wrpstate
>>>>>>>>> call tmp8,$0x0,$0,tmp0
>>>>>>>>> mov_i64 pc,npc
>>>>>>>>> movi_i64 tmp8,$0x4
>>>>>>>>> add_i64 npc,npc,tmp8
>>>>>>>>> exit_tb $0x0
>>>>>>>>>
>>>>>>>>> OP after liveness analysis:
>>>>>>>>> ---- 0x11fea28
>>>>>>>>> ld_i64 tmp6,regwptr,$0x20
>>>>>>>>> movi_i64 cond,$0x0
>>>>>>>>> movi_i64 tmp8,$0x0
>>>>>>>>> brcond_i64 tmp6,tmp8,ne,$0x0
>>>>>>>>> movi_i64 cond,$0x1
>>>>>>>>> set_label $0x0
>>>>>>>>>
>>>>>>>>> ---- 0x11fea2c
>>>>>>>>> nopn $0x2,$0x2
>>>>>>>>> nopn $0x3,$0x68,$0x3
>>>>>>>>> movi_i64 pc,$0x11fea2c
>>>>>>>>> movi_i64 tmp8,$compute_psr
>>>>>>>>> call tmp8,$0x0,$0
>>>>>>>>> movi_i64 tmp8,$0x0
>>>>>>>>> brcond_i64 cond,tmp8,eq,$0x1
>>>>>>>>> movi_i64 npc,$0x11fe9f4
>>>>>>>>> br $0x2
>>>>>>>>> set_label $0x1
>>>>>>>>> movi_i64 npc,$0x11fea30
>>>>>>>>> set_label $0x2
>>>>>>>>> movi_i64 tmp8,$wrpstate
>>>>>>>>> call tmp8,$0x0,$0,tmp0
>>>>>>>>> mov_i64 pc,npc
>>>>>>>>> movi_i64 tmp8,$0x4
>>>>>>>>> add_i64 npc,npc,tmp8
>>>>>>>>> exit_tb $0x0
>>>>>>>>> end
>>>>>>>>>
>>>>>>>>> Does it mean the last block is processed correctly and the crash
>>>>>>>>> happens on the next instruction which doesn't make it to the log?
>>>>>>>>> The next instruction would be a
>>>>>>>>>
>>>>>>>>> 0x00000000011fea30: retl
>>>>>>>>>
>>>>>>>>> Since it's a branch instruction I guess this would also be a tcg
>>>>>>>>> block boundary.
>>>>>>>>
>>>>>>>> Because abort() was called from tcg_reg_alloc_call, I'd say 'retl'
>>>>>>>> (synthetic op for 'jmpl %o8 + 8, %g0') was the problem.
>>>>>>>
>>>>>>> Any idea why? retl is not a rare instruction...
>>>>>>
>>>>>> Sorry, calls are generated for helpers, so it's not 'jmpl' but the
>>>>>> call to wrpstate helper.
>>>>>
>>>>> And why it doesn't happen in a singlestep mode?
>>>>> I tried to comment out
>>>>> cpu_check_irqs(env);
>>>>> in the helper_wrpstate but it made no difference. The only suspicious
>>>>> thing left is register bank switching. Is it safe to switch register
>>>>> banks in the helper function? Shouldn't we end the translation block
>>>>> before?
>>>>
>>>> Not sure if I have seen write to pstate in delay slot, but switching
>>>> globals with PS_AG appears to be safe.
>>>> Do you know which bits are changed in the pstate?
>>>
>>> Hard to say. With a breakpoint set qemu doesn't crash.
>>> The breakpoint shows the change from 0x14->0x16.
>>> So the only difference is that interrupts are getting enabled. No
>>> register bank change.
>>> (And now also no cpu_check_irqs(env) call, because I commented it out.)
>>>
>>> But given there was a Data Access MMU Miss, I would expect there must
>>> have beeb a PS_MG switch.
>>>
>>> Also the breakpoint makes tcg to cut the translation block before the wrpr:
>>>
>>> IN:
>>> 0x00000000011fea18: rdpr %tick, %o5
>>> 0x00000000011fea1c: cmp %o2, %o4
>>> 0x00000000011fea20: be %icc, 0x11fea18
>>> 0x00000000011fea24: ldub [ %o0 ], %o4
>>> --------------
>>> IN:
>>> 0x00000000011fea28: brz,pn %o4, 0x11fe9f4
>>> --------------
>>> IN:
>>> 0x00000000011fea2c: wrpr %g0, %g1, %pstate
>>> --------------
>>> IN:
>>> 0x00000000011fea30: retl
>>> --------------
>>> IN:
>>> 0x00000000011fea30: retl
>>> 0x00000000011fea34: sub %o5, %o3, %o0
>>>
>>
>> You can try enabling DEBUG_PSTATE to see which bits are changed.
>
> I put an additional DPRINTF in the helper and it doesn't get executed
> at 11fea2c. Only at 11fe9f4 (0x16->0x14).
In such cases I would run with -d in_asm,int to have more data to
compare two runs.
May the patch attached help a bit to add verbose pstate output.
Do you have public test case?
It is possible to code this delay slot write test but real issue may
be corruption elsewhere.
--
Kind regards,
Igor V. Kovalenko
sparc64-dump-pstate-verbose
Description: Binary data
- [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Aurelien Jarno, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Blue Swirl, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Blue Swirl, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Igor Kovalenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Message not available
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/10
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error,
Igor Kovalenko <=
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/11
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Igor Kovalenko, 2011/04/11
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/21
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Laurent Desnogues, 2011/04/21
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Igor Kovalenko, 2011/04/21
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Laurent Desnogues, 2011/04/21
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Igor Kovalenko, 2011/04/22
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Aurelien Jarno, 2011/04/25
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Igor Kovalenko, 2011/04/25
- Re: [Qemu-devel] tcg/tcg.c:1892: tcg fatal error, Artyom Tarasenko, 2011/04/26