qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i38


From: Pavel Dovgaluk
Subject: Re: [Qemu-devel] [PATCH v2 0/3] Fix exceptions handling for MIPS and i386
Date: Thu, 18 Jun 2015 09:18:38 +0300

> From: Aurelien Jarno [mailto:address@hidden
> On 2015-06-17 15:41, Pavel Dovgalyuk wrote:
> > This set of patches fixes exception handling for MIPS and i386 targets.
> > These targets contain instructions that break correct execution in
> > icount/TCG modes (MIPS) and in regular TCG mode (i386).
> 
> Just to be clear, this is not something specific to MIPS and i386.
> Every target which call cpu_loop_exit without doing a retranslation is
> affected by the icount bug.

True. But I haven't checked other platforms yet.

> > Incorrect execution for i386 is causes by exceptions raised by MMU 
> > functions.
> > MMU helper functions are called from generated code and other helper
> > functions. In both cases they try to get function's return address for
> > restoring virtual CPU state.
> >
> > When MMU helper is called from some other helper function
> > (like helper_maskmov_xmm) through cpu_st* function, the return address
> > will point to that helper. That is why CPU state cannot be restored in
> > the case of MMU fault.
> >
> > This bug can occur when maskmov instruction is located in the middle of the
> > translation block.
> >
> > Execution sequence for this example:
> >
> > TB start:
> > PC1: instr1
> >      instr2
> > PC2: maskmov <page fault>
> >      <page fault processing>
> > PC1: instr1
> >      instr2
> >      maskmov
> >
> > At the start of TB execution guest PC points to instr1. When page fault 
> > occurs
> > QEMU tries to restore guest PC (which should be equal to PC2). It reads 
> > host PC
> > from the call stack and checks whether it points to TB or not. Bug in ldst
> > helpers implementation provides incorrect host PC, which is not located 
> > within
> > the TB. That's why QEMU cannot recover guest PC and it remains the same 
> > (PC1).
> > After page fault processing QEMU restarts TB and executes instr1 and instr2
> > for the second time, because guest PC was not recovered.
> 
> Also just to be clear, the way you propose is just one way to fix the
> issue. The other way (which is used for all similar instructions) is to
> call just before the helper:
>     gen_update_cc_op(s);
>     gen_jmp_im(pc_start - s->cs_base);

This would work only for regular execution, not for icount one.

> > Bugs in MIPS helper functions do not break the execution in regular TCG 
> > mode,
> > because PC value is updated before calling the functions that can raise
> > an exception. But icount value cannot be updated this way. Therefore
> > exceptions make execution in icount mode non-determinisic.
> > In icount mode every translation block looks as follows:
> >
> > if icount < n then exit
> > icount -= n
> > instr1
> > instr2
> > ...
> > instrn
> > exit
> >
> > When one of these instructions initiates an exception, icount should be
> > restored and adjusted number of instructions should be subtracted from 
> > icount
> > instead of initial n.
> >
> > tlb_fill function passes retaddr to raise_exception, which allows restoring
> > current instructions in TB and correct icount calculation.
> >
> > When exception triggered with other function (e.g. by embedding call to
> > exception raising helper into TB), then PC is not passed as retaddr and
> > correct icount is not recovered. In such cases icount will be decreased
> > by the value equal to the size of TB.
> >
> > This behavior leads to incorrect values of virtual clock and
> > non-deterministic execution of the code.
> >
> > These patches passes pointer to the translation block code to the exception
> > handler. It allows correct restoring of PC and icount values.
> 
> While I think it's the correct way for load/stores or FPU exceptions, I
> am not convinced we should do that in all cases. Retranslation has a
> cost, and when the exception is likely to occur, it's better to save the
> CPU state instead of going for a retranslation. Actually we had to
> rollback such a change on SH4, as it has some performances issues. See
> commit 1012740098d0307ce5d957ebbe9a7f020da7f574.

Ok, I'll double check the patch to make translation to stop when exception
is likely to occur after current instruction.

> One way to fix that would be to reduce the cost of retranslation, for
> example by using a kind of exception table that we generate at the same
> time of the TB.

What exactly do you mean?

Pavel Dovgalyuk




reply via email to

[Prev in Thread] Current Thread [Next in Thread]