qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH 2/9] S/390 CPU emulation


From: Laurent Desnogues
Subject: Re: [Qemu-devel] [PATCH 2/9] S/390 CPU emulation
Date: Mon, 2 Nov 2009 20:03:49 +0100

On Mon, Nov 2, 2009 at 7:42 PM, Aurelien Jarno <address@hidden> wrote:
> On Mon, Nov 02, 2009 at 05:16:44PM +0200, Ulrich Hecht wrote:
>> On Thursday 22 October 2009, Aurelien Jarno wrote:
>> > Probably the second. Changing the instruction pointer in the helper
>> > instead of using the proper goto_tb TCG op prevents TB chaining, and
>> > therefore as a huge impact on performance.
>> >
>> > It's something not difficult to implement, and that I would definitely
>> > want to see in the patch before getting it merged.
>>
>> OK, I implemented it, and the surprising result is that performance  does
>> not get any better; in fact it even suffers a little bit. (My standard
>> quick test, the polarssl test suite, shows about a 2% performance impact
>> when profiled with cachegrind).
>
> That looks really strange, as TB chaining clearly reduce the number of
> instructions to execute, by not have to lookup for the TB after each
> branch. Also using a brcond instead of a helper should change nothing as
> it is located at the end of the TB, where all the globals must be saved
> in anyway.
>
> Also a recent bug found on ARM host with regard to TB chaining has shown
> it can gives a noticeably speed gain.

That indeed looks strange:  fixing the TB chaining on ARM
made nbench i386 three times faster.  Note the gain was
less for FP parts of the benchmark due to the use of
helpers.

Ulrich,

out of curiosity could you post your tb_set_jmp_target1
function?  The only thing I can think of at the moment that
could make the code slower is that the program you ran
was not reusing blocks and/or cache flushing in
tb_set_jmp_target1 is overkill.


Laurent




reply via email to

[Prev in Thread] Current Thread [Next in Thread]