qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Qemu-devel] Re: [PATCH 0/4] Improve -icount, fix it with iothread


From: Jan Kiszka
Subject: [Qemu-devel] Re: [PATCH 0/4] Improve -icount, fix it with iothread
Date: Wed, 23 Feb 2011 12:39:52 +0100
User-agent: Mozilla/5.0 (X11; U; Linux i686 (x86_64); de; rv:1.8.1.12) Gecko/20080226 SUSE/2.0.0.12-1.1 Thunderbird/2.0.0.12 Mnenhy/0.7.5.666

On 2011-02-23 12:08, Edgar E. Iglesias wrote:
> On Wed, Feb 23, 2011 at 11:25:54AM +0100, Paolo Bonzini wrote:
>> On 02/23/2011 11:18 AM, Edgar E. Iglesias wrote:
>>> Sorry, I don't know the code well enough to give any sensible feedback
>>> on patch 2 - 4. I did test them with some of my guests and things seem
>>> to be OK with them but quite a bit slower.
>>> I saw around 10 - 20% slowdown with a cris guest and -icount 10.
>>>
>>> The slow down might be related to the issue with super slow icount together
>>> with iothread (adressed by Marcelos iothread timeout patch).
>>
>> No, this supersedes Marcelo's patch.  10-20% doesn't seem comparable to 
>> "looks like it deadlocked" anyway.  Also, Jan has ideas on how to remove 
>> the synchronization overhead in the main loop for TCG+iothread.
> 
> I see. I tried booting two of my MIPS and CRIS linux guests with iothread
> and -icount 4. Without your patch, the boot crawls super slow. Your patch
> gives a huge improvement. This was the "deadlock" scenario which I
> mentioned in previous emails.
> 
> Just to clarify the previous test where I saw slowdown with your patch:
> A CRIS setup that has a CRIS and basically only two peripherals,
> a timer block and a device (X) that computes stuff but delays the results
> with a virtual timer. The guest CPU is 99% of the time just
> busy-waiting for device X to get ready.
> 
> This latter test runs in 3.7s with icount 4 and without iothread,
> with or without your patch.
> 
> With icount 4 and iothread it runs in ~1m5s without your patch and
> ~1m20s with your patch. That was the 20% slowdown I mentioned earlier.
> 
> Don't know if that info helps...

You should try to trace the event flow in qemu, either via strace, via
the built-in tracer (which likely requires a bit more tracepoints), or
via a system-level tracer (ftrace / kernelshark).

Did my patches contribute a bit to overhead reduction? They specifically
target the costly vcpu/iothread switches in TCG mode (caused by TCGs
excessive lock-holding times).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



reply via email to

[Prev in Thread] Current Thread [Next in Thread]