qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU throughput is down with SMP


From: Stefan Hajnoczi
Subject: Re: [Qemu-devel] QEMU throughput is down with SMP
Date: Fri, 1 Oct 2010 16:09:00 +0100

On Fri, Oct 1, 2010 at 4:04 PM, Venkateswararao Jujjuri (JV)
<address@hidden> wrote:
> On 10/1/2010 6:38 AM, Ryan Harper wrote:
>>
>> * Stefan Hajnoczi<address@hidden>  [2010-10-01 03:48]:
>>>
>>> On Thu, Sep 30, 2010 at 8:19 PM, Venkateswararao Jujjuri (JV)
>>> <address@hidden>  wrote:
>>>>
>>>> On 9/30/2010 2:13 AM, Stefan Hajnoczi wrote:
>>>>>
>>>>> On Thu, Sep 30, 2010 at 1:50 AM, Venkateswararao Jujjuri (JV)
>>>>> <address@hidden>    wrote:
>>>>>>
>>>>>> Code: Mainline QEMU (git://git.qemu.org/qemu.git)
>>>>>> Machine: LS21 blade.
>>>>>> Disk: Local disk through VirtIO.
>>>>>> Did not select any cache option. Defaulting to writethrough.
>>>>>>
>>>>>> Command tested:
>>>>>> 3 parallel instances of : dd if=/dev/zero of=/pmnt/my_pw bs=4k
>>>>>> count=100000
>>>>>>
>>>>>> QEMU with  smp=1
>>>>>> 19.3 MB/s + 19.2 MB/s + 18.6 MB/s = 57.1 MB/s
>>>>>>
>>>>>> QEMU with smp=4
>>>>>> 15.3 MB/s + 14.1 MB/s + 13.6 MB/s = 43.0 MB/s
>>>>>>
>>>>>> Is this expected?
>>>>>
>>>>> Did you configure with --enable-io-thread?
>>>>
>>>> Yes I did.
>>>>>
>>>>> Also, try using dd oflag=direct to eliminate effects introduced by the
>>>>> guest page cache and really hit the disk.
>>>>
>>>> With oflag=direct , I see no difference and the throughput is so slow
>>>> and I
>>>> would not
>>>> expect to see any difference.
>>>> It is 225 kb/s  for each thread either with smp=1 or with smp=4.
>>>
>>> If I understand correctly you are getting:
>>>
>>> QEMU oflag=direct with smp=1
>>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>>
>>> QEMU oflag=direct with smp=4
>>> 225 KB/s + 225 KB/s + 225 KB/s = 675 KB/s
>>>
>>> This suggests the degradation for smp=4 is guest kernel page cache or
>>> buffered I/O related.  Perhaps lockholder preemption?
>>
>> or just a single spindle maxed out because the blade hard drive doesn't
>> have writecache enabled (it's disabled by default).
>
> Yes, I am sure we are hitting the max limit on the blade local disk.
> Question is why the smp=4 degraded the performance in the cached mode.
>
> I am running latest kernel from upstream on the guest(2.6.36-rc5)..and using
> block IO.
> Do we have any know issues in there which could explain performance
> degradation?

I suggested that lockholder preemption might be the issue.  If you
check /proc/lock_stat in a guest debug kernel after seeing poor
performance, do the lock statistics look suspicious (very long hold
times)?

Stefan



reply via email to

[Prev in Thread] Current Thread [Next in Thread]