qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] is there a limit on the number of in-flight I/O operati


From: Chris Friesen
Subject: Re: [Qemu-devel] is there a limit on the number of in-flight I/O operations?
Date: Mon, 25 Aug 2014 15:50:02 -0600
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

On 08/23/2014 01:56 AM, Benoît Canet wrote:
The Friday 22 Aug 2014 à 18:59:38 (-0600), Chris Friesen wrote :
On 07/21/2014 10:10 AM, Benoît Canet wrote:
The Monday 21 Jul 2014 à 09:35:29 (-0600), Chris Friesen wrote :
On 07/21/2014 09:15 AM, Benoît Canet wrote:
The Monday 21 Jul 2014 à 08:59:45 (-0600), Chris Friesen wrote :
On 07/19/2014 02:45 AM, Benoît Canet wrote:

I think in the throttling case the number of in flight operation is limited by
the emulated hardware queue. Else request would pile up and throttling would be
inefective.

So this number should be around: #define VIRTIO_PCI_QUEUE_MAX 64 or something 
like than that.

Okay, that makes sense.  Do you know how much data can be written as part of
a single operation?  We're using 2MB hugepages for the guest memory, and we
saw the qemu RSS numbers jump from 25-30MB during normal operation up to
120-180MB when running dbench.  I'd like to know what the worst-case would

Sorry I didn't understood this part at first read.

In the linux guest can you monitor:
address@hidden:~$ cat /sys/class/block/xyz/inflight ?

This would give us a faily precise number of the requests actually in flight 
between the guest and qemu.


After a bit of a break I'm looking at this again.


Strange.

I would use dd with the flag oflag=nocache to make sure the write request
does not do in the guest cache though.

Best regards

Benoît

While doing "dd if=/dev/zero of=testfile bs=1M count=700" in the guest, I
got a max "inflight" value of 181.  This seems quite a bit higher than
VIRTIO_PCI_QUEUE_MAX.

I've seen throughput as high as ~210 MB/sec, which also kicked the RSS
numbers up above 200MB.

I tried dropping VIRTIO_PCI_QUEUE_MAX down to 32 (it didn't seem to work at
all for values much less than that, though I didn't bother getting an exact
value) and it didn't really make any difference, I saw inflight values as
high as 177.

I think I might have a glimmering of what's going on. Someone please correct me if I get something wrong.

I think that VIRTIO_PCI_QUEUE_MAX doesn't really mean anything with respect to max inflight operations, and neither does virtio-blk calling virtio_add_queue() with a queue size of 128.

I think what's happening is that virtio_blk_handle_output() spins, pulling data off the 128-entry queue and calling virtio_blk_handle_request(). At this point that queue entry can be reused, so the queue size isn't really relevant.

In virtio_blk_handle_write() we add the request to a MultiReqBuffer and every 32 writes we'll call virtio_submit_multiwrite() which calls down into bdrv_aio_multiwrite(). That tries to merge requests and then for each resulting request calls bdrv_aio_writev() which ends up calling qemu_rbd_aio_writev(), which calls rbd_start_aio().

rbd_start_aio() allocates a buffer and converts from iovec to a single buffer. This buffer stays allocated until the request is acked, which is where the bulk of the memory overhead with rbd is coming from (has anyone considered adding iovec support to rbd to avoid this extra copy?).

The only limit I see in the whole call chain from virtio_blk_handle_request() on down is the call to bdrv_io_limits_intercept() in bdrv_co_do_writev(). However, that doesn't provide any limit on the absolute number of inflight operations, only on operations/sec. If the ceph server cluster can't keep up with the aggregate load, then the number of inflight operations can still grow indefinitely.

Chris



reply via email to

[Prev in Thread] Current Thread [Next in Thread]