qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Verita


From: Ketan Nilangekar
Subject: Re: [Qemu-devel] [PATCH v7 RFC] block/vxhs: Initial commit to add Veritas HyperScale VxHS block device support
Date: Mon, 7 Nov 2016 20:27:39 +0000
User-agent: Microsoft-MacOutlook/0.0.0.160109





On 11/7/16, 2:22 AM, "Stefan Hajnoczi" <address@hidden> wrote:

>On Fri, Nov 04, 2016 at 06:30:47PM +0000, Ketan Nilangekar wrote:
>> > On Nov 4, 2016, at 2:52 AM, Stefan Hajnoczi <address@hidden> wrote:
>> >> On Thu, Oct 20, 2016 at 01:31:15AM +0000, Ketan Nilangekar wrote:
>> >> 2. The idea of having multi-threaded epoll based network client was to 
>> >> drive more throughput by using multiplexed epoll implementation and 
>> >> (fairly) distributing IOs from several vdisks (typical VM assumed to have 
>> >> atleast 2) across 8 connections. 
>> >> Each connection is serviced by single epoll and does not share its 
>> >> context with other connections/epoll. All memory pools/queues are in the 
>> >> context of a connection/epoll.
>> >> The qemu thread enqueues IO request in one of the 8 epoll queues using a 
>> >> round-robin. Responses are also handled in the context of an epoll loop 
>> >> and do not share context with other epolls. Any synchronization code that 
>> >> you see today in the driver callback is code that handles the split IOs 
>> >> which we plan to address by a) implementing readv in libqnio and b) 
>> >> removing the 4MB limit on write IO size.
>> >> The number of client epoll threads (8) is a #define in qnio and can 
>> >> easily be changed. However our tests indicate that we are able to drive a 
>> >> good number of IOs using 8 threads/epolls.
>> >> I am sure there are ways to simplify the library implementation, but for 
>> >> now the performance of the epoll threads is more than satisfactory.
>> > 
>> > By the way, when you benchmark with 8 epoll threads, are there any other
>> > guests with vxhs running on the machine?
>> > 
>> 
>> Yes. Infact the total througput with around 4-5 VMs scales well to saturate 
>> around 90% of available storage throughput of a typical pcie ssd device.
>> 
>> > In a real-life situation where multiple VMs are running on a single host
>> > it may turn out that giving each VM 8 epoll threads doesn't help at all
>> > because the host CPUs are busy with other tasks.
>> 
>> The exact number of epolls required to achieve optimal throughput may be 
>> something that can be adjusted dynamically by the qnio library in subsequent 
>> revisions. 
>> 
>> But as I mentioned today we can change this by simply rebuilding qnio with a 
>> different value for the #define
>
>In QEMU there is currently work to add multiqueue support to the block
>layer.  This enables true multiqueue from the guest down to the storage
>backend.

Is there any spec or documentation on this that you can point us to?

>
>virtio-blk already supports multiple queues but they are all processed
>from the same thread in QEMU today.  Once multiple threads are able to
>process the queues it would make sense to continue down into the vxhs
>block driver.
>
>So I don't think implementing multiple epoll threads in libqnio is
>useful in the long term.  Rather, a straightforward approach of
>integrating with the libqnio user's event loop (as described in my
>previous emails) would simplify the code and allow you to take advantage
>of full multiqueue support in the future.

Makes sense. We will take this up in the next iteration of libqnio.

Thanks,
Katan.

>
>Stefan

reply via email to

[Prev in Thread] Current Thread [Next in Thread]