qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] QEMU GSoC 2018 Project Idea (Apply polling to QEMU NVMe


From: Paolo Bonzini
Subject: Re: [Qemu-devel] QEMU GSoC 2018 Project Idea (Apply polling to QEMU NVMe)
Date: Mon, 26 Feb 2018 09:45:37 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0

On 25/02/2018 23:52, Huaicheng Li wrote:
> I remember there were some discussions back in 2015 about this, but I
> don't see it finally done. For this project, I think we can go in three
> steps: (1). add the shadow doorbell buffer support into QEMU NVMe
> emulation, this will reduce # of VM-exits. (2). replace current timers
> used by QEMU NVMe with a separate polling thread, thus we can completely
> eliminate VM-exits. (3). Even further, we can adapt the architecture to
> use one polling thread for each NVMe queue pair, thus it's possible to
> provide more performance. (step 3 can be left for next year if the
> workload is too much for 3 months).

Slightly rephrased:

(1) add shadow doorbell buffer and ioeventfd support into QEMU NVMe
emulation, which will reduce # of VM-exits and make them less expensive
(reduce VCPU latency.

(2) add iothread support to QEMU NVMe emulation.  This can also be used
to eliminate VM-exits because iothreads can do adaptive polling.

(1) and (2) seem okay for at most 1.5 months, especially if you already
have experience with QEMU.

For (3), there is work in progress to add multiqueue support to QEMU's
block device layer.  We're hoping to get the infrastructure part in
(removing the AioContext lock) during the first half of 2018.  As you
say, we can see what the workload will be.

Including a RAM disk backend in QEMU would be nice too, and it may
interest you as it would reduce the delta between upstream QEMU and
FEMU.  So this could be another idea.

However, the main issue that I'd love to see tackled is interrupt
mitigation.  With higher rates of I/O ops and high queue depth (e.g.
32), it's common for the guest to become slower when you introduce
optimizations in QEMU.  The reason is that lower latency causes higher
interrupt rates and that in turn slows down the guest.  If you have any
ideas on how to work around this, I would love to hear about it.

In any case, I would very much like to mentor this project.  Let me know
if you have any more ideas on how to extend it!

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]