[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [RFC 1/3] aio-posix: add aio_set_poll_handler()
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [RFC 1/3] aio-posix: add aio_set_poll_handler() |
Date: |
Thu, 10 Nov 2016 10:17:35 +0000 |
User-agent: |
Mutt/1.7.1 (2016-10-04) |
On Wed, Nov 09, 2016 at 06:30:11PM +0100, Paolo Bonzini wrote:
Thanks for the feedback. I hope that Karl will be able to find a
QEMU_AIO_POLL_MAX_NS setting that improves the benchmark. At that point
I'll send a new version of this series so we can iron out the details.
> > +static bool run_poll_handlers(AioContext *ctx)
> > +{
> > + int64_t start_time;
> > + unsigned int loop_count = 0;
> > + bool fired = false;
> > +
> > + /* Is there any polling to be done? */
>
> I think the question is not "is there any polling to be done" but rather
> "is there anything that requires looking at a file descriptor". If you
> have e.g. an NBD device on the AioContext you cannot poll. On the other
> hand if all you have is bottom halves (which you can poll with
> ctx->notified), AIO and virtio ioeventfds, you can poll.
This is a good point. Polling should only be done if all resources in
the AioContext benefit from polling - otherwise it adds latency to
resources that don't support polling.
Another thing: only poll if there is work to be done. Linux AIO must
only poll the ring when there are >0 requests outstanding. Currently it
always polls (doh!).
> In particular, testing for bottom halves is necessary to avoid incurring
> extra latency on flushes, which use the thread pool.
The current code uses a half-solution: it uses aio_compute_timeout() to
see if any existing BHs are ready to execute *before* beginning to poll.
Really we should poll BHs since they can be scheduled during the polling
loop.
> Perhaps the poll handler could be a parameter to aio_set_event_notifier?
> run_poll_handlers can just set revents (to G_IO_IN for example) if the
> polling handler returns true, and return true as well. aio_poll can
> then call aio_notify_accept and aio_dispatch, bypassing the poll system
> call altogether.
This is problematic. The poll source != file descriptor so there is a
race condition:
1. Guest increments virtqueue avail.idx
2. QEMU poll notices avail.idx update and marks fd.revents readable.
3. QEMU dispatches fd handler:
void virtio_queue_host_notifier_read(EventNotifier *n)
{
VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
if (event_notifier_test_and_clear(n)) {
virtio_queue_notify_vq(vq);
}
}
4. Guest kicks virtqueue -> ioeventfd is signalled
Unfortunately polling is "too fast" and event_notifier_test_and_clear()
returns false; we won't process the virtqueue!
Pretending that polling is the same as fd monitoring only works when #4
happens before #3. We have to solve this race condition.
The simplest solution is to get rid of the if statement (i.e. enable
spurious event processing). Not sure if that has a drawback though.
Do you have a nicer solution in mind?
signature.asc
Description: PGP signature
- [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/09
- [Qemu-devel] [RFC 3/3] linux-aio: poll ring for completions, Stefan Hajnoczi, 2016/11/09
- [Qemu-devel] [RFC 2/3] virtio: poll virtqueues for new buffers, Stefan Hajnoczi, 2016/11/09
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Karl Rister, 2016/11/11
- Re: [Qemu-devel] [RFC 0/3] aio: experimental virtio-blk polling mode, Stefan Hajnoczi, 2016/11/14