[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 1/1] virtio: fallback from irqfd to non-irqfd no
From: |
Halil Pasic |
Subject: |
Re: [Qemu-devel] [PATCH 1/1] virtio: fallback from irqfd to non-irqfd notify |
Date: |
Wed, 1 Mar 2017 17:08:39 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.7.0 |
On 03/01/2017 03:29 PM, Paolo Bonzini wrote:
>
>
> On 01/03/2017 14:22, Halil Pasic wrote:
>> Here a trace:
>>
>> address@hidden:virtio_blk_req_complete req 0x2aa6b117e10 status 0
>> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> address@hidden:virtio_blk_req_complete req 0x2aa6b118980 status 0
>> address@hidden:virtio_blk_req_complete req 0x2aa6b119260 status 0
>> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> address@hidden:virtio_blk_req_complete req 0x2aa6b118980 status 0
>> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de880
>> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e61c8 vq 0x2aa6b4de8f8
>> address@hidden:virtio_blk_data_plane_stop dataplane 0x2aa6b0e5540
>> ^== DATAPLANE STOP
>> address@hidden:virtio_blk_req_complete req 0x2aa6b117e10 status 0
>> address@hidden:virtio_guest_notifier_read vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> ^== comes from k->set_guest_notifiers(qbus->parent,
>> nvqs, false);
>> in virtio_blk_data_plane_stop and done
>> immediately after
>> irqfd is cleaned up by the transport
>> address@hidden:virtio_notify_irqfd vdev 0x2aa6b0e19d8 vq 0x2aa6b4c0870
>> halil: error in event_notifier_set: Bad file descriptor
>> ^== here we have the problem
>>
>> If you want a stacktrace that can be arranged to.
>>
>>> like a reset should cause it (the only call in virtio-blk is from
>>> virtio_blk_data_plane_stop), and then the guest doesn't care anymore
>>> about interrupts.
>> I do not understand this with 'doesn't care anymore about interrupts'.
>> I was debugging a virtio-blk device being stuck waiting for a host
>> notification (interrupt) after migration.
>
> Ok, this explains it better then. The issue is that
> virtio_blk_data_plane_stop doesn't flush the bottom half, which you want
> to do when the caller is, for example, virtio_ccw_vmstate_change.
>
> Does it work if you call to qemu_bh_cancel(s->bh) and notify_guest_bh(s)
> after
>
> blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
>
> ?
>
With
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -260,6 +260,8 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev)
/* Drain and switch bs back to the QEMU main loop */
blk_set_aio_context(s->conf->conf.blk, qemu_get_aio_context());
+ qemu_bh_cancel(s->bh);
+ notify_guest_bh(s);
applied I do not see the problem any more. I will most likely
turn this into a patch tomorrow. I would like to give it some more testing and
thinking (see questions below) until tomorrow.
I should probably cc stable, or?
I would also like to do some diagnostic stuff if virtio_notify_irqfd fails.
Maybe assert success for event_notifier_set. Would that be OK with you?
I have a couple of questions about the ways of the dataplane code. If
you are too busy, feel free to not answer -- I will keep thinking myself.
Q1. For this to work correctly, it seems to me, we need to be sure that
virtio_blk_req_complete can not be happen between the newly added
notify_guest_bh(s);
and
vblk->dataplane_started = false;
becomes visible. How is this ensured?
Q2. The virtio_blk_data_plane_stop should be from the thread/context
associated with the main event loop, and with that
vblk->dataplane_started = false too. But I think dataplane_started
may end up being used form a different thread (e.g. req_complete).
How does the sequencing work there and/or is it even important?
Regards,
Halil