[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL
From: |
Michael S. Tsirkin |
Subject: |
Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL |
Date: |
Tue, 17 Jan 2017 21:00:20 +0200 |
On Tue, Jan 17, 2017 at 06:53:17PM +0000, Felipe Franciosi wrote:
>
> > On 17 Jan 2017, at 10:41, Michael S. Tsirkin <address@hidden> wrote:
> >
> > On Fri, Jan 13, 2017 at 10:29:46PM +0000, Felipe Franciosi wrote:
> >>
> >>> On 13 Jan 2017, at 10:18, Michael S. Tsirkin <address@hidden> wrote:
> >>>
> >>> On Fri, Jan 13, 2017 at 05:15:22PM +0000, Felipe Franciosi wrote:
> >>>>
> >>>>> On 13 Jan 2017, at 09:04, Michael S. Tsirkin <address@hidden> wrote:
> >>>>>
> >>>>> On Fri, Jan 13, 2017 at 03:09:46PM +0000, Felipe Franciosi wrote:
> >>>>>> Hi Marc-Andre,
> >>>>>>
> >>>>>>> On 13 Jan 2017, at 07:03, Marc-André Lureau <address@hidden> wrote:
> >>>>>>>
> >>>>>>> Hi
> >>>>>>>
> >>>>>>> ----- Original Message -----
> >>>>>>>> Currently, VQs are started as soon as a SET_VRING_KICK is received.
> >>>>>>>> That
> >>>>>>>> is too early in the VQ setup process, as the backend might not yet
> >>>>>>>> have
> >>>>>>>
> >>>>>>> I think we may want to reconsider queue_set_started(), move it
> >>>>>>> elsewhere, since kick/call fds aren't mandatory to process the rings.
> >>>>>>
> >>>>>> Hmm. The fds aren't mandatory, but I imagine in that case we should
> >>>>>> still receive SET_VRING_KICK/CALL messages without an fd (ie. with the
> >>>>>> VHOST_MSG_VQ_NOFD_MASK flag set). Wouldn't that be the case?
> >>>>>
> >>>>> Please look at docs/specs/vhost-user.txt, Starting and stopping rings
> >>>>>
> >>>>> The spec says:
> >>>>> Client must start ring upon receiving a kick (that is,
> >>>>> detecting that
> >>>>> file descriptor is readable) on the descriptor specified by
> >>>>> VHOST_USER_SET_VRING_KICK, and stop ring upon receiving
> >>>>> VHOST_USER_GET_VRING_BASE.
> >>>>
> >>>> Yes I have seen the spec, but there is a race with the current
> >>>> libvhost-user code which needs attention. My initial proposal (which got
> >>>> turned down) was to send a spurious notification upon seeing a callfd.
> >>>> Then I came up with this proposal. See below.
> >>>>
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>>> a callfd to notify in case it received a kick and fully processed the
> >>>>>>>> request/command. This patch only starts a VQ when a SET_VRING_CALL is
> >>>>>>>> received.
> >>>>>>>
> >>>>>>> I don't like that much, as soon as the kick fd is received, it should
> >>>>>>> start polling it imho. callfd is optional, it may have one and not
> >>>>>>> the other.
> >>>>>>
> >>>>>> So the question is whether we should be receiving a SET_VRING_CALL
> >>>>>> anyway or not, regardless of an fd being sent. (I think we do, but I
> >>>>>> haven't done extensive testing with other device types.)
> >>>>>
> >>>>> I would say not, only KICK is mandatory and that is also not enough
> >>>>> to process ring. You must wait for it to be readable.
> >>>>
> >>>> The problem is that Qemu takes time between sending the kickfd and the
> >>>> callfd. Hence the race. Consider this scenario:
> >>>>
> >>>> 1) Guest configures the device
> >>>> 2) Guest put a request on a virtq
> >>>> 3) Guest kicks
> >>>> 4) Qemu starts configuring the backend
> >>>> 4.a) Qemu sends the masked callfds
> >>>> 4.b) Qemu sends the virtq sizes and addresses
> >>>> 4.c) Qemu sends the kickfds
> >>>>
> >>>> (When using MQ, Qemu will only send the callfd once all VQs are
> >>>> configured)
> >>>>
> >>>> 5) The backend starts listening on the kickfd upon receiving it
> >>>> 6) The backend picks up the guest's request
> >>>> 7) The backend processes the request
> >>>> 8) The backend puts the response on the used ring
> >>>> 9) The backend notifies the masked callfd
> >>>>
> >>>> 4.d) Qemu sends the callfds
> >>>>
> >>>> At which point the guest missed the notification and gets stuck.
> >>>>
> >>>> Perhaps you prefer my initial proposal of sending a spurious
> >>>> notification when the backend sees a callfd?
> >>>>
> >>>> Felipe
> >>>
> >>> I thought we read the masked callfd when we unmask it,
> >>> and forward the interrupt. See kvm_irqfd_assign:
> >>>
> >>> /*
> >>> * Check if there was an event already pending on the eventfd
> >>> * before we registered, and trigger it as if we didn't miss it.
> >>> */
> >>> events = f.file->f_op->poll(f.file, &irqfd->pt);
> >>>
> >>> if (events & POLLIN)
> >>> schedule_work(&irqfd->inject);
> >>>
> >>>
> >>>
> >>> Is this a problem you observe in practice?
> >>
> >> Thanks for pointing out to this code; I wasn't aware of it.
> >>
> >> Indeed I'm encountering it in practice. And I've checked that my kernel
> >> has the code above.
> >>
> >> Starts to sound like a race:
> >> Qemu registers the new notifier with kvm
> >> Backend kicks the (now no longer registered) maskfd
> >
> > vhost user is not supposed to use maskfd at all.
> >
> > We have this code:
> > if (net->nc->info->type == NET_CLIENT_DRIVER_VHOST_USER) {
> > dev->use_guest_notifier_mask = false;
> > }
> >
> > isn't it effective?
>
> I'm observing this problem when using vhost-user-scsi, not -net. So the code
> above is not in effect. Anyway, I'd expect the race I described to also
> happen on vhost-scsi.
>
> The problem is aggravated on storage for the following reason:
> SeaBIOS configures the vhost-(user)-scsi device and finds the boot drive and
> reads the boot data.
> Then the guest kernel boots, the virtio-scsi driver loads and reconfigures
> the device.
> Qemu sends the new virtq information to the backend, but as soon as the
> device status is OK the guest sends reads to the root disk.
> And if the irq is lost the guest will wait for a response forever before
> making progress.
>
> Unlike networking (which must cope with packet drops), the guest hangs
> waiting for the device to answer.
>
> So even if you had this race in networking, the guest would eventually
> retransmit which would hide the issue.
>
> Thoughts?
> Felipe
maskfd is just racy for vhost-user ATM. I'm guessing vhost-scsi should
just set use_guest_notifier_mask, that will fix it. Alternatively,
rework masking to support sync with the backend - but I doubt it's
useful.
> >
> >
> >
> >> Qemu sends the new callfd to the application
> >>
> >> It's not hard to repro. How could this situation be avoided?
> >>
> >> Cheers,
> >> Felipe
> >>
> >>
> >>>
> >>>>
> >>>>>
> >>>>>>>
> >>>>>>> Perhaps it's best for now to delay the callfd notification with a
> >>>>>>> flag until it is received?
> >>>>>>
> >>>>>> The other idea is to always kick when we receive the callfd. I
> >>>>>> remember discussing that alternative with you before libvhost-user
> >>>>>> went in. The protocol says both the driver and the backend must handle
> >>>>>> spurious kicks. This approach also fixes the bug.
> >>>>>>
> >>>>>> I'm happy with whatever alternative you want, as long it makes
> >>>>>> libvhost-user usable for storage devices.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Felipe
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> Signed-off-by: Felipe Franciosi <address@hidden>
> >>>>>>>> ---
> >>>>>>>> contrib/libvhost-user/libvhost-user.c | 26 +++++++++++++-------------
> >>>>>>>> 1 file changed, 13 insertions(+), 13 deletions(-)
> >>>>>>>>
> >>>>>>>> diff --git a/contrib/libvhost-user/libvhost-user.c
> >>>>>>>> b/contrib/libvhost-user/libvhost-user.c
> >>>>>>>> index af4faad..a46ef90 100644
> >>>>>>>> --- a/contrib/libvhost-user/libvhost-user.c
> >>>>>>>> +++ b/contrib/libvhost-user/libvhost-user.c
> >>>>>>>> @@ -607,19 +607,6 @@ vu_set_vring_kick_exec(VuDev *dev, VhostUserMsg
> >>>>>>>> *vmsg)
> >>>>>>>> DPRINT("Got kick_fd: %d for vq: %d\n", vmsg->fds[0], index);
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> - dev->vq[index].started = true;
> >>>>>>>> - if (dev->iface->queue_set_started) {
> >>>>>>>> - dev->iface->queue_set_started(dev, index, true);
> >>>>>>>> - }
> >>>>>>>> -
> >>>>>>>> - if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
> >>>>>>>> - dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
> >>>>>>>> - vu_kick_cb, (void *)(long)index);
> >>>>>>>> -
> >>>>>>>> - DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
> >>>>>>>> - dev->vq[index].kick_fd, index);
> >>>>>>>> - }
> >>>>>>>> -
> >>>>>>>> return false;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> @@ -661,6 +648,19 @@ vu_set_vring_call_exec(VuDev *dev, VhostUserMsg
> >>>>>>>> *vmsg)
> >>>>>>>>
> >>>>>>>> DPRINT("Got call_fd: %d for vq: %d\n", vmsg->fds[0], index);
> >>>>>>>>
> >>>>>>>> + dev->vq[index].started = true;
> >>>>>>>> + if (dev->iface->queue_set_started) {
> >>>>>>>> + dev->iface->queue_set_started(dev, index, true);
> >>>>>>>> + }
> >>>>>>>> +
> >>>>>>>> + if (dev->vq[index].kick_fd != -1 && dev->vq[index].handler) {
> >>>>>>>> + dev->set_watch(dev, dev->vq[index].kick_fd, VU_WATCH_IN,
> >>>>>>>> + vu_kick_cb, (void *)(long)index);
> >>>>>>>> +
> >>>>>>>> + DPRINT("Waiting for kicks on fd: %d for vq: %d\n",
> >>>>>>>> + dev->vq[index].kick_fd, index);
> >>>>>>>> + }
> >>>>>>>> +
> >>>>>>>> return false;
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>> 1.9.4
> >>>>>>>>
> >>>>>>>>
> >>>>>>
- [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Felipe Franciosi, 2017/01/12
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Marc-André Lureau, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Felipe Franciosi, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Michael S. Tsirkin, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Felipe Franciosi, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Michael S. Tsirkin, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Felipe Franciosi, 2017/01/13
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Michael S. Tsirkin, 2017/01/17
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Felipe Franciosi, 2017/01/17
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL,
Michael S. Tsirkin <=
- Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Paolo Bonzini, 2017/01/19
Re: [Qemu-devel] [PATCH] libvhost-user: Start VQs on SET_VRING_CALL, Michael S. Tsirkin, 2017/01/16