qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v2 12/13] vdpa: preemptive kick at enable


From: Si-Wei Liu
Subject: Re: [RFC v2 12/13] vdpa: preemptive kick at enable
Date: Sat, 4 Feb 2023 03:04:02 -0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.6.1



On 2/2/2023 8:53 AM, Eugenio Perez Martin wrote:
On Thu, Feb 2, 2023 at 1:57 AM Si-Wei Liu <si-wei.liu@oracle.com> wrote:


On 1/13/2023 1:06 AM, Eugenio Perez Martin wrote:
On Fri, Jan 13, 2023 at 4:39 AM Jason Wang <jasowang@redhat.com> wrote:
On Fri, Jan 13, 2023 at 11:25 AM Zhu, Lingshan <lingshan.zhu@intel.com> wrote:

On 1/13/2023 10:31 AM, Jason Wang wrote:
On Fri, Jan 13, 2023 at 1:27 AM Eugenio Pérez <eperezma@redhat.com> wrote:
Spuriously kick the destination device's queue so it knows in case there
are new descriptors.

RFC: This is somehow a gray area. The guest may have placed descriptors
in a virtqueue but not kicked it, so it might be surprised if the device
starts processing it.
So I think this is kind of the work of the vDPA parent. For the parent
that needs this trick, we should do it in the parent driver.
Agree, it looks easier implementing this in parent driver,
I can implement it in ifcvf set_vq_ready right now
Great, but please check whether or not it is really needed.

Some device implementation could check the available descriptions
after DRIVER_OK without waiting for a kick.

So IIUC we can entirely drop this from the series (and I hope we can).
But then, what with the devices that does *not* check for them?
I wonder how the kick can be missed from the first place. Supposedly the
moment when vhost_dev_stop() calls .suspend() into vdpa driver, the
vcpus already stopped running (vm_running = false) and all pending kicks
are delivered through vhost-vdpa's host notifiers or mapped doorbell
already then device won't get new ones.
I'm thinking now in cases like the net rx queue.

When the guest starts it fills it and kicks the device. Let's say
avail_idx is 255.

Following the qemu emulated virtio net,
hw/virtio/virtio.c:virtqueue_split_pop will stash shadow_avail_idx =
255, and it will not check it again until it is out of rx descriptors.

Now the NIC fills N < 255 receive buffers, and VMM migrates. Will the
destination device check rx avail idx even if it has not received any
kick? (here could be at startup or when it needs to receive a packet).
- If the answer is yes, and it will be a bug not to check it, then we
can drop this patch. We're covered even if there is a possibility of
losing a kick in the source.
So this is not an issue of missing delivery of kicks, but more of how device is expected to handle pending kicks during suspend? For network device, it's not required to process up to avail_idx during suspend, but this doesn't mean it should ignore the kick for new descriptors, or instead I would say the device shouldn't specifically rely on kick, either at suspend or at startup. If at suspend, the device doesn't process up to avail_idx, correspondingly the implementation of it should sync the avail_idx in memory at startup. Even if the device implementation has to process up to avail_idx at suspend, for interoperability (i.e. source device didn't sync at suspend) point of view it still needs to check avail_idx at startup (resume) time and go on to process any pending request, right? So in any case, it seems to me the "implicit" kick at startup is needed for any device implementation anyway. I wouldn't say mandatory but that's the way how its supposed to work I feel.

- If the answer is that it is not mandatory, we need to solve it
somehow. To me, the best way is to spuriously kick as we don't need
changes in the device, all we need is here. A new feature flag
_F_CHECK_AVAIL_ON_STARTUP or equivalent would work the same, but I
think it complicates everything more.

For tx the device should suspend "immediately", so it may receive a
kick, fetch avail_idx with M pending descriptors, transmit P < M and
then receive the suspend. If we don't want to wait indefinitely, the
device should stop processing so there are still pending requests in
the queue for the destination to send. So the case now is the same as
rx, even if the source device actually receives the kick.

Having said that, I didn't check if any code drains the vhost host
notifier. Or, as mentioned in the meeting, check that HW cannot
reorder kick and suspend call.
Not sure how order matters here, though I thought device suspend/resume doesn't tie in with kick order?


If the device intends to
purposely ignore (note: this could be a device bug) pending kicks during
.suspend(), then consequently it should check available descriptors
after reaching driver_ok to process outstanding descriptors, making up
for the missing kick. If the vdpa driver doesn't support .suspend(),
then it should flush the work before .reset() - vhost-scsi does it this
way.  Or otherwise I think it's the norm (right thing to do) device
should process pending kicks before guest memory is to be unmapped at
the late game of vhost_dev_stop(). Is there any case kicks may be missing?

So process pending kicks means to drain all tx and rx descriptors?
No it doesn't have to. What I said is it shouldn't ignore pending kicks as if there's no more buffer posted to the device. But really, a fair expectation for suspending or starting device is device should do a spontaneous sync for getting the latest avail_idx in the memory, which does not have to depend on kicks. For network hardware device, I thought suspend just needs to wait until the completion of ongoing Tx/Rx DMA transaction already in the flight, rather than to drain all the upcoming packets until avail_idx.

Regards,
-Siwei
That would be a solution, but then we don't need virtqueue_state at
all as we might simply recover it from guest's vring avail_idx.

Thanks!

-Siwei


If we drop it it seems to me we must mandate devices to check for
descriptors at queue_enable. The queue could stall if not, isn't it?

Thanks!

Thanks

Thanks
Zhu Lingshan
Thanks

However, that information is not in the migration stream and it should
be an edge case anyhow, being resilient to parallel notifications from
the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
    hw/virtio/vhost-vdpa.c | 5 +++++
    1 file changed, 5 insertions(+)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 40b7e8706a..dff94355dd 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -732,11 +732,16 @@ static int vhost_vdpa_set_vring_ready(struct vhost_dev 
*dev, int ready)
        }
        trace_vhost_vdpa_set_vring_ready(dev);
        for (i = 0; i < dev->nvqs; ++i) {
+        VirtQueue *vq;
            struct vhost_vring_state state = {
                .index = dev->vq_index + i,
                .num = 1,
            };
            vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
+
+        /* Preemptive kick */
+        vq = virtio_get_queue(dev->vdev, dev->vq_index + i);
+        event_notifier_set(virtio_queue_get_host_notifier(vq));
        }
        return 0;
    }
--
2.31.1





reply via email to

[Prev in Thread] Current Thread [Next in Thread]