[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vd
From: |
Eugenio Perez Martin |
Subject: |
Re: [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration |
Date: |
Wed, 15 Feb 2023 19:40:59 +0100 |
On Fri, Feb 10, 2023 at 1:58 PM Gautam Dawar <gdawar@amd.com> wrote:
>
> Hi Eugenio,
>
> I've tested this patch series on Xilinx/AMD SN1022 device without
> control vq and VM Live Migration between two hosts worked fine.
>
> Tested-by: Gautam Dawar <gautam.dawar@amd.com>
>
Thanks for the testing!
>
> Here is some minor feedback:
>
> Pls fix the typo (Dynamycally -> Dynamically) in the Subject.
>
> On 2/8/23 15:12, Eugenio Pérez wrote:
> > CAUTION: This message has originated from an External Source. Please use
> > proper judgment and caution when opening attachments, clicking links, or
> > responding to this email.
> >
> >
> > It's possible to migrate vdpa net devices if they are shadowed from the
> >
> > start. But to always shadow the dataplane is to effectively break its host
> >
> > passthrough, so its not convenient in vDPA scenarios.
> I believe you meant efficient instead of convenient.
> >
> >
> >
> > This series enables dynamically switching to shadow mode only at
> >
> > migration time. This allows full data virtqueues passthrough all the
> >
> > time qemu is not migrating.
> >
> >
> >
> > In this series only net devices with no CVQ are migratable. CVQ adds
> >
> > additional state that would make the series bigger and still had some
> >
> > controversy on previous RFC, so let's split it.
> >
> >
> >
> > The first patch delays the creation of the iova tree until it is really
> > needed,
> >
> > and makes it easier to dynamically move from and to SVQ mode.
> It would help adding some detail on the iova tree being referred to here.
> >
> >
> >
> > Next patches from 02 to 05 handle the suspending and getting of vq state
> > (base)
> >
> > of the device at the switch to SVQ mode. The new _F_SUSPEND feature is
> >
> > negotiated and stop device flow is changed so the state can be fetched
> > trusting
> >
> > the device will not modify it.
> >
> >
> >
> > Since vhost backend must offer VHOST_F_LOG_ALL to be migratable, last
> > patches
> >
> > but the last one add the needed migration blockers so vhost-vdpa can offer
> > it
>
> "last patches but the last one"?
>
I think I solved all of the above in v3, thanks for notifying them!
Would it be possible to test with v3 too?
> Thanks.
>
> >
> > safely. They also add the handling of this feature.
> >
> >
> >
> > Finally, the last patch makes virtio vhost-vdpa backend to offer
> >
> > VHOST_F_LOG_ALL so qemu migrate the device as long as no other blocker has
> > been
> >
> > added.
> >
> >
> >
> > Successfully tested with vdpa_sim_net with patch [1] applied and with the
> > qemu
> >
> > emulated device with vp_vdpa with some restrictions:
> >
> > * No CVQ. No feature that didn't work with SVQ previously (packed, ...)
> >
> > * VIRTIO_RING_F_STATE patches implementing [2].
> >
> > * Expose _F_SUSPEND, but ignore it and suspend on ring state fetch like
> >
> > DPDK.
> >
> >
> >
> > Comments are welcome.
> >
> >
> >
> > v2:
> >
> > - Check for SUSPEND in vhost_dev.backend_cap, as .backend_features is empty
> > at
> >
> > the check moment.
> >
> >
> >
> > v1:
> >
> > - Omit all code working with CVQ and block migration if the device supports
> >
> > CVQ.
> >
> > - Remove spurious kick.
> Even with the spurious kick, datapath didn't resume at destination VM
> after LM as kick happened before DRIVER_OK. So IMO, it will be required
> that the vdpa parent driver simulates a kick after creating/starting HW
> rings.
Right, it did not solve the issue.
If I'm not wrong all vdpa drivers are moving to that model, checking
for new avail descriptors right after DRIVER_OK. Maybe it is better to
keep this discussion at patch 12/13 on RFC v2?
Thanks!
> >
> > - Move all possible checks for migration to vhost-vdpa instead of the net
> >
> > backend. Move them to init code from start code.
> >
> > - Suspend on vhost_vdpa_dev_start(false) instead of in vhost-vdpa net
> > backend.
> >
> > - Properly split suspend after geting base and adding of status_reset
> > patches.
> >
> > - Add possible TODOs to points where this series can improve in the future.
> >
> > - Check the state of migration using migration_in_setup and
> >
> > migration_has_failed instead of checking all the possible migration
> > status in
> >
> > a switch.
> >
> > - Add TODO with possible low hand fruit using RESUME ops.
> >
> > - Always offer _F_LOG from virtio/vhost-vdpa and let migration blockers do
> >
> > their thing instead of adding a variable.
> >
> > - RFC v2 at
> > https://lists.gnu.org/archive/html/qemu-devel/2023-01/msg02574.html
> >
> >
> >
> > RFC v2:
> >
> > - Use a migration listener instead of a memory listener to know when
> >
> > the migration starts.
> >
> > - Add stuff not picked with ASID patches, like enable rings after
> >
> > driver_ok
> >
> > - Add rewinding on the migration src, not in dst
> >
> > - RFC v1 at
> > https://lists.gnu.org/archive/html/qemu-devel/2022-08/msg01664.html
> >
> >
> >
> > [1]
> > https://lore.kernel.org/lkml/20230203142501.300125-1-eperezma@redhat.com/T/
> >
> > [2]
> > https://lists.oasis-open.org/archives/virtio-comment/202103/msg00036.html
> >
> >
> >
> > Eugenio Pérez (13):
> >
> > vdpa net: move iova tree creation from init to start
> >
> > vdpa: Negotiate _F_SUSPEND feature
> >
> > vdpa: add vhost_vdpa_suspend
> >
> > vdpa: move vhost reset after get vring base
> >
> > vdpa: rewind at get_base, not set_base
> >
> > vdpa net: allow VHOST_F_LOG_ALL
> >
> > vdpa: add vdpa net migration state notifier
> >
> > vdpa: disable RAM block discard only for the first device
> >
> > vdpa net: block migration if the device has CVQ
> >
> > vdpa: block migration if device has unsupported features
> >
> > vdpa: block migration if dev does not have _F_SUSPEND
> >
> > vdpa: block migration if SVQ does not admit a feature
> >
> > vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices
> >
> >
> >
> > include/hw/virtio/vhost-backend.h | 4 +
> >
> > hw/virtio/vhost-vdpa.c | 126 +++++++++++++++-----
> >
> > hw/virtio/vhost.c | 3 +
> >
> > net/vhost-vdpa.c | 192 +++++++++++++++++++++++++-----
> >
> > hw/virtio/trace-events | 1 +
> >
> > 5 files changed, 267 insertions(+), 59 deletions(-)
> >
> >
> >
> > --
> >
> > 2.31.1
> >
> >
> >
> >
>
- [PATCH v2 09/13] vdpa net: block migration if the device has CVQ, (continued)
- [PATCH v2 09/13] vdpa net: block migration if the device has CVQ, Eugenio Pérez, 2023/02/08
- [PATCH v2 10/13] vdpa: block migration if device has unsupported features, Eugenio Pérez, 2023/02/08
- [PATCH v2 11/13] vdpa: block migration if dev does not have _F_SUSPEND, Eugenio Pérez, 2023/02/08
- [PATCH v2 12/13] vdpa: block migration if SVQ does not admit a feature, Eugenio Pérez, 2023/02/08
- [PATCH v2 13/13] vdpa: return VHOST_F_LOG_ALL in vhost-vdpa devices, Eugenio Pérez, 2023/02/08
- Re: [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration, Alvaro Karsz, 2023/02/08
- Re: [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration, Gautam Dawar, 2023/02/10
- Re: [PATCH v2 00/13] Dynamycally switch to vhost shadow virtqueues at vdpa net migration,
Eugenio Perez Martin <=